[CORE-13707] `redpanda`: the great config bootstrap fix by WillemKauf · Pull Request #30396 · redpanda-data/redpanda

WillemKauf · 2026-05-06T22:03:43Z

The Great Config Bootstrap Fix

BLUF

Re-arrangement of the bootstrap process in order to allow restarting nodes to see an up to date view of the cluster wide state before initializing systems that depend on it.

BLUF Continued

Two important cluster wide states include the cluster configuration and feature table views, which we now attempt to fetch via a new fetch_controller_snapshot RPC to the controller leader, similar to a node joining the cluster for the first time.

This is mostly why the config cache exists, as well as why we snapshot member tables and feature tables into the kvstore- to allow a view of these properties BEFORE bootstrapping. However, these local snapshots can always be stale, and it is important to bring the node up to date with the controller leader before the system is started.

This prevents bugs like:

Node goes down for a while, cloud_topics_enabled is set on, a cloud topic is created, node comes back, replays the controller log after bootstrapping itself, doesn't have cloud topics capabilities but creates the topic anyways in an undefined state.
Node goes down for a while, log_compaction_use_sliding_window is set on, node comes back, replays the controller log after bootstrapping itself, sets this property on, but no memory reservation has been properly allocated for the sliding window map, potential seg fault.

These are just two examples of the undefined behaviors that can happen when we bootstrap before establishing a cluster wide view of these configurations and/or features.

We can also no longer promote cluster configs during cluster recovery or during controller log replay after the bootstrapping process - that would be invalid behavior.

The core problem: where does a node's config come from at start-up?

Before — error prone

flowchart TD
    A[start-up] --> B[load local config cache / bootstrap yaml<br/>load local feature-table snapshot]
    B --> C{joining or restarting?}
    C -->|joiner| D[register_with_cluster<br/>→ gets a controller_join_snapshot<br/>config + features are fresh]
    C -->|restarter| E[use LOCAL cache + snapshot only<br/>⚠️ may be stale / many versions behind]
    D --> F[wire up + start ALL runtime services<br/>sized from this config]
    E --> F
    F --> G[controller-&gt;start → raft0 replay]
    G --> H[the restarter promotes cluster configs read and appears to possess the up-to-date cluster configuration + features, but services already started on stale values]

    style E fill:#ffd9d9
    style H fill:#ffd9d9

The hazards:

A restarter that was down for a while runs the entire early bootstrap on
stale local config and an old feature-table snapshot.
Runtime services and memory allocations are constructed/sized from those stale values.
Nothing enforces "don't read cluster config before it's trustworthy", so
stale reads are silent.

After — establish a consistent view first

flowchart TD
    A[start-up] --> RO[recover kvstore from local segments and read the node UUID, cluster UUID, feature snapshot]
    RO --> NR[mark cluster config as NOT ready to catch any potential misuses of stale configs]
    NR --> HY[preload local config cache / bootstrap yaml]
    HY --> RPC[start internal RPC bootstrap server]
    RPC --> EV[establish_cluster_view: call fetch_controller_snapshot from the controller leader and apply fresh config_manager + feature_backend state]
    EV --> RDY[mark cluster config READY]
    RDY --> INIT[initialize: build memory_groups,<br/>re-set RPC server memory capacity]
    INIT --> RUN[wire up + start runtime services<br/>now sized from a consistent, fresh config]
    RUN --> R0[controller-&gt;start → raft0 replay<br/>now just tops up from an already-consistent base]

    style EV fill:#d9ffd9
    style RDY fill:#d9ffd9

Key change: both joiners and restarters obtain an authoritative
config_manager + feature_backend view from the controller leader before
the runtime is built — so everything downstream is sized and gated from fresh
values. The window between "not ready" and "ready" is guarded by
usable_before_ready (see the last commit).

Commit walkthrough

Commits 1→3: read-only `kvstore` access for bootstrap

We need read-only access to the kvstore early in bootstrap so we can read
persisted identity (node UUID, cluster UUID, feature snapshot, controller-log
presence) before the storage subsystem is fully started.

storage: expose kvstore::recover() — surface kvstore::recover() (and
an accessor on storage::api) so the kvstore's persisted state can be read
for read-only purposes early in bootstrap, without a full storage start.
storage: move around constructor logic in api — make the kvstore
available right after storage::api's constructor, so the read-only accessor
above is usable as early as possible.
cluster: remove storage::api dependency injection in cluster_discovery
— cluster_discovery only needed node_uuid/cluster_uuid (constant once
set). Pass those two values directly instead of injecting the whole
storage::api, decoupling discovery from the storage lifecycle.

Commits 4→5: a chunk cache that doesn't force `memory_groups()` early

The chunk cache is needed by storage early, but its old singleton form
constructed memory_groups() eagerly — and we don't want to compute memory
group allotments before we actually know the (cluster-consistent) config that
determines them.

storage: abstract out chunk_cache::preallocate() — split the pool
pre-allocation out of start() into a dedicated preallocate() step, so
starting the cache and warming its pool become independent.
storage: remove chunk_cache singleton — the process-wide
thread_local singleton was a mistake. Encapsulate the cache inside
storage_resources (reachable everywhere the cache was used). Lifecycle is now
api::start() → storage_resources::start() → chunk_cache::start(prealloc),
where prealloc lets tests skip the up-front pool fill; add()/get()
assert_started() to catch misuse. Crucially, this defers construction of
memory_groups() until storage::api::start(), instead of at first
singleton access during early bootstrap.

Commit 6 — well-contained fix

kvstore: set is_compaction_enabled=false flag — during kvstore segment
recovery we were reaching into an override-less ntp_config and probing the
cluster default log_cleanup_policy, which is meaningless in the kvstore
context. Pass is_compaction_enabled=false explicitly instead.

Commits 7→9: size the bootstrap RPC server without `memory_groups()`

We need the internal RPC server up early to run the bootstrap_service (and
later fetch_controller_snapshot). But sizing its per-core memory from
memory_groups() would force that computation before we trust the config. So
size it from a fixed placeholder during bootstrap, then adjust it afterward.

config: add rpc_server_bootstrap_memory — a fixed placeholder for
max_service_memory_per_core used while bootstrapping, so the bootstrap RPC
server doesn't have to consult memory_groups().
utils: add adjustable_semaphore::underlying() — expose the underlying
semaphore, needed to resize capacity later.
net: use adjustable_semaphore in server — back net::server's
_memory limit with an adjustable_semaphore and add set_memory_capacity().
The server starts sized from rpc_server_bootstrap_memory; once bootstrap is
complete and memory_groups() is trustworthy, its capacity is adjusted to the
real allotment.

Commits 10→15 — preparation for the snapshot fetch

cluster: make members_manager::read_members_from_kvstore() static —
so it can be called without a members_manager instance during bootstrap.
cluster: template-ize maybe_make_join_snapshot — move it to the
header as a variadic template, so a caller can choose exactly which
controller-STM backends to include in a snapshot (needed for the fetch RPC).
cluster: apply config_manager before parallel snapshot batch — pull
config_manager::apply_snapshot out of the ss::when_all parallel block and
run it first, serially. Downstream backends (members, topics, security,
quota, cluster_link, …) may read shard_local_cfg() during their own
apply_snapshot, so config must be fresh before they run.
cluster: stop promoting pending values at start — drop
promote_all_pending() from config_manager::start(). Since
shard_local_cfg() is now hydrated from the controller snapshot before
bootstrap proceeds, any cluster_config_delta_cmd replayed afterward is a
change the cluster made after hydration; for needs_restart properties
it must stay pending (the strict reading of needs-restart semantics) rather
than retroactively mutating the running node.
cluster: factor out is_request_logical_version_compatible — small
extraction currently gating join_node_request; reused next for the fetch
RPC.
cluster: add fetch_controller_snapshot RPC — the heart of the new
approach. A new RPC that returns the controller leader's latest in-memory
config_manager + feature_backend state. It is called unconditionally
during bootstrap — whether the node is restarting or joining — because a node
that was down a long time has only stale local config/features, and these two
pieces of global state must be brought to a consistent view early in the
node's lifetime.

Commit 16 — `redpanda: the great config bootstrap fix`

This is the payload that rewires application::run() to use everything above.
The new start-up order is:

wire_up_and_start_crypto_services();   // crypto first (TLS cipher validation, etc.)
wire_up_bootstrap_services();          // feature_table, storage::api, smp groups
mark_config_ready(false);              // cluster config is NOT trustworthy yet
hydrate_cluster_config(node_cfg_yaml); // preload local cache / bootstrap file
bootstrap_from_kvstore();              // read-only recover: node/cluster UUID, feature snapshot
wire_up_and_start_rpc_service();       // bootstrap RPC server (sized from rpc_server_bootstrap_memory)
establish_cluster_view();              // fetch_controller_snapshot → apply → mark_config_ready(true)
log_cluster_config();                  // now safe to log the consistent config
initialize();                          // build memory_groups(); re-set RPC server memory capacity
setup_metrics();
wire_up_and_start(...);                // runtime services; controller->start → raft0 replay

What changed conceptually (before → after):

Concern	Before	After
Restarter's config/features at early bootstrap	local cache + old snapshot (stale)	refreshed from controller leader via `fetch_controller_snapshot`
When the cluster-consistent view exists	after raft0 replay (late, post-runtime)	before runtime services are built
kvstore at early bootstrap	needed full start	recovered read-only
RPC server memory	`memory_groups()` (from stale config)	`rpc_server_bootstrap_memory` placeholder, adjusted post-bootstrap
`memory_groups()`	computed early on stale config	computed in `initialize()` after the view is consistent
Stale `shard_local_cfg()` reads	silent	guarded by `usable_before_ready` (next commit)

sequenceDiagram
    participant N as Booting node
    participant K as local kvstore
    participant L as controller leader
    N->>K: recover() (read-only): node/cluster UUID, feature snapshot
    N->>N: mark_config_ready(false)
    N->>N: hydrate local config cache
    N->>N: start bootstrap RPC server (rpc_server_bootstrap_memory)
    N->>L: fetch_controller_snapshot (config_manager + feature_backend)
    L-->>N: authoritative snapshot
    N->>N: apply snapshot → mark_config_ready(true)
    N->>N: initialize(): build memory_groups, resize RPC memory
    N->>N: start runtime services on a consistent config
    N->>N: controller->start → raft0 replay (tops up from consistent base)

Commit 17 — `config`: add `usable_before_ready` flag

A guardrail against acting on stale config. shard_local_cfg() is marked
not-ready until establish_cluster_view() completes; reading any property's
active value while not-ready trips an assertion unless the property opted
in via usable_before_ready::yes. Default is ::no (strict). Properties that
are genuinely needed during bootstrap and whose stale value is harmless are
flagged after audit. See the appendix for the full list and rationale.

Commit 18 — `redpanda`: add `test_cfg` and `chunk_cache_prealloc`

Now that chunk_cache is no longer a process wide singleton, fixture
tests that allocate several applications to simulate multi-broker
tests allocate chunk_cache N times per shard instead of 1, leading to potential
memory issues. Add test_cfg to redpanda, encapsulating the existing
cloud_topics::test_cfg and adding a new flag chunk_cache_prealloc,
which is true by default for production use.

Commit 19 — `rptest`: add `MultiNodeBootstrapTest`

End-to-end ducktape coverage: brings up multiple nodes and exercises the new
bootstrap path (restart and join), asserting that a node converges to the
cluster's configuration/feature view at start-up rather than lagging until
raft0 replay.

Appendix: properties flagged `usable_before_ready`

These are read during bootstrap, before shard_local_cfg() is ready. Each is
audited as safe to read with a potentially-stale local value (the consumer is
re-synced/reconstructed after the view is established, or the value has no
lasting effect).

Internal RPC bootstrap server (constructed before the cluster view; metrics
start disabled, memory capacity re-applied afterward): rpc_server_bootstrap_memory,
rpc_server_listen_backlog, rpc_server_tcp_recv_buf, rpc_server_tcp_send_buf,
rpc_server_compress_replies.
Metrics: disable_metrics, disable_public_metrics, aggregate_metrics.
TLS for the bootstrap RPC server's reloadable credentials: tls_min_version,
tls_enable_renegotiation, tls_v1_2_cipher_suites, tls_v1_3_cipher_suites.
smp service groups (sized in wire_up_bootstrap_services):
raft_smp_max_non_local_requests, pp_sr_smp_max_non_local_requests,
topic_partitions_per_shard.
storage::api / kvstore read-only recovery + storage_resources:
kvstore_max_segment_size, kvstore_flush_interval, append_chunk_size,
storage_read_buffer_size, storage_read_readahead_count,
storage_target_replay_bytes, storage_max_concurrent_replay.
Cluster discovery / join: join_retry_timeout_ms.

What this PR doesn't fix:

The potential race/log invariant violation caused by reading a config cache that is more up to date than the controller snapshot (e.g. we have read-ahead in the log's history and then replay other entries out of order). It may be the case that we want to eliminate the config cache read entirely in favour of the controller snapshot.

  flowchart LR
      subgraph correct["✅ Correct order (log is source of truth)"]
          direction LR
          c5["offset 5<br/>snapshot: X=old"] --> c7["offset 7<br/>reads X ⇒ old"] --> c10["offset 10<br/>writes X=new"]
      end

      subgraph broken["❌ Replay reading kvstore first"]
          direction LR
          b0["kvstore preloaded<br/>X=new (from @10)"] --> b5["offset 5<br/>snapshot"] --> b7["offset 7<br/>reads X ⇒ new"]
  --> b10["offset 10<br/>writes X=new"]
          b7 -. "future state leaks<br/>into the past" .-> b0
      end

Backports Required

Release Notes

Bug Fixes

Fixes a bug in which stale configuration values could be read by joining or restarting redpanda nodes during the bootstrapping process.

Copilot

Pull request overview

This PR reworks Redpanda’s startup/bootstrap sequencing to prevent early reads of stale cluster configuration during node join/restart. It introduces a “config readiness” gate for config::shard_local_cfg(), splits kvstore recovery from writer startup, and adds an RPC path to fetch a leader-authoritative controller snapshot for restarting nodes.

Changes:

Add a readiness check to config::shard_local_cfg() plus an *_unsafe() escape hatch for early bootstrap.
Split kvstore recovery into an explicit phase, and restructure application bootstrap to recover/apply snapshots before marking config “ready”.
Add a fetch_controller_snapshot RPC and controller snapshot plumbing to support leader-authoritative bootstrap for restarting nodes.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/v/storage/storage_resources.h	Extend ctor signature and add API to update append chunk size during bootstrap.
src/v/storage/storage_resources.cc	Plumb append chunk size into `storage_resources` initialization and add updater method.
src/v/storage/kvstore.h	Make `recover()` a public entry point and add `start(recover_t)` mode switch.
src/v/storage/kvstore.cc	Implement split recovery/start flow; adjust read APIs to require recovery rather than full start.
src/v/storage/api.h	Construct kvstore earlier and add `recover_kvstore()` bootstrap hook before `start()`.
src/v/redpanda/BUILD	Add Bazel deps needed by updated application/bootstrap compilation units.
src/v/redpanda/application.h	Reshape bootstrap APIs into async helpers for applying snapshots/config and discovery.
src/v/redpanda/application.cc	Move cluster-config hydration earlier and ensure feature-table metrics are set up during metrics init.
src/v/redpanda/application_start.cc	Use stored `cluster_discovery` instance and remove passing discovery by reference into runtime start.
src/v/redpanda/application_config.cc	Use `shard_local_cfg_unsafe()` for bootstrap-time kvstore config extraction.
src/v/redpanda/application_bootstrap.cc	Rework bootstrap sequence: recover kvstore, apply snapshots, discovery/join, then mark config ready.
src/v/features/feature_table.h	Add `setup_metrics()` API.
src/v/features/feature_table.cc	Defer probe/metrics setup from ctor to explicit `setup_metrics()`.
src/v/config/node_config.cc	Use `shard_local_cfg_unsafe()` during node-config YAML loading/ignores.
src/v/config/configuration.h	Add readiness flag + document `shard_local_cfg()` vs `shard_local_cfg_unsafe()`.
src/v/config/configuration.cc	Implement `shard_local_cfg_unsafe()` and guard `shard_local_cfg()` with readiness vassert.
src/v/cluster/types.h	Add serde RPC request/reply types for fetching a controller snapshot.
src/v/cluster/service.h	Add controller service RPC endpoint for `fetch_controller_snapshot`.
src/v/cluster/service.cc	Implement RPC handler to delegate snapshot fetch logic to members manager.
src/v/cluster/members_manager.h	Declare handler to produce/forward leader-authoritative controller snapshot replies.
src/v/cluster/members_manager.cc	Implement snapshot fetch path (forward to leader if needed; leader builds partial join snapshot).
src/v/cluster/controller.json	Register new `fetch_controller_snapshot` RPC in controller protocol schema.
src/v/cluster/controller.cc	Adjust controller start wiring after config_manager ctor signature change.
src/v/cluster/controller_stm.h	Generalize join snapshot construction via templated backend selection.
src/v/cluster/controller_stm.cc	Reorder snapshot application so config manager applies earlier; remove old join snapshot impl.
src/v/cluster/config_manager.h	Remove cluster recovery table dependency from config manager constructor/state.
src/v/cluster/config_manager.cc	Switch more bootstrap-time reads to `shard_local_cfg_unsafe()` and adjust replay/bootstrap flow.
src/v/cluster/cluster_discovery.h	Update discovery ctor signature and add controller snapshot fetch API (with stale-comment mismatch).
src/v/cluster/cluster_discovery.cc	Implement new controller snapshot fetch RPC path and decouple discovery from storage::api.

Comments suppressed due to low confidence (1)

src/v/cluster/config_manager.cc:193

config_manager::apply_local() still stages updates to needs_restart properties via set_pending_value(...), but config_manager::start() no longer promotes pending values after STM replay. This means nodes may never apply needs_restart config updates even after a restart (pending values remain invisible indefinitely). Reintroduce a promotion point after replay (as before), or otherwise ensure pending values are promoted at least once during startup after replay completes.

ss::future<> config_manager::start() {
    if (_seen_version == config_version_unset) {
        vlog(clusterlog.trace, "Starting config_manager... (initial)");

        // Background: wait till we have a leader. If this node is leader

This singleton was a bad idea. Instead, encapsulate it within `storage_resources`, which is also accessible everywhere `chunk_cache` is currently used. This commit: * Removes the singleton accessor for `chunk_cache` * Adds the `_chunk_cache` object storage to `storage_resources` * Reworks the `start()` logic for `chunk_cache` * `api::start()` -> `storage_resources::start()` -> `chunk_cache::start()`, but with an bool argument `prealloc`. Meant for tests, where we previously wouldn't preallocate the chunk cache upfront when calling `api::start()`. * Public `chunk_cache` APIs `add()` and `get()` now called `assert_started()`. * This ensures we allocate chunks in production, while allowing tests to continue to not preallocate. * Ensures tests which require the `chunk_cache` properly start/stop their encapsulated `storage_resources`. This also accomplishes the goal of deferring construction of `memory_groups()` before `storage::api::start()` is called during the bootstrapping/start-up process in `redpanda`.

Otherwise we reach into the `ntp_config` (with 0 overrides) and try to probe the cluster default for `log_cleanup_policy`, which is incorrect in the `kvstore` context.

We are going to be using `rpc_server_bootstrap_memory` as a placeholder value for `cfg.max_service_memory_per_core` during the bootstrap process as a way of avoiding a call to `memory_groups()` early on in the bootstrap process. In order to correctly size the memory allotment for the server post bootstrap, we need a way of adjusting the number of units used for the semaphore `_memory` in `net::server`. Use an `adjustable_semaphore` for this purpose and add a new function `set_memory_capacity()` to adjust the semaphore's capacity. This will be called after the bootstrapping process is complete and `memory_groups` is considered usable/trustworthy.

Will be used in the future as a static member function without a `members_manager` instance.

We are going to want control over what backends we need to include in a `controller_stm` snapshot in a future commit. Move this function to the header and use a variadic template for this purpose.

Pull config_manager::apply_snapshot out of the ss::when_all parallel block and run it serially after feature_backend and before members_manager. Downstream backends (members, topics, plugin, recovery, security, quota, migrations, cluster_link) may consult shard_local_cfg during their own apply_snapshot work; sequencing config_manager first guarantees they see fresh values.

Removes the promote_all_pending() call from config_manager::start(). Pending values that accumulated during the post-bootstrap controller log replay (i.e. cluster_config_delta_cmd records applied above the hydrated snapshot offset) now stay pending instead of being promoted when config_manager::start runs. This is the strict reading of needs_restart=yes semantics: a property change visible to a node only after it restarts. shard_local_cfg() is already hydrated from the controller snapshot before bootstrap proceeds (see application::bootstrap_cluster_config_view), so any delta replayed afterward is a change the cluster made *after* the hydration point and should not retroactively affect the running node's active values for needs_restart properties.

A bit of code extrapolated out. Currently used for gating `join_node_request`s, will be used in a future commit for handling `fetch_controller_snapshot` requests as well.

A new RPC to fetch a controller snapshot, containing the latest in-memory state of the controller leader's `config_manager` and `feature_backend`. This RPC will be called unconditionally during bootstrap of a `redpanda` broker, whether that node is simply restarting or is joining for the first time. The reason for this being that a local node who is down for a long time will have only stale configurations and features in its local cache/snapshots/log, and the configuration and feature managers are two global pieces of state that should be brought to a consistent view early on in a `redpanda` node's lifetime.

An extra layer of security against using potentially stale configs out of the `shard_local_cfg()`. Defaults to `usable_before_ready::no` to be strict. Properties which are used in the bootstrapping process before a proper view of the cluster configuration can be established and which have no impact on the behavior of the system if read stale can be safely marked as `usable_before_ready::yes`.

Now that `chunk_cache` is no longer a process wide singleton, fixture tests that allocate several `application`s to simulate multi-broker tests allocate `chunk_cache` N times instead of 1, leading to potential memory issues. Add `test_cfg` to `redpanda`, encapsulating the existing `cloud_topics::test_cfg` and adding a new flag `chunk_cache_prealloc`, which is `true` by default for production use.

oleiman · 2026-06-17T06:33:09Z

-      .handle_exception_type([](const ss::gate_closed_exception&) {
-          lg.trace("Shutdown requested during recovery");


any particular reason for removing this?

oleiman · 2026-06-17T06:33:11Z


    /*
-     * Recovery
+     * Recovery (recover() itself is a public entry point declared above)


nitpick: shouldn't this comment be attached to recover()?

oleiman · 2026-06-17T06:33:12Z

-        _kvstore = std::make_unique<kvstore>(
-          _kv_conf_cb(), ss::this_shard_id(), _resources, _feature_table);
-        return _kvstore->start().then([this] {
+        if (!_kvstore) {


q: why do we need this? just for start/stopping storage in tests?

oleiman · 2026-06-17T06:33:13Z

+        co_await _storage.invoke_on_all(
+          [](storage::api& a) { return a.start(); });


nitpick: what's the difference?

EDIT: ah, default param

oleiman · 2026-06-17T06:33:14Z

+     * unaffected; callers that signal()/consume() directly through the
+     * underlying handle bypass capacity bookkeeping.


this feels a little sketchy, but i guess for our use case the whole point is that we only update the capacity at one point post-bootstrap?

oleiman · 2026-06-17T06:33:15Z

-ss::future<> promote_all_pending() {
-    co_await ss::smp::invoke_on_all(
-      [] { config::shard_local_cfg().promote_pending(); });
-}


iiuc the idea here is that, post the great config bootstrap fix, by the time we apply the local cache it is fully up-to-date with the a valid config snapshot from the controller leader, right?

nitpick: update commit message. only mentions start right now
nitpick: should this commit go on the other side of the actual bootstrap fix? bit easier to see it's correct that way, and then we'll have a valid bisection point just in case.

oleiman · 2026-06-17T06:33:16Z

-    // Snapshot application is a wholesale state replacement — promote
-    // all pending values immediately so needs_restart properties take
-    // effect (e.g. during cluster recovery).
-    co_await promote_all_pending();


checking my understanding: we no longer need this because we'll be applying a snapshot earlier in the boostrap process and ultimately forcing pending values to active in apply_local?

oleiman

made it up to config store changes. looking pretty good!

Copilot AI review requested due to automatic review settings May 6, 2026 22:03

WillemKauf requested a review from a team as a code owner May 6, 2026 22:03

github-actions Bot added area/build area/redpanda labels May 6, 2026

Copilot started reviewing on behalf of WillemKauf May 6, 2026 22:04 View session

WillemKauf force-pushed the config_bootstrap_fix branch from 4a97a52 to 4755fa7 Compare May 6, 2026 22:04

Copilot AI reviewed May 6, 2026

View reviewed changes

Comment thread src/v/storage/api.h Outdated

Comment thread src/v/storage/kvstore.cc

Comment thread src/v/config/configuration.h Outdated

Comment thread src/v/cluster/cluster_discovery.h Outdated

WillemKauf force-pushed the config_bootstrap_fix branch 4 times, most recently from 72cd608 to 0140ac7 Compare May 7, 2026 16:10

This comment was marked as outdated.

Sign in to view

WillemKauf force-pushed the config_bootstrap_fix branch 6 times, most recently from 3ec0835 to 2058bd9 Compare May 8, 2026 02:09

This comment was marked as outdated.

Sign in to view

WillemKauf force-pushed the config_bootstrap_fix branch 2 times, most recently from 625236c to d9c2e30 Compare May 8, 2026 19:46

This comment was marked as outdated.

Sign in to view

WillemKauf force-pushed the config_bootstrap_fix branch from d9c2e30 to 7ac0443 Compare May 8, 2026 21:33

WillemKauf changed the title ~~WIP: The great config bootstrap fix~~ redpanda: the great config bootstrap fix May 8, 2026

This comment was marked as outdated.

Sign in to view

WillemKauf force-pushed the config_bootstrap_fix branch from 7ac0443 to 07a3683 Compare May 9, 2026 05:09

WillemKauf changed the title ~~redpanda: the great config bootstrap fix~~ [CORE-13707] redpanda: the great config bootstrap fix May 9, 2026

WillemKauf closed this May 12, 2026

WillemKauf reopened this May 13, 2026

WillemKauf force-pushed the config_bootstrap_fix branch 2 times, most recently from 697cfdc to e4e5cfe Compare June 8, 2026 22:44

WillemKauf added 11 commits June 8, 2026 19:00

kvstore: set is_compaction_enabled=false flag

a86387c

Otherwise we reach into the `ntp_config` (with 0 overrides) and try to probe the cluster default for `log_cleanup_policy`, which is incorrect in the `kvstore` context.

config: add rpc_server_bootstrap_memory

a96c4b4

utils: add adjustable_semaphore::underlying()

ec1451f

cluster: make members_manager::read_members_from_kvstore() static

7f925f7

Will be used in the future as a static member function without a `members_manager` instance.

cluster: template-ize maybe_make_join_snapshot

418b744

We are going to want control over what backends we need to include in a `controller_stm` snapshot in a future commit. Move this function to the header and use a variadic template for this purpose.

cluster: factor out is_request_logical_version_compatible

de3f646

A bit of code extrapolated out. Currently used for gating `join_node_request`s, will be used in a future commit for handling `fetch_controller_snapshot` requests as well.

WillemKauf force-pushed the config_bootstrap_fix branch from e4e5cfe to aa327c3 Compare June 8, 2026 23:00

This comment was marked as outdated.

Sign in to view

WillemKauf added 4 commits June 8, 2026 22:20

redpanda: the great config bootstrap fix

6f3f7ba

rptest: add MultiNodeBootstrapTest

083ed0f

WillemKauf force-pushed the config_bootstrap_fix branch from aa327c3 to 083ed0f Compare June 9, 2026 02:58

sjust-redpanda self-requested a review June 10, 2026 23:47

oleiman reviewed Jun 17, 2026

View reviewed changes

oleiman self-requested a review June 17, 2026 16:32

		.handle_exception_type([](const ss::gate_closed_exception&) {
		lg.trace("Shutdown requested during recovery");

		co_await _storage.invoke_on_all(
		[](storage::api& a) { return a.start(); });

		* unaffected; callers that signal()/consume() directly through the
		* underlying handle bypass capacity bookkeeping.

Conversation

WillemKauf commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The Great Config Bootstrap Fix

BLUF

BLUF Continued

The core problem: where does a node's config come from at start-up?

Before — error prone

After — establish a consistent view first

Commit walkthrough

Commits 1→3: read-only kvstore access for bootstrap

Commits 4→5: a chunk cache that doesn't force memory_groups() early

Commit 6 — well-contained fix

Commits 7→9: size the bootstrap RPC server without memory_groups()

Commits 10→15 — preparation for the snapshot fetch

Commit 16 — redpanda: the great config bootstrap fix

Commit 17 — config: add usable_before_ready flag

Commit 18 — redpanda: add test_cfg and chunk_cache_prealloc

Commit 19 — rptest: add MultiNodeBootstrapTest

Appendix: properties flagged usable_before_ready

What this PR doesn't fix:

Backports Required

Release Notes

Bug Fixes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

oleiman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WillemKauf commented May 6, 2026 •

edited

Loading

Commits 1→3: read-only `kvstore` access for bootstrap

Commits 4→5: a chunk cache that doesn't force `memory_groups()` early

Commits 7→9: size the bootstrap RPC server without `memory_groups()`

Commit 16 — `redpanda: the great config bootstrap fix`

Commit 17 — `config`: add `usable_before_ready` flag

Commit 18 — `redpanda`: add `test_cfg` and `chunk_cache_prealloc`

Commit 19 — `rptest`: add `MultiNodeBootstrapTest`

Appendix: properties flagged `usable_before_ready`