Skip to content

[CORE-13707] redpanda: the great config bootstrap fix#30396

Open
WillemKauf wants to merge 19 commits into
redpanda-data:devfrom
WillemKauf:config_bootstrap_fix
Open

[CORE-13707] redpanda: the great config bootstrap fix#30396
WillemKauf wants to merge 19 commits into
redpanda-data:devfrom
WillemKauf:config_bootstrap_fix

Conversation

@WillemKauf

@WillemKauf WillemKauf commented May 6, 2026

Copy link
Copy Markdown
Contributor

The Great Config Bootstrap Fix

BLUF

Re-arrangement of the bootstrap process in order to allow restarting nodes to see an up to date view of the cluster wide state before initializing systems that depend on it.

BLUF Continued

Two important cluster wide states include the cluster configuration and feature table views, which we now attempt to fetch via a new fetch_controller_snapshot RPC to the controller leader, similar to a node joining the cluster for the first time.

This is mostly why the config cache exists, as well as why we snapshot member tables and feature tables into the kvstore- to allow a view of these properties BEFORE bootstrapping. However, these local snapshots can always be stale, and it is important to bring the node up to date with the controller leader before the system is started.

This prevents bugs like:

Node goes down for a while, cloud_topics_enabled is set on, a cloud topic is created, node comes back, replays the controller log after bootstrapping itself, doesn't have cloud topics capabilities but creates the topic anyways in an undefined state.
Node goes down for a while, log_compaction_use_sliding_window is set on, node comes back, replays the controller log after bootstrapping itself, sets this property on, but no memory reservation has been properly allocated for the sliding window map, potential seg fault.

These are just two examples of the undefined behaviors that can happen when we bootstrap before establishing a cluster wide view of these configurations and/or features.

We can also no longer promote cluster configs during cluster recovery or during controller log replay after the bootstrapping process - that would be invalid behavior.

The core problem: where does a node's config come from at start-up?

Before — error prone

flowchart TD
    A[start-up] --> B[load local config cache / bootstrap yaml<br/>load local feature-table snapshot]
    B --> C{joining or restarting?}
    C -->|joiner| D[register_with_cluster<br/>→ gets a controller_join_snapshot<br/>config + features are fresh]
    C -->|restarter| E[use LOCAL cache + snapshot only<br/>⚠️ may be stale / many versions behind]
    D --> F[wire up + start ALL runtime services<br/>sized from this config]
    E --> F
    F --> G[controller-&gt;start → raft0 replay]
    G --> H[the restarter promotes cluster configs read and appears to possess the up-to-date cluster configuration + features, but services already started on stale values]

    style E fill:#ffd9d9
    style H fill:#ffd9d9
Loading

The hazards:

  • A restarter that was down for a while runs the entire early bootstrap on
    stale local config and an old feature-table snapshot.
  • Runtime services and memory allocations are constructed/sized from those stale values.
  • Nothing enforces "don't read cluster config before it's trustworthy", so
    stale reads are silent.

After — establish a consistent view first

flowchart TD
    A[start-up] --> RO[recover kvstore from local segments and read the node UUID, cluster UUID, feature snapshot]
    RO --> NR[mark cluster config as NOT ready to catch any potential misuses of stale configs]
    NR --> HY[preload local config cache / bootstrap yaml]
    HY --> RPC[start internal RPC bootstrap server]
    RPC --> EV[establish_cluster_view: call fetch_controller_snapshot from the controller leader and apply fresh config_manager + feature_backend state]
    EV --> RDY[mark cluster config READY]
    RDY --> INIT[initialize: build memory_groups,<br/>re-set RPC server memory capacity]
    INIT --> RUN[wire up + start runtime services<br/>now sized from a consistent, fresh config]
    RUN --> R0[controller-&gt;start → raft0 replay<br/>now just tops up from an already-consistent base]

    style EV fill:#d9ffd9
    style RDY fill:#d9ffd9
Loading

Key change: both joiners and restarters obtain an authoritative
config_manager + feature_backend view from the controller leader before
the runtime is built — so everything downstream is sized and gated from fresh
values. The window between "not ready" and "ready" is guarded by
usable_before_ready (see the last commit).


Commit walkthrough

Commits 1→3: read-only kvstore access for bootstrap

We need read-only access to the kvstore early in bootstrap so we can read
persisted identity (node UUID, cluster UUID, feature snapshot, controller-log
presence) before the storage subsystem is fully started.

  1. storage: expose kvstore::recover() — surface kvstore::recover() (and
    an accessor on storage::api) so the kvstore's persisted state can be read
    for read-only purposes early in bootstrap, without a full storage start.
  2. storage: move around constructor logic in api — make the kvstore
    available right after storage::api's constructor, so the read-only accessor
    above is usable as early as possible.
  3. cluster: remove storage::api dependency injection in cluster_discovery
    cluster_discovery only needed node_uuid/cluster_uuid (constant once
    set). Pass those two values directly instead of injecting the whole
    storage::api, decoupling discovery from the storage lifecycle.

Commits 4→5: a chunk cache that doesn't force memory_groups() early

The chunk cache is needed by storage early, but its old singleton form
constructed memory_groups() eagerly — and we don't want to compute memory
group allotments before we actually know the (cluster-consistent) config that
determines them.

  1. storage: abstract out chunk_cache::preallocate() — split the pool
    pre-allocation out of start() into a dedicated preallocate() step, so
    starting the cache and warming its pool become independent.
  2. storage: remove chunk_cache singleton — the process-wide
    thread_local singleton was a mistake. Encapsulate the cache inside
    storage_resources (reachable everywhere the cache was used). Lifecycle is now
    api::start()storage_resources::start()chunk_cache::start(prealloc),
    where prealloc lets tests skip the up-front pool fill; add()/get()
    assert_started() to catch misuse. Crucially, this defers construction of
    memory_groups() until storage::api::start()
    , instead of at first
    singleton access during early bootstrap.

Commit 6 — well-contained fix

  1. kvstore: set is_compaction_enabled=false flag — during kvstore segment
    recovery we were reaching into an override-less ntp_config and probing the
    cluster default log_cleanup_policy, which is meaningless in the kvstore
    context. Pass is_compaction_enabled=false explicitly instead.

Commits 7→9: size the bootstrap RPC server without memory_groups()

We need the internal RPC server up early to run the bootstrap_service (and
later fetch_controller_snapshot). But sizing its per-core memory from
memory_groups() would force that computation before we trust the config. So
size it from a fixed placeholder during bootstrap, then adjust it afterward.

  1. config: add rpc_server_bootstrap_memory — a fixed placeholder for
    max_service_memory_per_core used while bootstrapping, so the bootstrap RPC
    server doesn't have to consult memory_groups().
  2. utils: add adjustable_semaphore::underlying() — expose the underlying
    semaphore, needed to resize capacity later.
  3. net: use adjustable_semaphore in server — back net::server's
    _memory limit with an adjustable_semaphore and add set_memory_capacity().
    The server starts sized from rpc_server_bootstrap_memory; once bootstrap is
    complete and memory_groups() is trustworthy, its capacity is adjusted to the
    real allotment.

Commits 10→15 — preparation for the snapshot fetch

  1. cluster: make members_manager::read_members_from_kvstore() static
    so it can be called without a members_manager instance during bootstrap.
  2. cluster: template-ize maybe_make_join_snapshot — move it to the
    header as a variadic template, so a caller can choose exactly which
    controller-STM backends to include in a snapshot (needed for the fetch RPC).
  3. cluster: apply config_manager before parallel snapshot batch — pull
    config_manager::apply_snapshot out of the ss::when_all parallel block and
    run it first, serially. Downstream backends (members, topics, security,
    quota, cluster_link, …) may read shard_local_cfg() during their own
    apply_snapshot, so config must be fresh before they run.
  4. cluster: stop promoting pending values at start — drop
    promote_all_pending() from config_manager::start(). Since
    shard_local_cfg() is now hydrated from the controller snapshot before
    bootstrap proceeds, any cluster_config_delta_cmd replayed afterward is a
    change the cluster made after hydration; for needs_restart properties
    it must stay pending (the strict reading of needs-restart semantics) rather
    than retroactively mutating the running node.
  5. cluster: factor out is_request_logical_version_compatible — small
    extraction currently gating join_node_request; reused next for the fetch
    RPC.
  6. cluster: add fetch_controller_snapshot RPC — the heart of the new
    approach. A new RPC that returns the controller leader's latest in-memory
    config_manager + feature_backend state. It is called unconditionally
    during bootstrap — whether the node is restarting or joining — because a node
    that was down a long time has only stale local config/features, and these two
    pieces of global state must be brought to a consistent view early in the
    node's lifetime.

Commit 16 — redpanda: the great config bootstrap fix

This is the payload that rewires application::run() to use everything above.
The new start-up order is:

wire_up_and_start_crypto_services();   // crypto first (TLS cipher validation, etc.)
wire_up_bootstrap_services();          // feature_table, storage::api, smp groups
mark_config_ready(false);              // cluster config is NOT trustworthy yet
hydrate_cluster_config(node_cfg_yaml); // preload local cache / bootstrap file
bootstrap_from_kvstore();              // read-only recover: node/cluster UUID, feature snapshot
wire_up_and_start_rpc_service();       // bootstrap RPC server (sized from rpc_server_bootstrap_memory)
establish_cluster_view();              // fetch_controller_snapshot → apply → mark_config_ready(true)
log_cluster_config();                  // now safe to log the consistent config
initialize();                          // build memory_groups(); re-set RPC server memory capacity
setup_metrics();
wire_up_and_start(...);                // runtime services; controller->start → raft0 replay

What changed conceptually (before → after):

Concern Before After
Restarter's config/features at early bootstrap local cache + old snapshot (stale) refreshed from controller leader via fetch_controller_snapshot
When the cluster-consistent view exists after raft0 replay (late, post-runtime) before runtime services are built
kvstore at early bootstrap needed full start recovered read-only
RPC server memory memory_groups() (from stale config) rpc_server_bootstrap_memory placeholder, adjusted post-bootstrap
memory_groups() computed early on stale config computed in initialize() after the view is consistent
Stale shard_local_cfg() reads silent guarded by usable_before_ready (next commit)
sequenceDiagram
    participant N as Booting node
    participant K as local kvstore
    participant L as controller leader
    N->>K: recover() (read-only): node/cluster UUID, feature snapshot
    N->>N: mark_config_ready(false)
    N->>N: hydrate local config cache
    N->>N: start bootstrap RPC server (rpc_server_bootstrap_memory)
    N->>L: fetch_controller_snapshot (config_manager + feature_backend)
    L-->>N: authoritative snapshot
    N->>N: apply snapshot → mark_config_ready(true)
    N->>N: initialize(): build memory_groups, resize RPC memory
    N->>N: start runtime services on a consistent config
    N->>N: controller->start → raft0 replay (tops up from consistent base)
Loading

Commit 17 — config: add usable_before_ready flag

  1. A guardrail against acting on stale config. shard_local_cfg() is marked
    not-ready until establish_cluster_view() completes; reading any property's
    active value while not-ready trips an assertion unless the property opted
    in via usable_before_ready::yes. Default is ::no (strict). Properties that
    are genuinely needed during bootstrap and whose stale value is harmless are
    flagged after audit. See the appendix for the full list and rationale.

Commit 18 — redpanda: add test_cfg and chunk_cache_prealloc

  1. Now that chunk_cache is no longer a process wide singleton, fixture
    tests that allocate several applications to simulate multi-broker
    tests allocate chunk_cache N times per shard instead of 1, leading to potential
    memory issues. Add test_cfg to redpanda, encapsulating the existing
    cloud_topics::test_cfg and adding a new flag chunk_cache_prealloc,
    which is true by default for production use.

Commit 19 — rptest: add MultiNodeBootstrapTest

  1. End-to-end ducktape coverage: brings up multiple nodes and exercises the new
    bootstrap path (restart and join), asserting that a node converges to the
    cluster's configuration/feature view at start-up rather than lagging until
    raft0 replay.

Appendix: properties flagged usable_before_ready

These are read during bootstrap, before shard_local_cfg() is ready. Each is
audited as safe to read with a potentially-stale local value (the consumer is
re-synced/reconstructed after the view is established, or the value has no
lasting effect).

  • Internal RPC bootstrap server (constructed before the cluster view; metrics
    start disabled, memory capacity re-applied afterward): rpc_server_bootstrap_memory,
    rpc_server_listen_backlog, rpc_server_tcp_recv_buf, rpc_server_tcp_send_buf,
    rpc_server_compress_replies.
  • Metrics: disable_metrics, disable_public_metrics, aggregate_metrics.
  • TLS for the bootstrap RPC server's reloadable credentials: tls_min_version,
    tls_enable_renegotiation, tls_v1_2_cipher_suites, tls_v1_3_cipher_suites.
  • smp service groups (sized in wire_up_bootstrap_services):
    raft_smp_max_non_local_requests, pp_sr_smp_max_non_local_requests,
    topic_partitions_per_shard.
  • storage::api / kvstore read-only recovery + storage_resources:
    kvstore_max_segment_size, kvstore_flush_interval, append_chunk_size,
    storage_read_buffer_size, storage_read_readahead_count,
    storage_target_replay_bytes, storage_max_concurrent_replay.
  • Cluster discovery / join: join_retry_timeout_ms.

What this PR doesn't fix:

The potential race/log invariant violation caused by reading a config cache that is more up to date than the controller snapshot (e.g. we have read-ahead in the log's history and then replay other entries out of order). It may be the case that we want to eliminate the config cache read entirely in favour of the controller snapshot.

  flowchart LR
      subgraph correct["✅ Correct order (log is source of truth)"]
          direction LR
          c5["offset 5<br/>snapshot: X=old"] --> c7["offset 7<br/>reads X ⇒ old"] --> c10["offset 10<br/>writes X=new"]
      end

      subgraph broken["❌ Replay reading kvstore first"]
          direction LR
          b0["kvstore preloaded<br/>X=new (from @10)"] --> b5["offset 5<br/>snapshot"] --> b7["offset 7<br/>reads X ⇒ new"]
  --> b10["offset 10<br/>writes X=new"]
          b7 -. "future state leaks<br/>into the past" .-> b0
      end
Loading

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v26.1.x
  • v25.3.x
  • v25.2.x

Release Notes

Bug Fixes

  • Fixes a bug in which stale configuration values could be read by joining or restarting redpanda nodes during the bootstrapping process.

Copilot AI review requested due to automatic review settings May 6, 2026 22:03
@WillemKauf WillemKauf requested a review from a team as a code owner May 6, 2026 22:03
@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch from 4a97a52 to 4755fa7 Compare May 6, 2026 22:04

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks Redpanda’s startup/bootstrap sequencing to prevent early reads of stale cluster configuration during node join/restart. It introduces a “config readiness” gate for config::shard_local_cfg(), splits kvstore recovery from writer startup, and adds an RPC path to fetch a leader-authoritative controller snapshot for restarting nodes.

Changes:

  • Add a readiness check to config::shard_local_cfg() plus an *_unsafe() escape hatch for early bootstrap.
  • Split kvstore recovery into an explicit phase, and restructure application bootstrap to recover/apply snapshots before marking config “ready”.
  • Add a fetch_controller_snapshot RPC and controller snapshot plumbing to support leader-authoritative bootstrap for restarting nodes.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/v/storage/storage_resources.h Extend ctor signature and add API to update append chunk size during bootstrap.
src/v/storage/storage_resources.cc Plumb append chunk size into storage_resources initialization and add updater method.
src/v/storage/kvstore.h Make recover() a public entry point and add start(recover_t) mode switch.
src/v/storage/kvstore.cc Implement split recovery/start flow; adjust read APIs to require recovery rather than full start.
src/v/storage/api.h Construct kvstore earlier and add recover_kvstore() bootstrap hook before start().
src/v/redpanda/BUILD Add Bazel deps needed by updated application/bootstrap compilation units.
src/v/redpanda/application.h Reshape bootstrap APIs into async helpers for applying snapshots/config and discovery.
src/v/redpanda/application.cc Move cluster-config hydration earlier and ensure feature-table metrics are set up during metrics init.
src/v/redpanda/application_start.cc Use stored cluster_discovery instance and remove passing discovery by reference into runtime start.
src/v/redpanda/application_config.cc Use shard_local_cfg_unsafe() for bootstrap-time kvstore config extraction.
src/v/redpanda/application_bootstrap.cc Rework bootstrap sequence: recover kvstore, apply snapshots, discovery/join, then mark config ready.
src/v/features/feature_table.h Add setup_metrics() API.
src/v/features/feature_table.cc Defer probe/metrics setup from ctor to explicit setup_metrics().
src/v/config/node_config.cc Use shard_local_cfg_unsafe() during node-config YAML loading/ignores.
src/v/config/configuration.h Add readiness flag + document shard_local_cfg() vs shard_local_cfg_unsafe().
src/v/config/configuration.cc Implement shard_local_cfg_unsafe() and guard shard_local_cfg() with readiness vassert.
src/v/cluster/types.h Add serde RPC request/reply types for fetching a controller snapshot.
src/v/cluster/service.h Add controller service RPC endpoint for fetch_controller_snapshot.
src/v/cluster/service.cc Implement RPC handler to delegate snapshot fetch logic to members manager.
src/v/cluster/members_manager.h Declare handler to produce/forward leader-authoritative controller snapshot replies.
src/v/cluster/members_manager.cc Implement snapshot fetch path (forward to leader if needed; leader builds partial join snapshot).
src/v/cluster/controller.json Register new fetch_controller_snapshot RPC in controller protocol schema.
src/v/cluster/controller.cc Adjust controller start wiring after config_manager ctor signature change.
src/v/cluster/controller_stm.h Generalize join snapshot construction via templated backend selection.
src/v/cluster/controller_stm.cc Reorder snapshot application so config manager applies earlier; remove old join snapshot impl.
src/v/cluster/config_manager.h Remove cluster recovery table dependency from config manager constructor/state.
src/v/cluster/config_manager.cc Switch more bootstrap-time reads to shard_local_cfg_unsafe() and adjust replay/bootstrap flow.
src/v/cluster/cluster_discovery.h Update discovery ctor signature and add controller snapshot fetch API (with stale-comment mismatch).
src/v/cluster/cluster_discovery.cc Implement new controller snapshot fetch RPC path and decouple discovery from storage::api.
Comments suppressed due to low confidence (1)

src/v/cluster/config_manager.cc:193

  • config_manager::apply_local() still stages updates to needs_restart properties via set_pending_value(...), but config_manager::start() no longer promotes pending values after STM replay. This means nodes may never apply needs_restart config updates even after a restart (pending values remain invisible indefinitely). Reintroduce a promotion point after replay (as before), or otherwise ensure pending values are promoted at least once during startup after replay completes.
ss::future<> config_manager::start() {
    if (_seen_version == config_version_unset) {
        vlog(clusterlog.trace, "Starting config_manager... (initial)");

        // Background: wait till we have a leader. If this node is leader

Comment thread src/v/storage/api.h Outdated
Comment thread src/v/storage/kvstore.cc
Comment thread src/v/config/configuration.h Outdated
Comment thread src/v/cluster/cluster_discovery.h Outdated
@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch 4 times, most recently from 72cd608 to 0140ac7 Compare May 7, 2026 16:10
@vbotbuildovich

This comment was marked as outdated.

@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch 6 times, most recently from 3ec0835 to 2058bd9 Compare May 8, 2026 02:09
@vbotbuildovich

This comment was marked as outdated.

@vbotbuildovich

This comment was marked as outdated.

@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch 2 times, most recently from 625236c to d9c2e30 Compare May 8, 2026 19:46
@vbotbuildovich

This comment was marked as outdated.

@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch from d9c2e30 to 7ac0443 Compare May 8, 2026 21:33
@WillemKauf WillemKauf changed the title WIP: The great config bootstrap fix redpanda: the great config bootstrap fix May 8, 2026
@vbotbuildovich

This comment was marked as outdated.

@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch from 7ac0443 to 07a3683 Compare May 9, 2026 05:09
@WillemKauf WillemKauf changed the title redpanda: the great config bootstrap fix [CORE-13707] redpanda: the great config bootstrap fix May 9, 2026
@WillemKauf WillemKauf closed this May 12, 2026
@WillemKauf WillemKauf reopened this May 13, 2026
@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch 2 times, most recently from 697cfdc to e4e5cfe Compare June 8, 2026 22:44
WillemKauf added 11 commits June 8, 2026 19:00
This singleton was a bad idea. Instead, encapsulate it within
`storage_resources`, which is also accessible everywhere `chunk_cache`
is currently used.

This commit:
* Removes the singleton accessor for `chunk_cache`
* Adds the `_chunk_cache` object storage to `storage_resources`
* Reworks the `start()` logic for `chunk_cache`
  * `api::start()` -> `storage_resources::start()` -> `chunk_cache::start()`,
    but with an bool argument `prealloc`. Meant for tests, where we previously
    wouldn't preallocate the chunk cache upfront when calling `api::start()`.
  * Public `chunk_cache` APIs `add()` and `get()` now called `assert_started()`.
  * This ensures we allocate chunks in production, while allowing tests to
    continue to not preallocate.
* Ensures tests which require the `chunk_cache` properly start/stop their
  encapsulated `storage_resources`.

This also accomplishes the goal of deferring construction of `memory_groups()`
before `storage::api::start()` is called during the bootstrapping/start-up
process in `redpanda`.
Otherwise we reach into the `ntp_config` (with 0 overrides) and try to
probe the cluster default for `log_cleanup_policy`, which is incorrect
in the `kvstore` context.
We are going to be using `rpc_server_bootstrap_memory` as a placeholder
value for `cfg.max_service_memory_per_core` during the bootstrap process
as a way of avoiding a call to `memory_groups()` early on in the bootstrap
process.

In order to correctly size the memory allotment for the server post bootstrap,
we need a way of adjusting the number of units used for the semaphore
`_memory` in `net::server`.

Use an `adjustable_semaphore` for this purpose and add a new function
`set_memory_capacity()` to adjust the semaphore's capacity. This will be called
after the bootstrapping process is complete and `memory_groups` is considered
usable/trustworthy.
Will be used in the future as a static member function without a
`members_manager` instance.
We are going to want control over what backends we need to include
in a `controller_stm` snapshot in a future commit. Move this function
to the header and use a variadic template for this purpose.
Pull config_manager::apply_snapshot out of the ss::when_all parallel
block and run it serially after feature_backend and before
members_manager. Downstream backends (members, topics, plugin,
recovery, security, quota, migrations, cluster_link) may consult
shard_local_cfg during their own apply_snapshot work; sequencing
config_manager first guarantees they see fresh values.
Removes the promote_all_pending() call from config_manager::start().
Pending values that accumulated during the post-bootstrap controller
log replay (i.e. cluster_config_delta_cmd records applied above the
hydrated snapshot offset) now stay pending instead of being promoted
when config_manager::start runs.

This is the strict reading of needs_restart=yes semantics: a property
change visible to a node only after it restarts. shard_local_cfg() is
already hydrated from the controller snapshot before bootstrap
proceeds (see application::bootstrap_cluster_config_view), so any
delta replayed afterward is a change the cluster made *after* the
hydration point and should not retroactively affect the running
node's active values for needs_restart properties.
A bit of code extrapolated out. Currently used for gating
`join_node_request`s, will be used in a future commit for handling
`fetch_controller_snapshot` requests as well.
A new RPC to fetch a controller snapshot, containing the latest in-memory
state of the controller leader's `config_manager` and `feature_backend`.

This RPC will be called unconditionally during bootstrap of a `redpanda`
broker, whether that node is simply restarting or is joining for the first
time. The reason for this being that a local node who is down for a long
time will have only stale configurations and features in its local
cache/snapshots/log, and the configuration and feature managers are two global
pieces of state that should be brought to a consistent view early on in a
`redpanda` node's lifetime.
@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch from e4e5cfe to aa327c3 Compare June 8, 2026 23:00
@vbotbuildovich

This comment was marked as outdated.

An extra layer of security against using potentially stale configs
out of the `shard_local_cfg()`. Defaults to `usable_before_ready::no`
to be strict. Properties which are used in the bootstrapping process
before a proper view of the cluster configuration can be established
and which have no impact on the behavior of the system if read stale
can be safely marked as `usable_before_ready::yes`.
Now that `chunk_cache` is no longer a process wide singleton, fixture
tests that allocate several `application`s to simulate multi-broker
tests allocate `chunk_cache` N times instead of 1, leading to potential
memory issues.

Add `test_cfg` to `redpanda`, encapsulating the existing
`cloud_topics::test_cfg` and adding a new flag `chunk_cache_prealloc`,
which is `true` by default for production use.
@WillemKauf WillemKauf force-pushed the config_bootstrap_fix branch from aa327c3 to 083ed0f Compare June 9, 2026 02:58
@sjust-redpanda sjust-redpanda self-requested a review June 10, 2026 23:47
Comment thread src/v/storage/kvstore.cc
Comment on lines -118 to -119
.handle_exception_type([](const ss::gate_closed_exception&) {
lg.trace("Shutdown requested during recovery");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any particular reason for removing this?

Comment thread src/v/storage/kvstore.h

/*
* Recovery
* Recovery (recover() itself is a public entry point declared above)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: shouldn't this comment be attached to recover()?

Comment thread src/v/storage/api.h
_kvstore = std::make_unique<kvstore>(
_kv_conf_cb(), ss::this_shard_id(), _resources, _feature_table);
return _kvstore->start().then([this] {
if (!_kvstore) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: why do we need this? just for start/stopping storage in tests?

Comment on lines +39 to +40
co_await _storage.invoke_on_all(
[](storage::api& a) { return a.start(); });

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: what's the difference?

EDIT: ah, default param

Comment on lines +133 to +134
* unaffected; callers that signal()/consume() directly through the
* underlying handle bypass capacity bookkeeping.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels a little sketchy, but i guess for our use case the whole point is that we only update the capacity at one point post-bootstrap?

Comment on lines -214 to -217
ss::future<> promote_all_pending() {
co_await ss::smp::invoke_on_all(
[] { config::shard_local_cfg().promote_pending(); });
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc the idea here is that, post the great config bootstrap fix, by the time we apply the local cache it is fully up-to-date with the a valid config snapshot from the controller leader, right?

nitpick: update commit message. only mentions start right now
nitpick: should this commit go on the other side of the actual bootstrap fix? bit easier to see it's correct that way, and then we'll have a valid bisection point just in case.

Comment on lines -1099 to -1102
// Snapshot application is a wholesale state replacement — promote
// all pending values immediately so needs_restart properties take
// effect (e.g. during cluster recovery).
co_await promote_all_pending();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking my understanding: we no longer need this because we'll be applying a snapshot earlier in the boostrap process and ultimately forcing pending values to active in apply_local?

@oleiman oleiman left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made it up to config store changes. looking pretty good!

@oleiman oleiman self-requested a review June 17, 2026 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants