[APMSVLS-501] refactor(bottlecap): preparatory work for testmode binary by lucaspimentel · Pull Request #1201 · DataDog/datadog-lambda-extension

lucaspimentel · 2026-04-23T01:31:50Z

Overview

Preparatory refactors for the upcoming bottlecap-testmode binary (APMSVLS-501), which reuses TraceAgent / handle_traces but has no Lambda lifecycle to drive. Three logical changes:

1. `InvocationProcessorHandle::noop()`

Constructor backed by a background task that acknowledges every ProcessorCommand with a sensible default so callers never block on their response oneshots:

Request-response commands (GetReparentingInfo, UpdateReparenting, SetColdStartSpanTraceId, PlatformRuntimeDone, PlatformReport) reply with empty/default values.
Fire-and-forget commands are dropped silently.

The match is exhaustive: adding a new ProcessorCommand variant will cause a compile error here, forcing test-mode behavior to be decided explicitly. A response-carrying variant placed in the fire-and-forget arm would silently drop its sender, causing the caller to receive ProcessorError::ChannelReceive instead of the intended default.

The noop channel capacity matches the real InvocationProcessorService to avoid backpressure surprises.

2. Promote `start_trace_agent` to a top-level library module

Moved the start_trace_agent helper from the private bin/bottlecap/main.rs into a new public module, bottlecap::startup, so callers outside the Lambda binary can construct the full trace-processing pipeline through a single call instead of duplicating ~110 lines of wiring. Placed at the crate root (not under traces/) because the module wires trace, stats, proxy, config, lifecycle, tags, appsec, and flushing pieces together: it is cross-cutting orchestration, not a trace-domain API. No behavior change; the Lambda binary's call site is unchanged (same function name, same signature, same internal spawn).

The companion TraceAgentPipeline return type is a pub struct with named pub fields (trace/stats/proxy flushers, the trace-channel sender, the shutdown token, and the aggregator/concentrator handles), rather than a positional tuple.

3. `RouterExtension` seam on `TraceAgent`

Adds a generic extension point that lets a caller merge additional axum routes into the trace-agent's HTTP router without TraceAgent knowing what those routes do. When no caller attaches an extension (the Lambda binary case), the HTTP surface on port 8126 is unchanged.

New pub trait RouterExtension: Send + Sync { fn extend(&self, router: Router) -> Result<Router, Box<dyn Error>>; } defined in traces::trace_agent. Returning Err aborts agent startup and surfaces the error in the existing log path, preventing silent failures from a panicking or misconfigured extension.
TraceAgent holds router_extension: Option<Arc<dyn RouterExtension>> with a consuming builder method with_router_extension(self, ext: Arc<dyn RouterExtension>) -> Self.
make_router calls extension.extend(router)? when the field is set, before applying the outer fallback and body-limit layers.
Split the library helper in startup.rs into build_trace_agent (returns an unspawned TraceAgent plus the TraceAgentPipeline handles) and a thin start_trace_agent wrapper that spawns. Callers that need to attach a RouterExtension before startup use build_trace_agent; the Lambda binary's call site continues to use start_trace_agent.
A #[cfg(test)]-only SpyExtension validates the seam end-to-end: the trait shape compiles, state is carried via Arc, and merged routes are reachable through the composed Router returned by make_router.

Any concrete flush/drain/diagnostic endpoint lives behind an impl of this trait in the consumer crate, not in trace_agent.rs.

Testing

cargo check --bin bottlecap
cargo check --lib
cargo test --lib
cargo clippy --workspace --all-targets --features default -- -D warnings
cargo fmt --all -- --check
cargo nextest run --workspace (532/532 passed)
New unit tests on InvocationProcessorHandle::noop(): noop_request_response_methods_return_defaults, noop_fire_and_forget_commands_do_not_panic, noop_platform_runtime_done_and_report_respond_without_blocking, noop_request_response_variants_complete_within_timeout
New unit tests on the RouterExtension seam: with_router_extension_adds_reachable_route_to_make_router, make_router_returns_404_for_extension_route_when_none_attached

"Why did the noop handler go to therapy? Because it had to acknowledge every command without ever really doing anything." — Claude 🤖

Copilot

Pull request overview

Preparatory refactors to enable an upcoming bottlecap-testmode binary to reuse the trace-processing pipeline (TraceAgent / handle_traces) without relying on the Lambda lifecycle, and to optionally expose a deterministic flush hook for test harnesses.

Changes:

Added InvocationProcessorHandle::noop() backed by a background task that acknowledges all ProcessorCommand variants with sensible defaults (plus unit tests).
Promoted start_trace_agent wiring from the Lambda binary into a reusable library module (bottlecap::traces::startup), including a build_trace_agent variant that returns an unspawned TraceAgent.
Added an opt-in POST /flush route on the trace-agent listener when a FlushingService is attached via TraceAgent::with_flushing_service.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
bottlecap/src/traces/trace_agent.rs	Adds optional `FlushingService` plumbing and conditionally registers `POST /flush`.
bottlecap/src/traces/startup.rs	Introduces shared library helpers to build/spawn the trace-agent pipeline.
bottlecap/src/traces/mod.rs	Exposes the new `startup` module and re-exports pipeline helpers/types.
bottlecap/src/lifecycle/invocation/processor_service.rs	Adds `InvocationProcessorHandle::noop()` plus tests to ensure defaults/non-blocking behavior.
bottlecap/src/bin/bottlecap/main.rs	Switches Lambda binary to use the new library `start_trace_agent` helper and removes duplicated wiring.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 723059ecf6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

duncanista · 2026-04-24T04:10:08Z

this is binary specific code, not traces component related code, you could create a file/module in the bottlecap binary folder instead of here, those methods are supposed to be binary specific, this breaks that pattern

duncanista

Review

Appreciate the thorough PR description and the preparatory structure here. The noop() constructor and build_/start_trace_agent split are solid ideas. However, there are several Rust style and structural issues worth addressing before this merges, especially since this is establishing patterns that the test-mode binary will build on.

1. `TraceAgentPipeline` — 8-element tuple as public API

pub type TraceAgentPipeline = (
    Sender<SendDataBuilderInfo>,
    Arc<trace_flusher::TraceFlusher>,
    Arc<trace_processor::ServerlessTraceProcessor>,
    Arc<stats_flusher::StatsFlusher>,
    Arc<ProxyFlusher>,
    CancellationToken,
    StatsConcentratorHandle,
    TraceAggregatorHandle,
);

In Rust, tuples beyond 3-4 elements are a code smell. At every call site, someone destructures by position — if the order changes, or two fields happen to share a type, it silently breaks. Now that this is being promoted from a private function's return type to a pub type alias in the library, it should be a struct with named fields:

pub struct TraceAgentPipeline {
    pub trace_channel: Sender<SendDataBuilderInfo>,
    pub trace_flusher: Arc<trace_flusher::TraceFlusher>,
    pub trace_processor: Arc<trace_processor::ServerlessTraceProcessor>,
    // ...
}

Named fields are self-documenting, order-independent, and play nicely with IDE autocompletion. The original code had this problem, but promoting it to library API is the right time to fix it.

2. `startup.rs` doesn't belong in `traces/`

(Expanding on Jordan's inline comment.)

build_trace_agent imports from config, lifecycle, tags, traces, appsec, and flushing — it touches nearly every module in the crate. It's application-level wiring/orchestration, not a traces component. Placing it in traces::startup misrepresents its role and breaks the pattern where traces/ contains trace-domain code.

The existing pattern is that each binary owns its startup wiring in bin/<name>/main.rs. If sharing between the Lambda and test-mode binaries is truly needed, consider:

A top-level bottlecap::startup module (not nested under a domain module)
Or a shared helper in bin/ if both binaries live in the same crate

Re-exporting via pub use startup::{...} from traces::mod.rs further blurs the boundary — callers see bottlecap::traces::start_trace_agent and assume it's trace-domain API.

3. `flush` handler — over-engineered timeout logic

The current code (after accepting Copilot's suggestion):

Err(_) => {
    task.abort();
    match task.await {
        Err(join_err) if join_err.is_cancelled() => StatusCode::GATEWAY_TIMEOUT,
        Err(_) => StatusCode::INTERNAL_SERVER_ERROR,
        Ok(()) => StatusCode::GATEWAY_TIMEOUT,
    }
}

After task.abort(), the await will always resolve to Err(JoinError::Cancelled) (the Ok(()) arm is only reachable in a vanishingly small race window, and even then the right status is still 504). The inner match has 3 arms that all effectively return the same thing. Simplify to:

Err(_) => {
    task.abort();
    StatusCode::GATEWAY_TIMEOUT
}

We don't need to await after abort — we already know the status code. Simpler code is also more correct code because there are fewer branches to reason about.

4. Noop tests — reach through internals instead of using the public API

The third test (noop_platform_runtime_done_and_report_respond_without_blocking) is 50+ lines that manually construct ProcessorCommand::PlatformRuntimeDone and ProcessorCommand::PlatformReport, sending them through handle.sender directly. But InvocationProcessorHandle already has public methods for both:

handle.on_platform_runtime_done(...)
handle.on_platform_report(...)

The test should use those. Testing through the public API is more resilient to internal refactors (e.g., if the channel type changes) and is more readable — the reader immediately understands what behavior is being tested rather than having to mentally map command construction to handler behavior. It also avoids the 15+ lines of field construction for each command variant.

5. Doc comments leak consumer knowledge into library functions

build_trace_agent's doc references "bottlecap-testmode binary" and "POST /flush route" repeatedly. Library functions should document what they do and why a caller would use them, not who specifically calls them. Consumer-specific guidance belongs in the consumer's code. For example:

"Returns an unspawned TraceAgent so the caller can configure it (e.g., attach optional routes) before spawning."

is better than:

"notably the bottlecap-testmode binary, which uses it to back a POST /flush route"

6. `flushing_service: Option<Arc<FlushingService>>` on `TraceAgent`

Adding a test-only field to a production struct is pragmatic for one field, but worth flagging as a pattern to not repeat. Every TraceAgent in production carries a None that exists solely for test-mode. If more test-mode hooks are needed later, consider a trait or a router-extension pattern instead of accumulating Option fields.

Summary

Area	Suggestion
`TraceAgentPipeline`	Replace 8-tuple with a named struct
`startup.rs` location	Move out of `traces/`, it's orchestration not trace-domain code
`flush` timeout	Simplify the post-abort match to just return 504
Noop tests	Use `handle.on_platform_runtime_done()` / `on_platform_report()` instead of raw channel sends
Doc comments	Remove consumer-specific references from library function docs
`flushing_service` field	Fine for now, but flag as not-to-repeat pattern

Happy to discuss any of these — the overall direction is good, these are about getting the Rust patterns right before the test-mode binary builds on top.

Tuples beyond 3-4 fields are fragile: every call site destructures by position, and any reordering (or two fields of the same type swapping) breaks silently. Now that the pipeline is a pub API type shared between the Lambda binary and future test-mode binary, promote it to a struct with named pub fields. Addresses review feedback on PR #1201.

This module wires trace, stats, proxy, config, lifecycle, tags, appsec, and flushing pieces together. It is cross-cutting orchestration, not trace-domain code, so placing it under traces/ misrepresents its role and makes callers treat bottlecap::traces::start_trace_agent as if it were a trace-domain API. Promote it to bottlecap::startup at the crate root so both the Lambda binary and the forthcoming test-mode binary can consume it without reaching into a domain module. Addresses review feedback on PR #1201.

After task.abort(), the awaited JoinError is effectively always Cancelled, and all three arms of the post-abort match returned (or should have returned) 504 for the test harness. The extra match adds branches to reason about without changing the result, and the lone Err(_) -> 500 arm is unreachable in practice (a panic surfaces through the Ok(Err(_)) branch of the outer timeout). Drop the await-after-abort and return 504 directly. Addresses review feedback on PR #1201.

The test previously built ProcessorCommand::PlatformRuntimeDone and PlatformReport by hand and pushed them through handle.sender, making the reader mentally map raw command construction to the handler behavior and coupling the test to the channel internals. Use the already-public InvocationProcessorHandle methods, which exercise the same code path, drop ~15 lines of field plumbing per variant, and stay resilient to future refactors of the command channel. Addresses review feedback on PR #1201.

Library doc comments named the bottlecap-testmode binary and the POST /flush route it registers. Library APIs should document what they do and why a caller might reach for them, not which specific caller today happens to. Rewrite the affected doc comments (and one internal impl comment) to describe behavior and motivation generically: "a deterministic on-demand drain hook", "a handle that has no Lambda lifecycle state to drive", and so on. Addresses review feedback on PR #1201.

…ension trait The flushing_service: Option<Arc<FlushingService>> field was test-mode scaffolding on the production struct. Every TraceAgent in production carried a None that existed solely for a future test binary, and any additional test-mode hook would have had to add another Option field with the same shape, each pulling another consumer-specific dependency into trace_agent.rs. Replace it with a single generic extension seam: - pub trait RouterExtension { fn extend(&self, router: Router) -> Router; } - TraceAgent stores Option<Arc<dyn RouterExtension>> and exposes with_router_extension(self, ext) -> Self in place of with_flushing_service. - make_router calls extension.extend(router) when set, before applying the outer fallback and body-limit layers. - FLUSH_ENDPOINT_PATH, the flush handler, and the crate::flushing FlushingService import are removed from trace_agent.rs. The trace- agent module now knows nothing about FlushingService. Future test-mode hooks add new methods (with defaults) to the trait or new impls behind it, not new Option fields on TraceAgent. A #[cfg(test)]-only SpyExtension validates the seam end-to-end: trait shape compiles, state is carried via Arc, and merged routes are reachable through the composed Router. Addresses review feedback on PR #1201 (deferred comment 6).

lucaspimentel · 2026-04-24T20:17:01Z

@duncanista, all six items addressed across six commits on top of 605933a:

TraceAgentPipeline tuple → struct with named pub fields (6586fc5). Destructure by name at the call site, so reordering or same-typed-field swaps can't break silently.
startup.rs moved out of traces/ to a new top-level bottlecap::startup (80e7752). The Lambda binary calls bottlecap::startup::start_trace_agent now, no longer reaching into a domain module; the re-export from traces::mod.rs is gone.
POST /flush timeout arm collapsed to task.abort(); 504 (b043eb8). The redundant await-after-abort and the unreachable 500 branch are gone; a panic still surfaces through the outer Ok(Err(_)) arm as 500.
Noop test uses handle.on_platform_runtime_done / on_platform_report (962f394) instead of constructing ProcessorCommand variants by hand. About 15 lines of per-variant field plumbing removed.
Consumer-specific references stripped from library doc comments (5fa5bb2). No more bottlecap-testmode or POST /flush route naming in build_trace_agent, with_router_extension, InvocationProcessorHandle::noop, etc. The library docs describe behavior, not today's caller.
flushing_service field replaced with a RouterExtension trait (043d991, bc31322). Swapped the Option<Arc<FlushingService>> field for a single generic slot Option<Arc<dyn RouterExtension>> with pub trait RouterExtension { fn extend(&self, router: Router) -> Router; }. TraceAgent no longer imports or references FlushingService; the flush handler and FLUSH_ENDPOINT_PATH const are gone. Future test-mode hooks become new trait impls (or default-implemented methods added to the trait), not new Option fields. A cfg(test) test builds a real TraceAgent, attaches a spy extension via with_router_extension, and verifies the route is reachable through make_router; a second test confirms the None path still falls through to 404.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Copilot · 2026-04-24T23:17:26Z

+    // Build one shared hyper-based HTTP client for trace and stats flushing.
+    // This client type is required by libdd_trace_utils for SendData::send().
+    let trace_http_client = trace_http_client::create_client(
+        config.proxy_https.as_ref(),
+        config.tls_cert_file.as_ref(),
+        config.skip_ssl_validation,
+    )
+    .expect("Failed to create trace HTTP client");
+


build_trace_agent is a public library API but it uses .expect("Failed to create trace HTTP client"), which will panic on common misconfigurations (invalid proxy URL/cert file, IO errors). Consider returning a Result<(TraceAgent, TraceAgentPipeline), Box<dyn Error + Send + Sync>> (and similarly making start_trace_agent return Result<TraceAgentPipeline, _>) so callers can handle startup failures without aborting the process.

Copilot · 2026-04-24T23:17:27Z

+    /// Returns a handle backed by a background task that acknowledges every
+    /// command with a sensible default. Use this for callers that need an
+    /// `InvocationProcessorHandle` (for example, to reuse `TraceAgent` or
+    /// `handle_traces`) but have no Lambda lifecycle state to drive.
+    ///
+    /// The match below is exhaustive: adding a new `ProcessorCommand` variant
+    /// will fail to compile here, so the noop behavior for it must be
+    /// decided explicitly. A response-carrying variant placed in the
+    /// fire-and-forget arm would silently drop its sender, causing the
+    /// caller to receive `ProcessorError::ChannelReceive` instead of the
+    /// intended default.
+    #[must_use]
+    pub fn noop() -> Self {
+        let (sender, mut receiver) = mpsc::channel::<ProcessorCommand>(1000);
+        tokio::spawn(async move {
+            while let Some(command) = receiver.recv().await {


InvocationProcessorHandle::noop() spawns a Tokio task internally. If this constructor is called outside of an active Tokio runtime, tokio::spawn will panic at runtime. Please either document this requirement explicitly in the doc comment (e.g., "must be called from within a Tokio runtime") or provide an alternative constructor that takes a tokio::runtime::Handle (or is async) to avoid surprising panics for library callers.

The forthcoming bottlecap-testmode binary (APMSVLS-511) reuses TraceAgent and handle_traces but has no Lambda lifecycle to drive. Adds a noop constructor that spawns a background task acknowledging every ProcessorCommand with a sensible default, so callers never block on their response oneshots. The match is exhaustive: a new ProcessorCommand variant forces a compile error here, keeping test-mode behavior explicit rather than silently dropping responses.

…isfy clippy 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

Moves the start_trace_agent helper from the private bin/bottlecap/main.rs into a new public module, bottlecap::traces::startup, so callers outside the Lambda binary (notably the forthcoming bottlecap-testmode binary) can construct the full trace-processing pipeline through a single call instead of duplicating ~110 lines of wiring. No behavior change. The Lambda binary's call site is unchanged: same function name, same signature, same internal spawn. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

…ode /flush hook Adds an opt-in POST /flush route to the trace-agent listener on 8126, backed by a caller-supplied FlushingService. The route is only registered when the caller calls TraceAgent::with_flushing_service; otherwise the Lambda binary's HTTP surface is unchanged. - Add an optional flushing_service field to TraceAgent with a consuming builder method with_flushing_service(self, fs) -> Self. - In make_router, conditionally merge a /flush sub-router that calls FlushingService::flush_blocking_final when the field is set. - Split the library helper in src/traces/startup.rs into build_trace_agent (returns an unspawned TraceAgent plus the pipeline handles) and a thin start_trace_agent wrapper that spawns. Test-mode will use build_trace_agent so it can attach its FlushingService before spawning; Lambda's call site is unchanged. - No behavior change for the Lambda binary: it never attaches a flushing service, so /flush remains unexposed and start_trace_agent still spawns internally. Preparatory for APMSVLS-511 (bottlecap-testmode binary). 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

build_trace_agent returns (TraceAgent, TraceAgentPipeline), a 2-tuple whose complexity is hidden behind the type alias — the lint never fires. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

Callers using build_trace_agent need the return type alias accessible via the same crate::traces path as the function itself. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

Wrap flush_blocking_final in a spawned task so panics return 500 instead of tearing down the handler, and add a 30-second timeout so a stuck flush returns 504 rather than hanging the test harness. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

Add a test that sends PlatformRuntimeDone and PlatformReport commands directly through the noop handle's internal channel and verifies the response oneshots are fulfilled without blocking. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

Tuples beyond 3-4 fields are fragile: every call site destructures by position, and any reordering (or two fields of the same type swapping) breaks silently. Now that the pipeline is a pub API type shared between the Lambda binary and future test-mode binary, promote it to a struct with named pub fields. Addresses review feedback on PR #1201.

This module wires trace, stats, proxy, config, lifecycle, tags, appsec, and flushing pieces together. It is cross-cutting orchestration, not trace-domain code, so placing it under traces/ misrepresents its role and makes callers treat bottlecap::traces::start_trace_agent as if it were a trace-domain API. Promote it to bottlecap::startup at the crate root so both the Lambda binary and the forthcoming test-mode binary can consume it without reaching into a domain module. Addresses review feedback on PR #1201.

After task.abort(), the awaited JoinError is effectively always Cancelled, and all three arms of the post-abort match returned (or should have returned) 504 for the test harness. The extra match adds branches to reason about without changing the result, and the lone Err(_) -> 500 arm is unreachable in practice (a panic surfaces through the Ok(Err(_)) branch of the outer timeout). Drop the await-after-abort and return 504 directly. Addresses review feedback on PR #1201.

The test previously built ProcessorCommand::PlatformRuntimeDone and PlatformReport by hand and pushed them through handle.sender, making the reader mentally map raw command construction to the handler behavior and coupling the test to the channel internals. Use the already-public InvocationProcessorHandle methods, which exercise the same code path, drop ~15 lines of field plumbing per variant, and stay resilient to future refactors of the command channel. Addresses review feedback on PR #1201.

Library doc comments named the bottlecap-testmode binary and the POST /flush route it registers. Library APIs should document what they do and why a caller might reach for them, not which specific caller today happens to. Rewrite the affected doc comments (and one internal impl comment) to describe behavior and motivation generically: "a deterministic on-demand drain hook", "a handle that has no Lambda lifecycle state to drive", and so on. Addresses review feedback on PR #1201.

…ension trait The flushing_service: Option<Arc<FlushingService>> field was test-mode scaffolding on the production struct. Every TraceAgent in production carried a None that existed solely for a future test binary, and any additional test-mode hook would have had to add another Option field with the same shape, each pulling another consumer-specific dependency into trace_agent.rs. Replace it with a single generic extension seam: - pub trait RouterExtension { fn extend(&self, router: Router) -> Router; } - TraceAgent stores Option<Arc<dyn RouterExtension>> and exposes with_router_extension(self, ext) -> Self in place of with_flushing_service. - make_router calls extension.extend(router) when set, before applying the outer fallback and body-limit layers. - FLUSH_ENDPOINT_PATH, the flush handler, and the crate::flushing FlushingService import are removed from trace_agent.rs. The trace- agent module now knows nothing about FlushingService. Future test-mode hooks add new methods (with defaults) to the trait or new impls behind it, not new Option fields on TraceAgent. A #[cfg(test)]-only SpyExtension validates the seam end-to-end: trait shape compiles, state is carried via Arc, and merged routes are reachable through the composed Router. Addresses review feedback on PR #1201 (deferred comment 6).

The previous SpyExtension test called `extension.extend(Router::new())` directly, which only proves axum's Router::merge works. Build an actual TraceAgent via a small test helper, attach the spy via with_router_extension, and hit /spy through the full router returned by make_router. Add a second test confirming that with no extension attached, /spy falls through to the 404 handler. Also trim the "deterministic drain endpoint" example from the with_router_extension and start_trace_agent doc blocks; keep it only in the trait doc to avoid repetition across three locations.

…comment Dropping a oneshot sender causes the receiver to return ProcessorError::ChannelReceive, not a hang. Update the comment to describe the actual observable behavior. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

…omment The previous note mentioned only the three tasks spawned directly in build_trace_agent (aggregator, concentrator, dedup). TraceAgent::new unconditionally spawns a fourth task: the trace-payload drain loop. Update the summary line and the leak warning to reflect all four tasks. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

…rrors Previously, a panic or error inside extend() would propagate as a panic through make_router() and start(), then be silently dropped by the tokio::spawn wrapper in start_trace_agent (which only logged Err returns, not panics). Changing the return type to Result<Router, Box<dyn std::error::Error>> means implementors must return errors explicitly. Those errors propagate through make_router via ?, then through start() to the spawn wrapper, where they are already logged. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

… timeout The exhaustive match in noop() prevents unhandled variants at compile time, but it cannot prevent a future contributor from placing a response-carrying variant in the fire-and-forget arm. In that case, rx.await returns ProcessorError::ChannelReceive, and the test would hang rather than fail cleanly. Add a test that wraps every request-response call in a 500ms timeout so any such regression produces a clear failure instead of a hung test suite. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

lucaspimentel changed the title ~~feat(bottlecap): add InvocationProcessorHandle::noop() for test-mode binary~~ [APMSVLS-511] Add InvocationProcessorHandle::noop() for test-mode binary Apr 23, 2026

lucaspimentel changed the title ~~[APMSVLS-511] Add InvocationProcessorHandle::noop() for test-mode binary~~ [APMSVLS-511] Add InvocationProcessorHandle::noop() for test-mode binary Apr 23, 2026

lucaspimentel changed the title ~~[APMSVLS-511] Add InvocationProcessorHandle::noop() for test-mode binary~~ [APMSVLS-511] refactor(bottlecap): preparatory work for testmode binary Apr 23, 2026

lucaspimentel changed the title ~~[APMSVLS-511] refactor(bottlecap): preparatory work for testmode binary~~ [APMSVLS-501] refactor(bottlecap): preparatory work for testmode binary Apr 23, 2026

lucaspimentel marked this pull request as ready for review April 23, 2026 22:15

lucaspimentel requested a review from a team as a code owner April 23, 2026 22:15

Copilot AI review requested due to automatic review settings April 23, 2026 22:15

lucaspimentel requested a review from a team as a code owner April 23, 2026 22:15

lucaspimentel requested a review from jchrostek-dd April 23, 2026 22:15

Copilot started reviewing on behalf of lucaspimentel April 23, 2026 22:15 View session

Copilot AI reviewed Apr 23, 2026

View reviewed changes

Comment thread bottlecap/src/traces/trace_agent.rs Outdated

Comment thread bottlecap/src/lifecycle/invocation/processor_service.rs

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread bottlecap/src/traces/trace_agent.rs Outdated

duncanista reviewed Apr 24, 2026

View reviewed changes

duncanista requested changes Apr 24, 2026

View reviewed changes

lucaspimentel requested review from Copilot and duncanista April 24, 2026 23:12

Copilot started reviewing on behalf of lucaspimentel April 24, 2026 23:12 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

lucaspimentel and others added 5 commits April 24, 2026 19:34

fix(bottlecap): replace unwrap() with expect() in noop() tests to sat…

2dfd8f4

…isfy clippy 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

fix: return 500 from /flush handler on panic

fa1b184

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

lucaspimentel and others added 23 commits April 24, 2026 19:34

refactor(startup): remove redundant re-export and allow attr

8a2fc53

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

refactor(startup): remove unnecessary clippy::type_complexity allow

6a9a581

build_trace_agent returns (TraceAgent, TraceAgentPipeline), a 2-tuple whose complexity is hidden behind the type alias — the lint never fires. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

refactor(traces): re-export TraceAgentPipeline from traces module

d7a8a08

Callers using build_trace_agent need the return type alias accessible via the same crate::traces path as the function itself. 🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

refactor(noop): match real service channel capacity

ecc5e21

docs(trace_agent): explain /flush panic isolation

81e34f5

docs(startup): warn about leaked tasks if agent not spawned

13a302c

refactor(startup): consolidate tokio::sync imports

3ae66a4

style: apply rustfmt

039fb0f

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

fix(trace_agent): abort timed-out flush task to bound execution

18b20a6

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

style(trace_agent): apply rustfmt to RouterExtension::extend signatures

c825e8d

🤖 Co-Authored-By: Claude Code <noreply@anthropic.com>

lucaspimentel force-pushed the lpimentel/bottlecap-test-mode branch from 05677cb to c825e8d Compare April 24, 2026 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APMSVLS-501] refactor(bottlecap): preparatory work for testmode binary#1201

[APMSVLS-501] refactor(bottlecap): preparatory work for testmode binary#1201
lucaspimentel wants to merge 28 commits intomainfrom
lpimentel/bottlecap-test-mode

lucaspimentel commented Apr 23, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

duncanista Apr 24, 2026

Uh oh!

duncanista left a comment

Uh oh!

lucaspimentel commented Apr 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Copilot AI Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lucaspimentel commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

1. InvocationProcessorHandle::noop()

2. Promote start_trace_agent to a top-level library module

3. RouterExtension seam on TraceAgent

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

duncanista Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

duncanista left a comment

Choose a reason for hiding this comment

Review

1. TraceAgentPipeline — 8-element tuple as public API

2. startup.rs doesn't belong in traces/

3. flush handler — over-engineered timeout logic

4. Noop tests — reach through internals instead of using the public API

5. Doc comments leak consumer knowledge into library functions

6. flushing_service: Option<Arc<FlushingService>> on TraceAgent

Summary

Uh oh!

lucaspimentel commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lucaspimentel commented Apr 23, 2026 •

edited

Loading

1. `InvocationProcessorHandle::noop()`

2. Promote `start_trace_agent` to a top-level library module

3. `RouterExtension` seam on `TraceAgent`

1. `TraceAgentPipeline` — 8-element tuple as public API

2. `startup.rs` doesn't belong in `traces/`

3. `flush` handler — over-engineered timeout logic

6. `flushing_service: Option<Arc<FlushingService>>` on `TraceAgent`

lucaspimentel commented Apr 24, 2026 •

edited

Loading