feat(agent)!: hook system v2 — composable middleware#2012
Merged
Conversation
Evolve the agent hook system into a composable middleware layer ahead of the vector-store/RAG rework, without touching vector stores. - HookContext: run-scoped context (run_id, turn, is_streaming, agent_name, shared Scratchpad) passed to every on_event; breaks the trait signature. - Mergeable request patches: RequestOverride -> RequestPatch, Flow::OverrideRequest -> Flow::PatchRequest. CompletionCall patches from every hook accumulate and merge in registration order (no more first-patch short-circuit); documented per-field merge rules. - RequestPatch::extra_context: per-turn passive-RAG document injection, appended after static + dynamic context. - RequestPatch::history: per-turn replacement of the messages sent to the provider; transcript untouched, RAG keys off the original history. - Chained tool rewrites: RewriteArgs/RewriteResult compose across a HookStack. - StepEvent::ModelTurnFinished: normalized once-per-turn event on both surfaces (covers streamed tool-only turns). All merge/patch/context/history/event semantics land once in the shared drive loop, so run() and stream() behave identically. Fail-closed handling, invalid-tool-call recovery, and run-all-then-decide tool execution unchanged.
…ook v2 Review found the pre-v2 'first non-Continue short-circuits the rest' blanket claim still on all four public add_hook rustdocs and the foundational CHANGELOG bullet — false for CompletionCall (accumulate) and ToolCall/ToolResult (chain). State the event-dependent composition and point to the hook module docs.
… ModelTurnFinished Address the second adversarial review round: - Flow::Terminate from a tool hook is now turn-wide fail-fast. Sequential execution stops before starting remaining siblings; concurrent execution drops not-yet-started siblings (a shared flag makes them skip) while draining already-in-flight ones, so no new side effect runs after a termination and the lowest call-index terminate reason still wins deterministically. Avoids the Semantic-Kernel run-all-then-decide fail-open. - Tool-result redaction: gen_ai.tool.call.result is recorded only AFTER the ToolResult hook runs (replacement on RewriteResult, raw on Keep, nothing on Terminate), so a redaction hook never leaks the raw output to the trace/logs. - Streaming ModelTurnFinished.content now carries the canonical committed StreamedTurn::choice (reasoning->text->tool ordering), not the raw stream.choice aggregate; raw is kept for final stream behavior. - Scratchpad/HookContext docs: drop "never contends"; document that at tool_concurrency > 1 tool hooks for different tools share the context and run concurrently, update is race-free per op but imposes no ordering; recommend commutative/idempotent state or keying by call id. - Docs/test hygiene: CHANGELOG old signature includes ctx; anthropic request_override wording; () observes nothing; streaming transcript-untouched assertion. Tests: sequential + concurrent fail-fast (both surfaces), redaction non-leak (span capture), canonical ModelTurnFinished ordering — all verified to fail without their fix. Full agent suite 229 passed.
…; fix () observes doc Round-2 review follow-ups (both low-severity accuracy fixes): - The concurrent drop test's narrative was wrong: a synchronous terminator meant no sibling ran at all, so it never exercised the in-flight-drain vs beyond-window-drop distinction it described. Rewrite with a Notify so tc1 is genuinely in flight (drains, called contains 1) while tc2 beyond the window is dropped (called never contains 2) and tc0's body never runs. Renamed accordingly; 8/8 non-flaky. - Correct the () hook observes() doc: observes gates only the two streaming delta events, not 'every event' (non-delta events dispatch unconditionally).
…al tool-choice validation Round-3 redesign for correctness and cleaner semantics (breaking): 1. Split model-emitted tool call from execution start. StreamAssistantItem (StreamedAssistantContent::ToolCall) now reports the tool call the MODEL emitted (at turn commit, whether or not it runs). A new MultiTurnStreamItem::ToolExecutionStart marks that Rig actually started executing a tool (after hook checks) — never for a dropped/hook-skipped/ invalid-recovery call. run_single_tool now reports `executed` to distinguish a real run from a Flow::Skip. 2. Atomic per-batch commit/surface. drive_tool_calls collects tool outcomes instead of streaming them one-by-one; execution-start + result items are surfaced (in call order) and committed only after the whole batch settles OK. On the first termination/fail-closed error the batch fails fast (stop new, drop not-yet-started concurrent siblings, drain in-flight, lowest call-index error) with no successful items surfaced and no history commit — no orphan execution-start events. Sequential and concurrent share the atomic path; results now surface in call order (was completion order for concurrent). 3. Local tool-choice validation. allowed_tool_names_for_choice now validates the effective request before the provider call: Required with no advertised tool (executable or output tool) and Specific naming a non-advertised tool are local errors, with an active_tools-aware hint suggesting a compatible tool_choice in the same RequestPatch. Structured-output Tool mode with no real tools still works via the synthetic output tool. Tests: split taxonomy + atomic ordering (both surfaces), no-orphan-execution-start on concurrent termination, no-successful-result/no-commit on termination, Fix-3 unit + no-provider-call integration tests. Full lib 1135 pass; anthropic/openai/ gemini cassette suites (322) pass.
…esults, precise active_tools hint Address the round-3 adversarial review (7 confirmed findings): - ToolExecutionStart now carries the EFFECTIVE (hook-rewritten) tool call, not the model's original — matching its doc and preventing a RewriteArgs redaction from leaking the original args. run_single_tool returns ToolExecution::Executed(effective_call) | Skipped. - A Flow::Skip tool result is now surfaced as a ToolResult (no ToolExecutionStart) instead of being committed-but-not-streamed — the stream matches history again. - The active_tools error hint is shown only when the filter actually dropped a tool that would have satisfied the choice, not for a plain typo (threads the pre-filter tool set into allowed_tool_names_for_choice). - Docs: ToolExecutionStart/StreamUserItem and the tool_concurrency method docs no longer scope atomicity to concurrency>1 or claim completion-order surfacing; ToolExecutionStart timing reworded to the atomic-settle model. Tests: ToolExecutionStart-carries-rewritten-args, hook-skip-surfaces-result-without- execution-start, specific-typo-not-blamed-on-active_tools. Full lib 1138 pass; anthropic/openai/gemini cassette suites (322) pass.
…::large_enum_variant) The Executed(ToolCall) variants dwarfed the empty Skipped/Preresolved variants; box the ToolCall so the enums stay small. Surfaced under the rig package's feature set (cargo clippy -p rig --all-features).
- Correct AgentHook::on_event default doc: the default observes() returns true (not "observes nothing"); () skips only the high-frequency delta dispatch and still receives every other event, returning Continue. - Add unit_hook_observes_no_event_kind guarding the () observes override. - Rename first_terminate_short_circuits_on_observe_only_events (it dispatched a chained ToolCall) to ..._on_chained_tool_call; add a real observe-only (_ => arm) terminate test via TextDelta. - Assert ModelTurnFinished is suppressed on recovered turns on both the blocking and streaming surfaces (its own guard, distinct from the response-finish guards), with cross-driver parity. - Add runner_add_hook_appends_to_agent_default_hooks proving runner/prompt add_hook appends to (not replaces) the agent's default hooks. - Fix broken intra-doc link ToolCallOutcome::executed -> ::execution.
…ut-tool warning
Deep-review follow-ups on hook system v2:
- Fix stale AgentRunner::tool_concurrency streaming doc. It still described the
pre-atomic-batch behavior ("emits each ToolResult as its tool finishes ...
completion order"); the driver now surfaces ToolExecutionStart + ToolResult
stream items in call order only after the whole batch settles. This matches the
implementation and the sibling StreamingPromptRequest::tool_concurrency doc that
links to it.
- Fix Scratchpad doc that referenced "clones of a run's HookContext" — HookContext
is not Clone (holds an AtomicUsize; it is shared by &-reference). Attribute the
sharing to the shared &HookContext / Scratchpad clones instead.
- completion.rs: the committed Tool-mode stall warning fired falsely when
tool_choice = Specific names the synthetic output tool. allowed_tool_names_for_choice
accepts and advertises such a choice (the output tool is callable), but
tool_choice_permits_output_tool treats every Specific as forbidding. Add a
name-aware output_tool_callable helper used only at the warning site (resolve_output_mode
keeps the coarser check, since the output-tool name is not known there yet) and a
unit test.
…ool-mode finalization, fix stale changelog Deep-review follow-ups on structured-output Tool-mode interactions: - ModelTurnFinished.content: doc said "as recorded into the run", but on a Tool-mode output-tool finalization turn the run persists the turn as assistant text (structured output) with the tool call dropped, while both drivers fire the event with the model-emitted content (including the output-tool call). The content is the model's committed turn output (consistent across surfaces, fired at turn commit before finalization). Correct the field doc to state this with an explicit Tool-mode caveat, and add a both-surfaces regression test. - StreamAssistantItem: the contract promised a complete ToolCall item for every model tool call whether or not Rig executes it, but an output-tool call finalizes the run directly (bypassing drive_tool_calls), so its complete item is never emitted (only deltas + FinalResponse); invalid-recovery calls are the same shape. Narrow the doc to document both carve-outs, and add a streaming regression test. - CHANGELOG: two Unreleased lines still described streaming ToolResult items in completion order, contradicting the atomic call-order-after-settle behavior. Update them to match.
A comparative study of LangChain, LangGraph, OpenAI Agents, pydantic-ai, Semantic Kernel, and the Vercel AI SDK surfaced three scoped, pure-doc improvements to the hook system (no API or behavior change): - Flow::Skip / Flow::skip: document that `reason` is delivered to the model verbatim as the tool result, so it doubles as a prompt — tell the model the tool did not run and not to retry, or it may re-emit the identical call (mirrors LangChain's HITL reject/respond feedback). - Flow::retry: add a worked example building corrective feedback from InvalidToolCallContext::available_tools (closes an asymmetric doc gap — the rewrite_args/rewrite_result constructors already had examples; mirrors LangGraph's INVALID_TOOL_NAME_ERROR_TEMPLATE). - hook module docs: add a "Why a returned Flow, not a next()-style middleware" rationale, citing the silent-skip footgun that a next()/middleware model carries (Semantic Kernel ADR 0043) and which Rig's fail-closed returned-Flow design prevents.
pull Bot
pushed a commit
to sternelee/rig
that referenced
this pull request
Jul 5, 2026
) * test(gemini): live cassette hook-system stress suite Adds a Gemini cassette-backed stress suite (tests/providers/gemini/cassette/hook_stress.rs) that exercises the merged hook system (v2, 0xPlaygrounds#2012) across long, realistic multi-turn workflows recorded against real Gemini and replayed deterministically: - HookContext identity (stable run_id, advancing turn, is_streaming, agent_name) and a shared Scratchpad threaded across two cooperating hooks and many turns. - RequestPatch: extra_context injection + active_tools narrowing + temperature, proven by downstream effects (the injected fact reaches the answer; the filtered-out tool never executes). - Chained tool lifecycle: RewriteArgs -> observe -> RewriteResult redaction, with paired positive/negative assertions (the marker reaches the model; the raw result does not; the transcript keeps the model's original args). - Streaming lifecycle ordering (tool call -> execution start -> tool result -> final response) and is_streaming parity vs the blocking surface. - Per-turn atomic call/result pairing and the Skip zero-execution invariant. Assertions follow tools_support's loose-assertion convention (exact equality only for rig-synthesized values), so the cassettes survive re-recording; hooks are deterministic so replay requests stay byte-identical. Cassettes are auto-scrubbed and safety-checked by the recorder (key -> [REDACTED]; ids/signatures placeholdered) and were reviewed manually. No hook-system bugs surfaced: every scenario passed on the first live recording, confirming the documented behavior. Inspired by how LangChain, LangGraph, OpenAI Agents, pydantic-ai, Semantic Kernel, and the Vercel AI SDK test their hook/middleware/guardrail systems (ordered breadcrumb logs; proving mutations via downstream effects; paired positive+negative redaction; zero-downstream skip invariants). * test(gemini): expand hook-system stress suite (+24 live cassette tests) Broadens the Gemini hook-system stress suite from 6 to 30 recorded workflows across four themed cassette files, driven by a shared deterministic-fixtures module (hook_stress_support.rs): - hook_stress_context (8): HookContext identity (stable run_id, advancing turn, is_streaming, agent_name incl. the unset case); a shared Scratchpad written by one hook and read by a second, growing across turns; internal_call_id correlation of ToolCall/ToolResult; two observe-only hooks both firing; add_hook appending across builder + request; CompletionCall patch accumulation; and active_tools set-intersection across two narrowing hooks. - hook_stress_patch (4): preamble override; tool_choice=Required (first turn only); per-turn history replacement injecting a prior fact; multi-field patch (preamble + extra_context). - hook_stress_tools (6): single-key and chained RewriteArgs; chained RewriteResult (redact -> wrap) and truncation; Terminate from a ToolResult (post-execution); and model-driven recovery from a tool error. - hook_stress_streaming (6): TextDelta / StreamResponseFinish / ModelTurnFinished on streaming; ModelTurnFinished on tool turns; RewriteResult redaction reaching the FinalResponse; active_tools narrowing and Skip on the streaming driver; and blocking-vs-streaming answer parity (two cassettes). All recorded live against Gemini and replaying deterministically; every patch / rewrite / skip effect is proven by a downstream-observable change, and model-shaped values use loose assertions (exact only for rig-synthesized ones). Cassettes are auto-scrubbed + safety-checked by the recorder and were reviewed. A footgun surfaced along the way: forcing tool_choice=Required on *every* turn loops until max_turns (each turn re-forces a tool call). Captured with a first-turn-only patch fixture (FirstTurnPatch) so the intended pattern is shown. No hook-system bugs found; every scenario confirms the documented behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hook system v2 — composable middleware
Evolves Rig's agent hook system into a composable middleware layer that supports serious production use cases — RAG/context injection, guardrails, request shaping, telemetry, tool policy, multi-turn orchestration — before the vector-store/RAG rework, and without touching vector stores. Major breaking change; no deprecated aliases (renamed once, correctly).
Update — round 3 (tool-execution semantics)
A third round redesigned tool-execution streaming for the cleanest, most correct semantics (breaking, per the mandate to prefer a small breaking change over surprising behavior):
MultiTurnStreamItem::StreamAssistantItem(StreamedAssistantContent::ToolCall)now reports the tool call the model emitted — surfaced when the model turn is committed, whether or not Rig executes it. A newMultiTurnStreamItem::ToolExecutionStart { tool_call, internal_call_id }marks that Rig has started executing a tool, emitted only after the tool passed itsToolCallhook checks and its body actually runs (never for a dropped, hook-skipped, or invalid-recovery call).run_single_toolnow returnsToolCallOutcome { content, executed }so the driver can tell a real run from aFlow::Skip. All three events (model call → execution start → result) correlate viainternal_call_id. This is LangGraph's distinction between model-emitted tool calls and tool-execution lifecycle events.drive_tool_callscollects tool outcomes instead of streaming them one-by-one. SuccessfulToolExecutionStart+ToolResultitems are surfaced (in call order) and committed to history only after the whole batch settles successfully. On the first hook termination / fail-closed error the batch fails fast — stop new, drop not-yet-started concurrent siblings, drain in-flight, surface the deterministic lowest call-index error, surface no successful items, commit nothing (no orphan execution-start events, no partial history). Sequential and concurrent share the atomic path;run()andstream()return the same terminal reason. This matches OpenAI Agents' bounded concurrent execution that commits/surfaces only after the batch settles. (Previously the concurrent path streamed each result as its tool completed, in completion order.)active_toolsfiltering,allowed_tool_names_for_choicevalidates the effective request before the provider call:ToolChoice::Requiredwith no advertised tool (no executable tool and no synthetic output tool) andToolChoice::Specificnaming a tool not in the effective advertised set are local request errors with no provider round-trip. When a per-turnactive_toolsallow-list caused the incompatibility, the error says so and suggests setting a compatibletool_choicein the sameRequestPatch. Structured-output Tool mode with no real tools still works when the synthetic output tool satisfies the choice. This is Pydantic AI's explicit local validation for impossibletool_choice/tool-set combinations.Round-3 tests: split taxonomy + atomic ordering on both surfaces (
stream_emits_model_tool_calls_then_atomic_execution_items,..._results_in_call_order_after_batch_settles_...);concurrent_termination_surfaces_no_execution_items(no orphan start, no successful result, side effect ran but suppressed); Fix-3 unit tests + no-provider-call integration tests (required_with_empty_active_tools_...,specific_naming_filtered_out_tool_...); the anthropic streaming-tools cassette updated to assert call-order surfacing.A 4-dimension adversarial review of round-3 confirmed 7 findings (no logic bugs in the atomic batch, split, or validation — the machinery was sound):
ToolExecutionStartnow carries the effective (hook-rewritten) tool call rather than the model's original — matching its doc and preventing aRewriteArgsredaction from leaking the original args (run_single_toolreturnsToolExecution::Executed(effective_call) | Skipped); aFlow::Skipresult is now surfaced as aToolResult(noToolExecutionStart) rather than committed-but-not-streamed; theactive_toolserror hint is shown only when the filter actually dropped a satisfying tool (not for a plain typo); and several doc-scoping/wording fixes (atomicity is not scoped toconcurrency > 1; no completion-order claims). Regression tests added for each. Full rig-core lib 1138 pass; anthropic/openai/gemini cassette suites (322) pass.Update — round 2 (correctness hardening)
A second adversarial review round, informed by how Pydantic AI / OpenAI Agents / LangGraph handle these cases, tightened four behaviors:
Flow::Terminatefrom a tool hook is now turn-wide fail-fast (was run-all-then-decide). Sequential execution (tool_concurrency == 1, the default) surfaces the terminate immediately and never starts the remaining sibling tools. Concurrent execution (> 1) drops not-yet-started siblings — a sharedterminatingflag makes them skip — while already-in-flight siblings are drained (so no detached task is left and the lowest call-index terminate reason still wins deterministically; a dropped sibling always has a higher index than the terminator that dropped it). No post-termination successful result is surfaced or committed. This avoids the Semantic-Kernel fail-open where every dispatched tool runs to completion before termination is honored, matching Pydantic AI / OpenAI Agents' cancel-or-drain-on-failure.gen_ai.tool.call.resultis recorded on the span only after theToolResulthook runs — the redacted replacement onRewriteResult, the raw output onKeep, and nothing onTerminate— so a redaction guardrail never leaks the raw output to the trace or logs. The firstToolResulthook still observes the tool's actual output. (OpenAI Agents applies tool-output guardrails before tracing / tool-end / model-visible output for the same reason.)ModelTurnFinished.contentguarantee. On the streaming surface,ModelTurnFinished.contentnow carries the canonical committed content fromStreamedTurn::finish(reasoning → text → tool-call ordering), matching what is recorded into run history — not the rawstream.choiceaggregate. The raw choice is retained for the raw/final stream item. This mirrors Vercel AI SDK / LangGraph separating raw stream events from normalized final state; the blocking surface already used the committedresp.choice.Scratchpad/HookContextnow document that attool_concurrency > 1theToolCall/ToolResulthooks for different tools share oneHookContextand can run concurrently;Scratchpad::updateis race-free per operation but imposes no deterministic ordering across concurrent tool hooks — store commutative/idempotent state or key per-tool state by the call id / internal call id (as LangGraph/OpenAI Agents/Pydantic AI treat run context as shared runtime state, not an ordered log under parallel tool execution).Round-2 tests: sequential fail-fast (tool B side effect must not run) and concurrent drop-not-yet-started + drain-in-flight (both surfaces,
Notify-synchronized for determinism, 5/5 non-flaky); redaction non-leak via a span-field capturing subscriber; canonicalModelTurnFinishedordering. The redaction and canonical tests were verified to fail without their fix. Also:git diff --checkclean,()observes nothing, CHANGELOG signature includesctx, streaming transcript-untouched assertion added.Shortcomings addressed → what fixed them
Continue, so a RAG hook's request patch skips a later tool-policy/telemetry hookCompletionCall, every hook runs and their patches mergeRequestOverrideis narrow, replacement-only, no per-hook merge semanticsRequestPatchwith documented per-field merge rulesRequestPatch::extra_context: Vec<Document>HookContext(run id, turn, streaming flag, agent name, sharedScratchpad)RewriteArgs/RewriteResultwins)StepEvent::ModelTurnFinished(once per accepted turn, both surfaces)New dispatch contract
A
HookStackcombines hooks in registration order, and how theirFlowresults combine depends on the event:CompletionCall— accumulate & merge. Every hook runs; eachFlow::PatchRequestis merged in registration order into one effective patch. A mergeable patch does not short-circuit later hooks.Flow::Terminatestops the stack; any unsupported flow fails closed (accumulated patch discarded).ToolCall/ToolResult— chain. Every hook runs; aRewriteArgs/RewriteResultis threaded into the next hook's event so rewrites compose (redact → truncate).Skip/Terminateare terminal mid-chain. The first result hook still observes the tool's actual output.Continuewins (observe-only / recovery:CompletionResponse,ModelTurnFinished,InvalidToolCall, streamed deltas).Nesting composes: a
HookStackpushed as a hook returns its own net flow (a merged patch / threaded rewrite / terminal action), which the outer stack folds in again. Register observe-only hooks before steering hooks (aTerminateshort-circuits the stack).RequestPatchmerge semantics (per field)extra_contextadditional_paramspreambletemperature,max_tokens,tool_choiceactive_toolshistoryPatches are per-turn and non-sticky —
CompletionCallre-fires each turn and re-resolves from the agent baseline.extra_context— document orderingDocument order in the completion request is static → dynamic (vector-store) → hook extras, hook extras in registration order. The RAG query text is unchanged. Per-turn and non-sticky; works identically on
run()andstream().historyviewRequestPatch::historyreplaces the prior messages sent to the provider for the turn (context-window compaction / summarization). The persisted transcript and run state are untouched, and RAG's query text still derives from the original history — only what is sent changes.HookContextPassed by
&to everyon_event:run_id(),turn()(advanced per turn by the driver),is_streaming(),agent_name(), and a sharedScratchpad(interior-mutable type-map:insert/get/update/remove/contains) so cooperating hooks share run-scoped state without their ownArc<Mutex<…>>. It is a driver construct — nothing from it reaches the sans-IOAgentRun.Streaming/non-streaming
run()andstream()share one drive loop (drive_agent), so every new semantic (patch accumulation,extra_context,history, chained rewrites) lands once and behaves identically — verified on both surfaces in tests.ModelTurnFinishedfires exactly once per accepted turn on both surfaces, including a streamed tool-only turn (which fires noStreamResponseFinish); it is suppressed for turns recovered via invalid-tool-call retry/repair/skip, and fires after the medium-specific raw event when one fires.CompletionResponse(blocking) andStreamResponseFinish(streaming, text turns only) are retained for the raw provider payload.Decision-point resolutions
CompletionCall(hooks see the agent baseline, not earlier hooks' patches): simpler, keepsStepEvent: Copy, sufficient because patches are declarative with documented conflict rules.active_toolsintersects rather than last-writer-wins: it is an allow-list guardrail, so two narrowing hooks must compose as narrowing; widening is a config error.preamble/history: last-writer-wins with atracing::warn!— silent conflict is the Semantic Kernel wart; a warning keeps composition debuggable. Additive guidance belongs inextra_context, not preamble concatenation.ModelTurnFinishedfires after the raw event: consistent "raw before normalized" ordering across surfaces.HookContext.turnis anAtomicUsize(built once per run, advanced per turn): gives a coherent turn to non-CompletionCallevents;Syncfor&HookContextacross awaits.Inspiration references
Studied under
/Users/kisaczka/Desktop/code/many_rigs/inspirations/:docs/decisions/0043-filters-exception-handling.md,KernelFunctionFromPrompt.InvokeStreamingCoreAsync): its ADR admits forgettingawait next()silently disables downstream filters, and it shipped a real streaming/non-streaming short-circuit asymmetry. → kept Rig's typedFlow+ fail-closed model; enforced both-surface parity with tests.chat_agent_executor.pyllm_input_messagesvsmessages;pregel/main.pyinvoke = stream): the ephemeral per-call view vs committed edit →RequestPatch::history; one drive loop → parity guarantee.generate-text.tsprepareStep,util/notify.ts): per-field replace-vs-merge documented explicitly; observers return void and are error-isolated; naming churn (onFinish→onEnd) → renamed once with no aliases.tool_guardrails.py): rich enum outcomes over booleans; run-scopedRunContextWrapper→HookContext.middleware/types.py,factory.py): first-registered-is-outermost onion, immutable request + override; classic callbacks' untyped-kwargs pain → typedStepEvent.capabilities/abstract.py): a run-scopedRunContexton every hook is table stakes; but its ~30-method mega-trait is a tax Rig avoids by keeping the singleon_event+ match-arm model.Tests
cargo test -p rig-core --lib agent— 226 passed (220 existing + 6 new hook-v2 integration tests), 0 failed.hook.rsunit tests: 16 (accumulation, terminate short-circuit, nested-stack composition, per-field merge rules incl. empty intersection, scratchpad).runner.rsintegration tests (both surfaces): extra_context after static context, append order, per-turn non-sticky, history override (transcript untouched),ModelTurnFinishedonce per accepted turn incl. tool-only, chainedRewriteArgs+RewriteResult.request_override/tool_call_rewrite_args/tool_result_rewrite(6), openairequest_hook/permission_control(3), geminiagent_run_streamed(6) — all pass on both surfaces.cargo fmt --check,cargo clippy -p rig-core --lib --tests --all-features(0 warnings),cargo doc -p rig-core --no-deps --all-features(0 warnings). All 5 hook examples +agent_with_durable_approvalcargo checkclean.A 5-dimension adversarial review (API/design, merge-semantics edge cases, streaming parity, regressions, deep dispatch correctness) surfaced no correctness/parity/regression issues; the only confirmed findings were doc-only — the four public
add_hookrustdocs and the foundational CHANGELOG bullet still carried the pre-v2 "first non-Continueshort-circuits the rest" blanket claim (false for the new accumulate/chain cases) — now corrected to state the event-dependent composition and point to the module docs.Known limitations & follow-ups (not in this PR)
RunStarted/RunFinished { outcome }observe-only lifecycle events (so telemetry sees hook-initiated terminate) — deferred.RequestPatchis#[non_exhaustive], so a futuretoolsfield is non-breaking.dynamic_contextreimplemented as a bundled hook usingextra_context.Non-goals (unchanged)
No vector-store crate/
VectorStoreIndexremoval;dynamic_contextuntouched; no RAG example migration; no genericRetrievertrait; no second observer trait or declarative ordering engine.