From cde0946095d710d3fdbd1219c318db5f82502edd Mon Sep 17 00:00:00 2001 From: maclane Date: Fri, 12 Jun 2026 18:49:12 -0400 Subject: [PATCH 1/2] docs(tbtc/signer): Phase 7.0 sidecar transport addendum Maps the frozen interactive-session API onto the sidecar process boundary from Decision Log entry 2 and scopes what that boundary means for the decision-gated #4007 TEE checker stack. Key points: the coarse JSON request/response contract was chosen for exactly this swap, and the frozen spec's engine-held nonce custody dissolves the old decision-brief objection to round-level APIs - rounds cross the boundary, nonces do not. Decision 7's init-config demand semantics carry over unchanged with "sidecar unreachable" joining the same fatal failure family; a sidecar crash is exactly the markers-only restart story, introducing no new failure mode. Security boundary: owner-only UDS with peer-credential pinning, never a network listener; the state-key provider moves to the sidecar's process environment. Transport-parameterized conformance tests are the mechanism keeping "transport swap, not API rework" true. The sidecar track (7.S1-7.S3) runs parallel to 7.1-7.5 and must converge before the ECDSA-retirement phases. Four open questions with proposed defaults (spawn model, framing, connection model, packaging) for sign-off. Co-Authored-By: Claude Fable 5 --- .../phase-7-sidecar-transport-addendum.md | 173 ++++++++++++++++++ 1 file changed, 173 insertions(+) create mode 100644 pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md diff --git a/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md b/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md new file mode 100644 index 0000000000..7f9b7f5506 --- /dev/null +++ b/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md @@ -0,0 +1,173 @@ +# Phase 7.0 Addendum: Sidecar Transport Mapping + +Date: 2026-06-12 +Status: Proposed (same review process as the Phase 7 spec freeze) +Owner: Threshold Labs +Scope: maps the frozen interactive-session API +(`phase-7-interactive-session-spec-freeze.md`, section 5) onto the +sidecar process boundary chosen in Decision Log entry 2, and scopes +what that boundary means for #4007 (the decision-gated TEE checker +stack). This document changes no contract: the sidecar is a +transport swap by construction, and anything here that would alter +the frozen spec is a defect in this document. + +## 1. What the sidecar is + +A separate OS process that owns the signer engine and every secret +it holds: key-share state, the state-encryption key path, and (after +Phase 7.1) the in-memory interactive nonces. The keep-client host +process — Go runtime, libp2p, Ethereum client, every transitive +dependency — talks to it over local IPC and holds no signing +secrets at any time. + +The isolation claim, stated precisely: today a memory-disclosure +bug anywhere in the host address space can read whatever the +in-process engine holds, because the dlopen FFI is an API boundary, +not a security boundary. The sidecar makes the boundary an OS +process boundary. It is also the deliberate stepping stone to the +TEE deployment: a sidecar process becomes an enclave process with +the same wire protocol, which is precisely why decision 2 told +isolation-sensitive work to assume this shape. + +## 2. Why the frozen API maps cleanly + +Two prior decisions did the work in advance: + +* The engine API is already coarse JSON request/response over a C + ABI — chosen over round-level FFI compatibility partly FOR + "cleaner future sidecar extraction" + (`signer-api-contract-decision-brief.md`). +* The frozen section-5 calls are idempotent-or-fail-closed, + self-contained request/response with no callbacks and no shared + memory. + +One tension to resolve explicitly: the old decision brief argued +against round-level APIs because they kept "nonce/round details +crossing the FFI boundary" and made the transport swap harder. The +Phase 7 API *is* round-level (`Round1`/`Round2`) — interactivity is +forced by true two-round FROST with a network exchange between +rounds — but the brief's actual objection is dissolved by the +frozen spec's section 4: rounds cross the boundary, **nonces do +not**. What transits is public commitments, signing packages, and +shares. The chattiness objection is inherent to interactive FROST +and is bounded (two round trips per attempt against a ~41-block +attempt budget; the Annex B arithmetic gives ~175x headroom). + +## 3. Transport mapping + +Same JSON envelopes, different carrier: + +| Engine call (frozen spec §5 / existing API) | dlopen transitional | Sidecar | +|---|---|---| +| `InstallNativeTBTCSignerConfig` (init) | `frost_tbtc_init_signer_config` symbol | First request after connect (handshake step 2) | +| `InteractiveSessionOpen/Round1/Round2/Aggregate/Abort` | per-call symbols (Phase 7.1/7.2) | One method each, identical JSON bodies | +| Coarse transitional calls (until deleted per spec §7) | existing symbols | Same mapping rule | + +Carrier (proposed defaults, section 8): a UNIX domain socket with +length-prefixed JSON frames, a small connection pool, and exactly +one in-flight request per connection. No request multiplexing in +v1: the engine's concurrency model and registries are unchanged, +and the pool bounds parallelism exactly as the host's call sites do +today. Errors keep the structured `ErrorResponse` contract +(`consumed_attempt_replay` etc.) — the codes are the cross-version +interface and MUST NOT fork between transports. + +Transport conformance: the contract tests that pin the FFI behavior +become transport-parameterized — the same request/response suites +run against the dlopen bridge and the sidecar, and divergence is a +release blocker. This is the mechanism that keeps "transport swap, +not API rework" true over time. + +## 4. Process model and lifecycle + +* **Spawn/supervision (proposed default)**: keep-client spawns the + sidecar as a child process and supervises it (restart with + backoff). The alternative — independent systemd unit — is open + question (a); the child model keeps the operator surface to one + service and lets the existing init-config demand semantics apply + without a coordination protocol. +* **Handshake**: (1) version exchange — the host refuses to operate + a sidecar outside its supported range, fail closed; (2) init- + config install — the host reads `TBTC_SIGNER_INIT_CONFIG_PATH` + and posts the install request as the first message, exactly the + #4037/#4041 flow. **Decision 7 carries over unchanged**: with the + path set, a sidecar that cannot be spawned, cannot complete the + handshake, or rejects the config is process-fatal for the host, + in every profile. The enforcement point + (`enforceNativeInitConfigDemand`) gains "sidecar unreachable" as + one more member of the same failure family. +* **Crash semantics**: a sidecar crash loses in-flight nonces — by + the frozen spec's section 4 and ratified question 4 + (markers-only), this is exactly the restart story: live attempts + fail safe, durable consumption markers prevent any replay, the + supervisor restarts the sidecar, re-init runs (idempotent by + config fingerprint), and the attestation TTL applies at re-init + (runbook prerequisite 6). No new failure mode is introduced; the + sidecar converts "host process restart" into the strictly smaller + "signer process restart." +* **Shutdown**: host-initiated graceful stop sends `SessionAbort` + for live sessions (zeroize), then terminates. SIGKILL is + equivalent to a crash and is safe by the same argument. + +## 5. Security boundary + +* Socket: filesystem-permission-guarded UDS (owner-only directory), + peer-credential check (`SO_PEERCRED`/`LOCAL_PEERCRED`) pinning + the host UID. Never a network listener — a TCP mode is explicitly + out of scope and should be rejected in review if proposed. +* Authentication beyond UID pinning is deliberately deferred: the + v1 trust model is same-host, same-operator. The TEE phase + replaces this with an attestation-bound channel; designing that + channel is part of #4007's scope, not this addendum's. +* Secrets: the state-encryption key provider (env/command) runs in + the **sidecar's** process environment, not the host's. The config + file may carry `state_key_command` (its 0600 guidance stands); + the command executes sidecar-side. Host environment variables + stop being a secret channel entirely. + +## 6. What does not change + +JSON schema ownership (Rust), the error-code contract, idempotency +and fail-closed semantics, registries and persistence +(sidecar-local files, same formats), provenance gating, the frozen +section-5 verification rules, and the section-7 deletion trigger. +The dlopen bridge remains the shipping transport until the sidecar +lands; Phases 7.1-7.5 build and validate on dlopen without waiting. + +## 7. #4007 (TEE checker stack) scoping + +#4007 gates *whether a signer may register* on TEE attestation +evidence and stays decision-gated on the DAO's TEE policy — this +addendum does not undraft it. What the sidecar decision gives it is +a concrete subject: the artifact whose identity gets attested is +the sidecar binary (later, the enclave image), not the composite +keep-client process. #4007's open scoping questions become: which +measurement (binary hash / enclave MRENCLAVE-equivalent), who +verifies (the DAO-whitelist checker), and how the attestation binds +to the UDS channel. Those land in #4007's own design doc; the +interface contract it must respect is sections 3-5 here. + +## 8. Open questions (proposed defaults; decide at this addendum's +sign-off) + +* (a) **Spawn model**: keep-client child process (default) vs. + independent systemd unit. +* (b) **Wire framing**: length-prefixed JSON frames (default) vs. + newline-delimited JSON. +* (c) **Connection model**: small pool, one in-flight request per + connection (default) vs. request-id multiplexing. +* (d) **Packaging**: sidecar binary ships in the same release + artifact as keep-client (default) vs. separate artifact with its + own version line. + +## 9. Sequencing + +The sidecar is not on the 7.1-7.5 critical path: those phases build +on the dlopen transport, and the frozen API guarantees the swap is +transport-only. The sidecar track runs in parallel and must +converge **before the ECDSA-retirement phases** (decision 1's +timing: take the isolation step before mainnet TVL migrates). +Suggested shape: 7.S1 sidecar process + handshake + conformance +suite; 7.S2 operational hardening (supervision, packaging, +runbook); 7.S3 cutover of the production default with dlopen kept +as the rollback transport for one release. From e45b975fc4672986ce989d822527a6010f8a3da6 Mon Sep 17 00:00:00 2001 From: maclane Date: Fri, 12 Jun 2026 21:34:43 -0400 Subject: [PATCH 2/2] docs(tbtc/signer): scope the sidecar secret-boundary claim to signing Section 1 said the host "holds no signing secrets at any time," but section 3 maps the transitional DKG calls unchanged and the frozen Phase 7 spec still has the DKG APIs returning/accepting secret_package_hex through the host until the DKG-custody follow-up. So in deployments that run DKG through this transport the host still sees DKG secret material (review finding). Section 1 now scopes the property to the signing path and states explicitly that #4007 must treat the host<->sidecar signing interface as a secret boundary but NOT the DKG interface until DKG custody moves inside the sidecar - a precondition for the sidecar being a complete secret boundary. Co-Authored-By: Claude Fable 5 --- .../phase-7-sidecar-transport-addendum.md | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md b/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md index 7f9b7f5506..755cc0232c 100644 --- a/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md +++ b/pkg/tbtc/signer/docs/phase-7-sidecar-transport-addendum.md @@ -17,8 +17,23 @@ A separate OS process that owns the signer engine and every secret it holds: key-share state, the state-encryption key path, and (after Phase 7.1) the in-memory interactive nonces. The keep-client host process — Go runtime, libp2p, Ethereum client, every transitive -dependency — talks to it over local IPC and holds no signing -secrets at any time. +dependency — talks to it over local IPC. + +**Boundary scope (important, and a hard prerequisite for #4007).** +The "host holds no signing secrets" property is *scoped to the +signing path* and holds once Phase 7.1's engine-held nonce custody +ships: key shares are env/command-only and nonces never leave the +engine. It does **not** yet hold for **DKG**: the transitional DKG +APIs that section 3 maps unchanged still return and accept +`secret_package_hex` through the host (frozen Phase 7 spec section 4 +names DKG secret-package custody as an out-of-scope follow-up). So in +any deployment that runs DKG through this transport, the host process +still sees DKG secret material. #4007 must therefore treat the +host↔sidecar **signing** interface as a secret boundary but must NOT +treat the DKG interface as one until the DKG-custody follow-up moves +that material inside the sidecar (or DKG is run out-of-band). Closing +that gap is a precondition for the sidecar being a complete secret +boundary. The isolation claim, stated precisely: today a memory-disclosure bug anywhere in the host address space can read whatever the