Trail of Bits tBTC v2 AC Extension — security remediation#4126
Open
lrsaturnino wants to merge 18 commits into
Open
Trail of Bits tBTC v2 AC Extension — security remediation#4126lrsaturnino wants to merge 18 commits into
lrsaturnino wants to merge 18 commits into
Conversation
…riplet eligibility filter excludeOperatorTriplets assigned the third operator from operators[j] instead of operators[k], so the post-exclusion seat-count filter double-counted the middle operator and ignored the third. This admitted triples whose true remaining seat count is below retryParticipantsCount and dropped otherwise-valid triples, biasing the tECDSA DKG/signing retry candidate set toward under-quorum attempts that then fail at the protocol layer. Assign rightOperator from operators[k] so the filter subtracts the seats of all three distinct operators in each triple. Add a regression test that pins the eligible-triplet count and the selected exclusion against the corrected arithmetic.
… past redemption events PastRedemptionRequestedEvents populated the converted event's TxMaxFee field from event.TreasuryFee, so every historical RedemptionRequested event returned to off-chain consumers (monitoring, replay, analytics) reported the treasury fee in place of the per-transaction max fee. The live-state converter in the same file already reads TxMaxFee correctly, confirming this as an off-by-one-field copy-paste error. Map TxMaxFee from event.TxMaxFee. Extract the per-event conversion into a pure convertPastRedemptionRequestedEvent helper so the field-by-field mapping is unit testable, and add a regression test asserting the treasury fee and tx max fee are populated from their own distinct sources. TOB-TBTCACEXT-10
… signing done check The listen goroutine wrote to signingDoneCheck.doneSigners under doneSignersMutex, but waitUntilAllDone read the map (length check and range iteration) and isValidDoneMessage read it (duplicate-signer check) without holding the mutex. Go maps are not safe for concurrent read/write, so normal signing completion — including the covenant signer submit paths — raced and could fail or crash the signer process. Acquire doneSignersMutex around the length check and iteration in waitUntilAllDone (capturing the result into locals and releasing the lock before returning) and around the duplicate-signer read in isValidDoneMessage. Add a race-detector regression test that drives concurrent listen writes and waitUntilAllDone reads; it broadcasts done messages with the stateless standard retransmission strategy to isolate this race from the separate backoff-strategy synchronization.
…ck counters BackoffStrategy stored its tickCounter, delay, and retransmitTick counters without synchronization, but ScheduleRetransmissions invokes strategy.Tick from a fresh goroutine on every ticker event. When one retransmission tick was still running as the next fired, two goroutines read and wrote the same BackoffStrategy instance concurrently, corrupting the retransmission schedule used by tECDSA signing, DKG, and signing-done messages. Add a sync.Mutex to BackoffStrategy and guard the counter read/modify/write in Tick, capturing the retransmit decision into a local and invoking retransmitFn after releasing the lock so the message is not retransmitted while holding shared state. Add a race-detector regression test that calls Tick concurrently from many goroutines behind a start barrier.
…ttempt members signingDoneCheck.listen discarded the attempt member list and kept only its length, and isValidDoneMessage validated wallet-group membership but not attempt inclusion. Any valid wallet group member — including one excluded from the current attempt — could therefore fill a completion slot, letting waitUntilAllDone reach its threshold before the selected signers finished and skewing the next batch item's start block from a non-attempt member's endBlock. Record the selected attempt members in a set in listen and reject done messages whose sender is not in that set in isValidDoneMessage. Keep the set as goroutine-local state captured by the listener closure and passed to isValidDoneMessage, rather than on the shared signingDoneCheck struct, so concurrent listen calls on the same instance cannot race on it or trigger a "concurrent map writes" fatal error while populating it. Add a regression test that sends a done message from a valid group member excluded from the attempt and asserts the completion check does not count it.
…oval certificates verifySignerApprovalCertificate parsed the DER-encoded threshold signature and called ecdsa.Verify without checking that the signature used the canonical low-S form. ECDSA signatures are malleable: for a valid secp256k1 signature (r, s), the signature (r, N - s) also verifies, so two distinct byte encodings were accepted for the same approval — a hazard if certificate signature bytes are ever used as an identifier, deduplication key, or audit-log key. Add a low-S check after DER parsing and before ecdsa.Verify, mirroring the rule already applied by the covenant signer artifact approval verifier. Add a regression test that malleates a valid certificate's signature to its high-S counterpart (r, N - s) and asserts the verifier rejects it.
… destination buildCovenantTransactionBuilder wrote MigrationDestination.DepositScript directly as the destination output's PublicKeyScript. In tBTC that field is the plain deposit script, and the Bridge's Deposit library rebuilds the funding output and requires its 32-byte hash to equal sha256(depositScript) for P2WSH. Signing the raw deposit script as the output script produced a funding output that revealDepositWithExtraData rejects, making the migration deposit unrevealable and leaving the migration debt unrepaid. Derive the P2WSH scriptPubKey (OP_0 <sha256(depositScript)>) from the deposit script via bitcoin.PayToWitnessScriptHash(bitcoin.WitnessScriptHash(...)), matching the pattern already used for the active covenant UTXO, and reject empty deposit scripts. Update the self_v1 and qc_v1 submit tests to assert the destination output is the P2WSH of the deposit script and not the raw script.
…jobs persist across restarts The covenant signer store persisted jobs under the nested directory name "covenant-signer/jobs". The disk persistence handle creates only one directory level (os.Mkdir, not MkdirAll) and enumerates only one directory level on load, reporting a nested file's directory as its first-level parent. As a result the first job save could fail when the parent directory did not exist, and even when it did, every persisted job was skipped on reload — breaking idempotency and recovery for accepted requests across restarts. Use the single-level directory name "covenant-signer-jobs", skip non-.json files (such as the store lock file that now shares the jobs directory) during load, and migrate any job files left under the legacy nested path on startup. Add disk-backed store tests that persist a job through a real NewBasicDiskHandle, restart the store, and assert both the request ID and route request indexes are restored, plus a test covering the legacy-path migration.
…ts in covenant signing
When a tBTC wallet is closed or terminated, handleWalletClosure archives it and
removes it from the wallet registry, but two paths kept the archived wallet
signable through the covenant signer:
1. getSigningExecutor consults the node's signingExecutors cache before the
wallet registry, so an executor created while the wallet was live kept being
returned after archival. The node could keep participating in threshold
signing (including the covenant signer paths) for a wallet the closure was
meant to deauthorize, until the process restarted.
2. VerifySignerApproval resolved the wallet and checked that registry data was
available, but never checked the wallet lifecycle state. The signer set hash
embedded in an approval certificate binds only the wallet identity, members
hash, and threshold, none of which change when a wallet is closed or
terminated, so a certificate issued while the wallet was live kept verifying
after closure.
Add node.invalidateSigningExecutor, which evicts the wallet's cached executor
under signingExecutorsMutex, and call it from handleWalletClosure after
walletRegistry.archiveWallet succeeds. Fail closed in VerifySignerApproval by
rejecting covenant signing for wallets that are not in the live state, the only
state covenant migrations are expected for; supporting another state later
requires adding it explicitly with justification.
Add regression tests that cache an executor, archive the wallet, confirm the
stale executor is still served pre-invalidation, and assert the wallet is no
longer signable after invalidation; and that VerifySignerApproval rejects an
otherwise-valid request once the wallet is no longer live.
…p without approval roots The covenant signer service only enforced the presence of a signer approval verifier and the route trust roots when the requireApprovalTrustRoots flag was set; otherwise it merely logged warnings and started anyway. A production (non-loopback) deployment could therefore start with the multi-party approval model silently disabled, relying on operators noticing warnings and enabling strict settings. Treat a non-loopback listen address as requiring the full approval model, mirroring the existing non-loopback authToken requirement: validateRequiredApprovalTrustRoots now also enforces the trust roots and the signer approval verifier when the listen address is non-loopback. Loopback deployments keep the warning-only behavior unless requireApprovalTrustRoots is set. Add startup tests asserting that a non-loopback address fails when trust roots or the verifier are missing.
… DKG approval archiveClosedWallets used IsWalletRegistered(walletID) == false as the signal that a wallet was closed or terminated and could be archived. That signal also matches a freshly generated wallet whose DKG result has been submitted but not yet approved on-chain: during the approval window the wallet exists only on the node's disk and is not yet registered. A node restart in that window archived the fresh key share, and enough affected operators could leave the new wallet without a signing quorum. Distinguish the two cases via GetWallet, which returns ErrWalletNotFound for a wallet the Bridge has never recorded (the pending case) and returns the wallet with its state for a registered wallet (which the Bridge retains permanently after closure). Archive only wallets recorded on-chain in a Closed or Terminated state, and leave not-yet-registered wallets in place. Add regression tests covering both the pending-wallet skip and the closed-wallet archival.
…evoked withdrawals The internal withdrawRevoked overwrote deposits[operator].withdrawn with the current available amount instead of adding to it. After a prior partial withdrawal had already incremented withdrawn, calling withdrawRevoked reset the counter, making availableAmount report the same tokens as available again. Because the escrow holds KEEP for many operators in one shared balance, repeated withdrawRevoked calls could withdraw the same tokens over and over and drain tokens backing other operators' deposits. Set deposits[operator].withdrawn = deposit.withdrawn.add(amount), matching the accounting already used by withdraw and migrate. Add a regression test that performs a partial withdraw, revokes the grant, calls withdrawRevoked, and asserts a second withdrawRevoked withdraws nothing.
…ling completes The DKGResultSubmitted callback recorded the deduplication key before running validation. Because validation can exit early on transient failures (RPC errors in IsDKGResultValid/ChallengeDKGResult/GetDKGState/DKGParameters, or a failed challenge-confirmation wait), a failed attempt left the key set. A later redelivery of the same event was then dropped as already-processed, silently removing the node from the challenger set for the rest of the challenge window. Split the deduplicator into claim/confirm/abort semantics backed by an in-progress set plus the existing completed TimeCache: an event is only recorded as completed once its handler succeeds, while concurrent duplicate deliveries are still ignored. executeDkgValidation/validateDKG now return an error so the callback confirms the dedup entry on a terminal outcome and releases it on a transient failure, allowing a redelivered event to retry.
…archival The WalletClosed callback recorded the deduplication key before archiving the wallet. handleWalletClosure can fail transiently (RPC error while reading the block counter or waiting for closure confirmation, an unconfirmed result, or a failed archive write) and return before removing the wallet from the local registry. The dedup key stayed set, so a redelivered WalletClosed event was dropped and the closed wallet remained in walletRegistry, keeping it signable through fresh getSigningExecutor lookups until process restart. Move the wallet closure to claim/confirm/abort deduplication: the callback now confirms the dedup entry only after handleWalletClosure succeeds and releases it on failure, so a redelivered event retries the archival.
…er handling The DKGStarted callback recorded the deduplication key before the block-height confirmation wait, the DKG state check, the past-event lookup, and the local DKG join. If any of those steps returned early on a transient error, the dedup key stayed set and a redelivered event was skipped, so the operator never joined the new group. Move the DKG started handling to claim/confirm/abort deduplication: the callback now confirms the dedup entry only once the local DKG join has been dispatched (or the event was authoritatively found unconfirmed) and releases it on any transient early return, so a redelivered event retries the handling.
…f targets Each SPV scanner fetched the latest TransactionLimit transactions for a wallet address and only then filtered for unproven protocol transactions. Because the limit was applied to the raw address history before protocol-level filtering, an attacker could send more than TransactionLimit dust transactions to the wallet address to push the real protocol transaction out of the bounded window, so the auto-prover found nothing and never submitted the SPV proof. Add a shared getUnprovenWalletTransactions helper that lists the wallet's confirmed transaction hashes and scans them newest-to-oldest, fetching and protocol-filtering each until enough unproven protocol transactions are found or the history is exhausted. The limit now bounds the number of matching protocol transactions rather than the raw address transactions examined, so unrelated spam can no longer evict protocol transactions from discovery. All four scanners (deposit sweep, redemption, moving funds, moved funds sweep) use the helper.
…bridge fraud defense The covenant signer produces a normal Bitcoin SIGHASH_ALL signature over a covenant active UTXO using the tBTC wallet key. Because a covenant active UTXO is neither a swept deposit, a spent main UTXO, nor a processed moved-funds sweep, the tBTC Bridge cannot defeat a fraud challenge that replays this signature, so a valid covenant migration signature can become a slashable wallet signature once the transaction is broadcast. The complete remediation is a bridge-side covenant fraud-defense path in the tBTC Bridge (Fraud.defeatFraudChallenge), which lives outside this repository. As the keep-core-side safeguard, gate covenant signing behind an explicit operator confirmation that the bridge covenant fraud-defense is deployed: covenantSigner.bridgeCovenantFraudDefenseConfirmed defaults to false and, until set, OnSubmit fails closed for both self_v1 and qc_v1 routes so the node cannot produce a signature that would expose a wallet to an undefeatable fraud challenge. The signing choke point is documented with the same invariant.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…diation Reconcile the Trail of Bits AC Extension remediation with the base feature branch, which has since landed its own independent deep-audit work. Three files conflicted; both resolutions preserve the security intent: - signer approval low-S (TOB-TBTCACEXT-70): the base already rejects non-canonical high-S signatures inline in verifySignerApprovalCertificate, so the duplicate isLowSSecp256k1 helper added here was dropped in favor of the base's version. - covenant-signer job directory (TOB-TBTCACEXT-65): the base reworked store.go but still persisted jobs under the nested "covenant-signer/jobs" path, which the single-level disk handle skips on reload. Re-ported the flat "covenant-signer-jobs" directory, the legacy-path startup migration, and the non-.json load skip onto the base's version. Build passes and the covenant-signer, signing-done, wallet-lifecycle, and retransmission/retry/SPV package tests are green on the merged tree.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Applies the remediation for the Trail of Bits "tBTC v2 AC Extension" security assessment. Every in-scope finding whose target code lives in this repository is fixed in its own commit, tagged with its public report ID (
TOB-TBTCACEXT-<n>) so each fix maps 1:1 to a finding, with regression test(s) added alongside each fix.Base / stacking
This branch was created off the frozen review commit
235e8fd9, an ancestor offeat/psbt-covenant-final-project-pr(the reviewed PR, #3882). The PR therefore targets that branch so the diff is exactly the remediation (17 commits) rather than PR #3882's own changes. Because the feature branch has since advanced past the frozen commit (it now carries the deep-audit fixes in #3938), expect to reconcile against those later commits when this lands. Retarget tomainonce #3882 lands.Findings remediated (17)
doneSignersmap on every access in the signing-done check), 6 (synchronizeBackoffStrategytick counters against concurrent retransmission ticks), 22 (scan past address spam when discovering SPV proof targets), 36 (do not archive wallets still inside the DKG approval window), 56 (deduplicate DKG results only after handling completes), 57 (deduplicate wallet closures only after archival succeeds), 65 (use a flat job directory so covenant-signer jobs persist across restarts), 87 (deauthorize archived and non-live wallets in covenant signing)TxMaxFeefrom its own field in past redemption events), 26 (reject signing-done messages from members not in the current attempt), 72 (deduplicate theDKGStartedcallback only after its handler completes)Finding 1's complete remediation is a bridge-side change in the tBTC Bridge (
Fraud.defeatFraudChallenge) that lives in thetbtc-v2repository; the keep-core side fails closed until an operator confirms that defense is deployed.Verification
go build ./...passes. The affected-package tests are green (pkg/tecdsa/retry,pkg/net/retransmission,pkg/covenantsigner,pkg/maintainer/spv, and the covenant-signer, signing-done, and wallet-lifecycle regression tests inpkg/tbtc). Each finding commit carries its own regression test(s) — race-detector tests for the concurrency findings, handler-level transient-failure and redelivery tests for the deduplication findings, and value-level tests for the accounting and filtering fixes.Opened as a draft for review.