Skip to content

Submitter Commit State#926

Open
Kukoomomo wants to merge 17 commits into
mainfrom
submitter_state
Open

Submitter Commit State#926
Kukoomomo wants to merge 17 commits into
mainfrom
submitter_state

Conversation

@Kukoomomo
Copy link
Copy Markdown
Contributor

@Kukoomomo Kukoomomo commented Apr 2, 2026

Summary by CodeRabbit

Release Notes

New Features

  • Added support for optimized "commitState" transactions that reference previously stored blob hashes, reducing transaction overhead

Bug Fixes

  • Enhanced blockchain reorganization detection and state recovery mechanisms
  • Improved transaction confirmation handling and validation logic

Improvements

  • Refined pending transaction pool management and state consistency
  • Optimized batch index and nonce selection processes

Review Change Stack

Kukoomomo and others added 11 commits March 12, 2026 17:09
- Add _setStoredBlobHash helper and BATCH_BLOB_VERSIONED_HASHES_SLOT in L1MessageBaseTest
- Preset batchBlobVersionedHashes in tests so commitBatch succeeds without blob tx
- test_commitAndFinalizeWithL1Messages_succeeds: set stored blob hash for batch 1 and 2
- test_commitBatches_succeeds: set stored blob hash for batch 1
- test_revertBatch_succeeds: set stored blob hash for batch 1 and 2
- Remove duplicate ZERO_VERSIONED_HASH from RollupCommitBatchWithProofTest

Made-with: Cursor
Co-authored-by: FletcherMan <fanciture@163.com>
Co-authored-by: fletcher.fan <fletcher.fan@bitget.com>
@Kukoomomo Kukoomomo requested a review from a team as a code owner April 2, 2026 07:16
@Kukoomomo Kukoomomo requested review from panos-xyz and removed request for a team April 2, 2026 07:16
@Kukoomomo Kukoomomo changed the title Submitter commit state Submitter Commit State Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 2, 2026

📝 Walkthrough

Walkthrough

This PR introduces support for the commitState method, which recommits a batch using a stored blob hash on L1 instead of including the blob data in the transaction. The changes span new constants and contracts, parsing utilities, pending transaction pool refactoring, and rollup service integration with dynamic method selection based on L1 blob-hash state.

Changes

CommitState Feature Implementation

Layer / File(s) Summary
Constants and Contract Definitions
tx-submitter/constants/methods.go, tx-submitter/iface/rollup.go, tx-submitter/mock/rollup.go
New MethodCommitState constant and IsCommitLikeMethod helper unify commit-like checks; IRollup interface adds BatchBlobVersionedHashes to expose per-batch stored blob hash state; mock implementation provided.
Parsing and Recognition of CommitState
tx-submitter/utils/utils.go
Extended ParseMethod, ParseBusinessInfo, and ParseParentBatchIndex to recognize and parse commitState selector; removed unused ParseL1Mempool functions.
Pending Transaction Pool Simplification
tx-submitter/services/pendingtx.go, tx-submitter/services/rollup_handle_test.go
Simplified NewPendingTxs constructor to accept only journal (removed method ID parameters); added thread-safe Len() method; refactored recovery to populate in-memory state directly and deduplicate via journal rewrite.
Reorg Detection and Metrics Separation
tx-submitter/services/reorg.go
Removed metrics parameter from NewReorgDetector; reorg metric recording moved to rollup service level via new handleReorg.
Rollup Initialization and Reorg Detection Integration
tx-submitter/services/rollup.go (initialization, ProcessTx, reorg handling)
Wire simplified PendingTxs(journal) initialization; integrate detectReorgWithRetry() with metrics recording via handleReorg; add safety checks for submitter-rotation timing state.
Transaction Confirmation and Fee Metrics
tx-submitter/services/rollup.go (confirmation, fee handling)
Pass consistent status and currentBlock to handleConfirmedTx; early-return on nil state to prevent receipt mismatches; use IsCommitLikeMethod for fee metric routing.
Pending Transaction Failure Handling and Early Cancellation
tx-submitter/services/rollup.go (handlePendingTx)
Cancel "doomed" commit-like and finalize txs early when target batch is already committed/finalized by another submitter, before applying timeout replacement; remove tracked txs without re-adding cancels to pool.
Finalize Processing and PendingTxs Accessor Patterns
tx-submitter/services/rollup.go (finalize, getters)
Use accessor methods (GetPFinalize, GetNonce, Len) instead of direct field access; rely on SendTx to add txs to pending pool.
Rollup Commit-Like Method Selection Based on Blob-Hash State
tx-submitter/services/rollup.go (rollup)
Query BatchBlobVersionedHashes for the next batch to dynamically select between commitState (blob hash exists) and commitBatch (no blob hash); choose transaction type (dynamic-fee vs rollup) accordingly.
Transaction Resubmission and Rebuild for Blob-Hash Changes
tx-submitter/services/rollup.go (tryRebuildRollupCommitTx, ReSubmitTx)
New tryRebuildRollupCommitTx inspects L1 blob-hash state and rebuilds in-flight commit-like txs to match current conditions; ReSubmitTx attempts rebuild first, bumps blobFeeCap when converting non-blob to blob tx.

Sequence Diagram(s)

sequenceDiagram
  participant Rollup as Rollup Service
  participant ReorgDetect as ReorgDetector
  participant L1 as L1 RPC / Contract
  participant PendingPool as PendingTxs
  participant Confirm as Confirmation Handler
  Rollup->>ReorgDetect: ProcessTx calls detectReorgWithRetry()
  ReorgDetect->>L1: fetch block history, detect reorg
  L1-->>ReorgDetect: reorg depth or ok
  alt hasReorg
    ReorgDetect-->>Rollup: hasReorg=true, depth
    Rollup->>Rollup: handleReorg(depth) increments metrics
    Rollup->>PendingPool: clears confirmed txs
  else no reorg
    ReorgDetect-->>Rollup: hasReorg=false, 0
  end
  Rollup->>L1: query BatchBlobVersionedHashes(nextBatch)
  L1-->>Rollup: stored blob hash or empty
  alt blob hash exists
    Rollup->>Rollup: encode commitState (blob already stored)
    Rollup->>Rollup: build dynamic-fee tx
  else no blob hash
    Rollup->>Rollup: encode commitBatch (include blob in tx)
    Rollup->>Rollup: build rollup tx (with blob sidecar)
  end
  Rollup->>PendingPool: SendTx adds signed tx
  PendingPool-->>Confirm: pending tx awaits confirmation
  Confirm->>L1: poll receipt
  L1-->>Confirm: receipt or nil
  Confirm->>Confirm: handleConfirmedTx processes success/failure
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • morph-l2/morph#924: Refactors PendingTxs.Recover() to directly populate in-memory state and deduplicate via journal rewrite, aligning with this PR's pending-pool lifecycle changes.
  • morph-l2/morph#911: Introduces commitState and per-batch stored blob-hash support in L1 Rollup contracts, providing the state that this PR queries via BatchBlobVersionedHashes.
  • morph-l2/morph#800: Modifies blob-tx construction and resubmission to incorporate versioning/sidecar logic, overlapping with this PR's dynamic method selection and rebuild paths.

Suggested reviewers

  • SecurityLife
  • FletcherMan
  • Web3Jumb0

Poem

🐰 A commitState arrives, with hashes stored on L1,
No blob sidecar needed when the blob's already done!
Dynamic method picking, rebuild paths anew,
The pending pool flows cleaner—accessors see it through. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main addition to the codebase: a new commit state feature.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch submitter_state

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
tx-submitter/services/rollup.go (2)

386-405: ⚠️ Potential issue | 🔴 Critical

Nil/uninitialized rotator handling is still inconsistent.

These lines now treat r.rotator == nil or missing startTime/epoch as a safe skip, but rollup() still unconditionally calls r.rotator.CurrentSubmitter(...) and dereferences r.rotator.startTime/epoch. With the same configuration the background rollup loop will panic instead of skipping.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx-submitter/services/rollup.go` around lines 386 - 405, The rollup() logic
calls r.rotator.CurrentSubmitter and later dereferences
r.rotator.startTime/epoch even when r.rotator is nil or uninitialized, risking a
panic; update rollup() to first check r.cfg.PriorityRollup and r.rotator != nil
before calling r.rotator.CurrentSubmitter, and also guard access to
r.rotator.startTime and r.rotator.epoch (returning errNotMyTurn when rotator is
nil or its state fields are nil) so all uses of r.rotator are protected;
specifically adjust the conditional around r.cfg.PriorityRollup and r.rotator,
move the CurrentSubmitter call after that guard, and ensure any early returns
use consistent error values.

602-688: ⚠️ Potential issue | 🟠 Major

Reverted batches can leave doomed commitBatch txs pending until timeout.

This doomed-commit check only looks at batchIndex <= lastCommitted. After revertBatch, lastCommitted drops back below the batch again, but Rollup.commitBatch now reverts while batchBlobVersionedHashes[batchIndex] is set. A recovered/pending commitBatch therefore sits until TxTimeout before ReSubmitTx can rebuild it to commitState, which can stall recommit or burn a nonce on-chain.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx-submitter/services/rollup.go` around lines 602 - 688, handlePendingTx can
leave doomed commitBatch txs pending when a revertBatch leaves lastCommitted <
batchIndex but batchBlobVersionedHashes[batchIndex] remains set; update the
commit-like branch in handlePendingTx to also detect this reverted-but-has-blob
state and cancel immediately: after computing batchIndex and lastCommitted
(using utils.ParseParentBatchIndex and r.Rollup.LastCommittedBatchIndex), add a
check against the rollup's batch blob map (batchBlobVersionedHashes or an
existing accessor) and treat a present/non-empty blob entry for batchIndex as a
reason to cancel (same CancelTx flow used when batchIndex <= lastCommitted);
reference handlePendingTx, CancelTx, commitBatch, revertBatch, and
batchBlobVersionedHashes when making the change.
tx-submitter/services/pendingtx.go (1)

242-299: ⚠️ Potential issue | 🟠 Major

Recover pnonce from the max nonce, not the last journal entry.

Line 294 assumes ParseAllTxs() returns nonce order, but journal append order can contain replacements with a lower nonce after a higher one. After restart that leaves pnonce too low and the next submission can reuse an already-pending nonce.

💡 Suggested fix
-	var maxCommitBatchIndex, maxFinalizeBatchIndex uint64
+	var maxCommitBatchIndex, maxFinalizeBatchIndex, maxNonce uint64

 	for _, tx := range txs {
+		if tx.Nonce() > maxNonce {
+			maxNonce = tx.Nonce()
+		}
+
 		// Get method name
 		method := utils.ParseMethod(tx, abi)
@@
-	pt.SetNonce(txs[len(txs)-1].Nonce())
+	pt.SetNonce(maxNonce)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx-submitter/services/pendingtx.go` around lines 242 - 299, The recovery
currently sets pnonce from the last journal entry via
pt.SetNonce(txs[len(txs)-1].Nonce()), which is unsafe; instead compute the
maximum nonce seen while iterating txs and call pt.SetNonce(maxNonce) after the
loop. Update the loop that processes txs to track a maxNonce (compare using
tx.Nonce()), handle the empty-txs case appropriately, and replace the
pt.SetNonce(...) call with pt.SetNonce(maxNonce) so pnonce reflects the highest
pending nonce rather than the last array element.
🧹 Nitpick comments (2)
contracts/contracts/test/base/L1MessageBase.t.sol (1)

50-54: Storage slot constants are fragile — consider using foundry's storage inspection.

These hardcoded slot numbers (172, 173) will silently break if the Rollup contract's storage layout changes. Consider adding a comment noting how these were derived (e.g., forge inspect Rollup storage-layout) or adding a sanity-check assertion in setUp() that verifies the slots are still correct.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@contracts/contracts/test/base/L1MessageBase.t.sol` around lines 50 - 54, The
hardcoded storage slot constants BATCH_BLOB_VERSIONED_HASHES_SLOT and
ROLLUP_DELAY_PERIOD_SLOT (and ZERO_VERSIONED_HASH) are fragile; run and record
the exact command used to derive them (e.g., `forge inspect Rollup
storage-layout`) in a comment next to the constants, and add a sanity-check
assertion in setUp() that uses vm.load against the deployed Rollup contract to
verify those slots contain the expected sentinel values (compare the loaded
bytes32 to ZERO_VERSIONED_HASH or known defaults) so tests fail fast if the
Rollup storage layout changes.
tx-submitter/mock/rollup.go (1)

68-71: Mock always returns empty hash — commitState path untestable.

The mock always returns [32]byte{}, which means useCommitState (checked via storedBlobHash != [32]byte{} in rollup.go:1163) will always be false. This prevents testing the commitState code path.

Consider adding a setter method to control the return value:

Proposed addition
 type MockRollup struct {
 	lastCommittedBatchIndex *big.Int
 	lastFinalizedBatchIndex *big.Int
 	insideChallengeWindow   bool
 	batchExists             bool
 	batchTx                 *types.Transaction
 	finalizeTx              *types.Transaction
+	blobVersionedHash       [32]byte
 }

 // BatchBlobVersionedHashes implements IRollup (no stored hash by default)
 func (m *MockRollup) BatchBlobVersionedHashes(opts *bind.CallOpts, batchIndex *big.Int) ([32]byte, error) {
-	return [32]byte{}, nil
+	return m.blobVersionedHash, nil
 }

+// SetBlobVersionedHash sets the mock value for BatchBlobVersionedHashes
+func (m *MockRollup) SetBlobVersionedHash(hash [32]byte) {
+	m.blobVersionedHash = hash
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tx-submitter/mock/rollup.go` around lines 68 - 71, The mock
BatchBlobVersionedHashes currently always returns an empty [32]byte, which
prevents the rollup code path that checks storedBlobHash != [32]byte{}
(useCommitState) from being exercised; add a field to MockRollup (e.g.,
batchBlobVersionedHashesRet [32]byte) and a setter method (e.g.,
SetBatchBlobVersionedHashesHash(hash [32]byte)) that sets that field, then
update BatchBlobVersionedHashes to return the stored field value instead of
always returning [32]byte{} so tests can toggle useCommitState by setting a
non-zero hash.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@contracts/contracts/test/Rollup.t.sol`:
- Around line 1396-1406: The test test_commitState_reverts_when_carrying_blob
currently sends a non-blob batch and only asserts success, so it doesn't
exercise the "blobhash(0) != 0" revert; change the test to explicitly assert the
revert path by constructing a BatchDataInput where blobhash(0) is non-zero
(e.g., bytes32(uint256(1))) and wrap the commit call with
hevm.expectRevert("commitState must not carry blob") (or vm.expectRevert if your
test harness uses vm), then hevm.prank(alice); call
rollup.commitState(batchDataInput, batchSignatureInput) to ensure the revert is
triggered; optionally keep the existing non-blob assert as a separate test or
split into two cases so both success and revert paths are covered (referencing
test_commitState_reverts_when_carrying_blob, batchDataInput,
rollup.commitState).

In `@tx-submitter/services/pendingtx.go`:
- Around line 279-299: The recovery rewrite is not crash-safe because pt.dump()
calls journal.Clean() before re-appending entries from pt.txinfos, risking
truncation if a crash or append fails; fix by making dump() atomic: write the
deduplicated entries to a temporary journal (or append to a new file) and fsync
the temp, then swap/rename it over the live journal (or only call
journal.Clean() after successful writes), ensuring pt.dump() (and any calls to
journal.Clean()/journal.Append()) only truncates the original after the new data
is durably persisted; update references in pendingtx.go to use the
atomic-temp-and-rename approach inside pt.dump() so pt.txinfos remains the
authoritative source until the swap completes.

In `@tx-submitter/services/reorg.go`:
- Around line 31-37: The inline comment and the value for maxHistory in
NewReorgDetector disagree (comment says "Track last 50 blocks" but maxHistory is
5); fix by making them consistent: either set maxHistory to 50 to match the
comment in NewReorgDetector or change the comment to reflect tracking 5 blocks.
Update the unique symbol NewReorgDetector and its maxHistory field so the value
and comment match.

---

Outside diff comments:
In `@tx-submitter/services/pendingtx.go`:
- Around line 242-299: The recovery currently sets pnonce from the last journal
entry via pt.SetNonce(txs[len(txs)-1].Nonce()), which is unsafe; instead compute
the maximum nonce seen while iterating txs and call pt.SetNonce(maxNonce) after
the loop. Update the loop that processes txs to track a maxNonce (compare using
tx.Nonce()), handle the empty-txs case appropriately, and replace the
pt.SetNonce(...) call with pt.SetNonce(maxNonce) so pnonce reflects the highest
pending nonce rather than the last array element.

In `@tx-submitter/services/rollup.go`:
- Around line 386-405: The rollup() logic calls r.rotator.CurrentSubmitter and
later dereferences r.rotator.startTime/epoch even when r.rotator is nil or
uninitialized, risking a panic; update rollup() to first check
r.cfg.PriorityRollup and r.rotator != nil before calling
r.rotator.CurrentSubmitter, and also guard access to r.rotator.startTime and
r.rotator.epoch (returning errNotMyTurn when rotator is nil or its state fields
are nil) so all uses of r.rotator are protected; specifically adjust the
conditional around r.cfg.PriorityRollup and r.rotator, move the CurrentSubmitter
call after that guard, and ensure any early returns use consistent error values.
- Around line 602-688: handlePendingTx can leave doomed commitBatch txs pending
when a revertBatch leaves lastCommitted < batchIndex but
batchBlobVersionedHashes[batchIndex] remains set; update the commit-like branch
in handlePendingTx to also detect this reverted-but-has-blob state and cancel
immediately: after computing batchIndex and lastCommitted (using
utils.ParseParentBatchIndex and r.Rollup.LastCommittedBatchIndex), add a check
against the rollup's batch blob map (batchBlobVersionedHashes or an existing
accessor) and treat a present/non-empty blob entry for batchIndex as a reason to
cancel (same CancelTx flow used when batchIndex <= lastCommitted); reference
handlePendingTx, CancelTx, commitBatch, revertBatch, and
batchBlobVersionedHashes when making the change.

---

Nitpick comments:
In `@contracts/contracts/test/base/L1MessageBase.t.sol`:
- Around line 50-54: The hardcoded storage slot constants
BATCH_BLOB_VERSIONED_HASHES_SLOT and ROLLUP_DELAY_PERIOD_SLOT (and
ZERO_VERSIONED_HASH) are fragile; run and record the exact command used to
derive them (e.g., `forge inspect Rollup storage-layout`) in a comment next to
the constants, and add a sanity-check assertion in setUp() that uses vm.load
against the deployed Rollup contract to verify those slots contain the expected
sentinel values (compare the loaded bytes32 to ZERO_VERSIONED_HASH or known
defaults) so tests fail fast if the Rollup storage layout changes.

In `@tx-submitter/mock/rollup.go`:
- Around line 68-71: The mock BatchBlobVersionedHashes currently always returns
an empty [32]byte, which prevents the rollup code path that checks
storedBlobHash != [32]byte{} (useCommitState) from being exercised; add a field
to MockRollup (e.g., batchBlobVersionedHashesRet [32]byte) and a setter method
(e.g., SetBatchBlobVersionedHashesHash(hash [32]byte)) that sets that field,
then update BatchBlobVersionedHashes to return the stored field value instead of
always returning [32]byte{} so tests can toggle useCommitState by setting a
non-zero hash.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4c9cf0df-78cd-4d89-bd4f-b5ec573f6fb2

📥 Commits

Reviewing files that changed from the base of the PR and between df02f26 and 1d9cb0e.

📒 Files selected for processing (19)
  • bindings/bin/rollup_deployed.hex
  • bindings/bin/zkevmverifierv1_deployed.hex
  • bindings/bindings/rollup.go
  • bindings/bindings/rollup_more.go
  • bindings/bindings/zkevmverifierv1.go
  • bindings/bindings/zkevmverifierv1_more.go
  • contracts/contracts/l1/rollup/IRollup.sol
  • contracts/contracts/l1/rollup/Rollup.sol
  • contracts/contracts/test/Rollup.t.sol
  • contracts/contracts/test/base/L1MessageBase.t.sol
  • tx-submitter/batch/batch_restart_test.go
  • tx-submitter/constants/methods.go
  • tx-submitter/iface/rollup.go
  • tx-submitter/mock/rollup.go
  • tx-submitter/services/pendingtx.go
  • tx-submitter/services/reorg.go
  • tx-submitter/services/rollup.go
  • tx-submitter/services/rollup_handle_test.go
  • tx-submitter/utils/utils.go

Comment thread contracts/contracts/test/Rollup.t.sol
Comment thread tx-submitter/services/pendingtx.go
Comment thread tx-submitter/services/reorg.go
Copy link
Copy Markdown
Contributor

@panos-xyz panos-xyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review — Critical & High Severity Findings

CRITICAL-1: ClearConfirmedTxs() called but not defined in this diff

In rollup.go, handleReorg now calls:

if r.pendingTxs != nil {
    r.pendingTxs.ClearConfirmedTxs()
}

However, the diff removes ClearPendingTxs() and MarkUnconfirmed() but does not show any implementation of ClearConfirmedTxs(). If this method does not exist, the code will not compile. If it exists in a pre-existing file not touched by this PR, its implementation needs verification — does it hold the mutex? Does it sync the journal?

Risk: Compilation failure, or undefined behavior after reorg if the method exists but is incorrect.


CRITICAL-2: Removing pendingTxs.Add() calls relies on a dangerous implicit assumption

The PR removes pendingTxs.Add(tx) in 5+ locations with comments like "SendTx already adds signedTx to the pool":

  • rollup.go finalize path
  • rollup.go rollup path
  • pendingtx.go handlePendingTx cancel (commit & finalize)
  • pendingtx.go replaceTimedOutTx

Core concern: If SendTx fails after broadcasting to the L1 mempool but before its internal Add() call (e.g., network timeout on confirmation, signing error on the tracking side), the transaction occupies a nonce on L1 but is not tracked in the pending pool. Consequences:

  • The nonce is consumed but the system is unaware
  • Subsequent transactions using the same nonce will be rejected by L1
  • The submitter can become permanently stuck

Question: Why not keep the Add() calls as a defensive fallback? The code savings are minimal, but the risk introduced is significant. At minimum, SendTx should guarantee atomic add-or-fail semantics, and that guarantee should be documented and tested.


CRITICAL-3: TOCTOU race in tryRebuildRollupCommitTx

stored, err := r.Rollup.BatchBlobVersionedHashes(nil, big.NewInt(int64(batchIndex)))
// ... series of operations (cache lookup, signature rebuild, tx construction) ...
newTx, err = r.createDynamicFeeTx(...)  // or createRollupTx

There is a time window between reading the stored value from L1 and actually building + signing + submitting the new tx. If L1 reorgs during this window:

  • Read stored hash → build commitState → hash cleared on L1 by the time tx lands → tx reverts
  • Read no stored hash → build commitBatch + blob → hash gets set on L1 → wasted blob fee

When blob fees are elevated, this race condition can cause significant fund waste. Consider adding a post-send verification or at minimum documenting this as a known limitation.


HIGH-1: handleConfirmedTx silently swallows nil receipt

func (r *Rollup) handleConfirmedTx(..., status *txStatus, currentBlock uint64) error {
    if status == nil || status.receipt == nil {
        return nil  // silently returns, tx stays in pending pool
    }

If getTxStatus returns txStatusConfirmed but the receipt disappears due to a reorg happening between the status check and this call, processSingleTx routes to the confirmed branch, and handleConfirmedTx silently returns nil. The transaction will never be processed or removed — it becomes a zombie in the pending pool.

Suggestion: At minimum log.Warn this condition. Ideally, return the tx to pending status so the next processing cycle re-evaluates it.


HIGH-2: ParseParentBatchIndex falls back to commitBatch ABI for unknown selectors

} else {
    // Unknown selector: keep legacy behavior (unpack as commitBatch)
    method, ok = rollupAbi.Methods["commitBatch"]
}

If a finalizeBatch transaction's calldata is accidentally passed to this function (shouldn't happen, but defensive programming should account for it), it will unpack using commitBatch's ABI layout and produce a garbage batch index. This garbage index could then be used in critical state decisions.

Suggestion: Return 0 for truly unknown selectors instead of guessing.


HIGH-3: Recover() journal rewrite breaks consistency on non-crash failure

// Adds to in-memory map without journal
pt.mu.Lock()
pt.txinfos[tx.Hash()] = &types.TxRecord{...}
pt.mu.Unlock()

// ... later ...
if err := pt.dump(); err != nil {
    return fmt.Errorf("failed to rewrite journal after recovery: %w", err)
}

The comment states "a crash here is safe — the next restart will re-read the original entries". However, dump() failure ≠ crash. If dump() returns an error:

  • The in-memory map has already been modified (new txs added)
  • The journal file retains its old state
  • The process does not crash — it continues running with inconsistent in-memory vs. journal state
  • Subsequent Add() / Remove() calls will append to the stale journal

Suggestion: If dump() fails, either crash (as the comment implies is safe) or rollback the in-memory state to match the journal.


HIGH-4: Reorg handling failure is silently swallowed

if hasReorg {
    if err := r.handleReorg(depth); err != nil {
        log.Warn("Post-reorg handling failed", "error", err)  // only warns!
    }
}

If handleReorg fails (e.g., ClearConfirmedTxs errors out), the reorg state cleanup is incomplete, but the submitter continues processing transactions. It may make decisions based on pre-reorg confirmed state, potentially submitting duplicate or conflicting transactions.

Suggestion: A failed reorg cleanup should either retry or halt processing for the current cycle (return error from ProcessTx), not silently continue.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tx-submitter/services/rollup.go (1)

594-594: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Potential nil pointer dereference on BlobGasFeeCap().String().

For DynamicFeeTx transactions, BlobGasFeeCap() returns nil. Calling .String() on a nil *big.Int will panic. The same issue exists at line 630.

🐛 Proposed fix to handle nil BlobGasFeeCap
 			log.Error("Failed to cancel commit batch transaction",
 				"error", err,
 				"tx", tx.Hash().Hex(),
 				"nonce", tx.Nonce(),
 				"gas", tx.Gas(),
 				"gas_tip_cap", tx.GasTipCap().String(),
 				"gas_fee_cap", tx.GasFeeCap().String(),
-				"blob_fee_cap", tx.BlobGasFeeCap().String(),
+				"blob_fee_cap", tx.BlobGasFeeCap(),
 				"batch_index", batchIndex,
 				"last_committed", lastCommitted.Uint64())

Apply the same fix at line 630 for the finalize batch error log. Using tx.BlobGasFeeCap() directly will safely print <nil> instead of panicking.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tx-submitter/services/rollup.go` at line 594, The log calls are dereferencing
a potentially nil *big.Int by calling BlobGasFeeCap().String(), which will panic
for DynamicFeeTx; change both occurrences (the one at the current diff and the
one in the finalize-batch error log) to pass tx.BlobGasFeeCap() (the *big.Int)
directly to the logger or guard with a nil check before calling .String(), so
the logger prints "<nil>" safely when BlobGasFeeCap() is nil; update the log
invocations around tx.BlobGasFeeCap() in rollup.go (the error logging blocks)
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tx-submitter/services/rollup.go`:
- Line 594: The log calls are dereferencing a potentially nil *big.Int by
calling BlobGasFeeCap().String(), which will panic for DynamicFeeTx; change both
occurrences (the one at the current diff and the one in the finalize-batch error
log) to pass tx.BlobGasFeeCap() (the *big.Int) directly to the logger or guard
with a nil check before calling .String(), so the logger prints "<nil>" safely
when BlobGasFeeCap() is nil; update the log invocations around
tx.BlobGasFeeCap() in rollup.go (the error logging blocks) accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 92b49d88-49a4-40c5-b13e-2ce378ac3b23

📥 Commits

Reviewing files that changed from the base of the PR and between 7caec1d and 5618c27.

📒 Files selected for processing (1)
  • tx-submitter/services/rollup.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants