feat(bench): add attack_replay benchmark by vincent-k2026 · Pull Request #299 · megaeth-labs/mega-evm

vincent-k2026 · 2026-05-22T07:08:27Z

Summary

Adds attack_replay, a hermetic regression benchmark that replays a real MegaETH mainnet attack contract deployment through MegaEvm.

The fixture is a self-contained ~64 KB JSON snapshot captured via debug_traceCall + prestateTracer (diffMode=false) at a fixed block:

tx: caller / nonce / gas / value / 17 KB initcode / chain_id
prestate (11 accounts): 4 proxy contracts at 0x4200..., 1 ERC-20 with code + 10 storage slots, the caller, and 5 supporting storage contracts
block env: number / timestamp / basefee / gas_limit / beneficiary / mix_hash

Bench arms

Three arms on the same in-memory state:

arm	typical wall time	engine
`attack_replay/equivalence`	~1.15 ms	`MegaSpecId::EQUIVALENCE`
`attack_replay/mini_rex`	~35.9 ms	`MegaSpecId::MINI_REX`
`attack_replay/pure_revm`	~1.10 ms	vanilla `revm` (`Context::mainnet`) baseline

Both mega-evm specs execute the exact same 205,951 opcodes and deploy the same 582-byte runtime. The ~30x gap between EQUIVALENCE and MINI_REX isolates the cost of the multi-dimensional AdditionalLimit accounting (quadratic LOG / compute / storage / data / KV buckets) that MINI_REX enables.

The pure_revm arm is a self-check: it should land near the EQUIVALENCE arm, confirming the bench is honest and not short-circuiting.

Why this bench

Numbers correlate directly with production: the mini_rex arm matches the sequencer-monitor's observed ~33 ms inside api.inspect(...) for this exact transaction, making the bench a stable, reproducible target for any limit-tracker / hot-path optimization (e.g. caching net_usage in FrameLimitTracker).

Sanity checks

Run before criterion warm-up:

Asserts ExecutionResult::Success variant + reports deployed code length + addr + accounts/slots touched.
Counts opcode steps via a minimal OpcodeCounter inspector and asserts steps >= MIN_EXPECTED_OPCODE_STEPS (100,000). Any future setup mistake that silently short-circuits tx validation will fail the bench loudly instead of producing artificially fast numbers.

Test plan

cargo bench --bench attack_replay -p mega-evm --no-run builds clean
cargo +nightly fmt --check -p mega-evm clean
cargo clippy --bench attack_replay -p mega-evm 0 warnings
cargo bench --bench attack_replay -p mega-evm -- --quick runs, all sanity assertions pass, numbers as expected

Adds `attack_replay`, a hermetic regression bench that replays a real MegaETH mainnet attack contract deployment through `MegaEvm`. The fixture is a self-contained ~64 KB JSON snapshot captured via `debug_traceCall` + `prestateTracer` (diffMode=false) at a fixed block: - tx (caller / nonce / gas / value / 17 KB initcode / chain_id) - prestate (11 accounts: 4 proxy contracts at 0x4200..., 1 ERC-20 with code + 10 storage slots, caller, and 5 supporting storage contracts) - block env (number / timestamp / basefee / gas_limit / beneficiary / mix_hash) The bench produces three arms on the same in-memory state: attack_replay/equivalence ~1.15 ms (MegaSpecId::EQUIVALENCE) attack_replay/mini_rex ~35.9 ms (MegaSpecId::MINI_REX) attack_replay/pure_revm ~1.10 ms (vanilla revm baseline) Both mega-evm specs execute the exact same 205,951 opcodes and deploy the same 582-byte runtime. The ~30x gap between EQUIVALENCE and MINI_REX isolates the cost of the multi-dimensional AdditionalLimit accounting (quadratic LOG / compute / storage / data / KV buckets). The pure_revm arm is a self-check: it should land near EQUIVALENCE, confirming the bench is honest and not short-circuiting. Numbers correlate directly with production: the MINI_REX arm matches the sequencer-monitor's observed ~33 ms inside `api.inspect(...)` for this tx, making the bench a stable target for any limit-tracker / hot-path optimization. Sanity checks run before criterion warm-up: - ExecutionResult variant + deployed code + accounts/slots touched - opcode step count via a minimal OpcodeCounter inspector, with a MIN_EXPECTED_OPCODE_STEPS guard so any future setup mistake that silently short-circuits validation fails the bench loudly instead of producing artificially fast numbers. Run: cargo bench --bench attack_replay

Troublor · 2026-05-26T15:50:50Z

@RealiCZ — non-blocking follow-up suggestion; happy to see this merge as-is.

The hand-rolled fixture parser is the right local call for one bench, but if more replay-style benches land we'll want a unified path. Sketching what the follow-up could look like:

Adopt the EEST state test schema as the canonical fixture format. state-test already has it — TestUnit { env, pre, transaction, post, out }, with AccountInfo deriving both Serialize and Deserialize (camelCase + alloy_serde::quantity), so the format is already round-trippable.
Extend mega-evme replay with --dump-fixture <FILE>. Record every Database::basic / storage / code_by_hash access during the replay to populate pre; copy block env into env, the on-chain tx into transaction, and the actual ResultAndState into post so the fixture self-checks on read. Workflow becomes:
```
mega-evme replay <tx_hash> --rpc <url> --dump-fixture foo.json
```
One command, no external cast rpc debug_traceCall + prestateTracer needed.
Reuse state-test types from this bench. Add state-test as a dev-dependency on mega-evm — Cargo permits the dev-dep cycle (mega-evm --dev→ state-test --regular→ mega-evm). The bench drops parse_u256 / parse_bytes / parse_address + the bespoke TxFixture / AccountFixture / BlockFixture structs in favor of use state_test::types::TestUnit. mega-evme's private AccountState (in bin/mega-evme/src/common/state.rs) also folds into the same type.

Implementation notes for whoever picks this up:

state-test's lib target is implicit but real (src/lib.rs alongside [[bin]]) — no Cargo.toml change on the state-test side.
EEST TransactionParts is multi-variant (matrix of data × gas × value); for a single-shot bench fixture, use a single-element variant via TxPartIndices(0, 0, 0).
Worth carving default-features = false + a slim types feature on state-test so consumers don't drag in walkdir / indicatif / triehash / k256 / plain_hasher.
Keep this separate from the existing --rpc.capture-file / ReplayTransport machinery in bin/mega-evme/src/common/provider/ — that's a transport-level artifact feeding AlloyDB, solving a different problem.

…eplay # Conflicts: # crates/mega-evm/Cargo.toml

claude · 2026-05-27T02:01:04Z

LGTM. The benchmark is well-structured and the sanity-check approach (asserting Success + asserting ≥ 100,000 opcode steps) is exactly right for catching silent short-circuits before criterion warms up.

One observation (non-blocking, all sanity checks pass): the mega_evm arms zero out operator_fee_scalar/operator_fee_constant to make the OP-stack L1 fees disappear, while the pure_revm arm disables balance_check and base_fee entirely. Both approaches produce a successful execution, so the opcode-level work is the same across arms — but it's worth leaving a comment in build_pure_revm_context (or inline in the bench closure) explaining why disable_balance_check is needed for the revm arm but not for mega_evm, so the next reader doesn't have to reconstruct it.

Troublor's follow-up on adopting the EEST state test schema if more replay benches land is the right long-term call. Happy to see this merge as-is.

claude · 2026-05-27T06:11:43Z

Label check

Two labels look off:

comp:misc → should be comp:core: all changes are in crates/mega-evm/Cargo.toml and crates/mega-evm/benches/attack_replay.rs, which belong to the mega-evm core crate, not the miscellaneous part of the repo.
spec:stable → should be spec:unchanged: the benchmark uses the EQUIVALENCE and MINI_REX specs but does not touch or change any spec behavior. spec:unchanged ("No change to any mega-evm's behavior") is the correct descriptor here; spec:stable implies the diff modifies stable-spec code under a must-not-change-behavior constraint, which is not the case.

claude · 2026-05-27T06:11:51Z

Two label issues on this PR:

comp:misc → comp:core: All changes are in crates/mega-evm/ (Cargo.toml + benches/attack_replay.rs), which is the mega-evm core crate. comp:misc is for changes outside the main crates; comp:core is the correct fit here.
spec:stable → spec:unchanged: spec:stable is intended for PRs that touch stable spec code and must not change its behavior. This PR doesn't modify any spec implementation — it only exercises stable specs (EQUIVALENCE, MINI_REX) inside a benchmark. spec:unchanged ("No change to any mega-evm's behavior") is the accurate label for a benchmark-only addition.

claude · 2026-05-27T06:11:52Z

Label issues:

comp:misc → should be comp:core: all changed files (crates/mega-evm/Cargo.toml, crates/mega-evm/benches/attack_replay.rs, and the fixture under crates/mega-evm/benches/fixtures/) are in the mega-evm core crate.
spec:stable → should be spec:unchanged: no EVM behavior or spec code is modified — this PR only adds a benchmark. spec:stable indicates the PR "touches stable spec code," which isn't the case here.

vincent-k2026 requested review from RealiCZ, Troublor and flyq as code owners May 22, 2026 07:08

flyq approved these changes May 26, 2026

View reviewed changes

Troublor approved these changes May 26, 2026

View reviewed changes

RealiCZ approved these changes May 27, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into krabat/bench/attack-r…

bedcc79

…eplay # Conflicts: # crates/mega-evm/Cargo.toml

RealiCZ added spec:stable Touches stable spec code — must not change behavior comp:misc Changes to the miscellaneous part of this repo api:unchanged No change to the public interface or API labels May 27, 2026

RealiCZ merged commit e728098 into main May 27, 2026
35 of 36 checks passed

RealiCZ deleted the krabat/bench/attack-replay branch May 27, 2026 06:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bench): add attack_replay benchmark#299

feat(bench): add attack_replay benchmark#299
RealiCZ merged 2 commits into
mainfrom
krabat/bench/attack-replay

vincent-k2026 commented May 22, 2026

Uh oh!

Troublor commented May 26, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

vincent-k2026 commented May 22, 2026

Summary

Bench arms

Why this bench

Sanity checks

Test plan

Uh oh!

Troublor commented May 26, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants