feat(bench): add attack_replay benchmark#299
Conversation
Adds `attack_replay`, a hermetic regression bench that replays a real
MegaETH mainnet attack contract deployment through `MegaEvm`.
The fixture is a self-contained ~64 KB JSON snapshot captured via
`debug_traceCall` + `prestateTracer` (diffMode=false) at a fixed block:
- tx (caller / nonce / gas / value / 17 KB initcode / chain_id)
- prestate (11 accounts: 4 proxy contracts at 0x4200..., 1 ERC-20 with
code + 10 storage slots, caller, and 5 supporting storage contracts)
- block env (number / timestamp / basefee / gas_limit / beneficiary /
mix_hash)
The bench produces three arms on the same in-memory state:
attack_replay/equivalence ~1.15 ms (MegaSpecId::EQUIVALENCE)
attack_replay/mini_rex ~35.9 ms (MegaSpecId::MINI_REX)
attack_replay/pure_revm ~1.10 ms (vanilla revm baseline)
Both mega-evm specs execute the exact same 205,951 opcodes and deploy
the same 582-byte runtime. The ~30x gap between EQUIVALENCE and
MINI_REX isolates the cost of the multi-dimensional AdditionalLimit
accounting (quadratic LOG / compute / storage / data / KV buckets).
The pure_revm arm is a self-check: it should land near EQUIVALENCE,
confirming the bench is honest and not short-circuiting.
Numbers correlate directly with production: the MINI_REX arm matches
the sequencer-monitor's observed ~33 ms inside `api.inspect(...)` for
this tx, making the bench a stable target for any limit-tracker /
hot-path optimization.
Sanity checks run before criterion warm-up:
- ExecutionResult variant + deployed code + accounts/slots touched
- opcode step count via a minimal OpcodeCounter inspector, with a
MIN_EXPECTED_OPCODE_STEPS guard so any future setup mistake that
silently short-circuits validation fails the bench loudly instead
of producing artificially fast numbers.
Run:
cargo bench --bench attack_replay
|
@RealiCZ — non-blocking follow-up suggestion; happy to see this merge as-is. The hand-rolled fixture parser is the right local call for one bench, but if more replay-style benches land we'll want a unified path. Sketching what the follow-up could look like:
Implementation notes for whoever picks this up:
|
…eplay # Conflicts: # crates/mega-evm/Cargo.toml
|
LGTM. The benchmark is well-structured and the sanity-check approach (asserting One observation (non-blocking, all sanity checks pass): the Troublor's follow-up on adopting the EEST state test schema if more replay benches land is the right long-term call. Happy to see this merge as-is. |
|
Label check Two labels look off:
|
|
Two label issues on this PR:
|
|
Label issues:
|
Summary
Adds
attack_replay, a hermetic regression benchmark that replays a real MegaETH mainnet attack contract deployment throughMegaEvm.The fixture is a self-contained ~64 KB JSON snapshot captured via
debug_traceCall+prestateTracer(diffMode=false) at a fixed block:0x4200..., 1 ERC-20 with code + 10 storage slots, the caller, and 5 supporting storage contractsBench arms
Three arms on the same in-memory state:
attack_replay/equivalenceMegaSpecId::EQUIVALENCEattack_replay/mini_rexMegaSpecId::MINI_REXattack_replay/pure_revmrevm(Context::mainnet) baselineBoth mega-evm specs execute the exact same 205,951 opcodes and deploy the same 582-byte runtime. The ~30x gap between
EQUIVALENCEandMINI_REXisolates the cost of the multi-dimensionalAdditionalLimitaccounting (quadratic LOG / compute / storage / data / KV buckets) thatMINI_REXenables.The
pure_revmarm is a self-check: it should land near theEQUIVALENCEarm, confirming the bench is honest and not short-circuiting.Why this bench
Numbers correlate directly with production: the
mini_rexarm matches the sequencer-monitor's observed ~33 ms insideapi.inspect(...)for this exact transaction, making the bench a stable, reproducible target for any limit-tracker / hot-path optimization (e.g. cachingnet_usageinFrameLimitTracker).Sanity checks
Run before criterion warm-up:
ExecutionResult::Successvariant + reports deployed code length + addr + accounts/slots touched.OpcodeCounterinspector and assertssteps >= MIN_EXPECTED_OPCODE_STEPS(100,000). Any future setup mistake that silently short-circuits tx validation will fail the bench loudly instead of producing artificially fast numbers.Test plan
cargo bench --bench attack_replay -p mega-evm --no-runbuilds cleancargo +nightly fmt --check -p mega-evmcleancargo clippy --bench attack_replay -p mega-evm0 warningscargo bench --bench attack_replay -p mega-evm -- --quickruns, all sanity assertions pass, numbers as expected