Skip to content

perf(snapshot): cache/reduce fs.stat calls during tracking#9727

Open
kilo-code-bot[bot] wants to merge 1 commit intomainfrom
perf/snapshot-reduce-stat-calls
Open

perf(snapshot): cache/reduce fs.stat calls during tracking#9727
kilo-code-bot[bot] wants to merge 1 commit intomainfrom
perf/snapshot-reduce-stat-calls

Conversation

@kilo-code-bot
Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot commented Apr 30, 2026

Summary

Snapshot.add() called fs.stat on every candidate path — tracked and untracked — on every Snapshot.track() (2–3× per turn), but the stat result was only ever used for a 2 MB size cap applied to the untracked subset (see the block set below the stat loop). On very large repos like jetbrains/intellij-community (~75k+ tracked files) that was tens of thousands of syscalls per turn, which hung the CLI on startup.

This PR keeps the cap's behavior and only stats untracked candidates. Tracked files are always staged regardless of size, so statting them was dead work. The remaining stat calls also run at higher concurrency (64 vs 8) since the working set is now much smaller and typically contains only new files.

Why this is correct

Before and after this change, the only use of the stat result is:

const block = new Set(untracked.filter((item) => large.has(item)))

block is constructed from untracked only. A tracked file could never end up in block, so whether we stat it or not, it gets staged. Skipping stat on tracked files preserves snapshot correctness 1:1.

Tests

New regression tests in packages/opencode/test/kilocode/snapshot-stat-perf.test.ts:

  • Large (>2 MB) tracked file modification is still captured in the snapshot.
  • Large (>2 MB) untracked file is still blocked (not staged).
  • track() over a 1000-tracked-file repo completes well under the timing bound.

Existing snapshot tests (snapshot-cache, snapshot-freeze-repro, test/snapshot/snapshot.test.ts) still pass. The one failing test in this area (test/session/snapshot-tool-race.test.ts) fails on main too — it's unrelated to this change.

Scope

  • Leaves JS enumeration and check-ignore logic alone (those are separate PRs).
  • Based on main.

Built for Imanol Maiztegui by Kilo for Slack

Snapshot.add() was calling fs.stat on every candidate file — tracked and
untracked — on every turn, only to apply a 2 MB cap that is only ever
used to filter untracked files (the `block` set). On very large repos
like jetbrains/intellij-community (~75k+ tracked files) this meant tens
of thousands of syscalls per turn, which hung the CLI on startup.

Stat only untracked candidates and bump concurrency for the remaining
calls. Tracked files are always staged regardless of size, so the stat
result was dead data.

Adds regression tests that cover the preserved semantics (large tracked
files still tracked, large untracked files still blocked) plus a
many-tracked-files timing bound.
@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented Apr 30, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (3 files)
  • .changeset/snapshot-skip-tracked-stat.md
  • packages/opencode/src/snapshot/index.ts
  • packages/opencode/test/kilocode/snapshot-stat-perf.test.ts

Reviewed by gpt-5.5-2026-04-23 · 188,750 tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants