Skip to content

fix(vcs): prevent crash when repo has thousands of untracked files#33927

Open
youtsuhodev wants to merge 4 commits into
anomalyco:devfrom
youtsuhodev:fix/vcs-untracked-files-limit
Open

fix(vcs): prevent crash when repo has thousands of untracked files#33927
youtsuhodev wants to merge 4 commits into
anomalyco:devfrom
youtsuhodev:fix/vcs-untracked-files-limit

Conversation

@youtsuhodev

@youtsuhodev youtsuhodev commented Jun 25, 2026

Copy link
Copy Markdown

Issue for this PR

Closes #33928

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

When a Git repository contains a large number of untracked files (e.g. 1200+), the VCS layer crashes the application. This typically occurs in two scenarios:

  1. A freshly initialized project (\git init) with no initial commit and no .gitignore\ — every single file in the working tree is untracked.
  2. A cloned project whose .gitignore\ is missing or incomplete, causing build artifacts (
    ode_modules, \dist/, etc.) to appear as untracked.

Two distinct issues compounded to produce the crash:

Issue A — --untracked-files=all\ enumerates every file individually

\Git.status()\ ran \git status --untracked-files=all, which instructs git to recurse into every untracked directory and list each file individually in the porcelain output. For a repo with a single untracked directory containing 1200 files, this produces 1200 separate ?? path/to/file\ entries.

The downstream code in \Vcs.status()\ and \Vcs.diff()\ then iterates over every entry and, for each untracked file, spawns one or more additional git subprocesses (\git diff --no-index --numstat\ for stats, \git diff --no-index --patch\ for the diff). With 1200+ files, this creates thousands of sequential git subprocesses, overwhelming the system and causing the application to crash or hang indefinitely.

Note that --untracked-files=all\ is actually not git's default behaviour. Since Git 1.7.0 the default has been --untracked-files=normal, which collapses entire untracked directories into a single ?? dir/\ entry instead of expanding them into individual file entries.

Issue B — no upper bound on file count

Even with --untracked-files=normal, a repository with thousands of individually-modified tracked files could still reach the same crash path. There was no safety limit anywhere in the pipeline: the full list was always processed unconditionally.

Fix

Two changes, one for each root cause:

  1. *Switch from --untracked-files=all\ to --untracked-files=normal* (\packages/opencode/src/git/index.ts)

    This makes git's status output collapse untracked directories into a single entry. For example, if
    ode_modules/\ is untracked and contains 1200 files, the output will contain one line (?? node_modules/) instead of 1200 individual file entries.

    Directory entries (those ending with /) are then filtered out by the existing \ latMap\ callback, since they cannot be processed as individual files — you cannot compute a diff or stats for a directory as a whole.

    In normal day-to-day development, where untracked files appear inside tracked directories (e.g. \src/new-component.ts), --untracked-files=normal\ behaves identically to --untracked-files=all\ — it still lists each file individually. The difference only manifests for entirely untracked directories, which is exactly where the crash occurs.

  2. Add \MAX_STATUS_FILES\ safety limit (\packages/opencode/src/project/vcs.ts)

    A new constant \MAX_STATUS_FILES = 2000\ caps the number of items processed by both \Vcs.status()\ and the internal \ iles()\ helper (used by \Vcs.diff()). If the list exceeds this threshold, it is silently truncated to the first 2000 entries.

    This is a defence-in-depth measure: even if \git.status()\ were to return more items than expected, the VCS layer will never attempt to spawn subprocesses for all of them.

Impact

  • The application no longer crashes on open when the working tree contains many untracked files.
  • The review/change panel shows partial results when the number of changed files exceeds \MAX_STATUS_FILES, which is vastly preferable to a hard crash that forces the user to delete all app data.
  • The fix is entirely in the backend VCS pipeline; no frontend changes are required.

How did you verify your code works?

  • The existing test suite (\�un test\ in \packages/opencode) covers \git.status(), \Vcs.status(), and \Vcs.diff()\ — all of which pass with the change since they operate on individual untracked files, not untracked directories.
  • The --untracked-files=normal\ flag is git's default behaviour since 1.7.0, making this a safe and well-tested code path.
  • The \MAX_STATUS_FILES\ truncation is a simple slice operation with no side effects.

Screenshots / recordings

N/A — backend change only.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

Problem
-------
When a Git repository contains a large number of untracked files (e.g.
1200+), the VCS layer would crash the application. This typically occurs
in two scenarios:

1. A freshly initialized project (git init) with no initial commit and no
   .gitignore — every single file in the working tree is untracked.
2. A cloned project whose .gitignore is missing or incomplete, causing
   build artifacts (node_modules, dist/, etc.) to appear as untracked.

Root cause
----------
Two distinct issues compounded to produce the crash:

Issue A — --untracked-files=all enumerates every file individually
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Git.status() ran git status --untracked-files=all, which instructs
git to recurse into every untracked directory and list each file
individually in the porcelain output. For a repo with a single
untracked directory containing 1200 files, this produces 1200 separate
?? path/to/file entries.

The downstream code in Vcs.status() and Vcs.diff() then iterates
over every entry and, for each untracked file, spawns one or more
additional git subprocesses (git diff --no-index --numstat for stats,
git diff --no-index --patch for the diff). With 1200+ files, this
creates thousands of sequential git subprocesses, overwhelming the
system and causing the application to crash or hang indefinitely.

Note that --untracked-files=all is actually *not* git's default
behaviour. Since Git 1.7.0 the default has been --untracked-files=normal,
which collapses entire untracked directories into a single ?? dir/
entry instead of expanding them into individual file entries.

Issue B — no upper bound on file count
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Even with --untracked-files=normal, a repository with thousands of
individually-modified tracked files could still reach the same crash
path. There was no safety limit anywhere in the pipeline: the full list
was always processed unconditionally.

Fix
---
Two changes, one for each root cause:

1. Switch from --untracked-files=all to --untracked-files=normal
   (packages/opencode/src/git/index.ts)

   This makes git's status output collapse untracked directories into
   a single entry. For example, if
ode_modules/ is untracked and
   contains 1200 files, the output will contain one line (?? node_modules/)
   instead of 1200 individual file entries.

   Directory entries (those ending with '/') are then filtered out by
   the existing flatMap callback, since they cannot be processed as
   individual files — you cannot compute a diff or stats for a directory
   as a whole.

   In normal day-to-day development, where untracked files appear inside
   tracked directories (e.g. src/new-component.ts), --untracked-files=normal
   behaves identically to --untracked-files=all — it still lists each
   file individually. The difference only manifests for *entirely*
   untracked directories, which is exactly where the crash occurs.

2. Add MAX_STATUS_FILES safety limit (packages/opencode/src/project/vcs.ts)

   A new constant MAX_STATUS_FILES = 2000 caps the number of items
   processed by both Vcs.status() and the internal iles() helper
   (used by Vcs.diff()). If the list exceeds this threshold, it is
   silently truncated to the first 2000 entries.

   This is a defence-in-depth measure: even if git.status() were to
   return more items than expected, the VCS layer will never attempt to
   spawn subprocesses for all of them.

Impact
------
- The application no longer crashes on open when the working tree
  contains many untracked files.
- The review/change panel shows partial results when the number of
  changed files exceeds MAX_STATUS_FILES, which is vastly preferable
  to a hard crash that forces the user to delete all app data.
- The fix is entirely in the backend VCS pipeline; no frontend changes
  are required.
@github-actions github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. needs:issue labels Jun 25, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Jun 25, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VCS crashes on startup when repo has thousands of untracked files

1 participant