fix(browse): restrict sensitive state file ACLs on Windows via icacls#1121
fix(browse): restrict sensitive state file ACLs on Windows via icacls#1121scarson wants to merge 1 commit intogarrytan:mainfrom
Conversation
gstack marks sensitive state files with `{ mode: 0o600 }` on
writeFileSync/appendFileSync and `{ mode: 0o700 }` on mkdirSync, plus
occasional `fs.chmodSync(path, 0o600)` post-write. On Windows, Node's fs
module doesn't translate POSIX mode bits to NTFS ACLs, so every one of
those calls is a silent no-op. The files end up with inherited ACLs from
the parent directory — typically user-full + SYSTEM-full + Administrators-
full + whatever inherited groups the parent holds.
Empirical demonstration via icacls on a fresh file written with
`{ mode: 0o600 }` on Windows 11:
BEFORE (Node writeFileSync with mode: 0o600):
S-1-5-21-... :(I)(M,DC)
Artemis\CodexSandboxUsers:(I)(M,DC)
NT AUTHORITY\SYSTEM:(I)(F)
BUILTIN\Administrators:(I)(F)
ARTEMIS\Sam:(I)(F)
AFTER (restrictFilePermissions via icacls):
ARTEMIS\Sam:(F)
Six ACEs collapsed to one. Files this affects today on every Windows
gstack install:
- `.gstack/.auth.json` server bearer token + port
- `.gstack/sidebar-agent-queue.jsonl`
full user prompts + canary token
(the prompt-injection defense secret)
- `.gstack/sidebar-sessions/*/session.json`
worktree paths, session IDs, config
- `.gstack/sidebar-sessions/*/chat.jsonl`
full chat history
- `.gstack/security/device-salt` hash key
- `.gstack/security/attempts.jsonl` threat-detection audit log
- `.gstack/security/session-state.json` security decisions
- `.gstack/security/decisions/tab-*.json` per-tab security verdicts
- config.stateDir/browse-states/*.json saved cookie bundles
- config.stateDir/browse.json state file
- `.context/sidebar-inbox/*.json` inter-agent messages
Most painful real-world scenario: self-hosted CI runners (GitHub Actions,
GitLab, Jenkins) where the runner's service account runs as a different
user on the same Windows box — that account can read every state file
the developer's gstack wrote, because inherited ACLs grant broad access.
Same concern on shared dev machines and multi-tenant servers.
Fix: a new `browse/src/file-permissions.ts` module with four functions:
restrictFilePermissions(path) chmod 0o600 on POSIX;
icacls /inheritance:r /grant:r
<user>:(F) on Windows
restrictDirectoryPermissions(path) chmod 0o700 on POSIX;
icacls /inheritance:r /grant:r
<user>:(OI)(CI)(F) on Windows
(OI+CI so new children inherit
the restricted ACL)
writeSecureFile(path, data) writeFileSync + restrictFile
appendSecureFile(path, data) appendFileSync + restrictFile on create
mkdirSecure(path) mkdirSync recursive + restrictDir
16 call sites across 7 files converted to the drop-in wrappers. Error
handling: icacls failures log a one-shot warning to stderr (filesystem
stays functional, file is just not ACL-restricted) so operators know to
audit their setup if icacls is unavailable or misbehaving, but the
warning won't spam. Bare `restrictFilePermissions`/`restrictDirectory-
Permissions` are exported too for the one sidebar-agent site that re-
applies ACL without rewriting.
Tests: 13 new assertions in `browse/test/file-permissions.test.ts`
covering POSIX mode bits (0o600 for files, 0o700 for dirs), Windows
no-throw on nonexistent paths, idempotent mkdir, recursive creation,
Buffer payloads, and appendSecureFile ACL-only-on-create semantics.
Windows-platform tests don't assert on icacls output (brittle across
Windows versions / locales) — they verify the "doesn't crash, file still
readable by caller" contract.
Zero regressions: same 140 pass / 1 fail on the pre-existing test set on
clean upstream vs. 153 pass / 1 fail with my changes (13 new tests; the
1 pre-existing failure is an unrelated beforeEach timeout).
Design notes for reviewer (inviting pushback on approach):
1. icacls is the right Windows primitive — no Node-native NTFS ACL API
exists, shell-out is the idiomatic answer. If there's a library you'd
prefer over `execFileSync('icacls', ...)` I'm happy to swap.
2. The ACL grants ONLY the current user. Not SYSTEM, not Administrators.
Rationale: this matches the POSIX 0o600 semantic (owner-only). A user
could argue SYSTEM should retain access for backup/AV — I'd counter
that the files are recreated on each gstack session, so backup loss
is annoying-but-not-catastrophic. Open to changing this if you'd
prefer SYSTEM+user.
3. icacls failures log once-per-process. Alternative: silent (lose audit
trail) or log-per-failure (spammy). Split the difference with a
single heads-up so ops teams notice the pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
For context: opened #1122 today ("Boiling the Windows CI Lake") proposing a Phase-1 Windows smoke CI that would have caught this class of bug at PR time. This PR is cited in the RFC's coverage table as one of the five recent examples that motivated it — not asking for anything on this PR, it can merge on its own merits. Cross-linking because the "Why this shipped unnoticed" paragraph above reads differently with a systemic answer in flight. Also a companion issue at #1123 for higher-level discussion if the PR's review thread gets crowded. |
|
Closing in favor of #1308, refreshed against current main and re-staged post-v1.24.0.0. The reshape is mostly mechanical (the file-permissions.ts module is unchanged in shape), but coverage is extended to terminal-agent.ts (8 new call sites, file is new since v1.20) and tunnel-denial-log.ts (1 new site, new since v1.20). Without those, the original PR would have left ~9 silent-NTFS-no-op gaps on current main — exactly the surfaces the threat model cares about (PTY token storage, tunnel denial audit log). This PR's empirical icacls before/after, threat model, design-notes pushback section, and 13-test suite all carry forward to #1308. |
Summary
gstack's
~/.gstack/state directory holds a lot of secrets: the server bearer token (.auth.json), the canary token that backstops prompt-injection defense, full user prompts in the agent queue, complete chat history, per-tab security decisions, saved cookie bundles, the device salt used for threat-payload hashing, and the threat-detection audit log. Every one of those files is written withfs.writeFileSync(path, data, { mode: 0o600 })or appended with the same, and parent dirs get{ mode: 0o700 }.On Windows, all of that is a silent no-op. Node's fs module doesn't translate POSIX mode bits to NTFS ACLs. Every "restricted" file ends up with the parent directory's inherited ACL, which on every Windows install I've tested is a large set of principals.
Empirical proof
Fresh file written on Windows 11 with
fs.writeFileSync(p, 'secret', { mode: 0o600 }), then inspected viaicacls:Six ACEs — the one I actually want (
ARTEMIS\Sam) is the LAST of six. The{ mode: 0o600 }hint on the write call did nothing. After applying this PR's helper:One ACE. That's the POSIX 0o600 semantic translated correctly.
Threat model
The "inherited by 6 principals" state is merely aesthetic on a single-user laptop where all those principals map back to the same human. It's a real leak on:
.gstack/.The agent queue contains the canary token, which is the secret that backstops prompt-injection defense. Letting another user read that defeats one of the security classifier's core layers.
Why this shipped unnoticed
Same story as the previous two PRs.
.github/workflows/runs every job onubuntu-latestormacos-latest—chmodworks there, the test suite passes, and no CI config ever inspects NTFS ACLs.The fix
A new
browse/src/file-permissions.tsmodule with two primitives and three drop-in wrappers:(OI)(CI)on directories means new child files (object inherit) and subdirs (container inherit) inherit the single-user-full ACL — important because anyfs.writeFileSync(...)inside amkdirSecure-created dir will land already-restricted without further work.16 call sites across 7 files converted:
browse/src/server.tsbrowse/src/sidebar-agent.tsbrowse/src/security.tsbrowse/src/cli.tsbrowse/src/security-classifier.tsbrowse/src/browser-manager.tsbrowse/src/meta-commands.tsbrowse/src/config.tsTwo call sites that the grep heuristic from the first audit pass missed — including the most-critical one (
.auth.jsonwith the server bearer token,browser-manager.ts:273) — were picked up on a second sweep and included. (Lesson for the future: grep-then-diff, not grep-then-file.)Error handling
icacls failures (nonexistent path, missing icacls.exe, hardened environments that block it) log a one-shot warning to stderr and proceed — the filesystem is still functional, the file just ends up with inherited ACLs. Warning message:
Once-per-process gating prevents log spam if the condition persists.
Design notes — inviting pushback
This PR introduces a new pattern to gstack (shell-out to icacls). Before landing, I'd value your read on three judgment calls:
icacls vs. a Node library. I used
execFileSync('icacls', ...)— the Win32 native tool that ships in System32 on every Windows install since 7. The alternative is an npm library likewin-permissionsthat wraps icacls behind a typed API. I'd lean shell-out (one fewer dependency, no npm-supply-chain exposure for a security-critical function), but happy to swap.ACL scope: just the user, or user + SYSTEM? The helper grants ONLY the current user — matches POSIX 0o600 exactly. An argument for adding SYSTEM: Windows backup / antivirus services run as SYSTEM and may need read access. An argument against: gstack state files are small, ephemeral, and recreated on each session; backup loss is annoying but not catastrophic. I went with "just user" to be deliberately strict; happy to relax to
SYSTEM:(F),<user>:(F)if you prefer.Graceful degradation on icacls failure. I log once and continue. Alternatives: silent (lose audit trail), log per failure (spammy), hard-fail (crash the server on a security hardening failure — probably wrong). The once-per-process compromise seemed right. Happy to adjust.
I'm aware this PR is larger than #1118/#1119/#1120 — the surface is wider because the bug touches every sensitive-file site. Each individual change is mechanical (swap call to wrapper), which I hope makes the diff easy to skim.
Test plan
bun test browse/test/file-permissions.test.ts— 13 pass, 0 fail (POSIX mode-bit assertions, Windows no-throw, mkdir idempotence, recursive creation, Buffer payloads, append-creates-then-reapplies-once semantics)beforeEachtimeout in a test unrelated to my changesfs.chmodSync(path, 0o6XX)— same semantics as the inline{ mode: 0o6XX }it replaces. No behavior change on Linux/macOS.What this PR doesn't do
~/.gstack/ngrok.envis created by the user (we only read it), so we can't restrict its ACL at write time. A user running on a shared box is on their own for that one — worth a note in docs but not a code fix.mode: 0o600was a no-op, those files keep their original broad ACL until gstack next writes them (which restricts on the new write). A one-shot migration step could scan~/.gstack/and runicaclson everything — out of scope for this PR; happy to add if you want it.Commits
4d1ae05—fix(browse): restrict sensitive state file ACLs on Windows via icaclsSingle commit — the helper + call-site replacements + tests are all one cohesive change.
🤖 Generated with Claude Code