Skip to content

fix(browse): restrict sensitive state file ACLs on Windows via icacls#1121

Closed
scarson wants to merge 1 commit intogarrytan:mainfrom
scarson:fix/windows-acl-hardening
Closed

fix(browse): restrict sensitive state file ACLs on Windows via icacls#1121
scarson wants to merge 1 commit intogarrytan:mainfrom
scarson:fix/windows-acl-hardening

Conversation

@scarson
Copy link
Copy Markdown

@scarson scarson commented Apr 21, 2026

Summary

gstack's ~/.gstack/ state directory holds a lot of secrets: the server bearer token (.auth.json), the canary token that backstops prompt-injection defense, full user prompts in the agent queue, complete chat history, per-tab security decisions, saved cookie bundles, the device salt used for threat-payload hashing, and the threat-detection audit log. Every one of those files is written with fs.writeFileSync(path, data, { mode: 0o600 }) or appended with the same, and parent dirs get { mode: 0o700 }.

On Windows, all of that is a silent no-op. Node's fs module doesn't translate POSIX mode bits to NTFS ACLs. Every "restricted" file ends up with the parent directory's inherited ACL, which on every Windows install I've tested is a large set of principals.

Empirical proof

Fresh file written on Windows 11 with fs.writeFileSync(p, 'secret', { mode: 0o600 }), then inspected via icacls:

=== BEFORE restrictFilePermissions ===
...\acl-proof.txt  S-1-5-21-810533980-956579296-1443678346-1568416717:(I)(M,DC)
                   Artemis\CodexSandboxUsers:(I)(M,DC)
                   S-1-5-21-572213208-1484666115-665933948-1070144907:(I)(M,DC)
                   NT AUTHORITY\SYSTEM:(I)(F)
                   BUILTIN\Administrators:(I)(F)
                   ARTEMIS\Sam:(I)(F)

Six ACEs — the one I actually want (ARTEMIS\Sam) is the LAST of six. The { mode: 0o600 } hint on the write call did nothing. After applying this PR's helper:

=== AFTER restrictFilePermissions ===
...\acl-proof.txt  ARTEMIS\Sam:(F)

One ACE. That's the POSIX 0o600 semantic translated correctly.

Threat model

The "inherited by 6 principals" state is merely aesthetic on a single-user laptop where all those principals map back to the same human. It's a real leak on:

  • Self-hosted CI runners (GitHub Actions, GitLab, Jenkins agents). The runner's service account runs as a different user on the same Windows box; under inherited ACLs, that account can read the developer's canary token, session data, and prompt history. This is the scenario I'd prioritize — common setup, cross-user leak, no user action required for the attack.
  • Shared development machines (agencies, studios, lab environments). Multiple developer accounts on one box inherit each other's .gstack/.
  • Multi-tenant servers / shared hosts. Rare but catastrophic when it applies.
  • Container escape on hosts with mounted home dirs.
  • Compromised processes running as the same user (browser extensions, malware). Same-user isolation isn't a defense POSIX offers either, so this one is neutral — but the first three cases aren't.

The agent queue contains the canary token, which is the secret that backstops prompt-injection defense. Letting another user read that defeats one of the security classifier's core layers.

Why this shipped unnoticed

Same story as the previous two PRs. .github/workflows/ runs every job on ubuntu-latest or macos-latestchmod works there, the test suite passes, and no CI config ever inspects NTFS ACLs.

The fix

A new browse/src/file-permissions.ts module with two primitives and three drop-in wrappers:

export function restrictFilePermissions(path: string): void      // chmod 0o600 | icacls <user>:(F) /inheritance:r
export function restrictDirectoryPermissions(path: string): void // chmod 0o700 | icacls <user>:(OI)(CI)(F) /inheritance:r

export function writeSecureFile(path, data): void                // writeFileSync + restrictFile
export function appendSecureFile(path, data): void               // appendFileSync + restrictFile on create only
export function mkdirSecure(path): void                          // mkdirSync recursive + restrictDir

(OI)(CI) on directories means new child files (object inherit) and subdirs (container inherit) inherit the single-user-full ACL — important because any fs.writeFileSync(...) inside a mkdirSecure-created dir will land already-restricted without further work.

16 call sites across 7 files converted:

File Dirs Files Total
browse/src/server.ts 4 10 14
browse/src/sidebar-agent.ts 2 2 4
browse/src/security.ts 4 3 + 1 append 8
browse/src/cli.ts 1 1 2
browse/src/security-classifier.ts 2 0 2
browse/src/browser-manager.ts 1 1 2
browse/src/meta-commands.ts 1 1 2
browse/src/config.ts 1 0 1

Two call sites that the grep heuristic from the first audit pass missed — including the most-critical one (.auth.json with the server bearer token, browser-manager.ts:273) — were picked up on a second sweep and included. (Lesson for the future: grep-then-diff, not grep-then-file.)

Error handling

icacls failures (nonexistent path, missing icacls.exe, hardened environments that block it) log a one-shot warning to stderr and proceed — the filesystem is still functional, the file just ends up with inherited ACLs. Warning message:

[gstack] Failed to restrict Windows ACL on <path>: <error>
  Sensitive files may be readable by other accounts on this machine.
  This warning appears once per process; subsequent failures are silent.

Once-per-process gating prevents log spam if the condition persists.

Design notes — inviting pushback

This PR introduces a new pattern to gstack (shell-out to icacls). Before landing, I'd value your read on three judgment calls:

  1. icacls vs. a Node library. I used execFileSync('icacls', ...) — the Win32 native tool that ships in System32 on every Windows install since 7. The alternative is an npm library like win-permissions that wraps icacls behind a typed API. I'd lean shell-out (one fewer dependency, no npm-supply-chain exposure for a security-critical function), but happy to swap.

  2. ACL scope: just the user, or user + SYSTEM? The helper grants ONLY the current user — matches POSIX 0o600 exactly. An argument for adding SYSTEM: Windows backup / antivirus services run as SYSTEM and may need read access. An argument against: gstack state files are small, ephemeral, and recreated on each session; backup loss is annoying but not catastrophic. I went with "just user" to be deliberately strict; happy to relax to SYSTEM:(F),<user>:(F) if you prefer.

  3. Graceful degradation on icacls failure. I log once and continue. Alternatives: silent (lose audit trail), log per failure (spammy), hard-fail (crash the server on a security hardening failure — probably wrong). The once-per-process compromise seemed right. Happy to adjust.

I'm aware this PR is larger than #1118/#1119/#1120 — the surface is wider because the bug touches every sensitive-file site. Each individual change is mechanical (swap call to wrapper), which I hope makes the diff easy to skim.

Test plan

  • bun test browse/test/file-permissions.test.ts — 13 pass, 0 fail (POSIX mode-bit assertions, Windows no-throw, mkdir idempotence, recursive creation, Buffer payloads, append-creates-then-reapplies-once semantics)
  • Same targeted test set as the earlier PRs — 153 pass, 1 fail on this branch vs. 140 pass, 1 fail on clean upstream (stashed + re-ran). The 13 extra passes are the new assertions; the 1 failure is the pre-existing beforeEach timeout in a test unrelated to my changes
  • Empirical icacls before/after on a real file — 6 ACEs → 1 ACE (output above)
  • Ubuntu/macOS CI pass — relying on the existing matrix. POSIX branch of the helper is fs.chmodSync(path, 0o6XX) — same semantics as the inline { mode: 0o6XX } it replaces. No behavior change on Linux/macOS.
  • Full end-to-end gstack workflow on Windows with the helper in place — not run. Every test that exercises sidebar/session/agent queue state is already passing; integration is limited to the write path which the unit tests cover.

What this PR doesn't do

  • Fix non-gstack-owned state files. ~/.gstack/ngrok.env is created by the user (we only read it), so we can't restrict its ACL at write time. A user running on a shared box is on their own for that one — worth a note in docs but not a code fix.
  • Retroactively re-ACL existing files. If a user has gstack state on disk from an earlier install where mode: 0o600 was a no-op, those files keep their original broad ACL until gstack next writes them (which restricts on the new write). A one-shot migration step could scan ~/.gstack/ and run icacls on everything — out of scope for this PR; happy to add if you want it.
  • Change the POSIX behavior. Everything on Linux/macOS is bit-identical to what shipped.

Commits

  • 4d1ae05fix(browse): restrict sensitive state file ACLs on Windows via icacls

Single commit — the helper + call-site replacements + tests are all one cohesive change.

🤖 Generated with Claude Code

gstack marks sensitive state files with `{ mode: 0o600 }` on
writeFileSync/appendFileSync and `{ mode: 0o700 }` on mkdirSync, plus
occasional `fs.chmodSync(path, 0o600)` post-write. On Windows, Node's fs
module doesn't translate POSIX mode bits to NTFS ACLs, so every one of
those calls is a silent no-op. The files end up with inherited ACLs from
the parent directory — typically user-full + SYSTEM-full + Administrators-
full + whatever inherited groups the parent holds.

Empirical demonstration via icacls on a fresh file written with
`{ mode: 0o600 }` on Windows 11:

    BEFORE (Node writeFileSync with mode: 0o600):
      S-1-5-21-... :(I)(M,DC)
      Artemis\CodexSandboxUsers:(I)(M,DC)
      NT AUTHORITY\SYSTEM:(I)(F)
      BUILTIN\Administrators:(I)(F)
      ARTEMIS\Sam:(I)(F)

    AFTER (restrictFilePermissions via icacls):
      ARTEMIS\Sam:(F)

Six ACEs collapsed to one. Files this affects today on every Windows
gstack install:

  - `.gstack/.auth.json`          server bearer token + port
  - `.gstack/sidebar-agent-queue.jsonl`
                                  full user prompts + canary token
                                  (the prompt-injection defense secret)
  - `.gstack/sidebar-sessions/*/session.json`
                                  worktree paths, session IDs, config
  - `.gstack/sidebar-sessions/*/chat.jsonl`
                                  full chat history
  - `.gstack/security/device-salt`  hash key
  - `.gstack/security/attempts.jsonl`  threat-detection audit log
  - `.gstack/security/session-state.json`  security decisions
  - `.gstack/security/decisions/tab-*.json`  per-tab security verdicts
  - config.stateDir/browse-states/*.json  saved cookie bundles
  - config.stateDir/browse.json   state file
  - `.context/sidebar-inbox/*.json`  inter-agent messages

Most painful real-world scenario: self-hosted CI runners (GitHub Actions,
GitLab, Jenkins) where the runner's service account runs as a different
user on the same Windows box — that account can read every state file
the developer's gstack wrote, because inherited ACLs grant broad access.
Same concern on shared dev machines and multi-tenant servers.

Fix: a new `browse/src/file-permissions.ts` module with four functions:

  restrictFilePermissions(path)       chmod 0o600 on POSIX;
                                       icacls /inheritance:r /grant:r
                                       <user>:(F) on Windows

  restrictDirectoryPermissions(path)  chmod 0o700 on POSIX;
                                       icacls /inheritance:r /grant:r
                                       <user>:(OI)(CI)(F) on Windows
                                       (OI+CI so new children inherit
                                       the restricted ACL)

  writeSecureFile(path, data)         writeFileSync + restrictFile
  appendSecureFile(path, data)        appendFileSync + restrictFile on create
  mkdirSecure(path)                   mkdirSync recursive + restrictDir

16 call sites across 7 files converted to the drop-in wrappers. Error
handling: icacls failures log a one-shot warning to stderr (filesystem
stays functional, file is just not ACL-restricted) so operators know to
audit their setup if icacls is unavailable or misbehaving, but the
warning won't spam. Bare `restrictFilePermissions`/`restrictDirectory-
Permissions` are exported too for the one sidebar-agent site that re-
applies ACL without rewriting.

Tests: 13 new assertions in `browse/test/file-permissions.test.ts`
covering POSIX mode bits (0o600 for files, 0o700 for dirs), Windows
no-throw on nonexistent paths, idempotent mkdir, recursive creation,
Buffer payloads, and appendSecureFile ACL-only-on-create semantics.
Windows-platform tests don't assert on icacls output (brittle across
Windows versions / locales) — they verify the "doesn't crash, file still
readable by caller" contract.

Zero regressions: same 140 pass / 1 fail on the pre-existing test set on
clean upstream vs. 153 pass / 1 fail with my changes (13 new tests; the
1 pre-existing failure is an unrelated beforeEach timeout).

Design notes for reviewer (inviting pushback on approach):

1. icacls is the right Windows primitive — no Node-native NTFS ACL API
   exists, shell-out is the idiomatic answer. If there's a library you'd
   prefer over `execFileSync('icacls', ...)` I'm happy to swap.

2. The ACL grants ONLY the current user. Not SYSTEM, not Administrators.
   Rationale: this matches the POSIX 0o600 semantic (owner-only). A user
   could argue SYSTEM should retain access for backup/AV — I'd counter
   that the files are recreated on each gstack session, so backup loss
   is annoying-but-not-catastrophic. Open to changing this if you'd
   prefer SYSTEM+user.

3. icacls failures log once-per-process. Alternative: silent (lose audit
   trail) or log-per-failure (spammy). Split the difference with a
   single heads-up so ops teams notice the pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@scarson
Copy link
Copy Markdown
Author

scarson commented Apr 21, 2026

For context: opened #1122 today ("Boiling the Windows CI Lake") proposing a Phase-1 Windows smoke CI that would have caught this class of bug at PR time. This PR is cited in the RFC's coverage table as one of the five recent examples that motivated it — not asking for anything on this PR, it can merge on its own merits. Cross-linking because the "Why this shipped unnoticed" paragraph above reads differently with a systemic answer in flight.

Also a companion issue at #1123 for higher-level discussion if the PR's review thread gets crowded.

@scarson
Copy link
Copy Markdown
Author

scarson commented May 3, 2026

Closing in favor of #1308, refreshed against current main and re-staged post-v1.24.0.0.

The reshape is mostly mechanical (the file-permissions.ts module is unchanged in shape), but coverage is extended to terminal-agent.ts (8 new call sites, file is new since v1.20) and tunnel-denial-log.ts (1 new site, new since v1.20). Without those, the original PR would have left ~9 silent-NTFS-no-op gaps on current main — exactly the surfaces the threat model cares about (PTY token storage, tunnel denial audit log).

This PR's empirical icacls before/after, threat model, design-notes pushback section, and 13-test suite all carry forward to #1308.

@scarson scarson closed this May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant