Skip to content

fix: extend trajectory JSONL redaction to cover Google, AWS, Daytona, and header patterns (#537)#585

Merged
bingran-you merged 4 commits into
mainfrom
fix/537-trajectory-redaction
Jun 4, 2026
Merged

fix: extend trajectory JSONL redaction to cover Google, AWS, Daytona, and header patterns (#537)#585
bingran-you merged 4 commits into
mainfrom
fix/537-trajectory-redaction

Conversation

@ElegantLin

@ElegantLin ElegantLin commented May 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Trajectory.to_jsonl only redacted two secret patterns (sk-ant-, Bearer). Every other production credential pattern could leak into acp_trajectory.jsonl if it entered trajectory content (e.g. agent running env as a tool call).

Added redaction for:

  • Google AI / Gemini: AIzaSy*
  • AWS access keys: AKIA* / ASIA* (anchored to exact 20-char length to avoid matching ASIAPACIFIC etc.)
  • Daytona: dtn_* (≥16 char suffix to avoid short identifiers)
  • OpenAI project keys: sk-proj-*
  • HTTP headers: x-api-key, api-key (Azure)

Extracted patterns into a compiled _REDACTION_PATTERNS list and a reusable redact_trajectory_text() function for downstream callers.

Test plan

  • 10 parametrized tests covering each secret pattern
  • 7 parametrized false-positive guard tests (English words, hyphenated slugs, short identifiers)
  • End-to-end tests through Trajectory.to_jsonl(redact_keys=True/False)
  • 21/21 trajectory tests pass; ruff clean
  • Self-reviewed for regex correctness and false positives before submitting

Closes #537

🤖 Generated with Claude Code


Open in Devin Review

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@ElegantLin ElegantLin force-pushed the fix/537-trajectory-redaction branch 3 times, most recently from 08f055a to 8a6cee7 Compare June 1, 2026 16:43
@bingran-you

Copy link
Copy Markdown
Collaborator

Thermo-nuclear code quality review

What this PR is trying to solve

This PR is trying to close #537 by expanding Trajectory.to_jsonl(redact_keys=True) beyond the old Anthropic/OpenAI/Bearer-only redaction. The proposed solution moves regexes into a compiled _REDACTION_PATTERNS list, exposes redact_trajectory_text(), and adds focused tests for Google, AWS, Daytona, OpenAI project keys, and JSON-shaped x-api-key / api-key headers.

The direction is right, but I would not merge this yet. There are still high-confidence gaps against the issue this PR claims to close.

Blocking findings

  1. x-api-key:, api-key:, and non-Bearer authorization: text still leak.

    src/benchflow/trajectories/types.py:160-173 only redacts headers when trajectory text contains JSON object keys like "x-api-key": "..." or "api-key": "...". But Trajectory JSONL redaction can miss Google AIza, AWS AKIA, x-api-key, Daytona dtn_ patterns #537 explicitly calls out generic header forms such as x-api-key: ..., api-key: ..., and authorization: <not Bearer>, which can appear when an agent prints env, curls with verbose headers, dumps request logs, or writes tool output into a trajectory. For Azure-style api-key: abc123secret456value, the value does not match any provider-prefix regex and survives.

    Handoff: add redaction for header/key-value text forms, not just JSON keys, for at least x-api-key, api-key, x-goog-api-key, and authorization regardless of scheme. Add tests that pass raw trajectory-content strings such as x-api-key: abc123secret456value, api-key=abc123secret456value, and authorization: Token abc123secret456value through Trajectory.to_jsonl(), not only through the helper.

  2. The documented secret-leak audit still fails on redacted Google/Daytona env output.

    The v0.5 audit in docs/v05-e2e-testing-guide.md:443 checks for AIzaSy, dtn_, GEMINI_API_KEY, and DAYTONA_API_KEY and expects no matches. This implementation preserves those prefixes in src/benchflow/trajectories/types.py:149-159: Google becomes AIzaSy....***REDACTED***, and Daytona becomes dtn_....***REDACTED***. If trajectory content contains GEMINI_API_KEY=AIzaSy..., the output still contains both GEMINI_API_KEY and AIzaSy, so the experiment-health audit still flags the artifact.

    Handoff: either redact the whole matched token/pair for audit-sensitive families, or update the audit and tests to distinguish redacted placeholders from live secrets. Include an end-to-end regression that constructs a trajectory body containing GEMINI_API_KEY=AIzaSy... and DAYTONA_API_KEY=dtn_..., writes JSONL, and proves the same leak-audit intent passes.

Test and maintainability gaps

The new tests are useful, but they mostly test redact_trajectory_text() on idealized JSON strings. The risky path is benchmark artifact publication after messy agent/tool output, so coverage should include Trajectory.to_jsonl() with raw shell env output, raw HTTP headers, and key/value logs.

Also, per this repo's AGENTS.md, regression tests should name the PR/commit they guard in the docstring. The new test file should make that explicit.

Verdict: request changes. The abstraction is small and the file stays well under 1k lines, but the implementation still misses core cases from #537 and leaves the documented experiment-health audit ambiguous.

ElegantLin added a commit that referenced this pull request Jun 3, 2026
Addresses bingran-you CHANGES_REQUESTED on PR #585 / issue #537.

Finding 1 — raw-text header/kv forms leaked. Redaction only matched JSON
keys ("x-api-key": "v"). Now also covers raw forms agents print into
trajectories (env dumps, curl -v, request logs): x-api-key: v, api-key=v,
api_key: v, x-goog-api-key: v, authorization: <scheme> v / bare value.
The field name is kept; the value is dropped.

Finding 2 — audit-sensitive prefixes survived. Google/AWS/Daytona/OpenAI
tokens were redacted but kept their live prefix (AIzaSy***), so the v0.5
§14 leak audit (grep AIzaSy/dtn_) still flagged them. Now the whole
token is replaced, prefix included. Updated the audit grep in
docs/v05-e2e-testing-guide.md to match live-key *shapes* (suffix length),
not bare variable names, with a note that names are not secrets.

Header-pattern hardening from this work: underscore api_key gets a leading
boundary so it can't match the tail of GEMINI_API_KEY; value classes
exclude backslash (a JSON-escaped \n can't swallow the next header) and
'*' (an already-redacted value isn't re-matched); authorization splits
scheme/bare with a lookahead to avoid double-redaction.

Tests run through Trajectory.to_jsonl() with raw env/header/kv text, not
only the helper on idealized JSON. Module docstring names PR #585.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ElegantLin added a commit that referenced this pull request Jun 3, 2026
Addresses bingran-you CHANGES_REQUESTED on PR #585 / issue #537.

Finding 1 — raw-text header/kv forms leaked. Redaction only matched JSON
keys ("x-api-key": "v"). Now also covers raw forms agents print into
trajectories (env dumps, curl -v, request logs): x-api-key: v, api-key=v,
api_key: v, x-goog-api-key: v, authorization: <scheme> v / bare value.
The field name is kept; the value is dropped.

Finding 2 — audit-sensitive prefixes survived. Google/AWS/Daytona/OpenAI
tokens were redacted but kept their live prefix (AIzaSy***), so the v0.5
§14 leak audit (grep AIzaSy/dtn_) still flagged them. Now the whole
token is replaced, prefix included. Updated the audit grep in
docs/v05-e2e-testing-guide.md to match live-key *shapes* (suffix length),
not bare variable names, with a note that names are not secrets.

Header-pattern hardening from this work: underscore api_key gets a leading
boundary so it can't match the tail of GEMINI_API_KEY; value classes
exclude backslash (a JSON-escaped \n can't swallow the next header) and
'*' (an already-redacted value isn't re-matched); authorization splits
scheme/bare with a lookahead to avoid double-redaction.

Tests run through Trajectory.to_jsonl() with raw env/header/kv text, not
only the helper on idealized JSON. Module docstring names PR #585.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ElegantLin added a commit that referenced this pull request Jun 3, 2026
These files (test_sandbox_hardening.py, test_usage_proxy_gemini_rewrite.py)
were committed to main unformatted (via #unrelated PRs) and trip
'ruff format --check' on this PR's merge-commit CI run. Formatting them
here to unblock #585; no logic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ElegantLin ElegantLin force-pushed the fix/537-trajectory-redaction branch from bfb7293 to 8520eaa Compare June 3, 2026 04:20
ElegantLin added a commit that referenced this pull request Jun 4, 2026
…paced API keys (#585)

Addresses the deeper review round on PR #585 / issue #537.

1. Redaction was only applied to llm_trajectory.jsonl (Trajectory.to_jsonl),
   never to acp_trajectory.jsonl — the file the PR set out to protect. Added a
   reusable redact_acp_trajectory_jsonl() helper and wired it into all three
   ACP-trajectory write sites (rollout.py local write + verifier-publish tmpfile,
   hosted_env.py), so an agent running `env` no longer leaks secrets there.
2. The underscore `api_key` rule had a (?<![A-Za-z0-9_]) lookbehind that left
   namespaced env vars (GEMINI_API_KEY=..., {"openai_api_key": ...}) unredacted.
   Removed it; the \1 capture preserves the key name, so GEMINI_API_KEY=secret
   -> GEMINI_API_KEY=***REDACTED*** with no over-redaction.

Regression tests assert the raw secret is absent from the written file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ElegantLin

Copy link
Copy Markdown
Contributor Author

Follow-up (589597a) after a deeper self-review found this PR didn't actually achieve its goal:

  • Redaction wasn't wired to acp_trajectory.jsonl (the named target) — only llm_trajectory.jsonl was redacted. Added redact_acp_trajectory_jsonl() and wired it into all three ACP-trajectory write sites (rollout local write, verifier-publish tmpfile, hosted_env).
  • Namespaced env keys leaked: GEMINI_API_KEY=... / {"openai_api_key": ...} slipped through due to a lookbehind on the api_key rule. Removed it (the \1 capture preserves the key name, so no over-redaction).
    New tests assert the raw secret is absent from the written file.

ElegantLin and others added 2 commits June 4, 2026 14:25
… and header patterns (#537)

Trajectory.to_jsonl only redacted sk-ant- (Anthropic) and Bearer headers.
Other production credential patterns could leak into acp_trajectory.jsonl
if they entered trajectory content (e.g. agent running `env` as a tool call).

Added redaction for:
- Google AI / Gemini: AIzaSy*
- AWS access keys: AKIA* / ASIA* (anchored to exact 20-char length)
- Daytona: dtn_* (>=16 char suffix to avoid short identifiers)
- OpenAI project keys: sk-proj-*
- HTTP headers: x-api-key, api-key (Azure)

Extracted patterns into a compiled _REDACTION_PATTERNS list and a reusable
redact_trajectory_text() function. All test fixtures use obviously-fake keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses bingran-you CHANGES_REQUESTED on PR #585 / issue #537.

Finding 1 — raw-text header/kv forms leaked. Redaction only matched JSON
keys ("x-api-key": "v"). Now also covers raw forms agents print into
trajectories (env dumps, curl -v, request logs): x-api-key: v, api-key=v,
api_key: v, x-goog-api-key: v, authorization: <scheme> v / bare value.
The field name is kept; the value is dropped.

Finding 2 — audit-sensitive prefixes survived. Google/AWS/Daytona/OpenAI
tokens were redacted but kept their live prefix (AIzaSy***), so the v0.5
§14 leak audit (grep AIzaSy/dtn_) still flagged them. Now the whole
token is replaced, prefix included. Updated the audit grep in
docs/v05-e2e-testing-guide.md to match live-key *shapes* (suffix length),
not bare variable names, with a note that names are not secrets.

Header-pattern hardening from this work: underscore api_key gets a leading
boundary so it can't match the tail of GEMINI_API_KEY; value classes
exclude backslash (a JSON-escaped \n can't swallow the next header) and
'*' (an already-redacted value isn't re-matched); authorization splits
scheme/bare with a lookahead to avoid double-redaction.

Tests run through Trajectory.to_jsonl() with raw env/header/kv text, not
only the helper on idealized JSON. Module docstring names PR #585.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ElegantLin added a commit that referenced this pull request Jun 4, 2026
…paced API keys (#585)

Addresses the deeper review round on PR #585 / issue #537.

1. Redaction was only applied to llm_trajectory.jsonl (Trajectory.to_jsonl),
   never to acp_trajectory.jsonl — the file the PR set out to protect. Added a
   reusable redact_acp_trajectory_jsonl() helper and wired it into all three
   ACP-trajectory write sites (rollout.py local write + verifier-publish tmpfile,
   hosted_env.py), so an agent running `env` no longer leaks secrets there.
2. The underscore `api_key` rule had a (?<![A-Za-z0-9_]) lookbehind that left
   namespaced env vars (GEMINI_API_KEY=..., {"openai_api_key": ...}) unredacted.
   Removed it; the \1 capture preserves the key name, so GEMINI_API_KEY=secret
   -> GEMINI_API_KEY=***REDACTED*** with no over-redaction.

Regression tests assert the raw secret is absent from the written file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ElegantLin ElegantLin force-pushed the fix/537-trajectory-redaction branch from 589597a to 35f8782 Compare June 4, 2026 14:26
…paced API keys (#585)

Addresses the deeper review round on PR #585 / issue #537.

1. Redaction was only applied to llm_trajectory.jsonl (Trajectory.to_jsonl),
   never to acp_trajectory.jsonl — the file the PR set out to protect. Added a
   reusable redact_acp_trajectory_jsonl() helper and wired it into all three
   ACP-trajectory write sites (rollout.py local write + verifier-publish tmpfile,
   hosted_env.py), so an agent running `env` no longer leaks secrets there.
2. The underscore `api_key` rule had a (?<![A-Za-z0-9_]) lookbehind that left
   namespaced env vars (GEMINI_API_KEY=..., {"openai_api_key": ...}) unredacted.
   Removed it; the \1 capture preserves the key name, so GEMINI_API_KEY=secret
   -> GEMINI_API_KEY=***REDACTED*** with no over-redaction.

Regression tests assert the raw secret is absent from the written file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ElegantLin ElegantLin force-pushed the fix/537-trajectory-redaction branch from 35f8782 to 8d0f203 Compare June 4, 2026 14:59
@bingran-you bingran-you merged commit de0c137 into main Jun 4, 2026
2 checks passed
@bingran-you bingran-you deleted the fix/537-trajectory-redaction branch June 4, 2026 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trajectory JSONL redaction can miss Google AIza, AWS AKIA, x-api-key, Daytona dtn_ patterns

2 participants