fix: extend trajectory JSONL redaction to cover Google, AWS, Daytona, and header patterns (#537) by ElegantLin · Pull Request #585 · benchflow-ai/benchflow

ElegantLin · 2026-05-30T03:44:30Z

Summary

Trajectory.to_jsonl only redacted two secret patterns (sk-ant-, Bearer). Every other production credential pattern could leak into acp_trajectory.jsonl if it entered trajectory content (e.g. agent running env as a tool call).

Added redaction for:

Google AI / Gemini: AIzaSy*
AWS access keys: AKIA* / ASIA* (anchored to exact 20-char length to avoid matching ASIAPACIFIC etc.)
Daytona: dtn_* (≥16 char suffix to avoid short identifiers)
OpenAI project keys: sk-proj-*
HTTP headers: x-api-key, api-key (Azure)

Extracted patterns into a compiled _REDACTION_PATTERNS list and a reusable redact_trajectory_text() function for downstream callers.

Test plan

10 parametrized tests covering each secret pattern
7 parametrized false-positive guard tests (English words, hyphenated slugs, short identifiers)
End-to-end tests through Trajectory.to_jsonl(redact_keys=True/False)
21/21 trajectory tests pass; ruff clean
Self-reviewed for regex correctness and false positives before submitting

Closes #537

🤖 Generated with Claude Code

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

bingran-you · 2026-06-02T23:09:34Z

Thermo-nuclear code quality review

What this PR is trying to solve

This PR is trying to close #537 by expanding Trajectory.to_jsonl(redact_keys=True) beyond the old Anthropic/OpenAI/Bearer-only redaction. The proposed solution moves regexes into a compiled _REDACTION_PATTERNS list, exposes redact_trajectory_text(), and adds focused tests for Google, AWS, Daytona, OpenAI project keys, and JSON-shaped x-api-key / api-key headers.

The direction is right, but I would not merge this yet. There are still high-confidence gaps against the issue this PR claims to close.

Blocking findings

x-api-key:, api-key:, and non-Bearer authorization: text still leak.

src/benchflow/trajectories/types.py:160-173 only redacts headers when trajectory text contains JSON object keys like "x-api-key": "..." or "api-key": "...". But Trajectory JSONL redaction can miss Google AIza, AWS AKIA, x-api-key, Daytona dtn_ patterns #537 explicitly calls out generic header forms such as x-api-key: ..., api-key: ..., and authorization: <not Bearer>, which can appear when an agent prints env, curls with verbose headers, dumps request logs, or writes tool output into a trajectory. For Azure-style api-key: abc123secret456value, the value does not match any provider-prefix regex and survives.

Handoff: add redaction for header/key-value text forms, not just JSON keys, for at least x-api-key, api-key, x-goog-api-key, and authorization regardless of scheme. Add tests that pass raw trajectory-content strings such as x-api-key: abc123secret456value, api-key=abc123secret456value, and authorization: Token abc123secret456value through Trajectory.to_jsonl(), not only through the helper.
The documented secret-leak audit still fails on redacted Google/Daytona env output.

The v0.5 audit in docs/v05-e2e-testing-guide.md:443 checks for AIzaSy, dtn_, GEMINI_API_KEY, and DAYTONA_API_KEY and expects no matches. This implementation preserves those prefixes in src/benchflow/trajectories/types.py:149-159: Google becomes AIzaSy....***REDACTED***, and Daytona becomes dtn_....***REDACTED***. If trajectory content contains GEMINI_API_KEY=AIzaSy..., the output still contains both GEMINI_API_KEY and AIzaSy, so the experiment-health audit still flags the artifact.

Handoff: either redact the whole matched token/pair for audit-sensitive families, or update the audit and tests to distinguish redacted placeholders from live secrets. Include an end-to-end regression that constructs a trajectory body containing GEMINI_API_KEY=AIzaSy... and DAYTONA_API_KEY=dtn_..., writes JSONL, and proves the same leak-audit intent passes.

Test and maintainability gaps

The new tests are useful, but they mostly test redact_trajectory_text() on idealized JSON strings. The risky path is benchmark artifact publication after messy agent/tool output, so coverage should include Trajectory.to_jsonl() with raw shell env output, raw HTTP headers, and key/value logs.

Also, per this repo's AGENTS.md, regression tests should name the PR/commit they guard in the docstring. The new test file should make that explicit.

Verdict: request changes. The abstraction is small and the file stays well under 1k lines, but the implementation still misses core cases from #537 and leaves the documented experiment-health audit ambiguous.

Addresses bingran-you CHANGES_REQUESTED on PR #585 / issue #537. Finding 1 — raw-text header/kv forms leaked. Redaction only matched JSON keys ("x-api-key": "v"). Now also covers raw forms agents print into trajectories (env dumps, curl -v, request logs): x-api-key: v, api-key=v, api_key: v, x-goog-api-key: v, authorization: <scheme> v / bare value. The field name is kept; the value is dropped. Finding 2 — audit-sensitive prefixes survived. Google/AWS/Daytona/OpenAI tokens were redacted but kept their live prefix (AIzaSy***), so the v0.5 §14 leak audit (grep AIzaSy/dtn_) still flagged them. Now the whole token is replaced, prefix included. Updated the audit grep in docs/v05-e2e-testing-guide.md to match live-key *shapes* (suffix length), not bare variable names, with a note that names are not secrets. Header-pattern hardening from this work: underscore api_key gets a leading boundary so it can't match the tail of GEMINI_API_KEY; value classes exclude backslash (a JSON-escaped \n can't swallow the next header) and '*' (an already-redacted value isn't re-matched); authorization splits scheme/bare with a lookahead to avoid double-redaction. Tests run through Trajectory.to_jsonl() with raw env/header/kv text, not only the helper on idealized JSON. Module docstring names PR #585. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

These files (test_sandbox_hardening.py, test_usage_proxy_gemini_rewrite.py) were committed to main unformatted (via #unrelated PRs) and trip 'ruff format --check' on this PR's merge-commit CI run. Formatting them here to unblock #585; no logic change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…paced API keys (#585) Addresses the deeper review round on PR #585 / issue #537. 1. Redaction was only applied to llm_trajectory.jsonl (Trajectory.to_jsonl), never to acp_trajectory.jsonl — the file the PR set out to protect. Added a reusable redact_acp_trajectory_jsonl() helper and wired it into all three ACP-trajectory write sites (rollout.py local write + verifier-publish tmpfile, hosted_env.py), so an agent running `env` no longer leaks secrets there. 2. The underscore `api_key` rule had a (?<![A-Za-z0-9_]) lookbehind that left namespaced env vars (GEMINI_API_KEY=..., {"openai_api_key": ...}) unredacted. Removed it; the \1 capture preserves the key name, so GEMINI_API_KEY=secret -> GEMINI_API_KEY=***REDACTED*** with no over-redaction. Regression tests assert the raw secret is absent from the written file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ElegantLin · 2026-06-04T06:00:18Z

Follow-up (589597a) after a deeper self-review found this PR didn't actually achieve its goal:

Redaction wasn't wired to acp_trajectory.jsonl (the named target) — only llm_trajectory.jsonl was redacted. Added redact_acp_trajectory_jsonl() and wired it into all three ACP-trajectory write sites (rollout local write, verifier-publish tmpfile, hosted_env).
Namespaced env keys leaked: GEMINI_API_KEY=... / {"openai_api_key": ...} slipped through due to a lookbehind on the api_key rule. Removed it (the \1 capture preserves the key name, so no over-redaction).
New tests assert the raw secret is absent from the written file.

… and header patterns (#537) Trajectory.to_jsonl only redacted sk-ant- (Anthropic) and Bearer headers. Other production credential patterns could leak into acp_trajectory.jsonl if they entered trajectory content (e.g. agent running `env` as a tool call). Added redaction for: - Google AI / Gemini: AIzaSy* - AWS access keys: AKIA* / ASIA* (anchored to exact 20-char length) - Daytona: dtn_* (>=16 char suffix to avoid short identifiers) - OpenAI project keys: sk-proj-* - HTTP headers: x-api-key, api-key (Azure) Extracted patterns into a compiled _REDACTION_PATTERNS list and a reusable redact_trajectory_text() function. All test fixtures use obviously-fake keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses bingran-you CHANGES_REQUESTED on PR #585 / issue #537. Finding 1 — raw-text header/kv forms leaked. Redaction only matched JSON keys ("x-api-key": "v"). Now also covers raw forms agents print into trajectories (env dumps, curl -v, request logs): x-api-key: v, api-key=v, api_key: v, x-goog-api-key: v, authorization: <scheme> v / bare value. The field name is kept; the value is dropped. Finding 2 — audit-sensitive prefixes survived. Google/AWS/Daytona/OpenAI tokens were redacted but kept their live prefix (AIzaSy***), so the v0.5 §14 leak audit (grep AIzaSy/dtn_) still flagged them. Now the whole token is replaced, prefix included. Updated the audit grep in docs/v05-e2e-testing-guide.md to match live-key *shapes* (suffix length), not bare variable names, with a note that names are not secrets. Header-pattern hardening from this work: underscore api_key gets a leading boundary so it can't match the tail of GEMINI_API_KEY; value classes exclude backslash (a JSON-escaped \n can't swallow the next header) and '*' (an already-redacted value isn't re-matched); authorization splits scheme/bare with a lookahead to avoid double-redaction. Tests run through Trajectory.to_jsonl() with raw env/header/kv text, not only the helper on idealized JSON. Module docstring names PR #585. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…paced API keys (#585) Addresses the deeper review round on PR #585 / issue #537. 1. Redaction was only applied to llm_trajectory.jsonl (Trajectory.to_jsonl), never to acp_trajectory.jsonl — the file the PR set out to protect. Added a reusable redact_acp_trajectory_jsonl() helper and wired it into all three ACP-trajectory write sites (rollout.py local write + verifier-publish tmpfile, hosted_env.py), so an agent running `env` no longer leaks secrets there. 2. The underscore `api_key` rule had a (?<![A-Za-z0-9_]) lookbehind that left namespaced env vars (GEMINI_API_KEY=..., {"openai_api_key": ...}) unredacted. Removed it; the \1 capture preserves the key name, so GEMINI_API_KEY=secret -> GEMINI_API_KEY=***REDACTED*** with no over-redaction. Regression tests assert the raw secret is absent from the written file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

devin-ai-integration Bot reviewed May 30, 2026

View reviewed changes

ElegantLin force-pushed the fix/537-trajectory-redaction branch 3 times, most recently from 08f055a to 8a6cee7 Compare June 1, 2026 16:43

ElegantLin force-pushed the fix/537-trajectory-redaction branch from bfb7293 to 8520eaa Compare June 3, 2026 04:20

ElegantLin and others added 2 commits June 4, 2026 14:25

ElegantLin force-pushed the fix/537-trajectory-redaction branch from 589597a to 35f8782 Compare June 4, 2026 14:26

ElegantLin force-pushed the fix/537-trajectory-redaction branch from 35f8782 to 8d0f203 Compare June 4, 2026 14:59

ElegantLin assigned bingran-you Jun 4, 2026

fix: redact verifiers trajectory exports

495dbc0

bingran-you merged commit de0c137 into main Jun 4, 2026
2 checks passed

bingran-you deleted the fix/537-trajectory-redaction branch June 4, 2026 19:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: extend trajectory JSONL redaction to cover Google, AWS, Daytona, and header patterns (#537)#585

fix: extend trajectory JSONL redaction to cover Google, AWS, Daytona, and header patterns (#537)#585
bingran-you merged 4 commits into
mainfrom
fix/537-trajectory-redaction

ElegantLin commented May 30, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

bingran-you commented Jun 2, 2026

Uh oh!

ElegantLin commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ElegantLin commented May 30, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

bingran-you commented Jun 2, 2026

Thermo-nuclear code quality review

What this PR is trying to solve

Blocking findings

Test and maintainability gaps

Uh oh!

ElegantLin commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ElegantLin commented May 30, 2026 •

edited by devin-ai-integration Bot

Loading