fix(hook): secret redaction was broken — rewrite via Python + add tests#16
Merged
fix(hook): secret redaction was broken — rewrite via Python + add tests#16
Conversation
The PostToolUse hook claimed to redact bearer tokens / api keys /
passwords from observation summaries before writing them to disk.
A code review found the redaction was broken in three concrete ways:
1. The keyword=value sed pattern used `\047` to mean a single quote
inside a POSIX bracket expression. Both BSD sed (macOS) and GNU
sed treat `\047` as literal characters there, so `api_key="..."`
and `password="..."` patterns never matched. Verified empirically:
`api_key="abc12345xyz_secretvalue"` came through unchanged.
2. The Bearer pattern's character class (`[A-Za-z0-9_\.\-\/+=]+`)
had unnecessary backslash escapes that fouled the match. Result:
`Bearer sk-ant-api03-realsecret` was redacted as
`Bearer [REDACTED]-ant-api03-realsecret` — the `sk-` got eaten,
the rest of the API key leaked into observations on disk. This
is the actual Anthropic key prefix and the most realistic case.
3. Inline-env-var assignments (`MY_TOKEN=abc curl ...`) were never
caught — only `export TOKEN=...` form. Common shell pattern,
fully bypassed redaction.
Plus secondary issues:
4. The `ls*` skip-list pattern matched `lsof`, `lsblk`, `lsattr` and
other non-read-only commands, dropping their observations.
5. The Bash branch only redacted SUMMARY, not COMMAND. The full
command also gets persisted into context.command in the JSONL.
Fix:
- `redact_secrets()` helper rewritten in Python via `python3 -c`
(sed character classes diverge between BSD and GNU; Python regex
is portable and verified). Three rules:
a. inline ENV=val with secret-implying variable name → REDACTED
b. keyword (token/password/api_key/secret/credential/auth) =
value (4+ non-space/quote chars) → REDACTED
c. Bearer X, sk-..., ghp_..., AKIA... → REDACTED
- Read-only skip patterns tightened: `ls` and `ls *` instead of `ls*`,
same for `pwd`. Adds `grep` and `rg` to the skip list.
- Bash branch now redacts COMMAND before deriving SUMMARY, so both
on-disk fields are clean.
- Drops the brittle `if grep ... TOKEN|SECRET` heuristic — replaced
by the rule (a) regex, which is more general and tested.
Adds tests/test_post_tool_use_hook.sh — 19 assertions covering all
six redaction cases that previously failed, the read-only skip list
(including the lsof/lsblk regression), and sensitive-file-path skips
on Write. Passes 19/19 against the new hook; would fail 6+ on the
prior version.
Performance: each tool call now invokes python3 once or twice. ~50ms
overhead, acceptable for a hook that fires on Edit/Write/Bash. The
hook is still non-blocking and fast-exits on read-only commands
before reaching Python.
anthroos
pushed a commit
that referenced
this pull request
May 2, 2026
Security-relevant: existing users should scrub ~/.openexp/observations/ if they installed before today. PR #16 fixed a redaction regression that let Bearer/api_key/password values reach disk verbatim. Entry documents the bug, the fix, and gives users a concrete grep + rm recipe to clean up old observations.
anthroos
added a commit
that referenced
this pull request
May 2, 2026
Security-relevant: existing users should scrub ~/.openexp/observations/ if they installed before today. PR #16 fixed a redaction regression that let Bearer/api_key/password values reach disk verbatim. Entry documents the bug, the fix, and gives users a concrete grep + rm recipe to clean up old observations. Co-authored-by: Ivan Pasichnyk <ivanpasichnyk@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
The PostToolUse hook claimed to redact
Bearer X,api_key=…,password=…,token=…etc from observation summaries before writing them to~/.openexp/observations/. A code review found the redaction was broken on the most realistic inputs, including Anthropic API keys (sk-ant-api03-…).The bug
Empirically verified before this PR (run on macOS, but reproduces identically on Linux GNU sed):
api_key="abc12345xyz_secretvalue"api_key="abc12345xyz_secretvalue"Bearer sk-ant-api03-veryverysecretBearer [REDACTED]-ant-api03-veryverysecrettoken=secretvalue1234567890token=[REDACTED]4567890password="hunter2hunter2"password=[REDACTED]"ANTHROPIC_API_KEY=sk-ant-… npm testThree concrete root causes:
\047for'inside a POSIX bracket expression. Both BSD sed (macOS) and GNU sed interpret\047as the literal characters\,0,4,7there, not as an escaped single-quote. The character class never matched the quote forms.[A-Za-z0-9_\.\-\/+=]+had unnecessary backslash escapes that fouled the match — onlyskconsumed before failing on-.MY_TOKEN=abc curl …) was never targeted at all; the previous heuristic only caughtexport TOKEN=….Plus two secondary issues caught in the same review:
ls*skip-list pattern matchedlsof,lsblk,lsattrand dropped their observations.SUMMARY. The fullCOMMANDalso gets persisted intocontext.commandin the JSONL — uncovered.Fix
redact_secrets()rewritten in Python viapython3 -c. sed character classes diverge between BSD and GNU; Python regex is portable, readable, and tested. Three rules:Other changes:
ls/ls *instead ofls*.pwd/pwd *. Addsgrepandrg.COMMANDbefore derivingSUMMARY. Both on-disk fields are clean.if grep ... TOKEN|SECRETheuristic — replaced by the rule (a) regex which is more general and tested.Tests
New
tests/test_post_tool_use_hook.sh— 19 assertions covering the six previously-broken redaction cases, the read-only skip list (including thelsof/lsblkregression), and sensitive-file-path skips onWrite. Passes 19/19 against this PR; would fail 6+ on the prior version.Performance impact
Each Bash tool call now invokes
python3once (~50ms cold). Acceptable for a hook that fires on Edit/Write/Bash. The hook still fast-exits on read-only commands before reaching Python — no overhead forls -la,cat,find.Severity / impact on existing users
Anyone running OpenExp with the previous hook has potentially leaked secrets from
Bashtool calls into~/.openexp/observations/observations-YYYY-MM-DD.jsonlon their own disk. The file is in$HOMEwith default permissions (700 typically), so not network-exposed — but if a user's transcripts get backed up, synced, or shared, the secrets travel with them.After merge, recommend a follow-up issue to scan/redact existing observation files. Tracked separately.