Skip to content

fix(security): allowlist-sanitize history-file identifiers via SafeSlug#189

Merged
rafeegnash merged 1 commit into
masterfrom
fix/23-path-traversal-safeslug
Jun 4, 2026
Merged

fix(security): allowlist-sanitize history-file identifiers via SafeSlug#189
rafeegnash merged 1 commit into
masterfrom
fix/23-path-traversal-safeslug

Conversation

@rafeegnash
Copy link
Copy Markdown
Collaborator

Summary

Seven of ten provider conversation-history modules built filenames with a blocklist sanitiser — `strings.NewReplacer` over a fixed set of shell metacharacters that let `.` and `..` through untouched. With CLI flags like `--cluster` (k8s) or `--account-id` (cloudflare) wired straight into the filename, an attacker could escape `~/.clanker`.

The other three providers (sentry, linear, notion) already used the right pattern — an allowlist over `[A-Za-z0-9_-]` — but the implementation was copy-pasted three times. Consolidate into one canonical helper.

What changed

  • New `secfile.SafeSlug(s string) string` — allowlist sanitiser, 64-byte cap, `"default"` fallback for empty input.
  • Ten `conversation.go` modules call it instead of a local `sanitize*` / `safeSlug` helper. Every local helper is deleted.
  • The AST drift test (introduced in feat: gcp support #22) gained two more invariants for every `conversation.go`:
    1. No `strings.NewReplacer` calls (the blocklist anti-pattern).
    2. No top-level `func sanitize*` / `func safeSlug` declarations.
      Either form would silently reintroduce the original CVE.

Exploit reachability (per provider)

Provider Input source Exploitable?
k8s `--cluster` flag YES — direct
cloudflare `--account-id` flag YES — direct
iam STS `GetCallerIdentity` NO — server-derived 12 digits
flyio / railway / vercel env vars / config file YES via env var (`FLY_ORG=../../...`)
verda config-file scope YES via config
sentry / linear / notion already allowlist already correct

Acceptance cases (see `internal/secfile/secfile_test.go::TestSafeSlug`)

Input Output
`..` `default`
`../../etc/passwd` `etcpasswd`
`my-cluster.dev` `my-clusterdev`
`` `default`
`123456789012` `123456789012` (AWS account IDs preserved byte-for-byte)
`deadbeef…cafebabe` (32-char hex) preserved (Cloudflare account IDs)
`中文` / `\x00\x00` `default` (unicode and null bytes stripped)
`"a" × 200` `"a" × 64` (length-capped)
`C:\Users\admin` `CUsersadmin` (Windows separators)

Migration

Existing history files written by the old sanitiser may become orphaned — e.g. `cloudflare_my.account.json` was correct under the old code, but `secfile.SafeSlug` now strips the dot so the new code looks at `cloudflare_myaccount.json`. Intentional: a rename migration would itself need to construct + trust the unsafe old path. Users silently start a fresh conversation. Documented in the commit body.

Out of scope (per fresh-eyes review)

  • `internal/k8s/plan/types.go::sanitizeFilename` — different subsystem (plan files), different threat model.
  • The cross-tenant collision where two empty IDs both fall back to `"default"` — predates this PR, low risk in practice (12-digit AWS IDs / UUIDs / hex IDs never strip to empty).
  • The `internal/convhistory` extraction (feat(k8s): add k8s ask command for natural language cluster queries #25). `secfile` only owns the security primitives.

Test plan

  • `go build ./...` clean
  • `go vet ./...` clean
  • `go test -race -count=1 ./...` — all packages pass
  • `secfile.SafeSlug` table-driven test covers every acceptance case + corner case (unicode, null bytes, length, Windows separators)
  • Drift test fails the build if any `conversation.go` regresses to `strings.NewReplacer` or a local sanitiser function
  • Net diff: -120 LoC across the repo (10 local helpers deleted, 1 shared added)

Closes #23

Seven provider conversation-history modules built filenames with a
blocklist sanitiser — strings.NewReplacer over a fixed list of shell
metacharacters that let `.` and `..` through untouched. With CLI
flags like `--cluster ../../etc/passwd` (k8s) or `--account-id ../..`
(cloudflare) wired straight into the filename, an attacker could
write the LLM history outside ~/.clanker.

The other three providers (sentry, linear, notion) already used the
right pattern — an allowlist over [A-Za-z0-9_-] with a "default"
fallback for empty input — but the implementation was copy-pasted
three times.

Move the canonical implementation into internal/secfile as
SafeSlug(s string) string. Bound the result to 64 bytes so a verbose
input can't blow past common filesystem limits. Replace the
sanitiser call in all ten conversation modules and delete every
local sanitize/safeSlug helper. Net diff: -150 LoC.

Extend the AST-based drift test (from #22) to fail the build if any
conversation.go:
  1. calls strings.NewReplacer (the blocklist anti-pattern), or
  2. declares a top-level func named sanitize*/safeSlug.

Either form would silently reintroduce the original CVE.

Acceptance cases pass (see internal/secfile/secfile_test.go):
  ".."                  -> "default"
  "../../etc/passwd"    -> "etcpasswd"
  "my-cluster.dev"      -> "my-clusterdev"
  ""                    -> "default"
  "123456789012"        -> "123456789012"  (AWS account IDs preserved)
  "中文" / "\\x00\\x00" -> "default"        (unicode/null stripped)
  "a" * 200             -> "a" * 64        (length cap)

Migration: existing history files written by the old sanitiser may
become orphaned (e.g. cloudflare_my.account.json -> code now reads
cloudflare_myaccount.json). The orphan is intentional — a rename
migration would itself need to handle the unsafe old name. Users
silently start a fresh conversation.

Closes #23
@rafeegnash rafeegnash merged commit 82cadfc into master Jun 4, 2026
5 checks passed
@rafeegnash rafeegnash deleted the fix/23-path-traversal-safeslug branch June 4, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant