Agent-preset egress fixes, Claude workflows, SLUICE_MASK + overlay dirs, and doctor checks#20
Merged
Merged
Conversation
Found by the audit-egress workflow, each verified against upstream source: - amp: @sourcegraph/amp -> @ampcode/cli (old name is now a thin alias that depends on the new package; new pkg keeps the `amp` bin) - crush: catwalk.charm.sh -> catwalk.charm.land (Charm domain migration; the stale SNI was silently blocking Crush's model catalog) - aider: avoid two off-allowlist runtime fetches without widening the allowlist - LITELLM_LOCAL_MODEL_COST_MAP=True (use litellm's bundled price map) and --no-check-update (skip aider's pypi.org version ping) - cursor: add the api5 agent fleet (.api5.cursor.sh) + auth hosts, drop the unused api.cursor.com (not used by the CLI per Cursor's network doc) - codex: correct a misleading comment - Codex's default Statsig telemetry to ab.chatgpt.com stays blocked by design, not allowlisted - opencode: autoupdate:false pins the baked version against opencode's launch-time self-upgrade (which would drift from `sluice lock`)
…st (claude) Follow-up to the egress-drift pass, verified against upstream: - gemini: note that usage-stats telemetry (Clearcut, play.googleapis.com) is left blocked; privacy.usageStatisticsEnabled=false silences the per-run warning - claude: re-add statsig.anthropic.com - commit 6cc6077 dropped it as dead, but upstream init-firewall.sh still allowlists it alongside statsig.com; both are feature-flag/metrics hosts whose flags can affect behavior. sentry.io error reporting stays blocked.
… findings Add .claude/workflows/ - Claude Code multi-agent orchestration scripts (not GitHub Actions): - review-launcher: review src/*.sh across security / bash-3.2 / docker-podman portability, each finding adversarially verified - audit-egress: audit each agents/*.config.sh preset's egress against the allowlist + THREAT_MODEL.md, then verify each finding against upstream source - triage-tests: run the bats suites, cluster failures, root-cause per cluster Apply the audit-egress findings (all verified against upstream) to the remaining presets: - amp: add auth.ampcode.com + production.ampworkers.com - the auth handshake and the Amp client's WebSocket were blocked at runtime (both per ampcode.com/security) - crush: note that Crush's default PostHog telemetry to data.charm.land is left blocked - qwen: note that the inherited Gemini-CLI Clearcut telemetry to play.googleapis.com is left blocked - cursor: attribute the model/agent stream to the .api5 agent hosts, not bare cursor.sh
raw.githubusercontent.com serves arbitrary bytes from any user's repo/branch, so an allowlisted box can launder data out through it (THREAT_MODEL item 2). laundering_host() already flagged gist.githubusercontent.com but missed its sibling. Surgical add, not *.githubusercontent.com - a wildcard would noisily flag the base-allowlisted objects.githubusercontent.com (release assets) on every run. Regression test added (Docker-free).
New knob: space-separated project-root-relative globs, expanded at launch. A matching file gets an empty read-only bind, a matching dir an empty tmpfs - the box sees the path exists but cannot read it. Stays in force during 'learn --audit' (the open-egress run). Agent presets default to SLUICE_MASK=".env*" (set it empty to disable). sluice doctor lists the active mask + match count, warns with names when secret-looking files (.env*, *.pem, *key*.json, ...) are readable and unmasked, and doctor --json gains a mask object. THREAT_MODEL documents the honest limits: launch-time evaluation (later files unmasked), path existence still visible, unmatched/nested paths unprotected. Also fixes two doctor --json bugs the new tests surfaced: RUNNER was unbound under a live daemon (resolve_runner was never called), and _json_arr dropped a final line without a trailing newline, losing registry.yarnpkg.com from the base array. Tests: verify-doctor-checks.bats runs engine-free (gate); verify-security-mask.bats needs a box (written, gated on an engine).
A symlink whose target leaves the project dir (plus the git common dir,
for worktrees) works on the host but dangles inside the box - the real
case was .claude/CLAUDE.md -> ~/.claude/shared/... breaking silently so
the agent ran without its project instructions. doctor now lists each
such link ('will be broken inside the box'); doctor --json gains
broken_symlinks.
Scope comparison is physical-path based (macOS /var is itself a symlink;
git reports the common dir physically). The scan is bounded: depth 6,
.git/node_modules/vendor/build dirs pruned, first 200 links considered.
…a box-local volume New knob: space-separated project-relative dirs (e.g. node_modules) each mounted over the project bind with a per-box named volume - the Linux box keeps its own contents while the host's stay untouched, ending the macOS-host vs Linux-box install flip-flop. The volume starts empty, persists across container recreation, and the entrypoint chowns a fresh one to the sluice user (named volumes over a bind path never auto-init). Entries are validated (relative only, no ..). Volumes are labeled sluice.box=<container> at creation so cleanup needs no config sourcing: 'sluice rm' (incl. orphan -b rm) and 'sluice prune' remove them; 'stop' keeps them like persisted state. doctor lists overlay dirs (+ --json overlay_dirs), the launch line names them, and 'ls --json' surfaces them via a new sluice.overlays image label. Tests: doctor surfacing runs engine-free (gate); verify-security-overlaydirs.bats needs a box (written, gated on an engine).
…olded config When 'sluice agent <name>' scaffolds the project config from a preset, sniff the project's manifests and extend the written SLUICE_ALLOW_DOMAINS with the stack's package-registry hosts (commented '# from stack detection: <stack>'), so the agent's first pip/bundle/yarn install doesn't trip the firewall into a learn cycle. Always the full registry set - an agent installs deps at runtime, so init's prefetch shortcut doesn't apply. Preset files themselves stay tool-only, and the single assignment line keeps 'sluice learn' rewrites working. The which-agent-is-this-repo-set-up-for note now matches on the preset's first-line banner instead of a verbatim cmp - the old check went blind the moment the config differed from the preset (now by design; before, on any learn edit).
'brew reinstall --HEAD' is not a valid brew invocation; moving an existing stable install to the dev stream needs 'brew uninstall sluice && brew install --HEAD Pyronewbic/tap/sluice'.
…o stderr The first Linux CI run of the new suites surfaced three harness issues (the security behavior itself passed): two exact-output asserts broke on the pre-existing stdout build/start lines when a box (re)starts inside a @test (now --partial), and the overlay teardown's chown-back ran after the 'sluice rm' test had deleted the image it needs (chown back first). The new masking / overlay-dirs launch lines now print to stderr like the egress receipt, so 'sluice run' stdout stays clean for piping. Also wire the new no-Docker suites (doctor scans, agent scaffold) into the CI unit job - they were only running locally via 'make test'. All 19 mask/overlay tests verified green against a local Docker.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes egress drift across the nine agent presets (surfaced and verified via a new
audit-egressworkflow), adds three Claude Code multi-agent workflows, and closes a laundering-host heuristic gap in the launcher. Every host, package, and flag change was verified against upstream source/docs before editing.Agent presets
@sourcegraph/ampto@ampcode/cli; addedauth.ampcode.comandproduction.ampworkers.com(the auth handshake and the Amp client WebSocket were blocked at runtime; both per ampcode.com/security).catwalk.charm.shis nowcatwalk.charm.land(Charm domain migration; the stale SNI silently blocked the model catalog); noted the default PostHog telemetry todata.charm.landstays blocked.LITELLM_LOCAL_MODEL_COST_MAP=True(bundled price map) and--no-check-update(skips the pypi.org version ping)..api5agent fleet and auth hosts (scoped leading-dot wildcards), dropped the unusedapi.cursor.com.statsig.anthropic.com(a prior commit dropped it as dead, but it is still in upstream's init-firewall alongsidestatsig.com).Workflows (
.claude/workflows/)Claude Code multi-agent orchestration scripts (not GitHub Actions):
review-launcher— reviewssrc/*.shacross security / bash-3.2 / docker-podman portability, adversarially verifying each finding.audit-egress— audits each preset's egress against the allowlist + THREAT_MODEL, then verifies each finding against upstream (this is what found the drift above).triage-tests— runs the bats suites, clusters failures, root-causes each cluster.Launcher
laundering_host()now flagsraw.githubusercontent.com— it serves arbitrary bytes from any repo, a write-capable exfil channel (THREAT_MODEL item 2). Surgical add rather than a*.githubusercontent.comwildcard, which would noisily flag the base-allowlistedobjects.githubusercontent.com.Test plan
make build-checkgreen (bin/sluice in sync with src/)shellcheck -S warning bin/sluicecleantest/verify-security-laundering.bats5/5 (including a newraw.githubusercontent.comregression case)Dogfooding batch (added on top)
Five additions from a real session driving Claude Code in a sluice box against a NestJS monorepo on a macOS host. All additive: new knobs, new doctor checks, docs - no verb/flag/knob changes.
SLUICE_MASK- in-repo secret masking. Project-root-relative globs shadowed at launch (empty read-only bind for files, tmpfs for dirs); the box sees the path exists but cannot read it. Agent presets default to.env*(setSLUICE_MASK=""to disable); stays in force duringlearn --audit;doctorlists the active mask and warns by name on unmasked secret-looking files. THREAT_MODEL documents the honest limits (launch-time evaluation, name still visible, nested paths need their own pattern).sluice doctordangling-symlink check - warns on symlinks resolving outside the mounted scope (project dir + git common dir), the.claude/CLAUDE.md -> ~/.claude/shared/...silent-breakage case. Bounded scan (depth 6, vendor dirs pruned, first 200 links), physical-path comparison for macOS.SLUICE_OVERLAY_DIRS- per-box named volumes over project-relative dirs (the devcontainer node_modules trick), ending the macOS-host vs Linux-box install flip-flop. Validated (relative, no..), labeled for cleanup (rm/pruneremove,stopkeeps), entrypoint chowns fresh volumes to uid 1000, surfaced in doctor +ls --json.sluice agent <name>appends the detected stack's package-registry hosts to the scaffolded allowlist (# from stack detection: ...), so first installs don't trip the firewall into a learn cycle. Presets stay tool-only.brew uninstall && brew install --HEADone-liner (brew reinstall --HEADis not a valid invocation).Also fixes two pre-existing
doctor --jsonbugs the new tests surfaced (unboundRUNNERunder a live daemon;_json_arrdropping the final base host), and routes the new launch status lines to stderr sosluice runstdout stays pipeable.Test plan (this batch)
verify-doctor-checks.bats(15, engine-free),verify-agent-scaffold.bats(7, stub engine),verify-security-mask.bats(9) +verify-security-overlaydirs.bats(10) - all green in CI's escape-hatches job after the harness fixes, and verified against a local Docker.make build-check+shellcheckclean; new code paths exercised under macOS bash 3.2.🤖 Generated with Claude Code