Skip to content

Agent-preset egress fixes, Claude workflows, SLUICE_MASK + overlay dirs, and doctor checks#20

Merged
Pyronewbic merged 10 commits into
mainfrom
feat/ls-switcher-changes
Jun 10, 2026
Merged

Agent-preset egress fixes, Claude workflows, SLUICE_MASK + overlay dirs, and doctor checks#20
Pyronewbic merged 10 commits into
mainfrom
feat/ls-switcher-changes

Conversation

@Pyronewbic

@Pyronewbic Pyronewbic commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes egress drift across the nine agent presets (surfaced and verified via a new audit-egress workflow), adds three Claude Code multi-agent workflows, and closes a laundering-host heuristic gap in the launcher. Every host, package, and flag change was verified against upstream source/docs before editing.

Agent presets

  • amp — package renamed from @sourcegraph/amp to @ampcode/cli; added auth.ampcode.com and production.ampworkers.com (the auth handshake and the Amp client WebSocket were blocked at runtime; both per ampcode.com/security).
  • crushcatwalk.charm.sh is now catwalk.charm.land (Charm domain migration; the stale SNI silently blocked the model catalog); noted the default PostHog telemetry to data.charm.land stays blocked.
  • aider — avoids two off-allowlist runtime fetches without widening the allowlist: LITELLM_LOCAL_MODEL_COST_MAP=True (bundled price map) and --no-check-update (skips the pypi.org version ping).
  • cursor — added the .api5 agent fleet and auth hosts (scoped leading-dot wildcards), dropped the unused api.cursor.com.
  • codex / gemini / qwen — documented default telemetry hosts that stay (correctly) blocked, rather than allowlisting them.
  • claude — restored statsig.anthropic.com (a prior commit dropped it as dead, but it is still in upstream's init-firewall alongside statsig.com).

Workflows (.claude/workflows/)

Claude Code multi-agent orchestration scripts (not GitHub Actions):

  • review-launcher — reviews src/*.sh across security / bash-3.2 / docker-podman portability, adversarially verifying each finding.
  • audit-egress — audits each preset's egress against the allowlist + THREAT_MODEL, then verifies each finding against upstream (this is what found the drift above).
  • triage-tests — runs the bats suites, clusters failures, root-causes each cluster.

Launcher

laundering_host() now flags raw.githubusercontent.com — it serves arbitrary bytes from any repo, a write-capable exfil channel (THREAT_MODEL item 2). Surgical add rather than a *.githubusercontent.com wildcard, which would noisily flag the base-allowlisted objects.githubusercontent.com.

Test plan

  • make build-check green (bin/sluice in sync with src/)
  • shellcheck -S warning bin/sluice clean
  • test/verify-security-laundering.bats 5/5 (including a new raw.githubusercontent.com regression case)

Dogfooding batch (added on top)

Five additions from a real session driving Claude Code in a sluice box against a NestJS monorepo on a macOS host. All additive: new knobs, new doctor checks, docs - no verb/flag/knob changes.

  • SLUICE_MASK - in-repo secret masking. Project-root-relative globs shadowed at launch (empty read-only bind for files, tmpfs for dirs); the box sees the path exists but cannot read it. Agent presets default to .env* (set SLUICE_MASK="" to disable); stays in force during learn --audit; doctor lists the active mask and warns by name on unmasked secret-looking files. THREAT_MODEL documents the honest limits (launch-time evaluation, name still visible, nested paths need their own pattern).
  • sluice doctor dangling-symlink check - warns on symlinks resolving outside the mounted scope (project dir + git common dir), the .claude/CLAUDE.md -> ~/.claude/shared/... silent-breakage case. Bounded scan (depth 6, vendor dirs pruned, first 200 links), physical-path comparison for macOS.
  • SLUICE_OVERLAY_DIRS - per-box named volumes over project-relative dirs (the devcontainer node_modules trick), ending the macOS-host vs Linux-box install flip-flop. Validated (relative, no ..), labeled for cleanup (rm/prune remove, stop keeps), entrypoint chowns fresh volumes to uid 1000, surfaced in doctor + ls --json.
  • Agent scaffold stack union - sluice agent <name> appends the detected stack's package-registry hosts to the scaffolded allowlist (# from stack detection: ...), so first installs don't trip the firewall into a learn cycle. Presets stay tool-only.
  • README Updating fix - the brew uninstall && brew install --HEAD one-liner (brew reinstall --HEAD is not a valid invocation).

Also fixes two pre-existing doctor --json bugs the new tests surfaced (unbound RUNNER under a live daemon; _json_arr dropping the final base host), and routes the new launch status lines to stderr so sluice run stdout stays pipeable.

Test plan (this batch)

  • New gate suites: verify-doctor-checks.bats (15, engine-free), verify-agent-scaffold.bats (7, stub engine), verify-security-mask.bats (9) + verify-security-overlaydirs.bats (10) - all green in CI's escape-hatches job after the harness fixes, and verified against a local Docker.
  • The two no-Docker suites are now wired into the CI unit job.
  • make build-check + shellcheck clean; new code paths exercised under macOS bash 3.2.

🤖 Generated with Claude Code

Pyronewbic added 10 commits June 5, 2026 16:24
Found by the audit-egress workflow, each verified against upstream source:
- amp: @sourcegraph/amp -> @ampcode/cli (old name is now a thin alias that
  depends on the new package; new pkg keeps the `amp` bin)
- crush: catwalk.charm.sh -> catwalk.charm.land (Charm domain migration; the
  stale SNI was silently blocking Crush's model catalog)
- aider: avoid two off-allowlist runtime fetches without widening the
  allowlist - LITELLM_LOCAL_MODEL_COST_MAP=True (use litellm's bundled price
  map) and --no-check-update (skip aider's pypi.org version ping)
- cursor: add the api5 agent fleet (.api5.cursor.sh) + auth hosts, drop the
  unused api.cursor.com (not used by the CLI per Cursor's network doc)
- codex: correct a misleading comment - Codex's default Statsig telemetry to
  ab.chatgpt.com stays blocked by design, not allowlisted
- opencode: autoupdate:false pins the baked version against opencode's
  launch-time self-upgrade (which would drift from `sluice lock`)
…st (claude)

Follow-up to the egress-drift pass, verified against upstream:
- gemini: note that usage-stats telemetry (Clearcut, play.googleapis.com) is
  left blocked; privacy.usageStatisticsEnabled=false silences the per-run warning
- claude: re-add statsig.anthropic.com - commit 6cc6077 dropped it as dead, but
  upstream init-firewall.sh still allowlists it alongside statsig.com; both are
  feature-flag/metrics hosts whose flags can affect behavior. sentry.io error
  reporting stays blocked.
… findings

Add .claude/workflows/ - Claude Code multi-agent orchestration scripts (not
GitHub Actions):
- review-launcher: review src/*.sh across security / bash-3.2 / docker-podman
  portability, each finding adversarially verified
- audit-egress: audit each agents/*.config.sh preset's egress against the
  allowlist + THREAT_MODEL.md, then verify each finding against upstream source
- triage-tests: run the bats suites, cluster failures, root-cause per cluster

Apply the audit-egress findings (all verified against upstream) to the
remaining presets:
- amp: add auth.ampcode.com + production.ampworkers.com - the auth handshake
  and the Amp client's WebSocket were blocked at runtime (both per
  ampcode.com/security)
- crush: note that Crush's default PostHog telemetry to data.charm.land is
  left blocked
- qwen: note that the inherited Gemini-CLI Clearcut telemetry to
  play.googleapis.com is left blocked
- cursor: attribute the model/agent stream to the .api5 agent hosts, not bare
  cursor.sh
raw.githubusercontent.com serves arbitrary bytes from any user's repo/branch,
so an allowlisted box can launder data out through it (THREAT_MODEL item 2).
laundering_host() already flagged gist.githubusercontent.com but missed its
sibling. Surgical add, not *.githubusercontent.com - a wildcard would noisily
flag the base-allowlisted objects.githubusercontent.com (release assets) on
every run. Regression test added (Docker-free).
New knob: space-separated project-root-relative globs, expanded at launch.
A matching file gets an empty read-only bind, a matching dir an empty
tmpfs - the box sees the path exists but cannot read it. Stays in force
during 'learn --audit' (the open-egress run). Agent presets default to
SLUICE_MASK=".env*" (set it empty to disable).

sluice doctor lists the active mask + match count, warns with names when
secret-looking files (.env*, *.pem, *key*.json, ...) are readable and
unmasked, and doctor --json gains a mask object. THREAT_MODEL documents
the honest limits: launch-time evaluation (later files unmasked), path
existence still visible, unmatched/nested paths unprotected.

Also fixes two doctor --json bugs the new tests surfaced: RUNNER was
unbound under a live daemon (resolve_runner was never called), and
_json_arr dropped a final line without a trailing newline, losing
registry.yarnpkg.com from the base array.

Tests: verify-doctor-checks.bats runs engine-free (gate);
verify-security-mask.bats needs a box (written, gated on an engine).
A symlink whose target leaves the project dir (plus the git common dir,
for worktrees) works on the host but dangles inside the box - the real
case was .claude/CLAUDE.md -> ~/.claude/shared/... breaking silently so
the agent ran without its project instructions. doctor now lists each
such link ('will be broken inside the box'); doctor --json gains
broken_symlinks.

Scope comparison is physical-path based (macOS /var is itself a symlink;
git reports the common dir physically). The scan is bounded: depth 6,
.git/node_modules/vendor/build dirs pruned, first 200 links considered.
…a box-local volume

New knob: space-separated project-relative dirs (e.g. node_modules) each
mounted over the project bind with a per-box named volume - the Linux box
keeps its own contents while the host's stay untouched, ending the
macOS-host vs Linux-box install flip-flop. The volume starts empty,
persists across container recreation, and the entrypoint chowns a fresh
one to the sluice user (named volumes over a bind path never auto-init).

Entries are validated (relative only, no ..). Volumes are labeled
sluice.box=<container> at creation so cleanup needs no config sourcing:
'sluice rm' (incl. orphan -b rm) and 'sluice prune' remove them; 'stop'
keeps them like persisted state. doctor lists overlay dirs (+ --json
overlay_dirs), the launch line names them, and 'ls --json' surfaces them
via a new sluice.overlays image label.

Tests: doctor surfacing runs engine-free (gate);
verify-security-overlaydirs.bats needs a box (written, gated on an engine).
…olded config

When 'sluice agent <name>' scaffolds the project config from a preset,
sniff the project's manifests and extend the written SLUICE_ALLOW_DOMAINS
with the stack's package-registry hosts (commented '# from stack
detection: <stack>'), so the agent's first pip/bundle/yarn install
doesn't trip the firewall into a learn cycle. Always the full registry
set - an agent installs deps at runtime, so init's prefetch shortcut
doesn't apply. Preset files themselves stay tool-only, and the single
assignment line keeps 'sluice learn' rewrites working.

The which-agent-is-this-repo-set-up-for note now matches on the preset's
first-line banner instead of a verbatim cmp - the old check went blind
the moment the config differed from the preset (now by design; before,
on any learn edit).
'brew reinstall --HEAD' is not a valid brew invocation; moving an
existing stable install to the dev stream needs
'brew uninstall sluice && brew install --HEAD Pyronewbic/tap/sluice'.
…o stderr

The first Linux CI run of the new suites surfaced three harness issues
(the security behavior itself passed): two exact-output asserts broke on
the pre-existing stdout build/start lines when a box (re)starts inside a
@test (now --partial), and the overlay teardown's chown-back ran after
the 'sluice rm' test had deleted the image it needs (chown back first).

The new masking / overlay-dirs launch lines now print to stderr like the
egress receipt, so 'sluice run' stdout stays clean for piping. Also wire
the new no-Docker suites (doctor scans, agent scaffold) into the CI
unit job - they were only running locally via 'make test'.

All 19 mask/overlay tests verified green against a local Docker.
@Pyronewbic Pyronewbic changed the title Agent-preset egress drift fixes, Claude workflows, and a laundering-host gap Agent-preset egress fixes, Claude workflows, SLUICE_MASK + overlay dirs, and doctor checks Jun 10, 2026
@Pyronewbic Pyronewbic merged commit 8b17832 into main Jun 10, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant