Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
140 commits
Select commit Hold shift + click to select a range
4092f32
docs: add Pi Sandbox v1 design spec
HashWarlock Apr 3, 2026
80ebe7a
docs: add Pi Sandbox v1 implementation plan (25 tasks, Phases 0-7)
HashWarlock Apr 4, 2026
f19cb42
feat: bootstrap pi-sandbox-extension TS package (Phase 1)
HashWarlock Apr 4, 2026
cb06cc0
feat: bootstrap protocol test scaffolding (Phase 1)
HashWarlock Apr 4, 2026
fc805f1
feat: add frozen TS contract types (Phase 2)
HashWarlock Apr 4, 2026
a4f47d2
feat: add frozen Rust contract types and basic plan parsing (Phase 2)
HashWarlock Apr 4, 2026
bc40a64
feat: add crash synthesis for supervisor crash recovery (Phase 3)
HashWarlock Apr 4, 2026
c70063f
feat: add runtime client with NDJSON subprocess I/O and cancel (Phase 3)
HashWarlock Apr 4, 2026
219d9df
feat: add validator, observer, supervisor, and wire main.rs (Phase 4)
HashWarlock Apr 4, 2026
310f18c
test: add all 6 canonical protocol tests (Phase 5)
HashWarlock Apr 4, 2026
531c0ad
fix: align Rust contract with spec (EffectiveNetwork fields, terminal…
HashWarlock Apr 4, 2026
9985ead
feat: add profiles, session manager, runtime base, reconciler, and Pi…
HashWarlock Apr 4, 2026
d34708b
docs: add Phases 8-10 design spec (Bubblewrap, build flows, network o…
HashWarlock Apr 4, 2026
dc8ed83
docs: add Phases 8-10 implementation plan (17 tasks)
HashWarlock Apr 4, 2026
aee1792
feat: add namespacesApplied and envApplied to EffectiveState
HashWarlock Apr 4, 2026
d4c2b6d
feat: add bubblewrap binary discovery module
HashWarlock Apr 4, 2026
e197272
feat: add plan_builder module for bwrap argv construction
HashWarlock Apr 6, 2026
bdfc234
feat: update validator for namespace resolution and NAMESPACE_DEGRADE…
HashWarlock Apr 6, 2026
07b73d9
feat: update supervisor for bwrap dispatch with direct-execution fall…
HashWarlock Apr 6, 2026
ee0768d
test: add bwrap integration protocol test with macOS fallback
HashWarlock Apr 6, 2026
61755cc
feat: scaffold integration test infrastructure
HashWarlock Apr 6, 2026
bdd871d
feat: add integration test helpers and fixture repos
HashWarlock Apr 6, 2026
586f85a
test: add npm install integration test
HashWarlock Apr 6, 2026
bbecf2e
test: add pip install integration test
HashWarlock Apr 6, 2026
f91d094
test: add cargo build integration test
HashWarlock Apr 6, 2026
f52c86d
test: add optional network smoke test (skipped by default)
HashWarlock Apr 6, 2026
ec9e2ec
feat: replace observer stub with NetworkObserver and /proc/net/tcp pa…
HashWarlock Apr 6, 2026
2eb68aa
feat: wire NetworkObserver into supervisor for live connection tracking
HashWarlock Apr 6, 2026
b39da8e
test: add network observation protocol test (Linux/macOS)
HashWarlock Apr 6, 2026
adb4801
docs: add Phases 11-12 design spec (legacy deprecation, browser, allo…
HashWarlock Apr 6, 2026
8609c9c
docs: add Phases 11-12 implementation plan (12 tasks)
HashWarlock Apr 6, 2026
2788b03
chore: remove legacy sandbox-rs server (Phase 11)
HashWarlock Apr 6, 2026
baf540f
feat: add BrowserManager with Playwright session-scoped pages (Phase …
HashWarlock Apr 6, 2026
f6a6536
feat: register sandbox_browser tool in extension (Phase 12a)
HashWarlock Apr 6, 2026
3cdaba8
feat: wire browser cleanup into session manager tombstone (Phase 12a)
HashWarlock Apr 6, 2026
80739e0
fix: wire BrowserManager into index.ts entry point and shutdown handler
HashWarlock Apr 6, 2026
5fb5429
fix: address code review issues in BrowserManager (race condition, Ma…
HashWarlock Apr 6, 2026
e38a511
test: add BrowserManager unit and integration tests (Phase 12a)
HashWarlock Apr 6, 2026
51674a5
feat: update TS contract for allowlist enforcement types (Phase 12b)
HashWarlock Apr 6, 2026
9f72d45
feat: add ResolvedAllowlistEntry and resolved_allowlist to Rust contr…
HashWarlock Apr 7, 2026
7c27e0b
feat: add DNS resolution and iptables detection to validator (Phase 12b)
HashWarlock Apr 7, 2026
d3fa695
feat: add iptables wrapper generation and allowlist-aware plan builde…
HashWarlock Apr 7, 2026
554c6a4
feat: wire allowlist enforcement into supervisor with leak detection …
HashWarlock Apr 7, 2026
a7fd960
test: add allowlist enforcement protocol tests (Phase 12b)
HashWarlock Apr 7, 2026
080bc73
docs: add Docker fallback for macOS design spec
HashWarlock Apr 8, 2026
cf8e36c
docs: add Docker fallback for macOS implementation plan
HashWarlock Apr 8, 2026
8ee0ce3
feat: add Docker sidecar Dockerfile for macOS bwrap fallback
HashWarlock Apr 8, 2026
aaa9180
feat: add Clone to plan types and isolation_backend to EffectiveState
HashWarlock Apr 8, 2026
c34fd9a
feat: add isolationBackend and Docker warning codes to TS contract
HashWarlock Apr 8, 2026
3c62bc9
feat: add docker.rs with path rewriting functions and tests
HashWarlock Apr 8, 2026
b21159a
feat: add DockerAvailable variant to BwrapAvailability, update all ma…
HashWarlock Apr 8, 2026
6265cf2
feat: add Docker detection and sidecar lifecycle management
HashWarlock Apr 8, 2026
4206fc5
feat: wire Docker sidecar detection into bubblewrap detect() on macOS
HashWarlock Apr 9, 2026
12567c2
feat: add Docker execution branch with path rewriting and crash recovery
HashWarlock Apr 9, 2026
4af704d
feat: add Docker sidecar integration tests (gated behind RUN_DOCKER_T…
HashWarlock Apr 9, 2026
3636592
fix: add seccomp=unconfined for bwrap, fix TS schema, isolate Docker …
HashWarlock Apr 9, 2026
9dae9d7
docs: add Nix flake runtime redesign design spec
HashWarlock Apr 9, 2026
0ec7598
docs: add Nix flake runtime implementation plan (Part A)
HashWarlock Apr 9, 2026
1330270
feat: add Nix flake with mkSandboxRootfs and built-in profiles
HashWarlock Apr 9, 2026
334d40d
feat: add curated package mapping (nix/packages.json)
HashWarlock Apr 9, 2026
e0b247d
feat: rename crate to nixosandbox, add clap CLI skeleton with subcomm…
HashWarlock Apr 9, 2026
e4bfdff
feat: add sandbox spec types with validation and tests
HashWarlock Apr 9, 2026
9fada74
feat: add session management (create/list/load/destroy) with tests
HashWarlock Apr 9, 2026
fd1e29b
feat: add Nix build invocation (build_profile, build_spec, validate_r…
HashWarlock Apr 9, 2026
0681a59
feat: add build_rootfs() for pivot-root bwrap execution
HashWarlock Apr 9, 2026
905383e
feat: wire all CLI subcommands (create, exec, enter, list, destroy, b…
HashWarlock Apr 9, 2026
35a38c1
chore: delete legacy docker-compose, update tests for renamed crate
HashWarlock Apr 9, 2026
fc7e80d
feat: wire nixosandbox binary into flake.nix as default package
HashWarlock Apr 9, 2026
fea46f3
docs: add Part B design spec (Docker sidecar, legacy cleanup, integra…
HashWarlock Apr 9, 2026
a04c468
docs: add Part B implementation plan (12 tasks)
HashWarlock Apr 9, 2026
6228201
refactor: rename and strip Dockerfile to nixosandbox-sidecar
HashWarlock Apr 9, 2026
c42756d
refactor: rename PI_SANDBOX_* env vars to NIXOSANDBOX_*
HashWarlock Apr 9, 2026
9fcce4a
refactor: update docker.rs naming, add /nix/store mount
HashWarlock Apr 9, 2026
b77ff27
feat: wire Docker exec with session path rewriting
HashWarlock Apr 9, 2026
783c3ab
feat: enhance exec --json with full event stream
HashWarlock Apr 9, 2026
cfc46d0
chore: delete legacy NDJSON-only protocol tests
HashWarlock Apr 9, 2026
18ca7a4
feat: delete legacy-ndjson subcommand and inbound types
HashWarlock Apr 9, 2026
6ac4a34
refactor: update protocol test infrastructure for exec --json
HashWarlock Apr 9, 2026
14caaa1
refactor: adapt protocol tests for exec --json
HashWarlock Apr 9, 2026
39bf746
feat: create integration test infrastructure
HashWarlock Apr 9, 2026
2c7879c
feat: add rootfs-pipeline integration tests
HashWarlock Apr 9, 2026
10078d8
feat: add Docker rootfs integration tests
HashWarlock Apr 9, 2026
a45d66b
fix: add sequence/ts to result event, clarify docker mount naming
HashWarlock Apr 9, 2026
1acbc81
docs: add Part C design spec (extension simplification + battlecard)
HashWarlock Apr 9, 2026
fdc25b1
docs: add Part C implementation plan (14 tasks)
HashWarlock Apr 9, 2026
436ecea
feat: add agent and description fields to SessionMetadata
HashWarlock Apr 9, 2026
a04e614
feat: wire --agent and --description flags into create
HashWarlock Apr 9, 2026
035a525
feat: add status subcommand with battlecard output
HashWarlock Apr 9, 2026
59b32b4
chore: delete supervisor.rs and validator.rs
HashWarlock Apr 9, 2026
82e57af
chore: delete dead outbound types from contract.rs
HashWarlock Apr 9, 2026
9ad67fd
chore: delete legacy build(), build_with_allowlist(), rewrite_plan()
HashWarlock Apr 9, 2026
c1fdebb
chore: final Rust dead code cleanup
HashWarlock Apr 9, 2026
8884cc2
chore: delete 5 extension modules replaced by CLI
HashWarlock Apr 9, 2026
f47e2fe
chore: delete inbound types from contract.ts
HashWarlock Apr 9, 2026
add134d
feat: create cli-client.ts — thin CLI wrappers
HashWarlock Apr 9, 2026
2b28caa
feat: rewrite extension.ts as thin CLI adapter
HashWarlock Apr 9, 2026
4bd0e39
test: update crash-synthesis test for simplified API
HashWarlock Apr 9, 2026
d437f6a
refactor: simplify index.ts and rename extension package
HashWarlock Apr 9, 2026
b86139e
fix: address code review findings
HashWarlock Apr 9, 2026
f1f161e
fix: update flake to nixos-25.11 with multi-platform support
HashWarlock Apr 9, 2026
279454f
docs: add llm-agents.nix integration design spec
HashWarlock Apr 9, 2026
0db3531
docs: add llm-agents.nix integration implementation plan (12 tasks)
HashWarlock Apr 9, 2026
645c869
feat: add llm-agents.nix flake input with numtide binary cache
HashWarlock Apr 9, 2026
57e0fe1
feat: create unified package catalog from llm-agents.nix + nixpkgs
HashWarlock Apr 9, 2026
5c88f50
feat: create mkAgentSandbox -- catalog-aware composition layer
HashWarlock Apr 9, 2026
3e28712
feat: wire catalog and mkAgentSandbox into flake outputs
HashWarlock Apr 9, 2026
b13e3fb
feat: add --with flag on create and catalog subcommand to CLI
HashWarlock Apr 9, 2026
7ce4407
feat: add build_with_catalog() and query_catalog() to nix.rs
HashWarlock Apr 9, 2026
504b70e
feat: wire --with catalog creation and catalog subcommand into main
HashWarlock Apr 9, 2026
2b56460
feat: add catalogPackages() and withPackages to cli-client
HashWarlock Apr 9, 2026
88786e4
feat: add sandbox_catalog tool and 'with' param to sandbox_run
HashWarlock Apr 9, 2026
95ec278
feat: re-export catalogPackages and CatalogResponse from index
HashWarlock Apr 9, 2026
590cbbb
test: add validation test for empty packages
HashWarlock Apr 9, 2026
ea947db
fix: sanitize package names and persist network mode in session metadata
HashWarlock Apr 9, 2026
a1826dd
fix: serialize session tests with mutex to prevent env var races
HashWarlock Apr 9, 2026
b18f1dd
ci: add GitHub Actions workflow for Rust, TypeScript, Nix tests
HashWarlock Apr 9, 2026
b470c59
ci: fix TS typecheck, add agent sandbox smoke tests
HashWarlock Apr 9, 2026
0cc3be1
ci: install bubblewrap on runner, make agent exec tests strict
HashWarlock Apr 9, 2026
9018924
ci: install bubblewrap from nixpkgs (apt version lacks --pivot-root)
HashWarlock Apr 9, 2026
2296bdf
ci: fix bwrap PATH issue, update actions to Node.js 24
HashWarlock Apr 9, 2026
52cac85
fix: use detected bwrap path at spawn time, handle custom profiles co…
HashWarlock Apr 9, 2026
086d037
fix: replace invalid --pivot-root with --ro-bind for bwrap root mount
HashWarlock Apr 9, 2026
7ae566e
fix: use system bwrap (setuid), add AppArmor fallback and lifecycle f…
HashWarlock Apr 9, 2026
0f54d70
fix: mount /nix/store read-only in sandbox for symlink resolution
HashWarlock Apr 10, 2026
df7df58
fix: add /nix/store mount point to rootfs for bwrap bind mount
HashWarlock Apr 10, 2026
7c95cc0
docs: rewrite README for CLI + catalog architecture
HashWarlock Apr 10, 2026
962a7c0
docs: add CLAUDE.md for Claude Code context
HashWarlock Apr 10, 2026
9ef970f
fix: null-safe args and required array for Pi extension compatibility
HashWarlock Apr 10, 2026
b00cf4f
docs: add design spec for dynamic numtide catalog passthrough
HashWarlock Apr 10, 2026
11f498d
docs: add implementation plan for dynamic numtide catalog
HashWarlock Apr 10, 2026
31f0439
feat: expose all llm-agents.nix packages dynamically in catalog
HashWarlock Apr 10, 2026
fbd881d
docs: add implementation plan for CI additions and catalog defensive …
HashWarlock Apr 10, 2026
8cebc7f
ci: add openclaw and hermes-agent smoke tests, verify catalog agent c…
HashWarlock Apr 10, 2026
58748c6
ci: raise catalog agent threshold to 80, improve failure diagnostics,…
HashWarlock Apr 10, 2026
b24859f
fix: guard query_catalog against non-derivation attrs, warn on empty …
HashWarlock Apr 10, 2026
3413794
refactor: flatten filterDrvs expression in query_catalog for readability
HashWarlock Apr 10, 2026
2d2961b
style: format query_catalog Nix expression as multi-line for readability
HashWarlock Apr 10, 2026
892254f
ci: lower catalog count threshold to 50, add jules smoke test
HashWarlock Apr 10, 2026
ae158a7
docs: update agent count to 88+, catalog table, CI smoke test list, C…
HashWarlock Apr 10, 2026
df3f22c
fix: avoid backslash in f-string expression inside single-quoted shel…
HashWarlock Apr 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
228 changes: 228 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
name: CI

on:
push:
branches: [master, pi-sandbox-refactor]
pull_request:
branches: [master]

jobs:
rust-tests:
name: Rust tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/cache@v5
with:
path: |
~/.cargo/registry
~/.cargo/git
crates/nixosandbox/target
key: cargo-${{ runner.os }}-${{ hashFiles('crates/nixosandbox/Cargo.lock') }}
restore-keys: cargo-${{ runner.os }}-

- name: Build
working-directory: crates/nixosandbox
run: cargo build

- name: Test
working-directory: crates/nixosandbox
run: cargo test

typescript-check:
name: TypeScript typecheck
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: actions/setup-node@v6
with:
node-version: "22"

- name: Install dependencies
working-directory: packages/pi-sandbox-extension
run: npm install

- name: Typecheck
working-directory: packages/pi-sandbox-extension
run: npx tsc --noEmit

nix-eval:
name: Nix evaluation & catalog
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: cachix/install-nix-action@v30
with:
extra_nix_config: |
experimental-features = nix-command flakes
extra-substituters = https://cache.numtide.com
extra-trusted-public-keys = niks3.numtide.com-1:DTx8wZduET09hRmMtKdQDxNNthLQETkc/yaX7M4qK0g=

- name: Flake check
run: nix flake check --accept-flake-config

- name: Evaluate catalog agents
run: |
nix eval --accept-flake-config .#catalog.agents --apply 'x: builtins.attrNames x'

- name: Evaluate catalog tools
run: |
nix eval --accept-flake-config .#catalog.tools --apply 'x: builtins.attrNames x'

- name: Verify existing profiles
run: |
nix eval --accept-flake-config .#packages.x86_64-linux.sandbox-strict.name

nix-build:
name: Nix build & smoke test
runs-on: ubuntu-latest
needs: [rust-tests, nix-eval]
steps:
- uses: actions/checkout@v6

- uses: cachix/install-nix-action@v30
with:
extra_nix_config: |
experimental-features = nix-command flakes
extra-substituters = https://cache.numtide.com
extra-trusted-public-keys = niks3.numtide.com-1:DTx8wZduET09hRmMtKdQDxNNthLQETkc/yaX7M4qK0g=

- name: Build nixosandbox CLI
run: nix build --accept-flake-config .#nixosandbox

- name: Test catalog subcommand
run: |
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox catalog --json | python3 -m json.tool > /dev/null
echo "Catalog JSON is valid"

- name: Verify catalog agent count and new packages
run: |
catalog_json=$(NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox catalog --json)
agent_count=$(echo "$catalog_json" | python3 -c "import sys, json; d=json.load(sys.stdin); print(len(d['agents']))")
echo "Catalog agent count: $agent_count"
# Threshold is conservative — guards against empty catalog, not exact upstream count.
# llm-agents.nix is a third-party input that may fluctuate; 50 is well above any realistic minimum.
[ "$agent_count" -gt 50 ] || { echo "ERROR: expected >50 agents, got $agent_count"; exit 1; }
echo "$catalog_json" | python3 -c '
import sys, json
d = json.load(sys.stdin)
agents = d["agents"]
missing = [n for n in ["openclaw", "hermes-agent", "jules"] if n not in agents]
if missing:
print(f"ERROR: missing agents: {missing}", file=sys.stderr)
sys.exit(1)
print(f"All expected agents present in {len(agents)} total")
'

- name: Test catalog filter
run: |
output=$(NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox catalog --filter claude)
echo "$output"
echo "$output" | grep -q "claude-code"

- name: Test --with rootfs creation
run: |
output=$(NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox create \
--with bash,coreutils --network off --name ci-smoke --json)
echo "$output"
session_id=$(echo "$output" | python3 -c "import sys,json; print(json.load(sys.stdin)['sessionId'])")
echo "Session created: $session_id"
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox destroy "$session_id"
echo "Session destroyed"

- name: Test mutual exclusivity
run: |
if NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox create --profile strict --with bash 2>&1; then
echo "ERROR: should have failed" && exit 1
else
echo "Mutual exclusivity check passed"
fi

agent-smoke-tests:
name: Agent sandbox smoke tests
runs-on: ubuntu-latest
needs: [nix-build]
strategy:
fail-fast: false
matrix:
include:
- agent: claude-code
binary: claude
check: "--version"
- agent: codex
binary: codex
check: "--help"
- agent: opencode
binary: opencode
check: "--version"
- agent: amp
binary: amp
check: "--help"
- agent: droid
binary: droid
check: "--version"
- agent: pi
binary: pi
check: "--help"
- agent: openclaw
binary: openclaw
check: "--help"
- agent: hermes-agent
binary: hermes
check: "--version"
- agent: jules
binary: jules
check: "--help"
steps:
- uses: actions/checkout@v6

- uses: cachix/install-nix-action@v30
with:
extra_nix_config: |
experimental-features = nix-command flakes
extra-substituters = https://cache.numtide.com
extra-trusted-public-keys = niks3.numtide.com-1:DTx8wZduET09hRmMtKdQDxNNthLQETkc/yaX7M4qK0g=

- name: Install and verify bubblewrap
run: |
sudo apt-get update -qq
sudo apt-get install -y -qq bubblewrap
echo "bwrap path: $(which bwrap)"
bwrap --version
# Verify bwrap can actually create sandboxes.
# Ubuntu 24.04 may restrict user namespaces via AppArmor (containers/bubblewrap#632).
if ! bwrap --ro-bind / / --dev /dev --proc /proc -- echo "bwrap sandbox works"; then
echo "bwrap failed — relaxing AppArmor unprivileged userns restriction"
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0
bwrap --ro-bind / / --dev /dev --proc /proc -- echo "bwrap sandbox works after sysctl fix"
fi

- name: Build nixosandbox CLI
run: nix build --accept-flake-config .#nixosandbox

- name: Create sandbox with ${{ matrix.agent }}
run: |
echo "Creating sandbox with ${{ matrix.agent }} + bash..."
output=$(NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox create \
--with ${{ matrix.agent }},bash --network off --name ci-${{ matrix.agent }} --json)
echo "$output"
echo "$output" | python3 -c "import sys,json; d=json.load(sys.stdin); print(f'Session: {d[\"sessionId\"]}, Profile: {d[\"profile\"]}')"
session_id=$(echo "$output" | python3 -c "import sys,json; print(json.load(sys.stdin)['sessionId'])")
echo "SESSION_ID=$session_id" >> "$GITHUB_ENV"

- name: Verify ${{ matrix.agent }} binary launches
run: |
echo "Running: ${{ matrix.binary }} ${{ matrix.check }}"
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox exec "$SESSION_ID" --json \
-- ${{ matrix.binary }} ${{ matrix.check }}
echo "${{ matrix.agent }} smoke test passed"

- name: Cleanup session
if: always()
run: |
if [ -n "$SESSION_ID" ]; then
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox destroy "$SESSION_ID" 2>/dev/null || true
fi
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,14 @@ tests/__pycache__/
*.key
credentials.json
secrets.json

# Pi Sandbox Extension
packages/pi-sandbox-extension/node_modules/
packages/pi-sandbox-extension/dist/

# Rust / Cargo
crates/nixosandbox/target/

# Protocol tests
tests/protocol/node_modules/
.pi/
96 changes: 96 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Build and test commands

### Rust CLI
```bash
cd crates/nixosandbox
cargo build # build
cargo test # run all 20 tests
cargo test session::tests # run tests in one module
cargo test metadata_roundtrip # run a single test by name
```

### TypeScript extension
```bash
cd packages/pi-sandbox-extension
npm install
npx tsc --noEmit # typecheck only
npm run build # compile to dist/
```

### Nix
```bash
nix flake check --accept-flake-config
nix build --accept-flake-config .#nixosandbox # build CLI as Nix package
nix eval --accept-flake-config .#catalog.agents --apply 'x: builtins.attrNames x'
nix eval --accept-flake-config .#catalog.tools --apply 'x: builtins.attrNames x'
```

### CLI smoke test (Linux only, requires bwrap)
```bash
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox catalog --json
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox create --with bash,coreutils --network off --json
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox exec <session-id> -- echo hello
NIXOSANDBOX_FLAKE_ROOT=$PWD ./result/bin/nixosandbox destroy <session-id>
```

## Architecture

### Core data flow

1. **User runs** `nixosandbox create --with claude-code,bash --network off`
2. **nix.rs** resolves package names via `build_with_catalog()` which generates a Nix expression calling `mkAgentSandbox`
3. **mkAgentSandbox.nix** resolves names from the catalog (agents first, then tools) and delegates to `mkSandboxRootfs`
4. **mkSandboxRootfs.nix** uses `pkgs.buildEnv` to merge packages, then creates a rootfs directory with symlinks into `/nix/store`
5. **session.rs** creates a session directory under `~/.local/share/nixosandbox/sessions/<id>/` with metadata, workspace, home, and cache dirs
6. **User runs** `nixosandbox exec <id> -- command`
7. **plan_builder.rs** constructs bwrap argv: `--ro-bind <rootfs> /`, `--ro-bind /nix/store /nix/store`, writable bind mounts for workspace/home/cache, namespace flags, env vars
8. **main.rs** spawns bwrap (detected path from `bubblewrap::detect()`) with the constructed argv

### Key design decisions

- **Profile field is overloaded**: `session.profile` stores either a built-in profile name (e.g., `"strict"`) which maps to `nix/profiles/<name>.json`, or `"custom:<pkg1>,<pkg2>"` for `--with` sessions. Code must check `meta.profile.starts_with("custom:")` before calling `load_profile()`.
- **Rootfs symlinks require Nix store**: The rootfs contains absolute symlinks into `/nix/store`. bwrap must bind-mount `/nix/store` read-only, and the rootfs must have `/nix/store` as an empty mount point directory.
- **macOS support via Docker sidecar**: On non-Linux, `bubblewrap::detect()` tries a Docker sidecar container (`nixosandbox-sidecar`) with bwrap inside. Session paths are rewritten from host to container paths via `docker::rewrite_path()`.
- **Package name validation**: `nix::validate_package_name()` rejects names not matching `[a-zA-Z0-9_.-]+` to prevent Nix expression injection via `--with`.
- **Network mode storage**: For `--with` sessions, the network mode is stored in `session.network` (not derivable from a profile file). For built-in profiles, it's in the profile JSON.

### Rust module responsibilities

| Module | Owns |
|--------|------|
| `cli.rs` | clap argument parsing |
| `main.rs` | command dispatch, `cmd_create`, `cmd_exec`, `cmd_catalog`, etc. |
| `session.rs` | session CRUD, metadata serialization, directory layout |
| `nix.rs` | `find_flake_root()`, `build_profile()`, `build_with_catalog()`, `query_catalog()` (filters non-derivation attrs via `filterDrvs` before reading `.meta.description`) |
| `plan_builder.rs` | bwrap argv construction (`--ro-bind`, `--bind`, `--unshare-*`, `--setenv`) |
| `bubblewrap.rs` | bwrap detection: `NIXOSANDBOX_BWRAP_PATH` env var, then `which bwrap`, Docker fallback on macOS |
| `docker.rs` | Docker sidecar lifecycle (find, start, create, image build) |
| `spec.rs` | profile/spec loading from JSON, validation |

### Nix module responsibilities

| File | Owns |
|------|------|
| `nix/catalog.nix` | Unified `{ agents, tools }` attrset — agents is a full dynamic passthrough of `llm-agents-pkgs` (no whitelist), tools from nixpkgs |
| `nix/mkSandboxRootfs.nix` | Builds rootfs directory tree from package list (symlinks, /etc, certs) |
| `nix/mkAgentSandbox.nix` | Resolves catalog names to packages, delegates to mkSandboxRootfs |
| `nix/profiles/*.json` | Built-in profile specs (strict, build-install, offline-review, debug-network) |

### TypeScript extension (packages/pi-sandbox-extension)

Provides Pi coding agent integration. Key files:
- `cli-client.ts` — spawns the nixosandbox CLI as a subprocess, provides `createSession()`, `execCommand()`, `catalogPackages()`
- `extension.ts` — registers Pi tools: `sandboxRun`, `sandboxReadFile`, `sandboxWriteFile`, `sandboxListFiles`, `sandboxSessionInfo`, `sandboxBrowser`, `sandboxCatalog`
- `contract.ts` — TypeScript type definitions for the NDJSON protocol

## Session tests use env var serialization

Tests in `session.rs` mutate `NIXOSANDBOX_DATA_DIR` to use temp directories. They serialize via `static ENV_LOCK: Mutex<()>` acquired in `with_temp_data_dir()`. If adding session tests, always use this helper.

## CI runs on nixos-25.11

The flake pins `nixpkgs` to `nixos-25.11`. CI uses `cachix/install-nix-action@v30` with the numtide binary cache. Agent smoke tests use system apt bwrap (setuid) because Nix-built bwrap lacks setuid and fails on Ubuntu 24.04's AppArmor user namespace restrictions.
Loading