Sandboxed workspace terminals for safe AI agent autonomy

## Vision

Workset becomes the **universal agent jail**. Any AI coding agent — Claude, Codex, Cursor, or future tools — gets full autonomy inside a sandboxed workspace. Workset owns the security boundary, not the agent. The agent doesn't even know it's jailed.

This isn't opt-in. It's how workset works. Workset is opinionated: agents develop inside the sandbox, users build and run on the host.

## Problem

Today, every AI agent implements its own permission system — approve `mkdir`, approve `grep`, approve `cat`, hundreds of times per session. This creates two bad outcomes:
1. **Friction**: developers disable safety to move fast (`--dangerously-skip-permissions`)
2. **Risk**: when safety is off, the agent has full host access (SSH keys, credentials, other projects, system config)

No agent's permission model is trustworthy enough to replace OS-level isolation. The agent shouldn't be responsible for policing itself.

## Design: jail + gateway + human-in-the-loop

### 1. The jail (agent side)

The agent operates inside a sandboxed environment. Full autonomy — no permission prompts:
- Edit any file in the workspace
- Run any script, install dev tools, delete and recreate things
- Complete freedom within the workspace boundary

What the agent sees:
```
/workspace/          ← read-write (the repos)
/nix/store/          ← read-only (tools, runtimes)
/tmp/                ← isolated
```

What doesn't exist: `~/`, `~/.ssh`, `~/.aws`, `/etc`, other workspaces, host config.

**Implementation**: Nix for reproducible toolchains + bubblewrap (Linux) or macOS sandbox equivalent for filesystem/process isolation. Workset manages the Nix environment through its UI — the agent doesn't modify the environment declaration.

### 2. The gateway (host side)

When the agent needs to build, test, or run something on the host — because the target IS the host (e.g., a macOS app, Swift/Xcode) — it goes through a workset-controlled gateway.

An agent-agnostic command interface (task-file style) that any agent can write to:
- Agent requests: "run `xcodebuild -scheme MyApp`" with a reason
- Workset evaluates against a per-workspace allowlist
- Allowed commands execute on the host, output streams back to the agent
- Unknown commands go to human approval

```yaml
# Per-workspace host command policy
host_commands:
  always_allow:
    - swift build
    - swift test
    - npm test
    - cargo test
  ask:
    - xcodebuild *
    - docker build *
  deny:
    - rm -rf ~/*
```

### 3. Human-in-the-loop UI

Workset's desktop app provides the approval interface:

> **Claude** wants to run `xcodebuild -scheme MyApp -configuration Debug`
> *Reason: verifying the build compiles after refactoring the networking layer*
> **[Approve]  [Approve & Remember]  [Deny]**

This is the single place where humans interact with agent requests — not scattered across terminal permission prompts.

## Architecture

```
┌──────────────────────────────────┐
│  Sandbox (Nix + bwrap/macOS)     │
│                                  │
│  Agent: FULL autonomy            │
│  • edit, script, install, delete │
│  • zero permission prompts       │
│                                  │
│  Needs host? → structured request│
└─────────────┬────────────────────┘
              │ task-file / gateway protocol
     ┌────────▼─────────┐
     │  Workset Gateway  │
     │                   │
     │  allowlist → run  │
     │  ask → UI prompt  │
     │  deny → reject    │
     │  audit → log      │
     └────────┬─────────┘
              │ stdout/stderr streamed back
     ┌────────▼─────────┐
     │  Host System      │
     │  (user's machine) │
     │  runs as user     │
     └──────────────────┘
```

## Key design decisions

- **Agent-agnostic**: the gateway protocol works for any agent, not just Claude
- **Build vs. run separation**: agent writes code in the jail; user decides what runs on the host
- **Workset manages the environment**: packages/tools added via UI, not by the agent
- **Default-on**: sandboxing is how workset works, not an opt-in feature
- **macOS-native support**: can't jail into Linux when building macOS apps; need macOS-appropriate isolation

## macOS considerations

Bubblewrap is Linux-only. On macOS:
- Filesystem scoping via sandbox profiles or equivalent
- Can't use a Linux VM when the build target is macOS-native (Swift, Xcode)
- The jail must still be macOS — just with restricted filesystem/process visibility
- This is the hardest part of the design and needs dedicated exploration

## What this enables

- **Move fast safely**: no permission fatigue, no risk to host
- **Universal agent support**: one security model for all AI tools
- **Clean human-in-the-loop**: approvals only where they matter (host actions)
- **Auditable**: every host command request is logged with context
- **Reproducible environments**: Nix ensures consistent toolchains across machines

## Related

- ADR: `docs-dev/adr/0002-agent-sandbox-architecture.md`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandboxed workspace terminals for safe AI agent autonomy #144

Vision

Problem

Design: jail + gateway + human-in-the-loop

1. The jail (agent side)

2. The gateway (host side)

3. Human-in-the-loop UI

Architecture

Key design decisions

macOS considerations

What this enables

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Sandboxed workspace terminals for safe AI agent autonomy #144

Description

Vision

Problem

Design: jail + gateway + human-in-the-loop

1. The jail (agent side)

2. The gateway (host side)

3. Human-in-the-loop UI

Architecture

Key design decisions

macOS considerations

What this enables

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions