Skip to content

Sandboxed workspace terminals for safe AI agent autonomy #144

@jrschumacher

Description

@jrschumacher

Vision

Workset becomes the universal agent jail. Any AI coding agent — Claude, Codex, Cursor, or future tools — gets full autonomy inside a sandboxed workspace. Workset owns the security boundary, not the agent. The agent doesn't even know it's jailed.

This isn't opt-in. It's how workset works. Workset is opinionated: agents develop inside the sandbox, users build and run on the host.

Problem

Today, every AI agent implements its own permission system — approve mkdir, approve grep, approve cat, hundreds of times per session. This creates two bad outcomes:

  1. Friction: developers disable safety to move fast (--dangerously-skip-permissions)
  2. Risk: when safety is off, the agent has full host access (SSH keys, credentials, other projects, system config)

No agent's permission model is trustworthy enough to replace OS-level isolation. The agent shouldn't be responsible for policing itself.

Design: jail + gateway + human-in-the-loop

1. The jail (agent side)

The agent operates inside a sandboxed environment. Full autonomy — no permission prompts:

  • Edit any file in the workspace
  • Run any script, install dev tools, delete and recreate things
  • Complete freedom within the workspace boundary

What the agent sees:

/workspace/          ← read-write (the repos)
/nix/store/          ← read-only (tools, runtimes)
/tmp/                ← isolated

What doesn't exist: ~/, ~/.ssh, ~/.aws, /etc, other workspaces, host config.

Implementation: Nix for reproducible toolchains + bubblewrap (Linux) or macOS sandbox equivalent for filesystem/process isolation. Workset manages the Nix environment through its UI — the agent doesn't modify the environment declaration.

2. The gateway (host side)

When the agent needs to build, test, or run something on the host — because the target IS the host (e.g., a macOS app, Swift/Xcode) — it goes through a workset-controlled gateway.

An agent-agnostic command interface (task-file style) that any agent can write to:

  • Agent requests: "run xcodebuild -scheme MyApp" with a reason
  • Workset evaluates against a per-workspace allowlist
  • Allowed commands execute on the host, output streams back to the agent
  • Unknown commands go to human approval
# Per-workspace host command policy
host_commands:
  always_allow:
    - swift build
    - swift test
    - npm test
    - cargo test
  ask:
    - xcodebuild *
    - docker build *
  deny:
    - rm -rf ~/*

3. Human-in-the-loop UI

Workset's desktop app provides the approval interface:

Claude wants to run xcodebuild -scheme MyApp -configuration Debug
Reason: verifying the build compiles after refactoring the networking layer
[Approve] [Approve & Remember] [Deny]

This is the single place where humans interact with agent requests — not scattered across terminal permission prompts.

Architecture

┌──────────────────────────────────┐
│  Sandbox (Nix + bwrap/macOS)     │
│                                  │
│  Agent: FULL autonomy            │
│  • edit, script, install, delete │
│  • zero permission prompts       │
│                                  │
│  Needs host? → structured request│
└─────────────┬────────────────────┘
              │ task-file / gateway protocol
     ┌────────▼─────────┐
     │  Workset Gateway  │
     │                   │
     │  allowlist → run  │
     │  ask → UI prompt  │
     │  deny → reject    │
     │  audit → log      │
     └────────┬─────────┘
              │ stdout/stderr streamed back
     ┌────────▼─────────┐
     │  Host System      │
     │  (user's machine) │
     │  runs as user     │
     └──────────────────┘

Key design decisions

  • Agent-agnostic: the gateway protocol works for any agent, not just Claude
  • Build vs. run separation: agent writes code in the jail; user decides what runs on the host
  • Workset manages the environment: packages/tools added via UI, not by the agent
  • Default-on: sandboxing is how workset works, not an opt-in feature
  • macOS-native support: can't jail into Linux when building macOS apps; need macOS-appropriate isolation

macOS considerations

Bubblewrap is Linux-only. On macOS:

  • Filesystem scoping via sandbox profiles or equivalent
  • Can't use a Linux VM when the build target is macOS-native (Swift, Xcode)
  • The jail must still be macOS — just with restricted filesystem/process visibility
  • This is the hardest part of the design and needs dedicated exploration

What this enables

  • Move fast safely: no permission fatigue, no risk to host
  • Universal agent support: one security model for all AI tools
  • Clean human-in-the-loop: approvals only where they matter (host actions)
  • Auditable: every host command request is logged with context
  • Reproducible environments: Nix ensures consistent toolchains across machines

Related

  • ADR: docs-dev/adr/0002-agent-sandbox-architecture.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions