Atelier OS

A purpose-built Linux desktop for AI agents.
Sway compositor + open-weights vision model.
One container per teammate. Fleet API. Bug-free or it doesn't ship.

What this is

Most "AI desktop" projects fall into one of two camps:

An Electron app that takes over your real laptop (UI-TARS-desktop). The model clicks your keyboard, the operator hopes nothing else is open, and there's no fleet story.
A cloud VM you rent (Anthropic Cowork, ByteDance Remote, Cua, Bytebot-now-archived). Polished UX, surprise bills, vendor sees every pixel, can't put it on your VPS.

Atelier OS is a third thing: a docker compose up that gives every teammate their own persistent Linux desktop, themed for AI use, driven by an open-weights vision model running on your hardware via a small FastAPI fleet that screenshots → thinks → clicks for as long as the task takes. The AI's files survive restarts. The fleet API lets one chat dispatch parallel goals across N employees' desktops. Every action is in an append-only audit log on your disk.

MIT. Built atop standard Linux pieces — Sway (Wayland headless), grim / wtype / wlrctl, Holo3-35B-A3B (#1 open computer-use model on OSWorld-Verified at 77.8%) or Anthropic Computer Use API. The whole orchestrator is ~1,300 LOC of Python + ~250 LOC of configs.

What the desktop looks like

Captured via GET /sessions/{id}/screenshot after dispatching the sequence below through the action-daemon — proving every verb (key combo, type, screenshot) end-to-end:

# 1. Sway opens foot (terminal) on Mod+Return
$ echo '{"action":"key","text":"super+Return"}'  | docker exec -i ${CN} atelier-action-cli
# 2. Type a comment
$ echo '{"action":"type","text":"# Atelier OS v0.1.5 — live desktop"}' | docker exec -i ${CN} atelier-action-cli
$ echo '{"action":"key","text":"Return"}' | docker exec -i ${CN} atelier-action-cli
# 3. Screenshot
$ curl -o shot.png http://fleet:8090/sessions/${SID}/screenshot
$ file shot.png
shot.png: PNG image data, 1280 x 720, 8-bit/color RGB, non-interlaced

The image above is the byte-identical output of that last curl.

What works today (v0.1.6, 451/451 tests green on real hardware)

./setup.sh from a fresh clone → fleet API healthy on :8090 in ~2 min
POST /sessions → spawns a per-teammate desktop container (image 0.1.5)
Desktop boots: Sway → action-daemon → wayvnc → websockify + noVNC on :8443
GET /sessions/{id}/embed → 307 to a live, iframe-able noVNC stream (Wayland-native VNC over WebSocket, ~500 ms latency, no plugin)
GET /sessions/{id}/screenshot → real 1280×720 PNG
POST /sessions/{id}/task → dispatches a goal, SSE step stream; per-session mutex — a second task on the same session returns 409
12 action verbs work end-to-end through the in-container daemon: screenshot, mouse_move (absolute), left/right/double click, type (incl. unicode), key (Return, Escape, ctrl+a, alt+Tab, …), scroll up+down, wait
GET /sessions/{id}/audit → every action + every cost increment, JSONL
Per-session /home/operator survives restart; registry survives fleet restart
Daily-cost cap (MAX_USD_PER_DAY), enforced atomically across concurrent submits
451/451 tests pass: 306 unit (sub-3s, no docker) + 145 integration (against live containers — every endpoint, every verb, every error path, multi-session stress, audit integrity, performance benchmarks p50/p95/p99, noVNC websockify handshake, documentation accuracy, snapshot lifecycle, Bearer-auth gating, Holo3 compose-profile structure, install-smoke harness)
Operator install verifier: python3 scripts/launch-smoke.py after docker compose up -d walks spawn → screenshot → snapshot → clone → teardown end-to-end. Stdlib-only, exits 0/1, slots into CI.

Roadmap (and what's tracked vs done)

See TRACKING.md for the full shipping log — every known limitation has a reproducer + acceptance + GitHub issue link. v0.1.6 closed all 8 v0.2 deliverables (S1-S8). Status at a glance:

✅ Fleet-API auth (S1, PR #13) — opt-in ATELIER_API_TOKEN, constant- time comparison, anti-leak header-only design
✅ TLS in compose (S2, PR #14) — opt-in with-tls profile, Caddy auto-Let's-Encrypt, SSE-flush + X-Forwarded-* preserved
✅ Per-session CPU/memory quotas (S3, PR #11) — --memory 4g --cpus 2.0 defaults via env
✅ Snapshot + clone session-state API (S4, PR #15) — POST /sessions/{id}/snapshot + POST /sessions {snapshot_id: ...} clones source's image + home tar into a new session. Path-traversal safe.
✅ Live budget visible in the desktop (S5, PR #12) — waybar shows $0.42 / $2.00; falls back to last-known with (stale) on fleet hiccup
✅ Bundled vLLM Holo3 sidecar (S6, PR #16) — --profile holo3 up spawns vLLM with safe 24 GB GPU defaults; weights cached in named volume
✅ Action-daemon protocol versioning (S7, PR #11) — {"v": 1, ...} on every wire-level request
✅ Boot-wait race (S8, PR #10) — ?wait=ready query param blocks until the desktop healthcheck is green
⏭ Real WebRTC (vs WebSocket VNC) — out of scope for atelier-os; future separate project (the v0.1.6 wayvnc + websockify + noVNC stack is iframe-embeddable today, ~500 ms latency)

"In the market of Iron you need to sell Gold." Atelier OS is the Gold: every piece is standard upstream Linux but the curation, theming, streaming, fleet, audit, and chat-embed are assembled into one experience nobody else ships.

Quick start

git clone https://github.com/karany97/atelier-os.git
cd atelier-os
./setup.sh           # one prompt for a session password, then ~2 min

That's the whole install. The script copies .env.example → .env, prompts for a session password (or generates one), checks FLEET_PORT isn't taken (common collision: multica also wants 8090), builds both images (docker compose build fleet desktop-template), starts the fleet, and waits for /health.

Verify your install

# stdlib-only, no pip install. Walks /health → /budget → spawn session
# (with ?wait=ready) → screenshot → snapshot → clone-from-snapshot →
# teardown. Reports PASS/FAIL per check; exits 0 if all green.
python3 scripts/launch-smoke.py

Operators behind bearer auth pass the token:

FLEET_URL=http://localhost:8090 \
ATELIER_API_TOKEN=$(grep ATELIER_API_TOKEN .env | cut -d= -f2) \
python3 scripts/launch-smoke.py

Manual override paths:

# Per-session embed (iframe-able noVNC live stream)
$ curl -X POST http://localhost:8090/sessions \
  -H 'content-type: application/json' \
  -d '{"label":"alice"}'
{"id":"sess_...","embed_url":"http://localhost:8090/sessions/sess_.../embed", ...}

$ curl -I http://localhost:8090/sessions/sess_.../embed
HTTP/1.1 307 Temporary Redirect
Location: http://localhost:7100/    # → noVNC client served on the session's port

# Per-session screenshot (PNG bytes through the action-daemon)
$ curl -o shot.png http://localhost:8090/sessions/sess_.../screenshot
$ file shot.png
shot.png: PNG image data, 1280 x 720, 8-bit/color RGB

# Dispatch a goal (returns 202; tail the SSE stream for live steps)
$ curl -X POST http://localhost:8090/sessions/sess_.../task \
  -H 'content-type: application/json' \
  -d '{"goal":"open epiphany and search nandai jewellery"}'
{"task_id":"task_...","status":"running","stream_url":".../stream",...}

To spin up a second employee's desktop:

curl -X POST http://localhost:8090/sessions \
  -H 'content-type: application/json' \
  -d '{"label":"alice","theme":"terracotta"}'
# → 201 + {"session_id":"...", "embed_url":"...", "vnc_url":"..."}

To embed in any chat / dashboard:

<iframe src="https://atelier-os.example.com/sessions/abc123/embed"
        allow="clipboard-read; clipboard-write; fullscreen"
        sandbox="allow-scripts allow-same-origin allow-forms allow-popups"
        referrerpolicy="no-referrer"></iframe>

Behind TLS (issue #2 — opt-in Caddy profile)

The default docker compose up runs the fleet on plaintext HTTP at $FLEET_PORT (assume private network). To put Caddy in front for TLS termination + automatic Let's Encrypt cert:

# 1. Copy + customize the Caddyfile template (or just set the env vars below)
cp compose/Caddyfile.example compose/Caddyfile

# 2. Tell Caddy your domain + Let's Encrypt contact email
cat >> .env <<'EOF'
ATELIER_TLS_DOMAIN=atelier-os.example.com
ATELIER_TLS_EMAIL=ops@example.com
EOF

# 3. Bring up the with-tls profile (caddy alongside the fleet)
docker compose --profile with-tls up -d

Caddy auto-provisions a cert via HTTP-01 challenge on first request, so :80 + :443 must be reachable from the public internet and your domain's A/AAAA record must point at this host. ACME state persists in a named volume across docker compose down.

Caveat — per-session noVNC ports. The /sessions/{id}/embed endpoint redirects to http://host:7XXX/ (the per-session port the desktop container exposes). Caddy in this template does NOT TLS- terminate those ports — they stay plain HTTP. If your iframe loads from an HTTPS atelier, browsers will block the HTTP redirect target (mixed-content). For v0.1.6 most operators keep those per-session ports on a private network and embed from a same-network atelier instance. v0.2 will ship a handle_path /v/{port}/* Caddy pattern that dynamically proxies the per-session ports through the same TLS endpoint.

With local Holo3 vision (issue #6 — opt-in `holo3` compose profile)

The default MODEL_BACKEND=anthropic calls Anthropic Computer Use (real $0.05–$0.40/task). For fully-local, free, open-weights vision the holo3 profile spins up a vLLM sidecar serving Holo3-35B-A3B:

# 1. NVIDIA Container Toolkit installed on the host (24 GB GPU minimum)
# 2. Switch the fleet to local Holo3 in .env
sed -i 's/^MODEL_BACKEND=anthropic/MODEL_BACKEND=holo3/' .env

# 3. Bring up the fleet + the Holo3 sidecar
docker compose --profile holo3 up -d

First boot downloads ~70 GB of Holo3 weights from HuggingFace (cached to the holo3-models named volume; survives docker compose down). Steady-state boot: ~30 s. The fleet auto-discovers HOLO3_ENDPOINT=http://holo3:8000/v1 over the compose network.

Holo3-35B-A3B is a MoE with ~3 B active params — single-card inference on RTX 3090 / 4090 / A6000 / A100. OSWorld-Verified 77.8% (the #1 OSS computer-use model, beats Anthropic Opus 4.6 at 1/10 the cost).

With Bearer auth (issue #1 — opt-in token)

The fleet API is open by default (assume private network). To require Authorization: Bearer <token> on every state-touching endpoint:

echo "ATELIER_API_TOKEN=$(openssl rand -hex 32)" >> .env
docker compose restart fleet

/health, /sessions/{id}/embed, and /budget stay open by design (load-bearing for healthchecks, iframe navigation, and the in-container budget poller respectively — see comments in driver/src/main.py). Constant-time comparison via hmac.compare_digest. Query-param tokens are rejected (anti-leak).

The architecture in one diagram

   ┌──────────────────────────┐
   │  Destiny Atelier (chat)  │     ──── any chat surface, really
   └──────────┬───────────────┘
              │  iframe / SSE
              ▼
   ┌──────────────────────────────────────────────────────────┐
   │  Atelier OS fleet API (FastAPI, this repo)               │
   │  · POST   /sessions             create new desktop       │
   │  · GET    /sessions/{id}/embed  iframe-able WebRTC URL   │
   │  · POST   /sessions/{id}/task   dispatch a goal          │
   │  · GET    /sessions/{id}/audit  per-turn audit log       │
   │  · DELETE /sessions/{id}        teardown + persist home  │
   └──────────┬───────────────────────────────────────────────┘
              │  docker exec
              ▼
   ┌──────────────────────────────────────────────────────────┐
   │  Per-employee desktop container (one each)               │
   │   ┌────────────────────────────────────────────────────┐ │
   │   │  Sway (Wayland kiosk compositor)                  │ │
   │   │  + waybar status bar (Atelier branding)           │ │
   │   │  + Firefox + xterm + Files + your apps            │ │
   │   │  + GTK4/Qt6 Atelier theme (terracotta default)    │ │
   │   └────────────────────────────────────────────────────┘ │
   │   ┌────────────────────────────────────────────────────┐ │
   │   │  wayvnc + websockify + noVNC (HTTP→WS→VNC on :8443)│ │
   │   │  · Wayland-native VNC server (sway's sibling project)│
   │   │  · ~500 ms latency over WebSocket, no plugin       │ │
   │   │  · iframe-embeddable, fullscreen-capable           │ │
   │   └────────────────────────────────────────────────────┘ │
   │   ┌────────────────────────────────────────────────────┐ │
   │   │  Action daemon (grim + wtype + ydotool)           │ │
   │   │  · screenshot via grim (Wayland-native, lossless) │ │
   │   │  · type via wtype, key via ydotool                │ │
   │   │  · ~1 ms per action (Unix socket, not docker exec)│ │
   │   └────────────────────────────────────────────────────┘ │
   └──────────────────────────────────────────────────────────┘
              │
              ▼ (model calls)
   ┌────────────────────────┐    ┌────────────────────────┐
   │  Holo3-35B-A3B local   │ OR │  Anthropic Computer    │
   │  (Apache 2.0, vLLM)    │    │  Use API (paid escape) │
   │  OSWorld-Verified 77.8%│    │  Sonnet 4.5 default    │
   └────────────────────────┘    └────────────────────────┘

Comparison

	UI-TARS-desktop	Bytebot (archived)	Anthropic Cowork	Atelier OS
Form factor	Electron app	Docker (dead since Mar 2026)	macOS/Windows app	`docker compose up`
Per-employee fleet	❌	❌	❌	✅ N sessions
Iframe-embeddable	❌	❌	❌	✅ noVNC/wayvnc
Persistent home	❌ (releases on navigate)	✅	❌	✅
Open weights	✅ (UI-TARS 1.5-7B)	✅	❌	✅ (Holo3 default)
Wayland-native screen capture	❌ (X11)	❌ (X11)	n/a	✅ (`grim`)
Live video stream	✅ (WebRTC)	✅ (VNC)	n/a	✅ (wayvnc, WS)
Screenshot-on-demand via API	❌	❌	n/a	✅ (proven, PNG)
Per-session task mutex	n/a	n/a	n/a	✅
Test count	unknown	unknown	proprietary	451 unit+integration
Audit log per turn	❌	❌	❌	✅ JSONL
Self-host floor	local app on your laptop	Docker on Linux	macOS only	Docker on any Linux
License	Apache 2.0	Apache 2.0	proprietary	MIT
Cost per task	model-only	model-only	$0.20-$2.00	model-only ($0 with local Holo3, $0.05-$0.40 with Sonnet 4.5)

Why Holo3-35B-A3B (the model that ships in v0.1)

Holo3-35B-A3B (HCompany, 2026) is currently the #1 open-source computer-use model on the OSWorld- Verified benchmark at 77.8% — ahead of OpenCUA-72B (45%) and Simular Agent-S2 (34.5%), and even beats Anthropic Opus 4.6 on the same benchmark at roughly 1/10 the per-token cost.

It's a 35B-parameter MoE with 3B active params — fits a single RTX 3090 at 40-60 t/s via vLLM. Apache 2.0 weights. We ship a vLLM systemd unit that boots it on demand.

Operators who don't have a GPU can point the driver at the Anthropic Computer Use API instead (MODEL_BACKEND=anthropic). Either way, the desktop, the fleet, the audit, the embed — all stay the same.

What's in v0.1.6 (the launch-sprint release)

✅ Repo layout (MIT, README, threat model, architecture doc, CHANGELOG)
✅ compose/docker-compose.yml — fleet API + N session containers, opt-in with-tls (Caddy) + holo3 (vLLM sidecar) profiles
✅ desktop/Dockerfile — Ubuntu 24.04 + Sway + Atelier theme, builds atelier-os/desktop:0.1.5 cleanly
✅ desktop/sway-config + desktop/waybar-config.json — keyboard kiosk + live-budget status bar
✅ desktop/atelier-budget-poller.py — in-container fleet /budget poller, atomic write to /tmp/atelier-budget
✅ driver/src/main.py — FastAPI fleet API (sessions, embed, task, audit, budget, snapshot lifecycle, optional Bearer-token gating)
✅ driver/src/session.py — per-session container lifecycle, quotas, snapshot_id-driven clone path
✅ driver/src/snapshot.py — multi-session snapshot module (image + home tar, path-traversal-safe extract, most-recent delete safeguard)
✅ driver/src/action.py — Wayland-native action dispatch
✅ driver/src/model.py — Holo3 + Anthropic backends, swappable
✅ driver/src/audit.py — append-only JSONL audit log
✅ tests/ — 451 tests (306 unit + 145 integration), all green
✅ scripts/launch-smoke.py — operator install verifier (spawn → snapshot → clone → teardown, exits 0/1)
✅ docs/threat-model.md + docs/architecture.md
✅ TRACKING.md — every known limitation has a closed PR + writeup
⏳ .github/workflows/ci.yml — staged; awaits one-time OAuth scope refresh on the maintainer side (issue #9)

What's coming (v0.2 — June 2026)

Hot-swap themes per-session (each employee picks their palette)
File-drop from chat to session home
Real WebRTC streaming pipeline (vs current wayvnc + WebSocket) for the per-session embed
Multi-cluster: federate sessions across N atelier-os hosts
One-USB-boot ISO ("Atelier OS Live") for kiosk deployments

Companion repos

Repo	What
karany97/nandai-atelier	The chat surface that embeds Atelier OS sessions
karany97/destiny-computer	The v0.1 single-desktop pattern Atelier OS evolved from
karany97/atelier-os	This repo

License

MIT. See LICENSE.

Security

If you find a vulnerability, please email security@destiny.computer instead of opening a public issue. PGP key in repo root. 72-hour acknowledgement, 14-day public disclosure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atelier OS

A purpose-built Linux desktop for AI agents.
Sway compositor + open-weights vision model.
One container per teammate. Fleet API. Bug-free or it doesn't ship.

What this is

What the desktop looks like

What works today (v0.1.6, 451/451 tests green on real hardware)

Roadmap (and what's tracked vs done)

Quick start

Verify your install

Behind TLS (issue #2 — opt-in Caddy profile)

With local Holo3 vision (issue #6 — opt-in `holo3` compose profile)

With Bearer auth (issue #1 — opt-in token)

The architecture in one diagram

Comparison

Why Holo3-35B-A3B (the model that ships in v0.1)

What's in v0.1.6 (the launch-sprint release)

What's coming (v0.2 — June 2026)

Companion repos

License

Security

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github		.github
compose		compose
desktop		desktop
docs		docs
driver		driver
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TRACKING.md		TRACKING.md
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Atelier OS

A purpose-built Linux desktop for AI agents.Sway compositor + open-weights vision model.One container per teammate. Fleet API. Bug-free or it doesn't ship.

What this is

What the desktop looks like

What works today (v0.1.6, 451/451 tests green on real hardware)

Roadmap (and what's tracked vs done)

Quick start

Verify your install

Behind TLS (issue #2 — opt-in Caddy profile)

With local Holo3 vision (issue #6 — opt-in holo3 compose profile)

With Bearer auth (issue #1 — opt-in token)

The architecture in one diagram

Comparison

Why Holo3-35B-A3B (the model that ships in v0.1)

What's in v0.1.6 (the launch-sprint release)

What's coming (v0.2 — June 2026)

Companion repos

License

Security

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

A purpose-built Linux desktop for AI agents.
Sway compositor + open-weights vision model.
One container per teammate. Fleet API. Bug-free or it doesn't ship.

With local Holo3 vision (issue #6 — opt-in `holo3` compose profile)

Packages