Full docs: docs/ — Quickstart · Subcommand reference · Trace schema · Architecture · Threat model.
Single-binary Zig CLI for observe-and-control of a Linux desktop at the
process / window / input level. The substrate every agent on this workstation
uses to drive Hyprland, launch .desktop apps, type keystrokes, click,
take screenshots, and copy/paste — without shell-eval anywhere in the hot
path.
Closes HARNESS_AUDIT R3 (ydotool/uinput daemon),
R7 (argv-safe launcher via aac-launch), and R8 (intent-traced
mutating actions).
DeepMind, Anthropic, and OpenAI shipped vision/planning models for
computer-use in 2024–25. The control layer those models drive — the thing
that actually moves the mouse, types the text, focuses the window —
remained a patchwork of shell scripts wrapping wtype / ydotool /
hyprctl / xdotool. Every wrapper had eval setsid $CMD somewhere in
it. Every wrapper logged nothing.
agent-app-control is the Zig replacement. One static binary, all the
verbs, intent-traced before execution, argv-safe at the spawn boundary.
Vision model talks to it over a single JSON-friendly CLI surface.
Three layers, three trust postures:
- Raw layer (
agent-app-control-raw— formerlyagent-app-control-zig): the single binary. Every subcommand. No policy. Used directly when the caller has already established intent (audit harnesses, test rigs). - Guarded layer (
agent-app-control): a Bash wrapper that enforces the capability/intent model — refuses mutating actions withoutSTAX_INTENT=<verb>, signs decisions into~/.local/state/staxctl/traces.jsonl. The default user-facing entry. - Parity layer (
agent-app-control-parity): a regression-test harness that runs the Zig binary against the original Bash reference and asserts schema/stats agreement.
The boundary between layers is the JSON contract emitted by the Raw layer. The Guarded layer never re-implements logic; it only gates and logs.
| subcommand | output | purpose |
|---|---|---|
help / --help / -h |
text | usage |
version / --version |
text | agent-app-control-zig 0.4.0 |
contract |
JSON | machine-readable contract (capability ids, allowed verbs) |
doctor |
JSON | environment health (session, deps, pointer backends) |
list |
TSV | visible .desktop applications |
list-json |
JSON | same, structured |
app-info ID |
JSON | one .desktop file parsed |
clients |
JSON | Hyprland windows (via hyprctl clients -j) |
active |
JSON | focused window |
workspaces |
JSON | Hyprland workspaces |
monitors |
JSON | Hyprland monitors |
qsr |
JSON | Quasi-Sovereign-Reflex graph (Phase 4f) |
| subcommand | required args | what it does |
|---|---|---|
focus PATTERN |
class/title regex | focus window by class/title |
workspace NAME |
workspace id/name | switch workspace |
close-active |
— | close focused window |
fullscreen |
— | toggle fullscreen on focused window |
launch-desktop ID |
.desktop id |
launch via gtk-launch |
launch-web URL |
http/https URL | launch as Chromium PWA |
open TARGET |
path or URL | xdg-open (sandbox-checked) |
dispatch NAME ARGS |
hyprctl dispatcher | raw hyprctl dispatch |
type TEXT |
text | keyboard via wtype (no ydotoold needed) |
move-pointer X Y |
coordinates | mouse via ydotool (requires ydotoold) |
click [BTN] |
button name | mouse click via ydotool |
key KEYCODE[:STATE] |
keycode + state | keyboard via ydotool |
drag X1 Y1 X2 Y2 |
coords | mouse drag |
copy TEXT |
text | wl-copy to clipboard |
clipboard-read / paste |
— | wl-paste from clipboard |
screenshot [PATH] |
optional path | full-screen via grim |
screenshot-region [--geometry GEOM] [PATH] |
optional geom | region via grim+slurp |
media ACTION |
playerctl action | media control |
audio [status] |
— | wpctl/pactl |
Every mutating action writes one line to ~/.local/state/staxctl/traces.jsonl
before spawning the underlying tool. Line shape:
{"ts_ms": 1778850120697, "layer": "raw", "intent": "screenshot", "target": "/tmp/r8-test.png"}The Guarded layer adds a second trace line with capability id, intent
flag, and decision (executed / denied_action_not_allowed /
denied_missing_intent).
The substrate has two trust boundaries:
- Caller intent: every mutating action requires the caller to know
what they're doing. The Guarded layer makes this explicit via
STAX_INTENT=<verb>env var. The Raw layer trusts the caller and logs the intent post-hoc (because tests and audit harnesses need to call directly). - System resource caps: pointer operations require
ydotooldto be running with uinput access. The installer (~/.config/nix-config/bin/install-ydotoold-udev.sh) ships the udev rule + systemd-user unit; before install the pointer verbs return a cleanFileNotFounderror rather than silently failing.
The doctor subcommand asserts both at runtime:
$ agent-app-control doctor | jq '.session, .pointer_backends'This project is the load-bearing closure for three open R-items:
- R3 — Wayland pointer control conditional on a daemon that wasn't running. Closed by the udev rule + systemd-user unit + modprobe reload shipped 2026-05-15.
- R7 — Shell-eval launcher pattern (
eval setsid $CMD). Closed by theaac-launchZig binary that parses.desktopExec lines into argv. The Raw layer now delegates toaac-launchfor any launch path. - R8 — Mutating actions weren't auditable. Closed by
isMutatingCommand()+logIntent()writing a JSONL trace line for every mutating subcommand BEFORE execution.
This project is currently vendored inside ~/.config/nix-config/. The
Home Manager wrapper installs both layers:
home-manager switch # installs ~/.local/bin/agent-app-control (Guarded)
# ~/.local/bin/agent-app-control-raw (Raw)
# ~/.local/bin/agent-app-control-parity (parity)Standalone build:
zig build -Doptimize=ReleaseFast
./zig-out/bin/agent-app-control helpRequires Zig 0.16.0 and libc.
Runtime (each verified by doctor):
hyprctl(Hyprland IPC)wtype(keyboard input — no daemon)wl-copy/wl-paste(clipboard)grim/slurp/satty(screenshot)ydotool+ydotoold(mouse/key — requires daemon + uinput)playerctl/wpctl/pactl(media + audio)walker/uwsm-app/xdg-terminal-exec(launch)gtk-launch/xdg-open(desktop entries / URL handling)jq(JSON for some glue)
zig build test --summary all # unit tests
agent-app-control-parity # Zig vs Bash reference parity
agent-app-control doctor # runtime self-check
stax-experiment-parity --cross-arch # cross-arch Zig binary parityTest coverage focuses on:
isMutatingCommand()enumeration (every mutating verb is tagged).desktopparser invariants- JSON output schema stability
- Error-code mapping consistency
- Trace-line write ordering (intent BEFORE exec)
Three boundaries the substrate enforces:
- Argv-safety: every
.desktopExec line goes throughaac-launchwhich tokenizes per the FreeDesktop spec — no shell expansion of$foo, backticks, or command substitution. - URL whitelist:
launch-webandopenreject non-http/https URLs and paths outside~/or/tmp. - Trace before exec: every mutating action leaves an audit line before the action fires. Forensics can reconstruct the intent chain even when the action succeeds.
Report security issues to the operator-private channel; do not file as public GitHub issues until coordinated.
This is a workstation-private project at the time of writing. The
public-mirror move is gated by the merchant-lens question: does
agent-app-control own a bottleneck downstream tools must route through?
At the time of writing (2026-05-15) it does NOT — it's the substrate
the stax workstation uses on itself. External use is welcome but
unsupported until that changes.
AGPL-3.0-or-later. See LICENSE.
SMC17/aac-launch— the argv-safe.desktopExec launcher this project delegates launch operations to.SMC17/stax-experiment— the experiment register that audits this project's own claims.HARNESS_AUDIT.md— the risk register that names R3/R7/R8/R10/R11 etc. as the items this substrate is required to close.