agent-app-control

Full docs: docs/ — Quickstart · Subcommand reference · Trace schema · Architecture · Threat model.

Single-binary Zig CLI for observe-and-control of a Linux desktop at the process / window / input level. The substrate every agent on this workstation uses to drive Hyprland, launch .desktop apps, type keystrokes, click, take screenshots, and copy/paste — without shell-eval anywhere in the hot path.

Closes HARNESS_AUDIT R3 (ydotool/uinput daemon), R7 (argv-safe launcher via aac-launch), and R8 (intent-traced mutating actions).

Why this exists

DeepMind, Anthropic, and OpenAI shipped vision/planning models for computer-use in 2024–25. The control layer those models drive — the thing that actually moves the mouse, types the text, focuses the window — remained a patchwork of shell scripts wrapping wtype / ydotool / hyprctl / xdotool. Every wrapper had eval setsid $CMD somewhere in it. Every wrapper logged nothing.

agent-app-control is the Zig replacement. One static binary, all the verbs, intent-traced before execution, argv-safe at the spawn boundary. Vision model talks to it over a single JSON-friendly CLI surface.

Architecture

Three layers, three trust postures:

Raw layer (agent-app-control-raw — formerly agent-app-control-zig): the single binary. Every subcommand. No policy. Used directly when the caller has already established intent (audit harnesses, test rigs).
Guarded layer (agent-app-control): a Bash wrapper that enforces the capability/intent model — refuses mutating actions without STAX_INTENT=<verb>, signs decisions into ~/.local/state/staxctl/traces.jsonl. The default user-facing entry.
Parity layer (agent-app-control-parity): a regression-test harness that runs the Zig binary against the original Bash reference and asserts schema/stats agreement.

The boundary between layers is the JSON contract emitted by the Raw layer. The Guarded layer never re-implements logic; it only gates and logs.

Subcommands

Read-only (no trace, no intent required)

subcommand	output	purpose
`help` / `--help` / `-h`	text	usage
`version` / `--version`	text	`agent-app-control-zig 0.4.0`
`contract`	JSON	machine-readable contract (capability ids, allowed verbs)
`doctor`	JSON	environment health (session, deps, pointer backends)
`list`	TSV	visible `.desktop` applications
`list-json`	JSON	same, structured
`app-info ID`	JSON	one `.desktop` file parsed
`clients`	JSON	Hyprland windows (via `hyprctl clients -j`)
`active`	JSON	focused window
`workspaces`	JSON	Hyprland workspaces
`monitors`	JSON	Hyprland monitors
`qsr`	JSON	Quasi-Sovereign-Reflex graph (Phase 4f)

Mutating (intent-traced via `logIntent` → `traces.jsonl` before exec)

subcommand	required args	what it does
`focus PATTERN`	class/title regex	focus window by class/title
`workspace NAME`	workspace id/name	switch workspace
`close-active`	—	close focused window
`fullscreen`	—	toggle fullscreen on focused window
`launch-desktop ID`	`.desktop` id	launch via `gtk-launch`
`launch-web URL`	http/https URL	launch as Chromium PWA
`open TARGET`	path or URL	`xdg-open` (sandbox-checked)
`dispatch NAME ARGS`	hyprctl dispatcher	raw `hyprctl dispatch`
`type TEXT`	text	keyboard via `wtype` (no ydotoold needed)
`move-pointer X Y`	coordinates	mouse via `ydotool` (requires ydotoold)
`click [BTN]`	button name	mouse click via `ydotool`
`key KEYCODE[:STATE]`	keycode + state	keyboard via `ydotool`
`drag X1 Y1 X2 Y2`	coords	mouse drag
`copy TEXT`	text	`wl-copy` to clipboard
`clipboard-read` / `paste`	—	`wl-paste` from clipboard
`screenshot [PATH]`	optional path	full-screen via `grim`
`screenshot-region [--geometry GEOM] [PATH]`	optional geom	region via `grim`+`slurp`
`media ACTION`	playerctl action	media control
`audio [status]`	—	`wpctl`/`pactl`

Every mutating action writes one line to ~/.local/state/staxctl/traces.jsonl before spawning the underlying tool. Line shape:

{"ts_ms": 1778850120697, "layer": "raw", "intent": "screenshot", "target": "/tmp/r8-test.png"}

The Guarded layer adds a second trace line with capability id, intent flag, and decision (executed / denied_action_not_allowed / denied_missing_intent).

Intent + trust model

The substrate has two trust boundaries:

Caller intent: every mutating action requires the caller to know what they're doing. The Guarded layer makes this explicit via STAX_INTENT=<verb> env var. The Raw layer trusts the caller and logs the intent post-hoc (because tests and audit harnesses need to call directly).
System resource caps: pointer operations require ydotoold to be running with uinput access. The installer (~/.config/nix-config/bin/install-ydotoold-udev.sh) ships the udev rule + systemd-user unit; before install the pointer verbs return a clean FileNotFound error rather than silently failing.

The doctor subcommand asserts both at runtime:

$ agent-app-control doctor | jq '.session, .pointer_backends'

Lineage (HARNESS_AUDIT closures)

This project is the load-bearing closure for three open R-items:

R3 — Wayland pointer control conditional on a daemon that wasn't running. Closed by the udev rule + systemd-user unit + modprobe reload shipped 2026-05-15.
R7 — Shell-eval launcher pattern (eval setsid $CMD). Closed by the aac-launch Zig binary that parses .desktop Exec lines into argv. The Raw layer now delegates to aac-launch for any launch path.
R8 — Mutating actions weren't auditable. Closed by isMutatingCommand() + logIntent() writing a JSONL trace line for every mutating subcommand BEFORE execution.

Install

This project is currently vendored inside ~/.config/nix-config/. The Home Manager wrapper installs both layers:

home-manager switch  # installs ~/.local/bin/agent-app-control (Guarded)
                     #          ~/.local/bin/agent-app-control-raw (Raw)
                     #          ~/.local/bin/agent-app-control-parity (parity)

Standalone build:

zig build -Doptimize=ReleaseFast
./zig-out/bin/agent-app-control help

Requires Zig 0.16.0 and libc.

Dependencies

Runtime (each verified by doctor):

hyprctl (Hyprland IPC)
wtype (keyboard input — no daemon)
wl-copy / wl-paste (clipboard)
grim / slurp / satty (screenshot)
ydotool + ydotoold (mouse/key — requires daemon + uinput)
playerctl / wpctl / pactl (media + audio)
walker / uwsm-app / xdg-terminal-exec (launch)
gtk-launch / xdg-open (desktop entries / URL handling)
jq (JSON for some glue)

Testing

zig build test --summary all       # unit tests
agent-app-control-parity           # Zig vs Bash reference parity
agent-app-control doctor           # runtime self-check
stax-experiment-parity --cross-arch # cross-arch Zig binary parity

Test coverage focuses on:

isMutatingCommand() enumeration (every mutating verb is tagged)
.desktop parser invariants
JSON output schema stability
Error-code mapping consistency
Trace-line write ordering (intent BEFORE exec)

Security

Three boundaries the substrate enforces:

Argv-safety: every .desktop Exec line goes through aac-launch which tokenizes per the FreeDesktop spec — no shell expansion of $foo, backticks, or command substitution.
URL whitelist: launch-web and open reject non-http/https URLs and paths outside ~/ or /tmp.
Trace before exec: every mutating action leaves an audit line before the action fires. Forensics can reconstruct the intent chain even when the action succeeds.

Report security issues to the operator-private channel; do not file as public GitHub issues until coordinated.

Contributing

This is a workstation-private project at the time of writing. The public-mirror move is gated by the merchant-lens question: does agent-app-control own a bottleneck downstream tools must route through? At the time of writing (2026-05-15) it does NOT — it's the substrate the stax workstation uses on itself. External use is welcome but unsupported until that changes.

License

AGPL-3.0-or-later. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
bench		bench
bindings/python		bindings/python
demos/llm-driving-desktop		demos/llm-driving-desktop
docs		docs
src		src
.gitignore		.gitignore
API_STABILITY.md		API_STABILITY.md
BENCH.md		BENCH.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build.zig		build.zig
build.zig.zon		build.zig.zon
cosign.pub		cosign.pub
release-signing.gpg.pub		release-signing.gpg.pub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-app-control

Why this exists

Architecture

Subcommands

Read-only (no trace, no intent required)

Mutating (intent-traced via `logIntent` → `traces.jsonl` before exec)

Intent + trust model

Lineage (HARNESS_AUDIT closures)

Install

Dependencies

Testing

Security

Contributing

License

See also

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-app-control

Why this exists

Architecture

Subcommands

Read-only (no trace, no intent required)

Mutating (intent-traced via logIntent → traces.jsonl before exec)

Intent + trust model

Lineage (HARNESS_AUDIT closures)

Install

Dependencies

Testing

Security

Contributing

License

See also

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Mutating (intent-traced via `logIntent` → `traces.jsonl` before exec)

Packages