Skip to content

SMC17/agent-app-control

agent-app-control

Full docs: docs/Quickstart · Subcommand reference · Trace schema · Architecture · Threat model.

Single-binary Zig CLI for observe-and-control of a Linux desktop at the process / window / input level. The substrate every agent on this workstation uses to drive Hyprland, launch .desktop apps, type keystrokes, click, take screenshots, and copy/paste — without shell-eval anywhere in the hot path.

Closes HARNESS_AUDIT R3 (ydotool/uinput daemon), R7 (argv-safe launcher via aac-launch), and R8 (intent-traced mutating actions).

Why this exists

DeepMind, Anthropic, and OpenAI shipped vision/planning models for computer-use in 2024–25. The control layer those models drive — the thing that actually moves the mouse, types the text, focuses the window — remained a patchwork of shell scripts wrapping wtype / ydotool / hyprctl / xdotool. Every wrapper had eval setsid $CMD somewhere in it. Every wrapper logged nothing.

agent-app-control is the Zig replacement. One static binary, all the verbs, intent-traced before execution, argv-safe at the spawn boundary. Vision model talks to it over a single JSON-friendly CLI surface.

Architecture

Three layers, three trust postures:

  1. Raw layer (agent-app-control-raw — formerly agent-app-control-zig): the single binary. Every subcommand. No policy. Used directly when the caller has already established intent (audit harnesses, test rigs).
  2. Guarded layer (agent-app-control): a Bash wrapper that enforces the capability/intent model — refuses mutating actions without STAX_INTENT=<verb>, signs decisions into ~/.local/state/staxctl/traces.jsonl. The default user-facing entry.
  3. Parity layer (agent-app-control-parity): a regression-test harness that runs the Zig binary against the original Bash reference and asserts schema/stats agreement.

The boundary between layers is the JSON contract emitted by the Raw layer. The Guarded layer never re-implements logic; it only gates and logs.

Subcommands

Read-only (no trace, no intent required)

subcommand output purpose
help / --help / -h text usage
version / --version text agent-app-control-zig 0.4.0
contract JSON machine-readable contract (capability ids, allowed verbs)
doctor JSON environment health (session, deps, pointer backends)
list TSV visible .desktop applications
list-json JSON same, structured
app-info ID JSON one .desktop file parsed
clients JSON Hyprland windows (via hyprctl clients -j)
active JSON focused window
workspaces JSON Hyprland workspaces
monitors JSON Hyprland monitors
qsr JSON Quasi-Sovereign-Reflex graph (Phase 4f)

Mutating (intent-traced via logIntenttraces.jsonl before exec)

subcommand required args what it does
focus PATTERN class/title regex focus window by class/title
workspace NAME workspace id/name switch workspace
close-active close focused window
fullscreen toggle fullscreen on focused window
launch-desktop ID .desktop id launch via gtk-launch
launch-web URL http/https URL launch as Chromium PWA
open TARGET path or URL xdg-open (sandbox-checked)
dispatch NAME ARGS hyprctl dispatcher raw hyprctl dispatch
type TEXT text keyboard via wtype (no ydotoold needed)
move-pointer X Y coordinates mouse via ydotool (requires ydotoold)
click [BTN] button name mouse click via ydotool
key KEYCODE[:STATE] keycode + state keyboard via ydotool
drag X1 Y1 X2 Y2 coords mouse drag
copy TEXT text wl-copy to clipboard
clipboard-read / paste wl-paste from clipboard
screenshot [PATH] optional path full-screen via grim
screenshot-region [--geometry GEOM] [PATH] optional geom region via grim+slurp
media ACTION playerctl action media control
audio [status] wpctl/pactl

Every mutating action writes one line to ~/.local/state/staxctl/traces.jsonl before spawning the underlying tool. Line shape:

{"ts_ms": 1778850120697, "layer": "raw", "intent": "screenshot", "target": "/tmp/r8-test.png"}

The Guarded layer adds a second trace line with capability id, intent flag, and decision (executed / denied_action_not_allowed / denied_missing_intent).

Intent + trust model

The substrate has two trust boundaries:

  1. Caller intent: every mutating action requires the caller to know what they're doing. The Guarded layer makes this explicit via STAX_INTENT=<verb> env var. The Raw layer trusts the caller and logs the intent post-hoc (because tests and audit harnesses need to call directly).
  2. System resource caps: pointer operations require ydotoold to be running with uinput access. The installer (~/.config/nix-config/bin/install-ydotoold-udev.sh) ships the udev rule + systemd-user unit; before install the pointer verbs return a clean FileNotFound error rather than silently failing.

The doctor subcommand asserts both at runtime:

$ agent-app-control doctor | jq '.session, .pointer_backends'

Lineage (HARNESS_AUDIT closures)

This project is the load-bearing closure for three open R-items:

  • R3 — Wayland pointer control conditional on a daemon that wasn't running. Closed by the udev rule + systemd-user unit + modprobe reload shipped 2026-05-15.
  • R7 — Shell-eval launcher pattern (eval setsid $CMD). Closed by the aac-launch Zig binary that parses .desktop Exec lines into argv. The Raw layer now delegates to aac-launch for any launch path.
  • R8 — Mutating actions weren't auditable. Closed by isMutatingCommand() + logIntent() writing a JSONL trace line for every mutating subcommand BEFORE execution.

Install

This project is currently vendored inside ~/.config/nix-config/. The Home Manager wrapper installs both layers:

home-manager switch  # installs ~/.local/bin/agent-app-control (Guarded)
                     #          ~/.local/bin/agent-app-control-raw (Raw)
                     #          ~/.local/bin/agent-app-control-parity (parity)

Standalone build:

zig build -Doptimize=ReleaseFast
./zig-out/bin/agent-app-control help

Requires Zig 0.16.0 and libc.

Dependencies

Runtime (each verified by doctor):

  • hyprctl (Hyprland IPC)
  • wtype (keyboard input — no daemon)
  • wl-copy / wl-paste (clipboard)
  • grim / slurp / satty (screenshot)
  • ydotool + ydotoold (mouse/key — requires daemon + uinput)
  • playerctl / wpctl / pactl (media + audio)
  • walker / uwsm-app / xdg-terminal-exec (launch)
  • gtk-launch / xdg-open (desktop entries / URL handling)
  • jq (JSON for some glue)

Testing

zig build test --summary all       # unit tests
agent-app-control-parity           # Zig vs Bash reference parity
agent-app-control doctor           # runtime self-check
stax-experiment-parity --cross-arch # cross-arch Zig binary parity

Test coverage focuses on:

  • isMutatingCommand() enumeration (every mutating verb is tagged)
  • .desktop parser invariants
  • JSON output schema stability
  • Error-code mapping consistency
  • Trace-line write ordering (intent BEFORE exec)

Security

Three boundaries the substrate enforces:

  1. Argv-safety: every .desktop Exec line goes through aac-launch which tokenizes per the FreeDesktop spec — no shell expansion of $foo, backticks, or command substitution.
  2. URL whitelist: launch-web and open reject non-http/https URLs and paths outside ~/ or /tmp.
  3. Trace before exec: every mutating action leaves an audit line before the action fires. Forensics can reconstruct the intent chain even when the action succeeds.

Report security issues to the operator-private channel; do not file as public GitHub issues until coordinated.

Contributing

This is a workstation-private project at the time of writing. The public-mirror move is gated by the merchant-lens question: does agent-app-control own a bottleneck downstream tools must route through? At the time of writing (2026-05-15) it does NOT — it's the substrate the stax workstation uses on itself. External use is welcome but unsupported until that changes.

License

AGPL-3.0-or-later. See LICENSE.

See also

  • SMC17/aac-launch — the argv-safe .desktop Exec launcher this project delegates launch operations to.
  • SMC17/stax-experiment — the experiment register that audits this project's own claims.
  • HARNESS_AUDIT.md — the risk register that names R3/R7/R8/R10/R11 etc. as the items this substrate is required to close.

About

Single-binary Zig CLI for observe-and-control of a Linux desktop. Closes HARNESS_AUDIT R3/R7/R8. AGPL-3.0. Workstation-private build (path deps to homelab + zig-canonical).

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors