Epic: Add Visual Companion as an Ouroboros-native design decision bridge
Summary
Add visual-companion to the Ouroboros plugin ecosystem as a first-class AgentOS capability for visual UI/UX decision making.
The goal is to give agents and users a browser-backed interaction surface for design questions: an agent can show a concrete screen, component state, layout choice, or interaction prompt, then recover the user's click/form response as structured JSON evidence for the next workflow step.
This is not about replacing design tools, building a general website generator, or making Ouroboros own frontend rendering. It is about closing a common specification gap: users are often asked to make UI/UX decisions through words alone, even when the meaning of those words depends on seeing the actual behavior.
In short:
Let Ouroboros ask visual questions visually, and recover the answer as auditable workflow input.
Problem
Today, design-oriented decisions often happen through text-only prompts. That works poorly when the user does not already share the same frontend vocabulary as the agent.
Common failure cases:
- A user is asked to choose between UI/UX terms such as layout density, hover behavior, selection state, drawer behavior, card rhythm, spacing, responsive behavior, or visual hierarchy, but cannot predict how those terms will behave in the actual interface.
- Non-designers, frontend beginners, and vibe coders may approve a textual description that sounds right but behaves differently once implemented.
- Agents may continue from an ambiguous design answer and encode the wrong interaction model into the implementation.
- Review happens late, after code has already been written, instead of during the decision point where the ambiguity first appears.
This is directly aligned with the Ouroboros thesis: many AI coding failures start at unclear input. For UI/UX work, the unclear input is often visual, not just verbal.
Product goal
Make design decisions inspectable before they become implementation commitments.
A user should be able to say, in effect:
ooo visual-companion start
ooo visual-companion show --html screen.html
ooo visual-companion wait
and get an Ouroboros-native decision product:
A browser screen was shown
The user clicked or submitted a choice
The response was recovered as JSON
A handoff/audit artifact was emitted
A downstream workflow can continue from the visual answer
Source capability
The initial source capability is the existing visual-companion skill surface, copied into an installable Ouroboros plugin rather than referenced from an external skill path.
The plugin should preserve the skill's useful runtime behavior:
- local browser server,
- agent-authored HTML screens,
- click/form event capture,
- pending-event recovery,
- explicit session start/stop lifecycle,
- durable state and handoff files.
Non-goals
This epic must not turn ouroboros-plugins into a UI framework or website generator.
Out of scope for the first implementation:
- Generating HTML or designing screens automatically inside the plugin.
- Replacing human design judgment.
- Replacing Figma, Storybook, Playwright, or browser automation tools.
- Running remote browser infrastructure by default.
- Uploading user screens, prompts, or interaction data to external services.
- Treating screenshots alone as sufficient; the key capability is interactive visual I/O.
- Writing runtime artifacts into installed plugin homes.
Desired plugin shape
Initial plugin candidate:
plugins/visual-companion/
ouroboros.plugin.json
README.md
assets/
skill/
SKILL.md
scripts/
server.cjs
wait-for-event.cjs
read-pending-event.cjs
...
visual_companion_plugin/
__init__.py
__main__.py
tests/
test_visual_companion_plugin.py
The name is intentionally capability-oriented. visual-companion should mean a companion surface for visual decisions, not a general browser automation wrapper.
UX preservation contract
The plugin must preserve the mental model of a visual question I/O bridge.
Preserve agent-authored screens
The plugin should not decide what the UI question should look like. Agents and workflows author the HTML screen, and the plugin serves it.
Preserve interactive answers
The plugin should recover user choices as structured events, not only screenshots or prose summaries.
Preserve local-first behavior
The default path should run on localhost and write artifacts to an Ouroboros-controlled project/output directory, not into the trusted installed plugin home.
AgentOS-native translation
The plugin must translate the visual interaction into Ouroboros primitives rather than merely launch a browser server.
Expected capabilities:
ledger:write — record visual question lifecycle and result status.
state:write — persist session metadata and event recovery state.
provenance:write — record bundled skill source, command shape, and generated artifacts.
runtime:execute — start and stop the local Node browser server.
handoff:attach — attach browser URL, screen files, event payloads, diagnostics, and result metadata.
progress:write — report lifecycle and wait/read progress.
Expected permissions:
filesystem:read — read bundled scripts, input HTML, and session state.
filesystem:write — write screen files, session metadata, event state, diagnostics, and handoff artifacts.
shell:execute — execute bundled Node scripts for local serving and event waiting.
Command plan
start
ooo visual-companion start --project-dir <path>
Responsibilities:
- create a local visual companion session,
- start the bundled browser server,
- return browser URL and public session paths,
- keep private event token out of public stdout/handoff,
- write recoverable session state.
Risk: write
show
ooo visual-companion show --html <path> --state-dir <path>
Responsibilities:
- copy an agent-authored HTML screen into the active session,
- avoid overwriting prior screens,
- return the served screen URL and artifact path.
Risk: write
wait
ooo visual-companion wait --state-dir <path> --timeout-ms <n>
Responsibilities:
- wait for the next click/form event,
- return the event as structured JSON,
- optionally clear pending event state.
Risk: read_only
read
ooo visual-companion read --state-dir <path>
Responsibilities:
- recover the latest pending event without blocking,
- represent missing state as an explicit
blocked result.
Risk: read_only
stop
ooo visual-companion stop --session-dir <path>
Responsibilities:
- stop the spawned local server,
- record lifecycle result,
- leave session artifacts inspectable.
Risk: write
Artifact contract
A successful visual session should produce a bounded artifact layout similar to:
<output-root>/.brainstorm/<session-id>/
content/
screen.html
screen-1.html
state/
adapter-session.json
server-info
server.log
pending-event.json
<output-root>/.omx/handoffs/visual-companion/
<run-id>-start.json
<run-id>-show.json
<run-id>-wait.json
<run-id>-read.json
<run-id>-stop.json
Runtime artifacts must not be written into installed plugin homes because plugin homes are trust subjects.
Implementation phases
Phase 0 — Contract and boundary alignment
- Confirm the plugin fits the current manifest schema.
- Document why this is an AgentOS visual I/O capability, not browser automation.
- Preserve the plugin-home trust boundary described by the Superpowers artifact fix.
Phase 1 — Reference plugin skeleton
- Add
plugins/visual-companion/ouroboros.plugin.json.
- Add README with product boundary, commands, permissions, and non-goals.
- Copy the source visual-companion skill assets into the plugin package.
- Add catalog entry.
Phase 2 — Adapter command surface
- Implement
start, show, wait, read, and stop.
- Normalize completed/blocked/failed JSON result shapes.
- Keep private browser event token out of public output.
- Write handoff artifacts for every command.
Phase 3 — Tests and validation
- Unit-test manifest shape.
- Unit-test command parser/help surface.
- Unit-test blocked handoff behavior.
- Unit-test non-overwriting screen publication.
- Run a Node-backed lifecycle smoke test that starts the server, publishes HTML, posts a browser event, reads/waits the event, and stops the server.
- Run repository contract validation.
Phase 4 — Future expansion
- Add richer event schemas for form, selection, and multi-choice visual prompts.
- Add optional screenshot attachment only as evidence, not as the primary answer channel.
- Add higher-level workflow helpers after the primitive bridge is stable.
Acceptance criteria
This epic is complete when:
- A
visual-companion plugin exists under plugins/visual-companion/ with a valid ouroboros.plugin.json.
- The source skill assets are copied into the plugin package.
- The plugin exposes
start, show, wait, read, and stop commands.
- The plugin serves agent-authored HTML through a local browser server.
- Browser click/form events are recoverable as structured JSON.
- Private event tokens are not exposed in public stdout or handoff artifacts.
- Session state and handoff artifacts are durable and inspectable.
- Runtime artifacts are written outside installed plugin homes.
- Failure and blocked states are explicit.
scripts/validate_contract.py passes.
- Focused unit/lifecycle tests pass.
- The README explains why this is a visual decision I/O bridge rather than a UI generator or generic browser wrapper.
Epic: Add Visual Companion as an Ouroboros-native design decision bridge
Summary
Add
visual-companionto the Ouroboros plugin ecosystem as a first-class AgentOS capability for visual UI/UX decision making.The goal is to give agents and users a browser-backed interaction surface for design questions: an agent can show a concrete screen, component state, layout choice, or interaction prompt, then recover the user's click/form response as structured JSON evidence for the next workflow step.
This is not about replacing design tools, building a general website generator, or making Ouroboros own frontend rendering. It is about closing a common specification gap: users are often asked to make UI/UX decisions through words alone, even when the meaning of those words depends on seeing the actual behavior.
In short:
Problem
Today, design-oriented decisions often happen through text-only prompts. That works poorly when the user does not already share the same frontend vocabulary as the agent.
Common failure cases:
This is directly aligned with the Ouroboros thesis: many AI coding failures start at unclear input. For UI/UX work, the unclear input is often visual, not just verbal.
Product goal
Make design decisions inspectable before they become implementation commitments.
A user should be able to say, in effect:
and get an Ouroboros-native decision product:
Source capability
The initial source capability is the existing
visual-companionskill surface, copied into an installable Ouroboros plugin rather than referenced from an external skill path.The plugin should preserve the skill's useful runtime behavior:
Non-goals
This epic must not turn
ouroboros-pluginsinto a UI framework or website generator.Out of scope for the first implementation:
Desired plugin shape
Initial plugin candidate:
The name is intentionally capability-oriented.
visual-companionshould mean a companion surface for visual decisions, not a general browser automation wrapper.UX preservation contract
The plugin must preserve the mental model of a visual question I/O bridge.
Preserve agent-authored screens
The plugin should not decide what the UI question should look like. Agents and workflows author the HTML screen, and the plugin serves it.
Preserve interactive answers
The plugin should recover user choices as structured events, not only screenshots or prose summaries.
Preserve local-first behavior
The default path should run on localhost and write artifacts to an Ouroboros-controlled project/output directory, not into the trusted installed plugin home.
AgentOS-native translation
The plugin must translate the visual interaction into Ouroboros primitives rather than merely launch a browser server.
Expected capabilities:
ledger:write— record visual question lifecycle and result status.state:write— persist session metadata and event recovery state.provenance:write— record bundled skill source, command shape, and generated artifacts.runtime:execute— start and stop the local Node browser server.handoff:attach— attach browser URL, screen files, event payloads, diagnostics, and result metadata.progress:write— report lifecycle and wait/read progress.Expected permissions:
filesystem:read— read bundled scripts, input HTML, and session state.filesystem:write— write screen files, session metadata, event state, diagnostics, and handoff artifacts.shell:execute— execute bundled Node scripts for local serving and event waiting.Command plan
startResponsibilities:
Risk:
writeshowResponsibilities:
Risk:
writewaitResponsibilities:
Risk:
read_onlyreadResponsibilities:
blockedresult.Risk:
read_onlystopResponsibilities:
Risk:
writeArtifact contract
A successful visual session should produce a bounded artifact layout similar to:
Runtime artifacts must not be written into installed plugin homes because plugin homes are trust subjects.
Implementation phases
Phase 0 — Contract and boundary alignment
Phase 1 — Reference plugin skeleton
plugins/visual-companion/ouroboros.plugin.json.Phase 2 — Adapter command surface
start,show,wait,read, andstop.Phase 3 — Tests and validation
Phase 4 — Future expansion
Acceptance criteria
This epic is complete when:
visual-companionplugin exists underplugins/visual-companion/with a validouroboros.plugin.json.start,show,wait,read, andstopcommands.scripts/validate_contract.pypasses.