Apra-Labs · kumaakh · May 28, 2026 · May 15, 2026 · May 15, 2026 · May 28, 2026
diff --git a/README.md b/README.md
@@ -233,6 +233,105 @@ reviewer  Opus 4.7        final review
 Provider strengths, role recommendations, and gotchas:
 [docs/provider-guide.md](docs/provider-guide.md).
 
+## Transport
+
+Fleet runs as a singleton service on your machine. When you start it, the server
+listens on port 7523 by default and multiple LLM clients (Claude Code, Gemini,
+Copilot, Codex) connect concurrently to the same fleet instance.
+
+### HTTP+SSE Transport (default)
+
+By default, fleet uses the **HTTP+SSE transport** -- clients connect over HTTP and
+receive server-push notifications over Server-Sent Events (SSE).
+
+```bash
+apra-fleet                  # Start HTTP server (default)
+apra-fleet --transport http # Explicitly use HTTP
+```
+
+When the server starts, it writes a `server.json` file to `~/.apra-fleet/` containing:
+```json
+{
+  "pid": 12345,
+  "port": 7523,
+  "url": "http://localhost:7523/mcp",
+  "version": "x.y.z",
+  "startedAt": "2026-05-19T..."
+}
+```
+
+If port 7523 is busy, the server falls back to port 0 (OS-assigned random port) and
+records the actual port in `server.json`. You can override the default port with the
+`APRA_FLEET_PORT` environment variable.
+
+**Multiple clients, one server.** When a second LLM client starts, it reads
+`server.json`, detects the running server, and connects to it. All clients share the
+same fleet instance -- no restart needed. When you close all clients, the server
+keeps running (as a singleton service on your machine). It shuts down on explicit
+exit (`apra-fleet --shutdown` tool) or on system reboot.
+
+**Re-register with HTTP.** When you upgrade or re-install Fleet, run:
+```bash
+apra-fleet install  # Registers fleet with HTTP transport (default)
+```
+
+### Event Bus
+
+The event bus is an internal notification system. When a subsystem (like credential
+storage) completes an operation, it emits an event, and the HTTP server broadcasts
+the notification to all connected clients via SSE. This lets clients respond
+immediately to fleet events without polling.
+
+### Backward Compatibility: stdio Transport
+
+Existing fleets can continue using the stdio transport:
+
+```bash
+apra-fleet --transport stdio # Use legacy stdio transport
+apra-fleet --stdio            # Alias for --transport stdio
+```
+
+When you run `apra-fleet install --transport stdio`, the MCP config keeps the old
+command-based format (no HTTP URL). The server's behavior is identical to pre-HTTP
+versions: it reads JSON-RPC from stdin, writes responses to stdout, and communicates
+with one client at a time via the stdio pipe.
+
+If you want to stay on stdio for now, run:
+```bash
+apra-fleet install --transport stdio
+```
+
+If you later switch back to HTTP, re-run the default install:
+```bash
+apra-fleet install  # Switches to HTTP transport
+```
+
+## Service Mode
+
+Fleet keeps a singleton server running so all your LLM clients share one instance.
+Registering it as an OS service keeps it alive across terminal sessions -- the server
+survives terminal close and restarts automatically on login:
+
+- Windows: a per-user Scheduled Task (Task Scheduler, OnLogon trigger)
+- Linux: a systemd user unit (`systemctl --user`)
+- macOS: a LaunchAgent in `~/Library/LaunchAgents/`
+
+Four verbs manage the lifecycle directly:
+
+```
+apra-fleet start    # start the server (idempotent -- exits cleanly if already running)
+apra-fleet stop     # graceful shutdown: POST /shutdown, poll, force-kill fallback
+apra-fleet restart  # stop then start
+apra-fleet status   # state, PID, port, uptime, version, and OS service status
+```
+
+`install` and `uninstall` include service registration. Running
+`apra-fleet install` on a packaged binary with the HTTP transport (the default)
+registers and starts the OS service automatically -- no extra step.
+`apra-fleet uninstall` stops and deregisters the service before removing files.
+Service registration failures are non-fatal: a warning is printed and the install
+continues.
+
 ## The PM skill
 
 The **PM skill** is Fleet's reference workflow for **software development**

diff --git a/skills/pm/tpl-doer.md → agents/doer.md b/skills/pm/tpl-doer.md → agents/doer.md
@@ -1,4 +1,10 @@
-# {{PROJECT_NAME}} - Plan Execution
+---
+name: doer
+description: Executes plan tasks in order, commits after each, stops at VERIFY checkpoints.
+tools: [Read, Edit, Write, Bash, Grep, Glob, Agent]
+---
+
+# Plan Execution
 
 ## Context Recovery
 Before starting any work: `git log --oneline -10`
@@ -7,37 +13,37 @@ Before starting any work: `git log --oneline -10`
 You are executing a plan defined in PLAN.md. Progress tracked in progress.json.
 
 On each invocation:
-1. Read progress.json — find next task with status "pending"
-2. Read PLAN.md — get full details for that task
-3. Execute — write code, run tests, fix issues
+1. Read progress.json -- find next task with status "pending"
+2. Read PLAN.md -- get full details for that task
+3. Execute -- write code, run tests, fix issues
 4. Commit with descriptive message referencing the task ID
-5. Update progress.json — set task to "completed", add notes
+5. Update progress.json -- set task to "completed", add notes
 6. Continue to next pending task
 
 ## Verify Checkpoints
 Tasks with type "verify" are checkpoints. When you reach one:
 1. Run the project build step (e.g. `npm run build`, `tsc`, `cargo build`) and linter check (e.g. `npm run lint`, `eslint`, `cargo clippy` if configured) first, then run the full test suite (unit, integration, e2e). All of them must pass.
 2. Confirm all prior tasks in the group work correctly
 3. Update progress.json with test results and issues found
-4. `git push origin {{branch}}` - code must be on origin before PM reviews
-5. STOP - do not continue. Report status so the PM can review.
+4. `git push origin <current-branch>` -- code must be on origin before PM reviews
+5. STOP -- do not continue. Report status so the PM can review.
 
 ## Branch Hygiene
-- Before creating a branch: `git fetch origin && git checkout origin/{{base_branch}}`
-- Before pushing a PR or at PM's request: `git fetch origin && git rebase origin/{{base_branch}}`, rerun tests after rebase
+- Before creating a branch: `git fetch origin && git checkout origin/<base-branch>`
+- Before pushing a PR or at PM's request: `git fetch origin && git rebase origin/<base-branch>`, rerun tests after rebase
 
 ## Secrets & API Keys
 
-If this task requires secrets, API keys, or tokens (e.g., external API calls, private registry pushes, third-party service authentication), check whether the PM has pre-loaded them via the credential store before you start. Use `{{secure.NAME}}` tokens only in `execute_command` — never in prompts or log messages. Fleet resolves and redacts them automatically in commands. Do not ask for raw secret values in conversation; if a required `sec://NAME` handle is missing, report it as a blocker so the PM can store it OOB.
+If this task requires secrets, API keys, or tokens (e.g., external API calls, private registry pushes, third-party service authentication), check whether the PM has pre-loaded them via the credential store before you start. Use `{{secure.NAME}}` tokens only in `execute_command` -- never in prompts or log messages. Fleet resolves and redacts them automatically in commands. Do not ask for raw secret values in conversation; if a required credential handle is missing, report it as a blocker so the PM can store it OOB.
 
 ## Rules
 - ONE task at a time, then commit, then continue
 - After every commit: run fast/unit tests and linter checks. If they fail, fix before moving to the next task.
 - Always update progress.json after each task
 - Blocker? Set status to "blocked" with notes, then STOP
-- NEVER skip tasks - execute in order
+- NEVER skip tasks -- execute in order
 - Read PLAN.md before starting each task
-- Commit and push PLAN.md, progress.json, and all project docs (design.md, feedback-*.md) at every turn - reviewers depend on them
-- NEVER commit this agent context file (CLAUDE.md / GEMINI.md / AGENTS.md / COPILOT.md / AGY.md) - it is role-specific and not shared
-- NEVER push to the base branch (main, master, or integration branch) - always work on feature branches
-- NEVER stage or commit `.fleet-task.md` - these are ephemeral prompt delivery files managed by the fleet server
+- Commit and push PLAN.md, progress.json, and all project docs (design.md, feedback-*.md) at every turn -- reviewers depend on them
+- NEVER commit this agent context file (CLAUDE.md / GEMINI.md / AGENTS.md / COPILOT.md / AGY.md) -- it is role-specific and not shared
+- NEVER push to the base branch (main, master, or integration branch) -- always work on feature branches
+- NEVER stage or commit `.fleet-task.md` -- these are ephemeral prompt delivery files managed by the fleet server
diff --git a/skills/pm/tpl-reviewer-plan.md → agents/plan-reviewer.md b/skills/pm/tpl-reviewer-plan.md → agents/plan-reviewer.md
@@ -1,3 +1,9 @@
+---
+name: plan-reviewer
+description: Reviews PLAN.md against requirements; writes feedback.md verdict (APPROVED or CHANGES NEEDED).
+tools: [Read, Grep, Glob, Bash, Write]
+---
+
 # Plan Review
 
 You are reviewing a plan in PLAN.md against requirements.md and any design docs in the work folder.
@@ -9,14 +15,14 @@ You are reviewing a plan in PLAN.md against requirements.md and any design docs
 3. Are key abstractions and shared interfaces in the earliest tasks?
 4. Is the riskiest assumption validated in Task 1?
 5. Later tasks reuse early abstractions (DRY)?
-6. Are phase boundaries drawn at cohesion boundaries — each phase is a coherent unit producing a reviewable, testable increment (tasks share a data model, code path, or design decision)?
-7. Are tiers monotonically non-decreasing within each phase (cheap → standard → premium, never downgrading mid-phase)?
+6. Are phase boundaries drawn at cohesion boundaries -- each phase is a coherent unit producing a reviewable, testable increment (tasks share a data model, code path, or design decision)?
+7. Are tiers monotonically non-decreasing within each phase (cheap -> standard -> premium, never downgrading mid-phase)?
 8. Each task completable in one session?
 9. Dependencies satisfied in order?
 10. Any vague tasks that two developers would interpret differently?
 11. Any hidden dependencies between tasks?
 12. Does the plan include a risk register? If missing or incomplete, identify the risks yourself and add them as findings
-13. Does the plan align with requirements.md intent — solving the right problem, not just a technically clean plan?
+13. Does the plan align with requirements.md intent -- solving the right problem, not just a technically clean plan?
 
 ## Output
 
@@ -25,9 +31,9 @@ If this is a re-review: run `git log --oneline -- feedback.md` then `git show <s
 Overwrite feedback.md with this structure:
 
 ```
-# {{sprint_name}} — Plan Review
+# <sprint-name> -- Plan Review
 
-**Reviewer:** {{member_name}}
+**Reviewer:** <your-member-name>
 **Date:** YYYY-MM-DD HH:MM:SS+TZ
 **Verdict:** APPROVED | CHANGES NEEDED
 
@@ -46,8 +52,8 @@ Overwrite feedback.md with this structure:
 <Synthesize what passed, what must change, what is deferred.>
 ```
 
-For each check: PASS or FAIL with narrative — not one-liners.
+For each check: PASS or FAIL with narrative -- not one-liners.
 
-If verdict is CHANGES NEEDED: the doer annotates each relevant section with `**Doer:** fixed in commit <sha> — <what changed>` before requesting re-review.
+If verdict is CHANGES NEEDED: the doer annotates each relevant section with `**Doer:** fixed in commit <sha> -- <what changed>` before requesting re-review.
 
 Commit feedback.md and push.
diff --git a/agents/planner.md b/agents/planner.md
@@ -0,0 +1,94 @@
+---
+name: planner
+description: Reads requirements and produces PLAN.md with tiered, phase-ordered tasks.
+tools: [Read, Grep, Glob, Bash, Write]
+---
+
+# Plan Generation
+
+You are generating an implementation plan. Read requirements.md for what needs to be built.
+
+### PHASE 0 -- EXPLORE (before writing any plan)
+
+1. Read relevant source files for this task
+2. Read existing tests -- understand conventions and framework
+3. `git log --oneline -20` -- recent changes in the area
+4. List assumptions about how the code works
+5. For every assumption you listed, answer: "How do I know this is currently true?" Then verify it.
+   Two categories to check:
+   - **Existence:** Does the thing you are building on top of actually exist right now? (e.g. a named entity, interface, resource, capability, configuration, or path your plan depends on)
+   - **Accessibility:** Can the part of the system that needs it actually reach it? (e.g. is it exposed, connected, permitted, or in scope for the component that will use it)
+   If you cannot verify an assumption, it becomes a risk register entry, not a task precondition.
+6. Report: what you found, what patterns exist, what constraints matter
+
+### PHASE 1 -- DRAFT
+
+For each task include:
+- What file(s) to create or change
+- What the change does -- specific, not vague ("add X method to Y class" not "implement feature")
+- What "done" means -- test passes, output appears, API returns expected response
+- What could block -- missing dependency, unclear API, native code issue
+
+Rules:
+- **Phase boundaries by cohesion, not count** -- a phase is a coherent unit of work that produces a reviewable, testable increment. Group tasks into a phase when they share a data model, code path, or design decision -- splitting them would produce an incoherent intermediate state or require touching the same code twice. Place a VERIFY at the natural completion boundary of that unit, not at an arbitrary task count. Phases may have 4-5 tasks (a coherent subsystem) or just 1-2 (a genuinely isolated change).
+- Each task completable in one session, results in one commit
+- Tasks ordered so dependencies are satisfied
+- **Model tier assignment:** Assign a tier (`cheap`, `standard`, or `premium`) to every work task based on complexity:
+  - `cheap` -- mechanical changes with no ambiguity (rename, move, simple config edit)
+  - `standard` -- typical implementation work (new function, test suite, moderate refactor)
+  - `premium` -- high-ambiguity design tasks, architectural decisions, or tasks requiring deep multi-file reasoning
+  - Write the tier into the task entry in PLAN.md (e.g. `- **Tier:** standard`)
+  - When the PM creates progress.json from the plan, it copies each task's tier into `tasks[i].tier`
+  - During dispatch, the PM reads `tasks[i].tier` and passes `model: <tier>` to `execute_prompt` for doer dispatches
+  - **Constraint:** Reviewer dispatches always use `model: premium` regardless of the task tier -- this is not configurable by the planner
+- **The plan is the elaboration, not the summary:** requirements.md uses terse human language with intentional ambiguity. PLAN.md must resolve that ambiguity -- every edge case decided, every behaviour specified, every acceptance criterion precise enough that two developers would implement the same thing. Referencing requirements.md for background is fine; deferring a decision to it is not.
+- **Monotonically non-decreasing tiers within a phase:** Within a phase, order tasks cheap -> standard -> premium. The PM resumes the same session across tasks in a phase -- a premium task can build a large context that a cheap model cannot load. The PM may group consecutive same-tier tasks into a single dispatch streak; tier transitions trigger a new dispatch. If a dependency forces a higher-tier task before a lower-tier task within a phase, split the phase at that boundary. Cross-phase tier order does not matter -- each phase starts a fresh session.
+  ```
+  cheap -> cheap -> standard -> standard -> premium -> VERIFY  [VALID]
+  cheap -> standard -> cheap -> VERIFY  [INVALID]  (downgrade within phase -- split into two phases)
+  ```
+
+### PHASE 2 -- FRONT-LOAD FOUNDATIONS
+
+Two things go first:
+1. Key abstractions and shared interfaces -- later tasks build on these. If the foundation is wrong, everything above it is wasted.
+2. Riskiest assumption -- the thing that, if it doesn't work, invalidates everything else.
+
+Later tasks MUST follow DRY -- reuse the abstractions from early tasks, never reinvent. If two tasks duplicate logic, the plan is sliced wrong.
+
+Examples: "Does the native addon run a pipeline?" -- Task 1, not Task 15. "Define the shared auth interface" -- Task 1, not scattered across 5 tasks.
+
+### PHASE 3 -- SELF-CRITIQUE
+
+Golden rule: high cohesion within each task, low coupling between tasks. If a task needs the whole project to make sense, it's sliced wrong.
+
+Check your draft against these failure modes:
+- Low cohesion -- does this task touch unrelated areas? Split by component boundary.
+- High coupling -- does task N depend heavily on task M's internals? Decouple via interfaces.
+- Vague task -- could two developers interpret this differently?
+- Too large -- more than ~50 tool calls? Split it.
+- Hidden dependency -- does task N assume something from task M that isn't explicit?
+- Late verification -- 5+ tasks before checking if the approach works?
+- Wrong ordering -- could the riskiest assumption be validated earlier?
+- Missing "done" criteria -- how does the member know the task is complete?
+- Phase boundary at wrong place -- does this phase mix unrelated subsystems that could be reviewed independently? Or does it split a cohesive unit across two phases?
+- Untracked work -- re-read every task description, note, and comment in your draft. Does any sentence say "X will also need to change", "X must be updated", or "X is a prerequisite"? If yes and there is no task that does that work, either add the task or explicitly state it is out of scope.
+- Missing blocker -- does this task depend on anything that another task produces or puts in place? If yes, that task must be listed in Blockers, even if the phase order implies it.
+- Tier downgrade within a phase -- does any task have a lower tier than the task before it in the same phase? If yes, either reorder (if dependencies allow) or split the phase at the downgrade point. Cross-phase tier order does not matter -- each phase starts with a fresh session.
+
+### PHASE 4 -- REFINE
+
+Rewrite incorporating critique:
+- Move risky/uncertain tasks earlier
+- Split vague tasks into specific ones
+- VERIFY checkpoint at the natural completion boundary of each cohesive phase
+- Every task has clear "done" criteria
+
+### PHASE 5 -- BRANCH & COMMIT
+
+1. Read requirements.md for the base branch (default: `main`)
+2. `git fetch origin && git checkout -b <feature-branch> origin/<base-branch>`
+3. Commit the plan files to the feature branch -- NEVER commit to the base branch
+4. `git push -u origin <feature-branch>`
+
+Output the final plan in PLAN.md format.