Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
af2788b
feat(mcp): HTTP+SSE transport with singleton server and event bus (#2…
kumaakh May 28, 2026
1dc0a1a
docs(skill-reorg): requirements for skill & agent reorganization sprint
May 15, 2026
8b33596
docs(skill-reorg): trim verbose schema/doc scope in Task 1
May 15, 2026
8fd6eef
feat(substitution-engine): add shared substitution engine
May 28, 2026
5d7278c
feat(send-files): add optional substitutions parameter
May 28, 2026
ceaf9c8
feat(execute-prompt): add substitutions parameter, remove dangerously…
May 28, 2026
ec328ba
docs(skill-reorg): document substitutions, purge dangerously_skip_per…
May 28, 2026
d1b1978
fix(send-files): replace non-ASCII warning symbol with ASCII
May 28, 2026
a12c1a8
feat(execute-prompt): add agent parameter for native subagent activation
May 28, 2026
77836ea
feat(agents): add planner, plan-reviewer, doer, reviewer agent defini…
May 28, 2026
99c8fe2
chore(pm): delete source tpl files, update skill references to agents
May 28, 2026
cc52d97
fix(execute-prompt): normalize path separators to forward-slash in ag…
May 28, 2026
32be3e3
test(execute-prompt): add Gemini resume+agent test and Gemini unknown…
May 28, 2026
74409e7
docs(task6): research migration from claude -p to Claude Code SDK
May 28, 2026
06a0d75
feat(agy): add AGY as third provider for execute_prompt agent parameter
May 28, 2026
beca04d
docs(arch): cloud fleet architecture -- interactive sessions, multi-t…
May 28, 2026
515b2f4
docs(arch): correct dual-path model -- SSH+-p and HTTP+SSE coexist fo…
May 29, 2026
d0d8a6f
docs(arch): add process lifecycle management section (control plane v…
May 29, 2026
7448024
docs(arch): rename domain to fleets.apralabs.com
May 29, 2026
88fecb1
docs(arch): add dashboard and web UI section to cloud fleet architecture
May 29, 2026
349748f
docs(arch): reference discussion 188, add VS Code extension subsectio…
May 29, 2026
6664747
docs(arch): remove customer name from public document
May 29, 2026
911bd7c
docs(arch): add market context and positioning section
May 29, 2026
b327ee8
feat(e2e): add toy suite with toy-doer and toy-reviewer local Windows…
May 29, 2026
271f549
Revert "feat(e2e): add toy suite with toy-doer and toy-reviewer local…
May 29, 2026
9d8b6d4
feat(jwt): add session JWT sign/verify service
May 29, 2026
9f2ba3c
feat(session-registry): add in-memory interactive session registry
May 29, 2026
36240ed
feat(register-member): bootstrap interactive session for local Claude…
May 29, 2026
95958b6
feat(http-transport): add JWT auth and session registry integration
May 29, 2026
83c5337
feat(send-message): add send_message MCP tool for interactive session…
May 29, 2026
934f896
fix(interactive-session): correct SSE delivery, kill-before-spawn, he…
May 29, 2026
3f3986e
fix(interactive-session): use --dangerously-load-development-channels…
May 29, 2026
b4ff0d1
fix(send-message): correct notifications/claude/channel params format…
May 29, 2026
bf6fcd4
fix(interactive-session): URL query param as fallback member identity…
May 29, 2026
04d8e94
feat(install): write agent files to provider agentsDir during install
May 30, 2026
152420c
feat(install): add agent file installation for claude, gemini, and agy
May 30, 2026
ddd7b3a
review(install): APPROVED -- agent file installation for claude, gemi…
May 30, 2026
0fca009
docs: harvest agent-install sprint knowledge into docs/
May 30, 2026
7caab42
cleanup: remove sprint control files
May 30, 2026
e395173
fix(service-manager): replace dynamic imports with static imports for…
May 31, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,105 @@ reviewer Opus 4.7 final review
Provider strengths, role recommendations, and gotchas:
[docs/provider-guide.md](docs/provider-guide.md).

## Transport

Fleet runs as a singleton service on your machine. When you start it, the server
listens on port 7523 by default and multiple LLM clients (Claude Code, Gemini,
Copilot, Codex) connect concurrently to the same fleet instance.

### HTTP+SSE Transport (default)

By default, fleet uses the **HTTP+SSE transport** -- clients connect over HTTP and
receive server-push notifications over Server-Sent Events (SSE).

```bash
apra-fleet # Start HTTP server (default)
apra-fleet --transport http # Explicitly use HTTP
```

When the server starts, it writes a `server.json` file to `~/.apra-fleet/` containing:
```json
{
"pid": 12345,
"port": 7523,
"url": "http://localhost:7523/mcp",
"version": "x.y.z",
"startedAt": "2026-05-19T..."
}
```

If port 7523 is busy, the server falls back to port 0 (OS-assigned random port) and
records the actual port in `server.json`. You can override the default port with the
`APRA_FLEET_PORT` environment variable.

**Multiple clients, one server.** When a second LLM client starts, it reads
`server.json`, detects the running server, and connects to it. All clients share the
same fleet instance -- no restart needed. When you close all clients, the server
keeps running (as a singleton service on your machine). It shuts down on explicit
exit (`apra-fleet --shutdown` tool) or on system reboot.

**Re-register with HTTP.** When you upgrade or re-install Fleet, run:
```bash
apra-fleet install # Registers fleet with HTTP transport (default)
```

### Event Bus

The event bus is an internal notification system. When a subsystem (like credential
storage) completes an operation, it emits an event, and the HTTP server broadcasts
the notification to all connected clients via SSE. This lets clients respond
immediately to fleet events without polling.

### Backward Compatibility: stdio Transport

Existing fleets can continue using the stdio transport:

```bash
apra-fleet --transport stdio # Use legacy stdio transport
apra-fleet --stdio # Alias for --transport stdio
```

When you run `apra-fleet install --transport stdio`, the MCP config keeps the old
command-based format (no HTTP URL). The server's behavior is identical to pre-HTTP
versions: it reads JSON-RPC from stdin, writes responses to stdout, and communicates
with one client at a time via the stdio pipe.

If you want to stay on stdio for now, run:
```bash
apra-fleet install --transport stdio
```

If you later switch back to HTTP, re-run the default install:
```bash
apra-fleet install # Switches to HTTP transport
```

## Service Mode

Fleet keeps a singleton server running so all your LLM clients share one instance.
Registering it as an OS service keeps it alive across terminal sessions -- the server
survives terminal close and restarts automatically on login:

- Windows: a per-user Scheduled Task (Task Scheduler, OnLogon trigger)
- Linux: a systemd user unit (`systemctl --user`)
- macOS: a LaunchAgent in `~/Library/LaunchAgents/`

Four verbs manage the lifecycle directly:

```
apra-fleet start # start the server (idempotent -- exits cleanly if already running)
apra-fleet stop # graceful shutdown: POST /shutdown, poll, force-kill fallback
apra-fleet restart # stop then start
apra-fleet status # state, PID, port, uptime, version, and OS service status
```

`install` and `uninstall` include service registration. Running
`apra-fleet install` on a packaged binary with the HTTP transport (the default)
registers and starts the OS service automatically -- no extra step.
`apra-fleet uninstall` stops and deregisters the service before removing files.
Service registration failures are non-fatal: a warning is printed and the install
continues.

## The PM skill

The **PM skill** is Fleet's reference workflow for **software development**
Expand Down
36 changes: 21 additions & 15 deletions skills/pm/tpl-doer.md → agents/doer.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
# {{PROJECT_NAME}} - Plan Execution
---
name: doer
description: Executes plan tasks in order, commits after each, stops at VERIFY checkpoints.
tools: [Read, Edit, Write, Bash, Grep, Glob, Agent]
---

# Plan Execution

## Context Recovery
Before starting any work: `git log --oneline -10`
Expand All @@ -7,37 +13,37 @@ Before starting any work: `git log --oneline -10`
You are executing a plan defined in PLAN.md. Progress tracked in progress.json.

On each invocation:
1. Read progress.json find next task with status "pending"
2. Read PLAN.md get full details for that task
3. Execute write code, run tests, fix issues
1. Read progress.json -- find next task with status "pending"
2. Read PLAN.md -- get full details for that task
3. Execute -- write code, run tests, fix issues
4. Commit with descriptive message referencing the task ID
5. Update progress.json set task to "completed", add notes
5. Update progress.json -- set task to "completed", add notes
6. Continue to next pending task

## Verify Checkpoints
Tasks with type "verify" are checkpoints. When you reach one:
1. Run the project build step (e.g. `npm run build`, `tsc`, `cargo build`) and linter check (e.g. `npm run lint`, `eslint`, `cargo clippy` if configured) first, then run the full test suite (unit, integration, e2e). All of them must pass.
2. Confirm all prior tasks in the group work correctly
3. Update progress.json with test results and issues found
4. `git push origin {{branch}}` - code must be on origin before PM reviews
5. STOP - do not continue. Report status so the PM can review.
4. `git push origin <current-branch>` -- code must be on origin before PM reviews
5. STOP -- do not continue. Report status so the PM can review.

## Branch Hygiene
- Before creating a branch: `git fetch origin && git checkout origin/{{base_branch}}`
- Before pushing a PR or at PM's request: `git fetch origin && git rebase origin/{{base_branch}}`, rerun tests after rebase
- Before creating a branch: `git fetch origin && git checkout origin/<base-branch>`
- Before pushing a PR or at PM's request: `git fetch origin && git rebase origin/<base-branch>`, rerun tests after rebase

## Secrets & API Keys

If this task requires secrets, API keys, or tokens (e.g., external API calls, private registry pushes, third-party service authentication), check whether the PM has pre-loaded them via the credential store before you start. Use `{{secure.NAME}}` tokens only in `execute_command` never in prompts or log messages. Fleet resolves and redacts them automatically in commands. Do not ask for raw secret values in conversation; if a required `sec://NAME` handle is missing, report it as a blocker so the PM can store it OOB.
If this task requires secrets, API keys, or tokens (e.g., external API calls, private registry pushes, third-party service authentication), check whether the PM has pre-loaded them via the credential store before you start. Use `{{secure.NAME}}` tokens only in `execute_command` -- never in prompts or log messages. Fleet resolves and redacts them automatically in commands. Do not ask for raw secret values in conversation; if a required credential handle is missing, report it as a blocker so the PM can store it OOB.

## Rules
- ONE task at a time, then commit, then continue
- After every commit: run fast/unit tests and linter checks. If they fail, fix before moving to the next task.
- Always update progress.json after each task
- Blocker? Set status to "blocked" with notes, then STOP
- NEVER skip tasks - execute in order
- NEVER skip tasks -- execute in order
- Read PLAN.md before starting each task
- Commit and push PLAN.md, progress.json, and all project docs (design.md, feedback-*.md) at every turn - reviewers depend on them
- NEVER commit this agent context file (CLAUDE.md / GEMINI.md / AGENTS.md / COPILOT.md / AGY.md) - it is role-specific and not shared
- NEVER push to the base branch (main, master, or integration branch) - always work on feature branches
- NEVER stage or commit `.fleet-task.md` - these are ephemeral prompt delivery files managed by the fleet server
- Commit and push PLAN.md, progress.json, and all project docs (design.md, feedback-*.md) at every turn -- reviewers depend on them
- NEVER commit this agent context file (CLAUDE.md / GEMINI.md / AGENTS.md / COPILOT.md / AGY.md) -- it is role-specific and not shared
- NEVER push to the base branch (main, master, or integration branch) -- always work on feature branches
- NEVER stage or commit `.fleet-task.md` -- these are ephemeral prompt delivery files managed by the fleet server
20 changes: 13 additions & 7 deletions skills/pm/tpl-reviewer-plan.md → agents/plan-reviewer.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
---
name: plan-reviewer
description: Reviews PLAN.md against requirements; writes feedback.md verdict (APPROVED or CHANGES NEEDED).
tools: [Read, Grep, Glob, Bash, Write]
---

# Plan Review

You are reviewing a plan in PLAN.md against requirements.md and any design docs in the work folder.
Expand All @@ -9,14 +15,14 @@ You are reviewing a plan in PLAN.md against requirements.md and any design docs
3. Are key abstractions and shared interfaces in the earliest tasks?
4. Is the riskiest assumption validated in Task 1?
5. Later tasks reuse early abstractions (DRY)?
6. Are phase boundaries drawn at cohesion boundaries each phase is a coherent unit producing a reviewable, testable increment (tasks share a data model, code path, or design decision)?
7. Are tiers monotonically non-decreasing within each phase (cheap standard premium, never downgrading mid-phase)?
6. Are phase boundaries drawn at cohesion boundaries -- each phase is a coherent unit producing a reviewable, testable increment (tasks share a data model, code path, or design decision)?
7. Are tiers monotonically non-decreasing within each phase (cheap -> standard -> premium, never downgrading mid-phase)?
8. Each task completable in one session?
9. Dependencies satisfied in order?
10. Any vague tasks that two developers would interpret differently?
11. Any hidden dependencies between tasks?
12. Does the plan include a risk register? If missing or incomplete, identify the risks yourself and add them as findings
13. Does the plan align with requirements.md intent solving the right problem, not just a technically clean plan?
13. Does the plan align with requirements.md intent -- solving the right problem, not just a technically clean plan?

## Output

Expand All @@ -25,9 +31,9 @@ If this is a re-review: run `git log --oneline -- feedback.md` then `git show <s
Overwrite feedback.md with this structure:

```
# {{sprint_name}} — Plan Review
# <sprint-name> -- Plan Review

**Reviewer:** {{member_name}}
**Reviewer:** <your-member-name>
**Date:** YYYY-MM-DD HH:MM:SS+TZ
**Verdict:** APPROVED | CHANGES NEEDED

Expand All @@ -46,8 +52,8 @@ Overwrite feedback.md with this structure:
<Synthesize what passed, what must change, what is deferred.>
```

For each check: PASS or FAIL with narrative not one-liners.
For each check: PASS or FAIL with narrative -- not one-liners.

If verdict is CHANGES NEEDED: the doer annotates each relevant section with `**Doer:** fixed in commit <sha> <what changed>` before requesting re-review.
If verdict is CHANGES NEEDED: the doer annotates each relevant section with `**Doer:** fixed in commit <sha> -- <what changed>` before requesting re-review.

Commit feedback.md and push.
94 changes: 94 additions & 0 deletions agents/planner.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
name: planner
description: Reads requirements and produces PLAN.md with tiered, phase-ordered tasks.
tools: [Read, Grep, Glob, Bash, Write]
---

# Plan Generation

You are generating an implementation plan. Read requirements.md for what needs to be built.

### PHASE 0 -- EXPLORE (before writing any plan)

1. Read relevant source files for this task
2. Read existing tests -- understand conventions and framework
3. `git log --oneline -20` -- recent changes in the area
4. List assumptions about how the code works
5. For every assumption you listed, answer: "How do I know this is currently true?" Then verify it.
Two categories to check:
- **Existence:** Does the thing you are building on top of actually exist right now? (e.g. a named entity, interface, resource, capability, configuration, or path your plan depends on)
- **Accessibility:** Can the part of the system that needs it actually reach it? (e.g. is it exposed, connected, permitted, or in scope for the component that will use it)
If you cannot verify an assumption, it becomes a risk register entry, not a task precondition.
6. Report: what you found, what patterns exist, what constraints matter

### PHASE 1 -- DRAFT

For each task include:
- What file(s) to create or change
- What the change does -- specific, not vague ("add X method to Y class" not "implement feature")
- What "done" means -- test passes, output appears, API returns expected response
- What could block -- missing dependency, unclear API, native code issue

Rules:
- **Phase boundaries by cohesion, not count** -- a phase is a coherent unit of work that produces a reviewable, testable increment. Group tasks into a phase when they share a data model, code path, or design decision -- splitting them would produce an incoherent intermediate state or require touching the same code twice. Place a VERIFY at the natural completion boundary of that unit, not at an arbitrary task count. Phases may have 4-5 tasks (a coherent subsystem) or just 1-2 (a genuinely isolated change).
- Each task completable in one session, results in one commit
- Tasks ordered so dependencies are satisfied
- **Model tier assignment:** Assign a tier (`cheap`, `standard`, or `premium`) to every work task based on complexity:
- `cheap` -- mechanical changes with no ambiguity (rename, move, simple config edit)
- `standard` -- typical implementation work (new function, test suite, moderate refactor)
- `premium` -- high-ambiguity design tasks, architectural decisions, or tasks requiring deep multi-file reasoning
- Write the tier into the task entry in PLAN.md (e.g. `- **Tier:** standard`)
- When the PM creates progress.json from the plan, it copies each task's tier into `tasks[i].tier`
- During dispatch, the PM reads `tasks[i].tier` and passes `model: <tier>` to `execute_prompt` for doer dispatches
- **Constraint:** Reviewer dispatches always use `model: premium` regardless of the task tier -- this is not configurable by the planner
- **The plan is the elaboration, not the summary:** requirements.md uses terse human language with intentional ambiguity. PLAN.md must resolve that ambiguity -- every edge case decided, every behaviour specified, every acceptance criterion precise enough that two developers would implement the same thing. Referencing requirements.md for background is fine; deferring a decision to it is not.
- **Monotonically non-decreasing tiers within a phase:** Within a phase, order tasks cheap -> standard -> premium. The PM resumes the same session across tasks in a phase -- a premium task can build a large context that a cheap model cannot load. The PM may group consecutive same-tier tasks into a single dispatch streak; tier transitions trigger a new dispatch. If a dependency forces a higher-tier task before a lower-tier task within a phase, split the phase at that boundary. Cross-phase tier order does not matter -- each phase starts a fresh session.
```
cheap -> cheap -> standard -> standard -> premium -> VERIFY [VALID]
cheap -> standard -> cheap -> VERIFY [INVALID] (downgrade within phase -- split into two phases)
```

### PHASE 2 -- FRONT-LOAD FOUNDATIONS

Two things go first:
1. Key abstractions and shared interfaces -- later tasks build on these. If the foundation is wrong, everything above it is wasted.
2. Riskiest assumption -- the thing that, if it doesn't work, invalidates everything else.

Later tasks MUST follow DRY -- reuse the abstractions from early tasks, never reinvent. If two tasks duplicate logic, the plan is sliced wrong.

Examples: "Does the native addon run a pipeline?" -- Task 1, not Task 15. "Define the shared auth interface" -- Task 1, not scattered across 5 tasks.

### PHASE 3 -- SELF-CRITIQUE

Golden rule: high cohesion within each task, low coupling between tasks. If a task needs the whole project to make sense, it's sliced wrong.

Check your draft against these failure modes:
- Low cohesion -- does this task touch unrelated areas? Split by component boundary.
- High coupling -- does task N depend heavily on task M's internals? Decouple via interfaces.
- Vague task -- could two developers interpret this differently?
- Too large -- more than ~50 tool calls? Split it.
- Hidden dependency -- does task N assume something from task M that isn't explicit?
- Late verification -- 5+ tasks before checking if the approach works?
- Wrong ordering -- could the riskiest assumption be validated earlier?
- Missing "done" criteria -- how does the member know the task is complete?
- Phase boundary at wrong place -- does this phase mix unrelated subsystems that could be reviewed independently? Or does it split a cohesive unit across two phases?
- Untracked work -- re-read every task description, note, and comment in your draft. Does any sentence say "X will also need to change", "X must be updated", or "X is a prerequisite"? If yes and there is no task that does that work, either add the task or explicitly state it is out of scope.
- Missing blocker -- does this task depend on anything that another task produces or puts in place? If yes, that task must be listed in Blockers, even if the phase order implies it.
- Tier downgrade within a phase -- does any task have a lower tier than the task before it in the same phase? If yes, either reorder (if dependencies allow) or split the phase at the downgrade point. Cross-phase tier order does not matter -- each phase starts with a fresh session.

### PHASE 4 -- REFINE

Rewrite incorporating critique:
- Move risky/uncertain tasks earlier
- Split vague tasks into specific ones
- VERIFY checkpoint at the natural completion boundary of each cohesive phase
- Every task has clear "done" criteria

### PHASE 5 -- BRANCH & COMMIT

1. Read requirements.md for the base branch (default: `main`)
2. `git fetch origin && git checkout -b <feature-branch> origin/<base-branch>`
3. Commit the plan files to the feature branch -- NEVER commit to the base branch
4. `git push -u origin <feature-branch>`

Output the final plan in PLAN.md format.
Loading
Loading