Skip to content

feat(install): install agent files for claude/gemini/agy on apra-fleet install#289

Open
kumaakh wants to merge 39 commits into
mainfrom
enhancement/skill-reorg
Open

feat(install): install agent files for claude/gemini/agy on apra-fleet install#289
kumaakh wants to merge 39 commits into
mainfrom
enhancement/skill-reorg

Conversation

@kumaakh
Copy link
Copy Markdown
Contributor

@kumaakh kumaakh commented May 30, 2026

Summary

  • Adds agentsDir to ProviderInstallConfig for claude, gemini, and agy providers
  • Wires agents/*.md into AssetManifest, buildDevManifest, and the SEA bundler (gen-sea-config.mjs)
  • runInstall now writes planner/doer/reviewer/plan-reviewer agent files to the provider-specific user agents directory during apra-fleet install
  • codex and copilot have no agent concept -- agentsDir is undefined and the step is skipped silently
  • Step count and install summary log updated to reflect the new step

Provider paths

Provider Agents installed to
claude ~/.claude/agents/*.md
gemini ~/.gemini/agents/*.md
agy ~/.gemini/antigravity-cli/agents/*.md
codex skipped
copilot skipped

Test plan

  • 1492 tests pass (0 failures)
  • 5 new install tests: claude/gemini/agy write verified, codex/copilot skip verified
  • Reviewed by fleet-rev: APPROVED
  • ASCII compliance verified

Notes

This PR also contains all prior enhancement/skill-reorg work (substitution engine, execute-prompt agent/substitution params, interactive sessions, agent definitions, skill reorganization). The agent installation wiring is the final piece that makes execute_prompt with agent: parameter functional on a fresh install.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

kumaakh and others added 30 commits May 30, 2026 02:06
…) (#273)

* docs(mcp): implementation plan for HTTP+SSE transport (#258)

4-phase plan: event bus + HTTP transport, server refactor with
--transport flag, credential_store_set event wiring + install config,
and documentation. Singleton model with per-session McpServer.

* review: plan review for HTTP+SSE transport (#258)

CHANGES NEEDED -- 3 blocking findings:
- HIGH-1: provider mcp.json config formats underspecified in Task 7
- HIGH-2: singleton startup race condition unaddressed in Task 5
- HIGH-3: SEA binary compatibility not verified

* docs(mcp): revise plan per review -- transport decision, race fix, SEA, provider configs (#258)

* feat(mcp): typed event bus for fleet pub/sub (#258)

* chore: update progress for T1 completion

* review: plan re-review for HTTP+SSE transport (#258)

APPROVED -- all 3 prior HIGH findings resolved:
- HIGH-1: concrete provider configs for Claude/Gemini/Copilot/Codex, port 7523
- HIGH-2: atomic startup lock via fs.openSync(path, 'wx')
- HIGH-3: SEA verification task added to Phase 1

* feat(mcp): HTTP transport with multi-session support (#258)

* test(mcp): verify HTTP transport in SEA binary (#258)

* chore: mark VERIFY Phase 1 completed in progress.json

* review: Phase 1 core abstractions (#258)

* refactor(mcp): extract tool registration into shared module (#258)

* chore: mark task 5 completed in progress.json

* feat(mcp): --transport flag and dual startup paths (#258)

* feat(mcp): singleton lifecycle detection with atomic claim (#258)

* chore: update progress.json -- task 5/6 complete, VERIFY Phase 2 done

* chore: mark VERIFY server refactor + dual transport completed (#258)

* chore: record VERIFY commit SHA in progress.json (#258)

* review: Phase 2 server refactor and dual transport (#258)

* feat(mcp): emit credential:stored event on OOB secret delivery (#258)

* chore: record T7 commit SHA in progress.json (#258)

* feat(mcp): provider-specific HTTP transport install configs (#258)

* chore: record T8 commit SHA in progress.json (#258)

* test(mcp): transport integration tests + Gemini client verification (#258)

* chore: record T9 commit SHA in progress.json (#258)

* chore: mark VERIFY event wiring + client config completed (#258)

* review: Phase 3 event wiring and client config (#258)

* docs(mcp): document HTTP+SSE transport, singleton model, event bus (#258)

* chore: record T10 completion + VERIFY checkpoint results (#258)

* review: Phase 4 docs + final sprint review (#258)

* cleanup: remove fleet control files

* docs(service): OS service lifecycle implementation plan

Add PLAN.md with the implementation plan for making apra-fleet behave
like a normal OS service -- start/stop/restart/status verbs, per-user
service registration folded into install/uninstall, cross-platform
support for Windows (schtasks), Linux (systemd --user), and macOS
(launchd LaunchAgent), all without elevation. Extends PR #273.

* review: OS service lifecycle plan review

* docs(service): revise plan per review -- dev-path, branch, macOS idempotency, stop semantics

* review: OS service lifecycle plan re-review

* feat(service): POST /shutdown endpoint and service constants (#258)

* progress: mark T1 complete (ef84f92)

* feat(service): T2 ServiceManager interface and factory

ServiceManager interface (register, unregister, start, stop, query,
isInstalled) + ServiceStatus type in types.ts. Factory getServiceManager()
selects per-platform adapter (win32/linux/darwin), falling back to
NoopServiceManager on unsupported platforms. gracefulStopByServerJson()
reads server.json and POST /shutdown with 5s pid-poll + SIGTERM fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(service): add platform adapter stubs to unblock build

Minimal throw-not-implemented stubs for WindowsServiceManager,
LinuxServiceManager, MacOSServiceManager. Created by PM after token
outage interrupted fleet-dev mid-sprint. T3/T4/T5 will replace these
with real schtasks/systemd/launchd implementations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(http-transport): declare claude/channel MCP capability

Adds experimental: { 'claude/channel': {} } to the McpServer capabilities
on each session. Enables server-to-client push via notifications/claude/channel
over the existing SSE stream. POC validated: server can inject messages into
a Claude Code session unprompted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(service): add T6.5 capability logging task, mark T1/T2 complete

Extends sprint plan with T6.5 (MCP session capability logging, beads 78g):
log clientInfo, capabilities, and channel flag on session init/close.
Marks T1 (ef84f92) and T2 (9963198) as completed in progress.json.
Notes stubs committed for T3/T4/T5 to unblock build after token outage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(service): Windows Scheduled Task adapter (#258)

* feat(service): Linux systemd user unit adapter (#258)

* feat(service): macOS launchd LaunchAgent adapter (#258)

* test(service): service manager adapter unit tests (#258)

* feat(http-transport): log MCP session init and close with client caps (#258)

* progress: mark T6/T6.5 complete, update T3-T5 notes

* progress: VERIFY blocked on build/test approval

* feat(service): service manager unit tests (#258)

* progress: VERIFY passed -- 85 files, 1365 tests green

* progress: mark VERIFY id=8 completed

* review: Phase 1 platform service foundation code review

* feat(service): start and stop CLI commands (#258)

Add runStart and runStop CLI verbs. start checks for a running instance
(idempotent), uses the service manager when a unit is installed, otherwise
spawns a detached process redirected to LOG_FILE_PATH. stop posts /shutdown,
polls up to 5s, falls back to taskkill (Windows) or SIGTERM. Both wired into
src/index.ts dispatch.

* feat(service): restart CLI command (#258)

Add runRestart: calls runStop then runStart. Wire into index.ts dispatch.
Also commit progress.json update for T7.

* feat(service): status CLI command (#258)

Add runStatus: reads server.json, GET /health for live metrics (version,
uptime, sessions), queries service manager for unit state. Formats output
with State/PID/Port/URL/Version/Uptime/Sessions/Service fields. Wired into
index.ts dispatch.

* test(service): CLI verb tests and help update (#258)

18 vitest tests covering start (already-running idempotent, service-managed
start, detached spawn, timeout failure), stop (not-running idempotent,
/shutdown POST, cleanup), restart (stop-then-start, idempotent when stopped),
and status (running/stopped states, service labels, health fields).
Update --help to list start/stop/restart/status verbs.

* progress: VERIFY P2 complete -- 86 files, 1376 passed, 18 new CLI verb tests green

* feat(service): CLI verb tests and --help update (#258)

Add tests/cli-verbs.test.ts with 18 tests covering runStart (already
running idempotent, service manager path, spawn path, failure exit),
runStop (not running idempotent, /shutdown post, file cleanup), runRestart
(stop then start), and runStatus (stopped/running states, service labels,
health fields). Update --help to list start/stop/restart/status verbs.

* feat(service): extend install/uninstall with service lifecycle (#258)

install: in SEA+HTTP mode, register the service unit and start it as a
new numbered step after Beads. Adds Service line to the Done summary.
totalSteps updated; beads step uses baseSteps so numbering is correct.

uninstall: replace hard killApraFleet with svcMgr.stop() (graceful POST
/shutdown + poll + fallback) in the --force path; always call
svcMgr.unregister() before file removal (idempotent, tolerates not-found).

* progress: VERIFY P2 final -- 86 files, 1383 passed, 0 failed

Update T10 and VERIFY P2 entries to reference 37a28b6 (spy-based rewrite
that fixed node:fs factory-mock leakage in fileParallelism:false mode).
Full suite: 86 files, 1383 passed, 13 skipped, 0 failed.

* test(service): install/uninstall service integration tests (#258)

13 tests covering T11+T12: install calls register+start in SEA+HTTP mode,
skips for stdio or dev mode, warns non-fatally on register failure, shows
correct step numbering; uninstall calls stop then unregister in correct
order, skips both in dry-run, swallows unregister errors, guards server-
running check without --force.

* chore: VERIFY P3 -- install/uninstall integration complete

npm run build: clean. npm test: 87 files, 1396 passed, 13 skipped, 0 failed.

* docs(readme): document service model and start/stop/restart/status verbs (#258)

* docs(arch): document service manager architecture (#258)

* chore: VERIFY P4 -- documentation complete, 87 files 1396 passed

* fix(service): quote args in Windows bat wrapper to support paths with spaces (#258)

* fix(service): always run gracefulStop before systemd check in LinuxServiceManager (#258)

* fix(service): XML-escape path values in macOS plist builder (#258)

* fix(service): use SIGKILL not SIGTERM for force-kill fallback on Unix (#258)

* fix(service): delegate stop to ServiceManager when service is installed (#258)

* fix(service): rollback register() if start() fails during install (#258)

* test(service): update bat wrapper test to expect quoted args (#258)

* ci(llms-full): regen after rebase on main

* fix(service): address reviewer findings -- shutdown exit code, shared process-utils, agy transport tests (#258)

R1: wrap each step in shutdown() in its own try/catch and always call
process.exit(0), preventing SIGTERM from triggering systemd/launchd
restart loop on graceful stop.

R2: extract isPidAlive and postShutdown into src/utils/process-utils.ts;
remove 4 duplicate isPidAlive copies (stop.ts, singleton.ts,
service-manager/index.ts, task-cleanup.ts) and 2 postShutdown copies
(stop.ts, service-manager/index.ts).

R3: add --transport http and --transport stdio test cases for agy to
tests/install-multi-provider.test.ts to match the pattern used by
claude, gemini, codex, and copilot.

---------

Co-authored-by: Bot <bot@apra-fleet.dev>
Co-authored-by: Akhil Kumar <akhil@Akhils-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Six tasks: shared substitution engine on send_files+execute_prompt
(with secrets-boundary invariant and dangerously_skip_permissions
cleanup), define 4 role agents (planner/plan-reviewer/doer/reviewer)
for Claude+Gemini, analyze pm skill splits, deep review reorganized
skills, deep review installer routing, migrate from claude -p to
long-running agents before 2026-06-15.

Branch: enhancement/skill-reorg off main@1970ced.
Member: apra-fleet-reorg.
Reduce the Schema and documentation scope section plus the Cleanup
section to a tight checklist. Removes prose explanations, quoted
historical doc text, and reasoning bullets that the doer LLM does
not need. Keeps: files in scope, what to add or remove, PR audit
gates, regression test for dangerously_skip_permissions removal.
Single module implementing scan / validate-keys / transform / emit-warning.
No credential-store imports. Pure text in, text out.

- TOKEN_RE matches {{ ws* [A-Za-z_][A-Za-z0-9_]* ws* }} (lenient whitespace)
- {{secure.NAME}} in content: passes through verbatim (dot excluded by grammar)
- validateSubstitutionKeys exported for pre-read-key-rejection in handlers
- Values never appear in errors or warnings; token names may
- No recursive substitution
Wires substitution engine into the send_files handler:
- Key validation before any file I/O (zero side effects on key rejection)
- Reads file contents, applies engine, writes temp files, transfers those
- Source files on fleet host are never modified
- Heuristic warning when substitutions omitted but files contain tokens
- Atomicity: validation strictly before file writes
…_skip_permissions

- Wires substitution engine into execute_prompt handler with same
  semantics as send_files: validate-then-transform, values never logged,
  {{secure.NAME}} pass-through, heuristic warning when omitted
- Removes dangerously_skip_permissions from schema (.strict() rejects it)
  and all handler code; per-member unattended mode is the replacement
- Regression test confirms schema now returns a validation error for the
  removed field; surface integration tests cover q-v from requirements
…missions

- fleet SKILL.md: adds Substitutions section covering send_files and
  execute_prompt, secrets-boundary callout, and execute_prompt parameters
  table (removes dangerously_skip_permissions row)
- pm SKILL.md: tpl-*.md listed as templates sent via send_files with
  substitutions; dangerously_skip_permissions references removed
- pm doer-reviewer.md: setup checklist updated to use send_files with
  substitutions map instead of PM-side templating
- docs/tools-work.md, docs/provider-matrix.md: dangerously_skip_permissions
  references replaced with update_member(unattended=...) guidance
- docs/test-audit-report.md: updated test descriptions to match new code
- ASCII fix: em dashes, arrows, and emoji replaced with ASCII equivalents
Replace emoji introduced by Task 1 with ASCII equivalents per
OBS-3 review finding and CLAUDE.md ASCII-only convention:
- send-files.ts: warning ⚠️ -> [WARN], success ✅ -> [OK],
  failure ❌ -> [FAIL], blocked ⛔ -> [ERR]
- execute-prompt.ts: warning ⚠️ -> [WARN]
- send-files-collision.test.ts: update assertions to match [ERR]
- unattended-mode.test.ts: replace box-drawing chars with -
Add optional agent: string parameter to execute_prompt. For Claude members,
invokes claude --agent <name>. For Gemini members, prepends @<name> to the
prompt on every dispatch (resume=true included). Substitution runs before
the @<name> prepend. Agent file existence is validated before any CLI
invocation -- missing agent returns a clear error naming both expected
paths. Windows buildAgentPromptCommand updated symmetrically.

Tests cover: Claude --agent flag, Gemini @name prepend, unknown agent error,
substitution-then-prepend ordering, and home-directory agent path resolution.
…tions

Four role agent files at agents/<name>.md (repo root, sibling to skills/).
Each file has YAML frontmatter (name, description, tools) and a body derived
from the corresponding skills/pm/tpl-*.md source with {{token}} placeholders
stripped. No model: in frontmatter -- model tier is chosen by PM at dispatch.

Install paths (decided in Task 2): ~/.claude/agents/<name>.md (Claude) and
~/.gemini/agents/<name>.md (Gemini). Installer routing is Task 5's scope.
Completes Task 2 source migration: removes plan-prompt.md, tpl-doer.md,
tpl-reviewer-plan.md, tpl-reviewer.md from skills/pm/ now that canonical
definitions live in agents/<name>.md.

Updates skills/fleet/SKILL.md, skills/pm/context-file.md,
skills/pm/doer-reviewer.md, and skills/pm/single-pair-sprint.md to
reference agent activation (agent: "doer", agent: "planner", etc.)
instead of tpl-*.md file delivery.

Also replaces all non-ASCII characters (emoji, arrows, dashes) in
docs/gemini-lifecycle-walkthrough.md and updated pm skill docs.
…-agent test; fix stale context-file.md rules

- BLOCKING-1: add Gemini resume=true agent test -- asserts @doer prepend on continuation dispatch
- BLOCKING-2: add Gemini-provider unknown-agent test -- asserts not-found error + .gemini path in message
- BLOCKING-3: remove stale 'on first send gitignore' and 'send context file on role switch' rules from context-file.md; replace with agent: param activation model
Maps current execute_prompt CLI flags to SDK equivalents, identifies gaps
(--agent flag, SSH-remote execution, stall detector log path), and recommends
a thin fleet-runner.mjs wrapper script as the migration path. Deadline: 2026-06-15.
AGY behaves like Gemini: prepends @<name> to the prompt on every dispatch
(including resume=true). Agent file validation uses the AGY config root at
~/.gemini/antigravity-cli/agents/ rather than ~/.agy/agents/.

- src/providers/agy.ts: prepend @<name> in buildPromptCommand when agentName is set
- src/os/windows.ts: include provider.name === 'agy' in the @<name> prepend check
- src/tools/execute-prompt.ts: update schema description; special-case AGY agent
  file paths (.gemini/antigravity-cli/agents/) in local and remote validation
- tests/execute-prompt-agent.test.ts: add three AGY test cases (fresh dispatch,
  resume=true dispatch, unknown agent returns error with antigravity-cli path)
- requirements.md: replace Claude+Gemini with Claude+AGY+Gemini throughout;
  add AGY row to Task 2 provider table; update install paths and done criteria
- skills/pm/context-file.md: add AGY install path for role agents
… members

- Add toy_doer (C:\akhil\git\fleet-e2e-toy) and toy_reviewer
  (C:\akhil\git\fleet-e2e-toy-rev) entries to members.json; toy_reviewer
  carries clone_url so the repo is cloned if the folder does not exist
- Add toy suite to suites.json (local Windows, Claude provider, github VCS)
- Add clone_url parameter to register_member: when set and the work folder
  is absent, git clone is used instead of mkdir (local and remote members)
- Update fleet-e2e.yml: add toy to suite options and runner selection,
  extract reviewer_clone_url from members.json, git clone in the pre-create
  step when clone_url is set, pass REVIEWER_CLONE_URL to rendered scripts
- Update setup-script.md template to pass clone_url to register_member
  when the token is non-empty
Implements sign() and verify() using HS256 (node:crypto) with a
persisted 32-byte hex key at ~/.apra-fleet/fleet.key.  No external
dependency required.  Tokens carry member_id, project_id, role, and
work_folder with a 7-day expiry.
Tracks connected member sessions (online/busy/idle) with SSE response
references for message injection.  Exports a singleton sessionRegistry
with register/unregister/get/list/setStatus/setSseResponse operations.
… members

After registration, local Claude members get:
- A signed JWT written into <workFolder>/.claude/settings.local.json
  as the apra-fleet MCP server entry (Bearer auth to 127.0.0.1:7523)
- claude spawned detached in the work folder so it connects immediately
POST /mcp: if Authorization: Bearer <token> is present, verify it and
register the member in sessionRegistry; reject with 401 on invalid token.
Unauthenticated connections (PM/tool) continue to work unchanged.

GET /mcp (SSE channel): capture the response object in sessionRegistry
so send_message can push events, and unregister on connection close.
… dispatch

Pushes a fleet:task SSE event to a connected member's SSE channel and
marks the member busy.  Returns {ok, msgid} or an error if the member
is not connected.  Registered in tool-registry as send_message.
Bot and others added 9 commits May 30, 2026 02:06
…alth check pre-launch

CRITICAL-1: Replace raw sseRes/SSE-write approach with McpServer.sendLoggingMessage().
Session registry now stores McpServer instance (not ServerResponse). Registration
happens in onsessioninitialized (not on raw POST), so the server object is live.
Unregistration happens in onsessionclosed. Removes the GET-handler SSE capture.

CRITICAL-2: Before spawning claude on re-registration, kill any existing process
tracked in the session registry by PID (process.kill, ignores already-gone errors).
Spawned PID is stored in sessionRegistry so subsequent re-registrations can clean up.

HIGH-1: GET http://127.0.0.1:7523/health before spawning claude. Returns a clear
error if the fleet server is not running on port 7523.
… flag and notifications/claude/channel method
- Add agentsDir field to ProviderInstallConfig: ~/.claude/agents for claude,
  ~/.gemini/agents for gemini, ~/.gemini/antigravity-cli/agents for agy;
  undefined for codex and copilot (no agent concept)
- Add agents field to AssetManifest interface and collect agents/*.md in
  buildDevManifest and gen-sea-config.mjs so they are bundled in the SEA binary
- In runInstall, add an agents installation step (skipped when agentsDir is
  undefined) between PM skill and Beads; update step counts and summary log
- Fix all AssetManifest mocks in install tests to include agents: {}; update
  hardcoded step number expectations that shifted (+1 for agentsStep)
- Add 5 new test cases verifying: claude/gemini/agy write agents to the correct
  dir, codex/copilot skip gracefully
- Add agentsDir field to ProviderInstallConfig (set for claude, gemini, agy; undefined for codex and copilot)
- Add agents field to AssetManifest interface
- In buildDevManifest: collect agents/*.md files (key=filename, value=relative path)
- In runInstall: write each agent file to paths.agentsDir when defined; skip for codex/copilot
- Update step count and summary log to include agentsDir when agents are installed
- In gen-sea-config.mjs: bundle agents/*.md as SEA assets for production installs
- Add agent installation tests: claude->~/.claude/agents/, gemini->~/.gemini/agents/, agy->~/.gemini/antigravity-cli/agents/, codex/copilot skip gracefully
- Add agents field to all AssetManifest mocks in install test files
…ni, agy

Reviewed commits 04d8e94 and 152420c. All 5 providers handle correctly
(3 write agents, 2 skip). SEA bundler includes all 4 agent files. Step
numbering is correct. Build clean, 1492 tests pass. No blocking issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant