feat(mcp): HTTP+SSE transport with singleton server and event bus (#258)#273
Merged
Conversation
kumaakh
pushed a commit
that referenced
this pull request
May 19, 2026
Add PLAN.md with the implementation plan for making apra-fleet behave like a normal OS service -- start/stop/restart/status verbs, per-user service registration folded into install/uninstall, cross-platform support for Windows (schtasks), Linux (systemd --user), and macOS (launchd LaunchAgent), all without elevation. Extends PR #273.
4-phase plan: event bus + HTTP transport, server refactor with --transport flag, credential_store_set event wiring + install config, and documentation. Singleton model with per-session McpServer.
CHANGES NEEDED -- 3 blocking findings: - HIGH-1: provider mcp.json config formats underspecified in Task 7 - HIGH-2: singleton startup race condition unaddressed in Task 5 - HIGH-3: SEA binary compatibility not verified
…A, provider configs (#258)
APPROVED -- all 3 prior HIGH findings resolved: - HIGH-1: concrete provider configs for Claude/Gemini/Copilot/Codex, port 7523 - HIGH-2: atomic startup lock via fs.openSync(path, 'wx') - HIGH-3: SEA verification task added to Phase 1
Add runStart and runStop CLI verbs. start checks for a running instance (idempotent), uses the service manager when a unit is installed, otherwise spawns a detached process redirected to LOG_FILE_PATH. stop posts /shutdown, polls up to 5s, falls back to taskkill (Windows) or SIGTERM. Both wired into src/index.ts dispatch.
Add runRestart: calls runStop then runStart. Wire into index.ts dispatch. Also commit progress.json update for T7.
Add runStatus: reads server.json, GET /health for live metrics (version, uptime, sessions), queries service manager for unit state. Formats output with State/PID/Port/URL/Version/Uptime/Sessions/Service fields. Wired into index.ts dispatch.
18 vitest tests covering start (already-running idempotent, service-managed start, detached spawn, timeout failure), stop (not-running idempotent, /shutdown POST, cleanup), restart (stop-then-start, idempotent when stopped), and status (running/stopped states, service labels, health fields). Update --help to list start/stop/restart/status verbs.
Add tests/cli-verbs.test.ts with 18 tests covering runStart (already running idempotent, service manager path, spawn path, failure exit), runStop (not running idempotent, /shutdown post, file cleanup), runRestart (stop then start), and runStatus (stopped/running states, service labels, health fields). Update --help to list start/stop/restart/status verbs.
install: in SEA+HTTP mode, register the service unit and start it as a new numbered step after Beads. Adds Service line to the Done summary. totalSteps updated; beads step uses baseSteps so numbering is correct. uninstall: replace hard killApraFleet with svcMgr.stop() (graceful POST /shutdown + poll + fallback) in the --force path; always call svcMgr.unregister() before file removal (idempotent, tolerates not-found).
Update T10 and VERIFY P2 entries to reference 37a28b6 (spy-based rewrite that fixed node:fs factory-mock leakage in fileParallelism:false mode). Full suite: 86 files, 1383 passed, 13 skipped, 0 failed.
13 tests covering T11+T12: install calls register+start in SEA+HTTP mode, skips for stdio or dev mode, warns non-fatally on register failure, shows correct step numbering; uninstall calls stop then unregister in correct order, skips both in dry-run, swallows unregister errors, guards server- running check without --force.
npm run build: clean. npm test: 87 files, 1396 passed, 13 skipped, 0 failed.
5d2203f to
5836328
Compare
added 2 commits
May 28, 2026 02:00
… process-utils, agy transport tests (#258) R1: wrap each step in shutdown() in its own try/catch and always call process.exit(0), preventing SIGTERM from triggering systemd/launchd restart loop on graceful stop. R2: extract isPidAlive and postShutdown into src/utils/process-utils.ts; remove 4 duplicate isPidAlive copies (stop.ts, singleton.ts, service-manager/index.ts, task-cleanup.ts) and 2 postShutdown copies (stop.ts, service-manager/index.ts). R3: add --transport http and --transport stdio test cases for agy to tests/install-multi-provider.test.ts to match the pattern used by claude, gemini, codex, and copilot.
kumaakh
added a commit
that referenced
this pull request
May 29, 2026
…) (#273) * docs(mcp): implementation plan for HTTP+SSE transport (#258) 4-phase plan: event bus + HTTP transport, server refactor with --transport flag, credential_store_set event wiring + install config, and documentation. Singleton model with per-session McpServer. * review: plan review for HTTP+SSE transport (#258) CHANGES NEEDED -- 3 blocking findings: - HIGH-1: provider mcp.json config formats underspecified in Task 7 - HIGH-2: singleton startup race condition unaddressed in Task 5 - HIGH-3: SEA binary compatibility not verified * docs(mcp): revise plan per review -- transport decision, race fix, SEA, provider configs (#258) * feat(mcp): typed event bus for fleet pub/sub (#258) * chore: update progress for T1 completion * review: plan re-review for HTTP+SSE transport (#258) APPROVED -- all 3 prior HIGH findings resolved: - HIGH-1: concrete provider configs for Claude/Gemini/Copilot/Codex, port 7523 - HIGH-2: atomic startup lock via fs.openSync(path, 'wx') - HIGH-3: SEA verification task added to Phase 1 * feat(mcp): HTTP transport with multi-session support (#258) * test(mcp): verify HTTP transport in SEA binary (#258) * chore: mark VERIFY Phase 1 completed in progress.json * review: Phase 1 core abstractions (#258) * refactor(mcp): extract tool registration into shared module (#258) * chore: mark task 5 completed in progress.json * feat(mcp): --transport flag and dual startup paths (#258) * feat(mcp): singleton lifecycle detection with atomic claim (#258) * chore: update progress.json -- task 5/6 complete, VERIFY Phase 2 done * chore: mark VERIFY server refactor + dual transport completed (#258) * chore: record VERIFY commit SHA in progress.json (#258) * review: Phase 2 server refactor and dual transport (#258) * feat(mcp): emit credential:stored event on OOB secret delivery (#258) * chore: record T7 commit SHA in progress.json (#258) * feat(mcp): provider-specific HTTP transport install configs (#258) * chore: record T8 commit SHA in progress.json (#258) * test(mcp): transport integration tests + Gemini client verification (#258) * chore: record T9 commit SHA in progress.json (#258) * chore: mark VERIFY event wiring + client config completed (#258) * review: Phase 3 event wiring and client config (#258) * docs(mcp): document HTTP+SSE transport, singleton model, event bus (#258) * chore: record T10 completion + VERIFY checkpoint results (#258) * review: Phase 4 docs + final sprint review (#258) * cleanup: remove fleet control files * docs(service): OS service lifecycle implementation plan Add PLAN.md with the implementation plan for making apra-fleet behave like a normal OS service -- start/stop/restart/status verbs, per-user service registration folded into install/uninstall, cross-platform support for Windows (schtasks), Linux (systemd --user), and macOS (launchd LaunchAgent), all without elevation. Extends PR #273. * review: OS service lifecycle plan review * docs(service): revise plan per review -- dev-path, branch, macOS idempotency, stop semantics * review: OS service lifecycle plan re-review * feat(service): POST /shutdown endpoint and service constants (#258) * progress: mark T1 complete (ef84f92) * feat(service): T2 ServiceManager interface and factory ServiceManager interface (register, unregister, start, stop, query, isInstalled) + ServiceStatus type in types.ts. Factory getServiceManager() selects per-platform adapter (win32/linux/darwin), falling back to NoopServiceManager on unsupported platforms. gracefulStopByServerJson() reads server.json and POST /shutdown with 5s pid-poll + SIGTERM fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(service): add platform adapter stubs to unblock build Minimal throw-not-implemented stubs for WindowsServiceManager, LinuxServiceManager, MacOSServiceManager. Created by PM after token outage interrupted fleet-dev mid-sprint. T3/T4/T5 will replace these with real schtasks/systemd/launchd implementations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(http-transport): declare claude/channel MCP capability Adds experimental: { 'claude/channel': {} } to the McpServer capabilities on each session. Enables server-to-client push via notifications/claude/channel over the existing SSE stream. POC validated: server can inject messages into a Claude Code session unprompted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(service): add T6.5 capability logging task, mark T1/T2 complete Extends sprint plan with T6.5 (MCP session capability logging, beads 78g): log clientInfo, capabilities, and channel flag on session init/close. Marks T1 (ef84f92) and T2 (9963198) as completed in progress.json. Notes stubs committed for T3/T4/T5 to unblock build after token outage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(service): Windows Scheduled Task adapter (#258) * feat(service): Linux systemd user unit adapter (#258) * feat(service): macOS launchd LaunchAgent adapter (#258) * test(service): service manager adapter unit tests (#258) * feat(http-transport): log MCP session init and close with client caps (#258) * progress: mark T6/T6.5 complete, update T3-T5 notes * progress: VERIFY blocked on build/test approval * feat(service): service manager unit tests (#258) * progress: VERIFY passed -- 85 files, 1365 tests green * progress: mark VERIFY id=8 completed * review: Phase 1 platform service foundation code review * feat(service): start and stop CLI commands (#258) Add runStart and runStop CLI verbs. start checks for a running instance (idempotent), uses the service manager when a unit is installed, otherwise spawns a detached process redirected to LOG_FILE_PATH. stop posts /shutdown, polls up to 5s, falls back to taskkill (Windows) or SIGTERM. Both wired into src/index.ts dispatch. * feat(service): restart CLI command (#258) Add runRestart: calls runStop then runStart. Wire into index.ts dispatch. Also commit progress.json update for T7. * feat(service): status CLI command (#258) Add runStatus: reads server.json, GET /health for live metrics (version, uptime, sessions), queries service manager for unit state. Formats output with State/PID/Port/URL/Version/Uptime/Sessions/Service fields. Wired into index.ts dispatch. * test(service): CLI verb tests and help update (#258) 18 vitest tests covering start (already-running idempotent, service-managed start, detached spawn, timeout failure), stop (not-running idempotent, /shutdown POST, cleanup), restart (stop-then-start, idempotent when stopped), and status (running/stopped states, service labels, health fields). Update --help to list start/stop/restart/status verbs. * progress: VERIFY P2 complete -- 86 files, 1376 passed, 18 new CLI verb tests green * feat(service): CLI verb tests and --help update (#258) Add tests/cli-verbs.test.ts with 18 tests covering runStart (already running idempotent, service manager path, spawn path, failure exit), runStop (not running idempotent, /shutdown post, file cleanup), runRestart (stop then start), and runStatus (stopped/running states, service labels, health fields). Update --help to list start/stop/restart/status verbs. * feat(service): extend install/uninstall with service lifecycle (#258) install: in SEA+HTTP mode, register the service unit and start it as a new numbered step after Beads. Adds Service line to the Done summary. totalSteps updated; beads step uses baseSteps so numbering is correct. uninstall: replace hard killApraFleet with svcMgr.stop() (graceful POST /shutdown + poll + fallback) in the --force path; always call svcMgr.unregister() before file removal (idempotent, tolerates not-found). * progress: VERIFY P2 final -- 86 files, 1383 passed, 0 failed Update T10 and VERIFY P2 entries to reference 37a28b6 (spy-based rewrite that fixed node:fs factory-mock leakage in fileParallelism:false mode). Full suite: 86 files, 1383 passed, 13 skipped, 0 failed. * test(service): install/uninstall service integration tests (#258) 13 tests covering T11+T12: install calls register+start in SEA+HTTP mode, skips for stdio or dev mode, warns non-fatally on register failure, shows correct step numbering; uninstall calls stop then unregister in correct order, skips both in dry-run, swallows unregister errors, guards server- running check without --force. * chore: VERIFY P3 -- install/uninstall integration complete npm run build: clean. npm test: 87 files, 1396 passed, 13 skipped, 0 failed. * docs(readme): document service model and start/stop/restart/status verbs (#258) * docs(arch): document service manager architecture (#258) * chore: VERIFY P4 -- documentation complete, 87 files 1396 passed * fix(service): quote args in Windows bat wrapper to support paths with spaces (#258) * fix(service): always run gracefulStop before systemd check in LinuxServiceManager (#258) * fix(service): XML-escape path values in macOS plist builder (#258) * fix(service): use SIGKILL not SIGTERM for force-kill fallback on Unix (#258) * fix(service): delegate stop to ServiceManager when service is installed (#258) * fix(service): rollback register() if start() fails during install (#258) * test(service): update bat wrapper test to expect quoted args (#258) * ci(llms-full): regen after rebase on main * fix(service): address reviewer findings -- shutdown exit code, shared process-utils, agy transport tests (#258) R1: wrap each step in shutdown() in its own try/catch and always call process.exit(0), preventing SIGTERM from triggering systemd/launchd restart loop on graceful stop. R2: extract isPidAlive and postShutdown into src/utils/process-utils.ts; remove 4 duplicate isPidAlive copies (stop.ts, singleton.ts, service-manager/index.ts, task-cleanup.ts) and 2 postShutdown copies (stop.ts, service-manager/index.ts). R3: add --transport http and --transport stdio test cases for agy to tests/install-multi-provider.test.ts to match the pattern used by claude, gemini, codex, and copilot. --------- Co-authored-by: Bot <bot@apra-fleet.dev> Co-authored-by: Akhil Kumar <akhil@Akhils-MacBook-Pro.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
kumaakh
added a commit
that referenced
this pull request
May 29, 2026
…) (#273) * docs(mcp): implementation plan for HTTP+SSE transport (#258) 4-phase plan: event bus + HTTP transport, server refactor with --transport flag, credential_store_set event wiring + install config, and documentation. Singleton model with per-session McpServer. * review: plan review for HTTP+SSE transport (#258) CHANGES NEEDED -- 3 blocking findings: - HIGH-1: provider mcp.json config formats underspecified in Task 7 - HIGH-2: singleton startup race condition unaddressed in Task 5 - HIGH-3: SEA binary compatibility not verified * docs(mcp): revise plan per review -- transport decision, race fix, SEA, provider configs (#258) * feat(mcp): typed event bus for fleet pub/sub (#258) * chore: update progress for T1 completion * review: plan re-review for HTTP+SSE transport (#258) APPROVED -- all 3 prior HIGH findings resolved: - HIGH-1: concrete provider configs for Claude/Gemini/Copilot/Codex, port 7523 - HIGH-2: atomic startup lock via fs.openSync(path, 'wx') - HIGH-3: SEA verification task added to Phase 1 * feat(mcp): HTTP transport with multi-session support (#258) * test(mcp): verify HTTP transport in SEA binary (#258) * chore: mark VERIFY Phase 1 completed in progress.json * review: Phase 1 core abstractions (#258) * refactor(mcp): extract tool registration into shared module (#258) * chore: mark task 5 completed in progress.json * feat(mcp): --transport flag and dual startup paths (#258) * feat(mcp): singleton lifecycle detection with atomic claim (#258) * chore: update progress.json -- task 5/6 complete, VERIFY Phase 2 done * chore: mark VERIFY server refactor + dual transport completed (#258) * chore: record VERIFY commit SHA in progress.json (#258) * review: Phase 2 server refactor and dual transport (#258) * feat(mcp): emit credential:stored event on OOB secret delivery (#258) * chore: record T7 commit SHA in progress.json (#258) * feat(mcp): provider-specific HTTP transport install configs (#258) * chore: record T8 commit SHA in progress.json (#258) * test(mcp): transport integration tests + Gemini client verification (#258) * chore: record T9 commit SHA in progress.json (#258) * chore: mark VERIFY event wiring + client config completed (#258) * review: Phase 3 event wiring and client config (#258) * docs(mcp): document HTTP+SSE transport, singleton model, event bus (#258) * chore: record T10 completion + VERIFY checkpoint results (#258) * review: Phase 4 docs + final sprint review (#258) * cleanup: remove fleet control files * docs(service): OS service lifecycle implementation plan Add PLAN.md with the implementation plan for making apra-fleet behave like a normal OS service -- start/stop/restart/status verbs, per-user service registration folded into install/uninstall, cross-platform support for Windows (schtasks), Linux (systemd --user), and macOS (launchd LaunchAgent), all without elevation. Extends PR #273. * review: OS service lifecycle plan review * docs(service): revise plan per review -- dev-path, branch, macOS idempotency, stop semantics * review: OS service lifecycle plan re-review * feat(service): POST /shutdown endpoint and service constants (#258) * progress: mark T1 complete (ef84f92) * feat(service): T2 ServiceManager interface and factory ServiceManager interface (register, unregister, start, stop, query, isInstalled) + ServiceStatus type in types.ts. Factory getServiceManager() selects per-platform adapter (win32/linux/darwin), falling back to NoopServiceManager on unsupported platforms. gracefulStopByServerJson() reads server.json and POST /shutdown with 5s pid-poll + SIGTERM fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(service): add platform adapter stubs to unblock build Minimal throw-not-implemented stubs for WindowsServiceManager, LinuxServiceManager, MacOSServiceManager. Created by PM after token outage interrupted fleet-dev mid-sprint. T3/T4/T5 will replace these with real schtasks/systemd/launchd implementations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(http-transport): declare claude/channel MCP capability Adds experimental: { 'claude/channel': {} } to the McpServer capabilities on each session. Enables server-to-client push via notifications/claude/channel over the existing SSE stream. POC validated: server can inject messages into a Claude Code session unprompted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(service): add T6.5 capability logging task, mark T1/T2 complete Extends sprint plan with T6.5 (MCP session capability logging, beads 78g): log clientInfo, capabilities, and channel flag on session init/close. Marks T1 (ef84f92) and T2 (9963198) as completed in progress.json. Notes stubs committed for T3/T4/T5 to unblock build after token outage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(service): Windows Scheduled Task adapter (#258) * feat(service): Linux systemd user unit adapter (#258) * feat(service): macOS launchd LaunchAgent adapter (#258) * test(service): service manager adapter unit tests (#258) * feat(http-transport): log MCP session init and close with client caps (#258) * progress: mark T6/T6.5 complete, update T3-T5 notes * progress: VERIFY blocked on build/test approval * feat(service): service manager unit tests (#258) * progress: VERIFY passed -- 85 files, 1365 tests green * progress: mark VERIFY id=8 completed * review: Phase 1 platform service foundation code review * feat(service): start and stop CLI commands (#258) Add runStart and runStop CLI verbs. start checks for a running instance (idempotent), uses the service manager when a unit is installed, otherwise spawns a detached process redirected to LOG_FILE_PATH. stop posts /shutdown, polls up to 5s, falls back to taskkill (Windows) or SIGTERM. Both wired into src/index.ts dispatch. * feat(service): restart CLI command (#258) Add runRestart: calls runStop then runStart. Wire into index.ts dispatch. Also commit progress.json update for T7. * feat(service): status CLI command (#258) Add runStatus: reads server.json, GET /health for live metrics (version, uptime, sessions), queries service manager for unit state. Formats output with State/PID/Port/URL/Version/Uptime/Sessions/Service fields. Wired into index.ts dispatch. * test(service): CLI verb tests and help update (#258) 18 vitest tests covering start (already-running idempotent, service-managed start, detached spawn, timeout failure), stop (not-running idempotent, /shutdown POST, cleanup), restart (stop-then-start, idempotent when stopped), and status (running/stopped states, service labels, health fields). Update --help to list start/stop/restart/status verbs. * progress: VERIFY P2 complete -- 86 files, 1376 passed, 18 new CLI verb tests green * feat(service): CLI verb tests and --help update (#258) Add tests/cli-verbs.test.ts with 18 tests covering runStart (already running idempotent, service manager path, spawn path, failure exit), runStop (not running idempotent, /shutdown post, file cleanup), runRestart (stop then start), and runStatus (stopped/running states, service labels, health fields). Update --help to list start/stop/restart/status verbs. * feat(service): extend install/uninstall with service lifecycle (#258) install: in SEA+HTTP mode, register the service unit and start it as a new numbered step after Beads. Adds Service line to the Done summary. totalSteps updated; beads step uses baseSteps so numbering is correct. uninstall: replace hard killApraFleet with svcMgr.stop() (graceful POST /shutdown + poll + fallback) in the --force path; always call svcMgr.unregister() before file removal (idempotent, tolerates not-found). * progress: VERIFY P2 final -- 86 files, 1383 passed, 0 failed Update T10 and VERIFY P2 entries to reference 37a28b6 (spy-based rewrite that fixed node:fs factory-mock leakage in fileParallelism:false mode). Full suite: 86 files, 1383 passed, 13 skipped, 0 failed. * test(service): install/uninstall service integration tests (#258) 13 tests covering T11+T12: install calls register+start in SEA+HTTP mode, skips for stdio or dev mode, warns non-fatally on register failure, shows correct step numbering; uninstall calls stop then unregister in correct order, skips both in dry-run, swallows unregister errors, guards server- running check without --force. * chore: VERIFY P3 -- install/uninstall integration complete npm run build: clean. npm test: 87 files, 1396 passed, 13 skipped, 0 failed. * docs(readme): document service model and start/stop/restart/status verbs (#258) * docs(arch): document service manager architecture (#258) * chore: VERIFY P4 -- documentation complete, 87 files 1396 passed * fix(service): quote args in Windows bat wrapper to support paths with spaces (#258) * fix(service): always run gracefulStop before systemd check in LinuxServiceManager (#258) * fix(service): XML-escape path values in macOS plist builder (#258) * fix(service): use SIGKILL not SIGTERM for force-kill fallback on Unix (#258) * fix(service): delegate stop to ServiceManager when service is installed (#258) * fix(service): rollback register() if start() fails during install (#258) * test(service): update bat wrapper test to expect quoted args (#258) * ci(llms-full): regen after rebase on main * fix(service): address reviewer findings -- shutdown exit code, shared process-utils, agy transport tests (#258) R1: wrap each step in shutdown() in its own try/catch and always call process.exit(0), preventing SIGTERM from triggering systemd/launchd restart loop on graceful stop. R2: extract isPidAlive and postShutdown into src/utils/process-utils.ts; remove 4 duplicate isPidAlive copies (stop.ts, singleton.ts, service-manager/index.ts, task-cleanup.ts) and 2 postShutdown copies (stop.ts, service-manager/index.ts). R3: add --transport http and --transport stdio test cases for agy to tests/install-multi-provider.test.ts to match the pattern used by claude, gemini, codex, and copilot. --------- Co-authored-by: Bot <bot@apra-fleet.dev> Co-authored-by: Akhil Kumar <akhil@Akhils-MacBook-Pro.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
kumaakh
added a commit
that referenced
this pull request
May 30, 2026
…) (#273) * docs(mcp): implementation plan for HTTP+SSE transport (#258) 4-phase plan: event bus + HTTP transport, server refactor with --transport flag, credential_store_set event wiring + install config, and documentation. Singleton model with per-session McpServer. * review: plan review for HTTP+SSE transport (#258) CHANGES NEEDED -- 3 blocking findings: - HIGH-1: provider mcp.json config formats underspecified in Task 7 - HIGH-2: singleton startup race condition unaddressed in Task 5 - HIGH-3: SEA binary compatibility not verified * docs(mcp): revise plan per review -- transport decision, race fix, SEA, provider configs (#258) * feat(mcp): typed event bus for fleet pub/sub (#258) * chore: update progress for T1 completion * review: plan re-review for HTTP+SSE transport (#258) APPROVED -- all 3 prior HIGH findings resolved: - HIGH-1: concrete provider configs for Claude/Gemini/Copilot/Codex, port 7523 - HIGH-2: atomic startup lock via fs.openSync(path, 'wx') - HIGH-3: SEA verification task added to Phase 1 * feat(mcp): HTTP transport with multi-session support (#258) * test(mcp): verify HTTP transport in SEA binary (#258) * chore: mark VERIFY Phase 1 completed in progress.json * review: Phase 1 core abstractions (#258) * refactor(mcp): extract tool registration into shared module (#258) * chore: mark task 5 completed in progress.json * feat(mcp): --transport flag and dual startup paths (#258) * feat(mcp): singleton lifecycle detection with atomic claim (#258) * chore: update progress.json -- task 5/6 complete, VERIFY Phase 2 done * chore: mark VERIFY server refactor + dual transport completed (#258) * chore: record VERIFY commit SHA in progress.json (#258) * review: Phase 2 server refactor and dual transport (#258) * feat(mcp): emit credential:stored event on OOB secret delivery (#258) * chore: record T7 commit SHA in progress.json (#258) * feat(mcp): provider-specific HTTP transport install configs (#258) * chore: record T8 commit SHA in progress.json (#258) * test(mcp): transport integration tests + Gemini client verification (#258) * chore: record T9 commit SHA in progress.json (#258) * chore: mark VERIFY event wiring + client config completed (#258) * review: Phase 3 event wiring and client config (#258) * docs(mcp): document HTTP+SSE transport, singleton model, event bus (#258) * chore: record T10 completion + VERIFY checkpoint results (#258) * review: Phase 4 docs + final sprint review (#258) * cleanup: remove fleet control files * docs(service): OS service lifecycle implementation plan Add PLAN.md with the implementation plan for making apra-fleet behave like a normal OS service -- start/stop/restart/status verbs, per-user service registration folded into install/uninstall, cross-platform support for Windows (schtasks), Linux (systemd --user), and macOS (launchd LaunchAgent), all without elevation. Extends PR #273. * review: OS service lifecycle plan review * docs(service): revise plan per review -- dev-path, branch, macOS idempotency, stop semantics * review: OS service lifecycle plan re-review * feat(service): POST /shutdown endpoint and service constants (#258) * progress: mark T1 complete (ef84f92) * feat(service): T2 ServiceManager interface and factory ServiceManager interface (register, unregister, start, stop, query, isInstalled) + ServiceStatus type in types.ts. Factory getServiceManager() selects per-platform adapter (win32/linux/darwin), falling back to NoopServiceManager on unsupported platforms. gracefulStopByServerJson() reads server.json and POST /shutdown with 5s pid-poll + SIGTERM fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(service): add platform adapter stubs to unblock build Minimal throw-not-implemented stubs for WindowsServiceManager, LinuxServiceManager, MacOSServiceManager. Created by PM after token outage interrupted fleet-dev mid-sprint. T3/T4/T5 will replace these with real schtasks/systemd/launchd implementations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(http-transport): declare claude/channel MCP capability Adds experimental: { 'claude/channel': {} } to the McpServer capabilities on each session. Enables server-to-client push via notifications/claude/channel over the existing SSE stream. POC validated: server can inject messages into a Claude Code session unprompted. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(service): add T6.5 capability logging task, mark T1/T2 complete Extends sprint plan with T6.5 (MCP session capability logging, beads 78g): log clientInfo, capabilities, and channel flag on session init/close. Marks T1 (ef84f92) and T2 (9963198) as completed in progress.json. Notes stubs committed for T3/T4/T5 to unblock build after token outage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(service): Windows Scheduled Task adapter (#258) * feat(service): Linux systemd user unit adapter (#258) * feat(service): macOS launchd LaunchAgent adapter (#258) * test(service): service manager adapter unit tests (#258) * feat(http-transport): log MCP session init and close with client caps (#258) * progress: mark T6/T6.5 complete, update T3-T5 notes * progress: VERIFY blocked on build/test approval * feat(service): service manager unit tests (#258) * progress: VERIFY passed -- 85 files, 1365 tests green * progress: mark VERIFY id=8 completed * review: Phase 1 platform service foundation code review * feat(service): start and stop CLI commands (#258) Add runStart and runStop CLI verbs. start checks for a running instance (idempotent), uses the service manager when a unit is installed, otherwise spawns a detached process redirected to LOG_FILE_PATH. stop posts /shutdown, polls up to 5s, falls back to taskkill (Windows) or SIGTERM. Both wired into src/index.ts dispatch. * feat(service): restart CLI command (#258) Add runRestart: calls runStop then runStart. Wire into index.ts dispatch. Also commit progress.json update for T7. * feat(service): status CLI command (#258) Add runStatus: reads server.json, GET /health for live metrics (version, uptime, sessions), queries service manager for unit state. Formats output with State/PID/Port/URL/Version/Uptime/Sessions/Service fields. Wired into index.ts dispatch. * test(service): CLI verb tests and help update (#258) 18 vitest tests covering start (already-running idempotent, service-managed start, detached spawn, timeout failure), stop (not-running idempotent, /shutdown POST, cleanup), restart (stop-then-start, idempotent when stopped), and status (running/stopped states, service labels, health fields). Update --help to list start/stop/restart/status verbs. * progress: VERIFY P2 complete -- 86 files, 1376 passed, 18 new CLI verb tests green * feat(service): CLI verb tests and --help update (#258) Add tests/cli-verbs.test.ts with 18 tests covering runStart (already running idempotent, service manager path, spawn path, failure exit), runStop (not running idempotent, /shutdown post, file cleanup), runRestart (stop then start), and runStatus (stopped/running states, service labels, health fields). Update --help to list start/stop/restart/status verbs. * feat(service): extend install/uninstall with service lifecycle (#258) install: in SEA+HTTP mode, register the service unit and start it as a new numbered step after Beads. Adds Service line to the Done summary. totalSteps updated; beads step uses baseSteps so numbering is correct. uninstall: replace hard killApraFleet with svcMgr.stop() (graceful POST /shutdown + poll + fallback) in the --force path; always call svcMgr.unregister() before file removal (idempotent, tolerates not-found). * progress: VERIFY P2 final -- 86 files, 1383 passed, 0 failed Update T10 and VERIFY P2 entries to reference 37a28b6 (spy-based rewrite that fixed node:fs factory-mock leakage in fileParallelism:false mode). Full suite: 86 files, 1383 passed, 13 skipped, 0 failed. * test(service): install/uninstall service integration tests (#258) 13 tests covering T11+T12: install calls register+start in SEA+HTTP mode, skips for stdio or dev mode, warns non-fatally on register failure, shows correct step numbering; uninstall calls stop then unregister in correct order, skips both in dry-run, swallows unregister errors, guards server- running check without --force. * chore: VERIFY P3 -- install/uninstall integration complete npm run build: clean. npm test: 87 files, 1396 passed, 13 skipped, 0 failed. * docs(readme): document service model and start/stop/restart/status verbs (#258) * docs(arch): document service manager architecture (#258) * chore: VERIFY P4 -- documentation complete, 87 files 1396 passed * fix(service): quote args in Windows bat wrapper to support paths with spaces (#258) * fix(service): always run gracefulStop before systemd check in LinuxServiceManager (#258) * fix(service): XML-escape path values in macOS plist builder (#258) * fix(service): use SIGKILL not SIGTERM for force-kill fallback on Unix (#258) * fix(service): delegate stop to ServiceManager when service is installed (#258) * fix(service): rollback register() if start() fails during install (#258) * test(service): update bat wrapper test to expect quoted args (#258) * ci(llms-full): regen after rebase on main * fix(service): address reviewer findings -- shutdown exit code, shared process-utils, agy transport tests (#258) R1: wrap each step in shutdown() in its own try/catch and always call process.exit(0), preventing SIGTERM from triggering systemd/launchd restart loop on graceful stop. R2: extract isPidAlive and postShutdown into src/utils/process-utils.ts; remove 4 duplicate isPidAlive copies (stop.ts, singleton.ts, service-manager/index.ts, task-cleanup.ts) and 2 postShutdown copies (stop.ts, service-manager/index.ts). R3: add --transport http and --transport stdio test cases for agy to tests/install-multi-provider.test.ts to match the pattern used by claude, gemini, codex, and copilot. --------- Co-authored-by: Bot <bot@apra-fleet.dev> Co-authored-by: Akhil Kumar <akhil@Akhils-MacBook-Pro.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #258.
Summary
Replaces fleet's stdio-only MCP transport with a default singleton HTTP+SSE server that multiple LLM clients share, while keeping stdio as a backward-compat fallback. Both transports co-exist.
src/services/http-transport.ts) - one McpServer per client session overStreamableHTTPServerTransport, bound to 127.0.0.1 only, default port 7523 with fallback. Multiple Claude/Gemini clients connect concurrently to one fleet service per machine.src/services/event-bus.ts) - internal pub/sub; events are broadcast to all connected SSE clients asnotifications/message.src/services/singleton.ts) - atomic startup claim (fs.openSyncwx) prevents start races; PID +/healthdouble-check detects a running instance; staleserver.json/lock cleanup.--transport http|stdioflag - defaulthttp; stdio path unchanged, no regression.credential:storedthe moment the OOB secret is delivered, no polling.apra-fleet installwrites HTTP transport config for Claude, Gemini, Copilot and Codex (stdio config when--transport stdio).docs/architecture.md"Transport Layer".Transport choice
Uses
StreamableHTTPServerTransport(not the deprecatedSSEServerTransport). Verified both Claude Code and Gemini CLI support Streamable HTTP as of 2026-05.Validation
Follow-ups (filed as backlog)
src/services/tool-registry.ts(ASCII cleanup).Out of scope: the Anthropic client-side change to surface
notifications/messageas conversation injections (external ask; server side is spec-compliant).Generated via apra-fleet doer/reviewer sprint.