Skip to content

feat: switch MCP transport from stdio to HTTP+SSE for server-push and event-driven workflows #258

@kumaakh

Description

@kumaakh

Background

Fleet currently uses the MCP stdio transport — the LLM client writes JSON-RPC requests to the server's stdin and reads responses from stdout. This is strictly request-response: the server can only speak when spoken to. There is no mechanism for the server to push unsolicited messages to the LLM.

The MCP spec defines a second transport — HTTP + Server-Sent Events (SSE) — where the client POSTs requests over HTTP and the server maintains an open SSE stream. On that stream the server can push notifications/* events at any time, unprompted, for the lifetime of the session.

What needs to change in fleet

Layer Change
MCP server Replace stdio JSON-RPC handler with an HTTP server (e.g. Express or native node:http). Expose a POST endpoint for tool calls and an SSE endpoint (/events) for push notifications.
MCP client config mcp.json changes from "type": "stdio" to "type": "sse" with a URL pointing to the local HTTP server.
Event bus Internal pub/sub bus inside fleet so any subsystem (auth socket, task monitor, stall detector) can emit events that get forwarded onto the SSE stream.
Claude Code client Claude Code already supports the SSE transport. Whether it surfaces notifications/message as LLM conversation injections is a separate Anthropic ask — but the server side is ready.

Immediate use case that motivates this

credential_store_set currently returns immediately with a "Waiting…" message. The LLM has no way to know when the user completes the OOB entry. With SSE, fleet pushes ✓ Secret stored: e2e_bb_token onto the event stream the moment the auth socket delivers the value — the LLM sees it without polling.

Other event-driven workflows this unlocks

  • execute_prompt completion — LLM dispatches a background prompt and gets notified when it finishes, without calling monitor_task in a loop.
  • Member online/offline — fleet pushes a notification when a registered member's SSH keepalive changes state. LLM knows immediately without calling fleet_status.
  • Stall detected — stall detector emits an event directly into the LLM conversation instead of just writing a log line. LLM can decide to intervene.
  • CI status flip — fleet could subscribe to a GitHub webhook and forward CI pass/fail events into an active LLM session that's waiting on a PR.
  • Credential expiry warning — fleet pushes a heads-up N minutes before a TTL credential expires, while the LLM is mid-sprint and about to use it.
  • File change watch — fleet watches a path on a member and notifies the LLM when a build artifact or config file changes.

Why this matters architecturally

All of the above today require the LLM to poll (monitor_task, fleet_status, repeated execute_command checks). Polling wastes turns, burns tokens, and introduces latency. SSE collapses all of these into a single persistent channel — fleet becomes an event source, not just a tool executor.

Suggested approach

  1. Keep stdio as a fallback (for environments that don't support HTTP) controlled by a --transport flag.
  2. Default to HTTP+SSE for local fleet servers (localhost, random port, written to a well-known file so mcp.json can be auto-generated).
  3. File a parallel request to Anthropic to surface MCP notifications/message events as LLM conversation injections in Claude Code — the server side is ready; the client side needs Anthropic's support.

Labels

enhancement wishlist mcp architecture

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions