Originally created by: kumaakh
Fleet currently uses the MCP stdio transport — the LLM client writes JSON-RPC requests to the server's stdin and reads responses from stdout. This is strictly request-response: the server can only speak when spoken to. There is no mechanism for the server to push unsolicited messages to the LLM.
The MCP spec defines a second transport — HTTP + Server-Sent Events (SSE) — where the client POSTs requests over HTTP and the server maintains an open SSE stream. On that stream the server can push notifications/* events at any time, unprompted, for the lifetime of the session.
| Layer | Change |
|---|---|
| MCP server | Replace stdio JSON-RPC handler with an HTTP server (e.g. Express or native node:http). Expose a POST endpoint for tool calls and an SSE endpoint (/events) for push notifications. |
| MCP client config | mcp.json changes from "type": "stdio" to "type": "sse" with a URL pointing to the local HTTP server. |
| Event bus | Internal pub/sub bus inside fleet so any subsystem (auth socket, task monitor, stall detector) can emit events that get forwarded onto the SSE stream. |
| Claude Code client | Claude Code already supports the SSE transport. Whether it surfaces notifications/message as LLM conversation injections is a separate Anthropic ask — but the server side is ready. |
credential_store_set currently returns immediately with a "Waiting…" message. The LLM has no way to know when the user completes the OOB entry. With SSE, fleet pushes ✓ Secret stored: e2e_bb_token onto the event stream the moment the auth socket delivers the value — the LLM sees it without polling.
execute_prompt completion — LLM dispatches a background prompt and gets notified when it finishes, without calling monitor_task in a loop.fleet_status.All of the above today require the LLM to poll (monitor_task, fleet_status, repeated execute_command checks). Polling wastes turns, burns tokens, and introduces latency. SSE collapses all of these into a single persistent channel — fleet becomes an event source, not just a tool executor.
--transport flag.mcp.json can be auto-generated).notifications/message events as LLM conversation injections in Claude Code — the server side is ready; the client side needs Anthropic's support.enhancement wishlist mcp architecture
Originally posted by: kumaakh
Analysis: Non-blocking execute_prompt with SSE
TL;DR
The proposal shifts the
execute_promptMCP tool from a synchronous, blocking model to an asynchronous, fire-and-forget pattern. Instead of waiting for the LLM execution to complete, the server would immediately return a unique stream identifier. The caller can then subscribe to the stream to receive real-time updates via Server-Sent Events (SSE). This unlocks parallel prompt dispatching for the orchestrator (PM) and provides a versatile streaming primitive for all long-running fleet tasks.Current State
Currently,
execute_promptis fully synchronous from the MCP caller's perspective:src/index.ts(L224).src/tools/execute-prompt.tsblocks onawait strategy.execCommand(...)(L190).inFlightAgents.add(agent.id)(L120 inexecute-prompt.ts). Concurrent dispatches to the same member are instantly rejected.stallDetector(src/services/stall/index.ts) polls the LLM's conversation log (e.g..jsonlfiles) to ensure the process hasn't hung.execute-prompt.ts). No partial results are streamed to the caller.execute_commandtool (src/tools/execute-command.ts) already implements along_runningfire-and-forget flag that returns atask_id, which the caller polls viamonitor_task(src/index.tsL248).Why the current model strains at scale
SSE in MCP — what's actually possible
The
@modelcontextprotocol/sdk(v1.27.0 inpackage.json) supports two primary transports:StdioServerTransportandSSEServerTransport.apra-fleetuses stdio (JSON-RPC over standard input/output) for its MCP server because it operates primarily via CLI.SSEServerTransportover an HTTP server (e.g., Express).notifications/messageor by implementing a long-polling tool (similar to howmonitor_taskcurrently pollsexecute_command).Implementation plan in phases
executePromptSchemato accept anasync: z.boolean().default(false)parameter.async=true, bypass theawait strategy.execCommandblock. Instead, spawn the process, capture theinv(invocation ID) orsessionId, and return immediately:{ stream_id: "<id>", status: "started" }.SSEServerTransport. Implement theGET /sseendpoint where clients subscribe usingstream_id.prompt_progress) containing partial tokens or status updates keyed bystream_id.skills/pm/SKILL.mdto instruct the PM to useasync=truefor multi-member dispatches.monitor_taskto gather results across parallel tasks.fleet_statuschanges, file transfer progress, andexecute_commandoutputs, deprecating the need for manual polling viamonitor_task.Risks
execute_promptblocks until completion. We must ensureasync=falsepreserves the exact current behavior.stallDetectorstate.Open Questions
Out-of-scope notes
stallDetector(it is still required to detect frozen background tasks).