Menu

#75 feat: inter-session attention mechanism — PM↔member communication across sessions

open
nobody
None
2026-04-26
2026-04-05
Anonymous
No

Originally created by: kumaakh

When a member hits a verify checkpoint or becomes blocked, there is no way for the PM to communicate across sessions. The PM must manually poll `progress.json`.

Don't try to inject into a running session. Instead, have the orchestrator write updates to a known file (`TASK_UPDATES.md`) and include in the original dispatch prompt: "Before each major step, check TASK_UPDATES.md for any new instructions."

The agent reads that file as part of its agentic loop since it was instructed to. Low-tech but surprisingly effective — poll by instruction rather than by infrastructure. Works equally for Claude and Gemini members.

PM writes to `TASK_UPDATES.md` via `send_files` or `execute_command` — no session interrupt needed.

Future infrastructure: `execute_prompt` with `update` flag

A cleaner solution is to add an `update` flag to `execute_prompt`:

  • If the member's session is currently running: inject the prompt as a task instruction into the live session (native interject).
  • If the session has finished: replace `.fleet-task.md` with the new prompt and resume the same session (same session ID, continuity preserved).

This gives PM a single call to "reach" a member regardless of whether they're mid-task or at a checkpoint — no polling, no separate resume logic.

Other future options

  • Needs-attention file polled by PM on session start
  • Slack DM via webhook
  • Desktop notification / toast
  • Statusline urgency sort (#2) as lightweight alternative

The file-polling pattern above serves as the immediate solution and remains a useful low-overhead fallback once native interject lands.

Backlog item [#14] from docs/MCP-BACKLOG.md. High priority.

Related

Tickets: #14
Tickets: #152

Discussion

  • Anonymous

    Anonymous - 2026-04-12

    Originally posted by: kumaakh

    From my research:

    Given your PM → doer/reviewer architecture, the cleanest pattern might be: don't try to inject into a running session. Instead, have your orchestrator write updates to a known file (like TASK_UPDATES.md) and include in the original prompt: "Before each major step, check TASK_UPDATES.md for any new instructions." Claude will cat that file as part of its agentic loop since you told it to. This is low-tech but surprisingly effective — you're essentially making the agent poll by instruction rather than by infrastructure.

     
  • Anonymous

    Anonymous - 2026-04-12

    Originally posted by: kumaakh

    PM Skill Integration Pattern

    Given the PM → doer/reviewer architecture, the cleanest pattern might be: don't try to inject into a running session. Instead, have the orchestrator write updates to a known file (e.g. TASK_UPDATES.md) and include in the original dispatch prompt: "Before each major step, check TASK_UPDATES.md for any new instructions."

    Claude will read that file as part of its agentic loop since it was instructed to. This is low-tech but surprisingly effective — you're essentially making the agent poll by instruction rather than by infrastructure.

    How this fits the PM skill

    • PM writes urgent instructions to TASK_UPDATES.md in the member's work folder via send_files or execute_command
    • The doer's agent context file (CLAUDE.md) includes the polling instruction as a standing rule
    • No new fleet infrastructure needed — works today with zero changes
    • Complements a future native interject mechanism: if infra-level injection lands, the file-polling fallback becomes the low-latency path for non-critical updates

    Suggested addition to doer context file template

    ## Live Updates
    Before each task: check TASK_UPDATES.md in your work folder for any new instructions from PM.
    If the file exists and contains unread instructions, apply them before proceeding.
    
     
  • Anonymous

    Anonymous - 2026-04-23

    Originally posted by: kumaakh

    Technical direction: The file-polling pattern described in the comments is already the recommended immediate approach and requires no infrastructure changes. The native interject path is the right next step:

    Add an update flag to execute_prompt (src/tools/execute-prompt.ts):

    • Session running: use the provider's native interject mechanism (Claude: --append-system-prompt or a follow-up -p on the same session ID; check Claude CLI docs for the correct flag).
    • Session idle/finished: write new prompt to .fleet-task.md and call execute_prompt with
      esume: true on the existing session ID — continuity preserved.

    The key design question is whether Claude CLI supports injecting into a currently running session (mid-turn). If not, the file-polling approach remains the only real-time path. Worth checking claude --help for any --inject or --append flag before implementing.

     
  • Anonymous

    Anonymous - 2026-04-26

    Originally posted by: kumaakh

    Research: Inter-session attention mechanism — Feasibility findings

    1. Claude Code CLI capability: Can you inject input into a running claude -p session?

    No. claude -p (print mode) is a fire-and-forget invocation. It reads a prompt, runs to completion (up to --max-turns), and exits. There is no stdin channel for injecting additional user turns mid-session. Key observations:

    • claude -p closes stdin immediately after reading the prompt. The fleet codebase confirms this: LocalStrategy.execCommand calls child.stdin?.end() right after spawn (src/services/strategy.ts:96).
    • --resume / -c resumes a completed session — it starts a new process that loads the prior conversation as context, then appends a new user turn. It does not attach to a running process.
    • --session-id <uuid> sets the session ID for a new invocation; it doesn't connect to a live one.
    • --input-format stream-json allows streaming input, but only before the agent starts running — it's for piping structured input at invocation time, not for mid-turn injection.

    Verdict: Claude Code CLI has no mechanism for injecting a message into a running -p session from an external process.

    2. Current execute_prompt resume path in apra-fleet

    Reading src/tools/execute-prompt.ts and src/providers/claude.ts:

    • Fleet stores sessionId on each agent after a successful prompt execution (touchAgent(agent.id, parsed.sessionId) at line 165).
    • On the next execute_prompt call with resume: true, it appends -c to the claude -p command (Claude provider's resumeFlag()). This resumes the prior conversation context in a new process.
    • If the session is stale (process exits non-zero), fleet retries without the session ID (lines 146-150).
    • The prompt is written to .fleet-task.md in the work folder, then cleaned up after execution.

    Key insight: Resume today is strictly sequential — it only works after the previous session has exited. There is no "resume into a running session" path.

    Could "resume with a NEW prompt" serve as the inject mechanism? Only if the current session has finished. You cannot resume a session that is still running — the CLI would either fail or start a parallel conversation with the same history. So resume is the right mechanism for follow-up messages, but not for mid-flight steering.

    3. File polling viability (TASK_UPDATES.md approach)

    How it would work:

    • PM writes steering instructions to a file (e.g., TASK_UPDATES.md) in the member's work folder.
    • The member's system prompt includes an instruction like: "Before each major step, check TASK_UPDATES.md for new directives. If present, incorporate them into your plan."
    • Claude Code's agentic loop would see the file on its next Read tool call (or the prompt could instruct periodic checking).

    Risks and limitations:

    • Timing gap: If the member finishes before checking, the update is missed entirely. The PM would need to re-dispatch.
    • No guaranteed check frequency: Claude Code decides when to read files. You can instruct it to check, but compliance is probabilistic, not deterministic.
    • Race conditions: PM writes while member reads — needs atomic writes or a sequence counter.
    • Clutters the work folder: TASK_UPDATES.md competes with the codebase for the agent's attention.

    Mitigation: Use a sentinel file (e.g., .fleet-attention) that the member's prompt says to check after every tool call. But this is still best-effort.

    4. Recommendation

    Approach B (resume with new prompt) is simpler and more reliable than file polling — but with a critical caveat.

    The realistic interaction model is:

    1. PM dispatches a prompt → member runs to completion → PM gets the result.
    2. PM sends a follow-up → fleet uses -c to resume with new context → member runs again.

    This is already how fleet works today. The "attention mechanism" reduces to: can the PM interrupt a running session to redirect it?

    For true mid-flight injection, neither approach works without upstream Claude Code changes. But there's a pragmatic middle ground:

    Recommended minimal API change:

    1. Add a cancel_and_redirect option to execute_prompt that:
    2. Kills the running session process (fleet already tracks PIDs for this)
    3. Immediately resumes with -c and the new steering prompt
    4. The new prompt includes context like "Your previous task was interrupted. The PM has a new directive: ..."
    5. This gives the PM effective "attention" without needing mid-turn injection. The member loses in-progress work on the current turn, but retains full conversation history via session resume.

    Effort: Small — the PID tracking and kill infrastructure already exists. The resume path already works. The main work is a new tool parameter and the orchestration logic.

    File polling (Approach A) should be a Phase 2 enhancement for cases where the PM wants to augment a running session without interrupting it (e.g., "also check the tests when you're done"). It's lower priority because the timing guarantees are weak.

     
  • Anonymous

    Anonymous - 2026-04-26

    Originally posted by: kumaakh

    Research: Inter-session attention — stream-json as a potential injection path

    Building on the prior feasibility research, I investigated one under-explored avenue: Claude Code's --input-format stream-json mode.

    Key finding: --input-format stream-json may support mid-session user-turn injection

    From claude --help:

    --input-format stream-json    Input format (only works with --print): "text" (default),
                                   or "stream-json" (realtime streaming input)
    --replay-user-messages        Re-emit user messages from stdin back on stdout for
                                   acknowledgment (only works with --input-format=stream-json
                                   and --output-format=stream-json)
    

    The description says "realtime streaming input" — not "initial structured input." And --replay-user-messages explicitly references "user messages from stdin," implying multiple messages can be sent over the lifetime of the process.

    This suggests that claude -p --input-format stream-json --output-format stream-json keeps stdin open and accepts additional user messages while the agent is running. If confirmed, this is exactly the injection mechanism fleet needs — no polling, no kill-and-resume, no upstream changes required.

    How fleet currently blocks this path

    Both execution strategies close stdin immediately:

    • LocalStrategy (src/services/strategy.ts:95): child.stdin?.end() right after spawn
    • SSH (src/services/ssh.ts:143-144): stream.end() immediately, with the comment "Close stdin so commands that read from it (e.g. claude -p) get EOF"

    If we kept stdin open and switched to stream-json input/output format, we could potentially write additional user turns to the running process.

    What an update flag implementation could look like

    execute_prompt({ member: "doer", prompt: "...", update: true })
    
    1. If no session is running: dispatch normally (write .fleet-task.md, invoke claude -p, etc.)
    2. If session IS running and was started with --input-format stream-json:
    3. Write a new user-turn JSON message to the running process's stdin
    4. The agent receives it as a new instruction mid-loop
    5. If session finished: resume with -c and the new prompt (existing path)

    Architecture changes needed

    1. Keep stdin open — don't call child.stdin?.end() or stream.end() at spawn time. Instead, store a reference to the writable stream.
    2. Switch to stream-json — use --input-format stream-json --output-format stream-json instead of --output-format json.
    3. Track running processes — maintain a Map<agentId, { stdin: Writable, sessionId: string }> so execute_prompt with update: true can write to the correct stdin.
    4. Parse streaming output — the output format changes from a single JSON blob to a stream of events. Fleet's parseResponse would need a streaming variant.

    Caveats and risks

    • Unverified experimentally: The stream-json input behavior needs testing. It's possible "realtime streaming input" only means streaming the initial prompt in chunks, not injecting new turns during autonomous execution. The --replay-user-messages flag is strong evidence for multi-turn support, but testing is essential.
    • SSH transport complication: For remote members, keeping the SSH channel's stdin open for the entire session duration changes the execution model from fire-and-forget to persistent-connection. This increases complexity around timeout handling, connection drops, and reconnection.
    • Local-only first: This approach is significantly simpler for local members (just keep a Node.js ChildProcess.stdin reference). Remote members via SSH require more careful engineering.

    Updated recommendation

    Approach Effort Reliability Mid-flight? Upstream needed?
    File polling (TASK_UPDATES.md) Low Low Yes (best-effort) No
    Cancel + resume (kill → -c) Low-Med High No (restarts turn) No
    stream-json stdin injection Medium High (if confirmed) Yes (true inject) No
    UDS channel → mid-turn inject High Highest Yes Yes

    Recommended next step: Spike a manual test of --input-format stream-json to confirm multi-turn stdin injection works:

    # Test: does writing a second user message to stdin get processed?
    echo '{"type":"user","content":"What is 2+2?"}' | claude -p --input-format stream-json --output-format stream-json
    

    If confirmed, implement stream-json injection for local members first (Phase 1), then extend to SSH (Phase 2). The cancel-and-resume approach remains the reliable fallback regardless.

     

Log in to post a comment.

MongoDB Logo MongoDB