apra-fleet / Tickets / #75 feat: inter-session attention mechanism — PM↔member communication across sessions

Anonymous - 2026-04-12

Originally posted by: kumaakh

From my research:

Given your PM → doer/reviewer architecture, the cleanest pattern might be: don't try to inject into a running session. Instead, have your orchestrator write updates to a known file (like TASK_UPDATES.md) and include in the original prompt: "Before each major step, check TASK_UPDATES.md for any new instructions." Claude will cat that file as part of its agentic loop since you told it to. This is low-tech but surprisingly effective — you're essentially making the agent poll by instruction rather than by infrastructure.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2026-04-12

Originally posted by: kumaakh

PM Skill Integration Pattern

Given the PM → doer/reviewer architecture, the cleanest pattern might be: don't try to inject into a running session. Instead, have the orchestrator write updates to a known file (e.g. TASK_UPDATES.md) and include in the original dispatch prompt: "Before each major step, check TASK_UPDATES.md for any new instructions."

Claude will read that file as part of its agentic loop since it was instructed to. This is low-tech but surprisingly effective — you're essentially making the agent poll by instruction rather than by infrastructure.

How this fits the PM skill

PM writes urgent instructions to TASK_UPDATES.md in the member's work folder via send_files or execute_command

The doer's agent context file (CLAUDE.md) includes the polling instruction as a standing rule

No new fleet infrastructure needed — works today with zero changes

Complements a future native interject mechanism: if infra-level injection lands, the file-polling fallback becomes the low-latency path for non-critical updates

Suggested addition to doer context file template

## Live Updates Before each task: check TASK_UPDATES.md in your work folder for any new instructions from PM. If the file exists and contains unread instructions, apply them before proceeding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2026-04-23

Originally posted by: kumaakh

Technical direction: The file-polling pattern described in the comments is already the recommended immediate approach and requires no infrastructure changes. The native interject path is the right next step:

Add an update flag to execute_prompt (src/tools/execute-prompt.ts):

Session running: use the provider's native interject mechanism (Claude: --append-system-prompt or a follow-up -p on the same session ID; check Claude CLI docs for the correct flag).

Session idle/finished: write new prompt to .fleet-task.md and call execute_prompt with
esume: true on the existing session ID — continuity preserved.

The key design question is whether Claude CLI supports injecting into a currently running session (mid-turn). If not, the file-polling approach remains the only real-time path. Worth checking claude --help for any --inject or --append flag before implementing.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2026-04-26

Originally posted by: kumaakh

Research: Inter-session attention mechanism — Feasibility findings

1. Claude Code CLI capability: Can you inject input into a running claude -p session?

No. claude -p (print mode) is a fire-and-forget invocation. It reads a prompt, runs to completion (up to --max-turns), and exits. There is no stdin channel for injecting additional user turns mid-session. Key observations:

claude -p closes stdin immediately after reading the prompt. The fleet codebase confirms this: LocalStrategy.execCommand calls child.stdin?.end() right after spawn (src/services/strategy.ts:96).

--resume / -c resumes a completed session — it starts a new process that loads the prior conversation as context, then appends a new user turn. It does not attach to a running process.

--session-id <uuid> sets the session ID for a new invocation; it doesn't connect to a live one.

--input-format stream-json allows streaming input, but only before the agent starts running — it's for piping structured input at invocation time, not for mid-turn injection.

Verdict: Claude Code CLI has no mechanism for injecting a message into a running -p session from an external process.

2. Current execute_prompt resume path in apra-fleet

Reading src/tools/execute-prompt.ts and src/providers/claude.ts:

Fleet stores sessionId on each agent after a successful prompt execution (touchAgent(agent.id, parsed.sessionId) at line 165).

On the next execute_prompt call with resume: true, it appends -c to the claude -p command (Claude provider's resumeFlag()). This resumes the prior conversation context in a new process.

If the session is stale (process exits non-zero), fleet retries without the session ID (lines 146-150).

The prompt is written to .fleet-task.md in the work folder, then cleaned up after execution.

Key insight: Resume today is strictly sequential — it only works after the previous session has exited. There is no "resume into a running session" path.

Could "resume with a NEW prompt" serve as the inject mechanism? Only if the current session has finished. You cannot resume a session that is still running — the CLI would either fail or start a parallel conversation with the same history. So resume is the right mechanism for follow-up messages, but not for mid-flight steering.

3. File polling viability (TASK_UPDATES.md approach)

How it would work:

PM writes steering instructions to a file (e.g., TASK_UPDATES.md) in the member's work folder.

The member's system prompt includes an instruction like: "Before each major step, check TASK_UPDATES.md for new directives. If present, incorporate them into your plan."

Claude Code's agentic loop would see the file on its next Read tool call (or the prompt could instruct periodic checking).

Risks and limitations:

Timing gap: If the member finishes before checking, the update is missed entirely. The PM would need to re-dispatch.

No guaranteed check frequency: Claude Code decides when to read files. You can instruct it to check, but compliance is probabilistic, not deterministic.

Race conditions: PM writes while member reads — needs atomic writes or a sequence counter.

Clutters the work folder: TASK_UPDATES.md competes with the codebase for the agent's attention.

Mitigation: Use a sentinel file (e.g., .fleet-attention) that the member's prompt says to check after every tool call. But this is still best-effort.

4. Recommendation

Approach B (resume with new prompt) is simpler and more reliable than file polling — but with a critical caveat.

The realistic interaction model is:

PM dispatches a prompt → member runs to completion → PM gets the result.

PM sends a follow-up → fleet uses -c to resume with new context → member runs again.

This is already how fleet works today. The "attention mechanism" reduces to: can the PM interrupt a running session to redirect it?

For true mid-flight injection, neither approach works without upstream Claude Code changes. But there's a pragmatic middle ground:

Recommended minimal API change:

Add a cancel_and_redirect option to execute_prompt that:

Kills the running session process (fleet already tracks PIDs for this)

Immediately resumes with -c and the new steering prompt

The new prompt includes context like "Your previous task was interrupted. The PM has a new directive: ..."

This gives the PM effective "attention" without needing mid-turn injection. The member loses in-progress work on the current turn, but retains full conversation history via session resume.

Effort: Small — the PID tracking and kill infrastructure already exists. The resume path already works. The main work is a new tool parameter and the orchestration logic.

File polling (Approach A) should be a Phase 2 enhancement for cases where the PM wants to augment a running session without interrupting it (e.g., "also check the tests when you're done"). It's lower priority because the timing guarantees are weak.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Originally posted by: kumaakh

Research: Inter-session attention — `stream-json` as a potential injection path

Building on the prior feasibility research, I investigated one under-explored avenue: Claude Code's --input-format stream-json mode.

Key finding: `--input-format stream-json` may support mid-session user-turn injection

From claude --help:

--input-format stream-json    Input format (only works with --print): "text" (default),
                               or "stream-json" (realtime streaming input)
--replay-user-messages        Re-emit user messages from stdin back on stdout for
                               acknowledgment (only works with --input-format=stream-json
                               and --output-format=stream-json)

The description says "realtime streaming input" — not "initial structured input." And --replay-user-messages explicitly references "user messages from stdin," implying multiple messages can be sent over the lifetime of the process.

This suggests that claude -p --input-format stream-json --output-format stream-json keeps stdin open and accepts additional user messages while the agent is running. If confirmed, this is exactly the injection mechanism fleet needs — no polling, no kill-and-resume, no upstream changes required.

How fleet currently blocks this path

Both execution strategies close stdin immediately:

LocalStrategy (src/services/strategy.ts:95): child.stdin?.end() right after spawn
SSH (src/services/ssh.ts:143-144): stream.end() immediately, with the comment "Close stdin so commands that read from it (e.g. claude -p) get EOF"

If we kept stdin open and switched to stream-json input/output format, we could potentially write additional user turns to the running process.

What an `update` flag implementation could look like

execute_prompt({ member: "doer", prompt: "...", update: true })

If no session is running: dispatch normally (write .fleet-task.md, invoke claude -p, etc.)
If session IS running and was started with --input-format stream-json:
Write a new user-turn JSON message to the running process's stdin
The agent receives it as a new instruction mid-loop
If session finished: resume with -c and the new prompt (existing path)

Architecture changes needed

Keep stdin open — don't call child.stdin?.end() or stream.end() at spawn time. Instead, store a reference to the writable stream.
Switch to stream-json — use --input-format stream-json --output-format stream-json instead of --output-format json.
Track running processes — maintain a Map<agentId, { stdin: Writable, sessionId: string }> so execute_prompt with update: true can write to the correct stdin.
Parse streaming output — the output format changes from a single JSON blob to a stream of events. Fleet's parseResponse would need a streaming variant.

Caveats and risks

Unverified experimentally: The stream-json input behavior needs testing. It's possible "realtime streaming input" only means streaming the initial prompt in chunks, not injecting new turns during autonomous execution. The --replay-user-messages flag is strong evidence for multi-turn support, but testing is essential.
SSH transport complication: For remote members, keeping the SSH channel's stdin open for the entire session duration changes the execution model from fire-and-forget to persistent-connection. This increases complexity around timeout handling, connection drops, and reconnection.
Local-only first: This approach is significantly simpler for local members (just keep a Node.js ChildProcess.stdin reference). Remote members via SSH require more careful engineering.

Updated recommendation

Approach	Effort	Reliability	Mid-flight?	Upstream needed?
File polling (TASK_UPDATES.md)	Low	Low	Yes (best-effort)	No
Cancel + resume (`kill → -c`)	Low-Med	High	No (restarts turn)	No
stream-json stdin injection	Medium	High (if confirmed)	Yes (true inject)	No
UDS channel → mid-turn inject	High	Highest	Yes	Yes

Recommended next step: Spike a manual test of --input-format stream-json to confirm multi-turn stdin injection works:

# Test: does writing a second user message to stdin get processed?
echo '{"type":"user","content":"What is 2+2?"}' | claude -p --input-format stream-json --output-format stream-json

If confirmed, implement stream-json injection for local members first (Phase 1), then extend to SSH (Phase 2). The cancel-and-resume approach remains the reliable fallback regardless.

feat: inter-session attention mechanism — PM↔member communication across sessions

Apra Fleet is an open-source MCP server

Milestone

Searches

Help

#75 feat: inter-session attention mechanism — PM↔member communication across sessions

Recommended pattern (works today, no infrastructure needed)

Future infrastructure: `execute_prompt` with `update` flag

Other future options

Related

Discussion

From my research:

PM Skill Integration Pattern

How this fits the PM skill

Suggested addition to doer context file template

Research: Inter-session attention mechanism — Feasibility findings

1. Claude Code CLI capability: Can you inject input into a running `claude -p` session?

2. Current `execute_prompt` resume path in apra-fleet

3. File polling viability (TASK_UPDATES.md approach)

4. Recommendation

Research: Inter-session attention — `stream-json` as a potential injection path

Key finding: `--input-format stream-json` may support mid-session user-turn injection

How fleet currently blocks this path

What an `update` flag implementation could look like

Architecture changes needed

Caveats and risks

Updated recommendation

feat: inter-session attention mechanism — PM↔member communication across sessions

Apra Fleet is an open-source MCP server

Milestone

Searches

Help

#75 feat: inter-session attention mechanism — PM↔member communication across sessions

Recommended pattern (works today, no infrastructure needed)

Future infrastructure: `execute_prompt` with `update` flag

Other future options

Related

Discussion

From my research:

PM Skill Integration Pattern

How this fits the PM skill

Suggested addition to doer context file template

Research: Inter-session attention mechanism — Feasibility findings

1. Claude Code CLI capability: Can you inject input into a running claude -p session?

2. Current execute_prompt resume path in apra-fleet

3. File polling viability (TASK_UPDATES.md approach)

4. Recommendation

Research: Inter-session attention — stream-json as a potential injection path

Key finding: --input-format stream-json may support mid-session user-turn injection

How fleet currently blocks this path

What an update flag implementation could look like

Architecture changes needed

Caveats and risks

Updated recommendation

1. Claude Code CLI capability: Can you inject input into a running `claude -p` session?

2. Current `execute_prompt` resume path in apra-fleet

Research: Inter-session attention — `stream-json` as a potential injection path

Key finding: `--input-format stream-json` may support mid-session user-turn injection

What an `update` flag implementation could look like