Bug Description
After restoring a session that involved transfer_task (sub-agent delegation), all subsequent prompts fail with:
all models failed: error receiving from stream: POST "https://ai-backend-service-stage.docker.com/proxy/v1/messages?beta=true": 400 Bad Request
{"type":"error","error":{"type":"invalid_request_error","message":"messages.26.content.0: unexpected `tool_use_id` found in `tool_result` blocks: tooluse_PDwaDa9qmwIyg9WUMFXlez. Each `tool_result` block must have a corresponding `tool_use` block in the previous message."}}
This is a recurrence of #1644. The root cause identified in #1738 (PersistentRuntime writing sub-agent streaming messages to the parent session) was never fixed.
Root Cause Analysis
There are two layers of bugs. The first corrupts the session state, and the second fails to protect against the corruption.
Layer 1: PersistentRuntime persists sub-agent streaming messages to parent session (#1738 — still unfixed)
In pkg/runtime/persistent_runtime.go, the handleEvent method skips persistence for sub-sessions:
if sess.IsSubSession() {
return
}
However, sess is always the parent session. When handleTaskTransfer runs a sub-agent via r.RunStream(ctx, child), the child's streaming events (AgentChoiceEvent, AgentChoiceReasoningEvent) are forwarded to the parent's event channel. Since the parent is not a sub-session, the guard does not trigger, and persistStreamingContent writes the sub-agent's assistant messages directly into the parent session's session_items.
After a task transfer, the parent session's database state looks like:
position | agent_name | role | notes
---------|------------------|-----------|----------------------------------
N | root | assistant | has transfer_task tool_use (ID: X)
N+1 | sub-agent | assistant | ← ORPHAN: written by streaming persistence
N+2 | sub-agent | assistant | ← ORPHAN: written by streaming persistence
...
N+k | (subsession) | | sub-session reference
N+k+1 | root | tool | tool_result for transfer_task (ID: X)
On session restore, GetMessages() includes all message items (skipping sub-session items). The parent session now contains sub-agent assistant messages with their own tool_use blocks that have no corresponding tool_result messages in the parent context.
Layer 2: Beta converter lacks orphan tool_result protection
The non-beta convertMessages (client.go) has a pendingAssistantToolUse flag that only includes tool_result user messages when they immediately follow an assistant message with tool_use blocks. Orphan tool results are silently dropped.
The beta convertBetaMessages (beta_converter.go) has no such guard. Every tool role message is unconditionally converted and sent to the API. When the corrupted session history contains orphan sub-agent tool_use/tool_result pairs, they pass straight through to Anthropic.
Layer 2b: Forward-only sequencing validation
validateSequencing (client.go) only checks the forward direction: each assistant tool_use must have a matching tool_result in the next user message. It does not check the reverse: each tool_result must reference a tool_use in the immediately preceding assistant message. So orphan tool_result blocks pass validation.
Layer 2c: repairSequencing can worsen partial tool_result mismatches
When an assistant message has tool_use IDs {A, B} and the next user message only has tool_result for A, repairSequencing inserts a synthetic user message with tool_result for B between the assistant and the existing user message:
assistant(tool_use: A, B)
synthetic_user(tool_result: B) ← inserted by repair
user(tool_result: A) ← A's "previous message" is now synthetic_user, not assistant → ERROR
Steps to Reproduce
- Configure a multi-agent setup with
transfer_task / sub_agents
- Start a session and trigger a task transfer to a sub-agent
- Let the sub-agent execute tool calls and complete
- End the session
- Restore the session with
--session <id>
- Send any new prompt
Expected Behavior
Restored sessions should work correctly. Sub-agent messages should only exist in the sub-session, not in the parent session.
Affected Code
| File |
Issue |
pkg/runtime/persistent_runtime.go handleEvent() |
Sub-agent streaming events persisted to parent session (Layer 1) |
pkg/model/provider/anthropic/beta_converter.go convertBetaMessages() |
Missing pendingAssistantToolUse guard (Layer 2) |
pkg/model/provider/anthropic/client.go validateSequencing() |
Forward-only validation, no reverse orphan check (Layer 2b) |
pkg/model/provider/anthropic/client.go repairSequencing() |
Inserts synthetic message that orphans existing tool_results (Layer 2c) |
Suggested Fixes
persistent_runtime.go: Filter AgentChoiceEvent / AgentChoiceReasoningEvent by comparing e.AgentName against the parent session's current agent, or add a SessionID field to streaming events and filter by session ID.
beta_converter.go: Add pendingAssistantToolUse guard matching the non-beta convertMessages behavior.
client.go validateSequencing: Add reverse validation — check that every tool_result references a tool_use in the immediately preceding assistant message.
client.go repairSequencing: When partial tool_results exist, merge synthetic tool_results into the existing next user message instead of inserting a separate synthetic message before it.
Related Issues
Environment
- OS: macOS (arm64)
- Version: built from
main (HEAD at v1.30.1)
- Session ID:
17cebf40-7cdb-4463-936d-916feedc2d4e
- Agents definition: multi-agent setup with sub_agents and transfer_task
Bug Description
After restoring a session that involved
transfer_task(sub-agent delegation), all subsequent prompts fail with:This is a recurrence of #1644. The root cause identified in #1738 (PersistentRuntime writing sub-agent streaming messages to the parent session) was never fixed.
Root Cause Analysis
There are two layers of bugs. The first corrupts the session state, and the second fails to protect against the corruption.
Layer 1: PersistentRuntime persists sub-agent streaming messages to parent session (#1738 — still unfixed)
In
pkg/runtime/persistent_runtime.go, thehandleEventmethod skips persistence for sub-sessions:However,
sessis always the parent session. WhenhandleTaskTransferruns a sub-agent viar.RunStream(ctx, child), the child's streaming events (AgentChoiceEvent,AgentChoiceReasoningEvent) are forwarded to the parent's event channel. Since the parent is not a sub-session, the guard does not trigger, andpersistStreamingContentwrites the sub-agent's assistant messages directly into the parent session'ssession_items.After a task transfer, the parent session's database state looks like:
On session restore,
GetMessages()includes all message items (skipping sub-session items). The parent session now contains sub-agent assistant messages with their owntool_useblocks that have no correspondingtool_resultmessages in the parent context.Layer 2: Beta converter lacks orphan tool_result protection
The non-beta
convertMessages(client.go) has apendingAssistantToolUseflag that only includestool_resultuser messages when they immediately follow an assistant message withtool_useblocks. Orphan tool results are silently dropped.The beta
convertBetaMessages(beta_converter.go) has no such guard. Everytoolrole message is unconditionally converted and sent to the API. When the corrupted session history contains orphan sub-agent tool_use/tool_result pairs, they pass straight through to Anthropic.Layer 2b: Forward-only sequencing validation
validateSequencing(client.go) only checks the forward direction: each assistanttool_usemust have a matchingtool_resultin the next user message. It does not check the reverse: eachtool_resultmust reference atool_usein the immediately preceding assistant message. So orphantool_resultblocks pass validation.Layer 2c: repairSequencing can worsen partial tool_result mismatches
When an assistant message has tool_use IDs {A, B} and the next user message only has tool_result for A,
repairSequencinginserts a synthetic user message with tool_result for B between the assistant and the existing user message:Steps to Reproduce
transfer_task/sub_agents--session <id>Expected Behavior
Restored sessions should work correctly. Sub-agent messages should only exist in the sub-session, not in the parent session.
Affected Code
pkg/runtime/persistent_runtime.gohandleEvent()pkg/model/provider/anthropic/beta_converter.goconvertBetaMessages()pendingAssistantToolUseguard (Layer 2)pkg/model/provider/anthropic/client.govalidateSequencing()pkg/model/provider/anthropic/client.gorepairSequencing()Suggested Fixes
persistent_runtime.go: FilterAgentChoiceEvent/AgentChoiceReasoningEventby comparinge.AgentNameagainst the parent session's current agent, or add aSessionIDfield to streaming events and filter by session ID.beta_converter.go: AddpendingAssistantToolUseguard matching the non-betaconvertMessagesbehavior.client.govalidateSequencing: Add reverse validation — check that everytool_resultreferences atool_usein the immediately preceding assistant message.client.gorepairSequencing: When partial tool_results exist, merge synthetic tool_results into the existing next user message instead of inserting a separate synthetic message before it.Related Issues
tool_use_idfound intool_resultblocks #1644 — same error, closed without identifying the PersistentRuntime root causeEnvironment
main(HEAD at v1.30.1)17cebf40-7cdb-4463-936d-916feedc2d4e