Skip to content

tool_use.caller field causes an error in Anthropic Computer Use Agent. #249

@asim-hl

Description

@asim-hl

Bug: tool_use.caller field causes "Extra inputs are not permitted" error on Step 2+

Description

When using AnthropicCUAClient with Claude models (e.g., claude-sonnet-4-20250514), the agent fails on Step 2 of execution with:

Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'messages.1.content.1.tool_use.caller: Extra inputs are not permitted'}}

Root Cause

In stagehand/agent/anthropic_cua.py, the _process_provider_response method serializes response content blocks using block.model_dump() (line ~299):

raw_assistant_content_blocks = [
    block.model_dump() for block in response.content
]

The Anthropic SDK's beta API response includes a caller field on tool_use blocks (see anthropic/types/beta/beta_tool_use_block_param.py). This field is valid in API responses but is not accepted when sending tool_use blocks back to the API in subsequent requests as part of the conversation history.

When raw_assistant_content_blocks is appended to current_messages (line ~211-212) and sent in the next API call, the API rejects the caller field.

Steps to Reproduce

  1. Create an agent using AnthropicCUAClient with any Claude model
  2. Execute a multi-step task that requires more than one API call
  3. Observe the error on Step 2

Expected Behavior

The agent should successfully execute multi-step tasks without API validation errors.

Actual Behavior

The agent fails on Step 2 with invalid_request_error because the caller field is included in the conversation history.

Suggested Fix

Exclude the caller field when serializing response content blocks:

# In _process_provider_response method
raw_assistant_content_blocks = []
if hasattr(response, "content") and isinstance(response.content, list):
    try:
        for block in response.content:
            block_dict = block.model_dump()
            # Remove 'caller' field - it's valid in responses but not accepted in requests
            block_dict.pop("caller", None)
            raw_assistant_content_blocks.append(block_dict)
    except Exception as e:
        # ... existing error handling

Environment

  • stagehand version: 0.5.7
  • anthropic SDK version: 0.75.0
  • Python version: 3.13
  • Model: claude-sonnet-4-20250514 (also affects other Claude models)

Workaround

Until this is fixed, users can monkey-patch AnthropicCUAClient._process_provider_response to strip the caller field before the stagehand module is used:

from stagehand.agent.anthropic_cua import AnthropicCUAClient
from stagehand.handlers.cua_handler import StagehandFunctionName

def _patched_process_provider_response(self, response):
    self.last_tool_use_ids = []
    model_message_parts = []
    agent_action = None

    raw_assistant_content_blocks = []
    if hasattr(response, "content") and isinstance(response.content, list):
        try:
            for block in response.content:
                block_dict = block.model_dump()
                block_dict.pop("caller", None)  # Remove problematic field
                raw_assistant_content_blocks.append(block_dict)
        except Exception as e:
            self.logger.error(
                f"Could not model_dump response.content blocks: {e}",
                category=StagehandFunctionName.AGENT,
            )
            raw_assistant_content_blocks = response.content

        tool_use_block = None
        for block in response.content:
            if block.type == "tool_use":
                tool_use_block = block
                self.last_tool_use_ids.append(block.id)
            elif block.type == "text":
                model_message_parts.append(block.text)

        if tool_use_block:
            tool_name = tool_use_block.name
            tool_input = tool_use_block.input if hasattr(tool_use_block, "input") else {}
            agent_action = self._convert_tool_use_to_agent_action(tool_name, tool_input)
            if agent_action:
                agent_action.step = raw_assistant_content_blocks

    model_message_text = " ".join(model_message_parts).strip() or None
    task_completed = not bool(agent_action)
    return (agent_action, model_message_text, task_completed, raw_assistant_content_blocks)

AnthropicCUAClient._process_provider_response = _patched_process_provider_response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions