Skip to content

Conversation

@mfenderov
Copy link
Contributor

Prompt caching is automatically enabled for models that support it (detected via models.dev) to reduce latency and costs. System prompts, tool definitions, and recent messages are cached with a 5-minute TTL.

To disable:

provider_opts:
  disable_prompt_caching: true

P.S. Benchmarked with examples/pr-reviewer-bedrock.yaml: 92% cache read vs 8% cache write.

Assisted-By: cagent

@mfenderov mfenderov requested a review from a team as a code owner January 13, 2026 11:48
@krissetto
Copy link
Contributor

This generally looks good to me, thanks for the contribution! ❤️

@dgageot since you've been deep diving on anthropic caching bits lately, you might be interested in taking a look here and maybe comparing bedrock to some of the tests you've been doing with anthropic's api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants