feat: move sampling options (temperature/topP/maxTokens) into modelOptions by AlemTuzlak · Pull Request #660 · TanStack/ai

AlemTuzlak · 2026-05-30T14:24:23Z

Summary

Moves sampling options — temperature, topP, maxTokens — off the root of chat() / ai() / generate() and into provider-native, fully-typed modelOptions. modelOptions is now the single sampling surface; there is no generic root-level mapping anymore. Each provider exposes its real SDK/API key names.

This continues and completes #499 by @harry-whorlow (which was built on the old packages/typescript/ layout and could not be rebased onto the restructured packages/ tree). Supersedes #499.

BREAKING for a 0.x SDK. A provider-aware codemod migrates existing code automatically.

Provider-native `modelOptions` keys

Provider	temperature	topP	maxTokens
OpenAI (Responses)	`temperature`	`top_p`	`max_output_tokens`
Anthropic	`temperature`	`top_p`	`max_tokens`
Gemini	`temperature`	`topP`	`maxOutputTokens`
Grok	`temperature`	`top_p`	`max_tokens`
Groq	`temperature`	`top_p`	`max_completion_tokens`
Ollama	`options.temperature`	`options.top_p`	`options.num_predict` (nested)
OpenRouter (chat)	`temperature`	`topP`	`maxCompletionTokens`
OpenRouter (Responses)	`temperature`	`topP`	`maxOutputTokens`

What changed

Core (@tanstack/ai): removed temperature/topP/maxTokens from TextOptions, the public activity options, and ChatMiddlewareConfig (+ structured-output config). Removed all engine plumbing. Middleware now adjusts sampling via config.modelOptions. metadata is unchanged and stays at the root.
All 7 text providers read sampling from typed modelOptions (native keys, verified against each provider's API). The provider-options types gained the sampling fields where missing (OpenAI, Anthropic, Gemini, Ollama; Grok/Groq/OpenRouter already had them).
Casts removed on the sampling path (Anthropic, Ollama, OpenRouter). No new casts introduced.
Bugs fixed along the way: a pre-existing Ollama double-nesting mismatch (typed nested options vs flat runtime spread); an OpenRouter variant field leaking into the request body; OTel gen_ai.request.* attributes now read a union of provider key spellings; summarize({ maxLength }) now resolves the correct per-provider token key from the wrapped adapter (was about to become a silent no-op).
Codemod move-sampling-to-model-options (jscodeshift): resolves the provider from the adapter factory and rewrites keys per provider (incl. Ollama nesting); reports + skips on conflicts or unresolvable adapters. 12 fixture cases.
Docs: new migration guide (docs/migration/sampling-options-to-model-options), updated adapter pages, middleware dynamic-temperature example, API reference, and typed-options.
Agent skills: adapter-configuration, middleware, chat-experience SKILL.md + reference files updated to the new surface.
Changeset: minor across @tanstack/ai + all affected adapter packages.

Migration

pnpm codemod:move-sampling-to-model-options "src/**/*.{ts,tsx}"

Testing

Per-adapter unit tests assert sampling reaches the wire under the correct native key (incl. Ollama no-double-nesting, Anthropic max_tokens default + no spurious dropped-key warning).
Codemod fixture tests (12 cases). OTel + summarize regression tests.
E2E suite verified green (already routes sampling via modelOptions).
Full pnpm test:pr green: test:sherif, test:knip, test:docs, test:eslint, test:lib, test:types, test:build, build across 33 projects.

Known follow-up (not blocking)

The ts-code-mode-web / ts-react-chat examples still pass a generic maxTokens on calls with dynamically-resolved adapters — the codemod can't infer a provider there, and they're excluded from CI (examples/**). They need per-example provider mapping in a follow-up.

🤖 Generated with Claude Code

Summary by CodeRabbit

Breaking Changes
- Root-level sampling options (temperature, topP, maxTokens) no longer work on chat()/ai()/generate(); supply them via provider-native modelOptions (provider-specific key names; Ollama nests under modelOptions.options). metadata remains at the root.
New Features
- Added an automated codemod to migrate sampling options into modelOptions.
Documentation
- Updated adapters, middleware, migration guide, examples, and docs to reflect modelOptions usage and migration guidance.

…t-completions base

… cast

…cast and flat root reads

…p cast

…sampling surface

…g attribute spellings

… adapter

…ions adapter

coderabbitai · 2026-05-30T14:24:36Z

📝 Walkthrough

Walkthrough

This PR removes root-level sampling options (temperature, topP, maxTokens) and consolidates sampling and token-limit configuration into provider-native modelOptions across adapters, middleware, docs, examples, and tests. It supplies a jscodeshift codemod, migration guide, and extensive test coverage.

Changes

Sampling Options Migration to modelOptions

Layer / File(s)	Summary
Public API contract removal `packages/ai/src/types.ts`, `packages/ai/src/activities/chat/middleware/types.ts`	`TextOptions` and `ChatMiddlewareConfig` drop `temperature`, `topP`, `maxTokens` from the public contract; `modelOptions`/`metadata` remain.
Core engine refactoring `packages/ai/src/activities/chat/index.ts`	Engine paths (beforeRun, streamModelResponse, runStructuredFinalization, middleware wiring) stop flattening/forwarding root sampling fields and use `modelOptions`/`metadata`.
Codemod and fixtures `codemods/move-sampling-to-model-options/*`, `codemods/README.md`, `codemods/package.json`	Add jscodeshift transform with provider detection, key renames, Ollama nesting, conflict detection, tests/fixtures demonstrating merges, shorthand, conflicts, and unresolvable adapters.
Documentation & migration guide `docs/migration/sampling-options-to-model-options.md`, `docs/adapters/*`, `docs/advanced/middleware.md`, `docs/config.json`	New migration doc, per-adapter docs updated to show `modelOptions` usage, middleware guidance updated, and site navigation entry added.
OpenAI base adapters & types `packages/ai-openai/src/text/text-provider-options.ts`, `packages/openai-base/src/adapters/*`	Extracted `OpenAISamplingOptions`; Responses/ChatCompletions adapters now source sampling from `modelOptions` using provider-native keys.
Anthropic adapter & tests `packages/ai-anthropic/src/adapters/text.ts`, `packages/ai-anthropic/tests/*`	`mapCommonOptionsToAnthropic` reads sampling/token limits from `modelOptions` (`max_tokens`), with tests for forwarding and defaults.
Gemini, Grok, Groq, OpenRouter adapters & tests `packages/ai-gemini/`, `packages/ai-grok/`, `packages/ai-groq/`, `packages/ai-openrouter/`	Adapters read sampling from `modelOptions` (provider-native names), OpenRouter extracts `variant` from `modelOptions`, tests updated to assert forwarding and serialization.
Ollama nested options `packages/ai-ollama/src/`, `packages/ai-ollama/tests/`	Ollama uses nested `modelOptions.options` for sampling; removed `OllamaTextProviderOptions` export and updated method signatures and meta typing.
Middleware, telemetry, summarize `packages/ai/src/middlewares/otel.ts`, `packages/ai/src/activities/summarize/chat-stream-summarize.ts`	OTel middleware normalizes sampling attributes from `modelOptions` (including nested Ollama keys); summarize adapter maps `maxLength` into provider-native token keys and injects defaults into `modelOptions` when appropriate.
Examples & helper `examples/ts-code-mode-web/src/lib/max-tokens-model-options.ts`, updated examples and routes	Add `maxTokensModelOptions` helper and update example tooling/routes to map generic `maxTokens` into provider-native `modelOptions`.
Tests `packages/ai/tests/*`, provider tests	Update chat, middleware, OTel tests and add summarize tests to validate `modelOptions`-based sampling and provider-specific mappings.
Scripts `package.json`, `codemods/package.json`	Add root codemod script and codemod package scripts; add `prettier` devDependency for codemod tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

jherr
tannerlinsley
tombeckenham

🐰 I hopped through code with nimble feet,
Shifted sampling keys to modelOptions neat,
Ollama nests, providers speak true names,
A codemod hops in to rewrite the frames,
🥕 Happy migrating — the rabbit proclaims!

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/sampling-options-to-modeloptions

github-actions · 2026-05-30T14:25:11Z

🚀 Changeset Version Preview

9 package(s) bumped directly, 21 bumped as dependents.

🟥 Major bumps

Package	Version	Reason
`@tanstack/ai-anthropic`	0.13.1 → 1.0.0	Changeset
`@tanstack/ai-gemini`	0.14.1 → 1.0.0	Changeset
`@tanstack/ai-grok`	0.10.1 → 1.0.0	Changeset
`@tanstack/ai-groq`	0.3.1 → 1.0.0	Changeset
`@tanstack/ai-ollama`	0.7.1 → 1.0.0	Changeset
`@tanstack/ai-openai`	0.12.1 → 1.0.0	Changeset
`@tanstack/ai-openrouter`	0.12.1 → 1.0.0	Changeset
`@tanstack/openai-base`	0.6.1 → 1.0.0	Changeset
`@tanstack/ai-code-mode`	0.2.3 → 1.0.0	Dependent
`@tanstack/ai-code-mode-skills`	0.2.3 → 1.0.0	Dependent
`@tanstack/ai-elevenlabs`	0.2.18 → 1.0.0	Dependent
`@tanstack/ai-event-client`	0.5.2 → 1.0.0	Dependent
`@tanstack/ai-fal`	0.7.21 → 1.0.0	Dependent
`@tanstack/ai-isolate-node`	0.1.28 → 1.0.0	Dependent
`@tanstack/ai-isolate-quickjs`	0.1.28 → 1.0.0	Dependent
`@tanstack/ai-preact`	0.9.1 → 1.0.0	Dependent
`@tanstack/ai-react`	0.15.1 → 1.0.0	Dependent
`@tanstack/ai-react-ui`	0.8.6 → 1.0.0	Dependent
`@tanstack/ai-solid`	0.13.1 → 1.0.0	Dependent
`@tanstack/ai-solid-ui`	0.7.6 → 1.0.0	Dependent
`@tanstack/ai-svelte`	0.13.1 → 1.0.0	Dependent
`@tanstack/ai-vue`	0.13.1 → 1.0.0	Dependent

🟨 Minor bumps

Package	Version	Reason
`@tanstack/ai`	0.26.1 → 0.27.0	Changeset

🟩 Patch bumps

Package	Version	Reason
`@tanstack/ai-client`	0.16.1 → 0.16.2	Dependent
`@tanstack/ai-devtools-core`	0.4.6 → 0.4.7	Dependent
`@tanstack/ai-isolate-cloudflare`	0.2.19 → 0.2.20	Dependent
`@tanstack/ai-vue-ui`	0.2.13 → 0.2.14	Dependent
`@tanstack/preact-ai-devtools`	0.1.49 → 0.1.50	Dependent
`@tanstack/react-ai-devtools`	0.2.49 → 0.2.50	Dependent
`@tanstack/solid-ai-devtools`	0.2.49 → 0.2.50	Dependent

nx-cloud · 2026-05-30T14:26:05Z

View your CI Pipeline Execution ↗ for commit 4b15de3

Command	Status	Duration	Result
`nx run-many --targets=build --exclude=examples/...`	✅ Succeeded	1m 8s	View ↗

☁️ Nx Cloud last updated this comment at 2026-06-03 07:10:20 UTC

pkg-pr-new · 2026-05-30T14:28:03Z

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/@tanstack/ai@660

@tanstack/ai-anthropic

npm i https://pkg.pr.new/@tanstack/ai-anthropic@660

@tanstack/ai-client

npm i https://pkg.pr.new/@tanstack/ai-client@660

@tanstack/ai-code-mode

npm i https://pkg.pr.new/@tanstack/ai-code-mode@660

@tanstack/ai-code-mode-skills

npm i https://pkg.pr.new/@tanstack/ai-code-mode-skills@660

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/@tanstack/ai-devtools-core@660

@tanstack/ai-elevenlabs

npm i https://pkg.pr.new/@tanstack/ai-elevenlabs@660

@tanstack/ai-event-client

npm i https://pkg.pr.new/@tanstack/ai-event-client@660

@tanstack/ai-fal

npm i https://pkg.pr.new/@tanstack/ai-fal@660

@tanstack/ai-gemini

npm i https://pkg.pr.new/@tanstack/ai-gemini@660

@tanstack/ai-grok

npm i https://pkg.pr.new/@tanstack/ai-grok@660

@tanstack/ai-groq

npm i https://pkg.pr.new/@tanstack/ai-groq@660

@tanstack/ai-isolate-cloudflare

npm i https://pkg.pr.new/@tanstack/ai-isolate-cloudflare@660

@tanstack/ai-isolate-node

npm i https://pkg.pr.new/@tanstack/ai-isolate-node@660

@tanstack/ai-isolate-quickjs

npm i https://pkg.pr.new/@tanstack/ai-isolate-quickjs@660

@tanstack/ai-ollama

npm i https://pkg.pr.new/@tanstack/ai-ollama@660

@tanstack/ai-openai

npm i https://pkg.pr.new/@tanstack/ai-openai@660

@tanstack/ai-openrouter

npm i https://pkg.pr.new/@tanstack/ai-openrouter@660

@tanstack/ai-preact

npm i https://pkg.pr.new/@tanstack/ai-preact@660

@tanstack/ai-react

npm i https://pkg.pr.new/@tanstack/ai-react@660

@tanstack/ai-react-ui

npm i https://pkg.pr.new/@tanstack/ai-react-ui@660

@tanstack/ai-solid

npm i https://pkg.pr.new/@tanstack/ai-solid@660

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/@tanstack/ai-solid-ui@660

@tanstack/ai-svelte

npm i https://pkg.pr.new/@tanstack/ai-svelte@660

@tanstack/ai-utils

npm i https://pkg.pr.new/@tanstack/ai-utils@660

@tanstack/ai-vue

npm i https://pkg.pr.new/@tanstack/ai-vue@660

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/@tanstack/ai-vue-ui@660

@tanstack/openai-base

npm i https://pkg.pr.new/@tanstack/openai-base@660

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/@tanstack/preact-ai-devtools@660

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/@tanstack/react-ai-devtools@660

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/@tanstack/solid-ai-devtools@660

commit: 4b15de3

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/ai-ollama/src/adapters/text.ts (1)
36-39: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Narrow the fallback modelOptions type to the subset this adapter actually forwards.

For arbitrary model strings, ResolveModelOptions falls back to the full ollama ChatRequest, but mapCommonOptionsToOllama() only reads modelOptions.options (and then format, keep_alive, logprobs, top_logprobs, and think when present). It sources model/messages from options.model/options.messages and converts tools only from options.tools, so request-level keys like model, messages, stream, and tools typed via the fallback can be silently ignored at runtime.
type ResolveModelOptions<TModel extends string> =
  TModel extends keyof OllamaChatModelOptionsByName
    ? OllamaChatModelOptionsByName[TModel]
    : ChatRequest
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai-ollama/src/adapters/text.ts` around lines 36 - 39, The fallback
for ResolveModelOptions is too broad (falls back to ChatRequest) even though
mapCommonOptionsToOllama only forwards specific fields; change the fallback to a
narrowed type that only includes the fields actually forwarded (e.g., an object
with an optional options property limited to Pick<ChatRequest['options'],
'format'|'keep_alive'|'logprobs'|'top_logprobs'|'think'> and an optional tools
property of ChatRequest['tools']) so ResolveModelOptions<TModel> returns either
OllamaChatModelOptionsByName[TModel] or that minimal subset; update the type
alias ResolveModelOptions accordingly so mapCommonOptionsToOllama and related
code use the correct, narrower modelOptions shape.

🧹 Nitpick comments (4)

codemods/move-sampling-to-model-options/README.md (1)
18-23: ⚡ Quick win

Clarify OpenAI adapter split in the mapping table.

The table currently presents a single OpenAI mapping to max_output_tokens, which can confuse openaiChatCompletions users (they need max_tokens). Please add an explicit note or split rows for OpenAI Responses vs Chat Completions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codemods/move-sampling-to-model-options/README.md` around lines 18 - 23, The
mapping table currently lists a single OpenAI entry mapping `maxTokens` to
`max_output_tokens`, which is ambiguous for users of `openaiChatCompletions`;
update the table to split the OpenAI row into two rows (e.g., "openai
(Responses)" and "openai (Chat Completions)") and set `maxTokens` ->
`max_output_tokens` for the Responses row and `maxTokens` -> `max_tokens` for
the Chat Completions row; mention the `openaiChatCompletions` identifier in the
note so readers know which row to use.
packages/ai/tests/summarize-max-length.test.ts (1)
1-139: ⚡ Quick win

Place this unit test alongside the summarize source file.

This new suite lives under packages/ai/tests/, but the repo convention is to colocate *.test.ts files with the source they cover. Moving it next to packages/ai/src/activities/summarize/chat-stream-summarize.ts will keep the summarize contract and its regression coverage together.

As per coding guidelines, "Place unit tests in *.test.ts files alongside source files".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai/tests/summarize-max-length.test.ts` around lines 1 - 139, Move
the test file packages/ai/tests/summarize-max-length.test.ts next to the
implementation file chat-stream-summarize.ts so the test is colocated with its
source; update imports in summarize-max-length.test.ts to reference the local
module path (adjust any ../src/... imports) and keep references to
ChatStreamSummarizeAdapter, createRecordingTextAdapter, and the test helpers
(resolveDebugOption, ev) intact so the suite continues to import the same
symbols from the colocated files. Ensure the new location preserves the same
filename and that any path changes are minimal and correct for the package
module resolution.
codemods/move-sampling-to-model-options/transform.ts (1)
341-356: 💤 Low value

Consider removing the type cast for clarity.

Line 351 casts key.name as RootSamplingKey, but key.name could be any string. While safe at runtime (Set.has() returns false for non-members), the cast is technically incorrect.
♻️ Clearer type-safe alternative
       const key = (prop as Property).key
       if (
         key.type === 'Identifier' &&
-        movedSet.has(key.name as RootSamplingKey)
+        ROOT_SAMPLING_KEYS.includes(key.name as RootSamplingKey) &&
+        movedSet.has(key.name as RootSamplingKey)
       ) {
         return false
Or use a type guard:
+      const isRootSamplingKey = (name: string): name is RootSamplingKey =>
+        ROOT_SAMPLING_KEYS.includes(name as RootSamplingKey)
+
       ...
       if (
         key.type === 'Identifier' &&
-        movedSet.has(key.name as RootSamplingKey)
+        isRootSamplingKey(key.name) &&
+        movedSet.has(key.name)
       ) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codemods/move-sampling-to-model-options/transform.ts` around lines 341 - 356,
The cast key.name as RootSamplingKey is unnecessary/unsafe; instead make the set
string-typed and check the identifier name directly: change movedSet's type from
Set<RootSamplingKey> to Set<string> when creating it (const movedSet = new
Set<string>(presentKeys)) and inside the obj.properties filter, after confirming
key.type === 'Identifier', use const name = key.name and call movedSet.has(name)
(no cast) to decide removal.
codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts (1)
14-16: 💤 Low value

Consider using shorthand property syntax.

The codemod could be enhanced to preserve or use ES6 shorthand syntax when the property name matches the identifier name. Line 15 uses temperature: temperature, but modern JavaScript/TypeScript convention prefers the shorthand form temperature, for better readability.

Both forms correctly reference the identifier rather than inlining the literal value, so the functional requirement is met.
♻️ More idiomatic shorthand syntax
 modelOptions: {
-  temperature: temperature,
+  temperature,
 },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts`
around lines 14 - 16, The object inside modelOptions uses verbose property
syntax "temperature: temperature,"—update the codemod transform that emits the
modelOptions object so when a property key equals its identifier (e.g.,
temperature) it emits the ES6 shorthand (temperature,) instead; locate the code
that constructs or prints the modelOptions object (the logic producing the
"modelOptions" node and its properties) and change it to detect identical
key+identifier pairs and output the shorthand property form.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/migration/migration.md`:
- Around line 200-201: The "Complete Migration Example" still places sampling
props (temperature, maxTokens) at the chat() root even though the guide states
they were moved into provider-native modelOptions; update the "After" snippet(s)
referenced around the "Complete Migration Example" so that temperature and
maxTokens are removed from the chat() root and instead placed inside
modelOptions using provider-native keys (e.g., provider-specific names) while
keeping metadata at the root; ensure both affected blocks (around lines 461–468
and the final After snippet) reflect modelOptions usage for sampling.

In `@docs/migration/sampling-options-to-model-options.md`:
- Around line 210-211: The file ends with leaked XML/HTML tags that break
rendering; remove the accidental trailing tags "</content>" and "</invoke>" from
the end of docs/migration/sampling-options-to-model-options.md so the document
ends cleanly with the previous markdown content (ensure there are no other stray
markup fragments left).

In `@packages/ai-anthropic/src/adapters/text.ts`:
- Line 365: The code uses `modelOptions?.max_tokens || 1024` which treats
explicit 0 as falsy and overrides it; change this to use the nullish coalescing
operator so explicit 0 (or other falsy but defined values) are preserved:
replace that expression with `modelOptions?.max_tokens ?? 1024` (or an explicit
undefined check) so `validateTextProviderOptions()` / `validateMaxTokens()` can
correctly detect and reject invalid zero values.

In `@packages/ai-code-mode/models-eval/run-eval.ts`:
- Around line 224-225: The Ollama branch currently returns sampling params at
the top level (case 'ollama'), but the adapter expects them nested under
modelOptions.options; modify the 'ollama' case to return an object with an
options property containing num_predict: maxTokens and num_ctx: 32768 (i.e.,
move num_predict and num_ctx inside options) so the sampling parameters are
forwarded correctly to the Ollama adapter.

In `@packages/ai-openai/src/text/text-provider-options.ts`:
- Around line 17-38: OpenAIBaseOptions currently inherits temperature/top_p via
OpenAISamplingOptions which makes those fields valid for reasoning models;
remove temperature and top_p from the shared base by stopping OpenAIBaseOptions
from extending OpenAISamplingOptions (keep max_output_tokens in the base),
create a new OpenAISamplingOptions (or NonReasoningSamplingOptions) interface
that includes temperature/top_p and apply that only to non-reasoning model
option types in model-meta.ts (e.g., O1/O3 should not use OpenAIBaseOptions with
sampling fields); additionally ensure adapters/responses-text.ts
mapOptionsToRequest() does not blindly spread modelOptions into
ResponseCreateParams for reasoning models—either filter out temperature/top_p
there or only pass the sampling interface for non-reasoning models so the client
never sends temperature/top_p to the Responses API for reasoning models.

In `@packages/ai-openrouter/src/adapters/text.ts`:
- Around line 1148-1153: The request builder currently spreads only
restModelOptions into the ChatRequest (const request) so a root-level metadata
passed to chat({ metadata }) is dropped; update the construction of request in
the text adapter (the const request: Omit<ChatRequest, 'stream'> block) to also
include the root metadata (e.g., preserve options.metadata or a top-level
metadata parameter) alongside model, messages and tools so the ChatRequest
forwarded to OpenRouter contains metadata the same way responses adapter does.

In `@packages/ai/src/activities/summarize/chat-stream-summarize.ts`:
- Around line 80-87: In the Ollama branch (the adapterName === 'ollama' block)
ensure we don’t override caller-set token limits: before injecting
merged.options = { num_predict: maxLength, ...existing } check if existing
already contains any recognized flat token-limit keys (e.g., num_predict,
max_tokens, max_length, max_output_tokens) and if so return merged unchanged;
only set num_predict when none of those keys are present. Update the guard
around merged.options / existing and the assignment in that block to honor those
keys instead of always injecting num_predict.

In `@packages/ai/src/middlewares/otel.ts`:
- Around line 169-172: The helper firstNumber currently returns any value of
typeof 'number' (including NaN and Infinity); update its check in the
firstNumber function so it only returns numbers that are finite by using
Number.isFinite(candidate) (i.e., return candidate only if typeof candidate ===
'number' && Number.isFinite(candidate)), leaving the rest of the control flow
unchanged so invalid numeric values are filtered out and undefined is returned
when no finite number is found.

---

Outside diff comments:
In `@packages/ai-ollama/src/adapters/text.ts`:
- Around line 36-39: The fallback for ResolveModelOptions is too broad (falls
back to ChatRequest) even though mapCommonOptionsToOllama only forwards specific
fields; change the fallback to a narrowed type that only includes the fields
actually forwarded (e.g., an object with an optional options property limited to
Pick<ChatRequest['options'],
'format'|'keep_alive'|'logprobs'|'top_logprobs'|'think'> and an optional tools
property of ChatRequest['tools']) so ResolveModelOptions<TModel> returns either
OllamaChatModelOptionsByName[TModel] or that minimal subset; update the type
alias ResolveModelOptions accordingly so mapCommonOptionsToOllama and related
code use the correct, narrower modelOptions shape.

---

Nitpick comments:
In
`@codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts`:
- Around line 14-16: The object inside modelOptions uses verbose property syntax
"temperature: temperature,"—update the codemod transform that emits the
modelOptions object so when a property key equals its identifier (e.g.,
temperature) it emits the ES6 shorthand (temperature,) instead; locate the code
that constructs or prints the modelOptions object (the logic producing the
"modelOptions" node and its properties) and change it to detect identical
key+identifier pairs and output the shorthand property form.

In `@codemods/move-sampling-to-model-options/README.md`:
- Around line 18-23: The mapping table currently lists a single OpenAI entry
mapping `maxTokens` to `max_output_tokens`, which is ambiguous for users of
`openaiChatCompletions`; update the table to split the OpenAI row into two rows
(e.g., "openai (Responses)" and "openai (Chat Completions)") and set `maxTokens`
-> `max_output_tokens` for the Responses row and `maxTokens` -> `max_tokens` for
the Chat Completions row; mention the `openaiChatCompletions` identifier in the
note so readers know which row to use.

In `@codemods/move-sampling-to-model-options/transform.ts`:
- Around line 341-356: The cast key.name as RootSamplingKey is
unnecessary/unsafe; instead make the set string-typed and check the identifier
name directly: change movedSet's type from Set<RootSamplingKey> to Set<string>
when creating it (const movedSet = new Set<string>(presentKeys)) and inside the
obj.properties filter, after confirming key.type === 'Identifier', use const
name = key.name and call movedSet.has(name) (no cast) to decide removal.

In `@packages/ai/tests/summarize-max-length.test.ts`:
- Around line 1-139: Move the test file
packages/ai/tests/summarize-max-length.test.ts next to the implementation file
chat-stream-summarize.ts so the test is colocated with its source; update
imports in summarize-max-length.test.ts to reference the local module path
(adjust any ../src/... imports) and keep references to
ChatStreamSummarizeAdapter, createRecordingTextAdapter, and the test helpers
(resolveDebugOption, ev) intact so the suite continues to import the same
symbols from the colocated files. Ensure the new location preserves the same
filename and that any path changes are minimal and correct for the package
module resolution.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e00706c-024b-453a-a0d4-8a667e653722

📥 Commits

Reviewing files that changed from the base of the PR and between 548e113 and 4e8afb8.

📒 Files selected for processing (86)

.changeset/sampling-options-to-model-options.md
codemods/README.md
codemods/move-sampling-to-model-options/README.md
codemods/move-sampling-to-model-options/__testfixtures__/anthropic-merge.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/anthropic-merge.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/conflict.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/conflict.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/create-chat-options.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/create-chat-options.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/gemini-rename.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/gemini-rename.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/generate-and-ai.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/generate-and-ai.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/groq-maxtokens.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/groq-maxtokens.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/no-import.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/no-import.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/ollama-nested.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/ollama-nested.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/openai-basic.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/openai-basic.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/openrouter-maxtokens.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/openrouter-maxtokens.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/shorthand.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/unresolvable-adapter.input.ts
codemods/move-sampling-to-model-options/__testfixtures__/unresolvable-adapter.output.ts
codemods/move-sampling-to-model-options/transform.test.ts
codemods/move-sampling-to-model-options/transform.ts
codemods/package.json
docs/adapters/anthropic.md
docs/adapters/gemini.md
docs/adapters/grok.md
docs/adapters/groq.md
docs/adapters/ollama.md
docs/adapters/openai.md
docs/adapters/openrouter.md
docs/advanced/middleware.md
docs/advanced/typed-options.md
docs/api/ai.md
docs/config.json
docs/migration/migration.md
docs/migration/sampling-options-to-model-options.md
package.json
packages/ai-anthropic/src/adapters/text.ts
packages/ai-anthropic/src/text/text-provider-options.ts
packages/ai-anthropic/tests/anthropic-adapter.test.ts
packages/ai-code-mode/models-eval/judge.ts
packages/ai-code-mode/models-eval/run-eval.ts
packages/ai-gemini/src/adapters/text.ts
packages/ai-gemini/src/experimental/text-interactions/adapter.ts
packages/ai-gemini/src/text/text-provider-options.ts
packages/ai-gemini/tests/gemini-adapter.test.ts
packages/ai-gemini/tests/text-interactions-adapter.test.ts
packages/ai-grok/tests/grok-adapter.test.ts
packages/ai-groq/tests/groq-adapter.test.ts
packages/ai-ollama/src/adapters/text.ts
packages/ai-ollama/src/index.ts
packages/ai-ollama/src/meta/models-meta.ts
packages/ai-ollama/tests/text-adapter.test.ts
packages/ai-openai/src/text/text-provider-options.ts
packages/ai-openai/tests/chat-per-model-type-safety.test.ts
packages/ai-openai/tests/openai-adapter.test.ts
packages/ai-openrouter/src/adapters/responses-text.ts
packages/ai-openrouter/src/adapters/text.ts
packages/ai-openrouter/tests/openrouter-adapter.test.ts
packages/ai-openrouter/tests/openrouter-responses-adapter.test.ts
packages/ai/skills/ai-core/adapter-configuration/SKILL.md
packages/ai/skills/ai-core/adapter-configuration/references/anthropic-adapter.md
packages/ai/skills/ai-core/adapter-configuration/references/gemini-adapter.md
packages/ai/skills/ai-core/adapter-configuration/references/ollama-adapter.md
packages/ai/skills/ai-core/adapter-configuration/references/openai-adapter.md
packages/ai/skills/ai-core/chat-experience/SKILL.md
packages/ai/skills/ai-core/middleware/SKILL.md
packages/ai/src/activities/chat/index.ts
packages/ai/src/activities/chat/middleware/types.ts
packages/ai/src/activities/summarize/chat-stream-summarize.ts
packages/ai/src/middlewares/otel.ts
packages/ai/src/types.ts
packages/ai/tests/chat.test.ts
packages/ai/tests/middleware.test.ts
packages/ai/tests/middlewares/otel.test.ts
packages/ai/tests/summarize-max-length.test.ts
packages/openai-base/src/adapters/chat-completions-text.ts
packages/openai-base/src/adapters/responses-text.ts
packages/openai-base/tests/responses-text.test.ts

💤 Files with no reviewable changes (4)

packages/ai-gemini/src/adapters/text.ts
packages/ai-gemini/src/experimental/text-interactions/adapter.ts
packages/ai/src/types.ts
packages/ai-ollama/src/index.ts

coderabbitai · 2026-05-30T14:41:17Z

+/** Sampling controls shared by all Responses-API models. */
+export interface OpenAISamplingOptions {
+  /**
+   * Sampling temperature, 0–2. Higher = more random. Recommend altering this or top_p, not both.
+   * Note: OpenAI reasoning models (o-series, GPT-5 reasoning) reject temperature/top_p.
+   * https://platform.openai.com/docs/api-reference/responses/create#responses_create-temperature
+   */
+  temperature?: number
+  /**
+   * Nucleus sampling. 0.1 = only the top 10% probability mass is considered.
+   * https://platform.openai.com/docs/api-reference/responses/create#responses_create-top_p
+   */
+  top_p?: number
+  /**
+   * Upper bound on generated tokens (visible output + reasoning tokens).
+   * https://platform.openai.com/docs/api-reference/responses/create#responses_create-max_output_tokens
+   */
+  max_output_tokens?: number
+}
+
 // Core, always-available options for Responses API
-export interface OpenAIBaseOptions {
+export interface OpenAIBaseOptions extends OpenAISamplingOptions {


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Do OpenAI Responses API reasoning models (for example o3 and GPT-5 reasoning models) accept the temperatureandtop_p parameters, or are those parameters rejected?

💡 Result:

OpenAI's dedicated reasoning models, such as the o1 and o3 series (including o3-mini) and the GPT-5 series, generally do not accept the traditional temperature and top_p sampling parameters via the API [1][2][3][4]. Attempts to include these parameters in API requests for these models typically result in a 400-level invalid_request_error stating that they are unsupported [1][2]. These reasoning models instead utilize specialized parameters to control their behavior, most notably reasoning.effort (or simply reasoning_effort in some environments), which allows developers to adjust the depth of the model's internal thinking process (e.g., minimal, low, medium, high, or xhigh) [5][6][7][8][3]. The exclusion of sampling parameters like temperature is a deliberate design choice by OpenAI to maintain the reliability and determinism required for complex, multi-step reasoning tasks [2]. Because these models are optimized to generate a high-quality, singular "best" chain-of-thought, standard sampling techniques used for creative text generation are often incompatible with their internal architecture [2]. Developers building on these models should omit temperature and top_p from their API calls to avoid request errors [1][2][4].

Citations:

1: CLI erroneously sends unsupported parameters (temperature/top_p) to the o3-mini model openai/openai-python#2072

2: https://community.openai.com/t/o3-mini-unsupported-parameter-temperature/1140846

3: https://learn.microsoft.com/en-us/azure/foundry/openai/how-to/reasoning

4: https://forum.popclip.app/t/required-api-changes-for-gpt-5-x-reasoning-models/3661

5: https://developers.openai.com/api/docs/guides/reasoning

6: https://docs.x.ai/developers/model-capabilities/text/reasoning

7: https://developers.openai.com/api/docs/models/gpt-5

8: https://developers.openai.com/api/docs/guides/latest-model

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find all references to the shared sampling/base types rg -n "OpenAISamplingOptions|OpenAIBaseOptions" packages/ai-openai/src/text/text-provider-options.ts packages/ai-openai/src -S # Show the rest of the file around where base/model option maps are composed FILE="packages/ai-openai/src/text/text-provider-options.ts" if [ -f "$FILE" ]; then echo "---- $FILE (context) ----" # print around the shown snippet and further down for composition nl -ba "$FILE" | sed -n '1,220p' fi # Look for option maps / reasoning models naming patterns rg -n "reasoning|o3|o1|gpt-5|responses.*(create|temperature|top_p)" packages/ai-openai/src -S # Identify where modelOptions types are built for text providers rg -n "modelOptions|ModelOptions|options.*model" packages/ai-openai/src/text packages/ai-openai/src -S

Repository: TanStack/ai

Length of output: 9554

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show key type definitions and where reasoning models include base options. rg -n "interface OpenAIReasoningOptions|type OpenAIReasoningOptions|OpenAIReasoningOptions" packages/ai-openai/src/text/text-provider-options.ts packages/ai-openai/src/model-meta.ts -S # Print relevant ranges from text-provider-options.ts around sampling/base/reasoning definitions FILE="packages/ai-openai/src/text/text-provider-options.ts" # Get line numbers via python and print small windows around important anchors python3 - <<'PY' import re, itertools file_path="packages/ai-openai/src/text/text-provider-options.ts" anchors=["export interface OpenAISamplingOptions","export interface OpenAIBaseOptions","OpenAIReasoningOptions","ExternalTextProviderOptions"] with open(file_path,'r',encoding='utf-8') as f: lines=f.readlines() for i,line in enumerate(lines,1): for a in anchors: if a in line: print(f"--- anchor:{a} at {i} ---") start=max(1,i-25); end=min(len(lines),i+120) for j in range(start,end+1): # trim long lines a bit? just output as-is print(f"{j}:{lines[j-1].rstrip()}") PY # Print relevant ranges from model-meta.ts around reasoning models (O3/O1) where OpenAIBaseOptions is used python3 - <<'PY' file_path="packages/ai-openai/src/model-meta.ts" targets=["[O3.name]","[O1.name]","[O3_PRO.name]","[O1_PRO.name]","reasoning"] with open(file_path,'r',encoding='utf-8') as f: lines=f.readlines() def match_any(line): return any(t in line for t in targets) for i,line in enumerate(lines,1): if match_any(line): print(f"--- match at {i}: {line.strip()} ---") start=max(1,i-20); end=min(len(lines),i+40) for j in range(start,end+1): print(f"{j}:{lines[j-1].rstrip()}") PY # Search for how temperature/top_p is serialized into API requests. rg -n "temperature|max_output_tokens|top_p" packages/ai-openai/src -S

Repository: TanStack/ai

Length of output: 50367

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show key reasoning type definitions and where reasoning models include base options. rg -n "OpenAIReasoningOptions" packages/ai-openai/src/text/text-provider-options.ts packages/ai-openai/src/model-meta.ts -S # Print relevant windows from text-provider-options.ts around sampling/base/reasoning/external option composition. python3 - <<'PY' file_path="packages/ai-openai/src/text/text-provider-options.ts" anchors=[ "export interface OpenAISamplingOptions", "export interface OpenAIBaseOptions", "OpenAIReasoningOptions", "export type ExternalTextProviderOptions" ] with open(file_path,'r',encoding='utf-8') as f: lines=f.readlines() for i,line in enumerate(lines,1): for a in anchors: if a in line: print(f"\n--- anchor:{a} at {i} ---") start=max(1,i-25); end=min(len(lines),i+180) for j in range(start,end+1): print(f"{j}:{lines[j-1].rstrip()}") PY # Print relevant windows from model-meta.ts around reasoning models (O3/O1) where OpenAIBaseOptions is used. python3 - <<'PY' file_path="packages/ai-openai/src/model-meta.ts" with open(file_path,'r',encoding='utf-8') as f: lines=f.readlines() # Print around O3/O1 entries explicitly for i,line in enumerate(lines,1): if "[O3.name]" in line or "[O1.name]" in line: print(f"\n--- model-meta match at {i}: {line.strip()} ---") start=max(1,i-25); end=min(len(lines),i+60) for j in range(start,end+1): print(f"{j}:{lines[j-1].rstrip()}") PY # Search for how temperature/top_p is serialized into API requests. rg -n "temperature|max_output_tokens|top_p" packages/ai-openai/src -S

Repository: TanStack/ai

Length of output: 50367

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find non-type usages/serialization points for temperature/top_p. # Limit output to keep it readable. rg -n --type ts "(^|[^a-zA-Z0-9_])(temperature|top_p|max_output_tokens)([^a-zA-Z0-9_]|$)" packages/ai-openai/src -S \ | head -n 80 # Look specifically for payload construction / Responses create call sites. rg -n --type ts "responses\.create|Responses\.create|createResponse|response.*payload|payload\s*=" packages/ai-openai/src -S \ | head -n 80 # Look for any conditional gating related to reasoning models. rg -n --type ts "reasoning.*temperature|temperature.*reasoning|top_p.*reasoning|reasoning.*top_p" packages/ai-openai/src -S \ | head -n 80 rg -n --type ts "O1\.name|O3\.name|o-series|o3|gpt-5.*reasoning|reasoning model" packages/ai-openai/src -S \ | head -n 80

Repository: TanStack/ai

Length of output: 8286

🏁 Script executed:

#!/bin/bash set -euo pipefail # Look for where temperature/top_p are mapped into Responses API payloads for text. rg -n --type ts "top_p\s*:|temperature\s*:" packages/ai-openai/src/adapters packages/ai-openai/src/text -S # Look for any explicit filtering based on model capability/name. rg -n --type ts "O1\.name|O3\.name|o3|o1|reasoning model|reasoning" packages/ai-openai/src/adapters packages/ai-openai/src/text -S | head -n 200 # Also check the legacy/openai adapter entrypoint if it builds payload for responses.create. rg -n --type ts "responses\.create|OpenAI\.responses\.create|create\(" packages/ai-openai/src/adapters packages/ai-openai/src/text -S | head -n 200

Repository: TanStack/ai

Length of output: 37

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="packages/ai-openai/src/adapters/text.ts" python3 - <<'PY' import itertools, pathlib path=pathlib.Path("packages/ai-openai/src/adapters/text.ts") text=path.read_text(encoding="utf-8").splitlines() # print whole file if small? otherwise print key regions print("LINES", len(text)) # find occurrences of responses.create or payload building need=[] for i,line in enumerate(text,1): if "responses.create" in line or "Responses.create" in line or "openai.responses.create" in line: need.append(i) if "temperature" in line or "top_p" in line: need.append(i) if not need: # still print first 260 lines and any later sections likely containing request building slices=[(1,220),(220,440),(440,660),(660,880)] else: slices=[] for i in sorted(set(need)): slices.append((max(1,i-60), min(len(text), i+80))) # dedupe slices by start seen=set() for a,b in slices: key=(a,b) if key in seen: continue seen.add(key) print(f"\n---- slice {a}-{b} ----") for j in range(a,b+1): print(f"{j}:{text[j-1]}") PY # Broader scan in adapters for responses payload creation / temperature rg -n --type ts "responses\.create|OpenAI.*responses|response[s]?\s*\(" packages/ai-openai/src/adapters -S rg -n --type ts "temperature|max_output_tokens|top_p" packages/ai-openai/src/adapters -S || true

Repository: TanStack/ai

Length of output: 5664

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "OpenAIBaseResponsesTextAdapter" -S packages # If found, show the key parts where it builds the ResponseCreateParams payload. # Otherwise, we may need to determine that the filtering (if any) lives in an external package. rg -n "mapOptionsToRequest\\(|ResponseCreateParams|responses\\.create|temperature|top_p" -S packages | head -n 200

Repository: TanStack/ai

Length of output: 28149

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="packages/openai-base/src/adapters/responses-text.ts" # show the region around the comment mentioning modelOptions is sole sampling surface # (use python to avoid relying on nl) python3 - <<'PY' import pathlib path=pathlib.Path("packages/openai-base/src/adapters/responses-text.ts") lines=path.read_text(encoding="utf-8").splitlines() # print around the first occurrence of the anchor text anchor="modelOptions is the sole sampling surface" idx=None for i,l in enumerate(lines): if anchor in l: idx=i break print("ANCHOR_LINE_INDEX", idx+1 if idx is not None else None) if idx is None: raise SystemExit("anchor not found") start=max(0, idx-80) end=min(len(lines), idx+220) for j in range(start,end): print(f"{j+1}:{lines[j]}") PY # Then narrow specifically for temperature/top_p serialization in this file rg -n "temperature|top_p|max_output_tokens" "$FILE" -S # Print small windows around each temperature/top_p occurrence (keep concise) python3 - <<'PY' import pathlib,re path=pathlib.Path("packages/openai-base/src/adapters/responses-text.ts") lines=path.read_text(encoding="utf-8").splitlines() hits=[] for i,l in enumerate(lines,1): if re.search(r"\b(temperature|top_p|max_output_tokens)\b", l): hits.append(i) # keep up to 10 hits for brevity for i in hits[:12]: start=max(1,i-15); end=min(len(lines),i+25) print(f"\n--- window around line {i} ---") for j in range(start,end+1): print(f"{j}:{lines[j-1]}") PY

Repository: TanStack/ai

Length of output: 113

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="packages/openai-base/src/adapters/responses-text.ts" # Find where mapOptionsToRequest starts and where it mentions modelOptions/sampling. rg -n --type ts "mapOptionsToRequest|modelOptions|sampling|temperature|top_p" "$FILE" -S # Print around mapOptionsToRequest and around any temperature/top_p mentions. python3 - <<'PY' import pathlib,re path=pathlib.Path("packages/openai-base/src/adapters/responses-text.ts") lines=path.read_text(encoding="utf-8").splitlines() # line numbers with temperature/top_p/max_output_tokens targets=[] for i,l in enumerate(lines,1): if re.search(r"\b(temperature|top_p|max_output_tokens)\b", l): targets.append(i) # add mapOptionsToRequest region start for i,l in enumerate(lines,1): if "mapOptionsToRequest" in l: targets.append(i) targets=sorted(set(targets)) print("HIT_COUNT", len(targets)) # print windows for i in targets[:30]: start=max(1,i-25); end=min(len(lines), i+50) print(f"\n---- window around {i} ----") for j in range(start,end+1): print(f"{j}:{lines[j-1]}") PY

Repository: TanStack/ai

Length of output: 37172

Don’t expose temperature/top_p on the shared OpenAI base options for reasoning models

OpenAIBaseOptions currently extends OpenAISamplingOptions (temperature/top_p), and model-meta.ts assigns OpenAIBaseOptions to reasoning models like O1/O3, so these fields become valid for all models.

At runtime, packages/openai-base/src/adapters/responses-text.ts mapOptionsToRequest() spreads ...modelOptions directly into ResponseCreateParams (the sole sampling surface), with no capability-based filtering—so temperature/top_p will be sent to /v1/responses and rejected by OpenAI reasoning models.

Keep max_output_tokens in the shared base if needed, but gate/remove temperature/top_p from the shared base and put them only on the non-reasoning model option types.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai-openai/src/text/text-provider-options.ts` around lines 17 - 38, OpenAIBaseOptions currently inherits temperature/top_p via OpenAISamplingOptions which makes those fields valid for reasoning models; remove temperature and top_p from the shared base by stopping OpenAIBaseOptions from extending OpenAISamplingOptions (keep max_output_tokens in the base), create a new OpenAISamplingOptions (or NonReasoningSamplingOptions) interface that includes temperature/top_p and apply that only to non-reasoning model option types in model-meta.ts (e.g., O1/O3 should not use OpenAIBaseOptions with sampling fields); additionally ensure adapters/responses-text.ts mapOptionsToRequest() does not blindly spread modelOptions into ResponseCreateParams for reasoning models—either filter out temperature/top_p there or only pass the sampling interface for non-reasoning models so the client never sends temperature/top_p to the Responses API for reasoning models.

Deferring this one to a follow-up rather than folding it into this PR, for two reasons:

Not a regression introduced here. Before this PR, temperature/topP lived on the chat() root and were sent to any model, reasoning ones included — the runtime behavior (a user can send temperature to o3/gpt-5 and get a 400) is unchanged. This PR only relocated the fields; it didn't newly enable sending them to reasoning models at runtime.

It's a sizable, orthogonal type-design change. Every OpenAI model in model-meta.ts (~30 entries) composes OpenAIBaseOptions. Splitting sampling out of the shared base and re-applying it only to non-reasoning models touches the whole matrix and reshapes the public per-model type surface — that deserves its own PR with its own tests, not a rider on the sampling-relocation change. The docstring on OpenAISamplingOptions already documents that reasoning models reject these.

Tracking as a follow-up: gate temperature/top_p to non-reasoning OpenAI models at the type level (and optionally filter them in responses-text.ts for reasoning models).

@AlemTuzlak, that reasoning makes sense — this PR relocates the surface, it doesn't regress behaviour, and the docstring on OpenAISamplingOptions already calls out the restriction. Splitting the per-model type matrix into a separate PR is the right call.

Would you like me to open a GitHub issue to track the follow-up work (gating temperature/top_p to non-reasoning model types and optionally filtering them in responses-text.ts)?

tombeckenham

pnpm vitest run move-sampling-to-model-options fails
Ollama summarize silently drops the temperature: 0.3 default

tombeckenham

Review of the sampling→modelOptions move. Core mapping is correct and well-tested; flagging a few real issues inline. Two blockers:

Codemod suite is red (9/12 fail) and escapes CI — codemods exposes a test script, not test:lib, so pnpm test:pr never runs it.
Ollama temperature/token caps land flat in two places (summarize default + the eval harness) but the adapter reads sampling only from modelOptions.options → silently dropped.

Also (not inline-able, examples aren't in the diff): ts-code-mode-web (api.product-codemode.ts:271, api.judge.ts:56, api.banking-demo.ts:237) and ts-react-chat/api.structured-output.ts:409 still pass root maxTokens to chat() — now silently ignored.

tombeckenham · 2026-06-02T05:33:48Z

+): Record<string, unknown> {
+  switch (provider) {
+    case 'ollama':
+      return { num_predict: maxTokens, num_ctx: 32768 }


Flat num_predict/num_ctx are dropped: this result is passed as modelOptions (L818), but the Ollama adapter reads sampling only from modelOptions.options. Nest it:

return { options: { num_predict: maxTokens, num_ctx: 32768 } }

tombeckenham · 2026-06-02T05:33:48Z

+    let working: Record<string, unknown> = {
+      temperature: 0.3,
+      ...(options.modelOptions as Record<string, unknown> | undefined),
+    }


Regression for Ollama: temperature: 0.3 stays flat, but Ollama reads sampling only from modelOptions.options, so the default never reaches the wire (it did before this PR). OTel reads the flat value, so telemetry will report 0.3 while the request omits it. Nest temperature under options for the ollama name.

tombeckenham · 2026-06-02T05:33:48Z

+ * provider reads — no adapter reads a generic `maxTokens`. A value of `null`
+ * marks a nested shape (handled specially below for Ollama).


Inaccurate: the map is Record<string, string> with no null and no ollama key — Ollama is a hardcoded branch in applyMaxLength. Drop the null sentence.

tombeckenham · 2026-06-02T05:33:48Z

+] as const
+
+/**
+ * Resolve `maxLength` to the provider-native max-output-tokens key for the


"the text adapter's name" → it's this summarize adapter's own name (constructor arg, defaults to 'chat-stream-summarize'), independent of the wrapped text adapter.

tombeckenham · 2026-06-02T05:33:48Z

+  }
+
+  const key = MAX_TOKENS_KEY_BY_ADAPTER[adapterName]
+  if (key === undefined) return merged


Silent no-op: any name not in the map (new provider, the default 'chat-stream-summarize', or the new openaiCompatible adapter's custom names) drops maxLength with no signal. At least logger.warn here; better, type adapter name as a literal union so the map must be exhaustive.

tombeckenham · 2026-06-02T05:33:48Z

 }

+/**
+ * Return the first candidate that is a finite `number`, or `undefined`. Used to


Says "finite" but the check is only typeof === 'number', so NaN/Infinity pass. Either guard with Number.isFinite or drop "finite".

tombeckenham · 2026-06-02T05:33:48Z

+): { reports: Array<string> } {
+  const expected = read(`${name}.output.${ext}`)
+  const { output, reports } = runTransform(name, ext)
+  expect(normalize(output)).toBe(normalize(expected))


This suite is red locally (9/12): recast prints ;/no trailing commas, fixtures are Prettier-style, and normalize() only trims. Transform output is semantically correct — run it through Prettier in the harness (or regenerate fixtures), and wire this into test:pr (it currently isn't, since codemods has no test:lib target).

tombeckenham · 2026-06-02T05:33:48Z

@@ -0,0 +1,31 @@
+---
+'@tanstack/ai': minor


Please note the removed public export OllamaTextProviderOptions (deleted from ai-ollama/src/index.ts) — it's a breaking API-surface change; mention the migration to the per-model type / SDK ChatRequest.

Blocking fixes (codemod CI + Ollama silent drops): - codemod: Prettier-normalize the transform test harness so recast's print style no longer diverges from the Prettier-formatted fixtures (20/20 green), and add a `test:lib` script to the codemods package so `nx affected` (and thus `test:pr`) actually runs the suite instead of skipping it. - summarize: place the default `temperature` where the wrapped provider reads it — nested under `options` for Ollama (a flat value was dropped at the wire while still surfacing in OTel). Honor caller-set flat token limits in the Ollama branch, and warn instead of silently dropping `maxLength` for an unrecognised adapter name. - code-mode eval harness: nest Ollama `num_predict`/`num_ctx` under `options`. - examples: route the generic `maxTokens` through provider-native `modelOptions` (shared `maxTokensModelOptions` helper for dynamically resolved adapters; native keys inline where the adapter is static). Other review items: - anthropic: `max_tokens ?? 1024` so an explicit `0` reaches validation instead of being coerced to the default. - openrouter (chat): forward root `metadata` like the responses adapter (+ test). - otel: `firstNumber` now requires `Number.isFinite` (rejects NaN/Infinity). - logger: add `InternalLogger.warn`, gated by the `errors` category so `debug: false` still silences it. - docs: fix the migration example to put sampling under `modelOptions`; drop leaked trailing tags from the sampling guide; note the removed `OllamaTextProviderOptions` export in the changeset; split the codemod README's OpenAI row into Responses vs Chat Completions. - codemod: emit ES6 shorthand when a moved value matches its key; drop an unsafe `as RootSamplingKey` cast.

AlemTuzlak · 2026-06-02T19:30:23Z

Thanks for the thorough review @tombeckenham + CodeRabbit. Addressed in acb3371.

Blockers

Codemod suite red + escaping CI. The 9/12 failures were purely recast-vs-Prettier print style (semicolons, trailing commas) — output was semantically correct. The harness now Prettier-normalizes both sides before comparing (transform.test.ts), so it's robust to recast's printer; 20/20 green. And the root cause of it escaping CI: the codemods project had no test:lib target, so nx affected --targets=…,test:lib,… skipped it. Added "test:lib": "vitest run" to codemods/package.json — verified it now runs under pnpm test:pr.
Ollama temperature: 0.3 dropped in summarize. Ollama reads sampling only from modelOptions.options, so the flat default was dropped at the wire while OTel reported it. The default now nests under options.temperature for Ollama (flat for everyone else). Regression test added that asserts both options.temperature and options.num_predict are nested (and that nothing flat leaks).
Ollama caps land flat in the eval harness (run-eval.ts) — now nested under options.

CodeRabbit items

migration.md "After" example: sampling moved into modelOptions (native keys); metadata stays at root.
sampling-options-to-model-options.md: removed the leaked </content></invoke> tags.
anthropic max_tokens ?? 1024 (so an explicit 0 reaches validateMaxTokens instead of being coerced).
openrouter chat adapter now forwards root metadata like the responses adapter (+ test).
otel firstNumber now requires Number.isFinite.
summarize: honor caller-set flat token limits in the Ollama branch; fixed the stale null/"text adapter's name" comments; warn instead of silently dropping maxLength for an unrecognised adapter name (new InternalLogger.warn, gated by the errors category so debug: false still silences it — names are legitimately open-ended, e.g. openaiCompatible, so a literal-union type isn't viable).
changeset: documented the removed OllamaTextProviderOptions export (→ OllamaChatModelOptionsByName / SDK ChatRequest).
codemod README: split the OpenAI row into Responses vs Chat Completions.
codemod transform: drop the unsafe as RootSamplingKey cast (Set<string>); emit ES6 shorthand when a moved value matches its key.

Examples (root `maxTokens` now silently ignored)

Fixed the ts-code-mode-web (api.product-codemode, api.judge, api.banking-demo, plus the execute-prompt/structured-output libs) and ts-react-chat/api.structured-output call sites — routed through provider-native modelOptions via a small maxTokensModelOptions(adapter, …) helper for the dynamically-resolved adapters, native keys inline where the adapter is static.

Deferred / not changed (with reasoning)

OpenAI reasoning-model temperature/top_p type split (CodeRabbit) — deferred to a follow-up; not a runtime regression (root temperature was always sendable pre-PR) and it's a ~30-model type-surface change orthogonal to this PR. Replied in-thread.
Ollama ResolveModelOptions fallback narrowing (CodeRabbit, outside-diff) — a type refinement on the arbitrary-model-string fallback; pre-existing design, risks per-model assignability. Worth a follow-up, out of scope here.
Move summarize-max-length.test.ts next to source (CodeRabbit nitpick) — skipped: packages/ai keeps all 43 tests in tests/ with zero colocated tests, so moving this one would break the package's actual convention.

Full pnpm test:pr green across 34 projects.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.changeset/sampling-options-to-model-options.md (1)
15-15: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clarify OpenAI Chat Completions vs Responses modelOptions token key in the changeset.

The changeset entry only documents max_output_tokens under “OpenAI (Responses)” (line 15), but the OpenAI chat-completions adapter uses provider-native modelOptions wire names max_tokens / max_completion_tokens (and doesn’t read the root maxTokens). Update the doc to add a separate “OpenAI (Chat Completions)” entry (including both key variants) or explicitly scope max_output_tokens to Responses only.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.changeset/sampling-options-to-model-options.md at line 15, The changeset
currently lists OpenAI (Responses) modelOptions as `{ temperature, top_p,
max_output_tokens }` but doesn’t clarify that OpenAI chat-completions adapters
use provider-native keys; update the changeset to either (A) add a new "OpenAI
(Chat Completions)" entry listing `modelOptions: { temperature, top_p,
max_tokens, max_completion_tokens }` or (B) explicitly scope the existing line
to say `OpenAI (Responses): modelOptions: { temperature, top_p,
max_output_tokens }` and note that chat-completions uses
`max_tokens`/`max_completion_tokens` (and does not read root `maxTokens`),
referencing the `modelOptions`, `max_output_tokens`, `max_tokens`, and
`max_completion_tokens` names so readers know the exact keys to use.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/ai-openrouter/tests/openrouter-adapter.test.ts`:
- Around line 1499-1515: Add an assertion that runs the outbound serializer on
the outgoing payload so the test verifies metadata survives SDK serialization:
after extracting rawParams = mockSend.mock.calls[0][0] and chatRequest =
rawParams.chatRequest, call ChatRequest$outboundSchema.parse(...) (the same
pattern used in other tests) on the serialized payload and assert the parsed
result contains metadata { env: 'test' } to ensure metadata is not stripped
during serialization.

In `@packages/ai/src/logger/internal-logger.ts`:
- Around line 115-120: The warn method currently swallows warnings if the
user-provided logger lacks warn (this.logger.warn) — update warn in the class
(the warn(message: string, meta?: Record<string, unknown>) method that checks
this.categories.errors and builds prefixed) to guard for a missing warn function
and fall back to other available logging methods (for example this.logger.error,
this.logger.debug) and finally console.warn/console.error as the last resort;
perform feature-detection (typeof this.logger.warn === 'function') before
calling and only use the try/catch for unexpected failures while ensuring the
warning is never silently dropped.

---

Outside diff comments:
In @.changeset/sampling-options-to-model-options.md:
- Line 15: The changeset currently lists OpenAI (Responses) modelOptions as `{
temperature, top_p, max_output_tokens }` but doesn’t clarify that OpenAI
chat-completions adapters use provider-native keys; update the changeset to
either (A) add a new "OpenAI (Chat Completions)" entry listing `modelOptions: {
temperature, top_p, max_tokens, max_completion_tokens }` or (B) explicitly scope
the existing line to say `OpenAI (Responses): modelOptions: { temperature,
top_p, max_output_tokens }` and note that chat-completions uses
`max_tokens`/`max_completion_tokens` (and does not read root `maxTokens`),
referencing the `modelOptions`, `max_output_tokens`, `max_tokens`, and
`max_completion_tokens` names so readers know the exact keys to use.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cd6d37d6-e98c-4ff7-9a4e-0b165fcab82c

📥 Commits

Reviewing files that changed from the base of the PR and between 4e8afb8 and acb3371.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (24)

.changeset/sampling-options-to-model-options.md
codemods/move-sampling-to-model-options/README.md
codemods/move-sampling-to-model-options/__testfixtures__/anthropic-merge.output.ts
codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts
codemods/move-sampling-to-model-options/transform.test.ts
codemods/move-sampling-to-model-options/transform.ts
codemods/package.json
docs/migration/migration.md
docs/migration/sampling-options-to-model-options.md
examples/ts-code-mode-web/src/lib/create-execute-prompt-tool.ts
examples/ts-code-mode-web/src/lib/max-tokens-model-options.ts
examples/ts-code-mode-web/src/lib/structured-output.ts
examples/ts-code-mode-web/src/routes/_banking-demo/api.banking-demo.ts
examples/ts-code-mode-web/src/routes/_database-demo/api.judge.ts
examples/ts-code-mode-web/src/routes/_home/api.product-codemode.ts
examples/ts-react-chat/src/routes/api.structured-output.ts
packages/ai-anthropic/src/adapters/text.ts
packages/ai-code-mode/models-eval/run-eval.ts
packages/ai-openrouter/src/adapters/text.ts
packages/ai-openrouter/tests/openrouter-adapter.test.ts
packages/ai/src/activities/summarize/chat-stream-summarize.ts
packages/ai/src/logger/internal-logger.ts
packages/ai/src/middlewares/otel.ts
packages/ai/tests/summarize-max-length.test.ts

💤 Files with no reviewable changes (1)

docs/migration/sampling-options-to-model-options.md

✅ Files skipped from review due to trivial changes (1)

codemods/move-sampling-to-model-options/README.md

🚧 Files skipped from review as they are similar to previous changes (11)

codemods/move-sampling-to-model-options/testfixtures/shorthand.output.ts
packages/ai/tests/summarize-max-length.test.ts
docs/migration/migration.md
codemods/move-sampling-to-model-options/testfixtures/anthropic-merge.output.ts
codemods/move-sampling-to-model-options/transform.test.ts
packages/ai-code-mode/models-eval/run-eval.ts
packages/ai-openrouter/src/adapters/text.ts
packages/ai/src/middlewares/otel.ts
packages/ai/src/activities/summarize/chat-stream-summarize.ts
codemods/move-sampling-to-model-options/transform.ts
packages/ai-anthropic/src/adapters/text.ts

coderabbitai · 2026-06-02T19:38:44Z

+  it('forwards root metadata to the request (same as the responses adapter)', async () => {
+    setupMockSdkClient(minimalStreamChunks)
+    const adapter = createAdapter()
+
+    for await (const _ of chat({
+      adapter,
+      messages: [{ role: 'user', content: 'test' }],
+      // Root `metadata` is still part of the contract; it must not be dropped
+      // by the chat-completions request builder.
+      metadata: { env: 'test' },
+    })) {
+      // consume
+    }
+
+    const [rawParams] = mockSend.mock.calls[0]!
+    const params = rawParams.chatRequest
+    expect(params.metadata).toEqual({ env: 'test' })


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert the serialized payload too.

This only checks chatRequest before the SDK's outbound schema runs. A regression that still strips metadata during serialization would pass here, so this test should mirror the existing ChatRequest$outboundSchema.parse(...) assertion pattern as well.

Suggested addition

const [rawParams] = mockSend.mock.calls[0]! const params = rawParams.chatRequest expect(params.metadata).toEqual({ env: 'test' }) + + const serialized = ChatRequest$outboundSchema.parse(params) + expect((serialized as { metadata?: unknown }).metadata).toEqual({ + env: 'test', + }) })

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

it('forwards root metadata to the request (same as the responses adapter)', async () => {

setupMockSdkClient(minimalStreamChunks)

const adapter = createAdapter()

for await (const _ of chat({

adapter,

messages: [{ role: 'user', content: 'test' }],

// Root `metadata` is still part of the contract; it must not be dropped

// by the chat-completions request builder.

metadata: { env: 'test' },

})) {

// consume

}

const [rawParams] = mockSend.mock.calls[0]!

const params = rawParams.chatRequest

expect(params.metadata).toEqual({ env: 'test' })

it('forwards root metadata to the request (same as the responses adapter)', async () => {

setupMockSdkClient(minimalStreamChunks)

const adapter = createAdapter()

for await (const _ of chat({

adapter,

messages: [{ role: 'user', content: 'test' }],

// Root `metadata` is still part of the contract; it must not be dropped

// by the chat-completions request builder.

metadata: { env: 'test' },

})) {

// consume

}

const [rawParams] = mockSend.mock.calls[0]!

const params = rawParams.chatRequest

expect(params.metadata).toEqual({ env: 'test' })

const serialized = ChatRequest$outboundSchema.parse(params)

expect((serialized as { metadata?: unknown }).metadata).toEqual({

env: 'test',

})

})

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai-openrouter/tests/openrouter-adapter.test.ts` around lines 1499 - 1515, Add an assertion that runs the outbound serializer on the outgoing payload so the test verifies metadata survives SDK serialization: after extracting rawParams = mockSend.mock.calls[0][0] and chatRequest = rawParams.chatRequest, call ChatRequest$outboundSchema.parse(...) (the same pattern used in other tests) on the serialized payload and assert the parsed result contains metadata { env: 'test' } to ensure metadata is not stripped during serialization.

coderabbitai · 2026-06-02T19:38:44Z

+  warn(message: string, meta?: Record<string, unknown>): void {
+    if (!this.categories.errors) return
+    const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}`
+    try {
+      this.logger.warn(prefixed, meta)
+    } catch {


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fallback when the user logger does not implement warn.

This is a new runtime requirement on custom loggers. If a caller still provides the old debug/error shape, this.logger.warn(...) throws and the blanket catch turns the warning into a silent drop, which defeats the whole point of surfacing recoverable anomalies.

Suggested fix

warn(message: string, meta?: Record<string, unknown>): void { if (!this.categories.errors) return const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}` try { - this.logger.warn(prefixed, meta) + const warn = (this.logger as { warn?: typeof this.logger.error }).warn + if (typeof warn === 'function') { + warn.call(this.logger, prefixed, meta) + } else { + this.logger.error(prefixed, meta) + } } catch { // User-supplied logger threw; swallow so a broken logger never masks the // condition we were trying to surface. } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

warn(message: string, meta?: Record<string, unknown>): void {

if (!this.categories.errors) return

const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}`

try {

this.logger.warn(prefixed, meta)

} catch {

warn(message: string, meta?: Record<string, unknown>): void {

if (!this.categories.errors) return

const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}`

try {

const warn = (this.logger as { warn?: typeof this.logger.error }).warn

if (typeof warn === 'function') {

warn.call(this.logger, prefixed, meta)

} else {

this.logger.error(prefixed, meta)

}

} catch {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai/src/logger/internal-logger.ts` around lines 115 - 120, The warn method currently swallows warnings if the user-provided logger lacks warn (this.logger.warn) — update warn in the class (the warn(message: string, meta?: Record<string, unknown>) method that checks this.categories.errors and builds prefixed) to guard for a missing warn function and fall back to other available logging methods (for example this.logger.error, this.logger.debug) and finally console.warn/console.error as the last resort; perform feature-detection (typeof this.logger.warn === 'function') before calling and only use the try/catch for unexpected failures while ensuring the warning is never silently dropped.

…-to-modeloptions # Conflicts: # examples/ts-code-mode-web/src/lib/structured-output.ts

Extract the provider-native max-output-token key spellings into `utilities/sampling-keys.ts` so the summarize wrapper's caller-limit detection and the OTel middleware's sampling-attribute extraction read from one source and cannot drift. Also document the intentional non-native-spelling caveat in `applyMaxLength`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

tombeckenham

Made one small change to centralise a set of consts. All good to go

coderabbitai

🧹 Nitpick comments (1)

packages/ai/src/utilities/sampling-keys.ts (1)
1-14: 💤 Low value

Consider documenting the order priority logic.

The comment mentions that the OTel middleware "picks the first numeric spelling" (line 9), and line 15 describes the array as "ordered," but the header documentation doesn't explicitly state that order determines priority when multiple keys are present. Adding a sentence like "The order determines which key OTel reports when multiple token caps are set" would clarify the significance of the ordering.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai/src/utilities/sampling-keys.ts` around lines 1 - 14, Update the
header comment in packages/ai/src/utilities/sampling-keys.ts to explicitly state
that the array order defines priority when multiple provider-native token keys
are present: mention that the first key found is the one used by
middlewares/otel.ts for the gen_ai.request.max_tokens attribute and that
activities/summarize/chat-stream-summarize.ts relies on this ordering to detect
caller-supplied limits; also reference MAX_TOKENS_KEY_BY_ADAPTER to remind
maintainers to keep both lists in the same priority order.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/ai/src/utilities/sampling-keys.ts`:
- Around line 1-14: Update the header comment in
packages/ai/src/utilities/sampling-keys.ts to explicitly state that the array
order defines priority when multiple provider-native token keys are present:
mention that the first key found is the one used by middlewares/otel.ts for the
gen_ai.request.max_tokens attribute and that
activities/summarize/chat-stream-summarize.ts relies on this ordering to detect
caller-supplied limits; also reference MAX_TOKENS_KEY_BY_ADAPTER to remind
maintainers to keep both lists in the same priority order.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1d8794ad-0064-4d4d-9dc9-948c682cee38

📥 Commits

Reviewing files that changed from the base of the PR and between ca91230 and 4b15de3.

📒 Files selected for processing (3)

packages/ai/src/activities/summarize/chat-stream-summarize.ts
packages/ai/src/middlewares/otel.ts
packages/ai/src/utilities/sampling-keys.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/ai/src/middlewares/otel.ts
packages/ai/src/activities/summarize/chat-stream-summarize.ts

After rebasing onto main (which merged #660 moving temperature/topP/ maxTokens into modelOptions), fix the two spots where our edits still assumed top-level sampling: - thinking-content.md: max_tokens now lives in modelOptions alongside the thinking budget (was a top-level maxTokens). - anthropic.md: drop the stale "auto-raises top-level maxTokens" note; budget_tokens must be below modelOptions.max_tokens. (ollama.md and typed-options.md conflicts were resolved to main's new convention during the rebase.)

The "safe allowlist" example still forwarded temperature/maxTokens as top-level chat() options (missed by #660's sampling-into-modelOptions migration). Map them into modelOptions under OpenAI's native keys (temperature / max_output_tokens) so the example type-checks.

…691) * docs: fix inaccurate code samples and expand coverage across guides Audited all guide pages against the actual package APIs and fixed copy-paste-broken / outdated samples and filled coverage gaps. Middleware & structured outputs: - New Built-in Middleware page (toolCache, contentGuard, otel) + new top-level Middleware nav section; document structured-output chunk transforms via onChunk + ctx.phase; fix middleware type import paths. - Document client consumption (useChat partial/final) on the one-shot page. Correctness fixes (verified against packages/ source): - chat: providerOptions -> modelOptions; invalid model ids; budget_tokens requires maxTokens; async stream() factory -> fetcher; add missing imports; document default maxIterations(5) and agentLoopStrategy. - tools: toServerSentEventsStream -> toServerSentEventsResponse; remove duplicate tools key; clarify tool-call vs tool-result states; fix the React examples and state diagram; add emitCustomEvent / runtime-context. - media: add required model args to factory calls; fix recursive generateVideo; TranscriptionResult.words is top-level; speed is a top-level speech option; Gemini audio returns b64Json; onResult transform. - advanced: adapter.model (not selectedModel); GeminiImageMetadata; source.mimeType; text.format structured-output shape; fill How It Works; createModel capabilities form; soften unsubstantiated bundle figures. - protocol: rewrite SSE / HTTP-stream pages to the AG-UI event format (drop obsolete chunk shapes and [DONE]); use toHttpResponse/toHttpStream; expand chunk-definitions with TOOL_CALL_RESULT, MESSAGES_SNAPSHOT, REASONING_* and deprecated-alias notes. - adapters: elevenlabs SFX model + @elevenlabs/client; ollama modelOptions placement; cencori AG-UI event/tools alignment. - fix @tanstack/ai-openai/adapters -> @tanstack/ai-openai (ag-ui-compliance, otel). * docs: address CR-round findings (correctness + latest models) A 7+1-agent confirmation review of the docs PR surfaced further source-accuracy issues (and caught one regression the first fix pass introduced). All verified against packages/ source: - tools/server-tools: JSON-schema tool input is `unknown` (not `any`); samples now narrow/cast args. - thinking-content: drop the adaptive-thinking / output_config.effort example — those option types are not wired into any model's typed modelOptions; document the `{ type: 'enabled', budget_tokens }` form. - multimodal-content: correct the Anthropic modality bullets (no `claude-3*` ids; Claude Haiku 3 supports documents). - comparison: fix the ImagePart (`source: { type:'url', url }`) and TextPart (`content`) shapes in the flagship example. - chunk-definitions: RUN_STARTED/RUN_FINISHED `threadId` is required; add REASONING_MESSAGE_CHUNK to the internal-members note. - media: createOpenaiVideo needs a model arg; video `seconds` is a string union; transcription `responseFormat`/`prompt` are top-level (not modelOptions); drop the non-existent gpt-4o-mini-audio-preview TTS model; add the Audio row to the generations table. - advanced: typed-options gpt-image-1 size must be a GptImageSize. - observability: aiEventClient imports from @tanstack/ai-event-client (the @tanstack/ai/event-client subpath does not exist). - adapters: revert claude-haiku-3 -> claude-3-haiku (the id passed to anthropicText); clarify max_tokens auto-adjust; @elevenlabs/client (not @11labs/client); elevenlabs agentId optional, debug is DebugOption. - structured-outputs: Standard JSON Schema /json-schema link; Zod v4.2+. Model ids touched in these fixes use the latest per provider from model-meta.ts (gpt-5.5, claude-sonnet-4-6, etc.). * docs: use latest per-provider models in examples Sweep example model ids across the PR's docs to the latest available per provider, sourced from each adapter's model-meta.ts: - OpenAI: gpt-5.2 -> gpt-5.5, gpt-5-mini -> gpt-5.4-mini - Anthropic: claude-sonnet-4-5 -> claude-sonnet-4-6, claude-opus-4-6 -> claude-opus-4.8 - Gemini: gemini-2.0-flash -> gemini-3-flash-preview, image -> gemini-3.1-flash-image-preview, tts -> gemini-3.1-flash-tts-preview Every replacement id was verified present in model-meta.ts. Intentional cases preserved: negative/capability-contrast examples (per-model-type-safety), the claude-3-haiku web_search note, model enumeration/availability tables, DALL-E and o-series demos, and the Cencori pass-through ids (external provider, no in-repo model-meta). * docs: reconcile thinking examples with modelOptions sampling convention After rebasing onto main (which merged #660 moving temperature/topP/ maxTokens into modelOptions), fix the two spots where our edits still assumed top-level sampling: - thinking-content.md: max_tokens now lives in modelOptions alongside the thinking budget (was a top-level maxTokens). - anthropic.md: drop the stale "auto-raises top-level maxTokens" note; budget_tokens must be below modelOptions.max_tokens. (ollama.md and typed-options.md conflicts were resolved to main's new convention during the rebase.) * docs: migrate ag-ui-compliance forwardedProps allowlist to modelOptions The "safe allowlist" example still forwarded temperature/maxTokens as top-level chat() options (missed by #660's sampling-into-modelOptions migration). Map them into modelOptions under OpenAI's native keys (temperature / max_output_tokens) so the example type-checks. * docs: remove deprecated observability + protocol pages, drop all casts - Remove the deprecated Observability page (event-client observability is superseded; otelMiddleware is the supported path) and its nav entry + inbound links. - Remove the protocol pages (chunk-definitions, sse-protocol, http-stream-protocol) — TanStack AI implements AG-UI, whose protocol is documented upstream; repoint the few inbound links to docs.ag-ui.com. - Fix the broken ToolCacheStorage snippet (it imported the type then re-declared it) and verify the shape against source. - Remove every `as <Type>` assertion cast from the docs (JSON-schema tool inputs, JSON.parse, formData, custom-event values, type brands, …), replacing them with typeof/in guards, type guards, typed annotations, or schema validation. `createModel`'s provider-option brand now uses a typed const instead of `{} as X`. - CLAUDE.md / AGENTS.md: codify the docs conventions — no `as` casts in samples, use the latest model per provider from model-meta.ts, and show both server and client sides when a doc spans both. * docs: latest model in built-in-middleware + concrete structured-output transform - built-in-middleware.md: gpt-4o -> gpt-5.5 in the examples. - middleware.md: make the "Transforming structured-output chunks" example self-contained — redact SSNs inline in the streaming JSON delta instead of calling an undefined `redact()` helper. (The docs conventions — no casts, latest models, show both sides — already live in the project CLAUDE.md / AGENTS.md; the earlier global-CLAUDE.md addition has been reverted.)

AlemTuzlak added 17 commits May 30, 2026 13:40

refactor(ai-openai): read sampling options from modelOptions

356d2d2

refactor(openai-base): read sampling options from modelOptions in cha…

38a6211

…t-completions base

refactor(ai-anthropic): read sampling options from modelOptions, drop…

81738b5

… cast

fix(ai-anthropic): exempt max_tokens from dropped-key warning

c44b3e0

refactor(ai-gemini): read sampling options from modelOptions

55a26ce

fix(ai-ollama): read sampling from nested modelOptions.options, drop …

e4495e1

…cast and flat root reads

refactor(ai-openrouter): read sampling options from modelOptions, dro…

9c70ac3

…p cast

refactor(ai): remove root sampling options; modelOptions is the sole …

93240a3

…sampling surface

fix(ai): preserve summarize maxLength per-provider + fix otel samplin…

014e104

…g attribute spellings

refactor(ai-openrouter): read sampling from modelOptions in responses…

f91990c

… adapter

refactor(ai-gemini): read sampling from modelOptions in text-interact…

d9e1f8a

…ions adapter

test: migrate remaining root sampling usages to modelOptions

1240e32

feat(codemods): add move-sampling-to-model-options codemod

3734f55

docs: document sampling options under modelOptions + migration guide

96ce4c6

docs(skills): sampling options now live in modelOptions

ec88a7e

chore: changeset for sampling-options-to-modelOptions move

0886b1b

docs: correct sampling migration framing to breaking change

fd0ad2c

AlemTuzlak mentioned this pull request May 30, 2026

chore(adapters): move common options to model options #499

Closed

ci: apply automated fixes

4e8afb8

AlemTuzlak requested a review from tombeckenham May 30, 2026 14:32

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

tombeckenham requested changes Jun 2, 2026

View reviewed changes

tombeckenham reviewed Jun 2, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

AlemTuzlak and others added 2 commits June 2, 2026 21:53

Merge remote-tracking branch 'origin/main' into feat/sampling-options…

ca91230

…-to-modeloptions # Conflicts: # examples/ts-code-mode-web/src/lib/structured-output.ts

tombeckenham approved these changes Jun 3, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

AlemTuzlak merged commit 6df32b5 into main Jun 3, 2026
10 checks passed

AlemTuzlak deleted the feat/sampling-options-to-modeloptions branch June 3, 2026 08:46

github-actions Bot mentioned this pull request Jun 3, 2026

ci: Version Packages #690

Merged

		* provider reads — no adapter reads a generic `maxTokens`. A value of `null`
		* marks a nested shape (handled specially below for Ollama).

Uh oh!

Conversation

AlemTuzlak commented May 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Provider-native modelOptions keys

What changed

Migration

Testing

Known follow-up (not blocking)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

github-actions Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Changeset Version Preview

🟥 Major bumps

🟨 Minor bumps

🟩 Patch bumps

Uh oh!

nx-cloud Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkg-pr-new Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tombeckenham left a comment

Choose a reason for hiding this comment

Uh oh!

tombeckenham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlemTuzlak commented Jun 2, 2026

Blockers

CodeRabbit items

Examples (root maxTokens now silently ignored)

Deferred / not changed (with reasoning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

AlemTuzlak commented May 30, 2026 •

edited by coderabbitai Bot

Loading

Provider-native `modelOptions` keys

coderabbitai Bot commented May 30, 2026 •

edited

Loading

github-actions Bot commented May 30, 2026 •

edited

Loading

nx-cloud Bot commented May 30, 2026 •

edited

Loading

pkg-pr-new Bot commented May 30, 2026 •

edited

Loading

coderabbitai Bot May 30, 2026 •

edited

Loading

Examples (root `maxTokens` now silently ignored)