feat: move sampling options (temperature/topP/maxTokens) into modelOptions#660
Conversation
…t-completions base
…cast and flat root reads
…g attribute spellings
📝 WalkthroughWalkthroughThis PR removes root-level sampling options (temperature, topP, maxTokens) and consolidates sampling and token-limit configuration into provider-native modelOptions across adapters, middleware, docs, examples, and tests. It supplies a jscodeshift codemod, migration guide, and extensive test coverage. ChangesSampling Options Migration to modelOptions
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
🚀 Changeset Version Preview9 package(s) bumped directly, 21 bumped as dependents. 🟥 Major bumps
🟨 Minor bumps
🟩 Patch bumps
|
|
View your CI Pipeline Execution ↗ for commit 4b15de3
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/ai-ollama/src/adapters/text.ts (1)
36-39:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNarrow the fallback
modelOptionstype to the subset this adapter actually forwards.For arbitrary model strings,
ResolveModelOptionsfalls back to the fullollamaChatRequest, butmapCommonOptionsToOllama()only readsmodelOptions.options(and thenformat,keep_alive,logprobs,top_logprobs, andthinkwhen present). It sourcesmodel/messagesfromoptions.model/options.messagesand convertstoolsonly fromoptions.tools, so request-level keys likemodel,messages,stream, andtoolstyped via the fallback can be silently ignored at runtime.type ResolveModelOptions<TModel extends string> = TModel extends keyof OllamaChatModelOptionsByName ? OllamaChatModelOptionsByName[TModel] : ChatRequest🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai-ollama/src/adapters/text.ts` around lines 36 - 39, The fallback for ResolveModelOptions is too broad (falls back to ChatRequest) even though mapCommonOptionsToOllama only forwards specific fields; change the fallback to a narrowed type that only includes the fields actually forwarded (e.g., an object with an optional options property limited to Pick<ChatRequest['options'], 'format'|'keep_alive'|'logprobs'|'top_logprobs'|'think'> and an optional tools property of ChatRequest['tools']) so ResolveModelOptions<TModel> returns either OllamaChatModelOptionsByName[TModel] or that minimal subset; update the type alias ResolveModelOptions accordingly so mapCommonOptionsToOllama and related code use the correct, narrower modelOptions shape.
🧹 Nitpick comments (4)
codemods/move-sampling-to-model-options/README.md (1)
18-23: ⚡ Quick winClarify OpenAI adapter split in the mapping table.
The table currently presents a single OpenAI mapping to
max_output_tokens, which can confuseopenaiChatCompletionsusers (they needmax_tokens). Please add an explicit note or split rows for OpenAI Responses vs Chat Completions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codemods/move-sampling-to-model-options/README.md` around lines 18 - 23, The mapping table currently lists a single OpenAI entry mapping `maxTokens` to `max_output_tokens`, which is ambiguous for users of `openaiChatCompletions`; update the table to split the OpenAI row into two rows (e.g., "openai (Responses)" and "openai (Chat Completions)") and set `maxTokens` -> `max_output_tokens` for the Responses row and `maxTokens` -> `max_tokens` for the Chat Completions row; mention the `openaiChatCompletions` identifier in the note so readers know which row to use.packages/ai/tests/summarize-max-length.test.ts (1)
1-139: ⚡ Quick winPlace this unit test alongside the summarize source file.
This new suite lives under
packages/ai/tests/, but the repo convention is to colocate*.test.tsfiles with the source they cover. Moving it next topackages/ai/src/activities/summarize/chat-stream-summarize.tswill keep the summarize contract and its regression coverage together.As per coding guidelines, "Place unit tests in *.test.ts files alongside source files".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai/tests/summarize-max-length.test.ts` around lines 1 - 139, Move the test file packages/ai/tests/summarize-max-length.test.ts next to the implementation file chat-stream-summarize.ts so the test is colocated with its source; update imports in summarize-max-length.test.ts to reference the local module path (adjust any ../src/... imports) and keep references to ChatStreamSummarizeAdapter, createRecordingTextAdapter, and the test helpers (resolveDebugOption, ev) intact so the suite continues to import the same symbols from the colocated files. Ensure the new location preserves the same filename and that any path changes are minimal and correct for the package module resolution.codemods/move-sampling-to-model-options/transform.ts (1)
341-356: 💤 Low valueConsider removing the type cast for clarity.
Line 351 casts
key.name as RootSamplingKey, butkey.namecould be any string. While safe at runtime (Set.has() returns false for non-members), the cast is technically incorrect.♻️ Clearer type-safe alternative
const key = (prop as Property).key if ( key.type === 'Identifier' && - movedSet.has(key.name as RootSamplingKey) + ROOT_SAMPLING_KEYS.includes(key.name as RootSamplingKey) && + movedSet.has(key.name as RootSamplingKey) ) { return falseOr use a type guard:
+ const isRootSamplingKey = (name: string): name is RootSamplingKey => + ROOT_SAMPLING_KEYS.includes(name as RootSamplingKey) + ... if ( key.type === 'Identifier' && - movedSet.has(key.name as RootSamplingKey) + isRootSamplingKey(key.name) && + movedSet.has(key.name) ) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codemods/move-sampling-to-model-options/transform.ts` around lines 341 - 356, The cast key.name as RootSamplingKey is unnecessary/unsafe; instead make the set string-typed and check the identifier name directly: change movedSet's type from Set<RootSamplingKey> to Set<string> when creating it (const movedSet = new Set<string>(presentKeys)) and inside the obj.properties filter, after confirming key.type === 'Identifier', use const name = key.name and call movedSet.has(name) (no cast) to decide removal.codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts (1)
14-16: 💤 Low valueConsider using shorthand property syntax.
The codemod could be enhanced to preserve or use ES6 shorthand syntax when the property name matches the identifier name. Line 15 uses
temperature: temperature,but modern JavaScript/TypeScript convention prefers the shorthand formtemperature,for better readability.Both forms correctly reference the identifier rather than inlining the literal value, so the functional requirement is met.
♻️ More idiomatic shorthand syntax
modelOptions: { - temperature: temperature, + temperature, },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts` around lines 14 - 16, The object inside modelOptions uses verbose property syntax "temperature: temperature,"—update the codemod transform that emits the modelOptions object so when a property key equals its identifier (e.g., temperature) it emits the ES6 shorthand (temperature,) instead; locate the code that constructs or prints the modelOptions object (the logic producing the "modelOptions" node and its properties) and change it to detect identical key+identifier pairs and output the shorthand property form.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/migration/migration.md`:
- Around line 200-201: The "Complete Migration Example" still places sampling
props (temperature, maxTokens) at the chat() root even though the guide states
they were moved into provider-native modelOptions; update the "After" snippet(s)
referenced around the "Complete Migration Example" so that temperature and
maxTokens are removed from the chat() root and instead placed inside
modelOptions using provider-native keys (e.g., provider-specific names) while
keeping metadata at the root; ensure both affected blocks (around lines 461–468
and the final After snippet) reflect modelOptions usage for sampling.
In `@docs/migration/sampling-options-to-model-options.md`:
- Around line 210-211: The file ends with leaked XML/HTML tags that break
rendering; remove the accidental trailing tags "</content>" and "</invoke>" from
the end of docs/migration/sampling-options-to-model-options.md so the document
ends cleanly with the previous markdown content (ensure there are no other stray
markup fragments left).
In `@packages/ai-anthropic/src/adapters/text.ts`:
- Line 365: The code uses `modelOptions?.max_tokens || 1024` which treats
explicit 0 as falsy and overrides it; change this to use the nullish coalescing
operator so explicit 0 (or other falsy but defined values) are preserved:
replace that expression with `modelOptions?.max_tokens ?? 1024` (or an explicit
undefined check) so `validateTextProviderOptions()` / `validateMaxTokens()` can
correctly detect and reject invalid zero values.
In `@packages/ai-code-mode/models-eval/run-eval.ts`:
- Around line 224-225: The Ollama branch currently returns sampling params at
the top level (case 'ollama'), but the adapter expects them nested under
modelOptions.options; modify the 'ollama' case to return an object with an
options property containing num_predict: maxTokens and num_ctx: 32768 (i.e.,
move num_predict and num_ctx inside options) so the sampling parameters are
forwarded correctly to the Ollama adapter.
In `@packages/ai-openai/src/text/text-provider-options.ts`:
- Around line 17-38: OpenAIBaseOptions currently inherits temperature/top_p via
OpenAISamplingOptions which makes those fields valid for reasoning models;
remove temperature and top_p from the shared base by stopping OpenAIBaseOptions
from extending OpenAISamplingOptions (keep max_output_tokens in the base),
create a new OpenAISamplingOptions (or NonReasoningSamplingOptions) interface
that includes temperature/top_p and apply that only to non-reasoning model
option types in model-meta.ts (e.g., O1/O3 should not use OpenAIBaseOptions with
sampling fields); additionally ensure adapters/responses-text.ts
mapOptionsToRequest() does not blindly spread modelOptions into
ResponseCreateParams for reasoning models—either filter out temperature/top_p
there or only pass the sampling interface for non-reasoning models so the client
never sends temperature/top_p to the Responses API for reasoning models.
In `@packages/ai-openrouter/src/adapters/text.ts`:
- Around line 1148-1153: The request builder currently spreads only
restModelOptions into the ChatRequest (const request) so a root-level metadata
passed to chat({ metadata }) is dropped; update the construction of request in
the text adapter (the const request: Omit<ChatRequest, 'stream'> block) to also
include the root metadata (e.g., preserve options.metadata or a top-level
metadata parameter) alongside model, messages and tools so the ChatRequest
forwarded to OpenRouter contains metadata the same way responses adapter does.
In `@packages/ai/src/activities/summarize/chat-stream-summarize.ts`:
- Around line 80-87: In the Ollama branch (the adapterName === 'ollama' block)
ensure we don’t override caller-set token limits: before injecting
merged.options = { num_predict: maxLength, ...existing } check if existing
already contains any recognized flat token-limit keys (e.g., num_predict,
max_tokens, max_length, max_output_tokens) and if so return merged unchanged;
only set num_predict when none of those keys are present. Update the guard
around merged.options / existing and the assignment in that block to honor those
keys instead of always injecting num_predict.
In `@packages/ai/src/middlewares/otel.ts`:
- Around line 169-172: The helper firstNumber currently returns any value of
typeof 'number' (including NaN and Infinity); update its check in the
firstNumber function so it only returns numbers that are finite by using
Number.isFinite(candidate) (i.e., return candidate only if typeof candidate ===
'number' && Number.isFinite(candidate)), leaving the rest of the control flow
unchanged so invalid numeric values are filtered out and undefined is returned
when no finite number is found.
---
Outside diff comments:
In `@packages/ai-ollama/src/adapters/text.ts`:
- Around line 36-39: The fallback for ResolveModelOptions is too broad (falls
back to ChatRequest) even though mapCommonOptionsToOllama only forwards specific
fields; change the fallback to a narrowed type that only includes the fields
actually forwarded (e.g., an object with an optional options property limited to
Pick<ChatRequest['options'],
'format'|'keep_alive'|'logprobs'|'top_logprobs'|'think'> and an optional tools
property of ChatRequest['tools']) so ResolveModelOptions<TModel> returns either
OllamaChatModelOptionsByName[TModel] or that minimal subset; update the type
alias ResolveModelOptions accordingly so mapCommonOptionsToOllama and related
code use the correct, narrower modelOptions shape.
---
Nitpick comments:
In
`@codemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.ts`:
- Around line 14-16: The object inside modelOptions uses verbose property syntax
"temperature: temperature,"—update the codemod transform that emits the
modelOptions object so when a property key equals its identifier (e.g.,
temperature) it emits the ES6 shorthand (temperature,) instead; locate the code
that constructs or prints the modelOptions object (the logic producing the
"modelOptions" node and its properties) and change it to detect identical
key+identifier pairs and output the shorthand property form.
In `@codemods/move-sampling-to-model-options/README.md`:
- Around line 18-23: The mapping table currently lists a single OpenAI entry
mapping `maxTokens` to `max_output_tokens`, which is ambiguous for users of
`openaiChatCompletions`; update the table to split the OpenAI row into two rows
(e.g., "openai (Responses)" and "openai (Chat Completions)") and set `maxTokens`
-> `max_output_tokens` for the Responses row and `maxTokens` -> `max_tokens` for
the Chat Completions row; mention the `openaiChatCompletions` identifier in the
note so readers know which row to use.
In `@codemods/move-sampling-to-model-options/transform.ts`:
- Around line 341-356: The cast key.name as RootSamplingKey is
unnecessary/unsafe; instead make the set string-typed and check the identifier
name directly: change movedSet's type from Set<RootSamplingKey> to Set<string>
when creating it (const movedSet = new Set<string>(presentKeys)) and inside the
obj.properties filter, after confirming key.type === 'Identifier', use const
name = key.name and call movedSet.has(name) (no cast) to decide removal.
In `@packages/ai/tests/summarize-max-length.test.ts`:
- Around line 1-139: Move the test file
packages/ai/tests/summarize-max-length.test.ts next to the implementation file
chat-stream-summarize.ts so the test is colocated with its source; update
imports in summarize-max-length.test.ts to reference the local module path
(adjust any ../src/... imports) and keep references to
ChatStreamSummarizeAdapter, createRecordingTextAdapter, and the test helpers
(resolveDebugOption, ev) intact so the suite continues to import the same
symbols from the colocated files. Ensure the new location preserves the same
filename and that any path changes are minimal and correct for the package
module resolution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1e00706c-024b-453a-a0d4-8a667e653722
📒 Files selected for processing (86)
.changeset/sampling-options-to-model-options.mdcodemods/README.mdcodemods/move-sampling-to-model-options/README.mdcodemods/move-sampling-to-model-options/__testfixtures__/anthropic-merge.input.tscodemods/move-sampling-to-model-options/__testfixtures__/anthropic-merge.output.tscodemods/move-sampling-to-model-options/__testfixtures__/conflict.input.tscodemods/move-sampling-to-model-options/__testfixtures__/conflict.output.tscodemods/move-sampling-to-model-options/__testfixtures__/create-chat-options.input.tscodemods/move-sampling-to-model-options/__testfixtures__/create-chat-options.output.tscodemods/move-sampling-to-model-options/__testfixtures__/gemini-rename.input.tscodemods/move-sampling-to-model-options/__testfixtures__/gemini-rename.output.tscodemods/move-sampling-to-model-options/__testfixtures__/generate-and-ai.input.tscodemods/move-sampling-to-model-options/__testfixtures__/generate-and-ai.output.tscodemods/move-sampling-to-model-options/__testfixtures__/groq-maxtokens.input.tscodemods/move-sampling-to-model-options/__testfixtures__/groq-maxtokens.output.tscodemods/move-sampling-to-model-options/__testfixtures__/no-import.input.tscodemods/move-sampling-to-model-options/__testfixtures__/no-import.output.tscodemods/move-sampling-to-model-options/__testfixtures__/ollama-nested.input.tscodemods/move-sampling-to-model-options/__testfixtures__/ollama-nested.output.tscodemods/move-sampling-to-model-options/__testfixtures__/openai-basic.input.tscodemods/move-sampling-to-model-options/__testfixtures__/openai-basic.output.tscodemods/move-sampling-to-model-options/__testfixtures__/openrouter-maxtokens.input.tscodemods/move-sampling-to-model-options/__testfixtures__/openrouter-maxtokens.output.tscodemods/move-sampling-to-model-options/__testfixtures__/shorthand.input.tscodemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.tscodemods/move-sampling-to-model-options/__testfixtures__/unresolvable-adapter.input.tscodemods/move-sampling-to-model-options/__testfixtures__/unresolvable-adapter.output.tscodemods/move-sampling-to-model-options/transform.test.tscodemods/move-sampling-to-model-options/transform.tscodemods/package.jsondocs/adapters/anthropic.mddocs/adapters/gemini.mddocs/adapters/grok.mddocs/adapters/groq.mddocs/adapters/ollama.mddocs/adapters/openai.mddocs/adapters/openrouter.mddocs/advanced/middleware.mddocs/advanced/typed-options.mddocs/api/ai.mddocs/config.jsondocs/migration/migration.mddocs/migration/sampling-options-to-model-options.mdpackage.jsonpackages/ai-anthropic/src/adapters/text.tspackages/ai-anthropic/src/text/text-provider-options.tspackages/ai-anthropic/tests/anthropic-adapter.test.tspackages/ai-code-mode/models-eval/judge.tspackages/ai-code-mode/models-eval/run-eval.tspackages/ai-gemini/src/adapters/text.tspackages/ai-gemini/src/experimental/text-interactions/adapter.tspackages/ai-gemini/src/text/text-provider-options.tspackages/ai-gemini/tests/gemini-adapter.test.tspackages/ai-gemini/tests/text-interactions-adapter.test.tspackages/ai-grok/tests/grok-adapter.test.tspackages/ai-groq/tests/groq-adapter.test.tspackages/ai-ollama/src/adapters/text.tspackages/ai-ollama/src/index.tspackages/ai-ollama/src/meta/models-meta.tspackages/ai-ollama/tests/text-adapter.test.tspackages/ai-openai/src/text/text-provider-options.tspackages/ai-openai/tests/chat-per-model-type-safety.test.tspackages/ai-openai/tests/openai-adapter.test.tspackages/ai-openrouter/src/adapters/responses-text.tspackages/ai-openrouter/src/adapters/text.tspackages/ai-openrouter/tests/openrouter-adapter.test.tspackages/ai-openrouter/tests/openrouter-responses-adapter.test.tspackages/ai/skills/ai-core/adapter-configuration/SKILL.mdpackages/ai/skills/ai-core/adapter-configuration/references/anthropic-adapter.mdpackages/ai/skills/ai-core/adapter-configuration/references/gemini-adapter.mdpackages/ai/skills/ai-core/adapter-configuration/references/ollama-adapter.mdpackages/ai/skills/ai-core/adapter-configuration/references/openai-adapter.mdpackages/ai/skills/ai-core/chat-experience/SKILL.mdpackages/ai/skills/ai-core/middleware/SKILL.mdpackages/ai/src/activities/chat/index.tspackages/ai/src/activities/chat/middleware/types.tspackages/ai/src/activities/summarize/chat-stream-summarize.tspackages/ai/src/middlewares/otel.tspackages/ai/src/types.tspackages/ai/tests/chat.test.tspackages/ai/tests/middleware.test.tspackages/ai/tests/middlewares/otel.test.tspackages/ai/tests/summarize-max-length.test.tspackages/openai-base/src/adapters/chat-completions-text.tspackages/openai-base/src/adapters/responses-text.tspackages/openai-base/tests/responses-text.test.ts
💤 Files with no reviewable changes (4)
- packages/ai-gemini/src/adapters/text.ts
- packages/ai-gemini/src/experimental/text-interactions/adapter.ts
- packages/ai/src/types.ts
- packages/ai-ollama/src/index.ts
| /** Sampling controls shared by all Responses-API models. */ | ||
| export interface OpenAISamplingOptions { | ||
| /** | ||
| * Sampling temperature, 0–2. Higher = more random. Recommend altering this or top_p, not both. | ||
| * Note: OpenAI reasoning models (o-series, GPT-5 reasoning) reject temperature/top_p. | ||
| * https://platform.openai.com/docs/api-reference/responses/create#responses_create-temperature | ||
| */ | ||
| temperature?: number | ||
| /** | ||
| * Nucleus sampling. 0.1 = only the top 10% probability mass is considered. | ||
| * https://platform.openai.com/docs/api-reference/responses/create#responses_create-top_p | ||
| */ | ||
| top_p?: number | ||
| /** | ||
| * Upper bound on generated tokens (visible output + reasoning tokens). | ||
| * https://platform.openai.com/docs/api-reference/responses/create#responses_create-max_output_tokens | ||
| */ | ||
| max_output_tokens?: number | ||
| } | ||
|
|
||
| // Core, always-available options for Responses API | ||
| export interface OpenAIBaseOptions { | ||
| export interface OpenAIBaseOptions extends OpenAISamplingOptions { |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Do OpenAI Responses API reasoning models (for example o3 and GPT-5 reasoning models) accept the temperatureandtop_p parameters, or are those parameters rejected?
💡 Result:
OpenAI's dedicated reasoning models, such as the o1 and o3 series (including o3-mini) and the GPT-5 series, generally do not accept the traditional temperature and top_p sampling parameters via the API [1][2][3][4]. Attempts to include these parameters in API requests for these models typically result in a 400-level invalid_request_error stating that they are unsupported [1][2]. These reasoning models instead utilize specialized parameters to control their behavior, most notably reasoning.effort (or simply reasoning_effort in some environments), which allows developers to adjust the depth of the model's internal thinking process (e.g., minimal, low, medium, high, or xhigh) [5][6][7][8][3]. The exclusion of sampling parameters like temperature is a deliberate design choice by OpenAI to maintain the reliability and determinism required for complex, multi-step reasoning tasks [2]. Because these models are optimized to generate a high-quality, singular "best" chain-of-thought, standard sampling techniques used for creative text generation are often incompatible with their internal architecture [2]. Developers building on these models should omit temperature and top_p from their API calls to avoid request errors [1][2][4].
Citations:
- 1: CLI erroneously sends unsupported parameters (temperature/top_p) to the o3-mini model openai/openai-python#2072
- 2: https://community.openai.com/t/o3-mini-unsupported-parameter-temperature/1140846
- 3: https://learn.microsoft.com/en-us/azure/foundry/openai/how-to/reasoning
- 4: https://forum.popclip.app/t/required-api-changes-for-gpt-5-x-reasoning-models/3661
- 5: https://developers.openai.com/api/docs/guides/reasoning
- 6: https://docs.x.ai/developers/model-capabilities/text/reasoning
- 7: https://developers.openai.com/api/docs/models/gpt-5
- 8: https://developers.openai.com/api/docs/guides/latest-model
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find all references to the shared sampling/base types
rg -n "OpenAISamplingOptions|OpenAIBaseOptions" packages/ai-openai/src/text/text-provider-options.ts packages/ai-openai/src -S
# Show the rest of the file around where base/model option maps are composed
FILE="packages/ai-openai/src/text/text-provider-options.ts"
if [ -f "$FILE" ]; then
echo "---- $FILE (context) ----"
# print around the shown snippet and further down for composition
nl -ba "$FILE" | sed -n '1,220p'
fi
# Look for option maps / reasoning models naming patterns
rg -n "reasoning|o3|o1|gpt-5|responses.*(create|temperature|top_p)" packages/ai-openai/src -S
# Identify where modelOptions types are built for text providers
rg -n "modelOptions|ModelOptions|options.*model" packages/ai-openai/src/text packages/ai-openai/src -SRepository: TanStack/ai
Length of output: 9554
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show key type definitions and where reasoning models include base options.
rg -n "interface OpenAIReasoningOptions|type OpenAIReasoningOptions|OpenAIReasoningOptions" packages/ai-openai/src/text/text-provider-options.ts packages/ai-openai/src/model-meta.ts -S
# Print relevant ranges from text-provider-options.ts around sampling/base/reasoning definitions
FILE="packages/ai-openai/src/text/text-provider-options.ts"
# Get line numbers via python and print small windows around important anchors
python3 - <<'PY'
import re, itertools
file_path="packages/ai-openai/src/text/text-provider-options.ts"
anchors=["export interface OpenAISamplingOptions","export interface OpenAIBaseOptions","OpenAIReasoningOptions","ExternalTextProviderOptions"]
with open(file_path,'r',encoding='utf-8') as f:
lines=f.readlines()
for i,line in enumerate(lines,1):
for a in anchors:
if a in line:
print(f"--- anchor:{a} at {i} ---")
start=max(1,i-25); end=min(len(lines),i+120)
for j in range(start,end+1):
# trim long lines a bit? just output as-is
print(f"{j}:{lines[j-1].rstrip()}")
PY
# Print relevant ranges from model-meta.ts around reasoning models (O3/O1) where OpenAIBaseOptions is used
python3 - <<'PY'
file_path="packages/ai-openai/src/model-meta.ts"
targets=["[O3.name]","[O1.name]","[O3_PRO.name]","[O1_PRO.name]","reasoning"]
with open(file_path,'r',encoding='utf-8') as f:
lines=f.readlines()
def match_any(line):
return any(t in line for t in targets)
for i,line in enumerate(lines,1):
if match_any(line):
print(f"--- match at {i}: {line.strip()} ---")
start=max(1,i-20); end=min(len(lines),i+40)
for j in range(start,end+1):
print(f"{j}:{lines[j-1].rstrip()}")
PY
# Search for how temperature/top_p is serialized into API requests.
rg -n "temperature|max_output_tokens|top_p" packages/ai-openai/src -SRepository: TanStack/ai
Length of output: 50367
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show key reasoning type definitions and where reasoning models include base options.
rg -n "OpenAIReasoningOptions" packages/ai-openai/src/text/text-provider-options.ts packages/ai-openai/src/model-meta.ts -S
# Print relevant windows from text-provider-options.ts around sampling/base/reasoning/external option composition.
python3 - <<'PY'
file_path="packages/ai-openai/src/text/text-provider-options.ts"
anchors=[
"export interface OpenAISamplingOptions",
"export interface OpenAIBaseOptions",
"OpenAIReasoningOptions",
"export type ExternalTextProviderOptions"
]
with open(file_path,'r',encoding='utf-8') as f:
lines=f.readlines()
for i,line in enumerate(lines,1):
for a in anchors:
if a in line:
print(f"\n--- anchor:{a} at {i} ---")
start=max(1,i-25); end=min(len(lines),i+180)
for j in range(start,end+1):
print(f"{j}:{lines[j-1].rstrip()}")
PY
# Print relevant windows from model-meta.ts around reasoning models (O3/O1) where OpenAIBaseOptions is used.
python3 - <<'PY'
file_path="packages/ai-openai/src/model-meta.ts"
with open(file_path,'r',encoding='utf-8') as f:
lines=f.readlines()
# Print around O3/O1 entries explicitly
for i,line in enumerate(lines,1):
if "[O3.name]" in line or "[O1.name]" in line:
print(f"\n--- model-meta match at {i}: {line.strip()} ---")
start=max(1,i-25); end=min(len(lines),i+60)
for j in range(start,end+1):
print(f"{j}:{lines[j-1].rstrip()}")
PY
# Search for how temperature/top_p is serialized into API requests.
rg -n "temperature|max_output_tokens|top_p" packages/ai-openai/src -SRepository: TanStack/ai
Length of output: 50367
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find non-type usages/serialization points for temperature/top_p.
# Limit output to keep it readable.
rg -n --type ts "(^|[^a-zA-Z0-9_])(temperature|top_p|max_output_tokens)([^a-zA-Z0-9_]|$)" packages/ai-openai/src -S \
| head -n 80
# Look specifically for payload construction / Responses create call sites.
rg -n --type ts "responses\.create|Responses\.create|createResponse|response.*payload|payload\s*=" packages/ai-openai/src -S \
| head -n 80
# Look for any conditional gating related to reasoning models.
rg -n --type ts "reasoning.*temperature|temperature.*reasoning|top_p.*reasoning|reasoning.*top_p" packages/ai-openai/src -S \
| head -n 80
rg -n --type ts "O1\.name|O3\.name|o-series|o3|gpt-5.*reasoning|reasoning model" packages/ai-openai/src -S \
| head -n 80Repository: TanStack/ai
Length of output: 8286
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for where temperature/top_p are mapped into Responses API payloads for text.
rg -n --type ts "top_p\s*:|temperature\s*:" packages/ai-openai/src/adapters packages/ai-openai/src/text -S
# Look for any explicit filtering based on model capability/name.
rg -n --type ts "O1\.name|O3\.name|o3|o1|reasoning model|reasoning" packages/ai-openai/src/adapters packages/ai-openai/src/text -S | head -n 200
# Also check the legacy/openai adapter entrypoint if it builds payload for responses.create.
rg -n --type ts "responses\.create|OpenAI\.responses\.create|create\(" packages/ai-openai/src/adapters packages/ai-openai/src/text -S | head -n 200Repository: TanStack/ai
Length of output: 37
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="packages/ai-openai/src/adapters/text.ts"
python3 - <<'PY'
import itertools, pathlib
path=pathlib.Path("packages/ai-openai/src/adapters/text.ts")
text=path.read_text(encoding="utf-8").splitlines()
# print whole file if small? otherwise print key regions
print("LINES", len(text))
# find occurrences of responses.create or payload building
need=[]
for i,line in enumerate(text,1):
if "responses.create" in line or "Responses.create" in line or "openai.responses.create" in line:
need.append(i)
if "temperature" in line or "top_p" in line:
need.append(i)
if not need:
# still print first 260 lines and any later sections likely containing request building
slices=[(1,220),(220,440),(440,660),(660,880)]
else:
slices=[]
for i in sorted(set(need)):
slices.append((max(1,i-60), min(len(text), i+80)))
# dedupe slices by start
seen=set()
for a,b in slices:
key=(a,b)
if key in seen: continue
seen.add(key)
print(f"\n---- slice {a}-{b} ----")
for j in range(a,b+1):
print(f"{j}:{text[j-1]}")
PY
# Broader scan in adapters for responses payload creation / temperature
rg -n --type ts "responses\.create|OpenAI.*responses|response[s]?\s*\(" packages/ai-openai/src/adapters -S
rg -n --type ts "temperature|max_output_tokens|top_p" packages/ai-openai/src/adapters -S || trueRepository: TanStack/ai
Length of output: 5664
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "OpenAIBaseResponsesTextAdapter" -S packages
# If found, show the key parts where it builds the ResponseCreateParams payload.
# Otherwise, we may need to determine that the filtering (if any) lives in an external package.
rg -n "mapOptionsToRequest\\(|ResponseCreateParams|responses\\.create|temperature|top_p" -S packages | head -n 200Repository: TanStack/ai
Length of output: 28149
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="packages/openai-base/src/adapters/responses-text.ts"
# show the region around the comment mentioning modelOptions is sole sampling surface
# (use python to avoid relying on nl)
python3 - <<'PY'
import pathlib
path=pathlib.Path("packages/openai-base/src/adapters/responses-text.ts")
lines=path.read_text(encoding="utf-8").splitlines()
# print around the first occurrence of the anchor text
anchor="modelOptions is the sole sampling surface"
idx=None
for i,l in enumerate(lines):
if anchor in l:
idx=i
break
print("ANCHOR_LINE_INDEX", idx+1 if idx is not None else None)
if idx is None:
raise SystemExit("anchor not found")
start=max(0, idx-80)
end=min(len(lines), idx+220)
for j in range(start,end):
print(f"{j+1}:{lines[j]}")
PY
# Then narrow specifically for temperature/top_p serialization in this file
rg -n "temperature|top_p|max_output_tokens" "$FILE" -S
# Print small windows around each temperature/top_p occurrence (keep concise)
python3 - <<'PY'
import pathlib,re
path=pathlib.Path("packages/openai-base/src/adapters/responses-text.ts")
lines=path.read_text(encoding="utf-8").splitlines()
hits=[]
for i,l in enumerate(lines,1):
if re.search(r"\b(temperature|top_p|max_output_tokens)\b", l):
hits.append(i)
# keep up to 10 hits for brevity
for i in hits[:12]:
start=max(1,i-15); end=min(len(lines),i+25)
print(f"\n--- window around line {i} ---")
for j in range(start,end+1):
print(f"{j}:{lines[j-1]}")
PYRepository: TanStack/ai
Length of output: 113
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="packages/openai-base/src/adapters/responses-text.ts"
# Find where mapOptionsToRequest starts and where it mentions modelOptions/sampling.
rg -n --type ts "mapOptionsToRequest|modelOptions|sampling|temperature|top_p" "$FILE" -S
# Print around mapOptionsToRequest and around any temperature/top_p mentions.
python3 - <<'PY'
import pathlib,re
path=pathlib.Path("packages/openai-base/src/adapters/responses-text.ts")
lines=path.read_text(encoding="utf-8").splitlines()
# line numbers with temperature/top_p/max_output_tokens
targets=[]
for i,l in enumerate(lines,1):
if re.search(r"\b(temperature|top_p|max_output_tokens)\b", l):
targets.append(i)
# add mapOptionsToRequest region start
for i,l in enumerate(lines,1):
if "mapOptionsToRequest" in l:
targets.append(i)
targets=sorted(set(targets))
print("HIT_COUNT", len(targets))
# print windows
for i in targets[:30]:
start=max(1,i-25); end=min(len(lines), i+50)
print(f"\n---- window around {i} ----")
for j in range(start,end+1):
print(f"{j}:{lines[j-1]}")
PYRepository: TanStack/ai
Length of output: 37172
Don’t expose temperature/top_p on the shared OpenAI base options for reasoning models
OpenAIBaseOptionscurrently extendsOpenAISamplingOptions(temperature/top_p), andmodel-meta.tsassignsOpenAIBaseOptionsto reasoning models likeO1/O3, so these fields become valid for all models.- At runtime,
packages/openai-base/src/adapters/responses-text.tsmapOptionsToRequest()spreads...modelOptionsdirectly intoResponseCreateParams(the sole sampling surface), with no capability-based filtering—sotemperature/top_pwill be sent to/v1/responsesand rejected by OpenAI reasoning models.
Keep max_output_tokens in the shared base if needed, but gate/remove temperature/top_p from the shared base and put them only on the non-reasoning model option types.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-openai/src/text/text-provider-options.ts` around lines 17 - 38,
OpenAIBaseOptions currently inherits temperature/top_p via OpenAISamplingOptions
which makes those fields valid for reasoning models; remove temperature and
top_p from the shared base by stopping OpenAIBaseOptions from extending
OpenAISamplingOptions (keep max_output_tokens in the base), create a new
OpenAISamplingOptions (or NonReasoningSamplingOptions) interface that includes
temperature/top_p and apply that only to non-reasoning model option types in
model-meta.ts (e.g., O1/O3 should not use OpenAIBaseOptions with sampling
fields); additionally ensure adapters/responses-text.ts mapOptionsToRequest()
does not blindly spread modelOptions into ResponseCreateParams for reasoning
models—either filter out temperature/top_p there or only pass the sampling
interface for non-reasoning models so the client never sends temperature/top_p
to the Responses API for reasoning models.
There was a problem hiding this comment.
Deferring this one to a follow-up rather than folding it into this PR, for two reasons:
-
Not a regression introduced here. Before this PR,
temperature/topPlived on thechat()root and were sent to any model, reasoning ones included — the runtime behavior (a user can sendtemperatureto o3/gpt-5 and get a 400) is unchanged. This PR only relocated the fields; it didn't newly enable sending them to reasoning models at runtime. -
It's a sizable, orthogonal type-design change. Every OpenAI model in
model-meta.ts(~30 entries) composesOpenAIBaseOptions. Splitting sampling out of the shared base and re-applying it only to non-reasoning models touches the whole matrix and reshapes the public per-model type surface — that deserves its own PR with its own tests, not a rider on the sampling-relocation change. The docstring onOpenAISamplingOptionsalready documents that reasoning models reject these.
Tracking as a follow-up: gate temperature/top_p to non-reasoning OpenAI models at the type level (and optionally filter them in responses-text.ts for reasoning models).
There was a problem hiding this comment.
@AlemTuzlak, that reasoning makes sense — this PR relocates the surface, it doesn't regress behaviour, and the docstring on OpenAISamplingOptions already calls out the restriction. Splitting the per-model type matrix into a separate PR is the right call.
Would you like me to open a GitHub issue to track the follow-up work (gating temperature/top_p to non-reasoning model types and optionally filtering them in responses-text.ts)?
tombeckenham
left a comment
There was a problem hiding this comment.
- pnpm vitest run move-sampling-to-model-options fails
- Ollama summarize silently drops the temperature: 0.3 default
tombeckenham
left a comment
There was a problem hiding this comment.
Review of the sampling→modelOptions move. Core mapping is correct and well-tested; flagging a few real issues inline. Two blockers:
- Codemod suite is red (9/12 fail) and escapes CI —
codemodsexposes atestscript, nottest:lib, sopnpm test:prnever runs it. - Ollama
temperature/token caps land flat in two places (summarize default + the eval harness) but the adapter reads sampling only frommodelOptions.options→ silently dropped.
Also (not inline-able, examples aren't in the diff): ts-code-mode-web (api.product-codemode.ts:271, api.judge.ts:56, api.banking-demo.ts:237) and ts-react-chat/api.structured-output.ts:409 still pass root maxTokens to chat() — now silently ignored.
| ): Record<string, unknown> { | ||
| switch (provider) { | ||
| case 'ollama': | ||
| return { num_predict: maxTokens, num_ctx: 32768 } |
There was a problem hiding this comment.
Flat num_predict/num_ctx are dropped: this result is passed as modelOptions (L818), but the Ollama adapter reads sampling only from modelOptions.options. Nest it:
return { options: { num_predict: maxTokens, num_ctx: 32768 } }| let working: Record<string, unknown> = { | ||
| temperature: 0.3, | ||
| ...(options.modelOptions as Record<string, unknown> | undefined), | ||
| } |
There was a problem hiding this comment.
Regression for Ollama: temperature: 0.3 stays flat, but Ollama reads sampling only from modelOptions.options, so the default never reaches the wire (it did before this PR). OTel reads the flat value, so telemetry will report 0.3 while the request omits it. Nest temperature under options for the ollama name.
| * provider reads — no adapter reads a generic `maxTokens`. A value of `null` | ||
| * marks a nested shape (handled specially below for Ollama). |
There was a problem hiding this comment.
Inaccurate: the map is Record<string, string> with no null and no ollama key — Ollama is a hardcoded branch in applyMaxLength. Drop the null sentence.
| ] as const | ||
|
|
||
| /** | ||
| * Resolve `maxLength` to the provider-native max-output-tokens key for the |
There was a problem hiding this comment.
"the text adapter's name" → it's this summarize adapter's own name (constructor arg, defaults to 'chat-stream-summarize'), independent of the wrapped text adapter.
| } | ||
|
|
||
| const key = MAX_TOKENS_KEY_BY_ADAPTER[adapterName] | ||
| if (key === undefined) return merged |
There was a problem hiding this comment.
Silent no-op: any name not in the map (new provider, the default 'chat-stream-summarize', or the new openaiCompatible adapter's custom names) drops maxLength with no signal. At least logger.warn here; better, type adapter name as a literal union so the map must be exhaustive.
| } | ||
|
|
||
| /** | ||
| * Return the first candidate that is a finite `number`, or `undefined`. Used to |
There was a problem hiding this comment.
Says "finite" but the check is only typeof === 'number', so NaN/Infinity pass. Either guard with Number.isFinite or drop "finite".
| ): { reports: Array<string> } { | ||
| const expected = read(`${name}.output.${ext}`) | ||
| const { output, reports } = runTransform(name, ext) | ||
| expect(normalize(output)).toBe(normalize(expected)) |
There was a problem hiding this comment.
This suite is red locally (9/12): recast prints ;/no trailing commas, fixtures are Prettier-style, and normalize() only trims. Transform output is semantically correct — run it through Prettier in the harness (or regenerate fixtures), and wire this into test:pr (it currently isn't, since codemods has no test:lib target).
| @@ -0,0 +1,31 @@ | |||
| --- | |||
| '@tanstack/ai': minor | |||
There was a problem hiding this comment.
Please note the removed public export OllamaTextProviderOptions (deleted from ai-ollama/src/index.ts) — it's a breaking API-surface change; mention the migration to the per-model type / SDK ChatRequest.
Blocking fixes (codemod CI + Ollama silent drops): - codemod: Prettier-normalize the transform test harness so recast's print style no longer diverges from the Prettier-formatted fixtures (20/20 green), and add a `test:lib` script to the codemods package so `nx affected` (and thus `test:pr`) actually runs the suite instead of skipping it. - summarize: place the default `temperature` where the wrapped provider reads it — nested under `options` for Ollama (a flat value was dropped at the wire while still surfacing in OTel). Honor caller-set flat token limits in the Ollama branch, and warn instead of silently dropping `maxLength` for an unrecognised adapter name. - code-mode eval harness: nest Ollama `num_predict`/`num_ctx` under `options`. - examples: route the generic `maxTokens` through provider-native `modelOptions` (shared `maxTokensModelOptions` helper for dynamically resolved adapters; native keys inline where the adapter is static). Other review items: - anthropic: `max_tokens ?? 1024` so an explicit `0` reaches validation instead of being coerced to the default. - openrouter (chat): forward root `metadata` like the responses adapter (+ test). - otel: `firstNumber` now requires `Number.isFinite` (rejects NaN/Infinity). - logger: add `InternalLogger.warn`, gated by the `errors` category so `debug: false` still silences it. - docs: fix the migration example to put sampling under `modelOptions`; drop leaked trailing tags from the sampling guide; note the removed `OllamaTextProviderOptions` export in the changeset; split the codemod README's OpenAI row into Responses vs Chat Completions. - codemod: emit ES6 shorthand when a moved value matches its key; drop an unsafe `as RootSamplingKey` cast.
|
Thanks for the thorough review @tombeckenham + CodeRabbit. Addressed in acb3371. Blockers
CodeRabbit items
Examples (root
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.changeset/sampling-options-to-model-options.md (1)
15-15:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClarify OpenAI Chat Completions vs Responses modelOptions token key in the changeset.
The changeset entry only documents
max_output_tokensunder “OpenAI (Responses)” (line 15), but the OpenAI chat-completions adapter uses provider-nativemodelOptionswire namesmax_tokens/max_completion_tokens(and doesn’t read the rootmaxTokens). Update the doc to add a separate “OpenAI (Chat Completions)” entry (including both key variants) or explicitly scopemax_output_tokensto Responses only.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.changeset/sampling-options-to-model-options.md at line 15, The changeset currently lists OpenAI (Responses) modelOptions as `{ temperature, top_p, max_output_tokens }` but doesn’t clarify that OpenAI chat-completions adapters use provider-native keys; update the changeset to either (A) add a new "OpenAI (Chat Completions)" entry listing `modelOptions: { temperature, top_p, max_tokens, max_completion_tokens }` or (B) explicitly scope the existing line to say `OpenAI (Responses): modelOptions: { temperature, top_p, max_output_tokens }` and note that chat-completions uses `max_tokens`/`max_completion_tokens` (and does not read root `maxTokens`), referencing the `modelOptions`, `max_output_tokens`, `max_tokens`, and `max_completion_tokens` names so readers know the exact keys to use.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/ai-openrouter/tests/openrouter-adapter.test.ts`:
- Around line 1499-1515: Add an assertion that runs the outbound serializer on
the outgoing payload so the test verifies metadata survives SDK serialization:
after extracting rawParams = mockSend.mock.calls[0][0] and chatRequest =
rawParams.chatRequest, call ChatRequest$outboundSchema.parse(...) (the same
pattern used in other tests) on the serialized payload and assert the parsed
result contains metadata { env: 'test' } to ensure metadata is not stripped
during serialization.
In `@packages/ai/src/logger/internal-logger.ts`:
- Around line 115-120: The warn method currently swallows warnings if the
user-provided logger lacks warn (this.logger.warn) — update warn in the class
(the warn(message: string, meta?: Record<string, unknown>) method that checks
this.categories.errors and builds prefixed) to guard for a missing warn function
and fall back to other available logging methods (for example this.logger.error,
this.logger.debug) and finally console.warn/console.error as the last resort;
perform feature-detection (typeof this.logger.warn === 'function') before
calling and only use the try/catch for unexpected failures while ensuring the
warning is never silently dropped.
---
Outside diff comments:
In @.changeset/sampling-options-to-model-options.md:
- Line 15: The changeset currently lists OpenAI (Responses) modelOptions as `{
temperature, top_p, max_output_tokens }` but doesn’t clarify that OpenAI
chat-completions adapters use provider-native keys; update the changeset to
either (A) add a new "OpenAI (Chat Completions)" entry listing `modelOptions: {
temperature, top_p, max_tokens, max_completion_tokens }` or (B) explicitly scope
the existing line to say `OpenAI (Responses): modelOptions: { temperature,
top_p, max_output_tokens }` and note that chat-completions uses
`max_tokens`/`max_completion_tokens` (and does not read root `maxTokens`),
referencing the `modelOptions`, `max_output_tokens`, `max_tokens`, and
`max_completion_tokens` names so readers know the exact keys to use.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cd6d37d6-e98c-4ff7-9a4e-0b165fcab82c
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (24)
.changeset/sampling-options-to-model-options.mdcodemods/move-sampling-to-model-options/README.mdcodemods/move-sampling-to-model-options/__testfixtures__/anthropic-merge.output.tscodemods/move-sampling-to-model-options/__testfixtures__/shorthand.output.tscodemods/move-sampling-to-model-options/transform.test.tscodemods/move-sampling-to-model-options/transform.tscodemods/package.jsondocs/migration/migration.mddocs/migration/sampling-options-to-model-options.mdexamples/ts-code-mode-web/src/lib/create-execute-prompt-tool.tsexamples/ts-code-mode-web/src/lib/max-tokens-model-options.tsexamples/ts-code-mode-web/src/lib/structured-output.tsexamples/ts-code-mode-web/src/routes/_banking-demo/api.banking-demo.tsexamples/ts-code-mode-web/src/routes/_database-demo/api.judge.tsexamples/ts-code-mode-web/src/routes/_home/api.product-codemode.tsexamples/ts-react-chat/src/routes/api.structured-output.tspackages/ai-anthropic/src/adapters/text.tspackages/ai-code-mode/models-eval/run-eval.tspackages/ai-openrouter/src/adapters/text.tspackages/ai-openrouter/tests/openrouter-adapter.test.tspackages/ai/src/activities/summarize/chat-stream-summarize.tspackages/ai/src/logger/internal-logger.tspackages/ai/src/middlewares/otel.tspackages/ai/tests/summarize-max-length.test.ts
💤 Files with no reviewable changes (1)
- docs/migration/sampling-options-to-model-options.md
✅ Files skipped from review due to trivial changes (1)
- codemods/move-sampling-to-model-options/README.md
🚧 Files skipped from review as they are similar to previous changes (11)
- codemods/move-sampling-to-model-options/testfixtures/shorthand.output.ts
- packages/ai/tests/summarize-max-length.test.ts
- docs/migration/migration.md
- codemods/move-sampling-to-model-options/testfixtures/anthropic-merge.output.ts
- codemods/move-sampling-to-model-options/transform.test.ts
- packages/ai-code-mode/models-eval/run-eval.ts
- packages/ai-openrouter/src/adapters/text.ts
- packages/ai/src/middlewares/otel.ts
- packages/ai/src/activities/summarize/chat-stream-summarize.ts
- codemods/move-sampling-to-model-options/transform.ts
- packages/ai-anthropic/src/adapters/text.ts
| it('forwards root metadata to the request (same as the responses adapter)', async () => { | ||
| setupMockSdkClient(minimalStreamChunks) | ||
| const adapter = createAdapter() | ||
|
|
||
| for await (const _ of chat({ | ||
| adapter, | ||
| messages: [{ role: 'user', content: 'test' }], | ||
| // Root `metadata` is still part of the contract; it must not be dropped | ||
| // by the chat-completions request builder. | ||
| metadata: { env: 'test' }, | ||
| })) { | ||
| // consume | ||
| } | ||
|
|
||
| const [rawParams] = mockSend.mock.calls[0]! | ||
| const params = rawParams.chatRequest | ||
| expect(params.metadata).toEqual({ env: 'test' }) |
There was a problem hiding this comment.
Assert the serialized payload too.
This only checks chatRequest before the SDK's outbound schema runs. A regression that still strips metadata during serialization would pass here, so this test should mirror the existing ChatRequest$outboundSchema.parse(...) assertion pattern as well.
Suggested addition
const [rawParams] = mockSend.mock.calls[0]!
const params = rawParams.chatRequest
expect(params.metadata).toEqual({ env: 'test' })
+
+ const serialized = ChatRequest$outboundSchema.parse(params)
+ expect((serialized as { metadata?: unknown }).metadata).toEqual({
+ env: 'test',
+ })
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it('forwards root metadata to the request (same as the responses adapter)', async () => { | |
| setupMockSdkClient(minimalStreamChunks) | |
| const adapter = createAdapter() | |
| for await (const _ of chat({ | |
| adapter, | |
| messages: [{ role: 'user', content: 'test' }], | |
| // Root `metadata` is still part of the contract; it must not be dropped | |
| // by the chat-completions request builder. | |
| metadata: { env: 'test' }, | |
| })) { | |
| // consume | |
| } | |
| const [rawParams] = mockSend.mock.calls[0]! | |
| const params = rawParams.chatRequest | |
| expect(params.metadata).toEqual({ env: 'test' }) | |
| it('forwards root metadata to the request (same as the responses adapter)', async () => { | |
| setupMockSdkClient(minimalStreamChunks) | |
| const adapter = createAdapter() | |
| for await (const _ of chat({ | |
| adapter, | |
| messages: [{ role: 'user', content: 'test' }], | |
| // Root `metadata` is still part of the contract; it must not be dropped | |
| // by the chat-completions request builder. | |
| metadata: { env: 'test' }, | |
| })) { | |
| // consume | |
| } | |
| const [rawParams] = mockSend.mock.calls[0]! | |
| const params = rawParams.chatRequest | |
| expect(params.metadata).toEqual({ env: 'test' }) | |
| const serialized = ChatRequest$outboundSchema.parse(params) | |
| expect((serialized as { metadata?: unknown }).metadata).toEqual({ | |
| env: 'test', | |
| }) | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-openrouter/tests/openrouter-adapter.test.ts` around lines 1499 -
1515, Add an assertion that runs the outbound serializer on the outgoing payload
so the test verifies metadata survives SDK serialization: after extracting
rawParams = mockSend.mock.calls[0][0] and chatRequest = rawParams.chatRequest,
call ChatRequest$outboundSchema.parse(...) (the same pattern used in other
tests) on the serialized payload and assert the parsed result contains metadata
{ env: 'test' } to ensure metadata is not stripped during serialization.
| warn(message: string, meta?: Record<string, unknown>): void { | ||
| if (!this.categories.errors) return | ||
| const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}` | ||
| try { | ||
| this.logger.warn(prefixed, meta) | ||
| } catch { |
There was a problem hiding this comment.
Fallback when the user logger does not implement warn.
This is a new runtime requirement on custom loggers. If a caller still provides the old debug/error shape, this.logger.warn(...) throws and the blanket catch turns the warning into a silent drop, which defeats the whole point of surfacing recoverable anomalies.
Suggested fix
warn(message: string, meta?: Record<string, unknown>): void {
if (!this.categories.errors) return
const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}`
try {
- this.logger.warn(prefixed, meta)
+ const warn = (this.logger as { warn?: typeof this.logger.error }).warn
+ if (typeof warn === 'function') {
+ warn.call(this.logger, prefixed, meta)
+ } else {
+ this.logger.error(prefixed, meta)
+ }
} catch {
// User-supplied logger threw; swallow so a broken logger never masks the
// condition we were trying to surface.
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| warn(message: string, meta?: Record<string, unknown>): void { | |
| if (!this.categories.errors) return | |
| const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}` | |
| try { | |
| this.logger.warn(prefixed, meta) | |
| } catch { | |
| warn(message: string, meta?: Record<string, unknown>): void { | |
| if (!this.categories.errors) return | |
| const prefixed = `⚠️ [tanstack-ai:warn] ⚠️ ${message}` | |
| try { | |
| const warn = (this.logger as { warn?: typeof this.logger.error }).warn | |
| if (typeof warn === 'function') { | |
| warn.call(this.logger, prefixed, meta) | |
| } else { | |
| this.logger.error(prefixed, meta) | |
| } | |
| } catch { |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai/src/logger/internal-logger.ts` around lines 115 - 120, The warn
method currently swallows warnings if the user-provided logger lacks warn
(this.logger.warn) — update warn in the class (the warn(message: string, meta?:
Record<string, unknown>) method that checks this.categories.errors and builds
prefixed) to guard for a missing warn function and fall back to other available
logging methods (for example this.logger.error, this.logger.debug) and finally
console.warn/console.error as the last resort; perform feature-detection (typeof
this.logger.warn === 'function') before calling and only use the try/catch for
unexpected failures while ensuring the warning is never silently dropped.
…-to-modeloptions # Conflicts: # examples/ts-code-mode-web/src/lib/structured-output.ts
Extract the provider-native max-output-token key spellings into `utilities/sampling-keys.ts` so the summarize wrapper's caller-limit detection and the OTel middleware's sampling-attribute extraction read from one source and cannot drift. Also document the intentional non-native-spelling caveat in `applyMaxLength`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tombeckenham
left a comment
There was a problem hiding this comment.
Made one small change to centralise a set of consts. All good to go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/ai/src/utilities/sampling-keys.ts (1)
1-14: 💤 Low valueConsider documenting the order priority logic.
The comment mentions that the OTel middleware "picks the first numeric spelling" (line 9), and line 15 describes the array as "ordered," but the header documentation doesn't explicitly state that order determines priority when multiple keys are present. Adding a sentence like "The order determines which key OTel reports when multiple token caps are set" would clarify the significance of the ordering.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai/src/utilities/sampling-keys.ts` around lines 1 - 14, Update the header comment in packages/ai/src/utilities/sampling-keys.ts to explicitly state that the array order defines priority when multiple provider-native token keys are present: mention that the first key found is the one used by middlewares/otel.ts for the gen_ai.request.max_tokens attribute and that activities/summarize/chat-stream-summarize.ts relies on this ordering to detect caller-supplied limits; also reference MAX_TOKENS_KEY_BY_ADAPTER to remind maintainers to keep both lists in the same priority order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/ai/src/utilities/sampling-keys.ts`:
- Around line 1-14: Update the header comment in
packages/ai/src/utilities/sampling-keys.ts to explicitly state that the array
order defines priority when multiple provider-native token keys are present:
mention that the first key found is the one used by middlewares/otel.ts for the
gen_ai.request.max_tokens attribute and that
activities/summarize/chat-stream-summarize.ts relies on this ordering to detect
caller-supplied limits; also reference MAX_TOKENS_KEY_BY_ADAPTER to remind
maintainers to keep both lists in the same priority order.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1d8794ad-0064-4d4d-9dc9-948c682cee38
📒 Files selected for processing (3)
packages/ai/src/activities/summarize/chat-stream-summarize.tspackages/ai/src/middlewares/otel.tspackages/ai/src/utilities/sampling-keys.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/ai/src/middlewares/otel.ts
- packages/ai/src/activities/summarize/chat-stream-summarize.ts
After rebasing onto main (which merged #660 moving temperature/topP/ maxTokens into modelOptions), fix the two spots where our edits still assumed top-level sampling: - thinking-content.md: max_tokens now lives in modelOptions alongside the thinking budget (was a top-level maxTokens). - anthropic.md: drop the stale "auto-raises top-level maxTokens" note; budget_tokens must be below modelOptions.max_tokens. (ollama.md and typed-options.md conflicts were resolved to main's new convention during the rebase.)
The "safe allowlist" example still forwarded temperature/maxTokens as top-level chat() options (missed by #660's sampling-into-modelOptions migration). Map them into modelOptions under OpenAI's native keys (temperature / max_output_tokens) so the example type-checks.
…691) * docs: fix inaccurate code samples and expand coverage across guides Audited all guide pages against the actual package APIs and fixed copy-paste-broken / outdated samples and filled coverage gaps. Middleware & structured outputs: - New Built-in Middleware page (toolCache, contentGuard, otel) + new top-level Middleware nav section; document structured-output chunk transforms via onChunk + ctx.phase; fix middleware type import paths. - Document client consumption (useChat partial/final) on the one-shot page. Correctness fixes (verified against packages/ source): - chat: providerOptions -> modelOptions; invalid model ids; budget_tokens requires maxTokens; async stream() factory -> fetcher; add missing imports; document default maxIterations(5) and agentLoopStrategy. - tools: toServerSentEventsStream -> toServerSentEventsResponse; remove duplicate tools key; clarify tool-call vs tool-result states; fix the React examples and state diagram; add emitCustomEvent / runtime-context. - media: add required model args to factory calls; fix recursive generateVideo; TranscriptionResult.words is top-level; speed is a top-level speech option; Gemini audio returns b64Json; onResult transform. - advanced: adapter.model (not selectedModel); GeminiImageMetadata; source.mimeType; text.format structured-output shape; fill How It Works; createModel capabilities form; soften unsubstantiated bundle figures. - protocol: rewrite SSE / HTTP-stream pages to the AG-UI event format (drop obsolete chunk shapes and [DONE]); use toHttpResponse/toHttpStream; expand chunk-definitions with TOOL_CALL_RESULT, MESSAGES_SNAPSHOT, REASONING_* and deprecated-alias notes. - adapters: elevenlabs SFX model + @elevenlabs/client; ollama modelOptions placement; cencori AG-UI event/tools alignment. - fix @tanstack/ai-openai/adapters -> @tanstack/ai-openai (ag-ui-compliance, otel). * docs: address CR-round findings (correctness + latest models) A 7+1-agent confirmation review of the docs PR surfaced further source-accuracy issues (and caught one regression the first fix pass introduced). All verified against packages/ source: - tools/server-tools: JSON-schema tool input is `unknown` (not `any`); samples now narrow/cast args. - thinking-content: drop the adaptive-thinking / output_config.effort example — those option types are not wired into any model's typed modelOptions; document the `{ type: 'enabled', budget_tokens }` form. - multimodal-content: correct the Anthropic modality bullets (no `claude-3*` ids; Claude Haiku 3 supports documents). - comparison: fix the ImagePart (`source: { type:'url', url }`) and TextPart (`content`) shapes in the flagship example. - chunk-definitions: RUN_STARTED/RUN_FINISHED `threadId` is required; add REASONING_MESSAGE_CHUNK to the internal-members note. - media: createOpenaiVideo needs a model arg; video `seconds` is a string union; transcription `responseFormat`/`prompt` are top-level (not modelOptions); drop the non-existent gpt-4o-mini-audio-preview TTS model; add the Audio row to the generations table. - advanced: typed-options gpt-image-1 size must be a GptImageSize. - observability: aiEventClient imports from @tanstack/ai-event-client (the @tanstack/ai/event-client subpath does not exist). - adapters: revert claude-haiku-3 -> claude-3-haiku (the id passed to anthropicText); clarify max_tokens auto-adjust; @elevenlabs/client (not @11labs/client); elevenlabs agentId optional, debug is DebugOption. - structured-outputs: Standard JSON Schema /json-schema link; Zod v4.2+. Model ids touched in these fixes use the latest per provider from model-meta.ts (gpt-5.5, claude-sonnet-4-6, etc.). * docs: use latest per-provider models in examples Sweep example model ids across the PR's docs to the latest available per provider, sourced from each adapter's model-meta.ts: - OpenAI: gpt-5.2 -> gpt-5.5, gpt-5-mini -> gpt-5.4-mini - Anthropic: claude-sonnet-4-5 -> claude-sonnet-4-6, claude-opus-4-6 -> claude-opus-4.8 - Gemini: gemini-2.0-flash -> gemini-3-flash-preview, image -> gemini-3.1-flash-image-preview, tts -> gemini-3.1-flash-tts-preview Every replacement id was verified present in model-meta.ts. Intentional cases preserved: negative/capability-contrast examples (per-model-type-safety), the claude-3-haiku web_search note, model enumeration/availability tables, DALL-E and o-series demos, and the Cencori pass-through ids (external provider, no in-repo model-meta). * docs: reconcile thinking examples with modelOptions sampling convention After rebasing onto main (which merged #660 moving temperature/topP/ maxTokens into modelOptions), fix the two spots where our edits still assumed top-level sampling: - thinking-content.md: max_tokens now lives in modelOptions alongside the thinking budget (was a top-level maxTokens). - anthropic.md: drop the stale "auto-raises top-level maxTokens" note; budget_tokens must be below modelOptions.max_tokens. (ollama.md and typed-options.md conflicts were resolved to main's new convention during the rebase.) * docs: migrate ag-ui-compliance forwardedProps allowlist to modelOptions The "safe allowlist" example still forwarded temperature/maxTokens as top-level chat() options (missed by #660's sampling-into-modelOptions migration). Map them into modelOptions under OpenAI's native keys (temperature / max_output_tokens) so the example type-checks. * docs: remove deprecated observability + protocol pages, drop all casts - Remove the deprecated Observability page (event-client observability is superseded; otelMiddleware is the supported path) and its nav entry + inbound links. - Remove the protocol pages (chunk-definitions, sse-protocol, http-stream-protocol) — TanStack AI implements AG-UI, whose protocol is documented upstream; repoint the few inbound links to docs.ag-ui.com. - Fix the broken ToolCacheStorage snippet (it imported the type then re-declared it) and verify the shape against source. - Remove every `as <Type>` assertion cast from the docs (JSON-schema tool inputs, JSON.parse, formData, custom-event values, type brands, …), replacing them with typeof/in guards, type guards, typed annotations, or schema validation. `createModel`'s provider-option brand now uses a typed const instead of `{} as X`. - CLAUDE.md / AGENTS.md: codify the docs conventions — no `as` casts in samples, use the latest model per provider from model-meta.ts, and show both server and client sides when a doc spans both. * docs: latest model in built-in-middleware + concrete structured-output transform - built-in-middleware.md: gpt-4o -> gpt-5.5 in the examples. - middleware.md: make the "Transforming structured-output chunks" example self-contained — redact SSNs inline in the streaming JSON delta instead of calling an undefined `redact()` helper. (The docs conventions — no casts, latest models, show both sides — already live in the project CLAUDE.md / AGENTS.md; the earlier global-CLAUDE.md addition has been reverted.)
Summary
Moves sampling options —
temperature,topP,maxTokens— off the root ofchat()/ai()/generate()and into provider-native, fully-typedmodelOptions.modelOptionsis now the single sampling surface; there is no generic root-level mapping anymore. Each provider exposes its real SDK/API key names.This continues and completes #499 by @harry-whorlow (which was built on the old
packages/typescript/layout and could not be rebased onto the restructuredpackages/tree). Supersedes #499.Provider-native
modelOptionskeystemperaturetop_pmax_output_tokenstemperaturetop_pmax_tokenstemperaturetopPmaxOutputTokenstemperaturetop_pmax_tokenstemperaturetop_pmax_completion_tokensoptions.temperatureoptions.top_poptions.num_predict(nested)temperaturetopPmaxCompletionTokenstemperaturetopPmaxOutputTokensWhat changed
@tanstack/ai): removedtemperature/topP/maxTokensfromTextOptions, the public activity options, andChatMiddlewareConfig(+ structured-output config). Removed all engine plumbing. Middleware now adjusts sampling viaconfig.modelOptions.metadatais unchanged and stays at the root.modelOptions(native keys, verified against each provider's API). The provider-options types gained the sampling fields where missing (OpenAI, Anthropic, Gemini, Ollama; Grok/Groq/OpenRouter already had them).optionsvs flat runtime spread); an OpenRoutervariantfield leaking into the request body; OTelgen_ai.request.*attributes now read a union of provider key spellings;summarize({ maxLength })now resolves the correct per-provider token key from the wrapped adapter (was about to become a silent no-op).move-sampling-to-model-options(jscodeshift): resolves the provider from the adapter factory and rewrites keys per provider (incl. Ollama nesting); reports + skips on conflicts or unresolvable adapters. 12 fixture cases.docs/migration/sampling-options-to-model-options), updated adapter pages, middlewaredynamic-temperatureexample, API reference, and typed-options.adapter-configuration,middleware,chat-experienceSKILL.md + reference files updated to the new surface.minoracross@tanstack/ai+ all affected adapter packages.Migration
pnpm codemod:move-sampling-to-model-options "src/**/*.{ts,tsx}"Testing
max_tokensdefault + no spurious dropped-key warning).modelOptions).pnpm test:prgreen:test:sherif, test:knip, test:docs, test:eslint, test:lib, test:types, test:build, buildacross 33 projects.Known follow-up (not blocking)
The
ts-code-mode-web/ts-react-chatexamples still pass a genericmaxTokenson calls with dynamically-resolved adapters — the codemod can't infer a provider there, and they're excluded from CI (examples/**). They need per-example provider mapping in a follow-up.🤖 Generated with Claude Code
Summary by CodeRabbit
Breaking Changes
New Features
Documentation