xml fallback for llama.cpp models by areebahmeddd · Pull Request #2732 · docker/docker-agent

areebahmeddd · 2026-05-09T16:11:18Z

summary

some models served via docker model runner (like qwen3-coder and hermes) ignore the openai function-calling api and instead return tool calls as <tool_call>...</tool_call> text, which caused tools to never execute
added a fallback to detect and extract these xml blocks after streaming completes and convert them into actual tool calls
suppressed raw xml blocks from showing up in the tui while streaming so they never appear as visible chat output

used vscode copilot (claude sonnet 4.6) to understand the issue context and help generate a fix. changes were self-reviewed and verified via:

task build
task test
task lint

Signed-off-by: Areeb Ahmed <areebahmed0709@gmail.com>

areebahmeddd added 3 commits May 9, 2026 19:04

parse XML tool calls

b5abfa6

Signed-off-by: Areeb Ahmed <areebahmed0709@gmail.com>

fallback on stream end

e0b8d15

add tests

72dc343