fix: mistral tool parser drops calls with leading whitespace#8772
Open
nhodge01 wants to merge 1 commit into
Open
fix: mistral tool parser drops calls with leading whitespace#8772nhodge01 wants to merge 1 commit into
nhodge01 wants to merge 1 commit into
Conversation
Mistral's TRT-LLM build emits " [{...}]" (note the leading space)
rather than "[TOOL_CALLS][{...}]" or "[{...}]" — the chat template's
BOS token handling produces a leading space token. With the original
strict startswith check, neither the bot_token nor "[" branches
matched, and the parser bailed treating the JSON tool-call payload
as plain content.
Symptom: tool_choice: "auto" returned the tool call as a raw string
in `content` rather than a structured `tool_calls` array. The engine
produced the right answer; the parser just couldn't see it. Affects
parallel tool calls in particular.
Strip leading whitespace before the startswith check. The detection
uses the stripped string; the response still uses the original
full_text for content, so non-tool responses with leading whitespace
pass through unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Mistral tool-call parser bails on detection when the model output has leading whitespace before
[TOOL_CALLS]or[. This is the common case for Mistral on TRT-LLM — the chat template's BOS handling produces a leading space token, so generated text starts with" [{...}]"rather than"[{...}]". With the strictstartswithcheck, neither branch matches and the parser falls through to the content-only path.Symptom:
tool_choice: "auto"returns the tool call as a raw JSON string incontentrather than a structuredtool_callsarray. Most visible on parallel tool calls, which almost always come back with the leading space.Reproduction
Against a Mistral Small 3.x model served via Triton with
--tool-call-parser=mistral:Before:
content: " [{\"name\": \"get_weather\", ...}, {...}]",tool_calls: null,finish_reason: stopAfter:
content: null,tool_calls: [{...}, {...}],finish_reason: tool_callsFix
strippedis used only for the detection. The response still uses the originalfull_textforcontent, so non-tool responses where the model legitimately starts with whitespace pass through unchanged.Testing
Manual end-to-end testing with Mistral Small 24B Instruct 2501 on Triton
26.03-trtllm-python-py3(NVFP4, TP=2):tool_choice: "required"— passes before and aftertool_choice: "auto"— fails before, passes aftertool_choice: "auto"— fails before, passes afterI have not exercised the
L0_*test suite locally — happy to add or run any specific tests if helpful.CLA
CLA submission in flight to triton-cla@nvidia.com.