Skip to content

fix: mistral tool parser drops calls with leading whitespace#8772

Open
nhodge01 wants to merge 1 commit into
triton-inference-server:mainfrom
nhodge01:fix/mistral-tool-parser-leading-whitespace
Open

fix: mistral tool parser drops calls with leading whitespace#8772
nhodge01 wants to merge 1 commit into
triton-inference-server:mainfrom
nhodge01:fix/mistral-tool-parser-leading-whitespace

Conversation

@nhodge01
Copy link
Copy Markdown

@nhodge01 nhodge01 commented May 8, 2026

Summary

The Mistral tool-call parser bails on detection when the model output has leading whitespace before [TOOL_CALLS] or [. This is the common case for Mistral on TRT-LLM — the chat template's BOS handling produces a leading space token, so generated text starts with " [{...}]" rather than "[{...}]". With the strict startswith check, neither branch matches and the parser falls through to the content-only path.

Symptom: tool_choice: "auto" returns the tool call as a raw JSON string in content rather than a structured tool_calls array. Most visible on parallel tool calls, which almost always come back with the leading space.

Reproduction

Against a Mistral Small 3.x model served via Triton with --tool-call-parser=mistral:

curl -s -X POST http://localhost:9000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model":"ensemble",
    "messages":[{"role":"user","content":"Weather in SF AND book a hotel for next Friday"}],
    "tool_choice":"auto",
    "tools":[
      {"type":"function","function":{"name":"get_weather","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}},
      {"type":"function","function":{"name":"search_hotels","parameters":{"type":"object","properties":{"city":{"type":"string"},"checkin_date":{"type":"string"}},"required":["city","checkin_date"]}}}
    ]
  }'

Before: content: " [{\"name\": \"get_weather\", ...}, {...}]", tool_calls: null, finish_reason: stop
After: content: null, tool_calls: [{...}, {...}], finish_reason: tool_calls

Fix

-        if not (full_text.startswith(self.bot_token) or full_text.startswith("[")):
+        stripped = full_text.lstrip()
+        if not (stripped.startswith(self.bot_token) or stripped.startswith("[")):

stripped is used only for the detection. The response still uses the original full_text for content, so non-tool responses where the model legitimately starts with whitespace pass through unchanged.

Testing

Manual end-to-end testing with Mistral Small 24B Instruct 2501 on Triton 26.03-trtllm-python-py3 (NVFP4, TP=2):

  • Single tool call with tool_choice: "required" — passes before and after
  • Single tool call with tool_choice: "auto"fails before, passes after
  • Parallel tool calls with tool_choice: "auto"fails before, passes after
  • Decline path (model chooses prose over tool) — passes before and after

I have not exercised the L0_* test suite locally — happy to add or run any specific tests if helpful.

CLA

CLA submission in flight to triton-cla@nvidia.com.

  Mistral's TRT-LLM build emits " [{...}]" (note the leading space)
  rather than "[TOOL_CALLS][{...}]" or "[{...}]" — the chat template's
  BOS token handling produces a leading space token. With the original
  strict startswith check, neither the bot_token nor "[" branches
  matched, and the parser bailed treating the JSON tool-call payload
  as plain content.

  Symptom: tool_choice: "auto" returned the tool call as a raw string
  in `content` rather than a structured `tool_calls` array. The engine
  produced the right answer; the parser just couldn't see it. Affects
  parallel tool calls in particular.

  Strip leading whitespace before the startswith check. The detection
  uses the stripped string; the response still uses the original
  full_text for content, so non-tool responses with leading whitespace
  pass through unchanged.
@nhodge01 nhodge01 marked this pull request as ready for review May 8, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant