[agent_loop, tool] fix: support hermes-format tool calls on gpt-oss tokenizer models#6481
Open
dafu-wu wants to merge 1 commit into
Open
Conversation
…s tokenizer is detected
When a model is based on a gpt-oss tokenizer but trained to emit hermes-style
<tool_call> output, two incompatible things happen simultaneously:
1. format: hermes correctly parses <tool_call> blocks in the model output.
2. apply_chat_template raises TemplateError because the gpt-oss jinja template
does not accept standard role:tool messages:
"Message has tool role, but there was no previous assistant message
with a tool call!"
The gpt-oss path already handles this by calling build_gpt_oss_tool_response_text
which manually encodes tool results as gpt-oss channel tokens, bypassing
apply_chat_template entirely.
Fix: detect the gpt-oss tokenizer at init time by checking for the <|channel|>
special token. When detected, always use the manual gpt-oss tool response
formatter regardless of the configured tool parser name, so models that output
hermes-format tool calls but use a gpt-oss tokenizer do not crash.
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces a check to detect the gpt-oss tokenizer by looking for the <|channel|> special token, ensuring that tool responses are manually formatted when this tokenizer is detected. The reviewer identified a critical issue where multimodal tool responses (which contain structured list content) are stringified as raw Python lists, corrupting the prompt format. A code suggestion was provided to extract and concatenate only the text parts from the content list before formatting.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes a crash in the multi-turn tool agent loop when using a model with a gpt-oss tokenizer that emits hermes-style
<tool_call>tool calls.Models based on a gpt-oss tokenizer but SFT-trained to emit hermes-style
<tool_call>blocks crash during multi-turn rollout with:Root cause:
format: hermescorrectly parses<tool_call>and executes tools, but then callsapply_chat_templatewith a standardrole: toolmessage. The gpt-oss jinja template rejects this format since it expectstool_callsas a structured attribute on the assistant message, not a separaterole: toolentry.Checklist Before Starting
Test
Reproduced the
TemplateErrorcrash by running multi-turn rollout with a gpt-oss-tokenizer model configured withformat: hermes. After this fix, tool execution proceeds normally and tool results are encoded as gpt-oss channel tokens.API and Usage Example
No API change. Behavior is automatically correct when a gpt-oss tokenizer is detected. Existing configs using
format: hermeswith gpt-oss tokenizer models will work without any change.Design & Code Changes
verl/experimental/agent_loop/tool_agent_loop.pyThe
gpt-ossandgemma4parsers already have dedicated manual tool response formatters that bypassapply_chat_template. The fix detects whether the loaded tokenizer is gpt-oss-style (presence of<|channel|>special token) and routes tool response encoding throughbuild_gpt_oss_tool_response_textregardless of which parser is configured:This decouples the tool call extraction format (
tool_parser_name, e.g.hermesorgpt-oss) from the tool response encoding format (gpt-oss channel tokens vs. chat template), allowing models that mix gpt-oss framing with hermes-format tool outputs to work correctly.Checklist Before Submitting
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel.