Skip to content

[agent_loop, tool] fix: support hermes-format tool calls on gpt-oss tokenizer models#6481

Open
dafu-wu wants to merge 1 commit into
verl-project:mainfrom
dafu-wu:fix/tool-agent-loop-gpt-oss-hermes-compat-v2
Open

[agent_loop, tool] fix: support hermes-format tool calls on gpt-oss tokenizer models#6481
dafu-wu wants to merge 1 commit into
verl-project:mainfrom
dafu-wu:fix/tool-agent-loop-gpt-oss-hermes-compat-v2

Conversation

@dafu-wu
Copy link
Copy Markdown
Contributor

@dafu-wu dafu-wu commented May 26, 2026

What does this PR do?

Fixes a crash in the multi-turn tool agent loop when using a model with a gpt-oss tokenizer that emits hermes-style <tool_call> tool calls.

Models based on a gpt-oss tokenizer but SFT-trained to emit hermes-style <tool_call> blocks crash during multi-turn rollout with:

jinja2.exceptions.TemplateError: Message has tool role, but there was no
previous assistant message with a tool call!

Root cause: format: hermes correctly parses <tool_call> and executes tools, but then calls apply_chat_template with a standard role: tool message. The gpt-oss jinja template rejects this format since it expects tool_calls as a structured attribute on the assistant message, not a separate role: tool entry.

Checklist Before Starting

Test

Reproduced the TemplateError crash by running multi-turn rollout with a gpt-oss-tokenizer model configured with format: hermes. After this fix, tool execution proceeds normally and tool results are encoded as gpt-oss channel tokens.

API and Usage Example

No API change. Behavior is automatically correct when a gpt-oss tokenizer is detected. Existing configs using format: hermes with gpt-oss tokenizer models will work without any change.

Design & Code Changes

verl/experimental/agent_loop/tool_agent_loop.py

The gpt-oss and gemma4 parsers already have dedicated manual tool response formatters that bypass apply_chat_template. The fix detects whether the loaded tokenizer is gpt-oss-style (presence of <|channel|> special token) and routes tool response encoding through build_gpt_oss_tool_response_text regardless of which parser is configured:

# __init__: detect gpt-oss tokenizer once
_channel_token_id = self.tokenizer.convert_tokens_to_ids("<|channel|>")
self._is_gpt_oss_tokenizer = (
    _channel_token_id is not None and _channel_token_id != self.tokenizer.unk_token_id
)

# _handle_processing_tools_state: route accordingly
if self.tool_parser_name == "gpt-oss" or self._is_gpt_oss_tokenizer:
    tool_response_text = build_gpt_oss_tool_response_text(add_messages, tool_call_names)
    ...

This decouples the tool call extraction format (tool_parser_name, e.g. hermes or gpt-oss) from the tool response encoding format (gpt-oss channel tokens vs. chat template), allowing models that mix gpt-oss framing with hermes-format tool outputs to work correctly.

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
  • Add / Update the documentation.
  • Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: the crash requires a real gpt-oss model checkpoint with hermes-format SFT, which is not feasible in CI.
  • Once your PR is ready for CI, send a message in the ci-request channel.

…s tokenizer is detected

When a model is based on a gpt-oss tokenizer but trained to emit hermes-style
<tool_call> output, two incompatible things happen simultaneously:

1. format: hermes correctly parses <tool_call> blocks in the model output.
2. apply_chat_template raises TemplateError because the gpt-oss jinja template
   does not accept standard role:tool messages:
     "Message has tool role, but there was no previous assistant message
      with a tool call!"

The gpt-oss path already handles this by calling build_gpt_oss_tool_response_text
which manually encodes tool results as gpt-oss channel tokens, bypassing
apply_chat_template entirely.

Fix: detect the gpt-oss tokenizer at init time by checking for the <|channel|>
special token. When detected, always use the manual gpt-oss tool response
formatter regardless of the configured tool parser name, so models that output
hermes-format tool calls but use a gpt-oss tokenizer do not crash.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a check to detect the gpt-oss tokenizer by looking for the <|channel|> special token, ensuring that tool responses are manually formatted when this tokenizer is detected. The reviewer identified a critical issue where multimodal tool responses (which contain structured list content) are stringified as raw Python lists, corrupting the prompt format. A code suggestion was provided to extract and concatenate only the text parts from the content list before formatting.

Comment thread verl/experimental/agent_loop/tool_agent_loop.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant