🚀 The feature, motivation and pitch
GPT-OSS models use the Harmony chat format, which differs from standard models in its response generation behavior. Even when
tool_choice="required" is set, these models tend to generate direct text responses instead of tool calls, resulting in only 91%
success rate for tool call generation.
Harmony format models select one of three channels (final, analysis, commentary) after completing their internal processing. Tool
calls are only generated through the commentary channel with a specified recipient. Currently, there is no mechanism to enforce the
tool call path when tool_choice="required" is set for these models.
The proposed solution uses a LogitsProcessor-based enforcement approach:
- Detect the response pattern <|end|><|start|>assistant<|channel|> in generated tokens
- Force the next tokens to be commentary to=, guaranteeing the tool call path
- Uses a PatternForcedSequenceLogitsProcessor with a state machine pattern (NORMAL → FORCING → NORMAL)
This positive enforcement approach is more robust than a bad_words blocking approach, which cannot guarantee 100% blocking (edge
cases like " final", " finally" tokens can slip through).
Related PR: #33306
Alternatives
A bad_words-based blocking approach was considered, where tokens like final and analysis would be suppressed. However, this
approach could not guarantee 100% success because edge-case tokens (e.g., " final", " finally") could bypass the filter. The
LogitsProcessor-based positive enforcement approach guarantees 100% tool call generation by forcing the exact required token
sequence.
Additional context
Test results with the proposed implementation (PR #33306):
🚀 The feature, motivation and pitch
GPT-OSS models use the Harmony chat format, which differs from standard models in its response generation behavior. Even when
tool_choice="required" is set, these models tend to generate direct text responses instead of tool calls, resulting in only 91%
success rate for tool call generation.
Harmony format models select one of three channels (final, analysis, commentary) after completing their internal processing. Tool
calls are only generated through the commentary channel with a specified recipient. Currently, there is no mechanism to enforce the
tool call path when tool_choice="required" is set for these models.
The proposed solution uses a LogitsProcessor-based enforcement approach:
This positive enforcement approach is more robust than a bad_words blocking approach, which cannot guarantee 100% blocking (edge
cases like " final", " finally" tokens can slip through).
Related PR: #33306
Alternatives
A bad_words-based blocking approach was considered, where tokens like final and analysis would be suppressed. However, this
approach could not guarantee 100% success because edge-case tokens (e.g., " final", " finally") could bypass the filter. The
LogitsProcessor-based positive enforcement approach guarantees 100% tool call generation by forcing the exact required token
sequence.
Additional context
Test results with the proposed implementation (PR #33306):