fix(tool_parser): fix func call parsing for native tool-call token#1062
fix(tool_parser): fix func call parsing for native tool-call token#1062ConnorLi96 wants to merge 2 commits intomainfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 54 minutes and 5 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (14)
📝 WalkthroughWalkthroughDetects tool-section start markers in reasoning output and splits reasoning vs normal text (including during streaming), exposes marker config in ParserConfig, extends KimiK2 to accept alternative delimiters, and changes processor/streaming selection to prefer configured native tool parsers over the JSON-schema path when appropriate. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Processor as Processor
participant ReasoningParser as Reasoning Parser
participant ToolParser as Tool Parser
participant JSONBridge as JSON-Bridge
Client->>Processor: model output (streaming/non-streaming)
Processor->>ReasoningParser: detect_and_parse_reasoning(text)
alt finds tool-section marker
ReasoningParser-->>Processor: reasoning_text (trimmed) + normal_text (from marker)
Processor->>ToolParser: parse_tool_calls(normal_text)
else no marker
ReasoningParser-->>Processor: full reasoning_text
alt configured native parser available && prioritized
Processor->>ToolParser: parse_tool_calls(reasoning_text or later tool text)
else
Processor->>JSONBridge: parse_json_schema_response(...)
JSONBridge-->>Processor: bridged result (may fall back to ToolParser)
end
end
Processor->>Client: emit parsed result / streaming events
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
||
| // Pattern for removing completed tool calls | ||
| let end_pattern = r"<\|tool_call_begin\|>.*?<\|tool_call_end\|>"; | ||
| let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)"; |
There was a problem hiding this comment.
🔴 Important: end_pattern is missing the (?s) flag that was added to tool_call_pattern and stream_pattern. When multi-line JSON is parsed successfully (via the (?s) in the extraction patterns), the tool_call_end_pattern used for buffer cleanup at line 289 won't match across newlines. This causes the else { self.buffer.clear() } fallback to fire, which discards all buffered content — including any subsequent tool call that has started accumulating in the buffer.
Scenario: model emits two tool calls where the first has multi-line JSON arguments. When the first completes, the buffer cleanup fails to match just the first call, clears everything, and the second tool call is lost.
| let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)"; | |
| let end_pattern = r"(?s)<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)"; |
| // Pattern for complete tool calls | ||
| let tool_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>\{.*?\})\s*<\|tool_call_end\|>"; | ||
| // Supports alternative delimiters: <|func_start|>/<|func_end|>; (?s) for multi-line JSON | ||
| let tool_call_pattern = r"(?s)<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*(?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)?(?P<function_arguments>\{.*?\})\s*(?:<\|tool_call_end\|>|<\|func_end\|>)"; |
There was a problem hiding this comment.
🟡 Nit: The argument delimiter group (?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)? is entirely optional (trailing ?). This means the regex will also match tool calls with no delimiter between the ID and the JSON body, e.g. <|tool_call_begin|> functions.search:0 {"query":"x"} <|tool_call_end|>. If this relaxation is intentional (to be lenient with models), a brief comment would help. If not, dropping the ? and making one of the two delimiters required would be safer:
(?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)
| // Assume reasoning was truncated before end token | ||
| // Don't consume tool call markers as reasoning content | ||
| if let Some(tool_pos) = processed_text.find("<|tool_calls_section_begin|>") { | ||
| let reasoning_text = processed_text[..tool_pos].trim().to_string(); |
There was a problem hiding this comment.
🟡 Nit: The hardcoded "<|tool_calls_section_begin|>" marker ties the reasoning parser to Kimi K2's specific token format. If another model family uses a different tool-call marker while also skipping </think>, this won't catch it. Consider extracting this as a configurable field on ParserConfig (or at least a constant) so it's easier to extend and easier to spot the coupling.
There was a problem hiding this comment.
Code Review
This pull request enhances tool call parsing by adding support for alternative delimiters in the KimiK2 parser and preventing tool call markers from being incorrectly consumed as reasoning content in the BaseReasoningParser. It also updates the model gateway to prioritize explicitly configured tool parsers over JSON schema parsing. Feedback suggests that hardcoding model-specific tokens in the base parser should be avoided by moving them to a configuration object to maintain generality. Additionally, it is recommended to use a more robust method for identifying the first occurrence of multiple possible end delimiters to ensure correct string splitting when multiple delimiter types are present.
| if !processed_text.contains(&self.config.think_end_token) { | ||
| // Assume reasoning was truncated before end token | ||
| // Don't consume tool call markers as reasoning content | ||
| if let Some(tool_pos) = processed_text.find("<|tool_calls_section_begin|>") { |
There was a problem hiding this comment.
Hardcoding the model-specific token <|tool_calls_section_begin|> in the BaseReasoningParser is problematic as it affects all models using this base implementation (e.g., DeepSeek, Qwen). If another model's reasoning content happens to contain this string (for example, when discussing Kimi's format), it will prematurely terminate the reasoning block. This token should ideally be moved to ParserConfig as an optional exit marker to maintain the generic nature of the base parser.
| if self.in_reasoning && self.config.stream_reasoning { | ||
| // Stream the content immediately | ||
| // Some models skip </think> and go straight to tool calls | ||
| if let Some(tool_pos) = current_text.find("<|tool_calls_section_begin|>") { |
| let end_pos = argument_diff.find("<|tool_call_end|>") | ||
| .or_else(|| argument_diff.find("<|func_end|>")); |
There was a problem hiding this comment.
The use of or_else here is potentially incorrect if both tokens are present in the argument_diff. It will prioritize <|tool_call_end|> even if <|func_end|> appears earlier in the string. While these are likely alternative delimiters, it is safer to find the minimum position of all possible end tokens to ensure the first occurrence is used for splitting.
let end_pos = [
argument_diff.find("<|tool_call_end|>"),
argument_diff.find("<|func_end|>"),
].into_iter().flatten().min();References
- When stripping specific marker strings from data that might contain structured formats like JSON, use whole-string matching instead of character-set matching. Character-set matching is unsafe as it can remove characters that are part of valid data, leading to data corruption.
| let end_pos2 = function_args.find("<|tool_call_end|>") | ||
| .or_else(|| function_args.find("<|func_end|>")); |
There was a problem hiding this comment.
As noted above, using or_else for finding delimiters can lead to incorrect splitting if multiple delimiter types appear in the text. Finding the minimum index of all candidate tokens is more robust.
let end_pos2 = [
function_args.find("<|tool_call_end|>"),
function_args.find("<|func_end|>"),
].into_iter().flatten().min();References
- When stripping specific marker strings from data that might contain structured formats like JSON, use whole-string matching instead of character-set matching. Character-set matching is unsafe as it can remove characters that are part of valid data, leading to data corruption.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1ccf3f411b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let tool_chunks = if is_specific_function | ||
| && !(self.configured_tool_parser.is_some() | ||
| && tool_parser_available) | ||
| { |
There was a problem hiding this comment.
Restore specific-function streaming branch for configured parsers
This new condition routes ToolChoice::Function through process_tool_calls_stream whenever a parser is configured, but that path still uses used_json_schema to select the JSON parser. In specific-function mode the model commonly streams arguments-only JSON (no name field), which JsonParser cannot turn into tool-call deltas, so streaming can emit no tool_calls (or leak raw text) instead of the expected call with the requested function name.
Useful? React with 👍 / 👎.
|
|
||
| // Pattern for removing completed tool calls | ||
| let end_pattern = r"<\|tool_call_begin\|>.*?<\|tool_call_end\|>"; | ||
| let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)"; |
There was a problem hiding this comment.
Make tool-call end regex multiline-aware
Multiline arguments are now supported via (?s) in the extraction regexes, but the end-token cleanup regex still uses .*? without dotall. When arguments contain newlines, end detection can fail, and the completion branch falls back to clearing the entire buffer, which drops trailing text or subsequent tool calls from the same chunk.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/regular/streaming.rs (1)
380-409:⚠️ Potential issue | 🔴 CriticalDon't keep
used_json_schematrue once a configured parser wins.This guard skips the arguments-only shortcut, but Line 407 still passes
used_json_schemaintoprocess_tool_calls_stream. Fortool_choice=function/requiredwith a configured native parser, that helper still instantiatesSome("json"), so native<|tool_call...|>streams are still routed to the JSON parser and silently miss tool calls.🔧 Suggested fix
- let tool_chunks = if is_specific_function - && !(self.configured_tool_parser.is_some() - && tool_parser_available) + let force_native_parser = + self.configured_tool_parser.is_some() && tool_parser_available; + let use_json_parser = used_json_schema && !force_native_parser; + let tool_chunks = if is_specific_function && !force_native_parser { Self::process_specific_function_stream( &delta, @@ self.process_tool_calls_stream( &delta, index, @@ - used_json_schema, + use_json_parser, ) .await };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/regular/streaming.rs` around lines 380 - 409, When a configured native parser wins the specific-function branch, the used_json_schema flag must be cleared so subsequent processing doesn't route native <|tool_call...|> streams to the JSON parser; update the branch that calls Self::process_specific_function_stream to also reset used_json_schema (or set it to None/false) when configured_tool_parser.is_some() && tool_parser_available is true, ensuring process_tool_calls_stream later receives the cleared value; reference symbols: process_specific_function_stream, process_tool_calls_stream, used_json_schema, configured_tool_parser, tool_parser_available.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/tool_parser/src/parsers/kimik2.rs`:
- Around line 67-68: The regex stored in end_pattern (used to build
tool_call_end_pattern) is missing the dotall flag so it doesn't match multiline
JSON; update the pattern or construction of tool_call_end_pattern to enable
dot-matches-newline (for example prefix the pattern with (?s) like
"(?s)<\\|tool_call_begin\\|>.*?(?:<\\|tool_call_end\\|>|<\\|func_end\\|>)" or
build via Regex::new using RegexBuilder with dot_matches_newline(true)) so
tool_call_end_pattern will match across newlines and avoid dropping buffered
partial calls when tool_call_end_pattern.find() fails.
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 1757-1759: The branch that falls back to streaming_tool_parser
should respect the configured parser override; compute a boolean (e.g.,
force_native_parser = self.configured_tool_parser.is_some() &&
tool_parser_available) and then choose parser_name = if used_json_schema &&
!force_native_parser { Some("json") } else {
self.configured_tool_parser.as_deref() } and use that parser_name when
constructing streaming_tool_parser so ToolChoice::Tool/Any won't feed native
tool-call tokens into the JSON parser.
---
Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 380-409: When a configured native parser wins the
specific-function branch, the used_json_schema flag must be cleared so
subsequent processing doesn't route native <|tool_call...|> streams to the JSON
parser; update the branch that calls Self::process_specific_function_stream to
also reset used_json_schema (or set it to None/false) when
configured_tool_parser.is_some() && tool_parser_available is true, ensuring
process_tool_calls_stream later receives the cleared value; reference symbols:
process_specific_function_stream, process_tool_calls_stream, used_json_schema,
configured_tool_parser, tool_parser_available.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5c597fba-ac2e-406e-9aba-008bbaecb0a4
📒 Files selected for processing (4)
crates/reasoning_parser/src/parsers/base.rscrates/tool_parser/src/parsers/kimik2.rsmodel_gateway/src/routers/grpc/regular/processor.rsmodel_gateway/src/routers/grpc/regular/streaming.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7fa35d374f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let force_native_parser = | ||
| self.configured_tool_parser.is_some() && tool_parser_available; | ||
| let tool_chunks = if is_specific_function && !force_native_parser { |
There was a problem hiding this comment.
Preserve specific-function stream path for configured JSON parser
This force_native_parser check treats any configured parser as a native-token parser, so configuring tool_parser=json now skips process_specific_function_stream and routes specific-function streaming through process_tool_calls_stream. In that mode the model output is typically arguments-only JSON (no name), and JsonParser::parse_incremental cannot emit tool-call deltas from it, so the stream can finish without any tool_calls for valid specific-function requests. This regression is triggered whenever a configured parser exists but does not provide native tool-call tokens.
Useful? React with 👍 / 👎.
| if self.configured_tool_parser.is_some() && tool_parser_available { | ||
| // Explicitly configured parser takes priority (models may emit native tokens regardless of tool_choice) |
There was a problem hiding this comment.
Keep JSON-schema parsing for specific function in non-streaming
Prioritizing any configured parser ahead of used_json_schema breaks specific-function non-streaming when the configured parser is json (or any parser expecting explicit function names): parse_tool_calls receives arguments-only JSON and returns no calls, while the previous JSON-schema path correctly synthesized the tool call using the selected function name. The same precedence pattern appears in both chat and messages non-streaming branches, so requests can silently lose tool_calls despite valid constrained output.
Useful? React with 👍 / 👎.
|
Hi @ConnorLi96, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch: git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease |
7fa35d3 to
206a006
Compare
| stream_reasoning, | ||
| max_buffer_size: DEFAULT_MAX_BUFFER_SIZE, | ||
| always_in_reasoning, | ||
| tool_section_start_markers: Vec::new(), |
There was a problem hiding this comment.
🟡 Nit: All tests use tool_section_start_markers: Vec::new(), so the new marker-detection logic in both detect_and_parse_reasoning (line 75) and parse_reasoning_streaming_incremental (line 149) has zero test coverage. Consider adding at least two tests:
- Non-streaming: reasoning text with a tool marker but no
</think>→ should split into reasoning + normal text at the marker. - Streaming: feed reasoning chunks followed by a chunk containing the marker → should transition out of reasoning and return the marker text as
normal_text.
Example sketch:
#[test]
fn test_tool_section_marker_ends_reasoning_non_streaming() {
let config = ParserConfig {
tool_section_start_markers: vec!["<|tool_calls_section_begin|>".to_string()],
..Default::default()
};
let mut parser = BaseReasoningParser::new(config);
let result = parser
.detect_and_parse_reasoning("<think>thinking here<|tool_calls_section_begin|>tool tokens")
.unwrap();
assert_eq!(result.reasoning_text, "thinking here");
assert_eq!(result.normal_text, "<|tool_calls_section_begin|>tool tokens");
}There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/regular/streaming.rs (1)
380-406:⚠️ Potential issue | 🟠 MajorDon't treat an explicitly configured
"json"parser as a native-parser override.
force_native_parserbecomestruefor any configured parser name. If the router is configured withtool_call_parser: "json", these branches now skip the JSON-schema-specific handling and routeToolChoice::Function/messages::ToolChoice::Toolthrough incremental parsing instead. That path has no request-side function-choice context, so arguments-only tool calls lose the selected function name and stop streaming correctly.🔧 Proposed fix
- let force_native_parser = - self.configured_tool_parser.is_some() && tool_parser_available; + let force_native_parser = matches!( + self.configured_tool_parser.as_deref(), + Some(name) if name != "json" + ) && tool_parser_available;Apply the same guard in both the chat and Messages streaming paths.
Also applies to: 1639-1649, 1757-1759
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/regular/streaming.rs` around lines 380 - 406, The code sets force_native_parser = self.configured_tool_parser.is_some() && tool_parser_available which treats any explicitly configured parser (including "json") as a native-parser override and skips JSON-schema-specific streaming; update the guard so force_native_parser is true only when a non-JSON native parser is configured (e.g., self.configured_tool_parser.as_deref() != Some("json") && tool_parser_available), and apply the same change in the other streaming branches referenced (the chat/Messages paths that use force_native_parser around process_specific_function_stream and process_tool_calls_stream) so JSON-configured parser does not disable JSON-schema incremental handling for ToolChoice::Function/Tool.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/src/routers/grpc/regular/processor.rs`:
- Around line 151-160: The JSON-schema handling is being skipped whenever
configured_tool_parser is set, which breaks cases where configured_tool_parser
== "json" because parse_tool_calls cannot recover tool_choice/function names;
update the Messages branch (and the similar branch around
parse_json_schema_response at the other site) to use the same guard as the first
branch: only call parse_tool_calls when configured_tool_parser.is_some() &&
tool_parser_available && configured_tool_parser.as_deref() != Some("json"),
otherwise call parse_json_schema_response(...) so tool_choice and function names
are preserved; adjust the logic around configured_tool_parser,
tool_parser_available, parse_tool_calls, and parse_json_schema_response to
ensure ToolChoice::Function / messages::ToolChoice::Tool are handled by
parse_json_schema_response when the configured parser is "json".
---
Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 380-406: The code sets force_native_parser =
self.configured_tool_parser.is_some() && tool_parser_available which treats any
explicitly configured parser (including "json") as a native-parser override and
skips JSON-schema-specific streaming; update the guard so force_native_parser is
true only when a non-JSON native parser is configured (e.g.,
self.configured_tool_parser.as_deref() != Some("json") &&
tool_parser_available), and apply the same change in the other streaming
branches referenced (the chat/Messages paths that use force_native_parser around
process_specific_function_stream and process_tool_calls_stream) so
JSON-configured parser does not disable JSON-schema incremental handling for
ToolChoice::Function/Tool.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b1e968b9-4b79-49b5-842c-886f69e488fa
📒 Files selected for processing (14)
crates/reasoning_parser/src/factory.rscrates/reasoning_parser/src/parsers/base.rscrates/reasoning_parser/src/parsers/cohere_cmd.rscrates/reasoning_parser/src/parsers/deepseek_r1.rscrates/reasoning_parser/src/parsers/glm45.rscrates/reasoning_parser/src/parsers/kimi.rscrates/reasoning_parser/src/parsers/minimax.rscrates/reasoning_parser/src/parsers/nano_v3.rscrates/reasoning_parser/src/parsers/qwen3.rscrates/reasoning_parser/src/parsers/step3.rscrates/reasoning_parser/src/traits.rscrates/tool_parser/src/parsers/kimik2.rsmodel_gateway/src/routers/grpc/regular/processor.rsmodel_gateway/src/routers/grpc/regular/streaming.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 147556504a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let end_pos = argument_diff | ||
| .find("<|tool_call_end|>") | ||
| .or_else(|| argument_diff.find("<|func_end|>")); |
There was a problem hiding this comment.
Pick earliest end token when parsing mixed Kimi delimiters
Now that parse_incremental accepts both <|tool_call_end|> and <|func_end|>, using find("<|tool_call_end|>").or_else(find("<|func_end|>")) can pick a later end marker instead of the first one in the buffer. If one call ends with <|func_end|> and a later call uses <|tool_call_end|>, the parser slices arguments through the next call’s tokens, so is_complete_json never succeeds and tool-call deltas can be dropped. Compute the minimum position across both delimiters before slicing.
Useful? React with 👍 / 👎.
350443f to
468f1df
Compare
…dels - Prioritize explicitly configured tool parser over JSON schema parsing - Support alternative delimiters (<|func_start|>/<|func_end|>) in KimiK2 parser - Prevent reasoning parser from consuming tool call markers when </think> is skipped Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor
- Replace hardcoded `<|tool_calls_section_begin|>` in BaseReasoningParser with configurable `tool_section_start_markers` in ParserConfig, injected only for Kimi models that need it - Fix `used_json_schema` flag leaking into `process_tool_calls_stream` when a configured native parser should take priority — compute `force_native_parser` and clear the JSON parser flag accordingly - Same fix for Messages API streaming path: `streaming_tool_parser` now respects configured parser over JSON schema parser - Add missing `(?s)` dotall flag to KimiK2 `end_pattern` regex so multi-line JSON arguments are correctly cleared from the buffer Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Signed-off-by: ConnorLi96 <connorli@together.ai> Made-with: Cursor Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/reasoning_parser/src/factory.rs`:
- Around line 219-232: The ParserConfig for the "kimi_k25" registration
redundantly sets fields that match defaults; update the closure passed to
registry.register_parser("kimi_k25") to construct ParserConfig using struct
update syntax with ..Default::default() for the defaulted fields so only the
non-default tokens and tool_section_start_markers are specified (i.e., build
ParserConfig { think_start_token: ..., think_end_token: ...,
tool_section_start_markers: markers.clone(), ..Default::default() }) and keep
the
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
part unchanged.
- Around line 235-250: Simplify the ParserConfig construction inside the
registry.register_parser("kimi_thinking") closure by using struct update syntax
(..Default::default()) and only explicitly set the fields that actually differ
from the default: set always_in_reasoning: true and keep
tool_section_start_markers: markers.clone() (remove explicit think_start_token,
think_end_token, stream_reasoning, max_buffer_size if they match defaults);
update the config passed to BaseReasoningParser::new accordingly in this
closure.
In `@model_gateway/src/routers/grpc/regular/processor.rs`:
- Around line 151-154: Rename the local variable has_native_parser to
force_native_parser to match the streaming path; update its declaration and all
uses within the same function (the expression using
self.configured_tool_parser.as_deref().is_some_and(|p| p != "json")) and also
rename the corresponding variable in the Messages-handling branch where the same
logic is duplicated so both paths use the identical identifier
(force_native_parser) for consistency.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f1324192-1946-46c1-a12f-8422a02d91a7
📒 Files selected for processing (14)
crates/reasoning_parser/src/factory.rscrates/reasoning_parser/src/parsers/base.rscrates/reasoning_parser/src/parsers/cohere_cmd.rscrates/reasoning_parser/src/parsers/deepseek_r1.rscrates/reasoning_parser/src/parsers/glm45.rscrates/reasoning_parser/src/parsers/kimi.rscrates/reasoning_parser/src/parsers/minimax.rscrates/reasoning_parser/src/parsers/nano_v3.rscrates/reasoning_parser/src/parsers/qwen3.rscrates/reasoning_parser/src/parsers/step3.rscrates/reasoning_parser/src/traits.rscrates/tool_parser/src/parsers/kimik2.rsmodel_gateway/src/routers/grpc/regular/processor.rsmodel_gateway/src/routers/grpc/regular/streaming.rs
| registry.register_parser("kimi_k25", { | ||
| let markers = kimi_tool_markers.clone(); | ||
| move || { | ||
| let config = ParserConfig { | ||
| think_start_token: "<think>".to_string(), | ||
| think_end_token: "</think>".to_string(), | ||
| stream_reasoning: true, | ||
| max_buffer_size: DEFAULT_MAX_BUFFER_SIZE, | ||
| always_in_reasoning: false, | ||
| tool_section_start_markers: markers.clone(), | ||
| }; | ||
| Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string())) | ||
| } | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using ..Default::default() for consistency.
The explicit fields stream_reasoning, max_buffer_size, and always_in_reasoning: false all match the default values. For consistency with the deepseek_v31 and passthrough parser registrations, consider simplifying:
♻️ Suggested simplification
registry.register_parser("kimi_k25", {
let markers = kimi_tool_markers.clone();
move || {
let config = ParserConfig {
think_start_token: "<think>".to_string(),
think_end_token: "</think>".to_string(),
- stream_reasoning: true,
- max_buffer_size: DEFAULT_MAX_BUFFER_SIZE,
- always_in_reasoning: false,
tool_section_start_markers: markers.clone(),
+ ..Default::default()
};
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
}
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/reasoning_parser/src/factory.rs` around lines 219 - 232, The
ParserConfig for the "kimi_k25" registration redundantly sets fields that match
defaults; update the closure passed to registry.register_parser("kimi_k25") to
construct ParserConfig using struct update syntax with ..Default::default() for
the defaulted fields so only the non-default tokens and
tool_section_start_markers are specified (i.e., build ParserConfig {
think_start_token: ..., think_end_token: ..., tool_section_start_markers:
markers.clone(), ..Default::default() }) and keep the
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
part unchanged.
| let has_native_parser = self | ||
| .configured_tool_parser | ||
| .as_deref() | ||
| .is_some_and(|p| p != "json"); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Minor naming inconsistency with streaming path.
The variable is named has_native_parser here but force_native_parser in streaming.rs. While the logic is identical, consistent naming would improve maintainability.
♻️ Suggested naming alignment
- let has_native_parser = self
+ let force_native_parser = self
.configured_tool_parser
.as_deref()
.is_some_and(|p| p != "json");
- if has_native_parser && tool_parser_available {
+ if force_native_parser && tool_parser_available {Apply the same rename in the Messages path at line 637.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let has_native_parser = self | |
| .configured_tool_parser | |
| .as_deref() | |
| .is_some_and(|p| p != "json"); | |
| let force_native_parser = self | |
| .configured_tool_parser | |
| .as_deref() | |
| .is_some_and(|p| p != "json"); | |
| if force_native_parser && tool_parser_available { | |
| // ... rest of the code | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@model_gateway/src/routers/grpc/regular/processor.rs` around lines 151 - 154,
Rename the local variable has_native_parser to force_native_parser to match the
streaming path; update its declaration and all uses within the same function
(the expression using self.configured_tool_parser.as_deref().is_some_and(|p| p
!= "json")) and also rename the corresponding variable in the Messages-handling
branch where the same logic is duplicated so both paths use the identical
identifier (force_native_parser) for consistency.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 468f1dfa1b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if let Some(tool_pos) = self.find_tool_section_start(¤t_text) { | ||
| let reasoning_text = current_text[..tool_pos].trim().to_string(); | ||
| let normal_text = current_text[tool_pos..].to_string(); | ||
| self.buffer.clear(); |
There was a problem hiding this comment.
Buffer partial tool-section markers across streaming chunks
The new reasoning bailout only checks for a full tool-section marker in current_text, but when no full marker is found this branch still streams current_text as reasoning and clears the buffer. If <|tool_calls_section_begin|> is split across chunk boundaries (a common streaming pattern), the first chunk drops the marker prefix, so later chunks never match the marker and the parser keeps treating tool-call tokens as reasoning. In that case, streaming tool calls are silently lost even though this code path is meant to recover when </think> is missing.
Useful? React with 👍 / 👎.
Problem
Function calling is broken for models that emit native tool-call tokens (e.g. Kimi K2):
tool_choiceisrequiredor a specific function, the JSON schema parser is picked over the model-specific tool parser. Models like Kimi always output native tokens (<|tool_call_begin|>, etc.) regardless oftool_choice, so the JSON schema path fails silently.</think>and jumps straight to<|tool_calls_section_begin|>, the reasoning parser treats everything — including tool-call tokens — as reasoning content. The tool parser never sees them.<|func_start|>/<|func_end|>instead of<|tool_call_argument_begin|>/<|tool_call_end|>, and may produce multi-line JSON arguments. The KimiK2 parser regex rejects all of these.Solution
tool_section_start_markersinParserConfigso the reasoning parser can bail out when tool-call markers appear (instead of hardcoding Kimi-specific tokens in the base parser).<|func_start|>/<|func_end|>as alternative delimiters in KimiK2 regex; add(?s)flag for multi-line JSON; use.min()to find earliest end delimiter.Changes
crates/reasoning_parser/src/traits.rs: addtool_section_start_markerstoParserConfigcrates/reasoning_parser/src/parsers/base.rs: bail out of reasoning mode when configured markers are foundcrates/reasoning_parser/src/factory.rs+ 7 parser files: inject markers for Kimi models onlycrates/tool_parser/src/parsers/kimik2.rs: alternative delimiters,(?s), hyphens in IDs,.min()for end tokensmodel_gateway/src/routers/grpc/regular/processor.rs: three-tier parser priority,force_native_parserflagmodel_gateway/src/routers/grpc/regular/streaming.rs: same priority fix + clearused_json_schemawhen native parser winsTest Plan
tool_choice: "auto","required", and specific functionChecklist:
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesCloses #1110
Summary by CodeRabbit
New Features
Improvements