Skip to content

fix(tool_parser): fix func call parsing for native tool-call token#1062

Open
ConnorLi96 wants to merge 2 commits intomainfrom
connorli/fix-func-call-parsing-v2
Open

fix(tool_parser): fix func call parsing for native tool-call token#1062
ConnorLi96 wants to merge 2 commits intomainfrom
connorli/fix-func-call-parsing-v2

Conversation

@ConnorLi96
Copy link
Copy Markdown
Collaborator

@ConnorLi96 ConnorLi96 commented Apr 8, 2026

Problem

Function calling is broken for models that emit native tool-call tokens (e.g. Kimi K2):

  1. When tool_choice is required or a specific function, the JSON schema parser is picked over the model-specific tool parser. Models like Kimi always output native tokens (<|tool_call_begin|>, etc.) regardless of tool_choice, so the JSON schema path fails silently.
  2. When the model skips </think> and jumps straight to <|tool_calls_section_begin|>, the reasoning parser treats everything — including tool-call tokens — as reasoning content. The tool parser never sees them.
  3. Some model variants emit <|func_start|>/<|func_end|> instead of <|tool_call_argument_begin|>/<|tool_call_end|>, and may produce multi-line JSON arguments. The KimiK2 parser regex rejects all of these.

Solution

  • Three-tier parser priority: explicitly configured parser > JSON schema > auto-detected parser. Models with a configured native parser always use it.
  • Configurable tool_section_start_markers in ParserConfig so the reasoning parser can bail out when tool-call markers appear (instead of hardcoding Kimi-specific tokens in the base parser).
  • Accept <|func_start|>/<|func_end|> as alternative delimiters in KimiK2 regex; add (?s) flag for multi-line JSON; use .min() to find earliest end delimiter.

Changes

  • crates/reasoning_parser/src/traits.rs: add tool_section_start_markers to ParserConfig
  • crates/reasoning_parser/src/parsers/base.rs: bail out of reasoning mode when configured markers are found
  • crates/reasoning_parser/src/factory.rs + 7 parser files: inject markers for Kimi models only
  • crates/tool_parser/src/parsers/kimik2.rs: alternative delimiters, (?s), hyphens in IDs, .min() for end tokens
  • model_gateway/src/routers/grpc/regular/processor.rs: three-tier parser priority, force_native_parser flag
  • model_gateway/src/routers/grpc/regular/streaming.rs: same priority fix + clear used_json_schema when native parser wins

Test Plan

  • Verified Kimi K2 function calling with tool_choice: "auto", "required", and specific function
  • Verified models without a configured tool parser still use JSON schema path
  • Tested streaming and non-streaming for both Chat Completions and Messages API

Checklist:

  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes

Closes #1110

Summary by CodeRabbit

  • New Features

    • Added support for alternative tool call delimiters for improved model compatibility.
    • Enhanced reasoning content parsing with automatic tool section detection and separation.
  • Improvements

    • Optimized tool parser selection and request routing logic.
    • Standardized parser configuration initialization for consistency across reasoning parsers.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

Warning

Rate limit exceeded

@ConnorLi96 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 5 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 54 minutes and 5 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8bedb765-a392-4cd7-a794-0e3f7ed4e645

📥 Commits

Reviewing files that changed from the base of the PR and between 350443f and 468f1df.

📒 Files selected for processing (14)
  • crates/reasoning_parser/src/factory.rs
  • crates/reasoning_parser/src/parsers/base.rs
  • crates/reasoning_parser/src/parsers/cohere_cmd.rs
  • crates/reasoning_parser/src/parsers/deepseek_r1.rs
  • crates/reasoning_parser/src/parsers/glm45.rs
  • crates/reasoning_parser/src/parsers/kimi.rs
  • crates/reasoning_parser/src/parsers/minimax.rs
  • crates/reasoning_parser/src/parsers/nano_v3.rs
  • crates/reasoning_parser/src/parsers/qwen3.rs
  • crates/reasoning_parser/src/parsers/step3.rs
  • crates/reasoning_parser/src/traits.rs
  • crates/tool_parser/src/parsers/kimik2.rs
  • model_gateway/src/routers/grpc/regular/processor.rs
  • model_gateway/src/routers/grpc/regular/streaming.rs
📝 Walkthrough

Walkthrough

Detects tool-section start markers in reasoning output and splits reasoning vs normal text (including during streaming), exposes marker config in ParserConfig, extends KimiK2 to accept alternative delimiters, and changes processor/streaming selection to prefer configured native tool parsers over the JSON-schema path when appropriate.

Changes

Cohort / File(s) Summary
Reasoning parser core & config
crates/reasoning_parser/src/parsers/base.rs, crates/reasoning_parser/src/traits.rs
Add tool_section_start_markers: Vec<String> to ParserConfig; implement find_tool_section_start and split logic to stop reasoning at the earliest tool-section marker (applies to detect_and_parse and streaming incremental paths).
Reasoning parser factory & registrations
crates/reasoning_parser/src/factory.rs
Introduce shared kimi_tool_markers and update parser registrations (closures use move + cloned markers) to inject tool_section_start_markers for specific parsers; use ..Default::default() when building ParserConfig.
Reasoning parser constructors
crates/reasoning_parser/src/parsers/...
cohere_cmd.rs, deepseek_r1.rs, glm45.rs, kimi.rs, minimax.rs, nano_v3.rs, qwen3.rs, step3.rs
Switch parser constructors to use ..Default::default() for ParserConfig (removing explicit stream/max-buffer/always_in_reasoning settings).
Tool parser (KimiK2) delimiter handling
crates/tool_parser/src/parsers/kimik2.rs
Accept alternative delimiters `<
Processor (non-streaming) parsing priority
model_gateway/src/routers/grpc/regular/processor.rs
Compute has_native_parser from configured_tool_parser; when has_native_parser && tool_parser_available, prefer and call parse_tool_calls(...) before JSON-schema bridge; otherwise preserve existing JSON-schema vs fallback logic.
Streaming parsing priority & flags
model_gateway/src/routers/grpc/regular/streaming.rs
Add force_native_parser derived from configured parser; adjust Chat and Messages streaming branches so specific-function JSON-only path runs only when not forcing native parser, and pass used_json_schema into incremental streaming only when not forcing native parser; select parser accordingly.
Tests & helpers
crates/reasoning_parser/src/parsers/base.rs (tests and helpers)
Add helper/config defaults use ..Default::default() and three tests covering non-streaming split, streaming split/state transition, and absence of tool markers.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Processor as Processor
  participant ReasoningParser as Reasoning Parser
  participant ToolParser as Tool Parser
  participant JSONBridge as JSON-Bridge

  Client->>Processor: model output (streaming/non-streaming)
  Processor->>ReasoningParser: detect_and_parse_reasoning(text)
  alt finds tool-section marker
    ReasoningParser-->>Processor: reasoning_text (trimmed) + normal_text (from marker)
    Processor->>ToolParser: parse_tool_calls(normal_text)
  else no marker
    ReasoningParser-->>Processor: full reasoning_text
    alt configured native parser available && prioritized
      Processor->>ToolParser: parse_tool_calls(reasoning_text or later tool text)
    else
      Processor->>JSONBridge: parse_json_schema_response(...)
      JSONBridge-->>Processor: bridged result (may fall back to ToolParser)
    end
  end
  Processor->>Client: emit parsed result / streaming events
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • CatherineSue
  • key4ng
  • slin1237

Poem

🐰 I nibble markers, split the hay,
Thought-threads part where tool-calls play.
Delimiters twirl in tidy rows,
Parsers hop where the marker goes—
Hooray for neat parsing day! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main fix: updating the tool parser to correctly handle native tool-call tokens from models like Kimi K2, which aligns with the changeset's core objective.
Linked Issues check ✅ Passed The PR fully implements all coding requirements from issue #1110: native parser priority [#1110], tool-section marker detection in reasoning parser [#1110], alternative delimiter support in KimiK2 [#1110], and priority fixes in model gateway [#1110].
Out of Scope Changes check ✅ Passed All changes directly address issue #1110 requirements: parser priority logic, tool-section markers, KimiK2 delimiter support, and streaming/non-streaming fixes; minor refactoring to use struct update syntax aligns with config normalization for the new markers field.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch connorli/fix-func-call-parsing-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ConnorLi96 ConnorLi96 changed the title fix(tool_parser): fix func call parsing for native tool-call token mo… fix(tool_parser): fix func call parsing for native tool-call token Apr 8, 2026
@github-actions github-actions bot added grpc gRPC client and router changes tool-parser Tool/function call parser changes reasoning-parser Reasoning parser changes model-gateway Model gateway crate changes labels Apr 8, 2026

// Pattern for removing completed tool calls
let end_pattern = r"<\|tool_call_begin\|>.*?<\|tool_call_end\|>";
let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Important: end_pattern is missing the (?s) flag that was added to tool_call_pattern and stream_pattern. When multi-line JSON is parsed successfully (via the (?s) in the extraction patterns), the tool_call_end_pattern used for buffer cleanup at line 289 won't match across newlines. This causes the else { self.buffer.clear() } fallback to fire, which discards all buffered content — including any subsequent tool call that has started accumulating in the buffer.

Scenario: model emits two tool calls where the first has multi-line JSON arguments. When the first completes, the buffer cleanup fails to match just the first call, clears everything, and the second tool call is lost.

Suggested change
let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";
let end_pattern = r"(?s)<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";

// Pattern for complete tool calls
let tool_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>\{.*?\})\s*<\|tool_call_end\|>";
// Supports alternative delimiters: <|func_start|>/<|func_end|>; (?s) for multi-line JSON
let tool_call_pattern = r"(?s)<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*(?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)?(?P<function_arguments>\{.*?\})\s*(?:<\|tool_call_end\|>|<\|func_end\|>)";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: The argument delimiter group (?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)? is entirely optional (trailing ?). This means the regex will also match tool calls with no delimiter between the ID and the JSON body, e.g. <|tool_call_begin|> functions.search:0 {"query":"x"} <|tool_call_end|>. If this relaxation is intentional (to be lenient with models), a brief comment would help. If not, dropping the ? and making one of the two delimiters required would be safer:

(?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)

// Assume reasoning was truncated before end token
// Don't consume tool call markers as reasoning content
if let Some(tool_pos) = processed_text.find("<|tool_calls_section_begin|>") {
let reasoning_text = processed_text[..tool_pos].trim().to_string();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: The hardcoded "<|tool_calls_section_begin|>" marker ties the reasoning parser to Kimi K2's specific token format. If another model family uses a different tool-call marker while also skipping </think>, this won't catch it. Consider extracting this as a configurable field on ParserConfig (or at least a constant) so it's easier to extend and easier to spot the coupling.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances tool call parsing by adding support for alternative delimiters in the KimiK2 parser and preventing tool call markers from being incorrectly consumed as reasoning content in the BaseReasoningParser. It also updates the model gateway to prioritize explicitly configured tool parsers over JSON schema parsing. Feedback suggests that hardcoding model-specific tokens in the base parser should be avoided by moving them to a configuration object to maintain generality. Additionally, it is recommended to use a more robust method for identifying the first occurrence of multiple possible end delimiters to ensure correct string splitting when multiple delimiter types are present.

if !processed_text.contains(&self.config.think_end_token) {
// Assume reasoning was truncated before end token
// Don't consume tool call markers as reasoning content
if let Some(tool_pos) = processed_text.find("<|tool_calls_section_begin|>") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding the model-specific token <|tool_calls_section_begin|> in the BaseReasoningParser is problematic as it affects all models using this base implementation (e.g., DeepSeek, Qwen). If another model's reasoning content happens to contain this string (for example, when discussing Kimi's format), it will prematurely terminate the reasoning block. This token should ideally be moved to ParserConfig as an optional exit marker to maintain the generic nature of the base parser.

if self.in_reasoning && self.config.stream_reasoning {
// Stream the content immediately
// Some models skip </think> and go straight to tool calls
if let Some(tool_pos) = current_text.find("<|tool_calls_section_begin|>") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the non-streaming path, hardcoding <|tool_calls_section_begin|> here introduces model-specific logic into a generic parser. This can lead to unexpected behavior for other models if this specific sequence appears in their reasoning stream.

Comment on lines +243 to +244
let end_pos = argument_diff.find("<|tool_call_end|>")
.or_else(|| argument_diff.find("<|func_end|>"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of or_else here is potentially incorrect if both tokens are present in the argument_diff. It will prioritize <|tool_call_end|> even if <|func_end|> appears earlier in the string. While these are likely alternative delimiters, it is safer to find the minimum position of all possible end tokens to ensure the first occurrence is used for splitting.

                        let end_pos = [
                            argument_diff.find("<|tool_call_end|>"),
                            argument_diff.find("<|func_end|>"),
                        ].into_iter().flatten().min();
References
  1. When stripping specific marker strings from data that might contain structured formats like JSON, use whole-string matching instead of character-set matching. Character-set matching is unsafe as it can remove characters that are part of valid data, leading to data corruption.

Comment on lines +265 to +266
let end_pos2 = function_args.find("<|tool_call_end|>")
.or_else(|| function_args.find("<|func_end|>"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

As noted above, using or_else for finding delimiters can lead to incorrect splitting if multiple delimiter types appear in the text. Finding the minimum index of all candidate tokens is more robust.

                        let end_pos2 = [
                            function_args.find("<|tool_call_end|>"),
                            function_args.find("<|func_end|>"),
                        ].into_iter().flatten().min();
References
  1. When stripping specific marker strings from data that might contain structured formats like JSON, use whole-string matching instead of character-set matching. Character-set matching is unsafe as it can remove characters that are part of valid data, leading to data corruption.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ccf3f411b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +380 to +383
let tool_chunks = if is_specific_function
&& !(self.configured_tool_parser.is_some()
&& tool_parser_available)
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore specific-function streaming branch for configured parsers

This new condition routes ToolChoice::Function through process_tool_calls_stream whenever a parser is configured, but that path still uses used_json_schema to select the JSON parser. In specific-function mode the model commonly streams arguments-only JSON (no name field), which JsonParser cannot turn into tool-call deltas, so streaming can emit no tool_calls (or leak raw text) instead of the expected call with the requested function name.

Useful? React with 👍 / 👎.


// Pattern for removing completed tool calls
let end_pattern = r"<\|tool_call_begin\|>.*?<\|tool_call_end\|>";
let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make tool-call end regex multiline-aware

Multiline arguments are now supported via (?s) in the extraction regexes, but the end-token cleanup regex still uses .*? without dotall. When arguments contain newlines, end detection can fail, and the completion branch falls back to clearing the entire buffer, which drops trailing text or subsequent tool calls from the same chunk.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/regular/streaming.rs (1)

380-409: ⚠️ Potential issue | 🔴 Critical

Don't keep used_json_schema true once a configured parser wins.

This guard skips the arguments-only shortcut, but Line 407 still passes used_json_schema into process_tool_calls_stream. For tool_choice=function/required with a configured native parser, that helper still instantiates Some("json"), so native <|tool_call...|> streams are still routed to the JSON parser and silently miss tool calls.

🔧 Suggested fix
-                            let tool_chunks = if is_specific_function
-                                && !(self.configured_tool_parser.is_some()
-                                    && tool_parser_available)
+                            let force_native_parser =
+                                self.configured_tool_parser.is_some() && tool_parser_available;
+                            let use_json_parser = used_json_schema && !force_native_parser;
+                            let tool_chunks = if is_specific_function && !force_native_parser
                             {
                                 Self::process_specific_function_stream(
                                     &delta,
@@
                                 self.process_tool_calls_stream(
                                     &delta,
                                     index,
@@
-                                    used_json_schema,
+                                    use_json_parser,
                                 )
                                 .await
                             };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/streaming.rs` around lines 380 - 409,
When a configured native parser wins the specific-function branch, the
used_json_schema flag must be cleared so subsequent processing doesn't route
native <|tool_call...|> streams to the JSON parser; update the branch that calls
Self::process_specific_function_stream to also reset used_json_schema (or set it
to None/false) when configured_tool_parser.is_some() && tool_parser_available is
true, ensuring process_tool_calls_stream later receives the cleared value;
reference symbols: process_specific_function_stream, process_tool_calls_stream,
used_json_schema, configured_tool_parser, tool_parser_available.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/kimik2.rs`:
- Around line 67-68: The regex stored in end_pattern (used to build
tool_call_end_pattern) is missing the dotall flag so it doesn't match multiline
JSON; update the pattern or construction of tool_call_end_pattern to enable
dot-matches-newline (for example prefix the pattern with (?s) like
"(?s)<\\|tool_call_begin\\|>.*?(?:<\\|tool_call_end\\|>|<\\|func_end\\|>)" or
build via Regex::new using RegexBuilder with dot_matches_newline(true)) so
tool_call_end_pattern will match across newlines and avoid dropping buffered
partial calls when tool_call_end_pattern.find() fails.

In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 1757-1759: The branch that falls back to streaming_tool_parser
should respect the configured parser override; compute a boolean (e.g.,
force_native_parser = self.configured_tool_parser.is_some() &&
tool_parser_available) and then choose parser_name = if used_json_schema &&
!force_native_parser { Some("json") } else {
self.configured_tool_parser.as_deref() } and use that parser_name when
constructing streaming_tool_parser so ToolChoice::Tool/Any won't feed native
tool-call tokens into the JSON parser.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 380-409: When a configured native parser wins the
specific-function branch, the used_json_schema flag must be cleared so
subsequent processing doesn't route native <|tool_call...|> streams to the JSON
parser; update the branch that calls Self::process_specific_function_stream to
also reset used_json_schema (or set it to None/false) when
configured_tool_parser.is_some() && tool_parser_available is true, ensuring
process_tool_calls_stream later receives the cleared value; reference symbols:
process_specific_function_stream, process_tool_calls_stream, used_json_schema,
configured_tool_parser, tool_parser_available.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5c597fba-ac2e-406e-9aba-008bbaecb0a4

📥 Commits

Reviewing files that changed from the base of the PR and between d3fc32a and 1ccf3f4.

📒 Files selected for processing (4)
  • crates/reasoning_parser/src/parsers/base.rs
  • crates/tool_parser/src/parsers/kimik2.rs
  • model_gateway/src/routers/grpc/regular/processor.rs
  • model_gateway/src/routers/grpc/regular/streaming.rs

Comment thread crates/tool_parser/src/parsers/kimik2.rs Outdated
Comment thread model_gateway/src/routers/grpc/regular/streaming.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7fa35d374f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +380 to +382
let force_native_parser =
self.configured_tool_parser.is_some() && tool_parser_available;
let tool_chunks = if is_specific_function && !force_native_parser {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve specific-function stream path for configured JSON parser

This force_native_parser check treats any configured parser as a native-token parser, so configuring tool_parser=json now skips process_specific_function_stream and routes specific-function streaming through process_tool_calls_stream. In that mode the model output is typically arguments-only JSON (no name), and JsonParser::parse_incremental cannot emit tool-call deltas from it, so the stream can finish without any tool_calls for valid specific-function requests. This regression is triggered whenever a configured parser exists but does not provide native tool-call tokens.

Useful? React with 👍 / 👎.

Comment on lines +151 to +152
if self.configured_tool_parser.is_some() && tool_parser_available {
// Explicitly configured parser takes priority (models may emit native tokens regardless of tool_choice)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep JSON-schema parsing for specific function in non-streaming

Prioritizing any configured parser ahead of used_json_schema breaks specific-function non-streaming when the configured parser is json (or any parser expecting explicit function names): parse_tool_calls receives arguments-only JSON and returns no calls, while the previous JSON-schema path correctly synthesized the tool call using the selected function name. The same precedence pattern appears in both chat and messages non-streaming branches, so requests can silently lose tool_calls despite valid constrained output.

Useful? React with 👍 / 👎.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 9, 2026

Hi @ConnorLi96, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch:

git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease

@mergify mergify bot added the needs-rebase PR has merge conflicts that need to be resolved label Apr 9, 2026
@ConnorLi96 ConnorLi96 force-pushed the connorli/fix-func-call-parsing-v2 branch from 7fa35d3 to 206a006 Compare April 9, 2026 12:03
@mergify mergify bot removed the needs-rebase PR has merge conflicts that need to be resolved label Apr 9, 2026
stream_reasoning,
max_buffer_size: DEFAULT_MAX_BUFFER_SIZE,
always_in_reasoning,
tool_section_start_markers: Vec::new(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: All tests use tool_section_start_markers: Vec::new(), so the new marker-detection logic in both detect_and_parse_reasoning (line 75) and parse_reasoning_streaming_incremental (line 149) has zero test coverage. Consider adding at least two tests:

  1. Non-streaming: reasoning text with a tool marker but no </think> → should split into reasoning + normal text at the marker.
  2. Streaming: feed reasoning chunks followed by a chunk containing the marker → should transition out of reasoning and return the marker text as normal_text.

Example sketch:

#[test]
fn test_tool_section_marker_ends_reasoning_non_streaming() {
    let config = ParserConfig {
        tool_section_start_markers: vec!["<|tool_calls_section_begin|>".to_string()],
        ..Default::default()
    };
    let mut parser = BaseReasoningParser::new(config);
    let result = parser
        .detect_and_parse_reasoning("<think>thinking here<|tool_calls_section_begin|>tool tokens")
        .unwrap();
    assert_eq!(result.reasoning_text, "thinking here");
    assert_eq!(result.normal_text, "<|tool_calls_section_begin|>tool tokens");
}

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/regular/streaming.rs (1)

380-406: ⚠️ Potential issue | 🟠 Major

Don't treat an explicitly configured "json" parser as a native-parser override.

force_native_parser becomes true for any configured parser name. If the router is configured with tool_call_parser: "json", these branches now skip the JSON-schema-specific handling and route ToolChoice::Function / messages::ToolChoice::Tool through incremental parsing instead. That path has no request-side function-choice context, so arguments-only tool calls lose the selected function name and stop streaming correctly.

🔧 Proposed fix
- let force_native_parser =
-     self.configured_tool_parser.is_some() && tool_parser_available;
+ let force_native_parser = matches!(
+     self.configured_tool_parser.as_deref(),
+     Some(name) if name != "json"
+ ) && tool_parser_available;

Apply the same guard in both the chat and Messages streaming paths.

Also applies to: 1639-1649, 1757-1759

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/streaming.rs` around lines 380 - 406,
The code sets force_native_parser = self.configured_tool_parser.is_some() &&
tool_parser_available which treats any explicitly configured parser (including
"json") as a native-parser override and skips JSON-schema-specific streaming;
update the guard so force_native_parser is true only when a non-JSON native
parser is configured (e.g., self.configured_tool_parser.as_deref() !=
Some("json") && tool_parser_available), and apply the same change in the other
streaming branches referenced (the chat/Messages paths that use
force_native_parser around process_specific_function_stream and
process_tool_calls_stream) so JSON-configured parser does not disable
JSON-schema incremental handling for ToolChoice::Function/Tool.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/regular/processor.rs`:
- Around line 151-160: The JSON-schema handling is being skipped whenever
configured_tool_parser is set, which breaks cases where configured_tool_parser
== "json" because parse_tool_calls cannot recover tool_choice/function names;
update the Messages branch (and the similar branch around
parse_json_schema_response at the other site) to use the same guard as the first
branch: only call parse_tool_calls when configured_tool_parser.is_some() &&
tool_parser_available && configured_tool_parser.as_deref() != Some("json"),
otherwise call parse_json_schema_response(...) so tool_choice and function names
are preserved; adjust the logic around configured_tool_parser,
tool_parser_available, parse_tool_calls, and parse_json_schema_response to
ensure ToolChoice::Function / messages::ToolChoice::Tool are handled by
parse_json_schema_response when the configured parser is "json".

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 380-406: The code sets force_native_parser =
self.configured_tool_parser.is_some() && tool_parser_available which treats any
explicitly configured parser (including "json") as a native-parser override and
skips JSON-schema-specific streaming; update the guard so force_native_parser is
true only when a non-JSON native parser is configured (e.g.,
self.configured_tool_parser.as_deref() != Some("json") &&
tool_parser_available), and apply the same change in the other streaming
branches referenced (the chat/Messages paths that use force_native_parser around
process_specific_function_stream and process_tool_calls_stream) so
JSON-configured parser does not disable JSON-schema incremental handling for
ToolChoice::Function/Tool.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b1e968b9-4b79-49b5-842c-886f69e488fa

📥 Commits

Reviewing files that changed from the base of the PR and between 1ccf3f4 and 206a006.

📒 Files selected for processing (14)
  • crates/reasoning_parser/src/factory.rs
  • crates/reasoning_parser/src/parsers/base.rs
  • crates/reasoning_parser/src/parsers/cohere_cmd.rs
  • crates/reasoning_parser/src/parsers/deepseek_r1.rs
  • crates/reasoning_parser/src/parsers/glm45.rs
  • crates/reasoning_parser/src/parsers/kimi.rs
  • crates/reasoning_parser/src/parsers/minimax.rs
  • crates/reasoning_parser/src/parsers/nano_v3.rs
  • crates/reasoning_parser/src/parsers/qwen3.rs
  • crates/reasoning_parser/src/parsers/step3.rs
  • crates/reasoning_parser/src/traits.rs
  • crates/tool_parser/src/parsers/kimik2.rs
  • model_gateway/src/routers/grpc/regular/processor.rs
  • model_gateway/src/routers/grpc/regular/streaming.rs

Comment thread model_gateway/src/routers/grpc/regular/processor.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 147556504a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +247 to +249
let end_pos = argument_diff
.find("<|tool_call_end|>")
.or_else(|| argument_diff.find("<|func_end|>"));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pick earliest end token when parsing mixed Kimi delimiters

Now that parse_incremental accepts both <|tool_call_end|> and <|func_end|>, using find("<|tool_call_end|>").or_else(find("<|func_end|>")) can pick a later end marker instead of the first one in the buffer. If one call ends with <|func_end|> and a later call uses <|tool_call_end|>, the parser slices arguments through the next call’s tokens, so is_complete_json never succeeds and tool-call deltas can be dropped. Compute the minimum position across both delimiters before slicing.

Useful? React with 👍 / 👎.

ConnorLi96 and others added 2 commits April 14, 2026 09:59
…dels

- Prioritize explicitly configured tool parser over JSON schema parsing
- Support alternative delimiters (<|func_start|>/<|func_end|>) in KimiK2 parser
- Prevent reasoning parser from consuming tool call markers when </think> is skipped

Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com>
Made-with: Cursor
- Replace hardcoded `<|tool_calls_section_begin|>` in BaseReasoningParser
  with configurable `tool_section_start_markers` in ParserConfig, injected
  only for Kimi models that need it
- Fix `used_json_schema` flag leaking into `process_tool_calls_stream`
  when a configured native parser should take priority — compute
  `force_native_parser` and clear the JSON parser flag accordingly
- Same fix for Messages API streaming path: `streaming_tool_parser` now
  respects configured parser over JSON schema parser
- Add missing `(?s)` dotall flag to KimiK2 `end_pattern` regex so
  multi-line JSON arguments are correctly cleared from the buffer

Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com>
Signed-off-by: ConnorLi96 <connorli@together.ai>
Made-with: Cursor
Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com>
Made-with: Cursor
Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com>
Made-with: Cursor
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/reasoning_parser/src/factory.rs`:
- Around line 219-232: The ParserConfig for the "kimi_k25" registration
redundantly sets fields that match defaults; update the closure passed to
registry.register_parser("kimi_k25") to construct ParserConfig using struct
update syntax with ..Default::default() for the defaulted fields so only the
non-default tokens and tool_section_start_markers are specified (i.e., build
ParserConfig { think_start_token: ..., think_end_token: ...,
tool_section_start_markers: markers.clone(), ..Default::default() }) and keep
the
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
part unchanged.
- Around line 235-250: Simplify the ParserConfig construction inside the
registry.register_parser("kimi_thinking") closure by using struct update syntax
(..Default::default()) and only explicitly set the fields that actually differ
from the default: set always_in_reasoning: true and keep
tool_section_start_markers: markers.clone() (remove explicit think_start_token,
think_end_token, stream_reasoning, max_buffer_size if they match defaults);
update the config passed to BaseReasoningParser::new accordingly in this
closure.

In `@model_gateway/src/routers/grpc/regular/processor.rs`:
- Around line 151-154: Rename the local variable has_native_parser to
force_native_parser to match the streaming path; update its declaration and all
uses within the same function (the expression using
self.configured_tool_parser.as_deref().is_some_and(|p| p != "json")) and also
rename the corresponding variable in the Messages-handling branch where the same
logic is duplicated so both paths use the identical identifier
(force_native_parser) for consistency.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f1324192-1946-46c1-a12f-8422a02d91a7

📥 Commits

Reviewing files that changed from the base of the PR and between 206a006 and 350443f.

📒 Files selected for processing (14)
  • crates/reasoning_parser/src/factory.rs
  • crates/reasoning_parser/src/parsers/base.rs
  • crates/reasoning_parser/src/parsers/cohere_cmd.rs
  • crates/reasoning_parser/src/parsers/deepseek_r1.rs
  • crates/reasoning_parser/src/parsers/glm45.rs
  • crates/reasoning_parser/src/parsers/kimi.rs
  • crates/reasoning_parser/src/parsers/minimax.rs
  • crates/reasoning_parser/src/parsers/nano_v3.rs
  • crates/reasoning_parser/src/parsers/qwen3.rs
  • crates/reasoning_parser/src/parsers/step3.rs
  • crates/reasoning_parser/src/traits.rs
  • crates/tool_parser/src/parsers/kimik2.rs
  • model_gateway/src/routers/grpc/regular/processor.rs
  • model_gateway/src/routers/grpc/regular/streaming.rs

Comment on lines +219 to 232
registry.register_parser("kimi_k25", {
let markers = kimi_tool_markers.clone();
move || {
let config = ParserConfig {
think_start_token: "<think>".to_string(),
think_end_token: "</think>".to_string(),
stream_reasoning: true,
max_buffer_size: DEFAULT_MAX_BUFFER_SIZE,
always_in_reasoning: false,
tool_section_start_markers: markers.clone(),
};
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
}
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider using ..Default::default() for consistency.

The explicit fields stream_reasoning, max_buffer_size, and always_in_reasoning: false all match the default values. For consistency with the deepseek_v31 and passthrough parser registrations, consider simplifying:

♻️ Suggested simplification
         registry.register_parser("kimi_k25", {
             let markers = kimi_tool_markers.clone();
             move || {
                 let config = ParserConfig {
                     think_start_token: "<think>".to_string(),
                     think_end_token: "</think>".to_string(),
-                    stream_reasoning: true,
-                    max_buffer_size: DEFAULT_MAX_BUFFER_SIZE,
-                    always_in_reasoning: false,
                     tool_section_start_markers: markers.clone(),
+                    ..Default::default()
                 };
                 Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
             }
         });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/reasoning_parser/src/factory.rs` around lines 219 - 232, The
ParserConfig for the "kimi_k25" registration redundantly sets fields that match
defaults; update the closure passed to registry.register_parser("kimi_k25") to
construct ParserConfig using struct update syntax with ..Default::default() for
the defaulted fields so only the non-default tokens and
tool_section_start_markers are specified (i.e., build ParserConfig {
think_start_token: ..., think_end_token: ..., tool_section_start_markers:
markers.clone(), ..Default::default() }) and keep the
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
part unchanged.

Comment thread crates/reasoning_parser/src/factory.rs
Comment on lines +151 to +154
let has_native_parser = self
.configured_tool_parser
.as_deref()
.is_some_and(|p| p != "json");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Minor naming inconsistency with streaming path.

The variable is named has_native_parser here but force_native_parser in streaming.rs. While the logic is identical, consistent naming would improve maintainability.

♻️ Suggested naming alignment
-            let has_native_parser = self
+            let force_native_parser = self
                 .configured_tool_parser
                 .as_deref()
                 .is_some_and(|p| p != "json");

-            if has_native_parser && tool_parser_available {
+            if force_native_parser && tool_parser_available {

Apply the same rename in the Messages path at line 637.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let has_native_parser = self
.configured_tool_parser
.as_deref()
.is_some_and(|p| p != "json");
let force_native_parser = self
.configured_tool_parser
.as_deref()
.is_some_and(|p| p != "json");
if force_native_parser && tool_parser_available {
// ... rest of the code
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/processor.rs` around lines 151 - 154,
Rename the local variable has_native_parser to force_native_parser to match the
streaming path; update its declaration and all uses within the same function
(the expression using self.configured_tool_parser.as_deref().is_some_and(|p| p
!= "json")) and also rename the corresponding variable in the Messages-handling
branch where the same logic is duplicated so both paths use the identical
identifier (force_native_parser) for consistency.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 468f1dfa1b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +149 to +152
if let Some(tool_pos) = self.find_tool_section_start(&current_text) {
let reasoning_text = current_text[..tool_pos].trim().to_string();
let normal_text = current_text[tool_pos..].to_string();
self.buffer.clear();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Buffer partial tool-section markers across streaming chunks

The new reasoning bailout only checks for a full tool-section marker in current_text, but when no full marker is found this branch still streams current_text as reasoning and clears the buffer. If <|tool_calls_section_begin|> is split across chunk boundaries (a common streaming pattern), the first chunk drops the marker prefix, so later chunks never match the marker and the parser keeps treating tool-call tokens as reasoning. In that case, streaming tool calls are silently lost even though this code path is meant to recover when </think> is missing.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

grpc gRPC client and router changes model-gateway Model gateway crate changes reasoning-parser Reasoning parser changes tool-parser Tool/function call parser changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Model-specific tool parser is bypassed under constrained tool_choice (Kimi K2)

1 participant