fix(tool_parser): fix func call parsing for native tool-call token by ConnorLi96 · Pull Request #1062 · lightseekorg/smg

ConnorLi96 · 2026-04-08T06:49:30Z

Problem

Function calling is broken for models that emit native tool-call tokens (e.g. Kimi K2):

When tool_choice is required or a specific function, the JSON schema parser is picked over the model-specific tool parser. Models like Kimi always output native tokens (<|tool_call_begin|>, etc.) regardless of tool_choice, so the JSON schema path fails silently.
When the model skips </think> and jumps straight to <|tool_calls_section_begin|>, the reasoning parser treats everything — including tool-call tokens — as reasoning content. The tool parser never sees them.
Some model variants emit <|func_start|>/<|func_end|> instead of <|tool_call_argument_begin|>/<|tool_call_end|>, and may produce multi-line JSON arguments. The KimiK2 parser regex rejects all of these.

Solution

Three-tier parser priority: explicitly configured parser > JSON schema > auto-detected parser. Models with a configured native parser always use it.
Configurable tool_section_start_markers in ParserConfig so the reasoning parser can bail out when tool-call markers appear (instead of hardcoding Kimi-specific tokens in the base parser).
Accept <|func_start|>/<|func_end|> as alternative delimiters in KimiK2 regex; add (?s) flag for multi-line JSON; use .min() to find earliest end delimiter.

Changes

crates/reasoning_parser/src/traits.rs: add tool_section_start_markers to ParserConfig
crates/reasoning_parser/src/parsers/base.rs: bail out of reasoning mode when configured markers are found
crates/reasoning_parser/src/factory.rs + 7 parser files: inject markers for Kimi models only
crates/tool_parser/src/parsers/kimik2.rs: alternative delimiters, (?s), hyphens in IDs, .min() for end tokens
model_gateway/src/routers/grpc/regular/processor.rs: three-tier parser priority, force_native_parser flag
model_gateway/src/routers/grpc/regular/streaming.rs: same priority fix + clear used_json_schema when native parser wins

Test Plan

Verified Kimi K2 function calling with tool_choice: "auto", "required", and specific function
Verified models without a configured tool parser still use JSON schema path
Tested streaming and non-streaming for both Chat Completions and Messages API

Checklist:

cargo +nightly fmt passes
cargo clippy --all-targets --all-features -- -D warnings passes

Closes #1110

Summary by CodeRabbit

New Features
- Added support for alternative tool call delimiters for improved model compatibility.
- Enhanced reasoning content parsing with automatic tool section detection and separation.
Improvements
- Optimized tool parser selection and request routing logic.
- Standardized parser configuration initialization for consistency across reasoning parsers.

coderabbitai · 2026-04-08T06:49:45Z

Warning

Rate limit exceeded

@ConnorLi96 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 54 minutes and 5 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 54 minutes and 5 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8bedb765-a392-4cd7-a794-0e3f7ed4e645

📥 Commits

Reviewing files that changed from the base of the PR and between 350443f and 468f1df.

📒 Files selected for processing (14)

crates/reasoning_parser/src/factory.rs
crates/reasoning_parser/src/parsers/base.rs
crates/reasoning_parser/src/parsers/cohere_cmd.rs
crates/reasoning_parser/src/parsers/deepseek_r1.rs
crates/reasoning_parser/src/parsers/glm45.rs
crates/reasoning_parser/src/parsers/kimi.rs
crates/reasoning_parser/src/parsers/minimax.rs
crates/reasoning_parser/src/parsers/nano_v3.rs
crates/reasoning_parser/src/parsers/qwen3.rs
crates/reasoning_parser/src/parsers/step3.rs
crates/reasoning_parser/src/traits.rs
crates/tool_parser/src/parsers/kimik2.rs
model_gateway/src/routers/grpc/regular/processor.rs
model_gateway/src/routers/grpc/regular/streaming.rs

📝 Walkthrough

Walkthrough

Detects tool-section start markers in reasoning output and splits reasoning vs normal text (including during streaming), exposes marker config in ParserConfig, extends KimiK2 to accept alternative delimiters, and changes processor/streaming selection to prefer configured native tool parsers over the JSON-schema path when appropriate.

Changes

Cohort / File(s)	Summary
Reasoning parser core & config `crates/reasoning_parser/src/parsers/base.rs`, `crates/reasoning_parser/src/traits.rs`	Add `tool_section_start_markers: Vec<String>` to `ParserConfig`; implement `find_tool_section_start` and split logic to stop reasoning at the earliest tool-section marker (applies to detect_and_parse and streaming incremental paths).
Reasoning parser factory & registrations `crates/reasoning_parser/src/factory.rs`	Introduce shared `kimi_tool_markers` and update parser registrations (closures use `move` + cloned markers) to inject `tool_section_start_markers` for specific parsers; use `..Default::default()` when building `ParserConfig`.
Reasoning parser constructors `crates/reasoning_parser/src/parsers/...` `cohere_cmd.rs`, `deepseek_r1.rs`, `glm45.rs`, `kimi.rs`, `minimax.rs`, `nano_v3.rs`, `qwen3.rs`, `step3.rs`	Switch parser constructors to use `..Default::default()` for `ParserConfig` (removing explicit stream/max-buffer/always_in_reasoning settings).
Tool parser (KimiK2) delimiter handling `crates/tool_parser/src/parsers/kimik2.rs`	Accept alternative delimiters `<
Processor (non-streaming) parsing priority `model_gateway/src/routers/grpc/regular/processor.rs`	Compute `has_native_parser` from `configured_tool_parser`; when `has_native_parser && tool_parser_available`, prefer and call `parse_tool_calls(...)` before JSON-schema bridge; otherwise preserve existing JSON-schema vs fallback logic.
Streaming parsing priority & flags `model_gateway/src/routers/grpc/regular/streaming.rs`	Add `force_native_parser` derived from configured parser; adjust Chat and Messages streaming branches so specific-function JSON-only path runs only when not forcing native parser, and pass `used_json_schema` into incremental streaming only when not forcing native parser; select parser accordingly.
Tests & helpers `crates/reasoning_parser/src/parsers/base.rs` (tests and helpers)	Add helper/config defaults use `..Default::default()` and three tests covering non-streaming split, streaming split/state transition, and absence of tool markers.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Processor as Processor
  participant ReasoningParser as Reasoning Parser
  participant ToolParser as Tool Parser
  participant JSONBridge as JSON-Bridge

  Client->>Processor: model output (streaming/non-streaming)
  Processor->>ReasoningParser: detect_and_parse_reasoning(text)
  alt finds tool-section marker
    ReasoningParser-->>Processor: reasoning_text (trimmed) + normal_text (from marker)
    Processor->>ToolParser: parse_tool_calls(normal_text)
  else no marker
    ReasoningParser-->>Processor: full reasoning_text
    alt configured native parser available && prioritized
      Processor->>ToolParser: parse_tool_calls(reasoning_text or later tool text)
    else
      Processor->>JSONBridge: parse_json_schema_response(...)
      JSONBridge-->>Processor: bridged result (may fall back to ToolParser)
    end
  end
  Processor->>Client: emit parsed result / streaming events

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(gateway): wire Messages API pipeline into gRPC routers #753 — Modifies model_gateway/src/routers/grpc/regular/processor.rs, related to message-processing and reasoning gating changes.
feat(gateway): add Messages API streaming support to gRPC router #758 — Updates streaming and processor logic for incremental tool/reasoning parsing; touches similar streaming branches.
fix(reasoning): detect thinking toggle from chat template and override parser state #1031 — Changes reasoning parser internals and ParserConfig, overlapping with base parser and config edits here.

Suggested reviewers

CatherineSue
key4ng
slin1237

Poem

🐰 I nibble markers, split the hay,
Thought-threads part where tool-calls play.
Delimiters twirl in tidy rows,
Parsers hop where the marker goes—
Hooray for neat parsing day! 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the main fix: updating the tool parser to correctly handle native tool-call tokens from models like Kimi K2, which aligns with the changeset's core objective.
Linked Issues check	✅ Passed	The PR fully implements all coding requirements from issue `#1110`: native parser priority [`#1110`], tool-section marker detection in reasoning parser [`#1110`], alternative delimiter support in KimiK2 [`#1110`], and priority fixes in model gateway [`#1110`].
Out of Scope Changes check	✅ Passed	All changes directly address issue `#1110` requirements: parser priority logic, tool-section markers, KimiK2 delimiter support, and streaming/non-streaming fixes; minor refactoring to use struct update syntax aligns with config normalization for the new markers field.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch connorli/fix-func-call-parsing-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-04-08T06:51:57Z


-        // Pattern for removing completed tool calls
-        let end_pattern = r"<\|tool_call_begin\|>.*?<\|tool_call_end\|>";
+        let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";


🔴 Important: end_pattern is missing the (?s) flag that was added to tool_call_pattern and stream_pattern. When multi-line JSON is parsed successfully (via the (?s) in the extraction patterns), the tool_call_end_pattern used for buffer cleanup at line 289 won't match across newlines. This causes the else { self.buffer.clear() } fallback to fire, which discards all buffered content — including any subsequent tool call that has started accumulating in the buffer.

Scenario: model emits two tool calls where the first has multi-line JSON arguments. When the first completes, the buffer cleanup fails to match just the first call, clears everything, and the second tool call is lost.

Suggested change

let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";

let end_pattern = r"(?s)<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";

claude · 2026-04-08T06:52:02Z

-        // Pattern for complete tool calls
-        let tool_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>\{.*?\})\s*<\|tool_call_end\|>";
+        // Supports alternative delimiters: <|func_start|>/<|func_end|>; (?s) for multi-line JSON
+        let tool_call_pattern = r"(?s)<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*(?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)?(?P<function_arguments>\{.*?\})\s*(?:<\|tool_call_end\|>|<\|func_end\|>)";


🟡 Nit: The argument delimiter group (?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)? is entirely optional (trailing ?). This means the regex will also match tool calls with no delimiter between the ID and the JSON body, e.g. <|tool_call_begin|> functions.search:0 {"query":"x"} <|tool_call_end|>. If this relaxation is intentional (to be lenient with models), a brief comment would help. If not, dropping the ? and making one of the two delimiters required would be safer:

(?:<\|tool_call_argument_begin\|>\s*|<\|func_start\|>\s*)

claude · 2026-04-08T06:52:05Z

-            // Assume reasoning was truncated before end token
+            // Don't consume tool call markers as reasoning content
+            if let Some(tool_pos) = processed_text.find("<|tool_calls_section_begin|>") {
+                let reasoning_text = processed_text[..tool_pos].trim().to_string();


🟡 Nit: The hardcoded "<|tool_calls_section_begin|>" marker ties the reasoning parser to Kimi K2's specific token format. If another model family uses a different tool-call marker while also skipping </think>, this won't catch it. Consider extracting this as a configurable field on ParserConfig (or at least a constant) so it's easier to extend and easier to spot the coupling.

gemini-code-assist

Code Review

This pull request enhances tool call parsing by adding support for alternative delimiters in the KimiK2 parser and preventing tool call markers from being incorrectly consumed as reasoning content in the BaseReasoningParser. It also updates the model gateway to prioritize explicitly configured tool parsers over JSON schema parsing. Feedback suggests that hardcoding model-specific tokens in the base parser should be avoided by moving them to a configuration object to maintain generality. Additionally, it is recommended to use a more robust method for identifying the first occurrence of multiple possible end delimiters to ensure correct string splitting when multiple delimiter types are present.

gemini-code-assist · 2026-04-08T06:53:34Z

        if !processed_text.contains(&self.config.think_end_token) {
-            // Assume reasoning was truncated before end token
+            // Don't consume tool call markers as reasoning content
+            if let Some(tool_pos) = processed_text.find("<|tool_calls_section_begin|>") {


Hardcoding the model-specific token <|tool_calls_section_begin|> in the BaseReasoningParser is problematic as it affects all models using this base implementation (e.g., DeepSeek, Qwen). If another model's reasoning content happens to contain this string (for example, when discussing Kimi's format), it will prematurely terminate the reasoning block. This token should ideally be moved to ParserConfig as an optional exit marker to maintain the generic nature of the base parser.

gemini-code-assist · 2026-04-08T06:53:34Z

        if self.in_reasoning && self.config.stream_reasoning {
-            // Stream the content immediately
+            // Some models skip </think> and go straight to tool calls
+            if let Some(tool_pos) = current_text.find("<|tool_calls_section_begin|>") {


Similar to the non-streaming path, hardcoding <|tool_calls_section_begin|> here introduces model-specific logic into a generic parser. This can lead to unexpected behavior for other models if this specific sequence appears in their reasoning stream.

gemini-code-assist · 2026-04-08T06:53:34Z

+                        let end_pos = argument_diff.find("<|tool_call_end|>")
+                            .or_else(|| argument_diff.find("<|func_end|>"));


The use of or_else here is potentially incorrect if both tokens are present in the argument_diff. It will prioritize <|tool_call_end|> even if <|func_end|> appears earlier in the string. While these are likely alternative delimiters, it is safer to find the minimum position of all possible end tokens to ensure the first occurrence is used for splitting.

let end_pos = [ argument_diff.find("<|tool_call_end|>"), argument_diff.find("<|func_end|>"), ].into_iter().flatten().min();

References

When stripping specific marker strings from data that might contain structured formats like JSON, use whole-string matching instead of character-set matching. Character-set matching is unsafe as it can remove characters that are part of valid data, leading to data corruption.

gemini-code-assist · 2026-04-08T06:53:34Z

+                        let end_pos2 = function_args.find("<|tool_call_end|>")
+                            .or_else(|| function_args.find("<|func_end|>"));


As noted above, using or_else for finding delimiters can lead to incorrect splitting if multiple delimiter types appear in the text. Finding the minimum index of all candidate tokens is more robust.

let end_pos2 = [ function_args.find("<|tool_call_end|>"), function_args.find("<|func_end|>"), ].into_iter().flatten().min();

References

When stripping specific marker strings from data that might contain structured formats like JSON, use whole-string matching instead of character-set matching. Character-set matching is unsafe as it can remove characters that are part of valid data, leading to data corruption.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ccf3f411b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-08T06:56:51Z

+                            let tool_chunks = if is_specific_function
+                                && !(self.configured_tool_parser.is_some()
+                                    && tool_parser_available)
+                            {


Restore specific-function streaming branch for configured parsers

This new condition routes ToolChoice::Function through process_tool_calls_stream whenever a parser is configured, but that path still uses used_json_schema to select the JSON parser. In specific-function mode the model commonly streams arguments-only JSON (no name field), which JsonParser cannot turn into tool-call deltas, so streaming can emit no tool_calls (or leak raw text) instead of the expected call with the requested function name.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-08T06:56:51Z


-        // Pattern for removing completed tool calls
-        let end_pattern = r"<\|tool_call_begin\|>.*?<\|tool_call_end\|>";
+        let end_pattern = r"<\|tool_call_begin\|>.*?(?:<\|tool_call_end\|>|<\|func_end\|>)";


Make tool-call end regex multiline-aware

Multiline arguments are now supported via (?s) in the extraction regexes, but the end-token cleanup regex still uses .*? without dotall. When arguments contain newlines, end detection can fail, and the completion branch falls back to clearing the entire buffer, which drops trailing text or subsequent tool calls from the same chunk.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

model_gateway/src/routers/grpc/regular/streaming.rs (1)

380-409: ⚠️ Potential issue | 🔴 Critical

Don't keep used_json_schema true once a configured parser wins.

This guard skips the arguments-only shortcut, but Line 407 still passes used_json_schema into process_tool_calls_stream. For tool_choice=function/required with a configured native parser, that helper still instantiates Some("json"), so native <|tool_call...|> streams are still routed to the JSON parser and silently miss tool calls.

🔧 Suggested fix

-                            let tool_chunks = if is_specific_function
-                                && !(self.configured_tool_parser.is_some()
-                                    && tool_parser_available)
+                            let force_native_parser =
+                                self.configured_tool_parser.is_some() && tool_parser_available;
+                            let use_json_parser = used_json_schema && !force_native_parser;
+                            let tool_chunks = if is_specific_function && !force_native_parser
                             {
                                 Self::process_specific_function_stream(
                                     &delta,
@@
                                 self.process_tool_calls_stream(
                                     &delta,
                                     index,
@@
-                                    used_json_schema,
+                                    use_json_parser,
                                 )
                                 .await
                             };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/streaming.rs` around lines 380 - 409,
When a configured native parser wins the specific-function branch, the
used_json_schema flag must be cleared so subsequent processing doesn't route
native <|tool_call...|> streams to the JSON parser; update the branch that calls
Self::process_specific_function_stream to also reset used_json_schema (or set it
to None/false) when configured_tool_parser.is_some() && tool_parser_available is
true, ensuring process_tool_calls_stream later receives the cleared value;
reference symbols: process_specific_function_stream, process_tool_calls_stream,
used_json_schema, configured_tool_parser, tool_parser_available.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/kimik2.rs`:
- Around line 67-68: The regex stored in end_pattern (used to build
tool_call_end_pattern) is missing the dotall flag so it doesn't match multiline
JSON; update the pattern or construction of tool_call_end_pattern to enable
dot-matches-newline (for example prefix the pattern with (?s) like
"(?s)<\\|tool_call_begin\\|>.*?(?:<\\|tool_call_end\\|>|<\\|func_end\\|>)" or
build via Regex::new using RegexBuilder with dot_matches_newline(true)) so
tool_call_end_pattern will match across newlines and avoid dropping buffered
partial calls when tool_call_end_pattern.find() fails.

In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 1757-1759: The branch that falls back to streaming_tool_parser
should respect the configured parser override; compute a boolean (e.g.,
force_native_parser = self.configured_tool_parser.is_some() &&
tool_parser_available) and then choose parser_name = if used_json_schema &&
!force_native_parser { Some("json") } else {
self.configured_tool_parser.as_deref() } and use that parser_name when
constructing streaming_tool_parser so ToolChoice::Tool/Any won't feed native
tool-call tokens into the JSON parser.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 380-409: When a configured native parser wins the
specific-function branch, the used_json_schema flag must be cleared so
subsequent processing doesn't route native <|tool_call...|> streams to the JSON
parser; update the branch that calls Self::process_specific_function_stream to
also reset used_json_schema (or set it to None/false) when
configured_tool_parser.is_some() && tool_parser_available is true, ensuring
process_tool_calls_stream later receives the cleared value; reference symbols:
process_specific_function_stream, process_tool_calls_stream, used_json_schema,
configured_tool_parser, tool_parser_available.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5c597fba-ac2e-406e-9aba-008bbaecb0a4

📥 Commits

Reviewing files that changed from the base of the PR and between d3fc32a and 1ccf3f4.

📒 Files selected for processing (4)

crates/reasoning_parser/src/parsers/base.rs
crates/tool_parser/src/parsers/kimik2.rs
model_gateway/src/routers/grpc/regular/processor.rs
model_gateway/src/routers/grpc/regular/streaming.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7fa35d374f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-08T07:46:58Z

+                            let force_native_parser =
+                                self.configured_tool_parser.is_some() && tool_parser_available;
+                            let tool_chunks = if is_specific_function && !force_native_parser {


Preserve specific-function stream path for configured JSON parser

This force_native_parser check treats any configured parser as a native-token parser, so configuring tool_parser=json now skips process_specific_function_stream and routes specific-function streaming through process_tool_calls_stream. In that mode the model output is typically arguments-only JSON (no name), and JsonParser::parse_incremental cannot emit tool-call deltas from it, so the stream can finish without any tool_calls for valid specific-function requests. This regression is triggered whenever a configured parser exists but does not provide native tool-call tokens.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-08T07:46:58Z

+            if self.configured_tool_parser.is_some() && tool_parser_available {
+                // Explicitly configured parser takes priority (models may emit native tokens regardless of tool_choice)


Keep JSON-schema parsing for specific function in non-streaming

Prioritizing any configured parser ahead of used_json_schema breaks specific-function non-streaming when the configured parser is json (or any parser expecting explicit function names): parse_tool_calls receives arguments-only JSON and returns no calls, while the previous JSON-schema path correctly synthesized the tool call using the selected function name. The same precedence pattern appears in both chat and messages non-streaming branches, so requests can silently lose tool_calls despite valid constrained output.

Useful? React with 👍 / 👎.

mergify · 2026-04-09T02:29:06Z

Hi @ConnorLi96, this PR has merge conflicts that must be resolved before it can be merged. Please rebase your branch:

git fetch origin main
git rebase origin/main
# resolve any conflicts, then:
git push --force-with-lease

claude · 2026-04-09T12:08:10Z

            stream_reasoning,
            max_buffer_size: DEFAULT_MAX_BUFFER_SIZE,
            always_in_reasoning,
+            tool_section_start_markers: Vec::new(),


🟡 Nit: All tests use tool_section_start_markers: Vec::new(), so the new marker-detection logic in both detect_and_parse_reasoning (line 75) and parse_reasoning_streaming_incremental (line 149) has zero test coverage. Consider adding at least two tests:

Non-streaming: reasoning text with a tool marker but no </think> → should split into reasoning + normal text at the marker.

Streaming: feed reasoning chunks followed by a chunk containing the marker → should transition out of reasoning and return the marker text as normal_text.

Example sketch:

#[test] fn test_tool_section_marker_ends_reasoning_non_streaming() { let config = ParserConfig { tool_section_start_markers: vec!["<|tool_calls_section_begin|>".to_string()], ..Default::default() }; let mut parser = BaseReasoningParser::new(config); let result = parser .detect_and_parse_reasoning("<think>thinking here<|tool_calls_section_begin|>tool tokens") .unwrap(); assert_eq!(result.reasoning_text, "thinking here"); assert_eq!(result.normal_text, "<|tool_calls_section_begin|>tool tokens"); }

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

model_gateway/src/routers/grpc/regular/streaming.rs (1)
380-406: ⚠️ Potential issue | 🟠 Major

Don't treat an explicitly configured "json" parser as a native-parser override.

force_native_parser becomes true for any configured parser name. If the router is configured with tool_call_parser: "json", these branches now skip the JSON-schema-specific handling and route ToolChoice::Function / messages::ToolChoice::Tool through incremental parsing instead. That path has no request-side function-choice context, so arguments-only tool calls lose the selected function name and stop streaming correctly.
🔧 Proposed fix
- let force_native_parser =
-     self.configured_tool_parser.is_some() && tool_parser_available;
+ let force_native_parser = matches!(
+     self.configured_tool_parser.as_deref(),
+     Some(name) if name != "json"
+ ) && tool_parser_available;
Apply the same guard in both the chat and Messages streaming paths.
Also applies to: 1639-1649, 1757-1759
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/streaming.rs` around lines 380 - 406,
The code sets force_native_parser = self.configured_tool_parser.is_some() &&
tool_parser_available which treats any explicitly configured parser (including
"json") as a native-parser override and skips JSON-schema-specific streaming;
update the guard so force_native_parser is true only when a non-JSON native
parser is configured (e.g., self.configured_tool_parser.as_deref() !=
Some("json") && tool_parser_available), and apply the same change in the other
streaming branches referenced (the chat/Messages paths that use
force_native_parser around process_specific_function_stream and
process_tool_calls_stream) so JSON-configured parser does not disable
JSON-schema incremental handling for ToolChoice::Function/Tool.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/regular/processor.rs`:
- Around line 151-160: The JSON-schema handling is being skipped whenever
configured_tool_parser is set, which breaks cases where configured_tool_parser
== "json" because parse_tool_calls cannot recover tool_choice/function names;
update the Messages branch (and the similar branch around
parse_json_schema_response at the other site) to use the same guard as the first
branch: only call parse_tool_calls when configured_tool_parser.is_some() &&
tool_parser_available && configured_tool_parser.as_deref() != Some("json"),
otherwise call parse_json_schema_response(...) so tool_choice and function names
are preserved; adjust the logic around configured_tool_parser,
tool_parser_available, parse_tool_calls, and parse_json_schema_response to
ensure ToolChoice::Function / messages::ToolChoice::Tool are handled by
parse_json_schema_response when the configured parser is "json".

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/streaming.rs`:
- Around line 380-406: The code sets force_native_parser =
self.configured_tool_parser.is_some() && tool_parser_available which treats any
explicitly configured parser (including "json") as a native-parser override and
skips JSON-schema-specific streaming; update the guard so force_native_parser is
true only when a non-JSON native parser is configured (e.g.,
self.configured_tool_parser.as_deref() != Some("json") &&
tool_parser_available), and apply the same change in the other streaming
branches referenced (the chat/Messages paths that use force_native_parser around
process_specific_function_stream and process_tool_calls_stream) so
JSON-configured parser does not disable JSON-schema incremental handling for
ToolChoice::Function/Tool.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b1e968b9-4b79-49b5-842c-886f69e488fa

📥 Commits

Reviewing files that changed from the base of the PR and between 1ccf3f4 and 206a006.

📒 Files selected for processing (14)

crates/reasoning_parser/src/factory.rs
crates/reasoning_parser/src/parsers/base.rs
crates/reasoning_parser/src/parsers/cohere_cmd.rs
crates/reasoning_parser/src/parsers/deepseek_r1.rs
crates/reasoning_parser/src/parsers/glm45.rs
crates/reasoning_parser/src/parsers/kimi.rs
crates/reasoning_parser/src/parsers/minimax.rs
crates/reasoning_parser/src/parsers/nano_v3.rs
crates/reasoning_parser/src/parsers/qwen3.rs
crates/reasoning_parser/src/parsers/step3.rs
crates/reasoning_parser/src/traits.rs
crates/tool_parser/src/parsers/kimik2.rs
model_gateway/src/routers/grpc/regular/processor.rs
model_gateway/src/routers/grpc/regular/streaming.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 147556504a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-13T20:15:20Z

+                        let end_pos = argument_diff
+                            .find("<|tool_call_end|>")
+                            .or_else(|| argument_diff.find("<|func_end|>"));


Pick earliest end token when parsing mixed Kimi delimiters

Now that parse_incremental accepts both <|tool_call_end|> and <|func_end|>, using find("<|tool_call_end|>").or_else(find("<|func_end|>")) can pick a later end marker instead of the first one in the buffer. If one call ends with <|func_end|> and a later call uses <|tool_call_end|>, the parser slices arguments through the next call’s tokens, so is_complete_json never succeeds and tool-call deltas can be dropped. Compute the minimum position across both delimiters before slicing.

Useful? React with 👍 / 👎.

…dels - Prioritize explicitly configured tool parser over JSON schema parsing - Support alternative delimiters (<|func_start|>/<|func_end|>) in KimiK2 parser - Prevent reasoning parser from consuming tool call markers when </think> is skipped Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor

- Replace hardcoded `<|tool_calls_section_begin|>` in BaseReasoningParser with configurable `tool_section_start_markers` in ParserConfig, injected only for Kimi models that need it - Fix `used_json_schema` flag leaking into `process_tool_calls_stream` when a configured native parser should take priority — compute `force_native_parser` and clear the JSON parser flag accordingly - Same fix for Messages API streaming path: `streaming_tool_parser` now respects configured parser over JSON schema parser - Add missing `(?s)` dotall flag to KimiK2 `end_pattern` regex so multi-line JSON arguments are correctly cleared from the buffer Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Signed-off-by: ConnorLi96 <connorli@together.ai> Made-with: Cursor Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor Signed-off-by: ConnorLi96 <ConnorLi96@users.noreply.github.com> Made-with: Cursor

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/reasoning_parser/src/factory.rs`:
- Around line 219-232: The ParserConfig for the "kimi_k25" registration
redundantly sets fields that match defaults; update the closure passed to
registry.register_parser("kimi_k25") to construct ParserConfig using struct
update syntax with ..Default::default() for the defaulted fields so only the
non-default tokens and tool_section_start_markers are specified (i.e., build
ParserConfig { think_start_token: ..., think_end_token: ...,
tool_section_start_markers: markers.clone(), ..Default::default() }) and keep
the
Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
part unchanged.
- Around line 235-250: Simplify the ParserConfig construction inside the
registry.register_parser("kimi_thinking") closure by using struct update syntax
(..Default::default()) and only explicitly set the fields that actually differ
from the default: set always_in_reasoning: true and keep
tool_section_start_markers: markers.clone() (remove explicit think_start_token,
think_end_token, stream_reasoning, max_buffer_size if they match defaults);
update the config passed to BaseReasoningParser::new accordingly in this
closure.

In `@model_gateway/src/routers/grpc/regular/processor.rs`:
- Around line 151-154: Rename the local variable has_native_parser to
force_native_parser to match the streaming path; update its declaration and all
uses within the same function (the expression using
self.configured_tool_parser.as_deref().is_some_and(|p| p != "json")) and also
rename the corresponding variable in the Messages-handling branch where the same
logic is duplicated so both paths use the identical identifier
(force_native_parser) for consistency.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f1324192-1946-46c1-a12f-8422a02d91a7

📥 Commits

Reviewing files that changed from the base of the PR and between 206a006 and 350443f.

📒 Files selected for processing (14)

crates/reasoning_parser/src/factory.rs
crates/reasoning_parser/src/parsers/base.rs
crates/reasoning_parser/src/parsers/cohere_cmd.rs
crates/reasoning_parser/src/parsers/deepseek_r1.rs
crates/reasoning_parser/src/parsers/glm45.rs
crates/reasoning_parser/src/parsers/kimi.rs
crates/reasoning_parser/src/parsers/minimax.rs
crates/reasoning_parser/src/parsers/nano_v3.rs
crates/reasoning_parser/src/parsers/qwen3.rs
crates/reasoning_parser/src/parsers/step3.rs
crates/reasoning_parser/src/traits.rs
crates/tool_parser/src/parsers/kimik2.rs
model_gateway/src/routers/grpc/regular/processor.rs
model_gateway/src/routers/grpc/regular/streaming.rs

coderabbitai · 2026-04-14T17:01:58Z

+        registry.register_parser("kimi_k25", {
+            let markers = kimi_tool_markers.clone();
+            move || {
+                let config = ParserConfig {
+                    think_start_token: "<think>".to_string(),
+                    think_end_token: "</think>".to_string(),
+                    stream_reasoning: true,
+                    max_buffer_size: DEFAULT_MAX_BUFFER_SIZE,
+                    always_in_reasoning: false,
+                    tool_section_start_markers: markers.clone(),
+                };
+                Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string()))
+            }
        });


🧹 Nitpick | 🔵 Trivial

Consider using ..Default::default() for consistency.

The explicit fields stream_reasoning, max_buffer_size, and always_in_reasoning: false all match the default values. For consistency with the deepseek_v31 and passthrough parser registrations, consider simplifying:

♻️ Suggested simplification

registry.register_parser("kimi_k25", { let markers = kimi_tool_markers.clone(); move || { let config = ParserConfig { think_start_token: "<think>".to_string(), think_end_token: "</think>".to_string(), - stream_reasoning: true, - max_buffer_size: DEFAULT_MAX_BUFFER_SIZE, - always_in_reasoning: false, tool_section_start_markers: markers.clone(), + ..Default::default() }; Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string())) } });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/reasoning_parser/src/factory.rs` around lines 219 - 232, The ParserConfig for the "kimi_k25" registration redundantly sets fields that match defaults; update the closure passed to registry.register_parser("kimi_k25") to construct ParserConfig using struct update syntax with ..Default::default() for the defaulted fields so only the non-default tokens and tool_section_start_markers are specified (i.e., build ParserConfig { think_start_token: ..., think_end_token: ..., tool_section_start_markers: markers.clone(), ..Default::default() }) and keep the Box::new(BaseReasoningParser::new(config).with_model_type("kimi_k25".to_string())) part unchanged.

coderabbitai · 2026-04-14T17:01:58Z

+            let has_native_parser = self
+                .configured_tool_parser
+                .as_deref()
+                .is_some_and(|p| p != "json");


🧹 Nitpick | 🔵 Trivial

Minor naming inconsistency with streaming path.

The variable is named has_native_parser here but force_native_parser in streaming.rs. While the logic is identical, consistent naming would improve maintainability.

♻️ Suggested naming alignment

- let has_native_parser = self + let force_native_parser = self .configured_tool_parser .as_deref() .is_some_and(|p| p != "json"); - if has_native_parser && tool_parser_available { + if force_native_parser && tool_parser_available {

Apply the same rename in the Messages path at line 637.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let has_native_parser = self

.configured_tool_parser

.as_deref()

.is_some_and(|p| p != "json");

let force_native_parser = self

.configured_tool_parser

.as_deref()

.is_some_and(|p| p != "json");

if force_native_parser && tool_parser_available {

// ... rest of the code

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/regular/processor.rs` around lines 151 - 154, Rename the local variable has_native_parser to force_native_parser to match the streaming path; update its declaration and all uses within the same function (the expression using self.configured_tool_parser.as_deref().is_some_and(|p| p != "json")) and also rename the corresponding variable in the Messages-handling branch where the same logic is duplicated so both paths use the identical identifier (force_native_parser) for consistency.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 468f1dfa1b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T17:05:30Z

+            if let Some(tool_pos) = self.find_tool_section_start(&current_text) {
+                let reasoning_text = current_text[..tool_pos].trim().to_string();
+                let normal_text = current_text[tool_pos..].to_string();
+                self.buffer.clear();


Buffer partial tool-section markers across streaming chunks

The new reasoning bailout only checks for a full tool-section marker in current_text, but when no full marker is found this branch still streams current_text as reasoning and clears the buffer. If <|tool_calls_section_begin|> is split across chunk boundaries (a common streaming pattern), the first chunk drops the marker prefix, so later chunks never match the marker and the parser keeps treating tool-call tokens as reasoning. In that case, streaming tool calls are silently lost even though this code path is meant to recover when </think> is missing.

Useful? React with 👍 / 👎.

ConnorLi96 requested review from CatherineSue, key4ng and slin1237 as code owners April 8, 2026 06:49

ConnorLi96 changed the title ~~fix(tool_parser): fix func call parsing for native tool-call token mo…~~ fix(tool_parser): fix func call parsing for native tool-call token Apr 8, 2026

github-actions bot added grpc gRPC client and router changes tool-parser Tool/function call parser changes reasoning-parser Reasoning parser changes model-gateway Model gateway crate changes labels Apr 8, 2026

claude bot reviewed Apr 8, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 8, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 8, 2026

View reviewed changes

coderabbitai bot requested changes Apr 8, 2026

View reviewed changes

Comment thread crates/tool_parser/src/parsers/kimik2.rs Outdated

Comment thread model_gateway/src/routers/grpc/regular/streaming.rs Outdated

chatgpt-codex-connector bot reviewed Apr 8, 2026

View reviewed changes

mergify bot added the needs-rebase PR has merge conflicts that need to be resolved label Apr 9, 2026

ConnorLi96 force-pushed the connorli/fix-func-call-parsing-v2 branch from 7fa35d3 to 206a006 Compare April 9, 2026 12:03

mergify bot removed the needs-rebase PR has merge conflicts that need to be resolved label Apr 9, 2026

claude bot reviewed Apr 9, 2026

View reviewed changes

coderabbitai bot requested changes Apr 9, 2026

View reviewed changes

Comment thread model_gateway/src/routers/grpc/regular/processor.rs Outdated

chatgpt-codex-connector bot reviewed Apr 13, 2026

View reviewed changes

ConnorLi96 mentioned this pull request Apr 13, 2026

[Bug]: Model-specific tool parser is bypassed under constrained tool_choice (Kimi K2) #1110

Open

3 tasks

ConnorLi96 force-pushed the connorli/fix-func-call-parsing-v2 branch 2 times, most recently from 350443f to 468f1df Compare April 14, 2026 16:58

ConnorLi96 and others added 2 commits April 14, 2026 09:59

coderabbitai bot requested changes Apr 14, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 14, 2026

View reviewed changes

	let end_pattern = r"<\\|tool_call_begin\\|>.*?(?:<\\|tool_call_end\\|>\|<\\|func_end\\|>)";
	let end_pattern = r"(?s)<\\|tool_call_begin\\|>.*?(?:<\\|tool_call_end\\|>\|<\\|func_end\\|>)";

		let end_pos = argument_diff.find("<\|tool_call_end\|>")
		.or_else(\|\| argument_diff.find("<\|func_end\|>"));

		let end_pos2 = function_args.find("<\|tool_call_end\|>")
		.or_else(\|\| function_args.find("<\|func_end\|>"));

		if self.configured_tool_parser.is_some() && tool_parser_available {
		// Explicitly configured parser takes priority (models may emit native tokens regardless of tool_choice)

Conversation

ConnorLi96 commented Apr 8, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

claude bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

ConnorLi96 commented Apr 8, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 8, 2026 •

edited

Loading