feat(tool_parser): add DeepSeek V3.2 DSML tool call parser by key4ng · Pull Request #1030 · lightseekorg/smg

key4ng · 2026-04-03T01:36:07Z

Description

Problem

DeepSeek V3.2 introduces a new XML-like DSML format for tool calls, replacing the special-token approach used in V3/V3.1. The gateway has no parser for this format, so V3.2 models cannot use tool calling through the gRPC streaming path.

Solution

Add a new DeepSeek32Parser that handles the DSML format with incremental streaming support, following the SGLang DeepSeekV32Detector pattern.

Changes

New parser: crates/tool_parser/src/parsers/deepseek32.rs — handles DSML format with regex-based parsing
- Supports both XML parameter tags (<｜DSML｜parameter>) and direct JSON fallback inside invoke blocks
- Type-aware argument reconstruction: string="true" → string, string="false" → parsed JSON
- Incremental streaming with argument diffing
- Partial DSML prefix detection to avoid flushing incomplete tags
Factory registration: deepseek32 parser with model mappings
- deepseek-v3.2* / deepseek-ai/DeepSeek-V3.2* → deepseek32 (DSML format)
- deepseek-v3.2-exp* / deepseek-ai/DeepSeek-V3.2-Exp* → deepseek31 (V3.2-Exp uses V3.1 format)
13 integration tests: complete parsing, streaming, factory registration

Test Plan

cargo test -p tool-parser --test tool_parser_deepseek32
# 13 passed; 0 failed

cargo test -p tool-parser
# 347 passed; 0 failed (no regressions)

cargo clippy -p tool-parser --all-targets --all-features -- -D warnings
# clean

Checklist

cargo +nightly fmt passes
cargo clippy --all-targets --all-features -- -D warnings passes
(Optional) Documentation updated
(Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

New Features
- Added support for DeepSeek V3.2 model format, enabling extraction of DSML tool calls and streaming of tool names plus incremental argument deltas.
- Model detection updated so V3.2 variants use the new parser while V3.2-Exp variants continue to use prior V3.1-format handling.
Tests
- Added comprehensive tests for complete and incremental/streaming parsing, multiple invokes, parameter types, nested JSON bodies, edge cases, and model-to-parser mappings.

Signed-off-by: key4ng <rukeyang@gmail.com>

coderabbitai · 2026-04-03T01:36:20Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new DeepSeek V3.2 parser (DeepSeek32Parser) with complete and incremental DSML parsing, registers and maps it in the parser factory for V3.2 model patterns, re-exports the parser, and adds integration tests covering parsing and factory resolution.

Changes

Cohort / File(s)	Summary
Parser Implementation `crates/tool_parser/src/parsers/deepseek32.rs`	New `DeepSeek32Parser` implementing `ToolParser`: DSML detection, complete parsing, incremental/streaming parsing with buffering/state, JSON argument reconstruction, partial-parameter handling, and reset/helpers.
Parsers Module & Re-exports `crates/tool_parser/src/parsers/mod.rs`, `crates/tool_parser/src/lib.rs`	Added `deepseek32` module and publicly re-exported `DeepSeek32Parser`.
Factory & Model Mapping `crates/tool_parser/src/factory.rs`	Registered `"deepseek32"` in `ParserFactory::new()`; mapped `deepseek-v3.2` and `deepseek-ai/DeepSeek-V3.2` to it; mapped `deepseek-v3.2-exp` / `deepseek-ai/DeepSeek-V3.2-Exp` to the existing `"deepseek31"` parser.
Tests `crates/tool_parser/tests/tool_parser_deepseek32.rs`	New tests for `DeepSeek32Parser` and factory mappings: complete parsing (XML-like params, JSON payloads, mixed/nested types), incremental streaming across chunks, marker detection, and model→parser resolution.
Factory Re-exports/Registration File `crates/tool_parser/src/factory.rs`, `crates/tool_parser/src/lib.rs`	Updated imports/registrations and public re-exports to include `DeepSeek32Parser`.

Sequence Diagram(s)

(Skipped)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat(tool_parser): add DeepSeek V3.1 tool call parser #1006: Adds a DeepSeek parser and updates parser exports and factory registration; strongly related due to analogous parser addition and factory mapping changes.

Suggested reviewers

slin1237
CatherineSue

Poem

🐇 I nibble DSML crumbs beneath the moon,
Invokes turned to JSON, tidy and soon,
Chunks hop and settle into my nest,
Names then args stream out at my best,
V3.2 carrots—what a cozy fest!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely summarizes the main change: adding a DeepSeek V3.2 DSML tool call parser. It accurately reflects the primary objective of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch keyang/deepseek_3_2_tool_call

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1cc2419261

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/tool_parser/src/parsers/deepseek32.rs

gemini-code-assist

Code Review

This pull request introduces the DeepSeek32Parser to support the DeepSeek V3.2 DSML tool-calling format, providing both complete and incremental parsing. The parser is integrated into the ParserFactory with mappings for V3.2 and V3.2-Exp models, and integration tests are included. Feedback suggests improving the robustness of parameter parsing and using warning-level logging for invalid tool names.

crates/tool_parser/src/parsers/deepseek32.rs

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Line 313: Remove the explicit drop(captures); statement in deepseek32.rs: the
local variable captures (which borrows buf_snapshot) will go out of scope
naturally, so delete the drop call and ensure there are no further references to
captures after its intended use (verify the surrounding code in the function
where captures and buf_snapshot are used).
- Around line 370-398: The argument diff logic in argument_diff can be
simplified: when is_complete is false and you have a prev_args (from
self.prev_tool_call_arr) and DSML parameters only ever accumulate, replace the
find_common_prefix-based branching inside the else-if that checks let Some(prev)
= &prev_args with a direct slice from sent_len into current_args (i.e., treat
the new content as current_args[sent_len..].to_string()); this removes the
prefix computation while preserving behavior for monotonic accumulation—keep the
existing handling for the is_complete branch and the None cases, and only change
the block that currently calls helpers::find_common_prefix and compares
prefix.len() to sent_len.
- Around line 316-324: The invalid-tool branch currently breaks leaving parser
state stale and preventing processing of remaining invokes; change it to follow
the pattern used in other parsers: when func_name is invalid and is_complete is
true, advance self.buffer via match_end (if Some(end)) and then reset parser
state (clear streamed_args_for_tool and set current_tool_name_sent = false) and
continue the loop instead of break; when func_name is invalid and is_complete is
false, reset the same state (clear streamed_args_for_tool and set
current_tool_name_sent = false) and return/exit early so partial invokes are
dropped and state is clean for the next chunk.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6a76df48-375a-4a91-898f-0a879f03acd3

📥 Commits

Reviewing files that changed from the base of the PR and between f6beb69 and 1cc2419.

📒 Files selected for processing (5)

crates/tool_parser/src/factory.rs
crates/tool_parser/src/lib.rs
crates/tool_parser/src/parsers/deepseek32.rs
crates/tool_parser/src/parsers/mod.rs
crates/tool_parser/tests/tool_parser_deepseek32.rs

crates/tool_parser/src/parsers/deepseek32.rs

…r stripping Signed-off-by: key4ng <rukeyang@gmail.com>

…ation Signed-off-by: key4ng <rukeyang@gmail.com>

…reaking Signed-off-by: key4ng <rukeyang@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c7c1c2f6c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/tool_parser/src/parsers/deepseek32.rs

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 336-381: The bug is that prev_tool_call_arr stores "arguments" as
an object but the code expects a string, so prev_args becomes None and the first
partial chunk is dropped; update the logic in the block computing argument_diff
(around parse_parameters_from_dsml, streamed_args_for_tool, prev_tool_call_arr
and helpers::find_common_prefix) to treat a missing/non-string previous
arguments as an empty string (or initialize "arguments" as an empty string when
setting prev_tool_call_arr) and then compute the diff from sent_len (i.e., if
prev_args is None treat prev = "" and emit current_args[sent_len..] when
!is_complete or when appropriate), ensuring the first partial arguments are
returned instead of None.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 574ade0e-dcfd-4409-a2d4-abea50e43965

📥 Commits

Reviewing files that changed from the base of the PR and between 1cc2419 and 2c7c1c2.

📒 Files selected for processing (1)

crates/tool_parser/src/parsers/deepseek32.rs

crates/tool_parser/src/parsers/deepseek32.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cdff14236c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/tool_parser/src/parsers/deepseek32.rs

Signed-off-by: key4ng <rukeyang@gmail.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

crates/tool_parser/src/parsers/deepseek32.rs (1)

346-349: ⚠️ Potential issue | 🟠 Major

Emit the first partial argument delta.

Line 348 seeds arguments with a non-string value, but Lines 363-367 only recover previous arguments via .as_str(). On the first incomplete invoke, prev_args is therefore missing, so Lines 389-390 return None and the stream emits the tool name without any initial parameter bytes.

🛠️ Possible fix

-            let prev_args = if tool_id < self.prev_tool_call_arr.len() {
-                self.prev_tool_call_arr[tool_id]
-                    .get("arguments")
-                    .and_then(|v| v.as_str())
-                    .map(|s| s.to_string())
-            } else {
-                None
-            };
+            let prev_args = if tool_id < self.prev_tool_call_arr.len() {
+                self.prev_tool_call_arr[tool_id]
+                    .get("arguments")
+                    .and_then(|v| v.as_str())
+                    .unwrap_or_default()
+                    .to_string()
+            } else {
+                String::new()
+            };
 
             let argument_diff = if is_complete {
                 if sent_len < current_args.len() {
                     Some(current_args[sent_len..].to_string())
                 } else {
                     Some(String::new())
                 }
-            } else if let Some(prev) = &prev_args {
-                if current_args == *prev {
+            } else if prev_args.is_empty() {
+                if sent_len < current_args.len() {
+                    Some(current_args[sent_len..].to_string())
+                } else {
+                    None
+                }
+            } else if current_args == prev_args {
                     None
                 } else {
-                    let prefix = helpers::find_common_prefix(prev, &current_args);
+                    let prefix = helpers::find_common_prefix(&prev_args, &current_args);
                     if prefix.len() > sent_len {
                         Some(prefix[sent_len..].to_string())
                     } else {
                         None
                     }
                 }
-            } else {
-                None
             };

Also applies to: 363-390

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/tool_parser/src/parsers/deepseek32.rs` around lines 346 - 349, The
code seeds prev_tool_call_arr[tool_id] with "arguments": {} (in deepseek32
parser) but later logic expects a string via .as_str(), so on the first partial
invoke prev_args is missing and no initial parameter bytes are emitted; fix by
initializing the seeded value to an empty string (e.g., set "arguments" to ""
instead of {}) or alternatively update the prev_args recovery (where .as_str()
is used) to handle non-string JSON by calling .as_str().unwrap_or_else(||
json_value.to_string().as_str()) or using to_string()/unwrap_or_default() so the
first partial argument delta is emitted; modify the code around
prev_tool_call_arr, tool_id and the prev_args extraction to ensure arguments are
a string when consumed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 244-270: The partial-marker detection in the has_partial_prefix
check is too narrow (only '<', '<｜', '</', '</｜') causing longer truncated
tokens (e.g. '<｜DSML' or '<｜DSML｜fun') to be treated as normal_text; update the
computation of has_partial_prefix in the parsing function (the variables
current_text, has_dsml, has_partial_prefix and the early-return branch that
yields StreamingParseResult::default()) so it detects any trailing incomplete
tag: find the last '<' (or "</") in current_text and treat it as a partial
prefix if there is no matching closing '>' after it (or use a regex like
r"</?[^>]*$" to detect an unterminated tag); this will ensure such longer
partial DSML fragments are buffered instead of flushed as normal_text.

---

Duplicate comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 346-349: The code seeds prev_tool_call_arr[tool_id] with
"arguments": {} (in deepseek32 parser) but later logic expects a string via
.as_str(), so on the first partial invoke prev_args is missing and no initial
parameter bytes are emitted; fix by initializing the seeded value to an empty
string (e.g., set "arguments" to "" instead of {}) or alternatively update the
prev_args recovery (where .as_str() is used) to handle non-string JSON by
calling .as_str().unwrap_or_else(|| json_value.to_string().as_str()) or using
to_string()/unwrap_or_default() so the first partial argument delta is emitted;
modify the code around prev_tool_call_arr, tool_id and the prev_args extraction
to ensure arguments are a string when consumed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 09577cc6-67fa-45f1-936b-c32a8acefbd2

📥 Commits

Reviewing files that changed from the base of the PR and between 2c7c1c2 and cdff142.

📒 Files selected for processing (1)

crates/tool_parser/src/parsers/deepseek32.rs

crates/tool_parser/src/parsers/deepseek32.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7fa9333830

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/tool_parser/src/parsers/deepseek32.rs

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

crates/tool_parser/src/parsers/deepseek32.rs (2)

348-351: ⚠️ Potential issue | 🟠 Major

Still dropping the first partial argument delta.

prev_tool_call_arr[tool_id]["arguments"] is initialized as {}, but Lines 365-369 only read strings. On the first incomplete invoke, prev_args becomes None, so the !is_complete path returns None and no parameters are streamed until a later chunk. Initialize "arguments" as "" or treat a non-string previous value as empty.

🛠️ Minimal fix

                 self.prev_tool_call_arr[tool_id] = serde_json::json!({
                     "name": func_name,
-                    "arguments": {},
+                    "arguments": "",
                 });
...
-            } else {
-                None
+            } else if sent_len < current_args.len() {
+                Some(current_args[sent_len..].to_string())
+            } else {
+                None
             };

Also applies to: 365-392

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/tool_parser/src/parsers/deepseek32.rs` around lines 348 - 351, The
current initialization of self.prev_tool_call_arr[tool_id]["arguments"] as an
object causes the first partial argument delta to be dropped because prev_args
(read elsewhere in deepseek32.rs around the prev_args / is_complete logic)
expects a string; change the initialization in the prev_tool_call_arr entry for
func_name to set "arguments" to an empty string "" (or alternatively update the
code that reads prev_args to treat non-string values as empty string) so that
the first incomplete chunk is appended/streamed correctly; update any logic that
merges incoming argument chunks (the code paths around prev_args and
is_complete) to handle and coerce non-string previous values to "" before
concatenation.

246-272: ⚠️ Potential issue | 🟠 Major

Longer truncated DSML prefixes are still flushed as normal text.

Lines 249-252 only preserve <, <｜, </, and </｜. A chunk ending with <｜DSML or <｜DSML｜inv still reaches Line 254 and gets emitted as normal_text, so the next chunk can never reconstruct the tag. Buffer any trailing unterminated <... fragment instead of hard-coding four suffixes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/tool_parser/src/parsers/deepseek32.rs` around lines 246 - 272, The
code currently only treats four very short suffixes as partial DSML prefixes
(has_partial_prefix) so longer truncated tags like "<｜DSML" get flushed as
normal_text; change has_partial_prefix to detect any trailing unterminated '<'
fragment by finding the last '<' in current_text and checking if there is no
corresponding '>' after it (i.e., an open tag that runs to the end of the
chunk), and if so treat that suffix as a partial prefix; when producing
normal_text (in the branch that strips end tokens and returns
StreamingParseResult), remove that trailing unterminated fragment from
normal_text and put it back into self.buffer so the fragment is preserved for
the next chunk; update uses of has_partial_prefix, current_text, buffer, and
StreamingParseResult accordingly instead of hard-coding the four suffixes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 51-67: strip_dsml_trailing currently trims by character set and
can remove legitimate argument characters; change it to only remove an actual
trailing substring that is a prefix of the full DSML closing tag. For each
fragment group (DSML_PARAM_END_FRAGMENTS and DSML_INVOKE_END_FRAGMENTS) build
the full closing string by concatenating the fragments (e.g.
"</｜DSML｜parameter>"), then for the input string find the longest k>0 such that
result.ends_with(&full[..k]) and remove exactly that suffix (no per-character
trimming). Update strip_dsml_trailing to perform this suffix-prefix check and
removal using the concatenated full tag rather than fragment.contains(c); keep
references to DSML_PARAM_END_FRAGMENTS and DSML_INVOKE_END_FRAGMENTS and the
function name strip_dsml_trailing for locating the change.

---

Duplicate comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 348-351: The current initialization of
self.prev_tool_call_arr[tool_id]["arguments"] as an object causes the first
partial argument delta to be dropped because prev_args (read elsewhere in
deepseek32.rs around the prev_args / is_complete logic) expects a string; change
the initialization in the prev_tool_call_arr entry for func_name to set
"arguments" to an empty string "" (or alternatively update the code that reads
prev_args to treat non-string values as empty string) so that the first
incomplete chunk is appended/streamed correctly; update any logic that merges
incoming argument chunks (the code paths around prev_args and is_complete) to
handle and coerce non-string previous values to "" before concatenation.
- Around line 246-272: The code currently only treats four very short suffixes
as partial DSML prefixes (has_partial_prefix) so longer truncated tags like
"<｜DSML" get flushed as normal_text; change has_partial_prefix to detect any
trailing unterminated '<' fragment by finding the last '<' in current_text and
checking if there is no corresponding '>' after it (i.e., an open tag that runs
to the end of the chunk), and if so treat that suffix as a partial prefix; when
producing normal_text (in the branch that strips end tokens and returns
StreamingParseResult), remove that trailing unterminated fragment from
normal_text and put it back into self.buffer so the fragment is preserved for
the next chunk; update uses of has_partial_prefix, current_text, buffer, and
StreamingParseResult accordingly instead of hard-coding the four suffixes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 08b90a6f-e740-4a73-b664-9be71109e9f4

📥 Commits

Reviewing files that changed from the base of the PR and between cdff142 and 7fa9333.

📒 Files selected for processing (1)

crates/tool_parser/src/parsers/deepseek32.rs

crates/tool_parser/src/parsers/deepseek32.rs

…m DSML block Signed-off-by: key4ng <rukeyang@gmail.com>

…id invoke abort Signed-off-by: key4ng <rukeyang@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf8518dd3b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/tool_parser/src/parsers/deepseek32.rs

key4ng · 2026-04-06T20:47:22Z

SGLang Alignment Review (Corrected)

Replaces the previous comparison after fact-checking against the actual SGLang source.

This parser follows SGLang's DeepSeekV32Detector implementation with targeted improvements.

Regex Patterns

Purpose	SGLang	Ours	Match?
function_calls block	`(.*?)` with `re.DOTALL`	Same with `(?s)` inline	Yes
Complete invoke	Reuses streaming regex (with `\|$` fallback)	Dedicated stricter regex (no `\|$`)	Ours is stricter
Complete parameter	`string="([^"]+)"`	`string="(true\|false)"`	Stricter per spec
Partial parameter	`string="([^"]+)"`	`string="(true\|false)"`	Stricter per spec
Invoke (streaming)	`(.*?)(end_tag\|$)`	Same	Yes

DSML Fragment Stripping

Aspect	SGLang	Ours	Match?
Param fragments	`["</", "｜DSML｜", "parameter"]`	Same	Yes
Invoke fragments	`["</", "｜DSML｜", "inv", "oke"]`	Same	Yes
Stripping method	`rstrip(chars)` in reverse	`trim_end_matches` in reverse	Equivalent
Where applied (params)	Strip `remaining_content` BEFORE partial regex	Same	Yes
Where applied (JSON)	Strip invoke-end from direct JSON	Same	Yes

Parameter Parsing

Aspect	SGLang	Ours	Match?
Direct JSON detection	`starts_with("{")`	Same	Yes
Direct JSON partial	Strip invoke-end, return	Same	Yes
Direct JSON complete	Check `ends_with("}")`, return	Same	Yes
`string="true"`	`value.strip()` (always trims whitespace)	`Value::String(value)` (no trim in complete path)	Minor diff — SGLang trims
`string="false"`	`json.loads` with fallback	`serde_json::from_str` with fallback	Yes
Partial: strip before regex	Strip `remaining[last_match_end:]`, then regex	Same	Yes
Partial: incomplete JSON	`_partial_json_loads` dependency	`serde_json::from_str` with string fallback	Functional gap for partial non-string values; diff algorithm compensates
Return format	`json.dumps(parameters)`	`serde_json::to_string`	Yes

Streaming Logic

Aspect	SGLang	Ours	Match?
Buffer accumulation	`self._buffer += new_text`	`self.buffer.push_str(chunk)`	Yes
DSML marker detection	`bot_token` or `<｜DSML｜invoke`	Same	Yes
Non-DSML flush	Strip `eot_token`, `invoke_end_token`	Strip 4 end tokens	Yes
Invoke loop	`while True` + `re.search`	`loop` + `captures`	Yes
Complete detection	`bool(group(3))`	`.is_some_and()`	Yes
Tool name emit	`ToolCallItem(name=func_name)`	Same	Yes
Arg parsing call	`allow_partial=not is_tool_end`	Same	Yes
Diff (complete)	`current_params[sent_len:]`	Same	Yes
Diff (partial, has prev)	`_find_common_prefix` + `> sent_len`	Same	Yes
Diff (partial, no prev)	Falls through — no emission	Emits from `sent_len`	Ours is better
Update prev state	`{"name": ..., "arguments": ...}`	Same	Yes
Complete: advance buffer	`self._buffer = text[match.end():]`	Same	Yes
Complete: advance tool_id	`+= 1`, reset, `continue`	Same	Yes
Partial: break	`break`	Same	Yes
Tool name validation	No validation — all names forwarded as-is	Validates against `tool_indices`, skips invalid	Ours is better

State Management

Field	SGLang	Ours	Match?
Buffer	`self._buffer`	`self.buffer`	Yes
Tool index	`self.current_tool_id` (starts -1)	Same	Yes
Name sent flag	`self.current_tool_name_sent`	Same	Yes
Previous tool calls	`self.prev_tool_call_arr` (list of dicts)	Same (Vec of Value)	Yes
Streamed args	`self.streamed_args_for_tool` (list of strings)	Same	Yes

Improvements Over SGLang

Area	SGLang	Ours
First partial args	Silently dropped (`prev_args` is None)	Emits from `sent_len` — fixes one-chunk delay
Tool name validation	No validation — invalid names forwarded to client	Validates against tools list; skips invalid invokes
Complete invoke regex	Reuses streaming regex with end-of-string fallback	Dedicated stricter regex for `parse_complete`
`string` attribute regex	Accepts any quoted value	Strict `true` or `false` only, per official spec

Acceptable Differences

Partial JSON for string="false": SGLang uses _partial_json_loads dependency; we use serde_json::from_str with string fallback. The diff algorithm (common-prefix) handles structure changes safely when the closing tag arrives.
potentially_dsml mid-buffer check: SGLang checks for ｜DSML｜ substring anywhere; we only check ends_with. DSML tokens arrive atomically from the tokenizer.
Whitespace trimming on string="true" values: SGLang calls .strip() on complete parameter values; our complete path does not. Partial path does trim.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

crates/tool_parser/src/parsers/deepseek32.rs (2)
57-67: ⚠️ Potential issue | 🟠 Major

strip_dsml_trailing still removes real argument bytes.

Line 64 trims by character set, not by DSML suffix. A partial value like bar becomes b, and direct-JSON fragments can also lose legitimate trailing bytes before the diffing code sees them. Strip only the longest trailing substring that is actually a prefix of the closing tag.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/tool_parser/src/parsers/deepseek32.rs` around lines 57 - 67,
strip_dsml_trailing currently uses trim_end_matches with a character predicate
and thus deletes any trailing characters that appear anywhere in a fragment;
instead detect and remove full DSML closing-tag substrings only: in
strip_dsml_trailing, stop using trim_end_matches(|c| ...) and replace with logic
that finds the longest fragment from fragments that is a suffix of the current
result (use ends_with(fragment)) and then chop off that exact fragment (once)
from the end; iterate in reverse or repeatedly as before but always remove
whole-fragment suffixes only so legitimate trailing bytes are not lost.
246-252: ⚠️ Potential issue | 🟠 Major

Buffer any unterminated DSML tag, not just four 1–2 byte suffixes.

Lines 249-252 only preserve <, <｜, </, and </｜. A chunk ending with <｜DSML, <｜DSML｜function_cal, or <｜DSML｜inv falls through Line 254 as normal_text, so the next chunk can no longer reconstruct the marker. Detect any trailing <... without a closing > instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/tool_parser/src/parsers/deepseek32.rs` around lines 246 - 252, The
current partial-prefix check only matches four specific short suffixes and
misses longer unterminated DSML fragments; update the logic that computes
has_partial_prefix (used alongside has_dsml and current_text in this parser) to
detect any trailing opening tag without a closing '>' instead of only exact
suffixes — e.g., consider the last index of '<' versus the last index of '>' in
current_text and treat as a partial if there's an unmatched '<' (and ensure this
works with the existing self.has_tool_markers check so longer fragments like
"<｜DSML" or "<｜DSML｜function_cal" are buffered rather than emitted as
normal_text).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 378-398: The code currently emits the placeholder "{}" as an
argument diff when prev_args is None (first partial chunk) because
parse_parameters_from_dsml(..., true) returns "{}" for an empty/incomplete
payload; update the first-partial fallback in the argument_diff computation (the
branch that now does `else if sent_len < current_args.len() { /* First partial
chunk */ Some(current_args[sent_len..].to_string()) }`) to only emit when
current_args contains actual payload bytes (e.g., require current_args != "{}"
or current_args.len() > 2) so the placeholder isn't streamed; keep the existing
complete-path (is_complete) behavior so a truly complete empty-object call can
still be sent.

---

Duplicate comments:
In `@crates/tool_parser/src/parsers/deepseek32.rs`:
- Around line 57-67: strip_dsml_trailing currently uses trim_end_matches with a
character predicate and thus deletes any trailing characters that appear
anywhere in a fragment; instead detect and remove full DSML closing-tag
substrings only: in strip_dsml_trailing, stop using trim_end_matches(|c| ...)
and replace with logic that finds the longest fragment from fragments that is a
suffix of the current result (use ends_with(fragment)) and then chop off that
exact fragment (once) from the end; iterate in reverse or repeatedly as before
but always remove whole-fragment suffixes only so legitimate trailing bytes are
not lost.
- Around line 246-252: The current partial-prefix check only matches four
specific short suffixes and misses longer unterminated DSML fragments; update
the logic that computes has_partial_prefix (used alongside has_dsml and
current_text in this parser) to detect any trailing opening tag without a
closing '>' instead of only exact suffixes — e.g., consider the last index of
'<' versus the last index of '>' in current_text and treat as a partial if
there's an unmatched '<' (and ensure this works with the existing
self.has_tool_markers check so longer fragments like "<｜DSML" or
"<｜DSML｜function_cal" are buffered rather than emitted as normal_text).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bef2d1e5-bd1e-4e34-8949-4c1dbfac94ab

📥 Commits

Reviewing files that changed from the base of the PR and between 7fa9333 and bf8518d.

📒 Files selected for processing (1)

crates/tool_parser/src/parsers/deepseek32.rs

crates/tool_parser/src/parsers/deepseek32.rs

key4ng · 2026-04-06T21:46:52Z

Parser Walkthrough

What the model outputs

DeepSeek V3.2 uses an XML-like "DSML" format for tool calls:

I'll check the weather for you.

<｜DSML｜function_calls>
<｜DSML｜invoke name="get_weather">
<｜DSML｜parameter name="city" string="true">Tokyo</｜DSML｜parameter>
<｜DSML｜parameter name="date" string="false">16</｜DSML｜parameter>
</｜DSML｜invoke>
</｜DSML｜function_calls>

The parser turns this into: name: "get_weather", arguments: '{"city":"Tokyo","date":16}'

string="true" → JSON string value
string="false" → parse as raw JSON (number, bool, array, object)
Also handles a fallback format where the model outputs raw JSON inside <invoke> instead of <parameter> tags

Two modes

parse_complete — called with the entire model output at once. Extracts normal text before the DSML block, parses all invoke blocks, returns tool calls.
parse_incremental — called once per streaming chunk. Accumulates tokens in a buffer, emits tool names and argument diffs incrementally.

DSML Fragment Stripping

During streaming, a chunk may end mid-closing-tag:

chunk 1: ...name="city" string="true">Tokyo</｜DSML｜para
chunk 2: meter>

The captured value from chunk 1 would be Tokyo</｜DSML｜para. We strip trailing DSML fragments using character-level right-trimming (same approach as SGLang's rstrip):

Fragments: ["</", "｜DSML｜", "parameter"]   (applied in reverse)

"Tokyo</｜DSML｜para"
  → strip chars in "parameter": a,r,a,p  → "Tokyo</｜DSML｜"
  → strip chars in "｜DSML｜": ｜,L,M,S,D,｜ → "Tokyo</"
  → strip chars in "</": /,<              → "Tokyo"

Argument Reconstruction (parse_parameters_from_dsml)

Converts DSML parameter tags into a JSON arguments string. Two paths:

Direct JSON path: If invoke content starts with {, treat as raw JSON. Strip trailing DSML fragments if streaming.

XML parameter path:

Match all complete <parameter> tags → build a Map<String, Value>
string="true" → Value::String("Tokyo")
string="false" → try serde_json::from_str("42") → Value::Number(42), fallback to string
If streaming (allow_partial): find text after the last complete parameter, strip DSML fragments, try to match a partial <parameter> tag and add it
Serialize map → {"city":"Tokyo","date":16}

Streaming Engine (parse_incremental)

Phase 1 — Buffer or flush?

Each chunk is appended to the buffer, then:

No DSML markers, no partial tag prefix → flush as normal_text
Ends with <, <｜, </, </｜ → might be start of DSML tag, buffer and wait
Has DSML content → enter the invoke processing loop

Phase 2 — Invoke processing loop

Processes invoke blocks one at a time from the buffer:

Buffer: "<invoke name="search">...complete...</invoke><invoke name="weather">...partial..."
         ├──── complete: process + consume ────┤├──── partial: process + break ────┤

For each invoke match:

Validate tool name against provided tools list. Skip invalid complete invokes, reset on invalid partial.
Emit tool name on first encounter → client knows "get_weather is starting"
Parse current args via parse_parameters_from_dsml(content, allow_partial)
Compute diff against what we've already sent:
- Complete → send everything from sent_len to end
- Partial with previous → use find_common_prefix to find stable prefix, send new stable portion
- Partial without previous (first chunk) → send from sent_len
Advance or wait: complete invoke → slice buffer past it, increment tool_id, continue. Partial → break and wait for more chunks.

End-to-end streaming example

Model calls get_weather(city="Tokyo"), tokens arrive as:

Chunk	Action	Emitted
`"Let me check.\n\n"`	No DSML → flush	`normal_text: "Let me check."`
`"<｜DSML｜function_calls>\n"`	Has DSML, no invoke match yet	nothing
`"<｜DSML｜invoke name=\"get_weather\">\n"`	Invoke matched (partial). Emit name	`name: "get_weather"`
`"<｜DSML｜parameter name=\"city\" string=\"true\">"`	Param tag opened, no value yet	nothing
`"Tokyo"`	Partial param value → args = `{"city":"Tokyo"}`	`params: '{"city":"Tokyo"}'`
`"</｜DSML｜parameter>\n"`	Param complete, no new diff	nothing
`"</｜DSML｜invoke>\n"`	Invoke COMPLETE. Send remaining diff. Advance tool_id.	`params: "}"`
`"</｜DSML｜function_calls>"`	No invoke match. End tokens stripped.	nothing

Client receives tool call get_weather with arguments {"city":"Tokyo"} streamed incrementally.

…o prevent delta corruption Signed-off-by: key4ng <rukeyang@gmail.com>

…ing for DSML fragments Signed-off-by: key4ng <rukeyang@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 09abcf7018

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/tool_parser/src/parsers/deepseek32.rs

key4ng · 2026-04-08T19:37:27Z

E2E Validation: DeepSeek V3.2-Exp

Tested against a live DeepSeek V3.2-Exp (FP8) deployment on 8x H200.
V3.2-Exp uses V3.1 tool call format — auto-detected as deepseek31 parser via factory mapping deepseek-ai/DeepSeek-V3.2-Exp* → deepseek31.

Setup

sglang backend (gRPC mode):

/home/ubuntu/sglang_venv/bin/python -m sglang.launch_server \
  --model-path /raid/models/deepseek-ai/DeepSeek-V3.2-Exp \
  --served-model-name deepseek-ai/DeepSeek-V3.2-Exp \
  --tp 8 --trust-remote-code --port 30000 --grpc-mode

smg router (V3.1 vLLM template, no --tool-call-parser needed):

./target/debug/smg \
  --worker-urls grpc://localhost:30000 \
  --model-path deepseek-ai/DeepSeek-V3.2-Exp \
  --tokenizer-path /raid/models/deepseek-ai/DeepSeek-V3.2-Exp \
  --chat-template e2e_test/fixtures/chat_templates/tool_chat_template_deepseekv31.jinja \
  --port 8080

Run tests:

SMG_BASE_URL=http://localhost:8080 SMG_MODEL=deepseek-ai/DeepSeek-V3.2-Exp \
  python -m pytest e2e_test/chat_completions/test_deepseek32_tool_calling.py \
  -v --tb=short --no-header --rootdir=/tmp --noconftest

Test result

============================= test session starts ==============================
collected 17 items

TestDeepSeek32NonStreaming::test_single_tool_call_required           PASSED
TestDeepSeek32NonStreaming::test_tool_call_arguments_are_valid_json  PASSED
TestDeepSeek32NonStreaming::test_tool_call_has_id                    PASSED
TestDeepSeek32NonStreaming::test_tool_call_finish_reason             PASSED
TestDeepSeek32NonStreaming::test_model_picks_correct_tool            PASSED
TestDeepSeek32NonStreaming::test_parallel_tool_calls                 PASSED
TestDeepSeek32NonStreaming::test_single_tool_call_auto               PASSED
TestDeepSeek32NonStreaming::test_tool_choice_none                    PASSED
TestDeepSeek32NonStreaming::test_usage_stats_present                 PASSED
TestDeepSeek32NonStreaming::test_unicode_in_tool_arguments           PASSED
TestDeepSeek32Streaming::test_streaming_single_tool_call             PASSED
TestDeepSeek32Streaming::test_streaming_arguments_arrive_incrementally PASSED
TestDeepSeek32Streaming::test_streaming_finish_reason                PASSED
TestDeepSeek32Streaming::test_streaming_single_tool_call_auto        PASSED
TestDeepSeek32Streaming::test_streaming_parallel_tool_calls          PASSED
TestDeepSeek32MultiTurn::test_tool_result_followup                   PASSED
TestDeepSeek32MultiTurn::test_tool_result_followup_streaming         PASSED

============================== 17 passed in 9.70s ==============================

Full test file: e2e_test/chat_completions/test_deepseek32_tool_calling.py

"""DeepSeek V3.2 Tool Calling E2E Tests.

Tests for the DeepSeek V3.2 DSML-format tool parser via the SMG gateway.
Tests both non-streaming and streaming modes against a live sglang backend.

IMPORTANT: DeepSeek V3.2 has no built-in Jinja chat template in tokenizer_config.json.
The DSML template must be provided via --chat-template. Without it:
- tool_choice=required works (uses JSON schema constrained decoding, bypasses tool parser)
- tool_choice=auto fails (model output not parsed by deepseek32 DSML parser)

Usage:
    SMG_BASE_URL=http://localhost:8080 pytest \
        e2e_test/chat_completions/test_deepseek32_tool_calling.py -v \
        --rootdir=/tmp --noconftest

Setup:
    # sglang
    python -m sglang.launch_server \
        --model-path /raid/models/deepseek-ai/DeepSeek-V3.2 \
        --served-model-name deepseek-ai/DeepSeek-V3.2 \
        --tp 8 --trust-remote-code --port 30000 --grpc-mode

    # smg (DSML template required for tool_choice=auto tests)
    ./target/debug/smg \
        --worker-urls grpc://localhost:30000 \
        --model-path deepseek-ai/DeepSeek-V3.2 \
        --tokenizer-path /raid/models/deepseek-ai/DeepSeek-V3.2 \
        --tool-call-parser deepseek32 \
        --chat-template e2e_test/fixtures/chat_templates/tool_chat_template_deepseekv32.jinja \
        --port 8080
"""

from __future__ import annotations

import json
import logging
import os

import openai
import pytest

logger = logging.getLogger(__name__)

BASE_URL = os.environ.get("SMG_BASE_URL", "http://localhost:8080")
MODEL = os.environ.get("SMG_MODEL", "deepseek-ai/DeepSeek-V3.2")

# =============================================================================
# Client fixture
# =============================================================================


@pytest.fixture(scope="module")
def client():
    return openai.OpenAI(base_url=f"{BASE_URL}/v1", api_key="dummy")


# =============================================================================
# Tool definitions
# =============================================================================

WEATHER_TOOL = {
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name, e.g. 'San Francisco'",
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit",
                },
            },
            "required": ["location"],
        },
    },
}

SEARCH_TOOL = {
    "type": "function",
    "function": {
        "name": "search",
        "description": "Search for information on the web.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query string",
                },
            },
            "required": ["query"],
        },
    },
}

TRANSLATE_TOOL = {
    "type": "function",
    "function": {
        "name": "translate",
        "description": "Translate text from one language to another.",
        "parameters": {
            "type": "object",
            "properties": {
                "text": {"type": "string", "description": "Text to translate"},
                "target_language": {"type": "string", "description": "Target language code"},
            },
            "required": ["text", "target_language"],
        },
    },
}

ALL_TOOLS = [WEATHER_TOOL, SEARCH_TOOL, TRANSLATE_TOOL]


# =============================================================================
# Helpers
# =============================================================================


def assert_valid_tool_call(tool_call, expected_name=None):
    assert tool_call.function.name, "Tool call must have a function name"
    assert tool_call.function.arguments, "Tool call must have arguments"
    args = json.loads(tool_call.function.arguments)
    assert isinstance(args, dict), "Arguments must be a JSON object"
    if expected_name:
        assert tool_call.function.name == expected_name
    return args


def collect_streaming_tool_calls(stream):
    tool_calls = {}
    chunks_count = 0
    finish_reason = None
    for chunk in stream:
        chunks_count += 1
        delta = chunk.choices[0].delta if chunk.choices else None
        if not delta:
            continue
        if chunk.choices[0].finish_reason:
            finish_reason = chunk.choices[0].finish_reason
        if delta.tool_calls:
            for tc in delta.tool_calls:
                idx = tc.index
                if idx not in tool_calls:
                    tool_calls[idx] = {"name": "", "arguments": ""}
                if tc.function and tc.function.name:
                    tool_calls[idx]["name"] = tc.function.name
                if tc.function and tc.function.arguments:
                    tool_calls[idx]["arguments"] += tc.function.arguments
    return tool_calls, chunks_count, finish_reason


# =============================================================================
# Non-Streaming Tests
# =============================================================================


class TestDeepSeek32NonStreaming:
    """Non-streaming tool call tests for DeepSeek V3.2 DSML parser."""

    def test_single_tool_call_required(self, client):
        """tool_choice=required forces a tool call."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls, "Expected tool calls with tool_choice=required"
        args = assert_valid_tool_call(msg.tool_calls[0], "get_weather")
        assert "location" in args
        logger.info("Tool args: %s", args)

    def test_tool_call_arguments_are_valid_json(self, client):
        """Tool call arguments must be parseable JSON objects."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "Search for 'best restaurants in Tokyo'"}],
            tools=[SEARCH_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        args = json.loads(msg.tool_calls[0].function.arguments)
        assert isinstance(args, dict)
        assert "query" in args

    def test_tool_call_has_id(self, client):
        """Each tool call should have a unique ID."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "Check weather in Berlin"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        assert msg.tool_calls[0].id, "Tool call should have an ID"
        assert msg.tool_calls[0].type == "function"

    def test_tool_call_finish_reason(self, client):
        """finish_reason should be 'tool_calls' when tools are returned."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "Weather in London?"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        if response.choices[0].message.tool_calls:
            assert response.choices[0].finish_reason == "tool_calls"

    def test_model_picks_correct_tool(self, client):
        """With multiple tools, model should pick the right one."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "user", "content": "Translate 'hello' to French. Use the translate tool."}
            ],
            tools=ALL_TOOLS,
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        assert msg.tool_calls[0].function.name == "translate"

    def test_parallel_tool_calls(self, client):
        """Model can return multiple tool calls in a single response."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[
                {
                    "role": "user",
                    "content": (
                        "Do two things: 1) Get weather in Tokyo "
                        "2) Search for 'Tokyo travel guide'. Call both tools in parallel."
                    ),
                }
            ],
            tools=[WEATHER_TOOL, SEARCH_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=1024,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        assert len(msg.tool_calls) >= 2
        names = {tc.function.name for tc in msg.tool_calls}
        assert "get_weather" in names
        assert "search" in names

    def test_single_tool_call_auto(self, client):
        """tool_choice=auto exercises the deepseek32 DSML parser path.

        Unlike tool_choice=required (which uses JSON schema constrained decoding
        and bypasses the parser), auto mode lets the model output freely and
        relies on the deepseek32 parser to detect and parse DSML markers.
        Requires the DSML chat template (--chat-template) to be set.
        """
        response = client.chat.completions.create(
            model=MODEL,
            messages=[
                {
                    "role": "user",
                    "content": "Use the get_weather tool to check the weather in Tokyo.",
                }
            ],
            tools=[WEATHER_TOOL],
            tool_choice="auto",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls, "Model should call tool when explicitly asked (auto mode)"
        args = assert_valid_tool_call(msg.tool_calls[0], "get_weather")
        assert "location" in args
        logger.info("Auto DSML tool args: %s", args)

    def test_tool_choice_none(self, client):
        """tool_choice=none should prevent tool calls."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "What's the weather in NYC?"}],
            tools=[WEATHER_TOOL],
            tool_choice="none",
            temperature=0,
            max_tokens=256,
        )

        msg = response.choices[0].message
        assert not msg.tool_calls
        assert msg.content

    def test_usage_stats_present(self, client):
        """Response should include usage statistics."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "Check weather in NYC"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=256,
        )

        assert response.usage is not None
        assert response.usage.prompt_tokens > 0
        assert response.usage.completion_tokens > 0

    def test_unicode_in_tool_arguments(self, client):
        """Tool arguments with unicode content."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "user", "content": "Translate 'こんにちは' to English using the translate tool."}
            ],
            tools=[TRANSLATE_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        args = json.loads(msg.tool_calls[0].function.arguments)
        assert "text" in args


# =============================================================================
# Streaming Tests
# =============================================================================


class TestDeepSeek32Streaming:
    """Streaming tool call tests for DeepSeek V3.2 DSML parser."""

    def test_streaming_single_tool_call(self, client):
        """Streaming delivers tool call name and arguments across chunks."""
        stream = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
            stream=True,
        )

        tool_calls, chunks_count, finish_reason = collect_streaming_tool_calls(stream)

        assert chunks_count > 1, "Streaming should return multiple chunks"
        assert len(tool_calls) >= 1
        tc = tool_calls[0]
        assert tc["name"] == "get_weather"
        args = json.loads(tc["arguments"])
        assert "location" in args

    def test_streaming_arguments_arrive_incrementally(self, client):
        """Arguments should arrive across multiple chunks."""
        stream = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "user", "content": "Search for 'comprehensive guide to machine learning'"}
            ],
            tools=[SEARCH_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
            stream=True,
        )

        arg_chunk_count = 0
        for chunk in stream:
            delta = chunk.choices[0].delta if chunk.choices else None
            if delta and delta.tool_calls:
                for tc in delta.tool_calls:
                    if tc.function and tc.function.arguments:
                        arg_chunk_count += 1

        assert arg_chunk_count > 1, f"Expected incremental args, got {arg_chunk_count} chunks"

    def test_streaming_finish_reason(self, client):
        """Streaming should end with a finish_reason."""
        stream = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "Weather in London"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=256,
            stream=True,
        )

        _, _, finish_reason = collect_streaming_tool_calls(stream)
        assert finish_reason is not None

    
    def test_streaming_single_tool_call_auto(self, client):
        """Streaming with tool_choice=auto exercises the DSML incremental parser.

        This is the most important streaming test — it validates parse_incremental
        with real DSML token output. All other streaming tests use required mode
        which bypasses the parser via JSON schema constrained decoding.
        """
        stream = client.chat.completions.create(
            model=MODEL,
            messages=[
                {
                    "role": "user",
                    "content": "Use the get_weather tool to check the weather in Tokyo.",
                }
            ],
            tools=[WEATHER_TOOL],
            tool_choice="auto",
            temperature=0,
            max_tokens=512,
            stream=True,
        )

        tool_calls, chunks_count, _ = collect_streaming_tool_calls(stream)

        assert chunks_count > 1, "Streaming should return multiple chunks"
        assert len(tool_calls) >= 1, "Model should call tool when explicitly asked (auto streaming)"
        tc = tool_calls[0]
        assert tc["name"] == "get_weather"
        args = json.loads(tc["arguments"])
        assert "location" in args
        logger.info("Streaming auto DSML tool args: %s", args)

    def test_streaming_parallel_tool_calls(self, client):
        """Streaming should handle multiple tool calls when model emits them."""
        stream = client.chat.completions.create(
            model=MODEL,
            messages=[
                {
                    "role": "user",
                    "content": (
                        "Do two things at once: "
                        "1) Get weather in Paris "
                        "2) Search for 'Paris travel tips'. "
                        "You MUST call BOTH get_weather AND search tools in parallel."
                    ),
                }
            ],
            tools=[WEATHER_TOOL, SEARCH_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=1024,
            stream=True,
        )

        tool_calls, _, _ = collect_streaming_tool_calls(stream)

        assert len(tool_calls) >= 1, "Should have at least one streaming tool call"
        for idx, tc in tool_calls.items():
            assert tc["name"], f"Tool call {idx} should have a name"
            args = json.loads(tc["arguments"])
            assert isinstance(args, dict), f"Tool call {idx} args should be valid JSON object"

        names = {tc["name"] for tc in tool_calls.values()}
        logger.info("Streaming parallel tool names: %s (count: %d)", names, len(tool_calls))

        if len(tool_calls) >= 2:
            assert "get_weather" in names
            assert "search" in names


# =============================================================================
# Multi-Turn Tests
# =============================================================================


class TestDeepSeek32MultiTurn:
    """Multi-turn conversations with tool results."""

    def test_tool_result_followup(self, client):
        """Model should use tool result to form a final text response."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=512,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        tool_call = msg.tool_calls[0]

        response2 = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "user", "content": "What's the weather in Tokyo?"},
                {
                    "role": "assistant",
                    "tool_calls": [
                        {
                            "id": tool_call.id,
                            "type": "function",
                            "function": {
                                "name": tool_call.function.name,
                                "arguments": tool_call.function.arguments,
                            },
                        }
                    ],
                },
                {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": json.dumps(
                        {"temperature": 22, "unit": "celsius", "condition": "sunny"}
                    ),
                },
            ],
            tools=[WEATHER_TOOL],
            temperature=0,
            max_tokens=512,
        )

        msg2 = response2.choices[0].message
        assert msg2.content, "Model should reply with text after receiving tool result"

    def test_tool_result_followup_streaming(self, client):
        """Streaming follow-up with tool result should produce text content."""
        response = client.chat.completions.create(
            model=MODEL,
            messages=[{"role": "user", "content": "What's the weather in Paris?"}],
            tools=[WEATHER_TOOL],
            tool_choice="required",
            temperature=0,
            max_tokens=256,
        )

        msg = response.choices[0].message
        assert msg.tool_calls
        tool_call = msg.tool_calls[0]

        stream = client.chat.completions.create(
            model=MODEL,
            messages=[
                {"role": "user", "content": "What's the weather in Paris?"},
                {
                    "role": "assistant",
                    "tool_calls": [
                        {
                            "id": tool_call.id,
                            "type": "function",
                            "function": {
                                "name": tool_call.function.name,
                                "arguments": tool_call.function.arguments,
                            },
                        }
                    ],
                },
                {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": json.dumps(
                        {"temperature": 18, "unit": "celsius", "condition": "cloudy"}
                    ),
                },
            ],
            tools=[WEATHER_TOOL],
            temperature=0,
            max_tokens=256,
            stream=True,
        )

        content_parts = []
        for chunk in stream:
            delta = chunk.choices[0].delta if chunk.choices else None
            if delta and delta.content:
                content_parts.append(delta.content)

        assert "".join(content_parts), "Streaming follow-up should produce text content"


# =============================================================================
# Run directly
# =============================================================================

if __name__ == "__main__":
    import sys

    sys.exit(
        pytest.main([__file__, "-v", "--tb=short", "-x", "--no-header", *sys.argv[1:]])
    )

key4ng added 3 commits April 2, 2026 18:21

feat(tool_parser): add DeepSeek V3.2 DSML parser with parse_complete

33730a8

Signed-off-by: key4ng <rukeyang@gmail.com>

feat(tool_parser): implement DeepSeek V3.2 streaming parse_incremental

1f8382c

Signed-off-by: key4ng <rukeyang@gmail.com>

feat(tool_parser): register DeepSeek V3.2 parser with model mappings

1cc2419

Signed-off-by: key4ng <rukeyang@gmail.com>

key4ng requested review from CatherineSue and slin1237 as code owners April 3, 2026 01:36

github-actions bot added tests Test changes tool-parser Tool/function call parser changes labels Apr 3, 2026

chatgpt-codex-connector bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

claude bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Outdated Show resolved Hide resolved

claude bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

gemini-code-assist bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

coderabbitai bot requested changes Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

key4ng added 3 commits April 2, 2026 18:46

fix(tool_parser): replace DSML suffix list with SGLang-style characte…

57d3d83

…r stripping Signed-off-by: key4ng <rukeyang@gmail.com>

fix(tool_parser): align DSML fragment stripping with SGLang implement…

2c7c1c2

…ation Signed-off-by: key4ng <rukeyang@gmail.com>

fix(tool_parser): continue past complete invalid invokes instead of b…

cdff142

…reaking Signed-off-by: key4ng <rukeyang@gmail.com>

chatgpt-codex-connector bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Outdated Show resolved Hide resolved

coderabbitai bot requested changes Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Outdated Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Outdated Show resolved Hide resolved

claude bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Outdated Show resolved Hide resolved

claude bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

style(tool_parser): apply nightly fmt to DeepSeek V3.2 parser

7fa9333

Signed-off-by: key4ng <rukeyang@gmail.com>

coderabbitai bot requested changes Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

coderabbitai bot requested changes Apr 3, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Outdated Show resolved Hide resolved

key4ng added 2 commits April 6, 2026 13:39

fix(tool_parser): return clean normal_text when no invokes parsed fro…

4321630

…m DSML block Signed-off-by: key4ng <rukeyang@gmail.com>

fix(tool_parser): emit first partial args and preserve calls on inval…

bf8518d

…id invoke abort Signed-off-by: key4ng <rukeyang@gmail.com>

chatgpt-codex-connector bot reviewed Apr 6, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

coderabbitai bot requested changes Apr 6, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

key4ng added 2 commits April 8, 2026 11:20

fix(tool_parser): skip empty {} emission on first incomplete invoke t…

1640b76

…o prevent delta corruption Signed-off-by: key4ng <rukeyang@gmail.com>

fix(tool_parser): use suffix matching instead of character-set stripp…

09abcf7

…ing for DSML fragments Signed-off-by: key4ng <rukeyang@gmail.com>

chatgpt-codex-connector bot reviewed Apr 8, 2026

View reviewed changes

crates/tool_parser/src/parsers/deepseek32.rs Show resolved Hide resolved

coderabbitai bot approved these changes Apr 8, 2026

View reviewed changes

Conversation

key4ng commented Apr 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

key4ng commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

key4ng commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 3, 2026 •

edited

Loading

key4ng commented Apr 6, 2026 •

edited

Loading