fix: return finish_reason=tool_calls when tool calls detected by eloe · Pull Request #14 · eloe/mlx-vlm

eloe · 2026-04-06T16:16:23Z

Summary\nReturn finish_reason="tool_calls" instead of "stop" when process_tool_calls finds calls. Both streaming and non-streaming /chat/completions. 3 tests.

OpenAI-compatible clients check finish_reason to decide whether to enter the tool execution loop. Previously mlx-vlm always returned "stop" even when process_tool_calls found calls in the model output. Now both streaming and non-streaming /chat/completions responses return finish_reason="tool_calls" when tool calls are present, and "stop" otherwise. Adds 3 tests covering: stop without tools, tool_calls with tools, stop with tools but no calls made. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR updates the OpenAI-compatible /chat/completions endpoint to return finish_reason="tool_calls" (instead of "stop") when tool calls are detected, aligning both streaming and non-streaming responses with expected tool-calling semantics.

Changes:

Set streaming final chunk finish_reason to "tool_calls" when process_tool_calls finds calls.
Set non-streaming response finish_reason to "tool_calls" when tool calls are present.
Add unit tests covering non-streaming finish_reason behavior for tool/no-tool scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`mlx_vlm/server.py`	Computes `finish_reason` based on detected tool calls for both streaming and non-streaming chat completions.
`mlx_vlm/tests/test_server.py`	Adds tests asserting `finish_reason` is `"stop"` vs `"tool_calls"` for non-streaming `/chat/completions`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-06T16:31:18Z

@@ -1,4 +1,5 @@
 import argparse
+import asyncio


asyncio is imported but never used in this module. Please remove the unused import to avoid lint failures and keep imports minimal.

Suggested change

import asyncio

Copilot · 2026-04-06T16:31:19Z

                    else:
                        tool_calls = {}
                        tool_calls["calls"] = []

-                    # Signal stream end
+                    # Signal stream end with correct finish_reason
+                    stream_finish = "tool_calls" if tool_calls.get("calls") else "stop"
                    choices = [
                        ChatStreamChoice(
-                            finish_reason="stop",
+                            finish_reason=stream_finish,
                            delta=ChatMessage(


The streaming path now sets finish_reason based on whether tool calls were detected, but the new/updated behavior isn’t covered by tests in this file (current tests only exercise non-streaming /chat/completions). Please add a streaming test case (e.g., stream=True) that asserts the final SSE chunk includes finish_reason: tool_calls when process_tool_calls returns calls, and stop when it doesn’t.

Remove unused asyncio import and add streaming test for finish_reason=tool_calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

eloe requested a review from Copilot April 6, 2026 16:28

Copilot started reviewing on behalf of eloe April 6, 2026 16:28 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

fix: address Copilot review feedback for PR #14

f9d9f0d

Remove unused asyncio import and add streaming test for finish_reason=tool_calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

eloe mentioned this pull request Apr 8, 2026

Combined server enhancements: OpenAI API compliance, prompt caching, concurrency #21

Merged

6 tasks

eloe added a commit that referenced this pull request Apr 9, 2026

fix: address Copilot review feedback for PR #14

115769b

Remove unused asyncio import and add streaming test for finish_reason=tool_calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: return finish_reason=tool_calls when tool calls detected#14

fix: return finish_reason=tool_calls when tool calls detected#14
eloe wants to merge 2 commits into
mainfrom
feature/finish-reason-tool-calls

eloe commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eloe commented Apr 6, 2026

Summary\nReturn finish_reason="tool_calls" instead of "stop" when process_tool_calls finds calls. Both streaming and non-streaming /chat/completions. 3 tests.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants