Skip to content

fix: return finish_reason=tool_calls when tool calls detected#14

Open
eloe wants to merge 2 commits into
mainfrom
feature/finish-reason-tool-calls
Open

fix: return finish_reason=tool_calls when tool calls detected#14
eloe wants to merge 2 commits into
mainfrom
feature/finish-reason-tool-calls

Conversation

@eloe
Copy link
Copy Markdown
Owner

@eloe eloe commented Apr 6, 2026

Summary\nReturn finish_reason="tool_calls" instead of "stop" when process_tool_calls finds calls. Both streaming and non-streaming /chat/completions. 3 tests.

OpenAI-compatible clients check finish_reason to decide whether to
enter the tool execution loop. Previously mlx-vlm always returned
"stop" even when process_tool_calls found calls in the model output.

Now both streaming and non-streaming /chat/completions responses
return finish_reason="tool_calls" when tool calls are present,
and "stop" otherwise.

Adds 3 tests covering: stop without tools, tool_calls with tools,
stop with tools but no calls made.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the OpenAI-compatible /chat/completions endpoint to return finish_reason="tool_calls" (instead of "stop") when tool calls are detected, aligning both streaming and non-streaming responses with expected tool-calling semantics.

Changes:

  • Set streaming final chunk finish_reason to "tool_calls" when process_tool_calls finds calls.
  • Set non-streaming response finish_reason to "tool_calls" when tool calls are present.
  • Add unit tests covering non-streaming finish_reason behavior for tool/no-tool scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
mlx_vlm/server.py Computes finish_reason based on detected tool calls for both streaming and non-streaming chat completions.
mlx_vlm/tests/test_server.py Adds tests asserting finish_reason is "stop" vs "tool_calls" for non-streaming /chat/completions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mlx_vlm/server.py Outdated
@@ -1,4 +1,5 @@
import argparse
import asyncio
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asyncio is imported but never used in this module. Please remove the unused import to avoid lint failures and keep imports minimal.

Suggested change
import asyncio

Copilot uses AI. Check for mistakes.
Comment thread mlx_vlm/server.py
Comment on lines 1176 to 1185
else:
tool_calls = {}
tool_calls["calls"] = []

# Signal stream end
# Signal stream end with correct finish_reason
stream_finish = "tool_calls" if tool_calls.get("calls") else "stop"
choices = [
ChatStreamChoice(
finish_reason="stop",
finish_reason=stream_finish,
delta=ChatMessage(
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The streaming path now sets finish_reason based on whether tool calls were detected, but the new/updated behavior isn’t covered by tests in this file (current tests only exercise non-streaming /chat/completions). Please add a streaming test case (e.g., stream=True) that asserts the final SSE chunk includes finish_reason: tool_calls when process_tool_calls returns calls, and stop when it doesn’t.

Copilot uses AI. Check for mistakes.
Remove unused asyncio import and add streaming test for
finish_reason=tool_calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
eloe added a commit that referenced this pull request Apr 9, 2026
Remove unused asyncio import and add streaming test for
finish_reason=tool_calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants