fix(mcp): avoid relisting tools on resumed responses#1060
fix(mcp): avoid relisting tools on resumed responses#1060
Conversation
📝 WalkthroughWalkthroughTracks MCP Changes
Sequence DiagramsequenceDiagram
participant Client
participant Router as Router<br/>(responses/route.rs)
participant History as History Loader<br/>(responses/history.rs)
participant ToolLoop as Tool Loop<br/>(mcp/tool_loop.rs)
participant MCP as MCP Session
Client->>Router: ResponsesRequest (first turn)
Router->>History: load_input_history()
History-->>Router: LoadedInputHistory { previous_response_id: None, existing_labels: [] }
Router->>ToolLoop: execute_tool_loop(ctx { existing_labels: [] })
ToolLoop->>ToolLoop: mcp_list_tools_bindings_to_emit([], all_bindings)
ToolLoop->>MCP: emit all MCP list-tools
ToolLoop-->>Router: response with mcp_list_tools + mcp_call
Client->>Router: ResponsesRequest (resume with previous_response_id)
Router->>History: load_input_history(previous_response_id)
History->>History: extract_mcp_list_tools_labels(prior_output)
History-->>Router: LoadedInputHistory { previous_response_id: Some(id), existing_labels: ["binding1"] }
Router->>ToolLoop: execute_tool_loop(ctx { existing_labels: ["binding1"] })
ToolLoop->>ToolLoop: mcp_list_tools_bindings_to_emit([binding1], all_bindings)
ToolLoop->>MCP: emit only new MCP list-tools
ToolLoop-->>Router: response with only mcp_call (no repeated mcp_list_tools)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to prevent redundant MCP tool listings when resuming conversations via previous_response_id. It tracks previously emitted MCP server labels throughout the request lifecycle and filters them from subsequent outputs in both streaming and non-streaming modes. The feedback identifies several improvement opportunities: replacing fixed time.sleep calls in tests with more robust polling to reduce flakiness, addressing fragile list indexing and logic duplication in the Python E2E tests, and optimizing the Rust implementation of existing_mcp_list_tools_labels to avoid inefficient JSON serialization.
49b6d6f to
66ecd5c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 66ecd5c9ed
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
model_gateway/src/routers/openai/responses/history.rs (1)
44-60:⚠️ Potential issue | 🟠 MajorMCP items are silently dropped during conversation loading, causing duplicate list-tools emissions on resume.
Conversation persistence stores
mcp_list_toolsandmcp_callitems in conversation storage (seeitem_to_new_conversation_item), but the conversation loader here only reconstructsmessage,function_call, andfunction_call_outputitems. All other types—includingmcp_list_toolsandmcp_call—fall into the_pattern and are discarded with a warning.As a result,
existing_mcp_list_tools_labelsis only seeded fromget_response_chain(previous_response_id). When a conversation-backed request resumes, the tool loop receives an empty label set relative to the conversation history, causingmcp_list_tools_bindings_to_emitto re-emit bindings for servers that were already listed in the conversation's output, leading to duplicates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@model_gateway/tests/common/mock_worker.rs`:
- Around line 669-688: The current guard around emitting mock tool calls uses
has_prior_tool_context which is true if previous_response_id or any mcp_* resume
metadata exists, causing resumed requests to short-circuit to final assistant
messages; update the logic so that only concrete prior tool results count:
remove previous_response_id from the has_prior_tool_context boolean and instead
require has_function_output || has_mcp_history (or, if previous_response_id is
present, look up stored mock history by that id to detect an actual prior tool
result before treating it as prior context). Apply this same fix to the other
occurrences noted (the checks around has_previous_response_id/has_mcp_history at
the other locations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: cb2b06d9-8d9c-4283-8063-7853aed7a16e
📒 Files selected for processing (11)
e2e_test/responses/test_tools_call.pymodel_gateway/src/routers/openai/chat.rsmodel_gateway/src/routers/openai/context.rsmodel_gateway/src/routers/openai/mcp/mod.rsmodel_gateway/src/routers/openai/mcp/tool_loop.rsmodel_gateway/src/routers/openai/responses/history.rsmodel_gateway/src/routers/openai/responses/non_streaming.rsmodel_gateway/src/routers/openai/responses/route.rsmodel_gateway/src/routers/openai/responses/streaming.rsmodel_gateway/tests/api/responses_api_test.rsmodel_gateway/tests/common/mock_worker.rs
Signed-off-by: Ziwen Zhao <zzw.mose@gmail.com>
Signed-off-by: Ziwen Zhao <zzw.mose@gmail.com>
Signed-off-by: Ziwen Zhao <zzw.mose@gmail.com>
66ecd5c to
627547e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@e2e_test/responses/test_tools_call.py`:
- Around line 277-279: Extract a small helper function named
get_completed_response_id(events) that finds events with type
"response.completed", asserts exactly one (or raises a clear assertion message)
and returns its .response.id, then replace the inline list-comprehensions that
index [0] (e.g., where previous_response_id is set from events1 and the
analogous use with events2) with calls to get_completed_response_id(events1) and
get_completed_response_id(events2) so failures produce a clear assertion message
and the test is less brittle.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 523c1a25-0943-444c-8759-44d07b1f7efc
📒 Files selected for processing (11)
e2e_test/responses/test_tools_call.pymodel_gateway/src/routers/openai/chat.rsmodel_gateway/src/routers/openai/context.rsmodel_gateway/src/routers/openai/mcp/mod.rsmodel_gateway/src/routers/openai/mcp/tool_loop.rsmodel_gateway/src/routers/openai/responses/history.rsmodel_gateway/src/routers/openai/responses/non_streaming.rsmodel_gateway/src/routers/openai/responses/route.rsmodel_gateway/src/routers/openai/responses/streaming.rsmodel_gateway/tests/api/responses_api_test.rsmodel_gateway/tests/common/mock_worker.rs
Description
Problem
When
/v1/responsesresumed a prior turn withprevious_response_id, SMG re-emittedmcp_list_toolsfor MCP bindings that were already established in earlier turns.That differs from OpenAI behavior. On follow-up turns, OpenAI does not repeat
mcp_list_toolsfor existing bindings. It only emits a newmcp_list_toolsitem when the request adds a new MCP binding.This PR intentionally scopes the fix to the OpenAI Responses router path first. The gRPC Responses paths will be handled in a follow-up PR.
Solution
SMG now reads prior response history for
mcp_list_tools.server_labelvalues and treats those bindings as already known.On resumed turns:
mcp_list_toolsmcp_list_toolsmcp_callbehavior is unchanged for the current turnThis first PR only applies that behavior to the OpenAI Responses router path. A follow-up PR will align the gRPC Responses paths.
Changes
previous_response_idmcp_list_toolsonly for bindings not already seen in prior response outputTest Plan
Before This Change
First request:
Example response shape:
{ "id": "resp_1", "output": [ { "type": "mcp_list_tools", "server_label": "brave" }, { "type": "mcp_call", "server_label": "brave" }, { "type": "message" } ] }Second request with the same MCP binding and
previous_response_id:Problematic response shape before this change:
{ "id": "resp_2", "output": [ { "type": "mcp_list_tools", "server_label": "brave" }, { "type": "mcp_call", "server_label": "brave" }, { "type": "message" } ] }Why this is wrong:
bravewas already listed in the first turnmcp_list_toolsfor the same bindingThird request adds a new MCP binding:
Problematic response shape before this change:
{ "id": "resp_3", "output": [ { "type": "mcp_list_tools", "server_label": "brave" }, { "type": "mcp_list_tools", "server_label": "deepwiki" }, { "type": "mcp_call", "server_label": "brave" }, { "type": "mcp_call", "server_label": "deepwiki" }, { "type": "message" } ] }Why this is wrong:
deepwikiis newbraveshould not be relisted againAfter This Change
First request stays the same:
{ "id": "resp_1", "output": [ { "type": "mcp_list_tools", "server_label": "brave" }, { "type": "mcp_call", "server_label": "brave" }, { "type": "message" } ] }Second request with the same binding now returns:
{ "id": "resp_2", "output": [ { "type": "mcp_call", "server_label": "brave" }, { "type": "message" } ] }Key change:
mcp_list_toolsforbraveThird request with a newly added binding now returns:
{ "id": "resp_3", "output": [ { "type": "mcp_list_tools", "server_label": "deepwiki" }, { "type": "mcp_call", "server_label": "brave" }, { "type": "mcp_call", "server_label": "deepwiki" }, { "type": "message" } ] }Key change:
deepwikibinding gets a newmcp_list_toolsbravebinding is not relistedStreaming Check
The same rule also applies to SSE event flow:
response.mcp_list_tools.in_progressandresponse.mcp_list_tools.completedagain for existing bindingsChecklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
New Features
Bug Fixes
Tests