Skip to content

fix: preserve non-ASCII text in provider request formatting#2776

Open
kimnamu wants to merge 1 commit into
strands-agents:mainfrom
kimnamu:fix/provider-non-ascii-request-formatting
Open

fix: preserve non-ASCII text in provider request formatting#2776
kimnamu wants to merge 1 commit into
strands-agents:mainfrom
kimnamu:fix/provider-non-ascii-request-formatting

Conversation

@kimnamu

@kimnamu kimnamu commented Jun 13, 2026

Copy link
Copy Markdown

Thank you to the Strands maintainers for the excellent SDK — and for the recent #2653 which fixed the @tool decorator path. This PR addresses the provider-side counterpart of the same issue.

Fixes #2660

Problem

Provider request-formatting paths serialize tool-call arguments and tool-result {"json": ...} content blocks with json.dumps(), which defaults to ensure_ascii=True. Non-ASCII text (CJK, emoji) is therefore escaped to \uXXXX in the request sent to the model. For non-Latin scripts this inflates token usage (and cost) and hurts readability/debuggability.

This is the provider-side counterpart to the @tool decorator path fixed in #2653, and it matches what the SDK already does in telemetry/tracer.py and the session managers (ensure_ascii=False).

Reproduction (no network / API key needed)

from strands.models.openai import OpenAIModel

tool_result = {"toolUseId": "c1", "status": "success", "content": [{"json": {"city": "東京"}}]}
print(OpenAIModel.format_request_tool_message(tool_result)["content"])
Before After
OpenAI tool-result {"json": {"city": "東京"}} {"city": "東京"} {"city": "東京"}
OpenAI tool-call args {"query": "東京"} {"query": "東京"} {"query": "東京"}
Anthropic tool-result json \uXXXX-escaped ❌ preserved ✅
Response-parsing paths (bedrock.py, ollama.py, gemini.py stream→internal) unchanged ✅ unchanged (not request-facing)
Public API / signatures / output ordering ✅ unchanged

Change

json.dumps(x)json.dumps(x, ensure_ascii=False) in the model-visible format_request_* paths only, across the affected providers (openai, anthropic, mistral, llamaapi, writer, llamacpp, ollama, openai_responses). Response→internal conversion paths are intentionally left untouched.

Tests

Added regression tests on the issue's primary paths (test_openai.py, test_anthropic.py). Verified they catch the bug:

  • Source reverted (bug present) → 3 failed (e.g. assert '\u' not in '{"city": "東京"}')
  • Source restored (fix) → 3 passed
  • Full changed-provider suite → 398 passed, no regressions.
  • ruff check / ruff format --check / mypy (changed files) all clean.

This contribution was prepared with the help of an AI agent (Claude Code); a human reviewed the change, rationale, and test results before submission.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Provider request-formatting paths serialize tool-call arguments and
tool-result json content blocks with json.dumps(), which defaults to
ensure_ascii=True and escapes non-ASCII (CJK, emoji) to \uXXXX in the
request sent to the model. This inflates token usage for non-Latin
scripts and hurts readability/debuggability.

This is the provider-side counterpart to the @tool decorator path fixed
in strands-agents#2653, and matches what the SDK already does in telemetry/tracer.py
and the session managers (ensure_ascii=False).

Fixes strands-agents#2660
@github-actions github-actions Bot added size/s python Pull requests that update python code area-model Related to models or model providers bug Something isn't working labels Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-model Related to models or model providers bug Something isn't working python Pull requests that update python code size/s

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Provider request formatting escapes non-ASCII in tool-result JSON and tool-call arguments (follow-up to #2636)

1 participant