Why this is here (not on the upstream repo)
This is a tracking issue for a bug in the upstream `waybarrios/vllm-mlx` project, filed here because the automation PAT used during the Bifrost migration session cannot file issues on external repos. The reproducer and context below are written to be copy-pasted into a real issue on `waybarrios/vllm-mlx` by a human with the right credentials.
Upstream bug: unescaped control characters in /v1/chat/completions responses
`vllm-mlx` 0.2.6 emits literal unescaped newline characters (and likely other control bytes in U+0000 – U+001F) inside `choices[].message.content` strings when the underlying MLX model generates multi-line output (Python code, long prose, etc.). This violates RFC 8259 §7, which requires control characters in string values to be escaped as `\n`, `\t`, `\uXXXX`, etc.
Strict JSON parsers (`jq`, some SDKs in strict mode) reject the response body:
```text
jq: parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 4, column 32
```
Reproducer
Environment:
- Apple M4 Max, macOS 26 (Tahoe)
- `vllm-mlx` 0.2.6 (installed via `uvx`)
- Model: `mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit`
- Serving config: `--cache-memory-mb 16384 --enable-auto-tool-choice --tool-call-parser hermes`
Steps:
-
Start the server directly (or via llama-swap):
```bash
uvx --from vllm-mlx==0.2.6 vllm-mlx serve \
mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit \
--port 11443 --host 127.0.0.1 \
--cache-memory-mb 16384 \
--enable-auto-tool-choice --tool-call-parser hermes
```
-
Send a prompt that produces multi-line output:
```bash
curl -sf -X POST http://127.0.0.1:11443/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model":"mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit",
"messages":[{
"role":"user",
"content":"Return the corrected factorial function, nothing else:\ndef factorial(n):\n if n == 0:\n return 0\n return n * factorial(n-1)"
}],
"max_tokens":150,
"temperature":0
}' | jq .
```
Expected
`jq` parses the response successfully; `.choices[0].message.content` is a valid JSON string with escaped newlines.
Actual
`jq` fails with `Invalid string: control characters from U+0000 through U+001F must be escaped`. Raw bytes (via `curl ... | hexdump -C`) show literal `0x0A` bytes inside the `content` string rather than escaped `\x5c\x6e`.
Impact
- Strict JSON clients cannot parse the response. Workarounds: `jq -Rn` (lenient input mode) or a cleanup filter.
- Lenient parsers (Python's `json.loads`, most OpenAI SDKs) handle it fine — so the bug has been invisible to most consumers.
- Gateways that relay the body unmodified (Bifrost — maximhq/bifrost) propagate the invalid JSON to their clients.
Root cause (best guess)
The response serialization path for `/v1/chat/completions` likely concatenates raw model output into a JSON template rather than round-tripping through `json.dumps` / a proper JSON encoder. The fix is to ensure the full response body is encoded through a JSON library that escapes control characters per RFC 8259.
Priority
Low. Only affects strict parsers; most real-world consumers handle it fine. Filing for visibility and to track the eventual fix.
Discovery context
Caught during the Bifrost AI gateway benchmark session on 2026-04-10. Full context in JacobPEvans/nix-ai#469 Bench G.
Related tracking issues:
Next step
When convenient, copy the Upstream bug section above (everything between that header and the Discovery context section) into a new issue at https://github.com/waybarrios/vllm-mlx/issues. Link the upstream issue back here once filed so we can track the fix.
Why this is here (not on the upstream repo)
This is a tracking issue for a bug in the upstream `waybarrios/vllm-mlx` project, filed here because the automation PAT used during the Bifrost migration session cannot file issues on external repos. The reproducer and context below are written to be copy-pasted into a real issue on `waybarrios/vllm-mlx` by a human with the right credentials.
Upstream bug: unescaped control characters in /v1/chat/completions responses
`vllm-mlx` 0.2.6 emits literal unescaped newline characters (and likely other control bytes in U+0000 – U+001F) inside `choices[].message.content` strings when the underlying MLX model generates multi-line output (Python code, long prose, etc.). This violates RFC 8259 §7, which requires control characters in string values to be escaped as `\n`, `\t`, `\uXXXX`, etc.
Strict JSON parsers (`jq`, some SDKs in strict mode) reject the response body:
```text
jq: parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 4, column 32
```
Reproducer
Environment:
Steps:
Start the server directly (or via llama-swap):
```bash
uvx --from vllm-mlx==0.2.6 vllm-mlx serve \
mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit \
--port 11443 --host 127.0.0.1 \
--cache-memory-mb 16384 \
--enable-auto-tool-choice --tool-call-parser hermes
```
Send a prompt that produces multi-line output:
```bash
curl -sf -X POST http://127.0.0.1:11443/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model":"mlx-community/Qwen3-Coder-30B-A3B-Instruct-4bit",
"messages":[{
"role":"user",
"content":"Return the corrected factorial function, nothing else:\ndef factorial(n):\n if n == 0:\n return 0\n return n * factorial(n-1)"
}],
"max_tokens":150,
"temperature":0
}' | jq .
```
Expected
`jq` parses the response successfully; `.choices[0].message.content` is a valid JSON string with escaped newlines.
Actual
`jq` fails with `Invalid string: control characters from U+0000 through U+001F must be escaped`. Raw bytes (via `curl ... | hexdump -C`) show literal `0x0A` bytes inside the `content` string rather than escaped `\x5c\x6e`.
Impact
Root cause (best guess)
The response serialization path for `/v1/chat/completions` likely concatenates raw model output into a JSON template rather than round-tripping through `json.dumps` / a proper JSON encoder. The fix is to ensure the full response body is encoded through a JSON library that escapes control characters per RFC 8259.
Priority
Low. Only affects strict parsers; most real-world consumers handle it fine. Filing for visibility and to track the eventual fix.
Discovery context
Caught during the Bifrost AI gateway benchmark session on 2026-04-10. Full context in JacobPEvans/nix-ai#469 Bench G.
Related tracking issues:
Next step
When convenient, copy the Upstream bug section above (everything between that header and the Discovery context section) into a new issue at https://github.com/waybarrios/vllm-mlx/issues. Link the upstream issue back here once filed so we can track the fix.