feat(ai): support thinking/reasoning models in OpenAI-compatible strategy by ImIvanGil · Pull Request #177 · Axenide/Ambxst

ImIvanGil · 2026-05-13T00:27:04Z

Problem

OpenAI-compatible thinking / reasoning models — Kimi K2.5, K2.6, kimi-*-thinking variants, GPT o1 family — fail silently in the AI panel with the generic message "No response received from the API.", regardless of prompt.

There are two distinct upstream causes, both in OpenAiApiStrategy.qml:

Cause 1: hardcoded `temperature: 0.7`

getBody() hardcodes temperature: 0.7. Thinking models reject any temperature other than 1 with HTTP 400:

{
    "error": {
        "message": "invalid temperature: only 1 is allowed for this model",
        "type": "invalid_request_error"
    }
}

The error is a single-line JSON, so the SSE-parsing SplitParser ignores it (no data: prefix), and curl exits 0 with the buffer never populated → the panel falls into its "no streaming data received" branch and shows the generic placeholder.

Cause 2: parser ignores `reasoning_content`

Thinking models stream their chain-of-thought as delta.reasoning_content before emitting the actual answer as delta.content. The existing parser only checks delta.content:

if (delta && delta.content)
    return { content: delta.content, done: false, error: null };

All reasoning chunks are dropped. By the time the actual delta.content chunks arrive (sometimes hundreds of reasoning chunks later), various downstream timing issues kick in — buffer's still empty, parser misses the small content tail, etc. Either way: user sees nothing.

Reproduced empirically against api.moonshot.ai/v1: a "hola" prompt to kimi-k2.6 produced 196 reasoning_content chunks then 2 content chunks ("Hola"). The 196-to-2 ratio is typical.

What this PR does

Two minimal additive changes in OpenAiApiStrategy.qml:

1. Dynamic temperature

+let temp = 0.7;
+if (model.model && /k2\.(5|6)|thinking|^o1(-|$)/.test(model.model)) {
+    temp = 1;
+}
 let body = {
     model: model.model,
     messages: _formatMessages(messages),
-    temperature: 0.7
+    temperature: temp
 };

Regex covers:

k2.5, k2.6 (Kimi K2.5/K2.6 vision-and-thinking)
*thinking* (kimi-thinking-preview, kimi-k2-thinking, kimi-k2-thinking-turbo, …)
o1, o1-preview, o1-mini, etc. (OpenAI reasoning family)

2. `reasoning_content` accumulation

parseStreamChunk():

 if (delta && delta.content)
     return { content: delta.content, done: false, error: null };

+if (delta && delta.reasoning_content)
+    return { content: delta.reasoning_content, done: false, error: null };

parseResponse() (non-stream):

+let outContent = msg.content || "";
+if (msg.reasoning_content && !outContent) {
+    outContent = msg.reasoning_content;
+}
-return { content: msg.content };
+return { content: outContent };

Surfaces the model's chain-of-thought as it streams in, then the final answer concatenated at the end — same flow as ChatGPT-o1 and Claude thinking UIs. Non-thinking models are unchanged.

Tested with

Kimi K2.6 (heavy thinking, 196+ reasoning chunks per short prompt) — works.
Kimi K2 (0905 preview) (no thinking, direct content stream) — unchanged, works.
Moonshot v1 family (no thinking) — unchanged, works.

Visual note

Reasoning and final answer arrive in the same chat bubble, concatenated. Could be polished in a follow-up that renders them in separate styled sections (greyed-out reasoning block + answer below), but that's a UX call worth its own PR.

Diff stats

modules/services/ai/strategies/OpenAiApiStrategy.qml | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)

Depends on feat(ai): wire Config.ai.extraModels for user-defined OpenAI-compat providers #176 to register custom OpenAI-compatible providers (Kimi/Moonshot) via Config.ai.extraModels — otherwise users can't even select a thinking model to test against. Without feat(ai): wire Config.ai.extraModels for user-defined OpenAI-compat providers #176, the temperature fix still helps any built-in provider that ever adds thinking variants.

…tegy OpenAI-compatible "thinking" models — Kimi K2.5, K2.6, kimi-*-thinking variants, GPT o1 family — emit a different stream shape and reject the default temperature. Without this patch, every request to them fails silently in the panel with the generic "No response received from the API." message. Two changes, both in OpenAiApiStrategy.qml: 1. **Dynamic temperature** in getBody(): Thinking models require `temperature: 1` and reject anything else with HTTP 400 `invalid_request_error` ("only 1 is allowed for this model"). The current hardcoded `temperature: 0.7` causes every thinking-model request to fail before streaming even starts. Fixed by regex-detecting thinking model IDs: /k2\.(5|6)|thinking|^o1(-|$)/ Other models continue to use 0.7 unchanged. 2. **reasoning_content support** in parseStreamChunk() and parseResponse(): Thinking models emit `delta.reasoning_content` (and `message.reasoning_content` in non-stream) BEFORE the final `delta.content`. The existing parser only checks `delta.content`, so all reasoning chunks are ignored and the response buffer ends up empty. With this patch, reasoning_content is treated as content and surfaced to the user — they see the model's chain-of-thought streaming in, then the final answer concatenated at the end. Same flow as ChatGPT/Claude thinking UIs. This is purely additive — non-thinking models behave identically to before. Tested with Kimi K2.6 (long thinking) and K2 (0905-preview, non-thinking) — both work; non-thinking is unchanged. Note: relies on PR Axenide#176 to register custom OpenAI-compatible providers (Kimi/Moonshot, OpenRouter, etc.) via Config.ai.extraModels. Without that PR, only providers with built-in fetch (Gemini/OpenAI/etc.) benefit from the temperature fix here.

ImIvanGil mentioned this pull request May 13, 2026

fix(ai): surface real HTTP error responses instead of "No response received" #178

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): support thinking/reasoning models in OpenAI-compatible strategy#177

feat(ai): support thinking/reasoning models in OpenAI-compatible strategy#177
ImIvanGil wants to merge 1 commit into
Axenide:mainfrom
ImIvanGil:feat/ai-thinking-models-support

ImIvanGil commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ImIvanGil commented May 13, 2026

Problem

Cause 1: hardcoded temperature: 0.7

Cause 2: parser ignores reasoning_content

What this PR does

1. Dynamic temperature

2. reasoning_content accumulation

Tested with

Visual note

Diff stats

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Cause 1: hardcoded `temperature: 0.7`

Cause 2: parser ignores `reasoning_content`

2. `reasoning_content` accumulation