From 90e80aca25e02f6afd594547fef9ae1c13dd0671 Mon Sep 17 00:00:00 2001 From: dfbb Date: Wed, 17 Jun 2026 10:43:27 +0000 Subject: [PATCH] fix(gemini): preserve per-tool-call thought signatures Gemini 3 thinking models attach a thoughtSignature to each function-call part that must be replayed verbatim on that specific part. crush stored a single ReasoningContent per assistant message and concatenated every signature into one string (AppendThoughtSignature), losing per-call association. On replay ToAIMessage sent the concatenated blob as one signature, so Gemini rejected the request with 'Corrupted thought signature'. The corrupted signature was persisted, poisoning the whole session and recurring across restarts. - message.ToolCall: add ThoughtSignature field; preserve it in FinishToolCall/AppendToolCallInput; add SetToolCallThoughtSignature - stop reasoning mutators (FinishThinking, AppendReasoningContent, SetReasoningResponsesData) from silently dropping signature fields - ToAIMessage: emit each signature in its own ReasoningPart immediately before its text/tool-call part instead of concatenating - agent: buffer per-toolID signatures in OnReasoningEnd and attach them to the matching tool call in OnToolCall Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/CORRUPTED_THOUGHT_SIGNATURE_ANALYSIS.md | 100 ++++++++++++++++++ internal/agent/agent.go | 23 +++- internal/message/content.go | 105 ++++++++++++++----- internal/message/content_test.go | 81 ++++++++++++++ 4 files changed, 283 insertions(+), 26 deletions(-) create mode 100644 docs/CORRUPTED_THOUGHT_SIGNATURE_ANALYSIS.md diff --git a/docs/CORRUPTED_THOUGHT_SIGNATURE_ANALYSIS.md b/docs/CORRUPTED_THOUGHT_SIGNATURE_ANALYSIS.md new file mode 100644 index 0000000000..a55164e9a7 --- /dev/null +++ b/docs/CORRUPTED_THOUGHT_SIGNATURE_ANALYSIS.md @@ -0,0 +1,100 @@ +# Crush + Gemini 3 Pro `Corrupted thought signature` 问题分析报告 + +- 版本:crush v0.77.0(本地源码树 `../crush`) +- 依赖:`charm.land/fantasy v0.31.1`,底层 `google.golang.org/genai v1.60.0` +- 现象:搭配 Gemini 3 Pro,运行一段时间后频繁报 `bad request: Corrupted thought signature`,且同一会话持续复现、跨重启仍失败(日志 `1.txt`/`2.txt`,同一 `session_id e1d37cb9` 自 10:16 起反复失败)。 + +--- + +## 1. 结论(TL;DR) + +`Corrupted thought signature` 是 Gemini 的 API 报错。Gemini 3 / 2.5 思考模型在返回 function call 时,会在对应 part 上附带加密的 `thoughtSignature`,**后续请求必须把每个签名原样、按 part 一一对应地回传**。 + +crush 的消息模型每条 assistant 消息**只保存一个 `ReasoningContent`**,无法表达「多个签名各属于不同 part」。当 Gemini 3 Pro 发起**多个/并行 function call** 时: + +1. crush 把多个签名**拼接成一个字符串**回传 → 签名损坏; +2. 损坏的签名被**持久化进 SQLite 会话历史**,之后每次请求都重放被污染的历史 → 错误持续复现、跨重启仍失败。 + +--- + +## 2. fantasy 侧的契约(关键约束) + +文件:`charm.land/fantasy@v0.31.1/providers/google/google.go` + +### 2.1 解析响应(响应 → fantasy 事件) +- `Stream`(行 728–826)与 `mapResponse`(行 1362–1404):**每个 function call 的签名都作为一次独立的 `OnReasoningEnd` 事件**抛出,并携带 `ReasoningMetadata{Signature, ToolID: <该 call 的 id>}`。 +- 即:**一次助手回合会触发多次 `OnReasoningEnd`,每次对应一个工具调用的签名**;纯文本回合的签名则 `ToolID == ""`。 + +### 2.2 回放请求(fantasy 消息 → genai 请求) +- `toGooglePrompt`(行 414–477)按 `Content` 顺序遍历 assistant parts: + - 遇到带 google 元数据的 `ReasoningPart`,把签名暂存到 `currentReasoningMetadata`(本身不产出 genai part); + - 在**紧随其后的下一个 text / toolCall part** 上设置 `Part.ThoughtSignature`,然后清空暂存。 +- **结论:签名与 part 的对应完全依赖顺序**——每个签名必须放在它自己的 `ReasoningPart` 里,且紧贴它所属的那个 part。没有 google 元数据的 reasoning part 会被直接跳过(行 425–428),不会误挂。 + +--- + +## 3. crush 侧的缺陷(真正的 bug) + +`internal/message/content.go` 的 getter `ReasoningContent()`(行 153)只返回**第一个** reasoning part,且所有 `Append*` 都作用于这个唯一的 reasoning part —— 模型层面只有一个 `ReasoningContent`。 + +### 缺陷 1:多个签名被拼接成一个(核心触发点) +- `internal/agent/agent.go:884-888` 每次 `OnReasoningEnd` 调用 `AppendThoughtSignature`。 +- `content.go:270-285` 的 `AppendThoughtSignature` 执行 `c.ThoughtSignature + signature`,把 N 个不同 base64 签名首尾相连成一个串,`ToolID` 只保留最后一个。 +- 回放时 `ToAIMessage`(`content.go:520-525`)把这坨拼接串作为**单个**签名发回 → `Corrupted thought signature`。 + +### 缺陷 2:工具调用 part 不携带各自签名 +- `ToolCall` 结构体(`content.go:101-107`)无签名字段。 +- `ToAIMessage`(`content.go:528-535`)重建 tool call 时不带任何 google 元数据。每个 function call 的独立签名无处安放。 + +### 缺陷 3:签名被 `if reasoning.Thinking != ""` 门控丢弃 +- `ToAIMessage:510` 仅在思考文本非空时才发出 reasoning part(连同签名)。 +- Gemini 工具回合常常签名非空但思考文本为空,此时签名整体不回传。 + +### 缺陷 4:mutator 静默清空签名 +逐字段重建结构体时未拷贝签名字段,会在回合中途擦除已存签名: +- `FinishThinking`(`content.go:316`):未拷贝 `ThoughtSignature`/`ToolID`/`ResponsesData`。 +- `AppendReasoningContent`(`content.go:249`):未拷贝 `ThoughtSignature`/`ToolID`/`Signature`/`ResponsesData`。 +- `SetReasoningResponsesData`(`content.go:302`):未拷贝 `ThoughtSignature`/`ToolID`/`Signature`。 +- 若给 `ToolCall` 加签名字段,`FinishToolCall`(346)、`AppendToolCallInput`(362)同样会擦除,需一并修。 + +### 缺陷 5:顺序错位 +`ToAIMessage` 把 part 重排为 `text → reasoning → 所有 toolcall`,与 fantasy 要求的「签名紧贴其 part」不符。 + +--- + +## 4. 为什么「跑一段时间后必现且持续」 + +- 早期简单回合(纯文本 / 单工具调用)拼接退化为单签名,多数能蒙混过去; +- 一旦 Gemini 3 Pro 发起**多个/并行 function call**,签名被拼接 → 损坏; +- 关键:**损坏的拼接签名被持久化进 SQLite 会话历史**,之后该会话每次请求都重放被污染的历史 → 错误持续复现,甚至跨重启(与日志中同一 `session_id` 反复失败完全吻合)。 + +> 排除项:`agent.go:800` 的 `prepared.Messages[i].ProviderOptions = nil` 是**消息级** `Message.ProviderOptions`(cache-control 用),与签名所在的 **part 级** `ReasoningPart.ProviderOptions[google]` 不是同一字段,已核实不影响签名。 + +--- + +## 5. 修复方案 + +核心思路:让 crush 模型能**逐工具调用**保存签名,并在 `ToAIMessage` 中按 fantasy 要求的顺序(每个签名一个独立 `ReasoningPart`,紧贴其 part)回放。 + +### 改动 1 — `internal/message/content.go` +- `ToolCall` 增加字段 `ThoughtSignature string` `json:"thought_signature,omitempty"`。 +- `FinishToolCall`(346)、`AppendToolCallInput`(362)、`AddToolCall` 重建时保留 `ThoughtSignature`。 +- 修复 `FinishThinking`(316)、`AppendReasoningContent`(249)、`SetReasoningResponsesData`(302):重建时拷贝 `ThoughtSignature`/`ToolID`/`ResponsesData`/`Signature`,不再清空。 +- 新增 `SetToolCallThoughtSignature(id, sig string)`。 +- **重写 `ToAIMessage` 的 Assistant 分支**: + - 思考/文本签名(`ToolID==""`):发一个 `ReasoningPart`,**仅当 `ThoughtSignature != ""` 时**才写 `ProviderOptions[google]`,随后发 text part; + - 每个 tool call:若其 `ThoughtSignature != ""`,**先发只含该签名的 `ReasoningPart`**(`ReasoningMetadata{Signature, ToolID: call.ID}`),紧接着发该 `ToolCallPart`。 + +### 改动 2 — `internal/agent/agent.go`(Stream 闭包内) +- 新增 `pendingThoughtSigs := map[string]string{}`。 +- `OnReasoningEnd`(877) 处理 google 元数据:`ToolID != ""` 时存 `pendingThoughtSigs[ToolID] = Signature`(不再拼接);`ToolID == ""` 时才 `AppendThoughtSignature(sig, "")`。 +- `OnToolCall`(923)(终态)创建 `message.ToolCall` 时设置 `ThoughtSignature = pendingThoughtSigs[tc.ToolCallID]`。 + +### 改动 3 — 测试 +`internal/message` 增加 `ToAIMessage` 单测:构造「思考 + 2 个并行工具调用、各带不同签名」的 assistant 消息,断言输出顺序为 `reasoning(sig_text)?, text, reasoning(sig1)+toolcall1, reasoning(sig2)+toolcall2`,每个 `ReasoningPart` 仅含单个签名且 `ToolID` 正确。 + +### 已损坏会话说明 +此修复只防止**新回合**污染;已写入旧会话历史的「拼接签名」无法还原,受影响会话需**新开 session**。 + +### 验证 +`go build ./...` + `go test ./internal/message/... ./internal/agent/...`;再用 Gemini 3 Pro 跑含多次并行工具调用的长会话回归确认不再报错。 diff --git a/internal/agent/agent.go b/internal/agent/agent.go index f4972b181a..cced02dd91 100644 --- a/internal/agent/agent.go +++ b/internal/agent/agent.go @@ -723,6 +723,14 @@ func (a *sessionAgent) Run(ctx context.Context, call SessionAgentCall) (result * // message of the turn is the value reachable through this // pointer when the defer runs. var currentAssistant *message.Message + // pendingThoughtSigs buffers Google Gemini per-tool-call thought + // signatures keyed by tool call ID. The provider emits a tool call's + // signature (via OnReasoningEnd with a ToolID) BEFORE the tool call + // itself arrives, so we stash it here and attach it once OnToolCall + // creates the tool call. Each signature must be replayed verbatim on its + // own tool call or Gemini rejects the request with "Corrupted thought + // signature". + pendingThoughtSigs := make(map[string]string) // Drain any debounced message updates before returning. message.Service // already flushes synchronously on terminal updates, but a defer here // guarantees the contract at every Run exit (success, error, panic @@ -864,6 +872,7 @@ func (a *sessionAgent) Run(ctx context.Context, call SessionAgentCall) (result * callContext = context.WithValue(callContext, tools.SupportsImagesContextKey, largeModel.CatwalkCfg.SupportsImages) callContext = context.WithValue(callContext, tools.ModelNameContextKey, largeModel.CatwalkCfg.Name) currentAssistant = &assistantMsg + clear(pendingThoughtSigs) return callContext, prepared, err }, OnReasoningStart: func(id string, reasoning fantasy.ReasoningContent) error { @@ -883,7 +892,16 @@ func (a *sessionAgent) Run(ctx context.Context, call SessionAgentCall) (result * } if googleData, ok := reasoning.ProviderMetadata[google.Name]; ok { if reasoning, ok := googleData.(*google.ReasoningMetadata); ok { - currentAssistant.AppendThoughtSignature(reasoning.Signature, reasoning.ToolID) + // A signature bound to a tool call (ToolID set) must travel + // with that specific tool call, not be concatenated onto the + // shared reasoning block. Buffer it until OnToolCall creates + // the tool call. Signatures without a ToolID belong to the + // final text answer and stay on the reasoning content. + if reasoning.ToolID != "" { + pendingThoughtSigs[reasoning.ToolID] = reasoning.Signature + } else { + currentAssistant.AppendThoughtSignature(reasoning.Signature, reasoning.ToolID) + } } } if openaiData, ok := reasoning.ProviderMetadata[openai.Name]; ok { @@ -927,6 +945,9 @@ func (a *sessionAgent) Run(ctx context.Context, call SessionAgentCall) (result * Input: tc.Input, ProviderExecuted: false, Finished: true, + // Attach the buffered Google thought signature (if any) so it + // is persisted and replayed verbatim with this tool call. + ThoughtSignature: pendingThoughtSigs[tc.ToolCallID], } currentAssistant.AddToolCall(toolCall) // Use parent ctx instead of genCtx to ensure the update succeeds diff --git a/internal/message/content.go b/internal/message/content.go index c62cdf5161..485aa36602 100644 --- a/internal/message/content.go +++ b/internal/message/content.go @@ -104,6 +104,11 @@ type ToolCall struct { Input string `json:"input"` ProviderExecuted bool `json:"provider_executed"` Finished bool `json:"finished"` + // ThoughtSignature is the per-tool-call thought signature returned by + // Google Gemini thinking models. It must be replayed verbatim, attached + // to this specific tool call, or Gemini rejects the request with + // "Corrupted thought signature". + ThoughtSignature string `json:"thought_signature,omitempty"` } func (ToolCall) isPart() {} @@ -251,10 +256,13 @@ func (m *Message) AppendReasoningContent(delta string) { for i, part := range m.Parts { if c, ok := part.(ReasoningContent); ok { m.Parts[i] = ReasoningContent{ - Thinking: c.Thinking + delta, - Signature: c.Signature, - StartedAt: c.StartedAt, - FinishedAt: c.FinishedAt, + Thinking: c.Thinking + delta, + Signature: c.Signature, + ThoughtSignature: c.ThoughtSignature, + ToolID: c.ToolID, + ResponsesData: c.ResponsesData, + StartedAt: c.StartedAt, + FinishedAt: c.FinishedAt, } found = true } @@ -303,10 +311,13 @@ func (m *Message) SetReasoningResponsesData(data *openai.ResponsesReasoningMetad for i, part := range m.Parts { if c, ok := part.(ReasoningContent); ok { m.Parts[i] = ReasoningContent{ - Thinking: c.Thinking, - ResponsesData: data, - StartedAt: c.StartedAt, - FinishedAt: c.FinishedAt, + Thinking: c.Thinking, + Signature: c.Signature, + ThoughtSignature: c.ThoughtSignature, + ToolID: c.ToolID, + ResponsesData: data, + StartedAt: c.StartedAt, + FinishedAt: c.FinishedAt, } return } @@ -318,10 +329,13 @@ func (m *Message) FinishThinking() { if c, ok := part.(ReasoningContent); ok { if c.FinishedAt == 0 { m.Parts[i] = ReasoningContent{ - Thinking: c.Thinking, - Signature: c.Signature, - StartedAt: c.StartedAt, - FinishedAt: time.Now().Unix(), + Thinking: c.Thinking, + Signature: c.Signature, + ThoughtSignature: c.ThoughtSignature, + ToolID: c.ToolID, + ResponsesData: c.ResponsesData, + StartedAt: c.StartedAt, + FinishedAt: time.Now().Unix(), } } return @@ -348,10 +362,12 @@ func (m *Message) FinishToolCall(toolCallID string) { if c, ok := part.(ToolCall); ok { if c.ID == toolCallID { m.Parts[i] = ToolCall{ - ID: c.ID, - Name: c.Name, - Input: c.Input, - Finished: true, + ID: c.ID, + Name: c.Name, + Input: c.Input, + ProviderExecuted: c.ProviderExecuted, + Finished: true, + ThoughtSignature: c.ThoughtSignature, } return } @@ -364,10 +380,12 @@ func (m *Message) AppendToolCallInput(toolCallID string, inputDelta string) { if c, ok := part.(ToolCall); ok { if c.ID == toolCallID { m.Parts[i] = ToolCall{ - ID: c.ID, - Name: c.Name, - Input: c.Input + inputDelta, - Finished: c.Finished, + ID: c.ID, + Name: c.Name, + Input: c.Input + inputDelta, + ProviderExecuted: c.ProviderExecuted, + Finished: c.Finished, + ThoughtSignature: c.ThoughtSignature, } return } @@ -387,6 +405,21 @@ func (m *Message) AddToolCall(tc ToolCall) { m.Parts = append(m.Parts, tc) } +// SetToolCallThoughtSignature attaches a Google thought signature to the tool +// call with the given ID, preserving its other fields. No-op if not found. +func (m *Message) SetToolCallThoughtSignature(toolCallID, signature string) { + if signature == "" { + return + } + for i, part := range m.Parts { + if c, ok := part.(ToolCall); ok && c.ID == toolCallID { + c.ThoughtSignature = signature + m.Parts[i] = c + return + } + } +} + func (m *Message) SetToolCalls(tc []ToolCall) { // remove any existing tool call part it could have multiple parts := make([]ContentPart, 0) @@ -503,11 +536,17 @@ func (m *Message) ToAIMessage() []fantasy.Message { case Assistant: var parts []fantasy.MessagePart text := strings.TrimSpace(m.Content().Text) - if text != "" { - parts = append(parts, fantasy.TextPart{Text: text}) - } reasoning := m.ReasoningContent() - if reasoning.Thinking != "" { + + // Emit the reasoning block (if any) BEFORE the text part. The Google + // provider replays a thought signature by attaching it to the part + // immediately following its ReasoningPart, so the order matters. We + // only carry the Google signature here when it is NOT bound to a + // specific tool call (ToolID == ""), i.e. the signature of the final + // text answer; per-tool-call signatures are emitted next to their + // tool call below. + hasGoogleTextSig := reasoning.ThoughtSignature != "" && reasoning.ToolID == "" + if reasoning.Thinking != "" || hasGoogleTextSig { reasoningPart := fantasy.ReasoningPart{Text: reasoning.Thinking, ProviderOptions: fantasy.ProviderOptions{}} if reasoning.Signature != "" { reasoningPart.ProviderOptions[anthropic.Name] = &anthropic.ReasoningOptionMetadata{ @@ -517,7 +556,7 @@ func (m *Message) ToAIMessage() []fantasy.Message { if reasoning.ResponsesData != nil { reasoningPart.ProviderOptions[openai.Name] = reasoning.ResponsesData } - if reasoning.ThoughtSignature != "" { + if hasGoogleTextSig { reasoningPart.ProviderOptions[google.Name] = &google.ReasoningMetadata{ Signature: reasoning.ThoughtSignature, ToolID: reasoning.ToolID, @@ -525,7 +564,23 @@ func (m *Message) ToAIMessage() []fantasy.Message { } parts = append(parts, reasoningPart) } + if text != "" { + parts = append(parts, fantasy.TextPart{Text: text}) + } for _, call := range m.ToolCalls() { + // Replay the per-tool-call thought signature in its own + // ReasoningPart placed immediately before the tool call, so the + // Google provider attaches it to exactly this function call. + if call.ThoughtSignature != "" { + parts = append(parts, fantasy.ReasoningPart{ + ProviderOptions: fantasy.ProviderOptions{ + google.Name: &google.ReasoningMetadata{ + Signature: call.ThoughtSignature, + ToolID: call.ID, + }, + }, + }) + } parts = append(parts, fantasy.ToolCallPart{ ToolCallID: call.ID, ToolName: call.Name, diff --git a/internal/message/content_test.go b/internal/message/content_test.go index 04e601012a..c62ecb4751 100644 --- a/internal/message/content_test.go +++ b/internal/message/content_test.go @@ -7,6 +7,7 @@ import ( "testing" "charm.land/fantasy" + "charm.land/fantasy/providers/google" "github.com/stretchr/testify/require" ) @@ -116,6 +117,86 @@ func TestToAIMessage_ASCIIButInvalidBase64(t *testing.T) { require.Equal(t, mediaLoadFailedPlaceholder, textContent.Text) } +// TestToAIMessage_GoogleThoughtSignaturesPerToolCall verifies that each tool +// call's Google thought signature is replayed in its own ReasoningPart placed +// immediately before that tool call, never concatenated. Concatenation or +// misplacement is what triggers Gemini's "Corrupted thought signature" error. +func TestToAIMessage_GoogleThoughtSignaturesPerToolCall(t *testing.T) { + t.Parallel() + + msg := &Message{ + Role: Assistant, + Parts: []ContentPart{ + ReasoningContent{Thinking: "let me think", FinishedAt: 1}, + ToolCall{ID: "call_1", Name: "view", Input: "{}", Finished: true, ThoughtSignature: "SIG1"}, + ToolCall{ID: "call_2", Name: "ls", Input: "{}", Finished: true, ThoughtSignature: "SIG2"}, + }, + } + + messages := msg.ToAIMessage() + require.Len(t, messages, 1) + content := messages[0].Content + // reasoning(thinking), reasoning(SIG1), toolcall_1, reasoning(SIG2), toolcall_2 + require.Len(t, content, 5) + + // [0] thinking reasoning, no google signature attached. + r0, ok := content[0].(fantasy.ReasoningPart) + require.True(t, ok) + require.Equal(t, "let me think", r0.Text) + require.Nil(t, r0.ProviderOptions[google.Name]) + + assertGoogleSig := func(i int, sig, toolID string) { + t.Helper() + rp, ok := content[i].(fantasy.ReasoningPart) + require.True(t, ok, "part %d must be a ReasoningPart", i) + meta, ok := rp.ProviderOptions[google.Name].(*google.ReasoningMetadata) + require.True(t, ok, "part %d must carry google ReasoningMetadata", i) + require.Equal(t, sig, meta.Signature) + require.Equal(t, toolID, meta.ToolID) + } + + assertGoogleSig(1, "SIG1", "call_1") + tc1, ok := content[2].(fantasy.ToolCallPart) + require.True(t, ok) + require.Equal(t, "call_1", tc1.ToolCallID) + + assertGoogleSig(3, "SIG2", "call_2") + tc2, ok := content[4].(fantasy.ToolCallPart) + require.True(t, ok) + require.Equal(t, "call_2", tc2.ToolCallID) +} + +// TestToAIMessage_GoogleTextAnswerSignature verifies the final-answer thought +// signature (no tool ID) is replayed on a ReasoningPart immediately before the +// text part. +func TestToAIMessage_GoogleTextAnswerSignature(t *testing.T) { + t.Parallel() + + msg := &Message{ + Role: Assistant, + Parts: []ContentPart{ + ReasoningContent{ThoughtSignature: "TEXTSIG", FinishedAt: 1}, + TextContent{Text: "final answer"}, + }, + } + + messages := msg.ToAIMessage() + require.Len(t, messages, 1) + content := messages[0].Content + require.Len(t, content, 2) + + rp, ok := content[0].(fantasy.ReasoningPart) + require.True(t, ok) + meta, ok := rp.ProviderOptions[google.Name].(*google.ReasoningMetadata) + require.True(t, ok) + require.Equal(t, "TEXTSIG", meta.Signature) + require.Empty(t, meta.ToolID) + + tp, ok := content[1].(fantasy.TextPart) + require.True(t, ok) + require.Equal(t, "final answer", tp.Text) +} + func BenchmarkPromptWithTextAttachments(b *testing.B) { cases := []struct { name string