Skip to content

feat: gemini-acp text streaming compat#13

Draft
kevin-david wants to merge 1 commit into
Open-ACP:mainfrom
kevin-david:feat-gemini-text-stream-compat
Draft

feat: gemini-acp text streaming compat#13
kevin-david wants to merge 1 commit into
Open-ACP:mainfrom
kevin-david:feat-gemini-text-stream-compat

Conversation

@kevin-david

@kevin-david kevin-david commented May 11, 2026

Copy link
Copy Markdown
Contributor

Why

Gemini-acp's text stream has a few quirks that the adapter wasn't handling. Coding sessions on gemini behaved noticeably worse than on Claude Code: chain-of-thought dumped into the response, long messages truncated to 1900 chars, and tables rendered as raw markdown pipes. This PR closes the gap.

What changes

1. Inline-marker chain-of-thought stripping. Gemini emits reasoning as plain text with a literal [Thought: true] marker at the thought/response boundary. At medium and below we retroactively trim the draft to only the post-marker content so the user sees the response without the thinking dump. High keeps both visible. Adds a MessageDraft.replaceBuffer helper for the in-place rewrite.

2. Finalize text draft on every turn end. Agents that don't emit usage / session_end at turn-end (gemini) left the streaming text draft stuck at its 1900-char mid-stream truncation. Register a plugin middleware on the turn:end hook that calls a new public DiscordAdapter.finalizeSessionDraft so the buffer gets split into chunks regardless of the agent's terminal-event behavior. Also finalize on incoming user message (messageCreate) so the prior turn's draft seals before the next turn's text appends to it.

3. Discord-specific rendering instruction injected per session. First-prompt-only middleware on agent:beforePrompt prepends a <system_instruction> block telling the agent: no markdown tables, use ASCII tables with +---+ borders inside triple-backtick fences, tables ≤90 chars wide, apply silently. Discord-specific (only fires for sourceAdapterId === 'discord'). Adds the middleware:register permission to the plugin. (Supersedes #10)

Stacking note

Works best in combination with #11 (content-loss split fix) — without that, finalize would still drop content into the void on long responses. Order of merge: #11 first, then this.

Test plan

  • Long table responses now arrive in full across multiple Discord messages
  • Gemini chain-of-thought no longer dumps into the response at medium
  • First-turn instruction appears to be honored (no markdown tables, ASCII style with borders)
  • Reviewer: confirm assistant session (PR feat(assistant): instruct agent to fence tables and fixed-width content for Discord #9 already shipped) and regular sessions both render tables correctly
  • Reviewer: confirm no regression for Claude Code (no [Thought:] marker → retro-rewrite is a no-op)

🤖 Generated with Claude Code

Gemini-acp's text stream has a few quirks the adapter wasn't handling.
This commit makes coding sessions on gemini behave like sessions on
Claude Code:

1. **Inline-marker chain-of-thought stripping.** Gemini emits reasoning
   as plain text with a literal `[Thought: true]` marker at the
   thought/response boundary. At medium and below we retroactively trim
   the draft to only the post-marker content so the user sees the
   response without the thinking dump. High keeps both visible. Adds a
   `MessageDraft.replaceBuffer` helper for the in-place rewrite.

2. **Finalize text draft on every turn end.** Agents that don't emit
   `usage`/`session_end` at turn-end (gemini) left the streaming text
   draft stuck at its 1900-char mid-stream truncation. Register a
   plugin middleware on the `turn:end` hook that calls a new public
   `DiscordAdapter.finalizeSessionDraft` so the buffer gets split into
   chunks regardless of the agent's terminal-event behavior. Also
   finalize on incoming user message (messageCreate) so the prior
   turn's draft seals before the next turn's text appends to it.

3. **Discord-specific rendering instruction injected per session.**
   First-prompt-only middleware on `agent:beforePrompt` prepends a
   `<system_instruction>` block telling the agent: no markdown tables,
   use ASCII tables with +---+ borders inside triple-backtick fences,
   tables ≤90 chars wide, apply silently. Discord-specific (only fires
   for sourceAdapterId === 'discord'). Adds the `middleware:register`
   permission to the plugin.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/adapter.ts
// Reset tracker state for new prompt cycle on existing sessions
// Reset tracker state and finalize any in-flight draft for existing sessions.
// Some agents (e.g. gemini) don't emit usage/tool_call events between turns,
// so a new user message is the only reliable signal that the prior turn ended.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that this doesn't break with back to back messages

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

back to back messages are fine - anything else to be worried about here?

Comment thread src/index.ts
"and box-drawing or `+---+` style borders, then wrap the whole table in " +
"triple-backtick code fences. The monospace inside the fence aligns the " +
"columns correctly.\n" +
"- Tables MUST be no wider than 90 characters per row. Discord's mobile " +

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think it's even shorter than this, on my Pixel Fold I can only get 51 characters on folded mode and 71 on unfolded mode. The desktop in full thread view mode is like 91. Thoughts?

@0xmrpeter

Copy link
Copy Markdown
Contributor

Clean implementation of both features — the replaceBuffer approach for real-time thought stripping is elegant, and hooking into turn:end for draft finalization is exactly the right place for this. The middleware pattern is consistent with how the rest of the adapter works.

A few minor things worth addressing before merge:

  • Deprecated getterfinalizeSessionDraft uses session?.threadId which is the deprecated single-thread getter. Should be session?.threadIds.get('discord') to stay consistent with the multi-adapter model.
  • Wrong field for adapter checksession?.channelId === 'discord' in the turn:end handler uses the creation-time channelId field. session.attachedAdapters.includes('discord') would be more accurate, especially in multi-adapter scenarios.
  • Empty edit guardreplaceBuffer('') when post-marker content is empty will fire an empty Discord edit call. A if (postMarker) guard would skip the unnecessary API call.
  • Silent catch — the catch in the turn:end middleware swallows errors completely. Even a log.warn would help with debugging if finalizeSessionDraft fails.

This stacks on #11 — worth merging that one first. No blocking issues here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants