feat: route all agents through Gemini via LiteLLM proxy#17
Conversation
Replace custom Sendblue integration with chat-adapter-sendblue and @chat-adapter/telegram, both unified under Chat SDK. All platforms share the same handlers, memory, automations, and Composio tools. - Add server/bot.ts with env-driven adapter registry (registerIfConfigured) - Delete server/sendblue.ts (replaced by chat-adapter-sendblue) - Refactor server/index.ts webhook bridge to forward all headers + debug logs - Add scripts/telegram-webhook.mjs for auto-registration on dev boot - Update scripts/dev.mjs: auto-register Telegram webhook alongside Sendblue - Add README section: "Adding more chat platforms" with Slack walkthrough - Add .agents/skills/chat-sdk/SKILL.md for agent-assisted Chat SDK work Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Capture raw request body via express.json verify callback so HMAC signature verification works for Slack, GitHub, Discord, etc. - Proxy adapter responses faithfully (status + headers + raw bytes) instead of forcing JSON — fixes URL verification challenge flows - Fix double-handler: scope onNewMessage catch-all to sendblue only; all other platforms use onDirectMessage to avoid firing twice on @-mentions - Gate webhook body logging behind DEBUG_WEBHOOKS=true env var to avoid PII in production logs - Surface webhook registration failures explicitly in autoRegisterWebhook - Update dev banner to list all active platform webhooks dynamically instead of hardcoding Sendblue-only messaging Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- litellm.config.yaml: map Claude model IDs (claude-sonnet-4-6, claude-haiku-4-5-20251001, etc.) to gemini/gemini-2.5-flash so Claude Code CLI accepts the model name while LiteLLM routes to Gemini - scripts/start-proxy.sh: load .env.local stripping inline comments; run LiteLLM on port 4001, thin proxy on port 4000 - scripts/anthropic-proxy.mjs: intercepts /v1/messages/count_tokens and returns a mock 200 (LiteLLM+Gemini sends empty body → 500 bug); proxies all other requests to LiteLLM on 4001 - server/interaction-agent.ts: fix empty-reply fallback (was sending literal "(no reply)" string to iMessage); add self-description to safe-to-answer list; log unknown SDK message types for debugging - server/bot.ts: send helpful fallback when model produces no text; log elapsed time in no-reply path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rt and enhance execution agent logic Co-authored-by: Copilot <copilot@github.com>
Greptile SummaryThis PR routes all agent requests through a LiteLLM proxy to Gemini, adds a thin Node.js proxy to work around a LiteLLM+Gemini
Confidence Score: 4/5Safe to merge after resolving the litellm.config.yaml / .env.example model-name mismatch; all other changes are straightforward. One P1 defect: documented model names (gemini-main, gpt-main) are absent from the LiteLLM config, causing immediate failures for any user who follows the .env.example instructions. No P0 issues found. litellm.config.yaml and .env.example — model name definitions must be consistent between them. Important Files Changed
Sequence DiagramsequenceDiagram
participant CC as Claude Code CLI
participant P as anthropic-proxy :4000
participant L as LiteLLM :4001
participant G as Gemini API
CC->>P: POST /v1/messages/count_tokens
P-->>CC: 200 {input_tokens: 10000} (mocked)
CC->>P: POST /v1/messages (claude-sonnet-4-6)
P->>L: forward request
L->>G: gemini/gemini-2.5-flash
G-->>L: response
L-->>P: response
P-->>CC: response
Note over CC,G: All three mapped Claude model IDs route to gemini-2.5-flash
|
|
|
||
| // --------------------------------------------------------------------------- |
There was a problem hiding this comment.
Duplicate turn for subscribed Sendblue threads
After the first Sendblue message, handleTurn calls thread.subscribe(). On every subsequent message that thread is now subscribed, so onSubscribedMessage fires and onNewMessage(/[\s\S]*/) (which only guards on adapter.name !== "sendblue") also fires — resulting in two calls to handleTurn, two thread.post() calls, and two Convex mutations per message.
A simple fix is to add an early-exit guard inside onSubscribedMessage since onNewMessage already covers follow-up messages for that adapter:
bot.onSubscribedMessage(async (thread, message) => {
// Sendblue messages are already handled via onNewMessage above
if (thread.adapter.name === "sendblue") return;
await handleTurn(thread, message);
});| if (req.url?.startsWith("/v1/messages/count_tokens")) { | ||
| res.writeHead(200, { "Content-Type": "application/json" }); | ||
| res.end(JSON.stringify({ input_tokens: 10000 })); | ||
| return; |
There was a problem hiding this comment.
Hardcoded mock token count may mis-signal context capacity
Returning input_tokens: 10000 for every count_tokens request means Claude Code always believes the context is ~10 k tokens, regardless of actual conversation length. If a conversation approaches the real model limit, Claude Code won't get a signal to compact or warn the user — it will silently exceed the context window and see truncation or errors at inference time.
A safer sentinel is a very large value (e.g. 200_000) so Claude Code never hits the heuristic threshold, or the comment should explicitly document the chosen value and its rationale.
| set +a | ||
| fi | ||
|
|
||
| # Ensure mandatory keys are set for LiteLLM |
There was a problem hiding this comment.
Unconditional empty-string export may confuse LiteLLM
Exporting OPENAI_API_KEY as an empty string (via the :-"" default) always sets the variable in the environment, even when OpenAI is not in use. LiteLLM inspects environment variable presence to decide whether to validate/route OpenAI requests; an empty string may trigger unexpected validation errors or suppress the clearer "key not configured" diagnostic. Only exporting it when the value is non-empty would be safer.
Closes #16
Summary
litellm.config.yaml: maps Claude Code's internal model IDs (claude-sonnet-4-6,claude-haiku-4-5-20251001,claude-sonnet-4-5-20250929) togemini/gemini-2.5-flashso requests route to Gemini transparentlyscripts/start-proxy.sh: fixed.env.localloading (inline comments brokexargs); now runs LiteLLM on port 4001 + thin proxy on port 4000scripts/anthropic-proxy.mjs: new thin Node.js proxy that intercepts/v1/messages/count_tokens(LiteLLM+Gemini bug: sends empty body → 500) and returns a mock 200; proxies all other requests to LiteLLMserver/interaction-agent.ts: fixed"(no reply)"literal being sent to iMessage when model produces no text; added self-description to safe-to-answer list; added unknown SDK message type loggingserver/bot.ts: empty reply now sends a helpful fallback message instead of nothingTest plan
sh scripts/start-proxy.shstarts without env errors/v1/messages/count_tokenscalls return 200 (no more 500 floods)🤖 Generated with Claude Code