Skip to content

Route all agents through Gemini via LiteLLM proxy #16

@farzanshibu

Description

@farzanshibu

Problem

All agents (dispatcher + execution) were hardcoded to Claude models via the Anthropic API. Running Gemini was not supported, and a LiteLLM proxy setup was partially wired but broken in several ways:

  • scripts/start-proxy.sh failed to load .env.local (inline comments broke xargs)
  • ANTHROPIC_AUTH_TOKEN was used instead of ANTHROPIC_API_KEY — SDK ignored it
  • LiteLLM had no model aliases matching the model names Claude Code CLI sends internally (claude-sonnet-4-6, claude-haiku-4-5-20251001, etc.) → 400 errors
  • LiteLLM's /v1/messages/count_tokens endpoint sends empty body to Gemini's countTokens API → 500 errors flooding logs
  • When the dispatcher model produced no text blocks (tool-only responses), the literal string "(no reply)" was sent to the user's iMessage

Solution (this PR)

See PR #X for implementation. This issue tracks the root causes.

Checklist

  • .env.local loading fixed (strip inline comments before xargs)
  • ANTHROPIC_API_KEY set correctly for LiteLLM auth
  • litellm.config.yaml maps all Claude Code internal model IDs to gemini/gemini-2.5-flash
  • Thin proxy on port 4000 intercepts count_tokens → returns mock 200
  • LiteLLM runs on port 4001 behind the proxy
  • "(no reply)" fallback replaced with a real user-facing fallback message
  • System prompt updated to allow self-description questions (e.g. "which model are you")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions