Skip to content

feat: multi-provider routing — Anthropic OAuth, LiteLLM, OpenAI, Gemini #39

@arniesaha

Description

@arniesaha

Summary

Extend Mux from single-provider routing (Anthropic OAuth only) to true multi-provider, policy-driven routing across multiple LLM backends. The policy engine should decide both which model and which provider to use per request, based on prompt complexity, model availability, cost, and caller intent.

Current State

  • Mux routes all requests through a single downstream path: DOWNSTREAM_MODE=anthropic-sdk via Anthropic OAuth
  • The policy engine (src/policy.ts) selects between Haiku/Sonnet/Opus based on prompt analysis (3-tier routing)
  • RouteDecision.provider and RouteDecision.backendTarget fields exist but are hardcoded to config.defaultProvider / config.defaultBackendTarget — never used for actual dispatch
  • callDownstream() makes a binary choice based on the global DOWNSTREAM_MODE config, not per-request

Target Providers

Provider Auth Backend Models
Anthropic OAuth OAuth token (sk-ant-oat01-...) agentweave-proxy → Anthropic API Claude Haiku, Sonnet, Opus
LiteLLM API key or none LiteLLM gateway (OpenAI-compatible) Any model LiteLLM supports
OpenAI OAuth or API key OpenAI API directly GPT-4o, GPT-4o-mini, o1, o3
Google Gemini API key Gemini API (or via LiteLLM) Gemini 2.5 Pro, Flash

Design Goals

  1. Policy-driven provider selection: resolveRoute() returns a provider that determines which backend handles the request — not a global config switch
  2. Provider registry: Each provider has its own auth config, base URL, and handler. Adding a new provider should be a config + registry entry, not a code change
  3. Fallback chains: If provider A is unavailable (rate limit, timeout, error), try provider B
  4. Cost-aware routing: Route to cheaper providers/models for simple queries, premium for complex
  5. Model availability awareness: If a model isn't available on the preferred provider, route to one that has it
  6. Unified observability: All providers emit the same tracing attributes so the AgentWeave dashboard shows routing decisions across providers

Architecture Sketch

Request → Policy Engine → RouteDecision { model, provider, backendTarget }
                              ↓
                     Provider Registry
                     ┌──────────────────┐
                     │ anthropic-oauth   │ → Anthropic SDK (existing)
                     │ litellm           │ → OpenAI-compatible HTTP
                     │ openai            │ → OpenAI SDK (new)
                     │ gemini            │ → Gemini SDK or via LiteLLM
                     └──────────────────┘

Implementation Phases

Phase 1: Provider abstraction

  • Define a Provider interface: { name, canHandle(model), call(req, route), stream(req, route, res) }
  • Refactor existing Anthropic SDK path into a provider
  • Refactor existing OpenAI-compatible path into a provider
  • Make callDownstream() dispatch based on route.provider → provider registry

Phase 2: LiteLLM integration

  • Add LiteLLM as a provider (OpenAI-compatible endpoint)
  • Configure model → provider mapping in env/config
  • Route non-Anthropic models to LiteLLM

Phase 3: Direct OpenAI + Gemini

  • Add OpenAI SDK provider (direct, not through LiteLLM)
  • Add Gemini provider (direct or via LiteLLM)
  • OAuth support for OpenAI

Phase 4: Fallback chains + cost routing

  • Provider health tracking (error rates, latency)
  • Fallback chain config: anthropic → litellm → openai
  • Cost-weighted routing for equivalent models across providers

Config Vision

# Provider registry (JSON)
PROVIDERS={"anthropic-oauth":{"type":"anthropic-sdk","baseUrl":"http://192.168.1.70:30400","oauthToken":"sk-ant-..."},"litellm":{"type":"openai-compatible","baseUrl":"http://127.0.0.1:4001/v1"},"openai":{"type":"openai-sdk","apiKey":"sk-..."},"gemini":{"type":"openai-compatible","baseUrl":"http://127.0.0.1:4001/v1"}}

# Model → provider mapping
MODEL_PROVIDERS={"claude-*":"anthropic-oauth","gpt-*":"openai","gemini-*":"gemini"}

# Fallback chains
PROVIDER_FALLBACKS={"anthropic-oauth":"litellm","openai":"litellm"}

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions