Skip to content

AI gateway: model-based routing + failover metric + model-id refresh#536

Merged
rickcrawford merged 1 commit into
mainfrom
ai-gateway-followup-fixes
Jun 24, 2026
Merged

AI gateway: model-based routing + failover metric + model-id refresh#536
rickcrawford merged 1 commit into
mainfrom
ai-gateway-followup-fixes

Conversation

@rickcrawford

Copy link
Copy Markdown
Contributor

Follow-up to #534, implementing the gaps that work surfaced. (Recreated after #534 merged; supersedes the closed #535.)

Model-based provider selection

Before the routing strategy runs, the dispatcher narrows providers to those whose models list declares the requested model. An empty models list is a wildcard; if no provider declares the model, the model name passes straight through unchanged. A single OpenAI-compatible origin can now route by the model field to the correct vendor regardless of strategy (previously round-robin could send a request to a provider that does not serve the model, returning model_not_found). Extracted as model_eligible_providers() with unit tests; documented under "Model-based provider selection" in docs/ai-gateway.md.

Failover metric

record_failover() was defined and unit-tested but never called, so sbproxy_ai_failovers_total stayed empty even on real failovers. Wired it into both advancement points (retriable 5xx and transport error). Verified live:

sbproxy_ai_failovers_total{from_provider="primary-unreachable",reason="transport",to_provider="anthropic-backup"} 1

Model-id refresh

Refreshed deprecated provider model ids across example configs/READMEs and AI doc snippets (claude-3-5-*claude-{sonnet,haiku}-4-5, gemini-2.x / 1.5 / flash-latestgemini-3.5-flash). Bedrock ARNs and OpenRouter-format ids are intentionally left unchanged.

Follow-up to the cassettes work, implementing the gaps it surfaced.

- Model-based provider selection: before the routing strategy runs, the
  dispatcher narrows providers to those whose `models` list declares the
  requested model (an empty list is a wildcard; no match passes through
  unchanged). A single OpenAI-compatible origin can now route by the model
  field to the right vendor regardless of strategy. Extracted as
  model_eligible_providers() with unit tests; documented in docs/ai-gateway.md.

- Failover metric: record_failover() was defined and tested but never called,
  so sbproxy_ai_failovers_total stayed empty on real failovers. Wired it into
  both advancement points (retriable 5xx and transport error).

- Refresh deprecated provider model ids across example configs/READMEs and AI
  doc snippets (claude-3-5-* -> claude-{sonnet,haiku}-4-5, gemini-2.x/1.5/
  flash-latest -> gemini-3.5-flash). Bedrock ARNs and OpenRouter-format ids
  left as-is.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B8vnyA6iBx5FnoLbsnwWww
@rickcrawford rickcrawford merged commit 367c5c9 into main Jun 24, 2026
5 checks passed
@rickcrawford rickcrawford deleted the ai-gateway-followup-fixes branch June 24, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant