AI gateway: model-based routing + failover metric + model-id refresh by rickcrawford · Pull Request #536 · soapbucket/sbproxy

rickcrawford · 2026-06-24T22:44:14Z

Follow-up to #534, implementing the gaps that work surfaced. (Recreated after #534 merged; supersedes the closed #535.)

Model-based provider selection

Before the routing strategy runs, the dispatcher narrows providers to those whose models list declares the requested model. An empty models list is a wildcard; if no provider declares the model, the model name passes straight through unchanged. A single OpenAI-compatible origin can now route by the model field to the correct vendor regardless of strategy (previously round-robin could send a request to a provider that does not serve the model, returning model_not_found). Extracted as model_eligible_providers() with unit tests; documented under "Model-based provider selection" in docs/ai-gateway.md.

Failover metric

record_failover() was defined and unit-tested but never called, so sbproxy_ai_failovers_total stayed empty even on real failovers. Wired it into both advancement points (retriable 5xx and transport error). Verified live:

sbproxy_ai_failovers_total{from_provider="primary-unreachable",reason="transport",to_provider="anthropic-backup"} 1

Model-id refresh

Refreshed deprecated provider model ids across example configs/READMEs and AI doc snippets (claude-3-5-* → claude-{sonnet,haiku}-4-5, gemini-2.x / 1.5 / flash-latest → gemini-3.5-flash). Bedrock ARNs and OpenRouter-format ids are intentionally left unchanged.

Follow-up to the cassettes work, implementing the gaps it surfaced. - Model-based provider selection: before the routing strategy runs, the dispatcher narrows providers to those whose `models` list declares the requested model (an empty list is a wildcard; no match passes through unchanged). A single OpenAI-compatible origin can now route by the model field to the right vendor regardless of strategy. Extracted as model_eligible_providers() with unit tests; documented in docs/ai-gateway.md. - Failover metric: record_failover() was defined and tested but never called, so sbproxy_ai_failovers_total stayed empty on real failovers. Wired it into both advancement points (retriable 5xx and transport error). - Refresh deprecated provider model ids across example configs/READMEs and AI doc snippets (claude-3-5-* -> claude-{sonnet,haiku}-4-5, gemini-2.x/1.5/ flash-latest -> gemini-3.5-flash). Bedrock ARNs and OpenRouter-format ids left as-is. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01B8vnyA6iBx5FnoLbsnwWww

rickcrawford merged commit 367c5c9 into main Jun 24, 2026
5 checks passed

rickcrawford deleted the ai-gateway-followup-fixes branch June 24, 2026 22:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI gateway: model-based routing + failover metric + model-id refresh#536

AI gateway: model-based routing + failover metric + model-id refresh#536
rickcrawford merged 1 commit into
mainfrom
ai-gateway-followup-fixes

rickcrawford commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rickcrawford commented Jun 24, 2026

Model-based provider selection

Failover metric

Model-id refresh

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant