AI gateway: model-based routing + failover metric + model-id refresh#536
Merged
Conversation
Follow-up to the cassettes work, implementing the gaps it surfaced.
- Model-based provider selection: before the routing strategy runs, the
dispatcher narrows providers to those whose `models` list declares the
requested model (an empty list is a wildcard; no match passes through
unchanged). A single OpenAI-compatible origin can now route by the model
field to the right vendor regardless of strategy. Extracted as
model_eligible_providers() with unit tests; documented in docs/ai-gateway.md.
- Failover metric: record_failover() was defined and tested but never called,
so sbproxy_ai_failovers_total stayed empty on real failovers. Wired it into
both advancement points (retriable 5xx and transport error).
- Refresh deprecated provider model ids across example configs/READMEs and AI
doc snippets (claude-3-5-* -> claude-{sonnet,haiku}-4-5, gemini-2.x/1.5/
flash-latest -> gemini-3.5-flash). Bedrock ARNs and OpenRouter-format ids
left as-is.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01B8vnyA6iBx5FnoLbsnwWww
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #534, implementing the gaps that work surfaced. (Recreated after #534 merged; supersedes the closed #535.)
Model-based provider selection
Before the routing strategy runs, the dispatcher narrows providers to those whose
modelslist declares the requested model. An emptymodelslist is a wildcard; if no provider declares the model, the model name passes straight through unchanged. A single OpenAI-compatible origin can now route by themodelfield to the correct vendor regardless of strategy (previously round-robin could send a request to a provider that does not serve the model, returningmodel_not_found). Extracted asmodel_eligible_providers()with unit tests; documented under "Model-based provider selection" indocs/ai-gateway.md.Failover metric
record_failover()was defined and unit-tested but never called, sosbproxy_ai_failovers_totalstayed empty even on real failovers. Wired it into both advancement points (retriable 5xx and transport error). Verified live:Model-id refresh
Refreshed deprecated provider model ids across example configs/READMEs and AI doc snippets (
claude-3-5-*→claude-{sonnet,haiku}-4-5,gemini-2.x/1.5/flash-latest→gemini-3.5-flash). Bedrock ARNs and OpenRouter-format ids are intentionally left unchanged.