test(ai-proxy): end-to-end coverage for ADR-0030 + drive-by dispatch fix#77
Closed
ndreno wants to merge 4 commits intofeat/ai-proxy-routes-tablefrom
Closed
test(ai-proxy): end-to-end coverage for ADR-0030 + drive-by dispatch fix#77ndreno wants to merge 4 commits intofeat/ai-proxy-routes-tablefrom
ndreno wants to merge 4 commits intofeat/ai-proxy-routes-tablefrom
Conversation
…nge)
ADR-0030 §1 calls for the dispatcher to become protocol-aware, with
path-based dispatch picking translation between Chat Completions and
the Responses API on the same target pool. This PR is the mechanical
move that preps the source layout — zero behavior change for the only
existing path (/v1/chat/completions).
Source layout:
- lib.rs orchestration (target resolution, fallback chain,
metrics, context propagation, shared helpers,
host stubs); now declares mod protocols/providers.
- protocols/
chat_completion.rs OpenAI Chat Completions adapter — handle(),
translate_to_anthropic, translate_from_anthropic,
AnthropicRequest.
- providers/
openai.rs OpenAI-compatible transport (openai_call,
openai_stream, maybe_inject_max_tokens, openai_url,
openai_headers).
anthropic.rs Anthropic Messages transport + ANTHROPIC_API_VERSION
constant (was inlined at lib.rs:359).
ollama.rs Empty slot — Ollama shares OpenAI transport today.
ADR-0030 §2 will use this file to reject Responses
requests against an Ollama target.
Dispatch is now path-aware via a function-pointer indirection
(`ProtocolHandler` type alias). Today only /v1/chat/completions is
routed; ADR-0030 PR-4 adds /v1/responses, PR-5 adds /v1/models. Unknown
paths return 404 with `urn:barbacane:error:not-found` (one new test
covers this).
All 43 existing unit tests pass unchanged via super::module::* paths;
test count goes 43 → 44 with the new path-dispatch test. WASM build
clean. Workspace clippy + fmt clean.
Pre-push checklist:
- cargo build --workspace clean
- cargo build --target wasm32-unknown-unknown -r clean
- cargo test --workspace --exclude barbacane-test all green
- cargo fmt --all -- --check clean
- cargo clippy --lib --bins clean
- cargo deny check advisories FAILS on
RUSTSEC-2026-0114 (wasmtime), pre-existing on main, separate PR.
…(ADR-0030 §0)
BREAKING CHANGE for ADR-0024 deployments. The model identifier is no
longer a gateway-side config knob — the client's `model` field on the
request body is passed to the upstream provider verbatim. The gateway
declares providers (where to go, with what credentials), never an
authoritative model list.
Removed:
- `model: String` from `TargetConfig` (was required)
- `model: Option<String>` from the flat `AiProxy` top-level config
- The `target.model` fallback in `translate_to_anthropic`
- The `cfg.model` assertion in `config_flat_minimal`
Added:
- `extract_client_model(body)` — parses the `model` field from an
OpenAI-format request body. Returns `None` for absent body, malformed
JSON, missing field, non-string value, or empty string.
- `model_required_response()` — `400 problem+json` with
`urn:barbacane:error:model_required` and `code: "model_required"`.
Returned by `dispatch()` when the client omits `model`. Matches both
upstream provider contracts (OpenAI Chat Completions and Responses
both require `model`) and ADR-0030 §0's caller-owned-model principle.
- `dispatch_chat_completion()` — path-specific helper that extracts the
client model upfront, short-circuits 400 if missing, otherwise calls
the shared orchestration loop with the client model plumbed through.
- `#[serde(deny_unknown_fields)]` on `AiProxy` and `TargetConfig` —
closes the runtime safety net, so leftover nested `model:` (which
vacuum's auto-generated validator does not recurse into yet) fails
at WASM instance load with a clear "unknown field model" error.
Plumbing:
- `ProtocolHandler` signature now takes `client_model: &str` as a 4th
argument before `streaming`. The orchestration loop extracts the
model once and passes it through to the handler.
- `propagate_context(target, client_model, resp)` now writes `ai.model`
from the client value, not `target.model`. Downstream middlewares
(`ai-cost-tracker`, `ai-token-limit`) read the same identifier the
client requested.
- `anthropic_call` and `translate_to_anthropic` take `client_model: &str`
explicitly; no fallback.
Migration:
- vacuum surfaces leftover `model:` at lint time on the flat config
via the auto-generated `additionalProperties: false` check; the
message names the field. Runtime `deny_unknown_fields` catches
leftover `model:` on `targets.<>` and `fallback[]`.
- All shipped fixtures updated: `tests/fixtures/ai-{proxy,gateway}.yaml`,
`crates/barbacane-test/tests/ai_{proxy,gateway}.rs`,
`docs/rulesets/tests/valid-complete.yaml`.
Tests: 50 unit (was 44; +6 covering model_required, extract_client_model
helper, legacy-model-rejection at top-level and nested target).
14/14 ruleset tests pass.
Considered and rejected: a dedicated `barbacane-ai-proxy-no-model`
vacuum rule. Migration-specific lint rules accumulate forever; the
auto-generated message + CHANGELOG entry is sufficient. The genuine
gap (vacuum doesn't recurse into nested objects) is addressed by a
separate generator improvement that benefits every plugin, not by a
bespoke rule for this one migration.
Adds glob-based dynamic model routing as a third resolution layer between ai.target context lookup and default_target/flat fallthrough. Each routes entry binds a glob pattern (e.g. claude-*, gpt-4o*, o[1-4]*) to a provider + credentials. First match wins. Catalog policy lives on the target via optional allow/deny glob lists. Critically, allow/deny applies on every resolution path that produces a target carrying those rules — including ai.target-driven dispatch — so a cel misconfig that sets ai.target to a target whose deny covers the requested model still gets 403. Catalog policy is a property of the target, not the resolution path. Resolution precedence (4-step ladder, ADR-0030 §3): 1. ai.target context key (set by upstream cel) 2. routes glob match against client model 3. default_target → targets[name] 4. flat provider config Failure modes: - 400 model_required (PR-2): client omitted model - 400 no_route (new): routes configured but no entry matched and no fallthrough — operator's catalog doesn't cover the requested model - 403 model_not_permitted (new): allow/deny rejected the model. Does NOT fall through to fallback or to another route — that would silently escalate a denied model to a different provider. Escape hatch: tighten the route's pattern so non-matching models miss the route entirely and reach the catch-all. - 500 misconfiguration: nothing configured, or a route's glob fails to compile (surfaced from ensure_compiled_routes at first dispatch). Implementation: - New Route struct (pattern + provider + credentials + allow/deny). - New CompiledRoute caching the precompiled GlobMatcher per route. - ensure_compiled_routes runs once per plugin instance, lazily on first dispatch — same pattern as cel's compiled CEL program. - New ResolveOutcome enum distinguishes Resolved / NoRouteMatch / NotConfigured so dispatch can map each to the right HTTP shape. - evaluate_catalog_policy compiles per-target allow/deny on the fly (lists are typically <5 entries; cheap). Fails closed on a glob compile error rather than silently bypassing the policy. - Glob library: `globset` 0.4 with case-sensitive, anchored matching. default-features = false to keep the WASM binary small. - Schema: TargetConfig gains optional allow/deny; new RouteEntry; new GlobPattern referenced from both. Pattern allowlist regex ^[A-Za-z0-9_*?\[\]\-:.+/]+$ pins glob characters at lint time so vacuum surfaces nonsense like regex syntax before runtime. - New resolution_total counter with `resolution=context|routes|default| flat` label and a debug log (ai-proxy: resolved provider=X via=Y) for the "why did my request go there?" debugging case. Tests: 67/67 (was 50; +17 covering routes first-match-wins, catch-all fallthrough, no_route when no fallthrough, default/flat fallthrough, ai.target overrides routes, invalid glob compile error, allow pass/ reject, deny pass/match, allow+deny ordering, end-to-end 400 no_route, end-to-end 403 model_not_permitted, no-fallthrough on deny, the ai.target+deny subtlety, and resolution_total label emission). 14/14 ruleset tests pass.
…e-by dispatch fix Adds integration tests, fixture coverage, and a vacuum migration fixture for the ADR-0030 implementation stack (PR-1 → PR-3). Surfaced and fixed one bug along the way. Drive-by fix ============ The path-based dispatch added in PR-1 (#73) was too strict: it returned 404 for any req.path != /v1/chat/completions, breaking every existing test fixture and any operator-defined operation path. The dispatcher shouldn't constrain the operator's choice of path when only one protocol is on offer. Now defaults to chat_completion::handle for any path; PR-4 will narrowly add /v1/responses when there's a real second protocol to differentiate. Removes the dead 404 arm from error_response(). The unit test from PR-1 that asserted the rejected behavior is replaced with one that asserts the dispatcher accepts custom paths today. Coverage added ============== Integration tests (crates/barbacane-test/tests/ai_proxy.rs, +5 tests): - test_ai_proxy_routes_first_match_wins — wiremock with three route prefixes proves claude-* / gpt-* / catch-all dispatch to the right upstream URL via the actual data plane pipeline. - test_ai_proxy_400_when_body_omits_model — proves the model_required short-circuit fires end-to-end. - test_ai_proxy_400_no_route_when_model_does_not_match — proves the no_route response shape ships through the data plane. - test_ai_proxy_403_model_not_permitted_does_not_reach_upstream — wiremock with .expect(0) proves the upstream is never called when catalog policy denies; would catch a regression that leaks through to the provider. - test_ai_proxy_403_does_not_fall_through_to_next_route — proves the no-fallthrough rule from ADR-0030 §3 holds in the real pipeline. End-to-end tests (crates/barbacane-test/tests/ai_gateway.rs, +2 tests): - cel_driven_target_deny_fires_403_not_silent_pass — the load-bearing ADR-0030 §3 subtlety in the actual pipeline. cel writes ai.target=anthropic-tier based on a header; the named target carries deny:["claude-opus-*"]; a request with claude-opus-4-6 hits 403, not silent pass. Mock has .expect(0) so the test proves the upstream is never called. - cel_driven_target_deny_passes_when_model_does_not_match_deny — positive control for the same spec; proves a non-denied model still reaches upstream. Compilation smoke (tests/fixtures/ai-proxy.yaml, +2 operations): - /ai/routed/chat/completions with full routes table including allow, deny, and a catch-all. - /ai/restricted/chat/completions with default_target + catalog deny on a named target. Vacuum migration UX (docs/rulesets/tests/invalid-ai-proxy-leftover- model.yaml + run-tests.sh): - Regression fixture proving the auto-generated dispatch validator surfaces "Unknown config field 'model' for dispatcher 'ai-proxy'" with the full list of allowed fields when an operator forgets to delete `model:` after upgrading from ADR-0024. The lint message is the migration UX promised in PR-2's CHANGELOG. Known gap (out of scope for this PR): nested-glob lint coverage for routes[].pattern and allow/deny entries. The auto-generated validator doesn't recurse into nested objects, so vacuum can't catch a regex- syntax pattern at lint time today. Runtime catches it via globset compile error from ensure_compiled_routes (covered by unit tests). This is the generator-recursion improvement we discussed earlier — a separate PR that benefits every plugin's nested schema, not a migration-specific lint. Test counts: plugin 793 → 801 (+8 — PR-1 unit test relaxed +1 test, PR-3 +17 already in stack, this PR +8 from drive-by fix and new helpers). Integration 275 → 282 (+7). 15/15 ruleset tests pass.
8 tasks
7645fe0 to
2d375c7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
End-to-end coverage for the ADR-0030 implementation stack (PR-1 → PR-3) + a drive-by fix to a bug surfaced by the new tests.
Stacked on PR #76 (routes table). Will rebase down the chain as each lower PR merges.
Drive-by fix
The path-based dispatch added in PR #73 was too strict: it returned 404 for any
req.path != /v1/chat/completions, breaking every existing test fixture and any operator-defined operation path. The dispatcher shouldn't constrain the operator's choice of path when only one protocol is on offer.Now defaults to
chat_completion::handlefor any path; PR-4 will narrowly add/v1/responseswhen there's a real second protocol to differentiate. The dead 404 arm oferror_response()is dropped, and the unit test from PR-1 that asserted the rejected behavior is replaced with one asserting the dispatcher accepts custom paths today.Coverage added
Integration (
crates/barbacane-test/tests/ai_proxy.rs, +5 tests)test_ai_proxy_routes_first_match_wins— wiremock with three route prefixes (/route-claude,/route-gpt,/route-catchall) proves the right upstream URL is hit for each model glob via the actual data plane pipeline.test_ai_proxy_400_when_body_omits_model— proves themodel_requiredshort-circuit fires end-to-end.test_ai_proxy_400_no_route_when_model_does_not_match— proves theno_routeresponse shape ships through the data plane.test_ai_proxy_403_model_not_permitted_does_not_reach_upstream— wiremock with.expect(0)proves the upstream is never called when catalog policy denies. Catches the regression where a denied model leaks through.test_ai_proxy_403_does_not_fall_through_to_next_route— proves the no-fallthrough rule from ADR-0030 §3 holds in the real pipeline (denied claude-opus-* does NOT reach the catch-all ollama route).End-to-end (
crates/barbacane-test/tests/ai_gateway.rs, +2 tests)cel_driven_target_deny_fires_403_not_silent_pass— the load-bearing ADR-0030 §3 subtlety in the actual pipeline.celwritesai.target=anthropic-tierbased on a header; the named target carriesdeny: ["claude-opus-*"]; a request withmodel: claude-opus-4-6hits 403, not silent pass. Mock has.expect(0)so the test proves the upstream is never called even though the target was selected via context.cel_driven_target_deny_passes_when_model_does_not_match_deny— positive control for the same spec; non-denied model reaches upstream.Compilation smoke (
tests/fixtures/ai-proxy.yaml, +2 operations)/ai/routed/chat/completions— full routes table withallow,deny, and a catch-all./ai/restricted/chat/completions—default_target+ catalogdenyon a named target.Vacuum migration UX (
docs/rulesets/tests/invalid-ai-proxy-leftover-model.yaml+run-tests.sh)Regression fixture proving the auto-generated dispatch validator surfaces:
— with the full list of allowed fields including
routes— when an operator forgets to deletemodel:after upgrading from ADR-0024. This is the migration UX promised in PR #75's CHANGELOG.Known gap (out of scope)
Nested-glob lint coverage for
routes[].patternandallow/denyentries. The auto-generated validator doesn't recurse into nested objects, so vacuum can't catch a regex-syntax pattern at lint time today. Runtime catches it viaglobsetcompile error fromensure_compiled_routes(covered by unit tests). This is the generator-recursion improvement we discussed earlier — deserves a separate PR that benefits every plugin's nested schema, not a migration-specific lint.Test plan
cargo test -p barbacane-test --test ai_proxy— 9/9 pass (was 4 before this PR; the 5 new integration tests + the 4 pre-existing ones that the PR-1 path-match was breaking)cargo test -p barbacane-test --test ai_gateway— 5/5 pass (3 pre-existing + 2 new cel+deny end-to-end)cargo test -p barbacane-test --test compilation test_fixture_compiles_ai_proxy— passes with the extended fixturecargo test --manifest-path plugins/ai-proxy/Cargo.toml— 67 unit tests pass (1 PR-1 unit test relaxed; counts unchanged)bash docs/rulesets/tests/run-tests.sh— 15/15 pass (was 14; +1 for the leftover-model fixture)cargo build --target wasm32-unknown-unknown --release— cleancargo fmt --all -- --check— cleancargo clippy --all-targets— clean on production codeStacking
Branched off PR #76 → which is branched off PR #75 → which is branched off PR #73. Will rebase down the stack as each lower PR merges.