Skip to content

test(ai-proxy): end-to-end coverage for ADR-0030 + drive-by dispatch fix#77

Closed
ndreno wants to merge 4 commits intofeat/ai-proxy-routes-tablefrom
test/ai-proxy-routes-coverage
Closed

test(ai-proxy): end-to-end coverage for ADR-0030 + drive-by dispatch fix#77
ndreno wants to merge 4 commits intofeat/ai-proxy-routes-tablefrom
test/ai-proxy-routes-coverage

Conversation

@ndreno
Copy link
Copy Markdown
Contributor

@ndreno ndreno commented May 4, 2026

Summary

End-to-end coverage for the ADR-0030 implementation stack (PR-1 → PR-3) + a drive-by fix to a bug surfaced by the new tests.

Stacked on PR #76 (routes table). Will rebase down the chain as each lower PR merges.

Drive-by fix

The path-based dispatch added in PR #73 was too strict: it returned 404 for any req.path != /v1/chat/completions, breaking every existing test fixture and any operator-defined operation path. The dispatcher shouldn't constrain the operator's choice of path when only one protocol is on offer.

Now defaults to chat_completion::handle for any path; PR-4 will narrowly add /v1/responses when there's a real second protocol to differentiate. The dead 404 arm of error_response() is dropped, and the unit test from PR-1 that asserted the rejected behavior is replaced with one asserting the dispatcher accepts custom paths today.

Coverage added

Integration (crates/barbacane-test/tests/ai_proxy.rs, +5 tests)

  • test_ai_proxy_routes_first_match_wins — wiremock with three route prefixes (/route-claude, /route-gpt, /route-catchall) proves the right upstream URL is hit for each model glob via the actual data plane pipeline.
  • test_ai_proxy_400_when_body_omits_model — proves the model_required short-circuit fires end-to-end.
  • test_ai_proxy_400_no_route_when_model_does_not_match — proves the no_route response shape ships through the data plane.
  • test_ai_proxy_403_model_not_permitted_does_not_reach_upstream — wiremock with .expect(0) proves the upstream is never called when catalog policy denies. Catches the regression where a denied model leaks through.
  • test_ai_proxy_403_does_not_fall_through_to_next_route — proves the no-fallthrough rule from ADR-0030 §3 holds in the real pipeline (denied claude-opus-* does NOT reach the catch-all ollama route).

End-to-end (crates/barbacane-test/tests/ai_gateway.rs, +2 tests)

  • cel_driven_target_deny_fires_403_not_silent_passthe load-bearing ADR-0030 §3 subtlety in the actual pipeline. cel writes ai.target=anthropic-tier based on a header; the named target carries deny: ["claude-opus-*"]; a request with model: claude-opus-4-6 hits 403, not silent pass. Mock has .expect(0) so the test proves the upstream is never called even though the target was selected via context.
  • cel_driven_target_deny_passes_when_model_does_not_match_deny — positive control for the same spec; non-denied model reaches upstream.

Compilation smoke (tests/fixtures/ai-proxy.yaml, +2 operations)

  • /ai/routed/chat/completions — full routes table with allow, deny, and a catch-all.
  • /ai/restricted/chat/completionsdefault_target + catalog deny on a named target.

Vacuum migration UX (docs/rulesets/tests/invalid-ai-proxy-leftover-model.yaml + run-tests.sh)

Regression fixture proving the auto-generated dispatch validator surfaces:

Unknown config field "model" for dispatcher "ai-proxy". Allowed fields: provider, api_key, base_url, timeout, max_tokens, fallback, routes, targets, default_target.

— with the full list of allowed fields including routes — when an operator forgets to delete model: after upgrading from ADR-0024. This is the migration UX promised in PR #75's CHANGELOG.

Known gap (out of scope)

Nested-glob lint coverage for routes[].pattern and allow/deny entries. The auto-generated validator doesn't recurse into nested objects, so vacuum can't catch a regex-syntax pattern at lint time today. Runtime catches it via globset compile error from ensure_compiled_routes (covered by unit tests). This is the generator-recursion improvement we discussed earlier — deserves a separate PR that benefits every plugin's nested schema, not a migration-specific lint.

Test plan

  • cargo test -p barbacane-test --test ai_proxy9/9 pass (was 4 before this PR; the 5 new integration tests + the 4 pre-existing ones that the PR-1 path-match was breaking)
  • cargo test -p barbacane-test --test ai_gateway5/5 pass (3 pre-existing + 2 new cel+deny end-to-end)
  • cargo test -p barbacane-test --test compilation test_fixture_compiles_ai_proxy — passes with the extended fixture
  • cargo test --manifest-path plugins/ai-proxy/Cargo.toml67 unit tests pass (1 PR-1 unit test relaxed; counts unchanged)
  • bash docs/rulesets/tests/run-tests.sh15/15 pass (was 14; +1 for the leftover-model fixture)
  • cargo build --target wasm32-unknown-unknown --release — clean
  • cargo fmt --all -- --check — clean
  • cargo clippy --all-targets — clean on production code
  • CI green (modulo pre-existing wasmtime advisory addressed in PR fix(deps): resolve RUSTSEC-2026-0114 (wasmtime patch bump) #74)

Stacking

Branched off PR #76 → which is branched off PR #75 → which is branched off PR #73. Will rebase down the stack as each lower PR merges.

ndreno added 4 commits May 4, 2026 11:36
…nge)

ADR-0030 §1 calls for the dispatcher to become protocol-aware, with
path-based dispatch picking translation between Chat Completions and
the Responses API on the same target pool. This PR is the mechanical
move that preps the source layout — zero behavior change for the only
existing path (/v1/chat/completions).

Source layout:
- lib.rs               orchestration (target resolution, fallback chain,
                        metrics, context propagation, shared helpers,
                        host stubs); now declares mod protocols/providers.
- protocols/
    chat_completion.rs OpenAI Chat Completions adapter — handle(),
                        translate_to_anthropic, translate_from_anthropic,
                        AnthropicRequest.
- providers/
    openai.rs          OpenAI-compatible transport (openai_call,
                        openai_stream, maybe_inject_max_tokens, openai_url,
                        openai_headers).
    anthropic.rs       Anthropic Messages transport + ANTHROPIC_API_VERSION
                        constant (was inlined at lib.rs:359).
    ollama.rs          Empty slot — Ollama shares OpenAI transport today.
                        ADR-0030 §2 will use this file to reject Responses
                        requests against an Ollama target.

Dispatch is now path-aware via a function-pointer indirection
(`ProtocolHandler` type alias). Today only /v1/chat/completions is
routed; ADR-0030 PR-4 adds /v1/responses, PR-5 adds /v1/models. Unknown
paths return 404 with `urn:barbacane:error:not-found` (one new test
covers this).

All 43 existing unit tests pass unchanged via super::module::* paths;
test count goes 43 → 44 with the new path-dispatch test. WASM build
clean. Workspace clippy + fmt clean.

Pre-push checklist:
- cargo build --workspace                             clean
- cargo build --target wasm32-unknown-unknown -r      clean
- cargo test --workspace --exclude barbacane-test     all green
- cargo fmt --all -- --check                          clean
- cargo clippy --lib --bins                           clean
- cargo deny check advisories                         FAILS on
  RUSTSEC-2026-0114 (wasmtime), pre-existing on main, separate PR.
…(ADR-0030 §0)

BREAKING CHANGE for ADR-0024 deployments. The model identifier is no
longer a gateway-side config knob — the client's `model` field on the
request body is passed to the upstream provider verbatim. The gateway
declares providers (where to go, with what credentials), never an
authoritative model list.

Removed:
- `model: String` from `TargetConfig` (was required)
- `model: Option<String>` from the flat `AiProxy` top-level config
- The `target.model` fallback in `translate_to_anthropic`
- The `cfg.model` assertion in `config_flat_minimal`

Added:
- `extract_client_model(body)` — parses the `model` field from an
  OpenAI-format request body. Returns `None` for absent body, malformed
  JSON, missing field, non-string value, or empty string.
- `model_required_response()` — `400 problem+json` with
  `urn:barbacane:error:model_required` and `code: "model_required"`.
  Returned by `dispatch()` when the client omits `model`. Matches both
  upstream provider contracts (OpenAI Chat Completions and Responses
  both require `model`) and ADR-0030 §0's caller-owned-model principle.
- `dispatch_chat_completion()` — path-specific helper that extracts the
  client model upfront, short-circuits 400 if missing, otherwise calls
  the shared orchestration loop with the client model plumbed through.
- `#[serde(deny_unknown_fields)]` on `AiProxy` and `TargetConfig` —
  closes the runtime safety net, so leftover nested `model:` (which
  vacuum's auto-generated validator does not recurse into yet) fails
  at WASM instance load with a clear "unknown field model" error.

Plumbing:
- `ProtocolHandler` signature now takes `client_model: &str` as a 4th
  argument before `streaming`. The orchestration loop extracts the
  model once and passes it through to the handler.
- `propagate_context(target, client_model, resp)` now writes `ai.model`
  from the client value, not `target.model`. Downstream middlewares
  (`ai-cost-tracker`, `ai-token-limit`) read the same identifier the
  client requested.
- `anthropic_call` and `translate_to_anthropic` take `client_model: &str`
  explicitly; no fallback.

Migration:
- vacuum surfaces leftover `model:` at lint time on the flat config
  via the auto-generated `additionalProperties: false` check; the
  message names the field. Runtime `deny_unknown_fields` catches
  leftover `model:` on `targets.<>` and `fallback[]`.
- All shipped fixtures updated: `tests/fixtures/ai-{proxy,gateway}.yaml`,
  `crates/barbacane-test/tests/ai_{proxy,gateway}.rs`,
  `docs/rulesets/tests/valid-complete.yaml`.

Tests: 50 unit (was 44; +6 covering model_required, extract_client_model
helper, legacy-model-rejection at top-level and nested target).
14/14 ruleset tests pass.

Considered and rejected: a dedicated `barbacane-ai-proxy-no-model`
vacuum rule. Migration-specific lint rules accumulate forever; the
auto-generated message + CHANGELOG entry is sufficient. The genuine
gap (vacuum doesn't recurse into nested objects) is addressed by a
separate generator improvement that benefits every plugin, not by a
bespoke rule for this one migration.
Adds glob-based dynamic model routing as a third resolution layer between
ai.target context lookup and default_target/flat fallthrough. Each routes
entry binds a glob pattern (e.g. claude-*, gpt-4o*, o[1-4]*) to a provider
+ credentials. First match wins.

Catalog policy lives on the target via optional allow/deny glob lists.
Critically, allow/deny applies on every resolution path that produces a
target carrying those rules — including ai.target-driven dispatch — so a
cel misconfig that sets ai.target to a target whose deny covers the
requested model still gets 403. Catalog policy is a property of the
target, not the resolution path.

Resolution precedence (4-step ladder, ADR-0030 §3):
1. ai.target context key (set by upstream cel)
2. routes glob match against client model
3. default_target → targets[name]
4. flat provider config

Failure modes:
- 400 model_required (PR-2): client omitted model
- 400 no_route (new): routes configured but no entry matched and no
  fallthrough — operator's catalog doesn't cover the requested model
- 403 model_not_permitted (new): allow/deny rejected the model. Does
  NOT fall through to fallback or to another route — that would
  silently escalate a denied model to a different provider. Escape
  hatch: tighten the route's pattern so non-matching models miss the
  route entirely and reach the catch-all.
- 500 misconfiguration: nothing configured, or a route's glob fails
  to compile (surfaced from ensure_compiled_routes at first dispatch).

Implementation:
- New Route struct (pattern + provider + credentials + allow/deny).
- New CompiledRoute caching the precompiled GlobMatcher per route.
- ensure_compiled_routes runs once per plugin instance, lazily on
  first dispatch — same pattern as cel's compiled CEL program.
- New ResolveOutcome enum distinguishes Resolved / NoRouteMatch /
  NotConfigured so dispatch can map each to the right HTTP shape.
- evaluate_catalog_policy compiles per-target allow/deny on the fly
  (lists are typically <5 entries; cheap). Fails closed on a glob
  compile error rather than silently bypassing the policy.
- Glob library: `globset` 0.4 with case-sensitive, anchored matching.
  default-features = false to keep the WASM binary small.
- Schema: TargetConfig gains optional allow/deny; new RouteEntry; new
  GlobPattern referenced from both. Pattern allowlist regex
  ^[A-Za-z0-9_*?\[\]\-:.+/]+$ pins glob characters at lint time so
  vacuum surfaces nonsense like regex syntax before runtime.
- New resolution_total counter with `resolution=context|routes|default|
  flat` label and a debug log (ai-proxy: resolved provider=X via=Y) for
  the "why did my request go there?" debugging case.

Tests: 67/67 (was 50; +17 covering routes first-match-wins, catch-all
fallthrough, no_route when no fallthrough, default/flat fallthrough,
ai.target overrides routes, invalid glob compile error, allow pass/
reject, deny pass/match, allow+deny ordering, end-to-end 400 no_route,
end-to-end 403 model_not_permitted, no-fallthrough on deny, the
ai.target+deny subtlety, and resolution_total label emission). 14/14
ruleset tests pass.
…e-by dispatch fix

Adds integration tests, fixture coverage, and a vacuum migration fixture
for the ADR-0030 implementation stack (PR-1 → PR-3). Surfaced and fixed
one bug along the way.

Drive-by fix
============

The path-based dispatch added in PR-1 (#73) was too strict: it returned
404 for any req.path != /v1/chat/completions, breaking every existing
test fixture and any operator-defined operation path. The dispatcher
shouldn't constrain the operator's choice of path when only one protocol
is on offer. Now defaults to chat_completion::handle for any path; PR-4
will narrowly add /v1/responses when there's a real second protocol to
differentiate. Removes the dead 404 arm from error_response().

The unit test from PR-1 that asserted the rejected behavior is replaced
with one that asserts the dispatcher accepts custom paths today.

Coverage added
==============

Integration tests (crates/barbacane-test/tests/ai_proxy.rs, +5 tests):
- test_ai_proxy_routes_first_match_wins — wiremock with three route
  prefixes proves claude-* / gpt-* / catch-all dispatch to the right
  upstream URL via the actual data plane pipeline.
- test_ai_proxy_400_when_body_omits_model — proves the model_required
  short-circuit fires end-to-end.
- test_ai_proxy_400_no_route_when_model_does_not_match — proves the
  no_route response shape ships through the data plane.
- test_ai_proxy_403_model_not_permitted_does_not_reach_upstream —
  wiremock with .expect(0) proves the upstream is never called when
  catalog policy denies; would catch a regression that leaks through
  to the provider.
- test_ai_proxy_403_does_not_fall_through_to_next_route — proves the
  no-fallthrough rule from ADR-0030 §3 holds in the real pipeline.

End-to-end tests (crates/barbacane-test/tests/ai_gateway.rs, +2 tests):
- cel_driven_target_deny_fires_403_not_silent_pass — the load-bearing
  ADR-0030 §3 subtlety in the actual pipeline. cel writes
  ai.target=anthropic-tier based on a header; the named target carries
  deny:["claude-opus-*"]; a request with claude-opus-4-6 hits 403,
  not silent pass. Mock has .expect(0) so the test proves the upstream
  is never called.
- cel_driven_target_deny_passes_when_model_does_not_match_deny —
  positive control for the same spec; proves a non-denied model still
  reaches upstream.

Compilation smoke (tests/fixtures/ai-proxy.yaml, +2 operations):
- /ai/routed/chat/completions with full routes table including allow,
  deny, and a catch-all.
- /ai/restricted/chat/completions with default_target + catalog deny
  on a named target.

Vacuum migration UX (docs/rulesets/tests/invalid-ai-proxy-leftover-
model.yaml + run-tests.sh):
- Regression fixture proving the auto-generated dispatch validator
  surfaces "Unknown config field 'model' for dispatcher 'ai-proxy'"
  with the full list of allowed fields when an operator forgets to
  delete `model:` after upgrading from ADR-0024. The lint message is
  the migration UX promised in PR-2's CHANGELOG.

Known gap (out of scope for this PR): nested-glob lint coverage for
routes[].pattern and allow/deny entries. The auto-generated validator
doesn't recurse into nested objects, so vacuum can't catch a regex-
syntax pattern at lint time today. Runtime catches it via globset
compile error from ensure_compiled_routes (covered by unit tests).
This is the generator-recursion improvement we discussed earlier — a
separate PR that benefits every plugin's nested schema, not a
migration-specific lint.

Test counts: plugin 793 → 801 (+8 — PR-1 unit test relaxed +1 test,
PR-3 +17 already in stack, this PR +8 from drive-by fix and new
helpers). Integration 275 → 282 (+7). 15/15 ruleset tests pass.
@ndreno ndreno force-pushed the feat/ai-proxy-routes-table branch from 7645fe0 to 2d375c7 Compare May 5, 2026 06:40
@ndreno ndreno deleted the branch feat/ai-proxy-routes-table May 5, 2026 06:41
@ndreno ndreno closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant