feat(filter): add /v1/responses model rewrite filter for Codex passthrough by nerdalert · Pull Request #594 · praxis-proxy/praxis

nerdalert · 2026-06-12T14:28:29Z

Summary

Adds openai_responses_model_rewrite, a Responses API request filter that translates client-facing model names into backend deployment names.

Capabilities

Rewrites configured model aliases.
Injects a default when model is missing or null.
Publishes original and effective model headers and metadata for downstream routing.
Preserves every other request field, including tools, instructions, input, function outputs, streaming flags, and unknown fields.
Skips non-Responses traffic without modifying it.

Workflow

An operator deploys Praxis between their AI clients and inference backends, then configures alias rules such as:

codex-mini-latest -> llama-3.3-70b

When a user or tool such as Codex sends a POST /v1/responses request asking for codex-mini-latest, Praxis silently replaces the model name before forwarding the request to the backend.

The client does not need to know the backend's deployment name.

If a request arrives without a model, the operator can configure a default model that Praxis injects automatically.

Praxis also publishes routing headers such as x-praxis-ai-effective-model. Operators can use these headers to route rewritten models to different backend clusters:

Effective Model	Example Destination
llama-3.3-70b	Llama GPU pool
qwen-2.5-72b	Qwen GPU pool

All other request fields pass through untouched. Praxis does not execute client tools or take ownership of the Codex tool loop.

Example Configuration

  - filter: openai_responses_model_rewrite
    default_model: "llama-3.3-70b"
    model_aliases:
      codex-mini-latest: "llama-3.3-70b"
      gpt-4.1-mini: "qwen-2.5-72b"
    headers:
      effective_model: x-praxis-ai-effective-model
      original_model: x-praxis-ai-original-model

Demo And Results

The runnable passthrough demo, captured results, benchmark data, and documented claim boundaries are available here:

Native /v1/responses Passthrough Demo and Results (https://github.com/nerdalert/praxis-research-spikes/tree/main/demo/v1-responses-passthrough)

Demonstrated Behavior

Native POST /v1/responses passthrough.
Model alias rewriting and default injection.
Effective-model-based backend routing.
Streaming SSE preservation.
Codex-shaped tools and function_call_output preservation.
Mixed Responses and Chat Completions routing.
Request-path benchmark profiles.

praxis-bot

Review: openai_responses_model_rewrite filter

Core logic is clean. Config validation thorough. 40+ unit tests and 11 integration tests. Follows established patterns. One compile-breaking bug and a few test gaps.

Findings: 1 Critical, 2 Medium, 3 Small, 4 Nit. See inline comments.

Non-inline: (1) is_responses_create does own path matching instead of reading classifier metadata, contradicting docs. (2) header_guard started early in content_length test. (3) Integration test file has doc comment.

praxis-bot

(superseded by later review)

praxis-bot

PR Review: feat(filter): add /v1/responses model rewrite filter for Codex passthrough

Summary: Adds openai_responses_model_rewrite, a Responses API request filter that translates client-facing model names into backend deployment names via alias mapping, with default model injection for missing/null values.

Overall: Well-structured filter implementation that follows the project established patterns closely. The code is clean, the test coverage is thorough with both unit and integration tests, and the integration tests exercise real end-to-end behavior including routing by effective model header. A few items need attention -- primarily a missing #[cfg(feature)] gate on a re-export, a documentation accuracy issue, and a note on JSON round-trip key reordering.

Severity	Count
Medium	3
Small	4
Nit	3

Findings without inline placement

[Small] tests/integration/tests/suite/main.rs: The new mod openai_responses_model_rewrite; declaration should be gated with #[cfg(feature = "ai-inference")] to match the pattern used for other AI-inference test modules. Without the gate, the module will fail to compile when the feature is disabled.
[Nit] filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs: The on_request_body method checks body.is_none() before calling rewrite_body, but rewrite_body immediately re-checks with let Some(raw) = body.as_ref(). The outer guard makes the inner one unreachable -- not a bug, but a minor redundancy.

See inline comments for all other findings.

leseb

i'm requesting changes because i'd like to discuss some key point of the filter's goals, otherwise the goal makes sense to me!

praxis-bot-app · 2026-06-19T03:41:25Z

PR too large: 1763 lines added (limit: 750, excludes Cargo files, tests, docs, examples, and benchmarks). Please split into smaller PRs. Add skip/pr-conventions label to override.

…rough This adds openai_responses_model_rewrite, a Responses API request-body filter that lets Praxis translate Codex facing model names into the backend’s actual deployment names while preserving native /v1/responses traffic. Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>

nerdalert requested review from a team June 12, 2026 14:28

nerdalert requested review from franciscojavierarceo, leseb, shaneutt and twghu as code owners June 12, 2026 14:28

nerdalert added the skip/pr-hygiene label Jun 12, 2026

nerdalert force-pushed the brent-responses-passthrough branch from 42e80e2 to 823b264 Compare June 12, 2026 14:32

shaneutt self-assigned this Jun 12, 2026

shaneutt added this to AI Gateway Jun 12, 2026

github-project-automation Bot moved this to Backlog in AI Gateway Jun 12, 2026

shaneutt assigned franciscojavierarceo and leseb Jun 12, 2026

shaneutt moved this from Backlog to Review in AI Gateway Jun 12, 2026

shaneutt added this to the v0.4.0 milestone Jun 12, 2026

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/mod.rs

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs Outdated

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/tests.rs Outdated

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/tests.rs

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs

praxis-bot reviewed Jun 12, 2026

View reviewed changes

Comment thread tests/integration/tests/suite/openai_responses_model_rewrite.rs Outdated

nerdalert force-pushed the brent-responses-passthrough branch 4 times, most recently from 5eab4df to b30ac7a Compare June 16, 2026 01:29