Skip to content

feat(filter): add /v1/responses model rewrite filter for Codex passthrough#594

Merged
shaneutt merged 3 commits into
praxis-proxy:mainfrom
nerdalert:brent-responses-passthrough
Jun 23, 2026
Merged

feat(filter): add /v1/responses model rewrite filter for Codex passthrough#594
shaneutt merged 3 commits into
praxis-proxy:mainfrom
nerdalert:brent-responses-passthrough

Conversation

@nerdalert

@nerdalert nerdalert commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary

Adds openai_responses_model_rewrite, a Responses API request filter that translates client-facing model names into backend deployment names.

Capabilities

  • Rewrites configured model aliases.
  • Injects a default when model is missing or null.
  • Publishes original and effective model headers and metadata for downstream routing.
  • Preserves every other request field, including tools, instructions, input, function outputs, streaming flags, and unknown fields.
  • Skips non-Responses traffic without modifying it.

Workflow

An operator deploys Praxis between their AI clients and inference backends, then configures alias rules such as:

codex-mini-latest -> llama-3.3-70b

When a user or tool such as Codex sends a POST /v1/responses request asking for codex-mini-latest, Praxis silently replaces the model name before forwarding the request to the backend.

The client does not need to know the backend's deployment name.

If a request arrives without a model, the operator can configure a default model that Praxis injects automatically.

Praxis also publishes routing headers such as x-praxis-ai-effective-model. Operators can use these headers to route rewritten models to different backend clusters:

Effective Model Example Destination
llama-3.3-70b Llama GPU pool
qwen-2.5-72b Qwen GPU pool

All other request fields pass through untouched. Praxis does not execute client tools or take ownership of the Codex tool loop.

Example Configuration

  - filter: openai_responses_model_rewrite
    default_model: "llama-3.3-70b"
    model_aliases:
      codex-mini-latest: "llama-3.3-70b"
      gpt-4.1-mini: "qwen-2.5-72b"
    headers:
      effective_model: x-praxis-ai-effective-model
      original_model: x-praxis-ai-original-model

Demo And Results

The runnable passthrough demo, captured results, benchmark data, and documented claim boundaries are available here:

Native /v1/responses Passthrough Demo and Results (https://github.com/nerdalert/praxis-research-spikes/tree/main/demo/v1-responses-passthrough)

Demonstrated Behavior

  • Native POST /v1/responses passthrough.
  • Model alias rewriting and default injection.
  • Effective-model-based backend routing.
  • Streaming SSE preservation.
  • Codex-shaped tools and function_call_output preservation.
  • Mixed Responses and Chat Completions routing.
  • Request-path benchmark profiles.

@nerdalert nerdalert requested review from a team June 12, 2026 14:28
@nerdalert nerdalert force-pushed the brent-responses-passthrough branch from 42e80e2 to 823b264 Compare June 12, 2026 14:32
@shaneutt shaneutt self-assigned this Jun 12, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in AI Gateway Jun 12, 2026
@shaneutt shaneutt moved this from Backlog to Review in AI Gateway Jun 12, 2026
@shaneutt shaneutt added this to the v0.4.0 milestone Jun 12, 2026

@praxis-bot praxis-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: openai_responses_model_rewrite filter

Core logic is clean. Config validation thorough. 40+ unit tests and 11 integration tests. Follows established patterns. One compile-breaking bug and a few test gaps.

Findings: 1 Critical, 2 Medium, 3 Small, 4 Nit. See inline comments.

Non-inline: (1) is_responses_create does own path matching instead of reading classifier metadata, contradicting docs. (2) header_guard started early in content_length test. (3) Integration test file has doc comment.

Comment thread filter/src/builtins/http/ai/mod.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs Outdated
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/tests.rs Outdated
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/tests.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
Comment thread tests/integration/tests/suite/openai_responses_model_rewrite.rs Outdated
@nerdalert nerdalert force-pushed the brent-responses-passthrough branch 4 times, most recently from 5eab4df to b30ac7a Compare June 16, 2026 01:29

@praxis-bot praxis-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(superseded by later review)

@praxis-bot praxis-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: feat(filter): add /v1/responses model rewrite filter for Codex passthrough

Summary: Adds openai_responses_model_rewrite, a Responses API request filter that translates client-facing model names into backend deployment names via alias mapping, with default model injection for missing/null values.

Overall: Well-structured filter implementation that follows the project established patterns closely. The code is clean, the test coverage is thorough with both unit and integration tests, and the integration tests exercise real end-to-end behavior including routing by effective model header. A few items need attention -- primarily a missing #[cfg(feature)] gate on a re-export, a documentation accuracy issue, and a note on JSON round-trip key reordering.

Severity Count
Medium 3
Small 4
Nit 3

Findings without inline placement

  • [Small] tests/integration/tests/suite/main.rs: The new mod openai_responses_model_rewrite; declaration should be gated with #[cfg(feature = "ai-inference")] to match the pattern used for other AI-inference test modules. Without the gate, the module will fail to compile when the feature is disabled.

  • [Nit] filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs: The on_request_body method checks body.is_none() before calling rewrite_body, but rewrite_body immediately re-checks with let Some(raw) = body.as_ref(). The outer guard makes the inner one unreachable -- not a bug, but a minor redundancy.


See inline comments for all other findings.

Comment thread filter/src/builtins/http/ai/mod.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
Comment thread docs/operating/filter-reference.md Outdated
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/tests.rs
Comment thread tests/integration/tests/suite/openai_responses_model_rewrite.rs
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
@nerdalert nerdalert force-pushed the brent-responses-passthrough branch 3 times, most recently from 208ed75 to f86c6c7 Compare June 16, 2026 04:06

@leseb leseb left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm requesting changes because i'd like to discuss some key point of the filter's goals, otherwise the goal makes sense to me!

Comment thread filter/src/builtins/http/ai/openai/responses/classifier/mod.rs Outdated
Comment thread examples/configs/ai/openai/responses/model-rewrite.yaml
Comment thread filter/src/builtins/http/ai/openai/responses/classifier/mod.rs Outdated
Comment thread filter/src/builtins/http/ai/openai/responses/classifier/mod.rs Outdated
Comment thread filter/src/builtins/http/ai/openai/responses/model_rewrite/mod.rs
@nerdalert nerdalert force-pushed the brent-responses-passthrough branch from 7baf162 to 14ae0df Compare June 19, 2026 03:41
@praxis-bot-app

Copy link
Copy Markdown

PR too large: 1763 lines added (limit: 750, excludes Cargo files, tests, docs, examples, and benchmarks). Please split into smaller PRs. Add skip/pr-conventions label to override.

@nerdalert nerdalert force-pushed the brent-responses-passthrough branch 2 times, most recently from 734c3f9 to 93c5b01 Compare June 19, 2026 04:26
@shaneutt shaneutt added skip/pr-conventions Skip conventions checks for PRs and removed skip/pr-hygiene labels Jun 19, 2026
@nerdalert nerdalert requested a review from leseb June 23, 2026 14:56
…rough

This adds openai_responses_model_rewrite, a Responses API request-body
filter that lets Praxis translate Codex facing model names into the
backend’s actual deployment names while preserving native /v1/responses traffic.

Signed-off-by: Brent Salisbury <bsalisbu@redhat.com>
@nerdalert nerdalert force-pushed the brent-responses-passthrough branch from 34d0e8c to 5897339 Compare June 23, 2026 15:24
@shaneutt shaneutt removed their assignment Jun 23, 2026
@shaneutt shaneutt merged commit b3e4f54 into praxis-proxy:main Jun 23, 2026
16 checks passed
@github-project-automation github-project-automation Bot moved this from Review to Done in AI Gateway Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip/pr-conventions Skip conventions checks for PRs

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants