feat(filter): add shared AI classifier and anthropic_messages_format filter by franciscojavierarceo · Pull Request #587 · praxis-proxy/praxis

franciscojavierarceo · 2026-06-11T20:51:59Z

Summary

Extracts AI request format classification into a shared classifier module at filter/src/builtins/http/ai/classifier/mod.rs, adding the AnthropicMessages variant to AiRequestFormat
Adds the anthropic_messages_format filter that classifies Anthropic Messages API requests and promotes format facts to internal headers for downstream routing
Updates openai/responses to import from the shared classifier instead of its local copy

Behavioral changes from extraction: The shared classifier adds prompt object detection (requests with a prompt object but no input now classify as Responses instead of UnknownJson). New fields (has_tools, has_prompt_id, max_tokens) are extracted into ClassifiedRequest for Anthropic consumption; existing openai_responses_format users are unaffected.

Example configs and integration tests are in #592 (PR 6 of this stack).

Part 1 of the Anthropic Messages API filter stack (epic #484).

Test plan

cargo test -p praxis-proxy-filter -- all unit tests pass including classifier and messages_format tests
make lint -- clippy and fmt clean
make test-unit -- no regressions across workspace

…filter Extract AI request format classification into a shared classifier module and add the anthropic_messages_format filter for Anthropic Messages API detection. The classifier promotes format facts to internal headers for downstream routing. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

franciscojavierarceo · 2026-06-12T19:49:20Z

looks like the Coding Conventions are failing due to the PRAXIS_BOT token being expired.

Run labels=$(gh pr view "$PR_NUMBER"
HTTP 401: Bad credentials (https://api.github.com/graphql)
Try authenticating with: gh auth login -h github.com

praxis-bot

Review: feat(filter): add shared AI classifier and anthropic_messages_format filter

Clean extraction of the classifier into a shared module, and a well-structured new filter. The code follows project conventions closely and the classifier disambiguation logic for Anthropic vs ChatCompletions is reasonable. Several test coverage gaps and one correctness concern below.

Findings

#	Severity	File	Finding
1	Medium	`classifier/mod.rs`	`classify_format` uses `prompt` object presence to classify as Responses, but old code did not. This silently changes behavior for existing `openai_responses_format` users.
2	Medium	`messages_format/mod.rs`	Missing `on_invalid: continue` test for non-JSON/invalid-JSON. Only reject-mode negative tests exist.
3	Medium	`messages_format/tests.rs`	`run_filter` utility always uses path `/v1/messages`, which triggers the path-based override. No test exercises body-only Anthropic classification.
4	Medium	`messages_format/tests.rs`	No test for `on_invalid: reject` with `NonJson` or `UnknownJson` formats, only `InvalidJson` is tested.
5	Medium	Project	Per CLAUDE.md test requirements: no example config and no integration test for the new filter.
6	Small	`messages_format/mod.rs`	`max_tokens` written to metadata but not promoted to filter results. Asymmetric with `stream`.
7	Small	`classifier/mod.rs`	`has_anthropic_signals` doc comment incomplete re: system array coverage.
8	Small	`messages_format/tests.rs`	No test for `stream: false` being promoted.
9	Small	`messages_format/tests.rs`	No test for header suppression when config field is `null`.
10	Nit	`messages_format/mod.rs`	`is_messages_path` lacks trailing-slash normalization unlike `is_responses_path`.
11	Nit	`classifier/mod.rs`	Extraction adds behavioral changes (prompt object detection, new fields) that should be called out in PR description.

See inline comments for details on findings 1-7, 10.

praxis-bot · 2026-06-13T12:10:06Z

+
+/// Determine format from top-level keys.
+///
+/// Precedence: `input` or `prompt` object → Responses, then


[Medium] This adds prompt object detection to classify_format, which the old classifier at openai/responses/classifier/mod.rs did not have. The old code only checked input and messages; now any payload with a prompt object (but no input) is classified as Responses instead of UnknownJson. This changes behavior for existing openai_responses_format users.

The addition is likely intentional and correct (Responses API does support prompt objects), but it is a behavioral change bundled into what the PR describes as an extraction/refactor of the classifier. Worth calling out explicitly in the PR description so reviewers are aware.

praxis-bot · 2026-06-13T12:10:10Z

+    /// Detected body format.
+    pub format: AiRequestFormat,
+    /// Whether `conversation` is present and non-null.
+    pub has_conversation: bool,


[Small] New fields has_tools, has_prompt_id, and max_tokens are added to ClassifiedRequest compared to the original. The existing openai_responses_format filter (via the updated responses/mod.rs) now receives these new fields but does not use them in its own promotion logic. Fine for forward-compatibility, but the struct now extracts data that only anthropic_messages_format consumes.

praxis-bot · 2026-06-13T12:10:14Z

+            None => &[],
+        };
+
+        let mut classified = classify_request_body(bytes);


[Medium] The path override (is_messages_path) uses exact equality with "/v1/messages". Unlike is_responses_path in the shared classifier which strips trailing slashes, a request to /v1/messages/ would not trigger the Anthropic override. Consider normalizing the trailing slash for consistency, or add a test documenting that /v1/messages/ intentionally falls through to body-only classification.

praxis-bot · 2026-06-13T12:10:22Z

+        "{}",
+        r#"{"model":"claude-opus-4-8","max_tokens":1024,"messages":[{"role":"user","content":"Hi"}]}"#,
+    )
+    .await;


[Medium] run_filter always sets the path to /v1/messages, which activates the is_messages_path override in on_request_body. This means promotes_anthropic_messages_format passes because of the path boost, not because the body alone classifies as AnthropicMessages.

Consider adding a test where the path is e.g. /v1/some-other-path and the body has messages + max_tokens + system to verify body-only classification promotes correctly without the path boost.

praxis-bot · 2026-06-13T12:10:26Z

+}
+
+// -----------------------------------------------------------------------------
+// Promotion Tests


[Medium] Only InvalidJson is tested in reject mode. Missing tests for:

NonJson (empty body) in reject mode

UnknownJson (valid JSON without messages or input) in reject mode

continue mode with invalid/non-JSON bodies (should confirm filter releases without rejecting)

These are distinct code paths in handle_invalid_format and each should have coverage.

praxis-bot · 2026-06-13T12:10:30Z

+
+/// Write durable metadata.
+fn write_metadata(ctx: &mut HttpFilterContext<'_>, classified: &ClassifiedRequest) {
+    ctx.set_metadata("anthropic_format.format", classified.format.as_str());


[Small] max_tokens is written to metadata (anthropic_format.max_tokens) but not promoted to filter results in promote_filter_results. This is asymmetric with stream which is promoted to both. If downstream branch chains might need to condition on max_tokens, this would require a follow-up change.

praxis-bot · 2026-06-13T12:10:33Z

+
+    results.set("format", classified.format.as_str())?;
+
+    if let Some(model) = &classified.model


[Nit] The filter result key is "anthropic_messages_format" while the metadata prefix is "anthropic_format.". The inconsistency could cause confusion: a user might expect anthropic_messages_format.format in metadata. Consider aligning the metadata prefix to "anthropic_messages_format." to match, or document the difference.

praxis-bot · 2026-06-13T12:35:43Z

PR too large: 1517 lines added (limit: 750, excludes Cargo files, tests, docs, examples, and benchmarks). Please split into smaller PRs. Add skip/pr-hygiene label to override.

- Normalize trailing slash on /v1/messages/ path check - Add body-only classification test without path boost - Add NonJson, UnknownJson reject-mode and continue-mode tests - Promote max_tokens to filter results for symmetry with stream - Clarify has_anthropic_signals doc re: system string and array forms - Add stream:false promotion test - Add null header config suppression test Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

leseb

Thanks! A few observations:

Splitting into two PRs would have been easier (one for the reworked classifier and one for the message format
we are missing integration tests for the new filters
we are missing YAML examples for the new filters

EDIT: i just noticed #592 but i think it's better to have CI validating new config and tests in the same PR :)

franciscojavierarceo · 2026-06-15T14:51:25Z

Thanks! A few observations:

Splitting into two PRs would have been easier (one for the reworked classifier and one for the message format

we are missing integration tests for the new filters

we are missing YAML examples for the new filters

EDIT: i just noticed #592 but i think it's better to have CI validating new config and tests in the same PR :)

Yeah, that makes sense that splitting would have been cleaner.

As you saw the integration tests and examples are in subsequent PRs. I wanted to try to keep the changes setup in moderate scope to not overload the amount of stuff I was going to ask you to review, but LMK if you want me to include them here.

praxis-bot · 2026-06-16T02:23:45Z

CI failure: Analyze (rust) (CodeQL)

The CodeQL Analyze (rust) step stalled during the "Perform CodeQL Analysis" phase and the job was marked as failed. All other checks (lint, test, build, coverage, etc.) passed. This looks like a transient CodeQL infrastructure timeout — re-running the workflow should resolve it.

praxis-bot

PR Review - Shared AI Classifier and Anthropic Messages Format Filter

Summary: Extracts AI request body classification into a shared classifier module and adds the anthropic_messages_format filter.

Overall: Well-structured extraction. The shared classifier is sound for multi-provider format detection. Test coverage is solid with a few gaps noted in inline comments and below.

Severity	Count
Medium	4
Small	3
Nit	2

Findings without inline placement

[Medium] The run_filter test utility hardcodes path to /v1/messages, so path-boost is always active. No test verifies pure body-heuristic classification without path or header boost. Consider adding one.
[Small] No test for ChatCompletions not being rejected in reject mode via handle_invalid_format.

praxis-bot · 2026-06-16T02:38:15Z

+    /// Extracted `max_tokens` field value, if present.
+    pub max_tokens: Option<u64>,
+    /// Extracted `model` field value, if present.
+    pub model: Option<String>,


[Nit] Doc comment says prompt.id but the code checks prompt.prompt_id (line 137). Consider updating to prompt.prompt_id for consistency.

praxis-bot · 2026-06-16T02:38:24Z

+fn classify_format(obj: &serde_json::Map<String, serde_json::Value>) -> AiRequestFormat {
+    if obj.contains_key("input") || obj.get("prompt").is_some_and(serde_json::Value::is_object) {
+        return AiRequestFormat::Responses;
+    }


[Small] The classify_format precedence: input or prompt-object takes absolute precedence over messages+max_tokens+system. A request with both a prompt object AND Anthropic signals would classify as Responses. This is probably intentional (Responses API uses prompt objects), but worth a test case to lock in this edge case behavior explicitly.

praxis-bot · 2026-06-16T02:38:27Z

+
+        let bytes = match body.as_ref() {
+            Some(b) => b.as_ref(),
+            None => &[],


[Medium] The boost logic reclassifies ChatCompletions to AnthropicMessages when anthropic-version header is present OR path is /v1/messages. However, it does not boost UnknownJson. If the body has max_tokens and system but no messages key, the classifier returns UnknownJson and the filter will not reclassify. This might be desired (let the backend reject), but worth documenting the intent since the boost only applies to ChatCompletions.

praxis-bot · 2026-06-16T02:38:33Z

+) {
+    if let Some(header) = &config.headers.format {
+        ctx.extra_request_headers
+            .push((Cow::Owned(header.clone()), classified.format.as_str().to_owned()));


[Medium] promote_filter_results does not promote has_tools, has_prompt_id, has_conversation, or has_previous_response_id to filter results, unlike the openai_responses_format filter which promotes all of these. If downstream branch conditions need to route Anthropic requests by tool usage or conversation state, these will be unavailable. write_metadata also does not promote has_tools. This may be intentional for the initial PR (fields not yet needed for Anthropic routing), but it creates an asymmetry between the two format filters that could surprise users.

praxis-bot · 2026-06-16T02:38:36Z

+// -----------------------------------------------------------------------------
+
+/// Default maximum request body size for `StreamBuffer` mode (1 MiB).
+const DEFAULT_MAX_BODY_BYTES: usize = 1_048_576; // 1 MiB


[Nit] The default is 1 MiB here vs. openai_responses_format default of 64 MiB (MAX_JSON_BODY_BYTES). This difference is presumably intentional (Anthropic messages are typically smaller), but there is no doc comment explaining the rationale for the 1 MiB default vs. the ceiling of 64 MiB.

praxis-bot · 2026-06-16T02:38:39Z

@@ -6,6 +6,10 @@

 pub(crate) mod agentic;


[Medium] This mod anthropic declaration is not gated behind cfg(feature = ai-inference), but its internals import from crate::builtins::http::ai::classifier which is feature-gated. If ai-inference is disabled, compilation should fail. Either add the feature gate here (matching the pattern of the other AI filter modules) or verify the module compiles without the feature.

praxis-bot · 2026-06-16T02:38:42Z

@@ -189,7 +188,9 @@ fn handle_invalid_format(format: AiRequestFormat, config: &ResponsesFormatConfig
                AiRequestFormat::InvalidJson => "invalid JSON body",
                AiRequestFormat::NonJson => "request body is not JSON",
                AiRequestFormat::UnknownJson => "unrecognized AI API format",


[Small] Good addition of AiRequestFormat::AnthropicMessages to the pass-through arm. This ensures the responses filter does not reject requests that the shared classifier now identifies as Anthropic format.

praxis-bot-app · 2026-06-16T14:52:58Z

Unsigned commits: 7a50a84. Please sign your commits.

Adds the unified-gateway example config and 4 integration tests to the classifier PR per review feedback. The config routes by classifier-promoted x-praxis-ai-format headers with the openai_responses_format header promotion explicitly suppressed to prevent overwriting. Signed-off-by: Francisco Arceo <farceo@redhat.com>

franciscojavierarceo mentioned this pull request Jun 11, 2026

feat(filter): add anthropic_validate filter #588

Merged

8 tasks

shaneutt assigned leseb Jun 12, 2026

shaneutt added the area/ai AI and inference filters label Jun 12, 2026

shaneutt added this to AI Gateway Jun 12, 2026

github-project-automation Bot moved this to Backlog in AI Gateway Jun 12, 2026

shaneutt moved this from Backlog to Review in AI Gateway Jun 12, 2026

shaneutt added this to the v0.4.0 milestone Jun 12, 2026

franciscojavierarceo force-pushed the feat/anthropic-1-classifier branch from 873d487 to 76c3827 Compare June 12, 2026 17:20

franciscojavierarceo marked this pull request as ready for review June 12, 2026 19:39

franciscojavierarceo requested review from a team June 12, 2026 19:39

franciscojavierarceo requested review from leseb, shaneutt and twghu as code owners June 12, 2026 19:39

praxis-bot reviewed Jun 13, 2026

View reviewed changes

Merge branch 'main' into feat/anthropic-1-classifier

3c7b05e

franciscojavierarceo added the skip/pr-hygiene label Jun 14, 2026

leseb requested changes Jun 15, 2026

View reviewed changes

Comment thread filter/src/builtins/http/ai/anthropic/messages_format/mod.rs

Comment thread filter/src/builtins/http/ai/classifier/mod.rs

Comment thread filter/src/builtins/http/ai/classifier/mod.rs

franciscojavierarceo and others added 3 commits June 15, 2026 16:08

Merge branch 'main' into feat/anthropic-1-classifier

8571213

Merge branch 'main' into feat/anthropic-1-classifier

4aa9c4a

Merge branch 'main' into feat/anthropic-1-classifier

4ea5c7a

praxis-bot reviewed Jun 16, 2026

View reviewed changes

franciscojavierarceo force-pushed the feat/anthropic-1-classifier branch from 7a50a84 to b6e09a4 Compare June 16, 2026 15:16

shaneutt and others added 2 commits June 16, 2026 12:49

Merge branch 'main' into feat/anthropic-1-classifier

945dded

Merge branch 'main' into feat/anthropic-1-classifier

41da547

leseb approved these changes Jun 17, 2026

View reviewed changes

shaneutt merged commit ae99313 into praxis-proxy:main Jun 17, 2026
16 checks passed

github-project-automation Bot moved this from Review to Done in AI Gateway Jun 17, 2026

franciscojavierarceo mentioned this pull request Jun 17, 2026

test(filter): add Anthropic Messages integration coverage #592

Merged

3 tasks


		results.set("format", classified.format.as_str())?;

		if let Some(model) = &classified.model

Conversation

franciscojavierarceo commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

franciscojavierarceo commented Jun 12, 2026

Uh oh!

praxis-bot left a comment

Choose a reason for hiding this comment

Review: feat(filter): add shared AI classifier and anthropic_messages_format filter

Findings

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praxis-bot commented Jun 13, 2026

Uh oh!

leseb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

franciscojavierarceo commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

praxis-bot commented Jun 16, 2026

Uh oh!

praxis-bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

PR Review - Shared AI Classifier and Anthropic Messages Format Filter

Findings without inline placement

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praxis-bot-app Bot commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

franciscojavierarceo commented Jun 11, 2026 •

edited

Loading

leseb left a comment •

edited

Loading

franciscojavierarceo commented Jun 15, 2026 •

edited

Loading

praxis-bot left a comment •

edited

Loading