docs(proposals): add How? section to #210 response-based token counting by mkoushni · Pull Request #643 · praxis-proxy/praxis

mkoushni · 2026-06-21T09:59:08Z

docs(proposals): add How? section to #210 response-based token counting

Summary

Completes the graduation criteria for proposal #210 by adding the How? section — the design and implementation plan for the token_count filter that extracts token usage from AI provider response bodies and headers.

What changed

docs/proposals/00210_response-based-token-counting.md — 193 lines added, status updated from proposed → accepted.

Open questions resolved

All four open questions from the What?/Why? section are answered in the new How? section:

Question	Decision
Provider identification	Explicit `provider:` YAML key — no auto-detection. Azure and OpenAI share the same JSON schema so auto-detection would be ambiguous.
Streaming completion signal	`BodyMode::StreamBuffer` — proxy buffers all response body bytes and delivers them once with `end_of_stream: true`. Stream close is the authoritative trigger, covering providers that omit `[DONE]` (Google Gemini).
Streaming accumulation per provider	Per-provider strategy: single terminal-chunk scan for OpenAI/Azure/Google/Bedrock Converse; two-event scan (`message_start` + `message_delta`) for Anthropic; header-only for Bedrock InvokeModel.
Partial usage data	Only the final assembled payload is parsed — no summing of intermediate chunks — to avoid double-counting.

Design content added

Requirements — 7 concrete implementation requirements.
Filter struct and config — TokenCountConfig, ProviderKind enum, TokenCountFilter struct with YAML snippet.
HttpFilter hook table — behaviour of each hook (on_request, on_response, response_body_access, response_body_mode, on_response_body).
SSE extraction detail — dispatch tree: Anthropic two-event scan vs. last-valid-chunk scan for all other providers.
Bedrock InvokeModel path — header-only extraction in on_response; BodyAccess::None prevents unnecessary buffering.
FilterContext metadata keys — token.input, token.output, token.total written via ctx.set_token_usage.
Module registration — step-by-step wiring for ai/mod.rs, http/mod.rs, builtins/mod.rs, and registry.rs.
YAML configuration example — minimal working filter chain showing token_count with provider: openai.

PR Review

Summary: Adds the How? section to the #210 proposal, transitioning status from proposed to accepted. Well-structured design that correctly addresses all four open questions and aligns with the existing token_usage library and FilterContext APIs.

Severity	Count
Medium	3

No critical or large issues found. The design decisions are sound -- StreamBuffer for body aggregation, explicit provider: key over auto-detection, and header-only path for Bedrock InvokeModel are all correct choices.

…token counting Signed-off-by: mkoushni <mkoushni@redhat.com>

- Add listeners block to YAML configuration example - Align ProviderKind::OpenAi casing with TokenUsageProvider::OpenAi - Update response_body_mode hook table row to reflect Stream/StreamBuffer split Signed-off-by: mkoushni <mkoushni@redhat.com>

mkoushni marked this pull request as ready for review June 21, 2026 11:02

mkoushni requested a review from a team June 21, 2026 11:02

mkoushni requested review from shaneutt and twghu as code owners June 21, 2026 11:02

praxis-bot reviewed Jun 22, 2026

View reviewed changes

Comment thread docs/proposals/00210_response-based-token-counting.md Outdated

praxis-bot reviewed Jun 22, 2026

View reviewed changes

Comment thread docs/proposals/00210_response-based-token-counting.md

praxis-bot reviewed Jun 22, 2026

View reviewed changes

Comment thread docs/proposals/00210_response-based-token-counting.md

mkoushni added 2 commits June 22, 2026 18:32

docs(proposals): add How? section to praxis-proxy#210 response-based …

be4253b

…token counting Signed-off-by: mkoushni <mkoushni@redhat.com>

mkoushni force-pushed the feat/210-response-based-token-counting branch from b7bf13b to 39631ec Compare June 22, 2026 15:41

shaneutt self-assigned this Jun 22, 2026

shaneutt added this to AI Gateway Jun 22, 2026

github-project-automation Bot moved this to Backlog in AI Gateway Jun 22, 2026

shaneutt moved this from Backlog to Review in AI Gateway Jun 22, 2026

shaneutt added this to the v0.4.0 milestone Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(proposals): add How? section to #210 response-based token counting#643

docs(proposals): add How? section to #210 response-based token counting#643
mkoushni wants to merge 2 commits into
praxis-proxy:mainfrom
mkoushni:feat/210-response-based-token-counting

mkoushni commented Jun 21, 2026

Uh oh!

praxis-bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mkoushni commented Jun 21, 2026

docs(proposals): add How? section to #210 response-based token counting

Summary

What changed

Open questions resolved

Design content added

Related

Uh oh!

praxis-bot left a comment

Choose a reason for hiding this comment

PR Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants