Skip to content

feat: enhance token usage resolution for OpenAI and Anthropic compati…#223

Open
AlapinEnjoyer wants to merge 2 commits into
alibaba:mainfrom
Val-Dennis:feat/recognize-openai-cached-tokens
Open

feat: enhance token usage resolution for OpenAI and Anthropic compati…#223
AlapinEnjoyer wants to merge 2 commits into
alibaba:mainfrom
Val-Dennis:feat/recognize-openai-cached-tokens

Conversation

@AlapinEnjoyer

Copy link
Copy Markdown
Contributor

Description

Cached token usage was not getting tracked for the OpenAI provider, the MR adds a way to recognize the common OpenAI-compatible cached-token fields for custom providers

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring (no functional changes)
  • Documentation update
  • CI / Build / Tooling

How Has This Been Tested?

  • make test passes locally
  • Manual testing (describe below)

Tested with a custom provider run

Checklist

  • My code follows the project's coding style (go fmt, go vet)
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the documentation accordingly (if applicable)
  • I have signed the CLA

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 OpenCodeReview found 1 issue(s) in this PR.

  • ✅ 1 posted as inline comment(s)
  • 📝 0 posted as summary

Comment thread internal/llm/usage_resolver.go Outdated
Comment on lines +45 to +46
"usage.prompt_tokens_details.cache_creation_tokens",
"data.usage.prompt_tokens_details.cache_creation_tokens",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor maintainability issue: These two new entries lack comments describing which providers or API formats use cache_creation_tokens under prompt_tokens_details. All other entries in these path lists have explanatory comments. Adding a comment (e.g., // OpenAI-compatible providers or similar) would keep the style consistent and help future maintainers understand when this path applies.

Suggestion:

Suggested change
"usage.prompt_tokens_details.cache_creation_tokens",
"data.usage.prompt_tokens_details.cache_creation_tokens",
"usage.prompt_tokens_details.cache_creation_tokens", // OpenAI-compatible providers
"data.usage.prompt_tokens_details.cache_creation_tokens", // wrapped OpenAI-compatible providers

@lizhengfeng101 lizhengfeng101 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @AlapinEnjoyer, thanks for this PR! 🎉 Great catch on the missing cached_tokens field — that's the standard OpenAI field name and it's definitely needed. The tests are solid too. A few thoughts below:


1. Potential double-counting of TotalTokens (pre-existing, not introduced by this PR)

When total_tokens is absent from the response, the fallback calculation at usage_resolver.go:77-79 does:

ui.TotalTokens = prompt + completion + cacheRead + cacheWrite

In OpenAI's semantics, prompt_tokens already includes cached_tokens (cached tokens are a subset of prompt tokens), so the correct total would be prompt + completion — adding cache tokens on top double-counts them. In contrast, Anthropic's input_tokens excludes cache tokens, so the addition is correct there.

The test TestResolveUsageWrappedCachedTokens asserts TotalTokens == 205 (100+20+75+10), which would be a double-count under OpenAI semantics.

This is a pre-existing design choice, not something you introduced — but your PR does make more OpenAI-format responses reach this code path. Might be worth a follow-up to handle the two semantics differently, or at least a comment noting the trade-off. No action needed in this PR if you'd prefer to keep the scope focused.

2. Where does cache_creation_tokens come from?

The newly added paths:

usage.prompt_tokens_details.cache_creation_tokens
data.usage.prompt_tokens_details.cache_creation_tokens

This field name isn't part of OpenAI's standard response format (OpenAI doesn't expose cache creation), and it differs from Anthropic's cache_creation_input_tokens (which lives at the usage top level, not under prompt_tokens_details). Could you add a short comment noting which provider or proxy uses this field? It would help future readers understand where it comes from.

3. Minor: test idea for path priority

Since probePath uses first-match-wins, it might be worth adding a small test where both cached_tokens and cache_read_input_tokens (or cache_tokens_hit) are present in the same response, to verify the expected path takes priority. This would document the intended ordering.


Overall this is a solid improvement — the gap in OpenAI cached token tracking was real and this fixes it cleanly. Nice work on the test coverage too! 👍

…bility

Recognize OpenAI cached_tokens and wrapped proxy response paths, add tests
for path priority and provider-specific total token fallback semantics.

Co-authored-by: Cursor <cursoragent@cursor.com>
@AlapinEnjoyer AlapinEnjoyer force-pushed the feat/recognize-openai-cached-tokens branch from 8e2443c to c0b2d88 Compare June 28, 2026 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants