Skip to content

Latest commit

 

History

History
172 lines (130 loc) · 6.61 KB

File metadata and controls

172 lines (130 loc) · 6.61 KB

ACPP Token Usage & Model Information Specification

1. Canonical Data Structure: UsageInfo

Defined in acp/session.go, this is the single source of truth for usage data within a session:

Field Type Description Semantics
InputTokens int64 Tokens consumed by user input Cumulative across all prompts
OutputTokens int64 Tokens generated by model Cumulative across all prompts
CacheCreationInputTokens int64 Tokens used to create prompt cache Cumulative
CacheReadInputTokens int64 Tokens read from cached prompts Cumulative
ContextWindow int64 Model's context window size Gauge (last known value)
MaxOutputTokens int64 Model's max output token limit Gauge (last known value)
WebSearchRequests int64 Number of web searches performed Cumulative
CostUSD float64 Total cost in USD Cumulative
PromptCount int64 Number of prompts sent Cumulative

Model identification is stored separately as lastModel string on each session — only the most recently seen model ID is kept.

SDK version is extracted once during initialization from InitializeResponse.AgentInfo._meta.{claudeCode,rai}.sdkVersion.


2. Where Usage Data Originates

ACP Backend (subprocess, ACPSession)

Source: _meta field on PromptResponse and AgentMessageChunk events.

Structure (JSON):

{
  "_meta": {
    "claudeCode": {
      "model": "claude-opus-4-6",
      "sdkVersion": "1.0.0",
      "totalCostUsd": 0.1234,
      "modelUsage": {
        "claude-opus-4-6": {
          "inputTokens": 1000,
          "outputTokens": 500,
          "cacheCreationInputTokens": 200,
          "cacheReadInputTokens": 800,
          "contextWindow": 200000,
          "maxOutputTokens": 16384,
          "webSearchRequests": 2,
          "costUSD": 0.1234
        }
      }
    }
  }
}

Key "claudeCode", "rai", "codex", or "gemini" — all use the same inner structure. modelUsage is a map keyed by model name; values are summed across all models into a single UsageInfo.

When updated:

  1. Streaming — each AgentMessageChunk with _meta replaces the current usage (values are cumulative from the agent).
  2. Prompt responsePromptResponse.Meta is used as fallback if streaming didn't provide usage.
  3. PromptCount is incremented by ACPP on each Prompt() call (not from the agent).

Accumulation model: The agent provides cumulative values. ACPP replaces (not adds) on each update, preserving only PromptCount from the previous state.

OpenCode Backend (HTTP API, OpenCodeSession)

Source: Synchronous POST /session/:id/message response.

Structure (JSON):

{
  "info": {
    "id": "msg_123",
    "sessionID": "sess_abc",
    "role": "assistant",
    "modelID": "claude-opus-4-6",
    "cost": 0.05,
    "tokens": {
      "input": 500,
      "output": 200,
      "reasoning": 0,
      "cache": {
        "read": 100,
        "write": 50
      }
    }
  }
}

When updated:

  1. Prompt response onlyupdateUsageFromMessage() after each Prompt() call.
  2. SSE message.updated — used only for lastModel tracking, not usage.

Accumulation model: The API provides per-prompt deltas. ACPP adds each response's values to the running total.


3. Current Inconsistencies

Issue ACP Backend OpenCode Backend
Accumulation Replace (agent gives cumulative) Add (API gives per-prompt deltas)
ContextWindow Populated from modelUsage Never populated
MaxOutputTokens Populated from modelUsage Never populated
WebSearchRequests Populated from modelUsage Never populated
SDKVersion From InitializeResponse Never populated
reasoning tokens Not tracked Available in API (ocTokens.Reasoning) but ignored
Model history Last model only Last model only
CostUSD source totalCostUsd at section level, fallback to per-model costUSD Per-message cost field

4. Where Usage Data is Stored

Database (session table)

Persisted columns:

Column Type Source field
model TEXT StatusInfo.Model (last model)
sdk_version TEXT StatusInfo.SDKVersion
input_tokens BIGINT UsageInfo.InputTokens
output_tokens BIGINT UsageInfo.OutputTokens
cache_creation_input_tokens BIGINT UsageInfo.CacheCreationInputTokens
cache_read_input_tokens BIGINT UsageInfo.CacheReadInputTokens
cost_usd DOUBLE PRECISION UsageInfo.CostUSD
prompt_count BIGINT UsageInfo.PromptCount

Not stored in DB: ContextWindow, MaxOutputTokens, WebSearchRequests.

Written at two points:

  • UpdateSession() — after each prompt completes
  • FinishSession() — when session closes (sets finished_at)

Prompt JSONL files (prompts/ directory)

Per-prompt snapshots: {timestamp, prompt_text, usage_snapshot}. Written after each Prompt() call. Contains full UsageInfo at that point in time.

Session log files (log/ directory)

Raw ACP SessionUpdate JSON objects. Contains the original _meta payloads with full model-level breakdown.

Database log table

Same events as log files, indexed by session_id and event_type.


5. Where Usage Data is Exposed

Consumer Fields shown Fields missing
Web UI session view cost, input/output tokens, cache creation, prompts, model, SDK cache read, context window, max output, web searches
Web UI daily stats sessions, prompts, input/output, cache create/read, cost model breakdown
Web UI monthly stats same as daily, grouped by directory model breakdown
Console summary (FormatUsageSummary) prompts, input/output, cache create+read, cost model, context window, web searches
Prometheus metrics all UsageInfo fields + context window + max output + session status per-model breakdown
JSON API (/api/sessions.json) prompts, input/output, cache create/read, cost, model, SDK context window, max output, web searches

6. Model Information

Currently only a single lastModel string is tracked per session. This means:

  • If a session uses multiple models (e.g. Haiku for tool calls, Opus for reasoning), only the last one seen is recorded.
  • The ACP _meta.modelUsage map contains per-model breakdowns, but ACPP sums them into a flat total and discards the per-model detail (except in raw logs).
  • OpenCode provides modelID per message, but again only the last is kept.

There is no per-model usage breakdown in the database schema or any display layer.