Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .changeset/wave-3-claude-code-skill.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
"tokenometer": minor
"@tokenometer/core": minor
---

Add `@tokenometer/claude-code-skill` — a Claude Code skill that teaches
Claude Code agents to invoke `npx tokenometer` for prompt-cost analysis.
Install via `~/.claude/skills/tokenometer/SKILL.md`. Submission to
community skill registry tracked separately.
11 changes: 11 additions & 0 deletions .changeset/wave-3-latency.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
"tokenometer": minor
"@tokenometer/core": minor
---

Add `--latency` flag — measures real generation latency (TTFT + total ms +
tokens/sec, p50/p95/mean over n trials) alongside token cost. Implies
`--empirical`. Default trials = 3, configurable via `--latency-trials <n>`
(1-10). Bumps default `--max-spend` to $0.25 to cover the n × 200-token
generations. Supported providers: Anthropic, OpenAI, Google, Cohere,
Mistral (latter two are metered).
19 changes: 19 additions & 0 deletions .changeset/wave-3-mistral-cohere.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
"tokenometer": minor
"@tokenometer/core": minor
---

Add Mistral and Cohere providers.

- Mistral: `mistral-tokenizer-js` for SentencePiece family (Mistral 7B,
Mixtral, Mistral Large 2407, Codestral); `chars/4` heuristic for Tekken
models (NeMo, Pixtral, Mistral Small 2409+, Devstral, Mistral Medium
2505+, Magistral, Ministral). All marked `approximate: true`. Empirical
mode unsupported (Mistral has no public token-count API).
- Cohere: offline heuristic `chars/4` (Cohere SDK is REST-only; no offline
tokenizer ships in JS). Empirical via `POST /v1/tokenize` when
`COHERE_API_KEY` is set.

Pricing for Mistral auto-sourced from `@tokenlens/models/mistral`. Cohere
pricing comes from `LOCAL_OVERRIDES` (`command-r`, `command-r-plus`)
because `@tokenlens/models` does not yet ship a Cohere catalog at v1.3.0.
11 changes: 11 additions & 0 deletions .changeset/wave-3-playground-showcase.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
"tokenometer": minor
"@tokenometer/core": minor
---

Playground (`https://tokenometer.vercel.app`) gains showcase pages for
every Wave 2 feature: `/diff`, `/by-file`, `/sarif`, `/vision`,
`/config-builder`, `/init`, `/models` (Cost Atlas + per-model SEO pages),
plus placeholder pages for the VS Code extension and Claude Code skill.
</content>
</invoke>
9 changes: 9 additions & 0 deletions .changeset/wave-3-vscode-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
"tokenometer": minor
"@tokenometer/core": minor
---

Add VS Code / Cursor extension (`@tokenometer/vscode`). Status bar shows
live token count + USD cost for the active editor file across Claude,
GPT-4o, and Gemini. Reuses `@tokenometer/core`. Marketplace publish
follows in Phase I.
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,15 @@ The `Approx` column shows `✓` when the count is a proxy (Anthropic / Google of

## Why this exists

`tiktoken` and `@anthropic-ai/tokenizer` give you a token count for one provider. They don't tell you:
**Cost AND latency in one CLI — the only tool that does both.** `tiktoken` and `@anthropic-ai/tokenizer` give you a token count for one provider. They don't tell you:

- What the same prompt costs across **multiple providers and models**
- How **fast** each provider actually responds (TTFT + tokens/sec) — a real generation, not a synthetic benchmark
- Whether **format conversion** (YAML ↔ JSON ↔ XML ↔ MD) actually moves the needle
- The **empirical** cost — what your provider actually charged on a real call, after prompt caching
- Whether a PR introduced a **prompt-cost regression**

Tokenometer is dev-time, multi-provider, multi-format, optionally empirical, and CI-native.
Tokenometer is dev-time, multi-provider, multi-format, optionally empirical, latency-aware, and CI-native.

## Install

Expand Down Expand Up @@ -135,11 +136,13 @@ The CLI also supports `--output json|sarif` for machine-readable output, `--by-f

Tokenometer picks a tokenizer per provider and flags the count as approximate (`approximate: true` in the API result) when the offline path is a proxy:

| Provider | Offline tokenizer | Exactness | Empirical (`--empirical`) |
|-----------|------------------------------|-------------|----------------------------------|
| OpenAI | `gpt-tokenizer` `o200k_base` | exact | same `o200k_base` (matches OpenAI production count) |
| Anthropic | `gpt-tokenizer` `cl100k_base`| approximate | `messages.countTokens` (exact, free) |
| Google | `chars / 4` heuristic | approximate | `model.countTokens` (exact, free) |
| Provider | Offline tokenizer | Exactness | Empirical (`--empirical`) |
|-----------|------------------------------------------------|-------------|----------------------------------|
| OpenAI | `gpt-tokenizer` `o200k_base` | exact | same `o200k_base` (matches OpenAI production count) |
| Anthropic | `gpt-tokenizer` `cl100k_base` | approximate | `messages.countTokens` (exact, free) |
| Google | `chars / 4` heuristic | approximate | `model.countTokens` (exact, free) |
| Mistral | `mistral-tokenizer-js` (V1/V2/V3) · `chars/4` for Tekken | approximate | unsupported (no public token-count endpoint) |
| Cohere | `chars / 4` heuristic | approximate | `POST /v1/tokenize` (exact, free, requires `COHERE_API_KEY`) |

Cost = `tokens / 1000 × per-1k input rate`. Pricing and context windows are sourced from the [`tokenlens`](https://www.npmjs.com/package/tokenlens) registry, with a small set of local overrides for bleeding-edge models the registry hasn't picked up yet — see [`packages/core/src/rates.ts`](packages/core/src/rates.ts) (`RATES_VERSION`).

Expand All @@ -153,7 +156,7 @@ Cost = `tokens / 1000 × per-1k input rate`. Pricing and context windows are sou

## Status

Early. v0.0.x — see [milestones](https://github.com/faraa2m/tokenometer/milestones). Roadmap to v1.0.0 in progress: VS Code extension, Claude Code skill, vision-token cost, Mistral + Cohere providers.
Early. v0.0.x — see [milestones](https://github.com/faraa2m/tokenometer/milestones). Roadmap to v1.0.0 in progress: VS Code extension, Claude Code skill, vision-token cost.

## License

Expand Down
Loading
Loading