faraa2m · faraa2m · May 10, 2026 · May 10, 2026 · May 10, 2026 · May 10, 2026
diff --git a/.changeset/wave-3-claude-code-skill.md b/.changeset/wave-3-claude-code-skill.md
@@ -0,0 +1,9 @@
+---
+"tokenometer": minor
+"@tokenometer/core": minor
+---
+
+Add `@tokenometer/claude-code-skill` — a Claude Code skill that teaches
+Claude Code agents to invoke `npx tokenometer` for prompt-cost analysis.
+Install via `~/.claude/skills/tokenometer/SKILL.md`. Submission to
+community skill registry tracked separately.
diff --git a/.changeset/wave-3-latency.md b/.changeset/wave-3-latency.md
@@ -0,0 +1,11 @@
+---
+"tokenometer": minor
+"@tokenometer/core": minor
+---
+
+Add `--latency` flag — measures real generation latency (TTFT + total ms +
+tokens/sec, p50/p95/mean over n trials) alongside token cost. Implies
+`--empirical`. Default trials = 3, configurable via `--latency-trials <n>`
+(1-10). Bumps default `--max-spend` to $0.25 to cover the n × 200-token
+generations. Supported providers: Anthropic, OpenAI, Google, Cohere,
+Mistral (latter two are metered).
diff --git a/.changeset/wave-3-mistral-cohere.md b/.changeset/wave-3-mistral-cohere.md
@@ -0,0 +1,19 @@
+---
+"tokenometer": minor
+"@tokenometer/core": minor
+---
+
+Add Mistral and Cohere providers.
+
+- Mistral: `mistral-tokenizer-js` for SentencePiece family (Mistral 7B,
+  Mixtral, Mistral Large 2407, Codestral); `chars/4` heuristic for Tekken
+  models (NeMo, Pixtral, Mistral Small 2409+, Devstral, Mistral Medium
+  2505+, Magistral, Ministral). All marked `approximate: true`. Empirical
+  mode unsupported (Mistral has no public token-count API).
+- Cohere: offline heuristic `chars/4` (Cohere SDK is REST-only; no offline
+  tokenizer ships in JS). Empirical via `POST /v1/tokenize` when
+  `COHERE_API_KEY` is set.
+
+Pricing for Mistral auto-sourced from `@tokenlens/models/mistral`. Cohere
+pricing comes from `LOCAL_OVERRIDES` (`command-r`, `command-r-plus`)
+because `@tokenlens/models` does not yet ship a Cohere catalog at v1.3.0.
diff --git a/.changeset/wave-3-playground-showcase.md b/.changeset/wave-3-playground-showcase.md
@@ -0,0 +1,11 @@
+---
+"tokenometer": minor
+"@tokenometer/core": minor
+---
+
+Playground (`https://tokenometer.vercel.app`) gains showcase pages for
+every Wave 2 feature: `/diff`, `/by-file`, `/sarif`, `/vision`,
+`/config-builder`, `/init`, `/models` (Cost Atlas + per-model SEO pages),
+plus placeholder pages for the VS Code extension and Claude Code skill.
+</content>
+</invoke>
diff --git a/.changeset/wave-3-vscode-extension.md b/.changeset/wave-3-vscode-extension.md
@@ -0,0 +1,9 @@
+---
+"tokenometer": minor
+"@tokenometer/core": minor
+---
+
+Add VS Code / Cursor extension (`@tokenometer/vscode`). Status bar shows
+live token count + USD cost for the active editor file across Claude,
+GPT-4o, and Gemini. Reuses `@tokenometer/core`. Marketplace publish
+follows in Phase I.
diff --git a/README.md b/README.md
@@ -58,14 +58,15 @@ The `Approx` column shows `✓` when the count is a proxy (Anthropic / Google of
 
 ## Why this exists
 
-`tiktoken` and `@anthropic-ai/tokenizer` give you a token count for one provider. They don't tell you:
+**Cost AND latency in one CLI — the only tool that does both.** `tiktoken` and `@anthropic-ai/tokenizer` give you a token count for one provider. They don't tell you:
 
 - What the same prompt costs across **multiple providers and models**
+- How **fast** each provider actually responds (TTFT + tokens/sec) — a real generation, not a synthetic benchmark
 - Whether **format conversion** (YAML ↔ JSON ↔ XML ↔ MD) actually moves the needle
 - The **empirical** cost — what your provider actually charged on a real call, after prompt caching
 - Whether a PR introduced a **prompt-cost regression**
 
-Tokenometer is dev-time, multi-provider, multi-format, optionally empirical, and CI-native.
+Tokenometer is dev-time, multi-provider, multi-format, optionally empirical, latency-aware, and CI-native.
 
 ## Install
 
@@ -135,11 +136,13 @@ The CLI also supports `--output json|sarif` for machine-readable output, `--by-f
 
 Tokenometer picks a tokenizer per provider and flags the count as approximate (`approximate: true` in the API result) when the offline path is a proxy:
 
-| Provider  | Offline tokenizer            | Exactness   | Empirical (`--empirical`)        |
-|-----------|------------------------------|-------------|----------------------------------|
-| OpenAI    | `gpt-tokenizer` `o200k_base` | exact       | same `o200k_base` (matches OpenAI production count) |
-| Anthropic | `gpt-tokenizer` `cl100k_base`| approximate | `messages.countTokens` (exact, free) |
-| Google    | `chars / 4` heuristic        | approximate | `model.countTokens` (exact, free) |
+| Provider  | Offline tokenizer                              | Exactness   | Empirical (`--empirical`)        |
+|-----------|------------------------------------------------|-------------|----------------------------------|
+| OpenAI    | `gpt-tokenizer` `o200k_base`                   | exact       | same `o200k_base` (matches OpenAI production count) |
+| Anthropic | `gpt-tokenizer` `cl100k_base`                  | approximate | `messages.countTokens` (exact, free) |
+| Google    | `chars / 4` heuristic                          | approximate | `model.countTokens` (exact, free) |
+| Mistral   | `mistral-tokenizer-js` (V1/V2/V3) · `chars/4` for Tekken | approximate | unsupported (no public token-count endpoint) |
+| Cohere    | `chars / 4` heuristic                          | approximate | `POST /v1/tokenize` (exact, free, requires `COHERE_API_KEY`) |
 
 Cost = `tokens / 1000 × per-1k input rate`. Pricing and context windows are sourced from the [`tokenlens`](https://www.npmjs.com/package/tokenlens) registry, with a small set of local overrides for bleeding-edge models the registry hasn't picked up yet — see [`packages/core/src/rates.ts`](packages/core/src/rates.ts) (`RATES_VERSION`).
 
@@ -153,7 +156,7 @@ Cost = `tokens / 1000 × per-1k input rate`. Pricing and context windows are sou
 
 ## Status
 
-Early. v0.0.x — see [milestones](https://github.com/faraa2m/tokenometer/milestones). Roadmap to v1.0.0 in progress: VS Code extension, Claude Code skill, vision-token cost, Mistral + Cohere providers.
+Early. v0.0.x — see [milestones](https://github.com/faraa2m/tokenometer/milestones). Roadmap to v1.0.0 in progress: VS Code extension, Claude Code skill, vision-token cost.
 
 ## License