Skip to content

feat: Wave 3 v1.0.0 rollout — VS Code ext, Claude Code skill, Mistral+Cohere, --latency, playground showcase#14

Merged
faraa2m merged 3 commits into
mainfrom
feat/v1-rollout-wave-3
May 10, 2026
Merged

feat: Wave 3 v1.0.0 rollout — VS Code ext, Claude Code skill, Mistral+Cohere, --latency, playground showcase#14
faraa2m merged 3 commits into
mainfrom
feat/v1-rollout-wave-3

Conversation

@faraa2m
Copy link
Copy Markdown
Owner

@faraa2m faraa2m commented May 10, 2026

Summary

Wave 3 of the v1.0.0 rollout. Five workstreams, all additive — no breaking changes.

What's in here

VS Code / Cursor extension (packages/vscode/, new package)

  • Status bar shows live <model> · <tokens> tok · $<cost> for the active editor file.
  • Settings: tokenometer.model, tokenometer.format, tokenometer.warnOnCostAbove (turns the bar warning-yellow when exceeded).
  • Commands: Tokenometer: switch model (quick-pick across KNOWN_MODELS), Tokenometer: show info.
  • Debounced 200ms recompute on doc change; 1MB byte-size gate; supported file types .md/.txt/.json/.yaml/.yml/.xml.
  • Reuses @tokenometer/core via esbuild bundle. Marketplace publish lands in Wave 4 Phase I.
  • 22 unit tests for the pure helpers (status bar formatter, threshold check, debounce, file gate).

Claude Code skill (packages/claude-code-skill/, new package)

  • SKILL.md with name: tokenometer-cost-check + multi-trigger description.
  • 13-section body teaches Claude Code agents when + how to invoke npx tokenometer: offline + empirical mode, --output json/sarif, --by-file, --image, auto provider detection, .tokenometer.yml, tokenometer init for CI scaffold.
  • One-line install: ~/.claude/skills/tokenometer/SKILL.md. Convenience install.sh included.

Mistral + Cohere providers (@tokenometer/core)

  • Mistral: mistral-tokenizer-js for SentencePiece family (Mistral 7B, Mixtral 8x7B/8x22B, Mistral Large 2407, Codestral 22B); chars/4 heuristic for Tekken-tokenizer models (NeMo, Pixtral, Mistral Small 2409+, Devstral, Mistral Medium 2505+, Magistral, Ministral). All marked approximate: true. Empirical mode unsupported — Mistral has no public token-count API; throws a clear error pointing at the metered chat-completion fallback.
  • Cohere: chars/4 heuristic offline. Empirical via POST /v1/tokenize using bare fetch (Cohere SDK ships only a REST wrapper, no offline tokenizer — saved 4.9 MB install).
  • Pricing: Mistral auto-sourced from @tokenlens/models/mistral (19 models). Cohere via LOCAL_OVERRIDES (command-r, command-r-plus) until tokenlens adds upstream.
  • KNOWN_MODELS count: 42 → 63.

--latency flag (CLI + core)

  • Measures real generation latency: TTFT + total ms + tokens/sec. Reports p50 / p95 / mean over n trials (default 3, configurable 1-10 via --latency-trials).
  • Implies --empirical. Bumps default --max-spend ceiling to $0.25 to cover ~200-token generations.
  • Streamed via SDK for Anthropic + Google; bare fetch SSE/NDJSON for OpenAI + Mistral + Cohere (avoids extra SDK install footprint).
  • Table renderer adds latency columns when present. JSON + SARIF outputs carry the full latency block.
  • Differentiation pitch: the only LLM cost CLI that also reports latency. No competitor (tokencost, tiktoken, gpt-token-counter-live) ships latency.

Playground showcase (packages/web/)

  • 11 new routes via react-router-dom@7: /diff, /by-file, /sarif, /vision, /config-builder, /init, /models (Cost Atlas: sortable + searchable + filter-by-provider), /models/<id> (per-model SEO pages), /editor, /claude-code, plus / and /calculator for the existing landing.
  • Top nav with Tools dropdown; footer linking repo + npm + Marketplace placeholders.
  • Map<Provider, ...> everywhere instead of static Records — playground gracefully extends as more providers land.
  • Static robots.txt + sitemap.xml; per-page document.title via usePageTitle hook (no react-helmet dependency).

Why now

  • Plan called for Wave 3 to add the user-facing differentiation: editor presence (E.1), agent-AI awareness (E.2), provider breadth (H), latency (the marketing line), and a playground that demos every shipped feature.
  • All five landed in this PR. v1.0.0 launch can claim every feature it advertises.

Test plan

  • npm run lint — 89 files clean
  • npm run typecheck — clean
  • npm test227/227 tests across 20 files (was 150/16 before Wave 3)
  • npm run build — clean
  • npm run build -w @tokenometer/web — clean (4.43 MB main JS, 2.01 MB gzipped, larger-than-500kB warning is acknowledged + accepted for now)
  • CLI smoke: SARIF output validates as version: "2.1.0"; --by-file / --output json / --image paths exercised by integration tests
  • Action smoke (post-merge): tokenometer's own prompt-cost.yml runs and the per-file Action comment renders
  • Manual: install the VS Code extension .vsix (built locally via npm run package:vsix --workspace=packages/vscode) and confirm status bar renders. Marketplace publish deferred to Phase I.
  • Manual: cp packages/claude-code-skill/SKILL.md ~/.claude/skills/tokenometer/ then issue a test prompt to confirm Claude Code triggers on it

Out of scope (Wave 4)

  • Phase I unified release pipeline — VS Code Marketplace publish via vsce, Open VSX publish, post-publish smoke test job, Vercel deploy hook, Marketplace verify HTTP probe. Right now the release workflow only handles npm via Changesets.
  • Phase G content posts — drafts are pre-staged in ~/Downloads/tokenometer-content-todo.md; manual posting after Wave 4 is green.

🤖 Generated with Claude Code

…ere, latency, playground showcase

Adds the headline differentiation features for v1.0.0. No breaking changes
to existing flags or output paths.

VS Code / Cursor extension (`packages/vscode/`, new package):
- Status bar shows live token count + USD cost for the active editor file
  across Claude, GPT-4o, and Gemini.
- Settings: model, format, warnOnCostAbove threshold (turns the bar
  background warning-yellow when exceeded).
- Commands: Tokenometer: switch model, Tokenometer: show info.
- Debounced 200ms recompute on document changes; 1MB file-size gate.
- Reuses @tokenometer/core via esbuild bundle. Marketplace publish in
  Wave 4 Phase I.

Claude Code skill (`packages/claude-code-skill/`, new package):
- SKILL.md with `name: tokenometer-cost-check` + multi-trigger description.
- 13-section body teaching Claude Code agents when and how to invoke
  `npx tokenometer` for prompt cost analysis: offline + empirical mode,
  --output json/sarif, --by-file, --image, auto provider detection,
  .tokenometer.yml config, CI guardrail via `tokenometer init`.
- One-line install via `~/.claude/skills/tokenometer/SKILL.md`.

Mistral + Cohere providers (`@tokenometer/core`):
- Mistral: `mistral-tokenizer-js` for SentencePiece family (Mistral 7B,
  Mixtral 8x7B/8x22B, Mistral Large 2407, Codestral 22B); chars/4
  heuristic for Tekken-tokenizer models (NeMo, Pixtral, Mistral Small
  2409+, Devstral, Mistral Medium 2505+, Magistral, Ministral). All
  marked `approximate: true`. Empirical mode unsupported — Mistral has
  no public token-count API.
- Cohere: chars/4 heuristic offline; empirical via POST /v1/tokenize
  using bare fetch (Cohere SDK ships only a REST wrapper, no offline
  tokenizer).
- Pricing: Mistral auto-sourced from @tokenlens/models; Cohere via
  LOCAL_OVERRIDES until tokenlens adds upstream. Total KNOWN_MODELS:
  42 → 63.

Latency mode (`--latency` flag):
- Measures real generation latency: TTFT (time to first token) + total
  wall-clock + tokens/sec. Reports p50/p95/mean over n trials (default
  3, configurable 1-10 via --latency-trials).
- Implies --empirical. Bumps default --max-spend ceiling to $0.25 to
  cover ~200-token generations.
- Streamed via SDK for Anthropic/Google; bare fetch SSE/NDJSON for
  OpenAI/Mistral/Cohere (no extra SDK install footprint).
- New table columns + JSON/SARIF carry the latency block.
- Differentiation pitch: only LLM cost CLI that also reports latency.

Playground showcase (`packages/web/`):
- 11 new routes: /diff, /by-file, /sarif, /vision, /config-builder,
  /init, /models (Cost Atlas with sort/search/filter), /models/<id>
  (per-model SEO pages), /editor, /claude-code, plus / and /calculator
  for the existing landing.
- react-router-dom 7, top nav with Tools dropdown, footer linking repo +
  npm + marketplace placeholders.
- Cross-page provider robustness: Map<Provider, ...> instead of static
  Records so playground gracefully extends as new providers land.
- Static robots.txt + sitemap.xml; per-page document.title via
  usePageTitle hook (no react-helmet dependency).

Test count: 150 → 227 (across 20 files). Lint, typecheck, root + web
builds all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
tokenometer Ready Ready Preview, Comment May 10, 2026 4:10am

…itions

Wave 3 added 21 new model ids (Mistral × 19 + Cohere × 2), bringing
KNOWN_MODELS from 42 to 63. Each prompt × format × new-model combination
adds a fresh cell to the benchmark sweep — the drift detector caught
210 new cells (10 prompts × 21 models × 1 format check) and rightly
failed CI.

Re-ran `npm run benchmarks:regenerate` to refresh the committed dataset
to 10 × 63 × 5 = 3,150 cells. No existing cells changed; only additions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Action bundles @tokenometer/core via esbuild. Wave 3 added Mistral
and Cohere providers (LOCAL_OVERRIDES entries, tokenize-mistral.ts,
tokenize-cohere.ts), but the bundle wasn't re-emitted, so CI's bundle
sync check rightly failed.

Re-ran `npm run build --workspace=packages/action`. Bundle now reflects
the new providers (+857 / -482 lines, mostly the new tokenize-cohere
heuristic path and the LOCAL_OVERRIDES additions for command-r and
command-r-plus). No source changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@faraa2m faraa2m merged commit 061e0c5 into main May 10, 2026
7 checks passed
@faraa2m faraa2m deleted the feat/v1-rollout-wave-3 branch May 10, 2026 04:16
faraa2m added a commit that referenced this pull request May 11, 2026
…+Cohere, --latency, playground showcase (#14)

* feat: Wave 3 v1.0.0 rollout — VS Code, Claude Code skill, Mistral+Cohere, latency, playground showcase

Adds the headline differentiation features for v1.0.0. No breaking changes
to existing flags or output paths.

VS Code / Cursor extension (`packages/vscode/`, new package):
- Status bar shows live token count + USD cost for the active editor file
  across Claude, GPT-4o, and Gemini.
- Settings: model, format, warnOnCostAbove threshold (turns the bar
  background warning-yellow when exceeded).
- Commands: Tokenometer: switch model, Tokenometer: show info.
- Debounced 200ms recompute on document changes; 1MB file-size gate.
- Reuses @tokenometer/core via esbuild bundle. Marketplace publish in
  Wave 4 Phase I.

Claude Code skill (`packages/claude-code-skill/`, new package):
- SKILL.md with `name: tokenometer-cost-check` + multi-trigger description.
- 13-section body teaching Claude Code agents when and how to invoke
  `npx tokenometer` for prompt cost analysis: offline + empirical mode,
  --output json/sarif, --by-file, --image, auto provider detection,
  .tokenometer.yml config, CI guardrail via `tokenometer init`.
- One-line install via `~/.claude/skills/tokenometer/SKILL.md`.

Mistral + Cohere providers (`@tokenometer/core`):
- Mistral: `mistral-tokenizer-js` for SentencePiece family (Mistral 7B,
  Mixtral 8x7B/8x22B, Mistral Large 2407, Codestral 22B); chars/4
  heuristic for Tekken-tokenizer models (NeMo, Pixtral, Mistral Small
  2409+, Devstral, Mistral Medium 2505+, Magistral, Ministral). All
  marked `approximate: true`. Empirical mode unsupported — Mistral has
  no public token-count API.
- Cohere: chars/4 heuristic offline; empirical via POST /v1/tokenize
  using bare fetch (Cohere SDK ships only a REST wrapper, no offline
  tokenizer).
- Pricing: Mistral auto-sourced from @tokenlens/models; Cohere via
  LOCAL_OVERRIDES until tokenlens adds upstream. Total KNOWN_MODELS:
  42 → 63.

Latency mode (`--latency` flag):
- Measures real generation latency: TTFT (time to first token) + total
  wall-clock + tokens/sec. Reports p50/p95/mean over n trials (default
  3, configurable 1-10 via --latency-trials).
- Implies --empirical. Bumps default --max-spend ceiling to $0.25 to
  cover ~200-token generations.
- Streamed via SDK for Anthropic/Google; bare fetch SSE/NDJSON for
  OpenAI/Mistral/Cohere (no extra SDK install footprint).
- New table columns + JSON/SARIF carry the latency block.
- Differentiation pitch: only LLM cost CLI that also reports latency.

Playground showcase (`packages/web/`):
- 11 new routes: /diff, /by-file, /sarif, /vision, /config-builder,
  /init, /models (Cost Atlas with sort/search/filter), /models/<id>
  (per-model SEO pages), /editor, /claude-code, plus / and /calculator
  for the existing landing.
- react-router-dom 7, top nav with Tools dropdown, footer linking repo +
  npm + marketplace placeholders.
- Cross-page provider robustness: Map<Provider, ...> instead of static
  Records so playground gracefully extends as new providers land.
- Static robots.txt + sitemap.xml; per-page document.title via
  usePageTitle hook (no react-helmet dependency).

Test count: 150 → 227 (across 20 files). Lint, typecheck, root + web
builds all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(benchmarks): regenerate results.json after Mistral + Cohere additions

Wave 3 added 21 new model ids (Mistral × 19 + Cohere × 2), bringing
KNOWN_MODELS from 42 to 63. Each prompt × format × new-model combination
adds a fresh cell to the benchmark sweep — the drift detector caught
210 new cells (10 prompts × 21 models × 1 format check) and rightly
failed CI.

Re-ran `npm run benchmarks:regenerate` to refresh the committed dataset
to 10 × 63 × 5 = 3,150 cells. No existing cells changed; only additions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(action): rebuild dist/index.cjs after Mistral + Cohere additions

The Action bundles @tokenometer/core via esbuild. Wave 3 added Mistral
and Cohere providers (LOCAL_OVERRIDES entries, tokenize-mistral.ts,
tokenize-cohere.ts), but the bundle wasn't re-emitted, so CI's bundle
sync check rightly failed.

Re-ran `npm run build --workspace=packages/action`. Bundle now reflects
the new providers (+857 / -482 lines, mostly the new tokenize-cohere
heuristic path and the LOCAL_OVERRIDES additions for command-r and
command-r-plus). No source changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant