Skip to content

feat: Wave 2 v1.0.0 rollout — config, SARIF, vision, per-file diff, auto-detect#13

Merged
faraa2m merged 2 commits into
mainfrom
feat/v1-rollout-wave-2
May 10, 2026
Merged

feat: Wave 2 v1.0.0 rollout — config, SARIF, vision, per-file diff, auto-detect#13
faraa2m merged 2 commits into
mainfrom
feat/v1-rollout-wave-2

Conversation

@faraa2m
Copy link
Copy Markdown
Owner

@faraa2m faraa2m commented May 10, 2026

Summary

Wave 2 of the v1.0.0 rollout. Pure additions across @tokenometer/core, the GitHub Action, and the CLI. No breaking changes.

What's in here

Core lib (@tokenometer/core)

  • loadConfig / parseConfig.tokenometer.yml loader with walk-up directory search. Halts at filesystem root or git root. Validates models against KNOWN_MODELS, formats against the supported set, budgets non-negative, paths array-of-strings.
  • toSarif — SARIF 2.1.0 emitter. One result per (file × model × format) cell with prompt-cost ruleId.
  • anthropicVisionTokens, openaiVisionTokens, googleVisionTokens — documented per-provider formulas for image tokenization.

GitHub Action (packages/action)

  • Sticky PR comment now includes a top-N changed-file table with per-file Δ tokens and Δ USD, plus a collapsible "all files" details block.
  • New optional top-n-files input (range 1–20, default 5).
  • Per-file aggregator extracted to per-file-diff.ts for unit tests (11 tests + 4 markdown snapshots).
  • dist/index.cjs rebuilt.

CLI (tokenometer)

  • Auto provider detection: when --model is omitted, picks based on which of ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY / GEMINI_API_KEY is set. Stderr note when multiple are present (defaults to claude-opus-4-7).
  • .tokenometer.yml config loading; --no-config to skip; --config <path> to point at a specific file. User flags always override config.
  • --by-file: per-file token/cost attribution table (no-op for single file).
  • --output table|json|sarif: machine-readable output. JSON emits TokenometerResult; SARIF passes through toSarif from core.
  • --image <path> (repeatable): vision-token cost estimation. Each image becomes a virtual file row in the result; provider is inferred from the model.

Release workflow

  • Chained npm run lint:fix after npx changeset version so Biome compact arrays survive Changesets' package.json rewrites.
  • Direct fix for the lint failure pattern that broke PR chore(release): version packages #12 — going forward, the version PR's CI stays green without manual intervention.

Mistral integration research

  • Memo at .planning/research/mistral-integration.md informs Wave 3 Phase H impl: ship mistral-tokenizer-js for SentencePiece family with approximate: true, fall back to char-ratio heuristic for Tekken models (NeMo / Pixtral / Devstral / Mistral Medium 2505+), defer empirical mode (no public token-count endpoint).

Why now

  • Plan said Wave 2 = Phase C (feature wins) + Phase D (vision) + Phase H prep. All landed.
  • The release workflow lint chain wasn't in the original plan but is required to keep the auto-bump pipeline working past Wave 1's keywords additions.

Test plan

  • npm run lint — 61 files clean
  • npm run typecheck — clean
  • npm test150/150 across 16 files (was 48/6 before)
  • npm run build — clean
  • Action dist/index.cjs rebuilt and contains the new aggregatePerFileDiff / renderPerFileMarkdown symbols
  • CLI smoke: --output table, --output json | jq '.files[0].path', --output sarif | jq '.version' (returns "2.1.0"), --by-file with multiple input files
  • Action smoke (post-merge): tokenometer's own prompt-cost.yml runs against this PR and sticky comment renders the new per-file table
  • Add a changeset for any follow-up tweaks before merging the resulting Version Packages PR

Out of scope (Wave 3 / 4)

  • VS Code / Cursor extension (E.1)
  • Claude Code skill (E.2 — confirmed this round)
  • Playground showcase pages for the new features (F)
  • --latency flag
  • Mistral + Cohere implementation (per memo)
  • Unified release pipeline phase I (Marketplace verify, VS Code publish, smoke test job)

🤖 Generated with Claude Code

…uto-detect

Adds the first batch of feature work toward v1.0.0. Pure additions; no
breaking changes to existing flags or output.

Core lib (`@tokenometer/core`):
- `loadConfig` / `parseConfig` for `.tokenometer.yml` (walk-up; halts at
  filesystem root or git root). Validates models against KNOWN_MODELS and
  formats against the supported set.
- `toSarif` for SARIF 2.1.0 emission (consumed by CLI `--output sarif` and
  by `gh code-scanning`).
- `anthropicVisionTokens`, `openaiVisionTokens`, `googleVisionTokens` —
  documented per-provider formulas; ignores `detail` where the provider
  doesn't expose it.

GitHub Action (`packages/action`):
- Sticky PR comment now includes a top-N changed-file table with per-file
  Δ tokens and Δ USD plus a collapsible "all files" details block.
- New optional `top-n-files` input (1-20, default 5).
- Per-file aggregator extracted to `per-file-diff.ts` for unit tests.
- Action `dist/index.cjs` rebuilt.

CLI (`tokenometer`):
- Auto provider detection: when `--model` is omitted, picks based on which
  of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`/`GEMINI_API_KEY`
  is set. Stderr note when multiple are present.
- `.tokenometer.yml` config loading; `--no-config` to skip; `--config <path>`
  to point at a specific file. User flags always win.
- `--by-file` per-file token/cost attribution table (no-op for single file).
- `--output table|json|sarif` for machine-readable output.
- `--image <path>` (repeatable) for vision-token cost across providers.

Release workflow:
- Chained `npm run lint:fix` after `changeset version` so Biome compact
  arrays survive Changesets' package.json rewrites — fixes the lint failure
  pattern seen on PR #12.

Mistral integration research memo at `.planning/research/mistral-integration.md`
informs Wave 3 Phase H impl: ship `mistral-tokenizer-js` for SentencePiece
family with `approximate: true`, fall back to char-ratio heuristic for Tekken
models, defer empirical mode (no public token-count endpoint).

Test count: 102 → 150 (across 16 files). Lint, typecheck, build all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
tokenometer Ready Ready Preview, Comment May 10, 2026 3:14am

The static `import { existsSync } from 'node:fs'` etc. at the top of
config.ts caused Vite (used by @tokenometer/web) to externalize the
node:* modules for browser builds, then Rollup failed because the
externalized stub doesn't expose named members like `join`.

Move the node:fs / node:fs/promises / node:path imports inside
loadConfig as dynamic imports. Vite stops static-tracing them, so the
browser build succeeds. Node still resolves them normally at runtime.

parseConfig is unchanged — it was already pure.

Fixes the test/build CI step + Vercel preview deploy on PR #13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@faraa2m faraa2m merged commit cc745d5 into main May 10, 2026
7 checks passed
@faraa2m faraa2m deleted the feat/v1-rollout-wave-2 branch May 10, 2026 03:37
faraa2m added a commit that referenced this pull request May 11, 2026
…uto-detect (#13)

* feat: Wave 2 v1.0.0 rollout — config, SARIF, vision, per-file diff, auto-detect

Adds the first batch of feature work toward v1.0.0. Pure additions; no
breaking changes to existing flags or output.

Core lib (`@tokenometer/core`):
- `loadConfig` / `parseConfig` for `.tokenometer.yml` (walk-up; halts at
  filesystem root or git root). Validates models against KNOWN_MODELS and
  formats against the supported set.
- `toSarif` for SARIF 2.1.0 emission (consumed by CLI `--output sarif` and
  by `gh code-scanning`).
- `anthropicVisionTokens`, `openaiVisionTokens`, `googleVisionTokens` —
  documented per-provider formulas; ignores `detail` where the provider
  doesn't expose it.

GitHub Action (`packages/action`):
- Sticky PR comment now includes a top-N changed-file table with per-file
  Δ tokens and Δ USD plus a collapsible "all files" details block.
- New optional `top-n-files` input (1-20, default 5).
- Per-file aggregator extracted to `per-file-diff.ts` for unit tests.
- Action `dist/index.cjs` rebuilt.

CLI (`tokenometer`):
- Auto provider detection: when `--model` is omitted, picks based on which
  of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`/`GEMINI_API_KEY`
  is set. Stderr note when multiple are present.
- `.tokenometer.yml` config loading; `--no-config` to skip; `--config <path>`
  to point at a specific file. User flags always win.
- `--by-file` per-file token/cost attribution table (no-op for single file).
- `--output table|json|sarif` for machine-readable output.
- `--image <path>` (repeatable) for vision-token cost across providers.

Release workflow:
- Chained `npm run lint:fix` after `changeset version` so Biome compact
  arrays survive Changesets' package.json rewrites — fixes the lint failure
  pattern seen on PR #12.

Mistral integration research memo at `local research notes`
informs Wave 3 Phase H impl: ship `mistral-tokenizer-js` for SentencePiece
family with `approximate: true`, fall back to char-ratio heuristic for Tekken
models, defer empirical mode (no public token-count endpoint).

Test count: 102 → 150 (across 16 files). Lint, typecheck, build all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): defer node:fs/path imports in loadConfig for browser builds

The static `import { existsSync } from 'node:fs'` etc. at the top of
config.ts caused Vite (used by @tokenometer/web) to externalize the
node:* modules for browser builds, then Rollup failed because the
externalized stub doesn't expose named members like `join`.

Move the node:fs / node:fs/promises / node:path imports inside
loadConfig as dynamic imports. Vite stops static-tracing them, so the
browser build succeeds. Node still resolves them normally at runtime.

parseConfig is unchanged — it was already pure.

Fixes the test/build CI step + Vercel preview deploy on PR #13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant