feat: Wave 2 v1.0.0 rollout — config, SARIF, vision, per-file diff, auto-detect#13
Merged
Conversation
…uto-detect Adds the first batch of feature work toward v1.0.0. Pure additions; no breaking changes to existing flags or output. Core lib (`@tokenometer/core`): - `loadConfig` / `parseConfig` for `.tokenometer.yml` (walk-up; halts at filesystem root or git root). Validates models against KNOWN_MODELS and formats against the supported set. - `toSarif` for SARIF 2.1.0 emission (consumed by CLI `--output sarif` and by `gh code-scanning`). - `anthropicVisionTokens`, `openaiVisionTokens`, `googleVisionTokens` — documented per-provider formulas; ignores `detail` where the provider doesn't expose it. GitHub Action (`packages/action`): - Sticky PR comment now includes a top-N changed-file table with per-file Δ tokens and Δ USD plus a collapsible "all files" details block. - New optional `top-n-files` input (1-20, default 5). - Per-file aggregator extracted to `per-file-diff.ts` for unit tests. - Action `dist/index.cjs` rebuilt. CLI (`tokenometer`): - Auto provider detection: when `--model` is omitted, picks based on which of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`/`GEMINI_API_KEY` is set. Stderr note when multiple are present. - `.tokenometer.yml` config loading; `--no-config` to skip; `--config <path>` to point at a specific file. User flags always win. - `--by-file` per-file token/cost attribution table (no-op for single file). - `--output table|json|sarif` for machine-readable output. - `--image <path>` (repeatable) for vision-token cost across providers. Release workflow: - Chained `npm run lint:fix` after `changeset version` so Biome compact arrays survive Changesets' package.json rewrites — fixes the lint failure pattern seen on PR #12. Mistral integration research memo at `.planning/research/mistral-integration.md` informs Wave 3 Phase H impl: ship `mistral-tokenizer-js` for SentencePiece family with `approximate: true`, fall back to char-ratio heuristic for Tekken models, defer empirical mode (no public token-count endpoint). Test count: 102 → 150 (across 16 files). Lint, typecheck, build all clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
The static `import { existsSync } from 'node:fs'` etc. at the top of
config.ts caused Vite (used by @tokenometer/web) to externalize the
node:* modules for browser builds, then Rollup failed because the
externalized stub doesn't expose named members like `join`.
Move the node:fs / node:fs/promises / node:path imports inside
loadConfig as dynamic imports. Vite stops static-tracing them, so the
browser build succeeds. Node still resolves them normally at runtime.
parseConfig is unchanged — it was already pure.
Fixes the test/build CI step + Vercel preview deploy on PR #13.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
faraa2m
added a commit
that referenced
this pull request
May 11, 2026
…uto-detect (#13) * feat: Wave 2 v1.0.0 rollout — config, SARIF, vision, per-file diff, auto-detect Adds the first batch of feature work toward v1.0.0. Pure additions; no breaking changes to existing flags or output. Core lib (`@tokenometer/core`): - `loadConfig` / `parseConfig` for `.tokenometer.yml` (walk-up; halts at filesystem root or git root). Validates models against KNOWN_MODELS and formats against the supported set. - `toSarif` for SARIF 2.1.0 emission (consumed by CLI `--output sarif` and by `gh code-scanning`). - `anthropicVisionTokens`, `openaiVisionTokens`, `googleVisionTokens` — documented per-provider formulas; ignores `detail` where the provider doesn't expose it. GitHub Action (`packages/action`): - Sticky PR comment now includes a top-N changed-file table with per-file Δ tokens and Δ USD plus a collapsible "all files" details block. - New optional `top-n-files` input (1-20, default 5). - Per-file aggregator extracted to `per-file-diff.ts` for unit tests. - Action `dist/index.cjs` rebuilt. CLI (`tokenometer`): - Auto provider detection: when `--model` is omitted, picks based on which of `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`/`GEMINI_API_KEY` is set. Stderr note when multiple are present. - `.tokenometer.yml` config loading; `--no-config` to skip; `--config <path>` to point at a specific file. User flags always win. - `--by-file` per-file token/cost attribution table (no-op for single file). - `--output table|json|sarif` for machine-readable output. - `--image <path>` (repeatable) for vision-token cost across providers. Release workflow: - Chained `npm run lint:fix` after `changeset version` so Biome compact arrays survive Changesets' package.json rewrites — fixes the lint failure pattern seen on PR #12. Mistral integration research memo at `local research notes` informs Wave 3 Phase H impl: ship `mistral-tokenizer-js` for SentencePiece family with `approximate: true`, fall back to char-ratio heuristic for Tekken models, defer empirical mode (no public token-count endpoint). Test count: 102 → 150 (across 16 files). Lint, typecheck, build all clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): defer node:fs/path imports in loadConfig for browser builds The static `import { existsSync } from 'node:fs'` etc. at the top of config.ts caused Vite (used by @tokenometer/web) to externalize the node:* modules for browser builds, then Rollup failed because the externalized stub doesn't expose named members like `join`. Move the node:fs / node:fs/promises / node:path imports inside loadConfig as dynamic imports. Vite stops static-tracing them, so the browser build succeeds. Node still resolves them normally at runtime. parseConfig is unchanged — it was already pure. Fixes the test/build CI step + Vercel preview deploy on PR #13. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wave 2 of the v1.0.0 rollout. Pure additions across
@tokenometer/core, the GitHub Action, and the CLI. No breaking changes.What's in here
Core lib (
@tokenometer/core)loadConfig/parseConfig—.tokenometer.ymlloader with walk-up directory search. Halts at filesystem root or git root. ValidatesmodelsagainstKNOWN_MODELS,formatsagainst the supported set,budgetsnon-negative,pathsarray-of-strings.toSarif— SARIF 2.1.0 emitter. One result per (file × model × format) cell withprompt-costruleId.anthropicVisionTokens,openaiVisionTokens,googleVisionTokens— documented per-provider formulas for image tokenization.GitHub Action (
packages/action)top-n-filesinput (range 1–20, default 5).per-file-diff.tsfor unit tests (11 tests + 4 markdown snapshots).dist/index.cjsrebuilt.CLI (
tokenometer)--modelis omitted, picks based on which ofANTHROPIC_API_KEY,OPENAI_API_KEY,GOOGLE_API_KEY/GEMINI_API_KEYis set. Stderr note when multiple are present (defaults toclaude-opus-4-7)..tokenometer.ymlconfig loading;--no-configto skip;--config <path>to point at a specific file. User flags always override config.--by-file: per-file token/cost attribution table (no-op for single file).--output table|json|sarif: machine-readable output. JSON emitsTokenometerResult; SARIF passes throughtoSariffrom core.--image <path>(repeatable): vision-token cost estimation. Each image becomes a virtual file row in the result; provider is inferred from the model.Release workflow
npm run lint:fixafternpx changeset versionso Biome compact arrays survive Changesets'package.jsonrewrites.Mistral integration research
.planning/research/mistral-integration.mdinforms Wave 3 Phase H impl: shipmistral-tokenizer-jsfor SentencePiece family withapproximate: true, fall back to char-ratio heuristic for Tekken models (NeMo / Pixtral / Devstral / Mistral Medium 2505+), defer empirical mode (no public token-count endpoint).Why now
keywordsadditions.Test plan
npm run lint— 61 files cleannpm run typecheck— cleannpm test— 150/150 across 16 files (was 48/6 before)npm run build— cleandist/index.cjsrebuilt and contains the newaggregatePerFileDiff/renderPerFileMarkdownsymbols--output table,--output json | jq '.files[0].path',--output sarif | jq '.version'(returns"2.1.0"),--by-filewith multiple input filesprompt-cost.ymlruns against this PR and sticky comment renders the new per-file tableOut of scope (Wave 3 / 4)
--latencyflag🤖 Generated with Claude Code