feat(skill): perf-trend render bench history report across recent main CI runs by dryotta · Pull Request #396 · dryotta/mdownreview

dryotta · 2026-05-16T18:28:49Z

What

Adds a new perf-trend skill plus the supporting scripts/perf-trend.mjs helper and npm run perf:trend wrapper. The skill pulls the bench-output artifact from recent successful ci.yml runs on main, parses the criterion bencher output, joins it with each run's commit SHA + PR title, and renders a markdown trend report.

Why

The bench job's per-run job-summary table shows only latest vs budget. It tells you nothing about:

whether a number is moving over time,
whether a regression is new or chronic,
whether other budgeted benches are silently missing because an earlier harness in cargo bench panicked.

The perf-trend skill answers all three on demand.

Report sections

** Missing budgeted benches** names from the workflow's BUDGETS table that produced no output in the window. The CI summary step is silent about these; surfacing them is the headline reason this skill exists.
Summary per-bench row: latest, budget, window median, Δ vs median, status ( / over budget), ASCII sparkline (oldestnewest).
Movers top regressions / improvements (10% off median, 3 samples), each citing the commit SHA + PR title of the latest run.
Per-bench history <details> block with the full series.

What it found on current `main`

Running on the 14-day window already surfaces two latent issues the per-run CI summary has been silent about:

hot_path/get_file_comments_large has been ~34 over its 20 ms budget (~7083 ms) for at least 8 consecutive runs.
matching_bench panics with "yaml anchors/aliases not allowed in sidecars" on every run, short-circuiting cargo bench and dropping 4 of 5 budgeted benches from the output (matching/50_comments_1000_lines, fold_regions/large_100kb_jsonlike/default, parse_kql/pipeline_50_steps, strip_json_comments/large_100kb).

Both are pre-existing repo issues outside the scope of this PR, but the report makes them impossible to miss going forward.

Design

Budgets parsed live from .github/workflows/ci.yml single source of truth; if the workflow's BUDGETS table changes, the script follows.
14-day default window matches the bench-output artifact retention.
--summary flag appends the report to \ so a future scheduled-workflow caller can post the same renderer's output to a job page without rewriting it.
Read-only never writes to the repo, never opens issues. Acting on findings is for the user or a follow-on skill (per user direction during design).

Verified

npm run lint:skills (11 skills now, all npm run references resolve)
npm run lint:markdown-surfaces
npx eslint scripts/perf-trend.mjs
node scripts/perf-trend.mjs --help
node scripts/perf-trend.mjs --days 14 --runs 10 renders full report with Missing-benches section
node scripts/perf-trend.mjs --summary with \ set writes 35 lines to the summary file

…n CI runs Adds a new perf-trend skill (and supporting scripts/perf-trend.mjs + `npm run perf:trend`) that pulls the �ench-output artifacts from recent successful ci.yml runs on main, parses criterion bencher output, and renders a markdown trend report. Why --- The �ench job's per-run summary table (in the GitHub job summary) shows only the latest measurement vs budget. It tells you nothing about whether the value is moving, whether the regression is new, or whether other budgeted benches are silently missing from the output. This skill answers all three. Output sections --------------- 1. **⚠ Missing budgeted benches** flags any bench named in the workflow's declare -A BUDGETS block that produced no output in the window. Root cause is usually an earlier-running bench harness panicking and short-circuiting `cargo bench` for the rest of the suite. The existing CI summary is silent about this; surfacing it is the headline reason this skill exists. 2. **Summary** per-bench row with latest, budget, window median, % delta vs median, status (` / over budget), and an ASCII sparkline (left=oldest, right=newest). 3. **Movers** top regressions / improvements (10% off median, 3 samples), each citing the commit SHA + PR title of the latest run. 4. **Per-bench history** <details> block with the full (date, commit, ns/iter, dev, title) series. Validation against current main ------------------------------- Running `npm run perf:trend` on the 14-day window surfaced two latent issues that the per-run CI summary has been silent about: - `hot_path/get_file_comments_large` has been ~3-4 over its 20 ms budget for at least 8 consecutive runs (~70-83 ms). - `matching_bench` panics with *"yaml anchors/aliases not allowed in sidecars"* on every run, which short-circuits `cargo bench` and drops 4 of 5 budgeted benches from the output: `matching/50_comments_1000_lines`, `fold_regions/large_100kb_jsonlike/default`, `parse_kql/pipeline_50_steps`, `strip_json_comments/large_100kb`. Both are pre-existing repo issues fixing them is out of scope for this PR, but the report makes them impossible to miss next time. Design notes ------------ - Budgets parsed live from .github/workflows/ci.yml (single source of truth updates to the workflow's BUDGETS table flow through automatically). - 14-day window matches the bench-output artifact retention. - --summary flag appends to `\` so a future scheduled-workflow caller can post the same report to a job page without rewriting the renderer. - Read-only never writes to the repo, never opens issues. Acting on findings is for the user or a follow-on skill. Verified -------- - `npm run lint:skills` (11 skills now, all references resolve) - `npm run lint:markdown-surfaces` - `npx eslint scripts/perf-trend.mjs` - `node scripts/perf-trend.mjs --help` - `node scripts/perf-trend.mjs --days 14 --runs 10` renders the full report - `node scripts/perf-trend.mjs --summary` with `\` set writes 35 lines to the summary file Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dryotta merged commit f8b6641 into main May 16, 2026
15 checks passed

dryotta deleted the chore/perf-trend-skill branch May 16, 2026 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skill): perf-trend render bench history report across recent main CI runs#396

feat(skill): perf-trend render bench history report across recent main CI runs#396
dryotta merged 1 commit into
mainfrom
chore/perf-trend-skill

dryotta commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dryotta commented May 16, 2026

What

Why

Report sections

What it found on current main

Design

Verified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

What it found on current `main`