Skip to content

chore(benchmarks): refresh stale perf baseline#5081

Merged
proggeramlug merged 1 commit into
mainfrom
chore/refresh-perf-baseline
Jun 13, 2026
Merged

chore(benchmarks): refresh stale perf baseline#5081
proggeramlug merged 1 commit into
mainfrom
chore/refresh-perf-baseline

Conversation

@TheHypnoo

@TheHypnoo TheHypnoo commented Jun 13, 2026

Copy link
Copy Markdown
Member

What

Regenerate benchmarks/baseline.json, which had gone badly stale.

The previous baseline (53f0f29, 2026-06-09) recorded 100x–3000x array/numeric
regressions that no longer reproduce after the intervening codegen work — the
regression gate was comparing against fantasy numbers:

Row stale baseline current vs Node
03_array_write 4143 ms 3 ms Perry wins
04_array_read 4067 ms 167 ms 15×
16_matrix_multiply 10221 ms 671 ms 17×
10_nested_loops 4579 ms 306 ms 16×
bench_numeric_array_numeric 3546 ms 248 ms 50×
bench_numeric_array_downgrade 21902 ms 4690 ms 781×

How

Regenerated on the current commit via:

./benchmarks/compare.sh --update-baseline --full --runs 5

(macOS host, Node 24, median of 5). Only benchmarks/baseline.json changes.
A provenance note recording the refresh and the runner-variance caveat is
included in the file's notes field.

Why it matters

With the stale numbers gone, the genuine remaining gaps vs Node are no longer
masked and the gate is meaningful again. Notable real gaps worth follow-up:

  • 09_method_calls (~490×) — trivial monomorphic method not inlined; its body
    emits ~4 non-inlinable typed-feedback runtime calls per invocation
    (register_site ×2 + class-field get/set guards) even though typed feedback
    is off by default. The fast paths behind the guards are correct inline slot
    load/stores; the scaffolding calls are the cost.
  • bench_numeric_array_downgrade (~780×)any[] forces dynamic add +
    boxed set fallback per element.
  • bench_object_property (~70×), bench_numeric_array_numeric (~50×).

Summary by CodeRabbit

  • Tests
    • Updated benchmark baseline with refreshed performance metrics across all benchmark entries.
    • Enhanced benchmarks to include correctness validation tracking alongside performance measurements.

baseline.json (53f0f29, 2026-06-09) recorded 100x-3000x array/numeric
regressions that no longer reproduce after intervening codegen work
(03_array_write 4143->3ms, 04_array_read 4067->167ms, 16_matrix_multiply
10221->671ms, bench_numeric_array_downgrade 21902->4690ms).

Regenerated on the current commit via
`./benchmarks/compare.sh --update-baseline --full --runs 5` so the
regression gate reflects reality. Genuine remaining Node gaps are now
visible rather than masked by stale catastrophic numbers: 09_method_calls
(~490x), bench_numeric_array_downgrade (~780x), bench_object_property
(~70x), bench_numeric_array_numeric (~50x).
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a3b8eccb-9177-4880-a8e9-85aae7b212d3

📥 Commits

Reviewing files that changed from the base of the PR and between ab5b7e8 and 4ea1116.

📒 Files selected for processing (1)
  • benchmarks/baseline.json

📝 Walkthrough

Walkthrough

The pull request refreshes benchmarks/baseline.json to a new performance baseline. The file's top-level metadata (commit hash and timestamp) are updated, the notes are expanded with runner-variance guidance, and each benchmark entry is extended with a correctness object tracking validation status alongside existing timing and memory metrics.

Changes

Benchmark Baseline Refresh

Layer / File(s) Summary
Baseline metadata update
benchmarks/baseline.json
Top-level commit, generated_at timestamp, and notes are refreshed to reflect the new baseline and provide runner-variance guidance.
Benchmark measurements and correctness validation
benchmarks/baseline.json
All benchmark entries (from 02_loop_overhead through bench_numeric_array_downgrade) are updated with new performance metrics (timing and memory), and each entry gains a correctness object containing validation status and actual/expected line counts.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

📊 New numbers dance across the baseline sheet,
With correctness blooms that make benchmarks complete,
Perry's performance captured in measured time,
Each entry now truth-tagged—a baseline sublime! 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: refreshing the stale performance baseline in benchmarks/baseline.json.
Description check ✅ Passed The description is comprehensive, explaining the what, how, and why with concrete examples and data; all key template sections are addressed.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/refresh-perf-baseline

Comment @coderabbitai help to get the list of available commands and usage tips.

@proggeramlug proggeramlug merged commit 56dff97 into main Jun 13, 2026
14 checks passed
@proggeramlug proggeramlug deleted the chore/refresh-perf-baseline branch June 13, 2026 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants