chore(benchmarks): refresh stale perf baseline by TheHypnoo · Pull Request #5081 · PerryTS/perry

TheHypnoo · 2026-06-13T09:46:34Z

What

Regenerate benchmarks/baseline.json, which had gone badly stale.

The previous baseline (53f0f29, 2026-06-09) recorded 100x–3000x array/numeric
regressions that no longer reproduce after the intervening codegen work — the
regression gate was comparing against fantasy numbers:

Row	stale baseline	current	vs Node
03_array_write	4143 ms	3 ms	Perry wins
04_array_read	4067 ms	167 ms	15×
16_matrix_multiply	10221 ms	671 ms	17×
10_nested_loops	4579 ms	306 ms	16×
bench_numeric_array_numeric	3546 ms	248 ms	50×
bench_numeric_array_downgrade	21902 ms	4690 ms	781×

How

Regenerated on the current commit via:

./benchmarks/compare.sh --update-baseline --full --runs 5

(macOS host, Node 24, median of 5). Only benchmarks/baseline.json changes.
A provenance note recording the refresh and the runner-variance caveat is
included in the file's notes field.

Why it matters

With the stale numbers gone, the genuine remaining gaps vs Node are no longer
masked and the gate is meaningful again. Notable real gaps worth follow-up:

09_method_calls (~490×) — trivial monomorphic method not inlined; its body
emits ~4 non-inlinable typed-feedback runtime calls per invocation
(register_site ×2 + class-field get/set guards) even though typed feedback
is off by default. The fast paths behind the guards are correct inline slot
load/stores; the scaffolding calls are the cost.
bench_numeric_array_downgrade (~780×) — any[] forces dynamic add +
boxed set fallback per element.
bench_object_property (~70×), bench_numeric_array_numeric (~50×).

Summary by CodeRabbit

Tests
- Updated benchmark baseline with refreshed performance metrics across all benchmark entries.
- Enhanced benchmarks to include correctness validation tracking alongside performance measurements.

baseline.json (53f0f29, 2026-06-09) recorded 100x-3000x array/numeric regressions that no longer reproduce after intervening codegen work (03_array_write 4143->3ms, 04_array_read 4067->167ms, 16_matrix_multiply 10221->671ms, bench_numeric_array_downgrade 21902->4690ms). Regenerated on the current commit via `./benchmarks/compare.sh --update-baseline --full --runs 5` so the regression gate reflects reality. Genuine remaining Node gaps are now visible rather than masked by stale catastrophic numbers: 09_method_calls (~490x), bench_numeric_array_downgrade (~780x), bench_object_property (~70x), bench_numeric_array_numeric (~50x).

coderabbitai · 2026-06-13T09:46:52Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a3b8eccb-9177-4880-a8e9-85aae7b212d3

📥 Commits

Reviewing files that changed from the base of the PR and between ab5b7e8 and 4ea1116.

📒 Files selected for processing (1)

benchmarks/baseline.json

📝 Walkthrough

Walkthrough

The pull request refreshes benchmarks/baseline.json to a new performance baseline. The file's top-level metadata (commit hash and timestamp) are updated, the notes are expanded with runner-variance guidance, and each benchmark entry is extended with a correctness object tracking validation status alongside existing timing and memory metrics.

Changes

Benchmark Baseline Refresh

Layer / File(s)	Summary
Baseline metadata update `benchmarks/baseline.json`	Top-level commit, generated_at timestamp, and notes are refreshed to reflect the new baseline and provide runner-variance guidance.
Benchmark measurements and correctness validation `benchmarks/baseline.json`	All benchmark entries (from `02_loop_overhead` through `bench_numeric_array_downgrade`) are updated with new performance metrics (timing and memory), and each entry gains a `correctness` object containing validation status and actual/expected line counts.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

📊 New numbers dance across the baseline sheet,
With correctness blooms that make benchmarks complete,
Perry's performance captured in measured time,
Each entry now truth-tagged—a baseline sublime! 🐰✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: refreshing the stale performance baseline in benchmarks/baseline.json.
Description check	✅ Passed	The description is comprehensive, explaining the what, how, and why with concrete examples and data; all key template sections are addressed.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/refresh-perf-baseline

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

proggeramlug merged commit 56dff97 into main Jun 13, 2026
14 checks passed

proggeramlug deleted the chore/refresh-perf-baseline branch June 13, 2026 09:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(benchmarks): refresh stale perf baseline#5081

chore(benchmarks): refresh stale perf baseline#5081
proggeramlug merged 1 commit into
mainfrom
chore/refresh-perf-baseline

TheHypnoo commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

TheHypnoo commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Why it matters

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TheHypnoo commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading