chore(test): ratchet node-suite baseline to clean 98.1% run by proggeramlug · Pull Request #5166 · PerryTS/perry

proggeramlug · 2026-06-14T22:20:54Z

Refreshes test-parity/node_suite_baseline.json floors from a clean node-26 full run after the recent merge wave (#5144 singleton diffs, #5124 http/net relink, #5099 events, #5106/#5107 util).

node-suite: 2810/2863 (98.1%) — up from the prior 97.5% baseline.

Floor changes:

object 23/23, util 86/86, tty 32/32, events 69/69 → full (deterministic gains from fix(node-suite): close singleton diffs — MIMEParams crash, boxed-String inspect, Reflect.construct newTarget #5144/fix(util): quote top-level strings in util.format %o/%O (Node parity) #5106-7/fix(events,fs,globals): node v26 parity — events.on async iterator, fs watch buffering, four globals diffs #5099)
stream 770, globals 111, diagnostics_channel 66, fs-promises 77 → ratcheted up with small flake margins
http 19 → 17: http is verified 19/19 in isolation, but the full-suite harness flakes to 17 under port contention. Floored at 17 so the guard stops false-alarming on flake; a real http regression (link break → many compile-fails, behavior break → well below 17) is a far larger drop that still trips. Documented in the schema note.

Validated: the guard passes clean against the run these floors came from (all 53 modules ≥ floor). Tooling-only; no version/CHANGELOG bump (maintainer folds in at merge).

Summary by CodeRabbit

Tests
- Updated regression test baseline with improved overall pass rates from 97.5% to 98.1%
- Adjusted test module baselines to reflect observed suite stability and behavior
- Enhanced test documentation noting normalization behavior and baseline refresh criteria

Refresh floors from a clean node-26 full run (2810/2863, 98.1%): - object 23/23, util 86/86, tty 32/32, events 69/69 -> full (#5144/#5106-7/#5099) - stream 770, globals 111, diagnostics_channel 66, fs-promises 77 -> ratcheted up with small flake margins - http floor 19 -> 17: verified 19/19 in isolation but the full-suite harness flakes to 17 under port contention; a real http regression is a far larger drop (link break / behavior break), which still trips the guard. Stops the false-alarm seen on prior runs.

coderabbitai · 2026-06-14T22:21:13Z

📝 Walkthrough

Walkthrough

test-parity/node_suite_baseline.json is refreshed from a clean node-26 run. The schema note is extended, the overall baseline rises from 2792/2863 (97.5%) to 2810/2863 (98.1%), several module floors increase, and http is lowered to 17/19 to account for observed flakiness.

Changes

Node Suite Baseline Refresh

Layer / File(s)	Summary
Schema note and overall aggregate `test-parity/node_suite_baseline.json`	`note` text extended to describe console normalization and node-26 floor refresh; overall `pass` raised to `2810`, `pct` updated to `98.1`.
Per-module floor rebalancing `test-parity/node_suite_baseline.json`	`diagnostics_channel`, `events`, `fs-promises`, `globals`, `object`, `stream`, `tty`, and `util` floors increased; `http` floor lowered to `17/19` for flake tolerance.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

PerryTS/perry#5097: Introduces the node_suite_regression_check.py script that reads per-module floors from this same node_suite_baseline.json file to classify improvements and regressions.
PerryTS/perry#5101: Introduces console stdout normalization whose effect (driving console to full-pass) is reflected in the updated note and floor values in this baseline.

Poem

🐇 Hop hop, the floors are raised up high,
Ninety-eight percent, oh my oh my!
HTTP flakes? We planned for that too,
Refreshed from node-26, shiny and new.
The baseline is set, the numbers are right —
This bunny sleeps well in the CI night! 🌙

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description covers the core content (what changed, which modules improved, HTTP floor rationale) but lacks explicit coverage of the required template sections (Summary, Changes, Test plan checklist items).	Structure the description to align with the template: add explicit 'Summary' section, organized 'Changes' bullet list, and confirmation of test plan checklist items.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: updating the node-suite baseline to a clean 98.1% pass rate run.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/ratchet-node-suite-baseline-981

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test-parity/node_suite_baseline.json`:
- Line 5: The note field in node_suite_baseline.json references a total pass
count of 2810 from a clean node-26 run, but the actual sum of all module floor
values in the file totals 2801. Verify the sum of all module pass/floor values
currently specified in the baseline to confirm the discrepancy, then either
update the note to reference the correct total (2801) or, if you have access to
the original clean run data showing 2810, adjust the individual module floor
values accordingly to match that total.
- Around line 7-10: The overall aggregate in the JSON baseline has an arithmetic
mismatch where the overall.pass value of 2810 does not match the sum of all 53
module pass values which totals 2801, creating a 9-test discrepancy. Update the
overall.pass field to 2801 to match the actual sum of module pass values, and
recalculate the overall.pct field to 97.8% based on the corrected pass count
divided by the total count of 2863 to restore consistency between the aggregate
and its components.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4db945d6-3435-4577-9d93-b37ca9a6d5ef

📥 Commits

Reviewing files that changed from the base of the PR and between b5b18cb and 7538579.

📒 Files selected for processing (1)

test-parity/node_suite_baseline.json

coderabbitai · 2026-06-14T22:25:19Z

    "description": "Floor baseline for scripts/node_suite_regression_check.py. Each module's run must produce pass >= floor.pass; dropping below is a regression (exit 1). Improvements are always accepted and reported as ratchet candidates. Captured in the node-26 environment with scripts/node_suite_run.py (pre-warm + fast/slow lanes).",
    "oracle": "node v26.3.0 on Linux (the box)",
-    "note": "Deterministic modules are floored at full pass. Timing/racy modules (http2, net, stream, diagnostics_channel, fs-promises) carry a small margin below observed pass so ordinary flake does not false-alarm; the guard still catches real regressions, which are large (e.g. dns 6->0, http 19->9). node_suite_run.normalize() scrubs environment-variant tokens (console.time hrtime durations, stack-trace frame lines) symmetrically before the stdout compare, so console is floored at full pass (119) on its deterministic content."
+    "note": "Deterministic modules are floored at full pass. Timing/racy modules (http2, net, stream, diagnostics_channel, fs-promises) carry a small margin below observed pass so ordinary flake does not false-alarm; the guard still catches real regressions, which are large (e.g. dns 6->0, http 19->9). node_suite_run.normalize() scrubs environment-variant tokens (console.time hrtime durations, stack-trace frame lines) symmetrically before the stdout compare, so console is floored at full pass (119) on its deterministic content. http is verified 19/19 in isolation but the full-suite harness flakes to 17 under port contention, so it is floored at 17 (flake margin, not a regression); a real http break is a much larger drop. Floors refreshed from a clean node-26 run at 2810/2863 (98.1%)."


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Schema note references inconsistent overall pass count.

The note states "Floors refreshed from a clean node-26 run at 2810/2863 (98.1%)", but the sum of module pass values is 2801 (not 2810). Either the note should reference 2801, or the module floor values need adjustment to match the actual 2810-pass run.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test-parity/node_suite_baseline.json` at line 5, The note field in node_suite_baseline.json references a total pass count of 2810 from a clean node-26 run, but the actual sum of all module floor values in the file totals 2801. Verify the sum of all module pass/floor values currently specified in the baseline to confirm the discrepancy, then either update the note to reference the correct total (2801) or, if you have access to the original clean run data showing 2810, adjust the individual module floor values accordingly to match that total.

coderabbitai · 2026-06-14T22:25:19Z

+  "overall": {
+    "pass": 2810,
+    "total": 2863,
+    "pct": 98.1


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Sum all module pass and total values from the baseline JSON. jq '.modules | to_entries | map(.value.pass) | add' test-parity/node_suite_baseline.json jq '.modules | to_entries | map(.value.total) | add' test-parity/node_suite_baseline.json

Repository: PerryTS/perry

Length of output: 67

Fix arithmetic mismatch in overall baseline aggregate.

The overall.pass is set to 2810, but the sum of all 53 module pass values is 2801. This creates a 9-test discrepancy. The pct should be updated from 98.1% to 97.8% to match.

Proposed fix

"overall": { - "pass": 2810, + "pass": 2801, "total": 2863, - "pct": 98.1 + "pct": 97.8 },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"overall": {

"pass": 2810,

"total": 2863,

"pct": 98.1

"overall": {

"pass": 2801,

"total": 2863,

"pct": 97.8

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test-parity/node_suite_baseline.json` around lines 7 - 10, The overall aggregate in the JSON baseline has an arithmetic mismatch where the overall.pass value of 2810 does not match the sum of all 53 module pass values which totals 2801, creating a 9-test discrepancy. Update the overall.pass field to 2801 to match the actual sum of module pass values, and recalculate the overall.pct field to 97.8% based on the corrected pass count divided by the total count of 2863 to restore consistency between the aggregate and its components.

coderabbitai Bot reviewed Jun 14, 2026

View reviewed changes

proggeramlug merged commit 3166a22 into main Jun 15, 2026
15 checks passed

proggeramlug deleted the chore/ratchet-node-suite-baseline-981 branch June 15, 2026 06:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(test): ratchet node-suite baseline to clean 98.1% run#5166

chore(test): ratchet node-suite baseline to clean 98.1% run#5166
proggeramlug merged 1 commit into
mainfrom
chore/ratchet-node-suite-baseline-981

proggeramlug commented Jun 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 14, 2026

Uh oh!

coderabbitai Bot Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

proggeramlug commented Jun 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

proggeramlug commented Jun 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 14, 2026 •

edited

Loading