chore(test): ratchet node-suite baseline to clean 98.1% run#5166
Conversation
Refresh floors from a clean node-26 full run (2810/2863, 98.1%): - object 23/23, util 86/86, tty 32/32, events 69/69 -> full (#5144/#5106-7/#5099) - stream 770, globals 111, diagnostics_channel 66, fs-promises 77 -> ratcheted up with small flake margins - http floor 19 -> 17: verified 19/19 in isolation but the full-suite harness flakes to 17 under port contention; a real http regression is a far larger drop (link break / behavior break), which still trips the guard. Stops the false-alarm seen on prior runs.
📝 WalkthroughWalkthrough
ChangesNode Suite Baseline Refresh
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test-parity/node_suite_baseline.json`:
- Line 5: The note field in node_suite_baseline.json references a total pass
count of 2810 from a clean node-26 run, but the actual sum of all module floor
values in the file totals 2801. Verify the sum of all module pass/floor values
currently specified in the baseline to confirm the discrepancy, then either
update the note to reference the correct total (2801) or, if you have access to
the original clean run data showing 2810, adjust the individual module floor
values accordingly to match that total.
- Around line 7-10: The overall aggregate in the JSON baseline has an arithmetic
mismatch where the overall.pass value of 2810 does not match the sum of all 53
module pass values which totals 2801, creating a 9-test discrepancy. Update the
overall.pass field to 2801 to match the actual sum of module pass values, and
recalculate the overall.pct field to 97.8% based on the corrected pass count
divided by the total count of 2863 to restore consistency between the aggregate
and its components.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 4db945d6-3435-4577-9d93-b37ca9a6d5ef
📒 Files selected for processing (1)
test-parity/node_suite_baseline.json
| "description": "Floor baseline for scripts/node_suite_regression_check.py. Each module's run must produce pass >= floor.pass; dropping below is a regression (exit 1). Improvements are always accepted and reported as ratchet candidates. Captured in the node-26 environment with scripts/node_suite_run.py (pre-warm + fast/slow lanes).", | ||
| "oracle": "node v26.3.0 on Linux (the box)", | ||
| "note": "Deterministic modules are floored at full pass. Timing/racy modules (http2, net, stream, diagnostics_channel, fs-promises) carry a small margin below observed pass so ordinary flake does not false-alarm; the guard still catches real regressions, which are large (e.g. dns 6->0, http 19->9). node_suite_run.normalize() scrubs environment-variant tokens (console.time hrtime durations, stack-trace frame lines) symmetrically before the stdout compare, so console is floored at full pass (119) on its deterministic content." | ||
| "note": "Deterministic modules are floored at full pass. Timing/racy modules (http2, net, stream, diagnostics_channel, fs-promises) carry a small margin below observed pass so ordinary flake does not false-alarm; the guard still catches real regressions, which are large (e.g. dns 6->0, http 19->9). node_suite_run.normalize() scrubs environment-variant tokens (console.time hrtime durations, stack-trace frame lines) symmetrically before the stdout compare, so console is floored at full pass (119) on its deterministic content. http is verified 19/19 in isolation but the full-suite harness flakes to 17 under port contention, so it is floored at 17 (flake margin, not a regression); a real http break is a much larger drop. Floors refreshed from a clean node-26 run at 2810/2863 (98.1%)." |
There was a problem hiding this comment.
Schema note references inconsistent overall pass count.
The note states "Floors refreshed from a clean node-26 run at 2810/2863 (98.1%)", but the sum of module pass values is 2801 (not 2810). Either the note should reference 2801, or the module floor values need adjustment to match the actual 2810-pass run.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test-parity/node_suite_baseline.json` at line 5, The note field in
node_suite_baseline.json references a total pass count of 2810 from a clean
node-26 run, but the actual sum of all module floor values in the file totals
2801. Verify the sum of all module pass/floor values currently specified in the
baseline to confirm the discrepancy, then either update the note to reference
the correct total (2801) or, if you have access to the original clean run data
showing 2810, adjust the individual module floor values accordingly to match
that total.
| "overall": { | ||
| "pass": 2810, | ||
| "total": 2863, | ||
| "pct": 98.1 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Sum all module pass and total values from the baseline JSON.
jq '.modules | to_entries | map(.value.pass) | add' test-parity/node_suite_baseline.json
jq '.modules | to_entries | map(.value.total) | add' test-parity/node_suite_baseline.jsonRepository: PerryTS/perry
Length of output: 67
Fix arithmetic mismatch in overall baseline aggregate.
The overall.pass is set to 2810, but the sum of all 53 module pass values is 2801. This creates a 9-test discrepancy. The pct should be updated from 98.1% to 97.8% to match.
Proposed fix
"overall": {
- "pass": 2810,
+ "pass": 2801,
"total": 2863,
- "pct": 98.1
+ "pct": 97.8
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "overall": { | |
| "pass": 2810, | |
| "total": 2863, | |
| "pct": 98.1 | |
| "overall": { | |
| "pass": 2801, | |
| "total": 2863, | |
| "pct": 97.8 | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test-parity/node_suite_baseline.json` around lines 7 - 10, The overall
aggregate in the JSON baseline has an arithmetic mismatch where the overall.pass
value of 2810 does not match the sum of all 53 module pass values which totals
2801, creating a 9-test discrepancy. Update the overall.pass field to 2801 to
match the actual sum of module pass values, and recalculate the overall.pct
field to 97.8% based on the corrected pass count divided by the total count of
2863 to restore consistency between the aggregate and its components.
Refreshes
test-parity/node_suite_baseline.jsonfloors from a clean node-26 full run after the recent merge wave (#5144 singleton diffs, #5124 http/net relink, #5099 events, #5106/#5107 util).node-suite: 2810/2863 (98.1%) — up from the prior 97.5% baseline.
Floor changes:
Validated: the guard passes clean against the run these floors came from (all 53 modules ≥ floor). Tooling-only; no version/CHANGELOG bump (maintainer folds in at merge).
Summary by CodeRabbit