Skip to content

test: commit gap-suite runner + fix stale CLAUDE.md /tmp pointer#5222

Merged
TheHypnoo merged 10 commits into
mainfrom
chore/ci-smoke-parity-gate
Jun 18, 2026
Merged

test: commit gap-suite runner + fix stale CLAUDE.md /tmp pointer#5222
TheHypnoo merged 10 commits into
mainfrom
chore/ci-smoke-parity-gate

Conversation

@TheHypnoo

@TheHypnoo TheHypnoo commented Jun 15, 2026

Copy link
Copy Markdown
Member

What

  • Commit scripts/run_gap_tests.sh — a runner for the gap suite (test-files/test_gap_*.ts). It is a thin wrapper over run_parity_tests.sh --filter test_gap_, so it reuses the one canonical normalizer, skip-list, and per-test output cap, builds the compiler itself, and gates locally on no new untriaged failures vs test-parity/known_failures.json.
  • Fix the stale CLAUDE.md parity-status line: it pointed at an out-of-repo /tmp/run_gap_tests.sh (a file that was never in the repo) and an outdated test count.

Why

CLAUDE.md told contributors to run /tmp/run_gap_tests.sh, which doesn't exist — the gap suite, the highest-signal-per-second test Perry has, had no committed runner. This gives everyone a real, in-repo command.

What changed from the original PR

Dropped the smoke-parity CI job. As designed it was continue-on-error (never blocking) and started ~all-red (60 untriaged pre-existing gaps on the Linux/node-22 image), so on every PR it cost a full release build + 237-test run (~57 min) while enforcing nothing and emitting noise — not worth the per-PR cost in that form. A real gap gate should land later with a green baseline and as a required check in one move (triage the baseline first), tracked separately.

No source changes; no existing CI job altered.

Add scripts/run_gap_tests.sh — a committed runner for the gap suite
(test-files/test_gap_*.ts, 235 tests), replacing the out-of-repo
/tmp/run_gap_tests.sh that CLAUDE.md pointed at. It is a thin wrapper over
run_parity_tests.sh --filter test_gap_, so it reuses the one canonical
normalizer, skip-list, and output cap (seed of the single-normalizer work),
and gates on "no NEW failures vs known_failures.json" rather than
run_parity_tests.sh's loose <80%-aggregate exit.

Wire it into a new smoke-parity CI job. The gap suite is the highest-signal-
per-second test Perry has and had no committed runner and no PR gate — a
single-feature regression could merge green. The job is INFORMATIONAL for now
(continue-on-error): the first runs surface which gap tests fail on the Linux
image under node 26 so they can be triaged into known_failures.json; once
curated + green, a follow-up drops continue-on-error and branch protection
makes it required.

Uses node 26 (the node-suite baseline oracle) and only allow-listed actions
(no sccache), so it is not blocked by the org action allow-list.

Also fix the stale CLAUDE.md parity-status line (28 -> 235 tests, /tmp ->
scripts/run_gap_tests.sh).

No source changes; no existing job altered.
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds scripts/run_gap_tests.sh, a Bash script that runs the test_gap_* parity subset via run_parity_tests.sh, extracts failures from the JSON report, diffs them against a known-failures list, and exits nonzero on untriaged failures. A new smoke-parity CI job calls this script with continue-on-error: true. CLAUDE.md is updated to reference the new command and the expanded 235-test count.

Changes

Gap test smoke gate

Layer / File(s) Summary
Gap test runner and triage gate
scripts/run_gap_tests.sh
New script invokes run_parity_tests.sh --filter test_gap_, reads test-parity/reports/latest.json to collect parity and compile failures, diffs them against test-parity/known_failures.json, and exits 1 with a printed list when untriaged failures are found; exits 0 with "Gap gate OK" otherwise.
CI job wiring and docs update
.github/workflows/test.yml, CLAUDE.md
Adds the smoke-parity GitHub Actions job (continue-on-error: true, 60-min timeout, Node.js 22) that runs ./scripts/run_gap_tests.sh; updates CLAUDE.md to reflect 235 tracked tests and the committed script as the run command.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • PerryTS/perry#5116: Defines the existing gc-stress optional gate job that the new smoke-parity job is positioned alongside in the workflow.

Poem

🐇 Hop, hop! The gap tests now run in CI lane,
A Bash script sniffs failures, untriaged ones get named.
continue-on-error keeps the pipeline sane,
Known failures are skipped — no false alarms remain.
The rabbit checks the gate: "Gap gate OK!" it exclaims. ✅

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately summarizes the two main changes: committing the gap-suite runner script and fixing the stale CLAUDE.md reference.
Description check ✅ Passed The description covers the summary, concrete changes, and rationale. It explains why the gap runner is needed and why the smoke-parity CI job was dropped, fulfilling the template's core requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/ci-smoke-parity-gate

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test.yml:
- Around line 284-295: Replace all mutable version tags with full commit SHAs
for the four GitHub Actions used in the smoke-parity job in the test.yml
workflow file. Update actions/checkout from `@v6` to its commit SHA,
dtolnay/rust-toolchain from `@stable` to its commit SHA, Swatinem/rust-cache from
`@v2` to its commit SHA, and actions/setup-node from `@v6` to its commit SHA. Each
action should be specified in the format owner/repo@<full-commit-sha> to ensure
reproducibility and prevent supply-chain drift.

In `@scripts/run_gap_tests.sh`:
- Around line 50-60: The script uses fixed temporary filenames like
/tmp/gap_all_fails.txt, /tmp/gap_known.txt, and /tmp/gap_new.txt which can
conflict when multiple script instances run concurrently. Replace all three
fixed temp filenames with dynamically generated unique names using mktemp or by
incorporating a process-specific identifier (such as $$) into each filename.
Update the assignments where these temp files are created from $REPORT, $KNOWN,
and the comm command output, and ensure the wc -l command references the correct
updated variable name for /tmp/gap_all_fails.txt.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 301df9eb-f909-4158-8e87-a5913c9fbe0e

📥 Commits

Reviewing files that changed from the base of the PR and between 4d405b4 and 4262e2e.

📒 Files selected for processing (3)
  • .github/workflows/test.yml
  • CLAUDE.md
  • scripts/run_gap_tests.sh

Comment thread .github/workflows/test.yml Outdated
Comment thread scripts/run_gap_tests.sh Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.github/workflows/test.yml (1)

124-126: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin mozilla-actions/sccache-action to a full commit SHA.

Line 125, Line 206, and Line 402 use a mutable tag (@v0.0.10). For CI supply-chain integrity and reproducibility, this needs owner/repo@<40-char-sha> pinning.

Suggested change
-      - name: Start sccache
-        uses: mozilla-actions/sccache-action@v0.0.10
+      - name: Start sccache
+        uses: mozilla-actions/sccache-action@<full-commit-sha>

Also applies to: 205-207, 401-403

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test.yml around lines 124 - 126, The
mozilla-actions/sccache-action action is pinned to a mutable version tag
(`@v0.0.10`) at three locations in the workflow file, which compromises
reproducibility and supply-chain security. Replace the mutable tag `@v0.0.10` with
a full 40-character commit SHA for the mozilla-actions/sccache-action action at
all three affected locations: .github/workflows/test.yml lines 125, 205-207, and
401-403. Use the same commit SHA across all three locations to ensure consistent
behavior.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In @.github/workflows/test.yml:
- Around line 124-126: The mozilla-actions/sccache-action action is pinned to a
mutable version tag (`@v0.0.10`) at three locations in the workflow file, which
compromises reproducibility and supply-chain security. Replace the mutable tag
`@v0.0.10` with a full 40-character commit SHA for the
mozilla-actions/sccache-action action at all three affected locations:
.github/workflows/test.yml lines 125, 205-207, and 401-403. Use the same commit
SHA across all three locations to ensure consistent behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7070c0c8-861a-47aa-ba29-e1f0a32ed83b

📥 Commits

Reviewing files that changed from the base of the PR and between 4262e2e and 5a309ab.

📒 Files selected for processing (2)
  • .github/workflows/test.yml
  • CLAUDE.md
✅ Files skipped from review due to trivial changes (1)
  • CLAUDE.md

@TheHypnoo

Copy link
Copy Markdown
Member Author

Good catch — fixed in the latest commit. run_gap_tests.sh now allocates a run-scoped mktemp -d dir and cleans it up via trap ... EXIT, so concurrent runs can't clobber each other's failure lists.

TheHypnoo and others added 4 commits June 16, 2026 12:27
run_gap_tests.sh wrote its failure lists to fixed /tmp/gap_*.txt names, so
concurrent runs (a second PR, local + CI on the same box, or the upcoming
node-suite-guard alongside) could clobber each other and produce a false
gate result. Allocate a run-scoped dir with mktemp -d and rm -rf it on EXIT.
…ls: false

Two fixes to the informational gap gate:

- Oracle node 26 -> 22. The gap suite is byte-diffed live against
  `node --experimental-strip-types`, and it is already green under node 22
  (the legacy parity job). Node 26 introduced version-sensitive diffs
  (v8 / perf_hooks / process internals — e.g. test_gap_node_v8, test_gap_v8_2,
  test_gap_perfhooks, test_gap_process_*) that are not Perry regressions and
  would pollute the triage list. The node-suite regression guard keeps node 26
  for its frozen pass-count baseline — a separate mechanism.
- persist-credentials: false on checkout. The job is read-only (build + test),
  so it should not leave the GITHUB_TOKEN in the local git config (CodeRabbit /
  least privilege).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.github/workflows/test.yml (1)

319-334: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin smoke-parity actions to full commit SHAs.

Line 319 through Line 334 still use mutable refs (@v6, @stable, @v2), which leaves this job open to supply-chain drift. Please pin all four actions in this job to immutable SHAs.

#!/bin/bash
# Verify mutable action refs specifically within smoke-parity job.
# Expected after fix: no output.
awk '
  /^  smoke-parity:/ {in_job=1}
  in_job && /^  [a-zA-Z0-9_-]+:/ && $1 != "smoke-parity:" {in_job=0}
  in_job {print NR ":" $0}
' .github/workflows/test.yml \
| rg -n 'uses:\s*[^@[:space:]]+@(v[0-9]+|stable|main|master)$'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test.yml around lines 319 - 334, Pin all four actions in
the smoke-parity job to immutable full commit SHAs instead of mutable version
references. Replace the `@v6` reference in actions/checkout, the `@stable`
reference in dtolnay/rust-toolchain, the `@v2` reference in Swatinem/rust-cache,
and the `@v6` reference in actions/setup-node with their respective full commit
SHA values to prevent supply-chain drift and ensure reproducible builds.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In @.github/workflows/test.yml:
- Around line 319-334: Pin all four actions in the smoke-parity job to immutable
full commit SHAs instead of mutable version references. Replace the `@v6`
reference in actions/checkout, the `@stable` reference in
dtolnay/rust-toolchain, the `@v2` reference in Swatinem/rust-cache, and the
`@v6` reference in actions/setup-node with their respective full commit SHA
values to prevent supply-chain drift and ensure reproducible builds.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e759d216-e2aa-4101-92bd-411b33cc243a

📥 Commits

Reviewing files that changed from the base of the PR and between 101ce8d and 6d5a488.

📒 Files selected for processing (1)
  • .github/workflows/test.yml

proggeramlug and others added 2 commits June 18, 2026 12:52
… runner

The smoke-parity job was continue-on-error (never blocking) and started
all-red (60 untriaged pre-existing gaps), so on every PR it cost a full
release build + 237-test run while enforcing nothing. Drop it; this PR now
just commits the gap runner and fixes the stale CLAUDE.md /tmp pointer. A
real gap gate should land later with a green baseline and as a required
check in one move.
@TheHypnoo TheHypnoo changed the title ci(test): committed gap-suite runner + informational smoke-parity gate test: commit gap-suite runner + fix stale CLAUDE.md /tmp pointer Jun 18, 2026
@TheHypnoo TheHypnoo merged commit c84bba3 into main Jun 18, 2026
15 checks passed
@TheHypnoo TheHypnoo deleted the chore/ci-smoke-parity-gate branch June 18, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants