Skip to content

fix(hir): functional source-compile correctness batch 3 (semver + minimatch green)#5408

Merged
proggeramlug merged 4 commits into
mainfrom
fix/functional-correctness-batch3
Jun 18, 2026
Merged

fix(hir): functional source-compile correctness batch 3 (semver + minimatch green)#5408
proggeramlug merged 4 commits into
mainfrom
fix/functional-correctness-batch3

Conversation

@proggeramlug

@proggeramlug proggeramlug commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Batch 3: functional source-compile correctness fixes

Found via the differential harness (secret-tests/corpus/functional). All fixes are in HIR lowering; no runtime/codegen changes.

Functional harness OK: 28 → 30 (semver + minimatch now FULLY green, byte-for-byte vs node --experimental-strip-types). Import-smoke 59/59 (no regression).

Bugs fixed (root cause)

1. .test() / .exec() on an untyped local mis-lowered to RegExpTestcommit 334a6b4
receiver.test(arg) was lowered to the Expr::RegExpTest codegen fast path whenever the receiver was a regex literal OR an identifier whose local type was Any/Unknown/unknown (unwrap_or(true)). But .test() is also a common INSTANCE method (semver's Comparator.test / Range.test). An imported class instance is typed Any at the call site, so comparator.test(version) mis-lowered into js_regexp_test(comparator-coerced-to-string, …) → silently returned a bogus boolean, never ran the method body → semver.satisfies(...)/maxSatisfying always returned false. The runtime already routes a genuine RegExp receiver's .test()/.exec() through dynamic dispatch (dispatch_regex_receiver_method, #1731), so falling through to a normal method call is correct for both a regex value and a class instance. Now only takes the fast path with positive evidence (regex literal, or RegExp-typed local). → semver green.

2. Named class EXPRESSION hijacks a same-named top-level/imported classcommits fe3b74d, 34b92bc
A named class expression nested in a function body whose name collided with a top-level class X declaration OR an imported class binding reused that class's ClassId (via lookup_class), silently overwriting the real class with the nested (often near-empty) expression. Per JS spec a class-expression's name binds only inside its own body. minimatch's defaults() returns Object.assign(m, { Minimatch: class Minimatch extends orig.Minimatch {…}, AST: class AST extends orig.AST {…} })Minimatch collides with the real export class Minimatch (→ new Minimatch(pattern) built a body-less instance, every field/method undefined, constructor never ran), and AST collides with import { AST } (→ AST.fromGlob(...) inside Minimatch.parse dispatched to the wrong empty class). Fix: track top-level class-declaration names and rename a colliding class expression to a fresh unique <name>__class_expr_N so it gets its own ClassId; the value position (object property / new site) holds the resulting ClassRef directly.

3. .match() / .matchAll() on an untyped arg mis-lowered to StringMatchcommit 34b92bc
Same heuristic class as (1): obj.match(arg) was lowered to Expr::StringMatch (String.prototype.match(regex)) whenever the ARG was an Any/Unknown/untyped local, treating the RECEIVER as a string. minimatch's top-level minimatch(p, pat) arrow returns new Minimatch(pat).match(p) (untyped param p), which mis-lowered to StringMatch(new Minimatch(pat), p) → returned null instead of the boolean. Runtime already handles a genuine string receiver's .match(regex) via dynamic dispatch (#519/#510). Now only takes the fast path with positive regex evidence. → minimatch green (combined with #2).

Characterized (not fixed — deeper, package-specific)

  • zodnew Class({…}) where Class is a function PARAMETER holding a cross-module class; undefined is not a constructor at core/api.ts:67. A related minimal repro shows the cross-module case loses object-literal-spread fields (#809-adjacent).
  • markedmarked.parse(md) returns [object Object] (deep Marked/Parser instance + parseMarkdown arrow-bound-at-construction chain).
  • immer — Proxy-trap path (proxy.ts), reading 'slice'.
  • lodashbaseIsNative's reIsNative = RegExp('…') module-level regex is undefined (was previously masked; the .test() fix correctly surfaces it as a throw).
  • ajv (runtime SIGSEGV), cheerio (compile-time stack overflow over ~222 modules), joi, jsonwebtoken, yup — not yet localized.

Packages newly FULLY green

  • semver, minimatch

Tests

  • crates/perry-hir/tests/instance_test_method_not_regex.rs — asserts untyped/member .test()/.match() receivers do NOT lower to RegExpTest/StringMatch, while regex-literal forms still do.
  • crates/perry-hir/tests/nested_class_expr_name_collision.rs — asserts a same-named nested class expression doesn't clobber the top-level class's methods/fields.

Notes

  • No version / CHANGELOG / Cargo.toml bump (maintainer folds at merge).
  • Lint: cargo fmt --all --check clean. Pre-existing check_file_size.sh failure on crates/perry/src/commands/compile/link/mod.rs (2132 lines) is unchanged debt on base (git show HEAD:…link/mod.rs | wc -l = 2132), not touched by this PR.
  • Known macOS-local red ignored per instructions: require_native_builtin_lowers_like_namespace_import.

Summary by CodeRabbit

  • Bug Fixes

    • Stricter type validation for regex method calls (.test(), .exec(), .match(), .matchAll()) now requires explicit type evidence
    • Fixed handling of named class expressions to prevent overriding module-level classes with identical names
  • Tests

    • Added regression tests for regex method call behavior and type validation
    • Added tests for nested class expression naming collision scenarios

Ralph Küpper added 3 commits June 18, 2026 13:58
`receiver.test(arg)` was lowered to the RegExp `Expr::RegExpTest` codegen
fast path whenever the receiver was a regex literal OR an identifier whose
local type was `Any`/`Unknown`/unknown (`unwrap_or(true)`). But `.test()`
is also a common INSTANCE method name — semver's `Comparator.test` /
`Range.test`, minimatch's matcher `.test`, etc. An imported class instance
is typed `Any` at the call site, so `comparator.test(version)` mis-lowered
into `js_regexp_test(comparator-coerced-to-string, ...)`: it silently
returned a bogus boolean and never ran the real method body, so
`semver.satisfies(...)` always returned false.

The runtime already routes a genuine RegExp receiver's `.test()`/`.exec()`
through dynamic method dispatch (dispatch_regex_receiver_method, #1731), so
falling through to a normal method call is correct for BOTH a regex value
(runtime regex dispatch) and a class instance (instance method). Only take
the codegen fast path with positive evidence: a regex literal receiver, or
a local explicitly typed `RegExp`.

Fixes semver source-compile (satisfies/maxSatisfying); functional harness
OK 28 -> 29. Regression test asserts untyped/member receivers do NOT lower
to RegExpTest while a regex literal still does.
… decl

A NAMED class EXPRESSION nested in a function body whose name collides with
a TOP-LEVEL class DECLARATION in the same module reused the declaration's
ClassId / module-scope registration (lookup_class hit the pre-registered
top-level id), silently overwriting the real class with the nested one. Per
JS spec a class-expression's name binds only inside its own body, so the two
are distinct classes.

minimatch's `defaults()` returns
  Object.assign(m, { Minimatch: class Minimatch extends orig.Minimatch
    { constructor(){super(...)} static defaults(){…} } })
— a near-empty nested expression sharing the name of the real
`export class Minimatch { …18 fields, ~10 methods… }`. The nested one won,
so `new Minimatch(pattern)` built a body-less instance (every field/method
undefined; the constructor never ran — assertValidPattern didn't even
throw on a bad pattern).

Fix: track top-level class-declaration names (module_class_decl_names,
populated by the module pre-pass) and, in the class-expression value path,
rename a colliding named expression to a fresh unique `<name>__class_expr_N`
so it gets its own ClassId. The value position (object property / new site)
holds the resulting ClassRef directly, so the original name isn't needed at
module scope.

Advances minimatch past the empty-constructor wall (constructor body now
runs, this.set populated); a separate follow-on (AST.fromGlob in parse)
remains. No functional-harness OK regression (still 29); import-smoke 59/59.
Regression test asserts the top-level class keeps its real methods/fields.
…ss-expr also dodges imported-class collision

Two fixes that together fully green minimatch's source-compile:

1. .match()/.matchAll() heuristic (regex_string.rs): `obj.match(arg)` was
   lowered to `Expr::StringMatch` (String.prototype.match(regex)) whenever
   the ARG was an `Any`/`Unknown`/untyped local — treating the RECEIVER as a
   string. But `.match()` is also an instance method: minimatch's top-level
   `minimatch(p, pat)` arrow returns `new Minimatch(pat).match(p)`, where
   `p` is an untyped param. That mis-lowered to
   `StringMatch(new Minimatch(pat), p)` → the call returned `null` instead
   of the boolean match result. The runtime already routes a genuine string
   receiver's `.match(regex)` through dynamic dispatch (#519/#510), so
   falling through to a normal method call is correct for both a string and
   a class instance. Only take the StringMatch fast path with positive
   evidence the arg is a regex (literal or RegExp-typed local). Mirrors the
   earlier `.test()`/`.exec()` fix.

2. class-expression collision (lower_expr.rs): extend the same-named guard
   to also cover IMPORTED class bindings (lookup_imported_func), not just
   top-level declarations. minimatch's `defaults()` nests `{ AST: class AST
   extends orig.AST {…} }` whose name collides with `import { AST } from
   './ast.js'`; the nested expression hijacked the imported AST's identity,
   so `AST.fromGlob(...)` inside `Minimatch.parse` dispatched to the wrong
   (empty) class (`reading 'AST'`). Now renamed to a fresh unique name.

minimatch is now FULLY green (matches node). functional OK 29 -> 30;
import-smoke 59/59; string.match with literal/RegExp-typed args unaffected.
Regression tests cover chained `new X().match()` and string-literal match.
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Two independent HIR lowering correctness fixes: a new module_class_decl_names field is added to LoweringContext and populated during the top-level class pre-pass, then used in lower_expr to rename colliding nested class expressions; separately, try_regex_string_methods now requires explicit RegExp type evidence (not Any/Unknown) before taking regex fast paths for .test()/.exec()/.match()/.matchAll().

Changes

Nested Class Expression Name Collision Fix

Layer / File(s) Summary
LoweringContext field and constructor init
crates/perry-hir/src/lower/lowering_context.rs, crates/perry-hir/src/lower/context.rs
Adds module_class_decl_names: HashSet<String> to LoweringContext with inline docs and initializes it as an empty HashSet in with_class_id_start.
Top-level class pre-pass registration
crates/perry-hir/src/lower/lower_module_fn.rs
During the module pre-pass that registers top-level class declarations, each class binding name is now inserted into ctx.module_class_decl_names.
Collision detection, rename, and regression test
crates/perry-hir/src/lower/lower_expr.rs, crates/perry-hir/tests/nested_class_expr_name_collision.rs
lower_expr's class-expression value path checks the name against module_class_decl_names, existing class bindings, and imported classes; renames colliding nested expressions to a fresh __class_expr_-suffixed identifier. Regression test verifies the top-level class's methods and fields are preserved.

Regex Fast-Path Type-Evidence Strictness

Layer / File(s) Summary
Stricter type checks in try_regex_string_methods
crates/perry-hir/src/lower/expr_call/regex_string.rs
For .test()/.exec(), an identifier receiver is accepted as RegExp only when lookup_local_type resolves to Type::Named("RegExp"); same requirement applied to the argument for .match()/.matchAll(). Prior Any/Unknown fallbacks removed.
Regression tests
crates/perry-hir/tests/instance_test_method_not_regex.rs
Five tests: untyped-receiver .test() → no RegExpTest; member-access-receiver .test() → no RegExpTest; regex-literal receiver .test()RegExpTest; untyped-argument .match() → no StringMatch; regex-literal argument .match()StringMatch.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • PerryTS/perry#5325: Both PRs address class-name collisions for nested classes so that an inner class doesn't incorrectly reuse or resolve to the outer/module-scope binding or ClassId during lowering.

Poem

🐇 Two bugs were lurking, hiding in the hir,
A nested class that stole its parent's id with flair,
And regex fast-paths fooled by types unknown—
Now strict type checks ensure each name's its own!
The carrot of correctness, freshly grown. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly summarizes the main changes: functional correctness fixes for HIR lowering, with semver and minimatch now passing tests.
Description check ✅ Passed The description is comprehensive, covering summary, root causes of bugs fixed, test additions, and verification steps. It follows the template structure and includes all essential details.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/functional-correctness-batch3

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/perry-hir/tests/instance_test_method_not_regex.rs (1)

67-78: 💤 Low value

Consider adding a test for explicitly typed RegExp locals taking the fast path.

The tests cover regex literals and untyped/any receivers, but there's no test verifying that an explicitly typed RegExp local (e.g., const re: RegExp = ...) correctly takes the RegExpTest fast path. This would provide symmetry with the literal case and ensure the Type::Named("RegExp") check in regex_string.rs:50-53 is exercised.

#[test]
fn explicitly_typed_regexp_receiver_uses_fast_path() {
    let src = r#"
        const re: RegExp = /foo/;
        const hit = re.test("foobar");
    "#;
    let module = lower(src);
    assert!(
        module_has_regexp_test(&module),
        "`re.test(...)` with explicit RegExp type should lower to RegExpTest"
    );
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-hir/tests/instance_test_method_not_regex.rs` around lines 67 -
78, Add a new test function called
explicitly_typed_regexp_receiver_uses_fast_path to the test file that verifies
an explicitly typed RegExp local variable correctly takes the RegExpTest fast
path. The test should create a source string with a constant variable explicitly
typed as RegExp (e.g., const re: RegExp = /foo/), call the test method on it,
lower the module using the lower function, and then assert using
module_has_regexp_test that the module contains the RegExpTest operator. This
test ensures the Type::Named("RegExp") check in regex_string.rs is properly
exercised and provides symmetry with the existing regex literal and untyped
receiver tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry-hir/src/lower/lower_expr.rs`:
- Around line 2380-2383: The condition check at lines 2380-2383 includes an
overly broad collision check with ctx.lookup_imported_func(&n).is_some() that is
unnecessary and causes unintended renaming behavior. According to the comment at
line 2368, imported class bindings are already caught via ctx.lookup_class(), so
the imported function check should be removed. Remove the
ctx.lookup_imported_func(&n).is_some() portion from the logical OR condition,
keeping only the checks for ctx.module_class_decl_names.contains(&n),
ctx.lookup_class(&n).is_some(), and the current_class comparison to restrict the
predicate to class bindings only.

---

Nitpick comments:
In `@crates/perry-hir/tests/instance_test_method_not_regex.rs`:
- Around line 67-78: Add a new test function called
explicitly_typed_regexp_receiver_uses_fast_path to the test file that verifies
an explicitly typed RegExp local variable correctly takes the RegExpTest fast
path. The test should create a source string with a constant variable explicitly
typed as RegExp (e.g., const re: RegExp = /foo/), call the test method on it,
lower the module using the lower function, and then assert using
module_has_regexp_test that the module contains the RegExpTest operator. This
test ensures the Type::Named("RegExp") check in regex_string.rs is properly
exercised and provides symmetry with the existing regex literal and untyped
receiver tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: e55d623a-52d0-4871-b60e-532c27439a81

📥 Commits

Reviewing files that changed from the base of the PR and between 54ebb6c and 34b92bc.

📒 Files selected for processing (7)
  • crates/perry-hir/src/lower/context.rs
  • crates/perry-hir/src/lower/expr_call/regex_string.rs
  • crates/perry-hir/src/lower/lower_expr.rs
  • crates/perry-hir/src/lower/lower_module_fn.rs
  • crates/perry-hir/src/lower/lowering_context.rs
  • crates/perry-hir/tests/instance_test_method_not_regex.rs
  • crates/perry-hir/tests/nested_class_expr_name_collision.rs

Comment on lines +2380 to +2383
if (ctx.module_class_decl_names.contains(&n)
|| ctx.lookup_class(&n).is_some()
|| ctx.lookup_imported_func(&n).is_some())
&& ctx.current_class.as_deref() != Some(n.as_str()) =>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n "lookup_imported_func|register_imported_func" crates/perry-hir/src/lower -g '*.rs'
rg -n "lookup_class|register_class" crates/perry-hir/src/lower -g '*.rs'
rg -n "lookup_imported_class|register_imported_class" crates/perry-hir/src/lower -g '*.rs'

Repository: PerryTS/perry

Length of output: 16118


🏁 Script executed:

sed -n '2370,2390p' crates/perry-hir/src/lower/lower_expr.rs | cat -n

Repository: PerryTS/perry

Length of output: 1455


🏁 Script executed:

# Check how imported items are registered, especially classes
sed -n '280,340p' crates/perry-hir/src/lower/module_decl.rs | cat -n

Repository: PerryTS/perry

Length of output: 4550


🏁 Script executed:

# Look for context on how imported classes flow through the system
rg -n "imported.*class|class.*import" crates/perry-hir/src/lower/lower_expr.rs -A 2 -B 2

Repository: PerryTS/perry

Length of output: 1272


🏁 Script executed:

rg -n "class_expr|CollisionDetect" crates/perry-hir/src/lower/tests.rs -A 3 -B 3

Repository: PerryTS/perry

Length of output: 39


🏁 Script executed:

# Check if there are any tests that exercise the imported function case for class expressions
rg -n "new.*\(class|class.*expression" crates/perry-hir/src/lower/tests.rs -A 2 -B 2

Repository: PerryTS/perry

Length of output: 39


🏁 Script executed:

# Look for documentation or comments about why lookup_imported_func might be needed
sed -n '2350,2400p' crates/perry-hir/src/lower/lower_expr.rs | cat -n

Repository: PerryTS/perry

Length of output: 3540


Remove overly broad collision check from class expression renaming logic.

The comment at line 2368 explicitly states that imported class bindings are caught via lookup_class (since "named class imports are registered too"). Checking ctx.lookup_imported_func(&n) is unnecessary and over-broad—it triggers renaming on collisions with imported functions, not just classes. This changes behavior for non-class import name collisions when it shouldn't. Restrict the predicate to class bindings only.

Suggested diff
-                    if (ctx.module_class_decl_names.contains(&n)
-                        || ctx.lookup_class(&n).is_some()
-                        || ctx.lookup_imported_func(&n).is_some())
+                    if (ctx.module_class_decl_names.contains(&n)
+                        || ctx.lookup_class(&n).is_some())
                         && ctx.current_class.as_deref() != Some(n.as_str()) =>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-hir/src/lower/lower_expr.rs` around lines 2380 - 2383, The
condition check at lines 2380-2383 includes an overly broad collision check with
ctx.lookup_imported_func(&n).is_some() that is unnecessary and causes unintended
renaming behavior. According to the comment at line 2368, imported class
bindings are already caught via ctx.lookup_class(), so the imported function
check should be removed. Remove the ctx.lookup_imported_func(&n).is_some()
portion from the logical OR condition, keeping only the checks for
ctx.module_class_decl_names.contains(&n), ctx.lookup_class(&n).is_some(), and
the current_class comparison to restrict the predicate to class bindings only.

@proggeramlug proggeramlug merged commit 9697fcd into main Jun 18, 2026
15 checks passed
@proggeramlug proggeramlug deleted the fix/functional-correctness-batch3 branch June 18, 2026 17:50
nglmercer pushed a commit to nglmercer/perry that referenced this pull request Jun 18, 2026
…lict in compile/link/mod.rs

Brings main's PerryTS#5400 (Windows response-file link) + PerryTS#5406/PerryTS#5408/PerryTS#5410/PerryTS#5412 codegen
split + PerryTS#5402 require Tier 2 + PerryTS#5414 unknown builtin-namespace fix into the
Windows plugin support branch. The conflict in compile/link/mod.rs is
structural: main extracted the orchestrator into link/build_and_run.rs
while the branch kept it inline. Resolved by taking main's refactor and
re-applying the Windows plugin support on top:

- compile/link/mod.rs: add PLUGIN_HOST_SYMBOLS const (the runtime + plugin
  export list — single source of truth for the macOS -u force-keep and
  the Windows .def file) plus use std::io::Write for the .def writer.

- compile/link/build_and_run.rs: convert the if ctx.needs_plugins &&
  !is_windows block to if ctx.needs_plugins with platform branches.
  The Windows branch writes a per-build perry_plugin_host_<pid>.def
  using the NAME directive (not LIBRARY — that one tells link.exe
  the output is a DLL and breaks the host .exe) and passes
  /DEF:<path> to the linker. PLUGIN_HOST_SYMBOLS replaces the inline
  
untime_syms array on macOS too.

- compile.rs: dylib link path on Windows now uses lld-link with
  /FORCE:UNRESOLVED (the .def file lists plugin_activate /
  perry_plugin_abi_version / plugin_deactivate so lld-link's empty
  default export table doesn't break GetProcAddress in the host) and
  writes a per-build .def file with the same three symbols. Lifts the
  pre-existing 'use link.exe' out for the LLVM-friendly linker.

- codegen/entry.rs: re-apply the dylib plugin ABI shim block (emits
  perry_plugin_abi_version, plugin_activate, and the optional
  plugin_deactivate) on top of main's refactor of compile_module_entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant