Skip to content

perf(transform): inline small this-using methods on exact receivers (1.6x on method_calls)#5092

Merged
proggeramlug merged 8 commits into
mainfrom
perf/inline-this-methods
Jun 14, 2026
Merged

perf(transform): inline small this-using methods on exact receivers (1.6x on method_calls)#5092
proggeramlug merged 8 commits into
mainfrom
perf/inline-this-methods

Conversation

@TheHypnoo

@TheHypnoo TheHypnoo commented Jun 13, 2026

Copy link
Copy Markdown
Member

Unlocks HIR inlining of small monomorphic methods that read/write this.field — previously they were never inlined, so a hot obj.method() in a loop always paid full dynamic dispatch + a non-inlined call per iteration.

Two gaps fixed (both in perry-transform/src/inline/)

  1. Loop bodies were seeded with empty exact-receiver facts (call_inliner.rs, While/DoWhile/For arms). A const c = new Counter() before a loop established the c → Counter fact, but the loop body got a fresh empty fact set, so the receiver class was unknown inside the loop and nothing inlined. Now the body inherits the loop-invariant subset of outer facts (loop_invariant_seed_facts): a fact is kept only if its local is never reassigned anywhere in the loop (collect_mutated_local_ids, which recurses into closures). A receiver rebound mid-loop drops its fact → not mis-inlined.

  2. is_inlinable rejected every method referencing this (body_references_dynamic_this) — i.e. essentially all real methods — even though the method-inliner substitutes this for the concrete receiver (substitute_this_in_stmts). Added is_inlinable_method, which allows a direct this (substituted) but still rejects new.target and any nested closure (whose lexical this/new.target is not rewritten), via method_body_blocks_this_substitution.

Soundness

  • method_lookup_safe still gates each call site, so methods on classes with dynamic/native extends (runtime-dependent prototype chains) are not inlined.
  • Loop fact inheritance drops any receiver mutated in the loop, so reassigned receivers keep dynamic dispatch.
  • new.target and nested closures keep the method out of the inlinable set.

Result

Benchmark Before After
bench_method_calls (10M monomorphic counter.increment()) 5229ms 3221ms (1.6×)

increment is now inlined into the loop. General win for this.field-heavy OOP code. (The remaining gap to Node is the per-access field-shape guards — separate follow-up, see plan 008.)

Validation

  • cargo test -p perry-transform -p perry-codegen -p perry-runtime — pass.
  • 29/29 suite benchmarks match node --experimental-strip-types.
  • Class-heavy correctness incl. a receiver reassigned inside a loop (p = new Counter() in the loop body) matches Node — proving the loop-fact mutation guard prevents mis-inlining.
  • Full local parity: zero regressions. Every pre-existing mismatch fails identically with and without this change (verified by rebuilding the base binary and diffing combined stdout+stderr per file).

Summary by CodeRabbit

  • Optimizations
    • Expanded inlining for methods called on a known receiver with stricter eligibility checks and improved this accessor/mutator handling.
    • Improved loop-body inlining by more accurately seeding receiver information while excluding facts tied to receivers potentially mutated by loop control flow.
  • Bug Fixes / Safety
    • Strengthened correctness by detecting lexical super usage and adding a recursion guard to prevent unbounded inlining work.
    • Improved detection for cases that block safe this substitution during inlining.

Two gaps kept hot monomorphic method calls from ever being inlined, so a call
like `const c = new Counter(); for (...) c.increment();` always paid full
dynamic-dispatch + a non-inlined call per iteration:

1. Loop bodies were seeded with EMPTY exact-receiver facts (While/DoWhile/For in
   call_inliner.rs), discarding the `c -> Counter` fact established before the
   loop — so the receiver class was unknown inside the loop and nothing inlined.
   Now the body inherits the loop-invariant subset of outer facts via
   `loop_invariant_seed_facts`: a fact is kept only if its local is never
   reassigned anywhere in the loop (collect_mutated_local_ids, which recurses
   into closures), so a receiver rebound mid-loop correctly drops its fact and
   is not mis-inlined.

2. is_inlinable rejected EVERY method that references `this`
   (body_references_dynamic_this) — i.e. essentially all real methods — even
   though the method-inliner substitutes `this` for the concrete receiver
   (substitute_this_in_stmts). Added is_inlinable_method, which allows a direct
   `this` (it gets substituted) but still rejects new.target and any nested
   closure (whose lexical this/new.target binding is not rewritten), via
   method_body_blocks_this_substitution.

method_lookup_safe still gates each call site, so methods on classes with
dynamic/native `extends` (runtime-dependent prototype) are NOT inlined.

bench_method_calls (10M monomorphic counter.increment()): 5229ms -> 3221ms
(1.6x); the method body is now inlined into the loop. General win for OOP code.

Validation: perry-transform + perry-codegen + perry-runtime tests pass; 29/29
suite benchmarks match Node; class-heavy correctness incl. a receiver reassigned
inside a loop matches Node; full local parity shows zero regressions vs the base
(all pre-existing mismatches fail identically with and without this change).
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

This PR enhances method inlining by adding a this-substitution safety check, a method-specific inlinability predicate, super-usage detection utilities, and conservative loop-invariant receiver fact seeding; it wires these utilities into the inliner driver.

Changes

Method inlining with this-substitution safety, super detection, and loop optimization

Layer / File(s) Summary
This-substitution safety and method inlinability
crates/perry-transform/src/inline/closure_analysis.rs, crates/perry-transform/src/inline/analysis.rs
method_body_blocks_this_substitution detects NewTarget and nested closures that block safe this substitution. is_inlinable_method applies standard inlinability checks and uses the safety guard to permit this-based accessor/mutator inlining when safe.
Super-usage and recursion depth detection
crates/perry-transform/src/inline/super_detect.rs
Introduces thread-local recursion depth guard with RAII semantics, expression and statement tree scanners for Expr::SuperPropertySet, and method-level predicate combining them to detect whether a method body contains lexical-super or exceeds inline recursion depth limits.
Loop-invariant receiver facts optimization
crates/perry-transform/src/inline/call_inliner.rs
Updates imports, introduces loop_invariant_seed_facts helper that filters ExactReceiverFacts by excluding LocalIds mutated in loop bodies or extra expressions, and applies this helper to seed loop body inlining for while, do-while, and for loops.
Module integration and method filtering
crates/perry-transform/src/inline/mod.rs
Adds super_detect submodule and re-exports its helpers alongside is_inlinable_method and method_body_blocks_this_substitution; updates method candidate selection to use is_inlinable_method instead of the generic is_inlinable predicate.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

I'm a rabbit in the code so spry,
I peek where this may safely lie,
I scan for super creeping near,
I seed the loops with facts most clear,
and leave inlined methods with a happy cheer 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main performance optimization: enabling inlining of this-using methods with 1.6x speedup on method calls.
Description check ✅ Passed The description comprehensively covers the two key fixes, soundness guarantees, benchmark results, and validation. It follows the template structure with Summary, Changes, Related issue (implicit: perf improvement), and Test plan sections.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/inline-this-methods

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/perry-transform/src/inline/call_inliner.rs (1)

661-678: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

For-loop init mutations are not included in receiver-fact invalidation

Line 717 seeds body_facts from exact_receiver_facts, but the mutation filter currently excludes init (Lines 661-678). If init rebinds a receiver local, the old exact class fact can survive and allow unsound method inlining in the loop body. Include init in the mutated-local set before using body_facts.

Proposed fix
                 let mut for_extra: Vec<&Expr> = Vec::new();
                 if let Some(c) = condition.as_ref() {
                     for_extra.push(c);
                 }
                 if let Some(u) = update.as_ref() {
                     for_extra.push(u);
                 }
                 let mut body_facts =
                     loop_invariant_seed_facts(exact_receiver_facts, body, &for_extra);
+                if let Some(init_stmt) = init.as_ref() {
+                    let mut init_mutated: HashSet<LocalId> = HashSet::new();
+                    collect_mutated_local_ids(std::slice::from_ref(init_stmt.as_ref()), &mut init_mutated);
+                    body_facts.retain(|id, _| !init_mutated.contains(id));
+                }
                 inline_calls_in_stmts(
                     body,
                     func_candidates,
                     method_candidates,

Also applies to: 709-718

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-transform/src/inline/call_inliner.rs` around lines 661 - 678,
The loop-init handling currently runs inline_calls_in_stmts on init (using
init_stmts and init_facts) but fails to add any locals rebinding in init into
the mutated-local set used to derive body_facts from exact_receiver_facts,
allowing stale exact-class facts to survive; update the code that processes the
for-loop init (the branch that creates init_stmts/init_facts and calls
inline_calls_in_stmts) to record any locals written by init into the
mutated-local set (the same mutation filter used before seeding body_facts from
exact_receiver_facts) so those receiver facts are invalidated before computing
body_facts, and apply the same change at the other for-init handling site that
mirrors this logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/perry-transform/src/inline/call_inliner.rs`:
- Around line 661-678: The loop-init handling currently runs
inline_calls_in_stmts on init (using init_stmts and init_facts) but fails to add
any locals rebinding in init into the mutated-local set used to derive
body_facts from exact_receiver_facts, allowing stale exact-class facts to
survive; update the code that processes the for-loop init (the branch that
creates init_stmts/init_facts and calls inline_calls_in_stmts) to record any
locals written by init into the mutated-local set (the same mutation filter used
before seeding body_facts from exact_receiver_facts) so those receiver facts are
invalidated before computing body_facts, and apply the same change at the other
for-init handling site that mirrors this logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c095746-c4a4-4839-aee5-a456a63a3a3e

📥 Commits

Reviewing files that changed from the base of the PR and between 9b6bbe0 and 8333d39.

📒 Files selected for processing (4)
  • crates/perry-transform/src/inline/analysis.rs
  • crates/perry-transform/src/inline/call_inliner.rs
  • crates/perry-transform/src/inline/closure_analysis.rs
  • crates/perry-transform/src/inline/mod.rs

proggeramlug and others added 2 commits June 13, 2026 15:42
The for-loop init can rebind a receiver local (`for (c = makeOther(); …)`),
but loop_invariant_seed_facts only scanned body + condition + update, so a stale
pre-loop exact-class fact could survive into the body and allow an unsound
method inline. Also drop facts for locals mutated by init. Addresses CodeRabbit
review feedback on #5092.

Verified: a receiver rebound in for-init is observed as the new class in the
body (matches Node).
proggeramlug and others added 3 commits June 14, 2026 12:54
…size gate

call_inliner.rs grew past the 2000-line CI limit. Move the inline-recursion
guard and lexical-`super` detection helpers into a new super_detect.rs sibling
module, re-exported through mod.rs.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry-transform/src/inline/super_detect.rs`:
- Around line 39-50: The expr_contains_lexical_super_set function only detects
SuperPropertySet but must check for all super expression variants that reference
the implicit receiver. Modify the matches! check at the beginning of
expr_contains_lexical_super_set to include all super forms: SuperPropertySet,
SuperMethodCall, SuperCall, SuperPropertyGet, ObjectSuperPropertyGet,
ObjectSuperPropertySet, and ObjectSuperMethodCall. This ensures the function
correctly identifies all unsafe-to-inline super expressions, not just property
sets.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8f59af95-1139-4a49-b487-6467b71b8392

📥 Commits

Reviewing files that changed from the base of the PR and between ef71158 and 5bc6b60.

📒 Files selected for processing (3)
  • crates/perry-transform/src/inline/call_inliner.rs
  • crates/perry-transform/src/inline/mod.rs
  • crates/perry-transform/src/inline/super_detect.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/perry-transform/src/inline/mod.rs
  • crates/perry-transform/src/inline/call_inliner.rs

Comment thread crates/perry-transform/src/inline/super_detect.rs Outdated
expr_contains_lexical_super matched only Expr::SuperPropertySet, so a method
whose body read super.prop (SuperPropertyGet), called super()/super.m() via the
spread or object-literal variants, etc. could pass the inline-safety guard and
be substituted onto a different receiver — re-resolving super against the wrong
parent. Match every super form (reads, writes, calls, and the object-super
variants). Renamed the helpers (drop the now-inaccurate _set suffix) and exposed
MAX_INLINE_EXPR_RECURSION_DEPTH through mod.rs for the call_inliner unit test.

Addresses CodeRabbit review on #5092.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/perry-transform/src/inline/super_detect.rs (1)

18-18: ⚡ Quick win

Make the recursion guard non-constructible outside this module.

InlineExprRecursionGuard is a pub(crate) unit struct, so any crate-internal caller can instantiate/drop it without going through enter_inline_expr_recursion(). That makes the thread-local depth counter easy to desynchronize and can accidentally punch holes in the recursion cap.

Proposed fix
-pub(crate) struct InlineExprRecursionGuard;
+pub(crate) struct InlineExprRecursionGuard(());

 pub(crate) fn enter_inline_expr_recursion() -> Option<InlineExprRecursionGuard> {
     let entered = INLINE_EXPR_RECURSION_DEPTH.with(|depth| {
         let current = depth.get();
         if current >= MAX_INLINE_EXPR_RECURSION_DEPTH {
@@
     });
-    entered.then_some(InlineExprRecursionGuard)
+    entered.then_some(InlineExprRecursionGuard(()))
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-transform/src/inline/super_detect.rs` at line 18, The
`InlineExprRecursionGuard` struct is currently a `pub(crate)` unit struct that
can be directly instantiated by any code in the crate, bypassing the
thread-local depth counter management in `enter_inline_expr_recursion()`. Make
this struct non-constructible outside the module by adding a private field (such
as a zero-sized private type) to the struct definition, ensuring all instances
can only be created through the proper `enter_inline_expr_recursion()` function
and preventing the recursion counter from becoming desynchronized.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/perry-transform/src/inline/super_detect.rs`:
- Line 18: The `InlineExprRecursionGuard` struct is currently a `pub(crate)`
unit struct that can be directly instantiated by any code in the crate,
bypassing the thread-local depth counter management in
`enter_inline_expr_recursion()`. Make this struct non-constructible outside the
module by adding a private field (such as a zero-sized private type) to the
struct definition, ensuring all instances can only be created through the proper
`enter_inline_expr_recursion()` function and preventing the recursion counter
from becoming desynchronized.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5da1fda0-e6c5-4742-8ec9-493921dbc98e

📥 Commits

Reviewing files that changed from the base of the PR and between ef71158 and 8a258b8.

📒 Files selected for processing (3)
  • crates/perry-transform/src/inline/call_inliner.rs
  • crates/perry-transform/src/inline/mod.rs
  • crates/perry-transform/src/inline/super_detect.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/perry-transform/src/inline/mod.rs
  • crates/perry-transform/src/inline/call_inliner.rs

@proggeramlug proggeramlug merged commit 5873cb6 into main Jun 14, 2026
15 checks passed
@proggeramlug proggeramlug deleted the perf/inline-this-methods branch June 14, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants