perf(transform): inline small this-using methods on exact receivers (1.6x on method_calls) by TheHypnoo · Pull Request #5092 · PerryTS/perry

TheHypnoo · 2026-06-13T13:35:39Z

Unlocks HIR inlining of small monomorphic methods that read/write this.field — previously they were never inlined, so a hot obj.method() in a loop always paid full dynamic dispatch + a non-inlined call per iteration.

Two gaps fixed (both in `perry-transform/src/inline/`)

Loop bodies were seeded with empty exact-receiver facts (call_inliner.rs, While/DoWhile/For arms). A const c = new Counter() before a loop established the c → Counter fact, but the loop body got a fresh empty fact set, so the receiver class was unknown inside the loop and nothing inlined. Now the body inherits the loop-invariant subset of outer facts (loop_invariant_seed_facts): a fact is kept only if its local is never reassigned anywhere in the loop (collect_mutated_local_ids, which recurses into closures). A receiver rebound mid-loop drops its fact → not mis-inlined.
is_inlinable rejected every method referencing this (body_references_dynamic_this) — i.e. essentially all real methods — even though the method-inliner substitutes this for the concrete receiver (substitute_this_in_stmts). Added is_inlinable_method, which allows a direct this (substituted) but still rejects new.target and any nested closure (whose lexical this/new.target is not rewritten), via method_body_blocks_this_substitution.

Soundness

method_lookup_safe still gates each call site, so methods on classes with dynamic/native extends (runtime-dependent prototype chains) are not inlined.
Loop fact inheritance drops any receiver mutated in the loop, so reassigned receivers keep dynamic dispatch.
new.target and nested closures keep the method out of the inlinable set.

Result

Benchmark	Before	After
`bench_method_calls` (10M monomorphic `counter.increment()`)	5229ms	3221ms (1.6×)

increment is now inlined into the loop. General win for this.field-heavy OOP code. (The remaining gap to Node is the per-access field-shape guards — separate follow-up, see plan 008.)

Validation

cargo test -p perry-transform -p perry-codegen -p perry-runtime — pass.
29/29 suite benchmarks match node --experimental-strip-types.
Class-heavy correctness incl. a receiver reassigned inside a loop (p = new Counter() in the loop body) matches Node — proving the loop-fact mutation guard prevents mis-inlining.
Full local parity: zero regressions. Every pre-existing mismatch fails identically with and without this change (verified by rebuilding the base binary and diffing combined stdout+stderr per file).

Summary by CodeRabbit

Optimizations
- Expanded inlining for methods called on a known receiver with stricter eligibility checks and improved this accessor/mutator handling.
- Improved loop-body inlining by more accurately seeding receiver information while excluding facts tied to receivers potentially mutated by loop control flow.
Bug Fixes / Safety
- Strengthened correctness by detecting lexical super usage and adding a recursion guard to prevent unbounded inlining work.
- Improved detection for cases that block safe this substitution during inlining.

Two gaps kept hot monomorphic method calls from ever being inlined, so a call like `const c = new Counter(); for (...) c.increment();` always paid full dynamic-dispatch + a non-inlined call per iteration: 1. Loop bodies were seeded with EMPTY exact-receiver facts (While/DoWhile/For in call_inliner.rs), discarding the `c -> Counter` fact established before the loop — so the receiver class was unknown inside the loop and nothing inlined. Now the body inherits the loop-invariant subset of outer facts via `loop_invariant_seed_facts`: a fact is kept only if its local is never reassigned anywhere in the loop (collect_mutated_local_ids, which recurses into closures), so a receiver rebound mid-loop correctly drops its fact and is not mis-inlined. 2. is_inlinable rejected EVERY method that references `this` (body_references_dynamic_this) — i.e. essentially all real methods — even though the method-inliner substitutes `this` for the concrete receiver (substitute_this_in_stmts). Added is_inlinable_method, which allows a direct `this` (it gets substituted) but still rejects new.target and any nested closure (whose lexical this/new.target binding is not rewritten), via method_body_blocks_this_substitution. method_lookup_safe still gates each call site, so methods on classes with dynamic/native `extends` (runtime-dependent prototype) are NOT inlined. bench_method_calls (10M monomorphic counter.increment()): 5229ms -> 3221ms (1.6x); the method body is now inlined into the loop. General win for OOP code. Validation: perry-transform + perry-codegen + perry-runtime tests pass; 29/29 suite benchmarks match Node; class-heavy correctness incl. a receiver reassigned inside a loop matches Node; full local parity shows zero regressions vs the base (all pre-existing mismatches fail identically with and without this change).

coderabbitai · 2026-06-13T13:35:56Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
✅ Review completed - (🔄 Check again to review again)

📝 Walkthrough

Walkthrough

This PR enhances method inlining by adding a this-substitution safety check, a method-specific inlinability predicate, super-usage detection utilities, and conservative loop-invariant receiver fact seeding; it wires these utilities into the inliner driver.

Changes

Method inlining with this-substitution safety, super detection, and loop optimization

Layer / File(s)	Summary
This-substitution safety and method inlinability `crates/perry-transform/src/inline/closure_analysis.rs`, `crates/perry-transform/src/inline/analysis.rs`	`method_body_blocks_this_substitution` detects `NewTarget` and nested closures that block safe `this` substitution. `is_inlinable_method` applies standard inlinability checks and uses the safety guard to permit `this`-based accessor/mutator inlining when safe.
Super-usage and recursion depth detection `crates/perry-transform/src/inline/super_detect.rs`	Introduces thread-local recursion depth guard with RAII semantics, expression and statement tree scanners for `Expr::SuperPropertySet`, and method-level predicate combining them to detect whether a method body contains lexical-super or exceeds inline recursion depth limits.
Loop-invariant receiver facts optimization `crates/perry-transform/src/inline/call_inliner.rs`	Updates imports, introduces `loop_invariant_seed_facts` helper that filters `ExactReceiverFacts` by excluding LocalIds mutated in loop bodies or extra expressions, and applies this helper to seed loop body inlining for `while`, `do-while`, and `for` loops.
Module integration and method filtering `crates/perry-transform/src/inline/mod.rs`	Adds `super_detect` submodule and re-exports its helpers alongside `is_inlinable_method` and `method_body_blocks_this_substitution`; updates method candidate selection to use `is_inlinable_method` instead of the generic `is_inlinable` predicate.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

I'm a rabbit in the code so spry,
I peek where this may safely lie,
I scan for super creeping near,
I seed the loops with facts most clear,
and leave inlined methods with a happy cheer 🐇

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main performance optimization: enabling inlining of this-using methods with 1.6x speedup on method calls.
Description check	✅ Passed	The description comprehensively covers the two key fixes, soundness guarantees, benchmark results, and validation. It follows the template structure with Summary, Changes, Related issue (implicit: perf improvement), and Test plan sections.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/inline-this-methods

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/perry-transform/src/inline/call_inliner.rs (1)

661-678: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

For-loop init mutations are not included in receiver-fact invalidation

Line 717 seeds body_facts from exact_receiver_facts, but the mutation filter currently excludes init (Lines 661-678). If init rebinds a receiver local, the old exact class fact can survive and allow unsound method inlining in the loop body. Include init in the mutated-local set before using body_facts.

Proposed fix

                 let mut for_extra: Vec<&Expr> = Vec::new();
                 if let Some(c) = condition.as_ref() {
                     for_extra.push(c);
                 }
                 if let Some(u) = update.as_ref() {
                     for_extra.push(u);
                 }
                 let mut body_facts =
                     loop_invariant_seed_facts(exact_receiver_facts, body, &for_extra);
+                if let Some(init_stmt) = init.as_ref() {
+                    let mut init_mutated: HashSet<LocalId> = HashSet::new();
+                    collect_mutated_local_ids(std::slice::from_ref(init_stmt.as_ref()), &mut init_mutated);
+                    body_facts.retain(|id, _| !init_mutated.contains(id));
+                }
                 inline_calls_in_stmts(
                     body,
                     func_candidates,
                     method_candidates,

Also applies to: 709-718

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-transform/src/inline/call_inliner.rs` around lines 661 - 678,
The loop-init handling currently runs inline_calls_in_stmts on init (using
init_stmts and init_facts) but fails to add any locals rebinding in init into
the mutated-local set used to derive body_facts from exact_receiver_facts,
allowing stale exact-class facts to survive; update the code that processes the
for-loop init (the branch that creates init_stmts/init_facts and calls
inline_calls_in_stmts) to record any locals written by init into the
mutated-local set (the same mutation filter used before seeding body_facts from
exact_receiver_facts) so those receiver facts are invalidated before computing
body_facts, and apply the same change at the other for-init handling site that
mirrors this logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/perry-transform/src/inline/call_inliner.rs`:
- Around line 661-678: The loop-init handling currently runs
inline_calls_in_stmts on init (using init_stmts and init_facts) but fails to add
any locals rebinding in init into the mutated-local set used to derive
body_facts from exact_receiver_facts, allowing stale exact-class facts to
survive; update the code that processes the for-loop init (the branch that
creates init_stmts/init_facts and calls inline_calls_in_stmts) to record any
locals written by init into the mutated-local set (the same mutation filter used
before seeding body_facts from exact_receiver_facts) so those receiver facts are
invalidated before computing body_facts, and apply the same change at the other
for-init handling site that mirrors this logic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c095746-c4a4-4839-aee5-a456a63a3a3e

📥 Commits

Reviewing files that changed from the base of the PR and between 9b6bbe0 and 8333d39.

📒 Files selected for processing (4)

crates/perry-transform/src/inline/analysis.rs
crates/perry-transform/src/inline/call_inliner.rs
crates/perry-transform/src/inline/closure_analysis.rs
crates/perry-transform/src/inline/mod.rs

The for-loop init can rebind a receiver local (`for (c = makeOther(); …)`), but loop_invariant_seed_facts only scanned body + condition + update, so a stale pre-loop exact-class fact could survive into the body and allow an unsound method inline. Also drop facts for locals mutated by init. Addresses CodeRabbit review feedback on #5092. Verified: a receiver rebound in for-init is observed as the new class in the body (matches Node).

…size gate call_inliner.rs grew past the 2000-line CI limit. Move the inline-recursion guard and lexical-`super` detection helpers into a new super_detect.rs sibling module, re-exported through mod.rs.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry-transform/src/inline/super_detect.rs`:
- Around line 39-50: The expr_contains_lexical_super_set function only detects
SuperPropertySet but must check for all super expression variants that reference
the implicit receiver. Modify the matches! check at the beginning of
expr_contains_lexical_super_set to include all super forms: SuperPropertySet,
SuperMethodCall, SuperCall, SuperPropertyGet, ObjectSuperPropertyGet,
ObjectSuperPropertySet, and ObjectSuperMethodCall. This ensures the function
correctly identifies all unsafe-to-inline super expressions, not just property
sets.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8f59af95-1139-4a49-b487-6467b71b8392

📥 Commits

Reviewing files that changed from the base of the PR and between ef71158 and 5bc6b60.

📒 Files selected for processing (3)

crates/perry-transform/src/inline/call_inliner.rs
crates/perry-transform/src/inline/mod.rs
crates/perry-transform/src/inline/super_detect.rs

🚧 Files skipped from review as they are similar to previous changes (2)

crates/perry-transform/src/inline/mod.rs
crates/perry-transform/src/inline/call_inliner.rs

expr_contains_lexical_super matched only Expr::SuperPropertySet, so a method whose body read super.prop (SuperPropertyGet), called super()/super.m() via the spread or object-literal variants, etc. could pass the inline-safety guard and be substituted onto a different receiver — re-resolving super against the wrong parent. Match every super form (reads, writes, calls, and the object-super variants). Renamed the helpers (drop the now-inaccurate _set suffix) and exposed MAX_INLINE_EXPR_RECURSION_DEPTH through mod.rs for the call_inliner unit test. Addresses CodeRabbit review on #5092.

coderabbitai

🧹 Nitpick comments (1)

crates/perry-transform/src/inline/super_detect.rs (1)

18-18: ⚡ Quick win

Make the recursion guard non-constructible outside this module.

InlineExprRecursionGuard is a pub(crate) unit struct, so any crate-internal caller can instantiate/drop it without going through enter_inline_expr_recursion(). That makes the thread-local depth counter easy to desynchronize and can accidentally punch holes in the recursion cap.

Proposed fix

-pub(crate) struct InlineExprRecursionGuard;
+pub(crate) struct InlineExprRecursionGuard(());

 pub(crate) fn enter_inline_expr_recursion() -> Option<InlineExprRecursionGuard> {
     let entered = INLINE_EXPR_RECURSION_DEPTH.with(|depth| {
         let current = depth.get();
         if current >= MAX_INLINE_EXPR_RECURSION_DEPTH {
@@
     });
-    entered.then_some(InlineExprRecursionGuard)
+    entered.then_some(InlineExprRecursionGuard(()))
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry-transform/src/inline/super_detect.rs` at line 18, The
`InlineExprRecursionGuard` struct is currently a `pub(crate)` unit struct that
can be directly instantiated by any code in the crate, bypassing the
thread-local depth counter management in `enter_inline_expr_recursion()`. Make
this struct non-constructible outside the module by adding a private field (such
as a zero-sized private type) to the struct definition, ensuring all instances
can only be created through the proper `enter_inline_expr_recursion()` function
and preventing the recursion counter from becoming desynchronized.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/perry-transform/src/inline/super_detect.rs`:
- Line 18: The `InlineExprRecursionGuard` struct is currently a `pub(crate)`
unit struct that can be directly instantiated by any code in the crate,
bypassing the thread-local depth counter management in
`enter_inline_expr_recursion()`. Make this struct non-constructible outside the
module by adding a private field (such as a zero-sized private type) to the
struct definition, ensuring all instances can only be created through the proper
`enter_inline_expr_recursion()` function and preventing the recursion counter
from becoming desynchronized.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5da1fda0-e6c5-4742-8ec9-493921dbc98e

📥 Commits

Reviewing files that changed from the base of the PR and between ef71158 and 8a258b8.

📒 Files selected for processing (3)

crates/perry-transform/src/inline/call_inliner.rs
crates/perry-transform/src/inline/mod.rs
crates/perry-transform/src/inline/super_detect.rs

🚧 Files skipped from review as they are similar to previous changes (2)

crates/perry-transform/src/inline/mod.rs
crates/perry-transform/src/inline/call_inliner.rs

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

proggeramlug and others added 2 commits June 13, 2026 15:42

Merge branch 'main' into perf/inline-this-methods

b74ef11

TheHypnoo mentioned this pull request Jun 13, 2026

perf(method dispatch): method_calls ~290× Node — remaining cost is per-field-access shape-guard calls (plan + standby) #5093

Open

style: cargo fmt (perry-transform inliner)

445ce5e

TheHypnoo mentioned this pull request Jun 13, 2026

perf(GC): make per-object layout O(1)-loadable — kill per-operation thread-local layout tracking (umbrella: method_calls/array-downgrade/object-property) #5094

Open

proggeramlug and others added 3 commits June 14, 2026 12:54

Merge branch 'main' into perf/inline-this-methods

ef71158

Merge branch 'main' into perf/inline-this-methods

3c5c1cc

refactor(transform): extract super-detection helpers to satisfy file-…

5bc6b60

…size gate call_inliner.rs grew past the 2000-line CI limit. Move the inline-recursion guard and lexical-`super` detection helpers into a new super_detect.rs sibling module, re-exported through mod.rs.

coderabbitai Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread crates/perry-transform/src/inline/super_detect.rs Outdated

coderabbitai Bot reviewed Jun 14, 2026

View reviewed changes

proggeramlug merged commit 5873cb6 into main Jun 14, 2026
15 checks passed

proggeramlug deleted the perf/inline-this-methods branch June 14, 2026 13:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(transform): inline small this-using methods on exact receivers (1.6x on method_calls)#5092

perf(transform): inline small this-using methods on exact receivers (1.6x on method_calls)#5092
proggeramlug merged 8 commits into
mainfrom
perf/inline-this-methods

TheHypnoo commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

TheHypnoo commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Two gaps fixed (both in perry-transform/src/inline/)

Soundness

Result

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TheHypnoo commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Two gaps fixed (both in `perry-transform/src/inline/`)

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading