Skip to content

perf(codegen): outline class-field-SET guard-miss arm to one call (#5334 lever A)#5351

Merged
proggeramlug merged 1 commit into
mainfrom
feat/outline-fieldset-fallback
Jun 18, 2026
Merged

perf(codegen): outline class-field-SET guard-miss arm to one call (#5334 lever A)#5351
proggeramlug merged 1 commit into
mainfrom
feat/outline-fieldset-fallback

Conversation

@proggeramlug

@proggeramlug proggeramlug commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What

First step of the IR-efficiency roadmap (#5334, Tier 1 / lever A): outline the cold guard-miss arm of the class-field-SET inline-cache diamond.

The default diamond runs the inline js_typed_feedback_class_field_set_guard in its entry block; on a guard PASS it stores the slot inline, on a MISS it branches to the fallback arm. That arm emitted two inline calls per set site:

class_field_set.fallback:
  call void @js_typed_feedback_record_fallback_call(i64 <site>)
  call void @js_object_set_field_by_name(i64 %obj, i64 %key, double %val)
  br label %class_field_set.merge

The guard has already run and failed in the entry block (that failure is what branched control here), so nothing is left to decide. Collapse the pair into one outlined call:

class_field_set.fallback:
  call void @js_class_field_set_fallback(i64 <site>, i64 %obj, i64 %key, double %val)
  br label %class_field_set.merge

js_class_field_set_fallback records the miss and routes the write by name — byte-identical to the two-call block.

Why it's safe

  • Perf-neutral by construction: the hot class_field_set.fast slot store is untouched; the change is confined to the cold guard-miss arm, which never executes on a monomorphic hot path.
  • The helper is a pure record-then-by-name fallback (it does not re-run the guard — the guard already failed upstream).

Verification

  • Emitted IR on a Point-field-churn test: fallback arm drops from 2 calls to 1 (21 js_class_field_set_fallback, no class-field js_object_set_field_by_name).
  • End-to-end native run produces the correct result (60000012).
  • Full perry-codegen suite green. Updated two typed_feedback tests whose record_fallback_call/by_name assertions were satisfied by the now-folded field-set fallback (one of them incidentally, via a class's synthesized field-init).

Roadmap

Establishes the outline-helper pattern reused by the larger levers in #5334 (C: nan-box round-trips — see #5350; D: non-pointer barrier elision; B: adaptive full-outline for oversized modules). Refs #5334.

Summary by CodeRabbit

  • Performance

    • Optimized class field setter operations through improved fallback path handling for enhanced runtime efficiency.
  • Tests

    • Updated test expectations to reflect changes in class field setter behavior.

 lever A)

The default class-field-set diamond runs the inline
`js_typed_feedback_class_field_set_guard` in its entry block; on a guard
PASS it stores the slot inline, on a MISS it branches to the fallback arm.
That arm emitted TWO inline calls per set site —
`js_typed_feedback_record_fallback_call` then `js_object_set_field_by_name`.
Since the guard has already run and FAILED (that failure is what branched
control here), nothing is left to decide: collapse the pair into a single
outlined `js_class_field_set_fallback(site_id, obj_bits, key_raw, value)`
that records the miss and routes the write by name.

Perf-neutral by construction: the hot `class_field_set.fast` slot store is
untouched, and the change is confined to the cold guard-miss arm, which
never executes on a monomorphic hot path. IR shrinks by one call per
class-field-SET site (verified on emitted IR: fallback arm 2 calls -> 1;
full perry-codegen suite green).

First step of the IR-efficiency roadmap in #5334 (Tier 1, lever A:
outline cold IC machinery). Establishes the outline-helper pattern reused
by the larger levers.
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 81441ca7-f1c7-4873-8140-414cb28d6902

📥 Commits

Reviewing files that changed from the base of the PR and between e5fbd2e and c0bf51f.

📒 Files selected for processing (4)
  • crates/perry-codegen/src/expr/property_set.rs
  • crates/perry-codegen/src/runtime_decls/objects.rs
  • crates/perry-codegen/tests/typed_feedback.rs
  • crates/perry-runtime/src/typed_feedback/guards.rs

📝 Walkthrough

Walkthrough

A new cold-path runtime helper, js_class_field_set_fallback, is introduced that consolidates typed-feedback fallback recording and by-name field write into a single extern "C" function. The codegen's class-field setter guard-miss path is updated to emit one call instead of two, the FFI declaration is added, and typed-feedback tests are updated to match.

Changes

Class-field SET fallback consolidation

Layer / File(s) Summary
New js_class_field_set_fallback runtime helper
crates/perry-runtime/src/typed_feedback/guards.rs
Adds js_class_field_set_fallback(site_id, obj_bits, key_raw, value) as a #[no_mangle] pub extern "C" cold-path function combining fallback recording and by-name field write; adds a #[used] static G1C to retain it at link time.
FFI declaration, cold-path emission, and test updates
crates/perry-codegen/src/runtime_decls/objects.rs, crates/perry-codegen/src/expr/property_set.rs, crates/perry-codegen/tests/typed_feedback.rs
Declares the new FFI function in declare_phase_b_objects; updates the guard-miss cold path to emit a single js_class_field_set_fallback call instead of the prior two-call sequence; updates typed-feedback tests to assert the new call is emitted and js_object_set_field_by_name is absent at the SET site.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • PerryTS/perry#5146: The new js_class_field_set_fallback helper delegates to js_object_set_field_by_name, and this PR modifies the accessor/frozen-order logic inside that callee function.

Poem

A rabbit once took two hops to set a field,
First recording the miss, then yielding the yield.
Now one single bound, js_class_field_set_fallback in name,
Combines both the record and write in one frame.
🐇✨ Fewer hops, same result — efficiency claimed!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: outlining the class-field-SET guard-miss arm to a single call, with a reference to the related issue.
Description check ✅ Passed The PR description is comprehensive and well-structured with sections covering What, Why it's safe, Verification, and Roadmap context, fully aligned with the template requirements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/outline-fieldset-fallback

Comment @coderabbitai help to get the list of available commands and usage tips.

@proggeramlug proggeramlug merged commit 849f2e3 into main Jun 18, 2026
15 checks passed
@proggeramlug proggeramlug deleted the feat/outline-fieldset-fallback branch June 18, 2026 03:59
proggeramlug added a commit that referenced this pull request Jun 18, 2026
…les (#5334 lever B) (#5385)

* perf(codegen): full-outline class-field IC diamond for oversized modules (#5334 lever B)

Pathologically-large modules (the motivating case: a 13MB minified bundle
that lowers to ~1.25GB of LLVM IR across ~92K functions) are forced to
`clang -O0` (#4880), where the inline class-field-SET IC diamond's
~15-lines-per-site expansion is never optimized away — and clang needs
~15GB RSS just to chew through it.

For such modules, replace the ENTIRE diamond (guard call + fast slot store
+ fallback arm) with a single `call @js_class_field_set_ic(...)`. The
runtime helper reproduces the diamond's exact semantics — run the guard,
then on PASS do the same raw-f64/boxed slot store, on FAIL record + route
by name. This trades a function-call frame on the (cold, startup-
dominated) field-set path for a large per-site IR reduction so clang can
compile the module at all.

Gating (codegen-time, decided once per module in compile_module):
- `PERRY_FULL_OUTLINE_IC=1/on/true` forces ON, `=0/off/false` forces OFF;
- otherwise auto: function count >= PERRY_FULL_OUTLINE_IC_MIN_FUNCS
  (default 4000) — the defining trait of the bundle case; ordinary
  per-file modules stay on the inline diamond and keep the hot fast store.

The decision is a thread-local set at the top of compile_module (codegen
is sequential per module), not a process-global OnceLock, so it can't pin
one module's decision across a multi-module build.

NB: the full-outline boxed store always emits the write barrier (via
js_object_set_field), so the compile-time non-pointer barrier elision
(#5334 lever D) does not apply on this path — acceptable, since it is
gated to oversized, non-hot-loop modules.

Verified: forced ON collapses the diamond to one call (no fast/fallback
blocks, no inline guard call); a class-field-write program runs to the
correct result under full-outline; full perry-codegen suite green. The two
class-field structure tests now pin PERRY_FULL_OUTLINE_IC off and
serialize on ENV_LOCK against the new lever-B test.

Final lever of the IR-efficiency roadmap (#5334). Levers A #5351, C #5350
merged; D #5381 in review.

* review: count class callables in lever-B gate; dedup IC fallback tail

Addresses self-review findings on #5334 lever B (#5385):

- Gate denominator: `decide_full_outline_ic` was fed `hir.functions.len()`,
  which excludes class methods, static methods, accessors, and constructors
  (those live in `hir.classes[].*`, collected separately). A class-heavy
  minified bundle — the exact pathology lever B targets — could have a small
  `functions.len()` yet emit tens of thousands of LLVM functions, so the gate
  would never fire. New `module_callable_count()` counts top-level functions
  plus all class callables; the gate now uses it. New test
  `full_outline_ic_auto_gate_counts_class_methods` covers a class-heavy module
  triggering with only one top-level function.

- Dedup: `js_class_field_set_ic`'s guard-FAIL tail re-implemented
  `js_class_field_set_fallback` verbatim. It now delegates to that helper, so
  by-name routing (frozen / accessor / setter-in-chain) is defined once.

Full perry-codegen + perry-runtime typed_feedback suites green.

* review(coderabbit): count class computed_members in lever-B gate denominator

ClassComputedMember holds a Function that compile_module lowers like any
method (emits an LLVM function), so computed members must count toward the
oversized-module gate alongside methods/static_methods/getters/setters.
A class with many computed-key methods could otherwise stay under
PERRY_FULL_OUTLINE_IC_MIN_FUNCS and keep the inline diamond when the auto
gate should fire.

* lint: GC_STORE_AUDIT(POINTER_FREE) marker on js_class_field_set_ic raw store

The full-outline IC helper's raw-f64 slot write is a barrier-free store the
GC store-site inventory requires to be annotated. A passing guard with
require_raw_f64 proves the slot is pointer-free (typed-shape descriptor) and
the value is a plain number — identical to the inline class_field_set.fast
raw-f64 store. Fixes the failing lint 'GC store-site inventory' step.

---------

Co-authored-by: Ralph Küpper <ralph2@skelpo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant