perf(codegen): full-outline the class-field-GET diamond for oversized modules (#5391 path 2)#5410
Conversation
… modules (#5391 path 2) Extends the #5334 lever-B full-outline from field-SET to field-GET. After codegen-unit splitting (#5407) removed the module memory wall and the string-init split removed the 68MB generated function, the residual wall is the bundle's large minified USER functions (7-18MB closures): clang -O0 time is superlinear in single-function size. Full-outlining the IC diamonds shrinks those functions at their source. When full-outline is enabled (size-gated), the class-field-GET diamond (inline precheck + guard call + fast slot load + by-name fallback + phi) collapses to one `js_class_field_get_ic(...)` call returning the field value. The runtime helper runs the same guard, reads the field slot as f64 on a pass (a plain number is self-boxing in nan-boxing, so raw-f64 and boxed slots read identically — matching the inline `load double`), and records + reads by name on a miss. The outline drops the inline path's static raw-number type hint (result treated as a general JS value), which is value-correct. class-field GET is the largest remaining diamond kind on the bundle (~39K sites). Gated by the existing PERRY_FULL_OUTLINE_IC; default off = unchanged. Unit test covers both gate states; full codegen suite green.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds a fully-outlined inline-cache path for class-field GET operations. A new ChangesFull-outline IC for class-field GET
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
…lict in compile/link/mod.rs Brings main's PerryTS#5400 (Windows response-file link) + PerryTS#5406/PerryTS#5408/PerryTS#5410/PerryTS#5412 codegen split + PerryTS#5402 require Tier 2 + PerryTS#5414 unknown builtin-namespace fix into the Windows plugin support branch. The conflict in compile/link/mod.rs is structural: main extracted the orchestrator into link/build_and_run.rs while the branch kept it inline. Resolved by taking main's refactor and re-applying the Windows plugin support on top: - compile/link/mod.rs: add PLUGIN_HOST_SYMBOLS const (the runtime + plugin export list — single source of truth for the macOS -u force-keep and the Windows .def file) plus use std::io::Write for the .def writer. - compile/link/build_and_run.rs: convert the if ctx.needs_plugins && !is_windows block to if ctx.needs_plugins with platform branches. The Windows branch writes a per-build perry_plugin_host_<pid>.def using the NAME directive (not LIBRARY — that one tells link.exe the output is a DLL and breaks the host .exe) and passes /DEF:<path> to the linker. PLUGIN_HOST_SYMBOLS replaces the inline untime_syms array on macOS too. - compile.rs: dylib link path on Windows now uses lld-link with /FORCE:UNRESOLVED (the .def file lists plugin_activate / perry_plugin_abi_version / plugin_deactivate so lld-link's empty default export table doesn't break GetProcAddress in the host) and writes a per-build .def file with the same three symbols. Lifts the pre-existing 'use link.exe' out for the LLVM-friendly linker. - codegen/entry.rs: re-apply the dylib plugin ABI shim block (emits perry_plugin_abi_version, plugin_activate, and the optional plugin_deactivate) on top of main's refactor of compile_module_entry.
What
Extends the #5334 lever-B full-outline from class-field SET to class-field GET: when full-outline is enabled (size-gated by
PERRY_FULL_OUTLINE_IC), theclass_field_getdiamond (inline precheck + guard call + fast slot load + by-name fallback + phi) collapses to a singlejs_class_field_get_ic(...)call returning the field value.Part of #5391 path 2 — shrinking large minified functions at their source so
clang -O0(whose per-function time is superlinear in size) can compile them.Runtime helper
js_class_field_get_icruns the samejs_typed_feedback_class_field_get_guard; on a PASS reads the field slot asf64(a plain number is self-boxing in nan-boxing, so raw-f64 and boxed slots read identically — matching the inlineclass_field_get.fastplainload double); on a MISS records the fallback and reads by name. The outline drops the inline path's static raw-number type hint (result treated as a general JS value), which is value-correct.Validation
full_outline_ic_collapses_class_field_get_to_single_call(both gate states).5999997) with full-outline ON and OFF.perry-codegensuite green. Default off = unchanged behavior.Scope note (honest)
This shrinks class-field-heavy functions. The bundle's largest functions, however, are dominated by a different construct — inline object/array literal construction (the 18MB
perry_closure...31856is ~17Kjs_gc_note_slot_layout+ 11Kjs_inline_arena_slow_alloc+ 9Kjs_write_barrier_slot: a giant data-table builder). Shrinking those needs literal-construction / inline-allocator outlining, a separate follow-up under #5391.Refs #5391, #5334.
Summary by CodeRabbit