diff --git a/W6_ISSUE_DRAFT.md b/W6_ISSUE_DRAFT.md new file mode 100644 index 0000000000..78b7536624 --- /dev/null +++ b/W6_ISSUE_DRAFT.md @@ -0,0 +1,105 @@ +# Lazy default-import of a cjs-wrapped module binds to a named class export instead of `export default` (scale-emergent) + +## Summary +In a large compiled bundle, a `require()` **inside a function** of a CommonJS-wrapped +module binds the import local to the module's **named class export** instead of its +**`export default`** (`module.exports`) object. The metadata is correct; only the +final codegen symbol binding is wrong, and only at giant-module scale. + +## Concrete instance (Next.js app-router render → HTTP 500) +- Module: `next/dist/server/lib/incremental-cache/shared-cache-controls.external.js` +- Source exports (CJS): `Object.defineProperty(exports, "SharedCacheControls", { get })` + a top-level `class SharedCacheControls`. +- `cjs_wrap` output (correct): hoists the class, emits both `export default _cjs;` and `export { SharedCacheControls };`. +- `PERRY_DUMP_EXPORTS` (recorded metadata, correct): + - `Named { local: "default", exported: "default" }` (→ `_cjs`, the exports object) + - `Named { local: "SharedCacheControls", exported: "SharedCacheControls" }` (the class) +- Importer `app-page-turbo.runtime.prod.js` does `const uw = require(".../shared-cache-controls.external.js")` **inside `IncrementalCache.getIncrementalCache`**, recorded as: + - `Default { local: "_lazyreq_26" }`, `is_adopted_require = true` (a lazy default import) +- Runtime: `typeof uw === "function"` and `uw.SharedCacheControls === undefined` → `new uw.SharedCacheControls(...)` throws **`TypeError: undefined is not a constructor`** in `IncrementalCache`'s constructor → app-router render returns HTTP 500. + +So the lazy default import `_lazyreq_26` binds to the **class** symbol instead of the +`"default"` (`_cjs`) symbol. + +## What's ruled out +- `cjs_wrap` output — correct (`export default _cjs` present). +- Runtime `module.exports` — correct (`typeof module.exports === "object"`, `.SharedCacheControls` a function, `exports === module.exports`). +- `reachability.rs` — tree-shaking only; `shared-cache-controls` is a non-barrel → module-granularity (whole module kept). +- Default-export-name collision — `__default` symbols are per-origin (`perry_fn___default`). + +## Not minimally reproducible (~13 shapes all bind correctly) +relative `.js`; node_modules pkg; `.external.js` suffix; `compilePackages` NativeCompiled; +full-subpath require; within-package sibling require; circular require; +`exports.X = X`; `module.exports.X`; getter + static-field exact class shape; +dual-importer (named + namespace); **lazy require inside a function**; +**multi-module (5) lazy default imports**. Every one returns `_cjs`/binds correctly. +The defect appears only inside the real ~600KB `app-page-turbo` module. + +## Likely area +Codegen default-import → export-symbol resolution (`import_function_prefixes` / +`perry_fn___default`). Hypotheses: the `__default` symbol for an +`export default ` (the IIFE result `_cjs`) is not emitted / not +reachable at scale, so the default import falls back to the module's other (named +class) export; or a scale-only symbol-resolution path differs. + +## Repro env +`/tmp/perry-nextjs-demo` (Next 16 standalone, `output: 'standalone'`), compiled with +`PERRY_LL_O0_THRESHOLD_BYTES=536870912 PERRY_ALLOW_PERRY_FEATURES=1 PERRY_ALLOW_EVAL=1 PERRY_ALLOW_UNIMPLEMENTED=1`. +Diagnostic: `PERRY_DUMP_EXPORTS` dump added to `bootstrap.rs enforce_package_default_exports`. + +## Context +This is the 6th wall in the Next.js app-router bring-up; walls 1–5 fixed on +`feat/nextjs-wall-46` (incl. `9970fbbe7` 0-arg class-object resolve, `af8c832b0` +readFileSync ENOENT, `6c41417ff` anon-class-expression capture). With W6 fixed the +render should advance past `IncrementalCache` construction. + +--- +## DEEP UPDATE (corrected root via runtime probes) +Earlier "binds to the class" was WRONG. Confirmed via ~10 probe cycles: +- Importer: `_lazyreq_26` is in `imported_vars`, NOT in `class_ids` → correctly reaches the getter path (dyn_extern_i18n.rs:594/625), calls `perry_fn___default`. +- Exporter: that getter IS emitted (`emit_getter=true`, `is_function_alias=false`), loads `@perry_global___55`. +- HIR: `export default _cjs` → `LocalGet(0)`; local 0 = `_cjs`, init = `Call` (the IIFE call result — correct). +- Module scope: at shared-cache-controls's OWN scope, `module.exports` (= `_cjs`) is `typeof object` (PERRY_SCC probe). +- Cross-module runtime (W6X at `new uw.SharedCacheControls`): `uw` = an UNNAMED CLOSURE (`typeof function`, `name===""`, no keys) — NOT the class, NOT the object. + +So: `perry_global___55` (the `"default"` global) holds a **closure** at runtime, even though `_cjs` is the exports **object** at its own scope. The cross-module `"default"` transfer (the module-init `perry_global__55 = LocalGet(0)` assignment, or the IIFE-result local read) **mistypes the object as a closure**, ONLY at giant-bundle scale (~14 minimal repros — incl. lazy-require, deferred, -O3 auto-optimize, exact class shape — all transfer the object correctly). Not the I64/F64 module_var_data_ids path (that's inlining-only). + +Next: runtime-probe the value written to `perry_global___55` at module-init (object vs closure) to confirm the assignment vs getter mistyping; investigate the IIFE-result local (`_cjs`, local 0) read at module-init scope at scale. + +--- +## ROOT (store-time probe, decisive) +`js_debug_val` injected at the module-global store (let_stmt.rs:785, gated PERRY_DBG_STORE on the COMPILE) shows the `"default"` Let (id 55) store-time value: +`[DEBUG_VAL] label=55 bits=0x7FFD045AB87A73B8` — tag `0x7FFD` = POINTER (runs once, deferred-init). Runtime `uw` is `typeof function`, so this pointer is the **closure**. + +So `_cjs` (local 0, init = the IIFE `Call`) holds the **IIFE closure**, not the IIFE **call result** (the exports object), at store time — i.e. `const _cjs = (function(){...; return module.exports})()` binds `_cjs` to the *function* instead of its *return value*, ONLY at giant-bundle scale. The IIFE body's `module.exports` IS an object (PERRY_SCC), so the IIFE returns the object; the bug is the Call-result binding of `_cjs`. Not reproducible in ~14 minimal repros (incl. deferred lazy-require + -O3 auto-optimize) — a scale-emergent codegen defect in the IIFE-call-result assignment for a deferred cjs-wrapped module. + +FIX area: the codegen that lowers `const x = (closure)()` (the cjs_wrap IIFE) — ensure `x` binds the Call RESULT, not the callee closure, under the giant-module / deferred-init path. Needs a scale reproduction or someone with the IIFE-call/deferred-init codegen context. + +--- +## store==load (definitive, same run) +`js_debug_val` at the store (let_stmt.rs) AND the importer getter-call (dyn_extern_i18n.rs:628), same run: +`label=55 (store) bits=0x7FFD02D428FA7130` == `label=9955 (load) bits=0x7FFD02D428FA7130` — IDENTICAL. +So the getter faithfully returns the stored value (NOT a load-side/getter bug, NOT corruption). `uw.name===""` (anonymous) ⇒ the stored value is the **IIFE function itself**, not the class (which would be `name==="SharedCacheControls"`). Definitive root: `const _cjs = (function(){...; return module.exports})()` binds `_cjs` to the IIFE **closure (callee)**, not the IIFE's **call result** (the exports object) — the IIFE body DOES run (module.exports populated) but its return value is discarded and the closure is stored. Giant-bundle-scale only (16+ repros incl. 150-module -O3 build all bind the call result correctly). The IIFE-call path is via the receiverless closure-value call (lower_call/console_promise.rs:997); it's correct in repros, so the defect is a scale-specific interaction (inlining / deferred-init / whole-program -O3) in the real bundle. Not reproducible synthetically → needs in-bundle debugging or the team's oversized-module codegen work. + +--- +## ROOT CONFIRMED (2026-06-19): deferred-require var captured by-value as a stale thunk +After exhaustively refuting prefix/global-id/FuncId collisions, -O3, GC, and the IIFE-return path (all clean), and tracing the value across the module boundary, the root is: + +`uw = require("next/dist/server/lib/incremental-cache/shared-cache-controls.external.js")` is an **adopted/deferred require** (`cjs_wrap` rewrites `const uw = require('S')` → `import uw from 'S'`; `is_deferred_require` on the import decl). The `IncrementalCache` **constructor** (a class method inside app-page-turbo's cjs-wrap IIFE) **captures `uw` by value** (`js_closure_get_capture_f64`, NON-boxed; literals_vars.rs:434) at class-definition time — when `uw` is still the **unresolved thunk/closure**. So `new uw.SharedCacheControls(...)` reads a function → `uw.SharedCacheControls === undefined` → `TypeError: undefined is not a constructor` → HTTP 500. + +### Verified value chain (one run, is_closure probe) +- perry_global store of the export = OBJECT (`is_closure=false`) +- importer getter-call result (the `uw` value via the getter) = SAME OBJECT (`is_closure=false`, identical bits) +- but the constructor's captured `uw` = FUNCTION (anonymous closure) — `W6X typeof=function` +- probe `js_dbg_closure_only` at the capture-read site: **47 by-value captures-of-closures in app-page-turbo**; `uw` is one (candidate ids 49 / 7499 / 7962 / 186xx). + +### Why the boxing analysis misses it +`boxed_vars.rs:151` boxes a var only when `(declared AND captured AND mutated) OR self-recursive-closure`. `uw` is captured but assigned once (not "mutated"), so it's snapshotted by value. The **self-recursive-closure** rule (`collect_self_recursive_closure_ids`, boxed_vars.rs:148) is the exact precedent — it boxes `let f = closure()` because "the store happens AFTER captures populate." `uw`'s deferred require is the same "value not ready at capture" shape, just with a require/import init. + +### Candidate fixes (delicate — import/capture subsystem; needs the repr decided first) +Whether `uw` is a `Stmt::Let` (boxed_vars-visible) or a pure adopted-import binding determines the site: +1. **box-when-captured**: if `uw` is a captured `Let` whose init is an adopted-require/import value → add to the boxed set (mirror the self-recursive rule), so the capture is by-reference and sees the resolved object. Verify the box actually receives the resolved value. +2. **eager-init**: a cjs-wrap-IIFE require runs at module init, so resolve it eagerly into the local before the class definition (then by-value capture = object). +3. **getter-on-read**: lower the constructor's `uw` read through the imported-var getter (`ExternFuncRef` → `perry_fn___`), consistent with init-scope reads, instead of a by-value capture. + +### Repro status +NOT minimally reproducible (the eager-resolve path works in isolation; module-level require captured by a class method passes). Needs the cjs-wrap adopted/deferred-require + cross-module-getter shape — bundle-only so far. A faithful repro likely requires a compilePackage that cjs-wraps `const uw = require('dep'); module.exports.C = class { constructor(){ new uw.Thing() } }`. diff --git a/crates/perry-codegen/src/codegen/artifacts.rs b/crates/perry-codegen/src/codegen/artifacts.rs index 35e9c56524..1ea51d4465 100644 --- a/crates/perry-codegen/src/codegen/artifacts.rs +++ b/crates/perry-codegen/src/codegen/artifacts.rs @@ -79,6 +79,82 @@ pub(super) struct ModuleArtifactsCtx<'a> { pub cross_module: &'a CrossModuleCtx, } +/// The standalone-constructor arity Perry emits for `class`, accounting for the +/// JS spec default ctor `constructor(...args) { super(...args) }` that a class +/// with NO own constructor but WITH heritage inherits. Walks the ancestor chain +/// (local `class_table` + cross-module `imported_classes` ctor-param counts + +/// imported stubs) for the nearest ctor-bearing parent's user arity, mirroring +/// the `found_params` walk in the per-class ctor emission below. +/// +/// This MUST agree with the synthesized standalone ctor's actual LLVM signature, +/// otherwise the runtime registers a different `total_params` than the function +/// declares and `replay_registered_class_constructor` forwards the wrong number +/// of args (Next.js wall 51: `class AppRouteRouteMatcher extends +/// _mod.RouteMatcher {}` emitted a 1-param forwarding ctor but registered it as +/// 0 params, so `new mod.AppRouteRouteMatcher(def)` dropped `def` before +/// `RouteMatcher(definition)` ran and every matcher's `this.definition` was a +/// garbage number). +fn synthesized_ctor_param_count( + class: &perry_hir::Class, + class_table: &HashMap, + imported_class_stubs: &[perry_hir::Class], + imported_classes: &[super::opts::ImportedClass], +) -> usize { + if let Some(c) = class.constructor.as_ref() { + return c.params.len(); + } + // A native parent (`extends Error` / `extends events.EventEmitter`) has its + // own native construction path that consumes the construction args directly; + // don't synthesize a forwarding ctor for it. + if class.native_extends.is_some() { + return 0; + } + // No heritage at all → nothing to forward. + if class.extends_name.is_none() && class.extends_expr.is_none() { + return 0; + } + // Walk ancestors for the nearest ctor-bearing parent's user arity. + let mut cur = class.extends_name.clone(); + while let Some(pname) = cur { + let imported_ctor_params = imported_classes + .iter() + .find(|i| i.local_alias.as_deref().unwrap_or(&i.name) == pname.as_str()) + .map(|ic| ic.constructor_param_count) + .unwrap_or(0); + if let Some(pclass) = class_table.get(pname.as_str()) { + if let Some(pctor) = &pclass.constructor { + return pctor.params.len(); + } + if imported_ctor_params > 0 { + return imported_ctor_params; + } + cur = pclass.extends_name.clone(); + } else if let Some(stub) = imported_class_stubs.iter().find(|c| c.name == pname) { + if imported_ctor_params > 0 { + return imported_ctor_params; + } + cur = stub.extends_name.clone(); + } else { + break; + } + } + // The parent's exact ctor arity is genuinely unavailable here in some build + // modes: the auto-optimize / standalone path compiles each nested + // `node_modules` module with an EMPTY `imported_classes` list and resolves + // the cross-module parent purely as a runtime DYNAMIC parent + // (`extends_expr` + `js_register_class_parent_dynamic`), so it is absent + // from `class_table` and `imported_class_stubs` here. Without a forwarding + // signature the synthesized `super()` dropped every construction arg + // (Next.js wall 51: `class PagesRouteMatcher extends _mod.RouteMatcher {}` + // → `RouteMatcher(definition)` saw garbage → every matcher's + // `this.definition` was undefined). Forward a generous fixed band of + // positional params: the `new` site pads missing slots with `undefined`, + // and a parent ctor reading fewer params ignores the trailing `undefined`s, + // so over-declaring is correct for any (non-native) parent up to this band. + const UNRESOLVED_PARENT_FWD_ARITY: usize = 8; + UNRESOLVED_PARENT_FWD_ARITY +} + /// Emit the artifact tail: bodies, wrappers, namespace globals, entry /// function, string pool. Mirrors the in-prelude execution order of /// the original `compile_module`. @@ -341,78 +417,31 @@ pub(super) fn emit_module_artifacts(c: ModuleArtifactsCtx<'_>) -> Result<()> { let ctor_body = if let Some(c) = class.constructor.as_ref() { (c.params.clone(), c.body.clone(), c.captures.clone()) } else if class.extends_name.is_some() { - // Walk ancestors for the first one with a ctor; adopt its - // params (cleared of ids — they'll be fresh). - let mut found_params: Vec = Vec::new(); - let mut cur = class.extends_name.clone(); - while let Some(pname) = cur { - // v0.5.760: also consult `opts.imported_classes` for - // cross-module parent ctors. Pre-fix the loop fell - // through to the next ancestor when `class_table`'s - // entry for an imported class returned a stub with - // `constructor: None` (stubs always have None) — even - // though the source module did have a real ctor/effect. - // Result: `class Child extends Parent { x = - // "y" }` (no own ctor, parent in another module) had - // its synthesized ctor with ZERO params, so the user's - // `new Child("arg")` lost the arg before reaching - // Parent_constructor. Explicit zero-arg ctors and - // field-initializer ctors still stop the walk even with - // zero adopted params. Refs #420. - let imported_ctor = opts - .imported_classes - .iter() - .find(|i| i.local_alias.as_deref().unwrap_or(&i.name) == pname.as_str()) - .filter(|ic| { - ic.constructor_param_count > 0 - || ic.has_own_constructor - || ic.has_instance_fields - }); - if let Some(pclass) = class_table.get(pname.as_str()) { - if let Some(pctor) = &pclass.constructor { - found_params = pctor.params.clone(); - break; - } - if let Some(imported_ctor) = imported_ctor { - for i in 0..imported_ctor.constructor_param_count { - found_params.push(perry_hir::Param { - id: 0xFFFF_0000 + i as u32, - name: format!("__forward_arg{}", i), - ty: perry_types::Type::Any, - default: None, - decorators: Vec::new(), - is_rest: false, - arguments_object: None, - }); - } - break; - } - cur = pclass.extends_name.clone(); - } else if let Some(stub) = imported_class_stubs.iter().find(|c| c.name == pname) - { - // Imported stub — params not in HIR; use effectful - // ctor metadata as a synthetic count of unnamed args. - if let Some(imported_ctor) = imported_ctor { - for i in 0..imported_ctor.constructor_param_count { - found_params.push(perry_hir::Param { - id: 0xFFFF_0000 + i as u32, - name: format!("__forward_arg{}", i), - ty: perry_types::Type::Any, - default: None, - decorators: Vec::new(), - is_rest: false, - arguments_object: None, - }); - } - } else { - cur = stub.extends_name.clone(); - continue; - } - break; - } else { - break; - } - } + // No own ctor + heritage → JS spec default ctor + // `constructor(...args) { super(...args) }`. Synthesize forwarding + // params matching the closest ancestor ctor's arity (incl. + // cross-module parents) via the shared helper — its result is + // ALSO what gets registered into CLASS_CONSTRUCTORS below, so the + // emitted signature and the runtime `total_params` always agree + // (Next.js wall 51). The actual `super(...)` is emitted by the + // compile_method post-init step from these params positionally. + let n = synthesized_ctor_param_count( + class, + class_table, + imported_class_stubs, + opts.imported_classes, + ); + let found_params: Vec = (0..n) + .map(|i| perry_hir::Param { + id: 0xFFFF_0000 + i as u32, + name: format!("__forward_arg{}", i), + ty: perry_types::Type::Any, + default: None, + decorators: Vec::new(), + is_rest: false, + arguments_object: None, + }) + .collect(); (found_params, Vec::new(), Vec::new()) } else { (Vec::new(), Vec::new(), Vec::new()) @@ -1616,6 +1645,26 @@ pub(super) fn emit_module_artifacts(c: ModuleArtifactsCtx<'_>) -> Result<()> { user_fn_source.push((sym, src.clone())); } + // Wall 51: the standalone-ctor arity registered into CLASS_CONSTRUCTORS must + // match the arity of the ctor function actually emitted above (which, for a + // no-own-ctor class with heritage, is the synthesized `super(...args)` + // forwarding ctor whose arity comes from the nearest ancestor ctor — possibly + // cross-module). Compute that arity here with the same walk so the runtime + // forwards the right number of construction args to the parent ctor. + let ctor_arity_overrides: HashMap = class_table + .iter() + .filter(|(name, class)| **name == class.name && class.id != 0) + .map(|(name, class)| { + let n = synthesized_ctor_param_count( + class, + class_table, + imported_class_stubs, + opts.imported_classes, + ) as u32; + (name.clone(), n) + }) + .collect(); + emit_string_pool( llmod, strings, @@ -1623,6 +1672,7 @@ pub(super) fn emit_module_artifacts(c: ModuleArtifactsCtx<'_>) -> Result<()> { class_keys_init_data, class_ids, class_table, + &ctor_arity_overrides, closure_rest_params, closure_arities, closure_lengths, diff --git a/crates/perry-codegen/src/codegen/entry.rs b/crates/perry-codegen/src/codegen/entry.rs index 97ee5260ba..85c0c18575 100644 --- a/crates/perry-codegen/src/codegen/entry.rs +++ b/crates/perry-codegen/src/codegen/entry.rs @@ -205,6 +205,22 @@ pub(super) fn compile_module_entry( .filter(|s| !s.is_empty()) .map(|suite| llmod.add_string_constant(suite)) }; + // Next.js wall 54 (part 2): emit a string constant for every Deferred + // `.next/server/**` module path now (before `main` borrows `llmod`); the + // registration calls go in the block below. `(string_const_name, + // byte_len, sanitized_prefix)`. + let nextjs_path_inits: Vec<(String, usize, String)> = if is_dylib { + Vec::new() + } else { + cross_module + .nextjs_path_init_modules + .iter() + .map(|(path, prefix)| { + let (cn, len) = llmod.add_string_constant(path); + (cn, len, prefix.clone()) + }) + .collect() + }; let main = if is_dylib { llmod.define_function("perry_module_init", VOID, vec![]) } else { @@ -316,6 +332,25 @@ pub(super) fn compile_module_entry( } blk.call_void(&format!("{}__init", prefix), &[]); } + // Next.js wall 54 (part 2): record each Deferred `.next/server/**` + // module's `__init` address under its absolute path so a runtime + // `require(absolutePath)` (turbopack page/chunk loading) can trigger + // its init lazily. No init runs here — only the address is recorded. + // The `__init` symbols are already declared above for every + // non-entry prefix, so `ptrtoint` of the symbol resolves at link. + for (const_name, byte_len, prefix) in &nextjs_path_inits { + let path_ptr = format!("@{}", const_name); + let len_str = byte_len.to_string(); + let init_addr = format!("ptrtoint (ptr @{}__init to i64)", prefix); + blk.call_void( + "js_register_path_init", + &[ + (PTR, path_ptr.as_str()), + (I64, len_str.as_str()), + (I64, init_addr.as_str()), + ], + ); + } } // Mark the boundary between init prelude and user code so // hoisted post-init setup (cached `@perry_class_keys_*` loads diff --git a/crates/perry-codegen/src/codegen/helpers.rs b/crates/perry-codegen/src/codegen/helpers.rs index 6db349f87e..b24ec5b944 100644 --- a/crates/perry-codegen/src/codegen/helpers.rs +++ b/crates/perry-codegen/src/codegen/helpers.rs @@ -711,6 +711,12 @@ pub(super) fn init_static_fields_early( // static-field & friends. Initialized and computed-key fields are emitted // inline at their source position elsewhere and are skipped here. for c in &hir.classes { + // Next.js wall 54: a nested class (declared inside a function) has its + // static-field initializers run when the enclosing function evaluates + // the class, NOT at module init — skip them here. + if c.is_nested { + continue; + } let Some(&class_id) = ctx.class_ids.get(&c.name) else { continue; }; @@ -799,6 +805,13 @@ pub(super) fn init_static_fields_late( refs.iter().any(|id| !module_local_scope.contains(id)) }; for c in &hir.classes { + // Next.js wall 54: a nested class's static-field initializers must run + // when the enclosing function evaluates the class, not at module init. + // Running a side-effectful one eagerly (e.g. `static #a = new Self()`) + // both mistimes it and can crash before user code. + if c.is_nested { + continue; + } for sf in &c.static_fields { // Computed-key static fields go through the class-static-symbol // side table. Refs #420 — drizzle's `static [entityKind] = diff --git a/crates/perry-codegen/src/codegen/method.rs b/crates/perry-codegen/src/codegen/method.rs index 0420e89eb8..cffe513dae 100644 --- a/crates/perry-codegen/src/codegen/method.rs +++ b/crates/perry-codegen/src/codegen/method.rs @@ -337,7 +337,16 @@ pub(super) fn compile_method( } effective_parent = pc.extends_name.as_deref(); } - if let Some(pname) = effective_parent { + // Wall 51: a class with a DYNAMIC parent (`extends_expr`, e.g. + // `class X extends _mod.Parent {}`) must route its synthesized + // super through the runtime dynamic-parent dispatcher below + // (`js_fetch_or_value_super` keyed on the decl-time-registered parent + // value), NOT this inline static-symbol call — the parent's + // standalone ctor symbol lives under a different module prefix and + // the static call would target the wrong/empty symbol, so the parent + // ctor never ran and inherited fields stayed undefined. Skip the + // inline path for dynamic-parent classes. + if let Some(pname) = effective_parent.filter(|_| class.extends_expr.is_none()) { let pname_owned = pname.to_string(); let node_stream_kind = if pname_owned == "Readable" { node_stream_parent_kind(ctx.classes, class) @@ -478,6 +487,72 @@ pub(super) fn compile_method( .call(DOUBLE, runtime_fn, &[(DOUBLE, &this_box), (DOUBLE, &opts)]); } + // Wall 51: a no-own-ctor class with a DYNAMIC / cross-module parent + // (`class X extends _mod.Parent {}`, captured as `extends_expr`) that + // the inline walk above could NOT resolve to a local/imported ctor + // symbol (the auto-optimize / standalone build compiles each nested + // module with the parent absent from `ctx.classes` / + // `imported_class_ctors`, resolving it purely as a runtime dynamic + // parent). Without an emitted super-call the parent ctor never runs + // and inherited `this. = …` writes are lost — Next.js route + // matchers (`class PagesRouteMatcher extends _mod.RouteMatcher {}`) + // left every `this.definition` undefined, so `matcher.definition + // .pathname` threw. Forward this synthesized ctor's params to the + // runtime dynamic-parent super dispatcher, mirroring the explicit + // `Expr::SuperCall` dynamic-parent path in `expr/this_super_call.rs`. + if builtin_parent_runtime.is_none() && class.extends_expr.is_some() { + if let Some(cid) = ctx.class_ids.get(&class.name).copied().filter(|c| *c != 0) { + let undef_lit = + crate::nanbox::double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED)); + let mut lowered_args: Vec = Vec::with_capacity(method.params.len()); + for p in &method.params { + if let Some(slot) = ctx.locals.get(&p.id).cloned() { + lowered_args.push(ctx.block().load(DOUBLE, &slot)); + } else { + lowered_args.push(undef_lit.clone()); + } + } + let parent_val = ctx.block().call( + DOUBLE, + "js_get_dynamic_parent_value", + &[(crate::types::I32, &cid.to_string())], + ); + let (args_ptr, args_len) = if lowered_args.is_empty() { + ("null".to_string(), "0".to_string()) + } else { + let buf_reg = ctx.func.alloca_entry_array(DOUBLE, lowered_args.len()); + for (i, a_val) in lowered_args.iter().enumerate() { + let slot = + ctx.block() + .gep(DOUBLE, &buf_reg, &[(I64, &format!("{}", i))]); + ctx.block().store(DOUBLE, a_val, &slot); + } + let ptr_reg = ctx.block().next_reg(); + ctx.block().emit_raw(format!( + "{} = getelementptr [{} x double], ptr {}, i64 0, i64 0", + ptr_reg, + lowered_args.len(), + buf_reg + )); + (ptr_reg, lowered_args.len().to_string()) + }; + let this_box = match ctx.this_stack.last().cloned() { + Some(slot) => ctx.block().load(DOUBLE, &slot), + None => undef_lit.clone(), + }; + let _ = ctx.block().call( + DOUBLE, + "js_fetch_or_value_super", + &[ + (DOUBLE, &parent_val), + (DOUBLE, &this_box), + (crate::types::PTR, &args_ptr), + (I64, &args_len), + ], + ); + } + } + // Apply self field initializers AFTER the parent body chain has // run, so they can read state set by the parent body (e.g. drizzle's // PgText.enumValues = this.config.enumValues — this.config is set diff --git a/crates/perry-codegen/src/codegen/mod.rs b/crates/perry-codegen/src/codegen/mod.rs index bed5f221c5..77c8b24352 100644 --- a/crates/perry-codegen/src/codegen/mod.rs +++ b/crates/perry-codegen/src/codegen/mod.rs @@ -437,6 +437,7 @@ pub fn compile_module(hir: &HirModule, opts: CompileOptions) -> Result> decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, }; imported_class_stubs.push(stub); imported_stub_prefixes.push(ic.source_prefix.clone()); @@ -1342,6 +1343,7 @@ pub fn compile_module(hir: &HirModule, opts: CompileOptions) -> Result> .collect(), namespace_entries: opts.namespace_entries.clone(), dynamic_import_path_to_prefix: opts.dynamic_import_path_to_prefix.clone(), + nextjs_path_init_modules: opts.nextjs_path_init_modules.clone(), deferred_module_prefixes: opts.deferred_module_prefixes.clone(), module_init_deps: opts.module_init_deps.clone(), is_dynamic_import_target: opts.is_dynamic_import_target, diff --git a/crates/perry-codegen/src/codegen/opts.rs b/crates/perry-codegen/src/codegen/opts.rs index eb877a91ed..bc5a290dac 100644 --- a/crates/perry-codegen/src/codegen/opts.rs +++ b/crates/perry-codegen/src/codegen/opts.rs @@ -335,6 +335,13 @@ pub struct CompileOptions { /// (multi-path). Empty if this module performs no dynamic imports. pub dynamic_import_path_to_prefix: std::collections::HashMap, + /// Next.js wall 54 (part 2): `(absolute_source_path, sanitized_prefix)` for + /// every Deferred `.next/server/**` module. The entry's `main` emits a + /// `js_register_path_init(path, &__init)` for each so a runtime + /// `require(absolutePath)` can lazily trigger the module's init. Only + /// populated for the entry module; empty otherwise. + pub nextjs_path_init_modules: Vec<(String, String)>, + /// Issue #753: sanitized prefixes of modules whose init must NOT /// run as part of the entry module's eager init chain. Reachable /// from the entry only through dynamic `import()` edges, so their @@ -753,6 +760,10 @@ pub(crate) struct CrossModuleCtx { /// dispatch site in `expr.rs::Expr::DynamicImport` to find the /// `@__perry_ns_` global to load. pub dynamic_import_path_to_prefix: std::collections::HashMap, + /// Next.js wall 54 (part 2): `(absolute_source_path, sanitized_prefix)` for + /// every Deferred `.next/server/**` module — see [`CompileOptions`]. The + /// entry's `main` emits one `js_register_path_init` per entry. + pub nextjs_path_init_modules: Vec<(String, String)>, /// Issue #753: sanitized prefixes of modules reached only through /// dynamic `import()` edges. Their `__init` is excluded /// from the entry-main eager init call sequence and fires lazily diff --git a/crates/perry-codegen/src/codegen/string_pool.rs b/crates/perry-codegen/src/codegen/string_pool.rs index c08f7c2dac..e9aed70f72 100644 --- a/crates/perry-codegen/src/codegen/string_pool.rs +++ b/crates/perry-codegen/src/codegen/string_pool.rs @@ -107,6 +107,13 @@ pub(super) fn emit_string_pool( class_keys_init_data: &[(String, String, u32, Vec, Vec)], class_ids: &HashMap, classes: &HashMap, + // Wall 51: per-class standalone-constructor arity, accounting for the + // synthesized `super(...args)` forwarding ctor a no-own-ctor class with + // heritage inherits (its arity comes from the nearest ancestor ctor, which + // may be cross-module). Keyed by canonical class name. Overrides the naive + // `class.constructor.params.len()` (which is 0 for a no-own-ctor class) so + // the registered `total_params` matches the emitted ctor's real signature. + ctor_arity_overrides: &HashMap, closure_rest_params: &HashMap, // Declared ABI arity for non-rest closures, used for runtime padding. closure_arities: &HashMap, @@ -503,6 +510,15 @@ pub(super) fn emit_string_pool( // the heap-class-object arm of `js_new_function_construct`, so it's // behavior-neutral for top-level class declarations (INT32 ref `new`). let mut ctor_triples: Vec<(u32, String, u32)> = Vec::new(); + // #wall3: class ctors with a rest param (`constructor(...args)`) need their + // standalone `_constructor` func_ptr registered in CLOSURE_REST_REGISTRY so + // a member-new (`new ns.Sub(opts)` → js_new_function_construct → + // js_native_call_value) BUNDLES trailing args into the rest array. Without + // this the rest param binds to the first arg as a scalar (a=opts, not + // [opts]) and `super(...args)` spreads a bare object → 0x400000000 mis-box → + // crash (Next.js `new c.AppPageRouteModule({...})`). Mirrors the + // closure-rest registration but keyed by the `_constructor` symbol. + let mut ctor_rest_regs: Vec<(String, usize)> = Vec::new(); for (class_name, class) in classes.iter() { // Refs #486: skip alias keys (class_table now contains both the // canonical name and self-binding aliases like `_X` from @@ -602,16 +618,32 @@ pub(super) fn emit_string_pool( // Class-expression templates with no own/synthesized constructor (no // captures) have arity 0 — the standalone ctor then just runs the // literal field initializers. - let ctor_params = class + // Wall 51: prefer the synthesized-ctor arity override (which walks the + // ancestor chain for a no-own-ctor class with heritage, incl. + // cross-module parents) so the registered `total_params` matches the + // standalone ctor function actually emitted in `artifacts.rs`. Falls + // back to the own-ctor param count for classes not in the map. + let ctor_params = ctor_arity_overrides + .get(class_name) + .copied() + .unwrap_or_else(|| { + class + .constructor + .as_ref() + .map(|c| c.params.len() as u32) + .unwrap_or(0) + }); + let ctor_symbol = format!("{}__{}_constructor", module_prefix, class_name); + // #wall3: record the rest-param position (in USER params) so the runtime + // bundles trailing args at the dynamic member-new dispatch path. + if let Some(rest_idx) = class .constructor .as_ref() - .map(|c| c.params.len() as u32) - .unwrap_or(0); - ctor_triples.push(( - cid, - format!("{}__{}_constructor", module_prefix, class_name), - ctor_params, - )); + .and_then(|c| c.params.iter().position(|p| p.is_rest)) + { + ctor_rest_regs.push((ctor_symbol.clone(), rest_idx)); + } + ctor_triples.push((cid, ctor_symbol, ctor_params)); } method_triples.sort_unstable(); for (cid, method_name, llvm_name, param_count, has_synth_args, has_rest, spec_length) in @@ -720,6 +752,20 @@ pub(super) fn emit_string_pool( ], ); } + // #wall3: register rest-bearing class ctors' func_ptrs in the closure-rest + // side table so the dynamic member-new dispatch (js_native_call_value via + // js_new_function_construct) bundles trailing args into the rest array, + // matching the static `new` path. See `ctor_rest_regs` above. + ctor_rest_regs.sort_unstable(); + for (ctor_symbol, rest_idx) in ctor_rest_regs { + chunker.roll_if_full(); + let blk = chunker.current_block(); + let func_ref = format!("@{}", ctor_symbol); + blk.call_void( + "js_register_closure_rest", + &[(PTR, &func_ref), (I32, &rest_idx.to_string())], + ); + } // Refs #618 / #420: register every class id with the runtime so // `js_value_typeof` can distinguish a class ref (NaN-boxed INT32 with diff --git a/crates/perry-codegen/src/collectors/class_accessors.rs b/crates/perry-codegen/src/collectors/class_accessors.rs index 3c2884e680..5a61a57654 100644 --- a/crates/perry-codegen/src/collectors/class_accessors.rs +++ b/crates/perry-codegen/src/collectors/class_accessors.rs @@ -107,6 +107,7 @@ mod tests { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, static_accessor_names: Vec::new(), static_accessor_fn_ids: Vec::new(), } diff --git a/crates/perry-codegen/src/collectors/this_as_value.rs b/crates/perry-codegen/src/collectors/this_as_value.rs index 306e6d1228..08ff783f78 100644 --- a/crates/perry-codegen/src/collectors/this_as_value.rs +++ b/crates/perry-codegen/src/collectors/this_as_value.rs @@ -452,6 +452,7 @@ mod tests { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, static_accessor_names: Vec::new(), static_accessor_fn_ids: Vec::new(), } diff --git a/crates/perry-codegen/src/expr/bigint_set.rs b/crates/perry-codegen/src/expr/bigint_set.rs index 20696bd121..cc1691b34d 100644 --- a/crates/perry-codegen/src/expr/bigint_set.rs +++ b/crates/perry-codegen/src/expr/bigint_set.rs @@ -279,28 +279,21 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let set_box = lower_expr(ctx, &Expr::LocalGet(*set_id))?; let blk = ctx.block(); let set_handle = unbox_to_i64(blk, &set_box); + // `js_set_add` mutates the set in place and ALWAYS returns the same + // `SetHeader` pointer it was given — `ensure_capacity` reallocs only + // the internal elements buffer, never the header. So there is no + // "realloc'd pointer" to write back: the previous writeback to + // `set_id`'s storage was vestigial (copied from the array-push + // pattern) and actively WRONG for a boxed/mutable closure capture — + // it overwrote the capture SLOT (which holds a box pointer) with the + // Set value, so the next read dereferenced the Set-as-box and saw + // `undefined` (Next.js turbopack runtime: `loadedChunks.add(p)` + // silently cleared the module-level `loadedChunks` Set captured by + // the chunk-loader closure, SIGSEGV on the next `.add`). GC moves of + // the header are handled by root rewriting of the variable slot, not + // here. let new_handle = blk.call(I64, "js_set_add", &[(I64, &set_handle), (DOUBLE, &v)]); - let new_box = nanbox_pointer_inline(blk, &new_handle); - // Write back to the storage so subsequent reads see the - // possibly-realloc'd pointer. - if let Some(&capture_idx) = ctx.closure_captures.get(set_id) { - let closure_ptr = ctx - .current_closure_ptr - .clone() - .ok_or_else(|| anyhow!("SetAdd captured but no current_closure_ptr"))?; - let idx_str = capture_idx.to_string(); - ctx.block().call_void( - "js_closure_set_capture_f64", - &[(I64, &closure_ptr), (I32, &idx_str), (DOUBLE, &new_box)], - ); - } else if let Some(slot) = ctx.locals.get(set_id).cloned() { - ctx.block().store(DOUBLE, &new_box, &slot); - } else if let Some(global_name) = ctx.module_globals.get(set_id).cloned() { - let g_ref = format!("@{}", global_name); - // GC_STORE_AUDIT(ROOT): module global Set slot is a registered mutable GC root. - emit_root_nanbox_store_on_block(ctx.block(), &new_box, &g_ref); - } - Ok(new_box) + Ok(nanbox_pointer_inline(blk, &new_handle)) } // -------- set.has(value) -> boolean -------- diff --git a/crates/perry-codegen/src/expr/property_get.rs b/crates/perry-codegen/src/expr/property_get.rs index 47c7a0f822..97d9b66d15 100644 --- a/crates/perry-codegen/src/expr/property_get.rs +++ b/crates/perry-codegen/src/expr/property_get.rs @@ -600,6 +600,20 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // ObjectHeader, and the outer PropertyGet routes through // `js_object_get_field_by_name`'s NATIVE_MODULE_CLASS_ID arm. if let Expr::NativeModuleRef(module_name) = object.as_ref() { + // Devirt: register this module's runtime dispatch bucket before + // the namespace value is produced, so later method calls on it + // route to the real handlers. The CJS-`require` shim lowers + // `require("path")` to `PropertyGet { NativeModuleRef("path"), + // "default" }` (NOT a bare NativeModuleRef), so the bare-ref + // install in `static_field_meta` never fired for the + // require-then-`.default.join()` shape (Next.js' `_path.default + // .join(...)` returned undefined — the dispatcher was unregistered + // and `nm_dispatch_lookup` fell to the `None`/undefined arm). + // Emitting it here mirrors the bare-ref path and keeps the + // handlers alive against the auto-optimize dead-strip. + if let Some(install_sym) = crate::nm_install::nm_install_symbol(module_name) { + ctx.block().call_void(install_sym, &[]); + } if module_name == "process" && property == "version" { let blk = ctx.block(); let handle = blk.call(I64, "js_process_version", &[]); diff --git a/crates/perry-codegen/src/lower_call/native/mod.rs b/crates/perry-codegen/src/lower_call/native/mod.rs index 73820c58ca..abf194d84b 100644 --- a/crates/perry-codegen/src/lower_call/native/mod.rs +++ b/crates/perry-codegen/src/lower_call/native/mod.rs @@ -147,6 +147,44 @@ pub(crate) fn lower_native_method_call( &[(DOUBLE, &iter), (DOUBLE, &done)], )); } + // Next.js wall 53: runtime `require(absolutePath.json)` fallback. + "requireJsonDisk" => { + let specifier = args.first().map_or_else( + || Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))), + |arg| lower_expr(ctx, arg), + )?; + return Ok(ctx.block().call( + DOUBLE, + "js_require_json_disk", + &[(DOUBLE, &specifier)], + )); + } + // Next.js wall 54: register an AOT-compiled module by absolute path. + "registerPathModule" => { + let path = args.first().map_or_else( + || Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))), + |arg| lower_expr(ctx, arg), + )?; + let exports = args.get(1).map_or_else( + || Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))), + |arg| lower_expr(ctx, arg), + )?; + ctx.block().call_void( + "js_register_path_module", + &[(DOUBLE, &path), (DOUBLE, &exports)], + ); + return Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))); + } + // Next.js wall 54: resolve runtime `require(absolutePath.js)`. + "requirePathModule" => { + let path = args.first().map_or_else( + || Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))), + |arg| lower_expr(ctx, arg), + )?; + return Ok(ctx + .block() + .call(DOUBLE, "js_require_path_module", &[(DOUBLE, &path)])); + } _ => {} } } @@ -2378,15 +2416,55 @@ pub(crate) fn lower_native_method_call( return lower_native_module_dispatch(ctx, sig, Some(&handle), args); } - // Unknown native method: lower the receiver and args for side - // effects (so closures inside them get auto-collected and any - // string literals get interned), then return a sentinel. This - // unblocks compilation of programs that touch native modules - // we haven't wired up yet — they'll produce garbage at runtime - // but won't fail at codegen time. - let _ = lower_expr(ctx, recv)?; - for a in args { - let _ = lower_expr(ctx, a)?; + // Unknown native method: route to the runtime method dispatcher on the + // ACTUAL receiver value instead of returning a 0.0 sentinel. The HIR can + // mis-classify a receiver's class — a webpack closure-captured array `e` + // gets registered as `FormData` (stale/aliased native-instance type), so + // `e.indexOf(s)` lowers as `NativeMethodCall{FormData, "indexOf"}`. None of + // the FormData arms match `indexOf`, and the old `0.0` sentinel made + // `!~e.indexOf(s)` always 0 → the Next.js `__webpack_require__.t` interop + // loop ran 0 iterations → empty React namespace → `cacheSignal is not a + // function`. `js_native_call_method` dispatches on the runtime type, so a + // real array receiver runs `Array.prototype.indexOf`, a real FormData runs + // its method, etc. (Same shape as the `new Console(...)` instance path + // above.) Falls back gracefully for genuinely-unimplemented modules too: + // the dispatcher returns `undefined` rather than a misleading numeric 0. + let recv_box = lower_expr(ctx, recv)?; + let mut lowered_args: Vec = Vec::with_capacity(args.len()); + for arg in args { + lowered_args.push(lower_expr(ctx, arg)?); } - Ok(double_literal(0.0)) + let (args_ptr, args_len) = if lowered_args.is_empty() { + ("null".to_string(), "0".to_string()) + } else { + let n = lowered_args.len(); + let buf = ctx.func.alloca_entry_array(DOUBLE, n); + { + let blk = ctx.block(); + for (i, value) in lowered_args.iter().enumerate() { + let slot = blk.gep(DOUBLE, &buf, &[(I64, &i.to_string())]); + blk.store(DOUBLE, value, &slot); + } + } + (buf, n.to_string()) + }; + let method_idx = ctx.strings.intern(method); + let entry = ctx.strings.entry(method_idx); + let bytes_global = format!("@{}", entry.bytes_global); + let name_len = entry.byte_len.to_string(); + // #wall4: null-safe — dispatch real receivers (fixes the mis-typed array + // `e.indexOf`), but a genuinely nullish receiver returns the 0.0 sentinel + // instead of hard-throwing (so app-page-turbo's top-level nullish-receiver + // `.indexOf` doesn't abort the whole external module load → 500). + Ok(ctx.block().call( + DOUBLE, + "js_native_call_method_nullsafe", + &[ + (DOUBLE, &recv_box), + (PTR, &bytes_global), + (I64, &name_len), + (PTR, &args_ptr), + (I64, &args_len), + ], + )) } diff --git a/crates/perry-codegen/src/lower_call/property_get.rs b/crates/perry-codegen/src/lower_call/property_get.rs index 8561007e62..0b6f190262 100644 --- a/crates/perry-codegen/src/lower_call/property_get.rs +++ b/crates/perry-codegen/src/lower_call/property_get.rs @@ -414,11 +414,17 @@ pub fn try_lower_property_get_method_call( // startsWith / endsWith only exist on String — both 1-arg // and 2-arg (searchString, position) forms route here. "startsWith" | "endsWith" if args.len() == 1 || args.len() == 2 => true, - // `normalize` is string-exclusive only at 0/1 args. User classes - // commonly define 2-arg `normalize(pathname, matched)` methods - // (Next.js route normalizers) — those must fall through to the - // runtime dispatcher instead of erroring on String arity. - "normalize" if args.len() <= 1 => true, + // `normalize` is NOT force-routed to the string path for Any-typed + // receivers at any arity. User classes commonly define a 1-arg + // `normalize(pathname)` method (Next.js route normalizers: + // `this.normalize(matchedPath)`, `normalizer.normalize(initPathname)`) + // — forcing the string path made the pathname argument the Unicode + // `form`, throwing `RangeError: The normalization form should be one + // of NFC, NFD, NFKC, NFKD` (Next.js wall 50). A receiver that really + // is a string still gets `String.prototype.normalize` two ways: the + // statically-typed-string fast path above (`is_string_expr`), and the + // `jsval.is_string()` arm of `js_native_call_method` for Any-typed + // strings. So nothing is lost by falling through here. "lastIndexOf" if args.len() == 1 => true, _ => false, }; diff --git a/crates/perry-codegen/src/runtime_decls/objects.rs b/crates/perry-codegen/src/runtime_decls/objects.rs index 67aa1fc9f6..95d1451156 100644 --- a/crates/perry-codegen/src/runtime_decls/objects.rs +++ b/crates/perry-codegen/src/runtime_decls/objects.rs @@ -249,6 +249,14 @@ pub fn declare_phase_b_objects(module: &mut LlModule) { module.declare_function("js_object_rest", I64, &[I64, I64]); // RequireObjectCoercible for object destructuring (throws on null/undefined). module.declare_function("js_require_object_coercible", DOUBLE, &[DOUBLE]); + // Next.js wall 53: runtime `require(absolutePath.json)` disk fallback. + module.declare_function("js_require_json_disk", DOUBLE, &[DOUBLE]); + // Next.js wall 54: runtime `require(absolutePath.js)` -> AOT-compiled module. + module.declare_function("js_register_path_module", VOID, &[DOUBLE, DOUBLE]); + module.declare_function("js_require_path_module", DOUBLE, &[DOUBLE]); + // Next.js wall 54 (part 2): register a Deferred module's `__init` address by + // path so a runtime `require(absolutePath)` can trigger its lazy init. + module.declare_function("js_register_path_init", VOID, &[PTR, I64, I64]); // Array alloc variant that pre-sets length to N (for exclude_keys array filling). module.declare_function("js_array_alloc_with_length", I64, &[I32]); // Unchecked array set (plain array, no buffer/Set/Map dispatch). diff --git a/crates/perry-codegen/src/runtime_decls/stdlib_ffi.rs b/crates/perry-codegen/src/runtime_decls/stdlib_ffi.rs index cd62606623..1f4724aa33 100644 --- a/crates/perry-codegen/src/runtime_decls/stdlib_ffi.rs +++ b/crates/perry-codegen/src/runtime_decls/stdlib_ffi.rs @@ -1905,6 +1905,11 @@ pub fn declare_stdlib_ffi(module: &mut LlModule) { DOUBLE, &[DOUBLE, I64, I64, I64, I64], ); + module.declare_function( + "js_native_call_method_nullsafe", + DOUBLE, + &[DOUBLE, I64, I64, I64, I64], + ); module.declare_function("js_native_call_value", DOUBLE, &[DOUBLE, I64, I64]); module.declare_function("js_new_from_handle", DOUBLE, &[DOUBLE, I64, I64]); module.declare_function("js_new_instance", DOUBLE, &[I64, I64, I64, I64, I64]); diff --git a/crates/perry-codegen/src/type_analysis_tests.rs b/crates/perry-codegen/src/type_analysis_tests.rs index be5b0886c6..3c2805385d 100644 --- a/crates/perry-codegen/src/type_analysis_tests.rs +++ b/crates/perry-codegen/src/type_analysis_tests.rs @@ -210,6 +210,7 @@ fn hir_inferred_types_reuse_codegen_contextual_class_facts() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }; let widget = perry_hir::Class { @@ -278,6 +279,7 @@ fn hir_inferred_types_reuse_codegen_contextual_class_facts() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }; let classes = HashMap::from([("Base".to_string(), &base), ("Widget".to_string(), &widget)]); diff --git a/crates/perry-codegen/tests/argless_builtin_extra_args.rs b/crates/perry-codegen/tests/argless_builtin_extra_args.rs index 44d8b51e43..ed4d5b1134 100644 --- a/crates/perry-codegen/tests/argless_builtin_extra_args.rs +++ b/crates/perry-codegen/tests/argless_builtin_extra_args.rs @@ -12,6 +12,7 @@ fn empty_opts() -> CompileOptions { target: None, is_entry_module: false, non_entry_module_prefixes: Vec::new(), + nextjs_path_init_modules: Vec::new(), import_function_prefixes: std::collections::HashMap::new(), import_function_origin_names: std::collections::HashMap::new(), import_function_v8_specifiers: std::collections::HashMap::new(), diff --git a/crates/perry-codegen/tests/class_keys_gc_root.rs b/crates/perry-codegen/tests/class_keys_gc_root.rs index 89cc445e73..5d79025d3b 100644 --- a/crates/perry-codegen/tests/class_keys_gc_root.rs +++ b/crates/perry-codegen/tests/class_keys_gc_root.rs @@ -31,6 +31,7 @@ fn entry_opts() -> CompileOptions { target: None, is_entry_module: true, non_entry_module_prefixes: Vec::new(), + nextjs_path_init_modules: Vec::new(), import_function_prefixes: std::collections::HashMap::new(), import_function_origin_names: std::collections::HashMap::new(), import_function_v8_specifiers: std::collections::HashMap::new(), @@ -116,6 +117,7 @@ fn module_with_declared_field_class() -> Module { static_methods: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }], interfaces: Vec::new(), diff --git a/crates/perry-codegen/tests/constructor_recursion.rs b/crates/perry-codegen/tests/constructor_recursion.rs index 1c6077c27c..8bc800760f 100644 --- a/crates/perry-codegen/tests/constructor_recursion.rs +++ b/crates/perry-codegen/tests/constructor_recursion.rs @@ -44,6 +44,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, @@ -117,6 +118,7 @@ fn module_with_recursive_constructor_return() -> Module { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, }], interfaces: Vec::new(), type_aliases: Vec::new(), diff --git a/crates/perry-codegen/tests/destructure_call_location.rs b/crates/perry-codegen/tests/destructure_call_location.rs index 3bd10d0523..70a1228346 100644 --- a/crates/perry-codegen/tests/destructure_call_location.rs +++ b/crates/perry-codegen/tests/destructure_call_location.rs @@ -23,6 +23,7 @@ fn base_opts() -> CompileOptions { target: None, is_entry_module: true, non_entry_module_prefixes: Vec::new(), + nextjs_path_init_modules: Vec::new(), import_function_prefixes: std::collections::HashMap::new(), import_function_origin_names: std::collections::HashMap::new(), import_function_v8_specifiers: std::collections::HashMap::new(), diff --git a/crates/perry-codegen/tests/large_object_barriers.rs b/crates/perry-codegen/tests/large_object_barriers.rs index 3b7f958948..e5212f7e01 100644 --- a/crates/perry-codegen/tests/large_object_barriers.rs +++ b/crates/perry-codegen/tests/large_object_barriers.rs @@ -44,6 +44,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, diff --git a/crates/perry-codegen/tests/macos_bundle_chdir_gate.rs b/crates/perry-codegen/tests/macos_bundle_chdir_gate.rs index 9b25cf45a6..9b40bfcd79 100644 --- a/crates/perry-codegen/tests/macos_bundle_chdir_gate.rs +++ b/crates/perry-codegen/tests/macos_bundle_chdir_gate.rs @@ -50,6 +50,7 @@ fn entry_opts(target: Option<&str>) -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, diff --git a/crates/perry-codegen/tests/native_proof_buffer_views.rs b/crates/perry-codegen/tests/native_proof_buffer_views.rs index 53dce67ce6..d406a6e1ce 100644 --- a/crates/perry-codegen/tests/native_proof_buffer_views.rs +++ b/crates/perry-codegen/tests/native_proof_buffer_views.rs @@ -49,6 +49,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, @@ -246,6 +247,7 @@ fn class(id: u32, name: &str, fields: Vec) -> Class { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } diff --git a/crates/perry-codegen/tests/native_proof_regressions.rs b/crates/perry-codegen/tests/native_proof_regressions.rs index 96db13caa1..ed8eac4f7a 100644 --- a/crates/perry-codegen/tests/native_proof_regressions.rs +++ b/crates/perry-codegen/tests/native_proof_regressions.rs @@ -49,6 +49,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, @@ -230,6 +231,7 @@ fn class(id: u32, name: &str, fields: Vec) -> Class { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } diff --git a/crates/perry-codegen/tests/shadow_slot_hygiene.rs b/crates/perry-codegen/tests/shadow_slot_hygiene.rs index 39f4b09d8d..f15d1ab99e 100644 --- a/crates/perry-codegen/tests/shadow_slot_hygiene.rs +++ b/crates/perry-codegen/tests/shadow_slot_hygiene.rs @@ -44,6 +44,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, diff --git a/crates/perry-codegen/tests/static_symbol_hygiene.rs b/crates/perry-codegen/tests/static_symbol_hygiene.rs index 6d1444fc03..a2800594ff 100644 --- a/crates/perry-codegen/tests/static_symbol_hygiene.rs +++ b/crates/perry-codegen/tests/static_symbol_hygiene.rs @@ -44,6 +44,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, @@ -113,6 +114,7 @@ fn class_with_static(id: u32, value: f64) -> Class { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } diff --git a/crates/perry-codegen/tests/typed_feedback.rs b/crates/perry-codegen/tests/typed_feedback.rs index 565e91d400..4c8c6fda6f 100644 --- a/crates/perry-codegen/tests/typed_feedback.rs +++ b/crates/perry-codegen/tests/typed_feedback.rs @@ -75,6 +75,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, @@ -130,6 +131,7 @@ fn class(id: u32, name: &str, fields: Vec) -> Class { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } diff --git a/crates/perry-codegen/tests/typed_shape_descriptor.rs b/crates/perry-codegen/tests/typed_shape_descriptor.rs index e3873205be..fde01999b0 100644 --- a/crates/perry-codegen/tests/typed_shape_descriptor.rs +++ b/crates/perry-codegen/tests/typed_shape_descriptor.rs @@ -44,6 +44,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, @@ -87,6 +88,7 @@ fn class(id: u32, name: &str, fields: Vec) -> Class { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } diff --git a/crates/perry-codegen/tests/typed_shape_descriptors.rs b/crates/perry-codegen/tests/typed_shape_descriptors.rs index 3206ebba09..f658220058 100644 --- a/crates/perry-codegen/tests/typed_shape_descriptors.rs +++ b/crates/perry-codegen/tests/typed_shape_descriptors.rs @@ -74,6 +74,7 @@ fn empty_opts() -> CompileOptions { app_metadata: AppMetadata::default(), namespace_entries: Vec::new(), dynamic_import_path_to_prefix: std::collections::HashMap::new(), + nextjs_path_init_modules: Vec::new(), deferred_module_prefixes: std::collections::HashSet::new(), module_init_deps: Vec::new(), is_dynamic_import_target: false, diff --git a/crates/perry-hir/src/analysis/value_types_tests.rs b/crates/perry-hir/src/analysis/value_types_tests.rs index ea93feee53..1fac520de0 100644 --- a/crates/perry-hir/src/analysis/value_types_tests.rs +++ b/crates/perry-hir/src/analysis/value_types_tests.rs @@ -618,6 +618,7 @@ fn seeds_contextual_class_and_enum_facts_from_module() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }); @@ -688,6 +689,7 @@ fn infers_named_class_and_interface_property_facts() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }); module.classes.push(Class { @@ -710,6 +712,7 @@ fn infers_named_class_and_interface_property_facts() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }); module.interfaces.push(Interface { @@ -1645,6 +1648,7 @@ fn resolves_this_and_super_in_class_context() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }); module.classes.push(Class { @@ -1667,6 +1671,7 @@ fn resolves_this_and_super_in_class_context() { computed_members: Vec::new(), decorators: Vec::new(), is_exported: false, + is_nested: false, aliases: Vec::new(), }); diff --git a/crates/perry-hir/src/destructuring/var_decl.rs b/crates/perry-hir/src/destructuring/var_decl.rs index 9f88e59890..8a84e1804a 100644 --- a/crates/perry-hir/src/destructuring/var_decl.rs +++ b/crates/perry-hir/src/destructuring/var_decl.rs @@ -30,6 +30,58 @@ pub(crate) fn lower_var_decl_with_destructuring( ); } + // A fresh binding of `name` must not inherit a stale + // native-instance tag that an UNRELATED earlier binding of the + // same name registered (e.g. a minified webpack bundle that + // `new FormData()`-binds a local `i` in one factory and reuses + // `var i = { exports: {} }` as the require-cache object in + // another). `native_instances` is module-global + last-match-wins, + // so push a tombstone to shadow the old tag here, BEFORE the + // native-instance registration checks below — if THIS init is + // itself a native instance, it re-registers after the tombstone + // and last-match-wins keeps the correct tag. Without this, a plain + // `i.exports` read mis-routes through the stale module's native + // method dispatch and folds to 0 (Next.js app-page-turbo `require` + // → React's `exports.Fragment = …` "read only property" throw). + if ctx.lookup_native_instance(&name).is_some() { + ctx.shadow_native_instance(name.clone()); + } + + // #wall5: same scope-leak for native MODULES. `native_modules_index` + // is module-global + first-match-wins (no scope tracking), so a + // local re-bind of a name a top-level `const url = require('url')` + // registered (e.g. undici's `const util = require('./util')`, or a + // local `const url = []` / a URL object) would mis-resolve + // `util.isStream` / `url.push` through the node-module dispatch and + // fire the unimplemented-API gate (Next.js app-page-turbo: 88× url.push, + // 84× util.destroy, the url.o render throw). Shadow the module here — + // UNLESS this very decl IS the native-module binding (`= require('url')` + // of a node-core module), which must keep resolving as the module. + if ctx.lookup_native_module(&name).is_some() { + let binds_native_module = decl.init.as_deref().is_some_and(|init| { + if let ast::Expr::Call(call) = init { + if let ast::Callee::Expr(callee) = &call.callee { + if let ast::Expr::Ident(id) = callee.as_ref() { + if &*id.sym == "require" { + if let Some(ast::Expr::Lit(ast::Lit::Str(s))) = + call.args.first().map(|a| a.expr.as_ref()) + { + if let Some(spec) = s.value.as_str() { + let bare = spec.strip_prefix("node:").unwrap_or(spec); + return perry_api_manifest::is_node_core_module(bare); + } + } + } + } + } + } + false + }); + if !binds_native_module { + ctx.shadow_native_module_if_present(&name); + } + } + // #809: tag locals provably bound to a plain object (an object // literal or `Object.create(...)`). `static_receiver_class` // consults this so `x.toJSON()` / `.toString()` / `.valueOf()` @@ -1865,6 +1917,49 @@ pub(crate) fn lower_var_decl_with_destructuring( return Ok(result); } } + // Next.js / webpack require pattern: `var i = n[e] = {exports:{}}`. + // A chained member-assignment whose RHS is an object literal + // miscompiles in the full-bundle context: the constructed object's + // own field reads back as 0 when the construction flows directly + // into both the member store and the binding (the nested webpack + // bundle's `exports` then reads 0 → `exports.Fragment = …` throws). + // A directly-bound object literal (`var x = {exports:{}}`) is fine, + // so hoist the construction to its own `Let` and feed the member-set + // and the binding from that temp — mirroring the working form. + let init = match init { + Some(Expr::PutValueSet { + target, + key, + value, + receiver, + strict, + }) if matches!(value.as_ref(), Expr::New { .. } | Expr::Object(_)) => { + let tmp_id = ctx.define_local("__nx_member_init".to_string(), Type::Any); + result.push(Stmt::Let { + id: tmp_id, + name: "__nx_member_init".to_string(), + ty: Type::Any, + mutable: false, + init: Some(*value), + }); + result.push(Stmt::Expr(Expr::PutValueSet { + target, + key, + value: Box::new(Expr::LocalGet(tmp_id)), + receiver, + strict, + })); + result.push(Stmt::Let { + id, + name, + ty, + mutable, + init: Some(Expr::LocalGet(tmp_id)), + }); + return Ok(result); + } + other => other, + }; result.push(Stmt::Let { id, name, diff --git a/crates/perry-hir/src/ir/decl.rs b/crates/perry-hir/src/ir/decl.rs index 8f25fae744..cfc043bf91 100644 --- a/crates/perry-hir/src/ir/decl.rs +++ b/crates/perry-hir/src/ir/decl.rs @@ -240,6 +240,16 @@ pub struct Class { /// `var X = class _X { ... new _X() ... }` records `_X` here so codegen /// can look it up as the same class. Refs #486. pub aliases: Vec, + /// Whether this class was declared/expressed INSIDE a function body (not at + /// module top level), even though HIR hoists it into `module.classes`. A + /// nested class's static-field initializers must run when the enclosing + /// function evaluates the class — NOT at module init. Running a nested + /// class's side-effectful static initializer (e.g. `static #a = new Self()`) + /// eagerly at module init both mistimes it and can crash before any user + /// code (Next.js wall 54: NextResponse's `static #a = this.EMPTY = new z()` + /// inside a turbopack factory threw at module init). Codegen + /// (`init_static_fields_*`) skips module-init static init for these. + pub is_nested: bool, } #[derive(Debug, Clone, Copy, PartialEq, Eq)] diff --git a/crates/perry-hir/src/lower/context.rs b/crates/perry-hir/src/lower/context.rs index 6b840eefd4..6c15169911 100644 --- a/crates/perry-hir/src/lower/context.rs +++ b/crates/perry-hir/src/lower/context.rs @@ -99,6 +99,7 @@ impl LoweringContext { pending_with_implicit_inits: Vec::new(), scope_depth: 0, scope_local_marks: Vec::new(), + scope_module_shadow_marks: Vec::new(), inside_block_scope: 0, namespace_vars: Vec::new(), current_namespace: None, @@ -122,6 +123,7 @@ impl LoweringContext { module_native_instances_index: HashMap::new(), func_return_native_instances_index: HashMap::new(), native_modules_index: HashMap::new(), + module_shadow_stack: Vec::new(), class_statics_index: HashMap::new(), weakref_locals: HashSet::new(), finreg_locals: HashSet::new(), @@ -145,6 +147,8 @@ impl LoweringContext { mixin_funcs: HashMap::new(), anon_shape_classes: HashMap::new(), forward_class_names: std::collections::HashSet::new(), + class_renames: std::collections::HashMap::new(), + next_class_rename_id: 0, module_class_decl_names: std::collections::HashSet::new(), next_anon_shape_id: 0, class_method_return_types: Vec::new(), @@ -391,6 +395,27 @@ impl LoweringContext { self.classes_index.get(name).map(|&idx| self.classes[idx].1) } + /// Apply any active scope-local class-name alias (see `class_renames`). + /// Identity for non-aliased names, so non-colliding classes are unaffected. + pub(crate) fn resolve_class_name(&self, name: &str) -> String { + self.class_renames + .get(name) + .cloned() + .unwrap_or_else(|| name.to_string()) + } + + /// Register a scope-local rename for `class X` when an outer/prior `class X` + /// is already registered (a distinct class that the name-keyed dedup would + /// otherwise skip). Returns immediately if no collision or already aliased. + /// Call from each body's Phase-1.5 class scan. + pub(crate) fn maybe_rename_colliding_class(&mut self, name: &str) { + if self.lookup_class(name).is_some() && !self.class_renames.contains_key(name) { + let unique = format!("{}${}", name, self.next_class_rename_id); + self.next_class_rename_id += 1; + self.class_renames.insert(name.to_string(), unique); + } + } + /// Issue #562: look up the `(module, class)` tuple from a class's /// `native_extends` clause (e.g. `class X extends WritableStream` → /// `Some(("writable_stream", "WritableStream"))`). Used by @@ -738,6 +763,24 @@ impl LoweringContext { self.locals.lookup(name) } + /// Like `lookup_local`, but only searches locals defined in the CURRENT + /// function scope (at or after the most recent `enter_scope` mark). Used by + /// function-declaration hoisting so a nested `function a` SHADOWS an + /// outer-scope binding of the same name (fresh local + box) instead of + /// reusing the outer local's box — which, when the outer binding is a + /// closure-captured variable (a webpack chunk's `function a` require + /// captured by an inner IIFE, with `function a` error-formatters in nested + /// module factories), let the nested declaration overwrite the captured + /// box at runtime. + pub(crate) fn lookup_local_in_current_scope(&self, name: &str) -> Option { + let scope_start = self.scope_local_marks.last().copied().unwrap_or(0); + self.locals[scope_start..] + .iter() + .rev() + .find(|(n, _, _)| n == name) + .map(|(_, id, _)| *id) + } + fn lookup_local_index(&self, name: &str) -> Option { self.locals.lookup_index(name) } @@ -968,6 +1011,9 @@ impl LoweringContext { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + // Synthetic anon-shape class; no static fields, so static-init + // timing is irrelevant. + is_nested: false, }); self.anon_shape_classes @@ -1063,12 +1109,46 @@ impl LoweringContext { } pub(crate) fn lookup_native_module(&self, name: &str) -> Option<(&str, Option<&str>)> { + // #wall5: a local binding (function parameter / `const`) named the same + // as a registered native module (`url`, `util`, `path`, …) SHADOWS that + // module within its scope — `native_modules_index` is module-global and + // first-match-wins, so without this a nested `function(url){ url.push() }` + // (a local array) or undici's own `util` object would route `url.push` / + // `util.isStream` through the node-module dispatch and the + // unimplemented-API gate fires (Next.js app-page-turbo: 88× `url.push`, + // 84× `util.destroy`, the `url.o` render throw). Mirrors the scope-aware + // `native_instances` shadowing (shadow_native_instance / truncate). + if self.module_shadow_stack.iter().any(|n| n == name) { + return None; + } self.native_modules_index.get(name).map(|&idx| { let (_, m, method) = &self.native_modules[idx]; (m.as_str(), method.as_ref().map(|s| s.as_str())) }) } + /// #wall5: shadow a native-module name for the current scope IF it is a + /// registered module (so a local/param of that name resolves as a value, not + /// the module). No-op for non-module names. Restore with + /// `truncate_module_shadow` at scope exit. Parallel to + /// `shadow_native_instance_if_present`. + pub(crate) fn shadow_native_module_if_present(&mut self, name: &str) { + if self.native_modules_index.contains_key(name) { + self.module_shadow_stack.push(name.to_string()); + } + } + + /// Current depth of the module-shadow stack (a scope mark). + pub(crate) fn module_shadow_mark(&self) -> usize { + self.module_shadow_stack.len() + } + + /// Restore the module-shadow stack to `mark`, re-exposing modules whose + /// shadowing local bindings went out of scope. + pub(crate) fn truncate_module_shadow(&mut self, mark: usize) { + self.module_shadow_stack.truncate(mark); + } + pub(crate) fn register_builtin_module_alias( &mut self, local_name: String, @@ -1131,6 +1211,36 @@ impl LoweringContext { .push((local_name, module_name, class_name)); } + /// Shadow any prior native-instance tag for `local_name` by pushing a + /// tombstone (empty module). `native_instances` is module-global and + /// last-match-wins, so without this a fresh binding of a name that an + /// unrelated `new FormData()`/`new Response()`/etc. earlier registered + /// (e.g. a minified bundle reusing the local `i`) would inherit the stale + /// native tag — routing a plain `i.exports` read through FormData's native + /// method dispatch (→ 0) instead of an ordinary property read. A real + /// native binding re-registers AFTER this tombstone, so last-match-wins + /// keeps the correct tag. (Next.js app-page-turbo `require` fix.) + pub(crate) fn shadow_native_instance(&mut self, local_name: String) { + self.native_instances + .push((local_name, String::new(), String::new())); + } + + /// Tombstone a stale native-instance tag for `name` ONLY if one is currently + /// live, so a fresh binding (var-decl OR function parameter) of that name + /// shadows it. A function PARAMETER named the same as a leaked native + /// instance (e.g. a minified `function(e){…}` whose `e` collides with an + /// earlier `e = new Response()` in another factory) must NOT route + /// `e.` through the stale native dispatch — that folds named reads to + /// 0 (the same class as the Fragment `i.exports` wall, but for params: in + /// the Next.js app-page bundle superstruct's `enums(e){ e.map(…).join() }` + /// saw `e.map`/`e.length`/`e.constructor` all read 0 while `e[0]` and + /// `Array.prototype.map.call(e)` worked → `(number).join is not a function`). + pub(crate) fn shadow_native_instance_if_present(&mut self, name: &str) { + if self.lookup_native_instance(name).is_some() { + self.shadow_native_instance(name.to_string()); + } + } + /// Truncate `native_instances` back to `mark`, keeping the /// `native_instances_index` shadow stacks in sync: every recorded index /// `>= mark` is popped (these belong to bindings whose scope is exiting), @@ -1182,6 +1292,22 @@ impl LoweringContext { matches!((module, class), ("module", "Module") | ("repl", _)) } + // Tombstone shadowing (see `shadow_native_instance`): if the most + // recent `native_instances` entry for `name` is a tombstone (empty + // module), this binding deliberately shadows any older native tag of + // the same name — resolve to no native instance so the read/call + // lowers as an ordinary property access. + if let Some((_, module, _)) = self + .native_instances + .iter() + .rev() + .find(|(n, _, _)| n == name) + { + if module.is_empty() { + return None; + } + } + // Issue #1132 — walk the scoped instances back-to-front so a // later (inner-scope) registration shadows an earlier // (outer-scope) one with the same name. `native_instances` is @@ -1318,6 +1444,10 @@ impl LoweringContext { let local_mark = self.locals.len(); self.scope_depth += 1; self.scope_local_marks.push(local_mark); + // #wall5: parallel mark for the native-module shadow stack, restored in + // exit_scope (kept off the returned tuple to avoid churning its callers). + self.scope_module_shadow_marks + .push(self.module_shadow_stack.len()); ( local_mark, self.native_instances.len(), @@ -1329,6 +1459,10 @@ impl LoweringContext { debug_assert!(self.scope_depth > 0, "exit_scope called at module depth"); self.scope_depth = self.scope_depth.saturating_sub(1); self.scope_local_marks.pop(); + // #wall5: restore native-module shadowing for this scope. + if let Some(m) = self.scope_module_shadow_marks.pop() { + self.module_shadow_stack.truncate(m); + } if self.locals.len() > mark.0 { let mut kept: Vec<(String, LocalId, Type)> = Vec::new(); for entry in self.locals.drain_from(mark.0) { diff --git a/crates/perry-hir/src/lower/expr_call/array_only_methods.rs b/crates/perry-hir/src/lower/expr_call/array_only_methods.rs index 0cf1437ea6..51c97dd5f3 100644 --- a/crates/perry-hir/src/lower/expr_call/array_only_methods.rs +++ b/crates/perry-hir/src/lower/expr_call/array_only_methods.rs @@ -341,6 +341,18 @@ pub(super) fn try_array_only_methods( | "reduce" | "reduceRight" | "join" + // wall 49: mutating array methods are also common + // user-class methods (Stack.push, Queue.shift, + // Next.js DefaultRouteMatcherManager.push). On an + // unknown (`Any`) receiver, folding to the array + // op corrupts a class instance (its ObjectHeader is + // read as an ArrayHeader). Bail to dynamic dispatch; + // real arrays are `Type::Array`, handled by the + // class_typed=false + typed fast paths elsewhere. + | "push" + | "pop" + | "shift" + | "unshift" ); class_typed || (unknown_recv && is_overlapping) } @@ -1094,7 +1106,15 @@ pub(super) fn try_array_only_methods( ast::Expr::New(_) => true, // new ClassName().push() _ => false, }; - if !is_user_class_receiver { + // wall 49: `recv_is_class` is true for an unknown/`Any` + // receiver (e.g. `const m = new mod.Class(); m.push(x)`, + // where the cross-module dynamic `new` leaves `m` typed + // `Any`). Folding to a native array push corrupts the + // class instance — its ObjectHeader is reinterpreted as an + // ArrayHeader and the user `push` method never runs. Bail + // to dynamic dispatch, which resolves the class method + // first and falls back to the array op for real arrays. + if !is_user_class_receiver && !recv_is_class { let array_expr = lower_expr(ctx, &member.obj)?; let any_spread = call.args.iter().any(|a| a.spread.is_some()); if !any_spread { diff --git a/crates/perry-hir/src/lower/expr_call/globals.rs b/crates/perry-hir/src/lower/expr_call/globals.rs index 9b941ba9e3..c0c7f70019 100644 --- a/crates/perry-hir/src/lower/expr_call/globals.rs +++ b/crates/perry-hir/src/lower/expr_call/globals.rs @@ -220,6 +220,61 @@ pub(super) fn try_global_builtins( }; return Ok(Ok(Expr::QueueMicrotask(Box::new(callback)))); } + // Internal intrinsic emitted only by the CJS wrapper's `require` + // fallback (cjs_wrap/wrap.rs): runtime `require(absolutePath.json)`. + // Reads + JSON.parses the file from disk via the runtime; `.json` is + // pure data so no eval is involved (Next.js wall 53). + "__perry_require_json_disk" => { + let specifier = if !args.is_empty() { + args.remove(0) + } else { + Expr::Undefined + }; + return Ok(Ok(Expr::NativeMethodCall { + module: "__perry_runtime".to_string(), + class_name: None, + object: None, + method: "requireJsonDisk".to_string(), + args: vec![specifier], + })); + } + // Wall 54: register an AOT-compiled module's exports under its + // absolute source path (emitted at the tail of each CJS wrapper). + "__perry_register_path_module" => { + let path = if !args.is_empty() { + args.remove(0) + } else { + Expr::Undefined + }; + let exports = if !args.is_empty() { + args.remove(0) + } else { + Expr::Undefined + }; + return Ok(Ok(Expr::NativeMethodCall { + module: "__perry_runtime".to_string(), + class_name: None, + object: None, + method: "registerPathModule".to_string(), + args: vec![path, exports], + })); + } + // Wall 54: resolve a runtime `require(absolutePath.js)` to an + // AOT-compiled module's exports (or `undefined` on miss). + "__perry_require_path_module" => { + let path = if !args.is_empty() { + args.remove(0) + } else { + Expr::Undefined + }; + return Ok(Ok(Expr::NativeMethodCall { + module: "__perry_runtime".to_string(), + class_name: None, + object: None, + method: "requirePathModule".to_string(), + args: vec![path], + })); + } "Symbol" => { // Symbol() / Symbol(description) if args.is_empty() { diff --git a/crates/perry-hir/src/lower/expr_call/local_array_methods.rs b/crates/perry-hir/src/lower/expr_call/local_array_methods.rs index f5088dd9fa..292411452c 100644 --- a/crates/perry-hir/src/lower/expr_call/local_array_methods.rs +++ b/crates/perry-hir/src/lower/expr_call/local_array_methods.rs @@ -148,6 +148,19 @@ pub(super) fn try_local_array_methods( | "reduce" | "reduceRight" | "join" + // wall 49: the mutating array methods are ALSO commonly + // user-class methods (Stack.push, Queue.shift, Next.js + // `DefaultRouteMatcherManager.push`). On an unknown (`Any`) + // receiver the inline array fast path reads the instance's + // ObjectHeader as an ArrayHeader and corrupts it; route + // through dynamic dispatch instead, which handles both real + // arrays and class instances correctly. Typed arrays + // (`Type::Array`) are unaffected — `is_unknown_recv` is + // false for them, so they keep the fast path. + | "push" + | "pop" + | "shift" + | "unshift" ); let is_unknown_recv = matches!(type_info, None | Some(Type::Any) | Some(Type::Unknown)); diff --git a/crates/perry-hir/src/lower/expr_call/native_module.rs b/crates/perry-hir/src/lower/expr_call/native_module.rs index 02bfab092b..b8abb210eb 100644 --- a/crates/perry-hir/src/lower/expr_call/native_module.rs +++ b/crates/perry-hir/src/lower/expr_call/native_module.rs @@ -1622,6 +1622,17 @@ pub(super) fn try_native_module_methods( perry_api_manifest::module_has_symbol(module_name, &method_name); if perry_api_manifest::module_has_any_entries(module_name) && manifest_entry.is_none() + // #wall4: an unmistakable `String.prototype` method + // (`endsWith`, `slice`, …) called on an identifier that + // shares a node-core module name (`url`, `path`) means + // the receiver is a runtime string, NOT the module — + // don't gate it as an unimplemented module API; fall + // through to dynamic dispatch on the real receiver. + // Next.js app-page-turbo calls `url.endsWith(...)` on a + // URL string bound to a local named `url`. + && !super::super::array_fold::is_known_string_prototype_method( + &method_name, + ) { // #925: this is the gate that fires // for `crypto.hmacSha256(data, key)`. diff --git a/crates/perry-hir/src/lower/expr_call/regex_string.rs b/crates/perry-hir/src/lower/expr_call/regex_string.rs index 0c4d009335..6058a02af7 100644 --- a/crates/perry-hir/src/lower/expr_call/regex_string.rs +++ b/crates/perry-hir/src/lower/expr_call/regex_string.rs @@ -86,24 +86,36 @@ pub(super) fn try_regex_string_methods( && args.len() == 1 { let is_match_all = method_ident.sym.as_ref() == "matchAll"; - // Check if the argument is a regex literal or a local holding a regex. - // - // CRITICAL (#A — minimatch source-compile): do NOT assume - // an `Any`/`Unknown`/untyped ARG is a regex (which would - // treat the RECEIVER as a string and lower to - // `Expr::StringMatch`). `.match()` is also a common - // INSTANCE method name — minimatch's `Minimatch.match` is - // invoked as `new Minimatch(pat).match(p)` inside the - // top-level `minimatch(p, pat)` arrow, where `p` is an - // untyped param. The old `Any | Unknown | None => true` - // heuristic mis-lowered that into - // `StringMatch(new Minimatch(pat), p)`, so `minimatch(...)` - // returned `null` instead of the boolean match result. - // The runtime already routes a genuine string receiver's - // `.match(regex)` through dynamic dispatch (#519/#510), so - // falling through to a normal method call is correct for - // BOTH a string and a class instance. Only take the codegen - // fast path with positive evidence the arg is a regex. + // Only fold to `String.prototype.match`/`matchAll` when the + // RECEIVER is statically a string. `.match` is also a common + // user-class method name (Next.js route matchers' + // `RouteMatcher.match(pathname)`), so an unknown / `Any` / + // class-instance receiver must NOT be assumed a string — + // otherwise `m.match(p)` on a class instance compiled to + // `js_string_match(m_as_string, p)`, reinterpreting the + // instance pointer as a string and returning null (Next.js + // wall 52: `DefaultRouteMatcherManager.validate` → + // `matcher.match(pathname)` never matched the App-Router root + // "/" → HTTP 500). A receiver that really is a string still + // gets `match` two ways: this fold (statically-typed string), + // and the `jsval.is_string()` arm of `js_native_call_method` + // for `Any`-typed strings. Mirrors the wall-50 `normalize` + // fix. + let recv_is_string = match member.obj.as_ref() { + ast::Expr::Lit(ast::Lit::Str(_)) => true, + ast::Expr::Tpl(_) => true, + ast::Expr::Ident(ident) => { + matches!( + ctx.lookup_local_type(ident.sym.as_ref()), + Some(Type::String) + ) + } + _ => false, + }; + if !recv_is_string { + return Ok(Err(args)); + } + // Check if the argument is a regex literal or a local holding a regex let arg_is_regex = match call.args.first().map(|a| a.expr.as_ref()) { Some(ast::Expr::Lit(ast::Lit::Regex(_))) => true, Some(ast::Expr::Ident(ident)) => { diff --git a/crates/perry-hir/src/lower/expr_function.rs b/crates/perry-hir/src/lower/expr_function.rs index 51d2b8c98e..c76e8bb001 100644 --- a/crates/perry-hir/src/lower/expr_function.rs +++ b/crates/perry-hir/src/lower/expr_function.rs @@ -227,6 +227,7 @@ pub(super) fn lower_arrow(ctx: &mut LoweringContext, arrow: &ast::ArrowExpr) -> let is_rest = is_rest_param(param); let param_ty = get_pat_type(param, ctx); let param_id = ctx.define_local(param_name.clone(), param_ty.clone()); + ctx.shadow_native_instance_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -600,6 +601,7 @@ fn lower_fn_expr_anon(ctx: &mut LoweringContext, fn_expr: &ast::FnExpr) -> Resul } let is_rest = is_rest_param(¶m.pat); let param_id = ctx.define_local(param_name.clone(), Type::Any); + ctx.shadow_native_instance_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -993,9 +995,15 @@ fn lower_fn_expr_anon(ctx: &mut LoweringContext, fn_expr: &ast::FnExpr) -> Resul // `js_global_get_or_throw_unresolved("X")` → `ReferenceError: X is not // defined` (Next.js RSCPathnameNormalizer). Scoped: restored after the body. let saved_forward_class_names = ctx.forward_class_names.clone(); + let saved_class_renames = ctx.class_renames.clone(); if let Some(ref block) = fn_expr.function.body { for stmt in &block.stmts { if let ast::Stmt::Decl(ast::Decl::Class(class_decl)) = stmt { + // Disambiguate a distinct same-named class (the cjs/ncc IIFE + // shape `(function(e){…class s{…}…})(t)` declares superstruct's + // `Struct` = `class s`, which collided with other `class s` in + // the bundle and was dedup-skipped). See `class_renames`. + ctx.maybe_rename_colliding_class(class_decl.ident.sym.as_str()); ctx.forward_class_names .insert(class_decl.ident.sym.to_string()); } @@ -1065,6 +1073,7 @@ fn lower_fn_expr_anon(ctx: &mut LoweringContext, fn_expr: &ast::FnExpr) -> Resul ctx.annexb_block_fn_var_ids = saved_annexb_block_fn_var_ids; ctx.annexb_block_fn_names_all = saved_annexb_block_fn_names_all; ctx.forward_class_names = saved_forward_class_names; + ctx.class_renames = saved_class_renames; // Prepend destructuring statements to body if !destructuring_stmts.is_empty() { diff --git a/crates/perry-hir/src/lower/expr_member.rs b/crates/perry-hir/src/lower/expr_member.rs index 50bc1b02da..e6b417f3b2 100644 --- a/crates/perry-hir/src/lower/expr_member.rs +++ b/crates/perry-hir/src/lower/expr_member.rs @@ -1278,6 +1278,30 @@ fn lower_member_inner(ctx: &mut LoweringContext, member: &ast::MemberExpr) -> Re object: Box::new(object_expr), property: property_name, }); + } else if class_name == "AsyncLocalStorage" + && matches!( + property_name.as_str(), + "run" | "getStore" | "enterWith" | "exit" | "disable" + ) + { + // `als.getStore` / `als.run` etc. are method-VALUE reads, + // not zero-arg native calls. A bare read (`const { getStore + // } = als`, `const gs = als.getStore`, `typeof als.getStore` + // — Next.js' cacheComponents / patch-fetch async-storage + // setup) must return the callable BOUND METHOD, not invoke + // `getStore()` with no args (which returns the store → + // undefined → `TypeError: getStore is not a function` at + // server startup, before `✓ Ready`). Keep PropertyGet so the + // runtime handle-property dispatch + // (`dispatch_async_local_storage_property`) binds the method; + // the call form `als.getStore()` still dispatches via the + // runtime handle method dispatch. Mirrors the EventEmitter / + // Console / net.Socket method-value-read arms above. + let object_expr = lower_expr(ctx, &member.obj)?; + return Ok(Expr::PropertyGet { + object: Box::new(object_expr), + property: property_name, + }); } else if matches!(module_name.as_str(), "http" | "https") && class_name == "Agent" && property_name == "close" @@ -2316,6 +2340,16 @@ fn lower_member_inner(ctx: &mut LoweringContext, member: &ast::MemberExpr) -> Re if !obj_is_named_import && perry_api_manifest::module_has_any_entries(module) && perry_api_manifest::module_has_symbol(module, prop).is_none() + // #wall4: a method that is unmistakably a `String.prototype` member + // (`endsWith`, `startsWith`, `slice`, …) called on an identifier that + // *happens* to share a node-core module name (`url`, `path`) means the + // receiver is a runtime string value, NOT the module — don't gate it + // as an unimplemented module API; fall through to a normal PropertyGet + // so it dispatches dynamically on the real receiver. Next.js's + // app-page-turbo bundle calls `url.endsWith(...)` on a URL *string* + // bound to a local named `url`, which otherwise threw + // "url.endsWith is not implemented in Perry (ahead-of-time)". + && !super::array_fold::is_known_string_prototype_method(prop) { // #3896: a bare *value read* of an absent member on a Node // builtin module namespace/default object is an ordinary diff --git a/crates/perry-hir/src/lower/expr_new.rs b/crates/perry-hir/src/lower/expr_new.rs index 40fd3e981b..5b8b26d108 100644 --- a/crates/perry-hir/src/lower/expr_new.rs +++ b/crates/perry-hir/src/lower/expr_new.rs @@ -20,6 +20,60 @@ use crate::lower_types::extract_ts_type_with_ctx; use super::expr_new_builtins::{global_member_constructor_name, module_constructor_name}; use super::{lower_expr, LoweringContext}; +/// Collect the compile-time-constant string fragments of a `+`-concatenation +/// (or template) expression, skipping any dynamic operands. Used to recognize a +/// runtime-constructed `new Function` body by its constant skeleton. +fn collect_const_string_parts(e: &ast::Expr, out: &mut String) { + match e { + ast::Expr::Lit(ast::Lit::Str(s)) => out.push_str(s.value.as_str().unwrap_or("")), + ast::Expr::Bin(b) if b.op == ast::BinaryOp::Add => { + collect_const_string_parts(&b.left, out); + collect_const_string_parts(&b.right, out); + } + ast::Expr::Paren(p) => collect_const_string_parts(&p.expr, out), + ast::Expr::Tpl(t) => { + for q in &t.quasis { + out.push_str(q.raw.as_str()); + } + } + // Dynamic operand (an identifier, call, etc.) — skip it. + _ => {} + } +} + +/// Recognize depd's `wrapfunction` deprecation-wrapper shape: +/// `new Function("fn","log","deprecate","message","site", +/// '"use strict"\n'+"return function ("+a+") {"+ +/// "log.call(deprecate, message, site)\n"+"return fn.apply(this, arguments)\n"+"}")`. +/// The five param-name args are constant string literals; only the body +/// (last arg) is runtime-constructed. The runtime `js_function_ctor_from_strings` +/// re-verifies the full template and returns the wrapped fn, so matching here +/// lets the site proceed to that recognizer instead of being deferred to a +/// throw-on-call value (which `send` invokes eagerly at Next.js startup). +fn is_depd_wrapfunction_shape(args: &[ast::ExprOrSpread]) -> bool { + if args.len() != 6 { + return false; + } + const PARAM_NAMES: [&str; 5] = ["fn", "log", "deprecate", "message", "site"]; + for (i, name) in PARAM_NAMES.iter().enumerate() { + if args[i].spread.is_some() { + return false; + } + match crate::eval_classifier::const_string_of(&args[i].expr) { + Some(s) if s == *name => {} + _ => return false, + } + } + if args[5].spread.is_some() { + return false; + } + let mut body = String::new(); + collect_const_string_parts(&args[5].expr, &mut body); + body.contains("return function (") + && body.contains("log.call(deprecate, message, site)") + && body.contains("return fn.apply(this, arguments)") +} + /// Lower `new TextDecoder(label?, { fatal?, ignoreBOM? })` into /// `Expr::TextDecoderNew { label, fatal, ignore_bom }`. Shared by /// `expr_new.rs` (bound to a local) and `textencoder.rs` (inline @@ -880,7 +934,9 @@ pub(super) fn lower_new(ctx: &mut LoweringContext, new_expr: &ast::NewExpr) -> R // Try to extract class name from callee match callee_expr { ast::Expr::Ident(ident) => { - let class_name = ident.sym.to_string(); + // Resolve through any scope-local class rename so `new X` binds to + // the lexically-correct (possibly disambiguated) class. + let class_name = ctx.resolve_class_name(ident.sym.as_str()); if matches!( ctx.lookup_native_module(&class_name), Some(("url", Some("Url"))) @@ -1153,25 +1209,39 @@ pub(super) fn lower_new(ctx: &mut LoweringContext, new_expr: &ast::NewExpr) -> R )? { return Ok(folded); } - // Not fully const-foldable — body is the last argument - // (`new Function(p1, p2, body)`); earlier args are param names. - let body_arg = args_slice.last().map(|a| a.expr.as_ref()); - match crate::eval_classifier::check_site( - crate::eval_classifier::EvalSurface::NewFunction, - body_arg, - &ctx.source_file_path, - new_expr.span, - )? { - crate::eval_classifier::EvalDecision::Proceed => {} - // #5206: default (defer) mode — compile to a function value - // that throws a descriptive Error only when invoked. - crate::eval_classifier::EvalDecision::DeferToRuntimeError(message) => { - return super::const_fold_fn::synth_deferred_eval_value( - ctx, - crate::eval_classifier::EvalSurface::NewFunction, - &message, - new_expr.span, - ); + // depd `wrapfunction` builds its deprecation wrapper with a + // runtime-constructed body (`'…return function ('+a+') {…'`), so + // it isn't const-foldable and the classifier would defer it to a + // throw-on-call value — which Next.js' `send` invokes eagerly at + // startup (`new Function(…)(fn,…)`), crashing before `✓ Ready`. + // The runtime `js_function_ctor_from_strings` recognizes this + // exact template and returns the wrapped fn (the deprecation log + // is a non-essential warning), so PROCEED to the codegen + // `Expr::New { "Function" }` path for it instead of deferring. + // Any other runtime-unknown body still defers. NO general eval. + if is_depd_wrapfunction_shape(args_slice) { + // fall through to `Expr::New { class_name: "Function" }`. + } else { + // Not fully const-foldable — body is the last argument + // (`new Function(p1, p2, body)`); earlier args are param names. + let body_arg = args_slice.last().map(|a| a.expr.as_ref()); + match crate::eval_classifier::check_site( + crate::eval_classifier::EvalSurface::NewFunction, + body_arg, + &ctx.source_file_path, + new_expr.span, + )? { + crate::eval_classifier::EvalDecision::Proceed => {} + // #5206: default (defer) mode — compile to a function value + // that throws a descriptive Error only when invoked. + crate::eval_classifier::EvalDecision::DeferToRuntimeError(message) => { + return super::const_fold_fn::synth_deferred_eval_value( + ctx, + crate::eval_classifier::EvalSurface::NewFunction, + &message, + new_expr.span, + ); + } } } } @@ -1878,7 +1948,7 @@ pub(super) fn lower_new(ctx: &mut LoweringContext, new_expr: &ast::NewExpr) -> R let synthetic_name = format!("__anon_class_{}", ctx.fresh_class()); let class = lower_class_from_ast(ctx, &class_expr.class, &synthetic_name, false)?; ctx.pending_classes.push(class); - let args = new_expr + let mut args: Vec = new_expr .args .as_ref() .map(|args| { @@ -1888,6 +1958,24 @@ pub(super) fn lower_new(ctx: &mut LoweringContext, new_expr: &ast::NewExpr) -> R }) .transpose()? .unwrap_or_default(); + // Issue #212 (anon-class-expression parity): a class expression + // nested in a function may capture enclosing-scope locals. + // `lower_class_from_ast` → `synthesize_class_captures` extended + // the synthesized constructor with one param per captured id and + // rewrote the METHOD bodies to read `this.__perry_cap_`. The + // named-class `new C()` path above forwards those captures as + // `LocalGet(id)`; the directly-constructed anonymous form + // (`new class { m() { return outer } }()`) must do the same, or + // the cap params receive `undefined` and every method that reads + // a captured local sees `undefined`. Refs Next.js bundled tracer + // (`getActiveScopeSpan` → `trace.getSpan` on undefined `trace`). + let class_captures: Vec = ctx + .lookup_class_captures(&synthetic_name) + .map(|c| c.to_vec()) + .unwrap_or_default(); + for cid in class_captures { + args.push(Expr::LocalGet(cid)); + } let type_args = new_expr .type_args .as_ref() diff --git a/crates/perry-hir/src/lower/expr_object.rs b/crates/perry-hir/src/lower/expr_object.rs index a41f81318e..aa2eee0c51 100644 --- a/crates/perry-hir/src/lower/expr_object.rs +++ b/crates/perry-hir/src/lower/expr_object.rs @@ -21,7 +21,10 @@ use crate::analysis::{ closure_uses_this, collect_assigned_locals_stmt, collect_local_refs_stmt, uses_this_stmt, }; use crate::ir::{EnumValue, Expr, Function, Param, Stmt}; -use crate::lower_decl::{append_synthetic_arguments_param, body_uses_arguments, lower_block_stmt}; +use crate::lower_decl::{ + append_synthetic_arguments_param, body_uses_arguments, lower_block_stmt, + lower_fn_body_block_stmt, +}; use crate::lower_patterns::{ generate_param_destructuring_stmts, get_param_default, get_pat_name, is_destructuring_pattern, is_rest_param, @@ -158,6 +161,15 @@ fn lower_method_prop( let method_key = match &method.key { ast::PropName::Ident(ident) => MethodKeyKind::Static(ident.sym.to_string()), ast::PropName::Str(s) => MethodKeyKind::Static(s.value.as_str().unwrap_or("").to_string()), + // A numeric-keyed method shorthand (`{ 900(e,t,r){} }`) — its key is the + // stringified number per spec (`{900(){}}` has own key "900"). Without + // this arm it fell through to `_ => Ok(None)` and the method was DROPPED + // entirely (invisible to `obj[900]` and `Object.keys`), so a webpack + // bundle's numeric-keyed module-factory table (`{900(e,t,r){…}}`, + // `t[900].call(…)`) lost its factories — Next.js's + // `app-page-turbo.runtime.prod.js` entry `a(900)`. Mirrors the KeyValue + // and closed-shape numeric-key handling (`number_to_js_key`). + ast::PropName::Num(n) => MethodKeyKind::Static(super::number_to_js_key(n.value)), ast::PropName::Computed(computed) => match lower_expr(ctx, computed.expr.as_ref()) { Ok(e) => MethodKeyKind::Computed(e), Err(_) => return Ok(None), @@ -229,6 +241,7 @@ fn lower_method_prop( } let param_type = extract_param_type_with_ctx(¶m.pat, Some(ctx)); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -294,7 +307,7 @@ fn lower_method_prop( } let mut body = if let Some(ref block) = method.function.body { - lower_block_stmt(ctx, block)? + lower_fn_body_block_stmt(ctx, block)? } else { Vec::new() }; @@ -475,6 +488,7 @@ fn lower_accessor_prop( let param_type = extract_param_type_with_ctx(pat, Some(ctx)); let param_default = get_param_default(ctx, pat)?; let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -488,7 +502,7 @@ fn lower_accessor_prop( } let body = if let Some(block) = body { - lower_block_stmt(ctx, block)? + lower_fn_body_block_stmt(ctx, block)? } else { Vec::new() }; diff --git a/crates/perry-hir/src/lower/lower_expr.rs b/crates/perry-hir/src/lower/lower_expr.rs index 1b437ec94f..3a1bc5d686 100644 --- a/crates/perry-hir/src/lower/lower_expr.rs +++ b/crates/perry-hir/src/lower/lower_expr.rs @@ -623,6 +623,23 @@ fn lower_expr_impl(ctx: &mut LoweringContext, expr: &ast::Expr) -> Result ctx.with_env_stack = saved_with_envs; return Ok(wrap_with_gets(&name, fallback?, with_envs)); } + // A class declared in the current function body lexically shadows a + // same-named binding from an OUTER scope. Resolution normally checks + // `lookup_local` (which finds outer-scope locals) before the class, + // so without this a nested `class a` whose name also exists as an + // outer local resolved to that outer local. In the Next.js app-page + // bundle a webpack chunk's `a` (`a=()=>{}`, undefined at module-init + // time) is captured into a module factory that declares + // `class a extends Error` (p-timeout's TimeoutError); the export + // `e.exports.TimeoutError=a` then read the outer `undefined` instead + // of the class, so `new r.TimeoutError` threw "undefined is not a + // constructor". Gate on there being NO current-scope local of that + // name (a sibling param/var/let still wins). + if ctx.forward_class_names.contains(&name) + && ctx.lookup_local_in_current_scope(&name).is_none() + { + return Ok(Expr::ClassRef(ctx.resolve_class_name(&name))); + } if let Some(id) = ctx.lookup_local(&name) { // A with-fallback implicit global may still be the HOLE // sentinel (the with-env took the write) — reading it then @@ -665,14 +682,14 @@ fn lower_expr_impl(ctx: &mut LoweringContext, expr: &ast::Expr) -> Result }) } else if ctx.lookup_class(&name).is_some() { // Class used as a first-class value (e.g., { Point: Point }) - Ok(Expr::ClassRef(name)) + Ok(Expr::ClassRef(ctx.resolve_class_name(&name))) } else if ctx.forward_class_names.contains(&name) { // Forward reference to a sibling class declared LATER in the // same function body (vendored zod: ZodType.optional() → // ZodOptional.create(...)). JS resolves this at call time; // emit a ClassRef by name — codegen resolves it from the // class registry, which has every pending class by then. - Ok(Expr::ClassRef(name)) + Ok(Expr::ClassRef(ctx.resolve_class_name(&name))) } else if name == "undefined" { // Global undefined identifier Ok(Expr::Undefined) diff --git a/crates/perry-hir/src/lower/lowering_context.rs b/crates/perry-hir/src/lower/lowering_context.rs index e0e07a222a..8d08b26f0a 100644 --- a/crates/perry-hir/src/lower/lowering_context.rs +++ b/crates/perry-hir/src/lower/lowering_context.rs @@ -298,6 +298,10 @@ pub struct LoweringContext { /// Function-body var prebinding uses the top mark to distinguish /// parameters/current-scope locals from outer captures with the same name. pub(crate) scope_local_marks: Vec, + /// #wall5: per-scope marks into `module_shadow_stack`, pushed in + /// `enter_scope` and popped in `exit_scope` to restore native-module + /// shadowing when a scope that re-bound a module name exits. + pub(crate) scope_module_shadow_marks: Vec, /// Block scope nesting counter (for bare `{}`, `if`, loops, try/finally). /// A local only counts as module-level when both `scope_depth == 0` and /// `inside_block_scope == 0`; `const captured = i` inside a top-level for @@ -407,6 +411,12 @@ pub struct LoweringContext { /// `lookup_native_module` scanned FORWARD (first-match-wins), so the index /// keeps the FIRST pushed index per name (`entry().or_insert`). pub(crate) native_modules_index: HashMap, + /// #wall5: scope-stack of native-module names currently SHADOWED by a local + /// binding (param / `const`) of the same name. `lookup_native_module` + /// returns `None` for shadowed names so a local `url`/`util`/etc. resolves + /// as a value, not the node module. Pushed at param/var-decl sites, truncated + /// at scope exit (parallel to `native_instances` scoping). + pub(crate) module_shadow_stack: Vec, /// Perf index for `class_statics` (push-only, never truncated). The old /// `has_static_method`/`has_static_field` scanned FORWARD (first-match-wins), /// so the index keeps the FIRST pushed index per class name. @@ -531,6 +541,16 @@ pub struct LoweringContext { /// call dispatched into `Object.create`. Scoped save/restore in /// `lower_fn_body_block_stmt`. pub(crate) forward_class_names: std::collections::HashSet, + /// Scope-local class-name aliases disambiguating distinct same-named classes + /// across nested function/factory scopes within ONE module (class refs are + /// name-keyed: `Expr::New { class_name }` / `ClassRef(name)`). When a body + /// declares `class X` while an outer/prior `class X` is already registered, + /// the body's X is renamed `X$` and `X -> X$` recorded so every + /// reference in that body binds to the lexically-correct class. Saved/ + /// restored per body in both `lower_fn_body_block_stmt` and `lower_fn_expr`. + pub(crate) class_renames: std::collections::HashMap, + /// Monotonic suffix source for `class_renames` unique names. + pub(crate) next_class_rename_id: u32, /// Names of TOP-LEVEL `class X { … }` declarations in the module being /// lowered (populated by the module pre-pass). A NAMED class EXPRESSION /// nested in a function body — e.g. minimatch's `defaults()` returns diff --git a/crates/perry-hir/src/lower/module_decl.rs b/crates/perry-hir/src/lower/module_decl.rs index 1a7af0c083..0d145c0277 100644 --- a/crates/perry-hir/src/lower/module_decl.rs +++ b/crates/perry-hir/src/lower/module_decl.rs @@ -2048,6 +2048,7 @@ pub(crate) fn lower_namespace_as_class( decorators: Vec::new(), is_exported, aliases: Vec::new(), + is_nested: false, }); } }; @@ -2348,5 +2349,6 @@ pub(crate) fn lower_namespace_as_class( decorators: Vec::new(), is_exported, aliases: Vec::new(), + is_nested: false, }) } diff --git a/crates/perry-hir/src/lower_decl/block.rs b/crates/perry-hir/src/lower_decl/block.rs index 8111a2cc31..e4463d86e0 100644 --- a/crates/perry-hir/src/lower_decl/block.rs +++ b/crates/perry-hir/src/lower_decl/block.rs @@ -82,6 +82,30 @@ pub(crate) fn pre_register_forward_captured_lets( } } } + } else { + // `var` bindings are already predefined + boxed by + // `predefine_var_bindings_in_function_body`, but their box is + // NOT in the prealloc set. A closure created EARLIER in the body + // that references a `var` declared LATER (`r.d(t,{x:()=>n.x}); + // var n=r("…")` — the webpack ESM re-export shape in Next.js' + // react-server.node.js) must capture the *live* box, not a + // TAG_UNDEFINED snapshot. Add forward-captured `var` ids to the + // prealloc set so codegen allocates the box at function entry. + for decl in &var_decl.decls { + let mut binding_idents: Vec<(String, u32)> = Vec::new(); + collect_pat_forward_idents(&decl.name, &mut binding_idents); + for (name, _span_lo) in binding_idents { + if !seen_closure_refs.contains(&name) { + continue; + } + if let Some(id) = ctx.lookup_local(&name) { + if !forward_boxed_ids.contains(&id) { + ctx.var_hoisted_ids.insert(id); + forward_boxed_ids.push(id); + } + } + } + } } } // Record closures introduced by THIS statement for subsequent decls. @@ -906,7 +930,11 @@ pub fn lower_fn_body_block_stmt( continue; } let name = fn_decl.ident.sym.to_string(); - let local_id = if let Some(existing) = ctx.lookup_local(&name) { + // Reuse only a CURRENT-scope binding (a sibling `var`/`function` + // hoisted into this same body). A same-named local from an OUTER + // scope must be shadowed with a fresh local, else this declaration + // would write into the outer binding's box at runtime. + let local_id = if let Some(existing) = ctx.lookup_local_in_current_scope(&name) { existing } else { ctx.define_local(name.clone(), Type::Any) @@ -923,8 +951,13 @@ pub fn lower_fn_body_block_stmt( // module function). Scoped: the previous set is restored on exit so // names don't leak across function bodies. let saved_forward_class_names = ctx.forward_class_names.clone(); + let saved_class_renames = ctx.class_renames.clone(); for stmt in &block.stmts { if let ast::Stmt::Decl(ast::Decl::Class(class_decl)) = stmt { + // Disambiguate a distinct same-named class declared in this body so + // its references don't bind to a colliding `class X` elsewhere in + // the bundled module (see `class_renames`). + ctx.maybe_rename_colliding_class(class_decl.ident.sym.as_str()); ctx.forward_class_names .insert(class_decl.ident.sym.to_string()); } @@ -967,12 +1000,14 @@ pub fn lower_fn_body_block_stmt( Err(err) => { ctx.current_strict = parent_strict; ctx.forward_class_names = saved_forward_class_names; + ctx.class_renames = saved_class_renames; ctx.annexb_block_fn_var_ids = saved_annexb_block_fn_var_ids; ctx.annexb_block_fn_names_all = saved_annexb_block_fn_names_all; return Err(err); } }; ctx.forward_class_names = saved_forward_class_names; + ctx.class_renames = saved_class_renames; // Re-register capture snapshots for classes declared in this body at // its END. The decl-site `RegisterClassCaptures` runs before later diff --git a/crates/perry-hir/src/lower_decl/body_stmt.rs b/crates/perry-hir/src/lower_decl/body_stmt.rs index 63d85a3a85..a18257c86c 100644 --- a/crates/perry-hir/src/lower_decl/body_stmt.rs +++ b/crates/perry-hir/src/lower_decl/body_stmt.rs @@ -266,7 +266,10 @@ pub fn lower_body_stmt(ctx: &mut LoweringContext, stmt: &ast::Stmt) -> Result { // Class declared inside a function body (e.g., noble-curves' Point class) - let class_name = class_decl.ident.sym.to_string(); + // Resolve through any scope-local rename: a disambiguated duplicate + // has a unique name, so it is NOT a real redeclaration and must be + // lowered (not skipped) under that unique name. + let class_name = ctx.resolve_class_name(class_decl.ident.sym.as_str()); // Skip if a class with the same name already exists (avoids duplicate definitions // when the same class name appears at both module level and function body level) let already_exists = ctx.pending_classes.iter().any(|c| c.name == class_name) @@ -729,6 +732,8 @@ pub fn lower_body_stmt(ctx: &mut LoweringContext, stmt: &ast::Stmt) -> Result Result { - let name = class_decl.ident.sym.to_string(); + // Resolve through any active scope-local rename so a disambiguated + // duplicate class registers (and self-references) under its unique name. + let name = ctx.resolve_class_name(class_decl.ident.sym.as_str()); validate_legacy_decorator_surface(&class_decl.class, &name)?; validate_class_element_early_errors(&class_decl.class, &name)?; let class_id = match ctx.lookup_class(&name) { @@ -486,21 +488,30 @@ pub fn lower_class_decl( Err(_) => (None, Some(parent_name), None, None), } } else { - // Refs #488 drizzle-sqlite: also try resolving the parent - // class by name across modules. Pre-fix the Member arm set - // `extends = None`, so `class SQLiteIntegerBuilder extends - // import_mid.SQLiteColumnBuilder { ... }` lost its parent - // link entirely — inherited methods (drizzle's - // ColumnBuilder.setName etc.) were unreachable on instances. - // Class names are unique enough in practice that `lookup_class` - // resolves; if it doesn't, we fall back to the prior - // name-only behavior (no regression for unknown parents). - ( - ctx.lookup_class(&parent_name), - Some(parent_name), - None, - None, - ) + // A NAMED cross-module member-extends (`class NodeNextRequest + // extends _index.BaseNextRequest`). The static `extends_name` + // path requires the parent in codegen's class table, which a + // cross-module parent often is NOT — and `ctx.lookup_class` is + // module-order-dependent (the parent module may not be lowered + // yet) — so `super(...)` became a no-op and the parent ctor never + // ran (Next.js `BaseNextRequest`'s ctor sets `this.url`/ + // `this.method` → "Invariant: url can not be undefined"). Route + // through the dynamic `extends_expr` path UNCONDITIONALLY, exactly + // like the `.default` arm (wall 38) and the unknown-Ident arm + // below: the decl-time `RegisterClassParentDynamic` records the + // parent value and `super()` runs the parent ctor at runtime via + // `js_fetch_or_value_super`, which already tolerates native / + // closure / class-ref / builtin parents (wall 38/42 hardening). + // Keep the (possibly-None) static `extends` link + `extends_name` + // for inherited-method / `instanceof` dispatch when resolvable. + // The colliding-name native case (`class Agent extends http.Agent`) + // is handled by the `parent_name == name` arm above. (Refs #488 + // drizzle-sqlite for the original cross-module link.) + let resolved = ctx.lookup_class(&parent_name); + match lower_expr(ctx, super_class) { + Ok(expr) => (resolved, Some(parent_name), None, Some(Box::new(expr))), + Err(_) => (resolved, Some(parent_name), None, None), + } } } else { // Issue #711: `class X extends fn(...)` / `class X extends @@ -1292,6 +1303,9 @@ pub fn lower_class_decl( decorators: lower_decorators(ctx, &class_decl.class.decorators), is_exported, aliases: Vec::new(), + // Declared inside a function body / non-module block → its static-field + // initializers must run on class evaluation, not at module init. + is_nested: ctx.scope_depth > 0 || ctx.inside_block_scope > 0, }) } @@ -1422,12 +1436,16 @@ pub fn lower_class_from_ast( Err(_) => (None, Some(parent_name), None, None), } } else { - ( - ctx.lookup_class(&parent_name), - Some(parent_name), - None, - None, - ) + // Named cross-module member-extends — route through `extends_expr` + // UNCONDITIONALLY so `super()` runs the parent ctor at runtime even + // when the parent isn't in codegen's class table / not yet lowered. + // Keep in lockstep with the matching arm in `lower_class_decl` + // (wall 48: NodeNextRequest extends _index.BaseNextRequest). + let resolved = ctx.lookup_class(&parent_name); + match lower_expr(ctx, super_class) { + Ok(expr) => (resolved, Some(parent_name), None, Some(Box::new(expr))), + Err(_) => (resolved, Some(parent_name), None, None), + } } } else { // Issue #711: see the matching arm in `lower_class_decl` above @@ -1783,5 +1801,8 @@ pub fn lower_class_from_ast( decorators: lower_decorators(ctx, &class.decorators), is_exported, aliases: Vec::new(), + // Declared inside a function body / non-module block → its static-field + // initializers must run on class evaluation, not at module init. + is_nested: ctx.scope_depth > 0 || ctx.inside_block_scope > 0, }) } diff --git a/crates/perry-hir/src/lower_decl/class_members.rs b/crates/perry-hir/src/lower_decl/class_members.rs index c31b55faa8..67f117e499 100644 --- a/crates/perry-hir/src/lower_decl/class_members.rs +++ b/crates/perry-hir/src/lower_decl/class_members.rs @@ -45,6 +45,8 @@ pub fn lower_constructor( let param_default = get_param_default(ctx, &p.pat)?; let is_rest = is_rest_param(&p.pat); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -100,6 +102,8 @@ pub fn lower_constructor( } }; let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); // Record this param for synthesizing `this.field = param` assignment param_prop_assignments.push((param_id, param_name.clone())); params.push(Param { @@ -498,6 +502,8 @@ pub fn lower_class_method_with_name( let param_type = extract_param_type_with_ctx(¶m.pat, Some(ctx)); let is_rest = is_rest_param(¶m.pat); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -804,6 +810,8 @@ pub fn lower_setter_method_with_name( } let param_type = extract_param_type_with_ctx(¶m.pat, Some(ctx)); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, diff --git a/crates/perry-hir/src/lower_decl/fn_decl.rs b/crates/perry-hir/src/lower_decl/fn_decl.rs index bca1217426..3ac01da985 100644 --- a/crates/perry-hir/src/lower_decl/fn_decl.rs +++ b/crates/perry-hir/src/lower_decl/fn_decl.rs @@ -31,7 +31,22 @@ fn function_has_use_strict(func: &ast::Function) -> bool { pub fn lower_fn_decl(ctx: &mut LoweringContext, fn_decl: &ast::FnDecl) -> Result { let name = fn_decl.ident.sym.to_string(); - let func_id = ctx.lookup_func(&name).unwrap_or_else(|| ctx.fresh_func()); + // A function declaration's name must be resolvable inside its own body + // (recursion / self-reference) and to sibling statements (hoisting). The + // pre-scan hoists top-level/shallow decls, but deeply-nested decls inside + // closures can slip through — and then `fresh_func()` here would mint an id + // WITHOUT registering the name, so `lookup_func(name)` stays None and a + // self-reference `o(...)` / member read on the func name lowers to an + // unresolved global (→ `ReferenceError: o is not defined` deep in Next.js's + // webpack-bundled modules). Register the name now when it wasn't already. + let func_id = match ctx.lookup_func(&name) { + Some(id) => id, + None => { + let id = ctx.fresh_func(); + ctx.register_func(name.clone(), id); + id + } + }; // #4101: retain the original source text so `fn.toString()` reconstructs // it. Slice the module source against the function's AST span; prepend the @@ -104,6 +119,8 @@ pub fn lower_fn_decl(ctx: &mut LoweringContext, fn_decl: &ast::FnDecl) -> Result } let param_type = extract_param_type_with_ctx(¶m.pat, Some(ctx)); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); let is_rest = is_rest_param(¶m.pat); params.push(Param { id: param_id, diff --git a/crates/perry-hir/src/lower_decl/private_members.rs b/crates/perry-hir/src/lower_decl/private_members.rs index 8840c5ef91..bae5b0883c 100644 --- a/crates/perry-hir/src/lower_decl/private_members.rs +++ b/crates/perry-hir/src/lower_decl/private_members.rs @@ -102,6 +102,8 @@ pub fn lower_private_method( let param_type = extract_param_type_with_ctx(¶m.pat, Some(ctx)); let is_rest = is_rest_param(¶m.pat); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, @@ -287,6 +289,8 @@ pub fn lower_private_setter( let param_name = get_pat_name(¶m.pat)?; let param_type = extract_param_type_with_ctx(¶m.pat, Some(ctx)); let param_id = ctx.define_local(param_name.clone(), param_type.clone()); + ctx.shadow_native_instance_if_present(¶m_name); + ctx.shadow_native_module_if_present(¶m_name); params.push(Param { id: param_id, name: param_name, diff --git a/crates/perry-hir/src/monomorph/specialize.rs b/crates/perry-hir/src/monomorph/specialize.rs index bf0a9b8ea0..7e388aed45 100644 --- a/crates/perry-hir/src/monomorph/specialize.rs +++ b/crates/perry-hir/src/monomorph/specialize.rs @@ -265,5 +265,6 @@ pub fn specialize_class(class: &Class, type_args: &[Type], new_id: ClassId) -> C decorators: class.decorators.clone(), is_exported: class.is_exported, aliases: class.aliases.clone(), + is_nested: class.is_nested, } } diff --git a/crates/perry-hir/src/stable_hash/decls.rs b/crates/perry-hir/src/stable_hash/decls.rs index 07a0d03ac9..67b40811c5 100644 --- a/crates/perry-hir/src/stable_hash/decls.rs +++ b/crates/perry-hir/src/stable_hash/decls.rs @@ -31,6 +31,7 @@ impl SH for Class { decorators, is_exported, aliases, + is_nested, } = self; id.hash(h); name.hash(h); @@ -52,6 +53,7 @@ impl SH for Class { decorators.hash(h); is_exported.hash(h); aliases.hash(h); + is_nested.hash(h); } } diff --git a/crates/perry-hir/src/stable_hash/tests.rs b/crates/perry-hir/src/stable_hash/tests.rs index e39e08f33c..92d0a4bff3 100644 --- a/crates/perry-hir/src/stable_hash/tests.rs +++ b/crates/perry-hir/src/stable_hash/tests.rs @@ -279,6 +279,7 @@ fn module_metadata_affects_hash() { decorators: vec![], is_exported: false, aliases: vec![], + is_nested: false, }); assert_ne!(base_hash, hash_module(&m_class)); diff --git a/crates/perry-runtime/src/closure/dispatch.rs b/crates/perry-runtime/src/closure/dispatch.rs index 14bf27cb9c..76f0e95564 100644 --- a/crates/perry-runtime/src/closure/dispatch.rs +++ b/crates/perry-runtime/src/closure/dispatch.rs @@ -1387,6 +1387,24 @@ pub unsafe extern "C" fn js_native_call_value( if func_ptr.is_null() && crate::object::is_function_prototype_object_value(func_value) { return f64::from_bits(crate::value::TAG_UNDEFINED); } + // W2 (Next.js app-page-turbo): a class-object (OBJECT_TYPE_CLASS) can reach + // the value-call path — e.g. `new s.RequestCookies(headers)` where the + // dynamic callee `s.RequestCookies` resolves (through a webpack lazy-export + // getter) to a class object, but the construct site lowered to a call rather + // than routing to `js_new_function_construct`. Calling a class object has + // exactly one sensible meaning — construct it — so do that here instead of + // `throw_not_callable` (which surfaces as "value is not a function"). + if func_ptr.is_null() && crate::object::is_class_object_value(func_value) { + // W4 experiment: a 0-arg call of a class object is most likely a + // new-expression CALLEE RESOLUTION (`new s.RequestCookies(headers)` whose + // member callee eval'd as a 0-arg call). Returning the class object lets + // the OUTER `new` construct it with the real args. A call WITH args is a + // direct construct. + if args_len == 0 { + return f64::from_bits(func_value.to_bits()); + } + return crate::object::js_new_function_construct(func_value, args_ptr, args_len); + } let dispatch_args_len = if !func_ptr.is_null() && lookup_closure_rest(func_ptr).is_none() { match lookup_closure_arity(func_ptr) { Some(declared) if (declared as usize) > args_len => declared as usize, diff --git a/crates/perry-runtime/src/closure/dynamic_props.rs b/crates/perry-runtime/src/closure/dynamic_props.rs index 78dca4d22e..ca18c76051 100644 --- a/crates/perry-runtime/src/closure/dynamic_props.rs +++ b/crates/perry-runtime/src/closure/dynamic_props.rs @@ -319,6 +319,18 @@ pub fn is_closure_ptr(ptr: usize) -> bool { if crate::value::addr_class::is_handle_band(ptr) { return false; } + // #wall2: reject any address outside the platform heap range BEFORE the + // `*(ptr + 12)` magic probe. The handle-band check only covers the low + // small-id bands; a MIS-BOXED value like `0x4_0000_0000` (i32 4 << 32 — a + // Next.js route-module options object whose codegen boxing went wrong) is + // aligned and above the handle band, so it passed both guards and the magic + // read dereferenced unmapped memory → SIGSEGV (the Next.js startup crash + // after app-page-turbo loads). `is_valid_obj_ptr` is the real heap floor + // (macOS: 0x2000_0000_0000); a non-heap address is definitively not a + // closure, so return false instead of faulting. + if !crate::value::addr_class::is_valid_obj_ptr(ptr as *const u8) { + return false; + } if ptr % std::mem::align_of::() != 0 { return false; } diff --git a/crates/perry-runtime/src/fs/mod.rs b/crates/perry-runtime/src/fs/mod.rs index 8f33df5ffa..bf193e2667 100644 --- a/crates/perry-runtime/src/fs/mod.rs +++ b/crates/perry-runtime/src/fs/mod.rs @@ -289,10 +289,24 @@ pub extern "C" fn js_fs_read_file_sync_options( c_err.as_ptr(), ); } - // Return empty string instead of null to prevent crashes when - // callers access .length on the result without null-checking. - // Perry's try/catch doesn't catch null-pointer segfaults. - js_string_from_bytes(b"".as_ptr(), 0) + // Node's `readFileSync` THROWS for an unreadable path + // (ENOENT/EACCES/EISDIR/…). Returning an empty string here made + // callers that try/catch a missing file behave wrongly: e.g. + // Next.js `loadManifest` wraps an optional manifest read + // (`subresource-integrity-manifest.json`, absent in most builds) + // in try/catch and falls back to `{}` on ENOENT — but an empty + // string slipped past the catch into `JSON.parse('')`, throwing a + // misleading `SyntaxError: Unexpected end of JSON input`. Surface + // a Node-shaped fs error instead. This is a real, catchable JS + // throw (caught by JS try/catch) — NOT the null-pointer segfault + // the previous empty-string workaround was guarding against. + let path_str = decode_path_value(path_value).unwrap_or_default(); + let io_err = std::fs::read(&path_str) + .err() + .unwrap_or_else(|| std::io::Error::from(std::io::ErrorKind::NotFound)); + crate::exception::js_throw(crate::fs::errors::build_fs_error_value( + &io_err, "open", &path_str, + )) } } } diff --git a/crates/perry-runtime/src/json/replacer.rs b/crates/perry-runtime/src/json/replacer.rs index a4334ea2f8..e61688946b 100644 --- a/crates/perry-runtime/src/json/replacer.rs +++ b/crates/perry-runtime/src/json/replacer.rs @@ -14,14 +14,53 @@ use std::fmt::Write as FmtWrite; // ─── JSON.stringify with replacer ──────────────────────────────────────────── -/// Call a replacer closure with (key, value) and return the result as f64 +/// Call a replacer closure with (key, value) and return the result as f64. +/// +/// Per ECMA-262 `SerializeJSONProperty`, the replacer is invoked with `this` +/// bound to the *holder* — the object/array that contains the property (or, for +/// the root value, the `{ "": value }` wrapper). Code that relies on the holder +/// (e.g. `this[key] instanceof Date`, or React's Flight reply encoder which +/// keys its already-serialized/dedup Maps by `this`) breaks without it — the +/// Flight encoder's `referenceMap.get(this)` then never finds the parent path, +/// so it re-serializes endlessly (Next.js standalone startup runaway). Mirror +/// the reviver path (`internalize_json_property`), which sets the implicit +/// `this` to the holder around the user-callback call. #[inline] pub(crate) unsafe fn call_replacer( replacer: *const crate::ClosureHeader, key_f64: f64, value_f64: f64, + holder_f64: f64, ) -> f64 { - crate::js_closure_call2(replacer, key_f64, value_f64) + let prev_this = crate::object::js_implicit_this_set(holder_f64); + let result = crate::js_closure_call2(replacer, key_f64, value_f64); + crate::object::js_implicit_this_set(prev_this); + result +} + +/// NaN-box a heap object/array pointer as the holder `this` for `call_replacer`. +#[inline] +unsafe fn holder_value(ptr: *const u8) -> f64 { + f64::from_bits(POINTER_TAG | (ptr as u64 & POINTER_MASK)) +} + +/// Build the spec root holder `{ "": value }` (ECMA-262 `JSON.stringify` step: +/// `Let wrapper be OrdinaryObjectCreate(...); CreateDataPropertyOrThrow(wrapper, +/// "", value)`), so the root replacer call sees `this` = the wrapper. GC-safe +/// (mirrors `apply_reviver_with_source`'s root-holder wrapper). +unsafe fn root_holder(value_f64: f64) -> f64 { + let scope = crate::gc::RuntimeHandleScope::new(); + let val_handle = scope.root_nanbox_f64(value_f64); + let wrapper = crate::object::js_object_alloc(0, 1); + let wrapper_handle = scope.root_raw_mut_ptr(wrapper); + let empty = js_string_from_bytes(b"".as_ptr(), 0); + let empty_handle = scope.root_string_ptr(empty); + crate::object::js_object_set_field_by_name( + wrapper_handle.get_raw_mut_ptr::(), + empty_handle.get_raw_const_ptr::(), + val_handle.get_nanbox_f64(), + ); + holder_value(wrapper_handle.get_raw_mut_ptr::() as *const u8) } /// Resolve `value.toJSON(key)` if `value` is an object with a callable @@ -267,7 +306,12 @@ pub(crate) unsafe fn stringify_object_with_replacer_pretty( } } let field_after_to_json = apply_to_json_keyed(field_val, key_f64_for_replacer); - let replaced = call_replacer(replacer, key_f64_for_replacer, field_after_to_json); + let replaced = call_replacer( + replacer, + key_f64_for_replacer, + field_after_to_json, + holder_value(ptr), + ); let replaced_bits = replaced.to_bits(); // Omit the property if the replacer returns undefined or a function. @@ -363,7 +407,7 @@ pub(crate) unsafe fn stringify_array_with_replacer_pretty( let key_f64 = nanbox_string_f64(idx_ptr); let elem_after_to_json = apply_to_json_keyed(elem, key_f64); - let replaced = call_replacer(replacer, key_f64, elem_after_to_json); + let replaced = call_replacer(replacer, key_f64, elem_after_to_json, holder_value(ptr)); let replaced_bits = replaced.to_bits(); // Array holes / undefined / functions become null (per JSON spec). @@ -409,8 +453,16 @@ pub unsafe extern "C" fn js_json_stringify_with_replacer( let empty_key_f64 = nanbox_string_f64(empty_str); let value_after_to_json = apply_to_json_keyed(value, empty_key_f64); - // Call replacer with ("", root_value) - let replaced_root = call_replacer(replacer, empty_key_f64, value_after_to_json); + // Call replacer with ("", root_value), `this` = the `{ "": value }` wrapper. + // Per spec the holder wraps the ORIGINAL root value (so a root replacer's + // `this[""]` observes the pre-`toJSON` value); only the replacer's value + // argument is post-`toJSON`. CodeRabbit (PR #5438). + let replaced_root = call_replacer( + replacer, + empty_key_f64, + value_after_to_json, + root_holder(value), + ); let replaced_bits = replaced_root.to_bits(); // If replacer returns undefined for root, return undefined. @@ -1250,7 +1302,12 @@ pub unsafe extern "C" fn js_json_stringify_full( let empty_str = js_string_from_bytes(b"".as_ptr(), 0); let empty_key_f64 = nanbox_string_f64(empty_str); let value_after_to_json = apply_to_json_keyed(value, empty_key_f64); - let replaced_root = call_replacer(closure_ptr, empty_key_f64, value_after_to_json); + let replaced_root = call_replacer( + closure_ptr, + empty_key_f64, + value_after_to_json, + root_holder(value_after_to_json), + ); let replaced_bits = replaced_root.to_bits(); if replaced_bits == TAG_UNDEFINED { STRINGIFY_STACK.with(|s| s.borrow_mut().clear()); diff --git a/crates/perry-runtime/src/lib.rs b/crates/perry-runtime/src/lib.rs index 89f452f9b0..275d3b604a 100644 --- a/crates/perry-runtime/src/lib.rs +++ b/crates/perry-runtime/src/lib.rs @@ -252,10 +252,11 @@ pub use value::{ pub use value::{ js_set_handle_array_get, js_set_handle_array_length, js_set_handle_call_method, js_set_handle_object_get_property, js_set_handle_to_string, js_set_handle_typeof, - js_set_native_crypto_dispatch, js_set_native_domain_dispatch, js_set_native_events_construct, - js_set_native_http_dispatch, js_set_native_module_js_loader, - js_set_native_querystring_dispatch, js_set_native_sqlite_dispatch, js_set_native_tls_dispatch, - js_set_native_webcrypto_dispatch, js_set_native_zlib_dispatch, js_set_new_from_handle_v8, + js_set_native_async_hooks_construct, js_set_native_crypto_dispatch, + js_set_native_domain_dispatch, js_set_native_events_construct, js_set_native_http_dispatch, + js_set_native_module_js_loader, js_set_native_querystring_dispatch, + js_set_native_sqlite_dispatch, js_set_native_tls_dispatch, js_set_native_webcrypto_dispatch, + js_set_native_zlib_dispatch, js_set_new_from_handle_v8, }; // Extension pump registration — allows extensions to register pump functions diff --git a/crates/perry-runtime/src/module_require.rs b/crates/perry-runtime/src/module_require.rs index af3da9c511..ce35d9bfa0 100644 --- a/crates/perry-runtime/src/module_require.rs +++ b/crates/perry-runtime/src/module_require.rs @@ -218,6 +218,134 @@ pub extern "C" fn js_module_create_require(filename_or_url: f64) -> f64 { make_require(undefined()) } +/// Next.js wall 54: registry mapping an AOT-compiled CJS module's absolute +/// source path to its evaluated `module.exports`, so a RUNTIME +/// `require(absolutePath.js)` (Next.js / turbopack load page + chunk modules by +/// a path computed at request time, not a static specifier) resolves to the +/// module Perry already compiled instead of throwing `MODULE_NOT_FOUND`. Keyed +/// by canonicalized path so `/a/../b` and symlinks normalize to one entry. Each +/// compiled module self-registers at the end of its CJS wrapper init. +static MODULE_PATH_REGISTRY: std::sync::RwLock>> = + std::sync::RwLock::new(None); + +/// Next.js wall 54 (part 2): registry mapping an AOT-compiled module's absolute +/// source path to the ADDRESS of its `__init` function, so a runtime +/// `require(absolutePath.js)` can LAZILY trigger init of a module that was NOT +/// run at startup (Deferred). The `.next/server/**` page/route/chunk modules are +/// loaded by a path computed at request time; eager-initing them at startup runs +/// React-SSR code before the server is ready (and turbopack chunks must init in +/// the order the page loader's `R.c()` calls demand), so they are Deferred and +/// init on first `require`. Populated once at program start by codegen-emitted +/// `js_register_path_init` calls (no init runs there — only the address is +/// recorded). The init function is idempotent (guarded), self-registers its +/// exports into [`MODULE_PATH_REGISTRY`], and may recursively `require` its own +/// dependencies (chunk loaders) — all safe because each module inits once. +static MODULE_PATH_INIT_REGISTRY: std::sync::RwLock< + Option>, +> = std::sync::RwLock::new(None); + +fn canonicalize_module_path(path: &str) -> String { + std::fs::canonicalize(path) + .map(|p| p.to_string_lossy().into_owned()) + .unwrap_or_else(|_| path.to_string()) +} + +/// Codegen FFI: record that `__init` (address `init_addr`) initializes +/// the module whose absolute source path is `path_value`. Emitted once per +/// Deferred `.next/server/**` module at the top of `main`. See +/// [`MODULE_PATH_INIT_REGISTRY`]. +/// # Safety +/// `path_ptr`/`path_len` describe a valid UTF-8 byte range (a codegen string +/// constant). `init_addr` is the address of an `extern "C" fn()` module +/// initializer (from `ptrtoint` of the symbol). +#[no_mangle] +pub unsafe extern "C" fn js_register_path_init(path_ptr: *const u8, path_len: i64, init_addr: i64) { + let slice = std::slice::from_raw_parts(path_ptr, path_len as usize); + let path = String::from_utf8_lossy(slice).into_owned(); + let key = canonicalize_module_path(&path); + let mut guard = MODULE_PATH_INIT_REGISTRY.write().unwrap(); + guard + .get_or_insert_with(std::collections::HashMap::new) + .insert(key, init_addr as usize); +} + +/// Codegen FFI: register an AOT-compiled module's exports under its absolute +/// source path (emitted at the tail of each CJS wrapper). See +/// [`MODULE_PATH_REGISTRY`]. +#[no_mangle] +pub extern "C" fn js_register_path_module(path_value: f64, exports: f64) { + let path = value_to_string(path_value, "path"); + let key = canonicalize_module_path(&path); + let mut guard = MODULE_PATH_REGISTRY.write().unwrap(); + guard + .get_or_insert_with(std::collections::HashMap::new) + .insert(key, exports.to_bits()); +} + +/// Codegen FFI: resolve a runtime `require(absolutePath.js)` to a registered +/// AOT-compiled module's exports, or `undefined` when no module is registered +/// for that path (caller then falls back to the `.json` disk read / throws +/// `MODULE_NOT_FOUND`). Module exports are always objects, so `undefined` +/// unambiguously signals "miss". +#[no_mangle] +pub extern "C" fn js_require_path_module(path_value: f64) -> f64 { + let path = value_to_string(path_value, "id"); + let key = canonicalize_module_path(&path); + // Fast path: the module already ran (eager, or a prior require) and + // self-registered its exports. + { + let guard = MODULE_PATH_REGISTRY.read().unwrap(); + if let Some(map) = guard.as_ref() { + if let Some(bits) = map.get(&key) { + return f64::from_bits(*bits); + } + } + } + // Wall 54 (part 2): no exports yet — if a Deferred module is registered for + // this path, trigger its init now. The init self-registers its exports (and + // may recursively require its chunk dependencies). Crucially, all registry + // locks are RELEASED before calling init, since init re-enters + // `js_register_path_module` / `js_require_path_module`. + let init_addr = { + let guard = MODULE_PATH_INIT_REGISTRY.read().unwrap(); + guard.as_ref().and_then(|m| m.get(&key).copied()) + }; + if let Some(addr) = init_addr { + // SAFETY: `addr` is the address of a codegen-emitted `extern "C" fn()` + // module initializer, recorded by `js_register_path_init` from a value + // produced by `ptrtoint` of the same function symbol. The initializer + // is idempotent (guarded by `@__perry_init_done_`). + let init_fn: extern "C" fn() = unsafe { std::mem::transmute::(addr) }; + init_fn(); + let guard = MODULE_PATH_REGISTRY.read().unwrap(); + if let Some(map) = guard.as_ref() { + if let Some(bits) = map.get(&key) { + return f64::from_bits(*bits); + } + } + } + undefined() +} + +/// Next.js wall 53: runtime `require(absolutePath)` of a `.json` file. +/// +/// Emitted only by the CJS wrapper's `require` fallback (cjs_wrap/wrap.rs) for a +/// specifier computed at runtime (e.g. Next.js `require(this.middlewareManifestPath)`) +/// — the statically-resolved relative cases can't cover it. Node's `require` +/// reads + `JSON.parse`s `.json` files; `.json` is pure data so this needs no +/// code evaluation. Reads the file from disk and parses it, throwing +/// `MODULE_NOT_FOUND` (matching Node's require) when the path doesn't exist. +#[no_mangle] +pub extern "C" fn js_require_json_disk(specifier: f64) -> f64 { + let path = value_to_string(specifier, "id"); + let content = match std::fs::read_to_string(&path) { + Ok(c) => c, + Err(_) => throw_module_not_found(&path), + }; + let text_ptr = js_string_from_bytes(content.as_ptr(), content.len() as u32); + let parsed = unsafe { crate::json::js_json_parse(text_ptr) }; + f64::from_bits(parsed.bits()) +} /// Ambient `require` for compiled external / `compilePackages` modules (Tier 1 of /// #5389, fixes #5373). These modules carry no CJS ambient `require` binding, so a /// bare or computed `require(expr)` would otherwise lower to diff --git a/crates/perry-runtime/src/object/class_constructors.rs b/crates/perry-runtime/src/object/class_constructors.rs index 21654ac717..3f9302782a 100644 --- a/crates/perry-runtime/src/object/class_constructors.rs +++ b/crates/perry-runtime/src/object/class_constructors.rs @@ -523,11 +523,42 @@ pub(crate) unsafe fn replay_class_object_constructor( let user_params = (total_params as usize).saturating_sub(effective_caps); let undef = f64::from_bits(crate::value::TAG_UNDEFINED); let mut final_args: Vec = Vec::with_capacity(total_params as usize); - for i in 0..user_params { - if !args_ptr.is_null() && i < args_len { - final_args.push(*args_ptr.add(i)); - } else { - final_args.push(undef); + // #wall3: a `constructor(...args)` (rest param) called via the dynamic + // member-new path (`new ns.Sub(opts)` → js_new_function_construct → + // is_class_object_value → here) must BUNDLE the trailing call args into a JS + // array for the rest slot. call_vtable_method's own `has_rest` can't do it + // because the rest param is NOT last here — the positional `__perry_cap_*` + // capture params follow it — so we pack the rest array ourselves at the rest + // index, then append caps. Without this the rest binds to the first arg as a + // scalar (`args`=opts, not [opts]) and `super(...args)` spreads a bare object + // → 0x400000000 mis-box → crash (Next.js `new c.AppPageRouteModule({...})`). + let rest_idx = crate::closure::lookup_closure_rest(ctor_ptr as *const u8) + .map(|ri| ri as usize) + .filter(|ri| *ri < user_params); + if let Some(ri) = rest_idx { + for i in 0..ri { + if !args_ptr.is_null() && i < args_len { + final_args.push(*args_ptr.add(i)); + } else { + final_args.push(undef); + } + } + let mut rest_arr = crate::array::js_array_alloc(0); + if !args_ptr.is_null() { + let mut i = ri; + while i < args_len { + rest_arr = crate::array::js_array_push_f64(rest_arr, *args_ptr.add(i)); + i += 1; + } + } + final_args.push(crate::value::js_nanbox_pointer(rest_arr as i64)); + } else { + for i in 0..user_params { + if !args_ptr.is_null() && i < args_len { + final_args.push(*args_ptr.add(i)); + } else { + final_args.push(undef); + } } } for j in 0..n_caps { @@ -571,11 +602,43 @@ pub(crate) unsafe fn replay_registered_class_constructor( let undef = f64::from_bits(crate::value::TAG_UNDEFINED); let mut final_args: Vec = Vec::with_capacity(total_params as usize); - for i in 0..user_params { - if !args_ptr.is_null() && i < args_len { - final_args.push(*args_ptr.add(i)); - } else { - final_args.push(undef); + // #wall3: a `constructor(...args)` reached via the dynamic class-REF member-new + // path (`new ns.Sub(opts)` where ns.Sub resolves to an INT32 ClassRef at + // runtime → js_new_function_construct → constructor_class_ref_id → + // construct_registered_class_ref → here) must BUNDLE trailing call args into a + // JS array for the rest slot. The rest is NOT the last ctor param (positional + // `__perry_cap_*` capture params follow it), so call_vtable_method's own + // `has_rest` can't pack it — we pack the rest array ourselves at the rest + // index, then append caps. Without this the rest binds to the first arg as a + // scalar (`args`=opts, not [opts]) and `super(...args)` spreads a bare object + // → 0x400000000 mis-box → crash (Next.js `new c.AppPageRouteModule({...})`). + let rest_idx = crate::closure::lookup_closure_rest(ctor_ptr as *const u8) + .map(|ri| ri as usize) + .filter(|ri| *ri < user_params); + if let Some(ri) = rest_idx { + for i in 0..ri { + if !args_ptr.is_null() && i < args_len { + final_args.push(*args_ptr.add(i)); + } else { + final_args.push(undef); + } + } + let mut rest_arr = crate::array::js_array_alloc(0); + if !args_ptr.is_null() { + let mut i = ri; + while i < args_len { + rest_arr = crate::array::js_array_push_f64(rest_arr, *args_ptr.add(i)); + i += 1; + } + } + final_args.push(crate::value::js_nanbox_pointer(rest_arr as i64)); + } else { + for i in 0..user_params { + if !args_ptr.is_null() && i < args_len { + final_args.push(*args_ptr.add(i)); + } else { + final_args.push(undef); + } } } for bits in &caps { diff --git a/crates/perry-runtime/src/object/class_registry.rs b/crates/perry-runtime/src/object/class_registry.rs index 6175cff03c..96dd7235c2 100644 --- a/crates/perry-runtime/src/object/class_registry.rs +++ b/crates/perry-runtime/src/object/class_registry.rs @@ -1963,6 +1963,23 @@ pub unsafe extern "C" fn js_new_function_construct( return dispatch(method.as_ptr(), method.len(), args_ptr, args_len); } } + // `new ()` / `<...AsyncResource>()`. + // Next.js stores the native ctor on `globalThis.AsyncLocalStorage` and + // later does `new maybeGlobalAsyncLocalStorage()` (a dynamic callee), so + // the static `new AsyncLocalStorage()` codegen arm never fires. Without + // this the instance was a class_id=0 empty object whose `.getStore` read + // back `undefined` -> "getStore is not a function" at server startup. + // Route to the stdlib handle constructor via the registered dispatcher. + if module == "async_hooks" + && matches!(method.as_str(), "AsyncLocalStorage" | "AsyncResource") + { + let ptr = crate::value::JS_NATIVE_ASYNC_HOOKS_CONSTRUCT + .load(std::sync::atomic::Ordering::SeqCst); + if !ptr.is_null() { + let dispatch: crate::value::JsNativeEventsConstructFn = std::mem::transmute(ptr); + return dispatch(method.as_ptr(), method.len(), args_ptr, args_len); + } + } if module == "zlib" && matches!(method.as_str(), "ZstdCompress" | "ZstdDecompress") { let ptr = crate::value::JS_NATIVE_ZLIB_DISPATCH.load(std::sync::atomic::Ordering::SeqCst); diff --git a/crates/perry-runtime/src/object/field_get_set.rs b/crates/perry-runtime/src/object/field_get_set.rs index 6c0a1d6ca0..697e9ab9c7 100644 --- a/crates/perry-runtime/src/object/field_get_set.rs +++ b/crates/perry-runtime/src/object/field_get_set.rs @@ -1076,7 +1076,13 @@ pub extern "C" fn js_object_set_field_by_index( let name_len = (*key).byte_len as usize; let name_bytes = std::slice::from_raw_parts(name_ptr, name_len); if let Ok(name) = std::str::from_utf8(name_bytes) { - if ACCESSORS_IN_USE.with(|c| c.get()) { + // Gate on the per-object descriptor flag: `ACCESSOR_DESCRIPTORS` + // is keyed by raw address, so a fresh object reusing a freed + // address must not pick up the previous tenant's stale accessor + // (it would silently drop `obj.k = v` for a getter-only stale + // entry). A fresh allocation has the flag clear. + if ACCESSORS_IN_USE.with(|c| c.get()) && super::object_has_descriptors(obj as usize) + { if let Some(acc) = get_accessor_descriptor(obj as usize, name) { if acc.set != 0 { let closure = (acc.set & crate::value::POINTER_MASK) @@ -5228,8 +5234,11 @@ pub extern "C" fn js_object_get_field_by_name( } else { // Accessor short-circuit: if this (obj, key) has a getter installed, // invoke it instead of reading the slot. The `ACCESSORS_IN_USE` - // thread-local gate keeps this off the hot path in the common case. - if ACCESSORS_IN_USE.with(|c| c.get()) { + // thread-local gate keeps this off the hot path in the common case; + // the per-object flag gate avoids invoking a stale getter left by a + // freed object whose address this fresh object reused. + if ACCESSORS_IN_USE.with(|c| c.get()) && super::object_has_descriptors(obj as usize) + { if let Ok(name) = std::str::from_utf8(key_bytes) { if let Some(acc) = get_accessor_descriptor(obj as usize, name) { if acc.get != 0 { @@ -5255,7 +5264,8 @@ pub extern "C" fn js_object_get_field_by_name( // linear scan below (the index is an accelerator, not authoritative). if key_count >= WIDE_KEY_INDEX_MIN_KEYS { if let Some(i) = wide_key_index_lookup(keys_id, key_bytes, key, keys, key_count) { - if ACCESSORS_IN_USE.with(|c| c.get()) { + if ACCESSORS_IN_USE.with(|c| c.get()) && super::object_has_descriptors(obj as usize) + { if let Ok(name) = std::str::from_utf8(key_bytes) { if let Some(acc) = get_accessor_descriptor(obj as usize, name) { if acc.get != 0 { @@ -5296,7 +5306,8 @@ pub extern "C" fn js_object_get_field_by_name( wide_key_index_note_hit(keys_id, key_bytes, i as u32); } // Accessor short-circuit (see fast path above). - if ACCESSORS_IN_USE.with(|c| c.get()) { + if ACCESSORS_IN_USE.with(|c| c.get()) && super::object_has_descriptors(obj as usize) + { if let Ok(name) = std::str::from_utf8(key_bytes) { if let Some(acc) = get_accessor_descriptor(obj as usize, name) { if acc.get != 0 { diff --git a/crates/perry-runtime/src/object/field_set_by_name.rs b/crates/perry-runtime/src/object/field_set_by_name.rs index 2f6a2a39be..860580352b 100644 --- a/crates/perry-runtime/src/object/field_set_by_name.rs +++ b/crates/perry-runtime/src/object/field_set_by_name.rs @@ -917,7 +917,15 @@ pub extern "C" fn js_object_set_field_by_name( // frozen check at the top of each block threw before the accessor was // consulted (test262 // assign/target-is-frozen-accessor-property-set-succeeds). - if ACCESSORS_IN_USE.with(|c| c.get()) { + // + // Gate on the per-object `OBJ_FLAG_HAS_DESCRIPTORS` flag, not just the + // thread-global `ACCESSORS_IN_USE`: `ACCESSOR_DESCRIPTORS` is keyed by + // raw address, so a fresh object reusing a freed address would otherwise + // read back the previous tenant's stale getter-only accessor and falsely + // throw "Cannot assign to read only property" on a plain `{}` (Next.js + // app-page-turbo runtime's `exports.Fragment = …`). A fresh allocation + // has the flag clear, so it skips the stale lookup entirely. + if ACCESSORS_IN_USE.with(|c| c.get()) && super::object_has_descriptors(obj as usize) { if let Some(ref k) = incoming_key_str { if let Some(acc) = get_accessor_descriptor(obj as usize, k) { if acc.set != 0 { @@ -1097,7 +1105,18 @@ pub extern "C" fn js_object_set_field_by_name( } // Per-property writable check (set by Object.defineProperty / freeze). // Issue #615 — strict-mode throw on read-only assign. - if PROPERTY_ATTRS_IN_USE.with(|c| c.get()) { + // + // Gate on the per-object `OBJ_FLAG_HAS_DESCRIPTORS` flag, not just + // the thread-global `PROPERTY_ATTRS_IN_USE`: `PROPERTY_DESCRIPTORS` + // is keyed by raw address, so a fresh object reusing a freed + // address would otherwise read back the previous tenant's stale + // `(addr, key)` descriptor and falsely throw "Cannot assign to + // read only property" on a plain `{}` (Next.js app-page-turbo + // runtime's `exports.Fragment = …`). A fresh allocation has the + // flag clear, so it skips the lookup entirely. + if PROPERTY_ATTRS_IN_USE.with(|c| c.get()) + && super::object_has_descriptors(obj as usize) + { if let Some(ref k) = incoming_key_str { if let Some(attrs) = get_property_attrs(obj as usize, k) { if !attrs.writable() { diff --git a/crates/perry-runtime/src/object/mod.rs b/crates/perry-runtime/src/object/mod.rs index daff4c5126..9ebcce340d 100644 --- a/crates/perry-runtime/src/object/mod.rs +++ b/crates/perry-runtime/src/object/mod.rs @@ -750,6 +750,27 @@ pub(crate) fn get_property_attrs(obj: usize, key: &str) -> Option PROPERTY_DESCRIPTORS.with(|m| m.borrow().get(&(obj, key.to_string())).copied()) } +/// Whether this specific object has ever had a property descriptor installed on +/// it (`OBJ_FLAG_HAS_DESCRIPTORS`, set by [`note_descriptor_target`] for every +/// `PROPERTY_DESCRIPTORS` insertion on a `GC_TYPE_OBJECT`). The flag lives in +/// the GcHeader and travels with the object across evacuation. +/// +/// `PROPERTY_DESCRIPTORS` is keyed by raw address, so once a freed object's slot +/// is reused by a fresh object, a stale `(addr, key)` descriptor entry would be +/// read back for the new object — falsely reporting e.g. a `writable: false` +/// `Fragment` on a brand-new `{}` and throwing "Cannot assign to read only +/// property". A fresh allocation's `_reserved` is zeroed, so gating descriptor +/// lookups on this per-object flag avoids the stale-address-reuse false +/// positive (Next.js app-page-turbo runtime's webpack `exports.Fragment = …`). +pub(crate) fn object_has_descriptors(obj: usize) -> bool { + unsafe { + if let Some(header) = crate::value::addr_class::try_read_gc_header(obj) { + return header._reserved & crate::gc::OBJ_FLAG_HAS_DESCRIPTORS != 0; + } + } + false +} + /// Store a property descriptor for (obj, key). pub(crate) fn set_property_attrs(obj: usize, key: String, attrs: PropertyAttrs) { note_descriptor_target(obj); diff --git a/crates/perry-runtime/src/object/native_call_method.rs b/crates/perry-runtime/src/object/native_call_method.rs index 03ac126764..fb49299c1f 100644 --- a/crates/perry-runtime/src/object/native_call_method.rs +++ b/crates/perry-runtime/src/object/native_call_method.rs @@ -1418,6 +1418,34 @@ pub(crate) unsafe fn try_dispatch_instance_method_value( )) } +/// #wall4: null-safe variant used ONLY by the unknown-native-method fallback in +/// codegen (`lower_call/native/mod.rs`). The HIR can mis-classify a receiver's +/// class so an `obj.method()` reaches that fallback; dispatching via +/// `js_native_call_method` is correct for a REAL receiver (fixes the Next.js +/// `e.indexOf` mis-typed-as-FormData case where `e` is a real array). But a +/// genuinely undefined/null receiver must NOT hard-throw "Cannot read +/// properties of undefined" — the prior `0.0` sentinel let such call sites limp, +/// and Next's `app-page-turbo.runtime.prod.js` TOP-LEVEL has a nullish-receiver +/// `.indexOf` that, if it throws, aborts the entire module load (then the +/// `_not-found` page can't be required → HTTP 500). Returns the SAME `0.0` +/// sentinel as the old fallback for a nullish receiver (preserving the exact +/// pre-fix non-crashing behavior — `undefined` instead broke downstream code +/// that expected a number); otherwise dispatches identically. +#[no_mangle] +pub unsafe extern "C" fn js_native_call_method_nullsafe( + object: f64, + method_name_ptr: *const i8, + method_name_len: usize, + args_ptr: *const f64, + args_len: usize, +) -> f64 { + let v = crate::value::JSValue::from_bits(object.to_bits()); + if v.is_undefined() || v.is_null() { + return 0.0; + } + js_native_call_method(object, method_name_ptr, method_name_len, args_ptr, args_len) +} + #[no_mangle] pub unsafe extern "C" fn js_native_call_method( object: f64, diff --git a/crates/perry-runtime/src/object/object_ops.rs b/crates/perry-runtime/src/object/object_ops.rs index 2af22aa6e1..6568a6f12e 100644 --- a/crates/perry-runtime/src/object/object_ops.rs +++ b/crates/perry-runtime/src/object/object_ops.rs @@ -1354,6 +1354,54 @@ pub extern "C" fn js_object_define_property( // 4. Present `get`/`set` must be callable. let target_is_class_ref = super::class_ref_id(obj_value).is_some(); if !target_is_class_ref && !value_is_object_like(obj_value) { + // A native HANDLE target (a small pointer-tagged id — e.g. an http + // ServerResponse, Headers, a timer) is not a heap object, so Perry + // can't attach an arbitrary own property to it the way V8 can. Node + // framework code nonetheless calls `Object.defineProperty(handle, …)`: + // Next.js `patchSetHeaderWithCookieSupport` marks `res` with a Symbol + // (`Object.defineProperty(res, PATCHED_SET_HEADER, { value: true })`). + // Throwing here aborts the whole request (HTTP 500). Instead treat the + // define as a best-effort success: for a string key with a data + // descriptor, route the value through the handle property-set so + // `res[key]` round-trips; symbol keys / accessor descriptors degrade + // to a no-op (the framework's patch is idempotent, so re-running is + // harmless). Matches how `js_object_set_field_by_name` already tolerates + // small-handle receivers. + let jv = crate::value::JSValue::from_bits(obj_value.to_bits()); + let handle_id = if jv.is_pointer() { + let p = jv.as_pointer::() as usize; + if p >= 1 && p < 0x10000 { + Some(p) + } else { + None + } + } else { + None + }; + if let Some(hid) = handle_id { + // Best-effort: store a string-keyed data-descriptor value on the + // handle via the same dispatch `obj.key = value` uses. + let ks = crate::value::js_get_string_pointer_unified(key_value) + as *const crate::StringHeader; + if !ks.is_null() { + if let Some(dispatch) = super::class_handles::handle_property_set_dispatch() { + let dval = if desc_has_field(descriptor_value, b"value") { + Some(f64::from_bits( + desc_read_field(descriptor_value, b"value").bits(), + )) + } else { + None + }; + if let Some(v) = dval { + let name_ptr = + (ks as *const u8).add(std::mem::size_of::()); + let name_len = (*ks).byte_len as usize; + dispatch(hid as i64, name_ptr, name_len, v); + } + } + } + return obj_value; + } throw_object_type_error(b"Object.defineProperty called on non-object"); } // A descriptor must be an Object; a Symbol is pointer-tagged but not an @@ -2760,6 +2808,24 @@ pub extern "C" fn js_object_get_prototype_of(obj_value: f64) -> f64 { ); } } + // A native-module namespace object (`require("path")` etc., + // class_id NATIVE_MODULE_CLASS_ID, the `__module__`-tagged + // object) is an ordinary object whose [[Prototype]] is + // %Object.prototype% — NOT itself. The `return obj_value` self- + // prototype fallback below makes turbopack's `interopEsm` + // proto-chain walk (`for(cur=raw; !LEAF.includes(cur); + // cur=getProto(cur))`) never terminate — getProto keeps + // returning the same object, so it creates export getters + // forever (the Next.js standalone startup runaway: unbounded + // memory growth, no `✓ Ready`). Return Object.prototype so the + // walk reaches a LEAF_PROTOTYPE and stops. + if (*obj).class_id == super::native_module::NATIVE_MODULE_CLASS_ID { + let proto = crate::object::builtin_prototype_value("Object"); + if proto.to_bits() != crate::value::TAG_UNDEFINED { + return proto; + } + return f64::from_bits(TAG_NULL); + } } return obj_value; } @@ -2868,6 +2934,24 @@ pub extern "C" fn js_object_get_prototype_of(obj_value: f64) -> f64 { ); } } + // A native-module namespace object (`require("path")` etc., + // class_id NATIVE_MODULE_CLASS_ID, the `__module__`-tagged + // object) is an ordinary object whose [[Prototype]] is + // %Object.prototype% — NOT itself. The `return obj_value` self- + // prototype fallback below makes turbopack's `interopEsm` + // proto-chain walk (`for(cur=raw; !LEAF.includes(cur); + // cur=getProto(cur))`) never terminate — getProto keeps + // returning the same object, so it creates export getters + // forever (the Next.js standalone startup runaway: unbounded + // memory growth, no `✓ Ready`). Return Object.prototype so the + // walk reaches a LEAF_PROTOTYPE and stops. + if (*obj).class_id == super::native_module::NATIVE_MODULE_CLASS_ID { + let proto = crate::object::builtin_prototype_value("Object"); + if proto.to_bits() != crate::value::TAG_UNDEFINED { + return proto; + } + return f64::from_bits(TAG_NULL); + } } return obj_value; } diff --git a/crates/perry-runtime/src/object/polymorphic_index.rs b/crates/perry-runtime/src/object/polymorphic_index.rs index 2977b75e84..79841a3368 100644 --- a/crates/perry-runtime/src/object/polymorphic_index.rs +++ b/crates/perry-runtime/src/object/polymorphic_index.rs @@ -36,7 +36,20 @@ unsafe fn property_key_string_ptr(value: f64) -> *mut crate::StringHeader { #[no_mangle] pub extern "C" fn js_object_get_index_polymorphic(obj_handle: i64, idx: f64) -> f64 { let raw = if (obj_handle as u64) >> 48 >= 0x7FF8 { - (obj_handle as u64) & 0x0000_FFFF_FFFF_FFFF + // NaN-boxed: only POINTER_TAG (0x7FFD) and STRING_TAG (0x7FFF) carry a + // heap pointer in the low 48 bits. INT32 (0x7FFE), BIGINT (0x7FFA) and + // the undefined/null/bool tags (0x7FFC) are PRIMITIVES — indexing them + // yields `undefined` per JS (`(983055)[0] === undefined`). Treating an + // INT32's integer payload as a pointer derefs a wild address → SIGSEGV. + // This is the Next.js app-page-turbo render crash: a NaN-boxed-int + // receiver (0xf000f = 983055) indexed inside a class `get` method + // (js_object_get_index_polymorphic read its GcHeader at raw-8). Reject + // non-pointer/non-string NaN-boxed receivers up front (cross-platform — + // not dependent on a heap-address floor). + match (obj_handle as u64) >> 48 { + 0x7FFD | 0x7FFF => (obj_handle as u64) & 0x0000_FFFF_FFFF_FFFF, + _ => return f64::from_bits(crate::value::TAG_UNDEFINED), + } } else { obj_handle as u64 }; @@ -62,6 +75,18 @@ pub extern "C" fn js_object_get_index_polymorphic(obj_handle: i64, idx: f64) -> ); } + // #wall5-render: `obj[idx]` where `obj` is a mis-boxed / non-heap value + // (e.g. a small bogus pointer like 0xf000f produced upstream) must NOT + // dereference the GcHeader at `raw-8` — that's a wild read → SIGSEGV. The + // `raw < 0x1000` guards above are too weak (0xf000f passes). Typed-array / + // buffer / string receivers were already handled before this point, so a + // value reaching here that isn't a valid arena/old-gen object pointer is + // not indexable → `undefined` (matches JS `(5)[0]` etc.). Mirrors the + // is_closure_ptr heap-range guard (wall #2). Next.js app-page-turbo's + // `u_i_24_6.get` indexed such a value during the app render → crash. + if !crate::value::addr_class::is_valid_obj_ptr(raw as *const u8) { + return f64::from_bits(crate::value::TAG_UNDEFINED); + } let gc_type = unsafe { let gc_header_addr = raw.wrapping_sub(crate::gc::GC_HEADER_SIZE as u64) as usize; if gc_header_addr < 0x1000 { diff --git a/crates/perry-runtime/src/proxy.rs b/crates/perry-runtime/src/proxy.rs index 86027a9ce6..b9b1223848 100644 --- a/crates/perry-runtime/src/proxy.rs +++ b/crates/perry-runtime/src/proxy.rs @@ -996,15 +996,29 @@ fn own_set_descriptor(target: f64, key: f64) -> Option { return None; } let key_name = key_to_rust_string(key)?; - if let Some(acc) = crate::object::get_accessor_descriptor(obj_ptr, &key_name) { - return Some(OwnSetDescriptor::Accessor { - setter_bits: acc.set, - }); - } - if let Some(attrs) = crate::object::get_property_attrs(obj_ptr, &key_name) { - return Some(OwnSetDescriptor::Data { - writable: attrs.writable(), - }); + // `ACCESSOR_DESCRIPTORS` / `PROPERTY_DESCRIPTORS` are keyed by raw address, + // so a fresh object reusing a freed address would otherwise read back the + // previous tenant's stale getter-only accessor / non-writable descriptor and + // report this `obj.k = v` as read-only — falsely throwing "Cannot assign to + // read only property" on a plain `{}` (Next.js app-page-turbo runtime's + // `exports.Fragment = …`, reached here once a descriptor on Object.prototype + // disables the plain-object [[Set]] fast path process-wide, #5054). Gate on + // the per-object `OBJ_FLAG_HAS_DESCRIPTORS` flag — set reliably for every + // descriptor installed on a `GC_TYPE_OBJECT`, and clear on a fresh + // allocation. Closures don't carry the flag, so keep consulting the side + // tables for them (their `name`/`length` + user `defineProperty` descriptors + // live there). + if crate::object::object_has_descriptors(obj_ptr) || crate::closure::is_closure_ptr(obj_ptr) { + if let Some(acc) = crate::object::get_accessor_descriptor(obj_ptr, &key_name) { + return Some(OwnSetDescriptor::Accessor { + setter_bits: acc.set, + }); + } + if let Some(attrs) = crate::object::get_property_attrs(obj_ptr, &key_name) { + return Some(OwnSetDescriptor::Data { + writable: attrs.writable(), + }); + } } if crate::closure::is_closure_ptr(obj_ptr) { if crate::object::has_own_helpers::closure_own_key_present(obj_ptr, &key_name) { diff --git a/crates/perry-runtime/src/value/handle.rs b/crates/perry-runtime/src/value/handle.rs index 1ee1cb4992..1d099bd569 100644 --- a/crates/perry-runtime/src/value/handle.rs +++ b/crates/perry-runtime/src/value/handle.rs @@ -110,6 +110,17 @@ pub extern "C" fn js_set_native_events_construct(func: JsNativeEventsConstructFn JS_NATIVE_EVENTS_CONSTRUCT.store(func as *mut (), Ordering::SeqCst); } +/// Register the async_hooks dynamic-construct dispatcher. Called by perry-stdlib +/// at startup so `new ()` (the Next.js +/// `new maybeGlobalAsyncLocalStorage()` shape, where the ctor value came from +/// `globalThis.AsyncLocalStorage = AsyncLocalStorage`) builds a real handle +/// instead of a class_id=0 empty object. Shares the `JsNativeEventsConstructFn` +/// (method_ptr, method_len, args_ptr, args_len) -> f64 signature. +#[no_mangle] +pub extern "C" fn js_set_native_async_hooks_construct(func: JsNativeEventsConstructFn) { + JS_NATIVE_ASYNC_HOOKS_CONSTRUCT.store(func as *mut (), Ordering::SeqCst); +} + /// Set the native module JS property loader (called by perry-jsruntime) /// This callback loads a native module via V8 and gets a property from it. #[no_mangle] diff --git a/crates/perry-runtime/src/value/mod.rs b/crates/perry-runtime/src/value/mod.rs index 3cff9a3700..4c34841c8a 100644 --- a/crates/perry-runtime/src/value/mod.rs +++ b/crates/perry-runtime/src/value/mod.rs @@ -59,11 +59,11 @@ pub(crate) use tags::{ STRING_TAG, TAG_FALSE, TAG_HOLE, TAG_MASK, TAG_NULL, TAG_TRUE, TAG_UNDEFINED, }; pub use tags::{ - JS_HANDLE_CALL_METHOD, JS_HANDLE_TYPEOF, JS_NATIVE_CRYPTO_DISPATCH, JS_NATIVE_DOMAIN_DISPATCH, - JS_NATIVE_EVENTS_CONSTRUCT, JS_NATIVE_HTTP_DISPATCH, JS_NATIVE_MODULE_JS_LOADER, - JS_NATIVE_QUERYSTRING_DISPATCH, JS_NATIVE_SQLITE_DISPATCH, JS_NATIVE_TLS_DISPATCH, - JS_NATIVE_WEBCRYPTO_DISPATCH, JS_NATIVE_ZLIB_DISPATCH, JS_NEW_FROM_HANDLE_V8, - SHORT_STRING_MAX_LEN, + JS_HANDLE_CALL_METHOD, JS_HANDLE_TYPEOF, JS_NATIVE_ASYNC_HOOKS_CONSTRUCT, + JS_NATIVE_CRYPTO_DISPATCH, JS_NATIVE_DOMAIN_DISPATCH, JS_NATIVE_EVENTS_CONSTRUCT, + JS_NATIVE_HTTP_DISPATCH, JS_NATIVE_MODULE_JS_LOADER, JS_NATIVE_QUERYSTRING_DISPATCH, + JS_NATIVE_SQLITE_DISPATCH, JS_NATIVE_TLS_DISPATCH, JS_NATIVE_WEBCRYPTO_DISPATCH, + JS_NATIVE_ZLIB_DISPATCH, JS_NEW_FROM_HANDLE_V8, SHORT_STRING_MAX_LEN, }; // Crate-internal handle dispatch atomics + callback type aliases (read by @@ -85,11 +85,12 @@ pub(crate) use handle::js_handle_is_function; pub use handle::{ is_js_handle, js_handle_array_get, js_handle_array_length, js_set_handle_array_get, js_set_handle_array_length, js_set_handle_call_method, js_set_handle_object_get_property, - js_set_handle_to_string, js_set_handle_typeof, js_set_native_crypto_dispatch, - js_set_native_domain_dispatch, js_set_native_events_construct, js_set_native_http_dispatch, - js_set_native_module_js_loader, js_set_native_querystring_dispatch, - js_set_native_sqlite_dispatch, js_set_native_tls_dispatch, js_set_native_webcrypto_dispatch, - js_set_native_zlib_dispatch, js_set_new_from_handle_v8, native_module_try_js_property, + js_set_handle_to_string, js_set_handle_typeof, js_set_native_async_hooks_construct, + js_set_native_crypto_dispatch, js_set_native_domain_dispatch, js_set_native_events_construct, + js_set_native_http_dispatch, js_set_native_module_js_loader, + js_set_native_querystring_dispatch, js_set_native_sqlite_dispatch, js_set_native_tls_dispatch, + js_set_native_webcrypto_dispatch, js_set_native_zlib_dispatch, js_set_new_from_handle_v8, + native_module_try_js_property, }; // ----- Basic NaN-box pack / unpack FFI ----- diff --git a/crates/perry-runtime/src/value/tags.rs b/crates/perry-runtime/src/value/tags.rs index 137bad1d9d..1962ae9dff 100644 --- a/crates/perry-runtime/src/value/tags.rs +++ b/crates/perry-runtime/src/value/tags.rs @@ -184,3 +184,11 @@ pub static JS_NATIVE_DOMAIN_DISPATCH: AtomicPtr<()> = AtomicPtr::new(std::ptr::n pub static JS_NATIVE_TLS_DISPATCH: AtomicPtr<()> = AtomicPtr::new(std::ptr::null_mut()); pub static JS_NATIVE_HTTP_DISPATCH: AtomicPtr<()> = AtomicPtr::new(std::ptr::null_mut()); pub static JS_NATIVE_EVENTS_CONSTRUCT: AtomicPtr<()> = AtomicPtr::new(std::ptr::null_mut()); +// Dynamic `new ()` (e.g. `new maybeGlobalAsyncLocalStorage()` +// where the value came from `globalThis.AsyncLocalStorage = AsyncLocalStorage`). +// Registered by perry-stdlib at startup so a bound `async_hooks.AsyncLocalStorage` / +// `AsyncResource` export value constructed dynamically reaches the real handle +// constructor instead of falling through to the class_id=0 empty object. Takes +// (method_name_ptr, method_name_len, args_ptr, args_len), returns the NaN-boxed +// instance. Next.js standalone server startup blocker. +pub static JS_NATIVE_ASYNC_HOOKS_CONSTRUCT: AtomicPtr<()> = AtomicPtr::new(std::ptr::null_mut()); diff --git a/crates/perry-stdlib/src/common/dispatch.rs b/crates/perry-stdlib/src/common/dispatch.rs index 094a1773f9..27d9e3dabf 100644 --- a/crates/perry-stdlib/src/common/dispatch.rs +++ b/crates/perry-stdlib/src/common/dispatch.rs @@ -287,6 +287,41 @@ unsafe fn dispatch_event_emitter_property(handle: i64, property: &str) -> Option Some(bind_method(method)) } +/// `AsyncLocalStorage` METHOD-VALUE reads (the property-read counterpart of +/// `dispatch_async_local_storage_method`). `als.getStore()` (a direct call) +/// already dispatched, but reading `als.getStore` AS A VALUE (`const gs = +/// als.getStore`, `{ getStore } = als`, `typeof als.getStore`) returned +/// `undefined` — there was no property-read dispatch for ALS handles (only +/// EventEmitter had one, #4995). Next.js' server startup reads `getStore` as a +/// value (cacheComponents / patch-fetch async-storage setup) and then calls it, +/// so it threw `TypeError: getStore is not a function` BEFORE `✓ Ready`. Bind +/// each method to the handle so the read yields a callable bound method, exactly +/// like `dispatch_event_emitter_property`. +unsafe fn dispatch_async_local_storage_property(handle: i64, property: &str) -> Option { + if !matches!( + property, + "run" | "getStore" | "enterWith" | "exit" | "disable" + ) { + return None; + } + if get_handle_mut::(handle).is_none() { + return None; + } + extern "C" { + fn js_class_method_bind( + instance: f64, + method_name_ptr: *const u8, + method_name_len: usize, + ) -> f64; + } + let m = property.as_bytes(); + Some(js_class_method_bind( + nanbox_handle_value(handle), + m.as_ptr(), + m.len(), + )) +} + /// Dispatch a method call on a handle-based object. #[no_mangle] pub unsafe extern "C" fn js_handle_method_dispatch( @@ -1813,6 +1848,10 @@ pub unsafe extern "C" fn js_handle_property_dispatch( return value; } + if let Some(value) = dispatch_async_local_storage_property(handle, property_name) { + return value; + } + #[cfg(feature = "http-client")] if let Some(value) = crate::http::dispatch_agent_property(handle, property_name) { return value; @@ -3292,6 +3331,42 @@ pub unsafe extern "C" fn js_stdlib_init_dispatch() { } #[cfg(any(feature = "bundled-events", feature = "external-events-construct"))] perry_runtime::js_set_native_events_construct(events_native_construct); + + // Dynamic `new ()` -> real handle. Next.js does + // `globalThis.AsyncLocalStorage = AsyncLocalStorage` then + // `new maybeGlobalAsyncLocalStorage()`; the dynamic callee misses the static + // `new AsyncLocalStorage()` codegen arm, so the runtime construct path must + // build the handle here (else `.getStore` is undefined at server startup). + unsafe extern "C" fn async_hooks_native_construct( + method_ptr: *const u8, + method_len: usize, + args_ptr: *const f64, + args_len: usize, + ) -> f64 { + let method = std::slice::from_raw_parts(method_ptr, method_len); + match method { + b"AsyncLocalStorage" => { + let handle = crate::async_local_storage::js_async_local_storage_new(); + perry_runtime::js_nanbox_pointer(handle) + } + b"AsyncResource" => { + let type_value = if !args_ptr.is_null() && args_len > 0 { + *args_ptr + } else { + TAG_UNDEFINED_F64 + }; + let options = if !args_ptr.is_null() && args_len > 1 { + *args_ptr.add(1) + } else { + TAG_UNDEFINED_F64 + }; + let handle = perry_runtime::async_hooks::js_async_resource_new(type_value, options); + perry_runtime::js_nanbox_pointer(handle) + } + _ => TAG_UNDEFINED_F64, + } + } + perry_runtime::js_set_native_async_hooks_construct(async_hooks_native_construct); super::net_socket_bridge::register_net_socket_handle_probe(); js_register_worker_threads_namespace_getters( crate::worker_threads::js_worker_threads_get_worker_data, diff --git a/crates/perry-transform/src/generator/id_scan.rs b/crates/perry-transform/src/generator/id_scan.rs index b552c8b879..01ade2b817 100644 --- a/crates/perry-transform/src/generator/id_scan.rs +++ b/crates/perry-transform/src/generator/id_scan.rs @@ -68,6 +68,27 @@ pub fn compute_max_local_id(module: &Module) -> LocalId { } scan_stmts_for_max_local(&setter.1.body, &mut max_id); } + // Issue #5143 (LocalId parallel): class FIELD initializers and + // computed-key exprs hold closures whose params/body LocalIds live in + // this namespace but are NOT reachable through any method/ctor body. + // `compute_max_func_id` already scans these (the #5143 FuncId fix); the + // LocalId scan was left incomplete, so the generator/async transform + // could synthesize state/done/sent/wrapper LocalIds that COLLIDE with a + // field-init closure's locals and corrupt unrelated codegen (e.g. a + // module-global/capture read resolving to the wrong value — Next.js + // app-page-turbo `()=>X` export getters at scale). A too-high max is + // always safe; a missed id collides. + for field in class.fields.iter().chain(class.static_fields.iter()) { + if let Some(init) = &field.init { + scan_expr_for_max_local(init, &mut max_id); + } + if let Some(key_expr) = &field.key_expr { + scan_expr_for_max_local(key_expr, &mut max_id); + } + } + if let Some(extends_expr) = &class.extends_expr { + scan_expr_for_max_local(extends_expr, &mut max_id); + } } max_id } @@ -535,6 +556,7 @@ mod tests { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } @@ -569,4 +591,50 @@ mod tests { module.classes.push(class); assert_eq!(compute_max_func_id(&module), 73); } + + /// #5143 (LocalId parallel): a class FIELD-initializer closure's param/body + /// LocalIds must be visible to `compute_max_local_id` too — they were only + /// counted for FuncId, so the generator/async transform could mint a + /// state/done LocalId colliding with a field-init local and corrupt codegen. + #[test] + fn class_field_initializer_locals_visible_to_max_local_id() { + let field = ClassField { + name: "request".to_string(), + key_expr: None, + ty: Type::Any, + init: Some(Expr::Closure { + func_id: 3, + params: vec![Param { + id: 91, + name: "input".to_string(), + ty: Type::Any, + default: None, + decorators: Vec::new(), + is_rest: false, + arguments_object: None, + }], + return_type: Type::Any, + body: vec![Stmt::Return(Some(Expr::LocalGet(91)))], + captures: Vec::new(), + mutable_captures: Vec::new(), + captures_this: false, + captures_new_target: false, + enclosing_class: None, + is_arrow: true, + is_async: false, + is_generator: false, + is_strict: false, + }), + is_private: false, + is_readonly: false, + decorators: Vec::new(), + }; + let mut module = Module::new("test"); + module.classes.push(class_with_fields("Hono", vec![field])); + assert_eq!( + compute_max_local_id(&module), + 91, + "field-initializer closure param LocalId must be counted" + ); + } } diff --git a/crates/perry-transform/src/inline/mod.rs b/crates/perry-transform/src/inline/mod.rs index f8956f5f73..3b880fafeb 100644 --- a/crates/perry-transform/src/inline/mod.rs +++ b/crates/perry-transform/src/inline/mod.rs @@ -712,6 +712,7 @@ mod tests { decorators: Vec::new(), is_exported: false, aliases: Vec::new(), + is_nested: false, } } diff --git a/crates/perry/src/commands/compile.rs b/crates/perry/src/commands/compile.rs index 8345ecd78d..6140d24b79 100644 --- a/crates/perry/src/commands/compile.rs +++ b/crates/perry/src/commands/compile.rs @@ -219,6 +219,7 @@ mod tests { computed_members: Vec::new(), decorators: Vec::new(), is_exported: true, + is_nested: false, aliases: Vec::new(), }; let project_root = PathBuf::from("/repo"); @@ -2201,6 +2202,24 @@ pub fn run_with_parse_cache( .filter(|(_, m)| m.init_kind == perry_hir::ModuleInitKind::Deferred) .map(|(_, m)| sanitize_name(&m.name)) .collect(); + // Next.js wall 54 (part 2): `(absolute_path, prefix)` for every + // `.next/server/**` runtime module so the entry's `main` can record + // its `__init` address by path (`js_register_path_init`). Only the + // entry emits these; the runtime `require(absolutePath)` shim then + // triggers the matching module's lazy init on first load. + let nextjs_path_init_modules: Vec<(String, String)> = if is_entry { + ctx.native_modules + .iter() + .filter(|(p, _)| { + self::collect_modules::is_nextjs_runtime_module(p) + }) + .map(|(p, m)| { + (p.to_string_lossy().into_owned(), sanitize_name(&m.name)) + }) + .collect() + } else { + Vec::new() + }; // Issue #753: prefixes of this module's static-import + // re-export source modules (non-entry only — the entry's // body is in `main`, not a `__init`). The wrapper at @@ -4078,6 +4097,7 @@ pub fn run_with_parse_cache( .get(path) .cloned() .unwrap_or_default(), + nextjs_path_init_modules, deferred_module_prefixes, module_init_deps, // Issue #842: signal side-effect-only dynamic-import diff --git a/crates/perry/src/commands/compile/cjs_wrap/wrap.rs b/crates/perry/src/commands/compile/cjs_wrap/wrap.rs index 75837b8f3c..ce437fcc9d 100644 --- a/crates/perry/src/commands/compile/cjs_wrap/wrap.rs +++ b/crates/perry/src/commands/compile/cjs_wrap/wrap.rs @@ -787,6 +787,26 @@ pub(in crate::commands::compile) fn wrap_commonjs_with_body_offset( if (typeof specifier !== 'string') throw __perry_cjs_require_error('type', 'ERR_INVALID_ARG_TYPE', 'The "id" argument must be of type string.'); if (specifier === '') throw __perry_cjs_require_error('type', 'ERR_INVALID_ARG_VALUE', 'The argument "id" must be a non-empty string.'); {require_cases} + // Runtime `require(absolutePath.js)` of a module Perry AOT-compiled but + // that is only reachable via a runtime-computed path (Next.js / turbopack + // load page + chunk modules by a manifest path at request time, not a + // static specifier). Resolve it from the path->module registry that each + // compiled module self-registers into at init; `undefined` = not + // registered, fall through to the `.json` read / MODULE_NOT_FOUND throw. + {{ + const __perry_path_mod = __perry_require_path_module(specifier); + if (__perry_path_mod !== undefined) return __perry_path_mod; + }} + // Runtime `require(absolutePath)` of a `.json` file (Next.js loads + // manifests this way: `require(this.middlewareManifestPath)`). Node's + // require reads + JSON.parses `.json` files; the statically-resolved + // cases above only cover specifiers known at compile time, so a path + // computed at runtime falls here. `.json` is pure data (no eval), so we + // read it from disk and parse it. `.js`/`.node` runtime require stays + // unsupported — that would require evaluating arbitrary code. + if ((specifier.charCodeAt(0) === 47 || (specifier.length > 2 && specifier.charCodeAt(1) === 58)) && specifier.slice(-5) === '.json') {{ + return __perry_require_json_disk(specifier); + }} throw __perry_cjs_require_error('error', 'MODULE_NOT_FOUND', "Cannot find module '" + specifier + "'"); }} Object.defineProperty(require, 'name', {{ @@ -814,6 +834,14 @@ pub(in crate::commands::compile) fn wrap_commonjs_with_body_offset( require.main = module;"# ); + // Wall 54: self-register this compiled module's exports under its absolute + // source path so a runtime `require(absolutePath.js)` (turbopack/Next.js + // page+chunk loading) resolves to it. `{:?}` debug-quotes to a valid JS + // string literal. + let path_register = format!( + "__perry_register_path_module({:?}, __cjs_module.exports);", + source_path.to_string_lossy() + ); let wrapped = if let Some(flat_class) = &flat_default_class { // Issue #4933 — flat emission. Drop the IIFE and run the CommonJS body // at ESM module scope: `module.exports = {flat_class}` then resolves to @@ -832,6 +860,7 @@ pub(in crate::commands::compile) fn wrap_commonjs_with_body_offset( {body_for_iife} const _cjs = __cjs_module.exports; +{path_register} export default {flat_class}; export {{ {flat_class} }}; {direct_class_exports} @@ -850,6 +879,7 @@ const _cjs = (function() {{ {body_for_iife} + {path_register} return __cjs_module.exports; }})(); diff --git a/crates/perry/src/commands/compile/collect_modules.rs b/crates/perry/src/commands/compile/collect_modules.rs index 818a4d118d..ea705d059a 100644 --- a/crates/perry/src/commands/compile/collect_modules.rs +++ b/crates/perry/src/commands/compile/collect_modules.rs @@ -56,6 +56,40 @@ use wasm_asset::{is_wasm_asset, synthesize_wasm_stub_module}; const MAX_CROSS_MODULE_INLINE_PRIOR_MODULES: usize = 128; +/// Next.js wall 54 (part 2): recursively gather every `*.js` file under `dir` +/// (page/route loaders + turbopack chunks). Symlinks are not followed; errors +/// reading a subdirectory are skipped silently (best-effort discovery). +fn collect_js_files_recursive(dir: &std::path::Path, out: &mut Vec) { + let Ok(entries) = fs::read_dir(dir) else { + return; + }; + for entry in entries.flatten() { + let path = entry.path(); + let Ok(file_type) = entry.file_type() else { + continue; + }; + if file_type.is_dir() { + collect_js_files_recursive(&path, out); + } else if file_type.is_file() && path.extension().and_then(|e| e.to_str()) == Some("js") { + out.push(path); + } + } +} + +/// Next.js wall 54 (part 2): true for a module discovered under a standalone +/// bundle's `.next/server/**` tree (page/route/chunk modules loaded by a +/// runtime-computed path). Matched by the `.next` then `server` path-component +/// sequence so it never false-matches a user file merely named `next` or a +/// `node_modules/.next-*` package. Used by init classification (these modules +/// must be eager so they self-register under their path at startup) and topo +/// ordering (chunks before the page loaders that `R.c()` them). +pub(super) fn is_nextjs_runtime_module(path: &std::path::Path) -> bool { + let comps: Vec<&std::ffi::OsStr> = path.components().map(|c| c.as_os_str()).collect(); + comps + .windows(2) + .any(|w| w[0] == std::ffi::OsStr::new(".next") && w[1] == std::ffi::OsStr::new("server")) +} + /// Collect all modules to compile (transitive closure of imports) pub(super) fn collect_modules( entry_path: &PathBuf, @@ -70,6 +104,37 @@ pub(super) fn collect_modules( ) -> Result<()> { let mut states: HashMap = HashMap::new(); let mut stack = vec![WorkFrame::Enter(entry_path.clone())]; + // Next.js wall 54 (part 2): a standalone `server.js` loads its page, route, + // and turbopack chunk modules from `/.next/server/**` by a path + // computed at request time (`require(getPagePath(...))`, turbopack + // `R.c("chunkpath")`) — never via a static `import`/`require` literal — so + // the import walk below never reaches them and they would not be AOT + // compiled. Seed every `.next/server/**/*.js` file as an additional root so + // each compiles natively and self-registers under its absolute path (see + // cjs_wrap `__perry_register_path_module`), letting the runtime + // `require(absolutePath)` resolve it. Detected only when the entry sits next + // to a `.next/server` directory (a Next.js standalone bundle). + if let Some(entry_dir) = entry_path.parent() { + let next_server_dir = entry_dir.join(".next").join("server"); + if next_server_dir.is_dir() { + let mut next_js_files = Vec::new(); + collect_js_files_recursive(&next_server_dir, &mut next_js_files); + if !next_js_files.is_empty() { + if matches!(format, OutputFormat::Text) { + println!( + "Next.js standalone: discovered {} runtime module(s) under {}", + next_js_files.len(), + next_server_dir.display() + ); + } + // Push after the entry so the entry is processed first; order + // among the discovered files does not matter (the walk dedups). + for f in next_js_files { + stack.push(WorkFrame::Enter(f)); + } + } + } + } while let Some(frame) = stack.pop() { match frame { WorkFrame::Enter(next_path) => { @@ -177,7 +242,14 @@ fn collect_module_one( // and synthesize a throwing-stub module (see the wasm branch below). Real // `.wasm` ESM instantiation is the companion issue #5234. let is_wasm = is_wasm_asset(&canonical); - let is_in_node_modules = canonical.to_string_lossy().contains("node_modules"); + // Match a real `node_modules/` directory COMPONENT, not a substring: a + // file whose NAME contains "node_modules" (e.g. turbopack's bundled chunks + // `.next/server/chunks/ssr/node_modules_next_dist_…._.js`) is NOT in + // node_modules and must compile natively, not get force-routed to the + // (removed) JS runtime. (Next.js wall 54.) + let is_in_node_modules = canonical + .components() + .any(|c| c.as_os_str() == "node_modules"); let is_perry_native = is_in_node_modules && is_in_perry_native_package(&canonical); let is_in_compiled_pkg = (is_in_node_modules && is_in_compile_package(&canonical, &ctx.compile_packages)) || ctx.compile_package_dirs.values().any(|dir| { diff --git a/crates/perry/src/commands/compile/init_order.rs b/crates/perry/src/commands/compile/init_order.rs index 106544e94a..5c1324ccc8 100644 --- a/crates/perry/src/commands/compile/init_order.rs +++ b/crates/perry/src/commands/compile/init_order.rs @@ -34,6 +34,13 @@ use super::CompilationContext; pub(super) fn classify_eager_modules(ctx: &mut CompilationContext, entry_path: &Path) { let mut eager: HashSet = HashSet::new(); eager.insert(entry_path.to_path_buf()); + // Next.js wall 54 (part 2): the `.next/server/**` page/route/chunk modules + // have no static importer (loaded by a runtime-computed path). They are left + // Deferred — eager-initing turbopack chunks at startup runs React-SSR code + // before the server is ready. Instead, the entry registers each module's + // `__init` address by path (`js_register_path_init`), so the first runtime + // `require(absolutePath)` (turbopack `R.c()` / `require(getPagePath(...))`) + // triggers init lazily and in dependency order. loop { let mut changed = false; let paths: Vec = ctx.native_modules.keys().cloned().collect(); diff --git a/crates/perry/src/commands/compile/object_cache.rs b/crates/perry/src/commands/compile/object_cache.rs index ade2ac6ce8..eaa7d8fc92 100644 --- a/crates/perry/src/commands/compile/object_cache.rs +++ b/crates/perry/src/commands/compile/object_cache.rs @@ -1021,6 +1021,7 @@ mod object_cache_tests { target: Some("aarch64-apple-darwin".to_string()), is_entry_module: false, non_entry_module_prefixes: Vec::new(), + nextjs_path_init_modules: Vec::new(), import_function_prefixes: std::collections::HashMap::new(), import_function_origin_names: std::collections::HashMap::new(), import_function_v8_specifiers: std::collections::HashMap::new(), diff --git a/isoA b/isoA new file mode 100755 index 0000000000..8721139e47 Binary files /dev/null and b/isoA differ diff --git a/scripts/check_file_size.sh b/scripts/check_file_size.sh index 171e47c72d..da211ff0d6 100755 --- a/scripts/check_file_size.sh +++ b/scripts/check_file_size.sh @@ -319,6 +319,16 @@ crates/perry-codegen/tests/native_proof_regressions.rs # `build_optimized_libs` driver + well-known resolution. Splitting the # freshness/well-known helpers into a sibling module is tracked under #1435. crates/perry/src/commands/compile/optimized_libs.rs +# Next.js app-router bring-up (PR #5438 / umbrella #793): these three crossed +# the gate from the wall-fix additions (HIR destructuring var-decl handling, +# `new ()`/anon-class lowering, and the stdlib FFI decl table) layered +# on top of main's recent growth — each was at/just under 2000 on main +# (2000/1991/1968) and is now a few-to-~90 LOC over. Topical split (by +# destructuring-pattern family / new-callee shape / FFI namespace) is a +# reasonable follow-up, deferred to keep the wall-fix PR focused. +crates/perry-codegen/src/runtime_decls/stdlib_ffi.rs +crates/perry-hir/src/destructuring/var_decl.rs +crates/perry-hir/src/lower/expr_new.rs EOF )