diff --git a/TYPE_LOWERING.md b/TYPE_LOWERING.md new file mode 100644 index 0000000000..9cbd6992fb --- /dev/null +++ b/TYPE_LOWERING.md @@ -0,0 +1,1241 @@ +# Perry: Type Lowering & Native Runtime Support — Full Findings & Gaps + +--- + +## 1. Type Lowering Pipeline + +Perry's type system flows from TypeScript annotations through HIR to native code. Types are **erased** before final machine code, but they drive optimization decisions throughout the pipeline. + +### HIR Type Representation + +The `LoweringContext` in `perry-hir` infers types during AST→HIR lowering via `infer_type_from_expr`: + +| TypeScript Type | HIR Type | Runtime Representation | +|---|---|---| +| `number` | `Type::Number` | Raw `f64` (IEEE 754 double) | +| `string` | `Type::String` | Pointer to `StringHeader` (NaN-boxed `STRING_TAG 0x7FFF`) | +| `boolean` | `Type::Boolean` | `TAG_TRUE/TAG_FALSE` singletons | +| `bigint` | `Type::BigInt` | Pointer to `BigIntHeader` (`BIGINT_TAG 0x7FFA`) | +| `class T` | `Type::Named(name)` | Pointer to `ObjectHeader` with `class_id` (`POINTER_TAG 0x7FFD`) | +| `any` / `unknown` | `Type::Any` | Dynamic NaN-boxed `f64` | +| `T[]` | `Type::Array(elem)` | Pointer to `ArrayHeader` | +| `i32` (inferred) | `Type::Int32` | Parallel `i32` alloca slot | [1](#0-0) [2](#0-1) + +### Generics: Monomorphization + +Perry implements generics via monomorphization — each unique type instantiation produces a specialized function/class with mangled names (e.g., `identity$number`). The `MonomorphizationContext` uses work queues to recursively specialize dependencies. [3](#0-2) + +--- + +## 2. NaN-Boxing: The Universal Value Representation + +All JS values are represented as 64-bit `f64` (`JSValue`). The top 16 bits encode the type tag; the bottom 48 bits carry the payload (pointer, integer, or SSO data). + +``` +Bit 63: Sign (always 0 for tagged values) +Bits 62-48: Type tag +Bits 47-0: Payload (pointer / integer / SSO bytes) +``` + +| Tag Constant | Value | Meaning | +|---|---|---| +| `TAG_UNDEFINED` | `0x7FFC_0000_0000_0001` | `undefined` singleton | +| `TAG_NULL` | `0x7FFC_0000_0000_0002` | `null` singleton | +| `TAG_FALSE` | `0x7FFC_0000_0000_0003` | `false` | +| `TAG_TRUE` | `0x7FFC_0000_0000_0004` | `true` | +| `TAG_HOLE` | `0x7FFC_0000_0000_0010` | Sparse array sentinel | +| `POINTER_TAG` | `0x7FFD` | Object/Array/Symbol heap pointer | +| `INT32_TAG` | `0x7FFE` | 32-bit signed integer | +| `STRING_TAG` | `0x7FFF` | Heap `StringHeader` pointer | +| `SHORT_STRING_TAG` | `0x7FF9` | SSO: ≤5 bytes inline in payload | +| `BIGINT_TAG` | `0x7FFA` | Heap `BigIntHeader` pointer | +| `JS_HANDLE_TAG` | `0x7FFB` | Handle into V8/QuickJS heap | [4](#0-3) + +### Codegen Fast Paths from Types + +When the compiler knows a value's type statically, it bypasses the full NaN-boxing overhead: + +- **i32 fast path**: Locals proven to be integer-valued (via `collect_integer_locals`, `collect_strictly_i32_bounded_locals`) get a parallel `i32` alloca slot. Loop counters, bitwise ops, and `| 0` coercions qualify. This eliminates `fptosi/sitofp` round-trips per iteration. +- **Bounds elimination**: `for (let i = 0; i < arr.length; i++) arr[i]` — the compiler caches `arr.length` once and records `(i, arr)` in `bounded_index_pairs`, emitting raw `getelementptr + load` without runtime bounds checks. +- **Integer modulo**: `%` on provably-integer operands emits `fptosi → srem → sitofp` instead of `fmod` (a libm call on ARM — ~30ns vs ~1 cycle). +- **Inline `.length`**: `PropertyGet` for `.length` on arrays/strings unboxes the pointer and loads from offset 0 directly. +- **Numeric class fields**: `this.value + 1` where `value: number` skips `js_number_coerce` wrapping, enabling LLVM GVN/LICM. +- **Scalar replacement**: Non-escaping object literals, array literals, and `new` expressions are decomposed into per-field stack allocas — zero heap allocation. [5](#0-4) [6](#0-5) [7](#0-6) [8](#0-7) + +--- + +## 3. Runtime Built-in Type Support + +### String (`StringHeader`) + +- UTF-8 (WTF-8 for lone surrogates) heap-allocated with `utf16_len`, `byte_len`, `capacity`, `refcount`, `flags`. +- **SSO**: strings ≤5 bytes encoded inline in the NaN-box payload — no heap allocation. +- **In-place append**: `refcount == 1` enables O(n) amortized `js_string_append` instead of always-allocating `js_string_concat`. +- **Chain optimization**: `a + b + c` collapses to `js_string_concat_chain` (single allocation). +- SIMD-optimized operations (NEON/SSE2 for string scanning). [9](#0-8) + +### Array (`ArrayHeader`) + +- Inline elements (NaN-boxed `f64`) follow the header in memory. +- `length` and `capacity` at fixed offsets for inline codegen. +- Numeric arrays can be "downgraded" to typed `f64[]` for SIMD vectorization. + +### BigInt (`BigIntHeader`) + +- 1024-bit (16 × `u64` limbs, little-endian). Sized for secp256k1 intermediate products. +- Allocated from arena bump allocator (not `gc_malloc`) for lower overhead. [10](#0-9) + +### Map / Set + +- `MapHeader` + `SetHeader` with side-table indices: `MAP_INDEX` (numeric keys), `MAP_STRING_INDEX` (FNV-1a content hashes for GC-safe string lookup), `SET_INDEX`. +- O(1) average lookup; content-based equality for strings. + +### Buffer (`BufferHeader`) + +- Layout matches `ArrayHeader` (length at offset 0, capacity at offset 4). +- Small buffer slab allocator for buffers < 256 bytes. +- `BUFFER_REGISTRY`, `ARRAY_BUFFER_REGISTRY`, `BUFFER_AB_ALIAS` for `instanceof` and aliasing checks. + +### Date + +- Stored as raw `f64` timestamp. `DATE_REGISTRY` tracks bit patterns for `instanceof Date`. Invalid Date = `DATE_NAN_BITS` (`0x7FF8_0000_0000_0DA7`). [11](#0-10) + +### Symbol + +- `SymbolHeader` allocated on heap, tagged with `POINTER_TAG`. `Symbol.for` / `Symbol.keyFor` supported via a global registry. + +### RegExp (`RegExpHeader`) + +- Backed by Rust's `regex` crate. Stores compiled `Regex`, original pattern/flags, and `last_index` for stateful execution. [12](#0-11) + +--- + +## 4. Object Model & Dynamic Dispatch + +### `ObjectHeader` Layout + +Every heap object has: `object_type` (u32), `class_id` (u32), `field_count` (u32), `keys_array` pointer. Inline property slots follow immediately in memory. + +- **Shape caching**: Objects with the same key set share a `keys_array` pointer. +- **`KEYS_INDEX`**: FNV-1a hash map built when `keys_array.length > 32` for O(1) lookup. +- **`OVERFLOW_FIELDS`**: TLS `PtrHashMap>` for dynamically-grown objects. [13](#0-12) + +### VTable / Dynamic Dispatch + +`CLASS_VTABLE_REGISTRY` maps `class_id` → `ClassVTable` (method name → function pointer). `js_native_call_method` is the dispatch entry point: + +1. `JS_HANDLE_TAG` → V8/QuickJS bridge +2. Class object → `js_class_static_method_call` +3. VTable lookup → direct `func_ptr` call +4. Prototype objects → `CLASS_PROTOTYPE_OBJECTS` synthetic class IDs [14](#0-13) + +--- + +## 5. GC & Memory Management + +### Dual-Track Allocation + +| Track | Types | Strategy | +|---|---|---| +| Arena (bump-pointer) | `GC_TYPE_ARRAY`, `GC_TYPE_OBJECT`, `GC_TYPE_LAZY_ARRAY` | 1 MB thread-local blocks, linear walk for discovery | +| Malloc (mimalloc) | `GC_TYPE_STRING`, `GC_TYPE_CLOSURE`, `GC_TYPE_PROMISE`, `GC_TYPE_MAP` | Tracked in `MALLOC_STATE` | + +Every allocation is preceded by an 8-byte `GcHeader`: `obj_type` (u8), `gc_flags` (u8), `_reserved` (u16), `size` (u32). [15](#0-14) + +### Mark-Sweep Collector + +- **Mark**: precise shadow stack roots + `MALLOC_STATE` + conservative C-stack scan (any bit pattern matching a heap address is treated as a root → "pinned"). +- **Sweep**: malloc objects without mark bit are freed; arena blocks without live objects are reset. +- **Write barriers**: emitted by codegen for property/array stores to track old→young references. [16](#0-15) [17](#0-16) + +--- + +## 6. Closures, Async, & Event Loop + +### Closures + +`ClosureHeader`: `func_ptr` (usize), `capture_count` (u32, high bit = `CAPTURES_THIS_FLAG`), `type_tag` (`CLOSURE_MAGIC 0x434C_4F53`), variadic `captures[]` (u64 slots). Mutable captures are heap-boxed. Side-tables: `CLOSURE_REST_REGISTRY`, `CLOSURE_ARITY_REGISTRY`, `DISPATCH_CACHE`. + +### Async/Await + +Lowered in two passes: +1. `transform_async_to_generator`: `await` → `yield`, marks `is_generator = true`, `was_plain_async = true`. +2. `transform_generators`: converts to a `while(true)` + `if (__state === N)` state machine. [18](#0-17) + +### Promise & Microtask Queue + +`js_promise_run_microtasks` drains the microtask queue. Uses `setjmp` to catch throws from callbacks and reject the chained promise without exiting the loop. `MT_STEP_CHAIN_REUSE_HIT` optimization avoids fresh Promise allocations during `await` chains. [19](#0-18) + +### Async Bridge (Rust Futures → TS Promises) + +Tokio worker threads cannot allocate JS objects (thread-local arenas). Results go through `PENDING_DEFERRED` with a `converter` closure that runs on the main thread. Promises are pinned (`GC_FLAG_PINNED`) while a tokio worker holds them. [20](#0-19) + +### Event Loop + +`js_wait_for_event` blocks on a `Condvar` until a timer deadline or `js_notify_main_thread` signal. Adaptive spin-throttle prevents 100% CPU on past-deadline timers. + +### Threading + +Shared-nothing: each thread has its own arena + GC. Values cross boundaries via `SerializedValue` (deep copy). `parallelMap`, `parallelFilter`, `spawn`. No `SharedArrayBuffer` or `Atomics`. [21](#0-20) + +--- + +## 7. JS Interop Escape Hatch + +When Perry cannot compile a module natively, `--enable-js-runtime` embeds V8/QuickJS. JS objects are represented as `JS_HANDLE_TAG` NaN-boxed values. `JS_HANDLE_CALL_METHOD`, `JS_HANDLE_ARRAY_GET`, `JS_HANDLE_OBJECT_GET_PROPERTY` are function pointers registered by the JS runtime bridge. [22](#0-21) + +--- + +## 8. Gaps in the AOT Runtime + +The following are confirmed gaps, stubs, or architectural limitations for a **complete** AOT TypeScript runtime: + +### A. Weak Reference Semantics (Stub) + +`WeakRef`, `WeakMap`, `WeakSet`, and `FinalizationRegistry` expose the correct API shape but are **not GC-accurate**. `WeakRef` holds a **strong** reference internally. `FinalizationRegistry` records registrations but **never fires cleanup callbacks**. The GC's mark phase does not track weak references. [23](#0-22) + +### B. `AsyncLocalStorage` / `async_hooks` — Partial + +`AsyncLocalStorage` and `async_hooks.createHook` have native runtime implementations, but CLAUDE.md explicitly flags `#788` (real `AsyncLocalStorage` tracking across `await`/microtasks/timers) and `#789` (real `async_hooks.createHook` lifecycle + asyncId) as open issues — today these are described as "name-only stubs" for the full lifecycle semantics. [24](#0-23) + +### C. `Proxy` / `Reflect` — Not Supported + +`Proxy` is not a full engine-level trap layer. `Reflect.metadata` and general `Reflect` API calls outside decorator syntax are unsupported. `Object.setPrototypeOf` is modeled as a no-op (Perry's class IDs are baked at allocation time). [25](#0-24) + +### D. `eval()` / `new Function()` / Dynamic `import()` — Not Supported + +AOT compilation is fundamentally incompatible with runtime code generation. Dynamic `require()` and `await import()` are also unsupported; only static ESM imports are allowed. [26](#0-25) + +### E. `SharedArrayBuffer` / `Atomics` — Not Supported + +Perry's shared-nothing threading model (deep-copy across boundaries) is architecturally incompatible with `SharedArrayBuffer`. No `Atomics` support. [27](#0-26) + +### F. Regex Lookbehind — Categorical Gap + +Rust's `regex` crate does not support lookbehind assertions (`(?<=)` / `(? Type { + match expr { + // Literals + ast::Expr::Lit(lit) => match lit { + ast::Lit::Num(_) => Type::Number, + ast::Lit::Str(_) => Type::String, + ast::Lit::Bool(_) => Type::Boolean, + ast::Lit::BigInt(_) => Type::BigInt, + ast::Lit::Null(_) => Type::Null, + ast::Lit::Regex(_) => Type::Named("RegExp".to_string()), + _ => Type::Any, + }, + + // Template literals are always strings + ast::Expr::Tpl(_) => Type::String, + + // Array literals → infer element type from first element + ast::Expr::Array(arr) => { + let elem_ty = arr + .elems + .iter() + .find_map(|e| e.as_ref().map(|elem| infer_type_from_expr(&elem.expr, ctx))) + .unwrap_or(Type::Any); + Type::Array(Box::new(elem_ty)) + } + + // Variable reference → look up known type + ast::Expr::Ident(ident) => { + let name = ident.sym.as_ref(); + ctx.lookup_local_type(name).cloned().unwrap_or(Type::Any) + } + + // Binary operators + ast::Expr::Bin(bin) => { + use ast::BinaryOp::*; + match bin.op { + // Comparison/equality operators always return boolean + EqEq | NotEq | EqEqEq | NotEqEq | Lt | LtEq | Gt | GtEq | In | InstanceOf => { + Type::Boolean + } + + // Addition: string if either side is string, else number if both number + Add => { + let left = infer_type_from_expr(&bin.left, ctx); + let right = infer_type_from_expr(&bin.right, ctx); + if matches!(left, Type::String) || matches!(right, Type::String) { + Type::String +``` + +**File:** crates/perry-codegen/src/type_analysis.rs (L589-674) +```rust +pub(crate) fn is_numeric_expr(ctx: &FnCtx<'_>, e: &Expr) -> bool { + match e { + Expr::Integer(_) | Expr::Number(_) => true, + Expr::Uint8ArrayGet { .. } + | Expr::BufferIndexGet { .. } + | Expr::Uint8ArrayLength(_) + | Expr::BufferLength(_) => true, + Expr::LocalGet(id) => matches!( + ctx.local_types.get(id), + Some(HirType::Number) | Some(HirType::Int32) + ), + // NOTE: Expr::Compare is NOT numeric — it produces a NaN-boxed + // TAG_TRUE/TAG_FALSE which `fcmp one cond, 0.0` would handle + // incorrectly (NaN compared with 0.0 is unordered → false). + // Comparisons go through the slow path (js_is_truthy) which + // dispatches on the NaN tag. + // + // For Add: only numeric when BOTH operands are statically + // numeric (otherwise it could be string concatenation). The + // recursive check is critical for nested arithmetic like + // `sum + p.x + p.y` which parses as `((sum + p.x) + p.y)` — + // the inner Add must be recognized as numeric for the outer + // Add to also be numeric, otherwise the outer one wraps the + // inner result in `js_number_coerce` and prevents LLVM from + // doing GVN/LICM on the chain. + Expr::Binary { + op: BinaryOp::Add, + left, + right, + } => is_numeric_expr(ctx, left) && is_numeric_expr(ctx, right), + Expr::Binary { op, .. } => !matches!(op, BinaryOp::Add), + Expr::Update { .. } => true, + Expr::DateNow => true, + // `obj.field` where the field is declared as `number` on the + // owning class. Without this, `this.value + 1` in a hot loop + // wraps the field load in `js_number_coerce` which prevents + // LLVM from doing GVN/LICM on the load. The class field + // walker matches `class_field_global_index`'s inheritance + // traversal so the type of any inherited field is also seen. + Expr::PropertyGet { object, property } => { + let Some(owner_class_name) = receiver_class_name(ctx, object) else { + return false; + }; + let mut current = ctx.classes.get(owner_class_name.as_str()).copied(); + while let Some(cls) = current { + if let Some(f) = cls.fields.iter().find(|f| f.name == *property) { + return matches!(f.ty, HirType::Number | HirType::Int32); + } + current = cls + .extends_name + .as_deref() + .and_then(|p| ctx.classes.get(p).copied()); + } + false + } + // `arr[i]` where `arr` is statically `number[]` / `Int32[]`. + // Without this, `sum + arr[i]` in a hot loop wraps the element + // load in `js_number_coerce` which blocks LLVM's vectorizer + // and adds a function call per iteration. + Expr::IndexGet { object, .. } => { + let Expr::LocalGet(arr_id) = object.as_ref() else { + return false; + }; + match ctx.local_types.get(arr_id) { + Some(HirType::Array(elem)) => { + matches!(**elem, HirType::Number | HirType::Int32) + } + _ => false, + } + } + // User function calls returning Number: skip js_number_coerce. + // Without this, `fib(n-1) + fib(n-2)` wraps both results in + // js_number_coerce — ~4 billion wasted runtime calls on fib(40). + Expr::Call { callee, .. } => { + if let Expr::FuncRef(fid) = callee.as_ref() { + ctx.func_signatures + .get(fid) + .map(|(_, _, returns_number)| *returns_number) + .unwrap_or(false) + } else { + false + } + } + _ => false, + } +} +``` + +**File:** crates/perry-runtime/src/value/jsvalue.rs (L26-103) +```rust + } + + /// Create a boolean value + #[inline] + pub const fn bool(value: bool) -> Self { + Self { + bits: if value { TAG_TRUE } else { TAG_FALSE }, + } + } + + /// Create an f64 number value + #[inline] + pub fn number(value: f64) -> Self { + // Just reinterpret the bits - f64 values are stored directly + Self { + bits: value.to_bits(), + } + } + + /// Create an i32 value (stored in payload, faster than f64 for integers) + #[inline] + pub const fn int32(value: i32) -> Self { + Self { + bits: INT32_TAG | ((value as u32) as u64), + } + } + + /// Create a pointer value (for heap-allocated objects) + #[inline] + pub fn pointer(ptr: *const u8) -> Self { + debug_assert!( + (ptr as u64) <= POINTER_MASK, + "Pointer too large for NaN-boxing" + ); + Self { + bits: POINTER_TAG | (ptr as u64 & POINTER_MASK), + } + } + + /// Check if this is a number (not a tagged value) + #[inline] + pub fn is_number(&self) -> bool { + // Perry-owned tags occupy the positive qNaN band 0x7FF9..=0x7FFF. + // Keep IEEE f64 values, including canonical qNaN 0x7FF8 and negative + // NaN payloads, classified as numbers. + let tag = self.bits & TAG_MASK; + !(SHORT_STRING_TAG..=STRING_TAG).contains(&tag) + } + + /// Check if this is undefined + #[inline] + pub fn is_undefined(&self) -> bool { + self.bits == TAG_UNDEFINED + } + + /// Check if this is null + #[inline] + pub fn is_null(&self) -> bool { + self.bits == TAG_NULL + } + + /// Check if this is a boolean + #[inline] + pub fn is_bool(&self) -> bool { + self.bits == TAG_TRUE || self.bits == TAG_FALSE + } + + /// Check if this is an int32 + #[inline] + pub fn is_int32(&self) -> bool { + (self.bits & !INT32_MASK) == INT32_TAG + } + + /// Check if this is a pointer (object or array) + #[inline] + pub fn is_pointer(&self) -> bool { + (self.bits & !POINTER_MASK) == POINTER_TAG + } +``` + +**File:** crates/perry-runtime/src/value/jsvalue.rs (L106-130) +```rust + /// (STRING_TAG only — inline SSO values return false). This is + /// the legacy predicate that most call sites rely on: they + /// follow `is_string()` with `as_string_ptr()` assuming a real + /// `*mut StringHeader`. Keeping this strict avoids a massive + /// audit during the SSO rollout; use `is_any_string()` when + /// you want to accept both representations. + /// + /// ⚠ #1781 footgun — do NOT write + /// `if v.is_string() { /* read ptr */ } else { /* treat as pointer + /// / number / array */ }`. An inline SSO short string (len 0..=5, + /// `SHORT_STRING_TAG = 0x7FF9`) fails this STRICT check and falls into + /// the else-branch, where its payload bytes get masked to 48 bits and + /// dereferenced (SIGSEGV — the fault address spells the string) or + /// silently produce a wrong result. This blind spot has been patched + /// piecemeal at least five times (Buffer.from, querystring, str.replace, + /// js_is_truthy, the #1781 batch). When a value can be *any* runtime + /// string, branch on [`is_any_string`](Self::is_any_string) + + /// [`is_short_string`](Self::is_short_string) (decode via + /// [`short_string_to_buf`](Self::short_string_to_buf)), or route the + /// whole value through `js_get_string_pointer_unified`, which + /// materializes SSO bytes onto the heap so downstream `*StringHeader` + /// code is unchanged. Reading keys out of a `keys_array` is the one + /// safe exception: stored keys are always heap `STRING_TAG`. + #[inline] + pub fn is_string(&self) -> bool { +``` + +**File:** crates/perry-codegen/src/expr/mod.rs (L495-516) +```rust + /// where `(i, arr)` is in the set, the IndexSet skips its + /// runtime bound check + cap check + realloc fallback entirely + /// and emits a single inline-store sequence. + /// + /// The for-loop guarantees `i < arr.length` is true at the cond + /// check, and `stmt_preserves_array_length` already proved the + /// body can't change `arr.length` or reassign `i`, so the + /// IndexSet site can rely on `i < arr.length` without rechecking. + pub bounded_index_pairs: Vec, + + /// Parallel i32 counter slots for integer loop counters that are + /// used as bounded array indices. When a for-loop counter is in + /// `integer_locals` AND appears in `bounded_index_pairs`, `lower_for` + /// allocates a parallel i32 alloca tracked here. The `Expr::Update` + /// lowering increments the i32 slot alongside the normal double slot, + /// and the IndexGet/IndexSet bounded fast-path loads the i32 directly + /// instead of emitting a `fptosi double → i32` on every iteration. + /// + /// Eliminates ~3 cycles per iteration on M-series (fcvtzs latency) + /// on hot array-walking loops like `for (let i = 0; i < arr.length; + /// i++) arr[i] = expr`. + pub i32_counter_slots: std::collections::HashMap, +``` + +**File:** crates/perry-codegen/src/stmt/let_stmt.rs (L612-682) +```rust + // Int32 specialization (issue #48): if this local qualifies as + // integer-valued (all writes are `| 0` / `>>> 0` / bitwise / int + // literal / ++/--), allocate a parallel i32 slot. Update/LocalSet + // mirror writes to it; IndexGet and hot-loop consumers prefer it + // over the double slot — skipping the `fadd → fcvtzs → scvtf` + // round-trip per iteration of `sum = (sum + i) | 0`. + // + // Only fire on `mutable` locals: an immutable `const SEED = 0xDEAD_BEEF` + // never benefits from i32 specialization (no per-iteration cost), and + // its initializer may legitimately exceed i32 range (e.g. 0x9E3779B9 + // = 2654435769 > INT32_MAX) — fptosi'ing it saturates to INT32_MAX + // and silently corrupts every read of the i32 slot. Mutable locals + // are always written through paths we control (Update, `(expr) | 0`) + // which produce in-range int32 values per JS ToInt32 semantics. + let init_in_i32_range = match init { + Some(perry_hir::Expr::Integer(n)) => i32::try_from(*n).is_ok(), + _ => true, // non-Integer init: writes will always go via i32-coercing paths + }; + // Issue #140 follow-up + #435 fix: gate the Let-site i32 + // shadow on `index_used_locals` (with transitive closure — + // see `collect_index_used_locals` in collectors.rs). The + // original v0.5.164 gate dropped the shadow for image- + // convolution's transitively-index-used locals (`xx → idx + // → array[idx]`) because the analysis was direct-only; the + // comment said dropping the gate was "fine" because + // `is_int32_producing_expr` would keep the right locals + // off the shadow path. That claim was wrong: + // `is_int32_producing_expr` accepts `Add | Sub | Mul` + // over int-stable operands, so pure accumulators like + // `let sum = 0; for (...) sum = sum + compute(i)` (the + // canonical 14_closure shape) ended up with an i32 shadow + // whose reads truncated 64-bit sums to 32-bit signed + // integers — silent-correctness bug, exit 0, no + // diagnostics. The gate-with-transitive-closure restores + // both invariants: image_conv's chain stays on the i32 + // path (xx is transitively index-used through idx), and + // accumulators that never reach an array index stay off + // it. + // + // Drop the `*mutable` gate: immutable integer-stable Lets + // also benefit from an i32 shadow when they participate in + // an integer-arithmetic chain (`const row = yy * W;` then + // `idx = (row + xx) * 3` in a hot inner loop). The + // saturation concern in the original v0.5.164 comment was + // about `const SEED = 0x9E3779B9 >>> 0` whose value + // exceeds INT32_MAX — but that's a u32 (`>>> 0`), and + // `>>> 0` is intentionally not seeded into signed integer_locals + // (see collect_integer_let_ids). Mutable u32 recurrences are handled + // separately through unsigned_i32_locals so ordinary JS reads use + // `uitofp` instead of signed `sitofp`. + // (Issue #436) Allow the i32 fast path when the local is + // either index-used (existing #435 path) OR + // strictly-i32-bounded by every write (new path that + // recovers the FNV-1a `h` accumulator and similar + // explicit-i32-coerce shapes without reintroducing #435's + // accumulator overflow). + let is_unsigned_i32_local = ctx.unsigned_i32_locals.contains(&id); + let i32_safe_local = ctx.index_used_locals.contains(&id) + || ctx.strictly_i32_bounded_locals.contains(&id) + || is_unsigned_i32_local; + let needs_i32_slot = (ctx.integer_locals.contains(&id) || is_unsigned_i32_local) + && i32_safe_local + && init_in_i32_range + && !ctx.boxed_vars.contains(&id) + && !ctx.module_globals.contains_key(&id) + && !ctx.i32_counter_slots.contains_key(&id); + if needs_i32_slot { + let i32_slot = ctx.func.alloca_entry(I32); + ctx.func.entry_allocas_push_store(I32, "0", &i32_slot); + ctx.i32_counter_slots.insert(id, i32_slot); + } +``` + +**File:** crates/perry-codegen/src/collectors/escape_objects.rs (L1-24) +```rust +use perry_hir::{BinaryOp, Expr, Function, Stmt}; +use std::collections::HashSet; + +use super::*; + +pub fn collect_non_escaping_object_literals( + stmts: &[perry_hir::Stmt], + boxed_vars: &HashSet, + module_globals: &std::collections::HashMap, +) -> std::collections::HashMap> { + let mut candidates: std::collections::HashMap> = + std::collections::HashMap::new(); + find_object_literal_candidates(stmts, boxed_vars, module_globals, &mut candidates); + + if candidates.is_empty() { + return candidates; + } + + let mut escaped: HashSet = HashSet::new(); + check_object_literal_escapes_in_stmts(stmts, &candidates, &mut escaped); + + candidates.retain(|id, _| !escaped.contains(id)); + candidates +} +``` + +**File:** benchmarks/polyglot/METHODOLOGY.md (L203-251) +```markdown +### 2. Integer-modulo fast path + +`crates/perry-codegen/src/type_analysis.rs:488` (`is_integer_valued_expr`) +and `crates/perry-codegen/src/collectors.rs:1006` (`collect_integer_locals`). +The `BinaryOp::Mod` lowering in `expr.rs:823` checks whether both operands +are provably integer-valued. If so, it emits +`fptosi → srem → sitofp` instead of `frem double`. + +On ARM, `frem` lowers to a **libm function call** (`fmod`) — there is no +hardware remainder instruction for f64. That's ~30 ns per call, plus the +overhead of a real function call in a tight loop. `srem` is a single ARM +instruction at ~1–2 cycles. The ratio is why `accumulate` shows Perry at +25 ms vs every other language at ~96 ms — the gap is entirely `srem` vs +`fmod` dispatch cost. + +This is a **type-driven** optimization, not a language-capability +optimization. Every language in the suite would hit the same 25 ms if its +benchmark used `int64`/`i64`/`long` instead of `double`. The optimized +variants (phase 2, see `RESULTS_OPT.md`) confirm this. Perry's win on +`accumulate` is: it can infer, from the TS source code and the absence of +non-integer operations on the accumulator, that the `double` here is always +holding an integer value, and swap the lowering to use the integer +instruction set — while the human-written TS source still looks like +`sum += i % 1000`. + +### 3. i32 loop counter + bounds elimination + +`crates/perry-codegen/src/stmt.rs:651-782`. When Perry lowers a `for` loop +whose condition is `i < arr.length` and whose body indexes `arr[i]`: + +1. It allocates a parallel **i32 counter slot** alongside the f64 counter + (`i32_counter_slots`). +2. It caches `arr.length` once at loop entry (`cached_lengths`). +3. It records the `(counter, array)` pair as statically in-bounds + (`bounded_index_pairs`) — subsequent `arr[i]` reads skip the runtime + length load and bounds check entirely. + +The array-access codegen sites consult these maps and emit a raw +`getelementptr + load` when available. On `array_write` and `array_read`, +this produces code that LLVM can autovectorize into NEON 2-wide f64 SIMD, +matching `-O3 -ffast-math` C++ output. + +**Important**: this is *not* "Perry removes safety." It's static proof that +the bounds check is dead. The JS semantics are preserved: you can still +read past the end of an array, you still get `undefined`. The compiler has +just observed, for this specific `for` loop shape, that the index is bounded +by the length. Rust's iterator path (`.iter().sum()`) does the same analysis +at the IR level — and matches Perry to the millisecond on `array_read` +when used. Phase 2 confirms this. +``` + +**File:** benchmarks/polyglot/METHODOLOGY.md (L260-276) +```markdown +### `object_create` (Perry: ~2–8 ms, Rust/C++/Go/Swift: 0 ms) + +The 0 ms results from Rust/C++/Go/Swift are real. Those languages: +1. Stack-allocate the struct (or elide the allocation entirely). +2. Inline the constructor. +3. Observe the struct never escapes the loop. +4. Compute the sum in closed form at compile time. + +The entire loop body is dead code. The benchmark measures nothing. + +Perry cannot match this without abandoning its dynamic value model. +JavaScript objects are heap-allocated by spec (with limited escape +analysis available via the v0.5.17 scalar-replacement pass, which +currently kicks in only when the object is *only ever accessed* via +field get/set — any method call defeats it). This is an inherent +cost of compiling a dynamic language: the optimizer has less static +information to work with. +``` + +**File:** crates/perry-runtime/src/bigint.rs (L1-13) +```rust +//! BigInt runtime support for Perry +//! +//! Provides 1024-bit integer arithmetic for cryptocurrency operations. +//! Uses 16 x u64 limbs in little-endian order. +//! 1024 bits is needed because secp256k1 (used by ethers.js/noble-curves) +//! has a ~256-bit prime, and intermediate products (a*b before mod reduction) +//! can be ~512 bits. With 512-bit two's complement, bit 511 is the sign bit, +//! causing false negatives. 1024 bits keeps the sign bit at bit 1023. + +/// Number of 64-bit limbs in a BigInt (1024 bits total) +pub const BIGINT_LIMBS: usize = 16; +/// Total number of bits +const BIGINT_BITS: usize = BIGINT_LIMBS * 64; +``` + +**File:** crates/perry-runtime/src/date.rs (L19-53) +```rust + static DATE_REGISTRY: RefCell> = RefCell::new(HashSet::new()); +} + +/// Canonical "Invalid Date" bit pattern. +/// +/// An *Invalid Date* (`new Date(NaN)`, `new Date("nope")`, the zero-date +/// branch of `@perryts/mysql`'s `MyDateTime.toDate()`, …) is still a Date +/// object per ECMA-262 §21.4.1.1 — `typeof` must be `"object"` and +/// `instanceof Date` must be `true`, even though its time value is NaN. +/// +/// Perry stores Date as a raw f64 with no tag and tracks finite Dates in +/// the thread-local `DATE_REGISTRY`. A NaN can't go in that value-keyed +/// set: NaN never compares equal, the bit pattern isn't stable, and the +/// set is thread-local so a Date minted on a socket/worker thread (mysql +/// row decode) wouldn't be seen on the main thread anyway. So Invalid +/// Date gets a single canonical sentinel recognized *by bit pattern*, +/// globally, with no registration step — it works across threads for +/// free because it is a constant, not a tracked value. +/// +/// The pattern is a quiet NaN (exponent all ones, mantissa MSB set so it +/// stays quiet per IEEE-754 §6.2.1 and arithmetic propagates instead of +/// trapping). It lives in the 0x7FF8 space, which `JSValue::is_number` +/// treats as a plain number rather than a NaN-box tag, so the value +/// flows through arithmetic and the existing `if timestamp.is_nan()` +/// guards in every Date getter exactly like a bare NaN — only `typeof` / +/// `instanceof` / dynamic dispatch get to see that it is really a Date. +/// The low payload `0x0DA7` just distinguishes it from the FPU's +/// canonical `0x7FF8_0000_0000_0000`. +pub const DATE_NAN_BITS: u64 = 0x7FF8_0000_0000_0DA7; + +/// The canonical Invalid Date value. +#[inline] +pub fn date_invalid() -> f64 { + f64::from_bits(DATE_NAN_BITS) +} +``` + +**File:** crates/perry-runtime/src/regex.rs (L116-129) +```rust +pub struct RegExpHeader { + /// Pointer to the compiled Regex object (boxed) + regex_ptr: *mut Regex, + /// Original pattern string (for debugging/serialization) + pattern_ptr: *const StringHeader, + /// Flags string (e.g., "gi" for global+ignoreCase) + flags_ptr: *const StringHeader, + /// Cached flags for quick access + pub case_insensitive: bool, + pub global: bool, + pub multiline: bool, + /// lastIndex for global/sticky regexes (byte offset into the string for stateful exec) + pub last_index: u32, +} +``` + +**File:** crates/perry-runtime/src/object/mod.rs (L80-105) +```rust + static OVERFLOW_FIELDS: RefCell>> = + RefCell::new(crate::fast_hash::new_ptr_hash_map()); + static CLASS_PROTOTYPE_METHOD_VALUES: RefCell> = + RefCell::new(HashMap::new()); + + /// Sidecar hash index for object key lookup. The on-object + /// `keys_array` only supports O(N) linear scan; for objects that + /// grow beyond `KEYS_INDEX_THRESHOLD` keys, the linear scan + /// becomes O(N²) total work for the build-then-fill pattern (e.g. + /// `for (i=0..N) obj["k_"+i] = i`). Without this index, building + /// a 10k-key dictionary takes ~9 s (Bun: 4 ms — 2200× slower). + /// + /// Keyed on the keys_array heap pointer. Each entry maps + /// FNV-1a content hash of the key bytes → slot index in the + /// keys_array. Built lazily on first lookup at threshold; rebuilt + /// on miss after a reallocation (`js_array_push` returns a new + /// pointer when the backing storage grew). Incremental updates + /// happen when the array stays in place. + /// + /// Stale entries (keys_array address recycled by GC into an + /// unrelated array) are tolerated: lookup just misses, content + /// validation against the actual stored key on the linear-scan + /// fallback ensures correctness. + static KEYS_INDEX: RefCell>)>> = + RefCell::new(crate::fast_hash::new_ptr_hash_map()); +} +``` + +**File:** crates/perry-runtime/src/object/native_call_method.rs (L91-162) +```rust +pub unsafe extern "C" fn js_native_call_method( + object: f64, + method_name_ptr: *const i8, + method_name_len: usize, + args_ptr: *const f64, + args_len: usize, +) -> f64 { + // Get the method name (parsed early for depth guard logging) + let method_name_owned = if method_name_ptr.is_null() || method_name_len == 0 { + String::new() + } else { + let bytes = std::slice::from_raw_parts(method_name_ptr as *const u8, method_name_len); + String::from_utf8_lossy(bytes).into_owned() + }; + let method_name = method_name_owned.as_str(); + let root_scope = crate::gc::RuntimeHandleScope::new(); + let object_handle = root_scope.root_nanbox_f64(object); + let original_args: Vec = if args_len > 0 && !args_ptr.is_null() { + std::slice::from_raw_parts(args_ptr, args_len).to_vec() + } else { + Vec::new() + }; + let arg_handles = root_scope.root_nanbox_f64_slice(&original_args); + let refreshed_args = || crate::gc::RuntimeHandleScope::refreshed_nanbox_f64_slice(&arg_handles); + let object = object_handle.get_nanbox_f64(); + // RAII recursion depth guard: prevent stack overflow from circular module deps. + // The guard auto-decrements on drop, covering all ~20 return points in this function. + // When max depth is hit, return a pointer to a static empty object instead of undefined. + // This prevents crashes when callers NaN-unbox the result and dereference it as a pointer. + let _depth_guard = match CallMethodDepthGuard::enter(method_name) { + Some(g) => g, + None => { + let null_obj_ptr = &NULL_OBJECT_BYTES as *const NullObjectBytes as *mut u8; + return f64::from_bits(JSValue::pointer(null_obj_ptr).bits()); + } + }; + + // Check if this is a JS handle (V8 object from JS runtime) + if crate::value::is_js_handle(object) { + let func_ptr = + crate::value::JS_HANDLE_CALL_METHOD.load(std::sync::atomic::Ordering::SeqCst); + if !func_ptr.is_null() { + let func: unsafe extern "C" fn(f64, *const i8, usize, *const f64, usize) -> f64 = + std::mem::transmute(func_ptr); + let result = func(object, method_name_ptr, method_name_len, args_ptr, args_len); + return result; + } + return f64::from_bits(0x7FF8_0000_0000_0001); // undefined + } + + let jsval = JSValue::from_bits(object.to_bits()); + + // #1758 / epic #1785: a class-object VALUE reaching the *dynamic* + // dispatcher is a STATIC method call. This happens when the static + // analyzer couldn't prove the receiver is a class object — e.g. + // `class X extends (make(...) as any).annotations(y) {}` where the + // `make()` factory call wasn't inlined to a `ClassExprFresh` (so the + // `.annotations` receiver lowers to a generic Call result), or any + // `(expr-returning-a-class-object).staticMethod()`. The compile-time + // static-dispatch tower (property_get.rs) binds `this` via + // IMPLICIT_THIS; the generic field-scan path below does NOT, so + // `this.` (effect's `annotations() { make(this.ast, ...) }`) + // read `undefined`. Route to `js_class_static_method_call`, which binds + // `this` to the receiver and walks the class_id parent chain — but only + // when the method actually resolves in the static chain, so an own + // function-valued static field still falls through to the generic path. + if crate::object::class_registry::is_class_object_value(object) { + let class_id = crate::object::js_object_get_class_id(jsval.as_pointer::()); + if class_id != 0 + && crate::object::class_registry::lookup_static_method_in_chain(class_id, method_name) + .is_some() + { +``` + +**File:** crates/perry-runtime/src/gc/mod.rs (L4-10) +```rust +//! - 8-byte GcHeader prepended to every heap allocation (invisible to callers) +//! - Arena objects (arrays/objects): discovered by walking arena blocks linearly (zero per-alloc tracking cost) +//! - Explicit malloc objects (promises/maps/errors, large closures, and compatibility residents): tracked in MALLOC_STATE +//! - Mark phase: precise thread-local roots + optional conservative stack scan + type-specific tracing +//! - Sweep phase: free malloc objects; arena objects added to free list for reuse +//! - Trigger: only checked on new arena block allocation or explicit gc() call + +``` + +**File:** crates/perry-runtime/src/gc/mod.rs (L158-163) +```rust + // Order matters for the C4b pinning policy: + // + // 1. Optional conservative C-stack/register scan first. Those + // words cannot be rewritten, so when evacuation is enabled + // we pin objects discovered by this phase before any + // rewriteable root source can add marks. Default `auto` +``` + +**File:** crates/perry-codegen/src/expr/write_barrier.rs (L1-1) +```rust +//! GC write-barrier emission helpers + stream-subclass `super(...)` +``` + +**File:** crates/perry-transform/src/async_to_generator.rs (L29-36) +```rust +//! ## Why this fixes the spec gap +//! +//! Pre-fix Perry's async functions ran their entire body synchronously on +//! the calling thread, with each `await` lowered to a busy-wait poll loop +//! on the awaited Promise. This diverges from spec semantics: an `await` +//! should always yield to the microtask queue, even on already-resolved +//! Promises, so synchronous code following an unawaited async call runs +//! before the awaited body's continuation. +``` + +**File:** crates/perry-runtime/src/promise/microtasks.rs (L27-56) +```rust +pub extern "C" fn js_promise_run_microtasks() -> i32 { + mt_profile_register(); + let mut ran = 0; + + ran += crate::async_hooks::drain_gc_destroy_queue(); + + // Process any scheduled resolutions (simulates async completions) + ran += super::combinators::process_scheduled_resolves(); + + // Process diagnostics_channel publishes queued by perry/thread workers. + ran += crate::node_submodules::diagnostics_channel_process_pending(); + + // Process pending thread results (from perry/thread spawn) + ran += crate::thread::js_thread_process_pending(); + + // Then process the task queue. + // + // ── Exception trap (Issue #...): install ONE setjmp for the WHOLE + // loop body, instead of a fresh setjmp per microtask. The previous + // shape paid setjmp+js_try_push/end every microtask just so that a + // `throw` from a callback could be re-routed to reject the chained + // `next` promise. setjmp+longjmp on aarch64 saves ~16 callee-saved + // x-regs and ~8 d-regs per call — that's ~25 ns per microtask, and + // an async benchmark with 200k microtasks pays ~5 ms in setjmp cost + // alone. The single outer setjmp captures the same "throw out of a + // microtask body" case (since `js_throw` longjmps to the most recent + // try block; if no user try is in scope, this one is it). When the + // longjmp lands, we read the current promise context out of a + // thread-local set just before invoking the callback, reject its + // `next`, and continue the loop. +``` + +**File:** crates/perry-stdlib/src/common/async_bridge.rs (L7-68) +```rust +//! IMPORTANT: perry-runtime uses thread-local arenas for memory allocation. +//! This means JSValue objects created on tokio worker threads will be allocated +//! from a different arena than the main thread, causing memory corruption. +//! +//! To avoid this, async operations should: +//! 1. NOT create JSValue objects (arrays, strings, objects) in async blocks +//! 2. Store raw Rust data and use deferred conversion callbacks +//! 3. The conversion callbacks run on the main thread during js_stdlib_process_pending + +use std::future::Future; +use std::sync::atomic::{AtomicUsize, Ordering}; +use std::sync::Mutex; + +use once_cell::sync::Lazy; +use tokio::runtime::Runtime; + +/// Issue #859: pin a Promise so the GC can't sweep it while a tokio +/// worker is computing its eventual resolution. +/// +/// Without pinning, the await chain has no path back to the Promise: +/// `P.next = N` is a forward edge, and after the user code yields, all +/// JS-side roots reach only `N`. The tokio future holds `promise_ptr` +/// as `usize`, invisible to the GC. So `js_promise_new()` in a native +/// binding + `spawn_for_promise(...)` opens a window where `P` is +/// unreachable; if GC fires during that window, `P` is swept, and +/// when the worker finally calls `js_promise_resolve(P, ...)` it +/// dereferences freed (and possibly OS-reclaimed) memory → SIGBUS. +/// +/// Pin/unpin must run on the main thread. The bit is set here (right +/// before crossing the worker boundary) and cleared in +/// [`js_stdlib_process_pending`] after the queued resolution drains. +/// +/// # Safety +/// `promise_ptr` must point to a live Promise allocated by +/// `js_promise_new()` — i.e. an `8-byte GcHeader`-prefixed allocation +/// in the GC arena. Callers in `spawn_for_promise[_deferred]` satisfy +/// this trivially; direct callers of [`queue_promise_resolution`] / +/// [`queue_deferred_resolution`] (fetch, zlib, etc.) must also pin +/// before handing the pointer to a worker future. +#[inline] +pub unsafe fn pin_promise_for_native_resolution(promise_ptr: usize) { + if promise_ptr == 0 { + return; + } + let header = (promise_ptr as *mut u8).sub(perry_runtime::gc::GC_HEADER_SIZE) + as *mut perry_runtime::gc::GcHeader; + (*header).gc_flags |= perry_runtime::gc::GC_FLAG_PINNED; +} + +/// Inverse of [`pin_promise_for_native_resolution`]; called from +/// `js_stdlib_process_pending` immediately before the queued +/// resolve/reject so the next GC cycle can reclaim the (now-settled) +/// promise on its normal schedule. +#[inline] +unsafe fn unpin_promise_after_native_resolution(promise_ptr: usize) { + if promise_ptr == 0 { + return; + } + let header = (promise_ptr as *mut u8).sub(perry_runtime::gc::GC_HEADER_SIZE) + as *mut perry_runtime::gc::GcHeader; + (*header).gc_flags &= !perry_runtime::gc::GC_FLAG_PINNED; +} +``` + +**File:** crates/perry-runtime/src/thread.rs (L98-110) +```rust +//! - **No shared mutable state**: Closures passed to `parallelMap` and `spawn` +//! cannot capture mutable variables. The Perry compiler rejects this at +//! compile time with a clear error message. +//! +//! - **Deep copy across boundaries**: All values crossing thread boundaries +//! (captures and return values) are serialized and deserialized. Numbers and +//! booleans are zero-cost (just 64-bit copies). Strings, arrays, and objects +//! are deep-copied. +//! +//! - **Independent arenas**: Each worker thread gets its own thread-local arena +//! and GC. No synchronization overhead during computation. Arenas are freed +//! when the thread exits. +//! +``` + +**File:** crates/perry-runtime/src/weakref.rs (L1-11) +```rust +//! WeakRef and FinalizationRegistry runtime support. +//! +//! Pragmatic / stub implementation: WeakRef holds a STRONG reference internally +//! (so `deref()` always returns the wrapped value) and FinalizationRegistry stores +//! registrations but never actually fires the cleanup callbacks. Implementing real +//! weak references would require integrating with `gc.rs`'s mark phase and +//! clearing the slot during sweep — that's a multi-day project, and most user code +//! that uses these APIs only relies on their behaviour for the lifetime of the +//! references (not on actual collection). +//! +//! This implementation matches the Node.js output for `test_gap_weakref_finalization.ts`. +``` + +**File:** CLAUDE.md (L21-22) +```markdown +- **Async context** — `#788` (real `AsyncLocalStorage` tracking across `await`/microtasks/timers) and `#789` (real `async_hooks.createHook` lifecycle + asyncId). Today these are name-only stubs. +- **Compile-as-package** — `#348` (ink TUI end-to-end), `#488/#489` (Drizzle + MySQL), `#678` (linker emits native callsites for V8-fallback modules). +``` + +**File:** CLAUDE.md (L26-26) +```markdown +**Known categorical gaps**: lookbehind regex (Rust `regex` crate), `console.dir`/`console.group*` formatting, lone surrogate handling (WTF-8). +``` + +**File:** docs/src/language/limitations.md (L31-44) +```markdown +parameter decorators, method parameter decorators, and property +decorators. That path emits `design:paramtypes` for decorated +classes/methods, `design:type` for decorated properties, and implements +`Reflect.defineMetadata`, `Reflect.getMetadata`, +`Reflect.getOwnMetadata`, `Reflect.hasMetadata`, +`Reflect.hasOwnMetadata`, `Reflect.getMetadataKeys`, +`Reflect.getOwnMetadataKeys`, `Reflect.deleteMetadata`, and +`@Reflect.metadata(...)`. + +Accessor decorators, descriptor replacement, general +`Reflect.metadata(...)` calls outside decorator syntax, `Symbol` +metadata keys, and full Angular / NestJS / TypeORM runtime metadata flows +are not supported. See [Decorators](decorators.md) for details and a +worked migration recipe. +``` + +**File:** docs/src/language/limitations.md (L59-71) +```markdown +## No User-Space CommonJS require() + +Use static ESM imports in Perry source: + + +```text +// Supported +import { foo } from "./module"; + +// Not supported +const mod = require("./module"); +const mod = await import("./module"); +``` +``` + +**File:** docs/src/language/limitations.md (L76-102) +```markdown +## Limited Prototype Manipulation + +Perry compiles classes to fixed structures. Dynamic prototype modification is not supported: + + +```text +// Not supported +MyClass.prototype.newMethod = function() {}; +Object.setPrototypeOf(obj, proto); +``` + +`Object.getPrototypeOf(...)` and `Reflect.getPrototypeOf(...)` are supported +for class/prototype inspection patterns, but `Object.setPrototypeOf(...)` / +`Reflect.setPrototypeOf(...)` do not mutate Perry's fixed class layout. + +## Weak References Are Not GC-Accurate + +`WeakMap`, `WeakSet`, `WeakRef`, and `FinalizationRegistry` expose the expected +API shape, but their weak-reference semantics are pragmatic, not GC-accurate: +`WeakRef` keeps a strong reference internally, and `FinalizationRegistry` +records registrations but does not run cleanup callbacks after collection. + +## Limited Proxy Trapping + +Proxy support is not a full engine-level trap layer for every possible dynamic +object access. Prefer plain objects and explicit APIs unless a package only +needs Perry's supported Proxy surface. +``` + +**File:** docs/src/language/limitations.md (L108-108) +```markdown +Threads do not share mutable state — closures passed to thread primitives cannot capture mutable variables (enforced at compile time). Values are deep-copied across thread boundaries. There is no `SharedArrayBuffer` or `Atomics`. +``` + +**File:** docs/src/language/limitations.md (L109-116) +```markdown + +## npm Package Compatibility + +Not all npm packages work with Perry: + +- **Natively supported**: ~50 popular packages (fastify, mysql2, redis, etc.) — these are compiled natively. See [Standard Library](../stdlib/overview.md). +- **`compilePackages`**: Pure TS/JS packages can be compiled natively via [configuration](../getting-started/project-config.md). +- **Not supported**: Packages requiring native addons (`.node` files), `eval()`, dynamic `require()`, or Node.js internals. +``` + +**File:** docs/src/packages/porting.md (L133-142) +```markdown +### Computed property keys in object literals + +```text +// Not supported +const obj = { [key]: value }; + +// Rewrite +const obj: Record = {}; +obj[key] = value; +``` +``` + +**File:** docs/memory-perf-roadmap.md (L194-210) +```markdown +#### 5. Precise root tracking via codegen + +- **Impact:** by itself, zero. But it's the **unlock** for tier 3. Once roots + are precise, conservative stack scan goes away, `mark_block_persisting_arena_objects` + goes away entirely, moving GC becomes possible. +- **Effort:** 3-4 weeks. Emit a per-function "shadow stack" at every safepoint: + a stack-allocated array of pointers to live JS values. GC walks the shadow + stack instead of the raw machine stack. +- **Risk:** register pressure + shadow-stack overhead. Benchmark carefully. + Typical cost: 2-8% on pointer-heavy workloads; effectively free on + computation-heavy workloads. +- **Scope:** codegen.rs + every call-site emission. Large but mechanical. + +**Ship criteria:** +- All gap tests + runtime tests pass with conservative scan disabled. +- No benchmark regresses >5%. +- `mark_block_persisting_arena_objects` can be deleted. +``` + +**File:** test-parity/known_failures.json (L11-22) +```json + "test_parity_stream": { + "issue": "793", + "added": "2026-05-15", + "category": "module-inventory", + "reason": "Node.js module inventory \u2014 `node:stream` surface not fully implemented. Tracker for surface coverage; flips to PASS as each API lands. Not a regression." + }, + "test_parity_stream_web": { + "issue": "793", + "added": "2026-05-15", + "category": "module-inventory", + "reason": "Node.js module inventory \u2014 `node:stream/web` (WHATWG streams) surface not fully implemented. Tracker for surface coverage; flips to PASS as each API lands. Not a regression." + }, +``` diff --git a/TYPE_LOWERING_GUIDANCE.md b/TYPE_LOWERING_GUIDANCE.md new file mode 100644 index 0000000000..dd4d750a0e --- /dev/null +++ b/TYPE_LOWERING_GUIDANCE.md @@ -0,0 +1,847 @@ +Perry’s next major performance step should be **representation-aware type lowering**: keep values in native typed form for as long as possible, and box into `JSValue` only at true dynamic boundaries. + +Right now Perry already has the ingredients: HIR type inference, monomorphized generics, `Int32` fast paths, array bounds elimination, numeric field recognition, scalar replacement, NaN-boxed `JSValue`, shape caching, fixed class IDs, and runtime-specialized support for strings, arrays, maps, buffers, dates, BigInts, regexes, promises, and async state machines. The problem is that too much of the system still uses the universal `JSValue` representation too early. The fastest Perry should look less like “native code calling a JS runtime helper often” and more like “a static compiler that only falls back to JS dynamic semantics where the program actually needs them.” + +## Main guidance + +The core rule should be: + +```text +Do not lower TypeScript values directly to JSValue. +Lower them to typed SSA values first. +Box only at dynamic boundaries. +``` + +A good target model is: + +```text +TypeScript/HIR type + → Perry type facts + → representation-specific IR + → LLVM native value + → JSValue only if needed +``` + +For example: + +```text +number → f64 +integer-stable number → i32 / u32 / i53 +boolean → i1 +string → PerryStringRef, not raw JSValue +class Point → ptr PointObjectLayout +number[] packed → ptr ArrayF64 +any / unknown → i64 JSValueBits +JS interop handle → JSHandleValue +``` + +This matters because LLVM can optimize `i32`, `double`, `ptr`, and typed loads. It cannot reason well about every value being a NaN-boxed `f64`. Perry’s current architecture says every JS value crossing a function boundary is a NaN-boxed `f64`, with tags for strings, objects, int32, BigInt, short strings, handles, and singleton values. That is fine as a **public ABI**, but it should not be the default internal representation inside optimized functions. + +## Use `JSValue` as an ABI, not as the optimizer’s native type + +Keep `JSValue` for: + +```text +public function boundaries +unknown calls +any / unknown +dynamic property access +generic arrays +object dictionaries +exceptions +closures that escape +Promise/microtask storage +thread serialization +V8/QuickJS bridge values +``` + +But inside a compiled function, Perry should prefer typed values. + +Example target shape: + +```text +// Public generic trampoline. +foo$jsvalue(JSValue a, JSValue b) -> JSValue { + if a,b are numbers: + return box_number(foo$number_number(a as f64, b as f64)) + else: + return foo$generic(a, b) +} + +// Internal typed clone. +foo$number_number(f64 a, f64 b) -> f64 { + return a + b +} +``` + +The same applies to classes: + +```text +Point.distance$typed(ptr Point, ptr Point) -> f64 +Point.distance$generic(JSValue this, JSValue other) -> JSValue +``` + +This gives Perry a static-compiler version of what a JIT does dynamically: one generic path for correctness, and typed paths for speed. + +## Represent `JSValue` bits as `i64` in LLVM IR + +Even if the external ABI still passes `JSValue` as `f64`, Perry should seriously consider representing boxed values internally as `i64` bit patterns, not as LLVM `double`. + +NaN-boxing depends on preserving payload bits exactly. But LLVM and CPU floating-point optimizations naturally treat `double` as a numeric value, not as a tagged pointer carrier. Perry’s own `JSValue::is_number` logic distinguishes real IEEE numbers from Perry-owned positive quiet-NaN tag bands, and the runtime already has careful handling for tags such as `SHORT_STRING_TAG`, `STRING_TAG`, and `POINTER_TAG`. + +Recommended internal split: + +```text +NumberValue = double +BoxedValue = i64 +PointerValue = ptr +BoolValue = i1 +Int32Value = i32 +Uint32Value = i32 with unsigned interpretation +``` + +Only bitcast between `i64` and `double` at ABI edges where the current ABI requires `f64`. + +This also avoids accidental optimizer corruption from fast-math flags. JavaScript numeric semantics include `NaN`, infinities, and signed zero, so broad fast-math should not be applied to general JS `number` operations unless Perry has proven the operation is in a restricted numeric domain. + +## Build a richer type-fact lattice + +Current HIR types are useful but too coarse. `Type::Number`, `Type::String`, `Type::Array(elem)`, `Type::Named(name)`, and `Type::Any` are not enough for aggressive lowering. Perry should keep HIR types, then add a second layer of **type facts**. + +Recommended fact shape: + +```text +Value facts: + kind: number | int32 | uint32 | int53 | bool | string | object | array | bigint | any + nullability: non-null | nullable | nullish | unknown + representation: unboxed | boxed | pointer | handle + range: integer bounds if known + constant: literal value if known + +Array facts: + element kind: f64 | i32 | uint32 | JSValue | string | object + packed vs holey + length stable inside region + capacity stable inside region + no external alias + no body write can grow/shrink + +Object facts: + exact class id + exact shape id + field layout + field pointer bitmap + frozen/sealed/no-extend state + dictionary fallback possible or impossible + +Effect facts: + may allocate + may call unknown code + may mutate array length + may mutate object shape + may throw + may access JS bridge + may run microtasks +``` + +This turns Perry’s optimizer from “infer a type once” into “carry a proof.” That proof then decides whether Perry emits raw native loads, bounds checks, dynamic dispatch, write barriers, or generic helper calls. + +## Generalize the existing integer fast path + +Perry already has a real performance win here. It recognizes integer-valued expressions, tracks `i32` loop-counter slots, uses integer modulo instead of `fmod`, and avoids repeated `fptosi/sitofp` round-trips in hot array-walking loops. The provided benchmark methodology says this is why integer modulo can become a single integer instruction instead of a libm `fmod` call, and why array read/write loops can become raw `getelementptr + load` patterns that LLVM can vectorize. + +The next step is to expand the numeric lattice: + +```text +i32 signed ToInt32 domain +u32 unsigned ToUint32 domain +i53 safe JS integer domain +f64 general JS number +f64-nonNaN proven non-NaN +f64-finite proven finite +``` + +Do not force everything integer-like into `i32`. JavaScript has several subtly different integer domains: + +```text +x | 0 → signed i32 +x >>> 0 → unsigned u32 +array idx → uint32-ish but constrained by length +number → f64, often integer-valued but not always safely i32 +``` + +Perry’s comments already show why this matters: previous `i32` shadowing could silently corrupt accumulators that were integer-stable but not actually safe as signed 32-bit values. The current gating via index-used locals, strictly bounded locals, and unsigned locals is the right direction. Extend that into a first-class numeric domain system rather than a set of local special cases. + +## Make arrays representation-specialized + +Arrays should not all be `ArrayHeader + JSValue[]`. + +Recommended array kinds: + +```text +PackedF64Array +PackedI32Array +PackedU32Array +PackedStringArray +PackedObjectArray +PackedValueArray +HoleyValueArray +DictionaryArray +TypedArray-backed variants +``` + +For `number[]` in a proven packed numeric loop, Perry should lower: + +```ts +for (let i = 0; i < arr.length; i++) { + sum += arr[i] +} +``` + +to roughly: + +```text +arr_ptr = checked_unbox_packed_f64_array(arr) +len = arr.length +data = arr.data + +for i32 i in [0, len): + sum = fadd sum, load data[i] +``` + +No per-iteration NaN-box decode. No per-iteration length load. No bounds check if the loop proof establishes `i < arr.length`. No runtime helper call. Perry already has a narrower version of this with cached lengths, `bounded_index_pairs`, and `i32_counter_slots`; generalize it to element-kind-specialized arrays. + +For stores, use transitions: + +```text +PackedF64Array + number store → stay PackedF64Array +PackedF64Array + undefined store → transition to PackedValueArray or HoleyValueArray +PackedI32Array + f64 store → transition to PackedF64Array or PackedValueArray +PackedObjectArray + object D → stay only if D <= C-compatible +``` + +This is where Perry can exploit its AOT restrictions. Since Perry does not support general `eval`, `new Function`, dynamic import, dynamic `require`, full prototype mutation, or full Proxy trapping, it has fewer invalidation hazards than V8. Those limitations are not just compatibility gaps; they are optimization permissions. + +## Add array-loop versioning + +For uncertain arrays, compile two paths: + +```text +fast path: + guard array is PackedF64Array + guard no holes + guard length stable + run typed loop + +slow path: + generic JS array access +``` + +Example: + +```text +if likely(is_packed_f64_array(arr)) { + return sum_packed_f64(arr) +} +return sum_generic_jsvalue(arr) +``` + +This is AOT-friendly. It does not require a JIT. It only requires small guarded clones. + +Use this for: + +```text +Array.prototype.map/filter/reduce +for loops over arr.length +JSON parse/stringify internal loops +Buffer/Uint8Array loops +string scanning +numeric kernels +``` + +Code size must be controlled. Do not clone every function for every type combination. Clone only when: + +```text +function is hot by benchmark/profile +loop body is small +specialization removes helper calls +array kind is stable +generic fallback remains available +``` + +PGO is a good fit here because it lets the compiler choose which clones matter for real workloads. LLVM’s own documentation describes PGO as a way for a compiler to optimize according to how code actually runs, with representative profile selection being important. ([LLVM][1]) + +## Make object/class fields unboxed where possible + +Perry’s object model currently uses `ObjectHeader`, `class_id`, `field_count`, `keys_array`, inline property slots, shape caching, `KEYS_INDEX`, overflow fields, and vtable-based dynamic dispatch. That is a workable JS object model, but class instances with declared fields should not be treated like generic dictionaries in hot paths. + +For classes, generate fixed layouts: + +```ts +class Point { + x: number + y: number +} +``` + +Target layout: + +```text +ObjectHeader +class_id +shape_id +field_bitmap +x: f64 +y: f64 +``` + +Not: + +```text +ObjectHeader +keys_array = ["x", "y"] +slots[0] = JSValue(number) +slots[1] = JSValue(number) +``` + +The second layout is more dynamic, but much slower. The first layout gives LLVM ordinary typed loads: + +```llvm +%x_ptr = getelementptr %Point, ptr %p, field_x +%x = load double, ptr %x_ptr +``` + +For nullable/dynamic fields, use mixed layout: + +```text +f64 fields +i32 fields +pointer fields +JSValue spill fields +overflow dictionary +``` + +This also improves GC. A typed field layout gives the collector a pointer bitmap, so it scans only pointer fields instead of inspecting every slot dynamically. + +## Fix scalar replacement across method calls + +The current scalar replacement limitation is important: Perry can stack-allocate or decompose non-escaping object literals only when the object is accessed exclusively through field get/set; any method call defeats it. + +That should be one of the highest-priority compiler improvements. + +Example: + +```ts +class Point { + constructor(public x: number, public y: number) {} + sum() { return this.x + this.y } +} + +let p = new Point(x, y) +total += p.sum() +``` + +Current likely behavior: + +```text +allocate Point +store x/y +dynamic or semi-dynamic method call +load fields +GC-visible object +``` + +Target behavior: + +```text +p.x = scalar x +p.y = scalar y +inline Point.sum +total += x + y +no allocation +``` + +To get there, Perry needs method summaries: + +```text +method Point.sum: + receiver escapes? no + mutates receiver shape? no + reads fields: x, y + writes fields: none + may call unknown? no + may throw? no +``` + +Then escape analysis can treat simple method calls as field operations. LLVM’s SROA pass is specifically designed to break analyzable aggregate allocas into scalar SSA values, and its vectorizers can then operate on clean scalar/loop IR; Perry should feed LLVM IR that makes those passes obvious instead of hiding work behind runtime helper calls. ([LLVM][2]) ([LLVM][3]) + +## Replace stringly dynamic dispatch with IDs + +`js_native_call_method` currently receives a method name pointer and length, builds/uses a string name, handles JS handles, class static methods, vtable lookup, prototype objects, and fallback paths. That is correct but expensive for hot calls. + +For compiled code, method/property names should be lowered to interned IDs at compile time: + +```text +"toString" → SymbolId / PropertyId 17 +"value" → FieldId 3 +"length" → BuiltinPropertyId::Length +``` + +Then dispatch can be: + +```text +if exact class id known: + direct call function pointer + +else if class id known but subclass possible: + vtable[class_id][method_id] + +else: + generic js_native_call_method_by_id + +only final fallback: + js_native_call_method_by_string +``` + +Hot-path method calls should not allocate Rust `String`s, hash method names, or scan strings. For static class methods, the same ID system should apply. + +## Unify string lowering; eliminate the SSO footgun + +The current short-string optimization is valuable, but the strict `is_string()` versus `is_any_string()` distinction is a correctness and performance hazard. The context explicitly warns that `is_string()` only recognizes heap `STRING_TAG`, while `SHORT_STRING_TAG` can fall into wrong branches and even be dereferenced as a pointer if call sites are not careful. + +Recommended fix: + +```text +PerryStringRef: + Short { bytes[5], len } + Heap { ptr: *StringHeader } +``` + +Then generated code and runtime helpers should use one abstraction: + +```text +is_string_like(value) +string_len(value) +string_bytes(value) +string_materialize_if_needed(value) +``` + +Rename low-level predicates to make misuse hard: + +```text +is_heap_string() +is_short_string() +is_any_string() +``` + +Do not allow new runtime code to branch on `is_string()` unless the name means “any string.” For performance, specialize string operations: + +```text +short + short → inline small concat if result <= 5 +heap refcount == 1 → append in place +concat chain → one allocation +string scan → SIMD path +property key → interned ID / pointer identity +``` + +The provided context already describes SSO, in-place append, concat-chain optimization, and SIMD string scanning. The main improvement is making the type lowering and runtime API impossible to misuse. + +## Use Perry’s unsupported JS features as optimization assumptions + +Perry does not support or only partially supports several highly dynamic JS features: full `Proxy`, full `Reflect`, `eval`, `new Function`, dynamic import, user-space dynamic `require`, full prototype mutation, `SharedArrayBuffer`, and `Atomics`. + +That means Perry can assume much more than V8 in native-compiled mode: + +```text +class layouts do not get monkey-patched at runtime +prototype methods do not arbitrarily change +static ESM imports form a closed module graph +no eval can introduce new code +no SharedArrayBuffer means no cross-thread mutation races +deep-copy threading means local arrays are not concurrently modified +``` + +Perry should formalize this into compilation modes: + +```text +strict-native mode: + assumes Perry limitations + strongest type lowering + no dynamic fallback except explicit JS runtime bridge + +compat mode: + more JSValue paths + more guards + less layout specialization +``` + +This lets Perry turn compatibility limitations into performance wins without pretending to be fully dynamic JavaScript. + +## Add effect analysis before lowering + +Many current optimizations depend on proving that a loop body does not mutate `arr.length`, does not reassign the loop counter, and does not invalidate the cached length. Perry already does some of this for bounded index pairs. + +Make this a general effect system: + +```text +Effect::ReadsArrayLength(arr) +Effect::WritesArrayLength(arr) +Effect::WritesArrayElement(arr) +Effect::MayGrowArray(arr) +Effect::MutatesShape(obj) +Effect::CallsUnknown +Effect::Allocates +Effect::MayThrow +Effect::RunsMicrotasks +Effect::TouchesJSHandle +``` + +Then lowering decisions become principled: + +```text +Can cache arr.length? + yes if no effect in loop may write length or reassign arr + +Can eliminate bounds check? + yes if loop induction range is within cached length + +Can direct-load field? + yes if receiver shape is stable and no effect mutates it + +Can stack-allocate object? + yes if object does not escape through call, closure, return, throw, async, or unknown store + +Can skip write barrier? + yes if stored value is statically primitive or parent is young +``` + +This will improve both performance and correctness because optimizations become proof-based rather than pattern-based. + +## Emit better LLVM metadata + +Perry should make LLVM’s optimizer see what Perry already knows. + +For typed arrays and fixed class layouts, emit: + +```text +nonnull +dereferenceable(N) +align +noalias for fresh allocations +alias.scope / noalias for independent arrays +TBAA metadata for headers, lengths, capacities, fields, data buffers +range metadata for lengths, tags, enum values +cold/noinline for generic fallbacks +alwaysinline for tiny tag checks and unbox helpers +readonly/readnone/willreturn/nounwind where valid +``` + +LLVM’s LangRef documents `dereferenceable`, `noalias`, `alias.scope`, `TBAA`, and `range` metadata; these are exactly the kinds of facts Perry can provide from its type/layout system. ([LLVM][4]) ([LLVM][4]) ([LLVM][4]) ([LLVM][4]) + +The important point: do not merely generate “correct” LLVM IR. Generate IR that exposes aliasing, bounds, layout, and type facts. + +## Lower write barriers statically + +Write barriers should not be emitted as generic runtime calls for every store. + +For every store site, codegen should decide: + +```text +storing number/bool/null/undefined/int32? + no pointer child → no barrier + +parent proven young? + no old→young edge → no barrier + +child proven old or non-GC? + no young child → no barrier + +parent old and child maybe young? + inline fast card barrier +``` + +The GC context says Perry’s current barrier fires on every heap store emitted by codegen, decodes parent and child, checks old→young, and dirties a page when needed. That is semantically right, but it leaves too much work for runtime. + +Type lowering can remove many barriers before codegen: + +```text +obj.x = 3.14 // no barrier +obj.flag = true // no barrier +obj.count = i32 // no barrier +youngObj.child = y // no old→young barrier +oldObj.child = y // inline card barrier only if y may be young pointer +``` + +This also argues for unboxed class fields. A `number` field stored as raw `f64` never needs a GC barrier. + +## Improve Map/Set lowering with key-specialized tables + +The current runtime has separate side-table indices for numeric keys, string keys, and sets, with content hashing for strings. + +The compiler should exploit that: + +```ts +const m = new Map() +m.set(k, v) +m.get(k) +``` + +should lower to: + +```text +MapStringNumber +key: PerryStringRef / interned string id where possible +value: f64 +``` + +Not: + +```text +generic Map +generic hash +generic equality +boxed value +``` + +Recommended specializations: + +```text +Map +Map +Map +Map +Set +Set +``` + +For `Record` and object dictionaries, use the same idea: once an object crosses the `KEYS_INDEX_THRESHOLD`, compile dynamic key operations against a dictionary representation directly rather than repeatedly going through generic object field helpers. + +## Rework BigInt representation + +The current BigInt design uses fixed 1024-bit storage, mainly to satisfy crypto workloads where secp256k1 intermediates can exceed 512 bits. + +That is good for crypto kernels, but it is too heavy as the default BigInt representation. + +Recommended split: + +```text +SmallBigInt: + inline i64/u64 or two limbs + +MediumBigInt: + heap variable-limb Vec + +CryptoBigInt1024: + fixed 16-limb path for known crypto packages / specialized kernels +``` + +Then lowering can choose: + +```text +1n + 2n → SmallBigInt or constant fold +BigInt loop counter → SmallBigInt +crypto modular multiply → CryptoBigInt1024 +unknown BigInt expression → generic variable-limb BigInt +``` + +This prevents every BigInt from paying crypto-sized costs. + +## Treat async lowering as allocation lowering + +Perry already lowers async/await into generator/state-machine form, runs promise microtasks, and has optimized the microtask loop by using one outer `setjmp` instead of one per microtask. The async bridge also avoids creating `JSValue` objects on Tokio worker threads and defers conversion to the main thread because arenas are thread-local. + +The next performance step is to make async lowering typed: + +```text +async function f(): Promise +``` + +should become: + +```text +state machine result slot: f64 +Promise continuation: typed f64 until boxed +``` + +Avoid: + +```text +state machine slot: JSValue +every await result: boxed +every continuation: generic closure +``` + +Recommended async optimizations: + +```text +typed result slots in state machines +typed capture slots +static continuation structs instead of generic closures +Promise allocation reuse for await chains +no boxing until resolving externally-visible Promise +microtask queue entries specialized by callback signature +``` + +The existing `MT_STEP_CHAIN_REUSE_HIT` style optimization should be expanded into a general “typed async continuation” path. + +## Use internal typed calling conventions + +Perry’s monomorphization is already a strong asset. Generics are specialized into mangled function/class names such as `identity$number`. + +Extend that idea beyond TypeScript generics: + +```text +Function clone dimensions: + argument representation + return representation + receiver class id + array element kind + nullability + closure capture layout +``` + +Example: + +```text +sum$ArrayF64__f64 +sum$ArrayI32__i32 +sum$ArrayValue__JSValue +sum$generic +``` + +But control code size: + +```text +clone only loops/functions above threshold +clone only if helper calls disappear +clone only if call graph is stable +merge clones with same machine representation +cap clone count per function +use profile data to choose +``` + +This gives Perry a static analog to V8’s specialization without needing a JIT. + +## Add package-level specialization + +Perry can know the whole package at compile time. Use that. + +For npm/native packages that Perry supports directly, ship lowering profiles: + +```text +fastify: + object-shape stable request/response paths + string/header maps + async Promise chains + +mysql2: + row object shapes + date/string/buffer decoding + typed column arrays + +redis: + string/buffer heavy paths + command array flattening + +noble/ethers: + BigInt/Uint8Array crypto kernels + fixed-limb arithmetic +``` + +This should not be handwritten one-off hacks in codegen. It should be a declarative profile system: + +```text +known stable shapes +known method purity +known allocation patterns +known typed arrays +known no-dynamic-require subset +``` + +Then the normal optimizer consumes those facts. + +## Make lowering observable + +Before adding many optimizations, add a compiler report: + +```text +perry build --explain-lowering +``` + +For each function: + +```text +boxes inserted: 42 +unboxes inserted: 17 +js_number_coerce calls: 3 +runtime property gets: 8 +direct field loads: 21 +bounds checks eliminated: 14 +barriers eliminated: 32 +object allocations scalar-replaced: 6 +array kind: PackedF64Array +generic fallback emitted: yes +reason scalar replacement failed: method call escapes receiver +reason bounds check kept: loop body may mutate length +reason typed call failed: callee return type unknown +``` + +This will pay for itself quickly. Perry’s biggest optimization risk is silent missed lowering: code still works, but one helper call in a hot loop destroys performance. + +## Recommended implementation order + +First, split internal `JSValue` representation into `i64 JSValueBits` and typed native values. Keep the existing external ABI if needed, but stop letting LLVM see boxed values as ordinary floating-point values unless they are actually numbers. + +Second, add a `TypeFacts` pass after HIR lowering and before LLVM generation. This should compute numeric domains, array kinds, object shapes, nullability, escape state, and side effects. + +Third, implement late boxing. Function bodies should use native typed values; boxes should be inserted only at returns, unknown calls, dynamic stores, closure captures, async suspension, thread serialization, and JS interop. + +Fourth, create internal typed function clones plus generic trampolines. Start with `number`, `int32`, `boolean`, `string`, and packed numeric arrays. + +Fifth, generalize array lowering into packed/holey/value/dictionary representations. Make the existing bounded-index and cached-length logic a special case of a broader array-fact system. + +Sixth, make fixed class layouts use unboxed fields and direct method calls when the receiver class is exact. Add method purity/effect summaries so scalar replacement works across simple method calls. + +Seventh, replace string-name dispatch with interned property/method IDs. Keep string fallback for dynamic cases only. + +Eighth, unify string handling around `PerryStringRef` so SSO and heap strings share one safe lowering path. + +Ninth, specialize Map/Set/Record by key and value kind. + +Tenth, apply PGO to choose which typed clones to emit and which generic paths to mark cold. + +## What not to do + +Do not rely on TypeScript annotations as runtime truth. TypeScript types guide optimization, but values can still enter through `any`, unknown package boundaries, dynamic data, JSON, V8/QuickJS handles, or native callbacks. Use annotations as optimistic facts only when the compiler can prove the boundary is closed, or emit guards. + +Do not globally enable fast-math. JavaScript `number` semantics are not C `-ffast-math` semantics. Signed zero, `NaN`, infinities, and coercions matter. + +Do not solve performance primarily by adding more runtime helpers. The goal is fewer helper calls in hot paths. + +Do not let monomorphization explode code size. Specialization should be costed. + +Do not keep expanding object side tables for hot class instances. Use side tables for dynamic dictionaries; use fixed typed layouts for classes and stable shapes. + +## Bottom line + +The right direction for Perry is: + +```text +AOT TypeScript compiler + + typed SSA + + late boxing + + guarded typed clones + + packed array/object layouts + + effect/range/escape analysis + + LLVM metadata + + generic JSValue fallback +``` + +Perry should not try to beat V8 by making a faster generic JS object model. It should beat V8 where AOT has an advantage: closed-world TypeScript, fixed class layouts, static imports, typed arrays, monomorphized functions, predictable async state machines, and compiler-proven loops. + +The best single sentence guidance is: **make `JSValue` the fallback representation, not the default representation.** + +[1]: https://llvm.org/docs/HowToBuildWithPGO.html "How To Build Clang and LLVM with Profile-Guided Optimizations — LLVM 23.0.0git documentation" +[2]: https://llvm.org/docs/Passes.html?utm_source=chatgpt.com "LLVM's Analysis and Transform Passes" +[3]: https://llvm.org/docs/Vectorizers.html?utm_source=chatgpt.com "Auto-Vectorization in LLVM — LLVM 23.0.0git documentation" +[4]: https://llvm.org/docs/LangRef.html "LLVM Language Reference Manual — LLVM 23.0.0git documentation" diff --git a/benchmarks/compiler_output/fixtures/raw_numeric_object_fields.ts b/benchmarks/compiler_output/fixtures/raw_numeric_object_fields.ts new file mode 100644 index 0000000000..84a63d00c8 --- /dev/null +++ b/benchmarks/compiler_output/fixtures/raw_numeric_object_fields.ts @@ -0,0 +1,30 @@ +class Gauge { + value: number = 1.5; + total: number = 2.5; + note: any = "stable"; +} + +function forceDynamicRead(gauge: any): number { + return gauge.value; +} + +function forceDynamicWrite(gauge: any, value: any): void { + gauge.value = value; +} + +function rawNumericObjectFieldsChecksum(): number { + const fast = new Gauge(); + fast.value = 4.5; + fast.total = 7.5; + let sum = fast.value + fast.total; + + const fallback = new Gauge(); + forceDynamicWrite(fallback, "boxed"); + sum += typeof (fallback as any).value === "string" ? 11 : 0; + forceDynamicWrite(fallback, 3.25); + sum += forceDynamicRead(fallback); + + return sum; +} + +console.log("raw_numeric_object_fields:" + rawNumericObjectFieldsChecksum()); diff --git a/benchmarks/compiler_output/workloads.toml b/benchmarks/compiler_output/workloads.toml index 38dac5600f..d2d816f4d0 100644 --- a/benchmarks/compiler_output/workloads.toml +++ b/benchmarks/compiler_output/workloads.toml @@ -632,12 +632,14 @@ detail = "numeric indexed read takes the guarded raw-f64 fast path and loads the [[workloads.numeric_arrays.ir_checks]] name = "numeric_array_uses_unboxed_set" -contains = "js_array_numeric_set_f64_unboxed" -detail = "numeric indexed write uses the guarded raw-f64 helper" +contains = "js_typed_feedback_numeric_array_index_set_guard" +regex = '''idxset\.(?:bounded_numeric_fast|inbounds)\.\d+:[\s\S]*?inttoptr i64 %\w+ to ptr[\s\S]*?call double @js_array_numeric_value_to_raw_f64\(double %\w+\)\s*\n\s*store double %\w+, ptr %\w+[^\n]*\n\s*br label %idxset\.(?:bounded_numeric_merge|merge)''' +regex_none = ["call i32 @js_array_numeric_set_f64_unboxed"] +detail = "numeric indexed write takes the guarded raw-f64 fast path, canonicalizes the value, and stores the raw slot inline" [[workloads.numeric_arrays.stdout_checks]] name = "numeric_arrays_checksum" -contains = "25" +equals = "25\n" detail = "numeric-array fixture stdout checksum" [workloads.numeric_arrays.native_rep_checks] @@ -653,14 +655,26 @@ bounds_state = "proven_or_guarded" consumed_fact_kind = "raw_f64_layout" consumed_fact_state = "consumed" +[[workloads.numeric_arrays.native_rep_checks.require_records]] +name = "numeric_array_push_guard_consumed" +expr_kind = "NumericArrayPush" +consumer = "js_array_numeric_push_f64_unboxed" +native_rep_name = "f64" +access_mode = "checked_native" +bounds_state = "proven_or_guarded" +consumed_fact_kind = "bounds" +consumed_fact_state = "consumed" + [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_push_dynamic_fallback" expr_kind = "NumericArrayPush" consumer = "js_array_push_f64" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "rejected" +rejected_fact_reason = "runtime_api" [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_push_dynamic_fallback_invalidates_layout" @@ -668,8 +682,21 @@ expr_kind = "NumericArrayPush" consumer = "js_array_push_f64" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" + +[[workloads.numeric_arrays.native_rep_checks.require_records]] +name = "numeric_array_push_materialization_hazard_invalidated" +expr_kind = "NumericArrayPush" +consumer = "js_array_push_f64" +access_mode = "dynamic_fallback" +materialization_reason = "runtime_api" +fallback_reason = "runtime_api" +rejected_fact_kind = "materialization_hazard" +rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_get_fast_f64" @@ -681,14 +708,26 @@ bounds_state = "proven_or_guarded" consumed_fact_kind = "raw_f64_layout" consumed_fact_state = "consumed" +[[workloads.numeric_arrays.native_rep_checks.require_records]] +name = "numeric_array_get_guard_consumed" +expr_kind = "NumericArrayIndexGet" +consumer = "js_array_numeric_get_f64_unboxed" +native_rep_name = "f64" +access_mode = "checked_native" +bounds_state = "proven_or_guarded" +consumed_fact_kind = "bounds" +consumed_fact_state = "consumed" + [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_get_dynamic_fallback" expr_kind = "NumericArrayIndexGet" consumer = "js_typed_feedback_array_index_get_fallback_boxed" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "rejected" +rejected_fact_reason = "runtime_api" [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_get_dynamic_fallback_invalidates_layout" @@ -696,8 +735,21 @@ expr_kind = "NumericArrayIndexGet" consumer = "js_typed_feedback_array_index_get_fallback_boxed" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" + +[[workloads.numeric_arrays.native_rep_checks.require_records]] +name = "numeric_array_get_materialization_hazard_invalidated" +expr_kind = "NumericArrayIndexGet" +consumer = "js_typed_feedback_array_index_get_fallback_boxed" +access_mode = "dynamic_fallback" +materialization_reason = "runtime_api" +fallback_reason = "runtime_api" +rejected_fact_kind = "materialization_hazard" +rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_set_fast_f64" @@ -709,14 +761,26 @@ bounds_state = "proven_or_guarded" consumed_fact_kind = "raw_f64_layout" consumed_fact_state = "consumed" +[[workloads.numeric_arrays.native_rep_checks.require_records]] +name = "numeric_array_set_guard_consumed" +expr_kind = "NumericArrayIndexSet" +consumer = "js_array_numeric_set_f64_unboxed" +native_rep_name = "f64" +access_mode = "checked_native" +bounds_state = "proven_or_guarded" +consumed_fact_kind = "bounds" +consumed_fact_state = "consumed" + [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_set_dynamic_fallback" expr_kind = "NumericArrayIndexSet" consumer_contains = "fallback" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "rejected" +rejected_fact_reason = "runtime_api" [[workloads.numeric_arrays.native_rep_checks.require_records]] name = "numeric_array_set_dynamic_fallback_invalidates_layout" @@ -724,11 +788,24 @@ expr_kind = "NumericArrayIndexSet" consumer_contains = "fallback" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" + +[[workloads.numeric_arrays.native_rep_checks.require_records]] +name = "numeric_array_set_materialization_hazard_invalidated" +expr_kind = "NumericArrayIndexSet" +consumer_contains = "fallback" +access_mode = "dynamic_fallback" +materialization_reason = "runtime_api" +fallback_reason = "runtime_api" +rejected_fact_kind = "materialization_hazard" +rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" [workloads.raw_numeric_object_fields] -source = "tests/raw_numeric_object_fields.ts" +source = "benchmarks/compiler_output/fixtures/raw_numeric_object_fields.ts" kind = "raw_numeric_object_fields" allow_dynamic_property_runtime = true allow_hot_loop_conversions = true @@ -759,6 +836,11 @@ write_barriers_traced = 64 boxed_number_allocations_static = 0 buffer_slow_path_accesses_static = 0 +[[workloads.raw_numeric_object_fields.stdout_checks]] +name = "raw_numeric_object_fields_checksum" +equals = "raw_numeric_object_fields:26.25\n" +detail = "raw numeric object field fixture stdout checksum" + [[workloads.raw_numeric_object_fields.ir_checks]] name = "raw_numeric_field_get_guard" contains = "js_typed_feedback_class_field_get_guard" @@ -769,11 +851,46 @@ name = "raw_numeric_field_set_guard" contains = "js_typed_feedback_class_field_set_guard" detail = "class numeric field writes are guarded before raw slot stores" +[[workloads.raw_numeric_object_fields.ir_checks]] +name = "raw_numeric_field_get_scoped_raw_load" +section = "llvm_before" +function_contains = "rawNumericObjectFieldsChecksum" +regex = '''class_field_get\.fast\.\d+:[\s\S]*?load double, ptr %\w+[\s\S]*?br label %class_field_get\.merge''' +detail = "checksum function raw numeric field read performs a scoped guarded load double" + +[[workloads.raw_numeric_object_fields.ir_checks]] +name = "raw_numeric_field_set_scoped_raw_store" +section = "llvm_before" +function_contains = "rawNumericObjectFieldsChecksum" +regex = '''class_field_set\.fast\.\d+:[\s\S]*?call double @js_array_numeric_value_to_raw_f64[\s\S]*?store double %\w+, ptr %\w+[\s\S]*?br label %class_field_set\.merge''' +detail = "checksum function raw numeric field write canonicalizes and performs a scoped guarded store double" + [workloads.raw_numeric_object_fields.native_rep_checks] allow_materialization_reasons = ["runtime_api"] +[[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] +name = "raw_scalar_ctor_field_store_raw_f64" +source_function = "rawNumericObjectFieldsChecksum" +expr_kind = "ScalarThisFieldSet" +consumer = "scalar_object_field_store.raw_f64" +native_rep_name = "f64" +access_mode = "none" +consumed_fact_kind = "representation" +consumed_fact_state = "consumed" + +[[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] +name = "raw_scalar_field_load_raw_f64" +source_function = "rawNumericObjectFieldsChecksum" +expr_kind = "ScalarObjectFieldGet" +consumer = "scalar_object_field_load.raw_f64" +native_rep_name = "f64" +access_mode = "none" +consumed_fact_kind = "representation" +consumed_fact_state = "consumed" + [[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] name = "raw_class_field_get_fast_f64" +source_function = "rawNumericObjectFieldsChecksum" expr_kind = "ClassFieldGet" consumer = "class_field_get.raw_f64_load" native_rep_name = "f64" @@ -782,26 +899,56 @@ bounds_state = "proven_or_guarded" consumed_fact_kind = "raw_f64_layout" consumed_fact_state = "consumed" +[[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] +name = "raw_class_field_get_guard_consumed" +source_function = "rawNumericObjectFieldsChecksum" +expr_kind = "ClassFieldGet" +consumer = "class_field_get.raw_f64_load" +native_rep_name = "f64" +access_mode = "checked_native" +bounds_state = "proven_or_guarded" +consumed_fact_kind = "bounds" +consumed_fact_state = "consumed" + [[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] name = "raw_class_field_get_dynamic_fallback" +source_function = "rawNumericObjectFieldsChecksum" expr_kind = "ClassFieldGet" consumer = "js_object_get_field_by_name_f64" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "rejected" +rejected_fact_reason = "runtime_api" [[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] name = "raw_class_field_get_dynamic_fallback_invalidates_layout" +source_function = "rawNumericObjectFieldsChecksum" expr_kind = "ClassFieldGet" consumer = "js_object_get_field_by_name_f64" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" + +[[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] +name = "raw_class_field_get_materialization_hazard_invalidated" +source_function = "rawNumericObjectFieldsChecksum" +expr_kind = "ClassFieldGet" +consumer = "js_object_get_field_by_name_f64" +access_mode = "dynamic_fallback" +materialization_reason = "runtime_api" +fallback_reason = "runtime_api" +rejected_fact_kind = "materialization_hazard" +rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" [[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] name = "raw_class_field_set_fast_f64" +source_function = "rawNumericObjectFieldsChecksum" expr_kind = "ClassFieldSet" consumer = "class_field_set.raw_f64_store" native_rep_name = "f64" @@ -810,23 +957,52 @@ bounds_state = "proven_or_guarded" consumed_fact_kind = "raw_f64_layout" consumed_fact_state = "consumed" +[[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] +name = "raw_class_field_set_guard_consumed" +source_function = "rawNumericObjectFieldsChecksum" +expr_kind = "ClassFieldSet" +consumer = "class_field_set.raw_f64_store" +native_rep_name = "f64" +access_mode = "checked_native" +bounds_state = "proven_or_guarded" +consumed_fact_kind = "bounds" +consumed_fact_state = "consumed" + [[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] name = "raw_class_field_set_dynamic_fallback" +source_function = "rawNumericObjectFieldsChecksum" expr_kind = "ClassFieldSet" consumer = "js_object_set_field_by_name" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "rejected" +rejected_fact_reason = "runtime_api" [[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] name = "raw_class_field_set_dynamic_fallback_invalidates_layout" +source_function = "rawNumericObjectFieldsChecksum" expr_kind = "ClassFieldSet" consumer = "js_object_set_field_by_name" access_mode = "dynamic_fallback" materialization_reason = "runtime_api" +fallback_reason = "runtime_api" rejected_fact_kind = "raw_f64_layout" rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" + +[[workloads.raw_numeric_object_fields.native_rep_checks.require_records]] +name = "raw_class_field_set_materialization_hazard_invalidated" +source_function = "rawNumericObjectFieldsChecksum" +expr_kind = "ClassFieldSet" +consumer = "js_object_set_field_by_name" +access_mode = "dynamic_fallback" +materialization_reason = "runtime_api" +fallback_reason = "runtime_api" +rejected_fact_kind = "materialization_hazard" +rejected_fact_state = "invalidated" +rejected_fact_reason = "runtime_api" [workloads.scalar_replacement_literals] source = "benchmarks/compiler_output/fixtures/scalar_replacement_literals.ts" @@ -870,7 +1046,7 @@ detail = "scalar-replaced literals do not use runtime property or array access h [[workloads.scalar_replacement_literals.stdout_checks]] name = "scalar_replacement_checksum" -contains = "17" +equals = "17\n" detail = "scalar-replacement fixture stdout checksum" [workloads.scalar_replacement_literals.native_rep_checks] @@ -957,7 +1133,7 @@ detail = "tracked typed arrays use native element-width loads" [[workloads.width_aware_buffer_kernels.stdout_checks]] name = "width_aware_buffer_kernels_checksum" -contains = "width_aware_buffer_kernels:38314632556" +equals = "width_aware_buffer_kernels:38314632556\n" detail = "width-aware buffer and typed-array semantic checksum" [workloads.width_aware_buffer_kernels.native_rep_checks] @@ -1105,7 +1281,7 @@ buffer_slow_path_accesses_static = 16 [[workloads.native_owned_typed_views.stdout_checks]] name = "native_owned_typed_views_checksum" -contains = "native_owned_typed_views:1383.25" +equals = "native_owned_typed_views:1383.25\n" detail = "native-owned typed view checksum" [workloads.native_owned_typed_views.native_rep_checks] @@ -1262,7 +1438,7 @@ buffer_slow_path_accesses_static = 0 [[workloads.native_pod_layout_constants.stdout_checks]] name = "native_pod_layout_constants_checksum" -contains = "native_pod_layout_constants:55" +equals = "native_pod_layout_constants:55\n" detail = "POD layout constants preserve native-arena packet checksum" [workloads.native_memory_bulk_fill] @@ -1294,7 +1470,7 @@ buffer_slow_path_accesses_static = 0 [[workloads.native_memory_bulk_fill.stdout_checks]] name = "native_memory_bulk_fill_checksum" -contains = "native_memory_bulk_fill:471" +equals = "native_memory_bulk_fill:471\n" detail = "NativeMemory.fillU32 and copy preserve packet checksum" [workloads.native_memory_bulk_fill.native_rep_checks] @@ -1356,7 +1532,7 @@ buffer_slow_path_accesses_static = 0 [[workloads.native_memory_fixture.stdout_checks]] name = "native_memory_fixture_checksum" -contains = "native_memory_fixture:1701" +equals = "native_memory_fixture:1701\n" detail = "native arenas, bulk fill/copy, POD view lowering, and native pod+count call preserve checksum" [workloads.native_memory_fixture.native_rep_checks] @@ -1437,7 +1613,7 @@ buffer_slow_path_accesses_static = 0 [[workloads.native_abi_packet_typed.stdout_checks]] name = "native_abi_packet_typed_checksum" -contains = "native_abi_packet_typed:" +equals = "native_abi_packet_typed:33688032\n" detail = "typed packet fixture emits a semantic checksum" [workloads.native_abi_packet_typed.native_rep_checks] @@ -1511,7 +1687,7 @@ buffer_slow_path_accesses_static = 128 [[workloads.native_abi_packet_control.stdout_checks]] name = "native_abi_packet_control_checksum" -contains = "native_abi_packet_control:" +equals = "native_abi_packet_control:33688032\n" detail = "control packet fixture emits a semantic checksum" [workloads.native_abi_packet_control.native_rep_checks] diff --git a/crates/perry-codegen/docs/native-representation.md b/crates/perry-codegen/docs/native-representation.md index eeed6ef8ee..bb5b87023f 100644 --- a/crates/perry-codegen/docs/native-representation.md +++ b/crates/perry-codegen/docs/native-representation.md @@ -14,8 +14,8 @@ generic JavaScript `double`/NaN-box value. the selected `NativeRep`, the LLVM type, and the SSA value. 3. A `NativeRep` describes the compiler contract, not an optimization by itself. Examples are `I32`, `U32`, `U64`, `USize`, `F32`, `F64`, `U8`, - `BufferLen`, `NativeHandle`, `PromiseBoundary`, `JsValue`, and - `BufferView`. + `BufferLen`, `NativeHandle`, `PromiseBoundary`, `JsValueBits`, `JsValue`, + and `BufferView`. 4. `materialize_js_value` is the boundary where a native value is converted back to the generic JS ABI representation. Each conversion records a `MaterializationReason` and, for native ABI crossings, a @@ -61,7 +61,8 @@ added. ## Native ABI Contract -Schema version 5 records explicit native ABI transitions. Native values may stay +Schema version 12 records explicit native ABI transitions and internal boxed +bits counts. Native values may stay region-local with their LLVM ABI type: - `I32`, `U32`, and `BufferLen`: LLVM `i32`; `U32` and `BufferLen` materialize @@ -72,13 +73,18 @@ region-local with their LLVM ABI type: - `F32`: LLVM `float`; JS-number materialization is explicit `fpext` to `double`. Raw `f32` records are not JS-visible. - `F64` and `JsValue`: LLVM `double`. +- `JsValueBits`: LLVM `i64`, used only as an internal NaN-box bit-pattern + representation. Public ABI records still use `JsValue`/`double`. - `BufferView`: LLVM `ptr`, scoped to the native buffer proof region. `native_abi_transition` records use `{ from_native_rep, to_native_rep, op, reason, lossy }`. Valid ops are `none`, `signed_int_to_float`, -`unsigned_int_to_float`, `float_extend`, `pointer_box`, and `promise_box`. -The legacy `scalar_conversion` field is still written for compatibility, but -new checks should read `native_abi_transition`. +`unsigned_int_to_float`, `float_extend`, `js_value_to_bits`, +`bits_to_js_value`, `pointer_box`, `native_handle_box`, and `promise_box`. +The `js_value_to_bits` and `bits_to_js_value` ops are plain bitcasts that mark +the boundary between the current `double` ABI and the optimizer-local boxed +bits representation. The legacy `scalar_conversion` field is still written for +compatibility, but new checks should read `native_abi_transition`. ## Verification Mode @@ -97,6 +103,8 @@ The verifier rejects records that claim: - `explicit_assume` as a bounds proof. - LLVM type mismatches for the claimed native rep. - JS-visible or materialized raw `F32` records. +- `JsValueBits` used as an external ABI descriptor or dynamic fallback record. +- Materialized `JsValueBits` records without a `js_value_to_bits` transition. - Escaping raw `NativeHandle` or `PromiseBoundary` records. - Native ABI transitions without a matching materialization reason. - Invalid transition ops or signedness, including implicit unsigned/signed diff --git a/crates/perry-codegen/src/codegen/closure.rs b/crates/perry-codegen/src/codegen/closure.rs index 9cd9f3e8c9..4de67dc939 100644 --- a/crates/perry-codegen/src/codegen/closure.rs +++ b/crates/perry-codegen/src/codegen/closure.rs @@ -246,6 +246,7 @@ pub(super) fn compile_closure( source_function: format!("closure_{}", func_id), source_function_slug: crate::expr::native_region_slug(&format!("closure_{}", func_id)), active_region_id: None, + native_facts: &native_facts, locals, local_types, current_block: 0, diff --git a/crates/perry-codegen/src/codegen/entry.rs b/crates/perry-codegen/src/codegen/entry.rs index af441609eb..862b23e10f 100644 --- a/crates/perry-codegen/src/codegen/entry.rs +++ b/crates/perry-codegen/src/codegen/entry.rs @@ -355,6 +355,7 @@ pub(super) fn compile_module_entry( source_function: "module_init".to_string(), source_function_slug: crate::expr::native_region_slug("module_init"), active_region_id: None, + native_facts: &main_native_facts, locals: HashMap::new(), local_types: init_local_types, current_block: 0, @@ -794,6 +795,7 @@ pub(super) fn compile_module_entry( source_function: "module_init".to_string(), source_function_slug: crate::expr::native_region_slug("module_init"), active_region_id: None, + native_facts: &init_native_facts, locals: HashMap::new(), local_types: HashMap::new(), current_block: 0, diff --git a/crates/perry-codegen/src/codegen/function.rs b/crates/perry-codegen/src/codegen/function.rs index 6228ec749b..2c2b1c7ea5 100644 --- a/crates/perry-codegen/src/codegen/function.rs +++ b/crates/perry-codegen/src/codegen/function.rs @@ -155,6 +155,7 @@ pub(super) fn compile_function( source_function: f.name.clone(), source_function_slug: crate::expr::native_region_slug(&f.name), active_region_id: None, + native_facts: &native_facts, locals, local_types, current_block: 0, diff --git a/crates/perry-codegen/src/codegen/method.rs b/crates/perry-codegen/src/codegen/method.rs index 42d8aba4d0..006b1e486d 100644 --- a/crates/perry-codegen/src/codegen/method.rs +++ b/crates/perry-codegen/src/codegen/method.rs @@ -142,6 +142,7 @@ pub(super) fn compile_method( class.name, method.name )), active_region_id: None, + native_facts: &native_facts, locals, local_types, current_block: 0, @@ -674,6 +675,7 @@ pub(super) fn compile_static_method( class.name, f.name )), active_region_id: None, + native_facts: &native_facts, locals, local_types, current_block: 0, diff --git a/crates/perry-codegen/src/collectors/hir_facts.rs b/crates/perry-codegen/src/collectors/hir_facts.rs index 80299ea839..3328b6fa50 100644 --- a/crates/perry-codegen/src/collectors/hir_facts.rs +++ b/crates/perry-codegen/src/collectors/hir_facts.rs @@ -9,7 +9,7 @@ use std::collections::{HashMap, HashSet}; /// existing native optimizations, and every consumer must keep the normal /// JSValue/NaN-boxed fallback at dynamic boundaries. #[derive(Debug, Clone, Default)] -pub(crate) struct NativeRegionFactGraph { +pub(crate) struct TypeFacts { pub representation: RepresentationFacts, pub integer_range: IntegerRangeFacts, pub bounds: BoundsFacts, @@ -27,6 +27,8 @@ pub(crate) struct NativeRegionFactGraph { pub materialization_hazards: MaterializationHazardFacts, } +pub(crate) type NativeRegionFactGraph = TypeFacts; + #[derive(Debug, Clone, Default)] pub(crate) struct RepresentationFacts { pub integer_locals: HashSet, @@ -83,7 +85,8 @@ pub(crate) struct MaterializationHazardFacts { pub initially_known_hazard_locals: HashSet, } -impl NativeRegionFactGraph { +#[allow(dead_code)] +impl TypeFacts { pub(crate) fn integer_locals(&self) -> &HashSet { &self.representation.integer_locals } @@ -131,6 +134,53 @@ impl NativeRegionFactGraph { pub(crate) fn materialization_hazard_locals(&self) -> &HashSet { &self.materialization_hazards.initially_known_hazard_locals } + + pub(crate) fn proves_i32_lowering(&self, local_id: u32) -> bool { + self.representation.integer_locals.contains(&local_id) + || self + .integer_range + .strictly_i32_bounded_locals + .contains(&local_id) + } + + pub(crate) fn proves_unsigned_i32_lowering(&self, local_id: u32) -> bool { + self.representation.unsigned_i32_locals.contains(&local_id) + } + + pub(crate) fn proves_bounds_range_seed(&self, local_id: u32) -> bool { + self.bounds.range_seed_locals.contains(&local_id) + } + + pub(crate) fn proves_noalias_buffer(&self, local_id: u32) -> bool { + self.alias_noalias + .known_noalias_buffer_locals + .contains(&local_id) + } + + pub(crate) fn proves_pure_helper(&self, function_id: u32) -> bool { + self.purity.pure_helper_function_ids.contains(&function_id) + } + + pub(crate) fn platform_constant(&self, local_id: u32) -> Option { + self.platform_constants.constants.get(&local_id).copied() + } + + pub(crate) fn scalar_replaceable_object_locals(&self) -> &HashSet { + &self.shape_stability.scalar_replaceable_object_locals + } + + pub(crate) fn proves_scalar_replacement(&self, local_id: u32) -> bool { + self.shape_stability + .scalar_replaceable_object_locals + .contains(&local_id) + || self.escape.non_escaping_arrays.contains_key(&local_id) + } + + pub(crate) fn has_materialization_hazard(&self, local_id: u32) -> bool { + self.materialization_hazards + .initially_known_hazard_locals + .contains(&local_id) + } } /// Build the full native-region fact graph in one pass boundary. @@ -139,7 +189,7 @@ impl NativeRegionFactGraph { /// function is the single contract used by codegen entry points so new native /// consumers do not need to rediscover facts independently. #[allow(clippy::too_many_arguments)] -pub(crate) fn collect_native_region_fact_graph( +pub(crate) fn collect_type_facts( stmts: &[Stmt], flat_const_ids: &HashSet, clamp_fn_ids: &HashSet, @@ -148,7 +198,7 @@ pub(crate) fn collect_native_region_fact_graph( module_globals: &HashMap, classes: &HashMap, compile_time_constants: &HashMap, -) -> NativeRegionFactGraph { +) -> TypeFacts { let integer_locals = super::integer_locals::collect_integer_locals( stmts, flat_const_ids, @@ -180,7 +230,7 @@ pub(crate) fn collect_native_region_fact_graph( .chain(non_escaping_object_literals.keys()) .copied() .collect(); - let graph = NativeRegionFactGraph { + let graph = TypeFacts { representation: RepresentationFacts { integer_locals: integer_locals.clone(), unsigned_i32_locals, @@ -219,15 +269,38 @@ pub(crate) fn collect_native_region_fact_graph( graph } -// #854: thin wrapper over collect_native_region_fact_graph, currently only -// exercised by this module's unit tests; kept as the focused-collector entry seam. +#[allow(clippy::too_many_arguments)] +pub(crate) fn collect_native_region_fact_graph( + stmts: &[Stmt], + flat_const_ids: &HashSet, + clamp_fn_ids: &HashSet, + arg_dependent_clamp_fn_ids: &HashSet, + boxed_vars: &HashSet, + module_globals: &HashMap, + classes: &HashMap, + compile_time_constants: &HashMap, +) -> NativeRegionFactGraph { + collect_type_facts( + stmts, + flat_const_ids, + clamp_fn_ids, + arg_dependent_clamp_fn_ids, + boxed_vars, + module_globals, + classes, + compile_time_constants, + ) +} + +// #854: thin wrapper over collect_type_facts, currently only exercised by this +// module's unit tests; kept as the focused-collector entry point. #[allow(dead_code)] pub(crate) fn collect_hir_facts( stmts: &[Stmt], flat_const_ids: &HashSet, clamp_fn_ids: &HashSet, -) -> NativeRegionFactGraph { - collect_native_region_fact_graph( +) -> TypeFacts { + collect_type_facts( stmts, flat_const_ids, clamp_fn_ids, @@ -426,6 +499,7 @@ mod tests { ); assert!(facts.unsigned_i32_locals().contains(&2)); + assert!(facts.proves_unsigned_i32_lowering(2)); assert!(!facts.integer_locals().contains(&2)); } @@ -472,8 +546,11 @@ mod tests { ); assert!(graph.known_noalias_buffer_locals().contains(&1)); + assert!(graph.proves_noalias_buffer(1)); assert_eq!(graph.compile_time_constants().get(&90), Some(&1.0)); + assert_eq!(graph.platform_constant(90), Some(1.0)); assert!(graph.purity.pure_helper_function_ids.contains(&7)); + assert!(graph.proves_pure_helper(7)); } #[test] @@ -505,12 +582,17 @@ mod tests { ); assert!(graph.integer_locals().contains(&1)); + assert!(graph.proves_i32_lowering(1)); + assert!(graph.proves_bounds_range_seed(1)); assert!(graph.index_used_locals().contains(&1)); assert!(graph.non_escaping_object_literals().contains_key(&3)); assert!(graph .shape_stability .scalar_replaceable_object_locals .contains(&3)); + assert!(graph.scalar_replaceable_object_locals().contains(&3)); + assert!(graph.proves_scalar_replacement(3)); + assert!(!graph.has_materialization_hazard(3)); } // Regression: a mutable `let __d = undefined` seed (the shape the diff --git a/crates/perry-codegen/src/collectors/mod.rs b/crates/perry-codegen/src/collectors/mod.rs index 8552613726..c10ca79531 100644 --- a/crates/perry-codegen/src/collectors/mod.rs +++ b/crates/perry-codegen/src/collectors/mod.rs @@ -48,7 +48,9 @@ pub(crate) use escape_objects::{ check_object_literal_escapes_in_expr, check_object_literal_escapes_in_stmts, collect_non_escaping_object_literals, find_object_literal_candidates, }; -pub(crate) use hir_facts::{collect_hir_facts, collect_native_region_fact_graph}; +pub(crate) use hir_facts::{ + collect_hir_facts, collect_native_region_fact_graph, collect_type_facts, NativeRegionFactGraph, +}; pub(crate) use i32_locals::{ collect_integer_let_ids, collect_localset_ids_in_expr_filtered, collect_localset_ids_in_stmts, collect_localset_ids_in_stmts_filtered, collect_strictly_i32_bounded_locals, diff --git a/crates/perry-codegen/src/expr/bigint_set.rs b/crates/perry-codegen/src/expr/bigint_set.rs index 9bc6790b27..20696bd121 100644 --- a/crates/perry-codegen/src/expr/bigint_set.rs +++ b/crates/perry-codegen/src/expr/bigint_set.rs @@ -46,6 +46,39 @@ use super::{ I18nLowerCtx, }; +fn number_coerce_operand_is_already_primitive_number(ctx: &FnCtx<'_>, operand: &Expr) -> bool { + if crate::type_analysis::expr_may_return_boxed_value_from_raw_f64_fallback(ctx, operand) + || is_bigint_expr(ctx, operand) + { + return false; + } + match operand { + Expr::Integer(_) + | Expr::Number(_) + | Expr::PodLayoutSizeOf { .. } + | Expr::PodLayoutAlignOf { .. } + | Expr::PodLayoutOffsetOf { .. } + | Expr::DateNow + | Expr::Uint8ArrayLength(_) + | Expr::BufferLength(_) => true, + Expr::LocalGet(id) | Expr::Update { id, .. } => ctx.integer_locals.contains(id), + Expr::Binary { op, left, right } => match op { + BinaryOp::Add | BinaryOp::Sub | BinaryOp::Mul | BinaryOp::Div | BinaryOp::Mod => { + number_coerce_operand_is_already_primitive_number(ctx, left) + && number_coerce_operand_is_already_primitive_number(ctx, right) + } + BinaryOp::BitAnd + | BinaryOp::BitOr + | BinaryOp::BitXor + | BinaryOp::Shl + | BinaryOp::Shr + | BinaryOp::UShr => true, + BinaryOp::Pow => false, + }, + _ => false, + } +} + pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { match expr { Expr::ObjectRest { @@ -229,10 +262,15 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // -------- Number(value) coercion -------- Expr::NumberCoerce(operand) => { + let already_number = number_coerce_operand_is_already_primitive_number(ctx, operand); let v = lower_expr(ctx, operand)?; - Ok(ctx - .block() - .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &v)])) + if already_number { + Ok(v) + } else { + Ok(ctx + .block() + .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &v)])) + } } // -------- set.add(value) — updates the local in place -------- diff --git a/crates/perry-codegen/src/expr/binary.rs b/crates/perry-codegen/src/expr/binary.rs index 11ffa42db5..d44284df3b 100644 --- a/crates/perry-codegen/src/expr/binary.rs +++ b/crates/perry-codegen/src/expr/binary.rs @@ -23,8 +23,10 @@ use crate::lower_string_method::{ use crate::nanbox::{double_literal, POINTER_MASK_I64}; #[allow(unused_imports)] use crate::type_analysis::{ - compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, - is_numeric_expr, is_set_expr, is_string_expr, is_url_search_params_expr, receiver_class_name, + add_operands_have_pod_materialization_hazard, compute_auto_captures, + expr_may_return_boxed_value_from_raw_f64_fallback, is_array_expr, is_bigint_expr, is_bool_expr, + is_map_expr, is_numeric_expr, is_set_expr, is_string_expr, is_url_search_params_expr, + receiver_class_name, }; #[allow(unused_imports)] use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; @@ -46,6 +48,17 @@ use super::{ I18nLowerCtx, }; +fn lower_arithmetic_operand(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result<(String, bool)> { + if expr_may_return_boxed_value_from_raw_f64_fallback(ctx, expr) { + if let Some(value) = + super::index_get::lower_numeric_index_get_for_number_context(ctx, expr)? + { + return Ok((value, true)); + } + } + Ok((lower_expr(ctx, expr)?, false)) +} + pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { match expr { Expr::Binary { op, left, right } => { @@ -116,6 +129,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // → string concat, BIGINT → bigint add, otherwise numeric. if !(crate::type_analysis::is_numeric_expr(ctx, left) && crate::type_analysis::is_numeric_expr(ctx, right)) + || add_operands_have_pod_materialization_hazard(ctx, left, right) { let l = lower_expr(ctx, left)?; let r = lower_expr(ctx, right)?; @@ -205,25 +219,29 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { return Ok(blk.sitofp(I64, &m, DOUBLE)); } - let l_raw = lower_expr(ctx, left)?; - let r_raw = lower_expr(ctx, right)?; + let (l_raw, l_fallback_coerced) = lower_arithmetic_operand(ctx, left)?; + let (r_raw, r_fallback_coerced) = lower_arithmetic_operand(ctx, right)?; // Coerce non-numeric operands to numbers for arithmetic. // JS: `true + true = 2`, `null + 1 = 1`, etc. Without // this, fadd on NaN-tagged booleans propagates the NaN // payload instead of computing 1.0 + 1.0 = 2.0. let l_numeric = is_numeric_expr(ctx, left); let r_numeric = is_numeric_expr(ctx, right); - let l = if l_numeric { - l_raw - } else { + let l_needs_coerce = !l_fallback_coerced + && (!l_numeric || expr_may_return_boxed_value_from_raw_f64_fallback(ctx, left)); + let r_needs_coerce = !r_fallback_coerced + && (!r_numeric || expr_may_return_boxed_value_from_raw_f64_fallback(ctx, right)); + let l = if l_needs_coerce { ctx.block() .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &l_raw)]) - }; - let r = if r_numeric { - r_raw } else { + l_raw + }; + let r = if r_needs_coerce { ctx.block() .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &r_raw)]) + } else { + r_raw }; let v = match op { BinaryOp::Add => { diff --git a/crates/perry-codegen/src/expr/buffer_access.rs b/crates/perry-codegen/src/expr/buffer_access.rs index e1f75bd278..07204d349a 100644 --- a/crates/perry-codegen/src/expr/buffer_access.rs +++ b/crates/perry-codegen/src/expr/buffer_access.rs @@ -206,7 +206,7 @@ fn lower_index_i32_value(ctx: &mut FnCtx<'_>, index: &Expr) -> Result, value: &Expr) -> Result { &ctx.i32_counter_slots, ctx.flat_const_arrays, &ctx.array_row_aliases, - ctx.integer_locals, + ctx.native_facts.integer_locals(), ctx.clamp3_functions, ctx.clamp_u8_functions, ctx.integer_returning_functions, @@ -639,7 +639,7 @@ pub(crate) fn lower_typed_array_store( &ctx.i32_counter_slots, ctx.flat_const_arrays, &ctx.array_row_aliases, - ctx.integer_locals, + ctx.native_facts.integer_locals(), ctx.clamp3_functions, ctx.clamp_u8_functions, ctx.integer_returning_functions, diff --git a/crates/perry-codegen/src/expr/buffer_views.rs b/crates/perry-codegen/src/expr/buffer_views.rs index 390ec3104e..1f0f02f6f3 100644 --- a/crates/perry-codegen/src/expr/buffer_views.rs +++ b/crates/perry-codegen/src/expr/buffer_views.rs @@ -1,4 +1,4 @@ -use perry_hir::Expr; +use perry_hir::{walker::walk_expr_children, Expr}; use crate::native_value::{ AliasState, BoundsState, BufferElem, BufferIndexUnit, BufferViewSlot, LengthSource, @@ -299,19 +299,12 @@ pub(crate) fn downgrade_buffer_aliases_in_expr( expr: &Expr, reason: MaterializationReason, ) { - match expr { - Expr::LocalGet(id) => downgrade_buffer_alias(ctx, *id, reason), - Expr::Binary { left, right, .. } => { - downgrade_buffer_aliases_in_expr(ctx, left, reason.clone()); - downgrade_buffer_aliases_in_expr(ctx, right, reason); - } - Expr::PropertyGet { object, .. } => downgrade_buffer_aliases_in_expr(ctx, object, reason), - Expr::IndexGet { object, index } => { - downgrade_buffer_aliases_in_expr(ctx, object, reason.clone()); - downgrade_buffer_aliases_in_expr(ctx, index, reason); - } - _ => {} + if let Expr::LocalGet(id) = expr { + downgrade_buffer_alias(ctx, *id, reason.clone()); } + walk_expr_children(expr, &mut |child| { + downgrade_buffer_aliases_in_expr(ctx, child, reason.clone()); + }); } pub(crate) fn buffer_access_materialization_reason( diff --git a/crates/perry-codegen/src/expr/call_spread.rs b/crates/perry-codegen/src/expr/call_spread.rs index 078e8c9a12..a58ceb58bc 100644 --- a/crates/perry-codegen/src/expr/call_spread.rs +++ b/crates/perry-codegen/src/expr/call_spread.rs @@ -21,6 +21,7 @@ use crate::lower_string_method::{ }; #[allow(unused_imports)] use crate::nanbox::{double_literal, POINTER_MASK_I64}; +use crate::native_value::MaterializationReason; #[allow(unused_imports)] use crate::type_analysis::{ compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, @@ -31,10 +32,10 @@ use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; #[allow(unused_imports)] use super::{ - buffer_alias_metadata_suffix, can_lower_expr_as_i32, emit_layout_note_slot_on_block, - emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, emit_string_literal_global, - emit_v8_export_call, emit_v8_member_method_call, emit_write_barrier, - emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, + buffer_alias_metadata_suffix, can_lower_expr_as_i32, downgrade_buffer_aliases_in_expr, + emit_layout_note_slot_on_block, emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, + emit_string_literal_global, emit_v8_export_call, emit_v8_member_method_call, + emit_write_barrier, emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, extract_array_of_object_shape, i32_bool_to_nanbox, import_origin_suffix, is_global_this_builtin_function_name, is_global_this_builtin_name, is_known_finite, lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, @@ -58,6 +59,18 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .iter() .filter(|a| matches!(a, CallArg::Expr(_))) .count(); + downgrade_buffer_aliases_in_expr(ctx, callee, MaterializationReason::UnknownCallEscape); + for arg in args { + match arg { + CallArg::Expr(expr) | CallArg::Spread(expr) => { + downgrade_buffer_aliases_in_expr( + ctx, + expr, + MaterializationReason::UnknownCallEscape, + ) + } + } + } // console.log(...arr) / .info / .warn / .error / .debug — bundle // every regular arg + every spread source into a single array, diff --git a/crates/perry-codegen/src/expr/calls.rs b/crates/perry-codegen/src/expr/calls.rs index 4e6b727203..147d110755 100644 --- a/crates/perry-codegen/src/expr/calls.rs +++ b/crates/perry-codegen/src/expr/calls.rs @@ -2389,6 +2389,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { )) } _ => { + super::downgrade_buffer_aliases_in_expr( + ctx, + callee, + crate::native_value::MaterializationReason::UnknownCallEscape, + ); for arg in args { super::downgrade_buffer_aliases_in_expr( ctx, @@ -2408,6 +2413,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { byte_offset, .. } => { + super::downgrade_buffer_aliases_in_expr( + ctx, + callee, + crate::native_value::MaterializationReason::UnknownCallEscape, + ); for arg in args { super::downgrade_buffer_aliases_in_expr( ctx, diff --git a/crates/perry-codegen/src/expr/compare.rs b/crates/perry-codegen/src/expr/compare.rs index e950fabf92..7fafb7d1a7 100644 --- a/crates/perry-codegen/src/expr/compare.rs +++ b/crates/perry-codegen/src/expr/compare.rs @@ -23,8 +23,9 @@ use crate::lower_string_method::{ use crate::nanbox::{double_literal, POINTER_MASK_I64}; #[allow(unused_imports)] use crate::type_analysis::{ - compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, - is_numeric_expr, is_set_expr, is_string_expr, is_url_search_params_expr, receiver_class_name, + compute_auto_captures, expr_may_return_boxed_value_from_raw_f64_fallback, is_array_expr, + is_bigint_expr, is_bool_expr, is_map_expr, is_numeric_expr, is_set_expr, is_string_expr, + is_url_search_params_expr, receiver_class_name, }; #[allow(unused_imports)] use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; @@ -402,6 +403,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // path below (and Dates are subsumed — they aren't numeric_expr). let both_numeric = is_numeric_expr(ctx, left) && is_numeric_expr(ctx, right) + && !expr_may_return_boxed_value_from_raw_f64_fallback(ctx, left) + && !expr_may_return_boxed_value_from_raw_f64_fallback(ctx, right) && !is_bigint_expr(ctx, left) && !is_bigint_expr(ctx, right); if is_relational_op && !both_numeric { diff --git a/crates/perry-codegen/src/expr/i32_fast_path.rs b/crates/perry-codegen/src/expr/i32_fast_path.rs index 6fb0277066..0a33bf90d8 100644 --- a/crates/perry-codegen/src/expr/i32_fast_path.rs +++ b/crates/perry-codegen/src/expr/i32_fast_path.rs @@ -5,7 +5,10 @@ use anyhow::Result; use perry_hir::{BinaryOp, Expr}; use super::{lower_expr, unbox_to_i64, FlatConstInfo, FnCtx}; -use crate::native_value::{ExpectedNativeRep, LoweredValue}; +use crate::native_value::{ + materialize_js_value_bits, ExpectedNativeRep, LoweredValue, MaterializationReason, +}; +use crate::type_analysis::{expr_may_return_boxed_value_from_raw_f64_fallback, is_numeric_expr}; use crate::types::{DOUBLE, F32, I32, I64}; /// Returns true if `e` is guaranteed to produce a finite double value @@ -358,6 +361,7 @@ pub(crate) fn lower_expr_native( expected: ExpectedNativeRep, ) -> Result { match expected { + ExpectedNativeRep::JsValueBits => lower_expr_native_js_value_bits(ctx, e), ExpectedNativeRep::I32 => lower_expr_native_i32(ctx, e), ExpectedNativeRep::I64 => lower_expr_native_i64(ctx, e), ExpectedNativeRep::U32 => lower_expr_native_u32(ctx, e), @@ -414,6 +418,10 @@ fn handle_id_lowered(value: String) -> LoweredValue { LoweredValue::handle_id(value) } +fn js_value_bits_lowered(value: String) -> LoweredValue { + LoweredValue::js_value_bits(value) +} + fn native_expr_kind(e: &Expr) -> &'static str { match e { Expr::Integer(_) => "Integer", @@ -555,6 +563,29 @@ fn lower_expr_native_i32(ctx: &mut FnCtx<'_>, e: &Expr) -> Result Ok(lowered) } +fn lower_expr_native_js_value_bits(ctx: &mut FnCtx<'_>, e: &Expr) -> Result { + let value = lower_expr(ctx, e)?; + let bits = materialize_js_value_bits( + ctx, + LoweredValue::js_value(value), + MaterializationReason::FunctionAbi, + ); + let lowered = js_value_bits_lowered(bits); + ctx.record_lowered_value( + native_expr_kind(e), + None, + "lower_expr_native_js_value_bits", + &lowered, + None, + None, + None, + false, + false, + Vec::new(), + ); + Ok(lowered) +} + fn lower_expr_native_u32(ctx: &mut FnCtx<'_>, e: &Expr) -> Result { let value = match e { Expr::Integer(n) if *n >= 0 && u32::try_from(*n).is_ok() => (*n as u32).to_string(), @@ -663,7 +694,15 @@ fn lower_expr_native_usize(ctx: &mut FnCtx<'_>, e: &Expr) -> Result, e: &Expr) -> Result { - let value = lower_expr(ctx, e)?; + let needs_raw_f64_fallback_coercion = expr_may_return_boxed_value_from_raw_f64_fallback(ctx, e) + || matches!(e, Expr::IndexGet { .. }) && is_numeric_expr(ctx, e); + let raw = lower_expr(ctx, e)?; + let value = if needs_raw_f64_fallback_coercion { + ctx.block() + .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &raw)]) + } else { + raw + }; let lowered = f64_lowered(value); ctx.record_lowered_value( native_expr_kind(e), @@ -681,7 +720,15 @@ fn lower_expr_native_f64(ctx: &mut FnCtx<'_>, e: &Expr) -> Result } fn lower_expr_native_f32(ctx: &mut FnCtx<'_>, e: &Expr) -> Result { - let d = lower_expr(ctx, e)?; + let needs_raw_f64_fallback_coercion = expr_may_return_boxed_value_from_raw_f64_fallback(ctx, e) + || matches!(e, Expr::IndexGet { .. }) && is_numeric_expr(ctx, e); + let raw = lower_expr(ctx, e)?; + let d = if needs_raw_f64_fallback_coercion { + ctx.block() + .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &raw)]) + } else { + raw + }; let value = ctx.block().fptrunc(DOUBLE, &d, F32); let lowered = f32_lowered(value); ctx.record_lowered_value( diff --git a/crates/perry-codegen/src/expr/index.rs b/crates/perry-codegen/src/expr/index.rs index 3e6fad1fd8..d3a12f701c 100644 --- a/crates/perry-codegen/src/expr/index.rs +++ b/crates/perry-codegen/src/expr/index.rs @@ -15,6 +15,14 @@ use crate::native_value::{ }; use crate::types::{DOUBLE, I1, I32, I64}; +fn canonicalize_raw_f64_numeric_store_value(blk: &mut LlBlock, value_double: &str) -> String { + blk.call( + DOUBLE, + "js_array_numeric_value_to_raw_f64", + &[(DOUBLE, value_double)], + ) +} + /// Inline fast-path lowering for `local_arr[i] = v`. /// /// Compiles to: @@ -144,7 +152,7 @@ pub(crate) fn lower_index_set_fast( &[ (I64, feedback_site_id), (DOUBLE, arr_box), - (I32, &idx_i32), + (DOUBLE, idx_double), (DOUBLE, val_double), ], ); @@ -211,14 +219,14 @@ pub(crate) fn lower_index_set_fast( ctx.current_block = inbounds_idx; { let blk = ctx.block(); + let (element_addr, element_ptr) = element_slot(blk, &arr_handle, &idx_i32); if require_numeric_layout { - blk.call( - I32, - "js_array_numeric_set_f64_unboxed", - &[(I64, &arr_handle), (I32, &idx_i32), (DOUBLE, val_double)], - ); + let numeric_value = canonicalize_raw_f64_numeric_store_value(blk, val_double); + // GC_STORE_AUDIT(POINTER_FREE): require_numeric_layout proves the + // array is raw-f64 and the value is canonicalized to a plain f64 — + // no GC pointer is written into the slot, so no write barrier. + blk.store(DOUBLE, &numeric_value, &element_ptr); } else { - let (element_addr, element_ptr) = element_slot(blk, &arr_handle, &idx_i32); // In-place overwrite of a non-raw-layout (e.g. downgraded `any[]`) // array element: the slot holds a valid value, so the scalar-aware // note skips the GC layout hashmap on scalar-over-scalar stores @@ -299,20 +307,28 @@ pub(crate) fn lower_index_set_fast( { let blk = ctx.block(); let (element_addr, element_ptr) = element_slot(blk, &arr_handle, &idx_i32); - let value_bits = emit_jsvalue_slot_store_on_block( - blk, - &element_ptr, - val_double, - &arr_handle, - &idx_i32, - layout_note_needed, - &arr_handle, - &element_addr, - write_barrier_needed, - ) - .unwrap_or_else(|| blk.bitcast_double_to_i64(val_double)); - if !value_is_numeric { - emit_array_numeric_write_note_on_block(blk, &arr_handle, &value_bits); + if require_numeric_layout { + let numeric_value = canonicalize_raw_f64_numeric_store_value(blk, val_double); + // GC_STORE_AUDIT(POINTER_FREE): require_numeric_layout proves the + // array is raw-f64 and the value is canonicalized to a plain f64 — + // no GC pointer is written into the slot, so no write barrier. + blk.store(DOUBLE, &numeric_value, &element_ptr); + } else { + let value_bits = emit_jsvalue_slot_store_on_block( + blk, + &element_ptr, + val_double, + &arr_handle, + &idx_i32, + layout_note_needed, + &arr_handle, + &element_addr, + write_barrier_needed, + ) + .unwrap_or_else(|| blk.bitcast_double_to_i64(val_double)); + if !value_is_numeric { + emit_array_numeric_write_note_on_block(blk, &arr_handle, &value_bits); + } } // Bump length: store idx+1 to arr_ptr+0. let new_len = blk.add(I32, &idx_i32, "1"); diff --git a/crates/perry-codegen/src/expr/index_get.rs b/crates/perry-codegen/src/expr/index_get.rs index 6ba83f993e..782374c06a 100644 --- a/crates/perry-codegen/src/expr/index_get.rs +++ b/crates/perry-codegen/src/expr/index_get.rs @@ -193,6 +193,7 @@ fn lower_guarded_array_index_get( idx_i32: &str, block_prefix: &str, require_numeric_layout: bool, + coerce_numeric_fallback: bool, ) -> Result { let contract = if require_numeric_layout { TypedFeedbackContract::numeric_array_get_index() @@ -235,7 +236,7 @@ fn lower_guarded_array_index_get( ctx.block().cond_br(&guard_ok, &fast_label, &fallback_label); ctx.current_block = fallback_idx; - let fallback_val = ctx.block().call( + let fallback_boxed = ctx.block().call( DOUBLE, "js_typed_feedback_array_index_get_fallback_boxed", &[ @@ -244,15 +245,16 @@ fn lower_guarded_array_index_get( (DOUBLE, idx_box), ], ); + let fallback_val = if require_numeric_layout && coerce_numeric_fallback { + ctx.block() + .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &fallback_boxed)]) + } else { + fallback_boxed.clone() + }; let fallback_end_label = ctx.block().label.clone(); ctx.block().br(&merge_label); if require_numeric_layout { - let fallback = LoweredValue { - semantic: SemanticKind::JsValue, - rep: NativeRep::JsValue, - llvm_ty: DOUBLE, - value: fallback_val.clone(), - }; + let fallback = LoweredValue::js_value(fallback_boxed.clone()); ctx.record_lowered_value_with_access_mode_and_facts( "NumericArrayIndexGet", None, @@ -363,6 +365,51 @@ fn lower_guarded_array_index_get( )) } +pub(crate) fn lower_numeric_index_get_for_number_context( + ctx: &mut FnCtx<'_>, + expr: &Expr, +) -> Result> { + let Expr::IndexGet { object, index } = expr else { + return Ok(None); + }; + if !is_array_expr(ctx, object) || !expr_has_numeric_pointer_free_array_layout(ctx, object) { + return Ok(None); + } + + if let (Expr::LocalGet(arr_id), Expr::LocalGet(idx_id)) = (object.as_ref(), index.as_ref()) { + if ctx + .bounded_index_pairs + .iter() + .any(|fact| fact.index_local_id == *idx_id && fact.array_local_id == *arr_id) + { + let arr_box = lower_expr(ctx, object)?; + let i32_slot_opt = ctx.i32_counter_slots.get(idx_id).cloned(); + let idx_i32 = if let Some(ref i32_slot) = i32_slot_opt { + ctx.block().load(I32, i32_slot) + } else { + let idx_double = lower_expr(ctx, index)?; + ctx.block().fptosi(DOUBLE, &idx_double, I32) + }; + let idx_double = ctx.block().sitofp(I32, &idx_i32, DOUBLE); + return lower_guarded_array_index_get( + ctx, + &arr_box, + &idx_double, + &idx_i32, + "bidx.num", + true, + true, + ) + .map(Some); + } + } + + let arr_box = lower_expr(ctx, object)?; + let idx_double = lower_expr(ctx, index)?; + let idx_i32 = ctx.block().fptosi(DOUBLE, &idx_double, I32); + lower_guarded_array_index_get(ctx, &arr_box, &idx_double, &idx_i32, "arr", true, true).map(Some) +} + fn lower_bounded_array_index_get( ctx: &mut FnCtx<'_>, arr_box: &str, @@ -678,7 +725,13 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { if let Some(k) = k { if k < slots.len() { let value = ctx.block().load(DOUBLE, &slots[k]); - let lowered = LoweredValue { + let raw_f64_element = + crate::type_analysis::scalar_replaced_array_element_is_raw_f64( + ctx, + object.as_ref(), + index.as_ref(), + ); + let lowered_js = LoweredValue { semantic: SemanticKind::JsValue, rep: NativeRep::JsValue, llvm_ty: DOUBLE, @@ -688,15 +741,34 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { "ScalarArrayIndexGet", Some(*id), "scalar_array_element_load", - &lowered, + &lowered_js, None, None, None, None, false, false, - vec![format!("index={}", k)], + vec![ + format!("index={}", k), + format!("raw_f64_element={}", raw_f64_element as u8), + ], ); + if raw_f64_element { + let lowered_f64 = LoweredValue::f64(value.clone()); + ctx.record_lowered_value_with_access_mode( + "ScalarArrayIndexGet", + Some(*id), + "scalar_array_element_load.raw_f64", + &lowered_f64, + None, + None, + None, + None, + false, + false, + vec![format!("index={}", k), "raw_f64_element=1".to_string()], + ); + } return Ok(value); } } @@ -844,6 +916,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { &idx_i32, "bidx.num", true, + false, ); } return lower_bounded_array_index_get(ctx, &arr_box, &idx_i32); @@ -865,6 +938,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { &idx_i32, "arr", require_numeric_layout, + false, ); } // Generic dynamic object access: stringify the index (no-op diff --git a/crates/perry-codegen/src/expr/index_set.rs b/crates/perry-codegen/src/expr/index_set.rs index 0ede0171f6..4aebefb1e1 100644 --- a/crates/perry-codegen/src/expr/index_set.rs +++ b/crates/perry-codegen/src/expr/index_set.rs @@ -22,7 +22,8 @@ use crate::lower_string_method::{ #[allow(unused_imports)] use crate::nanbox::{double_literal, POINTER_MASK_I64}; use crate::native_value::{ - BoundsState, BufferAccessMode, LoweredValue, MaterializationReason, NativeRep, SemanticKind, + BoundsState, BufferAccessMode, ExpectedNativeRep, LoweredValue, MaterializationReason, + NativeRep, SemanticKind, }; #[allow(unused_imports)] use crate::type_analysis::{ @@ -44,7 +45,7 @@ use super::{ expr_has_numeric_pointer_free_array_layout, expr_is_known_non_pointer_shadow_value, extract_array_of_object_shape, i32_bool_to_nanbox, import_origin_suffix, is_global_this_builtin_function_name, is_global_this_builtin_name, is_known_finite, - lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, + lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, lower_expr_native, lower_index_set_fast, lower_js_args_array, lower_object_literal, lower_stream_super_init, lower_typed_array_store, lower_url_string_getter, materialize_js_value, nanbox_bigint_inline, nanbox_pointer_inline, nanbox_pointer_inline_pub, nanbox_string_inline, proxy_build_args_array, @@ -54,6 +55,30 @@ use super::{ TypedFeedbackKind, }; +fn canonicalize_raw_f64_numeric_store_value( + blk: &mut crate::block::LlBlock, + value_double: &str, +) -> String { + blk.call( + DOUBLE, + "js_array_numeric_value_to_raw_f64", + &[(DOUBLE, value_double)], + ) +} + +fn lower_value_for_optional_barrier( + ctx: &mut FnCtx<'_>, + value: &Expr, + write_barrier_needed: bool, +) -> Result<(String, Option)> { + if !write_barrier_needed { + return Ok((lower_expr(ctx, value)?, None)); + } + let value_bits = lower_expr_native(ctx, value, ExpectedNativeRep::JsValueBits)?.value; + let value_double = ctx.block().bitcast_i64_to_double(&value_bits); + Ok((value_double, Some(value_bits))) +} + fn is_width_tracked_typed_array_receiver(ctx: &FnCtx<'_>, object: &Expr) -> bool { matches!( receiver_class_name(ctx, object).as_deref(), @@ -210,7 +235,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let arr_box = lower_expr(ctx, object)?; let key_box = lower_expr(ctx, index)?; let value_needs_barrier = array_store_needs_write_barrier(ctx, value); - let val_double = lower_expr(ctx, value)?; + let (val_double, val_bits) = + lower_value_for_optional_barrier(ctx, value, value_needs_barrier)?; let (arr_handle, key_handle) = { let blk = ctx.block(); let arr_handle = unbox_to_i64(blk, &arr_box); @@ -234,8 +260,9 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { ], ); if value_needs_barrier { - let val_bits = ctx.block().bitcast_double_to_i64(&val_double); let arr_bits = ctx.block().bitcast_double_to_i64(&arr_box); + let val_bits = + val_bits.unwrap_or_else(|| ctx.block().bitcast_double_to_i64(&val_double)); emit_write_barrier(ctx, &arr_bits, &val_bits); } return Ok(val_double); @@ -340,15 +367,15 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let require_numeric_layout = value_is_numeric && expr_has_numeric_pointer_free_array_layout(ctx, object); let arr_box = lower_expr(ctx, object)?; - let val_double = lower_expr(ctx, value)?; + let idx_double = lower_expr(ctx, index)?; // Grab i32 slot name before mutably borrowing ctx for block(). let i32_slot_opt = ctx.i32_counter_slots.get(idx_id).cloned(); let idx_i32 = if let Some(ref i32_slot) = i32_slot_opt { ctx.block().load(I32, i32_slot) } else { - let idx_double = lower_expr(ctx, index)?; ctx.block().fptosi(DOUBLE, &idx_double, I32) }; + let val_double = lower_expr(ctx, value)?; if require_numeric_layout { let feedback_site_id = emit_typed_feedback_register_site( ctx, @@ -388,7 +415,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { &[ (I64, &feedback_site_id), (DOUBLE, &arr_box), - (I32, &idx_i32), + (DOUBLE, &idx_double), (DOUBLE, &val_double), ], ); @@ -439,11 +466,23 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let blk = ctx.block(); let arr_bits = blk.bitcast_double_to_i64(&arr_box); let arr_handle = blk.and(I64, &arr_bits, POINTER_MASK_I64); - blk.call( - I32, - "js_array_numeric_set_f64_unboxed", - &[(I64, &arr_handle), (I32, &idx_i32), (DOUBLE, &val_double)], - ); + // The numeric-array set guard above was called with + // `in_bounds=true`, so it has already proved a live, + // non-forwarded plain Array in raw-f64 layout, a numeric + // RHS, and an in-bounds index. Store the f64 slot inline + // instead of calling the helper that re-validates the same + // facts before doing this store. + let idx_i64 = blk.zext(I32, &idx_i32, I64); + let byte_offset = blk.shl(I64, &idx_i64, "3"); + let with_header = blk.add(I64, &byte_offset, "8"); + let element_addr = blk.add(I64, &arr_handle, &with_header); + let element_ptr = blk.inttoptr(I64, &element_addr); + let numeric_value = + canonicalize_raw_f64_numeric_store_value(blk, &val_double); + // GC_STORE_AUDIT(POINTER_FREE): guarded raw-f64 + // numeric store — the canonicalized value is a + // plain f64, never a GC pointer, so no barrier. + blk.store(DOUBLE, &numeric_value, &element_ptr); blk.br(&merge_label); } let stored = LoweredValue { diff --git a/crates/perry-codegen/src/expr/js_runtime.rs b/crates/perry-codegen/src/expr/js_runtime.rs index 62211677bd..6a4172a037 100644 --- a/crates/perry-codegen/src/expr/js_runtime.rs +++ b/crates/perry-codegen/src/expr/js_runtime.rs @@ -21,6 +21,7 @@ use crate::lower_string_method::{ }; #[allow(unused_imports)] use crate::nanbox::{double_literal, POINTER_MASK_I64}; +use crate::native_value::MaterializationReason; #[allow(unused_imports)] use crate::type_analysis::{ compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, @@ -31,10 +32,10 @@ use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; #[allow(unused_imports)] use super::{ - buffer_alias_metadata_suffix, can_lower_expr_as_i32, emit_layout_note_slot_on_block, - emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, emit_string_literal_global, - emit_v8_export_call, emit_v8_member_method_call, emit_write_barrier, - emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, + buffer_alias_metadata_suffix, can_lower_expr_as_i32, downgrade_buffer_aliases_in_expr, + emit_layout_note_slot_on_block, emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, + emit_string_literal_global, emit_v8_export_call, emit_v8_member_method_call, + emit_write_barrier, emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, extract_array_of_object_shape, i32_bool_to_nanbox, import_origin_suffix, is_global_this_builtin_function_name, is_global_this_builtin_name, is_known_finite, lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, @@ -46,6 +47,16 @@ use super::{ I18nLowerCtx, }; +fn downgrade_unknown_call_expr(ctx: &mut FnCtx<'_>, expr: &Expr) { + downgrade_buffer_aliases_in_expr(ctx, expr, MaterializationReason::UnknownCallEscape); +} + +fn downgrade_unknown_call_args(ctx: &mut FnCtx<'_>, args: &[Expr]) { + for arg in args { + downgrade_unknown_call_expr(ctx, arg); + } +} + pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { match expr { Expr::JsLoadModule { path } => { @@ -71,6 +82,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { module_handle, export_name, } => { + downgrade_unknown_call_expr(ctx, module_handle); let handle_dbl = lower_expr(ctx, module_handle)?; let (bytes_global, byte_len) = { let idx = ctx.strings.intern(export_name); @@ -92,6 +104,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { func_name, args, } => { + downgrade_unknown_call_expr(ctx, module_handle); + downgrade_unknown_call_args(ctx, args); let handle_dbl = lower_expr(ctx, module_handle)?; let (bytes_global, byte_len) = { let idx = ctx.strings.intern(func_name); @@ -123,6 +137,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { method_name, args, } => { + downgrade_unknown_call_expr(ctx, object); + downgrade_unknown_call_args(ctx, args); let obj_dbl = lower_expr(ctx, object)?; let (bytes_global, byte_len) = { let idx = ctx.strings.intern(method_name); @@ -149,6 +165,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } Expr::JsCallValue { callee, args } => { + downgrade_unknown_call_expr(ctx, callee); + downgrade_unknown_call_args(ctx, args); let func_dbl = lower_expr(ctx, callee)?; let mut lowered_args: Vec = Vec::with_capacity(args.len()); for arg in args { @@ -166,6 +184,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { object, property_name, } => { + downgrade_unknown_call_expr(ctx, object); let obj_dbl = lower_expr(ctx, object)?; let (bytes_global, byte_len) = { let idx = ctx.strings.intern(property_name); @@ -185,6 +204,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { property_name, value, } => { + downgrade_unknown_call_expr(ctx, object); + downgrade_unknown_call_expr(ctx, value); let obj_dbl = lower_expr(ctx, object)?; let val_dbl = lower_expr(ctx, value)?; let (bytes_global, byte_len) = { @@ -210,6 +231,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { class_name, args, } => { + downgrade_unknown_call_expr(ctx, module_handle); + downgrade_unknown_call_args(ctx, args); let handle_dbl = lower_expr(ctx, module_handle)?; let (bytes_global, byte_len) = { let idx = ctx.strings.intern(class_name); @@ -237,6 +260,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } Expr::JsNewFromHandle { constructor, args } => { + downgrade_unknown_call_expr(ctx, constructor); + downgrade_unknown_call_args(ctx, args); let ctor_dbl = lower_expr(ctx, constructor)?; let mut lowered_args: Vec = Vec::with_capacity(args.len()); for arg in args { @@ -270,6 +295,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { closure, param_count, } => { + downgrade_unknown_call_expr(ctx, closure); let closure_dbl = lower_expr(ctx, closure)?; let blk = ctx.block(); let closure_i64 = unbox_to_i64(blk, &closure_dbl); diff --git a/crates/perry-codegen/src/expr/mod.rs b/crates/perry-codegen/src/expr/mod.rs index 3c8800f0bf..ddf9e96523 100644 --- a/crates/perry-codegen/src/expr/mod.rs +++ b/crates/perry-codegen/src/expr/mod.rs @@ -14,6 +14,7 @@ use perry_types::Type as HirType; use crate::block::LlBlock; use crate::codegen::AppMetadata; +use crate::collectors::NativeRegionFactGraph; use crate::function::LlFunction; use crate::lower_call::{lower_call, lower_native_method_call, lower_new}; use crate::lower_conditional::{lower_conditional, lower_logical, lower_truthy}; @@ -155,6 +156,14 @@ pub(crate) struct FnCtx<'a> { pub source_function_slug: String, /// Stable id for the labeled loop currently being lowered. pub active_region_id: Option, + /// Full native-region fact graph collected for this lowered HIR region. + /// + /// Existing fields below borrow individual subgraphs for compatibility + /// with older lowering consumers. New native-lowering decisions should + /// prefer this structured graph so representation, range, bounds, alias, + /// escape, shape, constants, and materialization-hazard facts stay tied + /// to the same collector snapshot. + pub native_facts: &'a NativeRegionFactGraph, /// Map from HIR LocalId → LLVM alloca pointer (e.g. `%r3`). pub locals: std::collections::HashMap, /// Map from HIR LocalId → static HIR Type. Used by `is_string_expr` and @@ -2049,7 +2058,9 @@ pub(crate) fn lower_expr(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { pub(crate) fn lower_math_operand(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { let raw = lower_expr(ctx, expr)?; - if is_numeric_expr(ctx, expr) { + if is_numeric_expr(ctx, expr) + && !crate::type_analysis::expr_may_return_boxed_value_from_raw_f64_fallback(ctx, expr) + { Ok(raw) } else { Ok(ctx diff --git a/crates/perry-codegen/src/expr/native_record.rs b/crates/perry-codegen/src/expr/native_record.rs index f5c52f1336..4d93178558 100644 --- a/crates/perry-codegen/src/expr/native_record.rs +++ b/crates/perry-codegen/src/expr/native_record.rs @@ -73,6 +73,13 @@ pub(super) fn native_fact_uses_for_record( let mut consumed = Vec::new(); let mut rejected = Vec::new(); match &lowered.rep { + NativeRep::JsValueBits => consumed.push(native_fact_use( + "representation", + local_id, + "consumed", + "js_value_bits", + None, + )), NativeRep::JsValue => {} NativeRep::I32 => consumed.push(native_fact_use( "representation", diff --git a/crates/perry-codegen/src/expr/new_dynamic.rs b/crates/perry-codegen/src/expr/new_dynamic.rs index ea0789d1af..63cc59e0f3 100644 --- a/crates/perry-codegen/src/expr/new_dynamic.rs +++ b/crates/perry-codegen/src/expr/new_dynamic.rs @@ -21,6 +21,7 @@ use crate::lower_string_method::{ }; #[allow(unused_imports)] use crate::nanbox::{double_literal, POINTER_MASK_I64}; +use crate::native_value::MaterializationReason; #[allow(unused_imports)] use crate::type_analysis::{ compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, @@ -31,10 +32,10 @@ use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; #[allow(unused_imports)] use super::{ - buffer_alias_metadata_suffix, can_lower_expr_as_i32, emit_layout_note_slot_on_block, - emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, emit_string_literal_global, - emit_v8_export_call, emit_v8_member_method_call, emit_write_barrier, - emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, + buffer_alias_metadata_suffix, can_lower_expr_as_i32, downgrade_buffer_aliases_in_expr, + emit_layout_note_slot_on_block, emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, + emit_string_literal_global, emit_v8_export_call, emit_v8_member_method_call, + emit_write_barrier, emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, extract_array_of_object_shape, i32_bool_to_nanbox, import_origin_suffix, is_global_this_builtin_function_name, is_global_this_builtin_name, is_known_finite, lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, @@ -93,6 +94,18 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } => { use perry_hir::CallArg; let new_byte_offset = *byte_offset; + downgrade_buffer_aliases_in_expr(ctx, callee, MaterializationReason::UnknownCallEscape); + for arg in args { + match arg { + CallArg::Expr(expr) | CallArg::Spread(expr) => { + downgrade_buffer_aliases_in_expr( + ctx, + expr, + MaterializationReason::UnknownCallEscape, + ) + } + } + } let func_double = lower_expr(ctx, callee)?; let mut acc_handle = ctx.block().call(I64, "js_array_alloc", &[(I32, "0")]); for a in args { @@ -593,6 +606,18 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { | Expr::Logical { .. } ); if routes_through_function_construct { + downgrade_buffer_aliases_in_expr( + ctx, + callee, + MaterializationReason::UnknownCallEscape, + ); + for arg in args { + downgrade_buffer_aliases_in_expr( + ctx, + arg, + MaterializationReason::UnknownCallEscape, + ); + } let func_double = lower_expr(ctx, callee)?; let lowered_args: Vec = args .iter() @@ -620,6 +645,14 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // back to the class_id=0 empty-object baseline inside the helper, // preserving the previous best-effort behavior for shapes the // compiler can't resolve statically. + downgrade_buffer_aliases_in_expr(ctx, callee, MaterializationReason::UnknownCallEscape); + for arg in args { + downgrade_buffer_aliases_in_expr( + ctx, + arg, + MaterializationReason::UnknownCallEscape, + ); + } let func_double = lower_expr(ctx, callee)?; let lowered_args: Vec = args .iter() diff --git a/crates/perry-codegen/src/expr/pod_record.rs b/crates/perry-codegen/src/expr/pod_record.rs index 1e31b59633..04c161ac02 100644 --- a/crates/perry-codegen/src/expr/pod_record.rs +++ b/crates/perry-codegen/src/expr/pod_record.rs @@ -8,6 +8,7 @@ use crate::native_value::{ LoweredValue, MaterializationReason, NativeRep, NativeValueState, PodLayoutField, PodLayoutManifest, SemanticKind, }; +use crate::type_analysis::expr_may_return_boxed_value_from_raw_f64_fallback; use crate::types::{DOUBLE, F32, I32, I64, I8}; use super::{ @@ -408,8 +409,34 @@ pub(crate) fn lower_and_store_initial_pod_field( field: &PodLayoutField, value: &Expr, ) -> Result<()> { - let expected = field_expected_rep(field); - let lowered = lower_expr_native(ctx, value, expected)?; + let needs_raw_f64_fallback_coercion = + expr_may_return_boxed_value_from_raw_f64_fallback(ctx, value) + || matches!(value, Expr::IndexGet { .. } | Expr::PropertyGet { .. }); + let lowered = if matches!(field.native_rep, NativeRep::F64 | NativeRep::F32) + && needs_raw_f64_fallback_coercion + { + let raw = lower_expr(ctx, value)?; + let coerced = ctx + .block() + .call(DOUBLE, "js_number_coerce", &[(DOUBLE, &raw)]); + let (rep, llvm_ty, value) = match field.native_rep { + NativeRep::F64 => (NativeRep::F64, DOUBLE, coerced), + NativeRep::F32 => { + let value = ctx.block().fptrunc(DOUBLE, &coerced, F32); + (NativeRep::F32, F32, value) + } + _ => unreachable!(), + }; + LoweredValue { + semantic: SemanticKind::JsNumber, + rep, + llvm_ty, + value, + } + } else { + let expected = field_expected_rep(field); + lower_expr_native(ctx, value, expected)? + }; store_pod_field_native(ctx, local_id, data_slot, field, &lowered); Ok(()) } diff --git a/crates/perry-codegen/src/expr/property_get.rs b/crates/perry-codegen/src/expr/property_get.rs index ad939a9c68..70ab169f7c 100644 --- a/crates/perry-codegen/src/expr/property_get.rs +++ b/crates/perry-codegen/src/expr/property_get.rs @@ -688,7 +688,19 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .cloned() { let value = ctx.block().load(DOUBLE, &slot); - let lowered = LoweredValue { + let declared_raw_f64 = crate::type_analysis::scalar_replaced_field_is_raw_f64( + ctx, + object.as_ref(), + property, + ); + let raw_f64_field = + crate::type_analysis::scalar_replaced_field_raw_f64_store_state( + ctx, + Some(*id), + property, + declared_raw_f64, + ); + let lowered_js = LoweredValue { semantic: SemanticKind::JsValue, rep: NativeRep::JsValue, llvm_ty: DOUBLE, @@ -698,15 +710,34 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { "ScalarObjectFieldGet", Some(*id), "scalar_object_field_load", - &lowered, + &lowered_js, None, None, None, None, false, false, - vec![format!("field={}", property)], + vec![ + format!("field={}", property), + format!("raw_f64_field={}", raw_f64_field as u8), + ], ); + if raw_f64_field { + let lowered_f64 = LoweredValue::f64(value.clone()); + ctx.record_lowered_value_with_access_mode( + "ScalarObjectFieldGet", + Some(*id), + "scalar_object_field_load.raw_f64", + &lowered_f64, + None, + None, + None, + None, + false, + false, + vec![format!("field={}", property), "raw_f64_field=1".to_string()], + ); + } return Ok(value); } // Issue #613: when the local is scalar-replaced but the @@ -739,14 +770,27 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { } // Also handle `this` during scalar-replaced ctor inlining if let Expr::This = object.as_ref() { - if let Some(slot) = ctx.scalar_ctor_target.last().and_then(|tid| { - ctx.scalar_replaced - .get(tid) - .map(|fs| fs.get(property.as_str()).cloned()) - }) { + if let Some(target_id) = ctx.scalar_ctor_target.last().copied() { + let slot = ctx + .scalar_replaced + .get(&target_id) + .and_then(|fs| fs.get(property.as_str()).cloned()); if let Some(slot) = slot { let value = ctx.block().load(DOUBLE, &slot); - let lowered = LoweredValue { + let declared_raw_f64 = + crate::type_analysis::scalar_replaced_field_is_raw_f64( + ctx, + object.as_ref(), + property, + ); + let raw_f64_field = + crate::type_analysis::scalar_replaced_field_raw_f64_store_state( + ctx, + Some(target_id), + property, + declared_raw_f64, + ); + let lowered_js = LoweredValue { semantic: SemanticKind::JsValue, rep: NativeRep::JsValue, llvm_ty: DOUBLE, @@ -754,17 +798,36 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { }; ctx.record_lowered_value_with_access_mode( "ScalarThisFieldGet", - None, + Some(target_id), "scalar_object_field_load", - &lowered, + &lowered_js, None, None, None, None, false, false, - vec![format!("field={}", property)], + vec![ + format!("field={}", property), + format!("raw_f64_field={}", raw_f64_field as u8), + ], ); + if raw_f64_field { + let lowered_f64 = LoweredValue::f64(value.clone()); + ctx.record_lowered_value_with_access_mode( + "ScalarThisFieldGet", + Some(target_id), + "scalar_object_field_load.raw_f64", + &lowered_f64, + None, + None, + None, + None, + false, + false, + vec![format!("field={}", property), "raw_f64_field=1".to_string()], + ); + } return Ok(value); } return Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))); @@ -1696,11 +1759,12 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { ctx.current_block = fallback_idx; let blk = ctx.block(); blk.call_void("js_typed_feedback_record_fallback_call", &[(I64, &site_id)]); - let val_fallback = blk.call( + let val_fallback_js = blk.call( DOUBLE, "js_object_get_field_by_name_f64", &[(I64, &obj_bits), (I64, &key_raw)], ); + let val_fallback = val_fallback_js.clone(); let fallback_end_label = blk.label.clone(); blk.br(&merge_label); if requires_raw_f64 { @@ -1708,7 +1772,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { semantic: SemanticKind::JsValue, rep: NativeRep::JsValue, llvm_ty: DOUBLE, - value: val_fallback.clone(), + value: val_fallback_js.clone(), }; ctx.record_lowered_value_with_access_mode_and_facts( "ClassFieldGet", diff --git a/crates/perry-codegen/src/expr/property_set.rs b/crates/perry-codegen/src/expr/property_set.rs index 6d52001fcb..14d2739884 100644 --- a/crates/perry-codegen/src/expr/property_set.rs +++ b/crates/perry-codegen/src/expr/property_set.rs @@ -26,8 +26,9 @@ use crate::native_value::{ }; #[allow(unused_imports)] use crate::type_analysis::{ - compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, - is_numeric_expr, is_set_expr, is_string_expr, is_url_search_params_expr, receiver_class_name, + compute_auto_captures, expr_may_return_boxed_value_from_raw_f64_fallback, is_array_expr, + is_bigint_expr, is_bool_expr, is_map_expr, is_numeric_expr, is_set_expr, is_string_expr, + is_url_search_params_expr, receiver_class_name, }; #[allow(unused_imports)] use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; @@ -51,6 +52,17 @@ use super::{ TypedFeedbackKind, }; +fn canonicalize_raw_f64_numeric_store_value( + blk: &mut crate::block::LlBlock, + value_double: &str, +) -> String { + blk.call( + DOUBLE, + "js_array_numeric_value_to_raw_f64", + &[(DOUBLE, value_double)], + ) +} + fn class_has_computed_runtime_members(ctx: &FnCtx<'_>, class_name: &str) -> bool { ctx.classes .get(class_name) @@ -193,9 +205,22 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .and_then(|fs| fs.get(property.as_str())) .cloned() { + let raw_f64_field = crate::type_analysis::scalar_replaced_field_is_raw_f64( + ctx, + object.as_ref(), + property, + ); + let numeric_store = raw_f64_field + && is_numeric_expr(ctx, value) + && !expr_may_return_boxed_value_from_raw_f64_fallback(ctx, value); let val_double = lower_expr(ctx, value)?; - ctx.block().store(DOUBLE, &val_double, &slot); - let lowered = LoweredValue { + let stored_value = if numeric_store { + canonicalize_raw_f64_numeric_store_value(ctx.block(), &val_double) + } else { + val_double.clone() + }; + ctx.block().store(DOUBLE, &stored_value, &slot); + let lowered_js = LoweredValue { semantic: SemanticKind::JsValue, rep: NativeRep::JsValue, llvm_ty: DOUBLE, @@ -205,30 +230,61 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { "ScalarObjectFieldSet", Some(*id), "scalar_object_field_store", - &lowered, + &lowered_js, None, None, None, None, false, false, - vec![format!("field={}", property)], + vec![ + format!("field={}", property), + format!("raw_f64_field={}", raw_f64_field as u8), + ], ); + if numeric_store { + let lowered_f64 = LoweredValue::f64(stored_value.clone()); + ctx.record_lowered_value_with_access_mode( + "ScalarObjectFieldSet", + Some(*id), + "scalar_object_field_store.raw_f64", + &lowered_f64, + None, + None, + None, + None, + false, + false, + vec![format!("field={}", property), "raw_f64_field=1".to_string()], + ); + } return Ok(val_double); } } // Handle `this` during scalar-replaced constructor inlining: if let Expr::This = object.as_ref() { - if let Some(slot) = ctx - .scalar_ctor_target - .last() - .and_then(|tid| ctx.scalar_replaced.get(tid)) - { - let maybe_slot = slot.get(property.as_str()).cloned(); + if let Some(target_id) = ctx.scalar_ctor_target.last().copied() { + let maybe_slot = ctx + .scalar_replaced + .get(&target_id) + .and_then(|slots| slots.get(property.as_str()).cloned()); + let raw_f64_field = crate::type_analysis::scalar_replaced_field_is_raw_f64( + ctx, + object.as_ref(), + property, + ); + let numeric_store = raw_f64_field + && is_numeric_expr(ctx, value) + && !expr_may_return_boxed_value_from_raw_f64_fallback(ctx, value); let val_double = lower_expr(ctx, value)?; if let Some(slot) = maybe_slot { - ctx.block().store(DOUBLE, &val_double, &slot); - let lowered = LoweredValue { + let stored_value = if numeric_store { + canonicalize_raw_f64_numeric_store_value(ctx.block(), &val_double) + } else { + val_double.clone() + }; + ctx.block().store(DOUBLE, &stored_value, &slot); + let lowered_js = LoweredValue { semantic: SemanticKind::JsValue, rep: NativeRep::JsValue, llvm_ty: DOUBLE, @@ -236,17 +292,36 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { }; ctx.record_lowered_value_with_access_mode( "ScalarThisFieldSet", - None, + Some(target_id), "scalar_object_field_store", - &lowered, + &lowered_js, None, None, None, None, false, false, - vec![format!("field={}", property)], + vec![ + format!("field={}", property), + format!("raw_f64_field={}", raw_f64_field as u8), + ], ); + if numeric_store { + let lowered_f64 = LoweredValue::f64(stored_value.clone()); + ctx.record_lowered_value_with_access_mode( + "ScalarThisFieldSet", + Some(target_id), + "scalar_object_field_store.raw_f64", + &lowered_f64, + None, + None, + None, + None, + false, + false, + vec![format!("field={}", property), "raw_f64_field=1".to_string()], + ); + } } return Ok(val_double); } @@ -418,39 +493,48 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // as the array-store barrier elision. let field_set_barrier_needed = !expr_produces_non_pointer_bits_by_construction(ctx, value); - let blk = ctx.block(); - let obj_ptr = blk.inttoptr(I64, &obj_handle); - let header_skip = "24".to_string(); - let fields_base = blk.gep(I8, &obj_ptr, &[(I64, &header_skip)]); - let field_ptr = blk.gep(DOUBLE, &fields_base, &[(I64, &field_idx_str)]); - if requires_raw_f64 { - // Guarded raw-f64 slots are pointer-free by typed - // shape descriptor; non-number writes miss the - // guard and use the boxed setter fallback. - // GC_STORE_AUDIT(POINTER_FREE): typed raw-f64 class - // slots contain numbers only. - blk.store(DOUBLE, &val_double, &field_ptr); - } else { - let field_addr = blk.ptrtoint(&field_ptr, I64); - emit_jsvalue_slot_store_on_block( - blk, - &field_ptr, - &val_double, - &obj_handle, - &field_idx_str, - true, - &obj_bits, - &field_addr, - field_set_barrier_needed, - ); - } - blk.br(&merge_label); - if requires_raw_f64 { + let raw_stored_value = { + let blk = ctx.block(); + let obj_ptr = blk.inttoptr(I64, &obj_handle); + let header_skip = "24".to_string(); + let fields_base = blk.gep(I8, &obj_ptr, &[(I64, &header_skip)]); + let field_ptr = blk.gep(DOUBLE, &fields_base, &[(I64, &field_idx_str)]); + let raw_stored_value = if requires_raw_f64 { + // Guarded raw-f64 slots are pointer-free by typed + // shape descriptor; non-number writes miss the + // guard and use the boxed setter fallback. + // GC_STORE_AUDIT(POINTER_FREE): typed raw-f64 class + // slots contain numbers only. + let numeric_value = + canonicalize_raw_f64_numeric_store_value(blk, &val_double); + blk.store(DOUBLE, &numeric_value, &field_ptr); + Some(numeric_value) + } else { + // #5334 lever D: skip the barrier when the value + // is a non-pointer by construction. + let field_addr = blk.ptrtoint(&field_ptr, I64); + emit_jsvalue_slot_store_on_block( + blk, + &field_ptr, + &val_double, + &obj_handle, + &field_idx_str, + true, + &obj_bits, + &field_addr, + field_set_barrier_needed, + ); + None + }; + blk.br(&merge_label); + raw_stored_value + }; + if let Some(numeric_value) = raw_stored_value { let stored = LoweredValue { semantic: SemanticKind::JsNumber, rep: NativeRep::F64, llvm_ty: DOUBLE, - value: val_double.clone(), + value: numeric_value.clone(), }; ctx.record_lowered_value_with_access_mode_and_facts( "ClassFieldSet", diff --git a/crates/perry-codegen/src/expr/proxy_reflect.rs b/crates/perry-codegen/src/expr/proxy_reflect.rs index 7d62a0eb87..1ef74c3b95 100644 --- a/crates/perry-codegen/src/expr/proxy_reflect.rs +++ b/crates/perry-codegen/src/expr/proxy_reflect.rs @@ -21,6 +21,7 @@ use crate::lower_string_method::{ }; #[allow(unused_imports)] use crate::nanbox::{double_literal, POINTER_MASK_I64}; +use crate::native_value::MaterializationReason; #[allow(unused_imports)] use crate::type_analysis::{ compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, @@ -31,10 +32,10 @@ use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; #[allow(unused_imports)] use super::{ - buffer_alias_metadata_suffix, can_lower_expr_as_i32, emit_layout_note_slot_on_block, - emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, emit_string_literal_global, - emit_v8_export_call, emit_v8_member_method_call, emit_write_barrier, - emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, + buffer_alias_metadata_suffix, can_lower_expr_as_i32, downgrade_buffer_aliases_in_expr, + emit_layout_note_slot_on_block, emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, + emit_string_literal_global, emit_v8_export_call, emit_v8_member_method_call, + emit_write_barrier, emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, extract_array_of_object_shape, i32_bool_to_nanbox, import_origin_suffix, is_global_this_builtin_function_name, is_global_this_builtin_name, is_known_finite, lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, @@ -46,6 +47,16 @@ use super::{ I18nLowerCtx, }; +fn downgrade_unknown_call_expr(ctx: &mut FnCtx<'_>, expr: &Expr) { + downgrade_buffer_aliases_in_expr(ctx, expr, MaterializationReason::UnknownCallEscape); +} + +fn downgrade_unknown_call_args(ctx: &mut FnCtx<'_>, args: &[Expr]) { + for arg in args { + downgrade_unknown_call_expr(ctx, arg); + } +} + /// `p.call(thisArg, ...rest)` / `p.apply(thisArg, argsArray)` where `p` is a /// Proxy (#3656). The HIR lowers the callee to `ProxyGet(p, "call"|"apply")`, /// which would otherwise read `.call`/`.apply` off the *target* and invoke the @@ -66,6 +77,8 @@ pub(crate) fn try_lower_proxy_fn_call_apply( Expr::String(s) if s == "call" => false, _ => return Ok(None), }; + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_args(ctx, args); let p = lower_expr(ctx, proxy)?; let this_arg = match args.first() { Some(a) => lower_expr(ctx, a)?, @@ -121,6 +134,8 @@ pub(crate) fn try_lower_proxy_method_call( if method_name == "call" || method_name == "apply" { return Ok(None); } + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_args(ctx, args); let recv_box = lower_expr(ctx, proxy)?; let mut lowered_args: Vec = Vec::with_capacity(args.len()); for a in args { @@ -386,6 +401,8 @@ fn try_lower_process_env_put_value_set( pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { match expr { Expr::ProxyNew { target, handler } => { + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, handler); let t = lower_expr(ctx, target)?; let h = lower_expr(ctx, handler)?; Ok(ctx @@ -393,6 +410,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_proxy_new", &[(DOUBLE, &t), (DOUBLE, &h)])) } Expr::ProxyGet { proxy, key } => { + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_expr(ctx, key); let p = lower_expr(ctx, proxy)?; let k = lower_expr(ctx, key)?; Ok(ctx @@ -400,6 +419,9 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_proxy_get", &[(DOUBLE, &p), (DOUBLE, &k)])) } Expr::ProxySet { proxy, key, value } => { + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, value); let p = lower_expr(ctx, proxy)?; let k = lower_expr(ctx, key)?; let v = lower_expr(ctx, value)?; @@ -411,6 +433,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { Ok(v) } Expr::ProxyHas { proxy, key } => { + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_expr(ctx, key); let p = lower_expr(ctx, proxy)?; let k = lower_expr(ctx, key)?; Ok(ctx @@ -418,6 +442,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_proxy_has", &[(DOUBLE, &p), (DOUBLE, &k)])) } Expr::ProxyDelete { proxy, key } => { + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_expr(ctx, key); let p = lower_expr(ctx, proxy)?; let k = lower_expr(ctx, key)?; Ok(ctx @@ -425,6 +451,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_proxy_delete", &[(DOUBLE, &p), (DOUBLE, &k)])) } Expr::ProxyApply { proxy, args } => { + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_args(ctx, args); let p = lower_expr(ctx, proxy)?; let arr_handle = proxy_build_args_array(ctx, args)?; let blk = ctx.block(); @@ -437,6 +465,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { )) } Expr::ProxyConstruct { proxy, args } => { + downgrade_unknown_call_expr(ctx, proxy); + downgrade_unknown_call_args(ctx, args); let p = lower_expr(ctx, proxy)?; let arr_handle = proxy_build_args_array(ctx, args)?; let blk = ctx.block(); @@ -452,6 +482,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // #2846: return a real `{ proxy, revoke }` record so `typeof // rec.revoke === "function"`, `rec.proxy.a` forwards, and the // revoke function survives aliasing/storage. + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, handler); let t = lower_expr(ctx, target)?; let h = lower_expr(ctx, handler)?; Ok(ctx @@ -459,6 +491,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_proxy_revocable", &[(DOUBLE, &t), (DOUBLE, &h)])) } Expr::ProxyRevoke(proxy) => { + downgrade_unknown_call_expr(ctx, proxy); let p = lower_expr(ctx, proxy)?; ctx.block().call_void("js_proxy_revoke", &[(DOUBLE, &p)]); Ok(double_literal(f64::from_bits(crate::nanbox::TAG_UNDEFINED))) @@ -471,6 +504,9 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // #2766: pass the optional receiver through; the runtime defaults // an `undefined` receiver to the target and binds it as `this` for // accessor getters. + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, receiver); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; let r = lower_expr(ctx, receiver)?; @@ -490,6 +526,10 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // `undefined` receiver to the target. A receiver distinct from an // Integer-Indexed target redirects the write to the receiver per // OrdinarySet (test262 internals/Set/key-is-valid-index-reflect-set). + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, value); + downgrade_unknown_call_expr(ctx, receiver); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; let v = lower_expr(ctx, value)?; @@ -547,6 +587,10 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { }, ); } + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, value); + downgrade_unknown_call_expr(ctx, receiver); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; let v = lower_expr(ctx, value)?; @@ -569,6 +613,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { )) } Expr::ReflectHas { target, key } => { + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; Ok(ctx @@ -576,6 +622,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_reflect_has", &[(DOUBLE, &t), (DOUBLE, &k)])) } Expr::ReflectDelete { target, key } => { + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; Ok(ctx @@ -583,6 +631,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { .call(DOUBLE, "js_reflect_delete", &[(DOUBLE, &t), (DOUBLE, &k)])) } Expr::ReflectOwnKeys(target) => { + downgrade_unknown_call_expr(ctx, target); let t = lower_expr(ctx, target)?; Ok(ctx .block() @@ -593,6 +642,9 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { this_arg, args, } => { + downgrade_unknown_call_expr(ctx, func); + downgrade_unknown_call_expr(ctx, this_arg); + downgrade_unknown_call_expr(ctx, args); let f = lower_expr(ctx, func)?; let ta = lower_expr(ctx, this_arg)?; let a = lower_expr(ctx, args)?; @@ -607,6 +659,9 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { args, new_target, } => { + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, args); + downgrade_unknown_call_expr(ctx, new_target); let t = lower_expr(ctx, target)?; let a = lower_expr(ctx, args)?; let nt = lower_expr(ctx, new_target)?; @@ -621,6 +676,9 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { key, descriptor, } => { + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, descriptor); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; let d = lower_expr(ctx, descriptor)?; @@ -631,6 +689,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { )) } Expr::ReflectGetOwnPropertyDescriptor { target, key } => { + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, key); let t = lower_expr(ctx, target)?; let k = lower_expr(ctx, key)?; Ok(ctx.block().call( @@ -642,6 +702,8 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { Expr::ReflectSetPrototypeOf { target, proto } => { // #2761: Reflect-specific boolean result (false on rejected change) // + TypeError on bad args, distinct from Object.setPrototypeOf. + downgrade_unknown_call_expr(ctx, target); + downgrade_unknown_call_expr(ctx, proto); let t = lower_expr(ctx, target)?; let p = lower_expr(ctx, proto)?; Ok(ctx.block().call( @@ -656,6 +718,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // `=== Class.prototype` comparison is still folded to a constant // bool at lowering time (lower_expr.rs); this path handles every // other (value-returning) use. + downgrade_unknown_call_expr(ctx, target); let t = lower_expr(ctx, target)?; Ok(ctx .block() @@ -664,6 +727,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { Expr::ReflectIsExtensible(target) => { // #2762: Reflect-specific — boolean result + TypeError on // non-object, distinct from Object.isExtensible. + downgrade_unknown_call_expr(ctx, target); let t = lower_expr(ctx, target)?; Ok(ctx .block() @@ -673,6 +737,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { // #2762: Reflect-specific — boolean result + TypeError on // non-object, distinct from Object.preventExtensions (which // returns the object). + downgrade_unknown_call_expr(ctx, target); let t = lower_expr(ctx, target)?; Ok(ctx .block() @@ -684,6 +749,12 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, value); + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let k = lower_expr(ctx, key)?; let v = lower_expr(ctx, value)?; let t = lower_expr(ctx, target)?; @@ -703,6 +774,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let k = lower_expr(ctx, key)?; let t = lower_expr(ctx, target)?; let p = property_key @@ -721,6 +797,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let k = lower_expr(ctx, key)?; let t = lower_expr(ctx, target)?; let p = property_key @@ -739,6 +820,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let k = lower_expr(ctx, key)?; let t = lower_expr(ctx, target)?; let p = property_key @@ -757,6 +843,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let k = lower_expr(ctx, key)?; let t = lower_expr(ctx, target)?; let p = property_key @@ -774,6 +865,10 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let t = lower_expr(ctx, target)?; let p = property_key .as_ref() @@ -790,6 +885,10 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let t = lower_expr(ctx, target)?; let p = property_key .as_ref() @@ -807,6 +906,11 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { target, property_key, } => { + downgrade_unknown_call_expr(ctx, key); + downgrade_unknown_call_expr(ctx, target); + if let Some(property_key) = property_key { + downgrade_unknown_call_expr(ctx, property_key); + } let k = lower_expr(ctx, key)?; let t = lower_expr(ctx, target)?; let p = property_key diff --git a/crates/perry-codegen/src/expr/static_method.rs b/crates/perry-codegen/src/expr/static_method.rs index 928903c7dd..8a3acd219a 100644 --- a/crates/perry-codegen/src/expr/static_method.rs +++ b/crates/perry-codegen/src/expr/static_method.rs @@ -21,6 +21,7 @@ use crate::lower_string_method::{ }; #[allow(unused_imports)] use crate::nanbox::{double_literal, POINTER_MASK_I64}; +use crate::native_value::MaterializationReason; #[allow(unused_imports)] use crate::type_analysis::{ compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, @@ -31,10 +32,10 @@ use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; #[allow(unused_imports)] use super::{ - buffer_alias_metadata_suffix, can_lower_expr_as_i32, emit_layout_note_slot_on_block, - emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, emit_string_literal_global, - emit_v8_export_call, emit_v8_member_method_call, emit_write_barrier, - emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, + buffer_alias_metadata_suffix, can_lower_expr_as_i32, downgrade_buffer_aliases_in_expr, + emit_layout_note_slot_on_block, emit_shadow_slot_clear, emit_shadow_slot_update_for_expr, + emit_string_literal_global, emit_v8_export_call, emit_v8_member_method_call, + emit_write_barrier, emit_write_barrier_slot_on_block, expr_is_known_non_pointer_shadow_value, extract_array_of_object_shape, i32_bool_to_nanbox, import_origin_suffix, is_global_this_builtin_function_name, is_global_this_builtin_name, is_known_finite, lower_array_literal, lower_channel_reduction, lower_expr, lower_expr_as_i32, @@ -46,6 +47,12 @@ use super::{ I18nLowerCtx, }; +fn downgrade_unknown_call_args(ctx: &mut FnCtx<'_>, args: &[Expr]) { + for arg in args { + downgrade_buffer_aliases_in_expr(ctx, arg, MaterializationReason::UnknownCallEscape); + } +} + pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { match expr { Expr::StaticMethodCall { @@ -53,6 +60,7 @@ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { method_name, args, } => { + downgrade_unknown_call_args(ctx, args); // Built-in static methods that the runtime provides directly. if class_name == "AbortSignal" && method_name == "timeout" { let ms = if !args.is_empty() { diff --git a/crates/perry-codegen/src/expr/unary.rs b/crates/perry-codegen/src/expr/unary.rs index 55661572af..2316e1af65 100644 --- a/crates/perry-codegen/src/expr/unary.rs +++ b/crates/perry-codegen/src/expr/unary.rs @@ -23,8 +23,9 @@ use crate::lower_string_method::{ use crate::nanbox::{double_literal, POINTER_MASK_I64}; #[allow(unused_imports)] use crate::type_analysis::{ - compute_auto_captures, is_array_expr, is_bigint_expr, is_bool_expr, is_map_expr, - is_numeric_expr, is_set_expr, is_string_expr, is_url_search_params_expr, receiver_class_name, + compute_auto_captures, expr_may_return_boxed_value_from_raw_f64_fallback, is_array_expr, + is_bigint_expr, is_bool_expr, is_map_expr, is_numeric_expr, is_set_expr, is_string_expr, + is_url_search_params_expr, receiver_class_name, }; #[allow(unused_imports)] use crate::types::{DOUBLE, I1, I32, I64, I8, PTR}; @@ -49,7 +50,8 @@ use super::{ pub(crate) fn lower(ctx: &mut FnCtx<'_>, expr: &Expr) -> Result { match expr { Expr::Unary { op, operand } => { - let numeric = is_numeric_expr(ctx, operand); + let numeric = is_numeric_expr(ctx, operand) + && !expr_may_return_boxed_value_from_raw_f64_fallback(ctx, operand); // `-` must stay a BigInt (`typeof -1n === "bigint"`). // `fneg` on a NaN-boxed BigInt flips the NaN payload's sign bit // and produces a garbage number, so route negation through the diff --git a/crates/perry-codegen/src/expr/write_barrier.rs b/crates/perry-codegen/src/expr/write_barrier.rs index a6015f841c..6aa192675e 100644 --- a/crates/perry-codegen/src/expr/write_barrier.rs +++ b/crates/perry-codegen/src/expr/write_barrier.rs @@ -8,6 +8,7 @@ use perry_hir::Expr; use super::{lower_expr, FnCtx}; use crate::block::LlBlock; use crate::nanbox::double_literal; +use crate::native_value::LoweredValue; use crate::types::{DOUBLE, I32, I64}; /// Gen-GC Phase C2 helper: emit a write barrier after heap-store sites @@ -20,6 +21,19 @@ pub(crate) fn emit_write_barrier(ctx: &mut FnCtx<'_>, parent_bits: &str, child_b if !crate::codegen::write_barriers_enabled() { return; } + let child_bits_value = LoweredValue::js_value_bits(child_bits.to_string()); + ctx.record_lowered_value( + "WriteBarrier", + None, + "write_barrier.child_bits", + &child_bits_value, + None, + None, + None, + false, + false, + Vec::new(), + ); ctx.block() .call_void("js_write_barrier", &[(I64, parent_bits), (I64, child_bits)]); } diff --git a/crates/perry-codegen/src/lower_conditional.rs b/crates/perry-codegen/src/lower_conditional.rs index 69c1232eb1..3fe902fe6e 100644 --- a/crates/perry-codegen/src/lower_conditional.rs +++ b/crates/perry-codegen/src/lower_conditional.rs @@ -7,7 +7,9 @@ use anyhow::Result; use perry_hir::{Expr, LogicalOp}; use crate::expr::{lower_expr, FnCtx}; -use crate::type_analysis::{is_bool_expr, is_numeric_expr}; +use crate::type_analysis::{ + expr_may_return_boxed_value_from_raw_f64_fallback, is_bool_expr, is_numeric_expr, +}; use crate::types::{DOUBLE, I32, I64}; /// Convert a lowered condition value to an `i1` for `cond_br`. @@ -30,7 +32,9 @@ use crate::types::{DOUBLE, I32, I64}; /// a function call but produces correct results across the entire JS /// truthiness table. pub(crate) fn lower_truthy(ctx: &mut FnCtx<'_>, cond_val: &str, cond_expr: &Expr) -> String { - if is_numeric_expr(ctx, cond_expr) { + if is_numeric_expr(ctx, cond_expr) + && !expr_may_return_boxed_value_from_raw_f64_fallback(ctx, cond_expr) + { return ctx.block().fcmp("one", cond_val, "0.0"); } if is_bool_expr(ctx, cond_expr) { diff --git a/crates/perry-codegen/src/native_value/artifact.rs b/crates/perry-codegen/src/native_value/artifact.rs index 68f0004340..aa1f84554e 100644 --- a/crates/perry-codegen/src/native_value/artifact.rs +++ b/crates/perry-codegen/src/native_value/artifact.rs @@ -36,6 +36,8 @@ pub(crate) enum NativeValueState { #[serde(rename_all = "snake_case")] pub(crate) enum NativeAbiTransitionOp { None, + JsValueToBits, + BitsToJsValue, SignedIntToFloat, UnsignedIntToFloat, FloatExtend, @@ -302,6 +304,7 @@ struct NativeRepSummary { consumed_fact_count: usize, rejected_fact_count: usize, raw_f64_layout_fact_counts: BTreeMap, + js_value_bits_count: usize, native_owned_view_count: usize, pod_layout_count: usize, pod_record_count: usize, @@ -326,6 +329,7 @@ impl NativeRepSummary { ("rejected".to_string(), 0), ("invalidated".to_string(), 0), ]); + let mut js_value_bits_count = 0; let mut native_owned_view_count = 0; let mut pod_layout_count = 0; let mut pod_record_count = 0; @@ -335,6 +339,9 @@ impl NativeRepSummary { *native_rep_counts .entry(record.native_rep_name.clone()) .or_insert(0) += 1; + if matches!(record.native_rep, NativeRep::JsValueBits) { + js_value_bits_count += 1; + } if record.materialization_reason.is_some() { materialization_count += 1; } @@ -362,6 +369,8 @@ impl NativeRepSummary { native_abi_transition_count += 1; let op_name = match transition.op { NativeAbiTransitionOp::None => "none", + NativeAbiTransitionOp::JsValueToBits => "js_value_to_bits", + NativeAbiTransitionOp::BitsToJsValue => "bits_to_js_value", NativeAbiTransitionOp::SignedIntToFloat => "signed_int_to_float", NativeAbiTransitionOp::UnsignedIntToFloat => "unsigned_int_to_float", NativeAbiTransitionOp::FloatExtend => "float_extend", @@ -433,6 +442,7 @@ impl NativeRepSummary { consumed_fact_count, rejected_fact_count, raw_f64_layout_fact_counts, + js_value_bits_count, native_owned_view_count, pod_layout_count, pod_record_count, @@ -486,7 +496,7 @@ pub(crate) fn write_native_rep_artifact_if_enabled( pid, wall_nonce, counter )); let artifact = NativeRepArtifact { - schema_version: 11, + schema_version: 12, module, records, pod_layouts: collect_pod_layouts(records), diff --git a/crates/perry-codegen/src/native_value/materialize.rs b/crates/perry-codegen/src/native_value/materialize.rs index 5e501441f0..6410ecd847 100644 --- a/crates/perry-codegen/src/native_value/materialize.rs +++ b/crates/perry-codegen/src/native_value/materialize.rs @@ -44,7 +44,9 @@ fn transition_lossy(rep: &NativeRep, op: &NativeAbiTransitionOp) -> bool { NativeAbiTransitionOp::UnsignedIntToFloat => { matches!(rep, NativeRep::U64 | NativeRep::USize | NativeRep::HandleId) } - NativeAbiTransitionOp::None + NativeAbiTransitionOp::JsValueToBits + | NativeAbiTransitionOp::BitsToJsValue + | NativeAbiTransitionOp::None | NativeAbiTransitionOp::FloatExtend | NativeAbiTransitionOp::PointerBox | NativeAbiTransitionOp::NativeHandleBox @@ -52,19 +54,20 @@ fn transition_lossy(rep: &NativeRep, op: &NativeAbiTransitionOp) -> bool { } } -fn record_materialized_transition( +fn record_transition( ctx: &mut FnCtx<'_>, expr_kind: &'static str, consumer: &'static str, materialized: &LoweredValue, from_native_rep: String, + to_native_rep: String, op: NativeAbiTransitionOp, reason: MaterializationReason, lossy: bool, ) { let transition = NativeAbiTransitionRecord { from_native_rep, - to_native_rep: NativeRep::JsValue.name().to_string(), + to_native_rep, op, reason: reason.clone(), lossy, @@ -86,6 +89,29 @@ fn record_materialized_transition( ); } +fn record_materialized_transition( + ctx: &mut FnCtx<'_>, + expr_kind: &'static str, + consumer: &'static str, + materialized: &LoweredValue, + from_native_rep: String, + op: NativeAbiTransitionOp, + reason: MaterializationReason, + lossy: bool, +) { + record_transition( + ctx, + expr_kind, + consumer, + materialized, + from_native_rep, + NativeRep::JsValue.name().to_string(), + op, + reason, + lossy, + ); +} + pub(crate) fn record_runtime_native_handle_box_transition( ctx: &mut FnCtx<'_>, value: &str, @@ -168,6 +194,52 @@ pub(crate) fn materialize_promise_boundary_to_js_value( ) } +pub(crate) fn materialize_js_value_bits( + ctx: &mut FnCtx<'_>, + lowered: LoweredValue, + reason: MaterializationReason, +) -> String { + if matches!(&lowered.rep, NativeRep::JsValueBits) { + return lowered.value; + } + let js_value = materialize_js_value(ctx, lowered, reason.clone()); + let bits = ctx.block().bitcast_double_to_i64(&js_value); + let materialized = LoweredValue::js_value_bits(bits.clone()); + record_transition( + ctx, + "materialize_js_value_bits", + "materialize_js_value_bits", + &materialized, + NativeRep::JsValue.name().to_string(), + NativeRep::JsValueBits.name().to_string(), + NativeAbiTransitionOp::JsValueToBits, + reason, + false, + ); + bits +} + +fn materialize_js_value_bits_to_js_value( + ctx: &mut FnCtx<'_>, + lowered: LoweredValue, + reason: MaterializationReason, +) -> String { + let from_native_rep = lowered.rep.name().to_string(); + let value = ctx.block().bitcast_i64_to_double(&lowered.value); + let materialized = LoweredValue::js_value(value.clone()); + record_materialized_transition( + ctx, + "materialize_js_value", + "materialize_js_value_bits", + &materialized, + from_native_rep, + NativeAbiTransitionOp::BitsToJsValue, + reason, + false, + ); + value +} + pub(crate) fn materialize_js_value( ctx: &mut FnCtx<'_>, lowered: LoweredValue, @@ -176,6 +248,9 @@ pub(crate) fn materialize_js_value( if matches!(&lowered.rep, NativeRep::JsValue) { return lowered.value; } + if matches!(&lowered.rep, NativeRep::JsValueBits) { + return materialize_js_value_bits_to_js_value(ctx, lowered, reason); + } if matches!(&lowered.rep, NativeRep::NativeHandle) { return materialize_native_handle_to_js_value(ctx, lowered, reason); } @@ -196,6 +271,7 @@ pub(crate) fn materialize_js_value( NativeRep::BufferView(_) | NativeRep::PodRecord { .. } | NativeRep::PodRecordView { .. } + | NativeRep::JsValueBits | NativeRep::JsValue | NativeRep::NativeHandle | NativeRep::PromiseBoundary => NativeAbiTransitionOp::None, @@ -217,6 +293,7 @@ pub(crate) fn materialize_js_value( NativeRep::BufferView(_) => lowered.value.clone(), NativeRep::PodRecord { .. } => lowered.value.clone(), NativeRep::PodRecordView { .. } => lowered.value.clone(), + NativeRep::JsValueBits => lowered.value.clone(), NativeRep::JsValue | NativeRep::F64 | NativeRep::NativeHandle diff --git a/crates/perry-codegen/src/native_value/mod.rs b/crates/perry-codegen/src/native_value/mod.rs index ebbce9f67e..5a7755813d 100644 --- a/crates/perry-codegen/src/native_value/mod.rs +++ b/crates/perry-codegen/src/native_value/mod.rs @@ -16,7 +16,7 @@ pub(crate) use buffer::{ GuardedBufferIndex, LengthSource, NativeOwnedViewFact, NativeOwnedViewSlot, }; pub(crate) use materialize::{ - materialize_js_value, materialize_native_handle_to_js_value, + materialize_js_value, materialize_js_value_bits, materialize_native_handle_to_js_value, materialize_promise_boundary_to_js_value, record_runtime_native_handle_box_transition, MaterializationReason, }; diff --git a/crates/perry-codegen/src/native_value/rep.rs b/crates/perry-codegen/src/native_value/rep.rs index 54dcce0860..9fb2861353 100644 --- a/crates/perry-codegen/src/native_value/rep.rs +++ b/crates/perry-codegen/src/native_value/rep.rs @@ -18,6 +18,10 @@ pub(crate) enum SemanticKind { #[derive(Debug, Clone, Serialize, PartialEq, Eq)] #[serde(rename_all = "snake_case", tag = "kind", content = "value")] pub(crate) enum NativeRep { + /// Internal NaN-box bit pattern carried as an integer. Public Perry ABI + /// slots still use `JsValue`/LLVM `double`; this rep is for optimizer-local + /// boxed values where preserving payload bits matters. + JsValueBits, JsValue, I32, /// Legacy signed 64-bit scalar. Kept for existing native-library @@ -73,6 +77,7 @@ pub(crate) enum NativeRep { impl NativeRep { pub(crate) fn name(&self) -> &'static str { match self { + Self::JsValueBits => "js_value_bits", Self::JsValue => "js_value", Self::I32 => "i32", Self::I64 => "i64", @@ -95,6 +100,10 @@ impl NativeRep { #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub(crate) enum ExpectedNativeRep { + // #854: internal boxed-bits request path. Production GC/layout consumers + // record this rep for region-local NaN-box payload bits; external ABI + // classifiers must still select `JsValue`. + JsValueBits, I32, I64, U32, @@ -179,6 +188,10 @@ impl LoweredValue { Self::new(SemanticKind::JsValue, NativeRep::JsValue, DOUBLE, value) } + pub(crate) fn js_value_bits(value: impl Into) -> Self { + Self::new(SemanticKind::JsValue, NativeRep::JsValueBits, I64, value) + } + pub(crate) fn native_handle(value: impl Into) -> Self { Self::new(SemanticKind::JsValue, NativeRep::NativeHandle, I64, value) } @@ -228,7 +241,8 @@ impl LoweredValue { pub(crate) fn is_rep(&self, expected: ExpectedNativeRep) -> bool { matches!( (expected, &self.rep), - (ExpectedNativeRep::I32, NativeRep::I32) + (ExpectedNativeRep::JsValueBits, NativeRep::JsValueBits) + | (ExpectedNativeRep::I32, NativeRep::I32) | (ExpectedNativeRep::I64, NativeRep::I64) | (ExpectedNativeRep::U32, NativeRep::U32) | (ExpectedNativeRep::U64, NativeRep::U64) diff --git a/crates/perry-codegen/src/native_value/verify.rs b/crates/perry-codegen/src/native_value/verify.rs index 4a14024d4a..175556a178 100644 --- a/crates/perry-codegen/src/native_value/verify.rs +++ b/crates/perry-codegen/src/native_value/verify.rs @@ -38,6 +38,7 @@ pub(crate) fn verify_native_rep_records(records: &[NativeRepRecord]) -> Result<( record.function, record.block_label, record.consumer )); } + validate_js_value_bits_record(record, &mut errors); if matches!( record.native_rep, NativeRep::NativeHandle | NativeRep::PromiseBoundary @@ -261,6 +262,52 @@ fn raw_f64_checked_native_consumer(record: &NativeRepRecord) -> bool { ) } +fn validate_js_value_bits_record(record: &NativeRepRecord, errors: &mut Vec) { + if !matches!(record.native_rep, NativeRep::JsValueBits) { + return; + } + let prefix = || { + format!( + "{}:{} {}", + record.function, record.block_label, record.consumer + ) + }; + if record.native_abi_type.is_some() { + errors.push(format!( + "{} js_value_bits cannot be used as an external ABI descriptor", + prefix() + )); + } + if record.access_mode == Some(BufferAccessMode::DynamicFallback) + || record.fallback_reason.is_some() + || record.native_value_state == NativeValueState::DynamicFallback + { + errors.push(format!( + "{} js_value_bits cannot be a dynamic fallback record", + prefix() + )); + } + if record.materialization_reason.is_some() + || record.native_value_state == NativeValueState::Materialized + { + let transition = record + .native_abi_transition + .as_ref() + .or(record.scalar_conversion.as_ref()); + if !transition.is_some_and(|conversion| { + conversion.from_native_rep == NativeRep::JsValue.name() + && conversion.to_native_rep == NativeRep::JsValueBits.name() + && conversion.op == NativeAbiTransitionOp::JsValueToBits + && !conversion.lossy + }) { + errors.push(format!( + "{} materialized js_value_bits record must carry js_value_to_bits transition", + prefix() + )); + } + } +} + fn raw_f64_dynamic_fallback_record(record: &NativeRepRecord) -> bool { matches!( (record.expr_kind.as_str(), record.consumer.as_str()), @@ -790,7 +837,8 @@ fn expected_llvm_type(rep: &NativeRep) -> Option<&'static str> { Some(match rep { NativeRep::JsValue | NativeRep::F64 => DOUBLE, NativeRep::F32 => F32, - NativeRep::I64 + NativeRep::JsValueBits + | NativeRep::I64 | NativeRep::U64 | NativeRep::USize | NativeRep::HandleId @@ -951,6 +999,12 @@ fn valid_native_abi_transition( lossy: bool, record_rep: &NativeRep, ) -> bool { + if to == NativeRep::JsValueBits.name() { + return matches!(record_rep, NativeRep::JsValueBits) + && from == NativeRep::JsValue.name() + && matches!(op, NativeAbiTransitionOp::JsValueToBits) + && !lossy; + } if to != NativeRep::JsValue.name() { return false; } @@ -959,6 +1013,8 @@ fn valid_native_abi_transition( } match op { NativeAbiTransitionOp::None => matches!(from, "f64" | "js_value") && !lossy, + NativeAbiTransitionOp::JsValueToBits => false, + NativeAbiTransitionOp::BitsToJsValue => from == "js_value_bits" && !lossy, NativeAbiTransitionOp::SignedIntToFloat => { matches!(from, "i32" | "i64") && lossy == (from == "i64") } @@ -1797,6 +1853,91 @@ mod tests { assert!(verify_native_rep_records(&[r]).is_err()); } + #[test] + fn accepts_region_local_js_value_bits() { + let mut r = record(); + r.semantic = SemanticKind::JsValue; + r.native_rep = NativeRep::JsValueBits; + r.native_rep_name = "js_value_bits".to_string(); + r.llvm_ty = I64; + r.llvm_value = "%bits".to_string(); + assert!(verify_native_rep_records(&[r]).is_ok()); + } + + #[test] + fn accepts_js_value_bits_materialization_transitions() { + let mut to_bits = record(); + to_bits.semantic = SemanticKind::JsValue; + to_bits.native_rep = NativeRep::JsValueBits; + to_bits.native_rep_name = "js_value_bits".to_string(); + to_bits.llvm_ty = I64; + to_bits.llvm_value = "%bits".to_string(); + to_bits.native_value_state = NativeValueState::Materialized; + to_bits.materialization_reason = Some(MaterializationReason::FunctionAbi); + to_bits.native_abi_transition = Some(NativeAbiTransitionRecord { + from_native_rep: "js_value".to_string(), + to_native_rep: "js_value_bits".to_string(), + op: NativeAbiTransitionOp::JsValueToBits, + reason: MaterializationReason::FunctionAbi, + lossy: false, + }); + + let mut to_js_value = record(); + to_js_value.semantic = SemanticKind::JsValue; + to_js_value.native_rep = NativeRep::JsValue; + to_js_value.native_rep_name = "js_value".to_string(); + to_js_value.llvm_ty = DOUBLE; + to_js_value.llvm_value = "%boxed".to_string(); + to_js_value.native_value_state = NativeValueState::Materialized; + to_js_value.materialization_reason = Some(MaterializationReason::ReturnAbi); + to_js_value.native_abi_transition = Some(NativeAbiTransitionRecord { + from_native_rep: "js_value_bits".to_string(), + to_native_rep: "js_value".to_string(), + op: NativeAbiTransitionOp::BitsToJsValue, + reason: MaterializationReason::ReturnAbi, + lossy: false, + }); + + assert!(verify_native_rep_records(&[to_bits, to_js_value]).is_ok()); + } + + #[test] + fn rejects_materialized_js_value_bits_without_transition() { + let mut r = record(); + r.semantic = SemanticKind::JsValue; + r.native_rep = NativeRep::JsValueBits; + r.native_rep_name = "js_value_bits".to_string(); + r.llvm_ty = I64; + r.llvm_value = "%bits".to_string(); + r.native_value_state = NativeValueState::Materialized; + r.materialization_reason = None; + assert!(verify_native_rep_records(&[r]).is_err()); + } + + #[test] + fn rejects_js_value_bits_as_abi_or_fallback() { + let mut abi = record(); + abi.semantic = SemanticKind::JsValue; + abi.native_rep = NativeRep::JsValueBits; + abi.native_rep_name = "js_value_bits".to_string(); + abi.llvm_ty = I64; + abi.llvm_value = "%bits".to_string(); + abi.native_abi_type = Some(abi_type("jsvalue", NativeAbiDirection::Param, Some(0), 0)); + assert!(verify_native_rep_records(&[abi]).is_err()); + + let mut fallback = record(); + fallback.semantic = SemanticKind::JsValue; + fallback.native_rep = NativeRep::JsValueBits; + fallback.native_rep_name = "js_value_bits".to_string(); + fallback.llvm_ty = I64; + fallback.llvm_value = "%bits".to_string(); + fallback.access_mode = Some(BufferAccessMode::DynamicFallback); + fallback.native_value_state = NativeValueState::DynamicFallback; + fallback.materialization_reason = Some(MaterializationReason::RuntimeApi); + fallback.fallback_reason = Some(MaterializationReason::RuntimeApi); + assert!(verify_native_rep_records(&[fallback]).is_err()); + } + #[test] fn rejects_materialized_f32_record() { let mut r = record(); diff --git a/crates/perry-codegen/src/runtime_decls/arrays.rs b/crates/perry-codegen/src/runtime_decls/arrays.rs index 364c901260..db89ecf06b 100644 --- a/crates/perry-codegen/src/runtime_decls/arrays.rs +++ b/crates/perry-codegen/src/runtime_decls/arrays.rs @@ -57,6 +57,7 @@ pub fn declare_phase_b_arrays(module: &mut LlModule) { module.declare_function("js_array_mark_numeric_f64_layout", I32, &[I64]); module.declare_function("js_array_is_numeric_f64_layout", I32, &[I64]); module.declare_function("js_array_clear_numeric_layout", VOID, &[I64]); + module.declare_function("js_array_numeric_value_to_raw_f64", DOUBLE, &[DOUBLE]); module.declare_function("js_array_note_numeric_write", VOID, &[I64, I64]); module.declare_function("js_array_length", I32, &[I64]); // Array.isArray runtime dispatch for values with indeterminate diff --git a/crates/perry-codegen/src/runtime_decls/objects.rs b/crates/perry-codegen/src/runtime_decls/objects.rs index 24aaea2cfd..d68a75475d 100644 --- a/crates/perry-codegen/src/runtime_decls/objects.rs +++ b/crates/perry-codegen/src/runtime_decls/objects.rs @@ -288,7 +288,7 @@ pub fn declare_phase_b_objects(module: &mut LlModule) { module.declare_function( "js_typed_feedback_array_index_set_fallback_boxed", DOUBLE, - &[I64, DOUBLE, I32, DOUBLE], + &[I64, DOUBLE, DOUBLE, DOUBLE], ); module.declare_function( "js_typed_feedback_observe_array_element", diff --git a/crates/perry-codegen/src/stmt/loops.rs b/crates/perry-codegen/src/stmt/loops.rs index f05d636d21..ae9090d0af 100644 --- a/crates/perry-codegen/src/stmt/loops.rs +++ b/crates/perry-codegen/src/stmt/loops.rs @@ -21,6 +21,15 @@ struct NumericBulkFillLoop { value: NumericBulkFillValue, } +#[derive(Clone, Copy)] +struct LengthHoist { + arr_id: u32, + counter_id: u32, + op: perry_hir::CompareOp, + lhs_addend: i32, + buffer_bounds_width_units: Option, +} + /// Runtime-guarded i32 specialization for `i < n` loops whose bound `n` is an /// `any`/untyped (non-`number`) local. The `is-number` flag and `fptosi(n)` /// value are both hoisted to stack slots once before the loop; the cond block @@ -273,7 +282,7 @@ pub(crate) fn lower_for( // Saves ~25-30% on `for (let i = 0; i < arr.length; i++) arr[i] = i` // and `for (let i = 0; i < arr.length; i++) for (let j = 0; j < // arr.length; j++) ...` patterns. - let hoist_classification: Option<(u32, u32, perry_hir::CompareOp)> = condition + let hoist_classification: Option = condition .and_then(|cond| classify_for_length_hoist(cond, body)) // `__arr_N` is the for-of desugar's holder — an ALIAS of the user's // iterable local. Body mutations go through the user's name @@ -282,79 +291,85 @@ pub(crate) fn lower_for( // length every step (array-expand/contract in test262), so never // hoist for desugared for-of loops; user-written `i < arr.length` // loops keep the peephole. - .filter(|(arr_id, _, _)| { + .filter(|hoist| { !ctx.local_id_to_name - .get(arr_id) + .get(&hoist.arr_id) .is_some_and(|n| n.starts_with("__arr_")) }); - let hoisted_length_arr_id: Option = hoist_classification.map(|(arr, _, _)| arr); - let hoisted_index_bounds_are_safe = hoist_classification.is_some_and(|(_, counter_id, op)| { - matches!(op, perry_hir::CompareOp::Lt) - && loop_counter_bounds_are_safe(ctx, counter_id, update, body) + let hoisted_length_arr_id: Option = hoist_classification.map(|hoist| hoist.arr_id); + let hoisted_index_bounds_are_safe = hoist_classification.is_some_and(|hoist| { + matches!(hoist.op, perry_hir::CompareOp::Lt) + && hoist.lhs_addend == 0 + && loop_counter_bounds_are_safe(ctx, hoist.counter_id, update, body) + }); + let hoisted_buffer_bounds_width = hoist_classification.and_then(|hoist| { + hoist.buffer_bounds_width_units.filter(|_| { + ctx.buffer_view_slots.contains_key(&hoist.arr_id) + && loop_counter_bounds_are_safe(ctx, hoist.counter_id, update, body) + }) }); - let hoisted_length_slot: Option = - if let Some((arr_id, counter_id, _op)) = hoist_classification { - let arr_box_loaded = lower_expr( - ctx, - &perry_hir::Expr::PropertyGet { - object: Box::new(perry_hir::Expr::LocalGet(arr_id)), - property: "length".to_string(), + let hoisted_length_slot: Option = if let Some(hoist) = hoist_classification { + let arr_box_loaded = lower_expr( + ctx, + &perry_hir::Expr::PropertyGet { + object: Box::new(perry_hir::Expr::LocalGet(hoist.arr_id)), + property: "length".to_string(), + }, + )?; + let slot = ctx.func.alloca_entry(DOUBLE); + ctx.block().store(DOUBLE, &arr_box_loaded, &slot); + ctx.cached_lengths.insert(hoist.arr_id, slot.clone()); + // Also tell `lower_index_set_fast` (and similar sites) that + // `arr[counter_id]` is statically inbounds for this body, so + // it can skip the runtime length-load + bound check. + if hoisted_index_bounds_are_safe { + ctx.bounded_index_pairs.push(BoundedIndexPair { + index_local_id: hoist.counter_id, + array_local_id: hoist.arr_id, + scope_id: loop_proof_scope_id, + }); + } + if let Some(bounds_width_units) = hoisted_buffer_bounds_width { + ctx.bounded_buffer_index_pairs.push(BoundedBufferIndex { + index_local_id: hoist.counter_id, + buffer_local_id: hoist.arr_id, + scope_id: loop_proof_scope_id, + bounds_width_units, + bounds: BoundsState::Proven { + proof: BoundsProof::LoopGuard, }, - )?; - let slot = ctx.func.alloca_entry(DOUBLE); - ctx.block().store(DOUBLE, &arr_box_loaded, &slot); - ctx.cached_lengths.insert(arr_id, slot.clone()); - // Also tell `lower_index_set_fast` (and similar sites) that - // `arr[counter_id]` is statically inbounds for this body, so - // it can skip the runtime length-load + bound check. - if hoisted_index_bounds_are_safe { - ctx.bounded_index_pairs.push(BoundedIndexPair { - index_local_id: counter_id, - array_local_id: arr_id, - scope_id: loop_proof_scope_id, - }); - if ctx.buffer_view_slots.contains_key(&arr_id) { - ctx.bounded_buffer_index_pairs.push(BoundedBufferIndex { - index_local_id: counter_id, - buffer_local_id: arr_id, - scope_id: loop_proof_scope_id, - bounds_width_units: 1, - bounds: BoundsState::Proven { - proof: BoundsProof::LoopGuard, - }, - }); - } - } + }); + } - // If the counter is provably integer-valued (initialized from - // an Integer literal, only mutated via Update ++/--), allocate - // a parallel i32 slot. The Update lowering will keep it in sync, - // and IndexGet/IndexSet will load the i32 directly instead of - // emitting a `fptosi double → i32` on every iteration. - if ctx.integer_locals.contains(&counter_id) { - if let Some(counter_slot) = ctx.locals.get(&counter_id).cloned() { - let i32_slot = ctx.func.alloca_entry(I32); - // Initialize from the current double value. - let cur_dbl = ctx.block().load(DOUBLE, &counter_slot); - let cur_i32 = ctx.block().fptosi(DOUBLE, &cur_dbl, I32); - ctx.block().store(I32, &cur_i32, &i32_slot); - ctx.i32_counter_slots.insert(counter_id, i32_slot); - } + // If the counter is provably integer-valued (initialized from + // an Integer literal, only mutated via Update ++/--), allocate + // a parallel i32 slot. The Update lowering will keep it in sync, + // and IndexGet/IndexSet will load the i32 directly instead of + // emitting a `fptosi double → i32` on every iteration. + if ctx.integer_locals.contains(&hoist.counter_id) { + if let Some(counter_slot) = ctx.locals.get(&hoist.counter_id).cloned() { + let i32_slot = ctx.func.alloca_entry(I32); + // Initialize from the current double value. + let cur_dbl = ctx.block().load(DOUBLE, &counter_slot); + let cur_i32 = ctx.block().fptosi(DOUBLE, &cur_dbl, I32); + ctx.block().store(I32, &cur_i32, &i32_slot); + ctx.i32_counter_slots.insert(hoist.counter_id, i32_slot); } + } - Some(slot) - } else { - None - }; + Some(slot) + } else { + None + }; // If we have an i32 counter AND a hoisted length, pre-compute the // length as i32 so the loop condition can use `icmp slt/sle i32` // instead of `fcmp olt/ole double`. This eliminates the float counter fadd + // fcmp per iteration — saves ~2 instructions on the inner loop of // nested_loops and similar patterns. - let i32_length_slot: Option = if let Some((_, counter_id, _op)) = hoist_classification { + let i32_length_slot: Option = if let Some(hoist) = hoist_classification { if let (Some(_), Some(len_dbl_slot)) = ( - ctx.i32_counter_slots.get(&counter_id).cloned(), + ctx.i32_counter_slots.get(&hoist.counter_id).cloned(), hoisted_length_slot.as_ref(), ) { let len_dbl = ctx.block().load(DOUBLE, len_dbl_slot); @@ -558,81 +573,83 @@ pub(crate) fn lower_for( // Cond block — fast i32 path when both counter and length are i32. ctx.current_block = cond_idx; - let used_i32_cond = if let (Some((_, counter_id, op)), Some(ref len_i32_slot)) = - (hoist_classification, &i32_length_slot) - { - // Existing path: `i < arr.length` / `i <= arr.length` with - // hoisted i32 length. - if let Some(ctr_i32_slot) = ctx.i32_counter_slots.get(&counter_id).cloned() { - let ctr = ctx.block().load(I32, &ctr_i32_slot); - let len = ctx.block().load(I32, len_i32_slot); - let cmp = match op { - perry_hir::CompareOp::Le => ctx.block().icmp_sle(I32, &ctr, &len), - _ => ctx.block().icmp_slt(I32, &ctr, &len), - }; - ctx.block().cond_br(&cmp, &body_label, &exit_label); - true - } else { - false - } - } else if let (Some((counter_id, _, op)), Some(ref bound_i32_slot)) = - (local_bound_classification, &i32_local_bound_slot) - { - // Issue #168: `i < n` / `i <= n` where `n` is a number-typed local - // or parameter. The fptosi(n) was hoisted above; use icmp i32. - if let Some(ctr_i32_slot) = ctx.i32_counter_slots.get(&counter_id).cloned() { - let ctr = ctx.block().load(I32, &ctr_i32_slot); - let bound = ctx.block().load(I32, bound_i32_slot); - let cmp = match op { - perry_hir::CompareOp::Le => ctx.block().icmp_sle(I32, &ctr, &bound), - _ => ctx.block().icmp_slt(I32, &ctr, &bound), - }; - ctx.block().cond_br(&cmp, &body_label, &exit_label); - true - } else { - false - } - } else if let Some(ref dyn_bound) = dynamic_i32_bound { - // Issue #168 follow-up: `i < n` / `i <= n` where `n` is an `any`/untyped - // local. Branch on the one-time `is-number` flag hoisted above: the - // fast loop uses `icmp slt i32`; the slow loop keeps full JS comparison - // semantics. The branch is loop-invariant, so LLVM's LoopUnswitch peels - // it into two loops at -O2+; even unswitched, the hot (is-number) path - // executes pure integer compares with no per-iteration `sitofp` / call. - if let Some(ctr_i32_slot) = ctx.i32_counter_slots.get(&dyn_bound.counter_id).cloned() { - let fast_idx = ctx.new_block("for.cond.fast"); - let slow_idx = ctx.new_block("for.cond.slow"); - let fast_label = ctx.block_label(fast_idx); - let slow_label = ctx.block_label(slow_idx); - let flag = ctx.block().load(I1, &dyn_bound.flag_slot); - ctx.block().cond_br(&flag, &fast_label, &slow_label); + let used_i32_cond = + if let (Some(hoist), Some(ref len_i32_slot)) = (hoist_classification, &i32_length_slot) { + // Existing path: `i < arr.length` / `i <= arr.length` with + // hoisted i32 length. + if let Some(ctr_i32_slot) = ctx.i32_counter_slots.get(&hoist.counter_id).cloned() { + let mut ctr = ctx.block().load(I32, &ctr_i32_slot); + if hoist.lhs_addend != 0 { + ctr = ctx.block().add(I32, &ctr, &hoist.lhs_addend.to_string()); + } + let len = ctx.block().load(I32, len_i32_slot); + let cmp = match hoist.op { + perry_hir::CompareOp::Le => ctx.block().icmp_sle(I32, &ctr, &len), + _ => ctx.block().icmp_slt(I32, &ctr, &len), + }; + ctx.block().cond_br(&cmp, &body_label, &exit_label); + true + } else { + false + } + } else if let (Some((counter_id, _, op)), Some(ref bound_i32_slot)) = + (local_bound_classification, &i32_local_bound_slot) + { + // Issue #168: `i < n` / `i <= n` where `n` is a number-typed local + // or parameter. The fptosi(n) was hoisted above; use icmp i32. + if let Some(ctr_i32_slot) = ctx.i32_counter_slots.get(&counter_id).cloned() { + let ctr = ctx.block().load(I32, &ctr_i32_slot); + let bound = ctx.block().load(I32, bound_i32_slot); + let cmp = match op { + perry_hir::CompareOp::Le => ctx.block().icmp_sle(I32, &ctr, &bound), + _ => ctx.block().icmp_slt(I32, &ctr, &bound), + }; + ctx.block().cond_br(&cmp, &body_label, &exit_label); + true + } else { + false + } + } else if let Some(ref dyn_bound) = dynamic_i32_bound { + // Issue #168 follow-up: `i < n` / `i <= n` where `n` is an `any`/untyped + // local. Branch on the one-time `is-number` flag hoisted above: the + // fast loop uses `icmp slt i32`; the slow loop keeps full JS comparison + // semantics. The branch is loop-invariant, so LLVM's LoopUnswitch peels + // it into two loops at -O2+; even unswitched, the hot (is-number) path + // executes pure integer compares with no per-iteration `sitofp` / call. + if let Some(ctr_i32_slot) = ctx.i32_counter_slots.get(&dyn_bound.counter_id).cloned() { + let fast_idx = ctx.new_block("for.cond.fast"); + let slow_idx = ctx.new_block("for.cond.slow"); + let fast_label = ctx.block_label(fast_idx); + let slow_label = ctx.block_label(slow_idx); + let flag = ctx.block().load(I1, &dyn_bound.flag_slot); + ctx.block().cond_br(&flag, &fast_label, &slow_label); - // Fast path: integer induction variable + `icmp`. - ctx.current_block = fast_idx; - let ctr = ctx.block().load(I32, &ctr_i32_slot); - let bound = ctx.block().load(I32, &dyn_bound.bound_i32_slot); - let cmp = match dyn_bound.op { - perry_hir::CompareOp::Le => ctx.block().icmp_sle(I32, &ctr, &bound), - _ => ctx.block().icmp_slt(I32, &ctr, &bound), - }; - ctx.block().cond_br(&cmp, &body_label, &exit_label); + // Fast path: integer induction variable + `icmp`. + ctx.current_block = fast_idx; + let ctr = ctx.block().load(I32, &ctr_i32_slot); + let bound = ctx.block().load(I32, &dyn_bound.bound_i32_slot); + let cmp = match dyn_bound.op { + perry_hir::CompareOp::Le => ctx.block().icmp_sle(I32, &ctr, &bound), + _ => ctx.block().icmp_slt(I32, &ctr, &bound), + }; + ctx.block().cond_br(&cmp, &body_label, &exit_label); - // Slow path: generic per-iteration comparison (full coercion). - ctx.current_block = slow_idx; - if let Some(cond_expr) = condition { - let cv = lower_expr(ctx, cond_expr)?; - let i1 = lower_truthy(ctx, &cv, cond_expr); - ctx.block().cond_br(&i1, &body_label, &exit_label); + // Slow path: generic per-iteration comparison (full coercion). + ctx.current_block = slow_idx; + if let Some(cond_expr) = condition { + let cv = lower_expr(ctx, cond_expr)?; + let i1 = lower_truthy(ctx, &cv, cond_expr); + ctx.block().cond_br(&i1, &body_label, &exit_label); + } else { + ctx.block().br(&body_label); + } + true } else { - ctx.block().br(&body_label); + false } - true } else { false - } - } else { - false - }; + }; if !used_i32_cond { if let Some(cond_expr) = condition { let cv = lower_expr(ctx, cond_expr)?; @@ -703,8 +720,8 @@ pub(crate) fn lower_for( // Pop the hoisted-length entry so nested loops or sibling loops // don't see a stale slot. - if let Some((_, counter_id, _op)) = hoist_classification { - ctx.i32_counter_slots.remove(&counter_id); + if let Some(hoist) = hoist_classification { + ctx.i32_counter_slots.remove(&hoist.counter_id); } if let Some(arr_id) = hoisted_length_arr_id { ctx.cached_lengths.remove(&arr_id); @@ -752,21 +769,24 @@ pub(crate) fn clear_loop_body_shadow_slots(ctx: &mut FnCtx<'_>, body: &[Stmt]) { } /// Inspect a `for` loop's condition expression and body, and return -/// `Some((arr_local_id, counter_local_id, op))` if the loop is the -/// well-known shape `for (let i = ...; i < .length; ...) { body }` -/// (or `<=`) AND the body is provably free of operations that can change -/// `arr.length`. +/// `Some(...)` if the loop is the well-known shape +/// `for (let i = ...; i < .length; ...) { body }` (or `<=`) AND the +/// body is provably free of operations that can change `arr.length`. +/// +/// Also recognizes fixed-width native-buffer guards such as +/// `i + 4 <= buf.length`. The hoist descriptor keeps the LHS addend so the +/// fast condition remains `i + 4 <= len`, not `i <= len`. /// /// The walker also accepts `arr[i] = expr` IndexSets where `i` is the /// loop counter from a strict `<` condition — those are guaranteed /// inbounds and therefore can't trigger the realloc slow path that would /// extend `arr.length`. Under `<=`, `i == arr.length` is reachable, so /// array writes must go through the normal extension-capable path. -pub(crate) fn classify_for_length_hoist( +fn classify_for_length_hoist( cond: &perry_hir::Expr, body: &[perry_hir::Stmt], -) -> Option<(u32, u32, perry_hir::CompareOp)> { - use perry_hir::{CompareOp, Expr}; +) -> Option { + use perry_hir::{BinaryOp, CompareOp, Expr}; let (op, left, right) = match cond { Expr::Compare { op, left, right } => (*op, left.as_ref(), right.as_ref()), _ => return None, @@ -781,18 +801,53 @@ pub(crate) fn classify_for_length_hoist( }, _ => return None, }; - let bounded_idx_id = match left { - Expr::LocalGet(id) => *id, + let (bounded_idx_id, lhs_addend) = match left { + Expr::LocalGet(id) => (*id, 0), + Expr::Binary { op, left, right } if matches!(op, BinaryOp::Add | BinaryOp::Sub) => { + match (left.as_ref(), right.as_ref()) { + (Expr::LocalGet(id), Expr::Integer(addend)) => { + let addend = if matches!(op, BinaryOp::Sub) { + addend.checked_neg()? + } else { + *addend + }; + if !(0..=i32::MAX as i64).contains(&addend) { + return None; + } + (*id, addend as i32) + } + (Expr::Integer(addend), Expr::LocalGet(id)) if matches!(op, BinaryOp::Add) => { + if !(0..=i32::MAX as i64).contains(addend) { + return None; + } + (*id, *addend as i32) + } + _ => return None, + } + } _ => return None, }; - let has_strict_bound = matches!(op, CompareOp::Lt); + let has_strict_bound = matches!(op, CompareOp::Lt) && lhs_addend == 0; if !body .iter() .all(|s| stmt_preserves_array_length(s, arr_id, bounded_idx_id, has_strict_bound)) { return None; } - Some((arr_id, bounded_idx_id, op)) + let buffer_bounds_width_units = match op { + CompareOp::Lt => i64::from(lhs_addend).checked_add(1), + CompareOp::Le => Some(i64::from(lhs_addend)), + _ => None, + } + .filter(|width| *width >= 1 && *width <= u32::MAX as i64) + .map(|width| width as u32); + Some(LengthHoist { + arr_id, + counter_id: bounded_idx_id, + op, + lhs_addend, + buffer_bounds_width_units, + }) } /// Inspect a `for` loop's condition and return `Some((counter_id, bound_id, diff --git a/crates/perry-codegen/src/type_analysis.rs b/crates/perry-codegen/src/type_analysis.rs index f6ec9a38c0..a8cba6fbf3 100644 --- a/crates/perry-codegen/src/type_analysis.rs +++ b/crates/perry-codegen/src/type_analysis.rs @@ -755,6 +755,381 @@ fn expression_has_numeric_length(ctx: &FnCtx<'_>, object: &Expr) -> bool { } } +fn native_rep_materializes_to_js_number(rep: &crate::native_value::NativeRep) -> bool { + matches!( + rep, + crate::native_value::NativeRep::I32 + | crate::native_value::NativeRep::I64 + | crate::native_value::NativeRep::U32 + | crate::native_value::NativeRep::U64 + | crate::native_value::NativeRep::USize + | crate::native_value::NativeRep::F64 + | crate::native_value::NativeRep::F32 + | crate::native_value::NativeRep::U8 + | crate::native_value::NativeRep::BufferLen + | crate::native_value::NativeRep::HandleId + ) +} + +fn pod_record_local_has_materialized_object(ctx: &FnCtx<'_>, local_id: u32) -> bool { + // Once a POD local has a materialized JS object path, later property + // reads may observe mutable boxed object state instead of native bytes. + ctx.native_rep_records.iter().any(|record| { + record.local_id == Some(local_id) && record.consumer == "pod_record_materialize_object" + }) +} + +pub(crate) fn pod_record_field_is_numeric(ctx: &FnCtx<'_>, object: &Expr, field: &str) -> bool { + let Expr::LocalGet(id) = object else { + return false; + }; + if pod_record_local_has_materialized_object(ctx, *id) { + return false; + } + ctx.pod_records + .get(id) + .and_then(|local| { + local + .layout + .fields + .iter() + .find(|candidate| candidate.name == field) + }) + .is_some_and(|field| native_rep_materializes_to_js_number(&field.native_rep)) +} + +fn collect_pod_numeric_field_read_locals(ctx: &FnCtx<'_>, expr: &Expr, out: &mut Vec) { + match expr { + Expr::PropertyGet { object, property } + if matches!(object.as_ref(), Expr::LocalGet(_)) + && pod_record_field_is_numeric(ctx, object, property) => + { + if let Expr::LocalGet(id) = object.as_ref() { + out.push(*id); + } + } + Expr::PropertyGet { object, .. } => collect_pod_numeric_field_read_locals(ctx, object, out), + Expr::PropertySet { object, value, .. } => { + collect_pod_numeric_field_read_locals(ctx, object, out); + collect_pod_numeric_field_read_locals(ctx, value, out); + } + Expr::IndexGet { object, index } => { + collect_pod_numeric_field_read_locals(ctx, object, out); + collect_pod_numeric_field_read_locals(ctx, index, out); + } + Expr::IndexSet { + object, + index, + value, + } => { + collect_pod_numeric_field_read_locals(ctx, object, out); + collect_pod_numeric_field_read_locals(ctx, index, out); + collect_pod_numeric_field_read_locals(ctx, value, out); + } + Expr::Binary { left, right, .. } | Expr::Compare { left, right, .. } => { + collect_pod_numeric_field_read_locals(ctx, left, out); + collect_pod_numeric_field_read_locals(ctx, right, out); + } + Expr::Unary { operand, .. } | Expr::TypeOf(operand) | Expr::Void(operand) => { + collect_pod_numeric_field_read_locals(ctx, operand, out); + } + Expr::Logical { left, right, .. } => { + collect_pod_numeric_field_read_locals(ctx, left, out); + collect_pod_numeric_field_read_locals(ctx, right, out); + } + Expr::Conditional { + condition, + then_expr, + else_expr, + } => { + collect_pod_numeric_field_read_locals(ctx, condition, out); + collect_pod_numeric_field_read_locals(ctx, then_expr, out); + collect_pod_numeric_field_read_locals(ctx, else_expr, out); + } + Expr::Call { callee, args, .. } => { + collect_pod_numeric_field_read_locals(ctx, callee, out); + for arg in args { + collect_pod_numeric_field_read_locals(ctx, arg, out); + } + } + Expr::NativeMethodCall { object, args, .. } => { + if let Some(object) = object { + collect_pod_numeric_field_read_locals(ctx, object, out); + } + for arg in args { + collect_pod_numeric_field_read_locals(ctx, arg, out); + } + } + Expr::New { args, .. } | Expr::NewDynamic { args, .. } => { + for arg in args { + collect_pod_numeric_field_read_locals(ctx, arg, out); + } + } + Expr::Array(items) => { + for item in items { + collect_pod_numeric_field_read_locals(ctx, item, out); + } + } + Expr::Object(items) => { + for (_, item) in items { + collect_pod_numeric_field_read_locals(ctx, item, out); + } + } + _ => {} + } +} + +fn expr_may_materialize_pod_local(ctx: &FnCtx<'_>, expr: &Expr, target_id: u32) -> bool { + match expr { + Expr::LocalGet(id) => *id == target_id && ctx.pod_records.contains_key(id), + Expr::PropertyGet { object, property } + if matches!(object.as_ref(), Expr::LocalGet(id) if *id == target_id) + && ctx.pod_records.get(&target_id).is_some_and(|local| { + local + .layout + .fields + .iter() + .any(|field| field.name == *property) + }) => + { + false + } + Expr::PropertyGet { object, .. } => expr_may_materialize_pod_local(ctx, object, target_id), + Expr::PropertySet { + object, + property, + value, + } => { + let pod_field_set = matches!(object.as_ref(), Expr::LocalGet(id) if *id == target_id) + && ctx.pod_records.get(&target_id).is_some_and(|local| { + local + .layout + .fields + .iter() + .any(|field| field.name == *property) + }); + pod_field_set + || expr_may_materialize_pod_local(ctx, object, target_id) + || expr_may_materialize_pod_local(ctx, value, target_id) + } + Expr::Call { callee, args, .. } => { + expr_may_materialize_pod_local(ctx, callee, target_id) + || args + .iter() + .any(|arg| expr_may_materialize_pod_local(ctx, arg, target_id)) + } + Expr::NativeMethodCall { object, args, .. } => { + object + .as_ref() + .is_some_and(|object| expr_may_materialize_pod_local(ctx, object, target_id)) + || args + .iter() + .any(|arg| expr_may_materialize_pod_local(ctx, arg, target_id)) + } + Expr::IndexGet { object, index } => { + expr_may_materialize_pod_local(ctx, object, target_id) + || expr_may_materialize_pod_local(ctx, index, target_id) + } + Expr::IndexSet { + object, + index, + value, + } => { + expr_may_materialize_pod_local(ctx, object, target_id) + || expr_may_materialize_pod_local(ctx, index, target_id) + || expr_may_materialize_pod_local(ctx, value, target_id) + } + Expr::Binary { left, right, .. } + | Expr::Compare { left, right, .. } + | Expr::Logical { left, right, .. } => { + expr_may_materialize_pod_local(ctx, left, target_id) + || expr_may_materialize_pod_local(ctx, right, target_id) + } + Expr::Unary { operand, .. } | Expr::TypeOf(operand) | Expr::Void(operand) => { + expr_may_materialize_pod_local(ctx, operand, target_id) + } + Expr::Conditional { + condition, + then_expr, + else_expr, + } => { + expr_may_materialize_pod_local(ctx, condition, target_id) + || expr_may_materialize_pod_local(ctx, then_expr, target_id) + || expr_may_materialize_pod_local(ctx, else_expr, target_id) + } + Expr::New { args, .. } | Expr::NewDynamic { args, .. } => args + .iter() + .any(|arg| expr_may_materialize_pod_local(ctx, arg, target_id)), + Expr::Array(items) => items + .iter() + .any(|item| expr_may_materialize_pod_local(ctx, item, target_id)), + Expr::Object(items) => items + .iter() + .any(|(_, item)| expr_may_materialize_pod_local(ctx, item, target_id)), + _ => false, + } +} + +pub(crate) fn add_operands_have_pod_materialization_hazard( + ctx: &FnCtx<'_>, + left: &Expr, + right: &Expr, +) -> bool { + let mut right_pod_reads = Vec::new(); + collect_pod_numeric_field_read_locals(ctx, right, &mut right_pod_reads); + right_pod_reads + .into_iter() + .any(|id| expr_may_materialize_pod_local(ctx, left, id)) +} + +fn static_object_property_type(ctx: &FnCtx<'_>, object: &Expr, field: &str) -> Option { + match static_type_of(ctx, object)? { + HirType::Object(object_ty) => object_ty + .properties + .get(field) + .map(|property| property.ty.clone()), + _ => None, + } +} + +fn scalar_replaced_field_static_type( + ctx: &FnCtx<'_>, + object: &Expr, + field: &str, +) -> Option { + match object { + Expr::LocalGet(id) + if ctx + .scalar_replaced + .get(id) + .is_some_and(|fields| fields.contains_key(field)) => + { + declared_field_type(ctx, object, field) + .or_else(|| static_object_property_type(ctx, object, field)) + } + Expr::This => { + let target_id = ctx.scalar_ctor_target.last()?; + if !ctx + .scalar_replaced + .get(target_id) + .is_some_and(|fields| fields.contains_key(field)) + { + return None; + } + ctx.class_stack + .last() + .and_then(|class_name| class_field_declared_type(ctx, class_name, field)) + } + _ => None, + } +} + +pub(crate) fn scalar_replaced_field_is_raw_f64( + ctx: &FnCtx<'_>, + object: &Expr, + field: &str, +) -> bool { + scalar_replaced_field_static_type(ctx, object, field) + .as_ref() + .is_some_and(crate::typed_shape::type_is_raw_f64_candidate) +} + +pub(crate) fn scalar_replaced_field_raw_f64_store_state( + ctx: &FnCtx<'_>, + local_id: Option, + field: &str, + declared_raw_f64: bool, +) -> bool { + if !declared_raw_f64 { + return false; + } + + let field_note = format!("field={}", field); + let mut proven_raw = false; + for record in &ctx.native_rep_records { + if record.local_id != local_id || !record.notes.iter().any(|note| note == &field_note) { + continue; + } + match record.consumer.as_str() { + "scalar_object_field_store.raw_f64" => { + proven_raw = true; + } + "scalar_object_field_store" + if record.notes.iter().any(|note| note == "raw_f64_field=1") => + { + proven_raw = false; + } + _ => {} + } + } + proven_raw +} + +fn constant_array_index(index: &Expr) -> Option { + match index { + Expr::Integer(k) if *k >= 0 => Some(*k as usize), + Expr::Number(f) if f.is_finite() && *f >= 0.0 && f.fract() == 0.0 => Some(*f as usize), + _ => None, + } +} + +pub(crate) fn scalar_replaced_array_element_is_raw_f64( + ctx: &FnCtx<'_>, + object: &Expr, + index: &Expr, +) -> bool { + let Expr::LocalGet(id) = object else { + return false; + }; + let Some(k) = constant_array_index(index) else { + return false; + }; + if !ctx + .scalar_replaced_arrays + .get(id) + .is_some_and(|slots| k < slots.len()) + { + return false; + } + match static_type_of(ctx, object) { + Some(HirType::Array(elem)) => crate::typed_shape::type_is_raw_f64_candidate(elem.as_ref()), + Some(HirType::Tuple(elems)) => elems + .get(k) + .is_some_and(crate::typed_shape::type_is_raw_f64_candidate), + _ => false, + } +} + +fn type_has_numeric_pointer_free_array_layout_for_fallback(ty: &HirType) -> bool { + match ty { + HirType::Array(elem) => matches!(elem.as_ref(), HirType::Number | HirType::Int32), + HirType::Tuple(elems) => elems + .iter() + .all(|elem| matches!(elem, HirType::Number | HirType::Int32)), + HirType::Union(variants) => variants.iter().all(|variant| { + matches!(variant, HirType::Null | HirType::Void | HirType::Never) + || type_has_numeric_pointer_free_array_layout_for_fallback(variant) + }), + _ => false, + } +} + +pub(crate) fn expr_may_return_boxed_value_from_raw_f64_fallback( + ctx: &FnCtx<'_>, + expr: &Expr, +) -> bool { + match expr { + Expr::PropertyGet { object, property } => receiver_class_name(ctx, object) + .and_then(|class_name| class_field_declared_type(ctx, &class_name, property)) + .as_ref() + .is_some_and(crate::typed_shape::type_is_raw_f64_candidate), + Expr::IndexGet { object, .. } => static_type_of(ctx, object) + .as_ref() + .is_some_and(type_has_numeric_pointer_free_array_layout_for_fallback), + _ => false, + } +} + fn is_fixed_width_buffer_numeric_read(method: &str) -> bool { matches!( method, @@ -827,6 +1202,42 @@ pub(crate) fn is_numeric_expr(ctx: &FnCtx<'_>, e: &Expr) -> bool { if property == "length" && expression_has_numeric_length(ctx, object) { return true; } + if let Expr::LocalGet(id) = object.as_ref() { + if ctx + .scalar_replaced + .get(id) + .is_some_and(|fields| fields.contains_key(property)) + { + let declared_raw_f64 = scalar_replaced_field_is_raw_f64(ctx, object, property); + return scalar_replaced_field_raw_f64_store_state( + ctx, + Some(*id), + property, + declared_raw_f64, + ); + } + } + if matches!(object.as_ref(), Expr::This) { + if let Some(target_id) = ctx.scalar_ctor_target.last().copied() { + if ctx + .scalar_replaced + .get(&target_id) + .is_some_and(|fields| fields.contains_key(property)) + { + let declared_raw_f64 = + scalar_replaced_field_is_raw_f64(ctx, object, property); + return scalar_replaced_field_raw_f64_store_state( + ctx, + Some(target_id), + property, + declared_raw_f64, + ); + } + } + } + if pod_record_field_is_numeric(ctx, object, property) { + return true; + } let Some(owner_class_name) = receiver_class_name(ctx, object) else { return false; }; @@ -1781,6 +2192,9 @@ pub(crate) fn static_type_of(ctx: &FnCtx<'_>, e: &Expr) -> Option { if property == "length" && expression_has_numeric_length(ctx, object) { return Some(HirType::Number); } + if pod_record_field_is_numeric(ctx, object, property) { + return Some(HirType::Number); + } if is_process_namespace_version_property(object, property) { return Some(HirType::String); } diff --git a/crates/perry-codegen/tests/native_proof_buffer_views.rs b/crates/perry-codegen/tests/native_proof_buffer_views.rs index e9d1ffe572..0aa3e18cc7 100644 --- a/crates/perry-codegen/tests/native_proof_buffer_views.rs +++ b/crates/perry-codegen/tests/native_proof_buffer_views.rs @@ -496,6 +496,14 @@ fn extern_call(name: &str, args: Vec, return_type: Type) -> Expr { ) } +fn extern_func_ref(name: &str, return_type: Type) -> Expr { + Expr::ExternFuncRef { + name: name.to_string(), + param_types: Vec::new(), + return_type, + } +} + fn native_library_opts(functions: Vec<(&str, Vec<&str>, &str)>) -> CompileOptions { let mut opts = empty_opts(); opts.native_library_functions = functions @@ -838,13 +846,11 @@ fn explicit_width_guard_proves_wide_buffer_read() { records.iter().any(|record| { record["expr_kind"] == "BufferNumericRead" && record["consumer"] == "BufferNumericRead.native_u32" - && record["bounds_state"]["guarded"]["guard_id"] - .as_str() - .is_some_and(|id| id.contains("width_4")) + && record["bounds_state"]["proven"]["proof"] == "loop_guard" && record["buffer_access"]["access_width_bytes"] == 4 && record["buffer_access"]["bounds_width_units"] == 4 }), - "expected i + 4 <= buf.length to guard a 4-byte native read:\n{artifact:#}" + "expected i + 4 <= buf.length to prove a 4-byte native read:\n{artifact:#}" ); } @@ -1246,6 +1252,225 @@ fn native_owned_unknown_call_escape_through_owner_alias_invalidates_views() { assert_typed_array_get_fallback_reason(&artifact, "missing_owner_root"); } +#[test] +fn native_owned_unknown_call_escape_inside_aggregate_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_unknown_escape_inside_aggregate.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(extern_call( + "unknown_nested_escape", + vec![Expr::Array(vec![local(2)])], + Type::Number, + )), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_call_spread_escape_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_call_spread_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::CallSpread { + callee: Box::new(Expr::ExternFuncRef { + name: "unknown_spread_escape".to_string(), + param_types: Vec::new(), + return_type: Type::Number, + }), + args: vec![perry_hir::CallArg::Spread(Expr::Array(vec![local(2)]))], + type_args: Vec::new(), + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_proxy_apply_escape_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_proxy_apply_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::ProxyApply { + proxy: Box::new(extern_func_ref("unknown_proxy_apply_escape", Type::Any)), + args: vec![Expr::Array(vec![local(2)])], + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_proxy_construct_escape_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_proxy_construct_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::ProxyConstruct { + proxy: Box::new(extern_func_ref("unknown_proxy_construct_escape", Type::Any)), + args: vec![Expr::Array(vec![local(2)])], + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_reflect_apply_escape_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_reflect_apply_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::ReflectApply { + func: Box::new(extern_func_ref("unknown_reflect_apply_escape", Type::Any)), + this_arg: Box::new(Expr::Undefined), + args: Box::new(Expr::Array(vec![local(2)])), + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_reflect_construct_escape_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_reflect_construct_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::ReflectConstruct { + target: Box::new(extern_func_ref( + "unknown_reflect_construct_escape", + Type::Any, + )), + args: Box::new(Expr::Array(vec![local(2)])), + new_target: Box::new(Expr::Undefined), + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_js_call_value_escape_invalidates_views() { + let artifact = compile_artifact_json( + "artifact_native_owned_js_call_value_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::JsCallValue { + callee: Box::new(extern_func_ref("unknown_js_call_value_escape", Type::Any)), + args: vec![Expr::Array(vec![local(2)])], + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + +#[test] +fn native_owned_static_method_v8_escape_invalidates_views() { + let mut opts = empty_opts(); + opts.namespace_imports.push("RemoteNs".to_string()); + opts.namespace_v8_specifiers + .insert("RemoteNs".to_string(), "remote:v8".to_string()); + let artifact = compile_artifact_json_for_module_with_opts( + module( + "artifact_native_owned_static_method_v8_escape.ts", + vec![ + native_arena_owner_let(1, "owner", int(64), false), + native_arena_view_let( + 2, + "view", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::StaticMethodCall { + class_name: "RemoteNs".to_string(), + method_name: "invoke".to_string(), + args: vec![Expr::Array(vec![local(2)])], + }), + Stmt::Return(Some(index_get(2, int(0)))), + ], + ), + opts, + ); + assert_typed_array_get_fallback_reason(&artifact, "escaping_unowned_pointer"); +} + #[test] fn native_owned_closure_capture_through_owner_alias_invalidates_views() { let artifact = compile_artifact_json( diff --git a/crates/perry-codegen/tests/native_proof_regressions.rs b/crates/perry-codegen/tests/native_proof_regressions.rs index 53b692ab1a..96db13caa1 100644 --- a/crates/perry-codegen/tests/native_proof_regressions.rs +++ b/crates/perry-codegen/tests/native_proof_regressions.rs @@ -622,7 +622,7 @@ fn artifact_schema_v6_records_consumed_native_facts_for_buffer_region() { ]; let artifact = compile_artifact_json("artifact_positive_buffer_region.ts", body); - assert_eq!(artifact["schema_version"], 11); + assert_eq!(artifact["schema_version"], 12); let records = artifact["records"].as_array().unwrap(); assert!( records.iter().any(|record| { @@ -655,7 +655,7 @@ fn artifact_schema_v6_records_rejected_facts_for_buffer_fallback() { ]; let artifact = compile_artifact_json("artifact_rejected_buffer_region.ts", body); - assert_eq!(artifact["schema_version"], 11); + assert_eq!(artifact["schema_version"], 12); let records = artifact["records"].as_array().unwrap(); assert!( records.iter().any(|record| { @@ -701,7 +701,7 @@ fn artifact_schema_v6_records_c_layout_pod_manifest() { ]; let artifact = compile_artifact_json("artifact_c_layout_pod_record.ts", body); - assert_eq!(artifact["schema_version"], 11); + assert_eq!(artifact["schema_version"], 12); assert_eq!(artifact["summary"]["pod_layout_count"], 1); assert_eq!(artifact["summary"]["pod_record_count"], 1); let layouts = artifact["pod_layouts"].as_array().unwrap(); @@ -1176,7 +1176,7 @@ fn artifact_schema_v6_records_pod_dynamic_write_fallback() { ]; let artifact = compile_artifact_json("artifact_c_layout_pod_dynamic_write.ts", body); - assert_eq!(artifact["schema_version"], 11); + assert_eq!(artifact["schema_version"], 12); assert!( artifact["records"] .as_array() @@ -1198,6 +1198,185 @@ fn artifact_schema_v6_records_pod_dynamic_write_fallback() { ); } +#[test] +fn pod_field_read_after_dynamic_materialization_uses_number_coerce() { + let packet_ty = pod_type(&[ + ("tag", Type::Named("PerryU32".to_string())), + ("gain", Type::Named("PerryF32".to_string())), + ]); + let body = vec![ + pod_let( + 1, + "packet", + packet_ty, + vec![("tag", int(7)), ("gain", number(1.5))], + ), + Stmt::Expr(Expr::PropertySet { + object: Box::new(local(1)), + property: "tag".to_string(), + value: Box::new(Expr::String("x".to_string())), + }), + Stmt::Return(Some(Expr::Binary { + op: BinaryOp::Sub, + left: Box::new(Expr::PropertyGet { + object: Box::new(local(1)), + property: "tag".to_string(), + }), + right: Box::new(int(1)), + })), + ]; + + let ir = compile_ir("pod_dynamic_materialized_read_coerce.ts", body); + assert!( + ir.contains("call double @js_number_coerce"), + "POD field reads after dynamic materialization must not feed boxed JSValue fallbacks into raw numeric arithmetic:\n{ir}" + ); +} + +#[test] +fn number_coerce_of_proven_numeric_loop_expression_skips_runtime_call() { + let body = vec![ + number_let(1, "sum", true, int(0)), + Stmt::For { + init: Some(Box::new(number_let(2, "i", true, int(0)))), + condition: Some(Expr::Compare { + op: CompareOp::Lt, + left: Box::new(local(2)), + right: Box::new(int(64)), + }), + update: Some(increment(2)), + body: vec![Stmt::Expr(Expr::LocalSet( + 1, + Box::new(add( + local(1), + Expr::NumberCoerce(Box::new(add(local(2), number(0.5)))), + )), + ))], + }, + Stmt::Return(Some(local(1))), + ]; + + let ir = compile_ir("number_coerce_numeric_loop_no_runtime_call.ts", body); + assert!( + !ir.contains("call double @js_number_coerce"), + "Number(i + 0.5) with a proven integer loop counter is already a primitive number:\n{ir}" + ); +} + +#[test] +fn number_coerce_of_numeric_array_fallback_keeps_runtime_call() { + let module = module_with_classes_and_params( + "number_coerce_numeric_array_fallback.ts", + Vec::new(), + vec![param(1, "values", Type::Array(Box::new(Type::Number)))], + Type::Number, + vec![Stmt::Return(Some(Expr::NumberCoerce(Box::new( + Expr::IndexGet { + object: Box::new(local(1)), + index: Box::new(int(0)), + }, + ))))], + ); + + let ir = compile_ir_for_module_with_opts(module, empty_opts()).unwrap(); + assert!( + ir.contains("call double @js_number_coerce"), + "Number(values[0]) must still coerce boxed numeric-array fallback values:\n{ir}" + ); +} + +#[test] +fn typed_array_f64_store_coerces_raw_numeric_array_fallback_value() { + let module = module_with_classes_and_params( + "typed_array_f64_store_coerces_numeric_array_fallback.ts", + Vec::new(), + vec![param(3, "values", Type::Array(Box::new(Type::Number)))], + Type::Number, + vec![ + native_arena_owner_let(1, "arena", int(64), false), + native_arena_view_let( + 2, + "out", + 1, + "Float64Array", + perry_hir::TYPED_ARRAY_KIND_FLOAT64, + int(0), + int(8), + ), + Stmt::Expr(Expr::IndexSet { + object: Box::new(local(2)), + index: Box::new(int(0)), + value: Box::new(Expr::IndexGet { + object: Box::new(local(3)), + index: Box::new(int(0)), + }), + }), + Stmt::Return(Some(int(0))), + ], + ); + let ir = compile_ir_for_module_with_opts(module, empty_opts()).unwrap(); + assert!( + ir.contains("call double @js_number_coerce"), + "Float64Array native stores must coerce guarded numeric-array fallback values before raw storage:\n{ir}" + ); + assert!( + ir.contains("store double"), + "test must exercise the raw Float64Array store path:\n{ir}" + ); +} + +#[test] +fn scalar_replaced_raw_f64_field_store_keeps_numeric_array_fallback_boxed() { + let mut properties = std::collections::HashMap::new(); + properties.insert("gain".to_string(), prop(Type::Number)); + let packet_ty = Type::Object(ObjectType { + name: None, + properties, + property_order: Some(vec!["gain".to_string()]), + index_signature: None, + }); + let module = module_with_classes_and_params( + "scalar_field_store_keeps_numeric_array_fallback_boxed.ts", + Vec::new(), + vec![param(3, "values", Type::Array(Box::new(Type::Number)))], + Type::Number, + vec![ + Stmt::Let { + id: 2, + name: "packet".to_string(), + ty: packet_ty, + mutable: true, + init: Some(Expr::Object( + vec![("gain".to_string(), number(0.0))] + .into_iter() + .collect(), + )), + }, + Stmt::Expr(Expr::PropertySet { + object: Box::new(local(2)), + property: "gain".to_string(), + value: Box::new(Expr::IndexGet { + object: Box::new(local(3)), + index: Box::new(int(0)), + }), + }), + Stmt::Return(Some(Expr::PropertyGet { + object: Box::new(local(2)), + property: "gain".to_string(), + })), + ], + ); + let ir = compile_ir_for_module_with_opts(module, empty_opts()).unwrap(); + assert!( + ir.contains("call double @js_typed_feedback_array_index_get_fallback_boxed"), + "test must exercise a numeric-array get with a boxed fallback arm:\n{ir}" + ); + assert!( + !ir.contains("call double @js_array_numeric_value_to_raw_f64"), + "scalar raw-f64 fields must not canonicalize a possibly boxed fallback value into raw storage:\n{ir}" + ); +} + #[test] fn artifact_schema_v8_rejects_inexact_pod_initializer_values() { let packet_ty = pod_type(&[ @@ -1223,7 +1402,7 @@ fn artifact_schema_v8_rejects_inexact_pod_initializer_values() { ]; let artifact = compile_artifact_json("artifact_c_layout_pod_init_reject.ts", body); - assert_eq!(artifact["schema_version"], 11); + assert_eq!(artifact["schema_version"], 12); assert_eq!(artifact["summary"]["pod_layout_count"], 0); assert_eq!(artifact["summary"]["pod_record_count"], 0); assert!(artifact["pod_layouts"].as_array().unwrap().is_empty()); @@ -1274,7 +1453,7 @@ fn artifact_schema_v6_records_pod_pointerful_field_rejection() { ]; let artifact = compile_artifact_json("artifact_c_layout_pod_reject.ts", body); - assert_eq!(artifact["schema_version"], 11); + assert_eq!(artifact["schema_version"], 12); assert_eq!(artifact["summary"]["pod_layout_count"], 0); assert!(artifact["pod_layouts"].as_array().unwrap().is_empty()); assert!( @@ -1902,6 +2081,58 @@ fn artifact_records_numeric_array_f64_fast_paths_and_fallback_reasons() { ); } +#[test] +fn artifact_records_write_barrier_child_js_value_bits() { + let module = module_with_classes_and_params( + "artifact_write_barrier_js_value_bits.ts", + Vec::new(), + vec![ + param(1, "xs", Type::Array(Box::new(Type::Any))), + param(2, "key", Type::String), + param(3, "value", Type::Any), + ], + Type::Number, + vec![ + Stmt::Expr(Expr::IndexSet { + object: Box::new(local(1)), + index: Box::new(local(2)), + value: Box::new(local(3)), + }), + Stmt::Return(Some(int(0))), + ], + ); + + let artifact = compile_artifact_json_for_module(module); + let records = artifact["records"].as_array().unwrap(); + assert!( + records.iter().any(|record| { + record["expr_kind"] == "WriteBarrier" + && record["consumer"] == "write_barrier.child_bits" + && record["native_rep_name"] == "js_value_bits" + && record["native_value_state"] == "region_local" + && record["access_mode"].is_null() + && record["native_abi_type"].is_null() + }), + "expected production write-barrier js_value_bits record:\n{artifact:#}" + ); + assert!( + records.iter().any(|record| { + record["consumer"] == "lower_expr_native_js_value_bits" + && record["native_rep_name"] == "js_value_bits" + && record["llvm_ty"] == "i64" + && record["native_abi_type"].is_null() + }), + "expected production js_value_bits selector record:\n{artifact:#}" + ); + assert!( + artifact["summary"]["js_value_bits_count"] + .as_u64() + .unwrap_or(0) + >= 1, + "expected js_value_bits summary count:\n{artifact:#}" + ); +} + #[test] fn artifact_records_raw_numeric_class_field_f64_fast_paths_and_fallback_reasons() { let point = class(101, "Point", vec![class_field("x", Type::Number)]); diff --git a/crates/perry-codegen/tests/typed_feedback.rs b/crates/perry-codegen/tests/typed_feedback.rs index bda7faff29..ea9100e91f 100644 --- a/crates/perry-codegen/tests/typed_feedback.rs +++ b/crates/perry-codegen/tests/typed_feedback.rs @@ -283,9 +283,13 @@ fn typed_feedback_guards_direct_class_field_specialization() { property: "x".to_string(), value: Box::new(Expr::Number(7.0)), }), - Stmt::Return(Some(Expr::PropertyGet { - object: Box::new(Expr::LocalGet(1)), - property: "x".to_string(), + Stmt::Return(Some(Expr::Binary { + op: BinaryOp::Sub, + left: Box::new(Expr::PropertyGet { + object: Box::new(Expr::LocalGet(1)), + property: "x".to_string(), + }), + right: Box::new(Expr::Integer(1)), })), ], )); @@ -310,6 +314,10 @@ fn typed_feedback_guards_direct_class_field_specialization() { // js_class_field_set_fallback). assert!(ir.contains("call void @js_typed_feedback_record_fallback_call")); assert!(ir.contains("call double @js_object_get_field_by_name_f64")); + assert!( + ir.contains("call double @js_number_coerce"), + "class-field raw fallback must be coerced at numeric consumers:\n{ir}" + ); } #[test] @@ -584,8 +592,10 @@ fn typed_feedback_guards_array_index_specialization() { assert!(ir.contains("js_typed_feedback_array_index_set_fallback_boxed")); assert!(ir.contains("js_typed_feedback_numeric_array_index_get_guard")); assert!(ir.contains("js_typed_feedback_array_index_get_fallback_boxed")); - assert!(ir.contains("js_array_numeric_set_f64_unboxed")); - assert!(ir.contains("js_array_numeric_get_f64_unboxed")); + assert!(ir.contains("idxset.inbounds")); + assert!(ir.contains("store double")); + assert!(!ir.contains("call i32 @js_array_numeric_set_f64_unboxed")); + assert!(!ir.contains("call double @js_array_numeric_get_f64_unboxed")); } #[test] diff --git a/crates/perry-codegen/tests/typed_shape_descriptors.rs b/crates/perry-codegen/tests/typed_shape_descriptors.rs index 1b86972f18..98518e111c 100644 --- a/crates/perry-codegen/tests/typed_shape_descriptors.rs +++ b/crates/perry-codegen/tests/typed_shape_descriptors.rs @@ -444,8 +444,12 @@ fn bounded_integer_array_store_omits_layout_note_and_barrier() { let ir = ir_for(module); assert!( - ir.contains("call i32 @js_array_numeric_set_f64_unboxed"), - "bounded numeric array store should route through the raw-f64 payload helper" + ir.contains("idxset.bounded_numeric_fast") && ir.contains("store double"), + "bounded numeric array store should inline the guarded raw-f64 payload store" + ); + assert!( + !ir.contains("call i32 @js_array_numeric_set_f64_unboxed"), + "bounded numeric array store should not call the redundant raw-f64 set helper" ); assert!( ir.contains("call i32 @js_typed_feedback_numeric_array_index_set_guard"), diff --git a/crates/perry-runtime/src/array/header.rs b/crates/perry-runtime/src/array/header.rs index 4d47ba9b08..402629e526 100644 --- a/crates/perry-runtime/src/array/header.rs +++ b/crates/perry-runtime/src/array/header.rs @@ -739,6 +739,11 @@ pub(crate) fn value_bits_to_number(value_bits: u64) -> Option { Some(canonical_raw_f64(f64::from_bits(value_bits))) } +#[no_mangle] +pub extern "C" fn js_array_numeric_value_to_raw_f64(value: f64) -> f64 { + value_bits_to_number(value.to_bits()).unwrap_or(f64::NAN) +} + #[inline] fn canonical_raw_f64(value: f64) -> f64 { if value.is_nan() { diff --git a/crates/perry-runtime/src/typed_feedback.rs b/crates/perry-runtime/src/typed_feedback.rs index bba2164b4f..175080bbf9 100644 --- a/crates/perry-runtime/src/typed_feedback.rs +++ b/crates/perry-runtime/src/typed_feedback.rs @@ -1027,10 +1027,7 @@ fn is_plain_number_bits(bits: u64) -> bool { } fn is_numeric_value_bits(bits: u64) -> bool { - matches!( - stable_value_kind(bits), - STABLE_VALUE_NUMBER | STABLE_VALUE_INT32 - ) + crate::array::value_bits_to_number(bits).is_some() } fn gc_header_for_user_addr(addr: usize) -> Option<*const crate::gc::GcHeader> { @@ -1089,6 +1086,42 @@ fn numeric_array_index_guard(arr: *const ArrayHeader, index: u32, require_in_bou && crate::array::js_array_is_numeric_f64_layout(arr) != 0 } +fn plain_array_index_set_guard( + arr: *const ArrayHeader, + index: u32, + require_in_bounds: bool, +) -> bool { + if !plain_array_index_guard(arr, index, require_in_bounds) { + return false; + } + let raw_addr = normalize_raw_object_addr(arr as u64); + let Some(header) = gc_header_for_user_addr(raw_addr) else { + return false; + }; + unsafe { + let flags = (*header)._reserved; + if flags & crate::gc::OBJ_FLAG_FROZEN != 0 { + return false; + } + let arr = raw_addr as *const ArrayHeader; + if index >= (*arr).length + && flags & (crate::gc::OBJ_FLAG_SEALED | crate::gc::OBJ_FLAG_NO_EXTEND) != 0 + { + return false; + } + } + true +} + +fn numeric_array_index_set_guard( + arr: *const ArrayHeader, + index: u32, + require_in_bounds: bool, +) -> bool { + plain_array_index_set_guard(arr, index, require_in_bounds) + && crate::array::js_array_is_numeric_f64_layout(arr) != 0 +} + fn numeric_array_push_guard(arr: *const ArrayHeader, value: f64) -> bool { let raw_addr = normalize_raw_object_addr(arr as u64); let Some(header) = gc_header_for_user_addr(raw_addr) else { @@ -1479,7 +1512,7 @@ pub extern "C" fn js_typed_feedback_plain_array_index_set_guard( let raw_addr = normalize_raw_object_addr(receiver.to_bits()); if !typed_feedback_enabled() { return (index >= 0 - && plain_array_index_guard( + && plain_array_index_set_guard( raw_addr as *const ArrayHeader, index as u32, require_in_bounds != 0, @@ -1498,7 +1531,7 @@ pub extern "C" fn js_typed_feedback_plain_array_index_set_guard( value_tag: stable_value_kind(value.to_bits()), }; let contract_valid = index >= 0 - && plain_array_index_guard( + && plain_array_index_set_guard( raw_addr as *const ArrayHeader, index as u32, require_in_bounds != 0, @@ -1528,7 +1561,7 @@ pub extern "C" fn js_typed_feedback_numeric_array_index_set_guard( if !typed_feedback_enabled() { return (index >= 0 && is_numeric_value_bits(value.to_bits()) - && numeric_array_index_guard( + && numeric_array_index_set_guard( raw_addr as *const ArrayHeader, index as u32, require_in_bounds != 0, @@ -1548,7 +1581,7 @@ pub extern "C" fn js_typed_feedback_numeric_array_index_set_guard( }; let contract_valid = index >= 0 && is_numeric_value_bits(value.to_bits()) - && numeric_array_index_guard( + && numeric_array_index_set_guard( raw_addr as *const ArrayHeader, index as u32, require_in_bounds != 0, @@ -1607,7 +1640,7 @@ pub extern "C" fn js_typed_feedback_numeric_array_push_guard( pub extern "C" fn js_typed_feedback_array_index_set_fallback_boxed( site_id: u64, receiver: f64, - index: i32, + index: f64, value: f64, ) -> f64 { record_fallback_call(site_id); @@ -1617,15 +1650,10 @@ pub extern "C" fn js_typed_feedback_array_index_set_fallback_boxed( return receiver; } - let index_value = index as f64; if crate::buffer::is_registered_buffer(raw_addr) || crate::typedarray::lookup_typed_array_kind(raw_addr).is_some() { - crate::array::js_array_set_index_or_string( - raw_addr as *mut ArrayHeader, - index_value, - value, - ); + crate::array::js_array_set_index_or_string(raw_addr as *mut ArrayHeader, index, value); return receiver; } @@ -1640,19 +1668,20 @@ pub extern "C" fn js_typed_feedback_array_index_set_fallback_boxed( crate::gc::GC_TYPE_ARRAY | crate::gc::GC_TYPE_LAZY_ARRAY => { let new_arr = crate::array::js_array_set_index_or_string( raw_addr as *mut ArrayHeader, - index_value, + index, value, ); crate::value::js_nanbox_pointer(new_arr as i64) } crate::gc::GC_TYPE_OBJECT | crate::gc::GC_TYPE_CLOSURE => { - let key = index.to_string(); - let key_ptr = crate::string::js_string_from_bytes(key.as_ptr(), key.len() as u32); - crate::object::js_object_set_field_by_name( - raw_addr as *mut ObjectHeader, - key_ptr, - value, - ); + let key_ptr = crate::value::js_jsvalue_to_string(index); + if !key_ptr.is_null() { + crate::object::js_object_set_field_by_name( + raw_addr as *mut ObjectHeader, + key_ptr, + value, + ); + } receiver } _ => receiver, diff --git a/crates/perry-runtime/src/typed_feedback/tests.rs b/crates/perry-runtime/src/typed_feedback/tests.rs index 821b9aae84..42976a5070 100644 --- a/crates/perry-runtime/src/typed_feedback/tests.rs +++ b/crates/perry-runtime/src/typed_feedback/tests.rs @@ -465,7 +465,7 @@ fn typed_feedback_non_bounded_array_set_guard_failure_uses_jsvalue_object_fallba let guard = js_typed_feedback_plain_array_index_set_guard(24, obj_box, 0, 99.0, 0); assert_eq!(guard, 0); - let returned = js_typed_feedback_array_index_set_fallback_boxed(24, obj_box, 0, 99.0); + let returned = js_typed_feedback_array_index_set_fallback_boxed(24, obj_box, 0.0, 99.0); assert_eq!(returned.to_bits(), obj_box.to_bits()); let key = crate::string::js_string_from_bytes(b"0".as_ptr(), 1); @@ -478,6 +478,64 @@ fn typed_feedback_non_bounded_array_set_guard_failure_uses_jsvalue_object_fallba assert_eq!(site.fallback_calls, 1); } +#[test] +fn typed_feedback_array_set_guards_reject_frozen_arrays() { + let _guard = TYPED_FEEDBACK_TEST_LOCK.lock().unwrap(); + reset_typed_feedback_for_tests(); + register(70, TypedFeedbackSiteKind::ArrayElement, "arr[i]="); + register(71, TypedFeedbackSiteKind::ArrayElement, "arr[i]="); + + let values = [1.0, 2.0]; + let arr = crate::array::js_array_from_f64(values.as_ptr(), values.len() as u32); + let arr_box = crate::value::js_nanbox_pointer(arr as i64); + crate::object::js_object_freeze(arr_box); + + assert_eq!( + js_typed_feedback_plain_array_index_set_guard(70, arr_box, 0, 99.0, 1), + 0 + ); + assert_eq!( + js_typed_feedback_numeric_array_index_set_guard(71, arr_box, 0, 99.0, 1), + 0 + ); + + let returned = js_typed_feedback_array_index_set_fallback_boxed(70, arr_box, 0.0, 99.0); + assert_eq!(returned.to_bits(), arr_box.to_bits()); + assert_eq!( + crate::array::js_array_get_f64(arr, 0).to_bits(), + 1.0f64.to_bits() + ); + + let snapshot = typed_feedback_snapshot(); + assert_eq!(snapshot.sites[0].guard_failures, 1); + assert_eq!(snapshot.sites[1].guard_failures, 1); +} + +#[test] +fn typed_feedback_array_set_boxed_fallback_preserves_original_index_value() { + let _guard = TYPED_FEEDBACK_TEST_LOCK.lock().unwrap(); + reset_typed_feedback_for_tests(); + register(72, TypedFeedbackSiteKind::ArrayElement, "arr[i]="); + + let obj = crate::object::js_object_alloc(0, 0); + let obj_box = f64::from_bits(crate::value::JSValue::pointer(obj as *const u8).bits()); + let key = crate::string::js_string_from_bytes(b"foo".as_ptr(), 3); + let key_value = crate::value::js_nanbox_string(key as i64); + + let returned = js_typed_feedback_array_index_set_fallback_boxed(72, obj_box, key_value, 77.0); + assert_eq!(returned.to_bits(), obj_box.to_bits()); + assert_eq!( + crate::object::js_object_get_field_by_name_f64(obj, key).to_bits(), + 77.0f64.to_bits() + ); + + let zero_key = crate::string::js_string_from_bytes(b"0".as_ptr(), 1); + assert_eq!( + crate::object::js_object_get_field_by_name_f64(obj, zero_key).to_bits(), + crate::value::TAG_UNDEFINED + ); +} + #[test] fn typed_feedback_numeric_array_get_guard_requires_numeric_layout() { let _guard = TYPED_FEEDBACK_TEST_LOCK.lock().unwrap(); @@ -534,6 +592,50 @@ fn typed_feedback_numeric_array_set_guard_requires_numeric_value_and_layout() { assert_eq!(site.fallback_calls, 0); } +#[test] +fn typed_feedback_numeric_array_guards_reject_registered_class_ref_bits() { + let _guard = TYPED_FEEDBACK_TEST_LOCK.lock().unwrap(); + reset_typed_feedback_for_tests(); + register(68, TypedFeedbackSiteKind::ArrayElement, "arr[i]="); + register(69, TypedFeedbackSiteKind::ArrayElement, "arr.push"); + + let class_id = 0x00C0_DE01; + unsafe { + crate::object::js_register_class_id(class_id); + } + let class_ref = f64::from_bits(crate::value::INT32_TAG | class_id as u64); + + let values = [1.0, 2.0]; + let arr = crate::array::js_array_from_f64(values.as_ptr(), values.len() as u32); + let arr_box = crate::value::js_nanbox_pointer(arr as i64); + + assert_eq!( + js_typed_feedback_numeric_array_index_set_guard(68, arr_box, 1, class_ref, 1), + 0 + ); + assert_eq!( + js_typed_feedback_numeric_array_push_guard(69, arr_box, class_ref), + 0 + ); + assert_eq!(crate::array::js_array_is_numeric_f64_layout(arr), 1); + + let snapshot = typed_feedback_snapshot(); + let set_site = snapshot + .sites + .iter() + .find(|site| site.site_id == 68) + .expect("set site"); + assert_eq!(set_site.guard_passes, 0); + assert_eq!(set_site.guard_failures, 1); + let push_site = snapshot + .sites + .iter() + .find(|site| site.site_id == 69) + .expect("push site"); + assert_eq!(push_site.guard_passes, 0); + assert_eq!(push_site.guard_failures, 1); +} + #[test] fn typed_feedback_numeric_array_push_guard_requires_room_numeric_value_and_layout() { let _guard = TYPED_FEEDBACK_TEST_LOCK.lock().unwrap(); diff --git a/crates/perry-runtime/src/typed_feedback/trace.rs b/crates/perry-runtime/src/typed_feedback/trace.rs index c67acbfb4f..4622163afe 100644 --- a/crates/perry-runtime/src/typed_feedback/trace.rs +++ b/crates/perry-runtime/src/typed_feedback/trace.rs @@ -380,7 +380,7 @@ mod keep_typed_feedback { #[used] static K13: extern "C" fn(u64, *mut ArrayHeader, u32, f64) = js_typed_feedback_array_set_f64; #[used] static K14: extern "C" fn(u64, *mut ArrayHeader, u32, f64) -> *mut ArrayHeader = js_typed_feedback_array_set_f64_extend; #[used] static K15: extern "C" fn(u64, f64, i32, f64, i32) -> i32 = js_typed_feedback_plain_array_index_set_guard; - #[used] static K16: extern "C" fn(u64, f64, i32, f64) -> f64 = js_typed_feedback_array_index_set_fallback_boxed; + #[used] static K16: extern "C" fn(u64, f64, f64, f64) -> f64 = js_typed_feedback_array_index_set_fallback_boxed; #[used] static K17: extern "C" fn(u64, *const ArrayHeader, u32) = js_typed_feedback_observe_array_element; #[used] static K18: extern "C" fn(u64, *mut ArrayHeader, *const crate::StringHeader, f64) -> *mut ArrayHeader = js_typed_feedback_array_set_string_key; #[used] static K19: extern "C" fn(u64, *mut ArrayHeader, f64, f64) -> *mut ArrayHeader = js_typed_feedback_array_set_index_or_string; diff --git a/scripts/check_file_size.sh b/scripts/check_file_size.sh index caa3835622..ee267dd5f4 100755 --- a/scripts/check_file_size.sh +++ b/scripts/check_file_size.sh @@ -304,6 +304,14 @@ crates/perry-ext-http-server/src/http2_server.rs # on/once iterator machinery into the existing `events/` submodule is tracked # under #1435 with the other module-size cleanups. crates/perry-stdlib/src/events.rs +# Representation-aware type-lowering work (#5291). These crossed the 2000-line +# gate as the raw-numeric fallback hardening + native-ABI hot-loop runtime gates +# expanded the type-analysis surface and its native-region proof tests. Splitting +# the per-concern analysis/verify helpers into sibling modules is tracked under +# #1435 with the other codegen file-size cleanups. +crates/perry-codegen/src/type_analysis.rs +crates/perry-codegen/src/native_value/verify.rs +crates/perry-codegen/tests/native_proof_regressions.rs EOF ) diff --git a/scripts/compiler_output_harness/capture.py b/scripts/compiler_output_harness/capture.py index 1534eab769..05c8102084 100644 --- a/scripts/compiler_output_harness/capture.py +++ b/scripts/compiler_output_harness/capture.py @@ -540,7 +540,10 @@ def verify_existing(args: argparse.Namespace) -> int: ir_after = after.read_text(encoding="utf-8") assembly = asm.read_text(encoding="utf-8") counters = structural_counters(ir_before, ir_after, assembly) - runtime_summary = runtime_counter_summary(None, counters) + benchmark = ( + manifest.get("benchmark") if isinstance(manifest.get("benchmark"), dict) else None + ) + runtime_summary = runtime_counter_summary(benchmark, counters) target = ( args.target or compile_plan.get("effective_target") @@ -553,7 +556,7 @@ def verify_existing(args: argparse.Namespace) -> int: ir_before=ir_before, ir_after=ir_after, assembly=assembly, - benchmark=None, + benchmark=benchmark, vectorization=vectorization, counters=counters, runtime_summary=runtime_summary, diff --git a/scripts/compiler_output_harness/spec.py b/scripts/compiler_output_harness/spec.py index ca6ae9fc71..ec8873ac77 100644 --- a/scripts/compiler_output_harness/spec.py +++ b/scripts/compiler_output_harness/spec.py @@ -53,6 +53,22 @@ def validate_workload_spec(data: dict[str, Any]) -> None: ) if not isinstance(workload.get("runtime_budgets"), dict): raise HarnessError(f"workload {name!r} runtime_budgets must be a table") + stdout_checks = workload.get("stdout_checks", []) + if not isinstance(stdout_checks, list): + raise HarnessError(f"workload {name!r} stdout_checks must be a list") + for check in stdout_checks: + if not isinstance(check, dict) or not check.get("name"): + raise HarnessError(f"workload {name!r} stdout_checks need names") + if any(key in check for key in ("contains", "contains_all", "contains_any")): + raise HarnessError( + f"workload {name!r} stdout check {check['name']!r} must not use " + "substring matching" + ) + if "equals" not in check and "line_equals" not in check: + raise HarnessError( + f"workload {name!r} stdout check {check['name']!r} must use " + "equals or line_equals" + ) native_rep_checks = workload.get("native_rep_checks") if native_rep_checks is not None: if not isinstance(native_rep_checks, dict): diff --git a/scripts/compiler_output_harness/verification.py b/scripts/compiler_output_harness/verification.py index 30a15416ac..28881fd24e 100644 --- a/scripts/compiler_output_harness/verification.py +++ b/scripts/compiler_output_harness/verification.py @@ -2,6 +2,7 @@ import json import re +from pathlib import Path from typing import Any from .analyzers import ( @@ -186,6 +187,10 @@ def _text_check_passes(text: str, check: dict[str, Any]) -> bool: if not function_text: return False text = function_text + if "equals" in check and text != str(check["equals"]): + return False + if "line_equals" in check and str(check["line_equals"]) not in text.splitlines(): + return False if "contains" in check and check["contains"] not in text: return False if "contains_all" in check and not all(part in text for part in check["contains_all"]): @@ -205,6 +210,20 @@ def _text_check_passes(text: str, check: dict[str, Any]) -> bool: return True +def _benchmark_run_stdout(run: dict[str, Any]) -> str: + stdout_path = run.get("stdout_path") + if stdout_path: + try: + return Path(stdout_path).read_text(encoding="utf-8") + except OSError: + pass + first = str(run.get("stdout_first") or "") + last = str(run.get("stdout_last") or "") + if last and last != first: + return first + last + return first + + def _function_text_containing(text: str, fragment: str) -> str: matches: list[str] = [] current: list[str] | None = None @@ -267,6 +286,10 @@ def _access_mode_name(value: Any) -> str: return _state_name(value) +def _field_name(value: Any) -> str: + return _state_name(value) or str(value or "") + + def _is_unchecked_native_unknown_bounds(record: dict[str, Any]) -> bool: return ( _access_mode_name(record.get("access_mode")) == "unchecked_native" @@ -331,6 +354,24 @@ def _fact_matches(fact: Any, *, kind: Any = None, state: Any = None) -> bool: return True +def _fact_matches_spec(fact: Any, spec: dict[str, Any], prefix: str) -> bool: + if not _fact_matches( + fact, + kind=spec.get(f"{prefix}_fact_kind"), + state=spec.get(f"{prefix}_fact_state"), + ): + return False + reason = spec.get(f"{prefix}_fact_reason") + if reason is not None and _field_name(fact.get("reason")) != str(reason): + return False + fact_id_contains = spec.get(f"{prefix}_fact_id_contains") + if fact_id_contains is not None and str(fact_id_contains) not in str( + fact.get("fact_id") or "" + ): + return False + return True + + def _record_has_fact( record: dict[str, Any], field: str, @@ -354,9 +395,11 @@ def _record_matches_required(record: dict[str, Any], spec: dict[str, Any]) -> bo "block_label", "function", "materialization_reason", + "fallback_reason", + "native_value_state", ) for field in exact_fields: - if field in spec and str(record.get(field) or "") != str(spec[field]): + if field in spec and _field_name(record.get(field)) != str(spec[field]): return False contains_fields = ( ("consumer_contains", "consumer"), @@ -388,30 +431,32 @@ def _record_matches_required(record: dict[str, Any], spec: dict[str, Any]) -> bo record.get("alias_state"), spec["alias_state"], state_kind="alias" ): return False - if "consumed_fact_kind" in spec and not _record_has_fact( - record, - "consumed_facts", - kind=spec.get("consumed_fact_kind"), - state=spec.get("consumed_fact_state"), + if "consumed_fact_kind" in spec and not any( + _fact_matches_spec(fact, spec, "consumed") + for fact in record.get("consumed_facts", []) or [] ): return False - if "consumed_fact_state" in spec and "consumed_fact_kind" not in spec and not _record_has_fact( - record, - "consumed_facts", - state=spec.get("consumed_fact_state"), + if ( + "consumed_fact_state" in spec + and "consumed_fact_kind" not in spec + and not any( + _fact_matches_spec(fact, spec, "consumed") + for fact in record.get("consumed_facts", []) or [] + ) ): return False - if "rejected_fact_kind" in spec and not _record_has_fact( - record, - "rejected_facts", - kind=spec.get("rejected_fact_kind"), - state=spec.get("rejected_fact_state"), + if "rejected_fact_kind" in spec and not any( + _fact_matches_spec(fact, spec, "rejected") + for fact in record.get("rejected_facts", []) or [] ): return False - if "rejected_fact_state" in spec and "rejected_fact_kind" not in spec and not _record_has_fact( - record, - "rejected_facts", - state=spec.get("rejected_fact_state"), + if ( + "rejected_fact_state" in spec + and "rejected_fact_kind" not in spec + and not any( + _fact_matches_spec(fact, spec, "rejected") + for fact in record.get("rejected_facts", []) or [] + ) ): return False return True @@ -463,11 +508,22 @@ def add(name: str, passed: bool, detail: str) -> None: r for r in records if r.get("materialization_reason") - and ( - _state_name(r.get("materialization_reason")) - or str(r.get("materialization_reason") or "") - ) - not in allowed_reasons + and _field_name(r.get("materialization_reason")) not in allowed_reasons + ] + dynamic_fallbacks = [r for r in records if _is_dynamic_fallback(r)] + missing_fallback_reason = [ + r + for r in dynamic_fallbacks + if not _field_name(r.get("fallback_reason")) + or not _field_name(r.get("materialization_reason")) + ] + mismatched_fallback_reason = [ + r + for r in dynamic_fallbacks + if _field_name(r.get("fallback_reason")) + and _field_name(r.get("materialization_reason")) + and _field_name(r.get("fallback_reason")) + != _field_name(r.get("materialization_reason")) ] add( @@ -503,6 +559,16 @@ def add(name: str, passed: bool, detail: str) -> None: + " unexpected=" + json.dumps(unexpected_materializations[:5], sort_keys=True), ) + add( + "native_reps_dynamic_fallbacks_have_reasons", + not missing_fallback_reason, + json.dumps(missing_fallback_reason[:5], sort_keys=True), + ) + add( + "native_reps_dynamic_fallback_reasons_match_materialization", + not mismatched_fallback_reason, + json.dumps(mismatched_fallback_reason[:5], sort_keys=True), + ) for required in check_spec.get("require_records", []) or []: matches = [r for r in records if _record_matches_required(r, required)] @@ -612,6 +678,32 @@ def records_for_native_region(region: str) -> list[dict[str, Any]]: bool(bounded), f"{region} bounded_records={len(bounded)}", ) + consumed_rep_names = { + r.get("native_rep_name") + for r in region_records + if r.get("native_rep_name") in {"i32", "buffer_view", "u8"} + and _record_has_fact( + r, "consumed_facts", kind="representation", state="consumed" + ) + } + add( + f"native_reps_{region}_consumes_representation_facts", + {"i32", "buffer_view", "u8"}.issubset(consumed_rep_names), + f"{region} consumed_rep_names={sorted(consumed_rep_names)}", + ) + consumed_bounds = [ + r + for r in region_records + if r.get("native_rep_name") in {"buffer_view", "u8"} + and _record_has_fact( + r, "consumed_facts", kind="bounds", state="consumed" + ) + ] + add( + f"native_reps_{region}_consumes_bounds_facts", + bool(consumed_bounds), + f"{region} consumed_bounds_records={len(consumed_bounds)}", + ) same_region_records = records_for_native_region("same_buffer") if not same_region_records: @@ -630,6 +722,20 @@ def records_for_native_region(region: str) -> list[dict[str, Any]]: ] same_reps = {r.get("native_rep_name") for r in same_records} same_noalias = [r for r in same_region_records if r.get("emitted_noalias")] + same_consumed_reps = { + r.get("native_rep_name") + for r in same_region_records + if r.get("native_rep_name") in {"buffer_view", "u8"} + and _record_has_fact( + r, "consumed_facts", kind="representation", state="consumed" + ) + } + same_consumed_bounds = [ + r + for r in same_region_records + if r.get("native_rep_name") in {"buffer_view", "u8"} + and _record_has_fact(r, "consumed_facts", kind="bounds", state="consumed") + ] add( "native_reps_same_buffer_has_raw_buffer_view", "buffer_view" in same_reps and "u8" in same_reps, @@ -640,6 +746,16 @@ def records_for_native_region(region: str) -> list[dict[str, Any]]: not same_noalias, json.dumps(same_noalias[:5], sort_keys=True), ) + add( + "native_reps_same_buffer_consumes_representation_facts", + {"buffer_view", "u8"}.issubset(same_consumed_reps), + f"same_buffer consumed_rep_names={sorted(same_consumed_reps)}", + ) + add( + "native_reps_same_buffer_consumes_bounds_facts", + bool(same_consumed_bounds), + f"same_buffer consumed_bounds_records={len(same_consumed_bounds)}", + ) if workload == "h1_buffer_alias_negative": def records_in_function(fragment: str) -> list[dict[str, Any]]: @@ -713,6 +829,33 @@ def fallback_buffer_access(rows: list[dict[str, Any]]) -> list[dict[str, Any]]: for r in records if r.get("materialization_reason") } + dynamic_fallback_records = [r for r in records if _is_dynamic_fallback(r)] + dynamic_fallbacks_missing_reason = [ + r + for r in dynamic_fallback_records + if not _field_name(r.get("fallback_reason")) + or not _field_name(r.get("materialization_reason")) + ] + dynamic_fallbacks_missing_rejection = [ + r + for r in dynamic_fallback_records + if not ( + _record_has_fact(r, "rejected_facts", kind="bounds", state="missing") + or _record_has_fact( + r, "rejected_facts", kind="alias_noalias", state="missing" + ) + ) + ] + dynamic_fallbacks_missing_invalidation = [ + r + for r in dynamic_fallback_records + if not _record_has_fact( + r, + "rejected_facts", + kind="materialization_hazard", + state="invalidated", + ) + ] add( "native_reps_negative_denies_unsafe_noalias", bool(denied_noalias), @@ -728,6 +871,21 @@ def fallback_buffer_access(rows: list[dict[str, Any]]) -> list[dict[str, Any]]: bool(reasons), f"materialization_reasons={sorted(reasons)}", ) + add( + "native_reps_negative_dynamic_fallbacks_have_reasons", + not dynamic_fallbacks_missing_reason, + json.dumps(dynamic_fallbacks_missing_reason[:5], sort_keys=True), + ) + add( + "native_reps_negative_dynamic_fallbacks_reject_guards", + not dynamic_fallbacks_missing_rejection, + json.dumps(dynamic_fallbacks_missing_rejection[:5], sort_keys=True), + ) + add( + "native_reps_negative_dynamic_fallbacks_invalidate_hazards", + not dynamic_fallbacks_missing_invalidation, + json.dumps(dynamic_fallbacks_missing_invalidation[:5], sort_keys=True), + ) alias_local_records = records_for_native_region("alias_local") reassignment_records = records_for_native_region("reassignment_region") unknown_call_escape_records = records_for_native_region("unknown_call_escape") @@ -1058,19 +1216,38 @@ def add(name: str, passed: bool, detail: str, severity: str = "error") -> None: ) if benchmark is not None: + benchmark_runs = list(benchmark.get("runs", []) or []) add( "benchmark_exit_zero", - all(run.get("exit_code") == 0 for run in benchmark.get("runs", [])), + bool(benchmark_runs) + and all(run.get("exit_code") == 0 for run in benchmark_runs), "all benchmark runs exited zero", ) - benchmark_stdout = "\n".join( - str(run.get("stdout_first") or "") for run in benchmark.get("runs", []) - ) - for check in workload_info.get("stdout_checks", []) or []: + else: + benchmark_runs = [] + + stdout_checks = workload_info.get("stdout_checks", []) or [] + if stdout_checks and not benchmark_runs: + for check in stdout_checks: + add( + check["name"], + False, + f"{check.get('detail', check['name'])}: no benchmark stdout captured", + ) + if benchmark_runs: + for check in stdout_checks: + failed_runs = [ + int(run.get("run", index)) + for index, run in enumerate(benchmark_runs, start=1) + if not _text_check_passes(_benchmark_run_stdout(run), check) + ] add( check["name"], - _text_check_passes(benchmark_stdout, check), - check.get("detail", check["name"]), + not failed_runs, + ( + f"{check.get('detail', check['name'])}: " + f"checked_runs={len(benchmark_runs)} failed_runs={failed_runs}" + ), ) for budget in runtime_budget_results(workload, runtime_summary, workloads): diff --git a/tests/raw_numeric_object_fields.ts b/tests/raw_numeric_object_fields.ts index a792e82983..3511dd5f58 100644 --- a/tests/raw_numeric_object_fields.ts +++ b/tests/raw_numeric_object_fields.ts @@ -1,3 +1,5 @@ +"use strict"; + interface Point { x: number; y: number; diff --git a/tests/test_compiler_output_regression.py b/tests/test_compiler_output_regression.py index 479b4e52de..b95f7b379f 100644 --- a/tests/test_compiler_output_regression.py +++ b/tests/test_compiler_output_regression.py @@ -81,6 +81,7 @@ br label %for.body.2 for.body.2: %i = load i32, ptr %slot + store i32 %i, ptr %slot %ok = icmp slt i32 %i, %n %p0 = getelementptr i8, ptr %src, i32 %i %b = load i8, ptr %p0 @@ -119,13 +120,86 @@ def native_record(function="main", block="for.body.2", rep="i32", **overrides): "alias_state": None, "access_mode": None, "materialization_reason": None, + "fallback_reason": None, + "native_value_state": "region_local", "emitted_inbounds": False, "emitted_noalias": False, } row.update(overrides) + if row.get("native_rep_name") != "js_value": + row.setdefault("consumed_facts", []).append( + native_fact( + "representation", + "consumed", + str(row.get("native_rep_name") or "unknown"), + ) + ) + bounds_state = row.get("bounds_state") + if isinstance(bounds_state, dict) and "guarded" in bounds_state: + guard = bounds_state["guarded"] or {} + row.setdefault("consumed_facts", []).append( + native_fact( + "bounds", + "consumed", + str(guard.get("guard_id") or "guarded"), + ) + ) + elif isinstance(bounds_state, dict) and "proven" in bounds_state: + proof = bounds_state["proven"] or {} + row.setdefault("consumed_facts", []).append( + native_fact( + "bounds", + "consumed", + str(proof.get("proof") or "proven"), + ) + ) + if row.get("access_mode") == "dynamic_fallback": + row["fallback_reason"] = row.get("fallback_reason") or row.get( + "materialization_reason" + ) + row["native_value_state"] = "dynamic_fallback" + if row.get("bounds_state") is None or row.get("bounds_state") == "unknown": + row.setdefault("rejected_facts", []).append( + native_fact( + "bounds", + "missing", + "unknown", + row.get("materialization_reason"), + ) + ) + if row.get("alias_state") in {"unknown", "may_alias", None}: + row.setdefault("rejected_facts", []).append( + native_fact( + "alias_noalias", + "missing", + "unknown_or_may_alias", + row.get("materialization_reason"), + ) + ) + if row.get("materialization_reason"): + row.setdefault("rejected_facts", []).append( + native_fact( + "materialization_hazard", + "invalidated", + str(row.get("materialization_reason")), + row.get("materialization_reason"), + ) + ) + elif row.get("materialization_reason"): + row["native_value_state"] = "materialized" return row +def native_fact(kind, state, detail, reason=None): + return { + "fact_id": f"native_region.{kind}.test.{detail}", + "kind": kind, + "local_id": None, + "state": state, + "reason": reason, + } + + def raw_f64_layout_fact(state): return { "fact_id": f"native_region.raw_f64_layout.test.{state}", @@ -368,6 +442,99 @@ def numeric_array_native_records(): ]) +def numeric_arrays_inline_ir(): + return """ +define i32 @main() { +entry: + call i64 @js_array_numeric_push_f64_unboxed(i64 1, double 2.0) + %g = call i32 @js_typed_feedback_numeric_array_index_get_guard(i64 1, double 0.0, double 0.0, i32 0, i32 1) + %gc = icmp ne i32 %g, 0 + br i1 %gc, label %bidx.num.fast.1, label %bidx.num.fallback.2 + +bidx.num.fast.1: + %addr = add i64 1, 8 + %p = inttoptr i64 %addr to ptr + %v = load double, ptr %p, align 8 + br label %bidx.num.merge.3 + +bidx.num.fallback.2: + br label %bidx.num.merge.3 + +bidx.num.merge.3: + %sg = call i32 @js_typed_feedback_numeric_array_index_set_guard(i64 1, double 0.0, i32 0, double 3.0, i32 1) + %sc = icmp ne i32 %sg, 0 + br i1 %sc, label %idxset.bounded_numeric_fast.4, label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_fast.4: + %sval = fadd double 3.0, 0.0 + %saddr = add i64 1, 8 + %sp = inttoptr i64 %saddr to ptr + %sraw = call double @js_array_numeric_value_to_raw_f64(double %sval) + store double %sraw, ptr %sp, align 8 + br label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_merge.5: + ret i32 0 +} +""" + + +def h1_equivalence_native_records(): + region_ids = { + "direct_bounded": "h1_native_rep_equivalence_ts.module_init.direct_bounded", + "local_cast": "h1_native_rep_equivalence_ts.module_init.local_cast", + "helper_index": "h1_native_rep_equivalence_ts.module_init.helper_index", + "same_buffer": "h1_native_rep_equivalence_ts.incinplace.same_buffer", + } + blocks = { + "direct_bounded": "for.body.2", + "local_cast": "for.body.6", + "helper_index": "for.body.10", + "same_buffer": "for.body.2.i", + } + records = [] + proven = {"proven": {"proof": "loop_guard"}} + for name, region_id in region_ids.items(): + alias_state = "may_alias" if name == "same_buffer" else "no_alias_proven" + records.extend( + [ + native_record( + block=blocks[name], + rep="i32", + region_id=region_id, + bounds_state=proven, + ), + native_record( + block=blocks[name], + rep="buffer_view", + region_id=region_id, + bounds_state=proven, + alias_state=alias_state, + access_mode="unchecked_native", + ), + native_record( + block=blocks[name], + rep="u8", + region_id=region_id, + bounds_state=proven, + alias_state=alias_state, + access_mode="unchecked_native", + consumer="u8_load_zext_i32", + ), + native_record( + block=blocks[name], + rep="u8", + region_id=region_id, + bounds_state=proven, + alias_state=alias_state, + access_mode="unchecked_native", + consumer="u8_store_trunc_i32", + ), + ] + ) + return records + + class CompilerOutputRegressionTests(unittest.TestCase): def test_image_convolution_good_shape_passes(self): report = HARNESS.verify_artifacts( @@ -530,7 +697,19 @@ def test_numeric_arrays_requires_runtime_api_fallback_reasons(self): entry: call i64 @js_array_numeric_push_f64_unboxed(i64 1, double 2.0) call double @js_array_numeric_get_f64_unboxed(i64 1, i32 0) - call i32 @js_array_numeric_set_f64_unboxed(i64 1, i32 0, double 3.0) + %sg = call i32 @js_typed_feedback_numeric_array_index_set_guard(i64 1, double 0.0, i32 0, double 3.0, i32 1) + %sc = icmp ne i32 %sg, 0 + br i1 %sc, label %idxset.bounded_numeric_fast.4, label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_fast.4: + %sval = fadd double 3.0, 0.0 + %saddr = add i64 1, 8 + %sp = inttoptr i64 %saddr to ptr + %sraw = call double @js_array_numeric_value_to_raw_f64(double %sval) + store double %sraw, ptr %sp, align 8 + br label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_merge.5: ret i32 0 } """ @@ -671,6 +850,79 @@ def test_verify_existing_uses_analysis_ir_object_disassembly_and_manifest_plan(s report = (root / "structural-report.json").read_text(encoding="utf-8") self.assertIn("object_disassembly_present", report) + def test_verify_existing_uses_manifest_benchmark_stdout_for_stdout_checks(self): + with tempfile.TemporaryDirectory() as temp: + root = Path(temp) + ir = numeric_arrays_inline_ir() + (root / "llvm-before-opt.ll").write_text(ir, encoding="utf-8") + (root / "llvm-after-opt.analysis.ll").write_text(ir, encoding="utf-8") + (root / "object-disassembly.s").write_text(GOOD_ASM, encoding="utf-8") + (root / "native-reps.json").write_text( + json.dumps({"records": numeric_array_native_records()}), + encoding="utf-8", + ) + (root / "manifest.json").write_text( + json.dumps( + { + "benchmark": { + "runs": [ + { + "run": 1, + "exit_code": 0, + "stdout_first": "25\n", + } + ] + } + } + ), + encoding="utf-8", + ) + args = type( + "Args", + (), + { + "artifact_dir": str(root), + "workload": "numeric_arrays", + "gate": True, + "print_summary": False, + "target": None, + "clang_arg": None, + "fp_contract": None, + "expect_fma": "auto", + }, + )() + self.assertEqual(HARNESS.verify_existing(args), 0) + + def test_verify_existing_stdout_checks_fail_without_manifest_benchmark(self): + with tempfile.TemporaryDirectory() as temp: + root = Path(temp) + ir = numeric_arrays_inline_ir() + (root / "llvm-before-opt.ll").write_text(ir, encoding="utf-8") + (root / "llvm-after-opt.analysis.ll").write_text(ir, encoding="utf-8") + (root / "object-disassembly.s").write_text(GOOD_ASM, encoding="utf-8") + (root / "native-reps.json").write_text( + json.dumps({"records": numeric_array_native_records()}), + encoding="utf-8", + ) + args = type( + "Args", + (), + { + "artifact_dir": str(root), + "workload": "numeric_arrays", + "gate": True, + "print_summary": False, + "target": None, + "clang_arg": None, + "fp_contract": None, + "expect_fma": "auto", + }, + )() + self.assertEqual(HARNESS.verify_existing(args), 1) + report = (root / "structural-report.json").read_text(encoding="utf-8") + self.assertIn("numeric_arrays_checksum", report) + self.assertIn("no benchmark stdout captured", report) + def test_explicit_perry_path_is_repo_relative(self): resolved = HARNESS.resolve_perry("target/debug/perry") self.assertEqual(resolved, [str(REPO_ROOT / "target/debug/perry")]) @@ -762,6 +1014,31 @@ def test_workload_spec_rejects_missing_required_fields(self): } ) + def test_workload_spec_rejects_substring_stdout_checks(self): + with self.assertRaises(HARNESS.HarnessError): + HARNESS.validate_workload_spec( + { + "schema_version": 1, + "workloads": { + "bad": { + "source": "fixture.ts", + "kind": "numeric_loop", + "vectorization": { + "min_vectorized_loops": 0, + "allowed_missed_reason_kinds": [], + }, + "runtime_budgets": {}, + "stdout_checks": [ + { + "name": "bad_stdout", + "contains": "25", + } + ], + } + }, + } + ) + def test_parse_kept_paths_includes_compile_metadata(self): irs, objects, metadata, native_reps = HARNESS.parse_kept_paths( "[perry-codegen] kept LLVM IR: /tmp/a.ll\n" @@ -930,8 +1207,9 @@ def test_native_rep_unchecked_unknown_bounds_fails_gate(self): def test_generic_native_rep_checks_require_configured_records(self): # The numeric indexed read is inlined: a guarded fast block computes the # element pointer (inttoptr) and performs a direct `load double` instead - # of calling js_array_numeric_get_f64_unboxed. Push/set still go through - # their guarded raw-f64 helpers. + # of calling js_array_numeric_get_f64_unboxed. The indexed write + # canonicalizes the input and stores inline after its guard instead of + # calling the raw-f64 set helper. ir = """ define i32 @main() { entry: @@ -950,7 +1228,19 @@ def test_generic_native_rep_checks_require_configured_records(self): br label %bidx.num.merge.3 bidx.num.merge.3: - call i32 @js_array_numeric_set_f64_unboxed(i64 1, i32 0, double 3.0) + %sg = call i32 @js_typed_feedback_numeric_array_index_set_guard(i64 1, double 0.0, i32 0, double 3.0, i32 1) + %sc = icmp ne i32 %sg, 0 + br i1 %sc, label %idxset.bounded_numeric_fast.4, label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_fast.4: + %sval = fadd double 3.0, 0.0 + %saddr = add i64 1, 8 + %sp = inttoptr i64 %saddr to ptr + %sraw = call double @js_array_numeric_value_to_raw_f64(double %sval) + store double %sraw, ptr %sp, align 8 + br label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_merge.5: ret i32 0 } """ @@ -1012,13 +1302,71 @@ def test_generic_native_rep_checks_require_configured_records(self): ) self.assertEqual(report["status"], "pass", report["errors"]) + def test_stdout_checks_require_benchmark_data(self): + ir = numeric_arrays_inline_ir() + report = HARNESS.verify_artifacts( + workload="numeric_arrays", + ir_before=ir, + ir_after=ir, + assembly=GOOD_ASM, + benchmark=None, + vectorization={"vectorized_count": 0, "missed_count": 0, "analysis_count": 0}, + native_reps=[{"records": numeric_array_native_records()}], + ) + self.assertEqual(report["status"], "fail") + self.assertTrue( + any("numeric_arrays_checksum" in error for error in report["errors"]), + report["errors"], + ) + self.assertTrue( + any("no benchmark stdout captured" in error for error in report["errors"]), + report["errors"], + ) + + def test_stdout_checks_are_exact_for_every_run(self): + ir = numeric_arrays_inline_ir() + report = HARNESS.verify_artifacts( + workload="numeric_arrays", + ir_before=ir, + ir_after=ir, + assembly=GOOD_ASM, + benchmark={ + "runs": [ + {"run": 1, "exit_code": 0, "stdout_first": "25\n"}, + {"run": 2, "exit_code": 0, "stdout_first": "125\n"}, + ] + }, + vectorization={"vectorized_count": 0, "missed_count": 0, "analysis_count": 0}, + native_reps=[{"records": numeric_array_native_records()}], + ) + self.assertEqual(report["status"], "fail") + self.assertTrue( + any( + "numeric_arrays_checksum" in error and "failed_runs=[2]" in error + for error in report["errors"] + ), + report["errors"], + ) + def test_numeric_array_native_rep_checks_require_raw_layout_facts(self): ir = """ define i32 @main() { entry: call i64 @js_array_numeric_push_f64_unboxed(i64 1, double 2.0) call double @js_array_numeric_get_f64_unboxed(i64 1, i32 0) - call i32 @js_array_numeric_set_f64_unboxed(i64 1, i32 0, double 3.0) + %sg = call i32 @js_typed_feedback_numeric_array_index_set_guard(i64 1, double 0.0, i32 0, double 3.0, i32 1) + %sc = icmp ne i32 %sg, 0 + br i1 %sc, label %idxset.bounded_numeric_fast.4, label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_fast.4: + %sval = fadd double 3.0, 0.0 + %saddr = add i64 1, 8 + %sp = inttoptr i64 %saddr to ptr + %sraw = call double @js_array_numeric_value_to_raw_f64(double %sval) + store double %sraw, ptr %sp, align 8 + br label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_merge.5: ret i32 0 } """ @@ -1063,6 +1411,98 @@ def test_numeric_array_native_rep_checks_require_raw_layout_facts(self): fallback_report["errors"], ) + def test_numeric_array_native_rep_checks_require_fallback_reason(self): + ir = """ +define i32 @main() { +entry: + call i64 @js_array_numeric_push_f64_unboxed(i64 1, double 2.0) + call double @js_array_numeric_get_f64_unboxed(i64 1, i32 0) + %sg = call i32 @js_typed_feedback_numeric_array_index_set_guard(i64 1, double 0.0, i32 0, double 3.0, i32 1) + %sc = icmp ne i32 %sg, 0 + br i1 %sc, label %idxset.bounded_numeric_fast.4, label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_fast.4: + %sval = fadd double 3.0, 0.0 + %saddr = add i64 1, 8 + %sp = inttoptr i64 %saddr to ptr + %sraw = call double @js_array_numeric_value_to_raw_f64(double %sval) + store double %sraw, ptr %sp, align 8 + br label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_merge.5: + ret i32 0 +} +""" + records = numeric_array_native_records() + for record in records: + if record.get("access_mode") == "dynamic_fallback": + record["fallback_reason"] = None + break + report = HARNESS.verify_artifacts( + workload="numeric_arrays", + ir_before=ir, + ir_after=ir, + assembly=GOOD_ASM, + benchmark={"runs": [{"exit_code": 0, "stdout_first": "25\n"}]}, + vectorization={"vectorized_count": 0, "missed_count": 0, "analysis_count": 0}, + native_reps=[{"records": records}], + ) + self.assertEqual(report["status"], "fail") + self.assertTrue( + any( + "native_reps_dynamic_fallbacks_have_reasons" in error + for error in report["errors"] + ), + report["errors"], + ) + + def test_numeric_array_native_rep_checks_require_fact_reason(self): + ir = """ +define i32 @main() { +entry: + call i64 @js_array_numeric_push_f64_unboxed(i64 1, double 2.0) + call double @js_array_numeric_get_f64_unboxed(i64 1, i32 0) + %sg = call i32 @js_typed_feedback_numeric_array_index_set_guard(i64 1, double 0.0, i32 0, double 3.0, i32 1) + %sc = icmp ne i32 %sg, 0 + br i1 %sc, label %idxset.bounded_numeric_fast.4, label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_fast.4: + %sval = fadd double 3.0, 0.0 + %saddr = add i64 1, 8 + %sp = inttoptr i64 %saddr to ptr + %sraw = call double @js_array_numeric_value_to_raw_f64(double %sval) + store double %sraw, ptr %sp, align 8 + br label %idxset.bounded_numeric_merge.5 + +idxset.bounded_numeric_merge.5: + ret i32 0 +} +""" + records = numeric_array_native_records() + for record in records: + if record.get("access_mode") == "dynamic_fallback": + for fact in record.get("rejected_facts", []): + if fact.get("kind") == "raw_f64_layout": + fact["reason"] = None + break + report = HARNESS.verify_artifacts( + workload="numeric_arrays", + ir_before=ir, + ir_after=ir, + assembly=GOOD_ASM, + benchmark={"runs": [{"exit_code": 0, "stdout_first": "25\n"}]}, + vectorization={"vectorized_count": 0, "missed_count": 0, "analysis_count": 0}, + native_reps=[{"records": records}], + ) + self.assertEqual(report["status"], "fail") + self.assertTrue( + any( + "native_reps_required_numeric_array_push_dynamic_fallback" in error + for error in report["errors"] + ), + report["errors"], + ) + def test_generic_native_rep_checks_reject_unexpected_materialization(self): ir = "define i32 @main() { entry: ret i32 0 }\n" report = HARNESS.verify_artifacts( @@ -1140,7 +1580,7 @@ def h1_alias_negative_records(self, length_records, mutated_records=None): consumer="BufferIndexGet.slow_path_i32", ), ] - return [ + records = [ native_record( function="aliasLocal", rep="buffer_view", @@ -1211,6 +1651,25 @@ def h1_alias_negative_records(self, length_records, mutated_records=None): *length_records, *mutated_records, ] + for record in records: + if record.get("access_mode") != "dynamic_fallback": + continue + reason = record.get("materialization_reason") or "unknown_bounds" + record["materialization_reason"] = reason + record["fallback_reason"] = record.get("fallback_reason") or reason + record["native_value_state"] = "dynamic_fallback" + if record.get("bounds_state") is None or record.get("bounds_state") == "unknown": + record.setdefault("rejected_facts", []).append( + native_fact("bounds", "missing", "unknown", reason) + ) + if record.get("alias_state") in {"unknown", "may_alias", None}: + record.setdefault("rejected_facts", []).append( + native_fact("alias_noalias", "missing", "unknown_or_may_alias", reason) + ) + record.setdefault("rejected_facts", []).append( + native_fact("materialization_hazard", "invalidated", str(reason), reason) + ) + return records def test_length_mismatch_unchecked_unknown_bounds_fails_gate(self): length_region = "h1_buffer_alias_negative_ts.lengthmismatch.length_mismatch" @@ -1633,6 +2092,42 @@ def test_native_region_materialization_fails_gate(self): any("native_reps_direct_bounded_no_materialization" in error for error in report["errors"]) ) + def test_h1_native_rep_equivalence_consumed_facts_pass_gate(self): + report = HARNESS.verify_artifacts( + workload="h1_native_rep_equivalence", + ir_before=H1_MIN_IR, + ir_after=H1_MIN_IR, + assembly=GOOD_ASM, + benchmark=None, + vectorization={"vectorized_count": 0, "missed_count": 0, "analysis_count": 0}, + native_reps=[{"records": h1_equivalence_native_records()}], + ) + self.assertEqual(report["status"], "pass", report["errors"]) + + def test_h1_native_rep_equivalence_requires_consumed_facts(self): + records = h1_equivalence_native_records() + direct_region = "h1_native_rep_equivalence_ts.module_init.direct_bounded" + for record in records: + if record.get("region_id") == direct_region: + record["consumed_facts"] = [] + report = HARNESS.verify_artifacts( + workload="h1_native_rep_equivalence", + ir_before=H1_MIN_IR, + ir_after=H1_MIN_IR, + assembly=GOOD_ASM, + benchmark=None, + vectorization={"vectorized_count": 0, "missed_count": 0, "analysis_count": 0}, + native_reps=[{"records": records}], + ) + self.assertEqual(report["status"], "fail") + self.assertTrue( + any( + "native_reps_direct_bounded_consumes_representation_facts" in error + for error in report["errors"] + ), + report["errors"], + ) + def test_benchmark_summary_reports_p95_and_stddev(self): summary = HARNESS.benchmark_summary( [