perf(codegen): integer-specialize `i < n` loop guards for any/untyped bounds by proggeramlug · Pull Request #5086 · PerryTS/perry

proggeramlug · 2026-06-13T11:44:12Z

Problem

A tight for (let i = 0; i < n; i++) integer loop whose bound n is not statically typed number — most commonly an any-typed or un-annotated value (e.g. a count from JSON.parse/untyped request data) — lowered its i < n / i <= n guard to a generic per-iteration comparison:

for.cond:
  %i  = load i32, ptr %ctr
  %id = sitofp i32 %i to double          ; vcvtsi2sd — every iteration
  %nd = load double, ptr %n              ; n kept as a NaN-boxed f64 on the stack
  %r  = call double @js_rel_lt(%id, %nd) ; callq — every iteration
  %b  = icmp eq i64 bitcast(%r), <TAG_TRUE>
  br i1 %b, label %body, label %exit

On the hot path of a compute kernel this is ~50× slower than an integer induction variable + icmp, and the call blocks SCEV / the loop vectorizer.

Why it looked arch-specific (it isn't)

The reporter observed this only on x86_64-linux, with arm64 "optimizing it correctly." But the emitted LLVM IR is byte-identical across both targets — only the target triple header line differs. js_rel_lt is a #[no_mangle] extern "C" runtime function in a separate compilation unit:

macOS uses the default auto-optimize build, which rebuilds + inlines the runtime so LLVM folds js_rel_lt into the loop and the call vanishes → looks optimized.
--target linux / Lambda links a prebuilt libperry_runtime.a with no cross-module inlining → the per-iteration callq survives → the ~50× hit.

The suboptimal IR was present on both arches; arm64 just masked it at link time. (A complementary follow-up — marking js_rel_* inlinable / enabling cross-module inlining on the prebuilt-archive path — would help every un-specialized hot comparison on Lambda, not just loop guards.)

Fix

A runtime-guarded i32 specialization extending the existing i < arr.length and i < n (number-typed) peepholes to any/untyped bounds:

New classify_for_local_bound_dynamic matches i < n / i <= n where n is an accessible (unboxed, non-module-global) local whose static type is not number/int32.
The loop head hoists, once, an is-number check (NaN-box tag test mirroring JSValue::is_number) plus fptosi(n).
The cond block branches on that loop-invariant flag:
- for.cond.fast → icmp slt i32 (no per-iteration sitofp, no call)
- for.cond.slow → the generic js_rel_lt path, preserving full JS coercion semantics for non-number values.
LLVM's LoopUnswitch peels the invariant branch into two loops at -O2+; even unswitched, the hot (is-number) path runs pure integer compares.
Added the SHORT_STRING_TAG ABI-mirror constant to codegen's nanbox.rs.

When the bound is a primitive number, hoisting fptosi(n) once carries the same documented trust-types trade-off as the static number-typed path (a non-integer float bound shifts the trip count by at most one).

Verification

The any-bound guard now lowers to for.cond.fast: icmp slt i32 with no per-iteration sitofp/call; the entry block computes the is-number flag + fptosi(n) once.
Runtime correctness: numeric any bound → fast path, correct sum (499999500000); "3" string bound → slow path, coerces to 3; <= bound → 15.
Tests: perry-codegen 16/16; perry-hir + perry lib 522/0 plus integration suites green. The residual perry-runtime date/url failures and the issue_4909 HTTP-timeout test are pre-existing load/timezone flakes — unchanged with this patch stashed, and pass in isolation.

Touches only crates/perry-codegen (stmt/loops.rs, nanbox.rs); +version bump/changelog.

Summary by CodeRabbit

Bug Fixes
- Improved loop performance when loop bounds come from dynamic/untyped values, reducing per-iteration overhead and speeding execution.
Documentation
- Added changelog entry for v0.5.1164 describing the performance improvement and verification status.
Chores
- Version bumped to v0.5.1164.

coderabbitai · 2026-06-13T11:44:53Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: cde6a90f-71eb-436e-9fb0-1f030666e868

📥 Commits

Reviewing files that changed from the base of the PR and between b87afd2 and c564446.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

CHANGELOG.md
CLAUDE.md
Cargo.toml
crates/perry-codegen/src/nanbox.rs
crates/perry-codegen/src/stmt/loops.rs

✅ Files skipped from review due to trivial changes (2)

Cargo.toml
CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (3)

CLAUDE.md
crates/perry-codegen/src/nanbox.rs
crates/perry-codegen/src/stmt/loops.rs

📝 Walkthrough

Walkthrough

This PR optimizes for-loop code generation with untyped (any) numeric bounds. It hoists a single is-number check and branches into a fast i32-compare path versus a slow coercion-preserving path, plus introduces an ABI-mirroring constant and bumps the workspace version to 0.5.1164.

Changes

Dynamic I32 Loop-Bound Optimization

Layer / File(s)	Summary
Version and changelog documentation `Cargo.toml`, `CLAUDE.md`, `CHANGELOG.md`	Workspace version incremented to 0.5.1164, with changelog documenting the loop-guard specialization for any-typed bounds and its verification status.
ABI constant for short-string tagging `crates/perry-codegen/src/nanbox.rs`	Added `SHORT_STRING_TAG` constant to mirror the runtime's NaN-boxed tag band for short-string payload classification during codegen.
Dynamic i32 bound loop optimization `crates/perry-codegen/src/stmt/loops.rs`	Extended for-loop lowering with `DynamicI32Bound` helper to carry hoisted is-number flag and fptosi-converted i32 bound. Hoisting logic allocates counter i32 slot if missing, computes is-number flag from nanbox tag bits, and precomputes i32 bound. Condition generation adds branching on hoisted flag: fast path uses i32 icmp, slow path falls back to full truthiness lowering. Added cleanup to remove dynamically allocated counter slots, and added `classify_for_local_bound_dynamic` to detect any-typed numeric bounds while excluding static cases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🐰 A loop with bounds so shy,
We hoist its checks way up on high,
Fast paths race with i32 grace,
Slow paths keep every coercion in place,
Hops of joy — optimized pace!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is comprehensive, detailing the problem, root cause analysis, the fix implementation, and verification results. However, the CLAUDE.md and CHANGELOG.md were edited despite the template instructions forbidding this.	Remove edits to CLAUDE.md and CHANGELOG.md as per template instructions—maintainers handle version/changelog updates at merge time.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: a performance optimization that adds integer specialization for loop guards with any/untyped bounds.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/loop-guard-any-bound

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry-codegen/src/stmt/loops.rs`:
- Around line 435-491: The dynamic fast path currently hoists is_number and
fptosi for a bound local without ensuring that bound_id is not mutated inside
the loop, so add a mutability/modified-local guard before creating
DynamicI32Bound: in the closure passed to and_then(|(counter_id, bound_id, op)|
{ ... }) (the classify_for_local_bound_dynamic path) check whether bound_id can
be written by the loop body or update (e.g. consult whichever
analysis/collection tracks locals modified by the loop such as a
loop-modified-locals set or a Local.is_mutable/modified-in-loop query) and
return None if it is mutated; only proceed to allocate flag_slot/bound_i32_slot
and return Some(DynamicI32Bound) when bound_id is guaranteed not to be changed
by the loop. Ensure this check uses the same identifying symbol bound_id and
prevents the one-time hoist when the local is mutable.
- Around line 480-483: The code currently does an unconditional fptosi on n_dbl
and stores it to bound_i32_slot, which can produce poison for
NaN/Inf/out-of-range values; instead, ensure the fptosi is only executed after
the finite/i32-range checks (i.e., inside the fast-path block or after the
flag_slot guard) — move the fptosi(&n_dbl, I32) and the store to bound_i32_slot
into the same basic block that is taken when the fast-number checks pass (or
emit a guarded conversion using the boolean guard that writes only when the
guard is true), leaving the pre-loop block to only allocate bound_i32_slot and
produce no fptosi result; update any uses that load bound_i32_slot (e.g., the
icmp compare) to occur in the fast block after the store so they never observe
an unguarded conversion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 60a74585-8d09-4f77-b456-7d5d2c1dcc88

📥 Commits

Reviewing files that changed from the base of the PR and between e092362 and b87afd2.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

CHANGELOG.md
CLAUDE.md
Cargo.toml
crates/perry-codegen/src/nanbox.rs
crates/perry-codegen/src/stmt/loops.rs

coderabbitai · 2026-06-13T11:51:23Z

+    let dynamic_i32_bound: Option<DynamicI32Bound> =
+        if hoist_classification.is_none() && local_bound_classification.is_none() {
+            condition
+                .and_then(|cond| classify_for_local_bound_dynamic(cond, ctx))
+                .and_then(|(counter_id, bound_id, op)| {
+                    let bound_slot = ctx.locals.get(&bound_id).cloned()?;
+                    // Ensure an i32 counter slot exists (the Let site allocates
+                    // one for `integer_locals`, but allocate here if absent so
+                    // the fast path and Update stay in sync).
+                    let counter_i32_was_fresh =
+                        if !ctx.i32_counter_slots.contains_key(&counter_id) {
+                            let counter_slot = ctx.locals.get(&counter_id).cloned()?;
+                            let i32_slot = ctx.func.alloca_entry(I32);
+                            let cur_dbl = ctx.block().load(DOUBLE, &counter_slot);
+                            let cur_i32 = ctx.block().fptosi(DOUBLE, &cur_dbl, I32);
+                            ctx.block().store(I32, &cur_i32, &i32_slot);
+                            ctx.i32_counter_slots.insert(counter_id, i32_slot);
+                            true
+                        } else {
+                            false
+                        };
+                    // One-time `is-number` test, mirroring runtime
+                    // `JSValue::is_number`: a value is a number unless its tag
+                    // bits fall in the Perry-owned band [SHORT_STRING_TAG,
+                    // STRING_TAG].
+                    let n_dbl = ctx.block().load(DOUBLE, &bound_slot);
+                    let n_bits = ctx.block().bitcast_double_to_i64(&n_dbl);
+                    let tag = ctx.block().and(
+                        I64,
+                        &n_bits,
+                        &crate::nanbox::i64_literal(crate::nanbox::TAG_MASK),
+                    );
+                    let below = ctx.block().icmp_ult(
+                        I64,
+                        &tag,
+                        &crate::nanbox::i64_literal(crate::nanbox::SHORT_STRING_TAG),
+                    );
+                    let above = ctx.block().icmp_ugt(
+                        I64,
+                        &tag,
+                        &crate::nanbox::i64_literal(crate::nanbox::STRING_TAG),
+                    );
+                    let is_number = ctx.block().or(I1, &below, &above);
+                    let flag_slot = ctx.func.alloca_entry(I1);
+                    ctx.block().store(I1, &is_number, &flag_slot);
+                    // `fptosi(n)` is valid only on the fast (is-number) path.
+                    let bound_i32 = ctx.block().fptosi(DOUBLE, &n_dbl, I32);
+                    let bound_i32_slot = ctx.func.alloca_entry(I32);
+                    ctx.block().store(I32, &bound_i32, &bound_i32_slot);
+                    Some(DynamicI32Bound {
+                        counter_id,
+                        op,
+                        flag_slot,
+                        bound_i32_slot,
+                        counter_i32_was_fresh,
+                    })
+                })


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject the dynamic fast path when the bound local is mutable inside the loop.

This path hoists both is_number and fptosi(n) once, but it never proves that bound_id stays unchanged across the body/update. A loop like for (let i = 0; i < n; i++) { n = "0"; } will keep using the entry-time flag/bound and run too many iterations instead of re-reading n each trip.

Suggested guard

if hoist_classification.is_none() && local_bound_classification.is_none() { condition .and_then(|cond| classify_for_local_bound_dynamic(cond, ctx)) .and_then(|(counter_id, bound_id, op)| { + if stmts_mutate_local(body, bound_id) + || update.is_some_and(|expr| expr_mutates_local(expr, bound_id)) + { + return None; + } let bound_slot = ctx.locals.get(&bound_id).cloned()?; // Ensure an i32 counter slot exists (the Let site allocates // one for `integer_locals`, but allocate here if absent so

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/perry-codegen/src/stmt/loops.rs` around lines 435 - 491, The dynamic fast path currently hoists is_number and fptosi for a bound local without ensuring that bound_id is not mutated inside the loop, so add a mutability/modified-local guard before creating DynamicI32Bound: in the closure passed to and_then(|(counter_id, bound_id, op)| { ... }) (the classify_for_local_bound_dynamic path) check whether bound_id can be written by the loop body or update (e.g. consult whichever analysis/collection tracks locals modified by the loop such as a loop-modified-locals set or a Local.is_mutable/modified-in-loop query) and return None if it is mutated; only proceed to allocate flag_slot/bound_i32_slot and return Some(DynamicI32Bound) when bound_id is guaranteed not to be changed by the loop. Ensure this check uses the same identifying symbol bound_id and prevents the one-time hoist when the local is mutable.

coderabbitai · 2026-06-13T11:51:23Z

+                    // `fptosi(n)` is valid only on the fast (is-number) path.
+                    let bound_i32 = ctx.block().fptosi(DOUBLE, &n_dbl, I32);
+                    let bound_i32_slot = ctx.func.alloca_entry(I32);
+                    ctx.block().store(I32, &bound_i32, &bound_i32_slot);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

LLVM LangRef fptosi: what happens when converting NaN, ±Inf, or out-of-range doublevalues toi32?

💡 Result:

In LLVM IR, the fptosi instruction converts a floating-point value to a signed integer by rounding towards zero [1][2]. If the source floating-point value (such as NaN, ±Inf, or a value that is out of range for the destination integer type) cannot be represented in the destination integer type, the result is a poison value [1][2]. Poison values are a form of undefined behavior in LLVM that propagate through instructions [1][2]. Because the conversion of NaN or out-of-range values produces poison, using the result of such an fptosi instruction in other operations can lead to undefined behavior for the program as a whole [3]. If you require defined behavior for these edge cases—such as clamping to the minimum or maximum representable integer—you should use the llvm.fptosi.sat intrinsic, which provides saturating conversion behavior [4][5].

Citations:

1: https://reviews.llvm.org/D47851

2: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180604/559300.html

3: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20250915/1734007.html

4: https://releases.llvm.org/15.0.0/docs/LangRef.html

5: https://releases.llvm.org/14.0.0/docs/LangRef.html

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the relevant region around the reported lines FILE="crates/perry-codegen/src/stmt/loops.rs" nl -ba "$FILE" | sed -n '430,540p'

Repository: PerryTS/perry

Length of output: 100

🏁 Script executed:

#!/bin/bash set -euo pipefail # Also search for the surrounding guard / fast-slow split context to see where the fptosi lands. FILE="crates/perry-codegen/src/stmt/loops.rs" rg -n "fptosi\$|is-number|NaN|fast|slow|primitive number|DOUBLE|I32" "$FILE"

Repository: PerryTS/perry

Length of output: 6052

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" awk 'NR>=430 && NR<=510 {printf "%d:%s\n", NR, $0}' "$FILE" | sed -n '1,120p'

Repository: PerryTS/perry

Length of output: 4893

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" awk 'NR>=440 && NR<=500 {printf "%d:%s\n", NR, $0}' "$FILE" | sed -n '1,200p'

Repository: PerryTS/perry

Length of output: 3534

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" rg -n "for\\.cond\\.fast|for\\.cond\\.slow|cond_br\\(&flag|flag_slot|bound_slot|bound_i32_slot|fptosi\\(DOUBLE, &n_dbl, I32\$" "$FILE"

Repository: PerryTS/perry

Length of output: 1597

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Show the fast/slow split logic around where flag is used sed -n '540,660p' "$FILE"

Repository: PerryTS/perry

Length of output: 5550

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" rg -n "fptosi\\.sat|llvm\\.fptosi\\.sat|fptosi\\(|freeze\\b|is_finite|isfinite|fcmp" "$FILE"

Repository: PerryTS/perry

Length of output: 1879

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Find where the slow path computes the bound/compare to see if it still uses bound_i32_slot rg -n "for\\.cond\\.slow|for\\.cond\\.fast|js_rel_|js_.*compare|CompareOp" "$FILE" -n

Repository: PerryTS/perry

Length of output: 1679

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Show a wider window around the DynamicI32Bound usage sed -n '560,640p' "$FILE"

Repository: PerryTS/perry

Length of output: 3949

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" rg -n "classify_for_local_bound_dynamic" -S "$FILE"

Repository: PerryTS/perry

Length of output: 279

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Print around the function definition and any helpers sed -n '660,820p' "$FILE"

Repository: PerryTS/perry

Length of output: 6618

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Show the fast/slow split logic around where flag is used sed -n '540,660p' "$FILE"

Repository: PerryTS/perry

Length of output: 5550

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" rg -n "fptosi\\.sat|llvm\\.fptosi\\.sat|fptosi\\(|freeze\\b|is_finite|isfinite|fcmp" "$FILE"

Repository: PerryTS/perry

Length of output: 1879

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Show how the slow path computes/compares (look for compare helpers/calls) rg -n "for\\.cond\\.slow|for\\.cond\\.fast|js_rel_|CompareOp::|js_.*rel|js_.*compare" "$FILE"

Repository: PerryTS/perry

Length of output: 1135

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Wider window around DynamicI32Bound usage to confirm where bound_i32_slot is read sed -n '560,640p' "$FILE"

Repository: PerryTS/perry

Length of output: 3949

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Locate the classifier to see whether it excludes NaN/Inf/out-of-range (or only checks tag) rg -n "classify_for_local_bound_dynamic" "$FILE"

Repository: PerryTS/perry

Length of output: 279

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/perry-codegen/src/stmt/loops.rs" # Print around potential helper definitions near the matcher result (best-effort) # (If this misses, we'll narrow further after seeing line numbers from rg.) sed -n '820,1040p' "$FILE"

Repository: PerryTS/perry

Length of output: 8246

Avoid speculatively fptosi-ing dynamic loop bounds (poison/UB risk)

At crates/perry-codegen/src/stmt/loops.rs:480-483, fptosi(DOUBLE, &n_dbl, I32) is computed unconditionally in the pre-loop block, then stored to bound_i32_slot even though only flag_slot gates use in the fast/slow condition. LLVM fptosi yields poison when the input is NaN/±Inf or out-of-range for i32, and the fast path subsequently loads that result for the icmp comparison.

// `fptosi(n)` is valid only on the fast (is-number) path. let bound_i32 = ctx.block().fptosi(DOUBLE, &n_dbl, I32); let bound_i32_slot = ctx.func.alloca_entry(I32); ctx.block().store(I32, &bound_i32, &bound_i32_slot);

Strengthen the guard so only finite, i32-representable doubles reach fptosi (and/or emit the conversion only inside the fast block after those checks).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/perry-codegen/src/stmt/loops.rs` around lines 480 - 483, The code currently does an unconditional fptosi on n_dbl and stores it to bound_i32_slot, which can produce poison for NaN/Inf/out-of-range values; instead, ensure the fptosi is only executed after the finite/i32-range checks (i.e., inside the fast-path block or after the flag_slot guard) — move the fptosi(&n_dbl, I32) and the store to bound_i32_slot into the same basic block that is taken when the fast-number checks pass (or emit a guarded conversion using the boolean guard that writes only when the guard is true), leaving the pre-loop block to only allocate bound_i32_slot and produce no fptosi result; update any uses that load bound_i32_slot (e.g., the icmp compare) to occur in the fast block after the store so they never observe an unguarded conversion.

… bounds Tight integer loops whose bound is not statically typed `number` (most commonly an `any`-typed or un-annotated count, e.g. out of `JSON.parse`) lowered their `i < n` / `i <= n` guard to a generic per-iteration comparison: `sitofp` the i32 counter back to a double, keep `n` as a NaN-boxed f64, and `call @js_rel_lt` every iteration — ~50x slower than an integer induction variable + `icmp`, and it blocks SCEV / the loop vectorizer. The emitted IR is identical across x86_64 and arm64 (only the target-triple header differs). It looked arch-specific only because the macOS auto-optimize build inlines `js_rel_lt` and folds the call away, while the `--target linux` prebuilt-runtime build keeps the per-iteration `callq` — the cause of poor compute throughput on Lambda. Fix: runtime-guarded i32 specialization extending the existing `classify_for_local_bound` peephole. New `classify_for_local_bound_dynamic` matches the shape; the loop head hoists, once, an `is-number` check (NaN-box tag test mirroring `JSValue::is_number`) + `fptosi(n)`; the cond block branches on that loop-invariant flag into `for.cond.fast` (`icmp slt i32`, no per-iteration sitofp/call) and `for.cond.slow` (generic `js_rel_lt`, preserving full JS coercion for non-number values). LoopUnswitch peels the invariant branch into two loops at -O2+. Added the `SHORT_STRING_TAG` ABI-mirror constant to codegen's nanbox.rs. Verified: any-bound guards now lower to `icmp slt i32`; numeric any computes correctly via the fast path; string/<= cases keep correct coercion via the slow path. perry-codegen 16/16, perry-hir + perry 522/0 + integration green.

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

proggeramlug force-pushed the feat/loop-guard-any-bound branch from b87afd2 to c564446 Compare June 13, 2026 12:20

proggeramlug merged commit 396e3ff into main Jun 13, 2026
13 of 14 checks passed

proggeramlug deleted the feat/loop-guard-any-bound branch June 13, 2026 13:13

proggeramlug mentioned this pull request Jun 13, 2026

fix(perry-ffi): unbreak cargo test -p perry-ffi (undefined js_register_ffi_handle_exists_probe) #5088

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(codegen): integer-specialize `i < n` loop guards for any/untyped bounds#5086

perf(codegen): integer-specialize `i < n` loop guards for any/untyped bounds#5086
proggeramlug merged 1 commit into
mainfrom
feat/loop-guard-any-bound

proggeramlug commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 13, 2026

Uh oh!

coderabbitai Bot Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

proggeramlug commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Why it looked arch-specific (it isn't)

Fix

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

proggeramlug commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading