[codex] Inline guarded numeric array payload access#5302
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughNumeric array index get and set fast paths are changed to compute element pointers inline (base + 8-byte header + idx×8) and emit direct DOUBLE load/store IR, removing the ChangesGuarded Numeric Array Direct Payload Access
sequenceDiagram
participant Codegen as lower_index_get/set
participant OldPath as Prior: runtime helper
participant NewPath as New: inline pointer arithmetic
rect rgba(200, 150, 100, 0.5)
Note over OldPath: Before: indirect call
Codegen->>OldPath: js_array_numeric_get/set_f64_unboxed()
OldPath-->>Codegen: f64 value or void
end
rect rgba(100, 200, 150, 0.5)
Note over NewPath: After: direct memory access
Codegen->>NewPath: hoist: element_ptr = base + 8-byte header + idx*8
NewPath-->>Codegen: element_ptr computed
Codegen->>Codegen: load/store DOUBLE to element_ptr
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
2b58bc2 to
ed71efd
Compare
ed71efd to
452b8d8
Compare
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
f49be7c to
f1945a4
Compare
f1945a4 to
73d3794
Compare
…ined raw-f64 index access #5302 inlines the guarded numeric array index get/set raw-f64 payload access instead of routing through js_array_numeric_get/set_f64_unboxed helpers, but left the native-region-proof workload spec asserting the old helper-based shape. Update the numeric_arrays spec to match: - ir_check numeric_array_uses_unboxed_set now pins the inline guarded store (idxset.inbounds + store double) and asserts the helper call is elided, mirroring the existing get check. - require_records get/set consumers updated to numeric_array_index_get.raw_f64_load / numeric_array_index_set.raw_f64_store. Verified against the captured CI artifacts (verify --gate => 34/34 pass).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@benchmarks/compiler_output/workloads.toml`:
- Line 636: The regex pattern in the workload check uses `%\w+` which is too
restrictive for LLVM IR identifiers and fails to match valid names containing
dots like `%tmp.1`. Broaden the SSA identifier matching pattern by replacing all
three occurrences of `%\w+` with `%[\w.]+` to allow dots in LLVM IR identifiers,
making the workload check more robust and less brittle to different IR output
variations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: af58f009-0b3b-4b9a-bff0-a92a6d13c624
📒 Files selected for processing (1)
benchmarks/compiler_output/workloads.toml
| contains = "js_array_numeric_set_f64_unboxed" | ||
| detail = "numeric indexed write uses the guarded raw-f64 helper" | ||
| contains = "js_typed_feedback_numeric_array_index_set_guard" | ||
| regex = '''idxset\.inbounds\.\d+:[\s\S]*?inttoptr i64 %\w+ to ptr\s*\n\s*store double %\w+, ptr %\w+[^\n]*\n\s*br label %idxset\.merge''' |
There was a problem hiding this comment.
Broaden SSA identifier matching in the IR regex.
%\w+ is too restrictive for LLVM IR names and may fail on valid identifiers containing dots (e.g., %tmp.1), making this workload check flaky/overly brittle.
Suggested patch
-regex = '''idxset\.inbounds\.\d+:[\s\S]*?inttoptr i64 %\w+ to ptr\s*\n\s*store double %\w+, ptr %\w+[^\n]*\n\s*br label %idxset\.merge'''
+regex = '''idxset\.inbounds\.\d+:[\s\S]*?inttoptr i64 %[-a-zA-Z$._0-9]+ to ptr\s*\n\s*store double %[-a-zA-Z$._0-9]+, ptr %[-a-zA-Z$._0-9]+[^\n]*\n\s*br label %idxset\.merge'''📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| regex = '''idxset\.inbounds\.\d+:[\s\S]*?inttoptr i64 %\w+ to ptr\s*\n\s*store double %\w+, ptr %\w+[^\n]*\n\s*br label %idxset\.merge''' | |
| regex = '''idxset\.inbounds\.\d+:[\s\S]*?inttoptr i64 %[-a-zA-Z$._0-9]+ to ptr\s*\n\s*store double %[-a-zA-Z$._0-9]+, ptr %[-a-zA-Z$._0-9]+[^\n]*\n\s*br label %idxset\.merge''' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@benchmarks/compiler_output/workloads.toml` at line 636, The regex pattern in
the workload check uses `%\w+` which is too restrictive for LLVM IR identifiers
and fails to match valid names containing dots like `%tmp.1`. Broaden the SSA
identifier matching pattern by replacing all three occurrences of `%\w+` with
`%[\w.]+` to allow dots in LLVM IR identifiers, making the workload check more
robust and less brittle to different IR output variations.
…ined raw-f64 index access #5302 inlines the guarded numeric array index get/set raw-f64 payload access instead of routing through js_array_numeric_get/set_f64_unboxed helpers, but left the native-region-proof workload spec asserting the old helper-based shape. Update the numeric_arrays spec to match: - ir_check numeric_array_uses_unboxed_set now pins the inline guarded store (idxset.inbounds + store double) and asserts the helper call is elided, mirroring the existing get check. - require_records get/set consumers updated to numeric_array_index_get.raw_f64_load / numeric_array_index_set.raw_f64_store. Verified against the captured CI artifacts (verify --gate => 34/34 pass).
d23d5bd to
787619a
Compare
|
Closing as superseded. Since this codex stack was cut, main advanced 52+ commits and independently evolved the same hot codegen/runtime paths. This PR is one link in a 13-deep linear stack that conflicts with diverged, correctness-sensitive codegen and was never reviewed; landing it would mean rebasing the whole stack. Per-PR judgement was done before closing. Specific: main already inlines the guard-proven raw-f64 numeric-array store (crates/perry-codegen/src/expr/index_set.rs:481 |
Summary
Performance
Stacked on #5295.
16_matrix_multiplyquick: 1842ms -> 1745ms, 5.3% faster10_nested_loopscompare median: 956ms -> 921ms, 3.7% fasterThe traced matrix module still declares
js_array_numeric_get_f64_unboxedandjs_array_numeric_set_f64_unboxed, but no longer emits call sites to those helpers on the guarded fast paths.Validation
cargo fmt --checkgit diff --checkcargo test -p perry-codegen --test typed_feedbackcargo test -p perry-codegen --test typed_shape_descriptorscargo test -p perry-codegen --test native_proof_regressions artifact_records_numeric_array_f64_fast_paths_and_fallback_reasonscargo test -p perry-codegen native_value::verify::testscargo build --releasePERRY_BIN=target/release/perry python3 tests/test_typed_feedback_runtime_evidence.pytests/test_benchmark_output_verifier.shtarget/release/perry compile --no-cache benchmarks/suite/16_matrix_multiply.ts -o /tmp/perry-matrix-direct-final --trace llvm --quiet./benchmarks/compare.sh --quick --runs 3 --warn-only --json-out /tmp/perry-direct-numeric-final-e816fc3e4.json./benchmarks/quick.shSummary by CodeRabbit
f64index get/set fast paths by inlining direct payload address handling and performing direct load/store of the value, with refreshed fast-path identifiers for better verification/recording.f64fast-path record names and the elimination of prior unboxed helper-call patterns.