[codex] Hoist invariant numeric array reads in inner loops#5312
[codex] Hoist invariant numeric array reads in inner loops#5312andrewtdiz wants to merge 3 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
5895ca8 to
f8454f6
Compare
ec79e68 to
4a132f8
Compare
f8454f6 to
20f67fe
Compare
4a132f8 to
33e8555
Compare
20f67fe to
44c070b
Compare
33e8555 to
d966c2c
Compare
44c070b to
8a96726
Compare
d966c2c to
fd22011
Compare
8a96726 to
e10ff31
Compare
|
Closing as superseded. Since this codex stack was cut, main advanced 52+ commits and independently evolved the same hot codegen/runtime paths. This PR is one link in a 13-deep linear stack that conflicts with diverged, correctness-sensitive codegen and was never reviewed; landing it would mean rebasing the whole stack. It was also flagged as borderline in per-PR review (narrow win and/or unrelated scope creep bundled in). Specific: bespoke split-CFG (for.prebody/for.cond.backedge) for a narrow single-expression inner-loop shape; high maintenance surface for a benchmark-specific win. A general LICM pass is preferable. |
Summary
arr[i]reads out of eligible innerfor (...; j < arr.length; j++)loopsBenchmarks
Baseline is
ec79e68ffoncodex/perry-i32-array-index-lowering; current commit is5895ca8e7.10_nested_loops: 221ms -> 102ms median, sum26991000000throughoutperf statdirect nested: 813.9M -> 422.6M cycles, 3.64B -> 1.84B instructions, 820.6M -> 415.4M branchesbenchmarks/quick.sh: nested_loops 202ms -> 124ms; matrix_multiply stayed in range at 387ms -> 390msbenchmarks/compare.sh --quick --runs 3 --warn-only: nested_loops 214ms -> 101ms; JSON at/tmp/perry-compare-invariant-hoist-final.jsonNote: local Node.js cannot execute
.tsbenchmark inputs here, so Node columns/correctness checks were skipped by the harness.benchmarks/baseline.jsonis stale for this Linux environment, so compare was run with--warn-onlyand PR deltas use the captured local #5310 baseline.Validation
cargo fmt --checkgit diff --checkcargo test -p perry-codegen --test typed_feedbackcargo test -p perry-codegen --test typed_shape_descriptorsPERRY_BIN=target/release/perry python3 tests/test_typed_feedback_runtime_evidence.pytests/test_benchmark_output_verifier.shcargo build --release