Skip to content

[codex] Hoist typed feedback site registration#5295

Merged
proggeramlug merged 2 commits into
mainfrom
codex/perry-performance-20260617
Jun 17, 2026
Merged

[codex] Hoist typed feedback site registration#5295
proggeramlug merged 2 commits into
mainfrom
codex/perry-performance-20260617

Conversation

@andrewtdiz

@andrewtdiz andrewtdiz commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Hoists typed-feedback site registration out of hot use sites and into function-entry setup. The guard, fallback, pass, and counter calls still run at the original use sites, but the per-site metadata registration now happens once per function entry instead of once per loop iteration.

Also makes the benchmark harness usable on this Linux runner by supporting /usr/bin/time -v RSS output and skipping Node columns when the installed Node cannot execute .ts benchmark inputs directly.

Root cause

benchmarks/suite/10_nested_loops.ts LLVM traces showed js_typed_feedback_register_site(...) inside the hot for.body.21 inner loop before each typed-feedback array guard. That turned static site metadata registration into repeated hot-loop work.

Benchmark Results

Local baseline was captured from e816fc3 because benchmarks/baseline.json is stale for this Linux environment.

  • ./benchmarks/compare.sh --quick --runs 3 --warn-only --json-out /tmp/perry-final-e816fc3e4.json
  • 10_nested_loops: 3383ms -> 956ms, 71.7% faster
  • 02_loop_overhead: 74ms -> 74ms
  • 05_fibonacci: 262ms -> 261ms
  • 06_math_intensive: 70ms -> 69ms
  • 13_factorial: 96ms -> 94ms
  • ./benchmarks/quick.sh
  • 16_matrix_multiply: 6462ms -> 1842ms, 71.5% faster

Node columns/correctness checks were skipped because local Node cannot run .ts benchmark inputs directly.

Validation

  • bash -n benchmarks/quick.sh
  • bash -n benchmarks/compare.sh
  • cargo fmt --check
  • git diff --check
  • cargo build --release
  • cargo test -p perry-codegen --test typed_feedback
  • PERRY_BIN=target/release/perry python3 tests/test_typed_feedback_runtime_evidence.py
  • tests/test_benchmark_output_verifier.sh
  • target/release/perry compile --no-cache benchmarks/suite/10_nested_loops.ts -o /tmp/perry-nested-loops-final --trace llvm --quiet
  • confirmed registration calls are in entry setup only and not in for.body.21
  • /tmp/perry-nested-loops-final produced nested_loops:963 and sum:26991000000
  • target/release/perry compile --no-cache benchmarks/suite/16_matrix_multiply.ts -o /tmp/perry-matrix-multiply-final --quiet && /tmp/perry-matrix-multiply-final
  • produced matrix_multiply:1778 and checksum:41079519680

Summary by CodeRabbit

  • Performance

    • Optimized compiler's typed feedback registration mechanism for improved benchmark performance.
  • Documentation

    • Added performance run documentation recording baseline and post-optimization benchmark results.
  • Chores

    • Enhanced benchmark scripts with improved TypeScript environment detection and refined peak memory measurement across operating systems.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2f3898f5-293e-44eb-b587-715aec423adf

📥 Commits

Reviewing files that changed from the base of the PR and between 0f73bb0 and 00057f5.

📒 Files selected for processing (5)
  • PERF_RUN_LOG.md
  • benchmarks/compare.sh
  • benchmarks/quick.sh
  • crates/perry-codegen/src/expr/typed_feedback.rs
  • crates/perry-codegen/src/function.rs

📝 Walkthrough

Walkthrough

Moves js_typed_feedback_register_site calls from per-guard-use-site block emission to a single function-entry setup region via a new LlFunction::entry_setup_call_void helper. Benchmark scripts gain cross-platform Node TypeScript runner detection, awk-based timing/RSS parsing, and consistent NODE_CMD wiring. A new perf log entry records the measured results.

Changes

Typed Feedback Registration Hoist

Layer / File(s) Summary
Function entry setup call helper
crates/perry-codegen/src/function.rs
Clarifies entry_post_init_setup placement docs and adds LlFunction::entry_setup_call_void to record FFI calls, format a typed LLVM call void IR line, and append it to the entry setup region.
Typed feedback registration emission
crates/perry-codegen/src/expr/typed_feedback.rs
emit_typed_feedback_register_site precomputes site_id, kind, and *_len argument strings then emits registration via ctx.func.entry_setup_call_void instead of ctx.block().call_void.

Benchmark Harness Runner and RSS Updates

Layer / File(s) Summary
Node TS detection and cross-platform RSS measurement
benchmarks/compare.sh, benchmarks/quick.sh
Both scripts add detect_node_ts_runner to probe node vs node --experimental-strip-types, rewrite timing extraction with awk, and refactor RSS collection to use OS-specific /usr/bin/time flags with awk-based peak RSS parsing for macOS and Linux.
NODE_CMD wiring in measurement and results loop
benchmarks/compare.sh, benchmarks/quick.sh
Node RSS measurement in compare.sh uses ${NODE_CMD[@]}; quick.sh replaces `cut -d'
Performance run documentation
PERF_RUN_LOG.md
Adds the 2026-06-17 run entry with baseline and post-change benchmark commands, results for nested_loops and matrix_multiply, verification steps, and a PR link.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • PerryTS/perry#5084: Modifies the same emit_typed_feedback_register_site function in crates/perry-codegen/src/expr/typed_feedback.rs, controlling whether the registration call is emitted at all — directly upstream of the hoisting change in this PR.

Poem

🐇 Hop hop, no more per-guard repeat,
One call at function entry — neat!
The benchmarks run on Linux now,
With awk to parse the peak RSS row.
The loops and matrices go zoom,
And Perry's codegen finds more room! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main optimization: hoisting typed feedback site registration from hot paths to function entry.
Description check ✅ Passed The description covers summary, root cause, benchmark results, and comprehensive validation steps, but is missing required template sections like explicit 'Changes' bullets and test plan checkboxes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/perry-performance-20260617

Comment @coderabbitai help to get the list of available commands and usage tips.

@andrewtdiz andrewtdiz force-pushed the codex/perry-performance-20260617 branch from 6ca1800 to 8d953ca Compare June 17, 2026 04:53
@proggeramlug proggeramlug marked this pull request as ready for review June 17, 2026 07:05
@proggeramlug proggeramlug merged commit ff30d89 into main Jun 17, 2026
15 checks passed
@proggeramlug proggeramlug deleted the codex/perry-performance-20260617 branch June 17, 2026 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants