docs(vcr-ra): RISC-V lever-parity scoping spike — map ARM perf levers to RV32 (#472, #242) by avrabe · Pull Request #484 · pulseengine/synth

avrabe · 2026-06-25T07:27:36Z

#472 scoping spike — frozen-safe (no codegen change)

Maps each ARM perf lever (landed v0.13–v0.15) to its RV32IMAC status, source-grounded in synth-backend-riscv/src/selector.rs and measured by .text size. The byte-changing ports are the explicitly-separate next gated steps.

Scope-changing finding

The port is 2 levers + 1 RISC-V-specific fold, not a 1:1 port of the three named ARM levers:

ARM lever	RV32 status
cmp→select	N/A for RV32IMAC — no conditional-move (Zicond not in IMAC, no IT-predication); `lower_select` is already the minimal branchy form.
local-promotion	APPLIES (direct #390 analogue) — non-param i32 locals always frame-spilled (`sw/lw off(sp)`); port to s-register homing, leaf-only, carrying the #474 fallback from the start.
immediate-shift-fold	APPLIES (RV form) — const shift amounts use register `sll/sra/srl` (`li tmp,N; sll`); fold to `slli/srli/srai` (ops already exist).
const-address-fold	APPLIES (RISC-V-specific) — RV already holds the base in `s11` (no #468-style re-materialization), but const `lw/sw` addresses do `li addr; add tmp,s11,addr; lw/sw off(tmp)` instead of folding to `lw/sw (ADDR+off)(s11)`.

Measured `.text` (RV32 vs ARM)

fixture	ARM	RV32	headroom
`redundant_base_materialization`	336 B	120 B (30 insn)	const-addr-fold ~56 B
`leaf_caller_saved`	200 B	104 B	local-promotion (sw/lw traffic)
`shifts`	188 B	44 B	imm-shift-fold ~8 B

Gated plan

Per-lever PRs (imm-shift-fold → const-address-fold → local-promotion), each flag-off → RV32 differential → qemu_riscv32/ESP32-C3 cycle gate → flip. Oracle gap noted: the RV32 path has no cargo byte-gate and no local RISC-V disassembler — the differential needs an RV32 execution harness + a small instruction decoder, built as part of step 1.

This is the frozen-safe measure-before-optimize scoping for #472's implementation, mirroring the #468 scoping doc that proved load-bearing.

🤖 Generated with Claude Code

… to RV32 (#472, #242) Frozen-safe scoping spike (no codegen change) for the RISC-V lever port. Reads the RV32IMAC backend source and measures the per-function overhead, mapping each ARM perf lever to its RV32 status: - cmp→select: N/A for RV32IMAC — no conditional-move (Zicond not in IMAC, no predication); `lower_select` is already the minimal branchy form. - local-promotion: APPLIES (direct #390 analogue) — non-param i32 locals are always frame-spilled (sw/lw off(sp)); port to s-register homing, leaf-only, carrying the #474 promotion-exhaustion fallback from the start. - immediate-shift-fold: APPLIES (RV form) — const shift amounts use the register sll/sra/srl (li tmp,N; sll); fold to slli/srli/srai (the ops already exist). - const-address-fold: APPLIES (RISC-V-specific) — RV already holds the linmem base in s11 (no base re-materialization, so #468's base-hoist half is N/A), but const lw/sw addresses do `li addr; add tmp,s11,addr; lw/sw off(tmp)` instead of folding to `lw/sw (ADDR+off)(s11)`. Scope-changing finding: the port is 2 levers + 1 RISC-V-specific fold, not a 1:1 port of all three named ARM levers (cmp→select does not apply to RV32IMAC). Measured .text (RV32 vs ARM): redundant_base 120B/30insn (const-addr-fold headroom ~56B), leaf_caller_saved 104B (local-promotion), shifts 44B (imm-shift-fold ~8B). Lays out the gated per-lever implementation plan (each flag-off → RV32 differential → qemu_riscv32/ESP32-C3 cycle gate → flip) and notes the oracle gap: the RV32 path has no cargo byte-gate and no local RISC-V disassembler, so the differential needs an RV32 execution harness + a small instruction decoder, built as part of step 1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

codecov · 2026-06-25T07:51:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…#472, #242) (#486) Traceability sync — the VCR-* roadmap's update logs had drifted behind the shipped RISC-V lever-port prep. Records the RV32 lever-baseline slice (#472/#484/#485) under VCR-ORACLE-001, its accurate home (it already logs the RV32 oracle slices: the frozen-fixture byte gate and the cmp-select execution differential). The entry captures: the three `*_baseline_472` selector tests pinning the current pre-lever RV32 codegen at the RiscVOp-stream level (const-address store unfolded, register-form shift, frame-spilled local), green today and flipping when each lever lands default-on so a codegen change on the un-byte-gated RV32 path surfaces as a reviewed assertion update; and the scoping finding that reshaped the port — cmp->select is N/A for RV32IMAC (no conditional-move), so it is local-promotion + immediate-shift-fold + a RISC-V-specific const-address-fold, not a 1:1 port. Frozen-safe: a single description append + a `riscv` tag on an existing item; no status change, no new links. rivet validate clean (0 non-cross-repo errors under the CI gate). The ARM perf levers' roadmap reconciliation is deliberately left to a focused pass rather than slotted into ambiguous homes here. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…i/srli/srai, flag-off (#472, #242) (#487) Port the first applicable ARM perf lever to the RV32 backend (scoped in #484). A constant shift `i32.shl/shr_u/shr_s (val) (i32.const N)` lowers as `addi tmp,zero,N ; sll/srl/sra rd,val,tmp` — the amount is materialized into a register, then consumed by the register-form shift. RV32 has immediate shift forms `slli/srli/srai` carrying the amount in the instruction, so folding a constant amount drops the `addi` (one instruction per constant shift). `fold_const_shift` is a post-pass peephole (mirrors the ARM `fold_immediate_shifts` / `fold_uxth` scaffolding): for each `addi tmp,zero,N`, the windowed scan finds the consuming register shift and rewrites it to the immediate form, dropping the `addi` as a dead store. Soundness: * `rs1 != tmp` guard — dropping the `addi` must not remove the shift's input definition; * the `addi` is removed only when it is a dead store — either the fold's destination IS `tmp` (the `slli` redefines it, reading only `rs1`) or `tmp` is dead after the shift (`rv_reg_dead_after`, the RV32 analogue of the ARM `reg_dead_by_redef`; an unmodeled op ⇒ can't-prove ⇒ keep); * `shamt = N & 31` reproduces the register `sll`'s hardware low-5-bit mask = WASM's shift-mod-32, so amounts ≥ 32 and negative constants fold identically. Only the single-`addi` const form (N in -2048..=2047, covering every meaningful amount 0..31) folds; a large constant via `lui+addi` stays a register shift. Flag-off behind `SYNTH_RV_SHIFT_FOLD` (default off): with the env unset the output is byte-identical to the pre-lever baseline, so the frozen RV32 fixtures (control_step / signed_div_const) are unchanged — frozen-safe by construction. The on-target cycle win is validated before the default-on flip. Oracle (scripts/repro/shift_fold.wat + shift_fold_riscv_differential.py): every exported function runs under unicorn UC_ARCH_RISCV in both flag states and matches wasmtime — including the mask cases (`shl33` << 33→1, `shlneg` << -1→31) and a VARIABLE shift (`shlvar`) that must NOT fold. Non-vacuity: flag-on `.text` 168B→148B (−20B = exactly 5 const shifts folded); flag-off zero. 6 unit tests cover fold/decline (input-alias guard, live-after, dest==tmp), srl/sra, and the mask. Full RV32 suite (184) + frozen byte gate (ARM+RV32) green; fmt + clippy clean. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

avrabe merged commit ce50642 into main Jun 25, 2026
10 checks passed

avrabe deleted the vcr-ra/472-riscv-scoping branch June 25, 2026 07:51

This was referenced Jun 25, 2026

test(vcr-ra): RV32 lever baselines — pin pre-lever codegen for #472 levers (#472, #242) #485

Merged

docs(vcr): log RV32 lever-baseline + scoping slice in the roadmap (#472, #242) #486

Merged

avrabe mentioned this pull request Jun 25, 2026

feat(vcr-ra): RV32 immediate-shift-fold — const shift amount into slli/srli/srai, flag-off (#472, #242) #487

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(vcr-ra): RISC-V lever-parity scoping spike — map ARM perf levers to RV32 (#472, #242)#484

docs(vcr-ra): RISC-V lever-parity scoping spike — map ARM perf levers to RV32 (#472, #242)#484
avrabe merged 1 commit into
mainfrom
vcr-ra/472-riscv-scoping

avrabe commented Jun 25, 2026

Uh oh!

Uh oh!

codecov Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

avrabe commented Jun 25, 2026

#472 scoping spike — frozen-safe (no codegen change)

Scope-changing finding

Measured .text (RV32 vs ARM)

Gated plan

Uh oh!

Uh oh!

codecov Bot commented Jun 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Measured `.text` (RV32 vs ARM)