Skip to content

feat(vcr-ra): RV32 const-address-fold — fold constant addr into the access immediate off s11, flag-off (#472, #242)#491

Merged
avrabe merged 1 commit into
mainfrom
feat/rv32-const-addr-fold-472
Jun 25, 2026
Merged

feat(vcr-ra): RV32 const-address-fold — fold constant addr into the access immediate off s11, flag-off (#472, #242)#491
avrabe merged 1 commit into
mainfrom
feat/rv32-const-addr-fold-472

Conversation

@avrabe

@avrabe avrabe commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

What

Loop-4 step 2 of the RISC-V lever-parity port (#472) — the RISC-V analogue of the ARM base-CSE address half (#468). A i32.load/store (i32.const ADDR) … lowers as addi a,zero,ADDR; add tmp,s11,a; lw/sw _,off(tmp). When ADDR+off fits the signed-12-bit access immediate, fold_const_addr collapses it to a single lw/sw _,(ADDR+off)(s11), dropping the add and the address addi2 instructions per constant-address access.

How

A post-pass peephole — the structural twin of the #487 shift fold. Soundness:

  • ADDR+off is range-checked as a SUM against [-2048, 2047] (each term is already ≤12 bits, so two in-range values can sum out of range);
  • the add base must be s11 and its address operand a addi a,zero,ADDR (single-addi small constant; a lui+addi large address stays the add form, out of v1 scope);
  • 3→1 rewrite, so both dropped temps must be dead — tmp (add result) read only by the access, and a (address constant) read only by the add (rv_reg_dead_after + an untouched-between-def-and-use check); a bounds check between the add and the access reads a and disqualifies the fold.

Frozen-safe

Flag-off behind SYNTH_RV_ADDR_FOLD (default off ⇒ byte-identical to baseline). The frozen RV32 fixtures and const_addr_store_not_folded_baseline_472 (#485, the flip-marker) stay green. The on-target cycle win is validated before the default-on flip.

Oracle

scripts/repro/const_addr_fold_riscv_differential.py (reuses the redundant_base_materialization fixture): runs init_fields (7 constant-address stores) under unicorn UC_ARCH_RISCV in both flag states; the resulting linear memory is bit-identical to wasmtime.

init_fields: off=ok on=ok vs wasmtime MATCH
.text 120B -> 64B (-56B): ~14 instruction(s) folded across 7 const-addr stores
ORACLE: PASS

CI-gated as an isolated rv32-const-addr-fold-oracle job (mirrors the shift-fold oracle; emulation deps pip-installed in that job only). Verified locally with the exact CI invocation (debug binary).

Tests / gates

  • 5 unit tests: store/load fold, offset-sum, 12-bit range-guard decline, addr-reused decline.
  • RV32 suite (189) + frozen byte gate (ARM + RV32) green; fmt + clippy clean.

Part of epic #242 (VCR-*), continuing the #472 lever-parity slice after the imm-shift-fold (#487).

🤖 Generated with Claude Code

…ccess immediate off s11, flag-off (#472, #242)

Loop-4 step 2 of the RISC-V lever-parity port (#472), the RISC-V analogue of the
ARM base-CSE address half (#468). A `i32.load/store (i32.const ADDR) …` lowers as
`addi a,zero,ADDR; add tmp,s11,a; lw/sw _,off(tmp)`; when `ADDR+off` fits the
signed-12-bit access immediate, `fold_const_addr` collapses it to a single
`lw/sw _,(ADDR+off)(s11)`, dropping the `add` and the address `addi` — 2
instructions per constant-address access.

Post-pass peephole (the structural twin of the #487 shift fold). Soundness:
  * `ADDR+off` is range-checked as a SUM against [-2048, 2047] (each term is
    already <=12 bits, so two in-range values can sum out of range);
  * the `add` base must be s11 and its address operand a `addi a,zero,ADDR`
    (single-`addi` small constant; a `lui+addi` large address stays the `add`
    form, out of v1 scope);
  * 3->1 rewrite, so BOTH dropped temps must be dead — `tmp` (add result) read
    only by the access, and `a` (address constant) read only by the `add`
    (rv_reg_dead_after + an untouched-between-def-and-use check); a bounds check
    between the add and the access reads `a` and disqualifies the fold.

Flag-off behind SYNTH_RV_ADDR_FOLD (default off => byte-identical to baseline, so
the frozen RV32 fixtures and `const_addr_store_not_folded_baseline_472` (#485)
stay green — frozen-safe). The on-target cycle win is validated before the flip.

Oracle (scripts/repro/const_addr_fold_riscv_differential.py, reusing the
redundant_base_materialization fixture): runs `init_fields` (7 constant-address
stores) under unicorn UC_ARCH_RISCV in both flag states; the resulting linear
MEMORY is bit-identical to wasmtime. Non-vacuity: .text 120B -> 64B (-56B = 14
instructions, 2 per store). CI-gated as an isolated `rv32-const-addr-fold-oracle`
job mirroring the shift-fold oracle. 5 unit tests (store/load fold, offset sum,
12-bit range guard, addr-reused decline). RV32 suite (189) + frozen byte gate
(ARM+RV32) green; fmt + clippy clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 78.77358% with 45 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/synth-backend-riscv/src/selector.rs 78.77% 45 Missing ⚠️

📢 Thoughts on this report? Let us know!

@avrabe avrabe merged commit f86e508 into main Jun 25, 2026
16 of 17 checks passed
@avrabe avrabe deleted the feat/rv32-const-addr-fold-472 branch June 25, 2026 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant