Skip to content

perf: consecutive const-address stores re-materialize the linear-memory base every time #468

Description

@avrabe

Pattern

On the absolute / native-pointer store path, i32.store* (i32.const ADDR) V lowers to:

movw ip, #<base_lo>     ; e.g. 0x0100
movt ip, #<base_hi>     ; e.g. 0x2000   -> ip = 0x20000100 (linear-memory base)
add.w ip, ip, <addr>
str V, [ip]

The base is a loop-invariant compile-time constant, but it's re-materialized (movw+movt = 8 B, ~2 cyc) before every store. A straight-line run of N stores pays N base-materializations where 1 would do. This is common in struct/field initializers (a function that sets up several adjacent fields).

Repro (generic)

scripts/repro/redundant_base_materialization.wat — 7 consecutive const-address stores.

$ synth compile scripts/repro/redundant_base_materialization.wat -o /tmp/rbm.elf --target cortex-m4 --all-exports
$ arm-none-eabi-objdump -d -M force-thumb /tmp/rbm.elf | grep -c 'movw\s\+ip, #256'
7        # 7 stores -> 7 identical base materializations; 6 are redundant

Cost

~(N-1) × 8 B + (N-1) × ~2 cyc per straight-line store run. For a 7-store initializer that's ~48 B and ~12 cyc recoverable in one function; it scales linearly with field count, so field-heavy init code pays the most.

Fix direction

Hoist the constant base into a stable register once per straight-line region (const-CSE of the base — ip/R12 is encoder scratch and clobbered per-access, so the hoist target must be a preserved reg or the already-reserved memory-base register), then address each store as add ip, base_reg, <addr>; str. Same const-CSE family as the existing redundant-const elimination, specialized to the 2-instruction MOVW/MOVT base.

Byte-changing → would land flag-off (frozen-safe), execution-differential + on-target cycle gate, then default-on — the established lever path. Tracked under #390 / VCR-RA (epic #242).

Related

A DWARF .debug_line emitter already exists (--debug-line, shipped v0.12.0) for source-line stepping into generated code; richer variable-location info is the gated Tier-2 follow-on (VCR-DBG-001, blocked on the register allocator).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions