Skip to content

fix(runtime,codegen): demote unique strings stored into array elements#5548

Merged
proggeramlug merged 1 commit into
PerryTS:mainfrom
machineloop:fix/array-element-string-demote
Jun 22, 2026
Merged

fix(runtime,codegen): demote unique strings stored into array elements#5548
proggeramlug merged 1 commit into
PerryTS:mainfrom
machineloop:fix/array-element-string-demote

Conversation

@machineloop

@machineloop machineloop commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #5533. That PR demoted heap-stored unique strings to shared for object-field stores (runtime_store_jsvalue_slot), but array-element stores reach the slot through several paths that don't share that choke point and were left uncovered. The same aliasing bug applies: a uniquely-owned (refcount==1) string written into an array element is aliased by the array, so a later in-place += on the source local mutates the stored element and corrupts it.

This applies the same tag-checked demote (js_string_addref_if_heap_string, introduced in #5533 — a no-op for small-string-optimized / non-string values) at each array store site.

Changes

Codegen — the inline fast paths that bypass the runtime store helpers:

  • crates/perry-codegen/src/expr/write_barrier.rsemit_jsvalue_slot_store_on_block, the shared inline element-store emitter for array literals, push, and arr[i] =. Gated on the existing "value may be a heap pointer" flag, so numeric stores pay nothing.
  • crates/perry-codegen/src/stmt/let_stmt.rs — the scalar-replaced array-literal init (const a = [s]): demote the element where it's captured into its slot, before the deferred build (mirrors the object scalar-field demote).

Runtime — the paths codegen hands off to a store helper:

  • crates/perry-runtime/src/array/push_pop.rsjs_array_push_f64 (push realloc / forwarding / proxy paths).
  • crates/perry-runtime/src/array/indexing.rsjs_array_set_f64 / js_array_set_f64_extend (arr[i] =).
  • crates/perry-runtime/src/array/alloc.rsjs_array_from_values (outline array-literal construction).
  • crates/perry-runtime/src/array/splice_slice.rs — the splice inserted-items loop.

crates/perry/tests/string_append_heap_alias.rs gains 4 compile-run regression tests.

Internal reshuffles (sort, splice tail shift, slice copy) only move values already stored in an array — already shared — so no demote is needed there.

Related issue

Follow-up to #5533.

Test plan

cargo test -p perry --test string_append_heap_alias
cargo build --release -p perry-runtime -p perry-codegen

Each of the 4 new tests stores a non-SSO refcount==1 string into an array (literal / arr[i] = / push / splice) then grows the source, asserting the stored element is unchanged. Verified they fail without the demotes (the stored element shows the grown value) and pass with them — 8/8 in the suite, including the object-field cases from #5533.

  • cargo build --release clean (affected crates)
  • cargo test -p perry --test string_append_heap_alias passes (8/8)
  • Added compile-run regression tests
  • (if CLI / stdlib / runtime API changed) Updated docs/src/ — n/a (internal store behavior)
  • (if touching a platform UI backend) — n/a

Checklist

  • I have NOT bumped the workspace version or edited CLAUDE.md / CHANGELOG.md (maintainer handles these at merge)
  • My commits follow the loose feat: / fix: / docs: / chore: prefix convention used in the log
  • I've read CONTRIBUTING.md and agree to the Code of Conduct

Summary by CodeRabbit

  • Bug Fixes
    • Prevented heap string aliasing when storing into arrays by ensuring heap string values are reference-counted/demoted before array element writes across common write paths (array literal elements, direct indexing/setting, extending sets, push, and splice), so later += mutations no longer corrupt stored elements.
  • Tests
    • Added end-to-end regression tests covering heap-aliasing behavior for array literals (including full-outline mode), indexing, push, and splice, verifying stored values remain unchanged after += on the original string.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5652c33c-d6e7-4f8a-809d-7db068b20d1e

📥 Commits

Reviewing files that changed from the base of the PR and between cac311a and f5a3245.

📒 Files selected for processing (7)
  • crates/perry-codegen/src/expr/write_barrier.rs
  • crates/perry-codegen/src/stmt/let_stmt.rs
  • crates/perry-runtime/src/array/alloc.rs
  • crates/perry-runtime/src/array/indexing.rs
  • crates/perry-runtime/src/array/push_pop.rs
  • crates/perry-runtime/src/array/splice_slice.rs
  • crates/perry/tests/string_append_heap_alias.rs
🚧 Files skipped from review as they are similar to previous changes (7)
  • crates/perry-runtime/src/array/alloc.rs
  • crates/perry-runtime/src/array/push_pop.rs
  • crates/perry-runtime/src/array/indexing.rs
  • crates/perry-codegen/src/expr/write_barrier.rs
  • crates/perry-runtime/src/array/splice_slice.rs
  • crates/perry-codegen/src/stmt/let_stmt.rs
  • crates/perry/tests/string_append_heap_alias.rs

📝 Walkthrough

Walkthrough

Adds js_string_addref_if_heap_string calls to every array element write path—js_array_from_values, js_array_set_f64, js_array_set_f64_extend, js_array_push_f64, and js_array_splice—and to two codegen emission paths (emit_jsvalue_slot_store_on_block_inner and lower_let scalar-replacement). Five regression tests are added covering each affected store mechanism and construction path.

Changes

Heap-string aliasing fix for array element stores

Layer / File(s) Summary
Runtime array element store addref calls
crates/perry-runtime/src/array/alloc.rs, crates/perry-runtime/src/array/indexing.rs, crates/perry-runtime/src/array/push_pop.rs, crates/perry-runtime/src/array/splice_slice.rs
Adds js_string_addref_if_heap_string at the start of js_array_from_values, js_array_set_f64, js_array_set_f64_extend, js_array_push_f64, and the js_array_splice insertion loop to demote uniquely-owned heap strings to shared before any slot write.
Codegen emit paths for heap-string addref
crates/perry-codegen/src/expr/write_barrier.rs, crates/perry-codegen/src/stmt/let_stmt.rs
In emit_jsvalue_slot_store_on_block_inner, emits the addref call for value_double when layout_note_needed is true. In lower_let's scalar-replacement path, conditionally emits the addref when an array literal element is a LocalGet of a String-typed local; import of expr_produces_non_pointer_bits_by_construction is added for the gating condition.
Regression tests
crates/perry/tests/string_append_heap_alias.rs
Adds five end-to-end regression tests covering array literal element store, index assignment, push, splice, and outlined array-literal construction path, each verifying the stored element is not corrupted by a subsequent += on the source string.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • PerryTS/perry#5098: Introduces the scalar-aware slot layout note path (js_gc_note_slot_layout_aware) in the same emit_jsvalue_slot_store_on_block_inner helper where this PR adds the js_string_addref_if_heap_string emission.
  • PerryTS/perry#5533: Implements the same "demote unique heap strings to shared" fix via js_string_addref_if_heap_string for runtime_store_jsvalue_slot and scalar-replaced field/property stores, directly preceding this PR's extension to array stores.
  • PerryTS/perry#5412: Introduces and centralizes js_array_from_values, the same function whose per-element loop this PR extends with the addref call.

Poem

🐇 A string once shared can't be mutated in place,
So I addref each element before it finds its space.
Array slot, push, splice — every path now aware,
A uniquely-owned buffer gets a ref to spare.
The snapshot stays frozen while the source carries on,
No aliasing bugs left when this rabbit hops along! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: demoting unique strings to shared ownership when stored into array elements, which is the core purpose of this PR.
Description check ✅ Passed The description includes all required sections: a clear summary of the follow-up fix, detailed list of changes across codegen and runtime, related issue reference, comprehensive test plan with verification results, and completed checklist.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/perry/tests/string_append_heap_alias.rs (1)

124-125: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Add a regression that forces the outlined literal store path (js_array_from_values).

Line 124 explicitly targets the small-literal lowering path, so this set still doesn’t pin the outlined array-literal construction path in this file.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/perry/tests/string_append_heap_alias.rs` around lines 124 - 125, The
current test targets only the small-literal lowering path that uses
js_array_alloc and js_array_push_f64, leaving the outlined array-literal
construction path uncovered. Add a new regression test after the existing test
that forces the outlined literal store path by using js_array_from_values
instead of the small-literal approach. This new test should create a scenario
that triggers the outlined array-literal construction behavior to ensure that
code path is properly tested and pinned in this test file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry-codegen/src/stmt/let_stmt.rs`:
- Around line 411-416: The needs_string_demote gate in the let_stmt.rs file only
triggers when ctx.local_types.get(src_id) is exactly perry_types::Type::String,
but LocalGet values typed as Any or other non-exact types can still carry
uniquely-owned heap strings that need demotion to avoid alias corruption.
Broaden the matches! condition in needs_string_demote to include not just exact
String type checks but also type cases like Any that could potentially contain
heap strings, ensuring demote logic applies whenever a LocalGet value could
reference a uniquely-owned heap string regardless of its static type annotation.

---

Nitpick comments:
In `@crates/perry/tests/string_append_heap_alias.rs`:
- Around line 124-125: The current test targets only the small-literal lowering
path that uses js_array_alloc and js_array_push_f64, leaving the outlined
array-literal construction path uncovered. Add a new regression test after the
existing test that forces the outlined literal store path by using
js_array_from_values instead of the small-literal approach. This new test should
create a scenario that triggers the outlined array-literal construction behavior
to ensure that code path is properly tested and pinned in this test file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3d79a1cf-9327-4cac-b6f9-4d585ebfd17e

📥 Commits

Reviewing files that changed from the base of the PR and between f392dba and bebc0cb.

📒 Files selected for processing (7)
  • crates/perry-codegen/src/expr/write_barrier.rs
  • crates/perry-codegen/src/stmt/let_stmt.rs
  • crates/perry-runtime/src/array/alloc.rs
  • crates/perry-runtime/src/array/indexing.rs
  • crates/perry-runtime/src/array/push_pop.rs
  • crates/perry-runtime/src/array/splice_slice.rs
  • crates/perry/tests/string_append_heap_alias.rs

Comment thread crates/perry-codegen/src/stmt/let_stmt.rs Outdated
@machineloop machineloop force-pushed the fix/array-element-string-demote branch from bebc0cb to cac311a Compare June 22, 2026 13:53
@machineloop

Copy link
Copy Markdown
Contributor Author

Addressed the outline-path coverage nitpick in cac311a: added unique_string_in_outlined_array_literal_is_not_corrupted, which forces the js_array_from_values path via PERRY_FULL_OUTLINE_IC=1 on an escaping literal (so it isn't scalar-replaced). Confirmed via the emitted IR that it routes through js_array_from_values, and the stored element stays intact — 9/9 in the suite.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/perry/tests/string_append_heap_alias.rs`:
- Around line 253-255: The test is not verifying that the compiled binary
executed successfully before attempting to parse its stdout. After calling
Command::new(&output).output() and storing the result in the run variable, add
an assertion to check run.status.success() before accessing run.stdout. This
will ensure that if the binary fails to run or returns a non-zero exit code, the
test will fail with an explicit status error rather than a confusing output
mismatch assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 90f04182-5511-4b64-a50c-de39af7b189e

📥 Commits

Reviewing files that changed from the base of the PR and between bebc0cb and cac311a.

📒 Files selected for processing (7)
  • crates/perry-codegen/src/expr/write_barrier.rs
  • crates/perry-codegen/src/stmt/let_stmt.rs
  • crates/perry-runtime/src/array/alloc.rs
  • crates/perry-runtime/src/array/indexing.rs
  • crates/perry-runtime/src/array/push_pop.rs
  • crates/perry-runtime/src/array/splice_slice.rs
  • crates/perry/tests/string_append_heap_alias.rs
🚧 Files skipped from review as they are similar to previous changes (6)
  • crates/perry-runtime/src/array/splice_slice.rs
  • crates/perry-runtime/src/array/indexing.rs
  • crates/perry-codegen/src/expr/write_barrier.rs
  • crates/perry-runtime/src/array/alloc.rs
  • crates/perry-runtime/src/array/push_pop.rs
  • crates/perry-codegen/src/stmt/let_stmt.rs

Comment thread crates/perry/tests/string_append_heap_alias.rs
Follow-up to PerryTS#5533, which demoted heap-stored unique strings to shared for
object-field stores (`runtime_store_jsvalue_slot`) but left array-element stores
uncovered. The same aliasing bug applies: a uniquely-owned (refcount==1) string
written into an array element is aliased by the array, so a later in-place `+=`
on the source local (`js_string_append`'s refcount==1 fast path) mutates the
stored element and corrupts it.

Array element stores reach the slot through several paths that don't share one
choke point, so apply the same tag-checked demote (`js_string_addref_if_heap_string`,
added in PerryTS#5533 — a no-op for small-string-optimized / non-string values) at each.

Codegen (the inline fast paths that bypass the runtime store helpers):
- `emit_jsvalue_slot_store_on_block` (write_barrier.rs): the shared inline
  element-store emitter for array literals, `push`, and `arr[i] =`. Gated on the
  existing "value may be a heap pointer" flag, so numeric stores pay nothing.
- the scalar-replaced array-literal init in let_stmt.rs (`const a = [s]`): demote
  the element where it is captured into its slot, before the deferred build —
  mirrors the object scalar-field demote.

Runtime (the paths codegen hands off to a store helper):
- `js_array_push_f64` (push realloc / forwarding / proxy paths; the grow path
  receives the already-demoted value).
- `js_array_set_f64` / `js_array_set_f64_extend` (`arr[i] =`).
- `js_array_from_values` (the outline array-literal construction path).
- the splice inserted-items loop.

Internal reshuffles (sort, splice tail shift, slice copy) only move values
already stored in an array — already shared — so no demote is needed there.

Tests: crates/perry/tests/string_append_heap_alias.rs gains 4 compile-run
regression cases (array literal, `arr[i] =`, push, splice). Each stores a
non-SSO refcount==1 string then grows the source; all fail without the demotes.
@machineloop machineloop force-pushed the fix/array-element-string-demote branch from cac311a to f5a3245 Compare June 22, 2026 14:06
@proggeramlug proggeramlug merged commit b86091d into PerryTS:main Jun 22, 2026
15 checks passed
proggeramlug added a commit that referenced this pull request Jun 23, 2026
…aths (unshift / fill / with / from_jsvalue) (#5567)

* fix(runtime): #5552 demote unique strings in remaining array insert paths

Follow-up to #5533 (object fields) and #5548 (the array store paths it
enumerated). A uniquely-owned (refcount==1) heap string written into an array
element aliases that slot, so a later in-place `s += x` on the source local
(`js_string_append`'s refcount==1 fast path) rewrites the stored element and
corrupts it. #5548 fixed the push / set / from_values / splice-insert paths; the
sibling insert/replace paths below do the same raw element write without the
demote.

Apply the same tag-checked `js_string_addref_if_heap_string` (no-op for SSO /
non-string, idempotent) before the element write at each:

- `js_array_unshift_f64` (covers `js_array_unshift_jsvalue` transitively) and the
  per-item loop in `js_array_unshift_variadic`.
- `js_array_fill` / `js_array_fill_range` (demote once before the fill loop — the
  source aliases every filled slot) and `js_array_fill_generic` (its
  object-receiver loop writes `value` into each index directly; the array
  receiver delegates to the two above, so the extra demote is idempotent).
- `js_array_with` (the replacement value stored into the new array's slot; the
  cloned elements are already shared).
- `js_array_from_jsvalue` (mixed-type literal construction — the JSValue sibling
  of the already-covered `js_array_from_values`).

Internal reshuffles (sort, splice tail shift, slice copy, copyWithin) only move
values already stored in an array — already shared — so no demote is needed.

Tests: `string_append_heap_alias.rs` gains compile-run regressions for unshift /
fill / with, each confirmed to fail without the demote. `js_array_from_jsvalue`
is not emitted by codegen from any TypeScript source, so it gets a runtime unit
test (`array/tests.rs`) instead, also confirmed to fail without the demote.

Refs #5533, #5548.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(runtime): restore GC_STORE_AUDIT marker on unshift_variadic insert write

The #5552 demote line pushed the ptr::write past the proximity window of the
existing GC_STORE_AUDIT(BARRIERED) marker, failing the lint job's GC store-site
inventory check. Re-annotate the insert write directly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Ralph <ralph@skelpo.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants