You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(gc): close the #5029 conservative-scan x evacuation corruption — re-enable write-barrier stress tests
Four coordinated fixes, each addressing a measured failure mode of the
gc_write_barrier_stress suite (red since #4998 made explicit gc() run the
Full conservative native-stack scan):
1. roots: conservative discoveries in the OLD generation are pin-only in
MINOR cycles (CONS_PINNED, no mark, no trace seed). A stale stack word
can resurrect a DEAD old object whose slots still point into long-swept
nursery memory; once fresh nursery blocks land on those freed ranges the
slots alias live young objects, and tracing/evacuating/rewriting through
them corrupts the heap (and produced the deterministic
missing_edges=7710 verifier signature on a dead 256 KB array backing).
Minors never sweep the old gen, so the mark is not needed for survival,
and a LIVE old object's real old->young edges are dirty-page-covered by
the write barriers (measured: ~60k barrier calls per inter-cycle window,
all landing correctly). FULL collections keep mark+trace (#4977).
2. verify/rewrite: rewrite_heap_objects and verify_heap_objects no longer
skip UNMARKED non-nursery objects. Being unmarked in a minor is the
normal state of a live old object, not a sign of death; old->old
references have no remembered-set coverage, so this walk is the only
pass that re-points an old referrer at an evacuated target before the
forwarding stubs are released (measured: 753 skipped stale referrers
per evacuating cycle).
3. remembered set: restore_surviving_dirty_coverage() re-derives kept
pages after remembered_set_clear from the SAME walk the old-young-edge
verifier uses, so a still-needed page can never be dropped (also closes
the copying-path re-remember gap where ptrs.decode_bits returns None
for freshly copied to-survivor children). External entries are
validated address-first (page classify / malloc registry) before any
header dereference - the reclaim unit tests seed synthetic entries.
4. policy: old-page defrag (C4b compaction) is skipped on cycles that ran
the conservative stack scan. Conservative stack words cannot be
rewritten after a move and CONS_PINNED only covers direct discoveries;
the stress suite demonstrated a moved old object with an un-rewritten
referrer (shape-table lookups through it returned recycled memory).
Copying minors never run the conservative scan, so steady-state defrag
is unaffected. Follow-up to lift the gate: #5042 (codegen raw-pointer
globals, e.g. perry_class_keys_*, need mutable-root registration).
Validation: gc_write_barrier_stress 2/2 across repeated runs (re-enabled,
previously #[ignore]d); standalone repro matrix 23/23 clean across
FORCE_EVACUATE+VERIFY_EVACUATION, PERRY_CONSERVATIVE_STACK_SCAN=full,
default, PERRY_GEN_GC=0 and PERRY_GEN_GC_EVACUATE=0; gc unit suite
377/377; perry-runtime lib green (single known macOS date flake); perry
bin+integration suites green.
0 commit comments