Skip to content

Commit bdb586d

Browse files
author
Ralph Küpper
committed
fix(gc): close the #5029 conservative-scan x evacuation corruption — re-enable write-barrier stress tests
Four coordinated fixes, each addressing a measured failure mode of the gc_write_barrier_stress suite (red since #4998 made explicit gc() run the Full conservative native-stack scan): 1. roots: conservative discoveries in the OLD generation are pin-only in MINOR cycles (CONS_PINNED, no mark, no trace seed). A stale stack word can resurrect a DEAD old object whose slots still point into long-swept nursery memory; once fresh nursery blocks land on those freed ranges the slots alias live young objects, and tracing/evacuating/rewriting through them corrupts the heap (and produced the deterministic missing_edges=7710 verifier signature on a dead 256 KB array backing). Minors never sweep the old gen, so the mark is not needed for survival, and a LIVE old object's real old->young edges are dirty-page-covered by the write barriers (measured: ~60k barrier calls per inter-cycle window, all landing correctly). FULL collections keep mark+trace (#4977). 2. verify/rewrite: rewrite_heap_objects and verify_heap_objects no longer skip UNMARKED non-nursery objects. Being unmarked in a minor is the normal state of a live old object, not a sign of death; old->old references have no remembered-set coverage, so this walk is the only pass that re-points an old referrer at an evacuated target before the forwarding stubs are released (measured: 753 skipped stale referrers per evacuating cycle). 3. remembered set: restore_surviving_dirty_coverage() re-derives kept pages after remembered_set_clear from the SAME walk the old-young-edge verifier uses, so a still-needed page can never be dropped (also closes the copying-path re-remember gap where ptrs.decode_bits returns None for freshly copied to-survivor children). External entries are validated address-first (page classify / malloc registry) before any header dereference - the reclaim unit tests seed synthetic entries. 4. policy: old-page defrag (C4b compaction) is skipped on cycles that ran the conservative stack scan. Conservative stack words cannot be rewritten after a move and CONS_PINNED only covers direct discoveries; the stress suite demonstrated a moved old object with an un-rewritten referrer (shape-table lookups through it returned recycled memory). Copying minors never run the conservative scan, so steady-state defrag is unaffected. Follow-up to lift the gate: #5042 (codegen raw-pointer globals, e.g. perry_class_keys_*, need mutable-root registration). Validation: gc_write_barrier_stress 2/2 across repeated runs (re-enabled, previously #[ignore]d); standalone repro matrix 23/23 clean across FORCE_EVACUATE+VERIFY_EVACUATION, PERRY_CONSERVATIVE_STACK_SCAN=full, default, PERRY_GEN_GC=0 and PERRY_GEN_GC_EVACUATE=0; gc unit suite 377/377; perry-runtime lib green (single known macOS date flake); perry bin+integration suites green.
1 parent a46ca1d commit bdb586d

6 files changed

Lines changed: 234 additions & 21 deletions

File tree

crates/perry-runtime/src/gc/copying.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -994,6 +994,7 @@ pub(super) fn gc_collect_minor_copying_fast_path_with_eligibility(
994994
}
995995
remembered_set_clear();
996996
collector.sticky.restore();
997+
restore_surviving_dirty_coverage(&snapshot);
997998
let malloc_freed_bytes = if malloc_sweep_due {
998999
let phase_start = trace_phase_start(trace);
9991000
let freed = sweep_malloc_objects();

crates/perry-runtime/src/gc/cycle.rs

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -565,15 +565,21 @@ impl RootScanCycleState {
565565
consider_evacuation: bool,
566566
budget: usize,
567567
allow_synchronous_scanners: bool,
568+
pin_only_old_conservative: bool,
568569
) -> bool {
569570
match self.subphase {
570571
RootScanSubphase::ConservativeStack => {
571572
if budget == 0 {
572573
return false;
573574
}
574575
let conservative_scan_decision = conservative_stack_scan_decision();
575-
let conservative_root_stats =
576-
mark_stack_roots_for_decision(valid_ptrs, conservative_scan_decision);
576+
// #5029: minors retain old-gen conservative discoveries
577+
// pin-only (no trace) — see try_mark_conservative_word.
578+
let conservative_root_stats = mark_stack_roots_for_decision(
579+
valid_ptrs,
580+
conservative_scan_decision,
581+
pin_only_old_conservative,
582+
);
577583
let conservative_pin_stats = if consider_evacuation
578584
&& matches!(
579585
conservative_scan_decision,
@@ -824,6 +830,9 @@ pub(super) struct GcCycleState {
824830
atomic_finalize: Option<AtomicFinalizeCycleState>,
825831
minor: Option<MinorCycleContext>,
826832
live_old_to_young_sticky: Option<StickyRememberedSet>,
833+
/// Dirty snapshot captured just before this cycle's remembered_set_clear
834+
/// begins, for the post-restore coverage repair (#5029).
835+
pre_clear_dirty_snapshot: Option<super::barrier::RememberedDirtySnapshot>,
827836
sweep_state: Option<IncrementalSweepState>,
828837
reclaim_state: Option<ReclaimCycleState>,
829838
sweep: Option<SweepTraceStats>,
@@ -854,6 +863,7 @@ impl GcCycleState {
854863
atomic_finalize: None,
855864
minor: None,
856865
live_old_to_young_sticky: None,
866+
pre_clear_dirty_snapshot: None,
857867
sweep_state: None,
858868
reclaim_state: None,
859869
sweep: None,
@@ -907,6 +917,7 @@ impl GcCycleState {
907917
evacuation_sticky: StickyRememberedSet::default(),
908918
}),
909919
live_old_to_young_sticky: None,
920+
pre_clear_dirty_snapshot: None,
910921
sweep_state: None,
911922
reclaim_state: None,
912923
sweep: None,
@@ -1079,6 +1090,7 @@ impl GcCycleState {
10791090
consider_evacuation,
10801091
budget.work_units,
10811092
allow_synchronous_scanners,
1093+
self.minor.is_some(),
10821094
);
10831095
trace_phase_record(&mut self.trace, phase_name, phase_start);
10841096
if done {
@@ -1415,6 +1427,14 @@ impl GcCycleState {
14151427
let clear = {
14161428
let reclaim_state =
14171429
self.reclaim_state.as_mut().expect("reclaim state exists");
1430+
if reclaim_state.remembered_set_clear.is_none()
1431+
&& self.pre_clear_dirty_snapshot.is_none()
1432+
{
1433+
// Snapshot the pre-clear dirty set so the
1434+
// post-restore repair can rescan it (#5029).
1435+
self.pre_clear_dirty_snapshot =
1436+
Some(super::barrier::remembered_dirty_snapshot());
1437+
}
14181438
reclaim_state
14191439
.remembered_set_clear
14201440
.get_or_insert_with(RememberedSetClearState::new)
@@ -1430,6 +1450,9 @@ impl GcCycleState {
14301450
if let Some(sticky) = self.live_old_to_young_sticky.as_ref() {
14311451
sticky.restore();
14321452
}
1453+
if let Some(snapshot) = self.pre_clear_dirty_snapshot.take() {
1454+
restore_surviving_dirty_coverage(&snapshot);
1455+
}
14331456
let reclaim_state =
14341457
self.reclaim_state.as_mut().expect("reclaim state exists");
14351458
reclaim_state.remembered_set_clear = None;

crates/perry-runtime/src/gc/mod.rs

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,24 @@ fn gc_collect_minor_with_trigger(trigger: GcTriggerSnapshot) -> GcCollectOutcome
111111
let current_rss_bytes = crate::process::get_rss_bytes();
112112
let evacuation_policy_allowed = gen_gc_evacuate_enabled();
113113
let force_evacuation = gc_force_evacuate_enabled();
114-
let old_page_selection = if evacuation_policy_allowed && old_to_young_tracking_complete() {
114+
// #5029: old-page defrag (C4b old-gen compaction) is skipped on cycles
115+
// that run the conservative native-stack scan. Conservative stack words
116+
// cannot be rewritten after a move, and per-object CONS_PINNED only
117+
// protects DIRECT discoveries — the stress suite demonstrated a moved
118+
// old object whose remaining referrer was not rewritten (clone shape
119+
// lookups through it returned recycled memory). Until every such
120+
// referrer surface is registered for rewrite, moving old objects is only
121+
// sound when all roots are precise. Copying minors (the steady-state
122+
// path) never run the conservative scan, so defrag keeps operating
123+
// there via its own policy.
124+
let conservative_scan_this_cycle = matches!(
125+
roots::conservative_stack_scan_decision(),
126+
roots::ConservativeStackScanDecision::Scan
127+
);
128+
let old_page_selection = if evacuation_policy_allowed
129+
&& old_to_young_tracking_complete()
130+
&& !conservative_scan_this_cycle
131+
{
115132
select_old_page_defrag_pages(force_evacuation)
116133
} else {
117134
OldPageDefragSelection::default()

crates/perry-runtime/src/gc/roots.rs

Lines changed: 88 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -434,9 +434,10 @@ pub extern "C" fn js_gc_register_global_root(ptr: i64) {
434434
pub(super) fn mark_stack_roots_for_decision(
435435
valid_ptrs: &ValidPointerSet,
436436
decision: ConservativeStackScanDecision,
437+
pin_only_old: bool,
437438
) -> ConservativeRootTraceStats {
438439
match decision {
439-
ConservativeStackScanDecision::Scan => mark_stack_roots_unchecked(valid_ptrs),
440+
ConservativeStackScanDecision::Scan => mark_stack_roots_unchecked(valid_ptrs, pin_only_old),
440441
ConservativeStackScanDecision::SkipDisabled => ConservativeRootTraceStats::default(),
441442
}
442443
}
@@ -447,6 +448,7 @@ pub(super) fn mark_stack_roots_for_decision(
447448
/// variables — codegen stores these as raw I64 words (not NaN-boxed) in registers and on stack.
448449
pub(super) fn mark_stack_roots_unchecked(
449450
valid_ptrs: &ValidPointerSet,
451+
pin_only_old: bool,
450452
) -> ConservativeRootTraceStats {
451453
let mut stats = ConservativeRootTraceStats::default();
452454
// Capture callee-saved registers into a buffer via setjmp.
@@ -476,7 +478,7 @@ pub(super) fn mark_stack_roots_unchecked(
476478

477479
// Scan the register buffer (covers callee-saved regs: x19-x28 on AArch64, rbx/rbp/r12-r15 on x86_64)
478480
for &word in &jmp_buf {
479-
if try_mark_value_or_raw(word, valid_ptrs) {
481+
if try_mark_conservative_word(word, valid_ptrs, pin_only_old) {
480482
stats.root_count += 1;
481483
}
482484
}
@@ -530,7 +532,7 @@ pub(super) fn mark_stack_roots_unchecked(
530532
options(nostack, preserves_flags),
531533
);
532534
for &word in &fp_regs {
533-
if try_mark_value_or_raw(word, valid_ptrs) {
535+
if try_mark_conservative_word(word, valid_ptrs, pin_only_old) {
534536
stats.root_count += 1;
535537
}
536538
}
@@ -566,7 +568,7 @@ pub(super) fn mark_stack_roots_unchecked(
566568
let mut addr = stack_top;
567569
while addr < stack_bottom {
568570
let word = unsafe { *(addr as *const u64) };
569-
if try_mark_value_or_raw(word, valid_ptrs) {
571+
if try_mark_conservative_word(word, valid_ptrs, pin_only_old) {
570572
stats.root_count += 1;
571573
}
572574
addr += 8;
@@ -579,6 +581,88 @@ pub(super) fn mark_stack_roots_unchecked(
579581
/// This is used for conservative scanning where Perry stores raw I64 pointers (for is_string/
580582
/// is_array/is_pointer/is_closure vars) alongside NaN-boxed F64 values.
581583
#[inline]
584+
/// Conservative-scan variant of `try_mark_value_or_raw` (#5029).
585+
///
586+
/// In a MINOR cycle, a conservative stack discovery that lives in the OLD
587+
/// generation is retained WITHOUT tracing: minors never sweep the old
588+
/// generation, so the mark is not needed for survival, and `CONS_PINNED`
589+
/// already excludes the object from every evacuation candidate set. Tracing
590+
/// it is actively unsound: a stale stack word (a dead frame slot from an
591+
/// earlier loop iteration) can resurrect a DEAD old object whose slots still
592+
/// hold pointers into long-swept nursery memory. Once fresh nursery blocks
593+
/// are allocated over those freed ranges, the resurrected object's slots
594+
/// alias live young objects; tracing / force-evacuating / rewriting through
595+
/// them corrupts the heap, and the old-young-edge verifier reports the same
596+
/// aliases as phantom missing remembered-set edges (the
597+
/// gc_write_barrier_stress signature: missing_edges=7710 on a dead 256 KB
598+
/// array backing). A LIVE old object loses nothing here: its real old→young
599+
/// edges are dirty-page-covered by the write barriers, which is the only
600+
/// mechanism a minor uses to find them anyway, and precise roots (shadow
601+
/// stack, module vars, registered scanners) still mark and trace as before.
602+
///
603+
/// FULL collections (`pin_only_old == false`) keep the old behavior: the old
604+
/// sweep frees unmarked old objects, so a stack-only-referenced old object
605+
/// must be marked for retention there (#4977).
606+
pub(super) fn try_mark_conservative_word(
607+
word: u64,
608+
valid_ptrs: &ValidPointerSet,
609+
pin_only_old: bool,
610+
) -> bool {
611+
if !pin_only_old {
612+
return try_mark_value_or_raw(word, valid_ptrs);
613+
}
614+
615+
// Resolve the candidate exactly like `try_mark_value_or_raw` —
616+
// NaN-boxed heap tags first, then the raw-I64 pointer fallback with the
617+
// interior-pointer (`enclosing_object`) lookup.
618+
let tag = word & TAG_MASK;
619+
let target = if tag == POINTER_TAG || tag == STRING_TAG || tag == BIGINT_TAG {
620+
let ptr_val = (word & POINTER_MASK) as usize;
621+
if ptr_val == 0 || !valid_ptrs.maybe_contains(ptr_val) || !valid_ptrs.contains(&ptr_val) {
622+
return false;
623+
}
624+
ptr_val
625+
} else {
626+
if !(0x1000..=0x0000_FFFF_FFFF_FFFF).contains(&word) {
627+
return false;
628+
}
629+
let raw_ptr = word as usize;
630+
if !valid_ptrs.maybe_contains(raw_ptr)
631+
&& raw_ptr.saturating_sub(0x10_0000) > valid_ptrs.range_max
632+
{
633+
return false;
634+
}
635+
if valid_ptrs.contains(&raw_ptr) {
636+
raw_ptr
637+
} else {
638+
match valid_ptrs.enclosing_object(raw_ptr) {
639+
Some(start) => start,
640+
None => return false,
641+
}
642+
}
643+
};
644+
645+
unsafe {
646+
let header = header_from_user_ptr(target as *const u8);
647+
if (*header).gc_flags & GC_FLAG_MARKED != 0 {
648+
return false;
649+
}
650+
if (*header).gc_flags & GC_FLAG_PINNED != 0 {
651+
return false;
652+
}
653+
if matches!(
654+
crate::arena::classify_heap_generation(target),
655+
crate::arena::HeapGeneration::Old
656+
) {
657+
// Pin-only retention: not moved, not traced, not marked.
658+
return pin_conservative_root_header(header);
659+
}
660+
(*header).gc_flags |= GC_FLAG_MARKED;
661+
push_mark_seed(header);
662+
}
663+
true
664+
}
665+
582666
pub(super) fn try_mark_value_or_raw(word: u64, valid_ptrs: &ValidPointerSet) -> bool {
583667
// First try NaN-boxed interpretation (POINTER_TAG / STRING_TAG / BIGINT_TAG)
584668
if try_mark_value(word, valid_ptrs) {

crates/perry-runtime/src/gc/verify.rs

Lines changed: 102 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,84 @@ pub(super) unsafe fn remember_evacuated_old_copy_young_slots(
162162
});
163163
}
164164

165+
/// Post-cycle remembered-set repair (#5029): after `remembered_set_clear` +
166+
/// sticky restore, rescan every PRE-cycle dirty page (and external dirty
167+
/// entry) with the same slot predicate `verify_old_to_young_edges_covered`
168+
/// uses, and re-remember any slot that still points into the nursery. The
169+
/// from-scratch rebuild can disagree with the verifier across the cycle
170+
/// boundary (measured: ~130 covered pages at cycle entry, ~10 after the
171+
/// rebuild, verifier missing_edges=7710 on the next minor while the swept
172+
/// children were still referenced); deriving the kept set from the SAME walk
173+
/// the verifier performs makes dropping a still-needed page impossible by
174+
/// construction. Pages whose every slot now points old (the common case
175+
/// after evacuation rewrites) are still dropped, so the remembered set keeps
176+
/// shrinking as before.
177+
pub(super) fn restore_surviving_dirty_coverage(snapshot: &RememberedDirtySnapshot) {
178+
let mut sticky = StickyRememberedSet::default();
179+
// Mirror scan_remembered_dirty_slots_copying's scan_header guards: the
180+
// external dirty entries can carry headers the harness seeded
181+
// synthetically, and a dead entry may point at reclaimed memory — never
182+
// dereference before the plausibility check.
183+
let mut visit_parent = |header: *mut GcHeader| unsafe {
184+
if header.is_null() {
185+
return;
186+
}
187+
let arena_parent = plausible_gc_header(header, true);
188+
let malloc_parent = !arena_parent && plausible_gc_header(header, false);
189+
if !arena_parent && !malloc_parent {
190+
return;
191+
}
192+
if (*header).gc_flags & GC_FLAG_FORWARDED != 0 {
193+
return;
194+
}
195+
let user = (header as *mut u8).add(GC_HEADER_SIZE) as usize;
196+
if arena_parent
197+
&& !matches!(
198+
crate::arena::classify_heap_generation(user),
199+
crate::arena::HeapGeneration::Old
200+
)
201+
{
202+
return;
203+
}
204+
visit_gc_rewrite_slots(header, |slot| unsafe {
205+
if crate::weakref::is_weak_target_trace_slot(header, slot.slot) {
206+
return;
207+
}
208+
slot.record_layout_read();
209+
remember_evacuated_old_to_young_slot(&mut sticky, header, slot.slot);
210+
});
211+
};
212+
if !snapshot.dirty_old_pages.is_empty() {
213+
crate::arena::old_arena_walk_objects_on_pages(&snapshot.dirty_old_pages, |hp| {
214+
visit_parent(hp as *mut GcHeader);
215+
});
216+
}
217+
let mut seen_external = crate::fast_hash::new_ptr_hash_set();
218+
for &(_, header_addr) in &snapshot.external_dirty_entries {
219+
if !seen_external.insert(header_addr) {
220+
continue;
221+
}
222+
// External entries may be stale (or, in the GC unit tests,
223+
// synthetic). Establish that the address is dereference-safe
224+
// WITHOUT touching it: old/longlived arena pages are always
225+
// mapped; anything else must still be a registered malloc GC
226+
// object.
227+
let deref_safe = matches!(
228+
crate::arena::classify_heap_generation(header_addr),
229+
crate::arena::HeapGeneration::Old | crate::arena::HeapGeneration::Longlived
230+
) || MALLOC_STATE.with(|s| {
231+
s.borrow()
232+
.objects
233+
.iter()
234+
.any(|&h| h as usize == header_addr)
235+
});
236+
if deref_safe {
237+
visit_parent(header_addr as *mut GcHeader);
238+
}
239+
}
240+
sticky.restore();
241+
}
242+
165243
pub(super) fn rebuild_evacuated_old_to_young_remembered_set(
166244
evacuated_headers: &[*mut GcHeader],
167245
) -> StickyRememberedSet {
@@ -556,9 +634,22 @@ pub(super) fn rewrite_heap_objects(valid_ptrs: &ValidPointerSet) {
556634
if flags & GC_FLAG_FORWARDED != 0 {
557635
return;
558636
}
559-
// Skip dead objects — sweep is about to free them.
637+
// Skip dead NURSERY objects — this cycle's sweep frees them.
638+
// An UNMARKED object outside the nursery is NOT dead: a minor
639+
// cycle neither traces nor sweeps the old generation, so being
640+
// unmarked is the normal state of a live old object whose pages
641+
// are clean. Old→old references have no dirty-page coverage
642+
// (barriers only track old→young), which makes this walk the
643+
// ONLY pass that re-points an old referrer at an old-page
644+
// evacuation target. Skipping unmarked old objects left their
645+
// slots aimed at forwarding stubs that
646+
// `release_evacuated_original_forwarding_stubs` then released —
647+
// dangling pointers into reused memory (#5029).
560648
if flags & (GC_FLAG_MARKED | GC_FLAG_PINNED) == 0 {
561-
return;
649+
let user = (header as *mut u8).add(GC_HEADER_SIZE) as usize;
650+
if crate::arena::pointer_in_nursery(user) {
651+
return;
652+
}
562653
}
563654
rewrite_heap_object_fields(header, valid_ptrs);
564655
}
@@ -751,8 +842,16 @@ pub(super) fn verify_heap_objects(valid_ptrs: &ValidPointerSet) {
751842
if flags & GC_FLAG_FORWARDED != 0 {
752843
return;
753844
}
845+
// Mirror rewrite_heap_objects: only unmarked NURSERY objects are
846+
// dead this cycle. Unmarked old/longlived/malloc objects survive a
847+
// minor and must hold no stale forwarded references either — this
848+
// gate previously hid exactly the #5029 dangling old→old slots from
849+
// PERRY_GC_VERIFY_EVACUATION.
754850
if flags & (GC_FLAG_MARKED | GC_FLAG_PINNED) == 0 {
755-
return;
851+
let user = (header as *mut u8).add(GC_HEADER_SIZE) as usize;
852+
if crate::arena::pointer_in_nursery(user) {
853+
return;
854+
}
756855
}
757856
verify_heap_object_fields(header, valid_ptrs, "heap fields");
758857
};

0 commit comments

Comments
 (0)