Skip to content

refactor(runtime): centralize handle-vs-pointer address classification (addr_class) + map.rs deref-order fix + CI audit#4899

Merged
proggeramlug merged 1 commit into
mainfrom
addr-class-centralize
Jun 10, 2026
Merged

refactor(runtime): centralize handle-vs-pointer address classification (addr_class) + map.rs deref-order fix + CI audit#4899
proggeramlug merged 1 commit into
mainfrom
addr-class-centralize

Conversation

@proggeramlug

Copy link
Copy Markdown
Contributor

Centralizes Perry's "is this u64 a handle or a dereferenceable heap pointer?" checks into one audited module, crates/perry-runtime/src/value/addr_class.rs, eliminating the hand-re-typed magnitude literals behind the recurring Linux-only segfault class (#1843, #4004, #4665, #4800).

What's in the module

  • Named band constants with ownership docs: HANDLE_BAND_MAX = 0x100000, common registry [1, 0x40000), fetch [0x40000, 0xE0000), zlib [0xE0000, 0xF0000), proxy ids [0xF0000, 0x100000), and the raw-numeric Web Streams band [0x100000, 0x200000) (documented as the deliberate above-band exception).
  • Predicates: is_handle_band, is_small_handle, is_above_handle_band, is_proxy_id_band, is_stream_id_band, is_plausible_heap_addr.
  • Checked header read: try_read_gc_header(addr) — magnitude check (band + platform heap floor) FIRST, deref second.
  • is_valid_obj_ptr moved here verbatim (re-exported from object:: so its ~30 call sites compile unchanged) — the platform heap floor now lives with the band map.
  • perry-stdlib's band constants (fetch/mod.rs, zlib.rs, streams.rs, common/handle.rs) now derive from addr_class — single source of truth; the existing stdlib unit tests assert containment against the addr_class constants.

Deref-ordering fix (the #4665 class)

map.rs is_registered_map dereferenced (addr - 8) as a GcHeader before consulting MAP_REGISTRY (a deliberate fast path profiled against the old SipHash registry — the registry has since moved to the Fibonacci-hash PtrHashSet, so the rationale is stale). That is exactly the ordering that segfaulted in #4665 for set.rs (Linux unmaps freed/foreign pages; macOS mimalloc retains them). It now mirrors set.rs: band check → registry (deref-free, authoritative) → header type confirm via try_read_gc_header.

Inventory

Classification sites converted (all in this PR). "Sem change?" = any observable difference in release builds.

Site(s) Old check New call Sem change?
map.rs is_registered_map < 0x100000header deref → registry band check → registry firsttry_read_gc_header Yes — deref-order fix (no JS-visible change; removes UB on stale/garbage candidates)
set.rs is_registered_set band literal; raw post-registry deref predicate + try_read_gc_header None (adds platform-floor validation on the post-registry confirm)
weakref.rs weak_class_id_from_receiver < 0x100000 then raw deref try_read_gc_header Strengthened: adds is_valid_obj_ptr before the deref (rejects only values that were previously deref'd as garbage)
date.rs is_date_cell_addr, temporal is_temporal_cell_addr, object/exotic_expando.rs < 0x100000 || !is_valid_obj_ptr then deref try_read_gc_header None (identical checks, one place)
object_ops.rs exotic-desc read < 0x100000 + < GC_HEADER_SIZE+0x1000 + !is_valid_obj_ptr is_plausible_heap_addr None (the 0x1000 check is subsumed by the band check)
global_this.rs, boxed_primitives.rs (×3), field_get_set.rs:3199 < 0x100000 || !is_valid_obj_ptr is_plausible_heap_addr None
value/truthy.rs raw-string-pointer probe bits >= 0x10_0000 is_above_handle_band None (same constant; was missed by every previous 0x100000 sweep because of the underscore separator — exactly the duplication hazard this PR removes)
proxy.rs PROXY_TAG_BASE local 0x000F_0000 = addr_class::PROXY_ID_BAND_START None
Proxy-id band checks: field_get_set.rs (×2), delete_rest.rs (×2) (0xF0000..0x100000).contains is_proxy_id_band None
Stream-id probes: field_get_set.rs (×2) (0x100000..0x200000).contains is_stream_id_band None
~50 further sites: symbol.rs (×6), closure/dynamic_props.rs is_closure_ptr, value/dynamic_object.rs, value/to_string.rs, builtins/{numbers,arithmetic×2,console,formatting×2,util_format}, typed_feedback.rs, json/{raw_json,reviver×2}, array/{is_array,iterator×2,flat_clone}, object/{field_set_by_name,field_get_set×8,native_call_method×8,native_module×4,instanceof×5,class_registry×2,descriptors,object_ops}, promise/{combinators×5,spec_combinators×2}, fs/mod.rs, stdlib fetch/headers.rs, events/constructors.rs < 0x100000 / > 0 && < 0x100000 / >= 0x100000 (incl. null-check variants) is_handle_band / is_small_handle / is_above_handle_band None (bit-identical comparisons; is_handle_band covers the removed explicit null checks since 0 < 0x100000)
stdlib fetch/mod.rs, zlib.rs, streams.rs(+tests), common/handle.rs locally-typed band constants derive from addr_class constants None (same values)

CI guard against recurrence

scripts/addr_class_inventory.py + scripts/addr_class_allowlist.txt (same path-prefix | substring | justification format as the #4886 GC store-site gate), wired into the lint job (self-test + gate). Two rule classes, comments stripped:

  1. band-literal — word-bounded 0x100000 / 0xF0000 / 0x40000 / 0xE0000 / 0x200000 (incl. 0x4_0000-style) in perry-runtime/perry-stdlib code outside value/addr_class.rs. After this PR only two allowlisted exceptions remain: the Runtime SIGSEGV in closure_dynamic_prop_by_key (IC-miss property lookup) under repeated fetch + response property access #4740 test fixture (raw in-band addresses asserted NOT to be deref'd) and the O_SYMLINK => 0x200000 fs constant (unrelated literal collision).
  2. gcheader-castas *const/mut GcHeader outside gc/ and addr_class.rs. 234 pre-existing casts across 104 files are grandfathered per-file with category justifications (arena internals / test self-allocations / call-site-guarded probes). New files doing raw casts fail the gate; the allowlist header directs edits in grandfathered files toward try_read_gc_header.

Flagged as follow-ups (NOT changed — behavior contract requires a repro per site)

  • ~40 GcHeader probes guarded only by low floors (GC_HEADER_SIZE + 0x1000 / < 0x10000) rather than the handle band — e.g. dgram.rs, node_stream_dispatch.rs, tty.rs, dns.rs, node_repl.rs, perf_hooks.rs, event_target.rs, array/{header,species,generic,flat_clone,indexing}.rs, util_*. A fetch handle (first id 0x40000) passes these floors; whether one can actually reach each site is unproven, so they are grandfathered in the allowlist and listed here rather than changed.
  • set.rs/map.rs *_ptr_from_receiver_bits and proxy.rs accept raw-i64 candidates with a > 0x10000 floor (below the band) before routing into the (now-safe) registry checks.
  • node_v8.rs:116 (< 0x10000), bigint.rs (< 0x10000 floors) — same class.

Validation

  • cargo build --release -p perry-runtime -p perry-stdlib -p perry ✅ (caches cleared first)
  • New e2e crates/perry/tests/addr_class_handle_bands.rs ✅ — the fix(runtime): for-of/spread over a Headers handle no longer segfaults #4800 shape: new Headers() handle through for-of / spread / typeof / Array.isArray / instanceof Map, next to live Map/Set brand checks and plain object/array/Date, exact-stdout asserted.
  • addr_class unit tests assert band layout + that try_read_gc_header returns None for the historical crash addresses (0x1008, 0x40000, 0x4000c, 0xF0000, …) without touching memory.
  • Local per-crate test run (-p perry-runtime -p perry-stdlib -p perry-codegen -p perry-hir -p perry-transform -p perry-parser -p perry-types -p perry) was still in flight when this PR was opened (passing so far, incl. the new e2e); full-workspace --exclude runs don't link on macOS (perry-ext-http lib-test, pre-existing _js_fetch_with_options link wall — same as during fix(codegen): provenance-based transitive disqualification for integer locals (#4785 bug class) #4881). CI's Linux cargo-test is the authoritative run.
  • Lint gates: cargo fmt --check, file-size gate, GC store-site inventory, and the new audit (self-test + gate) all green.
  • Platform caveat: developed and tested on macOS. The historical crashes in this class are Linux-only (mimalloc page retention masks bad derefs on macOS), so the map.rs deref-order fix cannot be runtime-verified locally — CI's Linux jobs are the real gate, as in fix(runtime): honor JSON reviver prototype lookups #4665.

…n into value/addr_class.rs

- New module owns every band boundary (common/fetch/zlib/proxy/stream) and
  the classification predicates (is_handle_band / is_small_handle /
  is_proxy_id_band / is_above_handle_band / is_plausible_heap_addr) plus a
  checked try_read_gc_header that magnitude-validates BEFORE dereferencing.
- is_valid_obj_ptr moved here (re-exported from object::) so the platform
  heap floor lives with the band map.
- ~70 hand-typed magnitude checks across perry-runtime/perry-stdlib replaced
  with the named predicates; perry-stdlib band constants now derive from
  addr_class (single source of truth, containment unit-tested).
- map.rs is_registered_map no longer derefs the GcHeader before the registry
  lookup (the #4665 Linux-segfault ordering, same fix set.rs already had).
- CI: scripts/addr_class_inventory.py (+ allowlist) added to the lint job —
  rejects new band literals and GcHeader casts outside addr_class.rs / gc/.
- e2e: crates/perry/tests/addr_class_handle_bands.rs exercises the #4800
  shape (Headers handle for-of/spread/typeof/instanceof next to real
  Map/Set/object/array/Date).
@proggeramlug proggeramlug merged commit f57b2db into main Jun 10, 2026
12 of 13 checks passed
@proggeramlug proggeramlug deleted the addr-class-centralize branch June 10, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant