refactor(runtime): centralize handle-vs-pointer address classification (addr_class) + map.rs deref-order fix + CI audit#4899
Merged
Conversation
…n into value/addr_class.rs - New module owns every band boundary (common/fetch/zlib/proxy/stream) and the classification predicates (is_handle_band / is_small_handle / is_proxy_id_band / is_above_handle_band / is_plausible_heap_addr) plus a checked try_read_gc_header that magnitude-validates BEFORE dereferencing. - is_valid_obj_ptr moved here (re-exported from object::) so the platform heap floor lives with the band map. - ~70 hand-typed magnitude checks across perry-runtime/perry-stdlib replaced with the named predicates; perry-stdlib band constants now derive from addr_class (single source of truth, containment unit-tested). - map.rs is_registered_map no longer derefs the GcHeader before the registry lookup (the #4665 Linux-segfault ordering, same fix set.rs already had). - CI: scripts/addr_class_inventory.py (+ allowlist) added to the lint job — rejects new band literals and GcHeader casts outside addr_class.rs / gc/. - e2e: crates/perry/tests/addr_class_handle_bands.rs exercises the #4800 shape (Headers handle for-of/spread/typeof/instanceof next to real Map/Set/object/array/Date).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Centralizes Perry's "is this u64 a handle or a dereferenceable heap pointer?" checks into one audited module,
crates/perry-runtime/src/value/addr_class.rs, eliminating the hand-re-typed magnitude literals behind the recurring Linux-only segfault class (#1843, #4004, #4665, #4800).What's in the module
HANDLE_BAND_MAX = 0x100000, common registry[1, 0x40000), fetch[0x40000, 0xE0000), zlib[0xE0000, 0xF0000), proxy ids[0xF0000, 0x100000), and the raw-numeric Web Streams band[0x100000, 0x200000)(documented as the deliberate above-band exception).is_handle_band,is_small_handle,is_above_handle_band,is_proxy_id_band,is_stream_id_band,is_plausible_heap_addr.try_read_gc_header(addr)— magnitude check (band + platform heap floor) FIRST, deref second.is_valid_obj_ptrmoved here verbatim (re-exported fromobject::so its ~30 call sites compile unchanged) — the platform heap floor now lives with the band map.fetch/mod.rs,zlib.rs,streams.rs,common/handle.rs) now derive fromaddr_class— single source of truth; the existing stdlib unit tests assert containment against the addr_class constants.Deref-ordering fix (the #4665 class)
map.rs is_registered_mapdereferenced(addr - 8)as aGcHeaderbefore consultingMAP_REGISTRY(a deliberate fast path profiled against the old SipHash registry — the registry has since moved to the Fibonacci-hashPtrHashSet, so the rationale is stale). That is exactly the ordering that segfaulted in #4665 forset.rs(Linux unmaps freed/foreign pages; macOS mimalloc retains them). It now mirrorsset.rs: band check → registry (deref-free, authoritative) → header type confirm viatry_read_gc_header.Inventory
Classification sites converted (all in this PR). "Sem change?" = any observable difference in release builds.
map.rs is_registered_map< 0x100000→ header deref → registrytry_read_gc_headerset.rs is_registered_settry_read_gc_headerweakref.rs weak_class_id_from_receiver< 0x100000then raw dereftry_read_gc_headeris_valid_obj_ptrbefore the deref (rejects only values that were previously deref'd as garbage)date.rs is_date_cell_addr,temporal is_temporal_cell_addr,object/exotic_expando.rs< 0x100000 || !is_valid_obj_ptrthen dereftry_read_gc_headerobject_ops.rsexotic-desc read< 0x100000+< GC_HEADER_SIZE+0x1000+!is_valid_obj_ptris_plausible_heap_addrglobal_this.rs,boxed_primitives.rs(×3),field_get_set.rs:3199< 0x100000 || !is_valid_obj_ptris_plausible_heap_addrvalue/truthy.rsraw-string-pointer probebits >= 0x10_0000is_above_handle_bandproxy.rs PROXY_TAG_BASE0x000F_0000= addr_class::PROXY_ID_BAND_STARTfield_get_set.rs(×2),delete_rest.rs(×2)(0xF0000..0x100000).containsis_proxy_id_bandfield_get_set.rs(×2)(0x100000..0x200000).containsis_stream_id_bandsymbol.rs(×6),closure/dynamic_props.rs is_closure_ptr,value/dynamic_object.rs,value/to_string.rs,builtins/{numbers,arithmetic×2,console,formatting×2,util_format},typed_feedback.rs,json/{raw_json,reviver×2},array/{is_array,iterator×2,flat_clone},object/{field_set_by_name,field_get_set×8,native_call_method×8,native_module×4,instanceof×5,class_registry×2,descriptors,object_ops},promise/{combinators×5,spec_combinators×2},fs/mod.rs, stdlibfetch/headers.rs,events/constructors.rs< 0x100000/> 0 && < 0x100000/>= 0x100000(incl. null-check variants)is_handle_band/is_small_handle/is_above_handle_bandis_handle_bandcovers the removed explicit null checks since0 < 0x100000)fetch/mod.rs,zlib.rs,streams.rs(+tests),common/handle.rsaddr_classconstantsCI guard against recurrence
scripts/addr_class_inventory.py+scripts/addr_class_allowlist.txt(samepath-prefix | substring | justificationformat as the #4886 GC store-site gate), wired into the lint job (self-test + gate). Two rule classes, comments stripped:0x100000 / 0xF0000 / 0x40000 / 0xE0000 / 0x200000(incl.0x4_0000-style) in perry-runtime/perry-stdlib code outsidevalue/addr_class.rs. After this PR only two allowlisted exceptions remain: the Runtime SIGSEGV in closure_dynamic_prop_by_key (IC-miss property lookup) under repeated fetch + response property access #4740 test fixture (raw in-band addresses asserted NOT to be deref'd) and theO_SYMLINK => 0x200000fs constant (unrelated literal collision).as *const/mut GcHeaderoutsidegc/andaddr_class.rs. 234 pre-existing casts across 104 files are grandfathered per-file with category justifications (arena internals / test self-allocations / call-site-guarded probes). New files doing raw casts fail the gate; the allowlist header directs edits in grandfathered files towardtry_read_gc_header.Flagged as follow-ups (NOT changed — behavior contract requires a repro per site)
GC_HEADER_SIZE + 0x1000/< 0x10000) rather than the handle band — e.g.dgram.rs,node_stream_dispatch.rs,tty.rs,dns.rs,node_repl.rs,perf_hooks.rs,event_target.rs,array/{header,species,generic,flat_clone,indexing}.rs,util_*. A fetch handle (first id0x40000) passes these floors; whether one can actually reach each site is unproven, so they are grandfathered in the allowlist and listed here rather than changed.set.rs/map.rs*_ptr_from_receiver_bitsandproxy.rsaccept raw-i64 candidates with a> 0x10000floor (below the band) before routing into the (now-safe) registry checks.node_v8.rs:116(< 0x10000),bigint.rs(< 0x10000floors) — same class.Validation
cargo build --release -p perry-runtime -p perry-stdlib -p perry✅ (caches cleared first)crates/perry/tests/addr_class_handle_bands.rs✅ — the fix(runtime): for-of/spread over a Headers handle no longer segfaults #4800 shape:new Headers()handle through for-of / spread /typeof/Array.isArray/instanceof Map, next to live Map/Set brand checks and plain object/array/Date, exact-stdout asserted.addr_classunit tests assert band layout + thattry_read_gc_headerreturnsNonefor the historical crash addresses (0x1008,0x40000,0x4000c,0xF0000, …) without touching memory.-p perry-runtime -p perry-stdlib -p perry-codegen -p perry-hir -p perry-transform -p perry-parser -p perry-types -p perry) was still in flight when this PR was opened (passing so far, incl. the new e2e); full-workspace--excluderuns don't link on macOS (perry-ext-httplib-test, pre-existing_js_fetch_with_optionslink wall — same as during fix(codegen): provenance-based transitive disqualification for integer locals (#4785 bug class) #4881). CI's Linuxcargo-testis the authoritative run.cargo fmt --check, file-size gate, GC store-site inventory, and the new audit (self-test + gate) all green.