Skip to content

Large-N instability fixed + validation suite as UI presets + it runs in a browser#4

Open
algorithm0r wants to merge 19 commits into
gusjengis:originfrom
algorithm0r:harness/ui-presets
Open

Large-N instability fixed + validation suite as UI presets + it runs in a browser#4
algorithm0r wants to merge 19 commits into
gusjengis:originfrom
algorithm0r:harness/ui-presets

Conversation

@algorithm0r

@algorithm0r algorithm0r commented Jun 11, 2026

Copy link
Copy Markdown

Hi Anthony — Chris here (with an AI assistant doing the legwork). You saw the demo videos; this is the code behind them — and as of tonight you can run the whole thing in a browser:

https://algorithm0r.github.io/Physics-Sim/

Chrome/Edge 113+ with hardware acceleration. Click the canvas once, then Space. The Presets menu holds the entire validation suite (T0–T10) plus two demo scenes — each loads the exact configuration the tests validated.

Your simulator was too good to leave stalled, so we picked up where you left off and went after the large-N blow-up. Nothing here is a rewrite — it's your architecture throughout, with surgical fixes.

The instability (what it actually was)

H3 — grid cell capacity collapse (the big one). grid_capacity() sizes cells from min_rad, but with variable_rad=false min_rad collapses to max_radius, so polydisperse scenes got cell_cap ≈ 11 instead of ~51. Overfull cells silently dropped grid insertions → missed contacts → energy injection that grew with N. Fix: derive variable_rad from the actual radius spread. Cell overflow at 10k particles went 3758 → 0 and total energy stopped growing at every N tested (up to 50k).

H8 — broad-phase clear/insert race. Clearing and inserting the grid in one dispatch races across workgroups; binning is now a third dispatch (2D_Grid_Insert.wgsl).

H1 — local damping. Your commented-out Cundall damping re-enabled behind a setting; the commented moment line needed abs(moment) or it injects rotational energy.

D1 — PFC-style viscous contact dashpot (new, gated by dashpot_beta, 0 = exactly legacy, proven byte-identical). Measured restitution matches the no-tension spring-dashpot ODE to RMS 3e-4.

Validation

Quantitative gates against closed-form/independent references: two-body suite (impact, oblique stick/slip kink at the predicted angle, slide→roll, angular momentum with unequal radii), bonded cantilever vs discrete beam theory (<0.7%), 50k-particle settling with strictly monotone energy decay, bulk tests (repose, packing fraction, oedometer modulus). Every stage re-verified by independent re-runs from artifacts.

The web port

Your index.html/index.js groundwork carried it most of the way. The rest was a strictness gauntlet — none of it logic errors, all things native wgpu 0.16 never enforced: @interpolate(flat) on integer vertex outputs; no writable storage in vertex stages; requiredLimits never forwarded by the 0.16 web backend (shimmed in index.js); browsers cap bind groups at 4 and storage buffers at 16/stage, so all pipelines got regrouped (your group 0–2 structure survived — the small uniforms merged into group 3); plus a few Tint pedantries. After every shader-touching change we re-ran a 1.2M-step bonded fracture scenario and required byte-identical output — the physics held through all of it.

Also in here: headless mode (--headless scenario.json out.csv) that drives your pipeline windowless and dumps per-step state — the whole Python validation harness runs against it (separate repo, happy to share).

Genuinely: this codebase was a pleasure to work in. The bones were right; it was one sizing bug away from working at scale.

🤖 Generated with Claude Code

algorithm0r and others added 15 commits June 9, 2026 17:17
build_output_stream gained a timeout parameter in cpal 0.15; pass None.
Remove unused Clone derive on AudioController (cpal::Stream is not Clone).
No physics touched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drives the existing GPU pipeline with a hidden window and no event loop:
scenario JSON -> settings -> WGPUProg::new -> state overwrite -> restore ->
compute loop -> update_state readback -> per-step CSV dump. No physics
changes; pads the empty-bonds sentinel to one null Bond (12 bytes) so the
array<Bond> binding validates with zero bonds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ction

- CSV dumps 9 cols/particle (adds fn, fs, moment from data[]) for Stage 1
- size st.grid from grid_info before restore(): State::new leaves a 1-word
  placeholder grid and restore() was shrinking the GPU grid buffer to 4 B,
  silently killing all collision detection (latent upstream bug)
- HEADLESS_DEBUG=1: dump grid_info, contact slots, data[], direct GPU grid
  and coll_cont readback for autopsy use

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
"aggregate": true scenario flag dumps whole-system rows: KE/PE energies,
max overlap, touching pairs, contact-slot and grid-cell overflow proxies
(CPU re-binning mirrors the shader's broad-phase geometry exactly),
max speed, NaN count. Validated against the T1 energy invariant (constant
to 0.005%). No shader files touched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
grid_capacity() collapses min_rad to max_radius when variable_rad is
false, undersizing cell_cap (11 instead of 51 for the polydisperse T7
scenario). Overflowing grid cells silently drop insertions -> missed
contacts -> deep interpenetration -> energy injection at large N.
With the fix, T7 cell-overflow proxy goes 3758 -> 0 and total energy
decays (~0.31 E0) at every N in the sweep instead of growing to
2.33x E0 at N=10000. Stage-1 suite re-run: all green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
storageBarrier() only synchronizes within a workgroup; with >256
particles the LOM kernel's grid clear in one workgroup could race
another workgroup's insert loop and wipe inserted neighbor entries.
New 2D_Grid_Insert.wgsl runs the insert loop as a separate dispatch
(implicit barrier after clear); LOM keeps integration + clear.

Empirical verdict: T7 sweep metrics unchanged vs the H3 fix alone
(within GPU run-to-run noise) -> the race was not measurably firing
on this hardware/scenario; fix retained as correctness hardening
since the failure mode is scheduler-dependent. Stage-1 suite: 9/9
PASS on this branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restores the commented-out damping block, gated by the existing
settings.local_damping / local_damping_alpha plumbing (off by default;
all Stage-1 scenarios run with it off and still pass 9/9). One
correction vs the original commented code: the moment line used
`moment * alpha * -sign(rot_vel)` without abs(moment), which would
inject rotational energy whenever moment and rot_vel have opposite
signs; now abs(moment), matching the translational form.

T7 with alpha=0.7: KE decays monotonically at every N to a uniform
~5e-6-of-peak jitter floor (no N-dependence left). The residual floor
tracks the impulsive wall law (D2) at the box floor, not instability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… 3c)

- Scenario gains bond_type/bond_init/bond_tearing/bond stiffness+strength/
  moment_contribution_factor fields, wired straight into settings.physics;
  bond_init reuses upstream State::regen_bonds after the state overwrite.
- Aggregate dump now accounts parallel-bond elastic energy (normal vs bond
  reference length + per-slot shear & bending, mirroring
  linear_parallel_bonds), counts intact/broken bonds, and excludes bonded
  pairs from pe_elast (their compression is referenced to bond length).
  Three new CSV columns: pe_bond, bonds_intact, bonds_broken.
- snapshot_every: appends binary f32 LE frames (x, y, speed, intact bond
  count) to <out>.snap with a JSON sidecar, for video rendering.

Harness-only; no physics changed. Smoke test: 2-disk parallel bond conserves
E to 0.16% (half-step sampling); Stage-1 suite fresh 9/9 PASS on this branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Local damping rectifies internal bond vibration into net lift on
free-flying bonded bodies (bonded block at alpha=0.7 hovers outright;
LOGBOOK 2026-06-10), so bonded drop scenes need damping off during
flight and on after contact. If the new scenario field is > 0, the run
starts with local_damping forced off and the host re-uploads the
collision-settings uniform at that step. No shader change; field absent
or 0 reproduces existing runs byte-identically (verified on T7b N=256).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
compute() already records gen_per_frame steps into one command buffer;
the headless loop was submitting one step at a time, paying per-submission
host/driver overhead 2.5M times per run. Batch up to 256 steps per
submission, breaking exactly at dump/snapshot/damping-switch boundaries.
Output byte-identical (verified T7b N=256). Throughput: 800->1141 steps/s
at N=50k, 2.4k->7.9k at N=10k (0.79x real time).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Viscous normal damping at unbonded contacts in linear_model():
c_n = 2*beta*sqrt(m_eff*k_n_pair), separation rate from del_pos/dT,
no-tension floor, Coulomb cap on spring force only, fixed particles
treated as infinite mass. New setting dashpot_beta (uniform appended,
UI slider, headless JSON field); beta=0 verified byte-identical to
legacy on t1_normal_impact and h9_fall_N1024_u_a0.2.

Spec: docs/AUTOPSY.md 2026-06-11 (DEM-GPU repo).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tage 5)

- scenario.rs: Scenario/loader factored VERBATIM out of headless.rs
  (parse from &str, wasm-clean). Headless re-uses it; refactor proven
  byte-identical on t0_verify, t5_rolling, and the 1.2M-step bonded
  vfy_t7b_N256_beta0 (bond_init path, 576 bonds, cmp exact).
- presets.rs: 14 embedded presets = T0-T10 configs of record + the two
  S5 demo scenes, exported from validation/scenarios/ by
  validation/export_ui_presets.py (single source of truth; never
  hand-edit presets/).
- settings.rs: Presets menu (hover blurbs state each test's validated
  result); client.rs: load_preset = apply_to_settings + reset +
  install_state (same loader the harness uses), ~real-time pacing,
  auto-framed view, starts paused.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- view.scale is NDC per world unit (Fit Bounds convention 2/vert_bound),
  not px/unit: first cut zoomed ~500x in -> blank screen on every preset.
- Frame the initial particle AABB (margin 1.2, span floor 30*max_r,
  capped at world bounds, centered via x_off/y_off = -center): scenarios
  deliberately park small scenes inside huge walls (T6: 38-unit chain in
  a 280-unit world).
- T6 preset swapped to t6_demo_g1 (same chain/bonds at FULL gravity:
  tip swings to -16.9, parks at -11.9; headless sanity run, no NaNs).
  The validated 0.002g gate config deflects 0.05 units = sub-pixel by
  design; blurb states this.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Cargo: native-dialog/cpal/copypasta moved to cfg(not(wasm32)) deps
  (no wasm backends); web-sys pinned =0.3.64 in Cargo.toml (wgpu
  0.16-era unstable WebGPU bindings; lockfile is gitignored);
  .cargo/config.toml carries --cfg=web_sys_unstable_apis (Anthony had
  it in Cargo.toml where cargo ignores it).
- File dialogs / clipboard / audio cfg-gated native-only.
- wgpu_config: web backend has no enumerate_adapters; wasm path now
  uses request_adapter (Anthony's commented-out draft, completed).
- window_init: 'self.window' in a constructor — wasm-only line that
  never compiled; fixed to 'window'.
- THE BIG ONE: current_monitor().unwrap() is a statically-known panic
  on the web backend (None) — LLVM dead-stripped the ENTIRE UI behind
  it (menus, presets, render). 60 Hz fallback; native unchanged.
- wasm-pack build --target web --out-name WGPU -> pkg/ (13.3 MB with
  embedded presets). Native rebuilt after all changes: harness output
  re-verified BYTE-IDENTICAL (t0, t5 vs verifier caches).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@algorithm0r

Copy link
Copy Markdown
Author

Status note on the wasm commit (0f02c1e): builds and boots, but browser rendering is not yet verified — don't sink time into pkg/ yet. Verified so far: compiles for wasm32, WebGPU adapter acquired, winit event loop running, no panics in console, presets embedded. Not yet verified: an actual rendered frame (headless Chrome can't present WebGPU reliably, so the pixel check needs a real browser — happening shortly). Will follow up here either confirming it renders or with the fix. Native is unaffected: the harness exe was re-verified byte-identical after every change in that commit.

🤖 Generated with Claude Code

algorithm0r and others added 4 commits June 11, 2026 14:58
request_adapter() returns None when WebGPU is unavailable — most often
a non-secure context (plain http:// hides navigator.gpu entirely; needs
https:// or localhost), a non-WebGPU browser, or hardware acceleration
off. Say so in the panic message.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…limits shim

Three Dawn-enforced rules native wgpu 0.16 ignored:
- integral vertex outputs need @interpolate(flat) (4 render shaders)
- read-write storage + VERTEX visibility is illegal: BufferGroup/
  BufferUniform now carry a second read-only VERTEX_FRAGMENT bind
  group; render pipelines/passes use it, compute keeps read-write
  (visibility now COMPUTE-only). Render WGSL: var<storage, read>.
- wgpu 0.16 web backend never forwards requiredLimits -> index.js
  shims requestDevice (clamped to adapter); fixes the 10-storage-
  buffer compute stage.
debug.html = standalone console/GPU-error capture page.
Native after all changes: harness BYTE-IDENTICAL (t0, t5), app runs.

KNOWN REMAINING WALL: Dawn caps maxBindGroups at 4; pipelines here
use up to 8 (render) / 6 (collision compute). Web needs a bind-group
regroup. Diagnosed via validation/wasm_console.py (CDP console
harvester in the DEM-GPU repo).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Dawn (web) hard-caps maxBindGroups at 4 and storage buffers per shader
stage at 16; native wgpu 0.16 enforced neither. PHYSICS PROVEN
UNTOUCHED: t0/t1/t5 + the 1.2M-step bonded vfy_t7b_N256_beta0 re-run
BYTE-IDENTICAL vs the verifier caches after every change below.

- WGSL: physics misc group 3 = settings(u)/materials/data; editor io
  group 0 = input(u)/click_info; render misc group 0 = frame input(u)/
  render settings(u)/materials/selections/click_info/create input(u).
- Host: BindingSpec per-entry visibility (stages only count entries
  they actually use; usage census in comments); spec_layout/spec_group
  helpers; per-dispatch bind groups (misc/io/selprop/mov_sim/
  contact_lite) because update paths replace buffers; per-frame render
  misc bind group in client::render().
- grid_info -> uniform (16 B, read-only); acc + contact_pointers ->
  visibility NONE (unused by every shader).
- $ DIAGNOSTICS template: data[] accumulation (observer-only) stripped
  on wasm32 to fit the 16-storage-buffer compute cap; native expands
  identically (byte-identity above includes this).
- Tint strictness: storageBarrier removed from dead broad-phase kernel;
  atomicStore for grid insert payload; 'var length' builtin shadowing
  renamed; &&/|| parenthesized (Click, Particles, Creation).
- wasm console (validation/wasm_console.py): CLEAN at runtime — no
  validation errors over 30 s.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@algorithm0r

Copy link
Copy Markdown
Author

Follow-up on the wasm status note: the web build now renders and runs, live at https://algorithm0r.github.io/Physics-Sim/ (Chrome/Edge 113+, hardware acceleration on — WebGPU needs a secure context, which Pages provides).

Getting there took a strictness gauntlet that might interest you, since none of it was wrong logic — just things native wgpu 0.16 never enforced and the browser does: @interpolate(flat) on integer vertex outputs, no writable storage in vertex stages, requiredLimits never forwarded by the 0.16 web backend (shimmed in index.js), the big one — browsers cap bind groups at 4 and storage buffers at 16/stage, so every pipeline's bindings got regrouped (your group 0–2 structure survived; the small uniforms merged into group 3) — and a few Tint pedantries (storageBarrier uniformity, atomicStore, a var length shadowing the builtin).

The physics is provably untouched: after every shader-touching change we re-ran a 1.2M-step bonded fracture scenario and required byte-identical output against the validated baseline — it held through all of it (commits 096cf57, b2701ec).

Try the Presets menu in the browser — the whole validation suite plus the demo scenes, one click each. Click the canvas once, then Space to run.

🤖 Generated with Claude Code

@algorithm0r algorithm0r changed the title Large-N instability diagnosed and fixed + validation suite (runnable as UI presets) Large-N instability fixed + validation suite as UI presets + it runs in a browser Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant