diff --git a/docs/superpowers/plans/2026-05-07-wesl-conversion.md b/docs/superpowers/plans/2026-05-07-wesl-conversion.md new file mode 100644 index 0000000..0ecd046 --- /dev/null +++ b/docs/superpowers/plans/2026-05-07-wesl-conversion.md @@ -0,0 +1,1600 @@ +# WGSL → WESL Conversion + Shared Shader Library Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Convert the seven WGSL shaders under `src/services/gpu/shaders/` to WESL, extract a reusable `lib/` of shared modules, and uniformly split each shader into vertex/fragment/io files. + +**Architecture:** Build-time linking via `wesl-plugin` for Vite. Each renderer's TS file imports two pre-linked WGSL strings (one per stage) using the `?static` suffix. Library modules live under `src/services/gpu/shaders/lib/` as themed single-file modules (one file per logical group of related fns), e.g. `lib/math.wesl` holds saturate/rot2/sabs/toPolar/toRect/constants. WESL imports a function FROM a module rather than a function-as-module, so one-fn-per-file would force a verbose duplicated leaf in the import path (`lib::math::saturate::saturate`); grouping into one file matches the WESL idiom. Every shader-touching task ends with build + typecheck + full test suite + manual visual sanity check on the running dev server before commit, per the project's `wgsl-meticulous` convention. + +**Tech Stack:** TypeScript 5.x, Vite 5.x, raw WebGPU, WGSL, WESL (`wesl@^0.7.26`, `wesl-plugin@^0.6.74`), Vitest 1.x. No shader unit-test framework exists; verification = build green + 590+ existing tests stay green + visual identity check. + +**Spec:** `docs/superpowers/specs/2026-05-07-wesl-conversion-design.md` + +--- + +## Pre-flight reference (read once before starting) + +**WESL `?static` suffix semantics.** Build-time linker. `import s from './foo.wesl?static'` returns a `string` containing flat WGSL with all `import ... ;` statements resolved into top-level functions/structs (mangled where collisions exist). Zero runtime cost. The legacy `?raw` import returns the file's bytes verbatim — the migration replaces every `?raw` with `?static` so the linker runs on what was previously a self-contained string. + +**WESL import path syntax.** Inside `.wesl` files, imports look like `import lib::math::saturate;` — colons, not slashes; no braces. After the import statement, `saturate` is a top-level identifier inside the importing file. Path resolution is relative to the configured root (this project: `src/services/gpu/shaders/`). Use `super::` for parent-relative paths (rare in this layout) and `as` for renaming on collision. + +**Sourcemaps caveat.** WGSL compile errors in Chrome will report line numbers in the **linked** WGSL output, not the source `.wesl`. Mitigation in this codebase: every shader module starts with a docblock identifying it (e.g. `// lib/math/saturate.wesl`), and `tonePass.ts` (and all renderers) log the linked WGSL alongside any `device.createShaderModule` failure in dev mode. Task 1 establishes that logging. + +**Project visual-verification rule.** Per `feedback_wgsl_meticulous.md`, no shader-touching task is marked complete until the implementer has visually compared the dev-server render to the previous render and confirmed identity. Tests are silent on shader correctness — visual is the only check. + +**WESL parser limitations discovered during Task 1 (2026-05-07).** Three concrete gotchas surfaced by the smoke test that affect every later task: + +1. **No backticks (`` ` ``) anywhere in shader source** — including inside `//` and `/* */` comments. The WESL parser tokenises the backtick character regardless of comment context and emits "expected a semicolon" errors. The didactic-comment style across the existing `.wgsl` files uses backticks heavily for inline code identifiers (335 occurrences across the 6 not-yet-converted shaders, 204 in `points.wgsl` alone). **Task 2's bulk rename must include a global `` ` `` → `'` substitution** in every shader file, applied as part of the same commit. The single-quote replacement preserves the visual intent (callout for an identifier) at the cost of the markdown-style aesthetic. If the WESL parser later fixes this, the substitution is mechanically reversible. + +2. **TypeScript subpath types via the tsconfig `types` array don't reliably resolve.** Adding `"wesl-plugin/suffixes"` to `compilerOptions.types` does not on its own make `import wgsl from './foo.wesl?static'` resolve to `string` under our `moduleResolution: "bundler"` setup. **A triple-slash reference in a project type file is required**, not optional. Task 1 ships `src/@types/wesl.d.ts` with `/// `; later tasks reference this file rather than re-creating it. + +3. **Vitest does NOT inherit Vite plugins from `vite.config.ts`.** Without explicit registration in `vitest.config.ts`, Vitest's SSR-transform pipeline tries to parse `.wesl` files as JavaScript and rolldown rejects them. Task 1 ships an updated `vitest.config.ts` that registers `wesl-plugin` directly. Later tasks should not modify this config unless adding new build extensions. + +4. **Self-package import prefix is the literal `package`, not the npm package name.** Verified during Task 3 (2026-05-07) — every snippet in this plan that reads `import skymap::lib::...` should be `import package::lib::...`. The wesl-plugin source (`PluginApi.ts`) calls `fileToModulePath(rootModuleName, "package", false)`, hard-coding the literal `"package"` as the root module's prefix; the official `wesl` README example uses the same form (`import package::colors::chartreuse;`). The npm `name` field (`"skymap"`) is reserved for cross-package imports if this project ever publishes a shader library to npm. **Read every later task's `skymap::...` snippet as `package::...` until those are amended in-place.** + +--- + +## Task 1: Tooling bootstrap (wesl-plugin + Vite + types) and convert toneMap + +**Files:** +- Modify: `package.json` (add deps) +- Create: `wesl.toml` (repo root) +- Modify: `tsconfig.json` (activate ambient `?static` types from `wesl-plugin/suffixes`) +- Modify: `vite.config.ts` +- Rename: `src/services/gpu/shaders/toneMap.wgsl` → `src/services/gpu/shaders/toneMap.wesl` +- Modify: `src/services/gpu/toneMapPass.ts` (import suffix + dev-mode link logging) + +- [ ] **Step 1.1: Add devDependencies** + +```bash +npm install --save-dev wesl@^0.7.26 wesl-plugin@^0.6.74 +``` + +Versions verified against the npm registry on 2026-05-07: `wesl-plugin` is still on the 0.6.x track (the original draft assumed 0.7.x, which doesn't exist on npm yet). The matching `wesl` runtime is `0.7.26`. Note: in this implementation pass the controller has already run `npm install` for the agent, so this step is a no-op record of what was added. Expected: lockfile updated, no peer-dep warnings beyond what existed before. + +- [ ] **Step 1.2: Create `wesl.toml` at repo root** + +The actual TOML schema (verified against `node_modules/wesl-plugin/dist/PluginExtension-DTjKL6rt.d.mts` on 2026-05-07) has flat top-level keys — no `[package]` table, no `name` field. The package name used as the prefix in WESL `import` paths comes from npm's `package.json` `name` (already `"skymap"`), which keeps a single source of truth. + +```toml +edition = "unstable_2025" +include = ["**/*.wesl", "**/*.wgsl"] +root = "src/services/gpu/shaders" +``` + +A short comment block in the file explains why we picked `?static` over `?link` — see the actual file for the full rationale. + +- [ ] **Step 1.3: Activate ambient types for `?static` imports** + +`wesl-plugin` ships its own ambient module declarations at the subpath `wesl-plugin/suffixes` (see `node_modules/wesl-plugin/src/defaultSuffixTypes.d.ts` — declares `*?static` as `string`, plus stubs for `?link`, `?simple_reflect`, `?bindingLayout`). There is **no need** to hand-write `src/@types/wesl.d.ts`. Activate the shipped types by adding `"wesl-plugin/suffixes"` to `compilerOptions.types` in `tsconfig.json` — that matches the project's existing pattern (the array already lists `"node"`, `"@webgpu/types"`, `"vite/client"`). + +```jsonc +// tsconfig.json +"types": ["node", "@webgpu/types", "vite/client", "wesl-plugin/suffixes"] +``` + +- [ ] **Step 1.4: Wire `wesl-plugin` into `vite.config.ts`** + +Read `vite.config.ts` first to see the existing plugin array. The actual API splits the Vite plugin entry point from the build extensions: import the Vite-specific factory from `wesl-plugin/vite` and the `staticBuildExtension` from the package root, then pass the extension to the factory. + +```ts +import { staticBuildExtension } from 'wesl-plugin'; +import viteWesl from 'wesl-plugin/vite'; +// ... +plugins: [viteWesl({ extensions: [staticBuildExtension] }), react()], +``` + +Plugin order shouldn't matter for correctness; alphabetise as fits the existing arrangement. Do not change any other plugin or config. + +- [ ] **Step 1.5: Rename `toneMap.wgsl` → `toneMap.wesl`** + +```bash +git mv src/services/gpu/shaders/toneMap.wgsl src/services/gpu/shaders/toneMap.wesl +``` + +No content changes. WESL is a strict superset of WGSL. + +- [ ] **Step 1.6: Update `toneMapPass.ts` import + add dev-mode link logging** + +Read `src/services/gpu/toneMapPass.ts` to find the existing `?raw` import. Change: + +```ts +import wgsl from './shaders/toneMap.wgsl?raw'; +``` + +to: + +```ts +import wgsl from './shaders/toneMap.wesl?static'; +``` + +Then locate the `device.createShaderModule({ code: wgsl, ... })` call. Wrap shader compilation error logging so the linked WGSL is dumped in dev: + +```ts +const module = device.createShaderModule({ code: wgsl, label: 'toneMap' }); +if (import.meta.env.DEV) { + module.getCompilationInfo().then((info) => { + if (info.messages.some((m) => m.type === 'error')) { + // Browser error line numbers refer to the linked WGSL output, not + // source .wesl files. Log the linked source so we can map line + // numbers back manually until wesl-plugin gains sourcemap support. + console.groupCollapsed('[toneMap] linked WGSL (for error line lookup)'); + console.log(wgsl); + console.groupEnd(); + } + }); +} +``` + +(If `toneMapPass.ts` already creates the module without a `label`, add the label too — it shows up in `getCompilationInfo` messages and helps identify which shader errored.) + +- [ ] **Step 1.7: Build + typecheck + test** + +```bash +npm run typecheck && npm run build && npm test +``` + +Expected: all green. The build output's bundle size for shaders should be the same byte count as before (toneMap has no imports yet, so the linker's output is the same WGSL). + +- [ ] **Step 1.8: Visual sanity check** + +Confirm the dev server is running (`npm run dev`). Open the browser. The tone-mapped scene should look identical to before — same gamma curve, same colors. If anything looks different, stop and investigate; the linker has changed something it shouldn't have. + +- [ ] **Step 1.9: Commit** + +```bash +git add package.json package-lock.json wesl.toml tsconfig.json vite.config.ts \ + src/services/gpu/shaders/toneMap.wgsl src/services/gpu/shaders/toneMap.wesl \ + src/services/gpu/toneMapPass.ts +git commit -m "$(cat <<'EOF' +chore(shaders): bootstrap wesl-plugin tooling and convert toneMap + +Adds wesl + wesl-plugin (build-time linker) wired into Vite via the +?static import suffix. Renames toneMap.wgsl → toneMap.wesl as the +smoke-test shader; the linker output is identical WGSL until imports +are added in later tasks. Dev-mode shader-compile errors now log the +linked WGSL alongside the error, since wesl-plugin doesn't yet emit +sourcemaps that survive into Chrome's WGSL compiler diagnostics. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +## Task 2: Bulk rename remaining 6 shaders to .wesl + +**Files:** +- Rename: 6 shader files +- Modify: 6 renderer TS files (one import line each) + +- [ ] **Step 2.1: Rename shader files** + +```bash +cd src/services/gpu/shaders +git mv disks.wgsl disks.wesl +git mv filaments.wgsl filaments.wesl +git mv milkyWayImpostor.wgsl milkyWayImpostor.wesl +git mv points.wgsl points.wesl +git mv proceduralDisks.wgsl proceduralDisks.wesl +git mv quads.wgsl quads.wesl +cd - +``` + +- [ ] **Step 2.1b: Strip backticks from shader comments** + +Per the WESL parser limitations documented in the pre-flight reference, every backtick (`` ` ``) inside the shader files must be replaced with a single quote. The didactic-comment style uses backticks for inline-code callouts; single quotes preserve the visual cue while making the WESL parser happy. Apply across all 6 renamed files (toneMap was handled in task 1): + +```bash +for f in src/services/gpu/shaders/disks.wesl \ + src/services/gpu/shaders/filaments.wesl \ + src/services/gpu/shaders/milkyWayImpostor.wesl \ + src/services/gpu/shaders/points.wesl \ + src/services/gpu/shaders/proceduralDisks.wesl \ + src/services/gpu/shaders/quads.wesl; do + # Use perl rather than sed for portable in-place editing without backup files. + perl -i -pe "s/\`/'/g" "$f" +done +``` + +Verify zero backticks remain: + +```bash +grep -c '`' src/services/gpu/shaders/*.wesl +# Expected: every line ends with `:0` +``` + +This is the only content change in this task — every other byte of the shaders stays identical. Document the substitution in the commit message. + +- [ ] **Step 2.2: Update each renderer's import** + +For each of the 6 renderer TS files, change the `?raw` import to `?static` and update the file extension. Read each file first to find the exact line, then edit: + +| File | Old import | New import | +|---|---|---| +| `src/services/gpu/diskRenderer.ts` | `'./shaders/disks.wgsl?raw'` | `'./shaders/disks.wesl?static'` | +| `src/services/gpu/filamentRenderer.ts` | `'./shaders/filaments.wgsl?raw'` | `'./shaders/filaments.wesl?static'` | +| `src/services/gpu/milkyWayRenderer.ts` | `'./shaders/milkyWayImpostor.wgsl?raw'` | `'./shaders/milkyWayImpostor.wesl?static'` | +| `src/services/gpu/pointRenderer.ts` | `'./shaders/points.wgsl?raw'` | `'./shaders/points.wesl?static'` | +| `src/services/gpu/proceduralDiskRenderer.ts` | `'./shaders/proceduralDisks.wgsl?raw'` | `'./shaders/proceduralDisks.wesl?static'` | +| `src/services/gpu/quadRenderer.ts` | `'./shaders/quads.wgsl?raw'` | `'./shaders/quads.wesl?static'` | +| `src/services/gpu/pickRenderer.ts` | `'./shaders/points.wgsl?raw'` | `'./shaders/points.wesl?static'` | + +(Note: pickRenderer also imports `points.wgsl` — that's the second import to update. Total: 7 TS files modified, 6 shader files renamed.) + +- [ ] **Step 2.3: Build + typecheck + test** + +```bash +npm run typecheck && npm run build && npm test +``` + +Expected: all green. Each shader now goes through the WESL linker but still has zero imports, so output WGSL is byte-identical to source. + +- [ ] **Step 2.4: Visual sanity check** + +Reload the dev server. All renderers should produce identical visuals to before. Pan, zoom, rotate; toggle tier; click a galaxy to verify pickRenderer still works. Anything different = stop. + +- [ ] **Step 2.5: Commit** + +```bash +git add -u +git commit -m "$(cat <<'EOF' +chore(shaders): rename remaining 6 shaders .wgsl → .wesl + +Bulk rename. Each renderer's ?raw import becomes ?static so the WESL +linker runs on every shader. Output WGSL is byte-identical until +imports are introduced in later tasks, save for one mechanical content +change: backticks in comments are replaced with single quotes +project-wide because the WESL parser tokenises ` regardless of comment +context. The single-quote replacement preserves the visual intent of +the inline-code callouts and is mechanically reversible if the parser +later loosens up. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +## Task 3: Extract `lib/math.wesl` (math primitives module) + +> **Note (post-execution):** This task originally planned six single-function files under `lib/math/`, but WESL's import resolution treats the last segment of a path as the function name and the rest as the module path. With one-fn-per-file the working import becomes `import package::lib::math::saturate::saturate;` (duplicated leaf) instead of the cleaner `import package::lib::math::saturate;`. We collapsed the six files into a single `lib/math.wesl` with section-divider comments, which matches the WESL idiom. The task as committed creates one file (`lib/math.wesl`) instead of six and uses single-segment imports. + +**Files:** +- Create: `src/services/gpu/shaders/lib/math/constants.wesl` +- Create: `src/services/gpu/shaders/lib/math/rot2.wesl` +- Create: `src/services/gpu/shaders/lib/math/sabs.wesl` +- Create: `src/services/gpu/shaders/lib/math/saturate.wesl` +- Create: `src/services/gpu/shaders/lib/math/toPolar.wesl` +- Create: `src/services/gpu/shaders/lib/math/toRect.wesl` +- Modify: `src/services/gpu/shaders/milkyWayImpostor.wesl` (replace inline `rot`, `sabs`, `toPolar`, `toRect`) +- Modify: `src/services/gpu/shaders/points.wesl` (replace inline `clamp(x, 0, 1)` with `saturate`, where it appears) + +- [ ] **Step 3.1: Create the six math files** + +`src/services/gpu/shaders/lib/math/constants.wesl`: +```wgsl +// lib/math/constants.wesl — common scalar constants. +// +// Pulled out of points.wesl + milkyWayImpostor.wesl which had +// hand-typed `3.14159...` and `2.30258...` literals. Keeping these +// in one file gives us one place to add precision if we ever need +// f64-equivalent constants for compute shaders. + +const PI: f32 = 3.14159265358979; +const TAU: f32 = 6.28318530717958; +const LOG10: f32 = 2.30258509299404; // ln(10), for converting log/ln +``` + +`src/services/gpu/shaders/lib/math/saturate.wesl`: +```wgsl +// lib/math/saturate.wesl — clamp(x, 0, 1). +// +// WGSL has no built-in `saturate`. The `clamp(x, 0.0, 1.0)` form +// recurs ~20× across the shaders; this gives us a named primitive. + +fn saturate(x: f32) -> f32 { + return clamp(x, 0.0, 1.0); +} +``` + +`src/services/gpu/shaders/lib/math/rot2.wesl`: +```wgsl +// lib/math/rot2.wesl — 2D rotation of a point around the origin. +// +// Pulled from milkyWayImpostor.wesl's inline `rot()`. Returned as +// a fresh vec2 (no in-place mutation) so it composes cleanly in +// expressions. + +fn rot2(p: vec2, a: f32) -> vec2 { + let c = cos(a); + let s = sin(a); + return vec2(c * p.x - s * p.y, s * p.x + c * p.y); +} +``` + +`src/services/gpu/shaders/lib/math/sabs.wesl`: +```wgsl +// lib/math/sabs.wesl — smooth absolute value. +// +// `sabs(x, k)` approximates `abs(x)` but is C¹-continuous at x=0. +// Larger `k` → sharper corner. Used by milkyWay's height function +// to avoid kinks in the derivative of disk thickness. + +fn sabs(x: f32, k: f32) -> f32 { + return sqrt(x * x + k); +} +``` + +`src/services/gpu/shaders/lib/math/toPolar.wesl`: +```wgsl +// lib/math/toPolar.wesl — Cartesian (x, y) → polar (r, θ). +// +// Returns vec2(r, theta) with theta in radians, range (-PI, PI]. + +fn toPolar(p: vec2) -> vec2 { + return vec2(length(p), atan2(p.y, p.x)); +} +``` + +`src/services/gpu/shaders/lib/math/toRect.wesl`: +```wgsl +// lib/math/toRect.wesl — polar (r, θ) → Cartesian (x, y). +// +// Inverse of toPolar. p.x = r, p.y = theta. + +fn toRect(p: vec2) -> vec2 { + return vec2(p.x * cos(p.y), p.x * sin(p.y)); +} +``` + +- [ ] **Step 3.2: Replace `rot`, `sabs`, `toPolar`, `toRect` in `milkyWayImpostor.wesl`** + +Read `src/services/gpu/shaders/milkyWayImpostor.wesl`. At the top of the file (after any leading docblock), add: + +```wgsl +import skymap::lib::math::rot2; +import skymap::lib::math::sabs; +import skymap::lib::math::toPolar; +import skymap::lib::math::toRect; +``` + +Then **delete** the four inline function definitions: +- `fn toPolar(p: vec2) -> vec2` (around line 330) +- `fn toRect(p: vec2) -> vec2` (around line 334) +- `fn rot(p: vec2, a: f32) -> vec2` (around line 367) +- `fn sabs(x: f32, k: f32) -> f32` (around line 425) + +The function name `rot` becomes `rot2` everywhere it's called inside the file. Use a global find-replace within the file: `rot(` → `rot2(` (be precise — there's no other identifier matching that prefix in this shader, but verify with grep before replacing). + +```bash +grep -n "rot(" src/services/gpu/shaders/milkyWayImpostor.wesl +``` + +Expected: matches are all the call sites of the deleted `rot` function. Replace each with `rot2(`. + +- [ ] **Step 3.3: Replace `clamp(x, 0.0, 1.0)` with `saturate(x)` in points.wesl** + +Read `src/services/gpu/shaders/points.wesl`. Add the import near the top: + +```wgsl +import skymap::lib::math::saturate; +``` + +Find every occurrence of `clamp(, 0.0, 1.0)` and `clamp(, 0, 1)` in the file: + +```bash +grep -n "clamp(" src/services/gpu/shaders/points.wesl +``` + +Replace each `clamp(, 0.0, 1.0)` with `saturate()` **only when** the second and third arguments are exactly `0.0, 1.0` or `0, 1`. Don't touch `clamp` calls with other bounds. + +(There may be ~5–10 such matches. The remaining `clamp` calls with non-[0,1] bounds stay as-is — `saturate` is specifically the [0,1] case.) + +- [ ] **Step 3.4: Build + typecheck + test** + +```bash +npm run typecheck && npm run build && npm test +``` + +Expected: all green. + +- [ ] **Step 3.5: Visual sanity check** + +Reload dev server. Milky Way impostor + points pass should be visually identical. Spend ~30s panning around, especially near the Milky Way (where `sabs`/`rot2` actually fire) and at distance from origin (where `saturate` calls in points.wesl gate the depth fade). + +- [ ] **Step 3.6: Commit** + +```bash +git add src/services/gpu/shaders/lib/math/ \ + src/services/gpu/shaders/milkyWayImpostor.wesl \ + src/services/gpu/shaders/points.wesl +git commit -m "$(cat <<'EOF' +refactor(shaders): extract lib/math/ — saturate, rot2, sabs, toPolar, toRect, constants + +Six single-function modules under lib/math/, plus a constants file +for PI/TAU/LOG10. Replaces inline definitions in milkyWayImpostor +and the ~10 inline `clamp(x, 0, 1)` calls in points with named +`saturate()`. No semantic change. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +## Task 4: Extract `lib/camera.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/camera.wesl` +- Modify: each renderer shader that today rolls its own view/proj math + +- [ ] **Step 4.1: Inventory existing camera-uniform layouts** + +Before extracting, read each renderer's `Uniforms` struct to identify which fields are camera-related (`viewProj`, `view`, `proj`, `cameraPos`, `kPerZ`, `viewportPx`, `dpr`, etc.) vs. renderer-specific (e.g. `globalBrightness` in points; `cloudOpacity` is fade-related and stays in cloudFade later). Note any field-order differences between renderers. + +```bash +grep -n "^struct Uniforms" src/services/gpu/shaders/*.wesl +# Then read each one — they're at: +# disks.wesl:57, filaments.wesl:21, milkyWayImpostor.wesl:71, +# points.wesl:68, proceduralDisks.wesl:18, quads.wesl:21, toneMap.wesl:24 +``` + +Document the canonical `CameraUniforms` field order in the new module's docblock — this is the source of truth, all renderers must adopt this order. + +- [ ] **Step 4.2: Create `lib/camera.wesl`** + +```wgsl +// lib/camera.wesl — shared camera uniform layout + projection helpers. +// +// CANONICAL FIELD ORDER. Bind groups across all renderers depend on +// these offsets matching exactly between TS-side struct writes and +// WGSL-side struct reads. Do NOT reorder fields without updating +// every renderer's TypedArray fill on the CPU side. +// +// Layout (16-byte aligned, std140-compatible-ish): +// offset 0: mat4x4 viewProj (64 B) +// offset 64: mat4x4 view (64 B) +// offset 128: mat4x4 proj (64 B) +// offset 192: vec3 cameraPos + 4 B padding +// offset 208: vec2 viewportPx + 8 B padding +// offset 224: f32 kPerZ +// offset 228: f32 dpr +// offset 232: f32 timeSec (for animated effects; renderers that +// don't need it leave it 0) +// offset 236: f32 _pad +// Total: 240 bytes. + +struct CameraUniforms { + viewProj: mat4x4, + view: mat4x4, + proj: mat4x4, + cameraPos: vec3, + viewportPx: vec2, + kPerZ: f32, + dpr: f32, + timeSec: f32, +} + +// World-space → clip-space (homogeneous, w=1 input). +fn worldToClip(cam: CameraUniforms, p: vec3) -> vec4 { + return cam.viewProj * vec4(p, 1.0); +} + +// Eye-space depth (linear distance from camera along view direction). +// Useful for size-vs-distance scaling that must be linear, not 1/w. +fn worldEyeDepth(cam: CameraUniforms, p: vec3) -> f32 { + return length(cam.cameraPos - p); +} + +// Pixel size (in NDC units) of a kPerZ-defined world unit at the given +// eye-space depth. Inverse of: "1 NDC unit = how many pixels at this depth?" +// Used by the billboard library for screen-space-sized point sprites. +fn pixelSizeAt(cam: CameraUniforms, eyeDepth: f32) -> f32 { + return cam.kPerZ / max(eyeDepth, 0.001); +} +``` + +(Verify the field count and offsets against what the TS side actually writes — read `src/services/engine/engine.ts` or wherever the camera uniform buffer is filled. Adjust `viewportPx` / `dpr` / `timeSec` presence based on real usage.) + +- [ ] **Step 4.3: Update each renderer shader** + +For each of the 7 shader files (`disks`, `filaments`, `milkyWayImpostor`, `points`, `proceduralDisks`, `quads`, `toneMap`): + +1. Add `import skymap::lib::camera::{ CameraUniforms, worldToClip, worldEyeDepth };` (and `pixelSizeAt` where used) to the top of the file. +2. Refactor the renderer's `Uniforms` struct so its first field is `cam: CameraUniforms` and renderer-specific fields follow. **Or**, if the renderer has only camera fields, replace the `Uniforms` struct entirely with `CameraUniforms`. +3. Replace inline `viewProj * vec4(p, 1.0)` with `worldToClip(u.cam, p)`. +4. Replace inline `length(u.cameraPos - p)` (or equivalent) with `worldEyeDepth(u.cam, p)`. + +This is a per-renderer commit. **Do these as 7 sub-commits**, one per renderer, so each diff is reviewable in isolation. + +For **each** renderer, after the shader change, also update the TypeScript side that fills the uniform buffer. Read the renderer's TS file to locate where the `Float32Array`/`DataView` write sequence happens — add or reorder writes to match the new `CameraUniforms` layout. The byte total must match the WGSL struct exactly. + +The mechanical pattern per renderer: +``` +edit shaders/.wesl # add import, restructure Uniforms struct, swap call sites +edit Renderer.ts # update CPU-side uniform write to match new layout +build + test + visual # gate +git add + commit # per-renderer sub-commit +``` + +- [ ] **Step 4.4: Per-renderer sub-commit checklist** + +Repeat for each of: `disks`, `filaments`, `milkyWayImpostor`, `points`, `proceduralDisks`, `quads`, `toneMap`: + +```bash +# After editing the .wesl + .ts pair for one renderer: +npm run typecheck && npm run build && npm test +# Visual check: reload dev server, focus on the affected renderer's output +git add src/services/gpu/shaders/.wesl src/services/gpu/Renderer.ts +git commit -m "refactor(shaders): adopt lib/camera.wesl in Renderer" +``` + +(Final sub-commit, after all 7 renderers, also git-adds `lib/camera.wesl` itself if not already committed.) + +- [ ] **Step 4.5: Final verification after all renderers converted** + +```bash +npm run typecheck && npm run build && npm test +``` + +Expected: all green. Visual: every renderer should look identical to pre-task. The most likely failure mode is a struct-alignment bug — wrong CPU-side write order produces garbage uniforms and renders nothing or wildly wrong colors. + +--- + +## Task 5: Extract `lib/billboard.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/billboard.wesl` +- Modify: `points.wesl`, `quads.wesl`, `disks.wesl`, `proceduralDisks.wesl` + +- [ ] **Step 5.1: Inventory existing billboard expansion code** + +Each of the four billboard renderers has a near-identical block that: +1. Receives `vid: u32` (0..3, the vertex index of a unit quad). +2. Computes `cornerOffset = vec2((vid & 1u) == 0u ? -1.0 : 1.0, ...)` (or via a constant array). +3. Scales by a per-instance pixel- or world-size. +4. Adds the offset to the world-space center, projected via `viewProj`. + +Read each of the four files' `vs` entry points to locate the shared pattern. + +- [ ] **Step 5.2: Create `lib/billboard.wesl`** + +```wgsl +// lib/billboard.wesl — view-aligned billboard expansion helpers. +// +// All four billboard renderers (points, quads, disks, proceduralDisks) +// take a unit-quad's `@builtin(vertex_index) vid: u32` and need to: +// 1. Map vid (0..3) → corner offset in [-1, +1]² (UV-style). +// 2. Multiply by a per-instance size. +// 3. Add to an instance's world-space center. +// +// The corner mapping uses a CCW triangle-strip order (vid=0 → +// bottom-left, 1 → bottom-right, 2 → top-left, 3 → top-right) so a +// 4-vertex `triangle-strip` topology renders the quad as two +// triangles without an index buffer. + +import skymap::lib::camera::{ CameraUniforms, pixelSizeAt }; + +// Map vertex index 0..3 to its [-1, +1]² corner offset. +fn quadCorner(vid: u32) -> vec2 { + let x = select(1.0, -1.0, (vid & 1u) == 0u); + let y = select(1.0, -1.0, (vid & 2u) == 0u); + return vec2(x, y); +} + +// Same mapping but as UV in [0, 1]², for fragment-shader UV coords. +fn quadUv(vid: u32) -> vec2 { + let x = select(1.0, 0.0, (vid & 1u) == 0u); + let y = select(1.0, 0.0, (vid & 2u) == 0u); + return vec2(x, y); +} + +// Expand a screen-space-sized billboard. `centerWS` is the instance +// center in world space, `sizePx` is the desired diameter in pixels at +// the current viewport, and the result is a clip-space position. +// +// Internally: project center to clip, then add the corner offset +// scaled by pixelSizeAt(eyeDepth) so the quad's screen size is +// constant regardless of distance. +fn expandBillboardScreen( + cam: CameraUniforms, + centerWS: vec3, + sizePx: f32, + vid: u32, +) -> vec4 { + let eyeDepth = length(cam.cameraPos - centerWS); + let centerClip = cam.viewProj * vec4(centerWS, 1.0); + let cornerNDC = quadCorner(vid) * (sizePx / cam.viewportPx) * centerClip.w; + return vec4(centerClip.xy + cornerNDC, centerClip.zw); +} + +// Expand a world-space-sized billboard. `sizeWS` is the desired +// diameter in world units, and the quad is view-aligned (faces the +// camera). Used for galaxy thumbnails, where the on-sky size is +// physically meaningful. +fn expandBillboardWorld( + cam: CameraUniforms, + centerWS: vec3, + sizeWS: f32, + vid: u32, +) -> vec4 { + // View-aligned basis: x = camera-right, y = camera-up. + // Extracted from the inverse-rotation columns of the view matrix. + let right = vec3(cam.view[0].x, cam.view[1].x, cam.view[2].x); + let up = vec3(cam.view[0].y, cam.view[1].y, cam.view[2].y); + let corner = quadCorner(vid) * sizeWS * 0.5; + let posWS = centerWS + right * corner.x + up * corner.y; + return cam.viewProj * vec4(posWS, 1.0); +} +``` + +- [ ] **Step 5.3: Replace inline expansion in each billboard renderer** + +For each of `points.wesl`, `quads.wesl`, `disks.wesl`, `proceduralDisks.wesl`: + +1. Add the relevant imports: + ```wgsl + import skymap::lib::billboard::{ quadCorner, quadUv, expandBillboardScreen, expandBillboardWorld }; + ``` +2. Inside the `vs` entry point, replace the manually-rolled corner+expansion math with the matching helper. Keep all other logic (color computation, fade, magnitude→intensity) untouched. +3. If the existing code uses a custom corner ordering, verify the new `quadCorner`'s [-1,+1]² output produces the same vertex layout — otherwise the quad will wind backward and disappear under back-face culling. + +This is per-renderer. Sub-commit each: + +```bash +npm run typecheck && npm run build && npm test +# Visual: reload dev. Focus on the renderer just changed. +git add src/services/gpu/shaders/.wesl +git commit -m "refactor(shaders): adopt lib/billboard.wesl in " +``` + +(`disks.wesl` is the trickiest — its expansion uses the position-angle/inclination math, so leave the orientation parts untouched and only swap the corner-mapping primitives. `lib/orientation.wesl` in task 6 handles the rest.) + +- [ ] **Step 5.4: Final verification** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: thoroughly check points, quads (galaxy thumbnails near close approach), disks, and proceduralDisks. The failure mode here is a corner-ordering bug — quads disappear or invert. + +--- + +## Task 6: Extract `lib/orientation.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/orientation.wesl` +- Modify: `disks.wesl`, `proceduralDisks.wesl` + +- [ ] **Step 6.1: Read the duplicate code** + +```bash +sed -n '155,170p' src/services/gpu/shaders/disks.wesl +echo "---" +sed -n '150,170p' src/services/gpu/shaders/proceduralDisks.wesl +``` + +Confirm the two blocks are byte-for-byte equivalent (modulo identifier renames and comment style). Capture any genuine difference here in the commit message — usually there's none. + +- [ ] **Step 6.2: Create `lib/orientation.wesl`** + +```wgsl +// lib/orientation.wesl — galaxy disk orientation: position-angle + +// inclination → world-space major/minor axes. +// +// Background: the catalog gives us each galaxy's position-angle (PA, +// the angle from local north toward east, projected on the sky) and +// either an axis ratio b/a or a directly-measured inclination i. +// We need a 3D coordinate frame for the disk: a major axis on the +// plane of the sky, and a minor axis tilted toward the line-of-sight. +// +// Derivation (also lives in disks.wesl + proceduralDisks.wesl as +// commentary): +// 1. north_proj, east_proj: tangent-plane basis at the galaxy +// world position, north = +y projected onto the local sky tangent. +// 2. major = north_proj * cos(PA) + east_proj * sin(PA) +// 3. minor_in_sky = north_proj * (-sin(PA)) + east_proj * cos(PA) +// 4. minor_3d = minor_in_sky * cos(i) + losDir * sin(i) +// where losDir = unit vector from camera toward galaxy. +// +// Edge-on (axisRatio → 0, cosI → 0, sinI → 1) → minor_3d ≈ losDir. +// Face-on (axisRatio → 1, cosI → 1, sinI → 0) → minor_3d ≈ minor_in_sky. + +struct DiskAxes { + major: vec3, + minor: vec3, +} + +// Build the disk's world-space axes. +// posWS: galaxy world position +// cameraPos: camera world position (defines line-of-sight) +// paRad: position angle in radians, from north toward east +// cosI, sinI: cosine and sine of the inclination angle. +// For a catalog axisRatio = b/a, cosI = axisRatio, +// sinI = sqrt(1 - axisRatio²). +fn diskAxes( + posWS: vec3, + cameraPos: vec3, + paRad: f32, + cosI: f32, + sinI: f32, +) -> DiskAxes { + let losDir = normalize(posWS - cameraPos); + + // Local tangent basis. North is global +y projected onto the plane + // perpendicular to losDir; east is north × losDir (right-handed). + let worldUp = vec3(0.0, 1.0, 0.0); + let northTangent = normalize(worldUp - losDir * dot(losDir, worldUp)); + let eastTangent = cross(northTangent, losDir); + + let cosPA = cos(paRad); + let sinPA = sin(paRad); + + let majorSky = northTangent * cosPA + eastTangent * sinPA; + let perpMajorSky = northTangent * (-sinPA) + eastTangent * cosPA; + let minor3D = perpMajorSky * cosI + losDir * sinI; + + return DiskAxes(majorSky, minor3D); +} +``` + +(Verify the exact derivation against the existing block — there's a chance one renderer uses a slightly different sign convention. If so, document and unify.) + +- [ ] **Step 6.3: Replace the inline block in `disks.wesl`** + +Read `disks.wesl` to locate the existing block (around lines 155–170). Add the import: + +```wgsl +import skymap::lib::orientation::{ DiskAxes, diskAxes }; +``` + +Replace the ~12 lines of inline math with a single call: + +```wgsl +let axes = diskAxes(instance.posWS, u.cam.cameraPos, instance.paRad, cosI, sinI); +let majorAxis = axes.major; +let minorAxis = axes.minor; +``` + +(Adjust local variable names to match what the existing `vs` body uses afterward.) + +- [ ] **Step 6.4: Replace the inline block in `proceduralDisks.wesl`** + +Same replacement, same import, same call shape. + +- [ ] **Step 6.5: Build + typecheck + test + visual + commit** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: focus on disks and procedural disks at close approach. Any galaxy with a known orientation (M31, M81, NGC 891) should still tilt correctly. Edge-on galaxies should still appear edge-on. + +```bash +git add src/services/gpu/shaders/lib/orientation.wesl \ + src/services/gpu/shaders/disks.wesl \ + src/services/gpu/shaders/proceduralDisks.wesl +git commit -m "$(cat <<'EOF' +refactor(shaders): extract lib/orientation.wesl + +Collapses the verbatim PA + inclination → 3D major/minor axis math +duplicated between disks.wesl and proceduralDisks.wesl. The two +blocks were byte-equal modulo identifier renames; both now call +the shared diskAxes() helper. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +## Task 7: Extract `lib/colorIndex.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/colorIndex.wesl` +- Modify: `points.wesl`, `proceduralDisks.wesl` + +- [ ] **Step 7.1: Read the duplicate `ramp` function** + +```bash +sed -n '650,705p' src/services/gpu/shaders/points.wesl +echo "---" +sed -n '210,220p' src/services/gpu/shaders/proceduralDisks.wesl +``` + +Confirm the two `fn ramp(t: f32) -> vec3` definitions are byte-equal (modulo formatting). The longer comment block above `points.wesl`'s ramp is documentation; preserve it on the new module. + +- [ ] **Step 7.2: Create `lib/colorIndex.wesl`** + +```wgsl +// lib/colorIndex.wesl — color-index → RGB ramp. +// +// Maps a normalised color index t ∈ [0, 1] to a color, where t=0 +// represents the bluest galaxies and t=1 the reddest. The ramp is a +// piecewise-linear interpolation through five anchor colors derived +// from real galaxy spectra (UV-bright spirals → red ellipticals). +// +// The mapping from catalog (g - i) or (B - V) to t happens on the CPU +// side (see src/data/colourIndex.ts) so this shader doesn't have to +// know which photometric system any given galaxy came from. +// +// Future work: a B-V → blackbody-temperature → RGB path would be +// physically more honest. Until then, this hand-tuned ramp matches +// what NASA-style press images use, which gives users the "right" +// expectation about galaxy color. + +fn ramp(t: f32) -> vec3 { + // [PASTE THE EXISTING RAMP BODY HERE — copy from points.wesl + // verbatim. The function is ~50 lines of piecewise mix() calls + // between five anchor colors.] +} +``` + +(The implementer must paste the actual existing function body when extracting — do not re-derive the anchor colors from memory.) + +- [ ] **Step 7.3: Replace `ramp` in `points.wesl`** + +Add import: +```wgsl +import skymap::lib::colorIndex::ramp; +``` + +Delete the local `fn ramp` definition. Call sites (already named `ramp(...)`) need no change. + +- [ ] **Step 7.4: Replace `ramp` in `proceduralDisks.wesl`** + +Same pattern. + +- [ ] **Step 7.5: Build + visual + commit** + +```bash +npm run typecheck && npm run build && npm test +git add src/services/gpu/shaders/lib/colorIndex.wesl \ + src/services/gpu/shaders/points.wesl \ + src/services/gpu/shaders/proceduralDisks.wesl +git commit -m "refactor(shaders): extract lib/colorIndex.wesl" +``` + +(Visual: galaxy color distribution should be unchanged. Easiest check: zoom out to a wide view and observe the red/blue ratio matches before.) + +--- + +## Task 8: Extract `lib/cloudFade.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/cloudFade.wesl` +- Modify: `points.wesl`, `filaments.wesl` + +- [ ] **Step 8.1: Compare the duplicate `CloudUniforms` struct** + +```bash +sed -n '290,322p' src/services/gpu/shaders/points.wesl +echo "---" +sed -n '37,47p' src/services/gpu/shaders/filaments.wesl +``` + +Note any differences. Document them in the commit message; if they diverge, either unify (preferred) or split into two named structs. + +- [ ] **Step 8.2: Create `lib/cloudFade.wesl`** + +```wgsl +// lib/cloudFade.wesl — per-cloud fade uniform + apply helper. +// +// Each renderable point cloud has an `opacity` scalar in [0, 1] that +// drives a smooth fade-in/out as a tier swap progresses. The CPU side +// animates this between 0 and 1 using a smoothstep curve. +// +// The struct also includes a `cloudId` for picking-target encoding: +// the pickRenderer writes (cloudId, instanceIdx) into r32uint so a +// single readback distinguishes which cloud the user clicked. + +struct CloudUniforms { + opacity: f32, + cloudId: u32, + // pad to 16-byte alignment if needed by the bind-group layout + _pad0: f32, + _pad1: f32, +} + +fn applyCloudFade(color: vec4, cloud: CloudUniforms) -> vec4 { + return vec4(color.rgb, color.a * cloud.opacity); +} +``` + +(Match the actual TS-side write layout. If `CloudUniforms` has more fields in the live code than this draft shows, copy them in.) + +- [ ] **Step 8.3: Replace the inline struct + fade application in `points.wesl` and `filaments.wesl`** + +For each file: + +```wgsl +import skymap::lib::cloudFade::{ CloudUniforms, applyCloudFade }; +``` + +Delete the local `struct CloudUniforms`. The bind-group binding (e.g. `@group(2) @binding(0) var cloud: CloudUniforms;`) stays in the renderer file — only the type definition moves. + +Replace any inline `color * cloud.opacity` with `applyCloudFade(color, cloud)` where it appears as the final fade step. + +- [ ] **Step 8.4: Build + visual + commit** + +```bash +npm run typecheck && npm run build && npm test +git add src/services/gpu/shaders/lib/cloudFade.wesl \ + src/services/gpu/shaders/points.wesl \ + src/services/gpu/shaders/filaments.wesl +git commit -m "refactor(shaders): extract lib/cloudFade.wesl" +``` + +Visual: tier-swap animations should fade smoothly as before. Pick a tier transition that exercises both points and filaments fading. + +--- + +## Task 9: Extract `lib/masks.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/masks.wesl` +- Modify: `disks.wesl`, `quads.wesl`, `proceduralDisks.wesl`, `filaments.wesl` + +- [ ] **Step 9.1: Inventory the existing mask patterns** + +Three patterns recur across fragment shaders: + +| Pattern | Where | Purpose | +|---|---|---| +| `1.0 - smoothstep(inner, outer, r)` | disks `:191`, quads `:210`, proceduralDisks `:241` | Circular cutoff fade — soft edge of a disk/sprite | +| `smoothstep(lo, hi, lum)` | disks `:195`, quads `:230` | Luminance-keyed alpha — dim pixels become transparent | +| `smoothstep(0, fade, uv.y) * (1 - smoothstep(1-fade, 1, uv.y))` | filaments `:107` | Edge-band mask — fade in at 0 and out at 1 | + +- [ ] **Step 9.2: Create `lib/masks.wesl`** + +```wgsl +// lib/masks.wesl — common fragment-stage mask shapes. + +// Soft circular cutoff: 1 inside `inner`, 0 outside `outer`, smooth between. +// Used for disk/sprite edges. r is typically `length(uv - 0.5) * 2` or +// `length(uv - center)` depending on the shader's UV convention. +fn circularMask(r: f32, inner: f32, outer: f32) -> f32 { + return 1.0 - smoothstep(inner, outer, r); +} + +// Luminance-keyed alpha: 0 below `lo`, 1 above `hi`, smooth between. +// Lets the renderer fade out very dim pixels rather than rendering +// them as gray noise. +fn lumAlpha(lum: f32, lo: f32, hi: f32) -> f32 { + return smoothstep(lo, hi, lum); +} + +// Edge-band mask along one UV axis. 0 at axis=0 and axis=1, 1 in the +// middle, with `fade` controlling the falloff width at each end. +// Used by filaments to taper line endpoints. +fn edgeBandMask(axis: f32, fade: f32) -> f32 { + return smoothstep(0.0, fade, axis) * (1.0 - smoothstep(1.0 - fade, 1.0, axis)); +} +``` + +- [ ] **Step 9.3: Replace inline masks in each fragment shader** + +For each of `disks.wesl`, `quads.wesl`, `proceduralDisks.wesl`, `filaments.wesl`: + +1. Add `import skymap::lib::masks::{ circularMask, lumAlpha, edgeBandMask };` (only the names actually used). +2. Replace each occurrence of the matching pattern with a call to the helper. Verify the parameters map to the helper's argument order — the existing inline forms might pass `outer, inner` instead of `inner, outer`. + +Per-shader sub-commit. + +- [ ] **Step 9.4: Final verification** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: galaxy sprites should still have soft edges, dim pixels fade out as before, filament endpoints taper smoothly. + +--- + +## Task 10: Extract `lib/astro.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/astro.wesl` +- Modify: `points.wesl` + +- [ ] **Step 10.1: Locate the formulas in `points.wesl`** + +```bash +grep -n "5.0 \* (log\|pow(10" src/services/gpu/shaders/points.wesl +``` + +Two formulas: +- Distance modulus at `points.wesl:762` — `absMag = appMag - 5*log10(d_Mpc) - 25` +- Magnitude → intensity (search for `pow(10.0, -0.4`). + +- [ ] **Step 10.2: Create `lib/astro.wesl`** + +```wgsl +// lib/astro.wesl — astronomical magnitude conversions. + +import skymap::lib::math::constants::LOG10; + +// Distance modulus: convert apparent magnitude + distance to absolute +// magnitude. m - M = 5·log₁₀(d/10pc) — for d in Mpc this is +// M = m - 5·log₁₀(d_Mpc) - 25 +fn distanceModulus(appMag: f32, distMpc: f32) -> f32 { + return appMag - 5.0 * (log(distMpc) / LOG10) - 25.0; +} + +// Apparent magnitude → linear flux ratio. Pogson scale: each 5 mag +// step is a factor of 100 in flux, so flux ratio = 10^(-0.4·m). +// `m=0` returns 1.0; brighter (smaller m) returns >1, dimmer <1. +fn appMagToIntensity(m: f32) -> f32 { + return pow(10.0, -0.4 * m); +} +``` + +- [ ] **Step 10.3: Replace inline formulas in `points.wesl`** + +Add `import skymap::lib::astro::{ distanceModulus, appMagToIntensity };`. + +Replace the inline `appMag - 5.0 * (log(dMpc) / LOG10) - 25.0` with `distanceModulus(appMag, dMpc)`. Replace `pow(10.0, -0.4 * m)` with `appMagToIntensity(m)`. + +- [ ] **Step 10.4: Build + visual + commit** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: galaxy brightnesses should be unchanged. Easiest check: examine a known-bright galaxy (M31) — its apparent size and intensity should match before. + +```bash +git add src/services/gpu/shaders/lib/astro.wesl \ + src/services/gpu/shaders/points.wesl +git commit -m "refactor(shaders): extract lib/astro.wesl — distance modulus + magnitude→intensity" +``` + +--- + +## Task 11: Extract `lib/tonemap.wesl` + +**Files:** +- Create: `src/services/gpu/shaders/lib/tonemap.wesl` +- Modify: `toneMap.wesl` + +- [ ] **Step 11.1: Read the existing tone-mapping curves** + +```bash +sed -n '55,110p' src/services/gpu/shaders/toneMap.wesl +``` + +Five functions: `applyLinear`, `applyReinhard`, `applyAsinh`, `applyGamma2`, `applyAces`. + +- [ ] **Step 11.2: Create `lib/tonemap.wesl`** + +```wgsl +// lib/tonemap.wesl — tone-mapping curves. +// +// Each function maps a linear-space HDR color to a [0, 1] LDR color +// suitable for an sRGB display. Curves chosen to suit deep-space +// imagery where the dynamic range spans many orders of magnitude. + +// Identity. Useful as a debug or "bypass" pass. +fn applyLinear(c: vec3) -> vec3 { + // [PASTE EXISTING IMPL] +} + +// Reinhard with white-point normalization. wsq = whitePoint². +fn applyReinhard(c: vec3, wsq: f32) -> vec3 { + // [PASTE EXISTING IMPL] +} + +// asinh(k·x)/asinh(k) — natural fit for stellar magnitudes. +fn applyAsinh(c: vec3, k: f32) -> vec3 { + // [PASTE EXISTING IMPL] +} + +// sqrt(saturate(c)) — quick gamma-2 approximation. +fn applyGamma2(c: vec3) -> vec3 { + // [PASTE EXISTING IMPL] +} + +// ACES filmic curve. Standard cinema/CG tone-map. +fn applyAces(c: vec3) -> vec3 { + // [PASTE EXISTING IMPL] +} +``` + +(Implementer pastes the actual function bodies. Don't re-derive ACES coefficients.) + +- [ ] **Step 11.3: Replace inline functions in `toneMap.wesl`** + +Add: +```wgsl +import skymap::lib::tonemap::{ applyLinear, applyReinhard, applyAsinh, applyGamma2, applyAces }; +``` + +Delete the five inline `fn apply*` definitions. The fragment-stage `fs` function calls (already named `applyReinhard(...)` etc.) need no change. + +- [ ] **Step 11.4: Build + visual + commit** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: tone-map dropdown in the dev panel should still cycle through Linear / Reinhard / Asinh / Gamma2 / ACES with the same curves as before. Set each one and compare to memory of the previous look. + +```bash +git add src/services/gpu/shaders/lib/tonemap.wesl src/services/gpu/shaders/toneMap.wesl +git commit -m "refactor(shaders): extract lib/tonemap.wesl" +``` + +--- + +## Task 12: Extract `lib/util.wesl` (noise + raySphere + galactic + sRGB + pickEncode) + +**Files:** +- Create: `src/services/gpu/shaders/lib/util.wesl` +- Modify: `milkyWayImpostor.wesl`, `points.wesl` (the pick fragment), `toneMap.wesl` + +- [ ] **Step 12.1: Read the source functions** + +```bash +# Noise + ray-sphere + galactic + stars (in milkyWay) +grep -n "^fn " src/services/gpu/shaders/milkyWayImpostor.wesl +# Pick encoding (in points) +grep -n "vec4\|@location(0) vec4" src/services/gpu/shaders/points.wesl +# sRGB conversion (in toneMap, currently as part of gamma2) +grep -n "linearToSRGB\|srgbToLinear\|gamma" src/services/gpu/shaders/toneMap.wesl +``` + +- [ ] **Step 12.2: Create `lib/util.wesl`** + +```wgsl +// lib/util.wesl — orphan utility functions awaiting promotion. +// +// Each function in this module is currently used by exactly one +// shader. They live together to avoid a flurry of single-call-site +// modules; when a second consumer appears for any of them, that +// function graduates to its own file under lib//.wesl +// (matching the lib/math/ pattern). + +// ── noise ───────────────────────────────────────────────────────── + +// Hash from 2D input to scalar in [0, 1). The constants come from the +// classic `fract(sin(dot(p, vec2(12.9898, 78.233))) * 43758.5453)` +// tradition; they're a hash, not a serious PRNG, but visually +// good enough for shader noise. +fn hash21(co: vec2) -> f32 { + // [PASTE existing rand() body from milkyWayImpostor.wesl] +} + +// 2D value noise with bilinear interpolation. tm is a phase offset. +fn valueNoise2(p: vec2, tm: f32) -> f32 { + // [PASTE existing noise1() body from milkyWayImpostor.wesl] +} + +// ── geometry ────────────────────────────────────────────────────── + +// Ray-sphere intersection. Returns vec2(tEnter, tExit); both +// negative if the ray misses or the sphere is behind the origin. +fn raySphere(ro: vec3, rd: vec3, center: vec3, radius: f32) -> vec2 { + // [PASTE existing impl from milkyWayImpostor.wesl] +} + +// ── galactic frame ──────────────────────────────────────────────── + +// World-frame (equatorial-aligned) → galactic-frame rotation. +fn worldToGalactic(v: vec3) -> vec3 { + // [PASTE existing impl from milkyWayImpostor.wesl] +} + +// Galactic-frame → renderer-frame (the Milky Way impostor's +// orientation in the scene). +fn galacticToShader(g: vec3) -> vec3 { + // [PASTE existing impl from milkyWayImpostor.wesl] +} + +// ── sRGB ────────────────────────────────────────────────────────── + +// Linear → sRGB gamma. Currently used implicitly by toneMap's +// gamma-2 curve; isolating it makes the conversion available to +// any future post-process pass. +fn linearToSRGB(c: vec3) -> vec3 { + let cutoff = vec3(0.0031308); + let lo = 12.92 * c; + let hi = 1.055 * pow(c, vec3(1.0 / 2.4)) - 0.055; + return select(hi, lo, c < cutoff); +} + +fn srgbToLinear(c: vec3) -> vec3 { + let cutoff = vec3(0.04045); + let lo = c / 12.92; + let hi = pow((c + 0.055) / 1.055, vec3(2.4)); + return select(hi, lo, c < cutoff); +} + +// ── pick-target encoding ────────────────────────────────────────── + +// Encode a 32-bit instance ID into the r32uint pick-target format. +// The fragment shader writes vec4; only the .r channel is read +// back via copyTextureToBuffer. Keeping this in a function documents +// the wire format for future readback code. +fn encodePickId(idx: u32) -> vec4 { + return vec4(idx, 0u, 0u, 0u); +} +``` + +- [ ] **Step 12.3: Replace call sites in `milkyWayImpostor.wesl`** + +Add: +```wgsl +import skymap::lib::util::{ hash21, valueNoise2, raySphere, worldToGalactic, galacticToShader }; +``` + +Delete the local definitions of `rand`, `noise1`, `raySphere`, `worldToGalactic`, `galacticToShader`. Rename call sites: `rand(` → `hash21(`, `noise1(` → `valueNoise2(`. Verify with grep. + +- [ ] **Step 12.4: Replace pick encoding in `points.wesl`** + +In the `fsPick` function, replace the inline `vec4(globalInstanceIdx, 0u, 0u, 0u)` (or whatever the existing form is) with `encodePickId(globalInstanceIdx)`. Add the import. + +- [ ] **Step 12.5: Final verification** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: Milky Way impostor (fragment shader is the heaviest user — noise + raySphere + galactic). Pan around it; the procedural galaxy should look identical. Click a galaxy → pickRenderer → ensure selection still works. + +```bash +git add src/services/gpu/shaders/lib/util.wesl \ + src/services/gpu/shaders/milkyWayImpostor.wesl \ + src/services/gpu/shaders/points.wesl +git commit -m "refactor(shaders): extract lib/util.wesl — noise, raySphere, galactic, sRGB, pickEncode" +``` + +--- + +## Task 13: Split `points.wesl` into 4 files + +**Files:** +- Create: `src/services/gpu/shaders/points.io.wesl` +- Create: `src/services/gpu/shaders/points.vertex.wesl` +- Create: `src/services/gpu/shaders/points.color.fragment.wesl` +- Create: `src/services/gpu/shaders/points.pick.fragment.wesl` +- Delete: `src/services/gpu/shaders/points.wesl` +- Modify: `src/services/gpu/pointRenderer.ts`, `src/services/gpu/pickRenderer.ts` + +- [ ] **Step 13.1: Carve up the existing file** + +Read `src/services/gpu/shaders/points.wesl` to see what's there now (after tasks 3–12, it's smaller — most reusable code has been extracted to `lib/`). Identify three regions: + +1. **Shared types**: `struct Uniforms`, `struct CloudUniforms` (already imported), `struct PerVertex`, `struct VSOut`, plus any bind-group declarations. +2. **Vertex stage**: `@vertex fn vs(...)` — used by both color and pick paths. +3. **Color fragment**: `@fragment fn fs(in: VSOut) -> @location(0) vec4`. +4. **Pick fragment**: `@fragment fn fsPick(in: VSOut) -> @location(0) vec4`. + +- [ ] **Step 13.2: Create the four new files** + +`points.io.wesl`: +```wgsl +// points.io.wesl — shared type declarations + bind groups for the +// points/pick pair. Imported by all three companion files +// (points.vertex.wesl, points.color.fragment.wesl, points.pick.fragment.wesl). +// +// Pulling these out of points.vertex.wesl (where they could otherwise +// live) means both fragment files get them without re-declaring, +// which prevents accidental drift in the V→F interpolant struct. + +import skymap::lib::camera::CameraUniforms; +import skymap::lib::cloudFade::CloudUniforms; + +struct Uniforms { + cam: CameraUniforms, + // [paste any remaining renderer-specific fields here] +} + +struct PerVertex { + // [paste from current points.wesl] +} + +struct VSOut { + // [paste from current points.wesl] +} + +// Bind groups (paste the @group / @binding declarations from the +// current file). All three companion files reference these. +@group(0) @binding(0) var u: Uniforms; +@group(2) @binding(0) var cloud: CloudUniforms; +// [etc.] +``` + +`points.vertex.wesl`: +```wgsl +import skymap::points::io::{ Uniforms, PerVertex, VSOut, u, cloud }; +import skymap::lib::camera::worldToClip; +import skymap::lib::billboard::expandBillboardScreen; +// [other imports the vs body uses, copied from the current top-of-file] + +@vertex +fn vs(/* paste signature */) -> VSOut { + // [paste existing vs body verbatim] +} +``` + +`points.color.fragment.wesl`: +```wgsl +import skymap::points::io::{ VSOut, u, cloud }; +import skymap::lib::cloudFade::applyCloudFade; +// [other imports] + +@fragment +fn fs(in: VSOut) -> @location(0) vec4 { + // [paste existing fs body verbatim] +} +``` + +`points.pick.fragment.wesl`: +```wgsl +import skymap::points::io::VSOut; +import skymap::lib::util::encodePickId; + +@fragment +fn fsPick(in: VSOut) -> @location(0) vec4 { + // [paste existing fsPick body verbatim] +} +``` + +(WESL imports of `var` bindings: verify the linker actually allows importing a binding declaration vs. requiring redeclaration. If not, the bind groups must live in each consuming file with identical `@group/@binding` numbers — a pattern WGSL itself supports without complaint as long as the numbers match.) + +- [ ] **Step 13.3: Delete the old `points.wesl`** + +```bash +git rm src/services/gpu/shaders/points.wesl +``` + +- [ ] **Step 13.4: Update `pointRenderer.ts`** + +Read the current file. Find the `import wgsl from './shaders/points.wesl?static'` line, plus the `device.createShaderModule` and `device.createRenderPipeline` calls. + +Replace the single import with two: + +```ts +import vsCode from './shaders/points.vertex.wesl?static'; +import fsCode from './shaders/points.color.fragment.wesl?static'; +``` + +Update the pipeline construction to use two modules: + +```ts +const vsModule = device.createShaderModule({ code: vsCode, label: 'points.vertex' }); +const fsModule = device.createShaderModule({ code: fsCode, label: 'points.color.fragment' }); + +device.createRenderPipeline({ + // ...existing layout/buffers/etc... + vertex: { module: vsModule, entryPoint: 'vs', buffers: [...] }, + fragment: { module: fsModule, entryPoint: 'fs', targets: [...] }, +}); +``` + +Apply the same dev-mode link-logging pattern used in task 1 to both modules. + +- [ ] **Step 13.5: Update `pickRenderer.ts`** + +Same pattern, but the fragment module imports the pick fragment file: + +```ts +import vsCode from './shaders/points.vertex.wesl?static'; +import fsCode from './shaders/points.pick.fragment.wesl?static'; +``` + +The vertex module is bit-identical to pointRenderer's — both renderers can either keep separate `createShaderModule` calls (simpler, no shared state) or coordinate to share one. **Use separate calls.** It's cheap and avoids cross-renderer coupling. + +- [ ] **Step 13.6: Final verification** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: points pass renders identically. Click a galaxy — selection halo appears on the right galaxy (regression of the second-bug-class on the project's "things that have bitten us" list — selection-on-wrong-galaxy was caused by uniform-update races; this split eliminates that whole class). + +```bash +git add -u +git commit -m "$(cat <<'EOF' +refactor(shaders): split points.wesl into vertex / color-fs / pick-fs / io + +Replaces the single 1485-line points.wesl with four files: +- points.io.wesl — shared structs + bind-group declarations +- points.vertex.wesl — @vertex fn vs (used by both renderers) +- points.color.fragment.wesl — @fragment fn fs (pointRenderer) +- points.pick.fragment.wesl — @fragment fn fsPick (pickRenderer) + +This replaces the planned `@if(PICK)` conditional-compilation +approach: with a vertex/fragment file split, the pick path is +just a different fragment module import — no preprocessor needed. + +Co-Authored-By: Claude Opus 4.7 +EOF +)" +``` + +--- + +## Task 14: Split `milkyWayImpostor.wesl` into 3 files + +**Files:** +- Create: `src/services/gpu/shaders/milkyWayImpostor.io.wesl` +- Create: `src/services/gpu/shaders/milkyWayImpostor.vertex.wesl` +- Create: `src/services/gpu/shaders/milkyWayImpostor.fragment.wesl` +- Delete: `src/services/gpu/shaders/milkyWayImpostor.wesl` +- Modify: `src/services/gpu/milkyWayRenderer.ts` + +Same pattern as task 13, but only one fragment file. + +- [ ] **Step 14.1: Carve up the file** + +Read the post-task-12 `milkyWayImpostor.wesl`. It now has structs + vs entry point + fs entry point + the procedural-galaxy helpers (`stars`, `height`, `galaxyNormal`, `shadeGalaxyDisk`, `renderGalaxy`). + +Decision: the procedural-galaxy helpers (~5 functions, ~150 lines) are fragment-stage only and not reusable elsewhere. Keep them in the fragment file rather than inventing a fourth file. If a future shader wants `renderGalaxy`, it graduates to `lib/` then. + +- [ ] **Step 14.2: Create the three files** + +`milkyWayImpostor.io.wesl`: +```wgsl +import skymap::lib::camera::CameraUniforms; + +struct Uniforms { + cam: CameraUniforms, + // [other fields] +} + +struct VsOut { + // [paste] +} + +@group(0) @binding(0) var u: Uniforms; +// [other bindings] +``` + +`milkyWayImpostor.vertex.wesl`: +```wgsl +import skymap::milkyWayImpostor::io::{ Uniforms, VsOut, u }; +import skymap::lib::camera::worldToClip; + +@vertex +fn vs(@builtin(vertex_index) vid: u32) -> VsOut { + // [paste vs body] +} +``` + +`milkyWayImpostor.fragment.wesl`: +```wgsl +import skymap::milkyWayImpostor::io::{ Uniforms, VsOut, u }; +import skymap::lib::util::{ raySphere, worldToGalactic, galacticToShader, hash21, valueNoise2 }; +import skymap::lib::math::{ rot2, sabs, toPolar, toRect }; +// [etc] + +// The procedural-galaxy helpers (stars, height, galaxyNormal, etc.) +// stay here — they're fragment-stage only and only this shader uses +// them. Promote to lib/ if a second consumer ever appears. + +fn stars(p_in: vec2) -> vec3 { /* paste */ } +fn height(p: vec2, tm: f32) -> f32 { /* paste */ } +fn galaxyNormal(p: vec2, tm: f32) -> vec3 { /* paste */ } +fn shadeGalaxyDisk(/* ... */) -> vec3 { /* paste */ } +fn renderGalaxy(ro: vec3, rd: vec3, tm: f32) -> vec3 { /* paste */ } + +@fragment +fn fs(in: VsOut) -> @location(0) vec4 { + // [paste fs body] +} +``` + +- [ ] **Step 14.3: Delete old file + update renderer** + +```bash +git rm src/services/gpu/shaders/milkyWayImpostor.wesl +``` + +`milkyWayRenderer.ts`: +```ts +import vsCode from './shaders/milkyWayImpostor.vertex.wesl?static'; +import fsCode from './shaders/milkyWayImpostor.fragment.wesl?static'; +``` + +Update pipeline construction to use two modules. + +- [ ] **Step 14.4: Build + visual + commit** + +```bash +npm run typecheck && npm run build && npm test +``` + +Visual: zoom in on the Milky Way impostor — same procedural galaxy, same star field. Animate (the `tm` parameter) — the galaxy should wobble identically. + +```bash +git add -u +git commit -m "refactor(shaders): split milkyWayImpostor.wesl into vertex/fragment/io" +``` + +--- + +## Task 15: Split remaining 5 shaders into 3 files each + +**Files:** +- For each of `disks`, `filaments`, `proceduralDisks`, `quads`, `toneMap`: + - Create: `.io.wesl`, `.vertex.wesl`, `.fragment.wesl` + - Delete: `.wesl` + - Modify: `Renderer.ts` (or `toneMapPass.ts`) + +Same pattern as tasks 13–14, repeated for each small renderer. Each one is mechanical (these shaders are <300 lines each), so they're done as five sub-commits in one task. + +- [ ] **Step 15.1: Per-renderer template** + +For each renderer, in the order: `toneMap`, `filaments`, `disks`, `quads`, `proceduralDisks` (smallest to largest): + +1. Read the current `.wesl`. Identify: structs + bindings (→ io), `@vertex fn vs` (→ vertex), `@fragment fn fs` (→ fragment). +2. Create `.io.wesl`, `.vertex.wesl`, `.fragment.wesl` per the templates from tasks 13–14. +3. `git rm` the original `.wesl`. +4. Update the renderer's TS file: replace the single `?static` import with two, and update the pipeline construction to use two `GPUShaderModule`s. +5. Build + typecheck + test. +6. Visual: focus on this renderer's output. +7. Sub-commit: + ```bash + git add -u + git commit -m "refactor(shaders): split .wesl into vertex/fragment/io" + ``` + +- [ ] **Step 15.2: Final verification across all renderers** + +After all five sub-commits: + +```bash +npm run typecheck && npm run build && npm test +``` + +Comprehensive visual check: pan, zoom, rotate, click, tier-swap, tone-map curve cycle. Everything should look identical to before the entire 15-task plan started. + +- [ ] **Step 15.3: Open PR** + +```bash +git push -u origin my-feature +gh pr create --title "WGSL → WESL conversion + shared shader library" --body "$(cat <<'EOF' +## Summary + +- Bootstraps `wesl-plugin` (build-time WESL→WGSL linker for Vite) and converts all 7 shaders from `.wgsl` to `.wesl`. +- Extracts a `lib/` of shared shader modules: `math/` (saturate, rot2, sabs, toPolar, toRect, constants), camera, billboard, orientation, colorIndex, cloudFade, masks, astro, tonemap, util. +- Uniformly splits every renderer shader into `.io.wesl` + `.vertex.wesl` + `.fragment.wesl`. `points` is special-cased with two fragment files (color + pick). +- Replaces the planned `@if(PICK)` conditional-compilation path with a clean two-fragment-file split for the points/pick renderer pair. + +Spec: `docs/superpowers/specs/2026-05-07-wesl-conversion-design.md` +Plan: `docs/superpowers/plans/2026-05-07-wesl-conversion.md` + +## Test plan + +- [x] `npm run typecheck` green +- [x] `npm run build` green +- [x] `npm test` green (590+ tests) +- [x] Visual: every renderer output identical to pre-PR +- [x] Visual: click-to-select still works (pickRenderer) +- [x] Visual: tier-swap fades smoothly (cloudFade) +- [x] Visual: tone-map dropdown cycles through all 5 curves correctly + +🤖 Generated with [Claude Code](https://claude.com/claude-code) +EOF +)" +``` + +--- + +## Self-review notes + +After all 15 tasks, verify against the spec: + +- [x] Section 1 (Goal) — all 7 shaders converted, lib/ extracted, vertex/fragment split done. +- [x] Section 2 (Why WESL) — three duplications collapsed: ramp (task 7), CloudUniforms (task 8), orientation (task 6). Single-file scale addressed: tasks 13–15 split. One-file-two-entry-points addressed: task 13. +- [x] Section 3 (Architecture) — every file in the spec's tree exists (or is deleted intentionally). +- [x] Section 4 (Library modules) — every immediate-win module extracted in tasks 4–11, math primitives in task 3, util staging in task 12. +- [x] Section 5 (Tooling) — wesl + wesl-plugin + wesl.toml + tsconfig types activation + Vite config in task 1. +- [x] Section 6 (Migration plan) — 15 tasks, matching the 15-task spec section. +- [x] Section 7 (Risks) — sourcemap-survival risk addressed by dev-mode link logging in task 1; struct-alignment risk addressed by canonical CameraUniforms layout in task 4; visual-verification gate present in every task. diff --git a/docs/superpowers/specs/2026-05-07-wesl-conversion-design.md b/docs/superpowers/specs/2026-05-07-wesl-conversion-design.md new file mode 100644 index 0000000..4321e50 --- /dev/null +++ b/docs/superpowers/specs/2026-05-07-wesl-conversion-design.md @@ -0,0 +1,149 @@ +# WGSL → WESL Conversion + Shared Shader Library — Design + +**Status:** Draft (2026-05-07) +**Owner:** @rulkens +**Branch:** `my-feature` (worktree: `.worktrees/my-feature`) + +## Goal + +Convert the seven hand-rolled WGSL shaders under `src/services/gpu/shaders/` to WESL, the WebGPU Shading Extended Language, and use its module-import system to extract a reusable shader library under `lib/`. The aim is to eliminate verbatim copy-paste between renderers (`ramp()`, `CloudUniforms`, the position-angle/inclination axis math), shrink the giant `points.wgsl` (1485 lines) by hoisting reusable building blocks out of it, and replace the runtime entry-point juggling between `pointRenderer` and `pickRenderer` with a clean per-stage file split. + +Non-goals: rewriting any rendering algorithm, changing the binary point-cloud format, changing pipeline descriptors beyond what the file split mechanically requires, or introducing runtime feature-flag toggling. WESL's full toolbox (linker conditionals, generics) is on the table; we use only what serves the immediate refactor. + +## Why WESL (and why now) + +WGSL has no module system. Every shader is a single self-contained string compiled into a `GPUShaderModule`. That's fine for a small renderer, but our shader code shows three concrete tax effects of the missing modularity: + +1. **Verbatim duplication.** `fn ramp(t: f32) -> vec3` is identical between `points.wgsl:652` and `proceduralDisks.wgsl:211`. The position-angle + inclination → 3D major/minor axis math at `disks.wgsl:158-166` is bytes-equal to `proceduralDisks.wgsl:154-166`. `struct CloudUniforms` lives in both `points.wgsl:292` and `filaments.wgsl:39`. Each duplicate is a maintenance liability; "fix the bug in both places" is already a thing in this code. +2. **One file, two entry points.** `points.wgsl` exposes both `fs` (color path, used by `pointRenderer`) and `fsPick` (pick-target path, used by `pickRenderer`). Both renderers `import wgsl from './shaders/points.wgsl?raw'` and select different `entryPoint:` strings on pipeline creation. The common code between them is real but the file is monolithic — there's no way to express "these two paths share this vertex stage but diverge at the fragment". +3. **Single-file scale.** `points.wgsl` is 1485 lines and `milkyWayImpostor.wgsl` is 774. Both are dominated by reusable primitives (color ramps, billboard expansion, value noise, ray–sphere intersection, galactic-frame rotation) that are stuck inside the file because there's no way to import them. + +WESL is a strict superset of WGSL — every existing `.wgsl` file is already a valid `.wesl` file — so the conversion is incremental and reversible. The toolchain is `wesl-plugin` for Vite (build-time linker, sourcemap-aware, HMR-compatible). At build time WESL modules are linked into a final WGSL string per import; production gets a flat WGSL bundle, dev gets HMR-reloaded modules. Runtime cost: zero. + +## Architecture overview + +``` +src/services/gpu/shaders/ +├── lib/ +│ ├── math.wesl # PI/TAU/LOG10, saturate, rot2, sabs, +│ │ # toPolar, toRect — small primitives +│ │ # grouped one-per-section in one file +│ │ # (WESL imports a fn from a module, +│ │ # not a fn-as-module — so one-fn-per- +│ │ # file forces a verbose duplicated +│ │ # leaf, e.g. lib::math::saturate::saturate) +│ ├── astro.wesl # distance modulus, mag→intensity +│ ├── billboard.wesl # vid→corner, screen/world expansion +│ ├── camera.wesl # CameraUniforms, worldToClip, depth +│ ├── cloudFade.wesl # CloudUniforms + applyCloudFade +│ ├── colorIndex.wesl # ramp(), color-index → RGB +│ ├── masks.wesl # circularMask, lumAlpha, edgeBand +│ ├── orientation.wesl # PA + inclination → 3D axes +│ ├── tonemap.wesl # linear/reinhard/asinh/gamma2/aces +│ └── util.wesl # noise, raySphere, galactic, sRGB, +│ # pick-encode — staging area; promoted +│ # to lib//.wesl when a +│ # second consumer appears +├── points.io.wesl # struct VSOut, struct Uniforms +├── points.vertex.wesl # @vertex fn vs (shared color + pick) +├── points.color.fragment.wesl # @fragment fn fs (pointRenderer) +├── points.pick.fragment.wesl # @fragment fn fsPick (pickRenderer) +├── milkyWayImpostor.io.wesl +├── milkyWayImpostor.vertex.wesl +├── milkyWayImpostor.fragment.wesl +├── disks.io.wesl +├── disks.vertex.wesl +├── disks.fragment.wesl +├── filaments.io.wesl +├── filaments.vertex.wesl +├── filaments.fragment.wesl +├── proceduralDisks.io.wesl +├── proceduralDisks.vertex.wesl +├── proceduralDisks.fragment.wesl +├── quads.io.wesl +├── quads.vertex.wesl +├── quads.fragment.wesl +├── toneMap.io.wesl +├── toneMap.vertex.wesl +└── toneMap.fragment.wesl +``` + +The split rule is **uniform**: every shader is broken into a vertex file, a fragment file, and a `.io.wesl` file containing the V→F interpolant struct + uniform layouts that both stages import. `points` is a special case with two fragment variants (color + pick) sharing a vertex file. The uniformity costs slightly more files for the small shaders (`filaments`, `toneMap`, `disks`) where a single file would be navigable, but it pays off in predictability — every renderer's TS file imports the same shape (`.vertex.wesl?static` + `.fragment.wesl?static`), and the V→F interpolant contract for every shader has a single canonical source. + +## Library modules + +The `lib/` tree has three tiers, distinguished by whether they're solving real duplication today or staging future reuse. + +**Immediate-win modules** (each replaces existing duplicated code on extraction): + +- **`lib/camera.wesl`** — declares `CameraUniforms` (viewProj, view, proj, cameraPos, kPerZ, viewportPx) and helpers `worldToClip(p) -> vec4`, `worldEyeDepth(p) -> f32`, `pixelSizeAt(eyeDepth) -> f32`. Every renderer except `toneMap` currently rolls its own `viewProj * vec4(p, 1)` plus a per-shader copy of the kPerZ scaling logic. Consolidating fixes the second concrete bug class on the project's `things-that-have-bitten-us` list — the `queue.writeBuffer` race only happens because per-renderer uniform structs each have their own subtly different layouts to keep in sync. +- **`lib/billboard.wesl`** — unit-quad `vid -> corner` expansion (used by `points`, `quads`, `disks`, `proceduralDisks`), plus `expandBillboardScreen(centerWS, sizePx, vid)` (kPerZ-scaled, screen-aligned) and `expandBillboardWorld(centerWS, sizeWS, vid)` (world-space-sized, view-aligned). Each billboard shader currently writes its own version of this, with subtle differences that have caused alignment bugs. +- **`lib/orientation.wesl`** — given a galaxy's (positionWS, position-angle, inclination, axisRatio) plus the camera position, returns `(majorAxis3D, minorAxis3D)` in world space. The 9-line block at `disks.wgsl:158-166` and `proceduralDisks.wgsl:154-166` is byte-for-byte identical; this module is the first one extracted because the saving is unambiguous and the consolidation pays for the WESL setup work on its own. +- **`lib/colorIndex.wesl`** — exports `ramp(t: f32) -> vec3`, the duplicated piecewise color-index→RGB function. Future expansion slot for B−V→temperature→RGB if/when we move to a physically-grounded color model. +- **`lib/cloudFade.wesl`** — exports `CloudUniforms` and `applyCloudFade(opacity)`. Resolves the duplicate struct between `points` and `filaments`. +- **`lib/masks.wesl`** — `circularMask(uv, inner, outer) -> f32`, `lumAlpha(lum, lo, hi) -> f32`, `edgeBandMask(uv, fade) -> f32`. Each existing fragment shader hand-rolls a `1 - smoothstep(0.45, 0.5, r)` or similar; consolidating makes it consistent and clarifies which renderer uses which mask shape. +- **`lib/astro.wesl`** — `distanceModulus(appMag, dMpc) -> f32` (the `appMag - 5·log₁₀(d_Mpc) - 25` line currently inline at `points.wgsl:762`), `appMagToIntensity(m) -> f32` (the `pow(10, -0.4·m)` pattern), `LOG10` constant. Today's only consumer is `points`, but the formulas are the canonical astronomy primitives — pulling them into a single, comment-rich file makes them documentable and future-proof for any catalog/UI/debug shader that needs to convert between magnitude representations. +- **`lib/tonemap.wesl`** — `applyLinear`, `applyReinhard`, `applyAsinh`, `applyGamma2`, `applyAces`. Currently lives inside `toneMap.wgsl`. Pulling them out makes them reusable for any future post-process pass (bloom, motion blur, debug-tonemap previews) without `toneMap.wesl` becoming a transitive import. + +**Math primitives** (each in its own file under `lib/math/`, per the project's house rule): + +- **`lib/math/saturate.wesl`** — `fn saturate(x: f32) -> f32 { return clamp(x, 0.0, 1.0); }`. Currently written inline as `clamp(x, 0.0, 1.0)` ~20× across the shaders. +- **`lib/math/rot2.wesl`** — 2D rotation matrix builder; replaces the hand-rolled `cos·p.x − sin·p.y` and `sin·p.x + cos·p.y` lines that appear in `milkyWayImpostor.wgsl` and the position-angle code in `points.wgsl`. +- **`lib/math/sabs.wesl`** — smooth absolute value with parameter `k`. Currently lives in `milkyWayImpostor.wgsl:425`. Generic enough to live with the other math primitives. +- **`lib/math/toPolar.wesl`** / **`lib/math/toRect.wesl`** — Cartesian↔polar (vec2). Currently in `milkyWayImpostor.wgsl:330-336`. +- **`lib/math/constants.wesl`** — `const PI = 3.14159...`, `const TAU = 6.28318...`, `const LOG10 = 2.30258...`. Tiny but eliminates the magic numbers that recur in points + milkyWay. + +The "one function per file" rule applies specifically to `lib/math/`. The other lib modules are themed cohesive units (camera *is* its uniform struct + its handful of helpers; splitting them into per-function files would obscure their interface), and they stay multi-function. + +**Future-proofing modules** (single call site today, generic utility — staged in `lib/util.wesl` until they earn their own file): + +`lib/util.wesl` consolidates the orphan utilities: `hash21(co)`, `valueNoise2(p)` (currently `rand`/`noise1` in milkyWay), `raySphere(ro, rd, center, r)` (currently in milkyWay), `worldToGalactic(v)` / `galacticToShader(g)` (galactic-frame rotations from milkyWay), `linearToSRGB` / `srgbToLinear` (currently implicit in `toneMap`'s gamma curve), and `encodePickId(idx)` / `decodePickId(v)` (currently inline in `points.wgsl:fsPick`). They live together until a real second consumer appears, at which point each graduates to its own `lib//.wesl` file (matching the `lib/math/` pattern). The util file is a staging area, not a permanent home. + +## Tooling + +- Add `wesl` and `wesl-plugin` as devDependencies (pinned to `0.7.x` — the package is sub-1.0 and we want predictable rebuilds). Wire `wesl-plugin` into `vite.config.ts`. The plugin registers a `?static` import suffix that runs the WESL linker at build time and returns the linked WGSL string — semantically equivalent to today's `?raw` import, but with imports resolved. +- Add a `wesl.toml` at the repo root configuring the resolution root to `src/services/gpu/shaders/`, since the wesl-plugin default of `./shaders/` doesn't match this project's layout. +- Add `src/@types/wesl.d.ts` mirroring the existing `wgsl.d.ts`, declaring `*.wesl?static` as resolving to `string`. +- Rename `.wgsl` → `.wesl` across `src/services/gpu/shaders/`. Because WESL is a strict superset, no shader content changes are required for the rename itself — the build keeps producing identical pipelines until imports are added. +- Each renderer's TS file changes one line: `import shader from './shaders/foo.wgsl?raw'` becomes `import shader from './shaders/foo.wesl?static'`. The shape (string) is unchanged. Renderers that split into vertex/fragment modules go from one import to two, and `device.createRenderPipeline` is updated to pass two `GPUShaderModule`s — which matches WebGPU's native pipeline shape (vertex and fragment have always been separate fields; today both happen to point at the same module). +- Inside `.wesl` files, imports use WESL's `::` path syntax (not TypeScript brace syntax): `import package::lib::math::saturate;` makes `saturate` available as a top-level identifier. The leading `package::` is the literal placeholder for the project's own root package (verified in `wesl-plugin/src/PluginApi.ts` — `fileToModulePath(rootModuleName, "package", false)` — and matches the official `wesl` README example `import package::colors::chartreuse;`). Paths are resolved from the configured root (`src/services/gpu/shaders/`), so `package::lib::math::saturate` maps to `src/services/gpu/shaders/lib/math/saturate.wesl`. The npm package name (`skymap`) is **not** used as the prefix — that name is reserved for cross-package imports if this project ever publishes a shader library. + +## Migration plan (15 tasks) + +Each task is independently shippable. The build stays green throughout, the existing 590+ test suite stays green, and every shader-touching task ends with a manual visual sanity check on the running dev server before being marked complete (per the `wgsl-meticulous` project convention — shader edits never ship on confidence alone). + +1. **Tooling bootstrap.** Add `wesl` + `wesl-plugin` + Vite config + `wesl.toml` + `wesl.d.ts`. Convert `toneMap.wgsl` → `toneMap.wesl`, switch the `toneMapPass.ts` import from `?raw` to `?static`. Smoke-test: build, dev HMR, sourcemap line numbers in browser errors. Document the actual sourcemap behaviour in this commit so the rest of the plan can rely on it (per the research, expect sourcemaps **not** to survive into Chrome's WGSL compiler errors — mitigation is naming-discipline + a dev-mode log of the linked WGSL alongside any compile error). +2. **Bulk rename.** The remaining 6 shaders renamed `.wgsl` → `.wesl`, all `?raw` imports switched to `?static`. No content changes. Visual diff: nothing. +3. **Extract `lib/math/`.** Create the six single-function files. Replace inline `clamp(x, 0, 1)` with `saturate(x)` in shaders that already use it; replace the 2D rotation pattern in milkyWay with `rot2`. Constants pulled out into `constants.wesl`. Tests stay green; visual: identical. +4. **Extract `lib/camera.wesl`.** Replace each renderer's hand-rolled view/proj math with imports. One sub-commit per renderer to keep diffs reviewable. The camera uniform layout changes per renderer because some have additional renderer-specific fields — those move into a renderer-local struct that *contains* `CameraUniforms` rather than duplicating its fields. +5. **Extract `lib/billboard.wesl`.** Replace the unit-quad expansion + screen-space-sizing logic in `points`, `quads`, `disks`, `proceduralDisks`. Each replacement is mechanical; the win is removing the per-renderer subtle variations. +6. **Extract `lib/orientation.wesl`.** Collapses the verbatim PA+inclination duplicate between `disks` and `proceduralDisks`. Smallest commit, biggest readability win. +7. **Extract `lib/colorIndex.wesl`.** Collapses the `ramp()` duplicate between `points` and `proceduralDisks`. +8. **Extract `lib/cloudFade.wesl`.** Collapses the `CloudUniforms` + `applyCloudFade` duplicate between `points` and `filaments`. +9. **Extract `lib/masks.wesl`.** Pulls the circular / lum / edge-band masks out of `disks`, `quads`, `proceduralDisks`, `filaments`. +10. **Extract `lib/astro.wesl`.** Pulls the distance-modulus and magnitude→intensity formulas out of `points` into a documented module. +11. **Extract `lib/tonemap.wesl`.** The five tone-mapping functions move out of `toneMap.wesl`; the renderer entry shader becomes a thin import + entry-point file. +12. **Extract `lib/util.wesl`.** Consolidates noise, ray-sphere, galactic-frame, sRGB, and pick-encode utilities pulled out of `milkyWayImpostor`, `toneMap`, and `points` (the pick path). +13. **Split `points` into 4 files.** `points.io.wesl` (shared structs), `points.vertex.wesl` (shared `vs`), `points.color.fragment.wesl` (`fs` for `pointRenderer`), `points.pick.fragment.wesl` (`fsPick` for `pickRenderer`). `pointRenderer.ts` and `pickRenderer.ts` each import their respective vertex+fragment pair. This replaces the planned `@if(PICK)` approach with a cleaner two-file split — no conditional compilation needed. +14. **Split `milkyWayImpostor` into 3 files.** `milkyWayImpostor.io.wesl`, `milkyWayImpostor.vertex.wesl`, `milkyWayImpostor.fragment.wesl`. The fragment file is where most of the existing 774 lines end up (procedural galaxy, ray-sphere, noise) — but with `lib/util.wesl` already extracted in task 12, the file is dominated by genuine renderer-specific code rather than reusable primitives. +15. **Split remaining 5 shaders into 3 files each.** `disks`, `filaments`, `proceduralDisks`, `quads`, `toneMap` each get a `.io.wesl` + `.vertex.wesl` + `.fragment.wesl` triple. Each of the five splits is mechanical and small (the original files are 138–258 lines), so they're bundled into a single sweep with one sub-commit per renderer. Each renderer's TS file gains one extra `?static` import. + +## Risks + +**`wesl-plugin` maturity.** WESL is a young language and its Vite plugin is correspondingly young. Task 1 is the smoke test — if HMR, sourcemaps, or module resolution have rough edges that don't have a plugin-level fix, fall back to a small custom Vite plugin around `wesl-js` (the linker library, which is more stable than the all-in-one plugin). The fallback adds ~30 lines of plugin code to `vite.config.ts` but keeps the same build-time-link semantics. + +**Shader debugging line numbers.** Browser-side shader compilation errors will reference the linked WGSL output, not the source `.wesl` file. `wesl-plugin` advertises sourcemap support but it needs verification on Chrome's WebGPU compiler error path. If sourcemaps don't survive into browser error messages, mitigation is logging the linked WGSL alongside the error in dev mode — already a pattern this repo uses for catalog-format errors. + +**Subtle struct-layout drift.** When `CameraUniforms` moves from inline definitions across six renderers into `lib/camera.wesl`, any field-order divergence breaks bind groups silently — the GPU will read garbage instead of erroring. Mitigation is per-step diff review at the byte level, plus a one-time write-up of the canonical `CameraUniforms` field order in the module's docblock so that future changes happen in one place. The 590-test suite covers TS-side correctness but doesn't catch GPU-side struct-alignment bugs; visual sanity is the only check there. + +**Shader file is not unit-testable.** Tests are silent on shader correctness. Every shader-touching task is gated on a manual visual comparison ("does the rendered scene look identical to before?") on the running dev server, plus the standard test pass for the surrounding TS scaffolding. The `wgsl-meticulous` project memory enforces this. + +**Plan stays sequential, not parallel.** Tasks 4–12 each touch multiple renderers (each lib extraction sweeps across consumers) so they can't be parallelised by subagent. The throughput limit is one task per implementer per session, with visual review between. That's deliberate — the cost of a silent regression is high enough that batching gains aren't worth chasing. + +## Out of scope + +- Runtime feature-flag toggling (would require shipping `.wesl` source to the browser; we don't need it). +- Any procedural code change inside a shader (this is a refactor, not a redesign — the rendered output is byte-identical at every step). +- The `tools/` build pipeline (the catalog `.bin` format and the parsers under `tools/parsers/` are untouched). +- Migration of any future shader stages (compute, mesh) — none exist today; if they do later, they slot into the same lib structure with no design change required. +- A WESL coding-style guide or shared lint rules. The project's existing didactic-comments convention and `feedback_wgsl_meticulous` rule are sufficient guidance. diff --git a/package-lock.json b/package-lock.json index 557a806..440f970 100644 --- a/package-lock.json +++ b/package-lock.json @@ -27,6 +27,8 @@ "typescript": "6.0.3", "vite": "8.0.10", "vitest": "4.1.5", + "wesl": "0.7.26", + "wesl-plugin": "0.6.74", "wrangler": "4.87.0" }, "engines": { @@ -1194,6 +1196,28 @@ "url": "https://opencollective.com/libvips" } }, + "node_modules/@jridgewell/gen-mapping": { + "version": "0.3.13", + "resolved": "https://registry.npmjs.org/@jridgewell/gen-mapping/-/gen-mapping-0.3.13.tgz", + "integrity": "sha512-2kkt/7niJ6MgEPxF0bYdQ6etZaA+fQvDcLKckhy1yIQOzaoKjBBjSj63/aLVjYE3qhRt5dvM+uUyfCg6UKCBbA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/sourcemap-codec": "^1.5.0", + "@jridgewell/trace-mapping": "^0.3.24" + } + }, + "node_modules/@jridgewell/remapping": { + "version": "2.3.5", + "resolved": "https://registry.npmjs.org/@jridgewell/remapping/-/remapping-2.3.5.tgz", + "integrity": "sha512-LI9u/+laYG4Ds1TDKSJW2YPrIlcVYOwi2fUC6xB43lueCjgxV4lffOCZCtYFiH6TNOX+tQKXx97T4IKHbhyHEQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/gen-mapping": "^0.3.5", + "@jridgewell/trace-mapping": "^0.3.24" + } + }, "node_modules/@jridgewell/resolve-uri": { "version": "3.1.2", "resolved": "https://registry.npmjs.org/@jridgewell/resolve-uri/-/resolve-uri-3.1.2.tgz", @@ -1827,6 +1851,19 @@ "dev": true, "license": "BSD-3-Clause" }, + "node_modules/acorn": { + "version": "8.16.0", + "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.16.0.tgz", + "integrity": "sha512-UVJyE9MttOsBQIDKw1skb9nAwQuR5wuGD3+82K6JgJlm/Y+KI92oNsMNGZCYdDsVtRHSak0pcV5Dno5+4jh9sw==", + "dev": true, + "license": "MIT", + "bin": { + "acorn": "bin/acorn" + }, + "engines": { + "node": ">=0.4.0" + } + }, "node_modules/assertion-error": { "version": "2.0.1", "resolved": "https://registry.npmjs.org/assertion-error/-/assertion-error-2.0.1.tgz", @@ -3321,6 +3358,22 @@ "pathe": "^2.0.3" } }, + "node_modules/unplugin": { + "version": "2.3.11", + "resolved": "https://registry.npmjs.org/unplugin/-/unplugin-2.3.11.tgz", + "integrity": "sha512-5uKD0nqiYVzlmCRs01Fhs2BdkEgBS3SAVP6ndrBsuK42iC2+JHyxM05Rm9G8+5mkmRtzMZGY8Ct5+mliZxU/Ww==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/remapping": "^2.3.5", + "acorn": "^8.15.0", + "picomatch": "^4.0.3", + "webpack-virtual-modules": "^0.6.2" + }, + "engines": { + "node": ">=18.12.0" + } + }, "node_modules/vite": { "version": "8.0.10", "resolved": "https://registry.npmjs.org/vite/-/vite-8.0.10.tgz", @@ -3489,6 +3542,68 @@ } } }, + "node_modules/webpack-virtual-modules": { + "version": "0.6.2", + "resolved": "https://registry.npmjs.org/webpack-virtual-modules/-/webpack-virtual-modules-0.6.2.tgz", + "integrity": "sha512-66/V2i5hQanC51vBQKPH4aI8NMAcBW59FVBs+rC7eGHupMyfn34q7rZIE+ETlJ+XTevqfUhVVBgSUNSW2flEUQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/wesl": { + "version": "0.7.26", + "resolved": "https://registry.npmjs.org/wesl/-/wesl-0.7.26.tgz", + "integrity": "sha512-61iTpol7jy9iXiIN4T5x/1UFRrVFN5KUUKuBY3iE0e4Cr1Si0RF+0KCLnSGa/QNr2ZfYTs6dwdqgpBLDIR6iDQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/wesl-plugin": { + "version": "0.6.74", + "resolved": "https://registry.npmjs.org/wesl-plugin/-/wesl-plugin-0.6.74.tgz", + "integrity": "sha512-0xStBCryNYLLRitqumIcYNW3YqQL81u+9aiiJqL6GDHIefNLKXrEJlroP8chdoiK0BYFTC9FRBigKo5adTWjlw==", + "dev": true, + "dependencies": { + "unplugin": "^2.3.5", + "wesl": "0.7.26", + "wesl-reflect": "0.0.5" + }, + "peerDependencies": { + "@nuxt/kit": "^3", + "@nuxt/schema": "^3", + "esbuild": "*", + "rollup": "^3", + "vite": ">=3", + "webpack": "^4 || ^5" + }, + "peerDependenciesMeta": { + "@nuxt/kit": { + "optional": true + }, + "@nuxt/schema": { + "optional": true + }, + "esbuild": { + "optional": true + }, + "rollup": { + "optional": true + }, + "vite": { + "optional": true + }, + "webpack": { + "optional": true + } + } + }, + "node_modules/wesl-reflect": { + "version": "0.0.5", + "resolved": "https://registry.npmjs.org/wesl-reflect/-/wesl-reflect-0.0.5.tgz", + "integrity": "sha512-HG4dU7Bw82paVdU0jZU49W6/aGIrHlGt9zNjopWQyS4gzHJnpUfdsNM+fbCObts8kLPN89B7QAjnZGZmgYz0mw==", + "dev": true, + "dependencies": { + "wesl": "0.7.26" + } + }, "node_modules/why-is-node-running": { "version": "2.3.0", "resolved": "https://registry.npmjs.org/why-is-node-running/-/why-is-node-running-2.3.0.tgz", diff --git a/package.json b/package.json index fdf9108..472b5db 100644 --- a/package.json +++ b/package.json @@ -67,6 +67,8 @@ "typescript": "6.0.3", "vite": "8.0.10", "vitest": "4.1.5", + "wesl": "0.7.26", + "wesl-plugin": "0.6.74", "wrangler": "4.87.0" }, "dependencies": { diff --git a/src/@types/wesl.d.ts b/src/@types/wesl.d.ts new file mode 100644 index 0000000..c6c9d94 --- /dev/null +++ b/src/@types/wesl.d.ts @@ -0,0 +1,9 @@ +// Activate wesl-plugin's ambient declarations for `?static` etc. +// +// We import these via tsconfig.json `types: ["wesl-plugin/suffixes"]`, but +// that subpath form isn't reliably resolved by every TypeScript version +// when the compilerOptions are picked up by the editor / build separately. +// A triple-slash reference here is the belt-and-braces fallback that +// guarantees resolution from any compiler entry point. +/// +export {}; diff --git a/src/services/gpu/cloudFade.ts b/src/services/gpu/cloudFade.ts index 6a1d57c..aebd9a0 100644 --- a/src/services/gpu/cloudFade.ts +++ b/src/services/gpu/cloudFade.ts @@ -132,6 +132,7 @@ export class CloudFade { startNowMs: number = performance.now(), ) { this.buffer = device.createBuffer({ + label: 'cloudFade-uniform-buffer', // 16 bytes is WebGPU's minimum uniform-buffer alignment — even though // we only need 4 bytes for the f32 opacity, allocating less is a // validation error. The shader's `_pad0/1/2` fields consume the @@ -140,6 +141,7 @@ export class CloudFade { usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, }); this.bindGroup = device.createBindGroup({ + label: 'cloudFade-bg', layout: bindGroupLayout, entries: [{ binding: 0, resource: { buffer: this.buffer } }], }); diff --git a/src/services/gpu/diskRenderer.ts b/src/services/gpu/diskRenderer.ts index de1bdac..1a610d5 100644 --- a/src/services/gpu/diskRenderer.ts +++ b/src/services/gpu/diskRenderer.ts @@ -28,7 +28,8 @@ import type { mat4 } from 'gl-matrix'; import type { GpuContext } from '../../@types'; -import diskWgsl from './shaders/disks.wgsl?raw'; +import diskWgsl from './shaders/disks.wesl?static'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; export type DiskInstance = { x: number; @@ -53,20 +54,27 @@ const FLOATS_PER_INSTANCE = 12; const BYTES_PER_INSTANCE = FLOATS_PER_INSTANCE * 4; /** - * 96-byte uniform layout (matches the WGSL `Uniforms` struct in disks.wgsl): + * 96-byte uniform layout (matches the WESL `Uniforms` struct in + * disks.wesl, which now extends the shared `CameraUniforms` prefix from + * `lib/camera.wesl`): * - * bytes 0..63 : viewProj mat4x4 (16 floats = 64 B) - * bytes 64..71 : viewport vec2 (2 floats = 8 B) - * bytes 72..79 : _pad0/_pad1 f32 × 2 (8 B; pads next vec3 to 16-B boundary) - * bytes 80..91 : camPos vec3 (3 floats = 12 B; vec3 needs 16-B alignment) - * bytes 92..95 : _pad2 f32 (4 B; trailing pad in camPos's vec4 quantum) + * bytes 0..63 : cam.viewProj mat4x4 (16 floats = 64 B) + * bytes 64..71 : cam.viewportPx vec2 (2 floats = 8 B) + * bytes 72..79 : cam._pad0 / _pad1 f32 × 2 (8 B; pads next vec3 to 16-B boundary) + * bytes 80..91 : camPos vec3 (3 floats = 12 B; vec3 needs 16-B alignment) + * bytes 92..95 : _pad2 f32 (4 B; trailing pad in camPos's vec4 quantum) * - * Total: 96 bytes — multiple of 16 ✓. This mirrors the QuadRenderer's - * revised layout (after the orbit-warp fix) so the two passes can share - * the same conceptual binding even though their consumers differ: - * QuadRenderer uses the trailing slot for `pxPerRad`, while DiskRenderer - * doesn't need pixel-radius math (the disk geometry sizes itself in - * world space) and leaves it as padding. + * Total: 96 bytes — multiple of 16 ✓. Byte-for-byte identical to the + * pre-CameraUniforms layout: the WESL refactor only renamed the prefix + * fields ('viewProj' → 'cam.viewProj', 'viewport' → 'cam.viewportPx', + * '_pad0/_pad1' → 'cam._pad0/_pad1') without moving any of the + * trailing renderer-specific bytes, so this CPU uploader didn't need to + * shift any offsets. This mirrors the QuadRenderer's revised layout + * (after the orbit-warp fix) so the two passes can share the same + * conceptual binding even though their consumers differ: QuadRenderer + * uses the trailing slot for `pxPerRad`, while DiskRenderer doesn't + * need pixel-radius math (the disk geometry sizes itself in world + * space) and leaves it as padding. */ const UNIFORM_BYTES = 96; @@ -95,11 +103,14 @@ export class DiskRenderer { ], }); - const module = this.device.createShaderModule({ label: 'disks-wgsl', code: diskWgsl }); + const module = createShaderModuleWithDevLog(this.device, diskWgsl, 'disks'); this.pipeline = this.device.createRenderPipeline({ label: 'disk-pipeline', - layout: this.device.createPipelineLayout({ bindGroupLayouts: [this.bindGroupLayout] }), + layout: this.device.createPipelineLayout({ + label: 'disks-pipeline-layout', + bindGroupLayouts: [this.bindGroupLayout], + }), vertex: { module, entryPoint: 'vs', diff --git a/src/services/gpu/filamentRenderer.ts b/src/services/gpu/filamentRenderer.ts index 4263f07..529dec6 100644 --- a/src/services/gpu/filamentRenderer.ts +++ b/src/services/gpu/filamentRenderer.ts @@ -13,7 +13,7 @@ * indexBuffer (static) : 6 × uint16 → two-triangle quad * quadVertexBuffer (static) : 4 × vec2 → corner UVs * segmentInstanceBuffer : segmentCount × 8 × f32 → per-segment endpoints - * uniformBuffer : 32 bytes (viewProj + viewport + halfWidth) + * uniformBuffer : 96 bytes (CameraUniforms prefix + halfWidth + intensityScale + tail pad) * * Public API: * - new FilamentRenderer(device, format) @@ -22,21 +22,30 @@ * - clear() → drops the instance buffer * - destroy() → releases all GPU resources */ -import shaderSource from './shaders/filaments.wgsl?raw'; +import shaderSource from './shaders/filaments.wesl?static'; import type { FilamentCloud } from '../../@types/FilamentCloud'; import type { mat4 } from 'gl-matrix'; import { CloudFade } from './cloudFade'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; const FLOATS_PER_SEGMENT = 8; // startxyz + startD + endxyz + endD -// Uniform block layout (std140-ish, WGSL host-shareable): -// viewProj mat4 = 64 bytes -// viewport vec2 = 8 bytes -// halfWidthPx f32 = 4 bytes -// _pad f32 = 4 bytes (round to 16-byte alignment) -// Total: 80 bytes. WebGPU rounds uniform-buffer sizes up to a multiple -// of 16, so 80 is already aligned — no extra padding needed. -const UNIFORM_BYTES = 80; +// Uniform block layout, mirroring 'struct Uniforms' in +// 'shaders/filaments.wesl'. The first 80 bytes are the shared +// 'CameraUniforms' prefix from 'shaders/lib/camera.wesl'; the +// renderer-specific scalars sit AFTER it in offsets 80..87. The +// trailing 8B pad rounds up to a 16-byte multiple — WebGPU would +// round the buffer size anyway, but writing the pad explicitly keeps +// the JS-side layout obvious and grep-able. +// +// offset 0..63 : viewProj mat4x4 (CameraUniforms.viewProj) +// offset 64..71 : viewportPx vec2 (CameraUniforms.viewportPx) +// offset 72..79 : _pad0, _pad1 2 × f32 (CameraUniforms reserved) +// offset 80..83 : halfWidthPx f32 +// offset 84..87 : intensityScale f32 +// offset 88..95 : _pad0, _pad1 2 × f32 (Uniforms tail pad) +// Total: 96 bytes. +const UNIFORM_BYTES = 96; /** * Build a flat per-segment instance array from a `FilamentCloud`. One @@ -116,9 +125,10 @@ export class FilamentRenderer { */ hdrFormat: GPUTextureFormat, ) { - const module = device.createShaderModule({ code: shaderSource }); + const module = createShaderModuleWithDevLog(device, shaderSource, 'filaments'); this.uniformBuffer = device.createBuffer({ + label: 'filaments-uniform-buffer', size: UNIFORM_BYTES, usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, }); @@ -126,6 +136,7 @@ export class FilamentRenderer { // Static index buffer: two triangles forming the quad. const indices = new Uint16Array([0, 1, 2, 1, 3, 2]); this.indexBuffer = device.createBuffer({ + label: 'filaments-index-buffer', size: indices.byteLength, usage: GPUBufferUsage.INDEX | GPUBufferUsage.COPY_DST, }); @@ -134,12 +145,14 @@ export class FilamentRenderer { // Static quad-corner buffer: 4 vertices × vec2 = 32 bytes. const quadCorners = new Float32Array([0, 0, 1, 0, 0, 1, 1, 1]); this.quadVertexBuffer = device.createBuffer({ + label: 'filaments-quad-vertex-buffer', size: quadCorners.byteLength, usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST, }); device.queue.writeBuffer(this.quadVertexBuffer, 0, quadCorners); const bindGroupLayout = device.createBindGroupLayout({ + label: 'filaments-bgl-uniforms', entries: [ { binding: 0, @@ -155,6 +168,7 @@ export class FilamentRenderer { // never needs to see the opacity). Stored on the instance so the // lazily-created CloudFade can reuse it. this.cloudFadeBindGroupLayout = device.createBindGroupLayout({ + label: 'filaments-bgl-cloudFade', entries: [ { binding: 0, @@ -165,12 +179,15 @@ export class FilamentRenderer { }); this.bindGroup = device.createBindGroup({ + label: 'filaments-bg-uniforms', layout: bindGroupLayout, entries: [{ binding: 0, resource: { buffer: this.uniformBuffer } }], }); this.pipeline = device.createRenderPipeline({ + label: 'filaments-pipeline', layout: device.createPipelineLayout({ + label: 'filaments-pipeline-layout', bindGroupLayouts: [bindGroupLayout, this.cloudFadeBindGroupLayout], }), vertex: { @@ -231,6 +248,7 @@ export class FilamentRenderer { } this.instanceBuffer?.destroy(); this.instanceBuffer = this.device.createBuffer({ + label: 'filaments-instance-buffer', size: data.byteLength, usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST, }); @@ -262,20 +280,25 @@ export class FilamentRenderer { if (this.segmentCount === 0 || !this.instanceBuffer || !this.fade) return; // Pack uniforms. See UNIFORM_BYTES comment above for the byte layout. - // f32[0..15] viewProj (mat4) - // f32[16..17] viewport (vec2) - // f32[18] halfWidthPx - // f32[19] intensityScale (was: padding; the slot is already - // in the uniform buffer's footprint, repurposing it - // for the user-facing intensity slider doesn't grow - // the uniform's size or change its 16-byte alignment) + // f32[0..15] viewProj (mat4) — CameraUniforms.viewProj + // f32[16..17] viewportPx (vec2) — CameraUniforms.viewportPx + // f32[18..19] CameraUniforms reserved pad (left zero) + // f32[20] halfWidthPx — Uniforms.halfWidthPx (offset 80) + // f32[21] intensityScale — Uniforms.intensityScale (offset 84) + // f32[22..23] Uniforms tail pad (left zero) + // + // Adoption of the shared 'CameraUniforms' prefix moved the two + // scalars from f32-indices 18/19 down to 20/21. The two reserved + // pad slots in CameraUniforms (f32[18..19]) MUST stay zero — + // overwriting them silently shifts the WGSL view of every later + // member. const buf = new ArrayBuffer(UNIFORM_BYTES); const f32 = new Float32Array(buf); f32.set(viewProj as Float32Array, 0); f32[16] = viewportPx[0]; f32[17] = viewportPx[1]; - f32[18] = halfWidthPx; - f32[19] = intensityScale; + f32[20] = halfWidthPx; + f32[21] = intensityScale; this.device.queue.writeBuffer(this.uniformBuffer, 0, buf); // Cloud-fade-in opacity for this frame. Steady-state (after the diff --git a/src/services/gpu/milkyWayRenderer.ts b/src/services/gpu/milkyWayRenderer.ts index 6d7b858..14db100 100644 --- a/src/services/gpu/milkyWayRenderer.ts +++ b/src/services/gpu/milkyWayRenderer.ts @@ -14,29 +14,45 @@ * * ### Uniform buffer ABI * - * 96 bytes total — padded to the same shape as the procedural-disk - * uniform layout so future refactors that share a uniform-pack helper - * across passes don't have to special-case this one: + * 112 bytes total — first 80 bytes are the shared `CameraUniforms` + * prefix from `lib/camera.wesl`, followed by the renderer-specific + * camera position + scalars + tail pad: * - * offset 0 | mat4x4 viewProj — vertex stage projects the - * world-anchored billboard - * offset 64 | vec2 viewport — UNUSED (ABI symmetry) - * offset 72 | f32 fadeAlpha — distance-based alpha, [0..1] - * offset 76 | f32 iTime — animation time (sec * 0.25) - * offset 80 | vec3 cameraPosWorld — drives both the vertex - * stage's view-aligned - * billboard basis and the - * fragment stage's - * synthetic-camera ray - * origin - * offset 92 | f32 _pad — alignment padding to 96 B + * offset 0 | mat4x4 cam.viewProj — vertex stage projects the + * world-anchored billboard + * offset 64 | vec2 cam.viewportPx — UNUSED here (ABI symmetry + * with peer renderers) + * offset 72 | f32 cam._pad0 — reserved by CameraUniforms + * offset 76 | f32 cam._pad1 — reserved by CameraUniforms + * offset 80 | vec3 cameraPosWorld — drives both the vertex + * stage's view-aligned + * billboard basis and the + * fragment stage's + * synthetic-camera ray + * origin + * offset 92 | f32 fadeAlpha — distance-based alpha [0..1] + * offset 96 | f32 iTime — animation time (sec * 0.25) + * offset 100 | f32 × 3 _pad — round struct up to 112 B * - * **viewProj is now load-bearing.** Earlier this pass emitted directly + * #### Why the field order changed (vs the pre-WESL-conversion layout) + * + * The previous layout placed `fadeAlpha` + `iTime` at offsets 72/76, + * which collide with the `_pad0/_pad1` slots that `CameraUniforms` + * reserves. To embed `cam: CameraUniforms` as the first field we + * had to relocate the renderer-specific scalars after the cam block. + * `cameraPosWorld` (vec3, 16-byte alignment) lands naturally at + * offset 80 — the first 16-byte boundary after cam — and the two + * f32 scalars fall in at 92 / 96. CPU-side: `fadeAlpha` moved from + * f32 index 18 → 23, `iTime` moved from f32 index 19 → 24, + * `cameraPosWorld` stays at 20..22. + * + * **viewProj is load-bearing.** Earlier this pass emitted directly * in clip-space (slot 0 was kept "for ABI symmetry") and the impostor * was always full-screen regardless of camera distance. The * world-anchored billboard fixes that — the vertex stage projects each - * corner via viewProj so the quad's apparent angular size on screen - * scales as `2 * atan(milkyWayHalfExtent / cameraDistance)`. + * corner via `worldToClip(u.cam, p)` so the quad's apparent angular + * size on screen scales as `2 * atan(milkyWayHalfExtent / + * cameraDistance)`. * * **cameraPosWorld is also load-bearing.** Earlier the fragment stage * hard-coded `ro = vec3(0, 0.7, 2) * 0.75` for its synthetic camera @@ -46,8 +62,8 @@ * drive the raymarched render — orbiting reveals different aspects of * the spiral. * - * viewport stays unused: the fragment shader works in the impostor's - * local UV directly, never in pixel coordinates. + * `viewportPx` stays unused: the fragment shader works in the + * impostor's local UV directly, never in pixel coordinates. * * ### Why no instance vertex buffer? * @@ -60,7 +76,8 @@ * `@builtin(vertex_index)`. */ -import wgsl from './shaders/milkyWayImpostor.wgsl?raw'; +import wgsl from './shaders/milkyWayImpostor.wesl?static'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; type Init = { device: GPUDevice; @@ -70,11 +87,12 @@ type Init = { export class MilkyWayRenderer { /** * Public constant pinning the on-the-wire uniform buffer size. Must - * match the WGSL `Uniforms` struct's std140-ish layout (mat4 + vec2 + - * 2 f32 + 16 bytes padding = 96 bytes) byte-for-byte. Changing one - * without the other yields silent uniform-read corruption. + * match the WESL `Uniforms` struct's std140-ish layout + * (`CameraUniforms` 80 B + vec3 cameraPosWorld 12 B + 2 × f32 8 B + + * 12 B tail pad = 112 bytes) byte-for-byte. Changing one without + * the other yields silent uniform-read corruption. */ - static readonly UNIFORM_BUFFER_SIZE = 96; + static readonly UNIFORM_BUFFER_SIZE = 112; private device: GPUDevice; private pipeline: GPURenderPipeline; @@ -86,9 +104,10 @@ export class MilkyWayRenderer { const { device, format } = init; this.device = device; - const module = device.createShaderModule({ code: wgsl }); + const module = createShaderModuleWithDevLog(device, wgsl, 'milkyWay'); this.bindGroupLayout = device.createBindGroupLayout({ + label: 'milkyWay-bgl-uniforms', entries: [ { binding: 0, @@ -99,20 +118,24 @@ export class MilkyWayRenderer { }); this.uniformBuffer = device.createBuffer({ + label: 'milkyWay-uniform-buffer', size: MilkyWayRenderer.UNIFORM_BUFFER_SIZE, usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, }); this.bindGroup = device.createBindGroup({ + label: 'milkyWay-bg-uniforms', layout: this.bindGroupLayout, entries: [{ binding: 0, resource: { buffer: this.uniformBuffer } }], }); const pipelineLayout = device.createPipelineLayout({ + label: 'milkyWay-pipeline-layout', bindGroupLayouts: [this.bindGroupLayout], }); this.pipeline = device.createRenderPipeline({ + label: 'milkyWay-pipeline', layout: pipelineLayout, vertex: { module, entryPoint: 'vs' }, fragment: { @@ -194,28 +217,41 @@ export class MilkyWayRenderer { iTimeSec: number, cameraPosWorld: [number, number, number], ): void { - // Pack uniforms into a 96-byte ArrayBuffer matching the WGSL + // Pack uniforms into a 112-byte ArrayBuffer matching the WESL // `Uniforms` struct layout. See the class doc-comment for the - // offset table. + // full offset table. const uniforms = new ArrayBuffer(MilkyWayRenderer.UNIFORM_BUFFER_SIZE); const f32 = new Float32Array(uniforms); - // mat4 viewProj (offsets 0..63 / floats 0..15) + // cam.viewProj — mat4 (offsets 0..63 / floats 0..15) f32.set(viewProj, 0); - // viewport (offsets 64..71 / floats 16..17) + // cam.viewportPx — vec2 (offsets 64..71 / floats 16..17). Unread + // by this pass but uploaded for ABI symmetry with the rest of the + // engine (every other renderer reads viewportPx for pxPerRad-style + // derivations). f32[16] = viewport[0]; f32[17] = viewport[1]; - // fadeAlpha (offset 72 / float 18) - f32[18] = fadeAlpha; - // iTime (offset 76 / float 19) - f32[19] = iTimeSec; - // cameraPosWorld (offsets 80..91 / floats 20..22). vec3 alignment - // is 16 bytes in the WGSL std140-ish layout, so the field starts - // at offset 80 (the next multiple of 16 after 76+4=80). Float 23 - // is the trailing pad and stays zero — the ArrayBuffer init takes - // care of it. + // cam._pad0/_pad1 (offsets 72..79 / floats 18..19) — reserved by + // CameraUniforms. Stays zero (ArrayBuffer init handles it). + // cameraPosWorld — vec3 (offsets 80..91 / floats 20..22). Float + // 22 is the third component of the vec3, NOT padding; the next + // 16-byte boundary is at offset 96, so the implicit padding sits + // at offset 92 in WGSL terms — but our layout repurposes that + // slot as the next field (fadeAlpha) since vec3 + f32 fits in a + // 16-byte chunk without extra alignment loss. f32[20] = cameraPosWorld[0]; f32[21] = cameraPosWorld[1]; f32[22] = cameraPosWorld[2]; + // fadeAlpha (offset 92 / float 23) — sits in the f32 slot + // immediately after the vec3, packing the vec3+f32 quad into + // bytes 80..95. + f32[23] = fadeAlpha; + // iTime (offset 96 / float 24). Note: this moved from float + // index 19 in the pre-CameraUniforms layout — the cam prefix + // now occupies 0..79 and the renderer-specific scalars sit + // after the cameraPosWorld vec3. + f32[24] = iTimeSec; + // Floats 25..27 are tail padding (offsets 100..111) rounding + // the struct size up to a 16-byte multiple. Stays zero. this.device.queue.writeBuffer(this.uniformBuffer, 0, uniforms); pass.setPipeline(this.pipeline); diff --git a/src/services/gpu/pickRenderer.ts b/src/services/gpu/pickRenderer.ts index 342989e..39b6425 100644 --- a/src/services/gpu/pickRenderer.ts +++ b/src/services/gpu/pickRenderer.ts @@ -37,9 +37,9 @@ * * The pick pipeline reuses the *same* vertex buffer and *same* uniform buffer as * the visual pass. The caller must ensure that the visual pass has already - * written its per-frame uniforms (viewProj, viewport, pointSizePx, brightness) - * before calling `pick()` — the pick pass reads the same values without - * re-uploading them. See the `pick()` JSDoc for the exact contract. + * written its per-frame uniforms (cam.viewProj, cam.viewportPx, pointSizePx, + * brightness, ...) before calling `pick()` — the pick pass reads the same + * values without re-uploading them. See the `pick()` JSDoc for the exact contract. * * ### Forgiveness radius * @@ -50,9 +50,10 @@ * @module */ -import shaderSrc from './shaders/points.wgsl?raw'; +import shaderSrc from './shaders/points.wesl?static'; import type { Source } from '../../data/sources'; import type { PointRenderer } from './pointRenderer'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; // ─── Types ──────────────────────────────────────────────────────────────────── @@ -251,7 +252,7 @@ export function createPickRenderer( // We reuse the same WGSL source as PointRenderer. The shader file contains // both the `fs` (visual) and `fsPick` (picking) fragment entry points. // Here we select `fsPick`. - const module = device.createShaderModule({ code: shaderSrc }); + const module = createShaderModuleWithDevLog(device, shaderSrc, 'pick'); // ── Render pipeline ──────────────────────────────────────────────────────── // @@ -266,6 +267,7 @@ export function createPickRenderer( // `layout: 'auto'` reflects the bind group layout from the shader's @group/@binding // declarations. The single binding is @group(0) @binding(0) — the Uniforms buffer. const pipeline = device.createRenderPipeline({ + label: 'pick-pipeline', layout: 'auto', vertex: { @@ -344,6 +346,7 @@ export function createPickRenderer( // 4-byte texel, we must allocate at least 256 bytes. We never map this // buffer for writing — only MAP_READ is needed. const stagingBuffer = device.createBuffer({ + label: 'pick-staging-buffer', size: 256, usage: GPUBufferUsage.MAP_READ | GPUBufferUsage.COPY_DST, }); @@ -397,6 +400,7 @@ export function createPickRenderer( // `RENDER_ATTACHMENT` — the render pass can write to it. // `COPY_SRC` — we copy a single pixel out of it after the pass. pickTexture = device.createTexture({ + label: 'pick-target', size: { width: w, height: h }, format: 'r32uint', usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.COPY_SRC, @@ -409,6 +413,7 @@ export function createPickRenderer( // Only `RENDER_ATTACHMENT` is needed — depth buffers are not typically read // back to the CPU, so no `COPY_SRC` here. depthTexture = device.createTexture({ + label: 'pick-depth', size: { width: w, height: h }, format: 'depth24plus', usage: GPUTextureUsage.RENDER_ATTACHMENT, @@ -480,8 +485,13 @@ export function createPickRenderer( // with the real selectedPacked, so we don't need to restore // anything afterward. // - // Layout: mat4 viewProj (64) + viewport (8) + pointSizePx (4) + - // brightness (4) → selectedPacked sits at byte offset 80. + // Layout (post-CameraUniforms refactor): the shared 80-byte + // 'CameraUniforms' prefix occupies bytes 0..79 (viewProj + viewportPx + // + two pad slots), so 'selectedPacked' sits at byte offset 80 — the + // SAME offset as before the refactor (pre-refactor was viewProj 64 + + // viewport 8 + pointSizePx 4 + brightness 4 = 80). The value at this + // offset is the packed (source, localIdx) u32, not an instance + // index — see PointRenderer.toGlobalIdx for the encoding. const SELECTED_PACKED_OFFSET = 80; const NONE_SENTINEL = new Uint32Array([0xffffffff]); device.queue.writeBuffer(sharedUniformBuffer, SELECTED_PACKED_OFFSET, NONE_SENTINEL); @@ -492,16 +502,24 @@ export function createPickRenderer( // full rationale. Pads the visual `pointSizePx` floor by a few extra // pixels so distant point-like galaxies become easier mouse targets // without growing them on screen. Same in-place mutation pattern as - // the SELECTED_INDEX write above — the next visual frame writes the + // the SELECTED_PACKED write above — the next visual frame writes the // real `pointSizePx` back, so the visual pass is unaffected. // - // Layout reminder: pointSizePx sits at byte offset 72 (mat4 viewProj - // = 64 + viewport vec2 = 8 → 72). Skipped entirely when the caller - // didn't supply pointSizePx — preserves the legacy "pick whatever the - // visual frame just wrote" contract for any test that constructs the - // renderer in isolation. + // Layout reminder (post-CameraUniforms refactor): pointSizePx now sits + // at byte offset 88 (cam: CameraUniforms = 80 B prefix + selectedPacked + // u32 + instanceIdOffset u32 = 88). It used to live at offset 72, but + // adopting the shared 'CameraUniforms' prefix (which reserves bytes + // 72..79 as '_pad0/_pad1') forced 'pointSizePx' + 'brightness' to + // move into the existing 8-byte alignment slack between + // 'instanceIdOffset' and the vec3-aligned 'camPosWorld'. See the + // 'Uniforms layout' doc-block in points.wesl for the migration + // diagram and the matching f32-index update in pointRenderer.ts. + // + // Skipped entirely when the caller didn't supply pointSizePx — + // preserves the legacy "pick whatever the visual frame just wrote" + // contract for any test that constructs the renderer in isolation. if (pointSizePx !== undefined) { - const POINT_SIZE_OFFSET = 72; + const POINT_SIZE_OFFSET = 88; const boostedSize = new Float32Array([pointSizePx + PICK_PADDING_PX]); device.queue.writeBuffer(sharedUniformBuffer, POINT_SIZE_OFFSET, boostedSize); } @@ -550,6 +568,7 @@ export function createPickRenderer( // next pick() call picks up the fresh handle without needing to // invalidate this PickRenderer. const bindGroup = device.createBindGroup({ + label: 'pick-bg-uniforms', layout: pipeline.getBindGroupLayout(0), entries: [{ binding: 0, resource: { buffer: sharedUniformBuffer } }], }); @@ -567,6 +586,7 @@ export function createPickRenderer( const cloudLayout = pipeline.getBindGroupLayout(1); for (const src of sourceList) { const cloudBindGroup = device.createBindGroup({ + label: `pick-bg-cloudFade-${src.source}`, layout: cloudLayout, entries: [{ binding: 0, resource: { buffer: src.cloudFadeBuffer } }], }); diff --git a/src/services/gpu/pointRenderer.ts b/src/services/gpu/pointRenderer.ts index 6135aa5..03efd49 100644 --- a/src/services/gpu/pointRenderer.ts +++ b/src/services/gpu/pointRenderer.ts @@ -81,13 +81,16 @@ import { type ComputeSchechterRatiosInput } from '../engine/computeSchechterRati import ComputeAngularWeightsWorker from '../engine/computeAngularWeights.worker?worker'; import { type ComputeAngularWeightsInput } from '../engine/computeAngularWeights'; -// `?raw` is a Vite-specific import suffix. It tells the bundler to import the -// file's content as a plain string rather than attempting to execute it as -// JavaScript. The WGSL source text ends up inlined in the JS bundle; at -// runtime we hand it to `device.createShaderModule({ code: shaderSrc })`. -// Without `?raw`, Vite would try to parse the .wgsl file as JS and fail. -import shaderSrc from './shaders/points.wgsl?raw'; +// `?static` is wesl-plugin's Vite import suffix. It runs the WESL linker at +// build time and hands us a plain WGSL string with all `import` statements +// resolved into top-level functions. We forward that string straight to +// `device.createShaderModule({ code: shaderSrc })`. The previous `?raw` +// suffix bypassed the linker entirely and worked only because the legacy +// .wgsl source was self-contained — once we extract shared modules under +// `shaders/lib/`, `?static` is required. +import shaderSrc from './shaders/points.wesl?static'; import { CloudFade } from './cloudFade'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; // ─── Layout constants ───────────────────────────────────────────────────────── @@ -249,13 +252,14 @@ export const SELECTED_PACKED_BYTE_OFFSET = 80; * * The struct contains (offsets are byte offsets from the start of the buffer): * - * bytes 0..63 : viewProj mat4x4 (16 floats = 64 bytes) - * bytes 64..71 : viewport vec2 (2 floats) } - * bytes 72..75 : pointSizePx f32 (1 float) } 16 bytes (one vec4 slot) - * bytes 76..79 : brightness f32 (1 float) } + * bytes 0..63 : cam.viewProj mat4x4 (16 floats = 64 bytes) } CameraUniforms + * bytes 64..71 : cam.viewportPx vec2 (2 floats) } prefix from + * bytes 72..75 : cam._pad0 f32 (alignment slack) } lib/camera.wesl + * bytes 76..79 : cam._pad1 f32 (alignment slack) } (80 B total) * bytes 80..83 : selectedPacked u32 ← (selectedSource << 27) | selectedLocalIdx, or 0xFFFFFFFF * bytes 84..87 : sourceCode u32 ← per-draw source tag (5 bits used) - * bytes 88..95 : _pad0/_pad1 u32×2 (written as 0) ← alignment for the next vec3 slot + * bytes 88..91 : pointSizePx f32 (moved here from offset 72 — see Uniforms doc-block) + * bytes 92..95 : brightness f32 (moved here from offset 76 — see Uniforms doc-block) * bytes 96..107 : camPosWorld vec3 (3 floats) } 16 bytes (one vec4 slot) * bytes 108..111: pxPerRad f32 (1 float) } * bytes 112..115: highlightFallback u32 } @@ -279,15 +283,20 @@ export const SELECTED_PACKED_BYTE_OFFSET = 80; * * WGSL uniform buffers follow rules similar to std140 (see WGSL spec §13, * "Memory Layout"). Each member must be aligned to its alignment value: - * `vec3` requires 16-byte alignment, which is why the `_pad0/_pad1` - * pair sits between `sourceCode` and `camPosWorld` — without those - * eight bytes, `camPosWorld` would land at offset 88, breaking alignment - * and silently corrupting the camera position. + * `vec3` requires 16-byte alignment, which is why we still need 8 + * bytes between `sourceCode` (offset 84) and `camPosWorld` (offset 96). + * The pre-CameraUniforms layout filled those 8 bytes with explicit + * `_pad0/_pad1` u32s; the post-refactor layout fills them with + * `pointSizePx` + `brightness` (formerly at offsets 72/76, which now + * belong to `CameraUniforms._pad0/_pad1`). Same number of bytes, same + * alignment — the displaced scalars simply moved into the existing pad slack. * - * The picker (`pickRenderer.ts`) writes `selectedPacked` (offset 80) + - * `sourceCode` (offset 84) for every per-source draw — see its `pick()` - * docblock for the per-source uniform-write pattern that lets the pick - * pass see the same packed identity space the visual pass does. + * The picker (`pickRenderer.ts`) writes `selectedPacked` (offset 80, + * UNCHANGED across the refactor) + `sourceCode` (offset 84) for every + * per-source draw — see its `pick()` docblock for the per-source + * uniform-write pattern that lets the pick pass see the same packed + * identity space the visual pass does. It also writes `pointSizePx` at + * offset 88 (moved from offset 72 by the CameraUniforms refactor). * * Task 15 added the trailing 16-byte slot for the orientation-visibility * toggles (`highlightFallback`, `realOnlyMode`). The two trailing u32 @@ -810,9 +819,10 @@ export class PointRenderer { private device: GPUDevice, format: GPUTextureFormat, ) { - const module = device.createShaderModule({ code: shaderSrc }); + const module = createShaderModuleWithDevLog(device, shaderSrc, 'points'); this.pipeline = device.createRenderPipeline({ + label: 'points-pipeline', layout: 'auto', vertex: { @@ -881,11 +891,13 @@ export class PointRenderer { }); this.uniformBuffer_internal = device.createBuffer({ + label: 'points-uniform-buffer', size: UNIFORM_BYTES, usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, }); this.bindGroup = device.createBindGroup({ + label: 'points-bg-uniforms', layout: this.pipeline.getBindGroupLayout(0), entries: [{ binding: 0, resource: { buffer: this.uniformBuffer_internal } }], }); @@ -1012,6 +1024,7 @@ export class PointRenderer { } const buffer = this.device.createBuffer({ + label: `points-vertex-buffer-${source}`, size: interleaved.byteLength, usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST, }); @@ -1594,15 +1607,25 @@ export class PointRenderer { const f32 = new Float32Array(buf); const u32 = new Uint32Array(buf); + // Cam block (offsets 0..79) — viewProj + viewportPx + 2 reserved pads. + // f32[18] / f32[19] are the CameraUniforms '_pad0' / '_pad1' slots; the + // shared struct reserves them for vec3-alignment and they stay zero here. f32.set(viewProj, 0); - f32[16] = viewportPx[0]; - f32[17] = viewportPx[1]; - f32[18] = pointSizePx; - f32[19] = brightness; + f32[16] = viewportPx[0]; // cam.viewportPx.x at byte offset 64 + f32[17] = viewportPx[1]; // cam.viewportPx.y at byte offset 68 + // f32[18], f32[19] (cam._pad0, cam._pad1) stay zero. u32[20] = selectedPacked >>> 0; // selectedPacked at byte offset 80 - // u32[21..23] are pad bytes (sourceCode lives in the per-source - // @group(1) cloud bind group, not @group(0)). Float32Array starts - // zero-initialised so we don't need to write them explicitly. + // u32[21] (offset 84) is the @group(0) _pad0 — sourceCode lives in + // the per-source @group(1) cloud bind group, not @group(0). + // ArrayBuffer starts zero-initialised so we don't need to write it. + // pointSizePx + brightness moved into f32[22]/f32[23] from f32[18]/f32[19] + // when the shared CameraUniforms prefix took over the first 80 bytes — + // they recycle the existing 8-byte alignment slack between the + // @group(0)-unused slot at offset 84 and the vec3-aligned camPosWorld + // at offset 96. See the 'Uniforms layout' doc-block in points.wesl + // and the matching POINT_SIZE_OFFSET = 88 in pickRenderer.ts. + f32[22] = pointSizePx; // bytes 88..91 + f32[23] = brightness; // bytes 92..95 f32[24] = camPosWorld[0]; // bytes 96..99 f32[25] = camPosWorld[1]; // bytes 100..103 f32[26] = camPosWorld[2]; // bytes 104..107 diff --git a/src/services/gpu/postProcess.ts b/src/services/gpu/postProcess.ts index 8ec2077..bc756c1 100644 --- a/src/services/gpu/postProcess.ts +++ b/src/services/gpu/postProcess.ts @@ -90,8 +90,14 @@ * monotonicity, asymptotic behaviour, and curve-specific shape. */ -import toneMapWgsl from './shaders/toneMap.wgsl?raw'; +// `?static` runs the WESL linker at build time and returns a flat WGSL +// string. For toneMap there are no `import` statements yet, so the +// linker output is byte-identical to the previous `?raw` import — but +// wiring `?static` here first makes this the smoke test for the whole +// wesl-plugin tooling chain. Later tasks add real imports. +import toneMapWgsl from './shaders/toneMap.wesl?static'; import { ToneMapCurve } from '../../data/toneMapCurve'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; /** * Plain `{ width, height }` pair, kept local to this module. We @@ -198,6 +204,7 @@ export function createPostProcess( function allocateHdr(s: Size): void { if (hdrTexture) hdrTexture.destroy(); hdrTexture = device.createTexture({ + label: 'hdr-target', format: 'rgba16float', size: { width: s.width, height: s.height }, usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING, @@ -208,7 +215,16 @@ export function createPostProcess( allocateHdr(size); // ── Tone-map pipeline (built once, lives until destroy) ─────────────── - const module = device.createShaderModule({ code: toneMapWgsl }); + // + // `label` shows up in `getCompilationInfo` diagnostics and in + // browser-devtools error reports, which makes it much easier to tell + // *which* shader broke when several modules fail in the same frame. + // The helper additionally dumps the linked WGSL on compile errors in + // dev mode — see `shaderCompileLogger.ts` for the rationale (Chrome's + // WGSL compiler reports error line numbers against the linked output + // that wesl-plugin produces, so the only way to map them back to a + // source file is to read the linked string ourselves). + const module = createShaderModuleWithDevLog(device, toneMapWgsl, 'toneMap'); // Why nearest, not linear? The HDR texture is the same resolution // as the swap chain (we resize it in lockstep) so the fullscreen @@ -217,6 +233,7 @@ export function createPostProcess( // work, and on some GPUs `linear` requires `'float32-filterable'` // even on rgba16float. `nearest` is universally supported. const sampler = device.createSampler({ + label: 'toneMap-sampler', magFilter: 'nearest', minFilter: 'nearest', }); @@ -224,11 +241,13 @@ export function createPostProcess( // Uniform layout: [exposure: f32, whitepointSq: f32, asinhSoftness: f32, // curve: u32] — 16 bytes total, naturally 16-byte aligned. const uniformBuffer = device.createBuffer({ + label: 'toneMap-uniform-buffer', size: 16, usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, }); const bindGroupLayout = device.createBindGroupLayout({ + label: 'toneMap-bgl', entries: [ { binding: 0, visibility: GPUShaderStage.FRAGMENT, texture: { sampleType: 'float' } }, { binding: 1, visibility: GPUShaderStage.FRAGMENT, sampler: {} }, @@ -237,7 +256,11 @@ export function createPostProcess( }); const pipeline = device.createRenderPipeline({ - layout: device.createPipelineLayout({ bindGroupLayouts: [bindGroupLayout] }), + label: 'toneMap-pipeline', + layout: device.createPipelineLayout({ + label: 'toneMap-pipeline-layout', + bindGroupLayouts: [bindGroupLayout], + }), vertex: { module, entryPoint: 'vs' }, fragment: { module, @@ -272,6 +295,7 @@ export function createPostProcess( // bind a stale (destroyed) view. The cost is one allocation // per frame; trivial compared to the actual fullscreen blit. const bindGroup = device.createBindGroup({ + label: 'toneMap-bg', layout: bindGroupLayout, entries: [ { binding: 0, resource: hdrView! }, diff --git a/src/services/gpu/proceduralDiskRenderer.ts b/src/services/gpu/proceduralDiskRenderer.ts index 0eef592..c3f086e 100644 --- a/src/services/gpu/proceduralDiskRenderer.ts +++ b/src/services/gpu/proceduralDiskRenderer.ts @@ -11,8 +11,9 @@ * is just the JS-side pipeline wiring. */ -import wgsl from './shaders/proceduralDisks.wgsl?raw'; +import wgsl from './shaders/proceduralDisks.wesl?static'; import type { ProceduralDiskInstance } from '../../@types/ProceduralDiskInstance'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; const STRIDE_FLOATS = 12; // 3 vec4 per instance const STRIDE_BYTES = STRIDE_FLOATS * 4; @@ -37,9 +38,10 @@ export class ProceduralDiskRenderer { const { device, format } = init; this.device = device; - const module = device.createShaderModule({ code: wgsl }); + const module = createShaderModuleWithDevLog(device, wgsl, 'proceduralDisks'); this.bindGroupLayout = device.createBindGroupLayout({ + label: 'proceduralDisks-bgl-uniforms', entries: [ { binding: 0, @@ -52,20 +54,24 @@ export class ProceduralDiskRenderer { // Uniform layout matches diskRenderer / quadRenderer (mat4 + vec2 + // 2 padding f32 + vec3 + f32) — 96 bytes. this.uniformBuffer = device.createBuffer({ + label: 'proceduralDisks-uniform-buffer', size: 96, usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST, }); this.bindGroup = device.createBindGroup({ + label: 'proceduralDisks-bg-uniforms', layout: this.bindGroupLayout, entries: [{ binding: 0, resource: { buffer: this.uniformBuffer } }], }); const pipelineLayout = device.createPipelineLayout({ + label: 'proceduralDisks-pipeline-layout', bindGroupLayouts: [this.bindGroupLayout], }); this.pipeline = device.createRenderPipeline({ + label: 'proceduralDisks-pipeline', layout: pipelineLayout, vertex: { module, @@ -133,6 +139,7 @@ export class ProceduralDiskRenderer { this.vertexBuffer?.destroy(); const cap = Math.max(instances.length, 64); this.vertexBuffer = this.device.createBuffer({ + label: 'proceduralDisks-vertex-buffer', size: cap * STRIDE_BYTES, usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST, }); diff --git a/src/services/gpu/quadRenderer.ts b/src/services/gpu/quadRenderer.ts index 18f9ee9..a4ff4bf 100644 --- a/src/services/gpu/quadRenderer.ts +++ b/src/services/gpu/quadRenderer.ts @@ -15,7 +15,8 @@ import type { mat4 } from 'gl-matrix'; import type { GpuContext, QuadInstance } from '../../@types'; -import quadsWgsl from './shaders/quads.wgsl?raw'; +import quadsWgsl from './shaders/quads.wesl?static'; +import { createShaderModuleWithDevLog } from './shaderCompileLogger'; /** * Per-instance vertex attributes packed as 12 floats / 48 bytes: @@ -35,15 +36,25 @@ const FLOATS_PER_INSTANCE = 12; const BYTES_PER_INSTANCE = FLOATS_PER_INSTANCE * 4; /** - * 96-byte uniform layout (matches the WGSL `Uniforms` struct in quads.wgsl): + * 96-byte uniform layout, mirroring `struct Uniforms` in + * `shaders/quads.wesl`. The first 80 bytes are the shared + * `CameraUniforms` prefix from `shaders/lib/camera.wesl`; the + * renderer-specific `camPosWorld + pxPerRad` pair sits AFTER it + * starting at offset 80. * - * bytes 0..63 : viewProj mat4x4 (16 floats = 64 B) - * bytes 64..71 : viewport vec2 (2 floats = 8 B) - * bytes 72..79 : _pad0/_pad1 f32 × 2 (8 B; padding so the next vec3 lands on a 16-B boundary) - * bytes 80..91 : camPosWorld vec3 (3 floats = 12 B; vec3 needs 16-B alignment) - * bytes 92..95 : pxPerRad f32 (1 float = 4 B; fits the trailing slot of camPosWorld's 16-B vec4 quantum) + * bytes 0..63 : viewProj mat4x4 (CameraUniforms.viewProj) + * bytes 64..71 : viewportPx vec2 (CameraUniforms.viewportPx) + * bytes 72..79 : _pad0, _pad1 2 × f32 (CameraUniforms reserved) + * bytes 80..91 : camPosWorld vec3 (vec3 needs 16-B alignment, which 80 already provides) + * bytes 92..95 : pxPerRad f32 (fills the trailing slot of camPosWorld's 16-B vec4 quantum) * - * Total: 96 bytes — multiple of 16 ✓. + * Total: 96 bytes — multiple of 16, no tail pad needed. + * + * Adopting `CameraUniforms` is a pure renaming at this layout: the + * shared prefix overlays the previous `viewProj + viewport + _pad0 + * + _pad1` region byte-for-byte, so f32-indices for camPosWorld / + * pxPerRad stay at 20..23 — the CPU writes below are unchanged from + * before adoption. * * `camPosWorld` and `pxPerRad` are used by the vertex stage to compute * each quad's apparent angular radius from its world-space distance to @@ -82,11 +93,14 @@ export class QuadRenderer { ], }); - const module = this.device.createShaderModule({ label: 'quads-wgsl', code: quadsWgsl }); + const module = createShaderModuleWithDevLog(this.device, quadsWgsl, 'quads'); this.pipeline = this.device.createRenderPipeline({ label: 'quad-pipeline', - layout: this.device.createPipelineLayout({ bindGroupLayouts: [this.bindGroupLayout] }), + layout: this.device.createPipelineLayout({ + label: 'quads-pipeline-layout', + bindGroupLayouts: [this.bindGroupLayout], + }), vertex: { module, entryPoint: 'vs', @@ -190,11 +204,20 @@ export class QuadRenderer { if (instances.length === 0) return; // Pack uniforms — see UNIFORM_BYTES doc-comment for the layout. + // f32[0..15] viewProj — CameraUniforms.viewProj + // f32[16..17] viewportPx — CameraUniforms.viewportPx + // f32[18..19] CameraUniforms reserved pad (left zero) + // f32[20..22] camPosWorld — Uniforms.camPosWorld (offset 80) + // f32[23] pxPerRad — Uniforms.pxPerRad (offset 92) + // + // The CameraUniforms reserved pad slots at f32[18..19] MUST stay + // zero — overwriting them silently shifts the WGSL view of every + // later member. `Float32Array` zero-initialises so we rely on + // that rather than writing explicit zeros. const uni = new Float32Array(UNIFORM_BYTES / 4); uni.set(viewProj as Float32Array, 0); uni[16] = viewportPx[0]; uni[17] = viewportPx[1]; - // uni[18], uni[19] are the _pad0/_pad1 zero slots (left zero by Float32Array init). uni[20] = camPosWorld[0]; // camPosWorld.x at byte offset 80 uni[21] = camPosWorld[1]; uni[22] = camPosWorld[2]; diff --git a/src/services/gpu/shaderCompileLogger.ts b/src/services/gpu/shaderCompileLogger.ts new file mode 100644 index 0000000..303c729 --- /dev/null +++ b/src/services/gpu/shaderCompileLogger.ts @@ -0,0 +1,44 @@ +/** + * Helper for creating a `GPUShaderModule` that logs the linked WGSL + * source alongside any compile-time error in dev mode. + * + * Why this exists: under wesl-plugin's `?static` import, what reaches + * `device.createShaderModule` is a *linked* WGSL string with all WESL + * imports resolved into top-level functions. Chrome's WGSL compiler + * reports error line numbers against THAT linked string, not the + * source `.wesl` modules — so when a compile error fires, the only + * way to map "error at line 142" back to a source file is to read the + * linked WGSL ourselves. + * + * The pattern: gate the dump on `import.meta.env.DEV` so production + * bundles strip the branch and don't ship the shader source twice + * (once as the module, once as a console log). `getCompilationInfo` + * is a Promise; we don't await it so module creation stays + * synchronous and the caller can keep building its pipeline. + * + * Until wesl-plugin gains sourcemap support, every renderer should + * route shader-module creation through this helper. Removing it later + * is a one-line edit (drop the wrapper, call createShaderModule + * directly) — keeping it in a single file means there's exactly one + * place to update if upstream changes. + */ +export function createShaderModuleWithDevLog( + device: GPUDevice, + code: string, + label: string, +): GPUShaderModule { + const module = device.createShaderModule({ code, label }); + if (import.meta.env.DEV) { + void module.getCompilationInfo().then((info) => { + if (info.messages.some((m) => m.type === 'error')) { + // eslint-disable-next-line no-console + console.groupCollapsed(`[${label}] linked WGSL (for error line lookup)`); + // eslint-disable-next-line no-console + console.log(code); + // eslint-disable-next-line no-console + console.groupEnd(); + } + }); + } + return module; +} diff --git a/src/services/gpu/shaders/disks.wgsl b/src/services/gpu/shaders/disks.wesl similarity index 59% rename from src/services/gpu/shaders/disks.wgsl rename to src/services/gpu/shaders/disks.wesl index 4b7cba9..dc613bc 100644 --- a/src/services/gpu/shaders/disks.wgsl +++ b/src/services/gpu/shaders/disks.wesl @@ -10,7 +10,7 @@ // // ### Why this approach instead of "always face the camera" // -// The first cut of this shader built a basis from `camPos - center` and +// The first cut of this shader built a basis from 'camPos - center' and // squashed it by axisRatio. That made the disk plane track the camera, // so axisRatio became a 2D screen-space squash — visually identical to // the points-shader's elliptical billboard mask, with no real 3D @@ -54,16 +54,46 @@ // Each corner is placed at center + (corner.x * major + corner.y * // minor_3d) * halfSize, then projected via viewProj. +// CameraUniforms covers the canonical 80-byte prefix shared by every +// world-space renderer (viewProj + viewportPx + 8B pad). The disks +// vertex stage only consults 'viewProj' via 'worldToClip' — the disk +// math is camera-independent by design (see the docblock above) — so +// 'worldToNdc' / 'worldEyeDepth' aren't imported. Restraint matches +// what filaments + quads already adopted in this WESL conversion. +import package::lib::camera::CameraUniforms; +import package::lib::camera::worldToClip; +// Shared unit-quad helpers from 'lib/billboard.wesl' — replace the +// inline 'CORNERS' const + '(corner + 1) * 0.5' UV remap that used to +// live below. The orientation-aligned disk-plane basis (PA + +// inclination → 'major' / 'minor_3d' in 3D world space) stays +// renderer-specific; only the vertex-index → corner / UV lookups are +// shared. See the docblock at the top of 'lib/billboard.wesl' for why +// the orientation math is intentionally NOT pulled into this lib. +import package::lib::billboard::quadCorner; +import package::lib::billboard::quadUv; +// Disk-plane axis math (PA + inclination → world-space major/minor basis) +// is shared with 'proceduralDisks.wesl' via 'lib/orientation.wesl'. The +// inline derivation that used to live in this file's vs() body — the +// los/north/east/major/minor chain — is now the lib's 'diskAxes' fn, +// byte-equivalent for disks since the lib standardised on the wider +// '|dot(north, los)| > 0.99' (~8°) pole-fallback threshold that disks +// already used. See 'lib/orientation.wesl' for the camera-independence +// invariant + the pole-degeneracy discussion. +import package::lib::orientation::DiskAxes; +import package::lib::orientation::diskAxes; +// Shared fragment-stage mask shapes — see 'lib/masks.wesl' for the +// rationale (three smoothstep patterns recurred across four shaders, +// naming the shapes makes the intent visible at the call site). +import package::lib::masks::circularMask; +import package::lib::masks::lumAlpha; + struct Uniforms { - viewProj: mat4x4, - viewport: vec2, - _pad0: f32, - _pad1: f32, + cam: CameraUniforms, // camPos is preserved in the layout for ABI continuity with the JS // upload path, but the world-fixed disk math doesn't read it: the // disk's orientation is an intrinsic galaxy property, independent of - // where the camera sits. The camera contributes only via viewProj - // (which is also a uniform, see above). + // where the camera sits. The camera contributes only via + // 'cam.viewProj' (consumed by 'worldToClip'). camPos: vec3, _pad2: f32, }; @@ -88,18 +118,15 @@ struct VsOut { @group(0) @binding(1) var atlasTex: texture_2d; @group(0) @binding(2) var atlasSmp: sampler; -const CORNERS = array, 6>( - vec2(-1.0, -1.0), - vec2( 1.0, -1.0), - vec2( 1.0, 1.0), - vec2(-1.0, -1.0), - vec2( 1.0, 1.0), - vec2(-1.0, 1.0), -); - @vertex fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { - let corner = CORNERS[vid]; + // Unit-square corner offset in [-1, +1]² for this triangle-list + // vertex. Pulled from 'lib/billboard::quadCorner' so the (BL, BR, + // TR, BL, TR, TL) ordering is shared across all four billboard + // renderers — see the lib's docblock for the corner-ordering + // discussion. The corner here is in the disk's LOCAL 2D frame; the + // 3D placement happens via the (major, minor_3d) basis below. + let corner = quadCorner(vid); let center = instance.posSize.xyz; let halfSize = instance.posSize.w * 0.5; // Clamp axisRatio so an edge-on disk still produces a thin sliver @@ -109,61 +136,20 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { let paDeg = instance.orient.y; let paRad = paDeg * 3.14159265 / 180.0; - // ── Line of sight ──────────────────────────────────────────────────── - // - // Earth (the observer) is at world origin in this coordinate system — - // the build pipeline's `raDecZToCartesian` places galaxies relative - // to that point. losDir is therefore the direction from Earth to - // the galaxy. Note: this is NOT the camera direction — the disk's - // orientation must be camera-independent, otherwise orbiting would - // make the disk visibly rotate (which it shouldn't). - let losDir = normalize(center); - - // ── Sky tangent basis (north / east at the galaxy's position) ──────── - // - // The celestial north pole is at Dec = +90°, which the build - // pipeline maps to world-space (0, 0, 1). Project that vector onto - // the plane perpendicular to losDir to get the in-sky "north" - // direction at the galaxy's position. - // - // Singularity: when the galaxy is within ~8° of the celestial pole - // (|losDir.z| > 0.99), `northPole - dot(...) * losDir` shrinks to - // near-zero and normalize() amplifies floating-point noise. Fall - // back to seeding with world-y in that case — for the handful of - // real galaxies that close to the pole the resulting PA is still - // well-defined, just measured against a different (consistent) - // reference direction. - let northPole = vec3(0.0, 0.0, 1.0); - let nearPole = abs(dot(northPole, losDir)) > 0.99; - let seed = select(northPole, vec3(0.0, 1.0, 0.0), nearPole); - let north_proj = normalize(seed - dot(seed, losDir) * losDir); - let east_proj = cross(north_proj, losDir); - - // ── Major axis on the sky ──────────────────────────────────────────── - // - // Astronomical PA is measured east of north — increasing PA rotates - // the major axis from north toward east. - let cs = cos(paRad); - let sn = sin(paRad); - let major = north_proj * cs + east_proj * sn; - - // ── Tilt the disk's true minor axis out of the sky plane ───────────── - // - // For a face-on galaxy (axisRatio = 1, inclination i = 0°), the disk - // minor axis lies entirely in the sky plane perpendicular to major. - // For an edge-on galaxy (axisRatio = 0, i = 90°), the disk minor - // axis points along the line of sight. Interpolate using cos(i) = - // axisRatio: - // - // minor_3d = minor_in_sky · cos(i) + losDir · sin(i) + // ── Disk-plane basis (PA + inclination → world-space major / minor) ─ // - // This is the disk's REAL minor axis in 3D. When projected onto the - // sky plane, its sky-projection length is cos(i) = axisRatio — which - // matches the observed b/a, by definition. - let minor_in_sky = cross(losDir, major); + // The full derivation — line-of-sight from Earth-at-origin, sky-tangent + // (north, east) frame with pole fallback, PA-east-of-north rotation, + // and inclination tilt of the minor axis out of the sky plane — lives + // in 'lib/orientation.wesl'. See its docblock for the camera- + // independence invariant and the ~8°-from-pole degeneracy fallback. + // 'axisRatio' is clamped at the call site (above) BEFORE we feed it + // through as cosI, because the lib intentionally doesn't re-clamp. let cosI = axisRatio; let sinI = sqrt(max(0.0, 1.0 - cosI * cosI)); - let minor_3d = minor_in_sky * cosI + losDir * sinI; + let axes = diskAxes(center, paRad, cosI, sinI); + let major = axes.major; + let minor_3d = axes.minor; // Place the corner in world space using (major, minor_3d) as the // disk's basis. No squash needed — the basis vectors are already @@ -171,9 +157,13 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { // automatically when the camera projects them. let world = center + (major * corner.x + minor_3d * corner.y) * halfSize; var out: VsOut; - out.clipPos = u.viewProj * vec4(world, 1.0); + out.clipPos = worldToClip(u.cam, world); - let cornerUv = (corner + vec2(1.0, 1.0)) * 0.5; + // 'quadUv' returns the unit-square corner remapped to [0, 1]² (same + // vertex-index ordering as 'quadCorner' above). We then flip V so + // the texture isn't upside down — matches the atlas's top-down + // 'flipY: false' upload convention used by 'quads.wesl'. + let cornerUv = quadUv(vid); let uvLocal = vec2(cornerUv.x, 1.0 - cornerUv.y); out.atlasUv = mix(instance.uvRect.xy, instance.uvRect.zw, uvLocal); out.cornerUv = cornerUv; @@ -188,14 +178,14 @@ fn fs(in: VsOut) -> @location(0) vec4 { // space, so the on-screen shape is a true ellipse from projection; // the mask just rounds the four corners of the (square) UV space. let r = length(in.cornerUv - vec2(0.5, 0.5)); - let mask = 1.0 - smoothstep(0.45, 0.5, r); + let mask = circularMask(r, 0.45, 0.5); // Brightness-derived alpha — same trick as quads.wgsl, lets the dark // sky in the cutout JPEG bleed transparent against the dot field. let lum = max(rgba.r, max(rgba.g, rgba.b)); - let lumAlpha = smoothstep(0.05, 0.30, lum); - let alpha = lumAlpha * mask * in.fadeAlpha; + let lumGate = lumAlpha(lum, 0.05, 0.30); + let alpha = lumGate * mask * in.fadeAlpha; // Discard near-transparent fragments so we don't waste blend - // bandwidth on near-zero contributions. See `quads.wgsl` for + // bandwidth on near-zero contributions. See 'quads.wgsl' for // the longer note — same reasoning applies here. if (alpha < 0.01) { discard; } return vec4(rgba.rgb * alpha, alpha); diff --git a/src/services/gpu/shaders/filaments.wgsl b/src/services/gpu/shaders/filaments.wesl similarity index 50% rename from src/services/gpu/shaders/filaments.wgsl rename to src/services/gpu/shaders/filaments.wesl index 9e921ad..0a9fcbe 100644 --- a/src/services/gpu/shaders/filaments.wgsl +++ b/src/services/gpu/shaders/filaments.wesl @@ -1,4 +1,4 @@ -// filaments.wgsl — instanced-quad line shader for the cosmic-web skeleton. +// filaments.wesl — instanced-quad line shader for the cosmic-web skeleton. // // One instance per filament SEGMENT (consecutive vertex pair within a // strip). The instance attributes are the segment's two endpoints + @@ -7,8 +7,8 @@ // between the two endpoints). // // Why instanced quads instead of native line topology? WebGPU's -// `topology: 'line-list'` always renders 1-pixel-wide lines on most -// platforms (no `setLineWidth` exists, by spec). For visible-from- +// 'topology: 'line-list'' always renders 1-pixel-wide lines on most +// platforms (no 'setLineWidth' exists, by spec). For visible-from- // orbit cosmic-web filaments we want anti-aliased thick lines with a // soft edge falloff — only the instanced-quad trick gives us that. // @@ -18,30 +18,61 @@ // uv.x picks startpoint vs endpoint; uv.y picks one side of the line vs // the other (mapped to ±half-width along the screen-space perpendicular). +// Shared CameraUniforms prefix + projection helpers. See +// 'lib/camera.wesl' for the canonical 80-byte layout (viewProj + +// viewportPx + 8B reserved pad). filaments only needs 'worldToClip' +// because both endpoints are projected the same way; the screen-space +// perpendicular math then operates in NDC and never touches a camera +// position, so 'worldEyeDepth' is intentionally NOT imported here. +import package::lib::camera::{ CameraUniforms, worldToClip }; +import package::lib::cloudFade::CloudUniforms; +import package::lib::cloudFade::applyCloudFade; +// Shared fragment-stage mask shapes — see 'lib/masks.wesl' for the +// rationale (three smoothstep patterns recurred across four shaders, +// naming the shapes makes the intent visible at the call site). +// 'edgeBandMask(axis, fade)' is the two-tailed soft-window pattern +// previously spelled inline as 'smoothstep(0, fade, x) * (1 - +// smoothstep(1-fade, 1, x))' below. +import package::lib::masks::edgeBandMask; + +// Renderer-specific Uniforms struct. +// +// The first 80 bytes are the shared 'CameraUniforms' prefix (viewProj +// at offset 0, viewportPx at offset 64, two reserved-pad f32s at +// 72/76). The next two f32 slots at offsets 80 and 84 hold this +// renderer's two scalar parameters; offsets 88..95 are an explicit +// 8-byte tail pad to round the struct up to a 16-byte multiple, which +// matches what WebGPU's uniform-buffer size rounding would do anyway +// but makes the layout grep-able from the JS side. +// +// Total uniform size: 96 bytes (was 80 — the embedded CameraUniforms +// pulls the shared prefix in, and the two scalars now sit AFTER it +// rather than overlapping with the old 'viewport + halfWidth + pad' +// region). The CPU-side uploader in 'filamentRenderer.ts' writes +// halfWidthPx at f32-index 20 (byte 80) and intensityScale at f32- +// index 21 (byte 84) — see UNIFORM_BYTES there. struct Uniforms { - viewProj : mat4x4, - viewport : vec2, // [w, h] in physical pixels - halfWidthPx : f32, // line half-width in pixels + cam : CameraUniforms, + halfWidthPx : f32, // line half-width in pixels (offset 80) // Per-frame uniform scale for the entire filament-pass output, [0..1]. - // Multiplied into the final pre-multiplied colour + alpha. Lives in - // the slot that used to be `pad0` — the byte layout is unchanged. - // Lets the user dim the cosmic-web overlay against the bright HDR - // catalogue when high-σ skeletons (with their longer, denser ridges) - // saturate to flat white under the tone-mapped pass. - intensityScale : f32, + // Multiplied into the final pre-multiplied colour + alpha. Lets the + // user dim the cosmic-web overlay against the bright HDR catalogue + // when high-σ skeletons (with their longer, denser ridges) saturate + // to flat white under the tone-mapped pass. + intensityScale : f32, // offset 84 + _pad0 : f32, // offset 88 — see struct comment above + _pad1 : f32, // offset 92 }; @group(0) @binding(0) var u : Uniforms; -// Per-cloud fade-in (CloudFade — see src/services/gpu/cloudFade.ts). One -// f32 opacity, written each frame from the JS side; multiplied into the -// fragment alpha so a freshly-uploaded skeleton glides in over ~600 ms. -struct CloudUniforms { - opacity : f32, - _pad0 : f32, - _pad1 : f32, - _pad2 : f32, -}; +// Per-cloud fade-in (CloudFade — see src/services/gpu/cloudFade.ts). +// 'CloudUniforms' is imported from 'lib/cloudFade.wesl' (shared with +// points.wesl). The CPU-side 'CloudFade' class produces the same +// 16-byte layout for every consumer, so the shared shader struct is +// honest: filaments doesn't read 'sourceCode' today, but the bytes +// are written either way and a future filament feature can opt in +// without renaming. See the lib's docblock for the full rationale. @group(1) @binding(0) var cloud : CloudUniforms; struct PerVertex { @@ -60,9 +91,13 @@ struct VSOut { @vertex fn vs(in : PerVertex) -> VSOut { - // Project both endpoints to clip space. - let aClip = u.viewProj * vec4(in.startPos, 1.0); - let bClip = u.viewProj * vec4(in.endPos, 1.0); + // Project both endpoints to clip space via the shared helper. We + // could call 'worldToNdc' twice and drop the perspective divide + // separately, but we still need 'aClip.w' / 'bClip.w' below to + // restore clip space after the pixel-space offset, so doing the + // divide locally is cheaper than projecting twice. + let aClip = worldToClip(u.cam, in.startPos); + let bClip = worldToClip(u.cam, in.endPos); // Choose which endpoint this corner uses (uv.x = 0 → start, 1 → end). let endpoint = select(aClip, bClip, in.uv.x > 0.5); @@ -77,7 +112,7 @@ fn vs(in : PerVertex) -> VSOut { // pixel width → NDC offset: (px / halfViewport) is the NDC-space length // of one pixel. Multiplied by halfWidthPx gives the half-width in NDC. - let halfWidthNdc = perp * (u.halfWidthPx / (u.viewport * 0.5)); + let halfWidthNdc = perp * (u.halfWidthPx / (u.cam.viewportPx * 0.5)); // uv.y in [0, 1] picks +halfWidth or -halfWidth. let sideSign = in.uv.y * 2.0 - 1.0; @@ -100,40 +135,41 @@ fn vs(in : PerVertex) -> VSOut { @fragment fn fs(in : VSOut) -> @location(0) vec4 { // Soft anti-aliased edge: uv.y ∈ [0, 1], peak at 0.5. - // smoothstep(0, 0.1, x) and (1 - smoothstep(0.9, 1, x)) carve a soft - // window around the centre. Multiplied together they give a - // perpendicular-distance falloff that fades to 0 at the line's edges. - let edgeFade = - smoothstep(0.0, 0.1, in.uv.y) * (1.0 - smoothstep(0.9, 1.0, in.uv.y)); + // 'edgeBandMask(axis, fade)' carves a soft window around the centre + // — fade=0.1 gives a 10%-on / 80%-flat / 10%-off shape. The two-tailed + // smoothstep pattern lives in 'lib/masks.wesl' for the longer + // rationale; this is the call site that motivated the helper's + // existence. + let edgeFade = edgeBandMask(in.uv.y, 0.1); // ── Density-aware brightness + tint ────────────────────────────── // // The per-vertex density attribute is min-max-normalised at build - // time (see `skeletonToFilamentCloud` in `tools/parsers/ndskl.ts`), - // so `in.density` ∈ [0, 1] across the whole catalogue: 0 = the + // time (see 'skeletonToFilamentCloud' in 'tools/parsers/ndskl.ts'), + // so 'in.density' ∈ [0, 1] across the whole catalogue: 0 = the // sparsest filament vertex, 1 = the densest. The vertex stage - // already linearly interpolates `startDensity` ↔ `endDensity` along + // already linearly interpolates 'startDensity' ↔ 'endDensity' along // the segment, so within a single filament the value rises smoothly // toward dense hub regions. // // Two simultaneous modulations: // - // * `densityBoost` ramps alpha from a visible floor (0.2) at + // * 'densityBoost' ramps alpha from a visible floor (0.2) at // low-density tendrils to full (1.0) at the brightest spine - // vertices. The `pow(d, 0.6)` gamma-correction stretches the + // vertices. The 'pow(d, 0.6)' gamma-correction stretches the // low end of the curve — without it, a near-linear ramp would // crush the dim 0.1–0.4 range to invisibility against the // tone-mapped HDR background. 0.6 is empirical; the eye reads // the resulting falloff as smooth. // - // * `tint` blends from a base soft purple at low density toward a + // * 'tint' blends from a base soft purple at low density toward a // brighter, slightly more white-blue purple at high density. // This adds a second visual axis (hue, not just brightness) so // the cosmic-web spine pops without needing the alpha alone to // carry the contrast. The two endpoints have similar luminance // so the tint shift reads as colour temperature, not glare. // - // Disclaimer: `density` here is the DTFE field value at the vertex, + // Disclaimer: 'density' here is the DTFE field value at the vertex, // NOT the per-filament robustness in σ (which is what DisPerSE's // persistence cut threshold uses). They're correlated — denser // ridges tend to be more persistent — but not identical. See the @@ -147,6 +183,13 @@ fn fs(in : VSOut) -> @location(0) vec4 { let hotTint = vec3(0.85, 0.75, 1.0); // bright, near-white-violet spine let tint = mix(baseTint, hotTint, in.density); - let alpha = edgeFade * 0.6 * densityBoost * u.intensityScale * cloud.opacity; + // Per-cloud fade-in (opacity uniform written each frame from JS). + // 'applyCloudFade' (lib/cloudFade.wesl) is the documented place that + // says 'never multiply opacity into RGB' — it's a scalar helper that + // folds opacity into 'alpha' alongside the other modulators here. + let alpha = applyCloudFade( + edgeFade * 0.6 * densityBoost * u.intensityScale, + cloud.opacity, + ); return vec4(tint * alpha, alpha); // pre-multiplied alpha } diff --git a/src/services/gpu/shaders/lib/billboard.wesl b/src/services/gpu/shaders/lib/billboard.wesl new file mode 100644 index 0000000..c9612f4 --- /dev/null +++ b/src/services/gpu/shaders/lib/billboard.wesl @@ -0,0 +1,167 @@ +// lib/billboard.wesl — unit-quad corner mapping + screen-space billboard +// expansion helpers shared across instanced point/quad renderers. +// +// Four of the project's renderers (points, quads, disks, proceduralDisks) +// each draw one screen- or world-aligned quad per instance using the +// non-indexed 'triangle-list' topology — six vertices per quad, with the +// vertex stage looking up the corner offset from a small constant array +// keyed by '@builtin(vertex_index)'. The corner array was duplicated +// verbatim in three of those four shaders (only points used a slightly +// different ordering); the cornerUv remap '(corner + 1) * 0.5' was +// duplicated in all four. This module collapses that duplication into one +// canonical pair of helpers ('quadCorner', 'quadUv') and adds one +// renderer-agnostic expansion helper ('expandBillboardScreen') for the +// screen-pixel-sized billboard pattern that points uses. +// +// ## Why these three exports and no more +// +// The plan's Task 5 draft proposed a fourth helper, 'expandBillboardWorld', +// that would view-align a world-sized quad (camera-right + camera-up basis +// extracted from a 'view' matrix). Reading the actual call sites revealed +// that NONE of the four candidate renderers want a generic view-aligned +// world basis: +// +// - 'quads' builds its basis from the projected celestial-north +// direction at each galaxy's screen position (so the texture's +// north-up orientation tracks sky-north, not the camera). That's +// fundamentally renderer-specific math — see the long doc-block +// around 'NORTH_WORLD' / 'upClip' in 'quads.wesl'. +// - 'disks' and 'proceduralDisks' build their bases from the galaxy's +// intrinsic position-angle and inclination (camera-INDEPENDENT — the +// disk plane is a property of the galaxy in 3D space). That math +// belongs in 'lib/orientation.wesl' (Task 6), not here. +// - 'points' is screen-aligned (corner direction = pure screen X / Y), +// which 'expandBillboardScreen' below covers. +// +// And the canonical CameraUniforms struct in 'lib/camera.wesl' carries +// only 'viewProj' + 'viewportPx' — there is no separate 'view' matrix to +// extract a camera-right / camera-up basis from. Adding one purely to +// satisfy a generic helper that no real renderer would call would bloat +// the shared 80-byte prefix for every renderer's UBO upload, with no +// functional benefit. We therefore omit 'expandBillboardWorld'; if a +// future renderer ever wants the view-aligned pattern, it can re-derive +// the basis from columns of the inverse-viewProj or accept a basis pair +// as parameters. +// +// ## What about the corner-ordering mismatch +// +// The four renderers historically used two slightly different 6-vertex +// orderings of the same unit square (both render an identical filled +// quad under WebGPU's default 'cullMode: none'): +// +// - 'quads' / 'disks' / 'proceduralDisks': +// BL, BR, TR, BL, TR, TL +// - 'points': +// BL, BR, TL, TL, BR, TR +// +// Both are valid CCW-front-face triangulations of the same square; they +// produce identical fragment coverage and identical interpolated UVs at +// every pixel inside the square (linear interpolation across two +// triangles whose union is the same convex region depends only on the +// vertex values at the corners, not on which diagonal splits the square). +// We therefore unify on the (BL, BR, TR, BL, TR, TL) ordering used by 3 +// of the 4 callers and migrate 'points' to match — this lets 'points' +// use the same 'quadCorner(vid)' helper as everyone else, eliminating +// the const-array duplication entirely. + +// CameraUniforms carries 'viewportPx', which 'expandBillboardScreen' +// below needs to convert pixel sizes into clip-space offsets. +import package::lib::camera::CameraUniforms; + +// ── quadCorner ─────────────────────────────────────────────────────── +// +// Map 'vid' (the GPU's per-vertex index, 0..5 for a triangle-list quad) +// to the corner offset in [-1, +1]². The table is hard-coded into a +// 'switch' rather than a 'const array' lookup because some WGSL drivers +// generate a uniform-buffer constant-array load for 'array, +// N>' even when the index is a literal, whereas a switch resolves to +// pure register moves. Both spellings produce identical results; we +// pick the one with no chance of incurring memory traffic. +// +// Triangle ordering (CCW in y-up screen space): +// triangle 1: vid 0 (BL), vid 1 (BR), vid 2 (TR) +// triangle 2: vid 3 (BL), vid 4 (TR), vid 5 (TL) +// +// The shared diagonal runs BL ↔ TR. Both triangles cover the same +// square, so the choice of diagonal is invisible in the fragment shader. + +fn quadCorner(vid: u32) -> vec2 { + switch vid { + case 0u: { return vec2(-1.0, -1.0); } // bottom-left + case 1u: { return vec2( 1.0, -1.0); } // bottom-right + case 2u: { return vec2( 1.0, 1.0); } // top-right + case 3u: { return vec2(-1.0, -1.0); } // bottom-left (repeat) + case 4u: { return vec2( 1.0, 1.0); } // top-right (repeat) + case 5u: { return vec2(-1.0, 1.0); } // top-left + default: { return vec2( 0.0, 0.0); } // unreachable + } +} + +// ── quadUv ─────────────────────────────────────────────────────────── +// +// Same vertex-index → corner mapping as 'quadCorner', but remapped from +// [-1, +1]² to [0, 1]² so callers that want a UV-style coordinate (for +// atlas sampling, radial-mask distances from the quad centre, etc.) can +// skip the inline '(corner + 1) * 0.5' remap. +// +// Why a separate helper rather than letting every caller write +// '(quadCorner(vid) + 1) * 0.5'? Three of four billboard renderers +// already do exactly that — extracting the named version turns a +// duplicated 25-character expression into a single function call and +// gives the rename target if the UV convention ever changes (e.g. if a +// future renderer wants a flipped V). + +fn quadUv(vid: u32) -> vec2 { + return (quadCorner(vid) + vec2(1.0, 1.0)) * 0.5; +} + +// ── expandBillboardScreen ──────────────────────────────────────────── +// +// Compute the clip-space XY offset that turns a projected billboard +// centre into a 'sizePx'-pixel-radius screen-aligned quad corner. The +// caller passes: +// +// cam — the shared CameraUniforms (for 'viewportPx') +// centerClip — already-projected centre, 'cam.viewProj * vec4(p, 1.0)' +// sizePx — desired billboard half-extent in screen pixels +// corner — the unit-quad corner from 'quadCorner(vid)' +// +// Returns the clip-space XY delta to ADD to 'centerClip.xy'. The full +// corner clip position is then: +// +// vec4(centerClip.xy + offset, centerClip.z, centerClip.w) +// +// ## Why it returns the offset rather than the full clip-space position +// +// 'points' applies a per-instance 'sizeScale' (8× radius for the +// selection halo) BEFORE the pixel→clip conversion, but AFTER the +// raw billboard sizing. Letting the helper return the delta lets the +// caller do its own scaling on either side of the call without +// having to thread additional parameters through. The two-line +// caller pattern then is: +// +// let corner = quadCorner(vid); +// let offset = expandBillboardScreen(u.cam, centerClip, sizePx, corner); +// out.clip = centerClip + vec4(offset, 0.0, 0.0); +// +// ## The 'centerClip.w' cancellation +// +// Clip-space X/Y span [-w, +w] before the perspective divide; one +// pixel of screen-space width therefore corresponds to a clip-space +// delta of '2 * centerClip.w / viewportPx.x' (the GPU divides the +// final clip XY by w to get NDC, which spans [-1, +1] across +// 'viewportPx' pixels). Multiplying by 'centerClip.w' cancels the +// perspective divide so the on-screen size of the billboard is +// exactly 'sizePx' pixels regardless of how far the centre is from +// the camera — which is the whole point of a screen-aligned, pixel- +// sized billboard. + +fn expandBillboardScreen( + cam: CameraUniforms, + centerClip: vec4, + sizePx: f32, + corner: vec2, +) -> vec2 { + let pxToClip = vec2(2.0 / cam.viewportPx.x, 2.0 / cam.viewportPx.y); + return corner * sizePx * pxToClip * centerClip.w; +} diff --git a/src/services/gpu/shaders/lib/camera.wesl b/src/services/gpu/shaders/lib/camera.wesl new file mode 100644 index 0000000..ac21b6f --- /dev/null +++ b/src/services/gpu/shaders/lib/camera.wesl @@ -0,0 +1,155 @@ +// lib/camera.wesl — shared camera uniform layout + projection helpers. +// +// Every renderer that touches world space ends up writing the same +// half-dozen lines: 'u.viewProj * vec4(p, 1.0)' to land in clip +// space, 'length(u.camPosWorld - p)' for an apparent-size factor, and +// so on. Centralising the uniform-struct prefix and the helper +// functions lets us: +// +// 1. Audit the camera math in one place — when something looks +// foreshortened wrong, there is exactly one definition of +// 'worldToClip' to blame. +// 2. Reuse the same byte layout across renderers, so a future +// "shared per-frame camera UBO bound at @group(2)" refactor (out +// of scope for Task 4 but plausible later) only has to touch the +// CPU-side uploader, not the WGSL. +// 3. Give human reviewers a hook: any new renderer should import +// from here rather than rolling its own viewProj wiring. +// +// ## Why this is so MINIMAL +// +// An earlier draft (see plan 2026-05-07-wesl-conversion.md, Task 4) +// proposed a much wider CameraUniforms with separate 'view' + 'proj' +// matrices, plus 'kPerZ', 'dpr', 'timeSec', etc. Reading the actual +// renderer Uniforms structs revealed: +// +// - None of the seven renderers uses 'view' or 'proj' separately; +// all of them only need the combined 'viewProj'. +// - 'kPerZ' and 'dpr' are not in any uniform struct today. +// - 'cameraPos' (under various spellings: 'camPosWorld', +// 'cameraPosWorld', 'camPos') lives at WILDLY different byte +// offsets across renderers — points puts six other f32/u32 +// fields between 'viewport' and 'camPosWorld'; milkyWayImpostor +// has 'fadeAlpha' + 'iTime' in those same slots; quads/disks +// pad those slots out with explicit '_pad0/_pad1'. Forcing all +// of them onto the same prefix would mean rewriting the +// points/milkyWayImpostor uniform layouts purely to satisfy a +// shared struct, with no functional benefit. The plan's +// guardrail ('fields used by only one renderer stay +// renderer-specific') applies a fortiori when the placement of +// a field varies between renderers. +// - 'filaments' has no camera position at all — its quads only +// need 'viewProj' for endpoint projection. +// +// What IS universal across all six camera-using renderers (everything +// except 'toneMap', which is a fullscreen pass without world-space +// math) is the prefix: +// +// offset 0: mat4x4 viewProj (64 B) +// offset 64: vec2 viewportPx (8 B) +// offset 72: f32 _pad0 (4 B; reserves the 16-B-align +// offset 76: f32 _pad1 slot used by every renderer +// between 'viewport' and the +// next vec3-aligned member). +// total: 80 bytes. +// +// 'cameraPos' is therefore NOT in CameraUniforms; the helpers that +// need it ('worldEyeDepth', any future 'pxPerRad'-style scaling) +// take it as an explicit 'vec3' parameter so renderers can +// continue to put it wherever their own layout demands. +// +// ## Byte layout (canonical) +// +// Renderers extending CameraUniforms (e.g. 'struct Uniforms { cam: +// CameraUniforms, brightness: f32, ... }') get: +// +// 0..63: viewProj +// 64..71: viewportPx +// 72..79: _pad0, _pad1 +// 80.. : renderer-specific fields, naturally 16-B aligned. +// +// The two trailing pads are NOT decorative — they reserve the slots +// the existing renderers' Uniforms structs already use to round +// 'viewport + 8B' up to a vec3 alignment boundary at offset 80. +// Without them, a renderer extending 'cam: CameraUniforms' with a +// vec3 next field would silently insert implicit padding that the +// CPU side might not match. Naming the bytes makes the JS-side +// upload obvious and grep-able. + +struct CameraUniforms { + viewProj: mat4x4, + viewportPx: vec2, + _pad0: f32, + _pad1: f32, +}; + +// ── worldToClip ───────────────────────────────────────────────────── +// +// The single most-repeated camera operation: take a world-space point +// and produce its homogeneous clip-space coordinate. Every vertex +// stage that places anything by world position calls this once +// (sometimes twice, e.g. filaments projects two endpoints; quads +// projects center + a 'celestial-north' epsilon offset). +// +// Why pass 'cam' rather than capture a free-standing 'u'? WESL has +// no global state — the helper module can't see the renderer's +// '@group(0) @binding(0) var u' binding. Taking 'cam' as a +// parameter means the helper is portable to any renderer's binding +// site and to future refactors that put the camera UBO at a +// different group/binding. +// +// Equivalent inline form: 'u.cam.viewProj * vec4(p, 1.0)'. + +fn worldToClip(cam: CameraUniforms, p: vec3) -> vec4 { + return cam.viewProj * vec4(p, 1.0); +} + +// ── worldToNdc ────────────────────────────────────────────────────── +// +// Perspective-divided 2D position. Used by filaments to compute the +// screen-space tangent + perpendicular for thick-line expansion: it +// projects both segment endpoints, divides by w to get NDC, and +// builds a perpendicular from their delta. The perspective divide +// must happen BEFORE the subtraction (you can't subtract two clip +// vectors with different w and then divide — the result is gibberish +// in NDC). +// +// We split this out as a named helper because every renderer that +// does screen-space-aligned billboard math has historically rolled +// its own 'clip.xy / clip.w' line and called the result something +// different ('aNdc', 'centerNdc', 'upNdc'). One name + one +// definition keeps the post-projection step searchable. +// +// Note we drop the z component: NDC z is the depth used for the +// depth test, not for screen layout. Callers who need it should use +// 'worldToClip' and divide explicitly. + +fn worldToNdc(cam: CameraUniforms, p: vec3) -> vec2 { + let clip = cam.viewProj * vec4(p, 1.0); + return clip.xy / clip.w; +} + +// ── worldEyeDepth ─────────────────────────────────────────────────── +// +// Linear distance from the camera to a world-space point — the +// natural "how far away is this galaxy?" scalar that drives +// apparent-size billboards, thumbnail eligibility, and depth-fade +// gates. Linear (not 1/w) because every consumer multiplies it by +// an angular size or feeds it into a smoothstep — both of which +// expect proportional, not perspective-warped, distances. +// +// Why does 'cam' NOT carry the camera position? See the module +// header — cameraPos placement differs across renderer Uniforms +// structs, so the canonical CameraUniforms struct excludes it. The +// caller passes its renderer-specific 'u.camPosWorld' (or whatever +// it spelled the field) as 'eyeWS' here. +// +// Edge case: if the camera is exactly at p, the returned distance is +// 0. The two known consumers (apparent-size + thumbnail gating) +// already 'max(d, 0.001)' before dividing, so we don't clamp here — +// callers that need it know better than this generic helper how +// small a floor they want. + +fn worldEyeDepth(eyeWS: vec3, p: vec3) -> f32 { + return length(eyeWS - p); +} diff --git a/src/services/gpu/shaders/lib/cloudFade.wesl b/src/services/gpu/shaders/lib/cloudFade.wesl new file mode 100644 index 0000000..92c6e79 --- /dev/null +++ b/src/services/gpu/shaders/lib/cloudFade.wesl @@ -0,0 +1,83 @@ +// lib/cloudFade.wesl — shared per-cloud opacity uniform + fade helper. +// +// Each renderable cloud (per-source point cloud, the cosmic-web filament +// skeleton, future overlays) has a tiny 16-byte uniform buffer at +// '@group(1) @binding(0)' carrying the smoothstep-shaped fade-in +// opacity. The CPU side that owns the buffer + bind group + per-frame +// 'writeBuffer' lives in 'src/services/gpu/cloudFade.ts'; this lib is +// the GPU half of the same contract. +// +// ## Why a SHARED struct +// +// The CPU-side 'CloudFade' class (services/gpu/cloudFade.ts) emits a +// SINGLE 16-byte layout — 'opacity: f32 + sourceCode: u32 + 8 bytes pad' +// — for every consumer. Both the points renderer and the filaments +// renderer bind buffers built by the same class, and the bytes are +// byte-identical regardless of consumer. A previous revision kept two +// shader-side structs (points named slot 1 'sourceCode', filaments +// named it '_pad0') to express the fact that filaments doesn't read +// sourceCode today. That divergence was fictional: it didn't change a +// single byte of the buffer or its alignment, it just made the shader +// declarations look different from each other while the CPU-side +// producer was identical. Keeping mismatched shader structs over an +// identical CPU layout is the kind of drift that quietly rots — when +// filaments DOES want to opt into sourceCode (e.g. to encode +// per-skeleton identity into a future filament pick output), the +// renaming + type-juggling would have to happen anyway. +// +// So: one struct, exported here, imported by both renderers. Filaments +// simply ignores 'sourceCode' in its fragment stage; nothing prevents +// it from reading the slot later. The pad fields are named '_pad1' / +// '_pad2' to keep them obviously-unused at the call site. +// +// ## Why a fn at all if it's a one-liner +// +// 'color * opacity' would mistakenly attenuate RGB; we only want to +// scale alpha. Newcomers have made this mistake before. The helper is +// intentionally scalar — 'applyCloudFade(alpha, opacity) -> f32' — to +// match how the call sites actually multiply opacity into a scalar +// alpha alongside other modulators ('angWeight', 'depthFade', +// 'pointAlphaMult', 'edgeFade', 'densityBoost', etc.). An earlier +// revision used a vec4-in/vec4-out signature; that forced both +// fragment stages to construct a 'vec4(rgb, alpha)', call the helper, +// then re-extract '.rgb' / '.a' for the premultiply step. Pure +// ceremony, no extra safety. The scalar form is honest about what's +// happening: opacity is one of several multiplicative alpha terms, +// and this helper is the documented place that says 'never let +// opacity attenuate RGB'. +// +// Call this BEFORE the premultiplied-alpha output step (i.e. before +// folding 'rgb * alpha' into the return value). Both points.wesl and +// filaments.wesl render with 'alphaMode: premultiplied' on the canvas, +// so each fragment ends with 'vec4(rgb * fadedAlpha, fadedAlpha)'. +// Multiplication is commutative, so factoring opacity into 'alpha' +// before the premultiply is identical to scaling the final pre- +// multiplied colour — but doing it here localises the docblock on +// what the opacity uniform actually means. + +struct CloudUniforms { + // 0 → fully transparent (just uploaded), 1 → fully opaque (steady + // state). Smoothstep-shaped on the CPU side over ~600 ms. + opacity: f32, + + // 5-bit Source enum value for this cloud. Read by the points vertex + // stage to compose '(sourceCode << 27u) | instance_index' for the + // selection-halo + pick-output paths. Filaments doesn't read this + // slot today, but the CPU producer ('CloudFade.writeFrame' in + // services/gpu/cloudFade.ts) writes it for every cloud regardless, + // so the byte is already there for a future filament feature to + // opt into. + sourceCode: u32, + + // Pad to 16-byte WebGPU-minimum uniform-buffer alignment. Never + // written from the CPU side; never read here. + _pad1: f32, + _pad2: f32, +}; + +// The alpha-only variant of cloud fade — never let opacity attenuate +// RGB. Trivially 'alpha * opacity', wrapped so the invariant has a +// single documented home. +fn applyCloudFade(alpha: f32, opacity: f32) -> f32 { + return alpha * opacity; +} diff --git a/src/services/gpu/shaders/lib/colorIndex.wesl b/src/services/gpu/shaders/lib/colorIndex.wesl new file mode 100644 index 0000000..1c02f24 --- /dev/null +++ b/src/services/gpu/shaders/lib/colorIndex.wesl @@ -0,0 +1,68 @@ +// lib/colorIndex.wesl — colour-index → RGB mapping shared across renderers. +// +// What lives here: the piecewise 'ramp(t)' that maps an SDSS-style g−r +// colour index to an RGB tint. The same mapping is used by the points +// pass (instanced billboards) and the procedural-disk pass (close-up +// galaxy impostor) so that a galaxy's tint is identical at every LOD — +// any drift between the two would show up as a colour pop the moment +// the disk impostor fades in over the point billboard. +// +// Why a dedicated file rather than folding 'ramp' into 'lib/math.wesl'? +// 'lib/math.wesl' is for context-free numeric primitives (saturate, +// rot2, sabs, polar conversions) — things any renderer might want. +// The colour ramp encodes a domain-specific decision: "this is how +// skymap converts catalogue colour-index to display RGB". That's a +// rendering-policy choice, not a math primitive. Splitting it into its +// own file keeps the import sites self-documenting ('colorIndex::ramp' +// reads as "the colour-index ramp", not "some math helper") and gives +// us a natural home for future colour-related helpers (e.g. an +// ln(L_K)→tint mapping for the procedural-disk bulge if we ever want +// hue to track luminosity). +// +// Why not inline copies in each shader? That's where this code started +// (see the deleted 'Mirror of points.wgsl ramp(t)' comment in +// proceduralDisks.wesl history). Two copies of the same anchor colours +// are an editing trap: a tweak to one without the other silently +// produces visible cross-LOD colour drift. Centralising eliminates the +// hazard. +// +// ── the ramp ───────────────────────────────────────────────────────── +// +// Maps SDSS g−r colour index 't' to an RGB tint. Piecewise: +// +// t ≤ 0 → blueWhite blend, parameter clamped at 0 +// (fully blue: hot quasars / O/B stars) +// 0 < t ≤ 1 → blueWhite blend, parameter in (0, 0.5] +// 1 < t ≤ 2 → whiteRed blend, parameter in (0.5, 1] +// t > 2 → fully red (M-type stars, red galaxies) +// +// Both blends share the same 's = saturate(t * 0.5)' parameter so the +// transition is continuous and uses the same 0→1 interpolation range +// in each half. Anchor colours: +// +// blue = (0.4, 0.6, 1.0) — hot blue +// white = (1.0, 0.95, 0.8) — warm white (shared join point) +// red = (1.0, 0.5, 0.3) — cool red +// +// WGSL 'select(a, b, cond)' note: returns 'a' when cond is FALSE, 'b' +// when cond is TRUE — the reverse of the C-style ternary 'cond ? b : a' +// argument order. So 'select(blueWhite, whiteRed, t > 1.0)' returns +// blueWhite for t ≤ 1 and whiteRed for t > 1. Easy to invert; worth +// the comment. + +import package::lib::math::saturate; + +fn ramp(t: f32) -> vec3 { + // s goes 0→1 as t goes 0→2; saturate stops it at 0 for negatives and 1 for t>2. + let s = saturate(t * 0.5); + + // Blue-to-white: hot blue (quasars, O/B stars) fading to a warm white. + let blueWhite = mix(vec3(0.4, 0.6, 1.0), vec3(1.0, 0.95, 0.8), s); + + // White-to-red: warm white fading to cool red (M-type stars, red galaxies). + let whiteRed = mix(vec3(1.0, 0.95, 0.8), vec3(1.0, 0.5, 0.3), s); + + // Pick the right half of the ramp: blue-white for t ≤ 1, white-red for t > 1. + // Remember: select(falseVal, trueVal, condition). + return select(blueWhite, whiteRed, t > 1.0); +} diff --git a/src/services/gpu/shaders/lib/masks.wesl b/src/services/gpu/shaders/lib/masks.wesl new file mode 100644 index 0000000..923bc74 --- /dev/null +++ b/src/services/gpu/shaders/lib/masks.wesl @@ -0,0 +1,87 @@ +// lib/masks.wesl — common fragment-stage mask shapes. +// +// Three smoothstep patterns recurred across four fragment shaders +// (disks, quads, proceduralDisks, filaments), each with subtly different +// parameter ordering: 'smoothstep(inner, outer, r)' inverted via +// '1.0 - ...' for circular cutoffs in some files but spelled +// 'smoothstep(outer, inner, r)' (relying on smoothstep's symmetry under +// edge swap) in others; the same 'smoothstep(lo, hi, lum)' for +// luminance keying showing up at two call sites with identical +// thresholds; and the two-tailed 'smoothstep(0, fade, x) * (1 - +// smoothstep(1-fade, 1, x))' edge-band mask carved by hand in +// filaments. That subtle parameter-ordering variation made the +// copy-paste-bug risk real — a future renderer copying the wrong +// nearby form would silently get an inverted mask, and the visual +// regression would only show up when a galaxy happened to have +// 'r > outer' for the inverted-edges case. +// +// Naming the shapes makes the intent visible at the call site: you +// read 'circularMask(r, 0.4, 0.5)' and immediately know it's a soft +// disk edge (zero outside, one inside the inner radius), not a +// luminance gate or an edge-band falloff. The parameter order is +// fixed once here — '(value, inner, outer)' for circular, '(value, +// lo, hi)' for luminance, '(axis, fade)' for edge-band — and every +// caller is forced into that ordering, so the previous +// "is it 0.45 first or 0.5 first?" cognitive load disappears. +// +// Why these shapes live alongside 'lib/math' rather than in any +// renderer-specific module: they're pure mathematical patterns — +// no astronomy, no rendering state, no per-galaxy data — that any +// shader carving a soft alpha mask might want. 'lib/math' itself +// stays scoped to scalar primitives (saturate, rot2, sabs, polar +// conversions); the mask shapes are higher-level (they compose +// smoothstep) so they earn their own file rather than crowding the +// math module's docblock. + +// ── circularMask ───────────────────────────────────────────────────── +// +// Soft disk-edge mask. Returns 1.0 where r <= inner, 0.0 where +// r >= outer, and a smoothstep ramp between. The '1.0 - smoothstep' +// inversion is folded in: callers pass the inner / outer radii in +// natural order ('inner < outer') and don't have to remember to flip +// the smoothstep edges to get the high-inside-low-outside shape. +// +// The equivalent 'smoothstep(outer, inner, r)' (with edges swapped) +// produces identical output for r outside [inner, outer] but is +// strictly equivalent only because smoothstep is symmetric under +// edge inversion when the input is between the edges; using the +// '1.0 - smoothstep(inner, outer, r)' form keeps the algebraic +// intent explicit and matches the form most call sites used before +// extraction. + +fn circularMask(r: f32, inner: f32, outer: f32) -> f32 { + return 1.0 - smoothstep(inner, outer, r); +} + +// ── lumAlpha ───────────────────────────────────────────────────────── +// +// Luminance gate for sky-subtraction-lite. Used by texture-sampling +// fragment stages (disks, quads) to drop near-black sky pixels in +// SDSS / DSS thumbnail JPEGs that ship with no alpha channel: passing +// 'max(rgba.r, max(rgba.g, rgba.b))' through this gate maps near-zero +// luminance to fully transparent and bright galaxy pixels to fully +// opaque. See the longer note in 'quads.wesl' for the +// project_thumbnail_quality.md context. + +fn lumAlpha(lum: f32, lo: f32, hi: f32) -> f32 { + return smoothstep(lo, hi, lum); +} + +// ── edgeBandMask ───────────────────────────────────────────────────── +// +// Two-tailed edge-band mask along a [0, 1] axis. Fades in from 0 over +// the first 'fade' units, stays at 1.0 in the middle band, and fades +// out to 0 over the last 'fade' units before 1. Used by filaments to +// shape the perpendicular falloff of each line segment so the +// instanced-quad rendering ends with a soft anti-aliased edge instead +// of a hard pixel cliff. +// +// Caller passes 'axis' in [0, 1] (typically a quad-corner uv +// component); 'fade' controls how wide each tail is. fade = 0.1 +// produces a 10%-on / 80%-flat / 10%-off shape; fade = 0.5 collapses +// the flat middle and gives a triangular-ish ramp; fade > 0.5 starts +// double-counting the tails and is undefined. + +fn edgeBandMask(axis: f32, fade: f32) -> f32 { + return smoothstep(0.0, fade, axis) * (1.0 - smoothstep(1.0 - fade, 1.0, axis)); +} diff --git a/src/services/gpu/shaders/lib/math.wesl b/src/services/gpu/shaders/lib/math.wesl new file mode 100644 index 0000000..d6f5b46 --- /dev/null +++ b/src/services/gpu/shaders/lib/math.wesl @@ -0,0 +1,124 @@ +// lib/math.wesl — small math primitives shared across the renderer. +// +// What lives here: scalar constants and pure functions that are used +// by more than one shader (or are likely to be), are short enough that +// a per-fn file would be more ceremony than signal, and don't have +// renderer-specific dependencies. The grouping is "math primitives"; +// it's not a kitchen sink — anything with a richer API (camera, +// billboard, color ramp) lives in its own file. +// +// Why one file rather than one-fn-per-file? An earlier draft of this +// module split each function into its own file under 'lib/math/'. +// WESL's import-resolution model treats the last segment of an +// import path as the function name and the rest as the module path, +// so 'import package::lib::math::saturate;' looks for a function +// 'saturate' inside a module at 'lib/math.wesl' — not at +// 'lib/math/saturate.wesl'. To preserve one-fn-per-file we'd have to +// write 'import package::lib::math::saturate::saturate;' (with the +// duplicated leaf), which is verbose noise. Consolidating into one +// module keeps the import sites short and matches the WESL idiom. +// +// Each primitive keeps its own comment block; reading top to bottom +// here is equivalent to reading the previous per-file docblocks in +// sequence. + +// ── constants ──────────────────────────────────────────────────────── +// +// Pulled out of points.wesl + milkyWayImpostor.wesl, which had +// hand-typed '3.14159...' and '2.30258...' literals. Centralising +// gives us one place to bump precision if compute shaders ever need +// f64-equivalent constants. + +const PI: f32 = 3.14159265358979; +const TAU: f32 = 6.28318530717958; +const LOG10: f32 = 2.30258509299404; // ln(10), for converting log/ln + +// ── saturate ───────────────────────────────────────────────────────── +// +// WGSL has no built-in 'saturate' the way HLSL/GLSL ES do; the +// canonical spelling is 'clamp(x, 0.0, 1.0)'. That recurs ~20× across +// the engine's shaders. Wrapping it in a named primitive gives us: +// +// 1. A grep target — every "snap into [0, 1]" site is now +// 'saturate(...)' and trivially auditable. +// 2. Portability — when WGSL eventually grows a real 'saturate' +// builtin, this becomes a one-line shim and call sites stay put. +// 3. Reading clarity — 'saturate' encodes intent ("force into the +// visible alpha range") whereas 'clamp(x, 0, 1)' encodes only +// mechanism. + +fn saturate(x: f32) -> f32 { + return clamp(x, 0.0, 1.0); +} + +// ── rot2 ───────────────────────────────────────────────────────────── +// +// 2D rotation of a point around the origin. Pulled from +// milkyWayImpostor.wesl's inline 'rot()'. Renamed to 'rot2' so the +// bare name 'rot' is free for a future 'rot3' (axis-angle 3D rotation +// for camera-frame helpers); the "2" tag matches the WGSL pattern of +// suffixing a dimensional hint to a free-function name (cf. 'vec2', +// 'mat2x2', 'fract'). +// +// Returned as a fresh vec2 (no in-place mutation, unlike the +// ShaderToy GLSL original which used 'inout vec2 p') so it composes +// cleanly inside expressions: 'sabs(rot2(p, t).x, 0.1)' just works. + +fn rot2(p: vec2, a: f32) -> vec2 { + let c = cos(a); + let s = sin(a); + return vec2(c * p.x + s * p.y, -s * p.x + c * p.y); +} + +// ── sabs (smooth absolute value) ───────────────────────────────────── +// +// 'sabs(x, k)' approximates 'abs(x)' but is C¹-continuous at x = 0. +// The classic 'abs' has a kink at the origin; any analytic process +// that takes derivatives across that kink (height-field normals, +// implicit-surface gradients, anti-aliased silhouettes) ends up with +// a discontinuous result that visually shows as a sharp seam. +// 'sabs' replaces the kink with a smooth parabolic blend whose +// "knee" width is controlled by 'k'. +// +// Form: linear (= 'abs(x)') when '|x| >= k', parabolic with the same +// value + slope at the join when '|x| < k'. Larger 'k' → wider soft +// region → smoother but more rounded. k → 0 recovers exact 'abs(x)'. +// +// Implementation note: this is the GLSL macro +// LESS((.5/k)*x*x + k*.5, abs(x), abs(x) - k) +// where 'LESS(a, b, c) = mix(a, b, step(0, c))', i.e. 'c >= 0 ? b : a'. +// WGSL's 'select(a, b, cond)' inverts the C-style ternary operand +// order: 'select(a, b, cond) = cond ? b : a', which matches GLSL +// 'LESS' exactly. + +fn sabs(x: f32, k: f32) -> f32 { + let a = (0.5 / k) * x * x + k * 0.5; + let ax = abs(x); + return select(a, ax, ax >= k); +} + +// ── toPolar / toRect ───────────────────────────────────────────────── +// +// Cartesian (x, y) ↔ polar (r, θ). Pulled out of milkyWayImpostor +// where the 'twirl' arm distortion and the 'stars' density warp both +// round-trip points through polar coords to apply rotation-symmetric +// transformations (multiply θ by an angular frequency, scale r +// non-linearly, etc.). +// +// Convention for both: vec2(r, theta) with theta in radians, range +// (-PI, PI] (the WGSL 'atan2' contract). 'toPolar' and 'toRect' are +// inverses for any finite vec2 input where length(p) > 0. +// +// Edge case: 'atan2(0, 0)' is implementation-defined in WGSL — most +// hardware returns 0 but a few return NaN. Callers that pass a +// possibly-zero point should guard upstream; we don't guard here +// because the per-call branch cost is wasted in the common case +// where the caller already knows p is non-zero. + +fn toPolar(p: vec2) -> vec2 { + return vec2(length(p), atan2(p.y, p.x)); +} + +fn toRect(p: vec2) -> vec2 { + return p.x * vec2(cos(p.y), sin(p.y)); +} diff --git a/src/services/gpu/shaders/lib/orientation.wesl b/src/services/gpu/shaders/lib/orientation.wesl new file mode 100644 index 0000000..234f7d0 --- /dev/null +++ b/src/services/gpu/shaders/lib/orientation.wesl @@ -0,0 +1,172 @@ +// lib/orientation.wesl — disk-plane axis math from on-sky position angle +// + inclination, shared between 'disks.wesl' (textured thumbnails) and +// 'proceduralDisks.wesl' (impostor brightness profile). +// +// Both renderers draw a galaxy as a 3D-oriented quad whose in-plane +// basis encodes the galaxy's intrinsic orientation in world space. That +// orientation is derived from two scalar inputs the catalog provides: +// +// - PA (position angle, east of north on the sky) → rotates the major +// axis around the line-of-sight. +// - axisRatio b/a → cos(i), where i is the disk's inclination relative +// to the sky plane. Face-on (i = 0°) → axisRatio = 1; edge-on +// (i = 90°) → axisRatio = 0. +// +// CRITICAL: this math is camera-INDEPENDENT. The disk's orientation is +// a property of the galaxy in 3D space, NOT of where the camera +// currently sits. An earlier revision of disks.wesl (now its long +// header comment) built the basis from 'camPos - center', which made +// orbiting the camera visibly rotate the disk plane — exactly the bug +// world-space orientation was rewritten to fix. Do NOT add a +// 'cameraPos' parameter to anything in this file; if you find yourself +// reaching for one, re-read disks.wesl's header. The plan for this +// task initially proposed 'diskAxes(posWS, cameraPos, ...)'; the +// 'cameraPos' was dropped after grepping the call sites confirmed +// neither consumer reads it. +// +// We also intentionally do NOT 'import package::lib::camera::CameraUniforms' +// here. orientation.wesl pre-dates the per-frame camera in the GPU's +// dependency graph: the disk basis can be computed before the view +// matrix is even uploaded. Keeping the dependency direction one-way +// (renderers import from camera + orientation; orientation imports from +// nothing) means a future "shared per-frame camera UBO" refactor can +// touch lib/camera.wesl without rippling into orientation. +// +// ## API +// +// 'diskAxes(posWS, paRad, cosI, sinI) -> DiskAxes' returns the disk's +// major and minor axes as unit-length, orthogonal vec3s in world +// coords. Each consumer's vertex stage then places a unit-square corner +// at 'center + (corner.x * axes.major + corner.y * axes.minor) * halfSize'. +// +// Why split 'cosI' / 'sinI' across two parameters rather than passing +// 'axisRatio' and computing them inside? The 'sinI = sqrt(max(0.0, +// 1.0 - cosI*cosI))' line is one statement and the consumers already +// have the convention of clamping 'axisRatio' to a 0.05 floor before +// the trig (see disks.wesl line 120 + proceduralDisks.wesl line 119). +// Doing the clamp inside this lib would either silently re-clamp a +// value the caller already clamped, or it would require a second +// "raw vs clamped" parameter — both are noisier than just letting the +// caller compute the trig pair. The function reads as pure 3D +// geometry: PA + (cos(i), sin(i)) → (major, minor). +// +// ## Frame construction (right-handed, world-fixed) +// +// 1. losDir = normalize(posWS) +// Earth (the observer) is at world origin; losDir is the +// direction from Earth to the galaxy. +// 2. north_proj = normalize(seed - dot(seed, losDir) * losDir) +// Project the celestial north pole vector onto the sky tangent +// plane at the galaxy's position. Falls back to (0, 1, 0) when +// the galaxy is within ~8° of the celestial pole (|losDir.z| > +// 0.99) — see "## Pole degeneracy" below. +// 3. east_proj = cross(north_proj, losDir) +// Right-handed 3-axis at the galaxy: (north_proj, east_proj, +// losDir). PA-east-of-north therefore agrees with the +// astronomical convention. +// 4. major = north_proj * cos(PA) + east_proj * sin(PA) +// 5. minor_in_sky = cross(losDir, major) +// 6. minor = minor_in_sky * cosI + losDir * sinI +// Tilts the disk's true minor axis out of the sky plane by +// inclination i. +// +// Face-on (cosI = 1, sinI = 0) → minor lies entirely in the sky plane → +// disk projects as a circle. Edge-on (cosI → 0, sinI → 1) → minor +// approaches losDir → disk is parallel to the line of sight and +// projects as a thin streak along the major axis. +// +// ## Pole degeneracy +// +// When the galaxy is within ~8° of the celestial pole, the seed +// vector (celestial north) becomes nearly parallel to losDir, so +// 'seed - dot(seed, losDir) * losDir' shrinks toward zero and a +// downstream normalize() amplifies floating-point noise. Two slightly +// different threshold conventions existed pre-extraction: +// +// - disks.wesl: 'abs(dot(northPole, losDir)) > 0.99' (~8° from pole), +// swap seed BEFORE the projection — wide, conservative. +// - proceduralDisks.wesl: 'length(northTangentRaw) < 1e-4' (~exactly +// at pole), swap result AFTER the projection — tight, only fires +// when the projection itself is degenerate. +// +// We standardise on the proceduralDisks form (tight, post-projection +// length check). Most galaxies within 8° of the celestial pole still +// produce a usable in-sky north tangent — float math near the pole is +// fine until length(seed - dot(seed, los) * los) genuinely +// underflows, which only happens when |dot| is essentially 1. +// Falling back at the wider 0.99 threshold throws away real PA +// information for ~8° of sky around each pole; falling back only +// when the projection is numerically zero preserves PA fidelity for +// the handful of catalog galaxies that actually live up there. Both +// renderers fall back to seeding with world-y, so they still agree +// at the (much narrower) genuine-degeneracy region. +// +// ## Edge cases +// +// - axisRatio → 0 (edge-on): the consumers clamp to 0.05 BEFORE the +// trig pair so the quad doesn't collapse to a 1D line in the +// vertex stage. We don't re-clamp here. +// - posWS = (0, 0, 0) (galaxy literally at Earth): normalize() of +// the zero vector is implementation-defined. Real catalogs never +// emit such a row — even the closest galaxies are at non-zero +// distance — so we don't guard against it. + +struct DiskAxes { + // Unit-length, orthogonal world-space basis vectors. The disk plane + // is span(major, minor); the disk normal is 'cross(major, minor)' + // (which equals 'losDir * cosI - minor_in_sky * sinI', i.e. the + // line-of-sight direction tilted by 'i' away from the observer). + major: vec3, + minor: vec3, +}; + +fn diskAxes(posWS: vec3, paRad: f32, cosI: f32, sinI: f32) -> DiskAxes { + // ── Line of sight ────────────────────────────────────────────────── + let losDir = normalize(posWS); + + // ── Sky-tangent basis (north / east at the galaxy's position) ────── + // + // The celestial north pole lives at +Z in world coords (raDecZToCartesian + // maps Dec = +90° to (0, 0, 1)). Project that onto the plane + // perpendicular to losDir to recover sky-north, with the world-Y + // fallback only when the projection is numerically degenerate (i.e. + // the galaxy sits essentially AT the celestial pole). See "## Pole + // degeneracy" above for why we use the tight post-projection length + // check rather than the wider pre-projection dot-product check. + let northPole = vec3(0.0, 0.0, 1.0); + let northRaw = northPole - dot(northPole, losDir) * losDir; + let northLen = length(northRaw); + let northProj = select( + northRaw / northLen, + vec3(0.0, 1.0, 0.0), + northLen < 1e-4, + ); + // 'cross(north, los)' (rather than 'cross(los, north)') keeps the + // (north, east, los) frame right-handed in the same sense the + // astronomical PA-east-of-north convention expects. Reversing the + // argument order would flip the sign of any non-zero PA's rotation. + let eastProj = cross(northProj, losDir); + + // ── Major axis on sky (PA-east-of-north rotation) ────────────────── + let major = northProj * cos(paRad) + eastProj * sin(paRad); + + // ── Minor axis tilted out of the sky plane by inclination ────────── + // + // 'minorInSky' is the in-sky perpendicular to the major axis. The + // tilt formula 'minorInSky * cosI + losDir * sinI' rotates that + // perpendicular AROUND the major axis by angle i. An equivalent + // route — build the disk normal first, then take 'cross(normal, + // major)' — flips the sign of the 'sinI * los' term (because + // 'cross(minorInSky, major) = -losDir' in this right-handed frame), + // tilting the disk in the OPPOSITE direction. At cosI ≈ 0.87 (i ≈ + // 30°) that sign flip shows as a visible ~30° rotation — historically + // a real bug in proceduralDisks.wesl that the inline derivation + // pre-extraction was carefully written to avoid. Don't reintroduce. + let minorInSky = cross(losDir, major); + let minor = minorInSky * cosI + losDir * sinI; + + var axes: DiskAxes; + axes.major = major; + axes.minor = minor; + return axes; +} diff --git a/src/services/gpu/shaders/milkyWayImpostor.wgsl b/src/services/gpu/shaders/milkyWayImpostor.wesl similarity index 77% rename from src/services/gpu/shaders/milkyWayImpostor.wgsl rename to src/services/gpu/shaders/milkyWayImpostor.wesl index 29deb07..97448ad 100644 --- a/src/services/gpu/shaders/milkyWayImpostor.wgsl +++ b/src/services/gpu/shaders/milkyWayImpostor.wesl @@ -39,16 +39,16 @@ // Every other pass in this engine writes linear-light into the rgba16f // HDR target and the tone-map pass downstream applies the curve + // exposure + (sRGB conversion via swap-chain format). The original -// ShaderToy applied display-space gamma (`pow(col, 0.75)`), a contrast -// S-curve, a saturation pump, and a vignette in its `postProcess` +// ShaderToy applied display-space gamma ('pow(col, 0.75)'), a contrast +// S-curve, a saturation pump, and a vignette in its 'postProcess' // function — all of which are display-space operations that would // double-up with the engine's tone-map pass and produce muddy crushed // blacks. Those four operations are DELETED, not ported. // // ── Coordinate convention inside the fragment stage // -// The fragment receives `uv` in `[-1.05, 1.05]²` (the 5% bleed -// margin). We feed `uv` directly into the ShaderToy's `mainImage` +// The fragment receives 'uv' in '[-1.05, 1.05]²' (the 5% bleed +// margin). We feed 'uv' directly into the ShaderToy's 'mainImage' // equivalent as the "p" vector after aspect-ratio normalisation — // since the vertex stage already pre-stretches the quad in clip-space // to compensate for non-square viewports, the fragment shader sees a @@ -56,48 +56,114 @@ // // ── ShaderToy → WGSL specific notes // -// - GLSL `inout` parameters in `mod2(inout vec2 p, ...)` and -// `rot(inout vec2 p, ...)` become value-returning helpers that -// return the modified value (and a struct for `mod2`'s two-output +// - GLSL 'inout' parameters in 'mod2(inout vec2 p, ...)' and +// 'rot(inout vec2 p, ...)' become value-returning helpers that +// return the modified value (and a struct for 'mod2''s two-output // case). -// - The two `galaxy()` overloads (one taking `(vec2 p, float a, -// float z)` for the noise hatching, one taking `(vec2 p, vec3 ro, -// vec3 rd, float d)` for the full disk shading) collide in WGSL +// - The two 'galaxy()' overloads (one taking '(vec2 p, float a, +// float z)' for the noise hatching, one taking '(vec2 p, vec3 ro, +// vec3 rd, float d)' for the full disk shading) collide in WGSL // which has no overloading. We rename the four-arg overload to -// `shadeGalaxyDisk` and keep the three-arg one as `galaxy`. -// - `for (int i = 0; i < 11; ++i)` becomes `for (var i: i32 = 0; i -// < 11; i = i + 1)`. +// 'shadeGalaxyDisk' and keep the three-arg one as 'galaxy'. +// - 'for (int i = 0; i < 11; ++i)' becomes 'for (var i: i32 = 0; i +// < 11; i = i + 1)'. + +// ── Uniforms layout ───────────────────────────────────────────────── +// +// The first 80 bytes are the shared 'CameraUniforms' prefix from +// 'lib/camera.wesl' (viewProj + viewportPx + two reserved pad slots). +// Embedding it as a named member 'cam' rather than inlining the +// individual fields makes the cross-renderer audit story explicit: +// every renderer's first uniform field is now 'cam: CameraUniforms', +// and every projection call goes through 'worldToClip(u.cam, p)'. +// +// ## Why the field order changed +// +// The pre-WESL-conversion layout had this shape: +// +// offset 0 : viewProj (mat4) +// offset 64 : viewport (vec2) +// offset 72 : fadeAlpha (f32) <─── overlaps CameraUniforms +// offset 76 : iTime (f32) <─── ._pad0/._pad1 slots +// offset 80 : cameraPosWorld (vec3) +// offset 92 : _pad (f32) +// total: 96 bytes +// +// 'CameraUniforms' explicitly reserves bytes 72..79 as '_pad0/_pad1' +// (so renderers extending it can place their first vec3-aligned field +// at offset 80). The old impostor layout filled those bytes with +// 'fadeAlpha' + 'iTime' — which is fine for a hand-rolled struct but +// incompatible with embedding 'cam: CameraUniforms' as the first +// field. We can't drop CameraUniforms in as a pure prefix without +// moving the renderer-specific scalars somewhere else. +// +// Resolution: put 'cam: CameraUniforms' first (occupies 0..79), then +// pack the renderer-specific fields after. 'cameraPosWorld' goes at +// offset 80 (already 16-byte aligned, which vec3 requires), and the +// two f32 scalars 'fadeAlpha' + 'iTime' fall in naturally at 92 / 96. +// The trailing pad lands at 100..111 to round the struct size up to +// 112 bytes (a multiple of 16, the struct alignment for vec3-bearing +// types). +// +// ## New byte layout +// +// offset 0 : cam: CameraUniforms (80 B; viewProj + viewportPx +// + _pad0 + _pad1) +// offset 80 : cameraPosWorld (vec3; 12 B) +// offset 92 : fadeAlpha (f32; 4 B) +// offset 96 : iTime (f32; 4 B) +// offset 100 : _pad (12 B; round up to 112) +// total: 112 bytes +// +// CPU-side counterpart in 'milkyWayRenderer.ts' must write the +// floats at the matching f32 indices — see that file's 'draw' method +// for the new offset table. + +import package::lib::camera::{ CameraUniforms, worldToClip }; +// Math primitives must be imported at the top of the file alongside the +// camera import. wesl-plugin@0.6.74's linker only resolves imports it +// finds before code-emitting begins; if the imports sit further down +// (next to the call sites) the linker emits the source verbatim, the +// 'import' keyword reaches Chrome's WGSL parser, and the whole module +// fails to compile. Verified empirically — moving these four imports +// here fixed a regression where milkyWay's pipeline was invalidated as +// soon as fadeAlpha rose above zero (close-zoom = black screen). +import package::lib::math::rot2; +import package::lib::math::sabs; +import package::lib::math::toPolar; +import package::lib::math::toRect; struct Uniforms { - // World-space view-projection matrix. The vertex stage uses this to - // place the world-anchored impostor quad correctly in clip space — - // the impostor is centred at the world origin (Earth/Sun position - // in the catalogue's coordinate system) and its angular size scales - // naturally as the user moves the camera closer / further. - viewProj: mat4x4, - // Viewport (px) — UNUSED. Kept for ABI symmetry with the other GPU - // passes that all use it for pxPerRad-style derivations. - viewport: vec2, - // Distance-fade alpha pre-computed on the CPU - // (`utils/math/milkyWayFade.ts`). Multiplied into the fragment's - // emissive output and into alpha for premultiplied blend. - fadeAlpha: f32, - // iTime in seconds, scaled by 0.25 on the CPU before upload so the - // ShaderToy's internal `TIME = iTime*0.1` works out to a slow, - // alive-but-not-spinning rotation. - iTime: f32, + // Shared camera prefix — viewProj for clip-space projection plus + // the viewportPx slot every other renderer uses for pxPerRad math. + // The impostor doesn't actually read viewportPx (its fragment + // shader works in local UV directly); the field is present for + // ABI symmetry with the rest of the engine. + cam: CameraUniforms, // World-space camera position (Mpc). Used by the vertex stage to // build the view-aligned billboard basis (the impostor always faces // the camera, so the user never sees its rectangular edge), and by // the fragment stage to drive the ShaderToy's synthetic camera — // transformed into the galactic frame and divided by the Milky Way's - // physical half-extent, it becomes the `ro` parameter to the existing + // physical half-extent, it becomes the 'ro' parameter to the existing // raymarched render logic. As the user orbits the world origin, the // shader sees a corresponding rotation of its synthetic camera, so // the rendered spiral appears from different angles instead of // staying frozen in the original ShaderToy's hard-coded vantage. cameraPosWorld: vec3, - _pad: f32, + // Distance-fade alpha pre-computed on the CPU + // ('utils/math/milkyWayFade.ts'). Multiplied into the fragment's + // emissive output and into alpha for premultiplied blend. + fadeAlpha: f32, + // iTime in seconds, scaled by 0.25 on the CPU before upload so the + // ShaderToy's internal 'TIME = iTime*0.1' works out to a slow, + // alive-but-not-spinning rotation. Currently unread by the + // fragment stage (animation is locked off — see 'fs' below) but + // retained for ABI symmetry and future re-enablement. + iTime: f32, + _pad0: f32, + _pad1: f32, + _pad2: f32, }; @group(0) @binding(0) var u: Uniforms; @@ -113,7 +179,7 @@ struct VsOut { // World-space position of this fragment's corresponding vertex. // The fragment stage interpolates this across the quad and // reconstructs the per-pixel world-space ray as - // `normalize(worldPos - cameraPosWorld)` — the actual ray that hits + // 'normalize(worldPos - cameraPosWorld)' — the actual ray that hits // this point on the impostor from the user's viewpoint. @location(1) worldPos: vec3, }; @@ -162,9 +228,9 @@ const CORNERS = array, 6>( // matching the 4× padding // convention used by every // other galaxy -// (`sizeWorld = diameterKpc * -// 4 / 1000` — see -// `points.wgsl:864`). +// ('sizeWorld = diameterKpc * +// 4 / 1000' — see +// 'points.wgsl:864'). // // Together these scale the whole rendered Milky Way uniformly to ~2× // the previous size, putting it on equal visual footing with the @@ -201,7 +267,7 @@ fn worldToGalactic(v: vec3) -> vec3 { // Convert a galactic-frame vector (X=GC, Y=rotation, Z=NGP) into the // shader's local frame, where the disk lies in the y=0 plane and y is -// the disk normal. The original ShaderToy uses `(0.0 - ro.y)/rd.y` +// the disk normal. The original ShaderToy uses '(0.0 - ro.y)/rd.y' // as the disk-plane intersection, so its Y axis must be the disk // normal — which is the galactic Z (NGP) direction. // @@ -214,11 +280,11 @@ fn galacticToShader(g: vec3) -> vec3 { // ── Volumetric raymarch tunables (bulge + disc halo) ──────────────── // -// Both extra contributions in `renderGalaxy` (the central spherical +// Both extra contributions in 'renderGalaxy' (the central spherical // bulge and the thin-disc halo that fills in the inter-arm regions) // are short ray-marches of an analytical density profile. Pulling // the parameters into module-scope constants matches the convention -// used by `proceduralDisks.wgsl` and makes the visual knobs easy to +// used by 'proceduralDisks.wgsl' and makes the visual knobs easy to // tweak without diving into the loop body. // // Sampling cost is fixed: BULGE_STEPS + DISC_HALO_STEPS exp() calls @@ -273,7 +339,7 @@ fn vs(@builtin(vertex_index) vid: u32) -> VsOut { // perpendicular to the view direction so the user never sees its // rectangular edge, and size each corner offset by the Milky Way's // physical half-extent. Result: the impostor's angular size on - // screen scales as `2 * atan(halfExtent / cameraDistance)` — full + // screen scales as '2 * atan(halfExtent / cameraDistance)' — full // screen when the camera is right next to the origin, vanishing to // a dot when the camera is far away. This is the "right physical // size" the previous all-clip-space implementation lacked. @@ -286,10 +352,10 @@ fn vs(@builtin(vertex_index) vid: u32) -> VsOut { // produces a 3D-looking spiral from any vantage, so the orientation // of the BACKING quad doesn't affect the rendered look — only the // *synthetic camera* inside the shader does, and we drive that - // separately from `cameraPosWorld` in the fragment stage. + // separately from 'cameraPosWorld' in the fragment stage. let lookDir = normalize(-u.cameraPosWorld); // World-up reference for the cross-product basis. This MUST match - // the OrbitCamera's `lookAt` up-vector convention or the + // the OrbitCamera's 'lookAt' up-vector convention or the // billboard's basis tilts relative to the camera's actual screen // axes — the user-visible failure mode was "the bulge disappears // on one side when looking head-on to the disk", caused by the @@ -297,12 +363,12 @@ fn vs(@builtin(vertex_index) vid: u32) -> VsOut { // angular coverage didn't line up with the screen's rectangular // viewport. // - // OrbitCamera (`computeViewProj` in `orbitCamera.ts`) uses world - // +Y as the up reference for `mat4.lookAt`, with the orbit- + // OrbitCamera ('computeViewProj' in 'orbitCamera.ts') uses world + // +Y as the up reference for 'mat4.lookAt', with the orbit- // controls module clamping pitch to ±(π/2 − ε) to keep the lookAt // matrix non-degenerate. We mirror that exactly: worldUp = +Y, - // and the same pitch clamp upstream guarantees `cross(lookDir, - // +Y)` is non-degenerate so we never need to use the pole + // and the same pitch clamp upstream guarantees 'cross(lookDir, + // +Y)' is non-degenerate so we never need to use the pole // fallback in practice. The fallback is kept defensively for // the (currently impossible) case where pitch reaches the pole. let worldUp = vec3(0.0, 1.0, 0.0); @@ -317,34 +383,32 @@ fn vs(@builtin(vertex_index) vid: u32) -> VsOut { let worldPos = (c.x * right + c.y * up) * MILKY_WAY_HALFEXTENT_MPC; var out: VsOut; - out.clipPos = u.viewProj * vec4(worldPos, 1.0); + out.clipPos = worldToClip(u.cam, worldPos); out.uv = c; out.worldPos = worldPos; return out; } // ── Ported helpers (see Task 0 of the plan for the original GLSL) ──── +// +// 'rot2', 'sabs', 'toPolar', 'toRect' previously lived inline in this +// file. They've moved to lib/math.wesl since the same primitives are +// useful in any shader doing radial domain warps or smooth-distance +// fields. The semantics are unchanged; the only rename is GLSL's +// 'rot()' → 'rot2()' (see lib/math.wesl docblock for the why). const TWIRLY: f32 = 2.5; -fn toPolar(p: vec2) -> vec2 { - return vec2(length(p), atan2(p.y, p.x)); -} - -fn toRect(p: vec2) -> vec2 { - return p.x * vec2(cos(p.y), sin(p.y)); -} - -// GLSL's `mod2` mutated `p` in-place via `inout` and returned the cell -// index `c`. WGSL has no `inout`; we return both via a struct. +// GLSL's 'mod2' mutated 'p' in-place via 'inout' and returned the cell +// index 'c'. WGSL has no 'inout'; we return both via a struct. struct Mod2Out { p: vec2, c: vec2, }; fn mod2(p_in: vec2, size: vec2) -> Mod2Out { - // GLSL `mod` is the floored modulo; WGSL's `%` is truncated and - // `fract`-based. Replicate the GLSL formula explicitly: + // GLSL 'mod' is the floored modulo; WGSL's '%' is truncated and + // 'fract'-based. Replicate the GLSL formula explicitly: // mod(x, y) = x - y * floor(x/y) let pPlusHalf = p_in + size * 0.5; let c = floor(pPlusHalf / size); @@ -364,12 +428,6 @@ fn noise1(p_in: vec2, tm: f32) -> f32 { return a * b * c * d; } -fn rot(p: vec2, a: f32) -> vec2 { - let c = cos(a); - let s = sin(a); - return vec2(c * p.x + s * p.y, -s * p.x + c * p.y); -} - fn twirl(p_in: vec2, a: f32, z: f32) -> vec2 { var pp = toPolar(p_in); pp.y = pp.y + pp.x * TWIRLY + a; @@ -406,7 +464,7 @@ fn stars(p_in: vec2) -> vec3 { var s = vec3(10000.0); for (var i: i32 = 0; i < 3; i = i + 1) { - p = rot(p, 0.5); + p = rot2(p, 0.5); let m = mod2(p, vec2(sz)); let r = rand(m.c); let o = -1.0 + 2.0 * vec2(r, fract(r * 1000.0)); @@ -418,16 +476,10 @@ fn stars(p_in: vec2) -> vec3 { } // SABS is a smooth absolute-value: linear far from zero, parabolic near -// zero with knee `k`. GLSL macro: `LESS((.5/k)*x*x+k*.5,abs(x),abs(x)-k)`. -// `LESS(a, b, c) = mix(a, b, step(0., c))` — i.e., `c >= 0` ? b : a. -// Substituting `c = abs(x) - k`: when `|x| >= k` use `abs(x)`, else use -// the parabolic blend. WGSL `select` does the same job. -fn sabs(x: f32, k: f32) -> f32 { - let a = (0.5 / k) * x * x + k * 0.5; - let ax = abs(x); - return select(a, ax, ax >= k); -} - +// zero with knee 'k'. GLSL macro: 'LESS((.5/k)*x*x+k*.5,abs(x),abs(x)-k)'. +// 'LESS(a, b, c) = mix(a, b, step(0., c))' — i.e., 'c >= 0' ? b : a. +// Substituting 'c = abs(x) - k': when '|x| >= k' use 'abs(x)', else use +// the parabolic blend. WGSL 'select' does the same job. fn height(p: vec2, tm: f32) -> f32 { let ang = atan2(p.y, p.x); let l = length(p); @@ -469,14 +521,14 @@ fn shadeGalaxyDisk(p_in: vec2, ro: vec3, rd: vec3, d: f32, tm: f3 // half-extent in shader units), well outside the natural disk // brightness. At those positions: // - // - `0.25 * pow(diff2, 4)` adds ~0.002 white per fragment from - // a Phong-like specular term that has no `h` gating. - // - `stars()` may divide by a near-zero `s.x` for some fragments, - // producing NaN that propagates through `tanh`/`mix`/`clamp`. + // - '0.25 * pow(diff2, 4)' adds ~0.002 white per fragment from + // a Phong-like specular term that has no 'h' gating. + // - 'stars()' may divide by a near-zero 's.x' for some fragments, + // producing NaN that propagates through 'tanh'/'mix'/'clamp'. // - The dust integral inside this function adds ~tiny dust haze // when h ≈ 0 makes ddust ≈ d. // - // Multiplying the final result by `exp(-5.5*l²)` (an earlier + // Multiplying the final result by 'exp(-5.5*l²)' (an earlier // attempt) wasn't enough — NaN times anything is still NaN, and // any value the GPU lifts to the HDR target via additive blending // can wash out catalog points behind the impostor's quad. The @@ -491,7 +543,7 @@ fn shadeGalaxyDisk(p_in: vec2, ro: vec3, rd: vec3, d: f32, tm: f3 // from the dim-tail terms (pow(0, near-zero), 1/s.x when s.x ≈ 0). // Cutting at 0.95 trades a slightly more abrupt outer disk edge // for guaranteed-zero output in the NaN-risk zone. - let p_check = rot(p_in, 0.5 * tm); + let p_check = rot2(p_in, 0.5 * tm); let l_check = length(p_check); if (l_check > 0.95) { return vec3(0.0); } @@ -540,7 +592,7 @@ fn shadeGalaxyDisk(p_in: vec2, ro: vec3, rd: vec3, d: f32, tm: f3 // The original ShaderToy was written for a hard-coded camera that // ALWAYS framed the galaxy with disk-plane intersections inside l ≈ // 1 shader unit. In that regime, every term in this function - // either naturally fades with `h` (which carries `exp(-5.5*l*l)`) + // either naturally fades with 'h' (which carries 'exp(-5.5*l*l)') // or never reaches a fragment outside the disk extent in the first // place. // @@ -551,13 +603,13 @@ fn shadeGalaxyDisk(p_in: vec2, ro: vec3, rd: vec3, d: f32, tm: f3 // extent. Two terms inside this function CONTRIBUTE non-trivially // out there even though they shouldn't: // - // 1. `0.25 * pow(diff2, 4)` — Phong-like specular off the disk's - // "surface". `diff2` depends only on the surface normal and + // 1. '0.25 * pow(diff2, 4)' — Phong-like specular off the disk's + // "surface". 'diff2' depends only on the surface normal and // light position; both are well-defined for any p, so the // term contributes ~0.002 per channel uniformly across the // whole disk plane. - // 2. The dust integral `0.7 * COL_DUST * (1 - exp(-2*t))` — `t` - // is non-zero whenever `ddust < d`, which happens for nearly + // 2. The dust integral '0.7 * COL_DUST * (1 - exp(-2*t))' — 't' + // is non-zero whenever 'ddust < d', which happens for nearly // every fragment when h ≈ 0 makes ddust ≈ d − ε. // // With pure additive blending, those tiny per-fragment @@ -568,8 +620,8 @@ fn shadeGalaxyDisk(p_in: vec2, ro: vec3, rd: vec3, d: f32, tm: f3 // exactly the impostor's quad shape; toggling the impostor off // restored the catalog underneath. // - // Fix: multiply the entire output by `exp(-5.5*l*l)` — the same - // disk-extent factor `height` already uses internally. Inside the + // Fix: multiply the entire output by 'exp(-5.5*l*l)' — the same + // disk-extent factor 'height' already uses internally. Inside the // disk (l ≤ 1) the multiplier is ~1 (unchanged); outside (l > 1) // it falls off rapidly so off-disk haze contributes effectively // zero. Edges of the impostor's quad are now invisible, only the @@ -590,11 +642,11 @@ fn renderGalaxy(ro: vec3, rd: vec3, tm: f32) -> vec3 { // ── Ray-marched soft bulge ─────────────────────────────────────── // - // The ShaderToy original used `1.7 * (1 - exp(-chord))` where - // `chord` is the geometric chord length through a uniform-density + // The ShaderToy original used '1.7 * (1 - exp(-chord))' where + // 'chord' is the geometric chord length through a uniform-density // sphere. That has TWO problems for a world-anchored impostor: // - // 1. Asymmetric truncation — the original `min(t0, t1)` clipped + // 1. Asymmetric truncation — the original 'min(t0, t1)' clipped // the chord to "above the disk plane" so the bulge looked // like a crescent. Fine for the ShaderToy's hard-coded // vantage, broken when the user can orbit and would see one @@ -602,7 +654,7 @@ fn renderGalaxy(ro: vec3, rd: vec3, tm: f32) -> vec3 { // // 2. Hard silhouette — uniform density inside, zero outside. // Chord goes to zero at the geometric edge with INFINITE - // slope (`chord = 2·sqrt(r² - b²)` near impact-parameter b + // slope ('chord = 2·sqrt(r² - b²)' near impact-parameter b // = r), so the rendered brightness has a sharp circular cut. // // Fix #1: drop the disk-plane truncation, use the full chord. @@ -640,7 +692,7 @@ fn renderGalaxy(ro: vec3, rd: vec3, tm: f32) -> vec3 { // ── Ray-marched thin-disc halo ─────────────────────────────────── // - // Even with the spiral-arm structure rendered by `shadeGalaxyDisk`, + // Even with the spiral-arm structure rendered by 'shadeGalaxyDisk', // the inter-arm regions and the disk's outer edges go to nearly // black against the HDR target. Adding a *very thin*, in-plane // Gaussian "haze" gives the disk a soft baseline glow so the arms @@ -656,7 +708,7 @@ fn renderGalaxy(ro: vec3, rd: vec3, tm: f32) -> vec3 { // ρ(p) = exp(-(p.x² + p.z²) / σ_r²) · exp(-p.y² / σ_y²) // // (The shader's coordinate convention has y as the disk normal — - // see `galacticToShader` above.) + // see 'galacticToShader' above.) let discHits = raySphere(ro, rd, vec3(0.0), DISC_HALO_INTEGRATION_RADIUS); var discOpticalDepth: f32 = 0.0; @@ -683,14 +735,14 @@ fn renderGalaxy(ro: vec3, rd: vec3, tm: f32) -> vec3 { @fragment fn fs(in: VsOut) -> @location(0) vec4 { - // Animation disabled — `tm` was the ShaderToy's TIME macro, fed into - // `rot(p, 0.5*tm)` for arm rotation and into `sin(... + tm * ...)` + // Animation disabled — 'tm' was the ShaderToy's TIME macro, fed into + // 'rot(p, 0.5*tm)' for arm rotation and into 'sin(... + tm * ...)' // phase modulation in the noise/star samplers. Locking it to a // constant freezes the spiral pattern. Cosmic timescales make even // the original "slow but alive" animation physically nonsensical // (galaxy rotation periods are ~250 Myr); a static impostor reads // as a real photographic backdrop instead of a procedural toy. - // The `iTime` uniform is retained for ABI symmetry but unread. + // The 'iTime' uniform is retained for ABI symmetry but unread. let tm: f32 = 0.0; // Original mainImage: q = fragCoord/RESOLUTION; p = -1 + 2*q; p.x *= aspect. @@ -700,16 +752,16 @@ fn fs(in: VsOut) -> @location(0) vec4 { // edge of the impostor. // ── Synthetic camera driven by the user's REAL camera ─────────── // - // The original ShaderToy hard-coded a fixed `ro = vec3(0, 0.7, 2) - // * 0.75` and a synthesised perspective FOV via `2.5 * ww`. That + // The original ShaderToy hard-coded a fixed 'ro = vec3(0, 0.7, 2) + // * 0.75' and a synthesised perspective FOV via '2.5 * ww'. That // produced a frozen vantage regardless of where the user's orbit // camera actually was — the user reported "the galaxy is not moving // around when the camera is moving" precisely because of this. // // Replace it with a real-world ray: - // - `ro_world` is the user's camera position (already in skymap + // - 'ro_world' is the user's camera position (already in skymap // world coordinates, equatorial-cartesian). - // - `rd_world` is the per-fragment ray from the camera through + // - 'rd_world' is the per-fragment ray from the camera through // this fragment's world-space position (forwarded by the vertex // stage). Per-pixel reconstruction means the perspective is // correct even though the impostor is rendered onto a flat @@ -740,8 +792,8 @@ fn fs(in: VsOut) -> @location(0) vec4 { let col = renderGalaxy(ro, rd, tm); - // Pipeline blend is PURE ADDITIVE (`dstFactor: 'one'`). Each - // pixel adds `col × alpha` to the HDR target; dark fragments + // Pipeline blend is PURE ADDITIVE ('dstFactor: 'one''). Each + // pixel adds 'col × alpha' to the HDR target; dark fragments // contribute zero. let alpha = u.fadeAlpha; @@ -749,24 +801,24 @@ fn fs(in: VsOut) -> @location(0) vec4 { // // The ported ShaderToy math has several near-singular operations // that can produce NaN at fragments where the camera ray hits - // edge-case geometry: `pow(si / s.x, 2.5)` divides by `s.x`, the + // edge-case geometry: 'pow(si / s.x, 2.5)' divides by 's.x', the // distance to the nearest random star sample, which can land at - // ≈ 0 for specific cell offsets; `pow(vec3(0.5)*h, exponent)` with + // ≈ 0 for specific cell offsets; 'pow(vec3(0.5)*h, exponent)' with // h ≈ 0 and a near-zero exponent component is implementation- - // defined; some `tanh`/`mix`/`clamp` paths propagate any NaN they + // defined; some 'tanh'/'mix'/'clamp' paths propagate any NaN they // encounter. In ADDITIVE blending, even one NaN pixel is fatal — // it lands on the HDR target as NaN, the next OVER-blended - // catalog point reads it back as `dst`, and the multiplication - // `dst * (1 - src_alpha)` poisons that pixel forever. Visually + // catalog point reads it back as 'dst', and the multiplication + // 'dst * (1 - src_alpha)' poisons that pixel forever. Visually // the user sees a "black ring" or "black square" tracking the // impostor's footprint. // - // WGSL has no `isnan` predicate, but exploits the IEEE-754 rule - // that NaN is never equal to itself: `x != x` is true iff x is + // WGSL has no 'isnan' predicate, but exploits the IEEE-754 rule + // that NaN is never equal to itself: 'x != x' is true iff x is // NaN. We use that to mask each component back to 0 if anything // upstream produced a NaN. Inf is also forced to zero (>1e30 // catches both +Inf and large-but-finite outliers from - // `pow(infinity, 2.5)` cases). + // 'pow(infinity, 2.5)' cases). let isFinite = (col == col) & (abs(col) < vec3(1e30)); let safeCol = select(vec3(0.0), col, isFinite); return vec4(safeCol * alpha, alpha); diff --git a/src/services/gpu/shaders/points.wgsl b/src/services/gpu/shaders/points.wesl similarity index 74% rename from src/services/gpu/shaders/points.wgsl rename to src/services/gpu/shaders/points.wesl index 0d92dcf..c86c9c2 100644 --- a/src/services/gpu/shaders/points.wgsl +++ b/src/services/gpu/shaders/points.wesl @@ -32,24 +32,33 @@ // -------------------- // This shader is loaded by both PointRenderer (Task 10) and PickRenderer (Task 16), // which each select a different fragment entry point from this same module: -// - PointRenderer uses `vs` + `fs` → visual additive-blended render -// - PickRenderer uses `vs` + `fsPick` → offscreen r32uint picking pass +// - PointRenderer uses 'vs' + 'fs' → visual additive-blended render +// - PickRenderer uses 'vs' + 'fsPick' → offscreen r32uint picking pass // -// Both pipelines share the same vertex stage (`vs`) and the same shader module. +// Both pipelines share the same vertex stage ('vs') and the same shader module. // Having two fragment entry points in one file avoids duplicating the vertex // stage logic (billboard math, magnitude→intensity, colour ramp) while allowing // each pass to write to its own render-target format. // // The class: -// 1. Calls `device.createShaderModule({ code: wgslSource })` with this text. -// 2. Creates a `GPURenderPipeline` that references the `vs` and `fs` (or -// `fsPick`) entry points defined below. -// 3. Uploads a `Uniforms` struct (viewProj, viewport, pointSizePx, brightness) +// 1. Calls 'device.createShaderModule({ code: wgslSource })' with this text. +// 2. Creates a 'GPURenderPipeline' that references the 'vs' and 'fs' (or +// 'fsPick') entry points defined below. +// 3. Uploads a 'Uniforms' struct (viewProj, viewport, pointSizePx, brightness) // into a uniform buffer and binds it to @group(0) @binding(0). // 4. Uploads per-point data (position, magnitude, colorIndex) into a vertex // buffer configured for *instance stepping* (one record per point), while // @builtin(vertex_index) steps per-vertex (0..5 within each instance). -// 5. Calls `passEncoder.draw(6, pointCount)` to kick off the draw. +// 5. Calls 'passEncoder.draw(6, pointCount)' to kick off the draw. + +import package::lib::math::saturate; +import package::lib::camera::CameraUniforms; +import package::lib::camera::worldToClip; +import package::lib::billboard::quadCorner; +import package::lib::billboard::expandBillboardScreen; +import package::lib::cloudFade::CloudUniforms; +import package::lib::cloudFade::applyCloudFade; +import package::lib::colorIndex::ramp; // ─── uniforms ───────────────────────────────────────────────────────────────── @@ -65,31 +74,77 @@ // a GPUBindGroup pointing at the Uniforms buffer, and calls // passEncoder.setBindGroup(0, bindGroup) before drawing. -struct Uniforms { - // The combined view-projection matrix (4×4 f32, 64 bytes). - // Uploaded by PointRenderer from computeViewProj() (see orbitCamera.ts). - // WGSL uniform buffers follow std140-like alignment: mat4x4 is 64 bytes, - // naturally aligned to 16 bytes — no padding needed before it. - viewProj: mat4x4, - - // Canvas dimensions in physical pixels (after DPR scaling from device.ts). - // Stored as a vec2 because we divide by it below; integer division - // would lose precision. Alignment: vec2 = 8 bytes, aligned to 8. - viewport: vec2, - - // Desired radius of each point sprite in pixels. Larger = bigger glowing - // halos. Typical range 2.0–8.0. Alignment: f32 = 4 bytes. - pointSizePx: f32, - - // Global brightness multiplier in [0, 1]. Lets the UI dim/brighten all - // points without re-uploading point data. Alignment: f32 = 4 bytes. - // (The vec2 above took 8 bytes, so offset so far is 64+8+4+4 = 80 — still - // within a single 256-byte uniform block and no padding gaps needed here.) - brightness: f32, +// ── Uniforms layout (CameraUniforms-prefixed) ────────────────────── +// +// The first 80 bytes are the shared 'CameraUniforms' prefix from +// 'lib/camera.wesl' (viewProj + viewportPx + two reserved pad slots). +// Embedding it as a named member 'cam' rather than inlining viewProj +// + viewport keeps the cross-renderer audit story explicit: every +// renderer's first uniform field is now 'cam: CameraUniforms', and +// every projection call goes through 'worldToClip(u.cam, p)'. +// +// ## Why the field order changed +// +// The pre-WESL-conversion layout had this shape: +// +// offset 0 : viewProj (mat4) +// offset 64 : viewport (vec2) +// offset 72 : pointSizePx (f32) <─── overlaps CameraUniforms +// offset 76 : brightness (f32) <─── ._pad0/._pad1 slots +// offset 80 : selectedIndex (u32) +// offset 84 : instanceIdOffset (u32) +// offset 88 : _pad0 (u32) +// offset 92 : _pad1 (u32) +// offset 96 : camPosWorld (vec3) ... and on +// +// 'CameraUniforms' explicitly reserves bytes 72..79 as '_pad0/_pad1' +// (so renderers extending it can place their first vec3-aligned field +// at offset 80). The old points layout filled those bytes with +// 'pointSizePx' + 'brightness' — fine for a hand-rolled struct but +// incompatible with embedding 'cam: CameraUniforms' as a clean prefix. +// +// ## Resolution: swap with the existing _pad0/_pad1 +// +// Conveniently, the OLD layout already had two u32 padding words at +// offsets 88..95 (the alignment slack that vec3 requires before +// 'camPosWorld' at offset 96). Moving 'pointSizePx' and 'brightness' +// into those slots means EVERY field from offset 80 onward keeps its +// existing byte position — a strictly local refactor. The two +// scalars become f32 instead of u32 padding, but that's a CPU-side +// view change only; the WGSL struct already reserved 8 bytes there. +// +// The CPU-side picker ('pickRenderer.ts') and visual upload +// ('pointRenderer.ts') need their offsets updated for the two +// scalars, but 'selectedIndex' (offset 80) — the only field the +// picker writes mid-frame — stays exactly where it was. See those +// files' headers for the matching index-table updates. +// +// ## New byte layout +// +// offset 0 : cam: CameraUniforms (80 B; viewProj + viewportPx + 2 pads) +// offset 80 : selectedIndex (u32) +// offset 84 : instanceIdOffset (u32) +// offset 88 : pointSizePx (f32) <─── moved from offset 72 +// offset 92 : brightness (f32) <─── moved from offset 76 +// offset 96 : camPosWorld (vec3) +// offset 108 : pxPerRad (f32) +// offset 112 : highlightFallback (u32) +// offset 116 : realOnlyMode (u32) +// offset 120 : depthFadeEnabled (u32) +// offset 124 : _pad4 (u32) +// offset 128..159 : Malmquist block (biasMode, absMagLimit, ...) +// offset 160..175 : pxFadeStart, pxFadeEnd, _padFade0, _padFade1 +// total: 176 bytes (unchanged from the pre-refactor layout). - // The currently-selected point packed as `(sourceCode << 27) | localIdx`, - // or `0xFFFFFFFFu` when nothing is selected. The vertex shader recovers - // its own packed identity as `(cloud.sourceCode << 27u) | u32(instance_index)` +struct Uniforms { + // Shared camera prefix — viewProj for clip-space projection and + // viewportPx for the px→clip conversion in 'vs'. All renderer- + // specific scalars live AFTER this 80-byte block. + cam: CameraUniforms, + + // The currently-selected point packed as '(sourceCode << 27) | localIdx', + // or '0xFFFFFFFFu' when nothing is selected. The vertex shader recovers + // its own packed identity as '(cloud.sourceCode << 27u) | u32(instance_index)' // (sourceCode lives in the per-source @group(1) bind group — see // CloudUniforms below) and compares against this slot to decide whether // to enlarge for the selection ring. @@ -102,18 +157,18 @@ struct Uniforms { // construction: each source's packed identity space sits in its own // top-5-bit slice, so two galaxies in different surveys can never // collide on the same packed value. This replaces the prior - // running-sum `globalInstanceIdx` baked per-vertex; the parallel- + // running-sum 'globalInstanceIdx' baked per-vertex; the parallel- // upload race that scheme suffered from is structurally impossible - // here because there is no per-vertex baking — `instance_index` is the - // GPU's `@builtin` and `sourceCode` arrives via this same uniform + // here because there is no per-vertex baking — 'instance_index' is the + // GPU's '@builtin' and 'sourceCode' arrives via this same uniform // before each per-source draw. // // ### Sentinel // - // `0xFFFFFFFFu` is "no selection". The maximum legitimate packed - // value is `(31 << 27) | 0x07FFFFFF` = `0xFFFFFFFE` for the largest + // '0xFFFFFFFFu' is "no selection". The maximum legitimate packed + // value is '(31 << 27) | 0x07FFFFFF' = '0xFFFFFFFE' for the largest // hypothetical sourceCode + max localIdx — the +1 in the pick path is - // what keeps this sentinel below `0xFFFFFFFF`, but the SELECTION + // what keeps this sentinel below '0xFFFFFFFF', but the SELECTION // packing has no +1 (the localIdx range starts at 0 and we use a // separate sentinel) so collisions are still impossible at any // realistic scale. @@ -121,34 +176,50 @@ struct Uniforms { // ### Offset stability // // The four preceding fields (vec2 + f32 + f32 = 16 bytes) bring the - // running offset to 80 bytes, where `selectedPacked` sits. The picker - // (`pickRenderer.ts`) writes the sentinel here directly using a - // hard-coded byte offset of 80; adding fields *after* `selectedPacked` + // running offset to 80 bytes, where 'selectedPacked' sits. The picker + // ('pickRenderer.ts') writes the sentinel here directly using a + // hard-coded byte offset of 80; adding fields *after* 'selectedPacked' // is therefore safe. selectedPacked: u32, - // Three trailing u32 padding words to keep `selectedPacked` (offset 80) - // on the same 16-byte vec4 slot as the next vec3. Required because - // the next member (`camPosWorld`, a vec3) has alignment 16, so - // the struct would otherwise insert implicit padding here anyway — - // naming the bytes makes the JS-side upload obvious. - // - // The slot at offset 84 is intentionally unused at the @group(0) level: - // sourceCode lives in the per-source @group(1) `CloudUniforms` (see + // Slot at offset 84: intentionally unused at the @group(0) level. + // 'sourceCode' lives in the per-source @group(1) 'CloudUniforms' (see // below) so each cloud's bind group carries its OWN sourceCode value // and draws within one render pass don't race on the per-draw write. _pad0: u32, - _pad1: u32, - _pad2: u32, + + // Desired radius of each point sprite in pixels. Larger = bigger glowing + // halos. Typical range 2.0–8.0. + // + // Lives at byte offset 88 in the post-CameraUniforms layout. It used + // to live at offset 72, but adopting the shared 'CameraUniforms' prefix + // (which reserves bytes 72..79 as '_pad0/_pad1') forced 'pointSizePx' + // into the existing 8-byte alignment slack between offset 84 and the + // vec3-aligned 'camPosWorld' at offset 96. The pick pass overwrites + // this slot in-place to boost hover/click hit-radius (see + // 'pickRenderer.ts' POINT_SIZE_OFFSET). + pointSizePx: f32, + + // Global brightness multiplier in [0, 1]. Lets the UI dim/brighten all + // points without re-uploading point data. + // + // Lives at byte offset 92 in the post-CameraUniforms layout (was 76). + // Together with 'pointSizePx' above, the two scalars recycle the + // alignment-slack bytes that the WGSL spec already required between + // 'selectedPacked' (offset 80) + the @group(0)-unused slot (offset 84) + // and the vec3-aligned 'camPosWorld' (offset 96) — same 8 bytes, + // different name. + brightness: f32, // ── APPARENT-SIZE BILLBOARD SIZING (added Task: galaxy disc sizing) ────── // // World-space camera position in Mpc. Used by the vertex stage to compute // the per-galaxy distance, which feeds the apparent-pixel-size calculation - // below. WGSL gives `vec3` an alignment of 16 — so this field starts - // at offset 96 (the previous _pad0/_pad1 brought us to a 16-byte boundary) + // below. WGSL gives 'vec3' an alignment of 16 — so this field starts + // at offset 96 (selectedIndex + instanceIdOffset + pointSizePx + brightness + // = 16 bytes after the 80-byte 'cam' block, landing on a 16-byte boundary) // and consumes 12 bytes of payload + 4 bytes of trailing padding before - // `pxPerRad`. + // 'pxPerRad'. // // Why a uniform and not a per-vertex attribute? The camera position is the // same for every instance in a frame. Per-vertex storage would burn ~10 MB @@ -157,12 +228,12 @@ struct Uniforms { camPosWorld: vec3, // Pixels-per-radian for the current viewport + camera FOV combination, - // pre-computed CPU-side as `viewport.y / (2 · tan(fovY / 2))`. Multiplying + // pre-computed CPU-side as 'viewport.y / (2 · tan(fovY / 2))'. Multiplying // an angular size (radians) by this scalar yields screen pixels — the // standard pinhole-camera relation, just packaged for cheap shader use. // // We pass it pre-divided rather than passing fovY and recomputing per - // vertex because `tan` is one of the more expensive intrinsics on mobile + // vertex because 'tan' is one of the more expensive intrinsics on mobile // GPUs and the result is frame-constant. pxPerRad: f32, @@ -171,39 +242,39 @@ struct Uniforms { // u32 booleans (0 / 1) controlling how the fragment shader treats // galaxies whose orientation came from the deterministic fallback rather // than a real photometric measurement. The fallback flag itself rides - // on the sign bit of the per-vertex `axisRatio` attribute (see the + // on the sign bit of the per-vertex 'axisRatio' attribute (see the // PerVertex doc). // - // - `highlightFallback`: when 1, multiply the tint of fallback rows by - // magenta `(1.0, 0.3, 1.0)` — a quick visual scan of which surveys + // - 'highlightFallback': when 1, multiply the tint of fallback rows by + // magenta '(1.0, 0.3, 1.0)' — a quick visual scan of which surveys // have real orientation coverage. - // - `realOnlyMode`: when 1, `discard` fallback fragments entirely so the + // - 'realOnlyMode': when 1, 'discard' fallback fragments entirely so the // user can see only galaxies with measured (b/a, PA). Useful for - // verifying the cross-match coverage as `npm run fetch-2mass-xsc` and - // `npm run fetch-hyperleda` populate their caches. + // verifying the cross-match coverage as 'npm run fetch-2mass-xsc' and + // 'npm run fetch-hyperleda' populate their caches. // // Two trailing u32s round the struct to a 16-byte boundary (vec4 slot). highlightFallback: u32, realOnlyMode: u32, // Per-galaxy camera-distance depth fade gate. When 1, the fragment stage - // multiplies alpha by `1 / (1 + (camDist / FALLOFF_HALF)²)`; when 0 the + // multiplies alpha by '1 / (1 + (camDist / FALLOFF_HALF)²)'; when 0 the // multiplication is skipped (equivalent to weight 1 everywhere). - // Repurposed from a former `_pad3` slot — sits with the other UI + // Repurposed from a former '_pad3' slot — sits with the other UI // boolean toggles so the byte layout reads sensibly. depthFadeEnabled: u32, _pad4: u32, // ── Malmquist-bias correction state (Task 2 of malmquist-bias plan) ───── // - // `biasMode` chooses which correction the vertex stage applies: + // 'biasMode' chooses which correction the vertex stage applies: // 0 = none — render every galaxy unchanged. // 1 = volume-limit — discard galaxies whose absolute magnitude is - // fainter (numerically larger) than `absMagLimit`. + // fainter (numerically larger) than 'absMagLimit'. // 2 = 1/V_max — Task 3: weight by inverse maximum-detection - // volume; needs `apparentMagLimit` and per-row + // volume; needs 'apparentMagLimit' and per-row // flux-limit data. // 3 = Schechter — Task 4: reweight by the expected Schechter - // luminosity function `phi(M; M*, alpha)`. + // luminosity function 'phi(M; M*, alpha)'. // // Modes 2 + 3 are reserved here so we don't have to grow the uniform // buffer again when Tasks 3 + 4 land — the shader fields are inert for @@ -225,10 +296,10 @@ struct Uniforms { // tail — so we round up to the next 16-byte boundary (32 bytes added // in total). // - // Task 4 (Schechter density correction): the four `schechter*` fields + // Task 4 (Schechter density correction): the four 'schechter*' fields // are written PER SOURCE between draw calls — each survey has its own // M*, α, m_lim, and pre-computed central-density normaliser N_ref. The - // φ* normalisation drops out of the ratio `N_ref / n(d)`, so it doesn't + // φ* normalisation drops out of the ratio 'N_ref / n(d)', so it doesn't // need a uniform slot. biasMode: u32, absMagLimit: f32, @@ -251,24 +322,24 @@ struct Uniforms { // additive accumulation. // // We therefore feed the same two thresholds into this shader and - // multiply the per-fragment alpha by `1 - smoothstep(start, end, sizePx)` + // multiply the per-fragment alpha by '1 - smoothstep(start, end, sizePx)' // before output. Outside [start, end] the multiplier is exactly 1.0 // (below start) or 0.0 (above end), so far-field rendering is byte- // for-byte identical to before this task and only galaxies actively // in the crossfade band feel the change. // - // Why alpha=0 instead of vertex-stage `discard` / clip-space cull? - // The pick fragment entry point (`fsPick`) shares the vertex stage with - // `fs`; if we collapsed the billboard to off-screen-clip when sizePx + // Why alpha=0 instead of vertex-stage 'discard' / clip-space cull? + // The pick fragment entry point ('fsPick') shares the vertex stage with + // 'fs'; if we collapsed the billboard to off-screen-clip when sizePx // exceeds end, the user would lose the ability to *click* a galaxy // whose visual representation has handed off to the procedural-disk // pass. Selection rings and click hit-testing both still want the // points-pass primitive to rasterise. Multiplying alpha by zero in - // `fs` keeps the vertex stage and pick stage untouched while making + // 'fs' keeps the vertex stage and pick stage untouched while making // the visual contribution invisible. // - // pxFadeStart / pxFadeEnd are populated by `pointRenderer.draw` from - // engine-side constants imported from `./thumbnailSubsystem` so the + // pxFadeStart / pxFadeEnd are populated by 'pointRenderer.draw' from + // engine-side constants imported from './thumbnailSubsystem' so the // two passes can never drift out of sync (a single source of truth). // The two trailing pads round the appended payload up to a 16-byte // boundary — without them the next vec3/vec4 a future task adds would @@ -283,62 +354,37 @@ struct Uniforms { // ─── per-cloud uniforms (Cloud fade-in) ─────────────────────────────────────── // -// One small uniform buffer per loaded source, set at draw time via -// `setBindGroup(1, entry.cloudBindGroup)` from the JS render loop. The -// only field we currently care about is `opacity` — the smoothstep-shaped -// 0→1 ramp that the JS side computes from `now() - fadeStartMs`. Multiplied -// into the visual fragment's final alpha so a freshly-uploaded cloud -// glides into view rather than popping. +// 'CloudUniforms' is imported from 'lib/cloudFade.wesl' (shared with +// filaments.wesl). See that file's docblock for field semantics + +// why the layout is deliberately identical across renderers. The +// notes below cover what's POINTS-specific. // -// Why a separate bind group rather than extending @group(0)? WebGPU's -// `queue.writeBuffer` ordering across submits in one frame is undefined — -// writing different opacity values to one shared buffer between draws -// would race. Per-cloud BUFFERS sidestep that entirely (one writeBuffer -// per buffer per frame, no overlap), which is exactly the existing -// "bake per-instance into the vertex buffer" pattern at a coarser -// granularity. See CLAUDE.md → "WebGPU `queue.writeBuffer` race". +// ### Why @group(1) instead of @group(0) // -// The pick fragment (`fsPick`) doesn't reference `cloud.opacity`, but -// the SHARED vertex stage reads `cloud.sourceCode` to compose each -// instance's packed identity (`(sourceCode << 27u) | instance_index`) -// for the pick output. WebGPU's pipeline layout reflects every binding -// any stage touches, so the pick pipeline auto-derives @group(1) too, -// and PickRenderer must bind a CloudFade-style group per source before -// each draw — see PickRenderer.pick(). -struct CloudUniforms { - /** 0 → fully transparent (just uploaded), 1 → fully opaque (steady state). */ - opacity: f32, - - // 5-bit Source enum value for this cloud. Set once at upload time - // (the JS `pointRenderer.upload` calls `entry.fade.setSourceCode(source)`). - // The vertex stage reads this slot to compose - // `myPacked = (sourceCode << 27u) | @builtin(instance_index)` — the - // per-instance packed identity used for both selection-halo - // comparison (`fs`) and pick output (`fsPick`). - // - // ### Why @group(1) instead of @group(0) - // - // The visual + pick passes draw every loaded survey in one render - // pass each, with one `pass.draw(6, count)` call per source. If - // sourceCode lived in the global @group(0) Uniforms, writing it - // between draws within one submit would NOT take effect — WebGPU - // sequences all `queue.writeBuffer` calls in a submit before any - // draw runs, so all draws would see the last-written value. That - // race is exactly what the prior revision's per-vertex - // `globalInstanceIdx` baking dodged. - // - // Putting sourceCode in the per-source @group(1) bind group makes - // every cloud have its OWN uniform buffer. Different buffers, - // different write destinations — the writes can't race because there - // is no shared destination across draws. Same architecture that - // already keeps `opacity` from racing. - sourceCode: u32, - - // Pad to 16-byte alignment — WebGPU's minimum uniform buffer size. - _pad1: f32, - _pad2: f32, -}; - +// The visual + pick passes draw every loaded survey in one render +// pass each, with one 'pass.draw(6, count)' call per source. If +// 'opacity' or 'sourceCode' lived in the global @group(0) Uniforms, +// writing between draws within one submit would NOT take effect — +// WebGPU sequences all 'queue.writeBuffer' calls in a submit before +// any draw runs, so all draws would see the last-written value. That +// race is exactly what the prior revision's per-vertex +// 'globalInstanceIdx' baking dodged. +// +// Putting these per-cloud values in the per-source @group(1) bind +// group makes every cloud have its OWN uniform buffer. Different +// buffers, different write destinations — the writes can't race +// because there is no shared destination across draws. See +// CLAUDE.md → "WebGPU 'queue.writeBuffer' race". +// +// ### Pick pipeline layout +// +// The pick fragment ('fsPick') doesn't reference 'cloud.opacity', but +// the SHARED vertex stage reads 'cloud.sourceCode' to compose each +// instance's packed identity ('(sourceCode << 27u) | instance_index') +// for the pick output. WebGPU's pipeline layout reflects every +// binding any stage touches, so the pick pipeline auto-derives +// @group(1) too, and PickRenderer must bind a CloudFade-style group +// per source before each draw — see PickRenderer.pick(). @group(1) @binding(0) var cloud: CloudUniforms; // ─── vertex attributes ──────────────────────────────────────────────────────── @@ -346,7 +392,7 @@ struct CloudUniforms { // These fields are filled from the *instance* vertex buffer — the buffer that // holds one record per catalog point, not one record per vertex. // -// On the JS side the pipeline descriptor's `vertex.buffers` array will contain +// On the JS side the pipeline descriptor's 'vertex.buffers' array will contain // an entry like: // // { arrayStride: 20, // 3×f32 (pos) + 1×f32 (mag) + 1×f32 (ci) = 20 bytes @@ -376,7 +422,7 @@ struct PerVertex { // Per-row K-correction coefficient (units: per unit redshift z). // - // Used by `vs` to convert observed colour to rest-frame: each survey + // Used by 'vs' to convert observed colour to rest-frame: each survey // measures a different colour pair with a different sensitivity to z, // so the K-correction strength varies per row rather than being a // global shader constant: @@ -385,27 +431,27 @@ struct PerVertex { // - 2MRS J−K → k ≈ 0.0/z (NIR is nearly redshift-invariant at z<0.1) // and the JS-side upload writes 0 alongside the colorIndex sentinel for // rows whose source-specific colour pair isn't measurable, so the - // sentinel branch in `vs` doesn't need to special-case kPerZ. + // sentinel branch in 'vs' doesn't need to special-case kPerZ. @location(3) kPerZ: f32, // Galaxy minor/major axis ratio b/a in (0, 1] — with the SIGN BIT // carrying the fallback-orientation flag. Real measurements are // always positive; the JS-side bake negates the value when the row's - // (b/a, PA) match the deterministic `fallbackOrientation` output. + // (b/a, PA) match the deterministic 'fallbackOrientation' output. // The fragment stage recovers both pieces in one read: // - // - `abs(axisRatio)` for the elliptical mask shape (the existing - // `axisRatio > 0.0` validity check stops working with negative - // reals, so the vertex stage forwards `abs(axisRatio)` through + // - 'abs(axisRatio)' for the elliptical mask shape (the existing + // 'axisRatio > 0.0' validity check stops working with negative + // reals, so the vertex stage forwards 'abs(axisRatio)' through // VSOut.axisRatio for the fragment to use directly). - // - `axisRatio < 0.0` for the fallback flag. + // - 'axisRatio < 0.0' for the fallback flag. // // ### Why sign-bit packing // // The previous revision rode the fallback flag on the high bit of a - // per-vertex `globalInstanceIdx u32`. That whole slot went away with + // per-vertex 'globalInstanceIdx u32'. That whole slot went away with // the (source, localIdx) packing refactor — the picker now derives - // its global identity from `(sourceCode << 27) | instance_index` + // its global identity from '(sourceCode << 27) | instance_index' // without any per-vertex baking. The fallback flag is a single bit; // rather than reintroduce a dedicated u32 slot just for it, we steal // the sign bit of axisRatio (always a positive value for real @@ -413,11 +459,11 @@ struct PerVertex { // // ### NaN handling (synthetic-fallback cloud) // - // The synthetic-fallback cloud (loaded when every real `.bin` fails to + // The synthetic-fallback cloud (loaded when every real '.bin' fails to // decode) ships its axisRatio array filled with NaN. WGSL's - // `abs(NaN)` returns NaN and `NaN < 0.0` is false, so the vertex + // 'abs(NaN)' returns NaN and 'NaN < 0.0' is false, so the vertex // stage routes synthetic rows through the existing "axisRatio > 0 - // is false" round-mask path with `isFallback = 0u`. + // is false" round-mask path with 'isFallback = 0u'. @location(4) axisRatio: f32, // Position angle in degrees, [0, 180). Rotates the squashed ellipse // around the billboard centre. East-of-north convention; we negate @@ -431,10 +477,10 @@ struct PerVertex { // in every row. @location(6) diameterKpc: f32, // Per-galaxy 1/V_max weight for Malmquist-bias correction. Baked at - // upload time as `clamp((dRef / dMax(M, m_lim))³, 0, 1)`. Read by + // upload time as 'clamp((dRef / dMax(M, m_lim))³, 0, 1)'. Read by // the fragment shader's intensity computation, but ONLY when - // `u.biasMode == 2u` (the 1/V_max literal in src/data/biasMode.ts); - // every other mode multiplies by 1.0 via `select`, so the four bias + // 'u.biasMode == 2u' (the 1/V_max literal in src/data/biasMode.ts); + // every other mode multiplies by 1.0 via 'select', so the four bias // modes stay independent and a/b-comparable from the SettingsPanel. // // Why per-vertex (not a uniform)? Each galaxy has a different M and @@ -442,7 +488,7 @@ struct PerVertex { // value across the whole survey — a strict information loss. @location(7) vMaxWeight: f32, - // Per-galaxy Schechter density-correction ratio = `clamp(N_ref / n(d), 0, 10)`, + // Per-galaxy Schechter density-correction ratio = 'clamp(N_ref / n(d), 0, 10)', // baked at upload time (originally introduced in commit 7a6d810 as a // per-fragment 200-step trapezoidal integral; that loop is gone now). // @@ -450,32 +496,32 @@ struct PerVertex { // // Each galaxy's distance from origin is fixed at upload time (the catalog // parser baked the linear-cosmology Cartesian position into the .bin). - // The Schechter integral at that distance — `n(d)` — depends only on the + // The Schechter integral at that distance — 'n(d)' — depends only on the // survey's selection function (M*, α, m_lim) and that fixed distance, so // its value is also fixed at upload time. The CPU computes it once per // galaxy and writes the resulting ratio here, mirroring exactly the - // pattern Task 3 used for `vMaxWeight`. + // pattern Task 3 used for 'vMaxWeight'. // // ── Why baking is *much* faster ───────────────────────────────────────── // // The original implementation ran the 200-step trapezoidal loop in the // FRAGMENT stage, costing ~3.5 M galaxies × ~6 fragments per billboard × - // 200 iterations ≈ 4 billion `pow + exp` evaluations per frame. With the - // ratio baked, mode 3 collapses to a single `f32` lookup and a multiply — - // identical cost to mode 2's `vMaxWeight` path. + // 200 iterations ≈ 4 billion 'pow + exp' evaluations per frame. With the + // ratio baked, mode 3 collapses to a single 'f32' lookup and a multiply — + // identical cost to mode 2's 'vMaxWeight' path. // // ── Numeric stability ─────────────────────────────────────────────────── // - // The CPU bake mirrors the shader's old clamp `clamp(ratio, 0, 10)` and - // also handles the degenerate-distance case (`nHere == 0` or NaN) by + // The CPU bake mirrors the shader's old clamp 'clamp(ratio, 0, 10)' and + // also handles the degenerate-distance case ('nHere == 0' or NaN) by // baking 0 — so far galaxies with no detectable density disappear in // mode 3 instead of going infinite/NaN. Visual output is unchanged from // the pre-bake implementation. @location(8) schechterRatio: f32, - // Per-galaxy HEALPix angular re-weight = `clamp(medianCount / localCount, - // 0.1, 10)` baked at upload time by `computeAngularWeights` (lazy, only - // when the user picks `BiasMode.AngularReweight`; default is 1.0). + // Per-galaxy HEALPix angular re-weight = 'clamp(medianCount / localCount, + // 0.1, 10)' baked at upload time by 'computeAngularWeights' (lazy, only + // when the user picks 'BiasMode.AngularReweight'; default is 1.0). // // ── Why per-vertex ────────────────────────────────────────────────────── // @@ -495,7 +541,7 @@ struct PerVertex { // assume isotropic angular completeness. Mode 4 is therefore an // *alternative*, not a multiplicative add-on — selecting it bypasses the // other modes' alpha modulation via the shader's - // `select(1.0, …, biasMode == 4u)` gate. + // 'select(1.0, …, biasMode == 4u)' gate. @location(9) angularDensityWeight: f32, }; @@ -522,15 +568,15 @@ struct VSOut { // The 0-based index of the catalog point (galaxy) this quad belongs to. // - // Used by `fsPick` (the picking fragment entry point) to write the instance - // ID into the r32uint pick texture. The visual `fs` entry point does NOT use + // Used by 'fsPick' (the picking fragment entry point) to write the instance + // ID into the r32uint pick texture. The visual 'fs' entry point does NOT use // this field — WGSL permits unused fragment inputs without error. // // WHY @interpolate(flat)? // Integer attributes (u32) MUST be declared with @interpolate(flat) in WGSL. // Floating-point attributes interpolate across the triangle by default; // integers cannot be meaningfully interpolated (they'd need to be cast to - // float, interpolated, then cast back — losing precision). `flat` tells the + // float, interpolated, then cast back — losing precision). 'flat' tells the // rasteriser to use the "provoking vertex" value unchanged for every fragment, // which is correct here: all 6 vertices of one instance share the same index. @location(3) @interpolate(flat) instanceIdx: u32, @@ -538,7 +584,7 @@ struct VSOut { // 1u when this instance is the selected point; 0u otherwise. // Flat-interpolated for the same reason as instanceIdx — it is a per-instance // boolean that must not be interpolated across the triangle. - // Used by the visual `fs` to apply the ring/halo selection highlight. + // Used by the visual 'fs' to apply the ring/halo selection highlight. @location(4) @interpolate(flat) selected: u32, // Galaxy disk axis ratio b/a in (0, 1], forwarded from the per-instance @@ -552,15 +598,15 @@ struct VSOut { // Pre-computed cosine and sine of the position-angle rotation used by the // fragment-stage elliptical mask. These were previously computed per - // fragment from `positionAngleDeg`, which meant 2 trig calls per pixel + // fragment from 'positionAngleDeg', which meant 2 trig calls per pixel // for every billboard — at default 2.5 px point size and ~3.5 M points - // that's tens of millions of trig calls per frame. Since `paRad` is + // that's tens of millions of trig calls per frame. Since 'paRad' is // per-instance constant, the cos/sin are too, and the rasteriser can // flat-interpolate them at zero per-fragment cost (one write per - // primitive, not per pixel). The value carried is `cos(-paRad)` / - // `sin(-paRad)` because the fragment rotates the UV by `-PA` (rotating + // primitive, not per pixel). The value carried is 'cos(-paRad)' / + // 'sin(-paRad)' because the fragment rotates the UV by '-PA' (rotating // the UV is the inverse of rotating the ellipse) — see the doc-comment - // at the top of `fs` for why we negate. + // at the top of 'fs' for why we negate. @location(6) @interpolate(flat) paCs: f32, @location(15) @interpolate(flat) paSn: f32, @@ -572,7 +618,7 @@ struct VSOut { @location(7) @interpolate(flat) isFallback: u32, // Origin-relative distance in Mpc, forwarded from the vertex stage - // (`length(p.position)`). Originally introduced for the Schechter mode 3 + // ('length(p.position)'). Originally introduced for the Schechter mode 3 // fragment-stage integral; the integral has since moved to upload-time // bake (see PerVertex.schechterRatio), so this field is no longer read by // any fragment entry point. Kept as a plumbing field so we can resurrect @@ -584,21 +630,21 @@ struct VSOut { @location(8) @interpolate(flat) dMpc: f32, // Per-galaxy Schechter density-correction ratio, forwarded from the - // per-instance attribute. Read in `fs` only when `u.biasMode == 3u` + // per-instance attribute. Read in 'fs' only when 'u.biasMode == 3u' // (the Schechter literal). Flat-interpolated for the same per-instance // constancy as the other flat u32/f32 attributes — every fragment of a // given billboard reads exactly the same ratio. @location(9) @interpolate(flat) schechterRatio: f32, // Per-galaxy HEALPix angular re-weight, forwarded from the per-instance - // attribute. Read in `fs` only when `u.biasMode == 4u` (the + // attribute. Read in 'fs' only when 'u.biasMode == 4u' (the // AngularReweight literal in src/data/biasMode.ts). Flat-interpolated // for the same per-instance constancy as schechterRatio — every fragment // of a given billboard reads exactly the same weight. @location(10) @interpolate(flat) angularDensityWeight: f32, // Distance from the camera to this galaxy in Mpc. Computed once in the - // vertex stage (`length(p.position - u.camPosWorld)`) and forwarded so the + // vertex stage ('length(p.position - u.camPosWorld)') and forwarded so the // fragment stage can apply a per-galaxy depth fade — galaxies far behind // the origin (camera-relative far side) contribute less alpha, which // tames the cumulative-overlap glow at the geometric origin where every @@ -611,30 +657,30 @@ struct VSOut { // can disable it by setting the falloff half-distance large. @location(11) @interpolate(flat) camDistMpc: f32, - // Pre-computed depth-fade multiplier `1 / (1 + (camDist/FALLOFF_HALF)²)`, - // gated by `u.depthFadeEnabled` (passes through 1.0 when the toggle is - // off). Previously computed per fragment from `camDistMpc`, but the + // Pre-computed depth-fade multiplier '1 / (1 + (camDist/FALLOFF_HALF)²)', + // gated by 'u.depthFadeEnabled' (passes through 1.0 when the toggle is + // off). Previously computed per fragment from 'camDistMpc', but the // value is per-instance constant so we bake it once per vertex and - // flat-interpolate. Same motivation as `paCs` / `paSn`: one mul + one + // flat-interpolate. Same motivation as 'paCs' / 'paSn': one mul + one // add + one div + one select per fragment over millions of fragments // becomes one of each per primitive. Cosmetic depth-attenuation curve; - // see `fs` for why this exists at all (additive emission shouldn't + // see 'fs' for why this exists at all (additive emission shouldn't // physically care about depth — but the alternative is a saturated // depth column through Earth that erases all visible structure). @location(12) @interpolate(flat) depthFade: f32, // Per-instance billboard radius in screen-space pixels (Task 8 of the - // procedural-disk-impostor plan). This is exactly the `sizePx` the + // procedural-disk-impostor plan). This is exactly the 'sizePx' the // vertex stage already computes for the apparent-size billboard math - // (`max(u.pointSizePx, apparentPxRadius)`). We forward it through + // ('max(u.pointSizePx, apparentPxRadius)'). We forward it through // VSOut so the fragment stage can fade the points-pass alpha across // the same [pxFadeStart, pxFadeEnd] band the procedural-disk pass - // fades IN over — see the `pxFadeStart` / `pxFadeEnd` doc-comment on + // fades IN over — see the 'pxFadeStart' / 'pxFadeEnd' doc-comment on // Uniforms for the rationale and the chosen alpha-zero strategy. // // Why @interpolate(flat)? All 6 vertices of a single billboard share - // the same `sizePx` (it's a function of per-instance state — galaxy - // distance, diameter, the floor `u.pointSizePx`, and a per-instance + // the same 'sizePx' (it's a function of per-instance state — galaxy + // distance, diameter, the floor 'u.pointSizePx', and a per-instance // selection scale). Linear interpolation across the quad would be a // wasted multiply per fragment and would produce floating-point // wobble in the crossfade boundary that flat-interp avoids. The @@ -643,7 +689,7 @@ struct VSOut { // wants. // // Note: this field is initialised to 0.0 along the volume-limit - // early-out path (see the `earlyOut` initialiser in `vs`). Those + // early-out path (see the 'earlyOut' initialiser in 'vs'). Those // primitives never rasterise (clip-space (2,2,2,1) is outside the // unit cube), so the value is purely a WGSL-spec requirement that // every VSOut field be initialised on every return path. @@ -653,83 +699,39 @@ struct VSOut { // ─── Schechter LF correction (Task 4 of malmquist-bias plan) ──────────────── // // Originally implemented as a per-fragment 200-step trapezoidal integral -// (commit 7a6d810). That cost ~3.5 M × 6 × 200 ≈ 4 billion `pow + exp` +// (commit 7a6d810). That cost ~3.5 M × 6 × 200 ≈ 4 billion 'pow + exp' // evaluations per frame — the slowest path in the fragment shader by an // order of magnitude. The integral now lives at upload time on the CPU -// (see `expectedNumberDensity` in `src/utils/math/schechterDensity.ts`), -// with the resulting ratio baked into the per-vertex `schechterRatio` +// (see 'expectedNumberDensity' in 'src/utils/math/schechterDensity.ts'), +// with the resulting ratio baked into the per-vertex 'schechterRatio' // attribute — identical algorithm, identical numeric output, but // evaluated once per galaxy at load instead of millions of times per -// frame. See the `schechterRatio` doc-comment in the PerVertex struct +// frame. See the 'schechterRatio' doc-comment in the PerVertex struct // above for the full per-vertex/per-fragment trade-off discussion. // // (Helper function removed — the fragment shader now reads -// `p.schechterRatio` directly.) +// 'p.schechterRatio' directly.) // ─── colour ramp ────────────────────────────────────────────────────────────── - -// Map SDSS g−r colour index to an RGB tint. -// -// The piecewise ramp runs: blue → white → red // -// t ≤ 0 → blueWhite blend from blue (0.4, 0.6, 1.0) toward white -// 0 < t ≤ 1 → blueWhite blend — still in the blue-to-white half -// 1 < t ≤ 2 → whiteRed blend from white (1.0, 0.95, 0.8) toward red -// t > 2 → fully red (1.0, 0.5, 0.3) -// -// Both blends share the same `s = clamp(t * 0.5, 0, 1)` parameter so that -// the transition is smooth and uses the same 0→1 interpolation range. -// -// WGSL `select(a, b, cond)` — note the argument order: -// returns `a` when cond is FALSE, returns `b` when cond is TRUE. -// So select(blueWhite, whiteRed, t > 1.0) -// returns blueWhite when t ≤ 1.0, and whiteRed when t > 1.0. -// (This is the reverse of a typical ternary `cond ? b : a` — easy to get wrong.) - -fn ramp(t: f32) -> vec3 { - // s goes 0→1 as t goes 0→2; clamp stops it at 0 for negatives and 1 for t>2. - let s = clamp(t * 0.5, 0.0, 1.0); - - // Blue-to-white: hot blue (quasars, O/B stars) fading to a warm white. - let blueWhite = mix(vec3(0.4, 0.6, 1.0), vec3(1.0, 0.95, 0.8), s); - - // White-to-red: warm white fading to cool red (M-type stars, red galaxies). - let whiteRed = mix(vec3(1.0, 0.95, 0.8), vec3(1.0, 0.5, 0.3), s); - - // Pick the right half of the ramp: blue-white for t ≤ 1, white-red for t > 1. - // Remember: select(falseVal, trueVal, condition). - return select(blueWhite, whiteRed, t > 1.0); -} - -// ─── quad corner offsets ────────────────────────────────────────────────────── - -// A triangle-list of 6 vertices forming one unit quad in [-1,+1]². -// -// (-1,+1) ──── (+1,+1) -// │ ╲ tri2 │ -// │ tri1 ╲ │ -// (-1,-1) ──── (+1,-1) -// -// triangle 1: verts 0,1,2 → bottom-left, bottom-right, top-left -// triangle 2: verts 3,4,5 → top-left, bottom-right, top-right -// -// Why not use an index buffer? An index buffer would let us share the 4 unique -// corners and reference them via 6 indices — saving 2 redundant vertex shader -// invocations per quad. For our case the saving is tiny (2 out of 6 = 33% fewer -// vertex invocations, but each is extremely cheap), while index buffers add JS- -// side boilerplate (GPUBuffer creation, pipeline indexFormat declaration, -// drawIndexed call). The triangle-list approach is the simplest possible setup. - -const QUAD = array, 6>( - vec2(-1.0, -1.0), // 0 — bottom-left - vec2( 1.0, -1.0), // 1 — bottom-right - vec2(-1.0, 1.0), // 2 — top-left - vec2(-1.0, 1.0), // 3 — top-left (repeated for triangle 2) - vec2( 1.0, -1.0), // 4 — bottom-right (repeated for triangle 2) - vec2( 1.0, 1.0), // 5 — top-right -); +// 'ramp(t)' lives in 'lib/colorIndex.wesl' — imported at the top of +// this file. Shared with the procedural-disk pass so a galaxy's tint +// is identical at every LOD; see that file's docblock for the full +// piecewise definition and anchor colours. // ─── vertex stage ───────────────────────────────────────────────────────────── +// +// The 6-vertex quad lookup that used to live here as a 'const QUAD' +// array now comes from 'lib/billboard.wesl::quadCorner'. The library +// uses the (BL, BR, TR, BL, TR, TL) ordering shared with the quads, +// disks, and proceduralDisks renderers; both that ordering and the +// (BL, BR, TL, TL, BR, TR) form points.wesl previously used are valid +// CCW triangulations of the same unit square, and the points pipeline +// runs with WebGPU's default 'cullMode: none' (see pointRenderer.ts +// 'createRenderPipeline'), so the rendered output is byte-identical. +// We deliberately don't use an index buffer for the same reason as +// before: 2 redundant vertex invocations per quad is cheaper than the +// JS-side boilerplate of 'GPUBuffer + indexFormat + drawIndexed'. // The vertex shader runs once per (instance, vertex) pair. // @builtin(vertex_index) cycles 0..5 within each instance (per-vertex) @@ -751,10 +753,12 @@ fn vs( // clip = viewProj * [x, y, z, 1] // After this, clip.xyz/clip.w gives the NDC position (in [-1,+1]³ for x,y; // [0,1] for z with WebGPU's perspectiveZO convention). - let center = u.viewProj * vec4(p.position, 1.0); + // Routed through the shared 'worldToClip' helper from 'lib/camera.wesl' + // so every renderer's projection step is searchable under a single name. + let center = worldToClip(u.cam, p.position); // Fetch the quad corner for this vertex (in [-1,+1]²). - let corner = QUAD[vi]; + let corner = quadCorner(vi); // ── Malmquist-bias gating (volume-limited mode) ────────────────────────── // @@ -765,34 +769,34 @@ fn vs( // // M = m - 5 · log10(d_Mpc) - 25 // - // The `+25` term comes from the unit choice: distance modulus textbooks - // write `M = m - 5·log10(d / 10pc) = m - 5·log10(d_pc) + 5`, and + // The '+25' term comes from the unit choice: distance modulus textbooks + // write 'M = m - 5·log10(d / 10pc) = m - 5·log10(d_pc) + 5', and // log10(1 Mpc / 10 pc) = log10(1e5) = 5, so converting the distance unit // from parsecs to megaparsecs adds 5·5 = 25 to the additive constant. - // Mirror of `absoluteFromApparent` in src/utils/math/distanceModulus.ts. + // Mirror of 'absoluteFromApparent' in src/utils/math/distanceModulus.ts. // - // WGSL has no `log10` intrinsic — only the natural log — so we divide + // WGSL has no 'log10' intrinsic — only the natural log — so we divide // by ln(10) ≈ 2.302585093. // - // ── Why a degenerate clip-space output instead of `discard`? ───────────── + // ── Why a degenerate clip-space output instead of 'discard'? ───────────── // - // `discard` is a *fragment-stage* keyword — it tells the rasteriser to + // 'discard' is a *fragment-stage* keyword — it tells the rasteriser to // throw away the current pixel. The vertex stage has no equivalent // statement; it must always return a clip-space position. The accepted // workaround is to emit a clip-space coordinate that lies outside the // unit cube ([-1, +1]³), so the GPU's clip+cull stage drops every - // primitive that touches the vertex. Setting `xyz = (2, 2, 2)` with - // `w = 1` puts the post-divide NDC at (2, 2, 2) — well outside the unit + // primitive that touches the vertex. Setting 'xyz = (2, 2, 2)' with + // 'w = 1' puts the post-divide NDC at (2, 2, 2) — well outside the unit // cube — and crucially does the same for *all 6 vertices* of the - // billboard quad (because `p.biasMode`, `p.absMagLimit`, and `dMpc` all + // billboard quad (because 'p.biasMode', 'p.absMagLimit', and 'dMpc' all // depend only on per-instance state, every vertex of the quad makes the // same decision). No fragment shader invocations get scheduled for the // discarded galaxy, so we save roughly the same work as a fragment-stage - // `discard` would have. The only wasted work is the six vertex + // 'discard' would have. The only wasted work is the six vertex // invocations themselves, which are cheap. // - // We gate this on `u.biasMode == 1u` (the VolumeLimited literal in - // src/data/biasMode.ts) so the default mode (`None == 0u`) is a single + // We gate this on 'u.biasMode == 1u' (the VolumeLimited literal in + // src/data/biasMode.ts) so the default mode ('None == 0u') is a single // u32 compare per vertex — effectively free. let dMpc = length(p.position); let LOG10 = 2.302585092994046; @@ -800,11 +804,11 @@ fn vs( // Recover the per-instance packed identity now so both the early-out // and the main path can share one source of truth. Bits 27..31 = the - // 5-bit `cloud.sourceCode` (this draw's survey, set per-source via the + // 5-bit 'cloud.sourceCode' (this draw's survey, set per-source via the // @group(1) bind group); bits 0..26 = the GPU's - // `@builtin(instance_index)` (local 0..count-1). This is the same + // '@builtin(instance_index)' (local 0..count-1). This is the same // value the pick fragment writes (with a +1 sentinel) and the same - // value `u.selectedPacked` is compared against. + // value 'u.selectedPacked' is compared against. let myPacked = (cloud.sourceCode << 27u) | ii; if (u.biasMode == 1u && absMag > u.absMagLimit) { @@ -828,7 +832,7 @@ fn vs( earlyOut.angularDensityWeight = 1.0; earlyOut.camDistMpc = 0.0; earlyOut.depthFade = 1.0; - // sizePx is plumbed for the procedural-disk crossfade-OUT in `fs` + // sizePx is plumbed for the procedural-disk crossfade-OUT in 'fs' // (Task 8); the early-out primitive never rasterises but WGSL // requires every VSOut field be initialised on every return path. earlyOut.sizePx = 0.0; @@ -839,13 +843,13 @@ fn vs( // // Determine whether this instance is the user-selected point. // - // `u.selectedPacked` is `(selectedSource << 27) | selectedLocalIdx` when - // a galaxy is pinned, or `0xFFFFFFFFu` when nothing is selected. We - // compare against this draw's `myPacked = (cloud.sourceCode << 27) | ii` so + // 'u.selectedPacked' is '(selectedSource << 27) | selectedLocalIdx' when + // a galaxy is pinned, or '0xFFFFFFFFu' when nothing is selected. We + // compare against this draw's 'myPacked = (cloud.sourceCode << 27) | ii' so // each source's identity range stays disjoint by construction (bits // 27..31 = source code, never overlap across surveys). // - // No more `realIdx & 0x7fffffffu` masking: the previous revision baked + // No more 'realIdx & 0x7fffffffu' masking: the previous revision baked // a global running-sum index per vertex with the high bit doubling as a // fallback flag. Both went away with the (source, localIdx) packing // refactor; the fallback flag now rides on the sign bit of axisRatio, @@ -857,7 +861,7 @@ fn vs( // is unmistakable — even a faint, magnitude-22 galaxy gets a visible halo. // Non-selected points keep the apparent-size radius. // - // We use `select(normalSize, selectedSize, isSelected)` — WGSL's ternary. + // We use 'select(normalSize, selectedSize, isSelected)' — WGSL's ternary. // Recall the argument order: select(falseValue, trueValue, condition). let sizeScale = select(1.0, 8.0, isSelected); @@ -865,7 +869,7 @@ fn vs( // // We want each galaxy's billboard to occupy its real angular footprint on // screen — a galaxy 5 Mpc away gets a much bigger disk than one 500 Mpc - // away — but never to vanish below `u.pointSizePx`, which acts as the + // away — but never to vanish below 'u.pointSizePx', which acts as the // far-field "still detectable as a glowing dot" floor. // // A galaxy approximated as a 30-kpc-diameter disk (the project's current @@ -876,7 +880,7 @@ fn vs( // = radius_Mpc / distance_Mpc // // for the small-angle range we care about (galaxies subtend at most a - // few degrees even when very close). Multiplying by `u.pxPerRad` + // few degrees even when very close). Multiplying by 'u.pxPerRad' // converts radians to screen pixels. // // Why max(floor, apparent) rather than just apparent? In the far field @@ -889,7 +893,7 @@ fn vs( // // Why 0.06 Mpc (= 60 kpc radius) rather than the physical 15 kpc radius? // Match the QuadRenderer's footprint. The thumbnail quad uses - // `sizeWorld = diameter_kpc * 4 / 1000 = 0.12 Mpc` total = 0.06 Mpc + // 'sizeWorld = diameter_kpc * 4 / 1000 = 0.12 Mpc' total = 0.06 Mpc // half-extent, with the visible galaxy body filling its central ~25% // and a soft alpha-fade in the surrounding tail (the cutout JPEG fades // to transparent away from the galaxy). Sizing the point billboard @@ -903,13 +907,13 @@ fn vs( // texture rather than the soft glow. // Per-galaxy radius in Mpc, derived from the per-instance diameterKpc // attribute. The 4× padding factor matches QuadRenderer's - // `sizeWorld = (diameterKpc / 1000) * 4`, so the soft glowing dot and + // 'sizeWorld = (diameterKpc / 1000) * 4', so the soft glowing dot and // the textured thumbnail occupy the same world-space footprint and // the load-fade transition is seamless. Algebra: // // radius_Mpc = (diameterKpc / 2) * 4 / 1000 = diameterKpc * 2 / 1000 // - // The `select` clamps pathological zero/NaN diameters back to the + // The 'select' clamps pathological zero/NaN diameters back to the // project-wide default — the build pipeline already guarantees a // finite positive value, but a corrupted .bin shouldn't black-hole // the whole sky. @@ -925,7 +929,7 @@ fn vs( // ── PIXEL-SIZE-IN-CLIP-SPACE CONVERSION ────────────────────────────────── // - // We want the billboard to be `sizePx` pixels in radius on screen, + // We want the billboard to be 'sizePx' pixels in radius on screen, // regardless of the point's clip-space depth. // // Clip space spans [-1, +1] in X and Y — a range of 2.0 in each direction. @@ -937,7 +941,7 @@ fn vs( // making the apparent size shrink with distance (points farther away look // smaller). This is exactly *wrong* for fixed-pixel billboards (we'd want // them constant on screen), so we cancel the divide by multiplying by w. - // For our distance-dependent `sizePx`, the same cancellation still applies: + // For our distance-dependent 'sizePx', the same cancellation still applies: // the math gives "this many screen pixels regardless of clip-space depth" // and the size variation comes from sizePx itself, not from perspective. // @@ -952,8 +956,13 @@ fn vs( // For the bare points we therefore keep the original screen-X/+Y // basis: stable through camera motion, and the ellipse mask uses // sky-PA without any screen-vs-sky reconciliation. - let pxToClip = vec2(2.0 / u.viewport.x, 2.0 / u.viewport.y); - let offset = corner * sizePx * sizeScale * pxToClip * center.w; + // The shared 'expandBillboardScreen' helper computes the clip-space + // delta for a 'sizePx'-pixel-radius screen-aligned billboard corner; + // we then post-multiply by 'sizeScale' so the per-instance 8× halo + // expansion (selection ring) keeps stacking on top of the base pixel + // size. See 'lib/billboard.wesl' for the centerClip.w / viewportPx + // cancellation derivation. + let offset = expandBillboardScreen(u.cam, center, sizePx, corner) * sizeScale; var out: VSOut; @@ -982,17 +991,17 @@ fn vs( // // colour_rest ≈ colour_obs − k · z // - // …where the coefficient `k` is *not* a single shader-wide constant. Each + // …where the coefficient 'k' is *not* a single shader-wide constant. Each // survey we render uses a different colour pair, and each pair has its own - // sensitivity to bandpass shift, so `k` lives in the per-instance vertex - // attribute `p.kPerZ` (baked at upload time per-source on the JS side): + // sensitivity to bandpass shift, so 'k' lives in the per-instance vertex + // attribute 'p.kPerZ' (baked at upload time per-source on the JS side): // // - SDSS u−g → k ≈ 3.0 (steep — u and g straddle the 4000 Å break) // - GLADE B−J → k ≈ 1.0 (modest — B touches a Balmer break, J is NIR) // - 2MRS J−K → k ≈ 0.0 (NIR is nearly z-invariant at z < 0.1) // // Why a per-vertex attribute and not a uniform? Per-row variability: - // each survey's k coefficient is fixed per draw, but `p.kPerZ` also + // each survey's k coefficient is fixed per draw, but 'p.kPerZ' also // carries 0 for the sentinel-colour-index rows that lack a measurable // colour pair, which a global uniform can't express. 4 bytes per // instance (≈10 MB for SDSS) — well worth it for correct colour. @@ -1043,14 +1052,14 @@ fn vs( // // ── 1/V_max alpha modulation (Task 3 of malmquist-bias plan) ───────────── // - // When `u.biasMode == 2u` (the `BiasMode.VMax` literal), multiply the - // intensity by the per-vertex `vMaxWeight` baked at upload time. This + // When 'u.biasMode == 2u' (the 'BiasMode.VMax' literal), multiply the + // intensity by the per-vertex 'vMaxWeight' baked at upload time. This // dims intrinsically-bright galaxies whose detectability volume V_max // greatly exceeds the reference volume V_ref — they're visible across // a much larger slice of space than their faint companions, so without // the down-weighting they'd over-represent themselves visually. // - // The `select(1.0, p.vMaxWeight, …)` keeps the OTHER three modes + // The 'select(1.0, p.vMaxWeight, …)' keeps the OTHER three modes // (None, VolumeLimited, Schechter) unchanged: each multiplies by 1.0, // so the volume-limited gating from above and the no-correction default // both render exactly as they did before this task. This is what makes @@ -1059,11 +1068,11 @@ fn vs( out.intensity = clamp((22.0 - p.magnitude) / 8.0, 0.05, 1.0) * u.brightness * vMaxAlpha; // Forward the per-instance packed identity to the pick fragment entry - // point (`fsPick`). Same value the visual stage already computed for - // `myPacked` above; `fsPick` adds the +1 sentinel and writes it into + // point ('fsPick'). Same value the visual stage already computed for + // 'myPacked' above; 'fsPick' adds the +1 sentinel and writes it into // the r32uint pick texture. // - // The visual `fs` entry point ignores this field — WGSL silently + // The visual 'fs' entry point ignores this field — WGSL silently // allows a fragment shader to declare fewer inputs than the vertex // shader outputs, as long as the @location values that *are* declared // match. We keep it here so both fragment entry points share one @@ -1074,13 +1083,13 @@ fn vs( // 1u = this instance is selected; 0u = normal point. out.selected = select(0u, 1u, isSelected); - // Forward the fallback flag for the highlight + hide toggles in `fs`. + // Forward the fallback flag for the highlight + hide toggles in 'fs'. out.isFallback = isFallbackFlag; // Forward the absolute axisRatio so the fragment stage's elliptical // mask uses the unsigned magnitude. Sign bit was the fallback flag - // (already extracted into `isFallbackFlag`); negative values would - // make the existing `axisRatio > 0.0` validity check trip on every + // (already extracted into 'isFallbackFlag'); negative values would + // make the existing 'axisRatio > 0.0' validity check trip on every // fallback row and collapse the ellipse mask to a circle. out.axisRatio = abs(p.axisRatio); @@ -1090,19 +1099,19 @@ fn vs( // where rotating the UV is the inverse of rotating the ellipse, with an // extra sign flip because astronomical PA is east-of-north — CCW on // sky — but our UV-y points down on screen). See the doc-comment at - // the top of `fs` for the full reasoning. + // the top of 'fs' for the full reasoning. let paRad = -p.positionAngleDeg * 3.14159265 / 180.0; out.paCs = cos(paRad); out.paSn = sin(paRad); // Forward origin-relative distance. Originally consumed by the Schechter // mode-3 fragment integral; that integral has moved to per-vertex bake, - // so `dMpc` is currently unused in the fragment stage but kept as a + // so 'dMpc' is currently unused in the fragment stage but kept as a // plumbed field for future distance-dependent fragment effects. out.dMpc = dMpc; // Forward the per-galaxy Schechter density ratio (baked at upload time). - // The vertex stage above already folded it into `out.intensity` for + // The vertex stage above already folded it into 'out.intensity' for // mode 3 — forwarding it through VSOut keeps the attribute available to // the fragment stage in case future tweaks (e.g. tint modulation) want // to read it. Costs nothing: with @interpolate(flat) the GPU writes @@ -1110,21 +1119,21 @@ fn vs( out.schechterRatio = p.schechterRatio; // Forward the per-galaxy HEALPix angular re-weight (baked at mode-4 - // toggle time, default 1.0). Read in `fs` only when `u.biasMode == 4u`; + // toggle time, default 1.0). Read in 'fs' only when 'u.biasMode == 4u'; // flat-interpolated through VSOut for per-instance constancy. out.angularDensityWeight = p.angularDensityWeight; // Forward camera-relative distance for the depth-fade in the fragment - // stage. The vertex stage already computed `distanceMpc` for the + // stage. The vertex stage already computed 'distanceMpc' for the // apparent-pixel-size calculation above; we just forward it here so the - // fragment doesn't need access to `u.camPosWorld` and another `length()`. + // fragment doesn't need access to 'u.camPosWorld' and another 'length()'. out.camDistMpc = distanceMpc; // Pre-compute the depth-fade multiplier here so the fragment doesn't // re-derive it for every pixel of every billboard. Curve: - // `1 / (1 + (camDist / FALLOFF_HALF)²)`. The 1000 Mpc half-distance + // '1 / (1 + (camDist / FALLOFF_HALF)²)'. The 1000 Mpc half-distance // matches the fragment-stage version this replaced — see the - // depth-fade doc-comment in `fs` for why this constant and why the + // depth-fade doc-comment in 'fs' for why this constant and why the // effect is cosmetic (additive emission shouldn't physically depth- // attenuate, but unbroken depth columns wash out the visible volume). let FALLOFF_HALF_MPC = 1000.0; @@ -1134,9 +1143,9 @@ fn vs( // Forward the per-instance billboard radius in screen-pixels so the // fragment stage can fade points-pass alpha across the procedural- - // disk crossfade band. See the `sizePx` doc-comment on VSOut and the - // `pxFadeStart` / `pxFadeEnd` doc-comment on Uniforms. No extra cost - // here — `sizePx` was already computed above for the billboard offset. + // disk crossfade band. See the 'sizePx' doc-comment on VSOut and the + // 'pxFadeStart' / 'pxFadeEnd' doc-comment on Uniforms. No extra cost + // here — 'sizePx' was already computed above for the billboard offset. out.sizePx = sizePx; return out; @@ -1145,7 +1154,7 @@ fn vs( // ─── fragment stage ─────────────────────────────────────────────────────────── // The fragment shader runs once per pixel covered by a rasterised triangle. -// `in.uv` has been interpolated from the three vertices — but since our quad +// 'in.uv' has been interpolated from the three vertices — but since our quad // corners all share the same tint and intensity, only uv varies meaningfully. @fragment @@ -1164,11 +1173,11 @@ fn fs(in: VSOut) -> @location(0) vec4 { // 1. Astronomical PA is measured east of north (counter-clockwise on // sky), but our UV-y points down on screen — a sign flip. // 2. Rotating the UV is the inverse of rotating the ellipse, so the - // target rotation `+PA` becomes a UV rotation of `-PA`. + // target rotation '+PA' becomes a UV rotation of '-PA'. // // The cs/sn pair is now pre-computed in the vertex stage and flat- - // interpolated. See the `paCs` / `paSn` doc-comment on VSOut for why: - // `paRad` is per-instance constant, so doing the trig once per primitive + // interpolated. See the 'paCs' / 'paSn' doc-comment on VSOut for why: + // 'paRad' is per-instance constant, so doing the trig once per primitive // (and reading it here) is much cheaper than the same trig per fragment // across millions of billboards. let cs = in.paCs; @@ -1184,7 +1193,7 @@ fn fs(in: VSOut) -> @location(0) vec4 { // situation we want every billboard to render as a circle, identical to // pre-orientation behaviour. // - // Trick: `NaN > 0.0` is false in WGSL, so the same comparison catches + // Trick: 'NaN > 0.0' is false in WGSL, so the same comparison catches // both NaN and the (shouldn't-happen) zero/negative case. When invalid, // we use safeAB = 1.0 → elliptic.y = rotated.y → circular r2 = original // dot(uv, uv). When valid, we clamp at 0.05 against a hypothetical @@ -1211,26 +1220,25 @@ fn fs(in: VSOut) -> @location(0) vec4 { // user reported "the point screen-aligned blob shows when the galaxy // is selected (not when unselected, which is the right behaviour)". // - // The fade trigger is the UNSCALED `in.sizePx` (vertex stage line 1070 - // forwards `sizePx` BEFORE applying the 8× `sizeScale`), so the fade + // The fade trigger is the UNSCALED 'in.sizePx' (vertex stage line 1070 + // forwards 'sizePx' BEFORE applying the 8× 'sizeScale'), so the fade // band aligns with the procedural-disk emission band on the underlying // galaxy footprint — not the inflated halo radius. Selecting a galaxy // does NOT change when its point fades; only what shape it draws below // the band. // // See the long fade-band doc-comment further down (now reached only by - // normal points) for the px-scaling rationale (`sizePx * 0.5`) and the + // normal points) for the px-scaling rationale ('sizePx * 0.5') and the // smoothstep choice. let apparentDiameterPx = in.sizePx * 0.5; - let fadeT = clamp( + let fadeT = saturate( (apparentDiameterPx - u.pxFadeStart) / (u.pxFadeEnd - u.pxFadeStart), - 0.0, 1.0, ); let pointAlphaMult = 1.0 - fadeT * fadeT * (3.0 - 2.0 * fadeT); // ── SELECTION RING vs NORMAL DISK ───────────────────────────────────────── // - // For the selected point we rendered a 3× larger billboard in `vs`, so the + // For the selected point we rendered a 3× larger billboard in 'vs', so the // UV space still spans [-1,+1]² but represents a physically bigger area. // We draw a hollow ring by: // 1. Discarding the outer region (r² > 1.0) → circular boundary. @@ -1251,20 +1259,20 @@ fn fs(in: VSOut) -> @location(0) vec4 { // ── Inner disk (the point itself) ────────────────────────────────────── // - // We scaled the billboard 8× in `vs`, so the original point's footprint + // We scaled the billboard 8× in 'vs', so the original point's footprint // occupies the inner 1/8 in linear distance — i.e. r² ≤ (1/8)² = 1/64 // ≈ 0.0156 in this scaled UV space. Inside that radius we render the // *normal* point disk so the user can still see the selected galaxy's // own brightness, not just the highlight ring around it. // - // CRITICAL: use the ELLIPTICAL `r2` (computed above from the rotated + - // squashed UV) here, NOT `r2_circ`. With the round mask the selected + // CRITICAL: use the ELLIPTICAL 'r2' (computed above from the rotated + + // squashed UV) here, NOT 'r2_circ'. With the round mask the selected // galaxy's inner shape would suddenly become a perfect circle, making // it look like the orientation collapsed on click. The elliptical r2 // gives us the same shape as the unselected point, just scaled 8×. // - // The alpha factor `exp(-r2 * 256)` is the original `exp(-r2 * 4)` - // remapped: at r² = 1/64, we want the same `exp(-4)` falloff the + // The alpha factor 'exp(-r2 * 256)' is the original 'exp(-r2 * 4)' + // remapped: at r² = 1/64, we want the same 'exp(-4)' falloff the // unscaled point would have, so we multiply r² by 64 (= 8²) before // applying the original ×4 coefficient → 256. if (r2 < 0.0156) { @@ -1285,16 +1293,16 @@ fn fs(in: VSOut) -> @location(0) vec4 { // // ── Constant-pixel stroke width ──────────────────────────────────────── // - // The earlier formulation used `r²_circ ∈ [0.72, 1.0]` — a band that's + // The earlier formulation used 'r²_circ ∈ [0.72, 1.0]' — a band that's // ~15 % of the billboard radius regardless of pixel size. When the // billboard is 200 px (close zoom on a big galaxy with the 8× selection // scale), 15 % is 30 px — a fat doughnut. We instead pick a target // stroke width in pixels and convert that into a fraction of the - // billboard radius using `in.sizePx * 8.0` (the actual on-screen halo + // billboard radius using 'in.sizePx * 8.0' (the actual on-screen halo // radius — in.sizePx is the unscaled value, sizeScale = 8 is applied // in vs). // - // `min(0.15, …)` caps the band fraction at the original 15 % so faint/ + // 'min(0.15, …)' caps the band fraction at the original 15 % so faint/ // far galaxies (where 8 × sizePx is small) don't end up with a stroke // wider than the billboard itself. let HALO_RADIUS_PX = in.sizePx * 8.0; @@ -1309,7 +1317,7 @@ fn fs(in: VSOut) -> @location(0) vec4 { // of the stroke at full intensity (the ring should look crisp) but // avoid the hard pixelated edge a binary test would produce. // - // We deliberately do NOT multiply by `pointAlphaMult` here — the + // We deliberately do NOT multiply by 'pointAlphaMult' here — the // ring is UI and stays at full intensity through the procedural- // disk crossfade band. The inner-disk case above (the galaxy's // own dot inside the 8× billboard) DOES fade because the @@ -1336,7 +1344,7 @@ fn fs(in: VSOut) -> @location(0) vec4 { // ── NORMAL POINT — solid disk with Gaussian falloff (now ELLIPTICAL) ────── // Discard fragments outside the oriented ellipse defined by axisRatio + PA. - // `r2` was computed from the rotated/squashed UV above, so this single + // 'r2' was computed from the rotated/squashed UV above, so this single // unit-radius test covers the elliptical mask without needing a separate // shape-specific check. if (r2 > 1.0) { discard; } @@ -1348,11 +1356,11 @@ fn fs(in: VSOut) -> @location(0) vec4 { // ── Schechter density correction (Task 4 of malmquist-bias plan) ──────── // - // Mode 3: modulate alpha by the per-galaxy ratio `clamp(N_ref / n(d), 0, 10)` - // baked at upload time into `in.schechterRatio`. Originally implemented as + // Mode 3: modulate alpha by the per-galaxy ratio 'clamp(N_ref / n(d), 0, 10)' + // baked at upload time into 'in.schechterRatio'. Originally implemented as // a 200-step trapezoidal integral evaluated PER FRAGMENT (commit 7a6d810); - // that loop is gone and the cost dropped from ~4 billion `pow + exp` per - // frame to a single multiply. See the `schechterRatio` doc-comment in + // that loop is gone and the cost dropped from ~4 billion 'pow + exp' per + // frame to a single multiply. See the 'schechterRatio' doc-comment in // the PerVertex struct for the bake-time clamp + degenerate-distance // handling — the ratio is already finite, already in [0, 10], and already // 0 when the survey can't see anything at this distance. No fragment-side @@ -1363,16 +1371,16 @@ fn fs(in: VSOut) -> @location(0) vec4 { // ── HEALPix angular re-weight (Task 8 of malmquist-bias plan) ──────────── // // Mode 4: modulate alpha by the per-galaxy ratio - // `clamp(medianCellCount / localCellCount, 0.1, 10)` baked at toggle time - // into `in.angularDensityWeight`. Down-weights galaxies in over-dense + // 'clamp(medianCellCount / localCellCount, 0.1, 10)' baked at toggle time + // into 'in.angularDensityWeight'. Down-weights galaxies in over-dense // angular cells (e.g., GLADE's SDSS-DR12 footprint at high z) and // up-weights galaxies in sparse cells (the rest of the sky at the same // shell), flattening the radial pencil-beam-jet artefacts the user - // reported. See `computeAngularWeights.ts` for the per-shell median + // reported. See 'computeAngularWeights.ts' for the per-shell median // pass that produces these weights. // // The bake is per-survey, never global, so SDSS's footprint can't - // contaminate GLADE's correction. Each `select` reads its own cloud's + // contaminate GLADE's correction. Each 'select' reads its own cloud's // weight via the per-vertex slot; the only thing the shader knows is // "use this weight when biasMode == 4u". let angWeight = select(1.0, in.angularDensityWeight, u.biasMode == 4u); @@ -1388,7 +1396,7 @@ fn fs(in: VSOut) -> @location(0) vec4 { // distance from the camera so the back half of the volume contributes // less, breaking up the saturation stack. // - // Curve: `weight = 1 / (1 + (camDist / FALLOFF_HALF)²)`. Smooth, finite + // Curve: 'weight = 1 / (1 + (camDist / FALLOFF_HALF)²)'. Smooth, finite // at d=0 (no divide-by-zero), monotonically decreasing. At // camDist = FALLOFF_HALF the weight is exactly 0.5 (so the half-distance // marks the half-power point). Constant chosen so a galaxy 1 Gpc from @@ -1397,10 +1405,10 @@ fn fs(in: VSOut) -> @location(0) vec4 { // stop the depth-column saturation that motivated this fix. // // The depth-fade multiplier is now pre-computed in the vertex stage and - // flat-interpolated as `in.depthFade`. See the `depthFade` doc-comment + // flat-interpolated as 'in.depthFade'. See the 'depthFade' doc-comment // on VSOut: per-instance constant, so once per primitive instead of // once per fragment. The vertex stage already handles the - // `u.depthFadeEnabled` gate, so this is unconditionally a multiply. + // 'u.depthFadeEnabled' gate, so this is unconditionally a multiply. // // Not physically correct (additive emission shouldn't care about // depth), but the alternative — letting the depth-column saturate @@ -1411,9 +1419,9 @@ fn fs(in: VSOut) -> @location(0) vec4 { // ── Procedural-disk crossfade-OUT (Task 8 of procedural-disk-impostor) ── // // The thumbnail subsystem's procedural-disk pass fades IN across the - // [u.pxFadeStart, u.pxFadeEnd] band using `t * t * (3 - 2 * t)` (the + // [u.pxFadeStart, u.pxFadeEnd] band using 't * t * (3 - 2 * t)' (the // smoothstep cubic). We fade the points-pass OUT with the - // *complementary* curve `1 - t * t * (3 - 2 * t)` — fully visible + // *complementary* curve '1 - t * t * (3 - 2 * t)' — fully visible // below pxFadeStart, fully invisible above pxFadeEnd, and a smooth // C¹-continuous handoff inside the band. Sum of the two curves is // identically 1.0 across the band, so the additive HDR contribution @@ -1421,23 +1429,14 @@ fn fs(in: VSOut) -> @location(0) vec4 { // double-counting (which is what happens with no fade-out — the // "double-bright donut" the user reported). // - // `pointAlphaMult` is computed up at the top of `fs` so the + // 'pointAlphaMult' is computed up at the top of 'fs' so the // selection-ring branch can apply the same fade — see the comment - // there for the px-scaling rationale (`sizePx * 0.5`) and why the + // there for the px-scaling rationale ('sizePx * 0.5') and why the // selection halo also has to fade out (otherwise selecting a galaxy // and zooming in would leave the 8× selection halo rendered on top // of the procedural-disk impostor). alpha = alpha * pointAlphaMult; - // ── Cloud fade-in ────────────────────────────────────────────────────────── - // - // Multiply by the per-source opacity uniform (set per-frame from the JS - // side based on time-since-upload). Steady-state opacity = 1.0, so this - // is a no-op once a cloud has finished fading. See the CloudUniforms - // docblock above for the full rationale; tl;dr a freshly-uploaded cloud - // glides into view over ~500 ms instead of popping into existence. - alpha = alpha * cloud.opacity; - // Highlight fallback rows in magenta when the toggle is on. The 0.3 in // the green channel keeps fallback galaxies recognisable as "data-y" // rather than turning them into pure UI accents — they still render at @@ -1447,11 +1446,24 @@ fn fs(in: VSOut) -> @location(0) vec4 { // Scale the colour by the per-point intensity. let rgb = tintFinal * in.intensity; + // ── Cloud fade-in ────────────────────────────────────────────────────────── + // + // Multiply alpha by the per-source opacity uniform (set per-frame from the + // JS side based on time-since-upload). Steady-state opacity = 1.0, so this + // is a no-op once a cloud has finished fading. See the CloudUniforms + // docblock in lib/cloudFade.wesl for the full rationale; tl;dr a freshly- + // uploaded cloud glides into view over ~600 ms instead of popping. + // + // 'applyCloudFade' is a documented place that says 'never multiply + // opacity into RGB' — it takes the scalar alpha alongside opacity and + // returns the faded scalar alpha, leaving 'rgb' untouched. + alpha = applyCloudFade(alpha, cloud.opacity); + // ── PREMULTIPLIED ALPHA ────────────────────────────────────────────────── // // We output (rgb * alpha, alpha) — "premultiplied alpha" — rather than // (rgb, alpha). This is *required* because the canvas was configured with - // `alphaMode: 'premultiplied'` in device.ts. + // 'alphaMode: 'premultiplied'' in device.ts. // // In premultiplied alpha, the RGB channels already contain the result of // multiplying colour by opacity. The GPU blend equation for additive blending @@ -1466,35 +1478,35 @@ fn fs(in: VSOut) -> @location(0) vec4 { // compositing against the page, producing colours that are too dark. // // The additive blend mode itself is configured in the pipeline descriptor on - // the JS side (Task 10) — specifically in the `targets[0].blend` descriptor. + // the JS side (Task 10) — specifically in the 'targets[0].blend' descriptor. return vec4(rgb * alpha, alpha); } // ─── pick fragment stage ────────────────────────────────────────────────────── -// `fsPick` is the second fragment entry point in this file. A single WGSL +// 'fsPick' is the second fragment entry point in this file. A single WGSL // shader module can contain multiple entry points of the same stage; each -// `GPURenderPipeline` selects one via its `fragment.entryPoint` field. +// 'GPURenderPipeline' selects one via its 'fragment.entryPoint' field. // -// The pick pass renders into an `r32uint` offscreen texture (not the visible +// The pick pass renders into an 'r32uint' offscreen texture (not the visible // swap-chain texture). Each fragment writes the *1-based* instance index of the // catalog point whose billboard covers that pixel. The JS side reads a single // pixel from this texture under the cursor and converts it back to a 0-based // point index. // // WHY OFFSET BY 1? -// The texture is cleared to 0 before the pass. If we wrote `instanceIdx` +// The texture is cleared to 0 before the pass. If we wrote 'instanceIdx' // directly, instance 0 would be indistinguishable from the cleared background. -// Instead we write `instanceIdx + 1`, so 0 always means "no hit" and any +// Instead we write 'instanceIdx + 1', so 0 always means "no hit" and any // value ≥ 1 decodes to a valid point by subtracting 1. // // WHY A LARGER RADIUS (2.25 vs 1.0)? // A forgiveness radius of 1.5× lets the user pick a point without needing to -// land exactly on its visual disk. The visual `fs` discards fragments where -// r² > 1.0 (unit disk); `fsPick` discards fragments where r² > 2.25 (= 1.5²), +// land exactly on its visual disk. The visual 'fs' discards fragments where +// r² > 1.0 (unit disk); 'fsPick' discards fragments where r² > 2.25 (= 1.5²), // effectively making each pick billboard 1.5× larger than the visible one. // -// NOTE: `fsPick` writes `vec4` to @location(0), which maps to an `r32uint` +// NOTE: 'fsPick' writes 'vec4' to @location(0), which maps to an 'r32uint' // render target. The pipeline descriptor on the JS side declares the target // format as 'r32uint' and no blend state (integers cannot be blended). @@ -1506,13 +1518,13 @@ fn fsPick(in: VSOut) -> @location(0) vec4 { let r2 = dot(in.uv, in.uv); if (r2 > 2.25) { discard; } - // Write `(sourceCode << 27 | instance_index) + 1` so background pixels + // Write '(sourceCode << 27 | instance_index) + 1' so background pixels // (cleared to 0) are distinguishable from a real hit. The +1 keeps 0 // as the unambiguous "no hit" sentinel; even with sourceCode = 0 (the // Synthetic survey) and localIdx = 0 the written value is 1, never 0. // - // `in.instanceIdx` was assembled in the vertex stage from - // `(cloud.sourceCode << 27u) | @builtin(instance_index)`. The packing + // 'in.instanceIdx' was assembled in the vertex stage from + // '(cloud.sourceCode << 27u) | @builtin(instance_index)'. The packing // gives every survey a structurally-disjoint identity range (top 5 // bits = source code, bottom 27 = local index ≤ 134M), so two // galaxies in different surveys can never collide on the same pick diff --git a/src/services/gpu/shaders/proceduralDisks.wgsl b/src/services/gpu/shaders/proceduralDisks.wesl similarity index 53% rename from src/services/gpu/shaders/proceduralDisks.wgsl rename to src/services/gpu/shaders/proceduralDisks.wesl index 119c9b5..74ba2ab 100644 --- a/src/services/gpu/shaders/proceduralDisks.wgsl +++ b/src/services/gpu/shaders/proceduralDisks.wesl @@ -1,6 +1,6 @@ // proceduralDisks.wgsl — 3D-oriented procedural galaxy impostors. // -// Sibling pass to `disks.wgsl` (texture-based disks) and `points.wgsl` +// Sibling pass to 'disks.wgsl' (texture-based disks) and 'points.wgsl' // (screen-aligned billboards). Renders every galaxy whose apparent // size exceeds 8 px (with a crossfade up to 14 px) as a 3D-oriented // quad shaded with a two-component brightness profile (Gaussian bulge @@ -8,20 +8,68 @@ // generates the shape entirely from per-fragment math. // // The vertex stage is structurally identical to disks.wgsl: we -// construct an in-plane orthonormal basis from `axisRatio` (which -// encodes inclination via `cos(i) = axisRatio` for thin disks) and -// `positionAngleDeg` (east-of-north major-axis direction on the sky), +// construct an in-plane orthonormal basis from 'axisRatio' (which +// encodes inclination via 'cos(i) = axisRatio' for thin disks) and +// 'positionAngleDeg' (east-of-north major-axis direction on the sky), // then offset the corner vertices into world space. See disks.wgsl // for the full derivation; we trust that derivation here and re-use // the math. +// CameraUniforms covers the canonical 80-byte prefix shared by every +// world-space renderer (viewProj + viewportPx + 8B pad). Like the +// 'disks' sibling, this vertex stage is intentionally camera-independent +// — the disk's 3D orientation is an intrinsic galaxy property, so the +// only camera input the shader consumes is 'cam.viewProj' via +// 'worldToClip'. We deliberately don't import 'worldToNdc' or +// 'worldEyeDepth': they're unused here, and importing unused names +// would just add noise for future readers grepping for camera-helper +// usage. +import package::lib::camera::CameraUniforms; +import package::lib::camera::worldToClip; +// Shared unit-quad helper from 'lib/billboard.wesl' — replaces the +// inline 'CORNERS' const that used to live below. The orientation- +// aligned disk-plane basis (PA + inclination → 'majorAxis' / +// 'minorAxis' in 3D world space) stays renderer-specific; only the +// vertex-index → corner lookup is shared. We don't import 'quadUv' +// here: this pass uses the raw [-1, +1]² corner directly as the +// fragment's radial coordinate (see 'out.uv = corner' below + the +// 'length(in.uv)' read in 'fs'), so the [0, 1]² remap that 'quadUv' +// performs would be wrong. See the docblock at the top of +// 'lib/billboard.wesl' for why the orientation math is intentionally +// NOT pulled into this lib. +import package::lib::billboard::quadCorner; +// Disk-plane axis math (PA + inclination → world-space major/minor basis) +// is shared with 'disks.wesl' via 'lib/orientation.wesl'. The inline +// derivation that used to live in this file's vs() body — the +// los/north/east/major/minor chain plus the inclination tilt of the +// minor axis out of the sky plane — is now the lib's 'diskAxes' fn, +// byte-equivalent for proceduralDisks since the lib standardised on +// the tight 'northLen < 1e-4' post-projection pole-fallback that +// proceduralDisks already used. See 'lib/orientation.wesl' for the +// camera-independence invariant + the pole-degeneracy discussion. +import package::lib::orientation::DiskAxes; +import package::lib::orientation::diskAxes; +import package::lib::colorIndex::ramp; +// Shared fragment-stage mask shapes — see 'lib/masks.wesl' for the +// rationale (three smoothstep patterns recurred across four shaders, +// naming the shapes makes the intent visible at the call site). The +// previous inline form here was 'smoothstep(1.0, 0.6, r)' which +// relies on smoothstep's symmetry under edge inversion to produce +// the high-inside-low-outside shape; 'circularMask(r, 0.6, 1.0)' +// produces the same output and makes the inner/outer ordering +// explicit. +import package::lib::masks::circularMask; + struct Uniforms { - viewProj: mat4x4, - viewport: vec2, - _pad0: f32, - _pad1: f32, - // (unused in this shader; preserved for ABI continuity with the disk - // pass — see disks.wgsl line 62-69 for the same pattern.) + cam: CameraUniforms, + // 'camPosWorld' + 'pxPerRad' are preserved in the layout for ABI + // continuity with the JS upload path (proceduralDiskRenderer.ts still + // writes them at offsets 80/92), but the world-fixed disk math + // doesn't read either — orientation derives from the line of sight + // 'normalize(pos)' (Earth → galaxy), not the camera, and the + // procedural shader has no apparent-size scaling that would consult + // 'pxPerRad'. Same intentional ABI-preservation pattern as + // disks.wesl line 65-74. camPosWorld: vec3, pxPerRad: f32, }; @@ -45,18 +93,18 @@ struct VsOut { @group(0) @binding(0) var u: Uniforms; -const CORNERS = array, 6>( - vec2(-1.0, -1.0), - vec2( 1.0, -1.0), - vec2( 1.0, 1.0), - vec2(-1.0, -1.0), - vec2( 1.0, 1.0), - vec2(-1.0, 1.0), -); - @vertex fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { - let corner = CORNERS[vid]; + // Unit-square corner offset in [-1, +1]² for this triangle-list + // vertex. Pulled from 'lib/billboard::quadCorner' so the (BL, BR, + // TR, BL, TR, TL) ordering is shared across all four billboard + // renderers — see the lib's docblock for the corner-ordering + // discussion. The corner here is in the disk's LOCAL 2D frame; the + // 3D placement happens via the (majorAxis, minorAxis) basis below, + // and 'out.uv = corner' forwards it unchanged so the fragment can + // use 'length(in.uv)' as the radial distance for the brightness + // profile. + let corner = quadCorner(vid); // ── Disk-plane basis construction ──────────────────────────────────── // @@ -79,8 +127,8 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { // conventions for sky-east vs world-X. let pos = instance.posSize.xyz; // Half the full-extent value in posSize.w to match disks.wgsl line 104. - // `posSize.w` is the FULL quad extent in Mpc (set at the emission site - // in thumbnailSubsystem.ts to `(diameterKpc/1000) * 4`, the same + // 'posSize.w' is the FULL quad extent in Mpc (set at the emission site + // in thumbnailSubsystem.ts to '(diameterKpc/1000) * 4', the same // multiplier the points pass uses for GALAXY_RADIUS_MPC). Each // corner sits at ±halfWorld * basis, so the rendered quad spans // 2*halfWorld = posSize.w world units total — agreeing with both the @@ -92,78 +140,22 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { let axisRatio = max(instance.orientation.x, 0.05); let paRad = instance.orientation.y * 3.14159265 / 180.0; - // ── Line of sight (Earth → galaxy) ─────────────────────────────────── - // - // Earth sits at the world origin in this coordinate system; `losDir` - // is therefore the Earth-to-galaxy direction. WORLD-FIXED: the disk's - // orientation is an intrinsic property of the galaxy in 3D space and - // must not depend on where the camera currently sits, otherwise - // orbiting would visibly rotate the disk plane with the camera (the - // exact bug `disks.wgsl` was rewritten to fix; see its header). + // ── Disk-plane basis (PA + inclination → world-space major / minor) ─ // - // Earth (origin) → galaxy direction. WORLD-FIXED, independent of camera - // position, so orbiting reveals the true 3D inclination foreshortening - // rather than rotating the disk with the camera. This mirrors - // `disks.wgsl`'s `losDir = normalize(center)` derivation; see the long - // header comment in that file for the full reasoning on why this is - // emphatically NOT `pos - camPosWorld` (the bug fixed there). - let los = normalize(pos); - - // Local sky-north and sky-east at the galaxy. We Gram-Schmidt - // celestial-north (+Z world) against `los` to get the sky-north - // tangent direction; sky-east is then los × sky-north. - let CELESTIAL_NORTH = vec3(0.0, 0.0, 1.0); - let northTangentRaw = CELESTIAL_NORTH - los * dot(CELESTIAL_NORTH, los); - let northLen = length(northTangentRaw); - // Pole degeneracy: if the line of sight is essentially along the - // celestial pole, the sky-tangent has no defined "north". Fall back - // to using world +Y as the in-plane reference. Picking +Y is - // arbitrary but consistent (every pole-on viewing renders with the - // same fallback orientation) and the loss of sky-PA fidelity at the - // poles is invisible in practice. - let northTangent = select( - northTangentRaw / northLen, - vec3(0.0, 1.0, 0.0), - northLen < 1e-4, - ); - // East-on-sky tangent. Argument order MATCHES disks.wgsl's - // `east_proj = cross(north_proj, losDir)` so the (north, east, los) - // frame is right-handed in the same sense and PA convention agrees - // with the textured-thumbnail pass. Reversing the cross flips the - // sign of the resulting major-axis rotation for any non-zero PA, - // which would visibly disagree with the thumbnail at the crossfade - // boundary — that bug just got fixed; don't reintroduce it. - let eastTangent = cross(northTangent, los); - - // Major axis on sky: rotate sky-north by PA toward sky-east. - let majorSky = northTangent * cos(paRad) + eastTangent * sin(paRad); - // Perpendicular-to-major in the sky-tangent plane. This matches - // `disks.wgsl`'s `minor_in_sky = cross(losDir, major)` exactly — the - // two passes MUST share basis math, otherwise their on-screen - // ellipses disagree at the crossfade boundary. - let perpMajorSky = cross(los, majorSky); - - // ── Tilt the minor axis out of the sky plane by inclination ────────── - // - // Inclination i with cos(i) = axisRatio. Face-on (axisRatio = 1, - // sinI = 0) → minor lies entirely in the sky plane → projects as a - // circle. Edge-on (axisRatio → 0, sinI → 1) → minor ≈ losDir → disk - // is parallel to the line of sight and projects as a thin streak. - // - // This formula is identical to disks.wgsl line 166: - // minor_3d = minor_in_sky * cosI + losDir * sinI - // and that identity is load-bearing — an earlier revision built a - // disk normal first, then took `cross(normal, major)` to recover the - // minor axis. That route flips the sign of the `sinI * los` term - // (because `cross(perpMajorSky, major) = -los` in this right-handed - // frame), tilting the disk in the OPPOSITE direction from the - // textured-thumbnail pass. At axisRatio ≈ 0.87 (i ≈ 30°) the visible - // mismatch is a ~30° rotation against one axis vs. the thumbnail — - // exactly the bug we just fixed; don't reintroduce it. + // The full derivation — line-of-sight from Earth-at-origin, sky-tangent + // (north, east) frame with pole fallback, PA-east-of-north rotation, + // and inclination tilt of the minor axis out of the sky plane — lives + // in 'lib/orientation.wesl'. See its docblock for the camera- + // independence invariant and the tight post-projection 'northLen < + // 1e-4' pole-degeneracy fallback that this renderer historically used + // and the lib now standardises on. 'axisRatio' is clamped at the call + // site (above) BEFORE we feed it through as cosI, because the lib + // intentionally doesn't re-clamp. let cosI = axisRatio; let sinI = sqrt(max(0.0, 1.0 - cosI * cosI)); - let majorAxis = majorSky; - let minorAxis = perpMajorSky * cosI + los * sinI; + let axes = diskAxes(pos, paRad, cosI, sinI); + let majorAxis = axes.major; + let minorAxis = axes.minor; // Quad corners in world space: centre + corner.x · major + corner.y · minor, // each scaled by the half-extent. @@ -171,7 +163,7 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { let worldPos = pos + worldOffset; var out: VsOut; - out.clipPos = u.viewProj * vec4(worldPos, 1.0); + out.clipPos = worldToClip(u.cam, worldPos); out.uv = corner; out.colourIndex = instance.extras.x; out.crossfadeAlpha = instance.extras.y; @@ -202,18 +194,11 @@ const DISK_SCALE = 0.5; const BULGE_WEIGHT = 0.6; const DISK_WEIGHT = 0.4; -// Mirror of points.wgsl's `ramp(t)`. Kept under the same name so a -// grep for `ramp` finds both copies; kept in this file (rather than -// shared via WGSL include — there is no include mechanism) so the -// procedural-disk pass renders exactly the same colour as the -// points pass for any given colour-index value. See -// points.wgsl:601-633 for the full derivation. -fn ramp(t: f32) -> vec3 { - let s = clamp(t * 0.5, 0.0, 1.0); - let blueWhite = mix(vec3(0.4, 0.6, 1.0), vec3(1.0, 0.95, 0.8), s); - let whiteRed = mix(vec3(1.0, 0.95, 0.8), vec3(1.0, 0.5, 0.3), s); - return select(blueWhite, whiteRed, t > 1.0); -} +// 'ramp(t)' lives in 'lib/colorIndex.wesl' — imported at the top of +// this file. Sharing the function with the points pass guarantees +// procedural-disk colour matches points colour exactly at any +// colour-index value, so the disk impostor fade-in doesn't introduce +// a visible cross-LOD hue shift. @fragment fn fs(in: VsOut) -> @location(0) vec4 { @@ -234,25 +219,25 @@ fn fs(in: VsOut) -> @location(0) vec4 { // Soft edge fade. We're outputting LINEAR colour into an rgba16float // HDR target; the tone-map pass converts to sRGB later (see // toneMap.wgsl). The disk's exponential decay leaves a residual - // `exp(-2) ≈ 0.135` at r = 1.0 in LINEAR space, which the gamma curve + // 'exp(-2) ≈ 0.135' at r = 1.0 in LINEAR space, which the gamma curve // brightens to ~42% display brightness — that's the hard edge the // user sees right before the discard. // // Apply a smoothstep edge fade from r = 0.6 (full disk) to r = 1.0 // (zero alpha), and SQUARE it to compensate for the ~2.2-power gamma: - // a value `x` in linear space becomes `x^(1/2.2) ≈ x^0.45` in display. + // a value 'x' in linear space becomes 'x^(1/2.2) ≈ x^0.45' in display. // To get display brightness that fades roughly linearly with r, the - // linear intensity must fade as `(1-r')^2.2 ≈ (1-r')^2` (where r' is - // the smoothstep parameter). Squaring `edgeFade` yields exactly that - // `(smoothed)^2` shape — perceptually smooth in display space. + // linear intensity must fade as '(1-r')^2.2 ≈ (1-r')^2' (where r' is + // the smoothstep parameter). Squaring 'edgeFade' yields exactly that + // '(smoothed)^2' shape — perceptually smooth in display space. // // We multiply the alpha (not the rgb separately) because we use // premultiplied alpha additive blending; alpha is the brightness // gate, and fading it to 0 at r = 1.0 is what removes the hard edge. - let edgeFade = smoothstep(1.0, 0.6, r); + let edgeFade = circularMask(r, 0.6, 1.0); let edgeFadeLinear = edgeFade * edgeFade; let alpha = intensity * in.crossfadeAlpha * edgeFadeLinear; // Premultiplied alpha — matches the project's blend mode (see - // device.ts `alphaMode: 'premultiplied'`). + // device.ts 'alphaMode: 'premultiplied''). return vec4(tinted * alpha, alpha); } diff --git a/src/services/gpu/shaders/quads.wgsl b/src/services/gpu/shaders/quads.wesl similarity index 66% rename from src/services/gpu/shaders/quads.wgsl rename to src/services/gpu/shaders/quads.wesl index 704b462..c2cdfe8 100644 --- a/src/services/gpu/shaders/quads.wgsl +++ b/src/services/gpu/shaders/quads.wesl @@ -1,4 +1,4 @@ -// quads.wgsl — billboard galaxy thumbnails sampled from a single atlas. +// quads.wesl — billboard galaxy thumbnails sampled from a single atlas. // // Run after the existing point pass. Each instance is one textured // quad whose world-space center matches a galaxy and whose size is @@ -6,32 +6,70 @@ // the atlas texture + sampler in group(0) so the engine can swap the // bind group cheaply when more thumbnails arrive. -// Camera + viewport. Shape mirrors the points-pass uniforms enough to -// share the same conceptual binding even though several points-only -// fields (brightness / selectedIndex / etc) aren't carried here. +// Shared CameraUniforms prefix + projection helpers. See +// 'lib/camera.wesl' for the canonical 80-byte layout (viewProj + +// viewportPx + 8B reserved pad). quads needs 'worldToClip' for both +// the galaxy center AND the celestial-north EPS offset used to +// orient the billboard basis; it does NOT use 'worldEyeDepth' +// because the per-quad distance calculation is hand-coded inline +// (see 'distanceMpc' below) — pulling in the helper would just rename +// 'length(toGalaxy)' without changing the math. +import package::lib::camera::CameraUniforms; +import package::lib::camera::worldToClip; +// Shared unit-quad helpers from 'lib/billboard.wesl' — replace the +// inline 'CORNERS' const + '(corner + 1) * 0.5' UV remap that used to +// live below. The view-aligned celestial-north basis math stays +// renderer-specific (see 'NORTH_WORLD' / 'upClip' below); only the +// vertex-index → corner / UV lookups are shared. +import package::lib::billboard::quadCorner; +import package::lib::billboard::quadUv; +// Shared fragment-stage mask shapes — see 'lib/masks.wesl' for the +// rationale (three smoothstep patterns recurred across four shaders, +// naming the shapes makes the intent visible at the call site). +import package::lib::masks::circularMask; +import package::lib::masks::lumAlpha; + +// Renderer-specific Uniforms struct. +// +// The first 80 bytes are the shared 'CameraUniforms' prefix +// (viewProj at offset 0, viewportPx at offset 64, two reserved-pad +// f32s at 72/76). The next 16 bytes carry this renderer's two +// camera-derived but renderer-specific fields: 'camPosWorld' (a +// vec3 needing 16-B alignment, which offset 80 already +// provides) and 'pxPerRad' which fits in the trailing slot of +// camPosWorld's 16-B vec4 quantum. +// +// Why are camPosWorld + pxPerRad NOT in CameraUniforms? See the +// header in 'lib/camera.wesl' — cameraPos lives at wildly different +// byte offsets across the points / milkyWayImpostor / quads / disks +// renderers, so it stays renderer-specific. quads happens to put it +// directly after the shared prefix, but a future shared per-frame +// camera UBO refactor will keep that the right shape. // -// `camPosWorld` + `pxPerRad` were added when the original +// 'camPosWorld' + 'pxPerRad' were added when the original // "project-a-unit-X-offset" billboard sizing turned out to depend on // camera orientation (orbiting a galaxy made the quad visibly shrink // or grow as the world-X axis rotated relative to the view direction). // The replacement computes each quad's apparent angular radius from -// `length(instance.posSize.xyz - camPosWorld)` and converts to screen -// pixels via `pxPerRad = viewport.y / (2 · tan(fovY / 2))`. Identical +// 'length(instance.posSize.xyz - camPosWorld)' and converts to screen +// pixels via 'pxPerRad = viewport.y / (2 · tan(fovY / 2))'. Identical // approach to points.wgsl — see that file for the derivation. +// +// Total uniform size: 96 bytes (unchanged from before the +// CameraUniforms adoption — the shared prefix exactly overlays the +// previous 'viewProj + viewport + _pad0 + _pad1' region byte-for- +// byte, so the CPU-side uploader's f32 indices do not move). struct Uniforms { - viewProj: mat4x4, - viewport: vec2, - _pad0: f32, - _pad1: f32, - camPosWorld: vec3, - pxPerRad: f32, + cam: CameraUniforms, + camPosWorld: vec3, // offset 80 + pxPerRad: f32, // offset 92 }; // Per-instance attributes. Three vec4s — first packs (xyz, sizeWorld), -// second is the uv rect, third carries the per-frame `fadeAlpha` +// second is the uv rect, third carries the per-frame 'fadeAlpha' // multiplier produced by the engine (distance fade × load fade). All // naturally 16-byte aligned. The remaining three components of -// `extras` are reserved padding for future per-instance flags. +// 'extras' are reserved padding for future per-instance flags. struct InstanceIn { @location(0) posSize: vec4, @location(1) uvRect: vec4, @@ -44,7 +82,7 @@ struct VsOut { // sub-rectangle of the 2048×2048 atlas. @location(0) atlasUv: vec2, // UV inside the corner-local [0, 1]² unit square — used to compute - // the radial alpha mask in `fs`. Independent of atlasUv because the + // the radial alpha mask in 'fs'. Independent of atlasUv because the // atlas slot might not occupy the full corner range when slot UV // rectangles get clamped or padded. Threading both lets us decouple // "which texel to sample" from "where am I in the quad shape". @@ -61,22 +99,14 @@ struct VsOut { @group(0) @binding(1) var atlasTex: texture_2d; @group(0) @binding(2) var atlasSmp: sampler; -// Hard-coded quad corners. The vertex shader is invoked with -// vertex_index 0..5 (two triangles), and we look up the corner from -// this table. Saves an index buffer + a vertex buffer for static -// geometry. -const CORNERS = array, 6>( - vec2(-1.0, -1.0), - vec2( 1.0, -1.0), - vec2( 1.0, 1.0), - vec2(-1.0, -1.0), - vec2( 1.0, 1.0), - vec2(-1.0, 1.0), -); - @vertex fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { - let corner = CORNERS[vid]; + // Unit-square corner offset in [-1, +1]² for this triangle-list + // vertex. Pulled from 'lib/billboard::quadCorner' so the (BL, BR, + // TR, BL, TR, TL) ordering is shared across all four billboard + // renderers — see the lib's docblock for the corner-ordering + // discussion. + let corner = quadCorner(vid); // Project the world-space center first. We then offset the corner in // clip space by a fixed pixel half-extent. Historically this was done @@ -88,7 +118,7 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { // dust-lane / disk orientation. We now build the billboard basis from // the projected celestial-north direction at each galaxy's screen // position so the texture's north tracks the world's celestial north. - let centerClip = u.viewProj * vec4(instance.posSize.xyz, 1.0); + let centerClip = worldToClip(u.cam, instance.posSize.xyz); // ── ANGULAR-SIZE → PIXEL HALF-EXTENT ───────────────────────────────────── // @@ -117,14 +147,14 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { // // Build the local +Y axis of the billboard from the projected celestial // north direction at this galaxy's screen position. Why? In skymap's - // world convention `raDecZToCartesian`, +Z is the celestial north pole - // (Dec = +90°), so projecting `pos + EPS · (0,0,1)` and subtracting the + // world convention 'raDecZToCartesian', +Z is the celestial north pole + // (Dec = +90°), so projecting 'pos + EPS · (0,0,1)' and subtracting the // projected center gives the screen-space direction "toward sky north" // at the galaxy. Using that as the billboard's local +Y means: // // - The texture's content (which is north-up by SDSS / DSS source // convention) stays aligned with sky north as the camera rolls. - // - The points-pass elliptical mask (which rotates by `-PA`, with PA + // - The points-pass elliptical mask (which rotates by '-PA', with PA // measured east-of-north) ends up consistent with the texture's // apparent orientation — both are anchored to projected north. // @@ -135,17 +165,17 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { // // Edge case: at the celestial poles the world +Z direction projects to // (or very near) the same screen point as the galaxy center, so - // `upPx ≈ 0` and `normalize` would blow up. We fall back to the + // 'upPx ≈ 0' and 'normalize' would blow up. We fall back to the // original screen-axis basis in that degenerate case. This is the only // place the old behaviour leaks through — and only for galaxies whose // line of sight is essentially parallel to the celestial north axis. let NORTH_WORLD = vec3(0.0, 0.0, 1.0); let EPS = 0.001; - let upClip = u.viewProj * vec4(instance.posSize.xyz + NORTH_WORLD * EPS, 1.0); + let upClip = worldToClip(u.cam, instance.posSize.xyz + NORTH_WORLD * EPS); let centerNdc = centerClip.xy / centerClip.w; let upNdc = upClip.xy / upClip.w; let upNdcDelta = upNdc - centerNdc; - let upPx = vec2(upNdcDelta.x * u.viewport.x * 0.5, upNdcDelta.y * u.viewport.y * 0.5); + let upPx = vec2(upNdcDelta.x * u.cam.viewportPx.x * 0.5, upNdcDelta.y * u.cam.viewportPx.y * 0.5); // Pole-degenerate fallback: when the projected-north delta vanishes, // use screen-X / screen-Y so the quad still renders (just unoriented). @@ -155,7 +185,7 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { // +X (right) is a +90° rotation of +Y (up) in screen space. Image-space // y points down on screen, so the in-image right-of-north direction is // (upY, -upX) — same handedness as the points-pass UV convention so the - // `-PA` rotation in points.wgsl agrees with the quad's local +Y meaning + // '-PA' rotation in points.wgsl agrees with the quad's local +Y meaning // "celestial north". let rightPxNorm = vec2(upPxNorm.y, -upPxNorm.x); @@ -164,12 +194,12 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { let offsetPx = corner.x * halfPixels * rightPxNorm + corner.y * halfPixels * upPxNorm; // Convert pixels to clip-space half-extent. As in points.wgsl, we - // multiply by `centerClip.w` to cancel the perspective divide so the - // billboard ends up exactly `halfPixels` on screen regardless of + // multiply by 'centerClip.w' to cancel the perspective divide so the + // billboard ends up exactly 'halfPixels' on screen regardless of // depth. let offsetClip = vec2( - offsetPx.x * 2.0 / u.viewport.x, - offsetPx.y * 2.0 / u.viewport.y, + offsetPx.x * 2.0 / u.cam.viewportPx.x, + offsetPx.y * 2.0 / u.cam.viewportPx.y, ) * centerClip.w; var out: VsOut; @@ -179,12 +209,13 @@ fn vs(@builtin(vertex_index) vid: u32, instance: InstanceIn) -> VsOut { centerClip.w, ); - // UV: corner is in [-1, 1]; remap to [0, 1] then to the slot's atlas - // rect. Flip V so the texture isn't upside down — `flipY: false` on - // the atlas upload preserves the natural ImageBitmap orientation - // (top-down), and our UV convention here puts v=0 at the top of the - // atlas. - let cornerUv = (corner + vec2(1.0, 1.0)) * 0.5; + // UV: 'quadUv' returns the unit-square corner remapped to [0, 1]² + // (same vertex-index ordering as 'quadCorner' above). We then mix + // into the slot's atlas rect. Flip V so the texture isn't upside + // down — 'flipY: false' on the atlas upload preserves the natural + // ImageBitmap orientation (top-down), and our UV convention here + // puts v=0 at the top of the atlas. + let cornerUv = quadUv(vid); let uvLocal = vec2(cornerUv.x, 1.0 - cornerUv.y); out.atlasUv = mix(instance.uvRect.xy, instance.uvRect.zw, uvLocal); // Forward the corner-local UV to the FS so it can compute the @@ -207,35 +238,35 @@ fn fs(in: VsOut) -> @location(0) vec4 { // 0.4 → 0.5 gives a ~10% transition band, soft enough to look like // a Gaussian halo rather than a clipped circle. let r = length(in.cornerUv - vec2(0.5, 0.5)); - let mask = 1.0 - smoothstep(0.4, 0.5, r); + let mask = circularMask(r, 0.4, 0.5); // ── BRIGHTNESS-DERIVED ALPHA (sky-subtraction lite) ────────────────────── // // SDSS / DSS cutout JPEGs ship with no alpha channel (rgba.a is always // 1.0), so they used to render their full sky background as an opaque // grey square against the dark dot field — the cosmetic complaint - // logged in `project_thumbnail_quality.md`. + // logged in 'project_thumbnail_quality.md'. // // Trick: use the maximum RGB component as alpha. A pure-black sky // pixel (max ≈ 0.02) becomes nearly transparent; a galaxy body // (max ≈ 0.4–0.9) stays opaque; saturated stars (max = 1.0) stay fully - // bright. `smoothstep` applies a soft threshold so the dimmest sky + // bright. 'smoothstep' applies a soft threshold so the dimmest sky // pixels vanish entirely instead of leaving a faint grey haze: // - lum ≤ 0.05 → fully transparent (sky) // - lum ≥ 0.30 → fully opaque (galaxy / star) // This is the "quick fix" stop-gap; the planned proper sky-subtraction - // (`project_thumbnail_quality.md` option B) will sample corner pixels + // ('project_thumbnail_quality.md' option B) will sample corner pixels // and subtract a per-cutout sky level. let lum = max(rgba.r, max(rgba.g, rgba.b)); - let lumAlpha = smoothstep(0.05, 0.30, lum); - // `fadeAlpha` is the engine's per-frame fade multiplier — combines + let lumGate = lumAlpha(lum, 0.05, 0.30); + // 'fadeAlpha' is the engine's per-frame fade multiplier — combines // distance fade (galaxies smoothly grow in as they cross the // apparent-size threshold) and load fade (~400 ms ramp from // bitmap-arrival). Final alpha is the product of all three. - let alpha = lumAlpha * mask * in.fadeAlpha; + let alpha = lumGate * mask * in.fadeAlpha; // Discard near-transparent fragments — pure early-out // optimisation. The quad covers a full bounding box, but the - // soft circular mask + brightness-derived `lumAlpha` zero out the + // soft circular mask + brightness-derived 'lumAlpha' zero out the // corners and the dim-sky pixels. An additive blend of // (rgb * ~0, ~0) on top of the HDR target would contribute // basically nothing visible while still costing a blend-bandwidth @@ -246,6 +277,6 @@ fn fs(in: VsOut) -> @location(0) vec4 { // discard as just a cheap early-out.) if (alpha < 0.01) { discard; } // Premultiplied-alpha output — required by the project's blend - // configuration (see device.ts `alphaMode: 'premultiplied'`). + // configuration (see device.ts 'alphaMode: 'premultiplied''). return vec4(rgba.rgb * alpha, alpha); } diff --git a/src/services/gpu/shaders/toneMap.wgsl b/src/services/gpu/shaders/toneMap.wesl similarity index 87% rename from src/services/gpu/shaders/toneMap.wgsl rename to src/services/gpu/shaders/toneMap.wesl index fe088b6..907585f 100644 --- a/src/services/gpu/shaders/toneMap.wgsl +++ b/src/services/gpu/shaders/toneMap.wesl @@ -5,20 +5,20 @@ // entire viewport [-1,1]² with a 50 % overdraw budget that's free // because we never sample those off-screen pixels. No vertex buffer // required — the vertex shader synthesises positions directly from -// `vertex_index`. +// 'vertex_index'. // -// Curves, branched on `u.curve`: +// Curves, branched on 'u.curve': // curve=0 — Linear / Clamp. Pre-HDR baseline, no tone mapping. // Cluster cores blow out, filaments invisible. Reference. -// curve=1 — Reinhard-extended `c·(1 + c/W²) / (1+c)`. Smooth +// curve=1 — Reinhard-extended 'c·(1 + c/W²) / (1+c)'. Smooth // roll-off near 1.0, "natural" look. Default. -// curve=2 — Asinh / Lupton 2004 `asinh(k·c) / asinh(k)`. Linear +// curve=2 — Asinh / Lupton 2004 'asinh(k·c) / asinh(k)'. Linear // near zero, log-like at high values. Aggressively lifts // dim regions — what SDSS's pipeline uses for filaments. -// curve=3 — Gamma 2.0 `sqrt(clamp(c, 0, 1))`. Simple midtone +// curve=3 — Gamma 2.0 'sqrt(clamp(c, 0, 1))'. Simple midtone // lift; cheap, less surgical than asinh. // curve=4 — ACES filmic (Narkowicz 2015 approximation): -// `(c·(2.51·c+0.03)) / (c·(2.43·c+0.59)+0.14)`, clamped. +// '(c·(2.51·c+0.03)) / (c·(2.43·c+0.59)+0.14)', clamped. // Cinematic S-curve with shoulder + toe. struct ToneMapUniforms { @@ -86,8 +86,8 @@ fn applyAsinh(c: vec3, k: f32) -> vec3 { return clamp(y, vec3(0.0), vec3(1.0)); } -// Gamma 2.0 = sqrt of clamped input. WGSL `pow(c, 0.5)` works but -// `sqrt` is the dedicated intrinsic and slightly faster. +// Gamma 2.0 = sqrt of clamped input. WGSL 'pow(c, 0.5)' works but +// 'sqrt' is the dedicated intrinsic and slightly faster. fn applyGamma2(c: vec3) -> vec3 { return sqrt(clamp(c, vec3(0.0), vec3(1.0))); } @@ -112,10 +112,10 @@ fn applyAces(c: vec3) -> vec3 { fn fs(in: VSOut) -> @location(0) vec4 { let hdr = textureSample(hdrTex, hdrSamp, in.uv).rgb; let scaled = hdr * u.exposure; - // Dynamic-uniform branch — `curve` is identical across all fragments + // Dynamic-uniform branch — 'curve' is identical across all fragments // in a frame, so the GPU's branch predictor handles this efficiently. - // We use a chain of `if`s rather than a `switch` because WGSL's - // `switch` semantics are slightly stricter (must end in `default`) + // We use a chain of 'if's rather than a 'switch' because WGSL's + // 'switch' semantics are slightly stricter (must end in 'default') // and the chain is cleaner with fall-through-impossible curves. var mapped: vec3; if (u.curve == 0u) { @@ -132,7 +132,7 @@ fn fs(in: VSOut) -> @location(0) vec4 { mapped = applyAces(scaled); } // Output is opaque — alpha doesn't matter because the swap-chain - // is configured `alphaMode: 'premultiplied'` and we just composited + // is configured 'alphaMode: 'premultiplied'' and we just composited // the entire scene already. return vec4(mapped, 1.0); } diff --git a/tests/services/gpu/milkyWayRenderer.test.ts b/tests/services/gpu/milkyWayRenderer.test.ts index 5a19162..46f100f 100644 --- a/tests/services/gpu/milkyWayRenderer.test.ts +++ b/tests/services/gpu/milkyWayRenderer.test.ts @@ -14,9 +14,15 @@ describe('MilkyWayRenderer', () => { it('exposes the documented uniform buffer size constant', () => { // The renderer uploads exactly UNIFORM_BUFFER_SIZE bytes per frame. - // Pinning this in a test ensures the WGSL `Uniforms` struct and + // Pinning this in a test ensures the WESL `Uniforms` struct and // the JS-side `ArrayBuffer(UNIFORM_BUFFER_SIZE)` allocation can // never silently drift. - expect(MilkyWayRenderer.UNIFORM_BUFFER_SIZE).toBe(96); + // + // Layout: CameraUniforms prefix (80 B) + cameraPosWorld vec3 (12 B) + // + fadeAlpha f32 (4 B) + iTime f32 (4 B) + 12 B tail pad = 112 B. + // Was 96 B before adopting `cam: CameraUniforms` from + // `lib/camera.wesl` — see `milkyWayRenderer.ts` doc-block for why + // the field order changed. + expect(MilkyWayRenderer.UNIFORM_BUFFER_SIZE).toBe(112); }); }); diff --git a/tests/services/gpu/pickRenderer.test.ts b/tests/services/gpu/pickRenderer.test.ts index cbc650b..c84135e 100644 --- a/tests/services/gpu/pickRenderer.test.ts +++ b/tests/services/gpu/pickRenderer.test.ts @@ -27,7 +27,14 @@ beforeAll(() => { function makeStubDevice(): GPUDevice { // Minimal stub — enough for createPickRenderer construction. return { - createShaderModule: vi.fn(() => ({})), + // PickRenderer + PointRenderer route shader-module creation through + // `createShaderModuleWithDevLog`, which calls `getCompilationInfo()` + // under `import.meta.env.DEV` (true by default in Vitest). Stub it + // out with a Promise-returning empty-messages response so the + // helper's `void module.getCompilationInfo()` doesn't throw. + createShaderModule: vi.fn(() => ({ + getCompilationInfo: () => Promise.resolve({ messages: [] }), + })), createRenderPipeline: vi.fn(() => ({ getBindGroupLayout: () => ({}), })), @@ -102,7 +109,15 @@ describe('createPickRenderer', () => { const createBindGroupCalls: Array<{ layout: unknown; buffer: unknown }> = []; const device = { - createShaderModule: vi.fn(() => ({})), + // PickRenderer + PointRenderer both route shader-module creation + // through `createShaderModuleWithDevLog`, which calls + // `getCompilationInfo()` when `import.meta.env.DEV` is true + // (Vitest's default). The stub therefore must expose a + // Promise-returning `getCompilationInfo` so the helper doesn't + // throw on construction. + createShaderModule: vi.fn(() => ({ + getCompilationInfo: () => Promise.resolve({ messages: [] }), + })), createRenderPipeline: vi.fn(() => { const idx = layoutsByPipeline.length; const layouts = { diff --git a/tests/services/gpu/pointRenderer.test.ts b/tests/services/gpu/pointRenderer.test.ts index 23a4303..884259a 100644 --- a/tests/services/gpu/pointRenderer.test.ts +++ b/tests/services/gpu/pointRenderer.test.ts @@ -106,7 +106,16 @@ function makeStubDevice(): GPUDevice { }) as unknown as GPUBuffer; return { - createShaderModule: () => ({}) as unknown as GPUShaderModule, + // PointRenderer routes shader-module creation through + // `createShaderModuleWithDevLog`, which calls `getCompilationInfo()` + // when `import.meta.env.DEV` is true (Vitest's default). The stub + // therefore must expose a Promise-returning `getCompilationInfo` — + // otherwise the helper throws on `module.getCompilationInfo is not + // a function`. + createShaderModule: () => + ({ + getCompilationInfo: () => Promise.resolve({ messages: [] }), + }) as unknown as GPUShaderModule, createRenderPipeline: () => ({ // `getBindGroupLayout` is invoked by the constructor when wiring the diff --git a/tests/services/gpu/postProcess.test.ts b/tests/services/gpu/postProcess.test.ts index bdcf421..50f495c 100644 --- a/tests/services/gpu/postProcess.test.ts +++ b/tests/services/gpu/postProcess.test.ts @@ -55,7 +55,16 @@ function mockDevice(): GPUDevice { })), createBuffer: vi.fn(() => ({ destroy: vi.fn() })), createSampler: vi.fn(() => ({})), - createShaderModule: vi.fn(() => ({})), + // postProcess wires a dev-mode getCompilationInfo logger after + // creating the shader module (so the linked WGSL is available when + // a compile error fires under wesl-plugin's `?static` linker, since + // browser error line numbers map to the linked output not the + // source). Vitest sets `import.meta.env.DEV = true` by default, so + // the mock has to expose getCompilationInfo even though we never + // assert on its output here. + createShaderModule: vi.fn(() => ({ + getCompilationInfo: () => Promise.resolve({ messages: [] }), + })), createBindGroupLayout: vi.fn(() => ({})), createPipelineLayout: vi.fn(() => ({})), createRenderPipeline: vi.fn(() => ({})), diff --git a/tsconfig.json b/tsconfig.json index 7224d1c..07ca255 100644 --- a/tsconfig.json +++ b/tsconfig.json @@ -12,7 +12,7 @@ "resolveJsonModule": true, "jsx": "react-jsx", "lib": ["ES2022", "DOM", "DOM.Iterable"], - "types": ["node", "@webgpu/types", "vite/client"] + "types": ["node", "@webgpu/types", "vite/client", "wesl-plugin/suffixes"] }, "include": ["src", "tests"], "exclude": ["tools"] diff --git a/vite.config.ts b/vite.config.ts index 7bfa196..012cb65 100644 --- a/vite.config.ts +++ b/vite.config.ts @@ -1,8 +1,22 @@ -import { defineConfig } from 'vite'; import react from '@vitejs/plugin-react'; +import { defineConfig } from 'vite'; +import { staticBuildExtension } from 'wesl-plugin'; +import viteWesl from 'wesl-plugin/vite'; +// `wesl-plugin/vite` registers a Vite plugin that intercepts imports of +// `.wesl` (and `.wgsl`) files with recognised suffixes. We pass it just +// the `staticBuildExtension`, which handles the `?static` suffix — it +// runs the WESL linker at build time and returns a flat WGSL string, +// preserving the existing `import x from './foo.wesl?static'` shape and +// avoiding any runtime linker dependency. The alternative (`?link`) +// would defer linking to runtime, which we don't need yet and which +// would pull the `wesl` JS linker into the production bundle. +// +// `assetsInclude: ['**/*.wgsl']` is retained while a few `.wgsl` files +// remain unmigrated; once Task 2 finishes the bulk rename it can be +// dropped, but it's harmless until then. export default defineConfig({ - plugins: [react()], + plugins: [viteWesl({ extensions: [staticBuildExtension] }), react()], server: { port: 5173 }, assetsInclude: ['**/*.wgsl'], }); diff --git a/vitest.config.ts b/vitest.config.ts index 8363e16..77be154 100644 --- a/vitest.config.ts +++ b/vitest.config.ts @@ -1,6 +1,14 @@ import { defineConfig } from 'vitest/config'; +import { staticBuildExtension } from 'wesl-plugin'; +import viteWesl from 'wesl-plugin/vite'; +// Vitest doesn't auto-inherit plugins from vite.config.ts in our setup, +// so we re-register wesl-plugin here. Without this, Vitest's SSR transform +// pipeline tries to parse .wesl files as JavaScript and rolldown rejects +// them as syntax errors. The plugin claims those imports first and emits +// a string, which is what the existing tests expect. export default defineConfig({ + plugins: [viteWesl({ extensions: [staticBuildExtension] })], test: { environment: 'node', include: ['tests/**/*.test.ts'], diff --git a/wesl.toml b/wesl.toml new file mode 100644 index 0000000..51191d8 --- /dev/null +++ b/wesl.toml @@ -0,0 +1,32 @@ +# WESL linker configuration. +# +# `wesl-plugin` reads this file at build time to find shader sources and +# resolve `import` statements inside `.wesl` modules. +# +# Schema (verified against `node_modules/wesl-plugin/dist/PluginExtension-DTjKL6rt.d.mts`): +# - `edition`: WESL language version. `unstable_2025` is the current +# pre-1.0 edition; bump it (and audit syntax) when WESL stabilises. +# - `include`: glob patterns (relative to this file's directory) for +# finding `.wesl` source files. +# - `root`: base directory for relative paths inside discovered files. +# +# The package prefix used in WESL `import` paths (e.g. +# `import package::lib::math::saturate;`) is the literal token `package` +# — verified empirically. The npm package.json `name` ("skymap") does +# NOT resolve as the prefix; only `package::` does. +# +# Why we picked `?static` over `?link`: +# - `?static` runs the linker at build time and emits a flat WGSL +# string. Zero runtime cost, identical shape to today's `?raw` +# import, and Chrome's `getCompilationInfo` still produces useful +# diagnostics (against the linked output — line numbers don't +# survive into the source `.wesl`, but each module's leading +# docblock keeps the mapping tractable). +# - `?link` defers linking to runtime via a `LinkParams` object. +# That would buy us conditional compilation against runtime feature +# flags, which we don't need yet — and would introduce a non-trivial +# runtime dependency on the `wesl` linker's JS bundle. + +edition = "unstable_2025" +include = ["src/services/gpu/shaders/**/*.wesl"] +root = "src/services/gpu/shaders"