Rendering-next: FrameGraph, shader-driven descriptors, contact shadow#32
Merged
Conversation
The frame graph's bind group cache used per-name version counters (starting at 1 on creation), so two distinct textures like "wireframe/color" and "pathtracer/color" could share the same version — causing the cache to falsely match and reuse a bind group pointing to the wrong GPU resource. External resources (IBL views, samplers) had no version tracking at all (always 0), so switching stages left stale IBL texture views in cached bind groups. Fix: replace per-name counters with a global monotonic version counter on FrameGraph, and fingerprint external resources by pointer identity. Also clear stale texture refs in create_renderer() as a defensive measure.
Replace flat BindGroupEntry struct (8 optional fields, "exactly one set") with a std::variant<ManagedBufferBinding, ManagedTextureBinding, ExternalViewBinding, ExternalBufferBinding, SamplerBinding>. The allocate_bind_groups() fingerprint and resolution loops now use std::visit, making missing a variant case a compile error. Add 3 regression tests covering the staleness bug from c1b0265: - bind group rebuilds when texture name changes across frames - bind group rebuilds when external view changes - bind group rebuilds when sampler changes Ticket: bg-entry-variant
Add ContactShadowPass: fullscreen ray march against the depth buffer toward each light, producing a per-pixel shadow factor (R8Unorm). The forward shader reads it via bind group 3 and modulates direct lighting only (not IBL/emissive). Introduce light_iteration.slang abstracting the "for each light" loop — both forward.slang and contact_shadow.slang use it. When clustered lighting lands, only this module changes. Includes 64-light test scene (clustered_lighting_test.usda) for development and stress testing.
Replace all manual WGPUBindGroupLayoutEntry arrays with declarative OutputSlot descriptions. One declaration creates the layout at setup and fills the descriptor per-frame. Sampler slots auto-filled from format. Rename BindGroup → Descriptor throughout (no compat aliases). Add PassBuilder.descriptor() for static descriptor auto-set before execute. FallbackPool provides shared 1x1 fallback textures. Child passes own their consumer descriptors via OutputLayoutInfo. GBuffer gains consumer output for downstream passes.
SSAO and ContactShadow compose their layouts from GBuffer consumer_output_slots() via concatenation. PathTracerPass compute and IBL descriptor fills migrated from manual WGPUBindGroupEntry arrays to OutputLayoutInfo.build(). Zero manual descriptor fills remain.
Typestate design: separate Decl (declaration phase, no GPU handles) and compiled (GPU handles valid, accessed via ExecuteContext::get(handle)) types. Declaration-phase invalid state is unrepresentable. Handles are strong-typedef uint32_t indices into dense vectors. No unordered_map+unique_ptr for hot-path access — O(1) handle indexing, cache-friendly iteration. Strings are debug labels only, never used as hot-path lookup keys. FrameGraph owns: - Textures, buffers, descriptors (Frame/Persistent lifetime) - Samplers (deduplicated pool) - Shader modules (cached with hot-reload invalidation) - Render/compute pipelines (fingerprint-based caching, shader-version invalidation) - Bind group layouts - Persistent uploads (LTC tables, SSAO kernel/noise, IBL fallbacks) Pass interface: - ensure_initialized(device) replaces do_setup/is_ready - Passes cache handles as plain members, register once (first frame) - No per-frame string allocations for registered resources - Implicit liveness via PassBuilder/DescriptorBuilder usage Compiled resource safety: - ExecuteContext::get() asserts valid handle, current-frame liveness, and non-null compiled pointer - Deferred destruction on desc change / eviction keeps pre-compile references (e.g. ImGui draw data) valid through execute Removed: - OutputLayoutInfo, auto-sampler creation, BuildResource - TextureHandle/BufferHandle/DescriptorHandle (old indices) - TextureRef/BufferRef/DescriptorRef, ResourceRef<T> - CachedResource<T>, intrusive_ptr refcounting - LtcTextures class (textures + sampler in FG static cache) - All Ready structs, do_setup/do_renderer_setup, variant<monostate,Ready> - Declaration vectors (m_resources/m_buffer_resources/m_descriptor_resources) - allocate_textures/allocate_buffers/allocate_descriptors - unordered_map<string, unique_ptr<T>> storage for decls Cross-pass coupling (SSAO/ContactShadow → GBufferPass*) removed via static consumer_slots().
…t-caches] String-keyed caches (name→handle registries, shaders, BGLs, pipelines) now use boost::unordered_flat_map with transparent StringViewHash/Equal — lookups pass string_view directly and no longer allocate a std::string per find(). Pipeline builders gain a shader-version fast path: on a cache hit with matching shader_version, build() returns the cached pipeline without computing the full config fingerprint. Fingerprint is still checked on the slow path (shader invalidated or first build). Tracy zones added to FrameGraph hot-path methods (texture/buffer/descriptor registration, bind_group_layout, shader, sampler, add_pass, resize, import_buffer, descriptor factory, begin_frame) and to compile sub-phases (materialize_textures/buffers/descriptors, evict_unused). Pipeline builder build() uses PTS_ZONE_NAMED to distinguish cache-hit vs cache-miss paths.
…onfig-driven variants
Squashes four in-flight commits plus cleanup on dev/rendering-next.
DepTrackedCache (was 582133a)
- Generic dep-tracked cache with forced-dirty invalidation and stable
references; unifies FG/RenderWorld version tracking.
IShaderCompiler interface + EmbeddedCompiler (was ec439d0)
- Polymorphic shader-compilation interface; EmbeddedCompiler serves
pre-built embedded WGSL, used on WASM and as native error-fallback.
SlangCompiler backend (was 504d5bf)
- libslang backend with on-disk cache and mtime-based poll_dirty.
- Per-source-key mutexes for thread-safe compile.
- Drops PTS_SHADER_HOT_RELOAD plumbing; editor polls the compiler.
Config-driven shader variants (was 88cb87e, a51c818)
- slangc.shaders[] gains a variants: list with defines + filename suffix.
- shader_variants_codegen emits a constexpr (defines_canon -> suffix)
map consumed by EmbeddedCompiler.
- shader_variants_codegen fails loud on non-dict entries; unit tests
cover conflicting-suffix detection and implicit-base handling.
Cleanup (this ticket)
- Rename headers/sources to lowerCamelCase per project convention:
depTrackedCache.h, shaderCompiler.{h,cpp}, slangCompiler.{h,cpp}.
- Drop the hand-rolled SHA-256 in slangCompiler.cpp in favor of
boost::hash_combine over the cache-key inputs. Cache keys are not
security-sensitive; collision resistance among same-process inputs
is what matters, and format_version + defines + source + dep
hashes make collisions astronomically unlikely.
- Introduce ShaderKey { source, defines } flowing through
IShaderCompiler::compile, SlangCompiler / EmbeddedCompiler, and
FrameGraph's shader/shader_variant entry points. Provides
hash_value / operator== so it can be used as a map key directly.
Future variant axes (pso, material, vertex layout) become one-field
changes rather than API churn.
- Rename PTS_UNUSED -> UNUSED in diagnostics.h to match the rest of
the macro surface; replace (void) silencers in depTrackedCache and
testSlangCompiler.
- linux-tool-builds (Linux CI for tool subset, kills Windows-artifact smuggling) - cpp-shader-compiler-tool (replaces Python slangc.py with C++ IShaderCompiler CLI) Resume reference in tmp_plan/README.md.
Replaces the Python shader-build pipeline (slangc.py + shader_codegen.py + Jinja template + *.reflect.json sidecars) with a single C++ tool, pts_shaderc, that walks slang::ProgramLayout in-process and emits both WGSL and the *_metadata.h headers directly. Architecture - core/shaderc/ — new CMake target `core::shaderc`. Houses slangRuntime, slangReflection, slangMetadata, ShaderLoader, and diagnostics.h. Main core library links it; builds slang-free on Emscripten so diagnostics.h and ShaderLoader remain accessible there. - tools/pts_shaderc/ — plain CMake tool (no Conan wrapper). Links only core::shaderc, avoiding the Dawn/ImGui/USD dep tree. `compile` and `batch` subcommands. - tools/repo_tools/slangc.py — reduced to a thin driver invoking pts_shaderc for config resolution and variant enumeration. - shader_codegen.py, shader_metadata.h.j2, and *.reflect.json all deleted. Runtime - IPass and 4 direct-call sites (hello_triangle, testWebGpu, testPipelineBuilder, testAutoExposure) route through IShaderCompiler::compile() instead of get_resource<WGSL>. - Native WGSL embedding scoped to Emscripten only; native builds rely on SlangCompiler at runtime. editor.exe shrinks -207 KB, embedded_resources.h 796 KB -> 30 KB. Build infra - tools/docker/linux-tools.Dockerfile + build-tools.sh for local Linux tool-build iteration. - linux-ci: Emscripten job builds host tools natively on its own Linux runner instead of consuming Windows artifacts. - Slang bumped 2026.1 -> 2026.5.2 for the reflection API + wgsl depth texture fixes.
Make shader reflection the single source of truth for bind group layouts. The C++ OutputSlot DSL duplicated what the shader already declares — silent drift when a shader binding changed and C++ didn't match. - pts_shaderc metadata emitter dispatches on TypeReflection::Kind — ConstantBuffer/ParameterBlock, SamplerState, Resource (StructuredBuffer / Texture* / RWTexture*). Generates correct WGPUBindGroupLayoutEntry unions per binding, previously all emitted as buffer type. - Slang user attributes [DynamicBuffer], [NonFilterable], [NonFiltering] registered via IGlobalSession::addBuiltins. Metadata emitter reads them to set WGPU flags reflection can't infer (hasDynamicOffset, sampler Filtering vs NonFiltering, texture sampleType). - FrameGraph::bind_group_layout(name, WGPUBindGroupLayout) registers a caller-built layout and caches by name for dep-tracked versioning. Lookup-only bind_group_layout(name) fails loud if not pre-registered. - ~28 callsites across forward, pathtracer, editor, gbuffer, shadow, contact_shadow, ssao, tone-mapping migrated to <shader>::create_bind_group_layout_N(device). Consumer BGLs registered by the consuming renderer (forward registers shadow_map/consumer from forward shader's reflection; ShadowMapPass / ContactShadowPass look up by name). - iblResources stays open-coded — its compute shaders patch RWTexture2D storage format at runtime (rgba32float → rgba16float) which reflection cannot predict. - testSlangMetadata.cpp covers uniform/[DynamicBuffer]/texture+sampler/ StructuredBuffer RO/RW/RWTexture2D cases. - Deletes: core/include/core/rendering/outputLayout.h + core/src/rendering/outputLayout.cpp, OutputSlot class, consumer_slots() / producer_slots() static helpers, slot-list overloads of FrameGraph::bind_group_layout. Native Debug + Release builds green; 51/51 tests pass including editorSmoke (brdf_test, kitchen_set, shadow_test, primitives, test_cube) and ptSmoke_glass_test. Emscripten prebuild slangc.py has a pre-existing path-resolution bug flagged for a separate infra ticket.
Replaces all non-ASCII characters in project source with ASCII equivalents: - Box drawing (U+2500) -> '-' - Em dash (U+2014) -> '--' - En dash (U+2013) -> '-' - Right arrow (U+2192) -> '->' - Check mark (U+2713) -> '[OK]' - Cross mark (U+2717) -> '[FAIL]' - Approx equal (U+2248) -> '~=' - Degree sign (U+00B0) -> 'deg' - Plus-minus (U+00B1) -> '+/-' - Multiplication sign (U+00D7) -> 'x' - Windows-1252 em dash (0x97 in ltcData.h) -> '--' Runtime-visible changes: build log output uses [OK]/[FAIL] instead of Unicode check/cross marks. The string matching in build/__init__.py was updated to match. Adds ASCII-only-source coding rule to CLAUDE.md Code Conventions. Ticket: ascii-only-source
…DepTrackedSlotMap
New dense slot-map primitives with fat-pointer handles, tombstone+free-list
erase, and globally monotonic versions:
- SlotMap<K,V>: flat_map index, stable indices across erase/realloc,
Handle={container*,uint32_t}, span_raw() escape hatch for GPU iteration
- DepTrackedSlotMap<K,V>: composition over SlotMap<K,Tracked>, dep-snapshot
rebuild, get_or_build / get_or_build_with_replace / invalidate
Migration:
- FrameGraph caches (shader, BGL, pipeline, descriptor) -> DepTrackedSlotMap
with std::less<> for transparent string_view lookup
- RenderWorld m_meshes/m_objects/m_lights/m_cameras -> SlotMap<SdfPath,T>;
alloc()+set_prim_path() collapsed to insert(path,T{}); m_prim_slots removed
- SyncScope: alloc_*(path), free_*(path), mutate_*(idx, fn) replace old
alloc_*_slot()/free_*_slot()/write_*() + WriteGuard pattern
- All render passes, editor, tests migrated to span_raw()/for_each/at() API
- Delete depTrackedCache.h, Slot<T>, SlotVector<T>
Ticket: unified-slotmap
Rename wrapper-level identifiers across 15 files: - bg_decl / bg0_decl / bg1_decl -> desc_decl / desc0_decl / desc1_decl - bg / bg0 / bg1 -> desc / desc0 / desc1 - desc_group -> descriptor - Debug label strings updated to match (e.g. "bg0" -> "desc0") - Wrapper-level comments reworded; WGPU API boundary comments preserved Unchanged (out of scope): Descriptor::bind_group field (holds raw WGPUBindGroup), bind_group_layout() methods (return WGPUBindGroupLayout), slangMetadata BindGroup struct, raw WGPU API call sites in iblResources and hello_triangle, PropertyDescriptor types. Ticket: unify-descriptor-terminology
|
Important Review skippedToo many files! This PR contains 177 files, which is 27 over the limit of 150. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (177)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
In --host-tools-only mode (CI Emscripten path), the main Release conan install hasn't run, so conanrun.sh doesn't exist. Previously slangc.py passed it unconditionally, triggering 'env_script not found' from ShellCommand.exec. The staged pts_shaderc has its own bundled deps and runs without the main build's env script.
The --host-tools-only pass (step before the Emscripten cross-compile in CI) already runs host-tool-dependent prebuild steps like slangc, staging the tool binary in the host build dir (e.g. _build/linux-x64/Release/). The Emscripten main build re-invoked those same steps, but pts_shaderc isn't in _build/emscripten/Release/bin/ -- so slangc failed with 'pts_shaderc not found'. Mirror the existing 'skip host-tools compilation phase on Emscripten' logic (line 511) for the prebuild-execute phase: filter out _HOST_TOOL_TARGETS entries when emscripten_build is True.
… main build" This reverts commit eb8f729.
This reverts commit ff7ab0e.
… search Two coupled fixes replace the earlier symptom patches (reverted in the prior two commits): 1. _resolve_pts_shaderc searches sibling platform build dirs too. The Emscripten main build can't find pts_shaderc under _build/emscripten/Release/ -- it was staged at _build/linux-x64/Release/bin/ by the preceding --host-tools-only pass. Search one level up (_build/<*>/<cfg>/bin/) to find it. 2. --host-tools-only now stages the tool's conanrun script next to the binary. Each host tool has its own isolated Conan graph with its own conanrun (different from the main build's). Staging it alongside lets slangc.py source it without depending on the main build's conan install. slangc.py now resolves conanrun in preference order: (a) next to the resolved pts_shaderc binary (bin/conanrun.*) (b) current build_dir/conanrun.* (native same-platform fallback) (c) None (RPATH handles it on Linux when neither is available) Windows needs (a) or (b) to find slang.dll via PATH. Linux can survive without either thanks to RPATH, but prefers (a)/(b) when present.
Unifies the layout between --host-tools-only and native Phase 1: after both paths finish building the host tool, conanrun.* is colocated with the binary in build_dir/bin/. slangc.py drops the build_dir fallback -- one convention, one location. Also hoist shutil/sys imports to module scope.
…lete
Template methods of SyncScope accessed non-dependent names
m_world.m_objects etc. where m_world is RenderWorld&. Because
RenderWorld was only forward-declared at that point in the header,
Clang (Emscripten emsdk) rejected the template at parse time:
error: member access into incomplete type 'RenderWorld'
m_world.m_objects.mutate_at(i, std::forward<Fn>(fn));
MSVC happened to defer name lookup and let it through. Move the
template bodies out of the class, below the RenderWorld definition
in the same header.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Long-running
dev/rendering-nextbranch squashing 17 commits of rendering infrastructure work. Major themes:FrameGraph overhaul
SlotMap/DepTrackedSlotMapreplacingDepTrackedCache+SlotVectorShader toolchain
core::shaderc+ standalonepts_shaderctoolIShaderCompilerinterface with Slang backend +EmbeddedCompilerfor WASMconfig.yaml+shader_variants_codegen.py)OutputSlotDSL)Descriptor API
DescriptorBuilderfluent API +FallbackPoolfor optional bindingsDescriptor;BindGrouponly where it refers to the raw WGPU APIadd_bilateral_blur()helper now backs both SSAO and contact shadowRendering
1/zinterpolation + grazing-angle bias; shared bilateral blur to denoiseQuality
--usddropdown sync fix: built-in scene combo no longer has a no-op first entry after startup-loading a custom stageTest plan
./repo build(native)./repo test-- all 52 tests pass locally./repo launch editor --capture-and-quit --usd assets/scenes/shadow_test.usda-- valid capture