Skip to content

Switch to CompilerCaching.jl#431

Draft
maleadt wants to merge 4 commits into
mainfrom
tb/compilercaching
Draft

Switch to CompilerCaching.jl#431
maleadt wants to merge 4 commits into
mainfrom
tb/compilercaching

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented May 12, 2026

maleadt and others added 4 commits May 10, 2026 20:29
Replace `cached_compilation` with an `OpenCLResults` struct attached to each
`CodeInstance` via `CompilerCaching`: SPIR-V `obj` + entry name + `device_rng`
flag are session-portable (cached through precompilation), the runtime
library bitcode rides along via `GPUCompiler.bitcode`/`bitcode!`, and the
linked `cl.Kernel` is held in a small linear cache keyed by `cl.Context`.

The linear cache is needed because `cl.Kernel` is bound to the OpenCL
context that built it, but the cache partition (set by
`GPUCompiler.cache_owner` on the SPIRV target + OpenCL params) only covers
codegen-affecting features. Two contexts on the same device share the same
SPIR-V but need distinct kernel handles. Hot-path lookup is one field load +
one `===` compare — single-context users (the overwhelming majority) stay
at n=1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`something(lookup(...), compile_opencl!(...))` evaluated `compile_opencl!`
eagerly even on a cache hit, so every kernel launch silently re-ran the
full LLVM compile pipeline. Branch explicitly on the lookup result.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant