Skip to content

Add an experimental opaque closure type.#1853

Draft
maleadt wants to merge 7 commits into
mainfrom
tb/opaque_closures
Draft

Add an experimental opaque closure type.#1853
maleadt wants to merge 7 commits into
mainfrom
tb/opaque_closures

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented Apr 4, 2023

Opaque closures don't make that much sense in the CUDA environment (where we don't have a linker and need to recompile when the OC changes, also because we require specsig), but it can be useful to inline typed IR into a kernel and call it.

Current state of this PR is 1.12+ only (because of JuliaLang/julia#53219). It could be made more general if and when we decide to merge this.

Depends on JuliaGPU/GPUCompiler.jl#572

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 4, 2023

Codecov Report

❌ Patch coverage is 0% with 140 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.69%. Comparing base (a011e73) to head (7782216).
⚠️ Report is 537 commits behind head on master.

Files with missing lines Patch % Lines
src/compiler/compilation.jl 0.00% 140 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #1853       +/-   ##
==========================================
- Coverage   71.83%   0.69%   -71.14%     
==========================================
  Files         155     154        -1     
  Lines       15013   14883      -130     
==========================================
- Hits        10785     104    -10681     
- Misses       4228   14779    +10551     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pepijndevos
Copy link
Copy Markdown

How much of this is actually CUDA specific?

@maleadt
Copy link
Copy Markdown
Member Author

maleadt commented Apr 5, 2023

Not much. Only the fact that we need to be able to create a compiler invocation, which depends on backend-specific state (which device is active, etc): https://github.com/JuliaGPU/CUDA.jl/pull/1853/files#diff-ecbfaf5b99ab10dcafff5717c7cc5f856768e4313446fa3fb58b839a25b17cfcR317

@maleadt maleadt force-pushed the tb/opaque_closures branch from 8538870 to dab7f8a Compare April 14, 2023 11:15
@maleadt
Copy link
Copy Markdown
Member Author

maleadt commented May 24, 2023

Seems to cause some IR verification errors:

ERROR: LLVM error: Called function is not the same type as the call!
  call void bitcast (
      void ([1 x i64],
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            [3 x float]*,
            double)* @julia_opaque_gpu_closure_9593
    to
      void ([1 x i64],
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { double, float },
            double)*)(
      [1 x i64] %state,
      { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }* nonnull %5,
      { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }* nonnull %6,
      { double, float } %.fca.1.insert,
      double %25), !dbg !300

@maleadt maleadt force-pushed the tb/opaque_closures branch from ae85f2e to 9cfd249 Compare June 15, 2023 18:02
@maleadt maleadt force-pushed the tb/opaque_closures branch from 9cfd249 to 92632e9 Compare June 29, 2023 14:09
@maleadt maleadt force-pushed the tb/opaque_closures branch from 92632e9 to 4511f85 Compare April 9, 2024 15:21
@maleadt maleadt added enhancement New feature or request cuda kernels Stuff about writing CUDA kernels. speculative Not sure about this one yet. labels Apr 9, 2024
@maleadt maleadt force-pushed the tb/opaque_closures branch from 3f9cde3 to 7782216 Compare April 17, 2024 11:40
id = length(GPUCompiler.deferred_codegen_jobs) + 1
GPUCompiler.deferred_codegen_jobs[id] = job
quote
ptr = ccall("extern deferred_codegen", llvmcall, Ptr{Cvoid}, (Int,), $id)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #582 you should be able to just emit a gpuc.lookup(mi, oc, args...) or maybe gpuc.deferred(oc, args...)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request speculative Not sure about this one yet.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants