Skip to content

Cublas Support#2830

Draft
dsding2 wants to merge 3 commits into
EnzymeAD:mainfrom
dsding2:main
Draft

Cublas Support#2830
dsding2 wants to merge 3 commits into
EnzymeAD:mainfrom
dsding2:main

Conversation

@dsding2
Copy link
Copy Markdown

@dsding2 dsding2 commented Apr 22, 2026

Support for EnzymeAD/Enzyme-JAX#2444
Adds support for cublasSgemm, including caching and runtime shape propagation.

@dsding2 dsding2 marked this pull request as draft April 22, 2026 17:54
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 32.09%. Comparing base (b39a1fc) to head (d9adb21).
⚠️ Report is 1017 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (b39a1fc) and HEAD (d9adb21). Click for more details.

HEAD has 94 uploads less than BASE
Flag BASE (b39a1fc) HEAD (d9adb21)
99 5
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2830       +/-   ##
===========================================
- Coverage   68.16%   32.09%   -36.07%     
===========================================
  Files         109      188       +79     
  Lines       11779    30188    +18409     
===========================================
+ Hits         8029     9689     +1660     
- Misses       3750    20499    +16749     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mlir::OpBuilder builder(module->getContext());

ParserConfig config(&context, /*verify_after_parse*/ true);
if (failed(parseSourceString(modstr, module->getBody(), config))) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this process here is expensive [especially since this is the hot loop function we want with low latency], can we hide this behind the cache?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the only mechanism I could figure out to encode shape info is through function arg annotations, which then requires parsing to extract. However, you need that extracted shape info to know whether the caching is valid. If there were some other way to pass that information to the JIT pass or at least mark the calls that don't need dynamic shape information to avoid regression, then we could utilize caching.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we instead potentially add two arguments to reactantXLAExec, ShapeInfo* runtimeArgs, runtimeArgsSize, and have the codegen emit those?

Obviously to do so we should change llvm::SmallVector<int64_t, 2> shape; to a pointer and size to make it nice for a c-api

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that might be possible. I'll try to look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants