Cublas Support#2830
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #2830 +/- ##
===========================================
- Coverage 68.16% 32.09% -36.07%
===========================================
Files 109 188 +79
Lines 11779 30188 +18409
===========================================
+ Hits 8029 9689 +1660
- Misses 3750 20499 +16749 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| mlir::OpBuilder builder(module->getContext()); | ||
|
|
||
| ParserConfig config(&context, /*verify_after_parse*/ true); | ||
| if (failed(parseSourceString(modstr, module->getBody(), config))) { |
There was a problem hiding this comment.
this process here is expensive [especially since this is the hot loop function we want with low latency], can we hide this behind the cache?
There was a problem hiding this comment.
The problem is that the only mechanism I could figure out to encode shape info is through function arg annotations, which then requires parsing to extract. However, you need that extracted shape info to know whether the caching is valid. If there were some other way to pass that information to the JIT pass or at least mark the calls that don't need dynamic shape information to avoid regression, then we could utilize caching.
There was a problem hiding this comment.
could we instead potentially add two arguments to reactantXLAExec, ShapeInfo* runtimeArgs, runtimeArgsSize, and have the codegen emit those?
Obviously to do so we should change llvm::SmallVector<int64_t, 2> shape; to a pointer and size to make it nice for a c-api
There was a problem hiding this comment.
Yeah, that might be possible. I'll try to look into it.
Support for EnzymeAD/Enzyme-JAX#2444
Adds support for cublasSgemm, including caching and runtime shape propagation.