-
Notifications
You must be signed in to change notification settings - Fork 429
[BugFix] Fix reduce_sum with clear=False not accumulating correctly #1778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
📝 WalkthroughWalkthroughInsert conditional guarded copies from output to reducer fragment in reduce macro when buffers are shared and the reduction is non-clearing (clear=False), ensuring existing output values are included before performing the reduction. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Important Action Needed: IP Allowlist UpdateIf your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:
Failure to add the new IP will result in interrupted reviews. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tilelang/language/reduce_op.py (1)
80-93:⚠️ Potential issue | 🟠 MajorMissing parallel fix for
is_fragment(buffer) and is_shared(out)branch.This branch also creates a temporary
red_frag_outfragment but doesn't copy existingoutvalues whenclear=False, matching the original bug pattern. The same fix should be applied here for consistency.🐛 Proposed fix
elif is_fragment(buffer) and is_shared(out): red_frag_out = alloc_fragment(out.shape, out.dtype) IRBuilder.name(out.name + "_frag", red_frag_out) + if not clear: + copy(out, red_frag_out) + tir.call_intrin( "handle", tir.op.Op.get(_REDUCE_OP_KEY),
🧹 Nitpick comments (1)
tilelang/language/customize.py (1)
56-66: Type hintobjectfor dtype is overly permissive.While the functional change is fine, using
objectas a type hint loses specificity. Consider using a more precise type union if TVM has a specific dtype class.♻️ Suggested type hint
-def view(src: Buffer, shape: list[PrimExpr] | tuple[PrimExpr, ...] | None = None, dtype: str | object | None = None) -> Buffer: +def view(src: Buffer, shape: list[PrimExpr] | tuple[PrimExpr, ...] | None = None, dtype: str | None = None) -> Buffer:If TVM dtype objects need to be accepted, consider importing and using the specific type (e.g.,
tvm.DataTypeor similar) instead ofobject.
|
pre-commit.ci autofix |
|
@ShaobinChen-AH Thanks! Could you add the corresponding tests and remove the annotations part (Likely some changes in: #1773)? |
5f67135 to
f5c67cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tilelang/language/reduce_op.py`:
- Around line 52-54: The fragment→shared path currently allocates red_frag_out
without preserving existing out when clear=False; fix it by pre-seeding
red_frag_out with the current out (i.e., call copy(out, red_frag_out)) in the
fragment→shared branch before red_frag_out is used/allocated so it mirrors the
shared→shared behavior; update the logic around the red_frag_out variable in
reduce_op.py so that when clear is False you perform copy(out, red_frag_out)
prior to further writes/aggregation.
edccfc2 to
f427076
Compare
f427076 to
edccfc2
Compare
|
@ShaobinChen-AH I added the testcases, you can pull the code and have a review :) |
I have removed the annotations part |
ok, thanks, I will review it |
Fixes #1666
Problem:
When using
reduce_sumwithclear=False, the function should accumulate results onto existing values in the output buffer. However, the code was creating a new temporary fragment buffer without copying existing values, causing accumulation to start from zero.Root Cause:
In
reduce_op.py, whenis_shared(buffer) and is_shared(out), the code creates temporary fragment buffers but only copies from input buffer, not from output buffer whenclear=False.Fix:
Copy existing values from output buffer to temporary fragment buffer when
clear=False, before performing the reduction.Testing:
test_clear_issue.pynow passes withOut=2048(expected)clear=Trueandclear=Falsecases work correctlySummary by CodeRabbit