Fix horizontal fusion missed due to duplicate inputs not being eliminated#2445
Fix horizontal fusion missed due to duplicate inputs not being eliminated#2445
Conversation
Agent-Logs-Url: https://github.com/diku-dk/futhark/sessions/b5912493-dec0-4968-83e1-d48e4a551868 Co-authored-by: athas <55833+athas@users.noreply.github.com>
WilliamDue
left a comment
There was a problem hiding this comment.
I think the problem can be solved more cleanly and safely by doing deduplication of inp_r in fuseSuperScrema maybe at the end of fuseSuperScrema or by applying a another function afterwards.
| form <- tryIdentityPost form' | ||
| pure (new_inp, form, new_out) | ||
| form'' <- tryIdentityPost form' | ||
| let (deduped_inp, deduped_form) = dedupInput new_inp form'' |
There was a problem hiding this comment.
I think this is probably this is probably not most sane way to solve the problem actually, The deduplication should probably happen at the end of fuseSuperScrema, I think am pretty sure the duplciates are introduces in the definition of inp_r.
There was a problem hiding this comment.
Done in 86d4568. The dedupInput call is now at the end of fuseSuperScrema, applied to inp_r and lam1 (using a lightweight ScremaForm wrapper since dedupInput operates on that type). The fuseScrema function is back to its original simpler form.
|
I am quickly gonna try it out to see if I am correct. |
|
Actually @athas do you mind if @ copilot to do it? |
|
Knock yourself out. |
|
@copilot Your current change does deduplication at the end of the construction of the fused Screma, I think it it is possible to only do the deduplication at the construction of the the fused SuperScrema in |
Agent-Logs-Url: https://github.com/diku-dk/futhark/sessions/a00ec7b9-8558-4780-88b4-0aea249f2c4f Co-authored-by: WilliamDue <79925124+WilliamDue@users.noreply.github.com>
Changed in 86d4568 — deduplication now happens at the end of |
|
@WilliamDue Looks like it works. Is this in greater accordance with your vision? |
|
Yes I believe it is correct, maybe a bit hacky using dedupInput but correct I am sure. |
After horizontal fusion, the combined screma can end up with duplicate entries in its input list when two SOACs share the same input array. Without deduplication, the fused screma looks structurally "heavier" than it should, causing the fusion pass to miss subsequent fusion opportunities in the same iteration.
Changes
src/Futhark/Optimise/Fusion/Screma.hs: CalldedupInputat the end offuseSuperScrema, applied toinp_randlam1before theSuperScremais returned. This is where duplicates are actually introduced (inp_r = inp_p <> inp_c_real), and deduplicating here removes extra lambda parameters by replacing them with let-binding aliases of the canonical parameter.tests/fusion/horizontal-fusion-shared-input.fut: Regression test for the pattern from the issue — two outer maps over a shared arrayW, each computing a dot product against different matrices, followed by an elementwise combination:Without the fix the total
Scremacount in the SOACS pipeline is 4; with it, the two innermost reductions fuse horizontally and the count drops to 3.