Expand pass by ramonwirsch · Pull Request #740 · daisytuner/docc

ramonwirsch · 2026-06-08T13:00:32Z

Reworked the expansion pipeline into a single MathExpansionPass.

Warning: the pass currently does not clear analysis-cache between expands, so it is currently up to each and every expand call to not let any cache survive that is outdated for other library nodes after it expanded itself.

* Will collect all math-nodes and attempt to expand them all in reverse execution order to minimize effects of expands on other analysis ! does not clear any analysis (which makes it fragile. Should not matter for current expands(), but slightly dangerous until we have removed the need for ScopeAnalysis or can more granularly clear analyses.

…anymore

daisytuner · 2026-06-08T13:25:23Z

Daisytuner Report - mlir_torch_models (chamomile)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# resnet18_torch         19.85 s     -0.40%      N/A         3787.49 J   -3.96%      
# resnet18_docc_none     17.96 s     -0.27%      N/A         4575.06 J   -3.77%      
# resnet18_docc_sequential17.57 s     +1.21%      N/A         4460.60 J   -2.32%      
# resnet18_docc_openmp   23.62 s     -0.75%      N/A         6696.29 J   -4.00%      
# resnet18_docc_cuda     5.49 s      -0.56%      N/A         1013.57 J   -4.47%

daisytuner · 2026-06-08T14:03:02Z

Daisytuner Report - python_npbench (zinnia)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# adi_numpy              1.32 s      +1.18%      N/A         134.26 J    +0.48%      
# adi_omp                2.53 s      -1.55%      N/A         315.31 J    -2.16%      
# adi_cuda               3.64 s      -2.11%      N/A         366.08 J    -2.58%      
# adi_seq_tuning         2.59 s      +1.18%      N/A         241.49 J    +0.16%      
# atax_numpy             2.15 s      -0.45%      N/A         227.71 J    -1.15%      
# atax_omp               3.13 s      +1.11%      N/A         404.67 J    +1.61%      
# atax_cuda              4.28 s      -0.17%      N/A         450.22 J    -0.71%      
# atax_seq_tuning        2.94 s      +2.01%      N/A         293.25 J    +0.08%      
# gemm_numpy             1.22 s      +0.26%      N/A         197.19 J    -0.25%      
# gemm_omp               1.16 s      +0.16%      N/A         170.10 J    +0.05%      
# gemm_cuda              10.67 s     -0.53%      N/A         1039.48 J   -1.17%      
# gemm_seq_tuning        1.16 s      +0.24%      N/A         169.31 J    -0.42%      
# gesummv_numpy          1.74 s      -0.43%      N/A         251.84 J    -1.08%      
# gesummv_omp            2.09 s      -4.61%      N/A         344.35 J    -6.01%      
# gesummv_cuda           5.28 s      -0.67%      N/A         734.41 J    -2.21%      
# gesummv_seq_tuning     5.31 s      -1.74%      N/A         681.63 J    -1.33%      
# gemver_numpy           1.08 s      -1.04%      N/A         168.31 J    -1.86%      
# gemver_omp             944.97 ms   -0.89%      N/A         126.74 J    -1.87%      
# gemver_cuda            2.53 s      -0.53%      N/A         273.29 J    -1.16%      
# gemver_seq_tuning      1.70 s      -1.05%      N/A         145.88 J    -2.25%      
# k2mm_numpy             1.19 s      -0.71%      N/A         198.44 J    -1.20%      
# k2mm_omp               3.58 s      -1.24%      N/A         662.24 J    -3.60%      
# k2mm_cuda              12.73 s     +0.02%      N/A         1236.17 J   -0.75%      
# k2mm_seq_tuning        3.00 s      -1.97%      N/A         405.61 J    -1.90%      
# k3mm_numpy             1.02 s      -0.41%      N/A         183.57 J    -1.10%      
# k3mm_omp               5.61 s      +0.16%      N/A         961.56 J    +0.36%      
# k3mm_cuda              18.38 s     -0.82%      N/A         1776.55 J   -1.51%      
# k3mm_seq_tuning        4.96 s      +0.37%      N/A         697.70 J    -1.37%      
# mvt_numpy              2.43 s      -0.44%      N/A         254.68 J    -1.12%      
# mvt_omp                2.78 s      -0.44%      N/A         295.35 J    -1.10%      
# mvt_cuda               3.44 s      -0.28%      N/A         358.95 J    -1.03%      
# mvt_seq_tuning         2.79 s      +0.23%      N/A         295.73 J    -0.56%      
# symm_numpy             781.37 ms   -0.47%      N/A         82.18 J     -1.06%      
# symm_omp               1.02 s      -0.40%      N/A         123.68 J    -2.16%      
# symm_seq_tuning        1.88 s      +1.71%      N/A         155.42 J    -0.23%      
# syr2k_numpy            878.78 ms   +0.66%      N/A         91.41 J     +0.12%      
# syr2k_omp              935.37 ms   -2.78%      N/A         106.90 J    -2.70%      
# syr2k_cuda             1.61 s      +0.36%      N/A         171.40 J    -0.48%      
# syr2k_seq_tuning       932.00 ms   -1.15%      N/A         106.50 J    -1.50%      
# syrk_numpy             768.69 ms   -0.56%      N/A         80.83 J     -1.23%      
# syrk_omp               785.18 ms   +0.37%      N/A         92.15 J     -0.57%      
# syrk_cuda              1.41 s      -1.23%      N/A         153.07 J    -1.63%      
# syrk_seq_tuning        783.85 ms   -0.20%      N/A         92.17 J     -0.69%      
# trmm_numpy             890.48 ms   +1.72%      N/A         92.49 J     +0.74%      
# trmm_omp               754.34 ms   -0.37%      N/A         96.23 J     -1.38%      
# trmm_seq_tuning        1.63 s      -4.75%      N/A         134.52 J    -3.85%

…put_dir for additional dumping

ramonwirsch · 2026-06-09T08:53:42Z

Cannot work without larger rewrites.
We currently emit further MathNodes from the expand of math nodes, requiring recursive expansion. But we do not keep metadata about how the expansion went (newly produced blocks / sequence members to scan for recursive evaluation)

ramonwirsch added 2 commits June 8, 2026 13:23

Also migrated sdfg-json-to-c.cpp to not using the expansion pipeline …

873c6a9

…anymore

toStr() for ReduceNodes and giving PyStructuredSDFG access to the out…

37383f1

…put_dir for additional dumping

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand pass#740

Expand pass#740
ramonwirsch wants to merge 3 commits into
mainfrom
expand-pass

ramonwirsch commented Jun 8, 2026

Uh oh!

daisytuner Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

daisytuner Bot commented Jun 8, 2026

Uh oh!

ramonwirsch commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ramonwirsch commented Jun 8, 2026

Uh oh!

daisytuner Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Daisytuner Report - mlir_torch_models (chamomile)

Uh oh!

daisytuner Bot commented Jun 8, 2026

Daisytuner Report - python_npbench (zinnia)

Uh oh!

ramonwirsch commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

daisytuner Bot commented Jun 8, 2026 •

edited

Loading