Skip to content

Couple of minor fixes#3151

Merged
maleadt merged 2 commits into
mainfrom
tb/fixes
May 22, 2026
Merged

Couple of minor fixes#3151
maleadt merged 2 commits into
mainfrom
tb/fixes

Conversation

@maleadt
Copy link
Copy Markdown
Member

@maleadt maleadt commented May 22, 2026

Encountered while testing JuliaGPU/GPUCompiler.jl#801

@maleadt maleadt merged commit 29b9b22 into main May 22, 2026
1 of 2 checks passed
@maleadt maleadt deleted the tb/fixes branch May 22, 2026 13:36
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 16.41%. Comparing base (2fe75d6) to head (87f6ea0).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3151      +/-   ##
==========================================
+ Coverage   16.40%   16.41%   +0.01%     
==========================================
  Files         124      124              
  Lines        9827     9827              
==========================================
+ Hits         1612     1613       +1     
+ Misses       8215     8214       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: 87f6ea0 Previous: d96bb17 Ratio
array/accumulate/Float32/1d 100372 ns 99804 ns 1.01
array/accumulate/Float32/dims=1 75337 ns 74695 ns 1.01
array/accumulate/Float32/dims=1L 1575673 ns 1575172 ns 1.00
array/accumulate/Float32/dims=2 140411 ns 140050 ns 1.00
array/accumulate/Float32/dims=2L 653433 ns 652746 ns 1.00
array/accumulate/Int64/1d 117165 ns 116667 ns 1.00
array/accumulate/Int64/dims=1 78987 ns 78138 ns 1.01
array/accumulate/Int64/dims=1L 1684256 ns 1683352 ns 1.00
array/accumulate/Int64/dims=2 152414 ns 151632 ns 1.01
array/accumulate/Int64/dims=2L 959616 ns 958723 ns 1.00
array/broadcast 20180 ns 19844 ns 1.02
array/construct 1190.4 ns 1172.2 ns 1.02
array/copy 16855 ns 16637 ns 1.01
array/copyto!/cpu_to_gpu 214555 ns 213824 ns 1.00
array/copyto!/gpu_to_cpu 281396 ns 278790 ns 1.01
array/copyto!/gpu_to_gpu 10516 ns 10365 ns 1.01
array/iteration/findall/bool 131514 ns 130581 ns 1.01
array/iteration/findall/int 146391 ns 144724 ns 1.01
array/iteration/findfirst/bool 79983 ns 79128 ns 1.01
array/iteration/findfirst/int 81393 ns 80847 ns 1.01
array/iteration/findmin/1d 67573 ns 66285 ns 1.02
array/iteration/findmin/2d 102222 ns 101434 ns 1.01
array/iteration/logical 191542 ns 188254 ns 1.02
array/iteration/scalar 65820 ns 64586 ns 1.02
array/permutedims/2d 50055 ns 49748 ns 1.01
array/permutedims/3d 50389 ns 50091 ns 1.01
array/permutedims/4d 50010 ns 49706 ns 1.01
array/random/rand/Float32 12069 ns 12116 ns 1.00
array/random/rand/Int64 23655 ns 23371 ns 1.01
array/random/rand!/Float32 9065 ns 7960.333333333333 ns 1.14
array/random/rand!/Int64 20443 ns 20391 ns 1.00
array/random/randn/Float32 35724 ns 34710 ns 1.03
array/random/randn!/Float32 25197 ns 23894 ns 1.05
array/reductions/mapreduce/Float32/1d 32973 ns 32647 ns 1.01
array/reductions/mapreduce/Float32/dims=1 38239 ns 37819 ns 1.01
array/reductions/mapreduce/Float32/dims=1L 50611 ns 50203 ns 1.01
array/reductions/mapreduce/Float32/dims=2 55743 ns 55479 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 67149 ns 67023 ns 1.00
array/reductions/mapreduce/Int64/1d 40006 ns 39410 ns 1.02
array/reductions/mapreduce/Int64/dims=1 40968 ns 40857 ns 1.00
array/reductions/mapreduce/Int64/dims=1L 86466 ns 86396 ns 1.00
array/reductions/mapreduce/Int64/dims=2 58285 ns 57852 ns 1.01
array/reductions/mapreduce/Int64/dims=2L 82753 ns 82413 ns 1.00
array/reductions/reduce/Float32/1d 33105 ns 32758 ns 1.01
array/reductions/reduce/Float32/dims=1 38205 ns 38011 ns 1.01
array/reductions/reduce/Float32/dims=1L 50604 ns 50349 ns 1.01
array/reductions/reduce/Float32/dims=2 55738 ns 55642 ns 1.00
array/reductions/reduce/Float32/dims=2L 67711 ns 67383 ns 1.00
array/reductions/reduce/Int64/1d 40025 ns 39537 ns 1.01
array/reductions/reduce/Int64/dims=1 40854 ns 40656 ns 1.00
array/reductions/reduce/Int64/dims=1L 86512 ns 86280 ns 1.00
array/reductions/reduce/Int64/dims=2 58144 ns 57865 ns 1.00
array/reductions/reduce/Int64/dims=2L 82594 ns 82279 ns 1.00
array/reverse/1d 16935 ns 16600 ns 1.02
array/reverse/1dL 67851 ns 67647 ns 1.00
array/reverse/1dL_inplace 65305 ns 65318 ns 1.00
array/reverse/1d_inplace 8493.666666666666 ns 8340.333333333334 ns 1.02
array/reverse/2d 20247 ns 19869 ns 1.02
array/reverse/2dL 72198 ns 71763 ns 1.01
array/reverse/2dL_inplace 65113 ns 65099 ns 1.00
array/reverse/2d_inplace 9649 ns 9659 ns 1.00
array/sorting/1d 2715667 ns 2713733 ns 1.00
array/sorting/2d 1068413 ns 1062891 ns 1.01
array/sorting/by 3268189 ns 3280831 ns 1.00
cuda/synchronization/context/auto 1136.9 ns 1138.4 ns 1.00
cuda/synchronization/context/blocking 915.4411764705883 ns 962.5652173913044 ns 0.95
cuda/synchronization/context/nonblocking 6014.6 ns 5944.2 ns 1.01
cuda/synchronization/stream/auto 963.7368421052631 ns 997.8333333333334 ns 0.97
cuda/synchronization/stream/blocking 793.7373737373738 ns 821.8235294117648 ns 0.97
cuda/synchronization/stream/nonblocking 6031.833333333333 ns 5912.4 ns 1.02
integration/byval/reference 143155 ns 143240 ns 1.00
integration/byval/slices=1 145167 ns 145261 ns 1.00
integration/byval/slices=2 283399 ns 283622 ns 1.00
integration/byval/slices=3 422065 ns 421946 ns 1.00
integration/cudadevrt 101583 ns 101870 ns 1.00
integration/volumerhs 9898016 ns 9906591 ns 1.00
kernel/indexing 12720 ns 12574 ns 1.01
kernel/indexing_checked 13641 ns 13327 ns 1.02
kernel/launch 2031.888888888889 ns 2037.111111111111 ns 1.00
kernel/occupancy 689.4344827586207 ns 696.7364864864865 ns 0.99
kernel/rand 14409 ns 14332 ns 1.01
latency/import 3860507731 ns 3854422204 ns 1.00
latency/precompile 4626112507 ns 4627097887 ns 1.00
latency/ttfp 4507617572 ns 4514039137 ns 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant