Skip to content

Added Base.similar methods for CuSparseMatrixCOO and BSR#3114

Open
rainerrodrigues wants to merge 5 commits into
JuliaGPU:mainfrom
rainerrodrigues:add-sparse-similar
Open

Added Base.similar methods for CuSparseMatrixCOO and BSR#3114
rainerrodrigues wants to merge 5 commits into
JuliaGPU:mainfrom
rainerrodrigues:add-sparse-similar

Conversation

@rainerrodrigues
Copy link
Copy Markdown
Contributor

This PR adds the missing Base.similar methods for CuSparseMatrixCOO and CuSparseMatrixBSR, allowing them to fallback gracefully without converting to dense CPU arrays.

Fixes #3061
Fixes #3055

Comment thread lib/cusparse/src/array.jl Outdated
@kshyatt
Copy link
Copy Markdown
Member

kshyatt commented Apr 21, 2026

Also, can some tests be added?

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: 54f12f6 Previous: 0ad0204 Ratio
array/accumulate/Float32/1d 101051 ns 101066 ns 1.00
array/accumulate/Float32/dims=1 76800 ns 76791 ns 1.00
array/accumulate/Float32/dims=1L 1585883 ns 1585160.5 ns 1.00
array/accumulate/Float32/dims=2 144087 ns 142797.5 ns 1.01
array/accumulate/Float32/dims=2L 658505 ns 657399 ns 1.00
array/accumulate/Int64/1d 119114 ns 118248.5 ns 1.01
array/accumulate/Int64/dims=1 80706 ns 79631 ns 1.01
array/accumulate/Int64/dims=1L 1695925.5 ns 1694201.5 ns 1.00
array/accumulate/Int64/dims=2 156668 ns 155693 ns 1.01
array/accumulate/Int64/dims=2L 962510 ns 961345 ns 1.00
array/broadcast 20728 ns 20364 ns 1.02
array/construct 1274.7 ns 1262.1 ns 1.01
array/copy 18167 ns 17799 ns 1.02
array/copyto!/cpu_to_gpu 216068 ns 213023 ns 1.01
array/copyto!/gpu_to_cpu 283881 ns 281078 ns 1.01
array/copyto!/gpu_to_gpu 10895 ns 10549.833333333332 ns 1.03
array/iteration/findall/bool 135031 ns 133910 ns 1.01
array/iteration/findall/int 148800 ns 148076 ns 1.00
array/iteration/findfirst/bool 81943 ns 80909 ns 1.01
array/iteration/findfirst/int 83874 ns 82938 ns 1.01
array/iteration/findmin/1d 85714 ns 82262 ns 1.04
array/iteration/findmin/2d 114989 ns 113580 ns 1.01
array/iteration/logical 200307.5 ns 199223 ns 1.01
array/iteration/scalar 66077.5 ns 67075 ns 0.99
array/permutedims/2d 52804.5 ns 51959 ns 1.02
array/permutedims/3d 53100 ns 52150 ns 1.02
array/permutedims/4d 51570 ns 51448.5 ns 1.00
array/random/rand/Float32 13073 ns 12962 ns 1.01
array/random/rand/Int64 24669 ns 24057 ns 1.03
array/random/rand!/Float32 8569.5 ns 9735.5 ns 0.88
array/random/rand!/Int64 21401 ns 21218 ns 1.01
array/random/randn/Float32 37534.5 ns 43055 ns 0.87
array/random/randn!/Float32 30935 ns 28025 ns 1.10
array/reductions/mapreduce/Float32/1d 35251 ns 33732 ns 1.05
array/reductions/mapreduce/Float32/dims=1 39956 ns 49005 ns 0.82
array/reductions/mapreduce/Float32/dims=1L 51342 ns 51002 ns 1.01
array/reductions/mapreduce/Float32/dims=2 56407 ns 57573.5 ns 0.98
array/reductions/mapreduce/Float32/dims=2L 69581 ns 66928.5 ns 1.04
array/reductions/mapreduce/Int64/1d 43074 ns 42242 ns 1.02
array/reductions/mapreduce/Int64/dims=1 44013 ns 48347 ns 0.91
array/reductions/mapreduce/Int64/dims=1L 87302 ns 86835 ns 1.01
array/reductions/mapreduce/Int64/dims=2 59675 ns 60601 ns 0.98
array/reductions/mapreduce/Int64/dims=2L 84777 ns 83761 ns 1.01
array/reductions/reduce/Float32/1d 35039 ns 34140.5 ns 1.03
array/reductions/reduce/Float32/dims=1 40196 ns 39851 ns 1.01
array/reductions/reduce/Float32/dims=1L 51440 ns 51248 ns 1.00
array/reductions/reduce/Float32/dims=2 56612 ns 58473 ns 0.97
array/reductions/reduce/Float32/dims=2L 70037 ns 67693 ns 1.03
array/reductions/reduce/Int64/1d 42966 ns 42152 ns 1.02
array/reductions/reduce/Int64/dims=1 42138.5 ns 43607.5 ns 0.97
array/reductions/reduce/Int64/dims=1L 87276 ns 86832 ns 1.01
array/reductions/reduce/Int64/dims=2 59434 ns 60330 ns 0.99
array/reductions/reduce/Int64/dims=2L 84738 ns 83394 ns 1.02
array/reverse/1d 18008.5 ns 17656 ns 1.02
array/reverse/1dL 68718 ns 68244 ns 1.01
array/reverse/1dL_inplace 65781 ns 65690 ns 1.00
array/reverse/1d_inplace 10294.666666666666 ns 8405.333333333334 ns 1.22
array/reverse/2d 20925 ns 20682 ns 1.01
array/reverse/2dL 73178 ns 72823 ns 1.00
array/reverse/2dL_inplace 65875 ns 65656 ns 1.00
array/reverse/2d_inplace 10734 ns 9850 ns 1.09
array/sorting/1d 2735825 ns 2735008 ns 1.00
array/sorting/2d 1076226 ns 1068540 ns 1.01
array/sorting/by 3327795 ns 3304170 ns 1.01
cuda/synchronization/context/auto 1183.8 ns 1131.2 ns 1.05
cuda/synchronization/context/blocking 957.2058823529412 ns 882.3673469387755 ns 1.08
cuda/synchronization/context/nonblocking 7488.299999999999 ns 8698.3 ns 0.86
cuda/synchronization/stream/auto 1031.3 ns 994.6 ns 1.04
cuda/synchronization/stream/blocking 839.1375 ns 825.1954022988506 ns 1.02
cuda/synchronization/stream/nonblocking 8112.8 ns 7316.4 ns 1.11
integration/byval/reference 143931 ns 143767 ns 1.00
integration/byval/slices=1 146072.5 ns 145850 ns 1.00
integration/byval/slices=2 284794 ns 284678 ns 1.00
integration/byval/slices=3 423418 ns 423481 ns 1.00
integration/cudadevrt 102495 ns 102411.5 ns 1.00
integration/volumerhs 23430349 ns 23457202 ns 1.00
kernel/indexing 13301 ns 13137 ns 1.01
kernel/indexing_checked 13955 ns 13819 ns 1.01
kernel/launch 2313.4444444444443 ns 2083.777777777778 ns 1.11
kernel/occupancy 745.082191780822 ns 668.00625 ns 1.12
kernel/rand 14310 ns 14198 ns 1.01
latency/import 3836824344.5 ns 3845206986.5 ns 1.00
latency/precompile 4634396037 ns 4633948752 ns 1.00
latency/ttfp 4409220668.5 ns 4453935633 ns 0.99

This comment was automatically generated by workflow using github-action-benchmark.

Comment thread lib/cusparse/src/array.jl
Copy link
Copy Markdown
Contributor Author

@rainerrodrigues rainerrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshyatt Hi, can you check if this is suitable and extensive enough for testing?

@maleadt
Copy link
Copy Markdown
Member

maleadt commented May 14, 2026

Same as #3119, you seem to have many unrelated changes in here that cause CI failures.

Comment thread lib/cusparse/src/array.jl
# Julia's `sparse()` constructor and SciPy/CuPy. For Bool we OR instead of sum,
# also matching `sparse()`, since Bool + Bool doesn't stay Bool.
sum_duplicate(a, b) = a + b
sum_duplicate(a::Bool, b::Bool) = a | b
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More unrelated stuff... Please rebase that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing sparse array methods for CuSparseMatrixCOO and CuSparseMatrixBSR [CUSPARSE] Missing appropriate similar methods

3 participants