-
Notifications
You must be signed in to change notification settings - Fork 69
Open
Description
Thank you for providing such a great tool. I'm wondering if you could also provide the results for various GPU architectures, because we're not sure if our GPUs are configured optimally and sometimes I find it difficult to reproduce the benchmarks provided by NVIDIA.
I ran the device_to_device_memcpy_read_ce test on 6 A6000 devices and got the following result:
nvbandwidth Version: v0.8
Built from Git version: v0.8
CUDA Runtime Version: 13000
CUDA Driver Version: 13000
Driver Version: 580.95.05
...
memcpy CE GPU(row) -> GPU(column) bandwidth (GB/s)
0 1 2 3 4 5
0 N/A 26.34 26.35 21.95 22.64 21.80
1 26.34 N/A 26.34 21.86 22.62 22.63
2 26.34 26.35 N/A 22.36 21.87 22.17
3 21.82 21.82 22.26 N/A 26.34 26.35
4 22.64 22.64 21.81 26.35 N/A 26.34
5 21.86 22.67 22.01 26.34 26.35 N/A
SUM device_to_device_memcpy_read_ce 715.53
Does this look about right to you?
Metadata
Metadata
Assignees
Labels
No labels