-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Summary
Add support for distributing benchmark workloads across multiple GPUs, splitting batches and aggregating results to demonstrate multi-GPU scaling.
Motivation
Many workstations and servers have multiple GPUs installed. The current implementation targets a single accelerator at a time. Multi-GPU fan-out would demonstrate near-linear scaling for embarrassingly parallel workloads and provide a realistic picture of what production GPU-accelerated systems look like. This is especially relevant for domains like financial simulation and AI inference where multi-GPU setups are common.
Acceptance Criteria
- Detect all available GPUs at startup and list them (name, memory, compute capability)
- Add a
--multi-gpuCLI flag that distributes work across all available GPUs of the same type - Split each batch evenly across available GPUs, execute in parallel, and merge results
- Report per-GPU and aggregate throughput metrics
- Report scaling efficiency (e.g. 2 GPUs achieving 1.9× = 95% efficiency)
- Gracefully handle systems with only one GPU (run normally with a note)
- Handle heterogeneous GPU configurations (different models) with appropriate warnings
Technical Notes
- ILGPU's
Contextcan enumerate multiple devices — use this for discovery - Each GPU will need its own
Acceleratorinstance and memory buffers - Synchronisation between GPUs happens on the host side (no direct GPU-to-GPU needed for this use case)
- Consider using
Task.WhenAllor similar for parallel dispatch across GPUs - Batch splitting should account for uneven division (last GPU gets remainder)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels