Skip to content

feat: Add AMD ROCm GPU support alongside CUDA#1

Open
alexhegit wants to merge 1 commit into
NVlabs:mainfrom
alexhegit:main
Open

feat: Add AMD ROCm GPU support alongside CUDA#1
alexhegit wants to merge 1 commit into
NVlabs:mainfrom
alexhegit:main

Conversation

@alexhegit

Copy link
Copy Markdown

Summary

Add explicit AMD ROCm 7.2 GPU support via uv extras while preserving the existing CUDA workflow.

Changes

  • pyproject.toml: Add rocm extra with ROCm 7.2 index routing, conflict resolution (rocm + end2end are mutually exclusive), remove torch-backend=auto in favor of explicit [tool.uv.sources] index routing, relax torch upper bound to allow ROCm 2.12+ wheels
  • AGENTS.md: New file with ROCm support documentation

Verified on AMD Radeon PRO W7900 (48GB VRAM) + ROCm 7.2

Test Result
torch==2.12.0+rocm7.2 GPU detection Pass — detects W7900 correctly
ptv3_vanilla forward on GPU Pass — output shape [2, 512]
Point cloud outlier removal on GPU Pass
93/96 pytest tests Pass (3 skipped = integration)
GPU memory reporting Pass — shows 44.98 GiB

What Works on ROCm

Feature Command
Base inference (point clouds, meshes, scenes) uv sync --extra rocm
All demo scripts uv run python scripts/demo_*.py
ZMQ server/client uv sync --extra rocm --extra serve
Gripper config wizard uv run python scripts/gripper_config_wizard.py
Tests uv run pytest -m "not end2end"

What Does NOT Work on ROCm

Feature Blocker
End-to-end pipeline nvidia-curobo, newton, warp are CUDA-only
Docker training NGC base image is NVIDIA-only
pointnet2_ops (PointNet++) Gracefully degrades — use default ptv3_vanilla backbone

Usage

# CUDA (unchanged)
uv sync

# AMD ROCm
uv sync --extra rocm

Conflicts

The rocm and end2end extras are mutually exclusive — uv enforces this automatically.

Add explicit ROCm 7.2 support via uv extras while preserving the existing
CUDA workflow (uv sync picks up PyPI CUDA wheels on Linux).

Changes:
- pyproject.toml: Add rocm extra with ROCm 7.2 index routing, conflict
  resolution (rocm + end2end are mutually exclusive), remove torch-backend=auto
  in favor of explicit [tool.uv.sources] index routing, relax torch upper
  bound to allow ROCm 2.12+ wheels
- AGENTS.md: Add ROCm support documentation with what works / what doesn't

Verified on AMD Radeon PRO W7900 (48GB VRAM) + ROCm 7.2:
- torch==2.12.0+rocm7.2 detects GPU correctly
- ptv3_vanilla backbone forward pass runs on GPU
- 93/96 tests pass (3 skipped = integration tests needing live server)
- pointnet2_ops gracefully degrades (warning only)
- All demo scripts, ZMQ server/client, gripper wizard work

What works on ROCm:
- Base inference (point clouds, meshes, scenes)
- All demo scripts (demo_object_pc, demo_scene_pc, demo_object_mesh)
- ZMQ server/client (uv sync --extra rocm --extra serve)
- Gripper config wizard (browser GUI)
- Tests (uv run pytest -m "not end2end")

What does NOT work on ROCm:
- End-to-end pipeline (nvidia-curobo, newton, warp are CUDA-only)
- Docker training (NGC base image is NVIDIA-only)
- pointnet2_ops PointNet++ backbone (gracefully degrades, use ptv3_vanilla)

Usage:
  uv sync              # CUDA (unchanged)
  uv sync --extra rocm # AMD ROCm

Signed-off-by: AMD <amd@amd.com>
Signed-off-by: AlexHe99 <alehe@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants