Skip to content

[ROCm] cmake: support PYTORCH_FOUND_HIP for torch 2.13 native HIP language support#43881

Merged
ywang96 merged 11 commits into
vllm-project:mainfrom
nemanjaudovic:fix-hip-found-torch213
May 30, 2026
Merged

[ROCm] cmake: support PYTORCH_FOUND_HIP for torch 2.13 native HIP language support#43881
ywang96 merged 11 commits into
vllm-project:mainfrom
nemanjaudovic:fix-hip-found-torch213

Conversation

@nemanjaudovic
Copy link
Copy Markdown
Contributor

@nemanjaudovic nemanjaudovic commented May 28, 2026

torch 2.13 switched from FindHIP.cmake (legacy) to CMake native enable_language(HIP) in commit pytorch/pytorch@d921fd0. The new approach sets PYTORCH_FOUND_HIP but never sets HIP_FOUND, which caused vllm's CMakeLists.txt to fall through to the fatal error path.

Fix the HIP detection condition in vllm's CMakeLists.txt to also accept PYTORCH_FOUND_HIP.

More details about the problem

Building vllm with PyTorch 2.13 on ROCm fails with:

CMake Error at CMakeLists.txt:169 (message):
  Can't find CUDA or HIP installation.

This happens even when HIP/ROCm is correctly installed and torch itself was built with ROCm support.

Root Cause

PyTorch 2.13 migrated from the legacy FindHIP.cmake module to CMake's native HIP language support (enable_language(HIP)) in pytorch/pytorch@d921fd0. The motivation was Windows support, where FindHIP.cmake's bash-based linker helper doesn't work.

The old LoadHIP.cmake (torch ≤2.12) called find_package(HIP MODULE), which ran FindHIP.cmake and set HIP_FOUND=TRUE as a CMake convention side effect. The new LoadHIP.cmake (torch 2.13) bypasses FindHIP.cmake entirely and sets only
PYTORCH_FOUND_HIP=TRUE.

vllm's CMakeLists.txt checked only HIP_FOUND to detect HIP, which is never set by torch 2.13, causing it to fall through to
the fatal error on line 169.

Fix

Extend the HIP detection condition to also accept PYTORCH_FOUND_HIP, which is set by both torch 2.12 and 2.13:

# Before
if (NOT HIP_FOUND AND CUDA_FOUND)
...
elseif(HIP_FOUND)

# After
if (NOT HIP_FOUND AND NOT PYTORCH_FOUND_HIP AND CUDA_FOUND)
...
elseif(HIP_FOUND OR PYTORCH_FOUND_HIP)

After this fix vllm builds correctly.

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 28, 2026

Hi @nemanjaudovic, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

…upport

torch 2.13 switched from FindHIP.cmake (legacy) to CMake native
enable_language(HIP) in commit pytorch/pytorch@d921fd0.
The new approach sets PYTORCH_FOUND_HIP but never sets HIP_FOUND, which
caused vllm's CMakeLists.txt to fall through to the fatal error path.

Fix the HIP detection condition in vllm's CMakeLists.txt to also accept PYTORCH_FOUND_HIP.

Signed-off-by: nemanjaudovic <nudovic@amd.com>
@nemanjaudovic nemanjaudovic force-pushed the fix-hip-found-torch213 branch from afb2ff1 to 7542584 Compare May 28, 2026 14:23
Copy link
Copy Markdown
Member

@Harry-Chen Harry-Chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Harry-Chen Harry-Chen added the ready ONLY add when PR is ready to merge/full CI is needed label May 28, 2026
@Harry-Chen
Copy link
Copy Markdown
Member

CC @tjtanaa do you want to take a look?

@Harry-Chen Harry-Chen enabled auto-merge (squash) May 29, 2026 06:33
Copy link
Copy Markdown
Member

@tjtanaa tjtanaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me that we can compile the code for AMD CI

@nemanjaudovic
Copy link
Copy Markdown
Contributor Author

Seems that CI is getting stuck on some unrelated tests

@ywang96 ywang96 disabled auto-merge May 30, 2026 05:16
@ywang96 ywang96 merged commit c0056b1 into vllm-project:main May 30, 2026
167 of 179 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in AMD May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants