Skip to content

Local CUDA settings not propagated to Pkg.test #2545

@JBlaschke

Description

@JBlaschke

I am helping a NERSC user develop a package on Perlmutter that depends on CUDA. We're encountering the following problem where the Pkg.test() environment does not pick up the project-wide CUDA configuration.

Background

At NERSC we set the JULIA_LOAD_PATH to :/global/common/software/nersc/n9/julia/environments/1.10.4/gnu (or similar) which contains the following LocalPreferences.toml:

# MPI stuff committed for brevity

[CUDA_Runtime_jll]
local = "true"
version = "12.2"

This way we set the CUDA.jl runtime version globally on the system to match the version installed by the vendor:

 $ julia --project=@. -e "import CUDA; CUDA.versioninfo()"
CUDA runtime 12.2, local installation
CUDA driver 12.6
NVIDIA driver 535.216.1, originally for CUDA 12.2

CUDA libraries:
- CUBLAS: 12.2.1
- CURAND: 10.3.3
- CUFFT: 11.0.8
- CUSOLVER: 11.5.0
- CUSPARSE: 12.1.1
- CUPTI: 2023.2.0 (API 20.0.0)
- NVML: 12.0.0+535.216.1

Julia packages:
- CUDA: 5.4.3
- CUDA_Driver_jll: 0.9.2+0
- CUDA_Runtime_jll: 0.14.1+0
- CUDA_Runtime_Discovery: 0.3.5

Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7

Preferences:
- CUDA_Runtime_jll.version: 12.2
- CUDA_Runtime_jll.local: true

1 device:
  0: NVIDIA A100-PCIE-40GB (sm_80, 39.391 GiB / 40.000 GiB available)

The Problem

The package the user is developing uses CUDA. If I add CUDA.versioninfo() to the unit tests, and run:

$ julia --project=@. -e "import Pkg; Pkg.test()"
# Temporary env setup omitted for brevity
     Testing Running tests...
┌ Warning: CUDA runtime library `libcublasLt.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcublasLt.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
┌ Warning: CUDA runtime library `libnvJitLink.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/lib64/libnvJitLink.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
┌ Warning: CUDA runtime library `libcusparse.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcusparse.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
CUDA runtime 12.5, artifact installation
CUDA driver 12.6
NVIDIA driver 535.216.1, originally for CUDA 12.2

CUDA libraries:
- CUBLAS: 12.2.1
- CURAND: 10.3.6
- CUFFT: 11.2.3
- CUSOLVER: 11.6.3
- CUSPARSE: 12.5.1
- CUPTI: 2024.2.1 (API 23.0.0)
- NVML: 12.0.0+535.216.1

Julia packages:
- CUDA: 5.4.3
- CUDA_Driver_jll: 0.9.2+0
- CUDA_Runtime_jll: 0.14.1+0

Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7

1 device:
  0: NVIDIA A100-PCIE-40GB (sm_80, 39.391 GiB / 40.000 GiB available)

So then I tried adding the LocalPreferences.toml to the Pkg.test environment -- as well as adding:

[preferences.CUDA_Runtime_jll]
local = "true"
version = "12.2"

to the test's Project.toml. Neither worked.

How do I either force CUDA.jl to use the system-wide preferences, or how to tell a unit test that relies on CUDA.jl to use a specific runtime version?

I was going to post this in the official Pkg.jl repo, but wanted to get @maleadt 's opinion first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions