Skip to content

TensorFlow with CUDA: RTX 5xxx series isn't supported (CUDA_ERROR_INVALID_HANDLE) #16

@jayjayhust

Description

@jayjayhust

CUDA/cuDNN version

12.9

GPU model and memory

5070TI(16G)

Command

uv run playground/open_duck_mini_v2/runner.py 

Relevant log output

(mujoco) jay@USER-20250603TE:~/Open_Duck_Playground$ uv run playground/open_duck_mini_v2/runner.py
2025-06-17 09:03:38.354798: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1750122218.405515    2972 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1750122218.420688    2972 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1750122218.514032    2972 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1750122218.514061    2972 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1750122218.514063    2972 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1750122218.514064    2972 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
xml: /home/jay/Open_Duck_Playground/playground/open_duck_mini_v2/xmls/scene_flat_terrain.xml
actuators: ['left_hip_yaw', 'left_hip_roll', 'left_hip_pitch', 'left_knee', 'left_ankle', 'neck_pitch', 'head_pitch', 'head_yaw', 'head_roll', 'right_hip_yaw', 'right_hip_roll', 'right_hip_pitch', 'right_knee', 'right_ankle']
joints: ['floating_base', 'left_hip_yaw', 'left_hip_roll', 'left_hip_pitch', 'left_knee', 'left_ankle', 'neck_pitch', 'head_pitch', 'head_yaw', 'head_roll', 'right_hip_yaw', 'right_hip_roll', 'right_hip_pitch', 'right_knee', 'right_ankle']
backlash joints: []
actuator joints ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
actuator joints dict: {'left_hip_yaw': 1, 'left_hip_roll': 2, 'left_hip_pitch': 3, 'left_knee': 4, 'left_ankle': 5, 'neck_pitch': 6, 'head_pitch': 7, 'head_yaw': 8, 'head_roll': 9, 'right_hip_yaw': 10, 'right_hip_roll': 11, 'right_hip_pitch': 12, 'right_knee': 13, 'right_ankle': 14}
floating qpos addr: 0 qvel addr: 0
[Poly ref data] Processing ...
[Poly ref data] Done processing
xml: /home/jay/Open_Duck_Playground/playground/open_duck_mini_v2/xmls/scene_flat_terrain.xml
actuators: ['left_hip_yaw', 'left_hip_roll', 'left_hip_pitch', 'left_knee', 'left_ankle', 'neck_pitch', 'head_pitch', 'head_yaw', 'head_roll', 'right_hip_yaw', 'right_hip_roll', 'right_hip_pitch', 'right_knee', 'right_ankle']
joints: ['floating_base', 'left_hip_yaw', 'left_hip_roll', 'left_hip_pitch', 'left_knee', 'left_ankle', 'neck_pitch', 'head_pitch', 'head_yaw', 'head_roll', 'right_hip_yaw', 'right_hip_roll', 'right_hip_pitch', 'right_knee', 'right_ankle']
backlash joints: []
actuator joints ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
actuator joints dict: {'left_hip_yaw': 1, 'left_hip_roll': 2, 'left_hip_pitch': 3, 'left_knee': 4, 'left_ankle': 5, 'neck_pitch': 6, 'head_pitch': 7, 'head_yaw': 8, 'head_roll': 9, 'right_hip_yaw': 10, 'right_hip_roll': 11, 'right_hip_pitch': 12, 'right_knee': 13, 'right_ankle': 14}
floating qpos addr: 0 qvel addr: 0
[Poly ref data] Processing ...
[Poly ref data] Done processing
Observation size: 101
PPO params: {'action_repeat': 1, 'batch_size': 256, 'clipping_epsilon': 0.2, 'discounting': 0.97, 'entropy_cost': 0.005, 'episode_length': 1000, 'learning_rate': 0.0003, 'max_grad_norm': 1.0, 'normalize_observations': True, 'num_envs': 8192, 'num_evals': 15, 'num_minibatches': 32, 'num_resets_per_eval': 1, 'num_timesteps': 150000000, 'num_updates_per_batch': 4, 'reward_scaling': 1.0, 'unroll_length': 20}
/home/jay/Open_Duck_Playground/.venv/lib/python3.11/site-packages/jax/_src/interpreters/xla.py:112: RuntimeWarning: overflow encountered in cast
  return np.asarray(x, dtypes.canonicalize_dtype(x.dtype))
-----------
STEP: 0 reward: 14.808151245117188 reward_std: 12.227242469787598
-----------
Saving checkpoint (step: 0): /home/jay/Open_Duck_Playground/checkpoints/2025_06_17_090411_0
 === EXPORT ONNX ===
W0000 00:00:1750122251.239234    2972 gpu_device.cc:2430] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
W0000 00:00:1750122251.241327    2972 gpu_device.cc:2430] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1750122251.241767    2972 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 708 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5070 Ti, pci bus id: 0000:01:00.0, compute capability: 12.0
(101,) (101,)
2025-06-17 09:04:11.336528: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_PTX'

2025-06-17 09:04:11.336557: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleGetFunction(&function, module, kernel_name)' failed with 'CUDA_ERROR_INVALID_HANDLE'

2025-06-17 09:04:11.336740: W tensorflow/core/framework/op_kernel.cc:1844] INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
(101,) (101,)
2025-06-17 09:04:11.340111: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_PTX'

2025-06-17 09:04:11.340135: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleGetFunction(&function, module, kernel_name)' failed with 'CUDA_ERROR_INVALID_HANDLE'

2025-06-17 09:04:11.340151: W tensorflow/core/framework/op_kernel.cc:1844] INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
Traceback (most recent call last):
  File "/home/jay/Open_Duck_Playground/playground/open_duck_mini_v2/runner.py", line 64, in <module>
    main()
  File "/home/jay/Open_Duck_Playground/playground/open_duck_mini_v2/runner.py", line 60, in main
    runner.train()
  File "/home/jay/Open_Duck_Playground/playground/common/runner.py", line 114, in train
    _, params, _ = train_fn(
                   ^^^^^^^^^
  File "/home/jay/Open_Duck_Playground/.venv/lib/python3.11/site-packages/brax/training/agents/ppo/train.py", line 692, in train
    policy_params_fn(current_step, make_policy, params)
  File "/home/jay/Open_Duck_Playground/playground/common/runner.py", line 78, in policy_params_fn
    export_onnx(
  File "/home/jay/Open_Duck_Playground/playground/common/export_onnx.py", line 105, in export_onnx
    example_output = tf_policy_network(example_input)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jay/Open_Duck_Playground/.venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/jay/Open_Duck_Playground/playground/common/export_onnx.py", line 69, in call
    inputs = (inputs - self.mean) / self.std
              ~~~~~~~^~~~~~~~~~~
tensorflow.python.framework.errors_impl.InternalError: Exception encountered when calling MLP.call().

{{function_node __wrapped__Sub_device_/job:localhost/replica:0/task:0/device:GPU:0}} 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE' [Op:Sub] name:

Arguments received by MLP.call():
  • inputs=tf.Tensor(shape=(1, 101), dtype=float32)

anyone has any idea how to solve this problem?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions