Bug Description
The Stable Diffusion WebUI app from the Olares Market does not work on the Olares One hardware (RTX 5090 Mobile GPU). Every image generation attempt fails with:
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA` to enable device-side assertions.
Root Cause (confirmed via diagnosis)
The SD WebUI container image ships with torch 2.3.0 + CUDA 12.1, which only supports up to sm_90. The RTX 5090 uses Blackwell architecture (sm_120) and requires torch built against CUDA 12.8+ with sm_120 kernels.
Confirmed by running inside the container shell:
$ pip show torch | grep -i version
Version: 2.3.0
$ python -c "import torch; print(torch.version.cuda); print(torch.cuda.get_arch_list())"
12.1
['sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
No sm_120 in the arch list = no Blackwell GPU support.
Additional Blockers Found
-
Read-only filesystem: The container's /opt/conda/ is read-only. pip uninstall torch fails with PermissionError. This makes it impossible to replace torch inside the running container.
-
YAML command overrides don't work: Even when modifying the deployment YAML to run pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128, pip sees "Requirement already satisfied" from the read-only system torch and skips the install.
-
Workaround partially works but breaks other deps: Installing torch nightly cu128 to /tmp/torchfix and using PYTHONPATH=/tmp/torchfix successfully loads torch 2.12+cu128 with sm_120 support, and CUDA tensor operations pass. However, the new torch pulls in numpy 2.x which breaks the container's tensorflow, gradio, kornia, and other packages compiled against numpy 1.x — causing the WebUI to crash on startup.
Environment
- Hardware: Olares One (Intel Core Ultra 9 275HX, NVIDIA RTX 5090 Mobile 24GB)
- Olares OS: 1.12.4
- App: Stable Diffusion WebUI (from Olares Market)
- Container torch: 2.3.0+cu121
- Container Python: 3.10 (Conda-based)
- Host CUDA: 12.8+ (Olares 1.12.0 release notes confirm "CUDA support extended to 12.9")
What Works
The RTX 5090 GPU works correctly at the host level. nvidia-smi detects it, HAMI GPU scheduler assigns it to the container, and when torch nightly cu128 is manually installed to a separate path, CUDA operations succeed:
$ PYTHONPATH=/tmp/torchfix python -c "import torch; t=torch.randn(4,4,device='cuda'); print(torch.cuda.get_device_name(0)); print(t@t.T); print('PASS')"
NVIDIA GeForce RTX 5090 Laptop GPU
tensor([...], device='cuda:0')
PASS
Expected Behavior
The SD WebUI app should work out of the box on Olares One, which ships with an RTX 5090.
Suggested Fix
Rebuild the SD WebUI container image with:
- Base image:
nvidia/cuda:12.8.0-devel-ubuntu22.04 (or newer)
- PyTorch: stable release with cu128 (or nightly cu128)
- numpy < 2 (to maintain compatibility with existing packages)
- No xformers (use
--opt-sdp-attention instead, as xformers crashes on Blackwell)
Alternatively, provide a separate SD WebUI image tagged for Blackwell/RTX 50-series GPUs.
Bug Description
The Stable Diffusion WebUI app from the Olares Market does not work on the Olares One hardware (RTX 5090 Mobile GPU). Every image generation attempt fails with:
Root Cause (confirmed via diagnosis)
The SD WebUI container image ships with torch 2.3.0 + CUDA 12.1, which only supports up to
sm_90. The RTX 5090 uses Blackwell architecture (sm_120) and requires torch built against CUDA 12.8+ withsm_120kernels.Confirmed by running inside the container shell:
No
sm_120in the arch list = no Blackwell GPU support.Additional Blockers Found
Read-only filesystem: The container's
/opt/conda/is read-only.pip uninstall torchfails withPermissionError. This makes it impossible to replace torch inside the running container.YAML command overrides don't work: Even when modifying the deployment YAML to run
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128, pip sees "Requirement already satisfied" from the read-only system torch and skips the install.Workaround partially works but breaks other deps: Installing torch nightly cu128 to
/tmp/torchfixand usingPYTHONPATH=/tmp/torchfixsuccessfully loads torch 2.12+cu128 withsm_120support, and CUDA tensor operations pass. However, the new torch pulls in numpy 2.x which breaks the container's tensorflow, gradio, kornia, and other packages compiled against numpy 1.x — causing the WebUI to crash on startup.Environment
What Works
The RTX 5090 GPU works correctly at the host level.
nvidia-smidetects it, HAMI GPU scheduler assigns it to the container, and when torch nightly cu128 is manually installed to a separate path, CUDA operations succeed:Expected Behavior
The SD WebUI app should work out of the box on Olares One, which ships with an RTX 5090.
Suggested Fix
Rebuild the SD WebUI container image with:
nvidia/cuda:12.8.0-devel-ubuntu22.04(or newer)--opt-sdp-attentioninstead, as xformers crashes on Blackwell)Alternatively, provide a separate SD WebUI image tagged for Blackwell/RTX 50-series GPUs.