Skip to content

fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell#463

Open
user22300 wants to merge 5 commits into
rishikanthc:mainfrom
user22300:main
Open

fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell#463
user22300 wants to merge 5 commits into
rishikanthc:mainfrom
user22300:main

Conversation

@user22300

@user22300 user22300 commented Jun 12, 2026

Copy link
Copy Markdown

Problem

Ubuntu 24.04 ships with a default ubuntu user at UID/GID 1000. Both CUDA Dockerfiles create appuser at UID 10001 to avoid this conflict, but the docker-entrypoint.sh script uses PUID/PGID 1000 by default, causing a mismatch that results in persistent permission errors on volume mounts. This leads to cascading errors on blackwell systems.

Fix

Remove the default ubuntu user at UID 1000 before creating appuser at UID 1000, consistent with what the entrypoint script expects.

Files changed

  • Dockerfile.cuda
  • Dockerfile.cuda.12.9

Tested and confirmed working on

  • Latest main branch (Commit bdb8838), image built locally on Windows 11, Docker Desktop (WSL2), RTX 5090 (Blackwell)

…r in CUDA base images on Ubuntu 24.04

Ubuntu 24.04 ships with a default ubuntu user at UID/GID 1000 which conflicts with appuser creation, causing persistent permission errors on volume mounts. Fix removes the ubuntu user first before creating appuser at UID 1000.
…u user in CUDA base images on Ubuntu 24.04

Ubuntu 24.04 ships with a default ubuntu user at UID/GID 1000 which conflicts with appuser creation, causing persistent permission errors on volume mounts. Fix removes the ubuntu user first before creating appuser at UID 1000.
@user22300

user22300 commented Jun 12, 2026

Copy link
Copy Markdown
Author

Can I also just say thank you to Rishikanth for Scriberr. Truly appreciate your work, such a meaningful contribution to the open source community. I wish you the very best in your job search :)

Add explanation of why this fork exists, identify to users that this repo will not be maintained and is intended to be merged with original repo.
Add explanation of why this fork exists, identify to users that this repo will not be maintained and is intended to be merged with original repo.
Add explanation of why this fork exists, identify to users that this repo will not be maintained and is intended to be merged with original repo.
@user22300 user22300 changed the title fix: resolve UID 1000 conflict with ubuntu user in CUDA base images fix: resolve UID 1000 conflict with ubuntu user in CUDA 12.9 base images for Blackwell GPUs Jun 12, 2026
@user22300 user22300 changed the title fix: resolve UID 1000 conflict with ubuntu user in CUDA 12.9 base images for Blackwell GPUs fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell Jun 12, 2026
@lyricalvanity

Copy link
Copy Markdown

Hi @rishikanthc — first off, sorry to hear about the layoff, and genuinely best of luck with the search. Thanks for building Scriberr; it's become a real part of my workflow. No rush on any of the below — just documenting it for whenever you (or anyone) has bandwidth.
I've got Scriberr running fully on Blackwell now — RTX Pro 4000 (sm_120) — with the complete pipeline working: large-v3 transcription, wav2vec2 alignment, and pyannote diarization all completing cleanly on GPU. Getting there meant clearing two distinct Blackwell blockers, and I think they're the two things the other folks in #451, #387, etc. are hitting:

  1. UID conflict — the appuser/UID 1000 vs ubuntu-24.04 ownership problem. Looks like @user22300's PR fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell #463 already addresses this, and it matches what I had to work around (cache/venv ownership failing under appuser). 👍
  2. Hardcoded CUDA index (cu126) — this is the one I haven't seen called out yet. The PyTorch index resolves to download.pytorch.org/whl/cu126, and cu126 wheels don't ship sm_120 kernels — so torch installs "successfully" but then throws no kernel image is available for execution on the device at the first real CUDA op (for me, torch._cudnn_rnn_flatten_weight during alignment). Blackwell needs the cu128 build (CUDA 12.8+).
    The fix is pointing the index at cu128 — but there's a subtlety worth flagging: the cu126 reference appears in two kinds of places. The embedded pyproject.toml for each env (WhisperX, pyannote, etc.) and a separate bare cu126 token that feeds the whl/%s index template used on the WhisperX path specifically. Fixing only the pyprojects leaves WhisperX still pulling cu126, which is an easy trap. Once both are on cu128, torch.cuda.get_arch_list() shows sm_120 and everything runs.
    Verified: torch held at 2.8.0, only the CUDA build swapped cu126→cu128. Host was driver 596.36 / CUDA 13.2, Docker on WSL2, NVIDIA Container Toolkit.
    Ideal long-term fix is probably making the index selectable (env var, or detect compute capability at runtime) so it's not hardcoded for any single GPU generation — happy to put together a PR along those lines if it'd help, just say the word. Thanks again. 🙏

@user22300

user22300 commented Jun 15, 2026

Copy link
Copy Markdown
Author

Hi @rishikanthc — first off, sorry to hear about the layoff, and genuinely best of luck with the search. Thanks for building Scriberr; it's become a real part of my workflow. No rush on any of the below — just documenting it for whenever you (or anyone) has bandwidth. I've got Scriberr running fully on Blackwell now — RTX Pro 4000 (sm_120) — with the complete pipeline working: large-v3 transcription, wav2vec2 alignment, and pyannote diarization all completing cleanly on GPU. Getting there meant clearing two distinct Blackwell blockers, and I think they're the two things the other folks in #451, #387, etc. are hitting:

  1. UID conflict — the appuser/UID 1000 vs ubuntu-24.04 ownership problem. Looks like @user22300's PR fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell #463 already addresses this, and it matches what I had to work around (cache/venv ownership failing under appuser). 👍
  2. Hardcoded CUDA index (cu126) — this is the one I haven't seen called out yet. The PyTorch index resolves to download.pytorch.org/whl/cu126, and cu126 wheels don't ship sm_120 kernels — so torch installs "successfully" but then throws no kernel image is available for execution on the device at the first real CUDA op (for me, torch._cudnn_rnn_flatten_weight during alignment). Blackwell needs the cu128 build (CUDA 12.8+).
    The fix is pointing the index at cu128 — but there's a subtlety worth flagging: the cu126 reference appears in two kinds of places. The embedded pyproject.toml for each env (WhisperX, pyannote, etc.) and a separate bare cu126 token that feeds the whl/%s index template used on the WhisperX path specifically. Fixing only the pyprojects leaves WhisperX still pulling cu126, which is an easy trap. Once both are on cu128, torch.cuda.get_arch_list() shows sm_120 and everything runs.
    Verified: torch held at 2.8.0, only the CUDA build swapped cu126→cu128. Host was driver 596.36 / CUDA 13.2, Docker on WSL2, NVIDIA Container Toolkit.
    Ideal long-term fix is probably making the index selectable (env var, or detect compute capability at runtime) so it's not hardcoded for any single GPU generation — happy to put together a PR along those lines if it'd help, just say the word. Thanks again. 🙏

Hi there,

Glad the UID fix is working for you.

Regarding the cu126/cu128 issue you raised, I checked my Dockerfile.cuda.12.9 in my fork made for this PR, and it has no cu126 references at all, with PYTORCH_CUDA_VERSION=cu128 set throughout.

I've been running Whisper large-v3 transcription successfully on my RTX 5090 without hitting the CUDA kernel issue you described, which suggests the Blackwell Dockerfile in this repo may already handle the cu128 routing correctly when built from the current main branch (commit bdb8838).

If you want to test build from my fork at https://github.com/user22300/Scriberr/tree/main which has the UID fix applied. Would be really useful to know if that resolves both of the above blockers for you as it has for me.

As an aside, your point about making the CUDA index selectable via env var is a great long-term suggestion and IMHO genuinely merits a separate PR.

Hope this helps :)

EDIT:

One thing worth checking: which Dockerfile did you build from?

The repo has two CUDA Dockerfiles:
Dockerfile.cuda (targeting legacy GTX 10-series through RTX 40-series, using cu126 wheels)
Dockerfile.cuda.12.9 (targeting Blackwell, using cu128 wheels)

The docker-compose.build.blackwell.yml file explicitly references Dockerfile.cuda.12.9, so building with that compose file should pull the correct cu128 wheels automatically. If you built using a different compose file, or directly specifying Dockerfile.cuda instead, that would explain the cu126/cu128 issue you encountered. My RTX 5090 build using Dockerfile.cuda.12.9 has no cu126 references and large-v3 transcription is working cleanly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants