fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell#463
fix: resolve UID 1000 conflict with ubuntu 24.04 user in CUDA base images - Working GPU transcription on Blackwell#463user22300 wants to merge 5 commits into
Conversation
…r in CUDA base images on Ubuntu 24.04 Ubuntu 24.04 ships with a default ubuntu user at UID/GID 1000 which conflicts with appuser creation, causing persistent permission errors on volume mounts. Fix removes the ubuntu user first before creating appuser at UID 1000.
…u user in CUDA base images on Ubuntu 24.04 Ubuntu 24.04 ships with a default ubuntu user at UID/GID 1000 which conflicts with appuser creation, causing persistent permission errors on volume mounts. Fix removes the ubuntu user first before creating appuser at UID 1000.
|
Can I also just say thank you to Rishikanth for Scriberr. Truly appreciate your work, such a meaningful contribution to the open source community. I wish you the very best in your job search :) |
Add explanation of why this fork exists, identify to users that this repo will not be maintained and is intended to be merged with original repo.
Add explanation of why this fork exists, identify to users that this repo will not be maintained and is intended to be merged with original repo.
Add explanation of why this fork exists, identify to users that this repo will not be maintained and is intended to be merged with original repo.
|
Hi @rishikanthc — first off, sorry to hear about the layoff, and genuinely best of luck with the search. Thanks for building Scriberr; it's become a real part of my workflow. No rush on any of the below — just documenting it for whenever you (or anyone) has bandwidth.
|
Hi there, Glad the UID fix is working for you. Regarding the cu126/cu128 issue you raised, I checked my Dockerfile.cuda.12.9 in my fork made for this PR, and it has no cu126 references at all, with PYTORCH_CUDA_VERSION=cu128 set throughout. I've been running Whisper large-v3 transcription successfully on my RTX 5090 without hitting the CUDA kernel issue you described, which suggests the Blackwell Dockerfile in this repo may already handle the cu128 routing correctly when built from the current main branch (commit bdb8838). If you want to test build from my fork at https://github.com/user22300/Scriberr/tree/main which has the UID fix applied. Would be really useful to know if that resolves both of the above blockers for you as it has for me. As an aside, your point about making the CUDA index selectable via env var is a great long-term suggestion and IMHO genuinely merits a separate PR. Hope this helps :) EDIT: One thing worth checking: which Dockerfile did you build from? The repo has two CUDA Dockerfiles: The docker-compose.build.blackwell.yml file explicitly references Dockerfile.cuda.12.9, so building with that compose file should pull the correct cu128 wheels automatically. If you built using a different compose file, or directly specifying Dockerfile.cuda instead, that would explain the cu126/cu128 issue you encountered. My RTX 5090 build using Dockerfile.cuda.12.9 has no cu126 references and large-v3 transcription is working cleanly. |
Problem
Ubuntu 24.04 ships with a default
ubuntuuser at UID/GID 1000. Both CUDA Dockerfiles createappuserat UID 10001 to avoid this conflict, but thedocker-entrypoint.shscript uses PUID/PGID 1000 by default, causing a mismatch that results in persistent permission errors on volume mounts. This leads to cascading errors on blackwell systems.Fix
Remove the default
ubuntuuser at UID 1000 before creatingappuserat UID 1000, consistent with what the entrypoint script expects.Files changed
Dockerfile.cudaDockerfile.cuda.12.9Tested and confirmed working on