-
Notifications
You must be signed in to change notification settings - Fork 40
added docker for blackwell arch support #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughDocker containerization infrastructure is added with Python 3.10-slim base image, system and Python dependencies, and PyTorch nightly CUDA 12.6 wheels. Inference logic updated to handle tuple returns from vc_single. Safe globals configuration added for hubert model loading. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Infer as infer_file()
participant VC as vc_single()
participant Model as Model
App->>Infer: call infer_file()
Infer->>VC: invoke vc_single()
VC->>Model: process audio
alt Returns Tuple
Model-->>VC: (info, (times, wav_opt))
VC-->>Infer: tuple result
Infer->>Infer: Extract wav_opt from tuple
alt wav_opt is None
Infer->>Infer: Raise RuntimeError
else wav_opt valid
Infer-->>App: Return wav data
end
else Returns Direct Array
Model-->>VC: wav_opt array
VC-->>Infer: direct result
Infer-->>App: Return wav data
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes The changes span configuration (Dockerfile, docker-compose), documentation, and logic modifications. Docker files are boilerplate-heavy with standard setup patterns. Inference tuple handling and safe globals changes introduce localized logic requiring careful verification of correctness, but complexity is contained to specific functions without widespread impact. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
rvc_python/infer.py (1)
145-156: Good error handling with minor style improvements needed.The tuple unpacking logic correctly handles multiple return formats from
vc_single. However, there are two minor style issues flagged by static analysis:
- The
timesvariable is unpacked but never used (line 149)- Long error messages should be defined outside the exception class (line 151)
Apply this diff to address the static analysis warnings:
- # Handle error case where vc_single returns a tuple (info, (times, wav_opt)) + # Handle error case where vc_single returns a tuple (info, (times, wav_opt)) if isinstance(result, tuple) and len(result) == 2: info, audio_data = result if isinstance(audio_data, tuple): - times, wav_opt = audio_data + _, wav_opt = audio_data if wav_opt is None: - raise RuntimeError(f"Voice conversion failed: {info}") + error_msg = f"Voice conversion failed: {info}" + raise RuntimeError(error_msg) else: wav_opt = audio_data else: wav_opt = resultrvc_python/modules/vc/utils.py (1)
22-25: Clarify the comment about "unsafe globals".The implementation correctly enables loading of fairseq models by adding
Dictionaryto safe globals. However, the comment "Temporarily allow unsafe globals" is misleading -Dictionaryfrom fairseq is a legitimate data structure used by the model, not an unsafe object.Consider updating the comment to be more accurate:
- # Temporarily allow unsafe globals for fairseq models + # Enable safe loading of fairseq models that use Dictionary objects from fairseq.data.dictionary import Dictionary torch.serialization.add_safe_globals([Dictionary])README.md (1)
266-274: Consider adding prerequisite information for Docker setup.The Docker usage instructions are clear and concise. However, users unfamiliar with NVIDIA Docker setup might benefit from additional context about prerequisites.
Consider adding a brief note about prerequisites:
### Docker (nvidia blackwell support) You can run the rvc-python codebase in docker for easier debugging & contributing. It was implemented to help support newer rtx 5000 series hardware, but works with CPU as well. **Prerequisites:** - Docker and Docker Compose installed - For GPU support: NVIDIA Docker runtime and compatible drivers (CUDA 12.6+) Run:This would help users ensure they have the necessary setup before attempting to use Docker.
Dockerfile (1)
14-14: Consider reducing CUDA architecture list to optimize image size.The TORCH_CUDA_ARCH_LIST includes architectures from Maxwell (5.0) to Blackwell (12.0). If this Docker image is specifically for newer hardware (Blackwell support), you could reduce the list to only recent architectures to decrease compilation time and image size.
For example, to support only Ampere, Ada, and Blackwell:
-ENV TORCH_CUDA_ARCH_LIST="5.0;6.0;7.0;7.5;8.0;8.6;9.0;12.0" +ENV TORCH_CUDA_ARCH_LIST="8.0;8.6;9.0;12.0"However, if broad hardware compatibility is desired, the current list is appropriate.
docker-compose.yml (3)
3-3: Consider changing restart policy for development use.The
restart: alwayspolicy will cause the container to automatically restart on failure or system reboot. Since the README indicates this Docker setup is "for easier debugging & contributing," a less aggressive restart policy might be more appropriate.For development, consider:
- restart: always + restart: "no"Or use
unless-stoppedif you want the container to persist across reboots but not restart on failures:- restart: always + restart: unless-stopped
16-19: Consider removing duplicate environment variables.The environment variables
TORCH_CUDA_ARCH_LISTandPYTORCH_CUDA_ALLOC_CONFare set in both the Dockerfile (lines 14-16) and docker-compose.yml (lines 18-19). This duplication creates a maintenance burden.Since these variables are already defined in the Dockerfile, consider removing them from docker-compose.yml:
environment: - PYTHONUNBUFFERED=1 - - TORCH_CUDA_ARCH_LIST="5.0;6.0;7.0;7.5;8.0;8.6;9.0;12.0" - - PYTORCH_CUDA_ALLOC_CONF=expandable_segments:TrueHowever, if you need the flexibility to override these values at runtime, keeping them in docker-compose.yml is acceptable.
28-28: Update example command to use actual mount paths.The example command uses
path/to/model.pth, but based on the volume mount configuration (./models/:/models/), models should be referenced from/models/inside the container.Update the example to reflect the actual mount structure:
- # python -m rvc_python cli -i input.wav -o output.wav -mp path/to/model.pth -de cuda:0 + # python -m rvc_python cli -i input.wav -o output.wav -mp /models/model_name/model.pth -de cuda:0This aligns with the README instruction to "add your models to the
./modelsdirectory."
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
Dockerfile(1 hunks)README.md(1 hunks)docker-compose.yml(1 hunks)rvc_python/infer.py(2 hunks)rvc_python/modules/vc/utils.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
rvc_python/infer.py (1)
rvc_python/modules/vc/modules.py (1)
vc_single(142-217)
🪛 Ruff (0.14.1)
rvc_python/infer.py
149-149: Unpacked variable times is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
151-151: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
Dockerfile (1)
16-16: Fix typo in environment variable value.There's a typo:
expandabled_segmentsshould beexpandable_segments. This will cause PyTorch to ignore the memory allocation configuration.Apply this diff:
-ENV PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True +ENV PYTORCH_CUDA_ALLOC_CONF=expandable_segments:TrueWait, let me check the actual line again... Looking at line 16:
ENV PYTORCH_CUDA_ALLOC_CONF=expandable_segments:TrueActually, I need to verify this against the provided code. The code shows:
ENV PYTORCH_CUDA_ALLOC_CONF=expandable_segments:TrueThe code appears correct as "expandable_segments". Let me mark this for verification to double-check the actual content.
Likely an incorrect or invalid review comment.
|
|
||
| RUN pip install -r requirements.txt | ||
|
|
||
| RUN pip install --pre torch torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify if PyTorch nightly is required for Blackwell support.
Using --pre to install nightly PyTorch builds introduces stability risks, as nightly versions can contain bugs and breaking changes.
Please verify whether CUDA 12.6 and Blackwell (compute capability 12.0) support is available in stable PyTorch releases. If stable releases now support Blackwell, consider using a pinned stable version instead.
If nightly is truly required, consider:
- Adding a comment explaining why nightly is necessary
- Pinning to a specific nightly date for reproducibility:
torch==2.6.0.dev20251022+cu126
🏁 Script executed:
#!/bin/bash
# Check if stable PyTorch supports CUDA 12.6 and Blackwell architecture
echo "Checking PyTorch stable releases for CUDA 12.6 support..."
curl -s https://download.pytorch.org/whl/torch_stable.html | grep -i "cu126" | head -5Length of output: 57
Blackwell support requires CUDA 12.8, not 12.6—update the Dockerfile.
PyTorch 2.7 introduces support for NVIDIA's new Blackwell GPU architecture and ships pre-built wheels for CUDA 12.8. Blackwell GPUs require 12.7 or newer, specifically 12.8 since that is the current version.
Using --pre torch torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126 will not provide proper Blackwell support. CUDA 12.x-compiled binaries will only run on the new GPU if they were compiled with explicit support for this new architecture, and the PyTorch binaries that are publicly available might not have this support yet—including cu126 wheels.
Since PyTorch 2.7 has been released complete with support for the NVIDIA Blackwell GPU architecture and pre-built wheels for CUDA 12.8, update the Dockerfile to use cu128:
RUN pip install --pre torch torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
Alternatively, if stable releases are acceptable, use the official stable wheels without --pre.
🤖 Prompt for AI Agents
In Dockerfile around line 12, the pip install line uses the cu126 nightly index
which lacks proper NVIDIA Blackwell (CUDA 12.8) support; update the wheel index
from cu126 to cu128 (or switch to the stable cu128 wheels) so PyTorch/Torchaudio
are installed with CUDA 12.8-compatible binaries; modify the RUN pip install
command to reference the cu128 index URL or remove --pre and point to the
appropriate stable cu128 package source.
Summary by CodeRabbit
New Features
Documentation
Bug Fixes