Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the Python dependency installation in the GPU Dockerfile by splitting a single pip install command into multiple separate RUN commands and adds cleanup steps for apt cache.
- Splits monolithic pip install into multiple individual RUN commands
- Adds apt cleanup commands to reduce image size
- Separates pip upgrade from dependency installation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0" | ||
|
|
||
| RUN pip install "vllm>=0.11.0" | ||
|
|
There was a problem hiding this comment.
Remove trailing whitespace on lines 27 and 28.
| RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0" | |
| RUN pip install "vllm>=0.11.0" | |
| RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0" | |
| RUN pip install "vllm>=0.11.0" |
| RUN pip install "datasets" | ||
|
|
||
| RUN pip install "pandas" "huggingface_hub" "hf_transfer" "tqdm" | ||
|
|
||
| RUN pip install "accelerate" "safetensors" | ||
|
|
||
| RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0" | ||
|
|
||
| RUN pip install "vllm>=0.11.0" | ||
|
|
||
| RUN pip install "transformers>=4.52.0" | ||
|
|
||
| RUN pip install "llmcompressor" | ||
|
|
There was a problem hiding this comment.
Splitting pip install commands into multiple RUN statements creates additional Docker layers and increases build time. Each RUN command creates a new layer, and splitting dependencies that could be installed together results in redundant pip overhead. Consider grouping related dependencies into fewer RUN commands (e.g., data processing libraries together, UI libraries together) to optimize layer caching while reducing total layers.
| RUN pip install "datasets" | |
| RUN pip install "pandas" "huggingface_hub" "hf_transfer" "tqdm" | |
| RUN pip install "accelerate" "safetensors" | |
| RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0" | |
| RUN pip install "vllm>=0.11.0" | |
| RUN pip install "transformers>=4.52.0" | |
| RUN pip install "llmcompressor" | |
| RUN pip install \ | |
| "datasets" \ | |
| "pandas" \ | |
| "huggingface_hub" \ | |
| "hf_transfer" \ | |
| "tqdm" \ | |
| "accelerate" \ | |
| "safetensors" \ | |
| "nicegui>=2.0.0" \ | |
| "nicegui-highcharts>=0.2.0" \ | |
| "vllm>=0.11.0" \ | |
| "transformers>=4.52.0" \ | |
| "llmcompressor" |
|
|
||
| RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0" | ||
|
|
||
| RUN pip install "vllm>=0.11.0" |
There was a problem hiding this comment.
Remove trailing whitespace on line 29.
| RUN pip install "vllm>=0.11.0" | |
| RUN pip install "vllm>=0.11.0" |
|
Automated review 🤖 Summary of Changes Key Changes & Positives
Potential Issues & Recommendations
Language/Framework Checks
Security & Privacy
Build/CI & Ops
Tests
Approval Recommendation
|
Root Cause #1: Invalid parameter name 'dataset_split' - LLM-Compressor's oneshot() expects 'splits' not 'dataset_split' - The parameter must be a dict: {"calibration": split_name} - This was causing: ValueError: Some keys are not used by the HfArgumentParser: ['dataset_split'] Root Cause #2: Inverted symmetric/zero_point logic - AWQ config had: "symmetric": config.zero_point - Correct: "symmetric": not config.zero_point - When zero_point=True → asymmetric quantization → symmetric=False - When zero_point=False → symmetric quantization → symmetric=True - This bug caused incorrect quantization configuration Changes: - AWQ: Fixed splits parameter format (line 151) - AWQ: Fixed symmetric parameter logic (line 111) - NVFP4: Fixed splits parameter format (line 225) Both quantizers now use the correct LLM-Compressor API format. Tested with: llmcompressor 0.8.1, wikitext dataset
better for caching