Skip to content

break up packages into docker layers#2

Merged
j4ys0n merged 1 commit into
mainfrom
initial
Nov 3, 2025
Merged

break up packages into docker layers#2
j4ys0n merged 1 commit into
mainfrom
initial

Conversation

@j4ys0n
Copy link
Copy Markdown
Contributor

@j4ys0n j4ys0n commented Nov 3, 2025

better for caching

Copilot AI review requested due to automatic review settings November 3, 2025 06:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the Python dependency installation in the GPU Dockerfile by splitting a single pip install command into multiple separate RUN commands and adds cleanup steps for apt cache.

  • Splits monolithic pip install into multiple individual RUN commands
  • Adds apt cleanup commands to reduce image size
  • Separates pip upgrade from dependency installation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docker/Dockerfile.gpu
Comment on lines +27 to +30
RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0"

RUN pip install "vllm>=0.11.0"

Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove trailing whitespace on lines 27 and 28.

Suggested change
RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0"
RUN pip install "vllm>=0.11.0"
RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0"
RUN pip install "vllm>=0.11.0"

Copilot uses AI. Check for mistakes.
Comment thread docker/Dockerfile.gpu
Comment on lines +21 to 34
RUN pip install "datasets"

RUN pip install "pandas" "huggingface_hub" "hf_transfer" "tqdm"

RUN pip install "accelerate" "safetensors"

RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0"

RUN pip install "vllm>=0.11.0"

RUN pip install "transformers>=4.52.0"

RUN pip install "llmcompressor"

Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting pip install commands into multiple RUN statements creates additional Docker layers and increases build time. Each RUN command creates a new layer, and splitting dependencies that could be installed together results in redundant pip overhead. Consider grouping related dependencies into fewer RUN commands (e.g., data processing libraries together, UI libraries together) to optimize layer caching while reducing total layers.

Suggested change
RUN pip install "datasets"
RUN pip install "pandas" "huggingface_hub" "hf_transfer" "tqdm"
RUN pip install "accelerate" "safetensors"
RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0"
RUN pip install "vllm>=0.11.0"
RUN pip install "transformers>=4.52.0"
RUN pip install "llmcompressor"
RUN pip install \
"datasets" \
"pandas" \
"huggingface_hub" \
"hf_transfer" \
"tqdm" \
"accelerate" \
"safetensors" \
"nicegui>=2.0.0" \
"nicegui-highcharts>=0.2.0" \
"vllm>=0.11.0" \
"transformers>=4.52.0" \
"llmcompressor"

Copilot uses AI. Check for mistakes.
Comment thread docker/Dockerfile.gpu

RUN pip install "nicegui>=2.0.0" "nicegui-highcharts>=0.2.0"

RUN pip install "vllm>=0.11.0"
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove trailing whitespace on line 29.

Suggested change
RUN pip install "vllm>=0.11.0"
RUN pip install "vllm>=0.11.0"

Copilot uses AI. Check for mistakes.
@j4ys0n
Copy link
Copy Markdown
Contributor Author

j4ys0n commented Nov 3, 2025

Automated review 🤖

Summary of Changes
This PR restructures the GPU Dockerfile to split Python dependency installations into separate RUN commands. This improves Docker layer caching by isolating each dependency group, making builds faster when only some dependencies change.

Key Changes & Positives

  • 🟢 Split pip install commands into individual layers for better caching.
  • 🟢 Removed redundant apt-get clean and rm -rf in one line.
  • 🟢 Improved readability and maintainability of dependency installation steps.

Potential Issues & Recommendations

  1. Issue / Risk: Multiple RUN commands increase the number of layers and may slightly increase image size due to intermediate layers.
    Impact: Slight overhead in image size and build time.
    Recommendation: Consider using pip install with all packages in one command to reduce layers, unless caching benefits outweigh this.
    Status: 🟡 Needs review

  2. Issue / Risk: Dependency versions may conflict if not carefully managed across multiple RUN commands.
    Impact: Risk of inconsistent or broken environments.
    Recommendation: Validate that all dependencies are compatible when split across layers.
    Status: 🟡 Needs review

Language/Framework Checks

  • Python: No issues detected in dependency handling or versioning.
  • Docker: Layering strategy is valid but may increase image size slightly.

Security & Privacy

  • No security concerns introduced.

Build/CI & Ops

  • No CI or build changes included.

Tests

  • No test changes included.

Approval Recommendation
Approve with caveats

  • Verify that splitting dependencies doesn't introduce version conflicts.
  • Confirm that the increased layer count is acceptable for caching benefits.

@j4ys0n j4ys0n merged commit b15379f into main Nov 3, 2025
7 checks passed
@j4ys0n j4ys0n deleted the initial branch November 3, 2025 06:25
j4ys0n pushed a commit that referenced this pull request Nov 4, 2025
Root Cause #1: Invalid parameter name 'dataset_split'
- LLM-Compressor's oneshot() expects 'splits' not 'dataset_split'
- The parameter must be a dict: {"calibration": split_name}
- This was causing: ValueError: Some keys are not used by the
  HfArgumentParser: ['dataset_split']

Root Cause #2: Inverted symmetric/zero_point logic
- AWQ config had: "symmetric": config.zero_point
- Correct: "symmetric": not config.zero_point
- When zero_point=True → asymmetric quantization → symmetric=False
- When zero_point=False → symmetric quantization → symmetric=True
- This bug caused incorrect quantization configuration

Changes:
- AWQ: Fixed splits parameter format (line 151)
- AWQ: Fixed symmetric parameter logic (line 111)
- NVFP4: Fixed splits parameter format (line 225)

Both quantizers now use the correct LLM-Compressor API format.

Tested with: llmcompressor 0.8.1, wikitext dataset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants