A project demonstrating how to fine-tune the unsloth/gemma-3-1b-it model for Python code generation using Unsloth's optimized LoRA implementation.
- Overview
- Model on Hugging Face
- Workflow
- 🚀 Reproduce the Training
- ⚙️ Project Structure
- 🧠 Fine-Tuning Methodology
- 🔧 Configuration
- 🤝 Contributing
- 📄 License
This repository provides a complete pipeline for fine-tuning a small, powerful language model (gemma-3-1b-it) to specialize in generating Python code. It leverages several key technologies:
- Unsloth: For a highly optimized training pipeline that enables up to 2x faster training and 60% less memory usage.
- LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning (PEFT) technique that dramatically reduces computational and storage costs.
- GGUF: The final model is exported to this format, making it highly portable and efficient for inference on a wide range of hardware.
The final GGUF model, as well as the LoRA adapters, are available for direct download from the Hugging Face Hub.
The project follows a clear, three-step process from training to a deployable model.
graph TD;
A[Base Model: gemma-3-1b-it] --> C{main.py};
B[Dataset: code_instructions_120k_alpaca] --> C;
C -- LoRA Fine-Tuning --> D[LoRA Adapters in /outputs];
D --> E{export_gguf.py};
A --> E;
E -- Merge & Quantize --> F[Final Model: merged_model.Q8_0.gguf];
D --> G{test_model.py};
G -- Run Inference --> H[Test Output];
Assuming you have cloned the repository and installed the dependencies from requirements.txt, you can replicate the fine-tuning process with the following steps:
-
Prepare the Dataset: Ensure your training data,
python-codes.json, is in the project root. See the Dataset Configuration section for details on the required format. -
Run Fine-Tuning: Execute the main training script. This will save the trained LoRA adapters in the
outputs/directory.python main.py
-
Export to GGUF: Merge the adapters with the base model and convert to GGUF.
Note: You may need to update the
model_namepath inexport_gguf.pyto point to your latest checkpoint from theoutputs/directory.python export_gguf.py
-
Test the Model: Run a quick inference test to validate the fine-tuned model.
Note: Like the export script, you may need to update the checkpoint path in
test_model.py.python test_model.py
main.py: The core script for fine-tuning the model.export_gguf.py: Merges LoRA adapters and exports the model to the GGUF format.test_model.py: A utility script to run inference for testing purposes.requirements.txt: A list of all Python dependencies.python-codes.json: The example training dataset.outputs/: The default directory where trained model checkpoints (adapters) are saved.
This project uses LoRA (Low-Rank Adaptation), a technique that freezes the pre-trained model weights and injects trainable rank-decomposition matrices.
Key Benefits:
- High Efficiency: Requires significantly less VRAM and trains much faster than full fine-tuning.
- Small Footprint: The output is just the adapter weights (a few MBs), not a full model.
- Modularity: Easily swap adapters to switch the model's specialized task.
The use of Unsloth further enhances this process, providing major speed and memory optimizations.
To use your own data, you must format it as a JSON file containing a list of objects. Each object must have instruction, input, and output keys.
-
Example Format (
your_data.json):[ { "instruction": "Create a Python function to find the max of two numbers.", "input": "a = 10, b = 20", "output": "def max_num(a, b):\n return max(a, b)" } ] -
Update
main.py: Change theload_datasetcall to point to your file:# dataset = load_dataset("python-codes.json") dataset = load_dataset("path/to/your_data.json")
These are configured in the TrainingArguments object in main.py.
| Parameter | Value | Description |
|---|---|---|
per_device_train_batch_size |
2 |
Number of samples per GPU per batch. |
gradient_accumulation_steps |
4 |
Steps to accumulate gradients for a larger effective batch size (8). |
warmup_steps |
10 |
Steps for a learning rate warm-up phase to stabilize training. |
num_train_epochs |
3 |
Total number of times to iterate over the dataset. |
learning_rate |
2e-4 |
The starting learning rate. |
optim |
"adamw_8bit" |
Memory-efficient AdamW optimizer. |
output_dir |
"outputs" |
Directory to save model checkpoints. |
save_strategy |
"epoch" |
Save a checkpoint at the end of each epoch. |
These are configured in the setup_lora function in main.py.
| Parameter | Value | Description |
|---|---|---|
r |
64 |
The rank of the LoRA matrices. Higher means more capacity. |
lora_alpha |
128 |
The scaling factor for the LoRA updates (typically 2 * r). |
target_modules |
[...] |
The specific model layers (e.g., q_proj, v_proj) to apply LoRA to. |
lora_dropout |
0 |
Dropout rate for LoRA layers. 0 is optimized in Unsloth. |
bias |
"none" |
Specifies that bias parameters are not trained (an Unsloth optimization). |
Contributions are welcome! Please feel free to submit a pull request or open an issue if you have suggestions for improvements.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is distributed under the MIT License. See LICENSE for more information.