This repository contains code for fine-tuning the DeepSeek-R1-Distill-Llama-8B model using Unsloth, LoRA, and Hugging Face tools. The model has been fine-tuned on customer service and general chat datasets and is open-sourced for further fine-tuning on domain-specific datasets.
I've taken the open-source model deepseek-ai/DeepSeek-R1-Distill-Llama-8B (loaded via Unsloth in 4-bit as unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit) and fine-tuned it on several datasets:
- taskydata/baize_chatbot
- MohammadOthman/mo-customer-support-tweets-945k
- bitext/Bitext-customer-support-llm-chatbot-training-dataset
The training was done in three steps. After fine-tuning, I merged the trained weights with the base model and pushed the complete model to my Hugging Face model repository: Aeshp/deepseekR1tunedchat.
- The model is released under the MIT license, allowing free use in applications and further fine-tuning
- This model may sometimes hallucinate, as is common with language models
- For effective fine-tuning, users should have a sufficient amount of high-quality data to avoid overfitting
Example_dataset/: Sample datasets for testing and demonstrationFinetune_base/: Python scripts for fine-tuning the base modelfineTune_nb_base/: Jupyter notebooks for fine-tuning the base modelfinetune_checkpoint/: Python scripts for fine-tuning from the checkpoint modelfinetune_checkpoint_nb/: Jupyter notebooks for fine-tuning from the checkpoint modelhuggingface_push/: Scripts for pushing models to Hugging Face Hubload_data/: Utilities for loading and preprocessing datasetsModelCall/: Scripts for model inference and deployment
-
Clone the repository:
git clone https://github.com/Aeshp/deepseekR1finetune.git cd deepseekR1finetune -
Choose your workflow:
- For base model fine-tuning:
cd Finetune_base - For checkpoint model fine-tuning:
cd finetune_checkpoint - For notebooks (Jupyter/Colab): Use the corresponding notebook directories
- For base model fine-tuning:
-
Install requirements:
pip install -r requirements.txt
cd Finetune_base
python unsloth_deepseek_basetune.pycd finetune_checkpoint
python deepseekR1tunedchat.pyDuring training, you can monitor progress using TensorBoard:
tensorboard --logdir outputs/runs/Then open http://localhost:6006 in your browser to view training metrics.
To create your own dataset for fine-tuning, follow this format:
{
"instruction": "Your instruction here",
"input": "Additional context (optional)",
"output": "The expected model output",
"text": "Combined prompt + response template"
}Convert your data to JSONL format and split it into train and test sets.
After fine-tuning, run inference with:
cd finetune_checkpoint
python inferance.pyYou can modify the prompt and parameters in the script to suit your needs.
cd huggingface_push
python push_only_weights.pycd huggingface_push
python merge_weights_and_push.pyRemember to update the repository name and login with your Hugging Face token.
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
Quantized:
- unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- bitsandbytes
- 8B
license: mit
language:
- en
datasets:
- taskydata/baize_chatbot
- MohammadOthman/mo-customer-support-tweets-945k
- bitext/Bitext-customer-support-llm-chatbot-training-dataset
new_version: Aeshp/deepseekR1tunedchat
pipeline_tag: text-generation
library_name: transformers
This model is fine-tuned on customer service and chatbot data and is ready to be fine-tuned on any specific domain dataset. To learn how to tune it further, refer to this GitHub repository: Aeshp/deepseekR1finetune.
The model uses the base model deepseek-ai/DeepSeek-R1-Distill-Llama-8B and is loaded in 4-bit as unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit. It has been tuned on the datasets mentioned above and pushed with merged weights to the base model.
- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
- deepseek-ai/DeepSeek-R1
- meta-llama/Meta-Llama-3-8B
- unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
This project is licensed under the MIT License - see the LICENSE file for details.