DeepSeekR1 Fine-Tuning and Inference Codebase

This repository contains code for fine-tuning the DeepSeek-R1-Distill-Llama-8B model using Unsloth, LoRA, and Hugging Face tools. The model has been fine-tuned on customer service and general chat datasets and is open-sourced for further fine-tuning on domain-specific datasets.

📋 Overview

I've taken the open-source model deepseek-ai/DeepSeek-R1-Distill-Llama-8B (loaded via Unsloth in 4-bit as unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit) and fine-tuned it on several datasets:

The training was done in three steps. After fine-tuning, I merged the trained weights with the base model and pushed the complete model to my Hugging Face model repository: Aeshp/deepseekR1tunedchat.

⚠️ Important Notes

The model is released under the MIT license, allowing free use in applications and further fine-tuning
This model may sometimes hallucinate, as is common with language models
For effective fine-tuning, users should have a sufficient amount of high-quality data to avoid overfitting

🗂️ Repository Structure

Example_dataset/: Sample datasets for testing and demonstration
Finetune_base/: Python scripts for fine-tuning the base model
fineTune_nb_base/: Jupyter notebooks for fine-tuning the base model
finetune_checkpoint/: Python scripts for fine-tuning from the checkpoint model
finetune_checkpoint_nb/: Jupyter notebooks for fine-tuning from the checkpoint model
huggingface_push/: Scripts for pushing models to Hugging Face Hub
load_data/: Utilities for loading and preprocessing datasets
ModelCall/: Scripts for model inference and deployment

🚀 Getting Started

Environment Setup

Clone the repository:

git clone https://github.com/Aeshp/deepseekR1finetune.git
cd deepseekR1finetune

Choose your workflow:
- For base model fine-tuning: cd Finetune_base
- For checkpoint model fine-tuning: cd finetune_checkpoint
- For notebooks (Jupyter/Colab): Use the corresponding notebook directories
Install requirements:
```
pip install -r requirements.txt
```

Fine-tuning Workflows

Base Model Fine-tuning

cd Finetune_base
python unsloth_deepseek_basetune.py

Checkpoint Model Fine-tuning

cd finetune_checkpoint
python deepseekR1tunedchat.py

Using TensorBoard for Monitoring

During training, you can monitor progress using TensorBoard:

tensorboard --logdir outputs/runs/

Then open http://localhost:6006 in your browser to view training metrics.

Dataset Creation

To create your own dataset for fine-tuning, follow this format:

{
  "instruction": "Your instruction here",
  "input": "Additional context (optional)",
  "output": "The expected model output",
  "text": "Combined prompt + response template"
}

Convert your data to JSONL format and split it into train and test sets.

Running Inference

After fine-tuning, run inference with:

cd finetune_checkpoint
python inferance.py

You can modify the prompt and parameters in the script to suit your needs.

Pushing to Hugging Face Hub

Push Only Weights

cd huggingface_push
python push_only_weights.py

Push Merged Model with Base Weights

cd huggingface_push
python merge_weights_and_push.py

Remember to update the repository name and login with your Hugging Face token.

📊 Model Details

base_model:
- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
Quantized:
- unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- bitsandbytes
- 8B
license: mit
language:
- en
datasets:
- taskydata/baize_chatbot
- MohammadOthman/mo-customer-support-tweets-945k
- bitext/Bitext-customer-support-llm-chatbot-training-dataset
new_version: Aeshp/deepseekR1tunedchat
pipeline_tag: text-generation
library_name: transformers

This model is fine-tuned on customer service and chatbot data and is ready to be fine-tuned on any specific domain dataset. To learn how to tune it further, refer to this GitHub repository: Aeshp/deepseekR1finetune.

The model uses the base model deepseek-ai/DeepSeek-R1-Distill-Llama-8B and is loaded in 4-bit as unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit. It has been tuned on the datasets mentioned above and pushed with merged weights to the base model.

📚 References

Hugging Face Models

GitHub Repositories

Papers

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSeekR1 Fine-Tuning and Inference Codebase

📋 Overview

⚠️ Important Notes

🗂️ Repository Structure

🚀 Getting Started

Environment Setup

Fine-tuning Workflows

Base Model Fine-tuning

Checkpoint Model Fine-tuning

Using TensorBoard for Monitoring

Dataset Creation

Running Inference

Pushing to Hugging Face Hub

Push Only Weights

Push Merged Model with Base Weights

📊 Model Details

📚 References

Hugging Face Models

GitHub Repositories

Papers

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Example_dataset		Example_dataset
Finetune_base		Finetune_base
ModelCall		ModelCall
fineTune_nb_base		fineTune_nb_base
finetune_checkpoint		finetune_checkpoint
finetune_checkpoint_nb		finetune_checkpoint_nb
huggingface_push		huggingface_push
load_data		load_data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py

Folders and files

Latest commit

History

Repository files navigation

DeepSeekR1 Fine-Tuning and Inference Codebase

📋 Overview

⚠️ Important Notes

🗂️ Repository Structure

🚀 Getting Started

Environment Setup

Fine-tuning Workflows

Base Model Fine-tuning

Checkpoint Model Fine-tuning

Using TensorBoard for Monitoring

Dataset Creation

Running Inference

Pushing to Hugging Face Hub

Push Only Weights

Push Merged Model with Base Weights

📊 Model Details

📚 References

Hugging Face Models

GitHub Repositories

Papers

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages