Skip to content

Domain-specific adaptation of CodeT5 for Python docstring generation. Compares pre-trained vs fine-tuned performance using NLP metrics (ROUGE, METEOR, BERTScore), showing significant improvements after targeted training.

Notifications You must be signed in to change notification settings

diaazg/code-comment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ§‘β€πŸ’» CodeT5 for Code-to-Docstring Generation

This project is a mini-project for the NLP course (Master 1, AI, Semester 2) focused on applying and fine-tuning the CodeT5 model to generate natural language docstrings from source code snippets.

I not only implemented the model but also studied and summarized the CodeT5 research paper, ensuring I understood the architecture, training methodology, and evaluation metrics used in the original work.


πŸ“Œ Project Overview

Automatic code documentation is an important challenge in software engineering. Large models such as CodeT5 already achieve strong performance in code summarization tasks. However, fine-tuning on a carefully prepared dataset can adapt the model to a specific domain and improve the quality of generated docstrings.

In this project:

  • I fine-tuned CodeT5-small on a dataset of 40k code-docstring pairs.
  • I compared pre-trained performance vs. fine-tuned performance using multiple metrics.
  • I implemented an interactive CLI tool where users can paste code and instantly get generated docstrings.

βš™οΈ Installation

# Clone repository
git clone https://github.com/diaazg/code-comment.git
cd code-comment

# Create environment
conda create -n codet5-docstring python=3.10 -y
conda activate codet5-docstring

# Install dependencies
pip install -r requirements.txt

πŸš€ Usage

Run the interactive CLI tool:

python main.py --model_dir ./checkpoints --device mps # use cuda or mps depends on ur device

πŸ“Š Evaluation

I evaluated on 4,000 samples before and after fine-tuning.

Before Fine-Tuning (Pre-trained CodeT5)

  • ROUGE-1: ~0.05
  • METEOR: ~0.027
  • BERTScore (F1): ~0.78

After Fine-Tuning

  • ROUGE-1: ~0.34
  • METEOR: ~0.24
  • BERTScore (F1): ~0.87

βœ… Fine-tuning led to a substantial improvement, especially in semantic similarity (BERTScore), showing that the model better captures the meaning of reference docstrings.

πŸ™ Acknowledgments

This project is based on the CodeT5 model introduced in:

Wang et al., CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation, EMNLP 2021.

About

Domain-specific adaptation of CodeT5 for Python docstring generation. Compares pre-trained vs fine-tuned performance using NLP metrics (ROUGE, METEOR, BERTScore), showing significant improvements after targeted training.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published