Fine-Tuned GPT-2 Model for Song Lyrics Generation

This repository contains a Jupyter notebook implementing a fine-tuned version of the GPT-2 model for generating song lyrics. The model was trained on a dataset of lyrics from various American singers to produce creative and coherent song lyrics based on user prompts.

Overview

This project leverages the GPT-2 language model, fine-tuned specifically for the task of lyric generation. By training on a diverse collection of American song lyrics, the model has learned patterns, styles, and structures common in popular music, enabling it to generate unique and contextually relevant lyrics.

Dataset

The model was trained on a dataset compiled from song lyrics by a variety of American singers. This dataset was processed and tokenized into a format suitable for the GPT-2 architecture. The tokenized data is stored in the combined_artists_tokenized.csv file.

Training Process

The fine-tuning process involved adapting a pre-trained GPT-2 model (gpt2) using the collected lyric dataset. Key steps included:

Data Preparation: Lyrics were loaded, converted from token strings to lists, and organized into a PyTorch Dataset (LyricsDataset). Sequences were padded or truncated to a fixed length (1024 tokens).
Model and Tokenizer: The standard gpt2 model and tokenizer from the transformers library were utilized.
Configuration: Training was performed using the AdamW optimizer and a linear learning rate scheduler over 3 epochs with a batch size of 4.
Hardware: Training was configured to use a GPU if available, otherwise defaulting to CPU.
Model Saving: The fine-tuned model and tokenizer were saved based on validation loss to the fine_tuned_gpt2_lyrics directory.

How to Use the Model

To generate song lyrics using the fine-tuned model via the provided Jupyter notebook:

Clone the Repository: Obtain a copy of this repository.
Install Dependencies: Install the required libraries. A requirements.txt file listing dependencies (e.g., pandas, numpy, torch, transformers, tqdm, sklearn) may be needed.
```
pip install -r requirements.txt
```
Open the Notebook: Launch Jupyter Notebook or JupyterLab and open lyrical-gpt2.ipynb in the FINE_TUNE directory.
Execute Cells: Run the notebook cells sequentially.
- Initial cells handle setup, data loading, and preprocessing.
- Training cells (Epochs 1-3) can be run to train the model. This step is time-consuming. If the fine_tuned_gpt2_lyrics directory exists from a previous run, these cells can be skipped.
- Final cells load the saved model and tokenizer and provide the lyric generation functionality.
Generate Lyrics: In the final cell block, you will be prompted to enter a starting phrase. The model will then generate a continuation of the lyrics.

Demonstration

A demonstration video (Sample.mp4) is included in this repository to showcase the model's lyric generation capabilities.

Note on Video Display on GitHub: Direct embedding of video files using HTML <video> tags is generally not supported on the GitHub website for security reasons.

To view the demonstration video:

Download: Click on the Sample.mp4 file in the repository file list to download and play it locally.
Direct Link: Access the video directly via this link: Sample.mp4.

Below is a screenshot illustrating the lyric generation process within the Jupyter notebook:

Conclusion

This fine-tuned GPT-2 model provides a tool for automated song lyric generation, trained on a dataset of American singers' lyrics. We encourage users to experiment with the model and explore its creative potential.

For questions or feedback, please feel free to reach out.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.DS_Store		.DS_Store
README.md		README.md
Sample.mp4		Sample.mp4
combined_artists_tokenized.csv		combined_artists_tokenized.csv
demo.png		demo.png
lyrical-gpt.zip		lyrical-gpt.zip
lyrical-gpt2.ipynb		lyrical-gpt2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine-Tuned GPT-2 Model for Song Lyrics Generation

Overview

Dataset

Training Process

How to Use the Model

Demonstration

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fine-Tuned GPT-2 Model for Song Lyrics Generation

Overview

Dataset

Training Process

How to Use the Model

Demonstration

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages