Transformer Quote Generator

A custom Transformer Encoder-Decoder implementation in PyTorch designed to generate quotes based on specific styles or categories. By providing a style (e.g., love, life, inspirational) to the encoder, the model learns to generate contextually relevant and stylistically consistent quotes.

Dataset

The model is trained on the Quotes-500k Dataset from Kaggle.

Columns:
- quote
- author
- category - multiple tags
Data Processing:
- Trained on a randomized subset of ~200,000 entries.
- Targeted Styles: The model uses the first tag from the category column as the source input.
- Quality Control: Extremely long or short quotes were excluded during preprocessing to maintain structural consistency and prevent padding-related noise.

Features

Custom Transformer: Built from scratch based on the "Attention Is All You Need" paper.
Optimized Architecture: Tuned to 256 model dimensions to balance creativity and prevent memorization (overfitting).
Advanced Sampling: Supports Top-K and Temperature sampling for diverse and natural text generation.
Automated Evaluation: Integrated Perplexity, BLEU and METEOR metrics for performance tracking.
TensorBoard Integration: Real-time monitoring of Training and Validation Loss and Evaluations.

Architecture & Hyperparameters

Through rigorous testing, the following configuration was selected to optimize learning stability:

Parameter	Value
Model Dimension ($d_{model}$)	256
Heads ($h$)	8
Layers ($N$)	6 Encoder / 6 Decoder
Context Size	96 tokens
Dropout	0.3
Label Smoothing	0.15
Batch Size	128 - 256

Training Analysis (Addressing Overfitting)

Initially, a larger model ($d_{model}=512$) showed significant overfitting after the 10th epoch, where Validation Loss began to diverge.

Solution:

Reduced model capacity to 256 dimensions.
Increased Dropout to 0.3.
Implemented a Learning Rate Scheduler (ReduceLROnPlateau) to handle the steep loss gradients seen in later epochs.

Installation

Kaggle Setup (GPU P100, Internet ON)

Essential Environment Fixes

  !pip install -q "protobuf==3.20.3" --force-reinstall
  !pip install -q evaluate datasets

Repo Management

!git clone https://github.com/aannjjiiccaa/transformer.git
%cd transformer

Visualization & Training

%load_ext tensorboard
%tensorboard --logdir runs/quotes

!python train.py

Inference Local Setup

After training the model on Kaggle, follow these steps to run it locally:

Download Model Assets

Move the following files from your Kaggle output to your local project directory:
- best_model.pt -> place in weights/
- tokenizer_src.json and tokenizer_tgt.json -> place in the root folder.
Enviroment Setup

Create a virtual environment to keep your dependencies isolated:
- Create venv
```
python -m venv venv
```
- Activate venv
  - Windows: .\venv\Scripts\activate
  - Linux/macOS: source venv/bin/activate
Install Requirements

Install the necessary libraries for inference:
```
pip install torch tokenizers
```
Run the Generator

Start the interactive inference session:
```
python inference.py
```

Tips:

Categories: The model was trained on specific tags (e.g. love, life, inspirational). Try those first for the best results.
Exit: Simply type exit to close the program.
Performance: The script is optimized to load the weights once, so subsequent generations are near-instant.

Project Structure

model.py: Core Transformer architecture (Attention, MultiHead, Encoder/Decoder).
dataset.py: Data loading and custom Tokenizer logic.
train.py: Training loop with validation and checkpointing.
config.py: Centralized hyperparameters and path management.
test.py: Inference engine with Top-K sampling logic.
inference.py: An interactive CLI application that allows users to generate quotes by category.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer Quote Generator

Dataset

Features

Architecture & Hyperparameters

Training Analysis (Addressing Overfitting)

Installation

Kaggle Setup (GPU P100, Internet ON)

Inference Local Setup

Tips:

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
dataset.py		dataset.py
inference.py		inference.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Transformer Quote Generator

Dataset

Features

Architecture & Hyperparameters

Training Analysis (Addressing Overfitting)

Installation

Kaggle Setup (GPU P100, Internet ON)

Inference Local Setup

Tips:

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages