Skip to content

aannjjiiccaa/transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformer Quote Generator

A custom Transformer Encoder-Decoder implementation in PyTorch designed to generate quotes based on specific styles or categories. By providing a style (e.g., love, life, inspirational) to the encoder, the model learns to generate contextually relevant and stylistically consistent quotes.

Dataset

The model is trained on the Quotes-500k Dataset from Kaggle.

  • Columns:
    • quote
    • author
    • category - multiple tags
  • Data Processing:
    • Trained on a randomized subset of ~200,000 entries.
    • Targeted Styles: The model uses the first tag from the category column as the source input.
    • Quality Control: Extremely long or short quotes were excluded during preprocessing to maintain structural consistency and prevent padding-related noise.

Features

  • Custom Transformer: Built from scratch based on the "Attention Is All You Need" paper.
  • Optimized Architecture: Tuned to 256 model dimensions to balance creativity and prevent memorization (overfitting).
  • Advanced Sampling: Supports Top-K and Temperature sampling for diverse and natural text generation.
  • Automated Evaluation: Integrated Perplexity, BLEU and METEOR metrics for performance tracking.
  • TensorBoard Integration: Real-time monitoring of Training and Validation Loss and Evaluations.

Architecture & Hyperparameters

Through rigorous testing, the following configuration was selected to optimize learning stability:

Parameter Value
Model Dimension ($d_{model}$) 256
Heads ($h$) 8
Layers ($N$) 6 Encoder / 6 Decoder
Context Size 96 tokens
Dropout 0.3
Label Smoothing 0.15
Batch Size 128 - 256

Training Analysis (Addressing Overfitting)

Initially, a larger model ($d_{model}=512$) showed significant overfitting after the 10th epoch, where Validation Loss began to diverge.

Solution:

  1. Reduced model capacity to 256 dimensions.
  2. Increased Dropout to 0.3.
  3. Implemented a Learning Rate Scheduler (ReduceLROnPlateau) to handle the steep loss gradients seen in later epochs.

Installation

Kaggle Setup (GPU P100, Internet ON)

  1. Essential Environment Fixes
  !pip install -q "protobuf==3.20.3" --force-reinstall
  !pip install -q evaluate datasets
  1. Repo Management
!git clone https://github.com/aannjjiiccaa/transformer.git
%cd transformer
  1. Visualization & Training
%load_ext tensorboard
%tensorboard --logdir runs/quotes

!python train.py

Inference Local Setup

After training the model on Kaggle, follow these steps to run it locally:

  1. Download Model Assets

    Move the following files from your Kaggle output to your local project directory:

    • best_model.pt -> place in weights/
    • tokenizer_src.json and tokenizer_tgt.json -> place in the root folder.
  2. Enviroment Setup

    Create a virtual environment to keep your dependencies isolated:

    • Create venv
    python -m venv venv
    • Activate venv
      • Windows: .\venv\Scripts\activate
      • Linux/macOS: source venv/bin/activate
  3. Install Requirements

    Install the necessary libraries for inference:

    pip install torch tokenizers
  4. Run the Generator

    Start the interactive inference session:

    python inference.py

Tips:

  • Categories: The model was trained on specific tags (e.g. love, life, inspirational). Try those first for the best results.
  • Exit: Simply type exit to close the program.
  • Performance: The script is optimized to load the weights once, so subsequent generations are near-instant.

Project Structure

  • model.py: Core Transformer architecture (Attention, MultiHead, Encoder/Decoder).

  • dataset.py: Data loading and custom Tokenizer logic.

  • train.py: Training loop with validation and checkpointing.

  • config.py: Centralized hyperparameters and path management.

  • test.py: Inference engine with Top-K sampling logic.

  • inference.py: An interactive CLI application that allows users to generate quotes by category.

About

Quote Generator Transformer (encoder-decoder)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages