Quantum GPT (Hybrid QNN-NanoGPT)

A hybrid Quantum-Classical implementation of a Generative Pre-trained Transformer (GPT). This project adapts Andrej Karpathy's nanoGPT architecture by replacing classical linear layers in the Self-Attention mechanism with Variational Quantum Circuits (VQC) using PennyLane.

🚀 Scientific Concept

In a standard Transformer, the Attention Head projects input tokens into Query, Key, and Value spaces using linear matrices ($W_Q, W_K, W_V$).

In this Quantum-Hybrid architecture, we replace these dense layers with a parameterized quantum evolution:

$$ x \xrightarrow{\text{Adapter}} z \in \mathbb{R}^n \xrightarrow{R(\phi)} |\psi(z)\rangle \xrightarrow{U(\theta)_{\text{entangle}}} \langle Z \rangle \to y $$

Where:

Adapter: A classical bottleneck layer compressing high-dimensional embeddings to $n$ qubits.
$R(\phi)$: Angle embedding encoding classical data into quantum states.
$U(\theta)$: A sequence of trainable entangling layers (Strongly Entangling Layers).
$\langle Z \rangle$: Expectation value measurement returning the projected vector.

Why?

This architecture allows us to study if the high-dimensional Hilbert space and quantum interference can capture semantic relationships more efficiently (parameter-wise) than classical linear algebra, despite the constraints of current NISQ simulation.

This allows exploring the expressivity of quantum circuits within a sequence modeling task.

Note: We employ a Quantum Bottleneck architecture. High-dimensional classical embeddings are projected down to a lower-dimensional quantum latent space via a trainable adapter, processed by the VQC, and projected back. This maintains computational feasibility while exploiting quantum interference.

📂 Project Structure

quantum-transformer/
├── checkpoints/                # Saved models
├── data/                       # Input text data
├── src/                        # Source code
│   ├── config.py               # Hyperparameters & flags
│   ├── dataset.py              # Tokenizer & Dataloader
│   ├── model.py                # Transformer Architecture
│   └── quantum_layers.py       # PennyLane Circuits & Hybrid Layers
├── main.py                     # Entry point (Train/Generate)
└── requirements.txt            # Dependencies

🛠️ Installation

Clone the repository:

git clone https://github.com/lorenzomaiuri-dev/quantum-gpt.git
cd quantum-transformer

Install dependencies:

pip install -r requirements.txt

⚡ Usage

Training

To train the model on the Shakespeare dataset (included in data/):

python main.py --mode train

Note: Quantum simulation is CPU-intensive. The default configuration uses a "Quantum Bottleneck" (4-8 qubits) to keep training times feasible on consumer hardware.

Generation

To generate text using the trained checkpoint:

python main.py --mode generate

⚙️ Configuration

You can modify hyperparameters in src/config.py:

# Quantum Settings
USE_QUANTUM = True      # Set False to use standard Linear Layers
N_QUBITS = 4            # Number of qubits per head
N_QLAYERS = 2           # Depth of the quantum circuit

🧠 Architecture Details

Embedding Dimension: 8 (scaled down for simulation speed)
Heads: 2
Qubits per Head: 4

📊 Preliminary Results (Coming Soon)

Comparison between Classical (64 params) vs Hybrid Quantum (4 qubits) attention heads:

Loss Convergence: Comparing training stability.
Parameter Efficiency: Can quantum circuits learn with fewer parameters?
Runtime Analysis: Quantifying the overhead of quantum simulation.

🙏 Acknowledgements

Andrej Karpathy for the original nanoGPT and Video Lecture.
Xanadu for the PennyLane library used for quantum machine learning.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
data		data
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantum GPT (Hybrid QNN-NanoGPT)

🚀 Scientific Concept

Why?

📂 Project Structure

🛠️ Installation

⚡ Usage

Training

Generation

⚙️ Configuration

🧠 Architecture Details

📊 Preliminary Results (Coming Soon)

🙏 Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quantum GPT (Hybrid QNN-NanoGPT)

🚀 Scientific Concept

Why?

📂 Project Structure

🛠️ Installation

⚡ Usage

Training

Generation

⚙️ Configuration

🧠 Architecture Details

📊 Preliminary Results (Coming Soon)

🙏 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages