Skip to content

phnazari/quark

Repository files navigation

Quark

quark logo

Python 3.12+ PyTorch flash-linear-attention License: MIT

A minimal playground for language modeling research. The goal is to provide a clean, hackable base for training and experimenting with GPT-style models — without the overhead of a large framework. Ships with a standard pre-norm transformer (causal attention + SwiGLU MLP) and a training pipeline built on Hydra, W&B, and DDP. This pipeline is adapted from PlainLM.

Setup

Clone the repo with submodules (required for flash-linear-attention):

git clone --recurse-submodules https://github.com/philippnazari/quark

If you already cloned without --recurse-submodules, fetch the submodules with:

git submodule update --init --recursive

Then install dependencies:

uv sync
uv run pre-commit install

For development (adds ruff and pre-commit):

uv sync --extra dev

Data

Download and tokenize FineWeb-Edu 10B into chunked Arrow files. Only needs to be run once — the result is reused across training runs.

.venv/bin/python -m data.datasets.prepare \
  --dataset_path HuggingFaceFW/fineweb-edu \
  --dataset_name sample-10BT \
  --tokenizer gpt2 \
  --seq_length 256 \
  --out_path data/fineweb10B \
  --n_tokens_valid 10000000

Training

Config is managed by Hydra (configs/). All keys can be overridden from the CLI.

Train the default transformer:

.venv/bin/python train.py

Train DeltaNet:

.venv/bin/python train.py model=delta_net

Train GLA (Gated Linear Attention):

.venv/bin/python train.py model=gla

Scale to multiple GPUs with DDP:

torchrun --standalone --nproc_per_node=4 train.py model=delta_net

Any config value can be overridden from the CLI:

.venv/bin/python train.py model=delta_net training.lr=1e-4 training.steps_budget=10000

Print the fully resolved config without running:

.venv/bin/python train.py --cfg job

Training logs to W&B and optionally saves checkpoints to out_dir/exp_name (configured in configs/config.yaml).

W&B

Before the first run, authenticate:

uv run wandb login

The W&B project name and run name are set in configs/config.yaml:

logging:
  wandb_project: quark        # project name on wandb.ai
  wandb_log: true

checkpoint:
  exp_name: my_experiment     # also used as the run name in W&B

About

Quark is a minimal training playground for language modeling research.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors