Skip to content

Optimization-AI/NeuCLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization

Paper link: OpenReview

How can we efficiently estimate the normalization term in the contrastive loss?

  • We study the problem of estimating the normalization term in the contrastive loss (i.e., sum of exponential of similarity with all negative samples)
  • We reformulate the contrastive loss for each sample via convex analysis into a minimization problem with an auxiliary variable representing its log-normalizer
  • We then leverage a compact neural network to predict the log-normalizers, which is justified by variational analysis
  • We design an alternating optimization algorithm, named NeuCLIP, that jointly trains the CLIP model and the auxiliary network.
  • We conduct extensive experiments on various datasets to validate the effectiveness of NeuCLIP

Table of Contents

Experiment Results

Comparison with baselines: In the following figure, we present the Datacomp Average performance (left), ImageNet & Variants performance (middle), and Retrieval performance (right) of different methods trained on DFN-14M.

Comparison with

Getting Started

Environment Setup

To set up the environment for training, please

  1. Download this repository:
    git clone https://github.com/Optimization-AI/NeuCLIP.git
    cd NeuCLIP
  2. Create a new environment:
    conda create -n fastclip python=3.11
    conda activate fastclip
    pip install -r requirements-training.txt

Training

We present sample slurm scripts to run NeuCLIP on DFN-14M.

Sample script to run NeuCLIP on DFN-14M using 8 GPUs (2 nodes and 4 GPUs per node)
#!/bin/bash
#SBATCH --time=2-00:00:00
#SBATCH --mem=120G
#SBATCH --nodes=2
#SBATCH --gres=gpu:4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=6
#SBATCH --job-name=neuclip
#SBATCH --partition=gpu
#SBATCH --output=./job_output/%x_%j.log

source ~/.bashrc
conda activate fastclip

master_addr=$(scontrol show hostnames "$SLURM_JOB_NODELIST" | head -n 1)
export MASTER_ADDR=$master_addr
export MASTER_PORT=12805

export CUDA_VISIBLE_DEVICES='0,1,2,3'
export PYTHONPATH="$PYTHONPATH:$PWD/src"
export HUGGINGFACE_HUB_CACHE='./checkpoints/huggingface'

srun python -u src/training/main.py \
    --save-frequency 1 \
    --train-data './datasets/dfn2b/medium/shards/0000{0000..1926}.tar' \
    --train-num-samples 13710637 --data_size 19270000 \
    --warmup 500 \
    --batch-size 512 \
    --epochs 24 \
    --workers 6 \
    --model ViT-B-32 \
    --name neuclip \
    --seed 2026 \
    --wd 0.2 \
    --local-loss \
    --fastclip --multiply_tau --temperature_scheme global_learnable --temperature 0.07 \
    --lr 5e-4 --lr_tau 6.25e-5 --lr_tau_scheduler step_thresh --rho 11.0 --fastclip_eps 1e-6 \
    --gamma 0.42 --gamma_schedule cosine --gamma_decay_epochs 24 \
    --npn --npn_lr 1.0 --npn_num_protos 4096 --npn_repetition 10 --npn_restart_iter 500

Evaluation

We leverage the Datacomp benchmark to evaluate the performance of trained models. We refer the users to their GitHub repository for detailed instructions on how to run the evaluation. Alternatively, we provide a modified fork to simplify the evaluation process. To run the evaluation, please first prepare the environment, clone the repository and download the evaluation datasets:

# create the evaluation environment
env_name='fastclip_eval'
conda create -n "$env_name" python=3.11
conda activate "$env_name"
pip install -r requirements-eval.txt

# clone the datacomp repository
git clone -b project git@github.com:xywei00/datacomp.git

# download the evaluation datasets to `./datasets/datacomp`
python ./datacomp/download_evalsets.py ./datasets/datacomp

To evaluate a trained CLIP ViT-B/32 model at epoch 24, run the following command:

# train_output_dir should be the one containing 'checkpoints', 'out.log', etc.
train_output_dir='./logs/name'
data_dir='./datasets/datacomp'
arch='ViT-B-32'
epoch=24

python ./datacomp/evaluate.py --train_output_dir "${train_output_dir}" --data_dir "${data_dir}" --epoch "${epoch}" --arch "${arch}"

Citing NeuCLIP

If you find NeuCLIP useful in your research, please consider citing the following paper:

@inproceedings{wei2026neuclip,
  title={Neu{CLIP}: Efficient Large-Scale {CLIP} Training with Neural Normalizer Optimization},
  author={Xiyuan Wei and Chih-Jen Lin and Tianbao Yang},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=WoMMSVZHfP}
}

About

NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages