YOLO + LLM-CLIP (PadChest ROI) — Training Pipeline

This repository contains a three-stage training pipeline:

Baseline YOLO training (standard Ultralytics training)
Contrastive pretraining (YOLO image encoder + LLM text encoder, CLIP-style)
LLM-guided fine-tuning (modified Ultralytics trainer)

0. Environment

Python 3.8+
PyTorch
Ultralytics (official + modified version for Stage 3)

Install dependencies (example):

pip install ultralytics torch torchvision torchaudio

1. Baseline Training (Ultralytics Standard)

Train a vanilla YOLO detector using the official Ultralytics pipeline.

Code Example

from ultralytics import YOLO
import torch

model = YOLO("yolo11m.yaml")

train_results = model.train(
    data="/chest.yaml",
    epochs=500,
    imgsz=640,
    device="0",
)

Notes

chest.yaml should define train, val, and names.
yolo11m.yaml can be replaced with other YOLO configs depending on compute resources.

2. Contrastive Pretraining (YOLO + LLM CLIP)

We perform CLIP-style contrastive pretraining between image features (YOLO backbone) and text features (LLM encoder) using PadChest ROI-level data.

Run Script

python pretraining.py

Example Command

python pretraining.py \
  --csv_path /root/autodl-tmp/dataset/PadChest-GR-yolo-6labels/roi256/roi256_box_sentence.csv \
  --batch_size 16 \
  --epochs 20 \
  --lr 5e-5 \
  --weight_decay 1e-2 \
  --temperature 0.07 \
  --device cuda:0 \
  --textencoder llama2 \
  --llama_rep Llama-2-7b-chat-hf \
  --context True \
  --context_length 8 \
  --n_prompts 2

3. LLM-Guided Fine-tuning (Modified Ultralytics Trainer)

Fine-tune the YOLO detector using LLM-guided features.

Run Script

python train.py

Notes

The Ultralytics trainer has been modified to support text-guided modules.
Pretrained weights from Stage 2 can be loaded for initialization.

Recommended Workflow

Baseline training
Contrastive pretraining
LLM-guided fine-tuning

Quick Start

python -c "from ultralytics import YOLO; YOLO('yolo11m.yaml').train(data='/chest.yaml', epochs=500, imgsz=640, device='0')"
python pretraining.py --device cuda:0
python train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLO + LLM-CLIP (PadChest ROI) — Training Pipeline

0. Environment

1. Baseline Training (Ultralytics Standard)

Code Example

Notes

2. Contrastive Pretraining (YOLO + LLM CLIP)

Run Script

Example Command

3. LLM-Guided Fine-tuning (Modified Ultralytics Trainer)

Run Script

Notes

Recommended Workflow

Quick Start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
model		model
README.md		README.md
chest.yaml		chest.yaml
pretraining.py		pretraining.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

YOLO + LLM-CLIP (PadChest ROI) — Training Pipeline

0. Environment

1. Baseline Training (Ultralytics Standard)

Code Example

Notes

2. Contrastive Pretraining (YOLO + LLM CLIP)

Run Script

Example Command

3. LLM-Guided Fine-tuning (Modified Ultralytics Trainer)

Run Script

Notes

Recommended Workflow

Quick Start

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages