CLIP for classification using description.

Introduction

CLIP is trained with huge of (image, caption) paired data. Therefore we can assume that the model, CLIP, has its “general” knowledge between image and text captured from nature.

Consequently, CLIP is the most beloved model for zero-shot image or text tasks. However it is not feasible to apply for specific “subtle” tasks. For instance, classifying what type of wallpaper fault an image is certainly hard for CLIP as it is not “general”.

Therefore we need to fine-tune CLIP model to apply it for specific tasks. It may is done by following Finetune like you pretrain: Improved finetuning of zero-shot vision models.

Here is the things.

Implemented code to use CLIP of HuggingFace model.
Use Pytorch Lightning to fully utilize machine power.
Constrain training procedure using user-defined descriptions.

Installation

conda env create -f environments.yaml

Usage

Create your dataset for classification.

The dataset has to follow below structure.

train_dataset_dir
├── class0
├── class1
├── class2
...
├── classN-3
├── classN-2
└── classN-1

Open cfg.py and write a path of train_dataset_dir (e.g. /home/path/to/dset)

Open info/class_descriptions.yaml and Write descriptions for “every” each class as follow

"가구수정": separation of wallpaper that is attached next to furniture with compartments, a defect that occurs in places like built-in wardrobes or drawers
"걸레받이수정": gap that has occurred between the mopboard and the wallpaper, mopboard is a material used to connect the side of the wallpaper and the floor, and this describes a defect that has occurred around that area
"곰팡이": blue mold that occurs between the wallpaper surface or moldings and the wallpaper. The defect arises over time in damp conditions or due to water leakage
...
classK: descrption for classK
...
"틈새과다": excessive gaps that have occurred between the wallpaper surface and moldings
"피스": tearing that occurred in the wallpaper surface due to improperly installed screws
"훼손": damage to the wallpaper surface itself and the damage occurring between the wallpaper and moldings

Train!
```
python train.py
```

See CFG in cfg.py

It governs to every training procedure.

class CFG():
	# model 
  batch_size = 1 # for now, batch size is fixed to 1
  img_transform_size_W = img_transform_size_H = 512
  num_classes = -1 # automatically calculated by train_dataset_dir
  label_smoothing = 0.1 # use label smoothing
  
  sim_weight = 0.5 # weight that multiplied to similarity loss proposed in paper CLIP.
  fc_weight = 0.5  # weight that multiplied to classificatio loss

	# optimizer setting you should 
  lr = 1e-6
  optim_betas = (0.9,0.98)
  optim_eps = 1e-8
  optim_weight_decay = 0.05
  temperature = 1.072508 # (exp(t)), t=0.07 from CLIP paper
  
  # dataset
  test_size = 0.2
  train_transforms = transforms.Compose([
                      transforms.Resize((img_transform_size_W, img_transform_size_H)),
                      transforms.TrivialAugmentWide(),
                      transforms.ToTensor(),
                  ])
  val_transforms = transforms.Compose([
                      transforms.Resize((img_transform_size_W, img_transform_size_H)),
                      transforms.ToTensor(),
                  ])
  train_dataset_dir = "/path/to/training/dataset/of/yours"
  class_description_yaml_file = "/path/to/class_description/yaml/written/in/step4"

References

https://dacon.io/competitions/official/236082/data

https://www.analyticsvidhya.com/blog/2021/06/5-techniques-to-handle-imbalanced-data-for-a-classification-problem/

https://ratsgo.github.io/insight-notes/docs/interpretable/smoothing

https://arxiv.org/pdf/2311.12065.pdf

https://sebastianraschka.com/blog/2023/data-augmentation-pytorch.html

https://youtu.be/jzJHgL9VGV4?si=R_FQq_XerelO6tRB

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
info		info
models		models
src		src
.gitignore		.gitignore
README.md		README.md
cfg.py		cfg.py
environments.yaml		environments.yaml
inference.ipynb		inference.ipynb
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLIP for classification using description.

Introduction

Installation

Usage

See CFG in cfg.py

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CLIP for classification using description.

Introduction

Installation

Usage

See CFG in cfg.py

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages