$\texttt{ModSCAN}$

This is the official public repository of the paper $\texttt{ModSCAN}$: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities.
All the following updates will be released here first in the future.

Be careful! This repository may contain potentially unsafe information. User discretion is advised.

How to use this repository?

A. Install and set the ENV

Clone this repository.
Prepare the python ENV.

conda create -n modscan python=3.10 -y
conda activate modscan
cd PATH_TO_THE_REPOSITORY
bash prepare.sh

B. Generate and (or) download our benchmark dataset.

Dataset for the Vision Modality Task

Download the official UTKFace dataset here to directory datasets/UTKFace/.
Command to process the UTKFace dataset for the vision modality task:

python datasets_process/process_UTKFace.py \
--target gender \
--mitigation sr

$target is the evaluated stereotypical attribute.
$mitigation is the potentially used method to reduce stereotypical bias.

Dataset for the Language Modality Task

Download the our self-generated (SelfGen) dataset here to directory datasets/SelfGen/ or use model described in the original paper to generate (all details have been provided in our paper).
To add the vision debiasing prompt into the SelfGen images, please run the command:

python datasets_process/process_SelfGen.py

C. Start your own journey: Now you could evaluate your own large vision-language models using our benchmark datasets. Good luck!

An Practic on LLaVA-v1.5

Prepare the environment for LLaVA-1.5 model according to the steps in the LLaVA-1.5 website.
Command to evaluate the vision modality task on UTKFace.

python eval/query_UTKFace_multiple_faces_1_occupation.py \
--target gender \
--key occupations \
--mode None

$target is the evaluated stereotypical attribute. $key is the evaluated stereotypical scenario. $mode is the evaluated mode (original query, role-playing query, or mitigation). 3. Command to evaluate the language modality task on SelfGen.

python eval/query_SelfGen_1_scene_1_attribute.py \
--target gender \
--mode None

$target is the evaluated stereotypical attribute. $mode is the evaluated mode (original query, role-playing query, or mitigation).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets		datasets
datasets_process		datasets_process
eval		eval
LICENSE		LICENSE
README.md		README.md
prepare.sh		prepare.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

$\texttt{ModSCAN}$

How to use this repository?

A. Install and set the ENV

B. Generate and (or) download our benchmark dataset.

C. Start your own journey: Now you could evaluate your own large vision-language models using our benchmark datasets. Good luck!

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

$\texttt{ModSCAN}$

How to use this repository?

A. Install and set the ENV

B. Generate and (or) download our benchmark dataset.

C. Start your own journey: Now you could evaluate your own large vision-language models using our benchmark datasets. Good luck!

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages