Trustworthy and Fair SkinGPT-R1 for Democratizing Dermatological Reasoning across Diverse Ethnicities
SkinGPT-R1 is a dermatological reasoning vision language model. 🩺✨
The Chinese University of Hong Kong, Shenzhen
- We will soon release the SkinGPT-R1-7B weights.
SkinGPT-R1/
├── checkpoints/
├── environment.yml
├── inference/
│ ├── full_precision/
│ └── int4_quantized/
├── requirements.txt
└── README.md
This repo provides full-precision inference, INT4 quantized inference, multi-turn chat, and FastAPI serving.
environment.yml is a Conda environment definition file for reproducing the recommended runtime environment.
From scratch:
git clone https://huggingface.co/yuhos16/SkinGPT-R1
cd SkinGPT-R1
conda env create -f environment.yml
conda activate skingpt-r1Manual setup:
git clone https://huggingface.co/yuhos16/SkinGPT-R1
cd SkinGPT-R1
conda create -n skingpt-r1 python=3.10.20 -y
conda activate skingpt-r1
pip install -r requirements.txt-
Use the repository
./checkpointsdirectory as the model weights directory. -
Prepare a test image, for example
./test_images/lesion.jpg. -
Run a first test.
Full precision:
bash inference/full_precision/run_infer.sh --image ./test_images/lesion.jpgINT4:
bash inference/int4_quantized/run_infer.sh --image_path ./test_images/lesion.jpg| Mode | Full Precision | INT4 Quantized |
|---|---|---|
| Single image | bash inference/full_precision/run_infer.sh --image ./test_images/lesion.jpg |
bash inference/int4_quantized/run_infer.sh --image_path ./test_images/lesion.jpg |
| Multi-turn chat | bash inference/full_precision/run_chat.sh --image ./test_images/lesion.jpg |
bash inference/int4_quantized/run_chat.sh --image ./test_images/lesion.jpg |
| API service | bash inference/full_precision/run_api.sh |
bash inference/int4_quantized/run_api.sh |
Default API ports:
- Full precision:
5900 - INT4 quantized:
5901
Notes
- On multi-GPU servers, prepend commands with
CUDA_VISIBLE_DEVICES=0if you want to pin one GPU. - RTX 50 series should use the default
sdpapath. - A100 / RTX 3090 / RTX 4090 / H100 can also try
flash_attention_2if their CUDA stack supports it.
Both API services expose the same endpoints:
POST /v1/upload/{state_id}POST /v1/predict/{state_id}POST /v1/reset/{state_id}POST /diagnose/streamGET /health
This project is for research and educational use only. It is not a substitute for professional medical advice, diagnosis, or treatment.
This repository is released under the MIT License. See LICENSE for details.

