RadVLM: Vision Language Models for Radiology Report Generation - A Reasoning and Knowledge Graph Retrieval Augmented Generation Approach
This is official repo for "RadVLM: Vision Language Models for Radiology Report Generation" by DBIS group at RWTH Aachen University (Yongli Mou*, Antonia Gustke and Stefan Decker)
RadVLM is a research project focused on enhancing radiology report generation using Vision-Language Models (VLMs). It integrates reasoning and knowledge graph retrieval to improve accuracy and contextual understanding. The repository provides tools for dataset preprocessing, model training, and evaluation, along with pre-trained models and benchmarks.
git clone https://github.com/MouYongli/RadVLM.git
cd RadVLMBecause the DeepSeek-VL2 and Qwen2.5-VL require different versions of dependencies, we need to install them in separate conda environments.
export PROJECT_ROOT=$(pwd)- DeepSeek-VL2
cd $PROJECT_ROOT/baselines
mkdir deepseek
# Clone the DeepSeek-VL2 repository
git clone https://github.com/deepseek-ai/DeepSeek-VL2.git
mv DeepSeek-VL2/* deepseek
rm -rf DeepSeek-VL2
# Update requirements.txt in DeepSeek-VL2 folder
cp requirements.deepseek.txt deepseek/requirements.txt
# Update pyproject.toml in DeepSeek-VL2 folder
cp pyproject.deepseek.toml deepseek/pyproject.toml
# Install dependencies and install the deepseek-vl2 package
cd deepseek
# Create a new conda environment for DeepSeek-VL2
conda create --name deepseekenv python=3.10
conda activate deepseekenv
pip install -r requirements.txt
pip install -e .
# Install PyTorch with CUDA 12.6
# For other version, please refer to https://pytorch.org/get-started/locally, for example:
# pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install torch torchvision torchaudio - Qwen2.5-VL
cd $PROJECT_ROOT/baselines
mkdir qwen
conda create --name qwenenv python=3.10
conda activate qwenenv
cp requirements.qwen.txt qwen/requirements.txt
cd qwen
pip install -r requirements.txt- Our project and dependencies
cd $PROJECT_ROOT
conda activate deepseekenv
pip install -e .
conda activate qwenenv
pip install -e .- MIMIC-CXR: https://physionet.org/content/mimic-cxr/2.0.0/
- ChestExpert
Here's an example of how to use the model:
from radvlm.models.modeling_radvlm import RadVLM
model = RadVLM.load_pretrained("base")📦 RadVLM
├── 📁 data # Sample datasets and preprocessing scripts
├── 📁 models # Pre-trained models and checkpoints
├── 📁 notebooks # Jupyter notebooks with tutorials
├── 📁 docs # Documentation and API references
├── 📁 experiments # Experimental configurations, logs and results
├── 📁 src # Core implementation of foundation models
└── README.md # Project description
| Model | accuracy |
|---|---|
| Baseline | xx |
| Ours | xx |
| More benchmarks are available in the research paper. |
This project is licensed under the MIT License. See the LICENSE file for details.
If you use this project in your research, please cite:
@article{mou2025radvlm,
author = {Yongli Mou, Antonia Gustke and Stefan Decker},
title = {XXX},
journal = {XXX},
year = {202X}
}