Welcome to BaMCo, a novel framework for multimodal, knowledge-driven biomedical Visual Question Answering. This repository contains the implementation of the paper, BaMCo, accepted to MICCAI 2025.
git clone https://github.com/yaziciz/BaMCo.git
cd BaMCoconda env create -f environment.yml
conda activate bamco- Place your datasets under the appropriate folders in
KSpace/Datasets/or use the predefined datasets, Slake, PathVQA, and VQA-RAD.
-
VQA Model:
Downloadpytorch_model_best.binfrom
Hugging Face BaMCo Collection
and place it inVQA/src/checkpoints/. -
Knowledge Encoder:
Download<Dataset>_KnowledgeSpace.ptfrom
Google Drive Knowledge Space Weights
and place it inKSpace/src/checkpoints/.
- Edit
main.pyin bothVQA/src/andKSpace/src/to point to the correct checkpoint files as described in the respectiveReadme.mdfiles in eachcheckpoints/directory.
-
KSpace:
Scripts for constructing and encoding biomedical knowledge sources. -
VQA:
End-to-end VQA pipeline, including data loading, model training, evaluation, and inference. -
Checkpoints:
Store and manage pretrained model weights for both knowledge encoders and VQA models.
We appreciate your interest! If you use or refer to BaMCo in your research, please cite us: The citation will be updated soon!
@inproceedings{BaMCo_MICCAI2025,
title = {BaMCo: Balanced Multimodal Contrastive Learning for Knowledge-Driven Medical VQA},
author = {Ziya Ata Yazici and Hazım Kemal Ekenel},
booktitle = {International Conference on Medical Image Computing and Computer-Assisted Intervention},
year = {2025}
}For questions, issues, or contributions, please open an issue or pull request on GitHub.