Skip to content

neusymlab/DiffGap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiffGap

Official implementation for "[Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation]".

Installation

To create the virtual environment, use the following command.

conda env create -f env.yml

Or do it step by step following the modified guidance in TargetDiff Installation

Python<3.10 is a must for Vina's compatibility.

Data

The data preparation follows TargetDiff. For more details, please refer to the repository of TargetDiff.

Usage

We use pipeline.py to wrap the whole pipeline of training, sampling, and evaluation for both projects.

python -m pipeline <configs> <sampling_results> [train|sample|eval] [-c resume_from_checkpoint_for_training]
# python -m pipeline configs/training.yml sampling_results/reproduce # for whole pipeline
# python -m pipeline configs/sampling.yml sampling_results/reproduce sample # for pipeline starts from sampling
# python -m pipeline "no matter" sampling_results/reproduce eval # for pipeline for evaluation

Or you can manually run the script for each stage like TargetDiff or BindDM.

We remove the {train,sample,evaluate}.py in BindDM because they are just the copies of the {train,sample,evaluate}_diffusion.py in scripts.

It is worth noting that we provide script for plotting and metrics calculation like High Affinity and Diversity which is just based on the metrics_-1.pt (meta file) generated by evaluation.

For more metrics, please refer to <binddm/scripts/jsd_summary.py> after reshape the meta file with eval_result_reshape.py.

These meta files and checkpoints are released in HF.

PDBbind

We conduct the extra experiments on PDBbind.

Download and unzip the PDBbind refined set from https://www.pdbbind-plus.org.cn/download (PDBbind v2020 id=3).

data prepare:

python dataset_prepare.py pdbbind PDBbind_refined_2020.tgz 100

# in BindDM
cd binddm
PYTHONPATH=. python scripts/data_preparation/clean_pdbbind.py PDBbind_refined_2020 --num_workers 64
PYTHONPATH=. python scripts/data_preparation/split_pl_dataset.py --path PDBbind_refined_2020 --dst PDBbind_refined_2020_pocket10_split.pt --fixed_split PDBbind_refined_2020/split_by_name.pt

evaluate dataset (for testset baseline):

# in BindDM
PYTHONPATH=. python scripts/eval_testset.py PDBbind_refined_2020_test --docking_mode vina_dock

sample and evaluate:

# bd_pdbbind.yaml is for BindDM
# gbd_pdbbind.yaml is for BindDM+Ours (DiffGap)
python pipeline.py configs/bd_pdbbind.yaml sampling_results/binddm_pdbbind sample -p PDBbind_refined_2020_test

Citation

@misc{liu2024gapdiff,
      title={Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation}, 
      author={Peidong Liu and Wenbo Zhang and Xue Zhe and Jiancheng Lv and Xianggen Liu},
      year={2024},
      eprint={2411.05472},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2411.05472}, 
}

About

The official implementation of "[Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation]"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors