Skip to content

zadorlab/PySIDT

Repository files navigation

PySIDT is a Python package for training and running inference on subgraph isomorphic decision trees (SIDTs) for molecular property prediction as described in Johnson et al. 2025.

SIDTs are graph-based decision trees made of nodes associated with molecular substructures. Inference occurs by descending target molecular structures down the decision tree to nodes with matching subgraph isomorphic substructures and making predictions based on the final (most specific) node matched. SIDTs can perform significantly better on smaller datasets (<10,000 datapoints) than deep neural network based approaches. Being trees of molecular substructures, SIDTs are inherently readable and easy to visualize, making them easy to analyze. They are also straightforward to extend and retrain, facilitate uncertainty estimation, and enable integration of expert knowledge.

The SIDT technique was originally developed in Johnson and Green 2024. This implementation incorporates uncertainty prepruning, as detailed in Pang et al. 2024.

Documentation for PySIDT is available here.

Installation from source

  • Install PySIDT from source
    • git clone https://github.com/zadorlab/PySIDT.git
    • cd PySIDT
    • conda env create -f environment.yml
    • conda activate pysidt_env
    • pip install -e .

Install molecule from source to customize atomtypes

  • Install molecule from source
    • git clone https://github.com/ReactionMechanismGenerator/molecule.git
    • cd molecule
    • conda activate pysidt_env
    • make
    • pip install -e .

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •