Skip to content

kaileyhh/betafold

Repository files navigation

BetaFold: an integer linear program for $\beta$-sheet folding prediction

Abstract

Determining a protein's three-dimensional structure is critical for understanding its function, but experimental methods like x-ray crystallography are time-consuming and expensive, making it infeasible to quickly evaluate the structure of proteins in a large scale setting. While advanced deep learning models like AlphaFold2 offer high accuracy, they require significant computational power, large training datasets, and can struggle with ``orphan" proteins or subtle mutations. Integer linear programming (ILP) methods trade off this fine-grained accuracy for a lower computational requirement, with a key advantage coming from needing no pretraining whatsoever. Here, we introduce BetaFold, an ILP-based method for predicting the structure of $\beta$-sheet-heavy protein sequences, as $\beta$-sheets are bound by hydrogen bounds, making them predictable by HP models. Unlike existing linear programming models, BetaFold supports the prediction of quaternary structure, which is the interaction between multiple strands or amino acid sequences. Furthermore, in this model, we integrate electrostatic interactions alongside the primary hydrophobic-polar (HP) lattice model to construct more biologically accurate folds. We evaluated BetaFold's performance on 155 unique sequences from the Protein Data Bank (PDB), and predictions achieved a Root Mean Square Deviation (RMSD) in the 6-15 range, comparable with state-of-the-art linear programming benchmarks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages