Skip to content

langbnj/alphasync

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alphasync

AlphaSync (https://alphasync.stjude.org) is an updated AlphaFold structure database synchronised with UniProt.

AlphaSync predicts new structures to stay up-to-date with the latest UniProt release, and it additionally enhances all structures with residue-level data such as solvent accessibility and atom-level non-covalent contacts. A preprint manuscript describing AlphaSync is available at https://www.biorxiv.org/content/10.1101/2025.03.12.642845v1.

Please note

This repository provides the structure prediction and processing pipeline behind AlphaSync. It is not yet optimised for local deployment. It currently requires a local SQL server to be set up and is optimised for an LSF HPC environment with a local Singularity container of AlphaFold 2.3.2. We have tentative plans for a more portable containerised version in future.

Initial setup

  • Requires Lahuta for residue-residue contacts, which is not yet publicly released (but should be soon) (https://bisejdiu.github.io/lahuta)
  • Install Python packages (see individual scripts for imports)
  • Install DSSP to make sure mkdssp is available (https://github.com/PDB-REDO/dssp)
  • Update blang_mysql.py with SQL connection details
  • Create tables in sql/sql_create_statements.sql and import .sql files

To update

  • Run run.py
    • Downloads structures from AlphaFold Protein Structure Database (AFDB) via FTP and GCS
    • Parses structures
    • Calculates RSA/dihedrals/contacts
    • Maps sequences to structures
  • Run run.py -alphasync
    • Refreshes protein sequence and proteome data from UniProt REST API and FTP
    • Submits AlphaFold structure prediction jobs as needed
    • Parses structures
    • Calculates RSA/dihedrals/contacts
    • Maps sequences to structures
    • Can then migrate alphasync_compact SQL tables to web server (code available on request)
    • Repeat for new UniProt releases

Acknowledgements

The code in input/alphasync/alphafold_tools is modified slightly from https://github.com/google-deepmind/alphafold, licensed under the Apache 2.0 license. The main change is a split into CPU- and GPU-based steps for more efficient parallelisation, similar to AlphaFold 3.

About

AlphaSync protein structure processing pipeline

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages