This repository provides a complete pipeline for calculating the binding free energy of molecular systems using alchemical free energy methods. The workflow combines molecular dynamics (MD) simulations, free energy perturbation (FEP), and Bennett acceptance ratio (BAR) analysis to compute binding affinities for host-guest complexes.
The pipeline implements an annihilation method to calculate binding free energies:
- Guest in solution: Annihilate the guest's interactions in a solvated environment
- Host-guest complex: Annihilate the guest's interactions within the host-guest complex
- Binding free energy: Calculate the difference between these two processes
The workflow uses OpenMM for MD simulations with alchemical transformations, applying a lambda schedule to annihilate (turn off) van der Waals and electrostatic interactions. Free energy differences are calculated using the Bennett Acceptance Ratio (BAR) method via pymbar.
This project currently assumes the host is hydroxypropyl beta cyclodextrin (HP-BCD).
- The repository includes host files in
HP-BCD/and the workflow explicitly loadshp-bcd.pdbandhp-bcd.xml. - Defaults and templates are tailored to HP-BCD, for example:
Workflow.pyuseshp-bcd.xmlduring solvation/setup and sets Host restraint Group 1 to a curated HP-BCD index list by default (see Workflow Automatic Selections and Defaults).Prepare.pycopieshp-bcd.pdb,hp-bcd.xml, and related files into template directories.- Docked complex examples and paths use HP-BCD naming conventions.
Using a different host requires programmatic changes to this codebase (not just swapping files). At minimum you will need to:
- Replace HP-BCD-specific file paths and filenames in
Workflow.pyandPrepare.py. - Update default restraint atom selections (e.g., the
1-217host index range) and any host-size assumptions. - Provide the correct host PDB/XML/PRM files and adjust directory names and job templates as needed.
If you plan to generalize beyond HP-BCD, consider adding configuration options for host files, host index ranges for restraints, and template selection, and then updating the documentation accordingly.
- Linux/Unix environment (tested on cluster systems)
- Python 3.x
- GPU with CUDA or OpenCL support (OpenMM platform)
- Production, BAR, and Energy scripts request the CUDA platform by default; Equil and Solvate let OpenMM choose the default platform unless you override it
- CPU platform may work for small tests but is not recommended for production
This project uses a pinned conda environment defined in environment.yml (with a small pip section). The following packages are included:
-
OpenMM (with OpenMM-Setup)
Molecular dynamics simulation engine -
pymbar (version >= 4)
Free energy calculations and BAR analysis https://pymbar.readthedocs.io/en/stable/ -
MDTraj
Trajectory analysis and manipulation https://www.mdtraj.org/1.9.8.dev0/index.html -
PDBFixer
PDB file preparation and solvation https://github.com/openmm/pdbfixer -
Open Babel (Python bindings)
Used to read SDF and write XYZ for docked guest coordinates
https://open-babel.readthedocs.io/ -
intspan
Interval handling utilities https://pypi.org/project/intspan/ -
Standard scientific Python stack
- numpy
- scipy
- (typically included with conda environment)
- Force Field X (FFX): Version 1.0.0 is bundled in the
ffx-1.0.0/directory. Used for force field parameterization and file conversions. If a new version is needed, download from the official Force Field X website. - A minimum Java version of 21 is required to run this bundled tool:
If not already installed, follow the official Miniconda installation guide.
git clone <repository_url>
cd binding_free_energyconda env create -f environment.ymlThis will create a conda environment named binding_free_energy (as defined in the file) with pinned versions for Python, OpenMM, MDTraj, PDBFixer, and other dependencies. A small set of packages is installed via pip as specified under the pip: section.
conda activate binding_free_energyNotes:
- To update an existing environment to match the file:
conda env update -f environment.yml --prune - OpenMM platforms: the environment provides a CUDA toolkit. Select your platform at runtime. Production, BAR, and Energy scripts explicitly request CUDA by default; Equil and Solvate let OpenMM choose the default platform. To force a platform globally, set
OPENMM_DEFAULT_PLATFORM(e.g.,CUDAorOpenCL). - Java requirement for FFX: Force Field X tools require Java 21. If Java 21 is not available on your system, install it (e.g.,
conda install -c conda-forge openjdk=21) or use a system-wide JDK 21.
The workflow requires prepared input files for both the guest molecule and host-guest complex:
<guest>.xyz: Atomic coordinates from PolType (placed inGuests/directory)<guest>.prm: Force field parameters from PolType<guest>.pdb: PDB format structure (generated byPrepare.py)<guest>.xml: OpenMM force field XML (generated byPrepare.py)
hp-bcd.pdb: Host molecule structurehp-bcd.xml: Host force field parameters- Docked structure: SDF file containing docked guest coordinates (host pose from
hp-bcd.pdbis used as-is)
Note: The current workflow is tailored to HP-BCD as the host; using a different host requires code changes (see "Host Assumption and Limitations").
Purpose: Main orchestrator for the complete binding free energy calculation workflow.
Usage:
python Workflow.py --guest_name <name> [options]Required Arguments:
--guest_name: Name of guest molecule (must match XYZ file inGuests/directory)
Optional Arguments:
--setup_only <true|false>: Run setup only without submitting jobs (default:false)--start_at <method>: Start workflow at specific step (options:setup_directories,submit_equil,submit_thermo,submit_bar,collect_energy)--run_equilibration <true|false>: Whether to run equilibration (default:true)--alchemical_atoms <list>: Comma-separated atom indices for alchemical transformation (auto-detected from guest if not specified)--restraint_type <COM|BORESCH>: Restraint model to use (default:COM)--restraint_atoms1 <list>: Comma-separated indices for restraint group 1--restraint_atoms2 <list>: Comma-separated indices for restraint group 2--submission_system <SGE|SLURM>: Job scheduler to target for job scripts and submission (default:SGE)
- Use
--submission_systemto select the scheduler targeted by job scripts and templates.SGE(default): templates are loaded fromJobFiles/SGESLURM: templates are loaded fromJobFiles/SLURM
- The workflow copies the appropriate
equil.job,thermo.job, andbar.jobfrom the selected template folder into your working directories.
- If
--alchemical_atomsis not provided, the workflow auto-detects the alchemical atom range from the first line ofGuests/<NAME>.xyzand sets it to1-<N>. - For the Host_Guest leg, if restraint atoms are not provided:
- Group 1 defaults to a host index list tailored to HP-BCD:
11,16,17,20,23,26,31,32,39,40,51,63,64,70,71,82,94,95,101,102,113,125,126,132,133,144,156,157,163,164,175,187,188,191,198 - Group 2 is auto-selected using the built-in
calculate_restraint_subsectionfunction with a 5.0 Å cutoff around the guest. The indices are reported to the console.
- Group 1 defaults to a host index list tailored to HP-BCD:
Full Workflow (Default Behavior):
To run the full workflow including job submission:
python Workflow.py --guest_name DiflunisalSetup Only::
The workflow supports a --setup_only flag that allows you to run the workflow to only set up directories and files without submitting any jobs or scripts.
python Workflow.py --guest_name Diflunisal --setup_only trueNext Steps After Setup-Only:
After running in setup-only mode, you can:
-
Review the generated files in:
workflow/<GuestName>/Guest_Workflow/workflow/<GuestName>/Host_Guest_Workflow/
-
Manually submit jobs if needed, or
-
Continue the workflow from the equilibration step:
python Workflow.py --guest_name Diflunisal --start_at submit_equil
output*.dcd: Trajectory files in DCD formatoutput*.xml: Simulation state files for restartoutput*.log: Simulation log files*_solv.pdb: Solvated structures
- BAR free energy differences printed to stdout
- TODO: Add file-based output for BAR results and final binding free energies
Purpose: Convert guest molecule files from PolType output to OpenMM-compatible formats and merge with host structures.
Usage:
python Prepare.py --guest_file <xyz> --prm_file <prm> --host_file_dir <dir> --target_dir <dir> --docked_file <sdf>Required Arguments:
--guest_file: Guest XYZ file from PolType--prm_file: PRM parameter file from PolType--host_file_dir: Directory containing host PDB and XML files--target_dir: Output directory for prepared files--docked_file: SDF file containing the docked guest coordinates (only guest coordinates are used; host pose fromhp-bcd.pdbis unchanged)
Note on coordinate handling:
- The docked SDF should contain the guest conformer pose from docking. Only the guest coordinates are taken from this file.
Prepare.pyreplaces the guest PDB coordinates with those from the SDF before creating the host-guest complex. The host pose fromhp-bcd.pdbremains unchanged.- After merging host and guest, no additional post-merge coordinate copying is performed.
Purpose: Add explicit solvent to a molecular system using PDBFixer and perform energy minimization.
Usage:
python Solvate.py --pdb_file <pdb> --forcefield_file <xml> [<xml> ...]Required Arguments:
--pdb_file: Input PDB file--forcefield_file: One or more OpenMM XML force field files
Output: Creates <filename>_solv.pdb with solvated and minimized structure.
Purpose: Run equilibration MD simulation.
Usage:
python Equil.py --pdb_file <pdb> --forcefield_file <xml> [options]Common Arguments:
--pdb_file: Input PDB structure--forcefield_file: Force field XML file(s)--nonbonded_method: Method for nonbonded interactions (NoCutoff,CutoffNonPeriodic,PME,Ewald)--nonbonded_cutoff: Cutoff distance in nm (default: 1.0)--num_steps: Number of simulation steps--step_size: Integration time step in fs--use_restraints: Enable harmonic restraints--restraint_type: Restraint model to use (COMorBORESCH, default:COM)--restraint_atoms_1,--restraint_atoms_2: Atom indices for restraints--restraint_constant: Force constant for restraints (kcal/mol/Ų)--restraint_lower_distance,--restraint_upper_distance: Flat-bottom restraint distances (Å)
Purpose: Run production MD simulation with alchemical transformations.
Usage:
python Production.py --pdb_file <pdb> --forcefield_file <xml> --vdw_lambda <value> --elec_lambda <value> --alchemical_atoms <list> [options]Additional Arguments:
--vdw_lambda: Van der Waals lambda value (0.0 to 1.0)--elec_lambda: Electrostatic lambda value (0.0 to 1.0)--alchemical_atoms: Comma-separated atom indices to transform
Purpose: Calculate free energy differences between adjacent lambda windows using Bennett Acceptance Ratio method.
Usage:
python BAR.py --traj_i <dcd> --traj_ip1 <dcd> --pdb_file <pdb> --forcefield_file <xml> --vdw_lambda_i <value> --vdw_lambda_ip1 <value> --elec_lambda_i <value> --elec_lambda_ip1 <value> --alchemical_atoms <list> [options]Required Arguments:
--traj_i: Trajectory file at lambda i (DCD format)--traj_ip1: Trajectory file at lambda i+1--pdb_file: Reference PDB structure--forcefield_file: Force field XML file(s)--vdw_lambda_i,--vdw_lambda_ip1: VDW lambda values--elec_lambda_i,--elec_lambda_ip1: Electrostatic lambda values--alchemical_atoms: Atoms undergoing transformation
Optional Arguments:
--start: Starting frame index (default: 0)--stop: Ending frame index (default: all frames)--step_size: Frame sampling interval (default: 1)--nonbonded_method: Nonbonded method--nonbonded_cutoff: Cutoff distance--restraint_lambda_i,--restraint_lambda_ip1: Restraint scaling parameters for states i and i+1 (default: 1.0)
Output: Prints forward/reverse FEP and BAR free energy differences with uncertainties.
Purpose: Calculate potential energy of a system at specific alchemical lambda values.
Usage:
python Energy.py --pdb_file <pdb> --forcefield_file <xml> --vdw_lambda <value> --elec_lambda <value> --alchemical_atoms <list> [options]Supports the same optional restraint flags as Equil.py (e.g., --use_restraints, --restraint_type, --restraint_atoms_1, --restraint_atoms_2, etc.). By default, this script requests the CUDA platform.
Purpose: Calculate analytical corrections for harmonic restraint contributions to binding free energy.
Usage:
python alchemistry/Freefix.py <ri> <fi> <ro> <fo> <temp>Arguments:
ri: Inner flat-bottom radius (Å)fi: Inner force constant (kcal/mol/Ų)ro: Outer flat-bottom radius (Å)fo: Outer force constant (kcal/mol/Ų)temp: Temperature (K)
Note: Currently implemented with HARMONIC method; BORESCH method not yet implemented.
-
OpenMM Platform Issues (OpenCL/CUDA)
- Production, BAR, and Energy request CUDA by default. Verify available platforms:
python - <<'PY'\nimport openmm\nprint([openmm.Platform.getPlatform(i).getName() for i in range(openmm.Platform.getNumPlatforms())])\nPY
- To force CUDA (if installed): set
OPENMM_DEFAULT_PLATFORM=CUDAor modify the script to use CUDA. - If CUDA not found, ensure CUDA toolkit and OpenMM CUDA package are correctly installed.
- Production, BAR, and Energy request CUDA by default. Verify available platforms:
-
Restraint Warnings
- If you see "WARNING! Restraints triggered" messages, the system may have drifted outside restraint boundaries
- Consider adjusting restraint parameters or checking system stability
-
Force Field X Errors
- Ensure
ffx-1.0.0/binis accessible - Check Java installation for FFX execution
- Ensure
-
Memory Issues
- Large systems may require significant GPU memory
- Reduce trajectory output frequency or system size if needed
TODO: Add contribution guidelines
TODO: No license file is currently present in the repository. Please add appropriate license information.
TODO: Add citation information for publications using this workflow.
TODO: Add contact information for questions and support.
This workflow uses the following software packages:
- OpenMM for molecular dynamics simulations
- pymbar for free energy analysis
- MDTraj for trajectory processing
- Force Field X for force field parameterization
TODO: Add relevant literature references for:
- Alchemical free energy methods
- Bennett Acceptance Ratio method
- Annihilation approach
- Force field parameterization
Last Updated: 2026-04-06 13:02