Skip to content

ReiherGroup/lmlp

Repository files navigation

Lifelong Machine Learning Potentials

Introduction

A lifelong machine learning potential (lMLP) is a representation of the potential energy surface for arbitrary systems with uncertainty quantification which can be fine-tuned and extended in a rolling fashion. Hence, it unites accuracy, efficiency, and flexibility.

This software enables lMLP training and prediction.

Installation

The module lmlp can be installed using pip once the repository has been cloned.

git clone <lmlp-repository>
cd <lmlp-repository>
python3 -m pip install .

A non super user can install the package using a virtual environment or the --user flag. If there is no space left on device for TMPDIR, one can use TMPDIR=<PATH> in front of python3, with <PATH> being a directory with more space for temporary files.

For higher performance Intel SVML and TBB can be exploited.

pip install icc-rt tbb

If they are installed via pip, please make sure that the respective library path of pip is added to the environment variable LD_LIBRARY_PATH. The path of the pip installation location is shown by pip -V. Only the leading part /.../lib/ of this path needs to be added to LD_LIBRARY_PATH. We recommend to add the export statement also to your ~/.bashrc file.

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/.../lib/   # replace ... by actual library path

Usage

Training

  1. Prepare episodic memory file, descriptor file, and supplemental potential file (see examples and tools in GitHub).
  2. Adjust settings in input_lmlp.py (see examples).
  3. Run input_lmlp.py.
python3 -u input_lmlp.py > output.dat

In the generalization setting file several lMLPs can be combined for an ensemble prediction by adding the names of the respective generalization files in the ensemble list and their test RMSEs in the test_rmse list (see tools). The other properties in the generalization setting file have to match for all lMLPs in the ensemble.

Prediction

import numpy as np
import lmlp

# Required input
generalization_setting_file = '...'
elements = np.array([...], dtype=str)   # shape: (n_atoms)
positions = np.array([...])   # shape: (n_atoms, 3), unit: Angstrom

# Optional input
lattice = np.array([...])   # shape: (3, 3), unit: Angstrom
atomic_classes = np.array([...], dtype=int)   # shape: (n_atoms), values: 1 -> QM atom, 2 -> MM atom
atomic_charges = np.array([...])   # shape: (n_atoms), unit: elementary charge

# Initialize lMLP calculator
lMLP = lmlp.lMLP_calculator(
    generalization_setting_file, uncertainty_scaling=2.0,
    uncertainty_thresholds=(30.0, 60.0, 180.0), active_learning_file=None,
    active_learning_thresholds=(3.0, 6.0, 18.0), active_learning_max_number=250,
    active_learning_min_step_difference=5, active_learning_min_atom_distance=0.65)

# Simple energy and forces prediction requires elements and positions
energy, forces = lMLP.predict(elements, positions)   # energy unit: eV, forces unit: eV/Ang

# In addition lattice can be provided for periodic systems,
# atomic_classes and atomic_charges are required for QM/MM predictions,
# name will be assigned to the structure in the active learning output file,
# calc_forces enables forces calculation (only available properties are returned),
# calc_uncertainty enables uncertainty quantification (only available properties are returned)
energy, forces, energy_uncertainty, forces_uncertainty = lMLP.predict(
    elements, positions, lattice=lattice, atomic_classes=atomic_classes,
    atomic_charges=atomic_charges, name=None, calc_forces=True, calc_uncertainty=True)

The number of threads used by Numba during prediction can be specified by the environment variable NUMBA_NUM_THREADS. The default is no parallelization.

NUMBA_NUM_THREADS=4 python3 prediction_lmlp.py

Numba and PyTorch perform just-in-time compilation at the first time the respective code is executed. The compiled Numba functions are cached for future use. Just-in-time compilation can be turned off by setting NUMBA_JIT=0 and PYTORCH_JIT=0.

License and Copyright Information

The module lmlp is distributed under the BSD 3-Clause "New" or "Revised" License. For more license and copyright information, see the file LICENSE.txt.

How to Cite

When publishing results obtained with lmlp, please cite the respective release as archived on Zenodo (DOI: 10.5281/zenodo.7912831, 10.5281/zenodo.8192948).

In addition, we kindly request you to cite M. Eckhoff, M. Reiher, Lifelong Machine Learning Potentials, J. Chem. Theory Comput. 2023, 19, 3509-3525 when working with lMLPs and M. Eckhoff, M. Reiher, CoRe optimizer: an all-in-one solution for machine learning, Mach. Learn.: Sci. Technol. 2024, 5, 015018 when working with the CoRe optimizer.

Support and Contact

In case you encounter any problems or bugs, please write a message to lifelong_ml@phys.chem.ethz.ch.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages