|
RΓ©mi Schlama Β· Joshua W. Sin Β· Ryan P. Burwood Β· Kurt PΓΌntener Β· Raphael Bigler Β· Philippe Schwaller Chem Β· 2026 Β· ArticleΒ 103035 Β· |
AlphaSwarm is an ML-guided particle swarm optimisation (PSO) framework for the multi-objective optimisation of chemical reaction conditions. It couples physically intuitive swarm dynamics with Gaussian-process surrogates to efficiently navigate vast, mixed continuous/categorical reaction spaces, and supports both in silico benchmarking against high-throughput-experimentation (HTE) datasets and closed-loop experimental campaigns. The methodology and results are described in our paper published open access in Chem (2026).
- π Installation
- β‘ Quickstart
- π Usage
- π οΈ Development details
- π¬ Questions and support
- π How to cite
- π License
Install uv with the command
pipx install uv
Create the environment with the following command
uv syncand activate the environment
source .venv/bin/activateAlternatively, you can use conda / mamba to create the environment and install all required packages (installation used for all benchmarks and experiments):
git clone https://github.com/schwallergroup/alphaswarm.git
cd alphaswarm
conda env create -f environment.yml
# mamba env create -f environment.yml
python -m pip install -e .After installation, save the following as quickstart.toml β it points at one of the virtual-experiment CSVs shipped in data/benchmark/:
file_path = "data/benchmark/ni_suzuki_virtual_benchmark.csv"
y_columns = ["AP yield", "AP selectivity"]
exclude_columns = ["ligand", "solvent", "precursor", "base", "cosolvent"]
seed = 42
n_iter = 3
n_particles = 24
init_method = "sobol"
algo = "alpha-pso"
objective_function = "weighted_sum"
results_path = "results" # where per-run CSVs + summary.json are written
[pso_params]
c_1 = 1.0
c_2 = 1.0
c_a = 1.0
w = 1.0
n_particles_to_move = [0, 0]
[obj_func_params]
weights = [1.0, 1.0]
noise = 0.0
[model_config]
kernel = "MaternKernel"
kernel_params = "default"
training_iter = 1000Then launch a 3-iteration, 24-particle alpha-pso run against the Ni-Suzuki virtual benchmark:
alphaswarm benchmark quickstart.tomlResults are logged to the terminal and persisted to results/<algo>_seed<seed>_<timestamp>/ as five files: scores.csv, raw_scores.csv, global_best.csv, final_positions.csv, and summary.json. See π Usage for every available option and for the experimental-campaign workflow.
To run a benchmark, define a configuration file, e.g. benchmark.toml:
file_path = "data/benchmark/ni_suzuki_virtual_benchmark.csv" # Path to the dataset with features and target
y_columns = ["AP yield", "AP selectivity"] # Columns containing the objectives values
exclude_columns = ["ligand", "solvent", "precursor", "base", "cosolvent"] # (Optional) Columns to exclude from the feature set used for modelling, usually contains text data
seed = 42 # Seed for reproducibility
n_iter = 3 # Number of iterations
n_particles = 24 # Number of particles (batch size)
init_method = "sobol" # Initialisation method (random, sobol, LHS, halton)
algo = "alpha-pso" # Algorithm to use (canonical-pso, alpha-pso, qnehvi, sobol, qei, qpi, qucb)
[pso_params] # only for (canonical-pso, alpha-pso)
c_1 = 1.0 # Cognitive parameter
c_2 = 1.0 # Social parameter
c_a = 1.0 # ML parameter
w = 1.0 # Inertia parameter
n_particles_to_move = [0, 0] # Number of particles to move directly to ML predictions at each iteration after initialisation (list size = iteration_number - 1)
objective_function = "weighted_sum" # Objective function to use (weighted_sum, weighted_power, ...)
results_path = "results" # (Optional) Directory where the per-run results folder is written (default: "results")
[obj_func_params]
weights = [1.0, 1.0] # Weights for the weighted sum objective function
noise = 0.0 # Noise to add to the objectives
[model_config]
kernel = "MaternKernel" # Kernel to use (MaternKernel, KMaternKernel)
kernel_params = "default" # Kernel parameters
training_iter = 1000 # Number of iterations for trainingThen, run the benchmark with the following command:
alphaswarm benchmark benchmark.tomlEach run creates a subfolder <results_path>/<algo>_seed<seed>_<timestamp>/ containing:
| File | Contents |
|---|---|
scores.csv |
Long-format aggregated (scalarised) score per (iteration, particle). |
raw_scores.csv |
Per-objective raw scores per (iteration, particle), one column per entry in y_columns. |
global_best.csv |
Best aggregated score seen at each iteration (convergence curve). |
final_positions.csv |
Final particle positions in the normalised feature space, one row per particle. |
summary.json |
Run metadata: algorithm, seed, n_iter, n_particles, y_columns, best score and position. |
To run an experimental campaign, you need a chemical space file (.csv) that contains your reaction features. This file must also include a column named rxn_id for the reaction identifiers.
If your file includes non-feature columns describing reaction conditions (e.g., catalyst, base, solvent), you must list them in the configuration file under the exclude_columns parameter. This ensures they are excluded from the model's feature set. The config files used in the manuscript accompanying this repository can be found in the /configs directory.
An example of a configuration file is shown below:
Warning
The chemical space features for the experimental campaign must be normalised between 0 and 1.
Normalisation can be done with the normalise_features(...) function.
chemical_space_file = "data/experimental_campaigns/example/chemical_space.csv" # Path to the chemical space
exclude_columns = ["ligand", "solvent", "precursor", "base"] # (Optional) Columns to exclude from the input features, usually columns containing text data (rxn_id is automatically excluded)
iteration_number = 1 # Number of iterations (1 for the initialisation)
seed = 42 # Random seed for reproducibility
n_particles = 96 # Number of particles (batch size)
init_method = "sobol" # Initialisation method (random, sobol, LHS, halton)
algo = "alpha-pso" # Algorithm to use (canonical-pso, alpha-pso, qnehvi, sobol, qei, qpi, qucb)
[pso_params] # only for (canonical-pso, alpha-pso)
c_1 = 1.0 # Cognitive parameter
c_2 = 1.0 # Social parameter
c_a = 1.0 # ML parameter
w = 1.0 # Inertia parameter
n_particles_to_move = [0] # Number of particles to move directly to ML predictions at each iteration after initialisation (list size = iteration_number - 1)
objective_columns = ["AP yield", "AP selectivity"] # Columns specifying the objectives
# Suggestions path/file format
pso_suggestions_path = "data/experimental_campaigns/example/pso_plate_suggestions" # output path for the PSO suggestions
pso_suggestions_format = "PSO_plate_{}_suggestions.csv" # file format of the PSO suggestions
# Experimental/Training data path/file format
experimental_data_path = "data/experimental_campaigns/example/pso_training_data" # path to the experimental data
experimental_data_format = "PSO_plate_{}_train.csv" # file format of the training data
[model_config]
kernel = "MaternKernel" # Kernel to use (MaternKernel, KMaternKernel)
kernel_params = "default" # Kernel parameters
training_iter = 1000 # Number of iterations for trainingThen, run the experimental campaign with the following command:
alphaswarm experimental experimental.tomlThe package is structured as follows:
πalphaswarm/
βββ LICENSE # MIT License file
βββ README.md # Installation and usage instructions
|ββ tox.ini # Configuration file for tox (testing)
βββ pyproject.toml # Project configuration file
βββ environment.yml # Configuration file for conda environment
βββ data/
βΒ Β βββ benchmark/ # Contains the virtual experiments for benchmarking
βΒ Β βΒ Β βββ buchwald_virtual_benchmark.csv
βΒ Β βΒ Β βββ ni_suzuki_virtual_benchmark.csv
β β βββ sulfonamide_virtual_benchmark.csv
βΒ Β βΒ Β βββ experimental_data/ # Contains the experimental data for training emulators
βΒ Β βΒ Β Β Β βββ buchwald_train_data.csv
βΒ Β βΒ Β Β Β βββ ni_suzuki_train_data.csv
β β βββ sulfonamide_train_data.csv
βΒ Β βββ experimental_campaigns/
βΒ Β βΒ Β βββ pso_suzuki/ # Example of an experimental campaign
βΒ Β βΒ Β βββ chemical_spaces/ # Contains the chemical spaces
βΒ Β βΒ Β β βββ pso_suzuki_chemical_space.csv
β β βββ configs/ # Contains the config .toml files use to obtain experimental suggestions
β β β βββ pso_suzuki_iter_1.toml
β βΒ Β Β Β βΒ Β ...
βΒ Β βΒ Β βββ pso_plate_suggestions/ # Contains the experimental suggestions
βΒ Β βΒ Β Β Β | βββ PSO_suzuki_plate_1_suggestions.csv
βΒ Β βΒ Β Β Β | ...
βΒ Β βΒ Β βββ pso_training_data/ # Contains the training data (experimental results)
βΒ Β βΒ Β βββ PSO_suzuki_plate_1_train.csv
βΒ Β βΒ Β ...
β βββ HTE_datasets/ # Contains the experimental HTE datasets in SURF format
β β βββ pd_sulfonamide_SURF.csv
β β βββ pd_suzuki_SURF.csv
βββ src/
βΒ Β βββ alphaswarm/
βΒ Β βββ __about__.py
βΒ Β βββ __init__.py
βΒ Β βββ cli.py # Command line interface tools
βΒ Β βββ configs.py # Configurations for benchmark and experimental campaigns
βΒ Β βββ metrics.py # Metrics for the benchmark
βΒ Β βββ objective_functions.py # Objective functions for the benchmark
βΒ Β βββ pso.py # Main PSO algorithm
βΒ Β βββ swarms.py # Particle and Swarm classes
βΒ Β βββ acqf/ # Acquisition functions
βΒ Β βΒ Β βββ acqf.py
βΒ Β βΒ Β βββ acqfunc.py
βΒ Β βββ models/ # Surrogate models
βΒ Β βΒ Β βββ gp.py # Gaussian Process models
βΒ Β βββ utils/
βΒ Β βββ logger.py # Logger for the package
βΒ Β βββ moo_utils.py # Utilities for multi-objective optimisation
βΒ Β βββ tensor_types.py # Type definitions for tensors
βΒ Β βββ utils.py # General utilities
βββ tests/ # Contains all the unit tests
All data is stored in the data/ directory. The benchmark/ directory contains the virtual experiments used for benchmarking. The experimental_campaigns/ directory contains the chemical spaces and the experimental data for the experimental campaigns.
See developer instructions
To install the package with its test dependencies, run either:
pip install -e ".[test]"or, with uv and PEP 735 dependency groups:
uv sync --group testTo run style checks:
uv pip install pre-commit
pre-commit run -aRuff is used for linting and type checking. To run the tests, use the following command:
ruff check src/ --fixTo test:
uv pip install tox
python -m tox -r -e py312Tensor shapes can be checked using jaxtyping. To check the shapes, set the TYPECHECK environment variable to 1 and run code normally:
export TYPECHECK=1- Bug reports and feature requests: please open an issue on the GitHub issue tracker.
- Academic enquiries: contact the corresponding authors listed in the publication.
If you use AlphaSwarm in your research, please cite our paper published open access in Chem:
Schlama, R.; Sin, J. W.; Burwood, R. P.; PΓΌntener, K.; Bigler, R.; Schwaller, P. Swarm intelligence for chemical reaction optimization. Chem 2026, 103035. https://doi.org/10.1016/j.chempr.2026.103035
BibTeX:
@article{schlamaSwarmIntelligenceChemical2026,
title = {Swarm Intelligence for Chemical Reaction Optimization},
author = {Schlama, RΓ©mi and Sin, Joshua W. and Burwood, Ryan P. and PΓΌntener, Kurt and Bigler, Raphael and Schwaller, Philippe},
date = {2026-04-17},
journaltitle = {Chem},
volume = {0},
number = {0},
publisher = {Elsevier},
issn = {2451-9294, 2451-9308},
doi = {10.1016/j.chempr.2026.103035},
url = {https://www.cell.com/chem/abstract/S2451-9294(26)00101-4},
}Released under the MIT License. If you use AlphaSwarm in your work, please also cite our paper.

