AlphaSwarm: ML-guided Particle Swarm Optimisation for Chemical Reaction Conditions

Swarm intelligence for chemical reaction optimization

Rémi Schlama · Joshua W. Sin · Ryan P. Burwood · Kurt Püntener · Raphael Bigler · Philippe Schwaller

_{Chem · 2026 · Article 103035 · 10.1016/j.chempr.2026.103035}

About

AlphaSwarm is an ML-guided particle swarm optimisation (PSO) framework for the multi-objective optimisation of chemical reaction conditions. It couples physically intuitive swarm dynamics with Gaussian-process surrogates to efficiently navigate vast, mixed continuous/categorical reaction spaces, and supports both in silico benchmarking against high-throughput-experimentation (HTE) datasets and closed-loop experimental campaigns. The methodology and results are described in our paper published open access in Chem (2026).

🚀 Installation

Install uv with the command
pipx install uv

Create the environment with the following command

uv sync

and activate the environment

source .venv/bin/activate

Alternatively, you can use conda / mamba to create the environment and install all required packages (installation used for all benchmarks and experiments):

git clone https://github.com/schwallergroup/alphaswarm.git
cd alphaswarm
conda env create -f environment.yml
# mamba env create -f environment.yml
python -m pip install -e .

⚡ Quickstart

After installation, save the following as quickstart.toml — it points at one of the virtual-experiment CSVs shipped in data/benchmark/:

file_path = "data/benchmark/ni_suzuki_virtual_benchmark.csv"
y_columns = ["AP yield", "AP selectivity"]
exclude_columns = ["ligand", "solvent", "precursor", "base", "cosolvent"]

seed = 42
n_iter = 3
n_particles = 24
init_method = "sobol"
algo = "alpha-pso"

objective_function = "weighted_sum"
results_path = "results"  # where per-run CSVs + summary.json are written

[pso_params]
c_1 = 1.0
c_2 = 1.0
c_a = 1.0
w = 1.0
n_particles_to_move = [0, 0]

[obj_func_params]
weights = [1.0, 1.0]
noise = 0.0

[model_config]
kernel = "MaternKernel"
kernel_params = "default"
training_iter = 1000

Then launch a 3-iteration, 24-particle alpha-pso run against the Ni-Suzuki virtual benchmark:

alphaswarm benchmark quickstart.toml

Results are logged to the terminal and persisted to results/<algo>_seed<seed>_<timestamp>/ as five files: scores.csv, raw_scores.csv, global_best.csv, final_positions.csv, and summary.json. See 📖 Usage for every available option and for the experimental-campaign workflow.

📖 Usage

🖥️ Benchmark

To run a benchmark, define a configuration file, e.g. benchmark.toml:

file_path = "data/benchmark/ni_suzuki_virtual_benchmark.csv"  # Path to the dataset with features and target
y_columns = ["AP yield", "AP selectivity"]  # Columns containing the objectives values
exclude_columns = ["ligand", "solvent", "precursor", "base", "cosolvent"]  # (Optional) Columns to exclude from the feature set used for modelling, usually contains text data

seed = 42  # Seed for reproducibility
n_iter = 3  # Number of iterations
n_particles = 24  # Number of particles (batch size)
init_method = "sobol"  # Initialisation method (random, sobol, LHS, halton)
algo = "alpha-pso"  # Algorithm to use (canonical-pso, alpha-pso, qnehvi, sobol, qei, qpi, qucb)

[pso_params]  # only for (canonical-pso, alpha-pso)
c_1 = 1.0  # Cognitive parameter
c_2 = 1.0  # Social parameter
c_a = 1.0  # ML parameter
w = 1.0  # Inertia parameter
n_particles_to_move = [0, 0]  # Number of particles to move directly to ML predictions at each iteration after initialisation (list size = iteration_number - 1)

objective_function = "weighted_sum"  # Objective function to use (weighted_sum, weighted_power, ...)
results_path = "results"  # (Optional) Directory where the per-run results folder is written (default: "results")

[obj_func_params]
weights = [1.0, 1.0]  # Weights for the weighted sum objective function
noise = 0.0  # Noise to add to the objectives

[model_config]
kernel = "MaternKernel"  # Kernel to use (MaternKernel, KMaternKernel)
kernel_params = "default"  # Kernel parameters
training_iter = 1000  # Number of iterations for training

Then, run the benchmark with the following command:

alphaswarm benchmark benchmark.toml

Each run creates a subfolder <results_path>/<algo>_seed<seed>_<timestamp>/ containing:

File	Contents
`scores.csv`	Long-format aggregated (scalarised) score per (iteration, particle).
`raw_scores.csv`	Per-objective raw scores per (iteration, particle), one column per entry in `y_columns`.
`global_best.csv`	Best aggregated score seen at each iteration (convergence curve).
`final_positions.csv`	Final particle positions in the normalised feature space, one row per particle.
`summary.json`	Run metadata: algorithm, seed, `n_iter`, `n_particles`, `y_columns`, best score and position.

🧪 Experimental campaign

To run an experimental campaign, you need a chemical space file (.csv) that contains your reaction features. This file must also include a column named rxn_id for the reaction identifiers.

If your file includes non-feature columns describing reaction conditions (e.g., catalyst, base, solvent), you must list them in the configuration file under the exclude_columns parameter. This ensures they are excluded from the model's feature set. The config files used in the manuscript accompanying this repository can be found in the /configs directory.

An example of a configuration file is shown below:

Warning

The chemical space features for the experimental campaign must be normalised between 0 and 1. Normalisation can be done with the normalise_features(...) function.

chemical_space_file = "data/experimental_campaigns/example/chemical_space.csv"  # Path to the chemical space
exclude_columns = ["ligand", "solvent", "precursor", "base"]  # (Optional) Columns to exclude from the input features, usually columns containing text data (rxn_id is automatically excluded)

iteration_number = 1  # Number of iterations (1 for the initialisation)

seed = 42  # Random seed for reproducibility
n_particles = 96  # Number of particles (batch size)
init_method = "sobol"  # Initialisation method (random, sobol, LHS, halton)
algo = "alpha-pso"  # Algorithm to use (canonical-pso, alpha-pso, qnehvi, sobol, qei, qpi, qucb)

[pso_params]  # only for (canonical-pso, alpha-pso)
c_1 = 1.0  # Cognitive parameter
c_2 = 1.0  # Social parameter
c_a = 1.0  # ML parameter
w = 1.0  # Inertia parameter
n_particles_to_move = [0]  # Number of particles to move directly to ML predictions at each iteration after initialisation (list size = iteration_number - 1)

objective_columns = ["AP yield", "AP selectivity"]  # Columns specifying the objectives

# Suggestions path/file format
pso_suggestions_path = "data/experimental_campaigns/example/pso_plate_suggestions"  # output path for the PSO suggestions
pso_suggestions_format = "PSO_plate_{}_suggestions.csv"  # file format of the PSO suggestions
# Experimental/Training data path/file format
experimental_data_path = "data/experimental_campaigns/example/pso_training_data"  # path to the experimental data
experimental_data_format = "PSO_plate_{}_train.csv"  # file format of the training data

[model_config]
kernel = "MaternKernel"  # Kernel to use (MaternKernel, KMaternKernel)
kernel_params = "default"  # Kernel parameters
training_iter = 1000  # Number of iterations for training

Then, run the experimental campaign with the following command:

alphaswarm experimental experimental.toml

Package structure

The package is structured as follows:

📁alphaswarm/

    ├── LICENSE  # MIT License file
    ├── README.md  # Installation and usage instructions
    |── tox.ini  # Configuration file for tox (testing)
    ├── pyproject.toml  # Project configuration file
    ├── environment.yml # Configuration file for conda environment
    ├── data/
    │   ├── benchmark/  # Contains the virtual experiments for benchmarking
    │   │   ├── buchwald_virtual_benchmark.csv
    │   │   ├── ni_suzuki_virtual_benchmark.csv
    │   │   ├── sulfonamide_virtual_benchmark.csv
    │   │   └── experimental_data/  # Contains the experimental data for training emulators
    │   │       ├── buchwald_train_data.csv
    │   │       ├── ni_suzuki_train_data.csv
    │   │       └── sulfonamide_train_data.csv
    │   ├── experimental_campaigns/
    │   │   └── pso_suzuki/  # Example of an experimental campaign
    │   │       ├── chemical_spaces/  # Contains the chemical spaces
    │   │       │   └── pso_suzuki_chemical_space.csv
    │   │       ├── configs/  # Contains the config .toml files use to obtain experimental suggestions
    │   │       │   ├── pso_suzuki_iter_1.toml
    │   │       │   ...
    │   │       ├── pso_plate_suggestions/  # Contains the experimental suggestions
    │   │       |   ├── PSO_suzuki_plate_1_suggestions.csv
    │   │       |   ...
    │   │       └── pso_training_data/  # Contains the training data (experimental results)
    │   │           ├── PSO_suzuki_plate_1_train.csv
    │   │           ...
    │   │── HTE_datasets/ # Contains the experimental HTE datasets in SURF format
    │   │   ├── pd_sulfonamide_SURF.csv
    │   │   └── pd_suzuki_SURF.csv
    ├── src/
    │   └── alphaswarm/
    │       ├── __about__.py
    │       ├── __init__.py
    │       ├── cli.py  # Command line interface tools
    │       ├── configs.py  # Configurations for benchmark and experimental campaigns
    │       ├── metrics.py  # Metrics for the benchmark
    │       ├── objective_functions.py  # Objective functions for the benchmark
    │       ├── pso.py  # Main PSO algorithm
    │       ├── swarms.py  # Particle and Swarm classes
    │       ├── acqf/  # Acquisition functions
    │       │   ├── acqf.py
    │       │   └── acqfunc.py
    │       ├── models/  # Surrogate models
    │       │   └── gp.py  # Gaussian Process models
    │       └── utils/
    │           ├── logger.py  # Logger for the package
    │           ├── moo_utils.py  # Utilities for multi-objective optimisation
    │           ├── tensor_types.py  # Type definitions for tensors
    │           └── utils.py  # General utilities
    └── tests/  # Contains all the unit tests

All data is stored in the data/ directory. The benchmark/ directory contains the virtual experiments used for benchmarking. The experimental_campaigns/ directory contains the chemical spaces and the experimental data for the experimental campaigns.

🛠️ Development details

See developer instructions

To install the package with its test dependencies, run either:

pip install -e ".[test]"

or, with uv and PEP 735 dependency groups:

uv sync --group test

To run style checks:

uv pip install pre-commit
pre-commit run -a

Run style checks, coverage, and tests

Ruff is used for linting and type checking. To run the tests, use the following command:

ruff check src/ --fix

To test:

uv pip install tox
python -m tox -r -e py312

Tensor shapes can be checked using jaxtyping. To check the shapes, set the TYPECHECK environment variable to 1 and run code normally:

export TYPECHECK=1

💬 Questions and support

Bug reports and feature requests: please open an issue on the GitHub issue tracker.
Academic enquiries: contact the corresponding authors listed in the publication.

📚 How to cite

If you use AlphaSwarm in your research, please cite our paper published open access in Chem:

Schlama, R.; Sin, J. W.; Burwood, R. P.; Püntener, K.; Bigler, R.; Schwaller, P. Swarm intelligence for chemical reaction optimization. Chem 2026, 103035. https://doi.org/10.1016/j.chempr.2026.103035

BibTeX:

@article{schlamaSwarmIntelligenceChemical2026,
  title = {Swarm Intelligence for Chemical Reaction Optimization},
  author = {Schlama, Rémi and Sin, Joshua W. and Burwood, Ryan P. and Püntener, Kurt and Bigler, Raphael and Schwaller, Philippe},
  date = {2026-04-17},
  journaltitle = {Chem},
  volume = {0},
  number = {0},
  publisher = {Elsevier},
  issn = {2451-9294, 2451-9308},
  doi = {10.1016/j.chempr.2026.103035},
  url = {https://www.cell.com/chem/abstract/S2451-9294(26)00101-4},
}

📄 License

Released under the MIT License. If you use AlphaSwarm in your work, please also cite our paper.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
data		data
src/alphaswarm		src/alphaswarm
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
tox.ini		tox.ini
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AlphaSwarm: ML-guided Particle Swarm Optimisation for Chemical Reaction Conditions

Swarm intelligence for chemical reaction optimization

About

Table of contents

🚀 Installation

⚡ Quickstart

📖 Usage

🖥️ Benchmark

🧪 Experimental campaign

Package structure

🛠️ Development details

Run style checks, coverage, and tests

💬 Questions and support

📚 How to cite

📄 License

About

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AlphaSwarm: ML-guided Particle Swarm Optimisation for Chemical Reaction Conditions

Swarm intelligence for chemical reaction optimization

About

Table of contents

🚀 Installation

⚡ Quickstart

📖 Usage

🖥️ Benchmark

🧪 Experimental campaign

Package structure

🛠️ Development details

Run style checks, coverage, and tests

💬 Questions and support

📚 How to cite

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages