Discover the hidden equations inside your data.
DaeFinder is a Scientific Machine Learning toolkit that recovers Differential Algebraic Equations (DAEs) from noisy measurements using a sparse-optimization framework — the SODAs algorithm.
Various real-world systems like chemical reaction networks, power grids, mechanical systems etc are governed by a mix of differential equations (how things change in time) and algebraic constraints (relationships that always hold). DaeFinder automatically discovers and untangles both directly from time-series data:
- 🧩 Decouples algebraic & dynamic relations — finds the constraints and the ODEs.
- 🪶 Model-agnostic by design — plug in any estimator that implements
fit()andscore()(linear models, regularized regressors, or your own custom optimizer). - 🔇 Robust to noise — built-in smoothing and derivative estimation (splines, Savitzky–Golay).
- 🧮 Rich feature engineering — polynomial libraries, sparse feature coupling, and SVD analysis.
- ⚡ Parallel discovery — fit many candidate relations concurrently.
- ✅ Tested & current — runs on the latest scientific-Python stack (see Compatibility).
If DaeFinder supports your research, please cite the SODAs paper:
M. Jayadharan, C. Catlett, A. N. Montanari, and N. M. Mangan, "SODAs: Sparse Optimization for the Discovery of Differential and Algebraic Equations." Proc. A 1 May 2026; 482 (2337): 20250201. https://doi.org/10.1098/rspa.2025.0201
DaeFinder requires Python 3.9+.
pip install DaeFinderInstall the latest development version from source
git clone https://github.com/mjayadharan/DAE-FINDER_dev.git
cd DAE-FINDER_dev
pip install -e .Dependencies (installed automatically): numpy, scipy, pandas,
sympy, scikit-learn, matplotlib, joblib.
import pandas as pd
from daeFinder import PolyFeatureMatrix, AlgModelFinder
# Your measurements: one column per state variable, one row per time point.
data = pd.DataFrame({"S": ..., "E": ..., "ES": ..., "P": ...})
# 1) Build a candidate library of nonlinear (polynomial) features.
library = PolyFeatureMatrix(degree=2).fit_transform(data)
# 2) Discover sparse algebraic relations among the library terms.
finder = AlgModelFinder(model_id="lasso", alpha=0.01)
finder.fit(library, scale_columns=True)
# ...or fit every candidate in parallel:
finder.fit(library, scale_columns=True, parallelize=True, num_cpu=8)
# 3) Rank and inspect the strongest recovered relations.
print(finder.best_models(num=5))Working from noisy data? Smooth it and estimate derivatives first:
from daeFinder import smooth_data
smoothed = smooth_data(data, smooth_method="spline", noise_perc=2, derr_order=1)See the Examples/ folder for full, runnable walkthroughs.
DaeFinder follows the SODAs pipeline:
- Smooth & differentiate noisy measurements to obtain clean state variables and their time derivatives.
- Construct a candidate library of nonlinear terms (e.g. polynomial features, coupled features).
- Sparsely regress library terms against one another to surface algebraic constraints, then against derivatives to surface the dynamics.
- Refine & simplify the discovered relations into interpretable symbolic equations.
DaeFinder is continuously tested across the modern scientific-Python stack:
| Component | Supported |
|---|---|
| Python | 3.9 – 3.14 |
| NumPy | 1.x and 2.x |
| pandas | 2.x and 3.x |
| scikit-learn | ≥ 1.2 (incl. 1.9) |
A GitHub Actions matrix runs the full test suite on every supported Python
version. (Recent releases resolved Python 3.13+ exec()/PEP 667, NumPy 2,
pandas 3 copy-on-write, and scikit-learn ≥ 1.7 incompatibilities.)
The package ships with a comprehensive regression suite under tests/.
pip install -r tests/requirements-test.txt
pytest # run everything
./tests/run_tests.sh # run all tests + save a timestamped reportSee tests/README.md for how to run the suite, read its
reports, and add new tests.
Step-by-step notebooks live in Examples/, covering:
- A guided walkthrough of the discovery pipeline.
- Chemical reaction networks (Michaelis–Menten enzyme kinetics).
- Nonlinear & double pendulums.
- Power-grid networks.
Some examples need extra data or tools. Data files are included in the repository (download the relevant folders). The power-grid example also requires Matpower 6.0 for power-flow calculations.
Manu Jayadharan · Christina Catlett · Arthur Montanari · Grace Hooper · Niall Mangan · Finn Hagerty · Yuxiang Feng
Developed with the Mangan Group at Northwestern University.
Contributions are welcome! Whether it's a bug report, a feature, a new example, or related research collaboration, please open an issue or reach out to the authors or the Mangan Group.
- Manu Jayadharan — manu.jayadharan@gmail.com · manu.jayadharan@northwestern.edu
- Niall Mangan — niall.mangan@northwestern.edu