Organism-agnostic MATLAB platform for mechanistic metabolic route discovery,
multi-omics ranking, and phenotype simulation from any genome-scale model.
Given a Genome-Scale Metabolic Model (GEM) and a substrate→product phenotype (e.g. azo dye decolorization, alcohol production, antibiotic degradation), ELEMENT automatically:
- Discovers K mechanistic route hypotheses from the GEM using sparse MILP with guaranteed stoichiometric balance (S·v = 0)
- Simulates elementary enzymatic kinetics — no Quasi-Steady-State Approximation
- Fits a population-level bridge linking molecular kinetics to observable phenotype
- Ranks hypotheses with a multi-omics composite score (C1–C10 + C_dyn)
- Iterates with accumulative cuts to explore diverse route topologies
- Exports publication-quality figures, tables and a full audit report
To make the governing mathematics visible from the start, ELEMENT v2 uses two complementary core equations:
This equation couples molecular route flux with relative biomass to produce the observable phenotype:
Authorship note: The EFM formulation documented in ELEMENT is presented as an original methodological contribution of the author within the ELEMENT program. Reuse, adaptation, or discussion of this formulation should cite ELEMENT.
EFM variables and parameters:
-
$D(t)$ : normalized observable phenotype at time$t$ ; for example, decolorization fraction, transformed substrate fraction, or accumulated product scaled to 0-1. -
$y_{max}$ : maximum reachable observable amplitude. It maps the integrated molecular-population signal to the experimental ceiling. -
$H_{pop}(t)$ : cumulative population-weighted molecular activity up to time$t$ . Larger values indicate more integrated biological work. -
$v(\tau)$ : route-specific molecular flux or effective transformation rate predicted by the elementary ODE at time$\tau$ . -
$X(\tau)$ : biomass or effective population signal at time$\tau$ . -
$X_0$ : initial biomass used to normalize the population trajectory. -
$X(\tau)/X_0$ : relative biological capacity. Values above 1 indicate growth relative to the inoculum; values below 1 indicate loss of active biomass. -
$t$ : current evaluation time. -
$\tau$ : dummy integration variable used to accumulate biological activity from 0 to$t$ .
ELEMENT also reuses the WF92 surrogate as a biological transformation equation for biomass and condition-specific substrate/product targets:
Authorship note: The ETB formulation as implemented and parameterized in WF92/ELEMENT is presented as an original methodological contribution of the author within the WF92 and ELEMENT program stack. Reuse, adaptation, or discussion of this formulation should cite ELEMENT.
ETB variables and parameters:
-
$D_{ETB}(t)$ : transformed fraction predicted by the WF92 surrogate, bounded between 0 and 1. -
$\phi$ : cumulative effective transformation drive before explicit delay and inhibition penalties are applied. -
$A$ : amplitude term that converts effective capacity into transformation potential. -
$T$ : temporal activation term. It starts near 0 and approaches 1 as the system leaves the lag region. -
$B$ : delay and shape penalty term. Larger values suppress early transformation and shift the rise of the curve. -
$I$ : inhibition term driven by the initial target concentration. -
$V_{max}$ : maximum nominal transformation capacity in the surrogate. -
$X_{eff}$ : effective biological capacity used by the surrogate. Depending on the use case, it can represent biomass, active biomass, or another calibrated activity proxy. -
$K_s$ : saturation constant for effective capacity. It controls how quickly the effect of$X_{eff}$ approaches saturation. -
$a_f$ : scale factor for the amplitude term. -
$bf_f$ : curvature exponent for the amplitude term. Higher values make$A$ respond more nonlinearly to$\phi$ . -
$k_f$ : activation rate constant controlling how fast$T$ approaches 1. -
$k_2$ : steepness parameter for the denominator transition in$B$ . -
$t_{10f}$ : delay coefficient that sets when transformation starts to rise appreciably. -
$C_0$ : initial substrate or target concentration. -
$\alpha_f$ : exponent controlling how strongly the initial concentration shifts the delay term. -
$K_{if}$ : inhibition constant. Larger values weaken the penalty imposed by$C_0$ . -
$b$ : inhibition exponent controlling how sharply inhibition grows as$C_0/K_{if}$ increases. -
$C_{amp}$ : amplitude used when the measured observable increases instead of decreases.
Operational mapping:
The full skill specification is available at .github/skills/element-ode-engine-v2/SKILL.md.
ELEMENT adopts the framework of Ullah et al. (2006) for three-step elementary kinetics without Quasi-Steady-State Approximation:
The ODE system solved by ode15s for each reaction in the route:
Default parameters (organism-agnostic, justified for unbiased route ranking):
| Parameter | Value | Meaning |
|---|---|---|
| 1000 mM⁻¹h⁻¹ | Binding rate constant | |
| 150 h⁻¹ | Reverse (dissociation) rate | |
| 150 h⁻¹ | Catalytic rate | |
| 3×10⁻⁴ mM | Initial enzyme pool | |
| 3×10⁻³ mM | Metabolite background pool |
Design rationale: Uniform
$k_{\text{cat}}$ ensures hypothesis ranking is driven by stoichiometric topology, not by uncertain enzyme-specific constants. Implicit$K_m = k_{\text{rev}}/k_{\text{bind}} = 0.15$ mM is physiologically reasonable. See METHODS.md for full justification.
The molecular ODE produces species concentrations $P_{\text{mol}}$. The observable phenotype
-
$y_{max} \in [0,1]$ : maximum decolorization asymptote (single free parameter) -
$X(t)/X_0$ : normalized biomass (OD600 timeseries) -
$\lambda_{mol}$ : molecular flux scale factor (fitted by non-linear least squares) - CI 95% via GEKKO/bootstrapping
For condition-specific target reconstruction, ELEMENT v2 also exposes the Biological Transformation Equation (ETB) reused from WF92 to describe biomass or substrate/product transformation:
The full v2 skill specification is available at .github/skills/element-ode-engine-v2/SKILL.md.
The bridge R² is reported as C2 in the composite score.
For each gene
where:
-
$\log_2\text{FC}(g)$ : log₂ fold-change from RNA-seq, proteomics (iBAQ ratio), or RT-qPCR -
$\log_2\text{FC}_{\text{cap}} = 2.0$ : clamp to avoid outlier dominance -
$w_{\text{src}}$ : source reliability weight (RT-qPCR > proteomics > RNA-seq)
The expression component C3 for a route is:
Missing data sources degrade gracefully — C3 is computed from available evidence.
subject to
| Component | Description | Default |
|---|---|---|
| C1 | ODE convergence & substrate consumption | 0.10 |
| C2 | Bridge R² (phenotype fit quality) | 0.20 |
| C3 | Multi-omics expression agreement | 0.25 |
| C4 | FVA rigidity (route flux variance) | 0.08 |
| C5 | Phenotype shape (sigmoidal coherence) | 0.12 |
| C6 | FSEOF flux-sum enrichment | 0.05 |
| C7 | Reporter metabolite enrichment | 0.05 |
| C8 | OptMDF thermodynamic feasibility | 0.05 |
| C9 | Carbon continuity in route | 0.05 |
| C10 | pFBA parsimony | 0.05 |
| C_dyn | NMR dynamics R² (substrate + product) | 0.10 |
Quality gate: Score ≥ 0.65 → PASS
Routes are discovered by solving a sparse MILP on the stoichiometric matrix:
Diversity is enforced by accumulative integer cuts after each iteration
This guarantees that no previously discovered route is repeated across the entire search history.
ELEMENT/
├── run_element.m ← Universal launcher (start here)
├── inputs/
│ ├── element_config.json ← Single configuration file (edit this)
│ ├── gem/ ← Genome-scale metabolic model (SBML)
│ ├── omics/
│ │ ├── proteomics/ ← iBAQ quantitative proteomics TSV
│ │ ├── rnaseq/ ← DESeq2 differential expression TSV
│ │ └── rtqpcr/ ← RT-qPCR validation panel TSV
│ ├── nmr/ ← NMR timeseries CSVs (substrate + product)
│ ├── specialty_reactions_db.json ← Extended reactions database
│ ├── dye_molecules_metadata.json
│ ├── electron_carriers_db.json
│ └── species_xref.json
├── matlab/ ← 64 core MATLAB scripts
│ ├── run_iterative_loop_element.m ← Main loop
│ ├── config_reference_v2.m ← Config parser + defaults
│ ├── discover_routes_v2.m ← MILP route discovery
│ ├── build_S_elementary.m ← Stoichiometric matrix builder (MECATE v2)
│ ├── solve_3step.m ← ODE elementary kinetics solver
│ ├── run_phase3_bridge_efm.m ← Bridge fitting
│ ├── run_phase4_multiomics.m ← Multi-omics coupling
│ ├── run_phase6_scoring.m ← Score computation
│ ├── run_phase7_export_docs_lc6.m ← Export figures & report
│ └── ... ← All 64 scripts included
├── outputs/ ← Generated automatically
│ ├── iter_0/ ... iter_N/
│ └── best/ ← Best scoring iteration
├── docs/
│ ├── METHODS.md ← Full mathematical methods
│ ├── SCORE_COMPONENTS.md ← Scoring system documentation
│ └── INPUT_FORMAT.md ← Data format specifications
├── INSTALL.md
├── CITATION.cff
└── LICENSE
| Tool | Version | Required |
|---|---|---|
| MATLAB | R2021b+ | ✅ Required |
| COBRA Toolbox | Latest | ✅ Required |
| IBM CPLEX | 12.10+ | ⚡ Recommended (5–10× speedup) |
| GLPK | (via COBRA) | ✅ Free fallback |
# 1. Clone ELEMENT
git clone https://github.com/jramirezgen/ELEMENT.git
cd ELEMENT
# 2. Clone COBRA Toolbox (if not already installed)
git clone https://github.com/opencobra/cobratoolbox.git
# 3. In MATLAB, initialize COBRA once:
# >> addpath(genpath('cobratoolbox')); initCobraToolbox(false)% The config already points to the included GEM and omics data
% Just run:
run_elementExpected output:
[ELEMENT] ELEMENT v1.0 — Shewanella xiamenensis LC6
[ELEMENT] Substrate: MO → Sulfanilic acid / catechol
[ELEMENT] MAX_ITER=10 | K_ROUTES=3 | PATIENCE=5 | TARGET_C3=0.55
[ELEMENT] ===== ITERATION 0 (no cuts) =====
...
[ELEMENT] ★★★ BEST ITER: 11 (C3=0.4414) ★★★
[ELEMENT] Final score: 0.7199 [PASS ≥ 0.65]
[ELEMENT] Results in: outputs/
- Edit
inputs/element_config.json— changeproject,gem.path,substrate_product_pair, andomicspaths - Place your GEM in
inputs/gem/(SBML Level 3, FBC v2) - Place your omics in
inputs/omics/{proteomics,rnaseq,rtqpcr}/(see docs/INPUT_FORMAT.md) - Run
run_element
Minimal run (GEM only, no omics — scores C3, C4, C7, C8, C10 will be NaN):
{
"project": { "id": "my_organism", "organism": "My Organism" },
"gem": { "path": "inputs/gem/my_model.xml" },
"substrate_product_pair": { "substrate_met_id": "cpd_X[e0]", "product_met_id": "cpd_Y[c0]" },
"omics": { "gene_rxn_mapping_tsv": "", "proteomics_8h_tsv": "" },
"experimental_data": { "decol_tsv": "inputs/my_timeseries.tsv" }
}The included example represents the first validated application of ELEMENT:
| Metric | Value |
|---|---|
| Best iteration | 11 / 16 executed |
| Score total | 0.7199 ✅ PASS |
| Bridge R² | 0.868 |
| C3 expression agreement | 0.441 |
| C_dyn (NMR) | 0.814 |
| φ decolorization | 99.7% |
| GEM | 2472 reactions, 2226 species, 2112 genes |
| Runtime | ~24 min (CPLEX, 16-core) |
Key mechanistic finding: In Methyl Orange degradation, only AzoR/ARS1 is strongly up-regulated (RNA-seq log₂FC = +6.05; RT-qPCR = +6.89), while Mtr/CymA electron transport chain genes are DOWN-regulated — rejecting the EET hypothesis and supporting direct NADH-mediated azo reduction → catechol → β-ketoadipate → TCA.
flowchart TD
A[GEM + Config JSON] --> B[Phase 0: GEM validation]
B --> C[Phase 1: MILP route discovery\nS·v=0, K hypotheses, accumulative cuts]
C --> D[Phase 2: Elementary ODE\nE+S⇌ES→E+P via ode15s]
D --> E[Phase 3: Bridge fitting\nD=ymax·1-exp-Hpop]
E --> F[Phase 4: Multi-omics coupling\nRNA-seq + Proteomics + RT-qPCR]
F --> G[Phase 6: Composite scoring\nC1-C10 + C_dyn]
G --> H{Score ≥ 0.65?}
H -- YES --> I[Phase 7: Export figures + report\nPASS ✅]
H -- NO --> J[Add accumulative cut\nNext iteration]
J --> C
See docs/INPUT_FORMAT.md for complete specifications.
Decolorization timeseries (exp_decol.tsv):
time_h replicate abs_pct
0 1 0.000
3 1 0.312
6 1 0.687
Proteomics (proteomics_Xh.tsv):
gene_id iBAQ_condition iBAQ_control
EJGNCN_00123 1.45e6 2.01e5
RNA-seq (rnaseq.tsv):
gene_id log2FoldChange padj
EJGNCN_00123 2.34 0.001
If you use ELEMENT in your research, please cite:
@software{ramirez2025element,
author = {Ramirez Quispe, Jefferson David and
Ramirez Roca, Pablo Sergio and
Reyes, Bryan Cesar and
Susuki Pacheco, Asami and
Huamantalla Huamán, Magerlyn and
Herrera Quiñones, Josemaria and
Beraun Gasco, Piero},
title = {{ELEMENT}: Enzymatic-Level Equation-based Mechanistic
Elementary-route Network Toolkit},
year = {2025},
version = {1.0.0},
publisher = {GitHub},
url = {https://github.com/jramirezgen/ELEMENT},
institution = {Universidad Nacional Mayor de San Marcos (UNMSM),
Laboratorio de Microbiología Molecular y Biotecnología,
Grupo de Investigación: Genómica Funcional de
Microorganismos y Biorremediación}
}See also CITATION.cff.
Jefferson David Ramirez Quispe, B.Sc.1, Pablo Sergio Ramirez Roca, Ph.D.1, Bryan Cesar Reyes, M.Sc.1, Asami Susuki Pacheco, B.Sc.1, Magerlyn Huamantalla Huamán, M.Sc.1, Josemaria Herrera Quiñones, B.Sc.1, and Piero Beraun Gasco, M.Sc.1
1 Laboratorio de Microbiología Molecular y Biotecnología, Grupo de Investigación en Genómica Funcional de Microorganismos y Biorremediación, Universidad Nacional Mayor de San Marcos (UNMSM), Lima, Perú
Apache License 2.0 — see LICENSE.