⚗️ ELEMENT

Enzymatic-Level Equation-based Mechanistic Elementary-route Network Toolkit

Organism-agnostic MATLAB platform for mechanistic metabolic route discovery,
multi-omics ranking, and phenotype simulation from any genome-scale model.

What ELEMENT does

Given a Genome-Scale Metabolic Model (GEM) and a substrate→product phenotype (e.g. azo dye decolorization, alcohol production, antibiotic degradation), ELEMENT automatically:

Discovers K mechanistic route hypotheses from the GEM using sparse MILP with guaranteed stoichiometric balance (S·v = 0)
Simulates elementary enzymatic kinetics — no Quasi-Steady-State Approximation
Fits a population-level bridge linking molecular kinetics to observable phenotype
Ranks hypotheses with a multi-omics composite score (C1–C10 + C_dyn)
Iterates with accumulative cuts to explore diverse route topologies
Exports publication-quality figures, tables and a full audit report

Mathematical Framework

Core Equations

To make the governing mathematics visible from the start, ELEMENT v2 uses two complementary core equations:

1. Physiological-Molecular Equation (EFM)

This equation couples molecular route flux with relative biomass to produce the observable phenotype:

Authorship note: The EFM formulation documented in ELEMENT is presented as an original methodological contribution of the author within the ELEMENT program. Reuse, adaptation, or discussion of this formulation should cite ELEMENT.

$$ \begin{aligned} D(t) &= y_{max},\bigl(1 - e^{-H_{pop}(t)}\bigr),\\ H_{pop}(t) &= \int_{0}^{t} v(\tau),\dfrac{X(\tau)}{X_0},d\tau \end{aligned} $$

EFM variables and parameters:

$D(t)$: normalized observable phenotype at time $t$; for example, decolorization fraction, transformed substrate fraction, or accumulated product scaled to 0-1.
$y_{max}$: maximum reachable observable amplitude. It maps the integrated molecular-population signal to the experimental ceiling.
$H_{pop}(t)$: cumulative population-weighted molecular activity up to time $t$. Larger values indicate more integrated biological work.
$v(\tau)$: route-specific molecular flux or effective transformation rate predicted by the elementary ODE at time $\tau$.
$X(\tau)$: biomass or effective population signal at time $\tau$.
$X_0$: initial biomass used to normalize the population trajectory.
$X(\tau)/X_0$: relative biological capacity. Values above 1 indicate growth relative to the inoculum; values below 1 indicate loss of active biomass.
$t$: current evaluation time.
$\tau$: dummy integration variable used to accumulate biological activity from 0 to $t$.

2. Biological Transformation Equation (ETB)

ELEMENT also reuses the WF92 surrogate as a biological transformation equation for biomass and condition-specific substrate/product targets:

Authorship note: The ETB formulation as implemented and parameterized in WF92/ELEMENT is presented as an original methodological contribution of the author within the WF92 and ELEMENT program stack. Reuse, adaptation, or discussion of this formulation should cite ELEMENT.

$$ \begin{aligned} D_{ETB}(t) &= \min(1, \max(0, \dfrac{A,T}{B,I})),\\ \phi &= \dfrac{V_{max},X_{eff}}{K_s + X_{eff}},t,\\ A &= a_f,\max(\phi,0)^{bf_f},\\ T &= 1 - e^{-k_f t},\\ B &= 100,[1 + e^{-k_2(\phi/100 - t_{10f},C_0^{\alpha_f})}],\\ I &= 1 + \left(\dfrac{C_0}{K_{if}}\right)^b \end{aligned} $$

ETB variables and parameters:

$D_{ETB}(t)$: transformed fraction predicted by the WF92 surrogate, bounded between 0 and 1.
$\phi$: cumulative effective transformation drive before explicit delay and inhibition penalties are applied.
$A$: amplitude term that converts effective capacity into transformation potential.
$T$: temporal activation term. It starts near 0 and approaches 1 as the system leaves the lag region.
$B$: delay and shape penalty term. Larger values suppress early transformation and shift the rise of the curve.
$I$: inhibition term driven by the initial target concentration.
$V_{max}$: maximum nominal transformation capacity in the surrogate.
$X_{eff}$: effective biological capacity used by the surrogate. Depending on the use case, it can represent biomass, active biomass, or another calibrated activity proxy.
$K_s$: saturation constant for effective capacity. It controls how quickly the effect of $X_{eff}$ approaches saturation.
$a_f$: scale factor for the amplitude term.
$bf_f$: curvature exponent for the amplitude term. Higher values make $A$ respond more nonlinearly to $\phi$.
$k_f$: activation rate constant controlling how fast $T$ approaches 1.
$k_2$: steepness parameter for the denominator transition in $B$.
$t_{10f}$: delay coefficient that sets when transformation starts to rise appreciably.
$C_0$: initial substrate or target concentration.
$\alpha_f$: exponent controlling how strongly the initial concentration shifts the delay term.
$K_{if}$: inhibition constant. Larger values weaken the penalty imposed by $C_0$.
$b$: inhibition exponent controlling how sharply inhibition grows as $C_0/K_{if}$ increases.
$C_{amp}$: amplitude used when the measured observable increases instead of decreases.

Operational mapping:

$$ C(t)= \begin{cases} C_0,\bigl(1 - D_{ETB}(t)\bigr), & \text{decreasing observable},\\ C_0 + C_{amp},D_{ETB}(t), & \text{increasing observable}. \end{cases} $$

The full skill specification is available at .github/skills/element-ode-engine-v2/SKILL.md.

1 — Elementary Enzymatic Kinetics (Phase 2)

ELEMENT adopts the framework of Ullah et al. (2006) for three-step elementary kinetics without Quasi-Steady-State Approximation:

$$E + S ;\overset{k_{\text{bind}}}{\underset{k_{\text{rev}}}{\rightleftharpoons}}; ES \xrightarrow{;k_{\text{cat}};} E + P$$

The ODE system solved by ode15s for each reaction in the route:

$$\frac{d[E]}{dt} = -k_{\text{bind}}[E][S] + (k_{\text{rev}} + k_{\text{cat}})[ES]$$

$$\frac{d[S]}{dt} = -k_{\text{bind}}[E][S] + k_{\text{rev}}[ES]$$

$$\frac{d[ES]}{dt} = k_{\text{bind}}[E][S] - (k_{\text{rev}} + k_{\text{cat}})[ES]$$

$$\frac{d[P]}{dt} = k_{\text{cat}}[ES]$$

Default parameters (organism-agnostic, justified for unbiased route ranking):

Parameter	Value	Meaning
$k_{\text{bind}}$	1000 mM⁻¹h⁻¹	Binding rate constant
$k_{\text{rev}}$	150 h⁻¹	Reverse (dissociation) rate
$k_{\text{cat}}$	150 h⁻¹	Catalytic rate
$E_0$	3×10⁻⁴ mM	Initial enzyme pool
$[bg]$	3×10⁻³ mM	Metabolite background pool

Design rationale: Uniform $k_{\text{cat}}$ ensures hypothesis ranking is driven by stoichiometric topology, not by uncertain enzyme-specific constants. Implicit $K_m = k_{\text{rev}}/k_{\text{bind}} = 0.15$ mM is physiologically reasonable. See METHODS.md for full justification.

2 — Population-Level Bridge (Phase 3)

The molecular ODE produces species concentrations $P_{\text{mol}}$. The observable phenotype $D(t)$ (e.g. % decolorization) is linked via a bridge equation based on cumulative biological capacity:

$$ \begin{aligned} D(t) &= y_{max},\bigl(1 - e^{-H_{pop}(t)}\bigr),\\ H_{pop}(t) &= \lambda_{mol},\int_{0}^{t} \dfrac{X(\tau)}{X_0},d\tau \end{aligned} $$

$y_{max} \in [0,1]$: maximum decolorization asymptote (single free parameter)
$X(t)/X_0$: normalized biomass (OD600 timeseries)
$\lambda_{mol}$: molecular flux scale factor (fitted by non-linear least squares)
CI 95% via GEKKO/bootstrapping

For condition-specific target reconstruction, ELEMENT v2 also exposes the Biological Transformation Equation (ETB) reused from WF92 to describe biomass or substrate/product transformation:

$$D_{ETB}(t) = \min(1, \max(0, \dfrac{A,T}{B,I}))$$

$$ C(t)= \begin{cases} C_0,\bigl(1 - D_{ETB}(t)\bigr), & \text{decreasing observable},\\ C_0 + C_{amp},D_{ETB}(t), & \text{increasing observable}. \end{cases} $$

The full v2 skill specification is available at .github/skills/element-ode-engine-v2/SKILL.md.

The bridge R² is reported as C2 in the composite score.

3 — Multi-Omics Expression Coupling (Phase 4)

For each gene $g$ in a candidate route, the expression score $\mathcal{E}(g)$ is computed as:

$$\mathcal{E}(g) = \text{sign}(\Delta) \cdot \dfrac{|\log_2\text{FC}(g)|}{\log_2\text{FC}_{\text{cap}}} \cdot w_{\text{src}}$$

where:

$\log_2\text{FC}(g)$: log₂ fold-change from RNA-seq, proteomics (iBAQ ratio), or RT-qPCR
$\log_2\text{FC}_{\text{cap}} = 2.0$: clamp to avoid outlier dominance
$w_{\text{src}}$: source reliability weight (RT-qPCR > proteomics > RNA-seq)

The expression component C3 for a route is:

$$C_3 = \frac{1}{|G|} \sum_{g \in G} \mathcal{E}(g) \cdot \mathbf{1}[\text{direction matches}]$$

Missing data sources degrade gracefully — C3 is computed from available evidence.

4 — Composite Scoring System

$$\text{Score}_{\text{total}} = \sum_{k=1}^{10} w_k , C_k + w_{\text{dyn}} , C_{\text{dyn}}$$

subject to $\sum w_k + w_{\text{dyn}} = 1$ and NaN components redistribute weight proportionally.

Component	Description	Default $w$
C1	ODE convergence & substrate consumption	0.10
C2	Bridge R² (phenotype fit quality)	0.20
C3	Multi-omics expression agreement	0.25
C4	FVA rigidity (route flux variance)	0.08
C5	Phenotype shape (sigmoidal coherence)	0.12
C6	FSEOF flux-sum enrichment	0.05
C7	Reporter metabolite enrichment	0.05
C8	OptMDF thermodynamic feasibility	0.05
C9	Carbon continuity in route	0.05
C10	pFBA parsimony	0.05
C_dyn	NMR dynamics R² (substrate + product)	0.10

Quality gate: Score ≥ 0.65 → PASS

5 — Iterative Route Discovery (MILP + Accumulative Cuts)

Routes are discovered by solving a sparse MILP on the stoichiometric matrix:

$$\min_{v} ; \mathbf{1}^\top |v| \quad \text{s.t.} \quad S \cdot v = 0, ; v_{\text{sub}} \leq -\epsilon, ; v_{\text{prod}} \geq \epsilon$$

Diversity is enforced by accumulative integer cuts after each iteration $i$:

$$\sum_{j \in \mathcal{R}_i} y_j \leq |\mathcal{R}_i| - 1 \quad \forall i \in {0 \ldots n_{\text{prev}}}$$

This guarantees that no previously discovered route is repeated across the entire search history.

Architecture

ELEMENT/
├── run_element.m                  ← Universal launcher (start here)
├── inputs/
│   ├── element_config.json        ← Single configuration file (edit this)
│   ├── gem/                       ← Genome-scale metabolic model (SBML)
│   ├── omics/
│   │   ├── proteomics/            ← iBAQ quantitative proteomics TSV
│   │   ├── rnaseq/                ← DESeq2 differential expression TSV
│   │   └── rtqpcr/                ← RT-qPCR validation panel TSV
│   ├── nmr/                       ← NMR timeseries CSVs (substrate + product)
│   ├── specialty_reactions_db.json   ← Extended reactions database
│   ├── dye_molecules_metadata.json
│   ├── electron_carriers_db.json
│   └── species_xref.json
├── matlab/                        ← 64 core MATLAB scripts
│   ├── run_iterative_loop_element.m   ← Main loop
│   ├── config_reference_v2.m          ← Config parser + defaults
│   ├── discover_routes_v2.m           ← MILP route discovery
│   ├── build_S_elementary.m           ← Stoichiometric matrix builder (MECATE v2)
│   ├── solve_3step.m                  ← ODE elementary kinetics solver
│   ├── run_phase3_bridge_efm.m        ← Bridge fitting
│   ├── run_phase4_multiomics.m        ← Multi-omics coupling
│   ├── run_phase6_scoring.m           ← Score computation
│   ├── run_phase7_export_docs_lc6.m   ← Export figures & report
│   └── ...                            ← All 64 scripts included
├── outputs/                       ← Generated automatically
│   ├── iter_0/ ... iter_N/
│   └── best/                      ← Best scoring iteration
├── docs/
│   ├── METHODS.md                 ← Full mathematical methods
│   ├── SCORE_COMPONENTS.md        ← Scoring system documentation
│   └── INPUT_FORMAT.md            ← Data format specifications
├── INSTALL.md
├── CITATION.cff
└── LICENSE

Quick Start

Prerequisites

Tool	Version	Required
MATLAB	R2021b+	✅ Required
COBRA Toolbox	Latest	✅ Required
IBM CPLEX	12.10+	⚡ Recommended (5–10× speedup)
GLPK	(via COBRA)	✅ Free fallback

Installation

# 1. Clone ELEMENT
git clone https://github.com/jramirezgen/ELEMENT.git
cd ELEMENT

# 2. Clone COBRA Toolbox (if not already installed)
git clone https://github.com/opencobra/cobratoolbox.git

# 3. In MATLAB, initialize COBRA once:
#    >> addpath(genpath('cobratoolbox')); initCobraToolbox(false)

Running the included example (S. xiamenensis LC6, Methyl Orange)

% The config already points to the included GEM and omics data
% Just run:
run_element

Expected output:

[ELEMENT] ELEMENT v1.0 — Shewanella xiamenensis LC6
[ELEMENT] Substrate: MO → Sulfanilic acid / catechol
[ELEMENT] MAX_ITER=10 | K_ROUTES=3 | PATIENCE=5 | TARGET_C3=0.55
[ELEMENT] ===== ITERATION 0 (no cuts) =====
...
[ELEMENT] ★★★ BEST ITER: 11 (C3=0.4414) ★★★
[ELEMENT] Final score: 0.7199  [PASS ≥ 0.65]
[ELEMENT] Results in: outputs/

Adapting to your organism

Edit inputs/element_config.json — change project, gem.path, substrate_product_pair, and omics paths
Place your GEM in inputs/gem/ (SBML Level 3, FBC v2)
Place your omics in inputs/omics/{proteomics,rnaseq,rtqpcr}/ (see docs/INPUT_FORMAT.md)
Run run_element

Minimal run (GEM only, no omics — scores C3, C4, C7, C8, C10 will be NaN):

{
  "project":              { "id": "my_organism", "organism": "My Organism" },
  "gem":                  { "path": "inputs/gem/my_model.xml" },
  "substrate_product_pair": { "substrate_met_id": "cpd_X[e0]", "product_met_id": "cpd_Y[c0]" },
  "omics":                { "gene_rxn_mapping_tsv": "", "proteomics_8h_tsv": "" },
  "experimental_data":    { "decol_tsv": "inputs/my_timeseries.tsv" }
}

Biological Validation — S. xiamenensis LC6

The included example represents the first validated application of ELEMENT:

Metric	Value
Best iteration	11 / 16 executed
Score total	0.7199 ✅ PASS
Bridge R²	0.868
C3 expression agreement	0.441
C_dyn (NMR)	0.814
φ decolorization	99.7%
GEM	2472 reactions, 2226 species, 2112 genes
Runtime	~24 min (CPLEX, 16-core)

Key mechanistic finding: In Methyl Orange degradation, only AzoR/ARS1 is strongly up-regulated (RNA-seq log₂FC = +6.05; RT-qPCR = +6.89), while Mtr/CymA electron transport chain genes are DOWN-regulated — rejecting the EET hypothesis and supporting direct NADH-mediated azo reduction → catechol → β-ketoadipate → TCA.

Pipeline Overview

flowchart TD
    A[GEM + Config JSON] --> B[Phase 0: GEM validation]
    B --> C[Phase 1: MILP route discovery\nS·v=0, K hypotheses, accumulative cuts]
    C --> D[Phase 2: Elementary ODE\nE+S⇌ES→E+P via ode15s]
    D --> E[Phase 3: Bridge fitting\nD=ymax·1-exp-Hpop]
    E --> F[Phase 4: Multi-omics coupling\nRNA-seq + Proteomics + RT-qPCR]
    F --> G[Phase 6: Composite scoring\nC1-C10 + C_dyn]
    G --> H{Score ≥ 0.65?}
    H -- YES --> I[Phase 7: Export figures + report\nPASS ✅]
    H -- NO --> J[Add accumulative cut\nNext iteration]
    J --> C

Input Data Formats

See docs/INPUT_FORMAT.md for complete specifications.

Decolorization timeseries (exp_decol.tsv):

time_h  replicate  abs_pct
0       1          0.000
3       1          0.312
6       1          0.687

Proteomics (proteomics_Xh.tsv):

gene_id         iBAQ_condition   iBAQ_control
EJGNCN_00123    1.45e6           2.01e5

RNA-seq (rnaseq.tsv):

gene_id         log2FoldChange   padj
EJGNCN_00123    2.34             0.001

Citation

If you use ELEMENT in your research, please cite:

@software{ramirez2025element,
  author    = {Ramirez Quispe, Jefferson David and
               Ramirez Roca, Pablo Sergio and
               Reyes, Bryan Cesar and
               Susuki Pacheco, Asami and
               Huamantalla Huamán, Magerlyn and
               Herrera Quiñones, Josemaria and
               Beraun Gasco, Piero},
  title     = {{ELEMENT}: Enzymatic-Level Equation-based Mechanistic
               Elementary-route Network Toolkit},
  year      = {2025},
  version   = {1.0.0},
  publisher = {GitHub},
  url       = {https://github.com/jramirezgen/ELEMENT},
  institution = {Universidad Nacional Mayor de San Marcos (UNMSM),
                 Laboratorio de Microbiología Molecular y Biotecnología,
                 Grupo de Investigación: Genómica Funcional de
                 Microorganismos y Biorremediación}
}

Authors

Jefferson David Ramirez Quispe, B.Sc.¹, Pablo Sergio Ramirez Roca, Ph.D.¹, Bryan Cesar Reyes, M.Sc.¹, Asami Susuki Pacheco, B.Sc.¹, Magerlyn Huamantalla Huamán, M.Sc.¹, Josemaria Herrera Quiñones, B.Sc.¹, and Piero Beraun Gasco, M.Sc.¹

¹ Laboratorio de Microbiología Molecular y Biotecnología, Grupo de Investigación en Genómica Funcional de Microorganismos y Biorremediación, Universidad Nacional Mayor de San Marcos (UNMSM), Lima, Perú

License

Apache License 2.0 — see LICENSE.

_{ELEMENT v1.0.0 · Universidad Nacional Mayor de San Marcos · 2025}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚗️ ELEMENT

Enzymatic-Level Equation-based Mechanistic Elementary-route Network Toolkit

What ELEMENT does

Mathematical Framework

Core Equations

1. Physiological-Molecular Equation (EFM)

2. Biological Transformation Equation (ETB)

1 — Elementary Enzymatic Kinetics (Phase 2)

2 — Population-Level Bridge (Phase 3)

3 — Multi-Omics Expression Coupling (Phase 4)

4 — Composite Scoring System

5 — Iterative Route Discovery (MILP + Accumulative Cuts)

Architecture

Quick Start

Prerequisites

Installation

Running the included example (S. xiamenensis LC6, Methyl Orange)

Adapting to your organism

Biological Validation — S. xiamenensis LC6

Pipeline Overview

Input Data Formats

Citation

Authors

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/skills/element-ode-engine-v2		.github/skills/element-ode-engine-v2
devops		devops
docs		docs
inputs		inputs
matlab		matlab
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
run_element.m		run_element.m

Folders and files

Latest commit

History

Repository files navigation

⚗️ ELEMENT

Enzymatic-Level Equation-based Mechanistic Elementary-route Network Toolkit

What ELEMENT does

Mathematical Framework

Core Equations

1. Physiological-Molecular Equation (EFM)

2. Biological Transformation Equation (ETB)

1 — Elementary Enzymatic Kinetics (Phase 2)

2 — Population-Level Bridge (Phase 3)

3 — Multi-Omics Expression Coupling (Phase 4)

4 — Composite Scoring System

5 — Iterative Route Discovery (MILP + Accumulative Cuts)

Architecture

Quick Start

Prerequisites

Installation

Running the included example (S. xiamenensis LC6, Methyl Orange)

Adapting to your organism

Biological Validation — S. xiamenensis LC6

Pipeline Overview

Input Data Formats

Citation

Authors

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages