Skip to content

Latest commit

 

History

History
133 lines (74 loc) · 5.02 KB

File metadata and controls

133 lines (74 loc) · 5.02 KB

ELEMENT — Mathematical Methods

1. Route Discovery (Phase 1)

1.1 Sparse MILP Formulation

ELEMENT discovers metabolic routes as sparse flux vectors from the GEM stoichiometric matrix $S \in \mathbb{R}^{m \times n}$ by solving:

$$\min_{v,, y} ; \mathbf{1}^\top y$$

subject to:

$$S \cdot v = 0$$ $$v_j \in [-U_j , y_j, ; U_j , y_j] \quad \forall j$$ $$v_{\text{sub}} \leq -\epsilon, \quad v_{\text{prod}} \geq \epsilon$$ $$y_j \in {0,1}, \quad \epsilon = 10^{-4}$$

where $y_j = 1$ if reaction $j$ carries flux. The objective minimizes the number of active reactions (sparse route).

1.2 Integer Cuts for Route Diversity

After discovering route $\mathcal{R}_i$ at iteration $i$, the following constraint is added:

$$\sum_{j \in \mathcal{R}_i} y_j \leq |\mathcal{R}_i| - 1$$

Cuts are accumulative: all previous routes ${\mathcal{R}_0, \mathcal{R}1, \ldots, \mathcal{R}{i-1}}$ are excluded from future iterations.

1.3 eFlux2 Context Weighting

MILP costs are weighted by transcriptomic context (eFlux2 method):

$$c_j = \frac{1}{1 + \bar{w}_j}, \quad \bar{w}_j = \frac{1}{|G_j|} \sum_{g \in G_j} \frac{|\text{LFC}(g)|}{\text{LFC}_{\text{cap}}}$$

where $G_j$ is the gene set for reaction $j$ from GPR rules. Reactions with strong expression evidence are weighted lower in cost → more likely to enter the route.


2. Elementary Kinetics ODE (Phase 2)

2.1 3-Step Elementary Model

Adopted from [Ullah et al. (2006), IEE Proc.-Syst. Biol. 153(6):425-432]:

$$\frac{d[E]}{dt} = -k_1[E][S] + (k_2 + k_3)[ES]$$ $$\frac{d[S]}{dt} = -k_1[E][S] + k_2[ES]$$ $$\frac{d[ES]}{dt} = k_1[E][S] - (k_2 + k_3)[ES]$$ $$\frac{d[P]}{dt} = k_3[ES]$$

Notation: $k_1 = k_{\text{bind}}$, $k_2 = k_{\text{rev}}$, $k_3 = k_{\text{cat}}$.

2.2 Stoichiometric Matrix Extension (MECATE v2)

For a route with reactions $R_1, R_2, \ldots, R_n$ and species $S_1, \ldots, S_m$, ELEMENT builds the ODE system dynamically:

  1. Parse GEM for route reactions
  2. Extract stoichiometry $S_{\text{route}} \in \mathbb{R}^{m_{\text{route}} \times n_{\text{route}}}$
  3. For each substrate/intermediate, apply the 3-step ODE with shared enzyme pools

The system size grows as $O(4 \cdot n_{\text{route}})$ ODEs.

2.3 Solver Settings

  • Solver: MATLAB ode15s (stiff, variable step)
  • Relative tolerance: $10^{-6}$, Absolute tolerance: $10^{-9}$
  • Time span: $t \in [0, T_{\text{max}}]$ hours (configurable, default 48 h)

3. Population Bridge (Phase 3)

3.1 Biological Capacity Integral

The cumulative biological capacity $H_{\text{pop}}(t)$ integrates normalized biomass:

$$H_{\text{pop}}(t) = \int_0^t \frac{X(\tau)}{X_0} , d\tau \cdot \lambda_{\text{mol}}$$

where $X(t)$ is the OD600 timeseries and $\lambda_{\text{mol}}$ is a scale factor.

3.2 Bridge Equation

$$D(t) = y_{\max} \cdot \left(1 - e^{-H_{\text{pop}}(t)}\right)$$

Fitting: non-linear least squares on $y_{\max}$ and $\lambda_{\text{mol}}$:

$$\min_{y_{\max}, \lambda} \sum_i \left(D_{\text{exp}}(t_i) - D_{\text{model}}(t_i)\right)^2$$

Confidence intervals (95%) via bootstrap resampling ($N=200$).


4. Multi-Omics Coupling (Phase 4)

4.1 Source Hierarchy

Source Weight $w_{\text{src}}$ Notes
RT-qPCR 1.0 Highest reliability
Proteomics iBAQ 0.85 Protein-level evidence
RNA-seq 0.70 Transcript-level evidence

4.2 Per-Gene Score

$$\mathcal{E}(g) = \text{sign}(\Delta) \cdot \min!\left(\frac{|\text{LFC}(g)|}{\text{LFC}_{\text{cap}}}, 1\right) \cdot w_{\text{src}}(g)$$

where $\text{LFC}_{\text{cap}} = 2.0$ and $\text{sign}(\Delta) \in {+1, -1}$ reflects whether the observed direction matches the expected direction for the route (up for substrate consumption enzymes, down for competing branches).

4.3 Route C3 Score

$$C_3 = \frac{1}{|G_{\text{route}}|} \sum_{g \in G_{\text{route}}} \mathcal{E}(g)$$

For multi-timepoint proteomics:

$$C_3 = 0.5 \cdot C_{3,8h} + 0.3 \cdot C_{3,24h} + 0.2 \cdot C_{3,\text{temporal}}$$


5. NMR Dynamics Scoring (C_dyn)

When NMR timeseries are available for both substrate and product:

$$C_{\text{dyn}} = 0.5 \cdot R^2_{\text{sub}}(t) + 0.5 \cdot r_{\text{Pearson}}(\text{SA}(t))$$

where $R^2_{\text{sub}}$ is the coefficient of determination for bridge-predicted vs NMR-measured substrate decay, and $r_{\text{Pearson}}$ is the Pearson correlation for the product appearance curve.


6. References

  1. Ullah M, Schmidt H, Cho K-H, Bhatt DL, Bhatt DL. (2006). Towards a systems-level understanding of Staphylococcus aureus biofilm formation. IEE Proc.-Syst. Biol. 153(6):425–432. https://doi.org/10.1049/ip-syb:20050024

  2. Heirendt L, et al. (2019). Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0. Nat. Protoc. 14:639–702.

  3. Lewis NE, Hixson KK, Conrad TM, Lerman JA, Charusanti P, Polpitiya AD, Adkins JN, Schramm G, Purvine SO, Lopez-Ferrer D, Weitz KK, Elfenbein R, Conrad RS, Saunders CR, Palsson BØ. (2010). Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models. Mol. Syst. Biol. 6:390.