# Pathway Analysis¶

## Overview of Pathway Analysis¶

Selection of enzymes on the level of pathways is a relatively understudied topic, but has recently come into focus, especially driven by synthetic biology and metabolic engineering applications. In a series of papers from the 90s, Michael Mavrovouniotis layed out the foundation for synthetic pathway design [1], thermodynamic inference of single reactions based on group contribution [2] [3], and how to identify thermodynamic bottlenecks in entire pathways [4]. The dependence of entire pathway feasibility on the aqueous conditions was explored extensively by Vojinović and von Stockar [5], mainly demonstrating it for the glycolysis pathway. Further advancements in these methods have been presented by Hatzimanikatis et al. [6] and recently reviewed by Wang et al. [7].

eQuilibrator can be used to analyze your pathway of interest using one out of two methods: Max-min Driving Force (MDF) and Enzyme Cost Minimization (ECM). If you are not familiar with any of these methods, we recommend trying MDF first, as it is simpler and does not require any information beyond thermodynamics (i.e. reaction Gibbs energies) which is what eQuilibrator is designed for. If you want to try using ECM, note that you will be asked to provide kinetic parameters for all enzymes in your pathway (or, at least, some approximate values).

## Using eQuilibrator to profile your pathway¶

This process has three steps:

### 1. Generating an SBtab model of your pathway¶

Definition of all the pathway reactions, equilibrium constants for all reactions and concentration bounds for all metabolites and cofactors in a tab-delimited file format that can be edited in Excel or similar spreadsheet applications. eQuilibrator can generate an SBtab description of your pathway from a simplified CSV file (like this one) which defines only the reactions (in free text, as you would in the eQuilibrator search box) and their relative fluxes – separated by a comma. You can then choose some of the global parameters given in the form, namely the minimum and maximum allowed concentrations, pH, and ionic strength. By default, eQuilibrator uses cofactor concentrations chosen in [8], so the user-defined bounds are only relevant for non-cofactors. Finally, you can decide which analysis method to use (Max-min Driving Force or Enzyme Cost Minimization). When you press “Build”, eQuilibrator will parse all these reactions, calculate their ΔrG’° and generate a single SBtab file containing all the relevant information required for the pathway analysis. This file should be compatible with other applications using SBtab.

### 2. Verifying and editing the model¶

Before calculating the MDF or ECM, it is always a good idea to check the SBtab file. eQuilibrator does not always parse free-text reactions perfectly, so it is important to double check that compounds and reactions output are correct.

It is important to keep the concentrations of cofactors (like ATP, CoA and NADH) fixed because these concentrations are homeostatically maintained by host organisms. If you do not fix these concentrations to reasonable, physiologically-relevant values, their concentrations will be optimized (within the provided range), just like all other metabolites. In glycolysis, for example, this would could the ATP concentration as low as allowed (e.g. 1 μM), making ATP synthesis and glycolysis as a whole appear much more favorable than they should. In this case the problem is obvious: maintaining a very low ATP concentration is good for producing ATP from ADP, but catastrophic for the cell because ATP-dependent reactions will operate slowly and with very little driving force.

You can also edit the reaction ΔrG’° values, which are recorded in the SBtab file as equilibrium constants (Keq). This may be useful if your pathway contains a reaction whose ΔrG’° eQuilibrator cannot calculate (for example iron redox reactions are problematic for eQuilibrator). When performing pathway analysis, eQuilibrator verifies that the ΔrG’° are consistent with the first law of thermodynamics, i.e. that they could arise from compound formation energies ΔfG’° that are internally consistant with the stoichiometry of your pathway.

Finally, if you chose ECM analysis in step 1, the SBtab file will also contain a table of kinetic parameters, filled with default values. At this stage, you must change these values to ones obtained from other sources, for example BRENDA database is a great resource for kinetic data.

### 3. Performing pathway analysis¶

Now that your SBtab file is ready, you can simply “Browse” and select it from your file system, then press “ANALYZE” and wait paitently for eQuilibrator to generate a report.

## Max-min Driving Force¶

The Max-min Driving Force (MDF) framework was developed in Noor et al. 2014 [8] and is designed to help metabolic engineers select between alternative pathways for achieving the same or similar metabolic goals. Typically, metabolic engineers must express several heterologous enzymes in a non-native host in order to establish a pathway and often choose the pathway in the absence of good data on the kinetics of pathway enzymes. In this context, traditional metabolic control analysis (MCA) is difficult to apply for two reasons:

1. Reliable kinetic data is required to calculate control coefficients.
2. Engineers usually do not have fine-grained control over enzyme expression and activity, even in well-studied model organisms.

Rather than focusing on the relationship between enzyme levels and pathway flux, the MDF considers a pathway’s stoichiometry and thermodynamics and asks whether it is likely to support high flux in cellular conditions. The MDF of a pathway is a metric of how thermodynamically favorable a pathway can be in physiological conditions, considering allowable metabolite and cofactor concentrations, as well as pH and ionic strength. The value of the MDF is the smallest -ΔrG’ obtained by any pathway reaction when metabolite concentrations are chosen to make all pathway reactions as favorable as possible (-ΔrG’ as positive as possible). If the MDF is sufficiently high, the pathway contains no thermodynamic bottlenecks that would hamper its operation in vivo.

MDF is solved using a simple Linear Program (LP):

\begin{eqnarray} \text{maximize} & B \\ \text{subject to} & -\Delta_r \mathbf{G}' & \geq B \\ & \Delta_r \mathbf{G}' &= \Delta_r \mathbf{G}'^\circ + RT \cdot S^\top \cdot \mathbf{x} \\ & \ln(C_{min}) &\leq \mathbf{x} \leq \ln(C_{max}) \end{eqnarray}

where $$\mathbf{x}$$ is the vector of all metabolite log-concentrations, S is the stoichiometric matrix, Cmin and Cmax are vectors with the given concentration lower and upper bounds, R is the gas constant and T is the temperature. The optimal B is the Max-min Driving Force, given in units of kJ/mol.

This approach has several practical advantages over MCA for the purposes of metabolic engineering. First, enzyme kinetic properties are laborious to measure and differ between organisms and isozymes, but no kinetic data is required to calculate the MDF. Second, as the MDF accounts for pH, ionic strength and allowed concentration ranges, it is simple to model the effect of these parameters on the MDF. Finally, as it can be difficult to control the exact expression level of enzymes within cells, the MDF helps identify pathways that are less sensitive to the levels of their constituent enzymes.

## Enzyme Cost Minimization¶

A newer method, that optimizes the total cost of required enzyme for supporting a pathway, called Enzyme Cost Minimization (ECM) was published by Noor et al. in 2016 [9]. Rather than using a purely thermodynamic criterion to evaluate pathway efficiency (which is what MDF does), here we consider a full kinetic model for each enzyme-catalyzed reaction. For example, a single-substrate single-product reaction would be characterized by the reversible Michaelis-Menten rate law:

(1)$v(s, p, E) = E ~ \frac{k_{cat}^+ ~ s/K_s - k_{cat}^- ~ p/K_p}{1 + s/K_s + p/K_p}$

where s, p, and E are the concentrations of the substrate, product, and enzyme respectively. Since the steady-state fluxes are given as an input (and so are all the kinetic parameters $$k_{cat}$$ and $$K_M$$), we can solve for E as a function of s and p:

(2)$E(s, p) = v ~ \frac{1 + s/K_s + p/K_p}{k_{cat}^+ ~ s/K_s - k_{cat}^- ~ p/K_p}$

The solution to equation (2) is what we call the enzyme demand (although it could have a more complex form for reactions with more substrates and products). When summing up all enzyme demands for the different reactions in the pathway, we get the total enzyme cost:

(3)$q(\mathbf{x}) = \sum_i h_{E_i} E_i(\mathbf{x})$

Note, that this is a weighted sum, with coefficients $$h_{E_i}$$ which we denote enzyme burden. By default they are defined as the protein molecular weights. And as before, $$\mathbf{x}$$ is the vector of all metabolite log-concentrations (i.e. a generalization of (s, p) for all reactions). The logarithmic scale is important, since we facilitates our proof that the total enzyme cost is a convex function and therefore can be easily optimized. Essentially, ECM is the process of running a convex optimization problem defined by $$q(\mathbf{x})$$. Nevertheless, convex optimization is typically slower than linear programming, therefore ECM takes a bit longer to calculate compared to MDF.

## References¶

 [1] Mavrovouniotis, M., Stephanopoulos, G., Stephanopoulos, G., 1992. “Synthesis of biochemical production routes.” Computers & Chemical Engineering 16, 605–619
 [2] Mavrovouniotis, L.M., 1990. “Group contributions for estimating standard Gibbs energies of formation of biochemical compounds in aqueous solution.” Biotechnol. Bioeng. 36, 1070–1082
 [3] Mavrovouniotis, M.L., 1991. “Estimation of standard Gibbs energy changes of biotransformations.” J. Biol. Chem 266, 14440–14445.
 [4] Mavrovouniotis, M.L., 1996. “Duality theory for thermodynamic bottlenecks in bioreaction pathways.” Chemical Engineering Science 51, 1495–1507.
 [5] Vojinović & U. von Stockar, “Influence of uncertainties in pH, pMg, activity coefficients, metabolite concentrations, and other factors on the analysis of the thermodynamic feasibility of metabolic pathways” Biotechnology and Bioengineering 103, 780–795 (2009)
 [6] Hatzimanikatis, V., Li, C., Ionita, J.A., Henry, C.S., Jankowski, M.D., Broadbelt, L.J., 2005. “Exploring the diversity of complex metabolic networks” Bioinformatics 21, 1603–1609
 [7] Wang, C. Y. Ng, S. Dash, C. D. Maranas, “Exploring the combinatorial space of complete pathways to chemicals” Biochemical Society Transactions (2018) DOI:10.1042/BST20170272
 [8] (1, 2) Noor, A. Bar-Even, A. Flamholz, E. Reznik, W. Liebermeister, R. Milo, “Pathway thermodynamics highlights kinetic obstacles in central metabolism” PLoS Comput Biol (2014) 10:e1003483
 [9] Noor, A. Flamholz, A. Bar-Even, D. Davidi, R. Milo, W. Liebermeister, “The Protein Cost of Metabolic Fluxes: Prediction from Enzymatic Rate Laws and Cost Minimization” PLoS Comput Biol (2016) 11:e1005167