Abstract
A key consideration in metabolic engineering is the determination of fluxes of the metabolites within the cell. This determination provides an unambiguous description of metabolism before and/or after engineering interventions. Here, we present a computational framework that combines a constraint-based modeling framework with isotopic label tracing on a large-scale. When cells are fed a growth substrate with certain carbon positions labeled with 13C, the distribution of this label in the intracellular metabolites can be calculated based on the known biochemistry of the participating pathways. Most labeling studies focus on skeletal representations of central metabolism and ignore many flux routes that could contribute to the observed isotopic labeling patterns. In contrast, our approach investigates the importance of carrying out isotopic labeling studies using a more comprehensive reaction network consisting of 350 fluxes and 184 metabolites in Escherichia coli including global metabolite balances on cofactors such as ATP, NADH, and NADPH. The proposed procedure is demonstrated on an E. coli strain engineered to produce amorphadiene, a precursor to the anti-malarial drug artemisinin. The cells were grown in continuous culture on glucose containing 20% [U-13C]glucose; the measurements are made using GC-MS performed on 13 amino acids extracted from the cells. We identify flux distributions for which the calculated labeling patterns agree well with the measurements alluding to the accuracy of the network reconstruction. Furthermore, we explore the robustness of the flux calculations to variability in the experimental MS measurements, as well as highlight the key experimental measurements necessary for flux determination. Finally, we discuss the effect of reducing the model, as well as shed light onto the customization of the developed computational framework to other systems.
Keywords: Metabolic flux analysis, Isotope labeling, Constraint-based modeling, Nonlinear optimization, Statistical analysis
1. Introduction and Objectives
In recent years, high-throughput methods have enabled rapid advances in the rate of data generation in genomics, transcriptomics, and proteomics. Despite a number of advances in the experimental measurement and analysis techniques metabolic flux elucidation (i.e., metabolic flux analysis (MFA) (Vallino and Stephanopoulos, 1993)) has not yet become commonplace as the above mentioned ‘omics analyses. This is because flux elucidation is context-specific requiring sensitive NMR and/or gas chromatography-mass spectrometry (GC-MS) measurements, accurate metabolic network reconstructions and powerful computational techniques to match experimental observables with underlying models.
The elucidation of metabolic fluxes is important for a number of reasons. First, the set of fluxes through a cell’s metabolic pathways is a key descriptor of its physiology (Nielsen, 2003) and for evaluating mechanisms for metabolic engineering (Bailey, 1991; Stephanopoulos and Vallino, 1991). While gene transcripts, protein levels and metabolite concentrations tend to vary in a seemingly unpredictable fashion, metabolic fluxes are the only invariant that seems to most effectively capture a cell’s metabolic state. They provide the only unambiguous means for pinpointing the effect of engineering interventions (e.g., knock-outs/ins, up/down regulations) on cellular metabolism alluding to their effectiveness and suggesting additional engineering strategies. While optimization-based hypotheses such as biomass yield maximization (Varma and Palsson, 1994), minimization of metabolic adjustments (MOMA) (Segre et al., 2002) and others have in certain cases led to accurate flux predictions, they cannot be relied upon to always reliably deduce the correct flux distributions. Flux measurements (especially internal ones) are needed to provide a reference state (i.e., for MOMA), test the effectiveness of different maximization hypotheses and provide clues as to how to construct new ones.
The potential of using 13C-labeled isotopes to elucidate metabolic pathways has long been recognized in the metabolic engineering community. Early efforts relied on MNR spectra of metabolites which were related to the underlying pathways used to create them (Jeffrey et al., 1991). NMR spectra could also be used to elucidate the flux through metabolic pathways (Bacher et al., 1998; Kelleher, 2001). NMR based recapitulations of central metabolism fluxes in Escherichia coli were accomplished using uniformly-labeled glucose (Szyperski, 1995). This work was later expanded to resolve central metabolism fluxes in E. coli under a variety of conditions (Sauer et al., 1999). The central metabolic fluxes and assumptions about reversibility in Bacillus subtilis were also explored using NMR data that were analyzed within an isotope isomer- (isotopomer)-balancing framework (Dauner et al., 2001). These experimental protocols were subsequently employed to examine the changes in flux distributions in E. coli when pyruvate kinase activity was removed (Emmerling et al., 2002).
The use of NMR is hampered by the fact that it is time consuming and requires high concentrations of the metabolites. Alternatively, fluxes can be deduced by generating GC-MS spectra of primarily amino acids (and other metabolites) to de-convolute the metabolic flux distributions. Based on the observed fragment weight patterns the distribution of isotopes can then be deduced. One of the earliest such contributions involved using 13C-labeling MFA GC-MS to determine the central metabolism fluxes for Corynebacterium glutamicum (Marx et al., 1996) using enrichment data for eleven amino acids. Similar approaches were used for profiling strains from successive generations of C. glutamicum (Wittmann and Heinzle, 2002). Notably 13C-labeling based MFA has also been informative when characterizing regulatory mutations (Van Dien et al., 2003).
Both GC-MS and/or NMR spectra information must first be mapped onto metabolic fluxes before any elucidation procedure can be deployed. One of the first such modeling contributions is the introduction of atom mapping matrices (AMM) (Zupke and Stephanopoulos, 1994) that tracks the transfer of carbon atoms from reactants to products. This concept was subsequently generalized in the form of isotopomer mapping matrices (IMM) by (Schmidt et al., 1997). The use of IMMs enables the formulation of all isotopomer mass balances of a metabolite pool as a closed-form se of nonlinear algebraic equations. The variables in these representations include the metabolic fluxes and the isotopomer distribution vectors (IDV) that quantify the fraction of each metabolite being present in a particular isotope form. These modeling developments were used to elucidate central metabolic maps of E. coli (Schmidt et al., 1999).
A potential problem with the use of IMMs is that even for a given flux distribution the identification of the underlying IDVs yields a set of equations which remain nonlinear. The elegant cumomer concept (Wiechert et al., 1999) was later introduced to first prove that there exists a unique IDV assignment that satisfies any given feasible flux distribution and subsequently devise a IDV identification procedure by solving a cascade of linear equations. However, the nonlinear coupling between metabolic fluxes and IDVs remains. An alternative method for reducing the dimensional space of the isotopomer problem is the concept of the theoretical bondomer (van Winden et al., 2002). A limitation of this method of analysis is that it requires the use of a single, uniformly labeled substrate. Also, (Forbes et al., 2001) introduced the isotopomer path tracing concept which identifies all isotopomer paths that produce an observable isotopomer. Most recently, the Elemental Metabolic Unit (EMU) framework has been developed that dramatically reduces the number of variables necessary to calculate the mass isotopomer distribution of a measured metabolite (Antoniewicz et al., 2006b).
The above described modeling developments link isotopomer fractions (codified though a variety of different variable sets) and metabolic flux information into a set of algebraic equations that allowed for the straightforward identification of IDVs given a feasible flux distribution. The next step is to solve for the fluxes that best explain the observable data. Most approaches rely on gradient-based minimization searches that minimize the sum of the squares of the differences between measurements and observations. These include the Levenberg-Marquardt algorithm (Zhao and Shimizu, 2003), the generalized reduced gradient method (Klapa et al., 2003) and trust region methods (Yang et al., 2004). Efforts to decrease the computation time led to development of analytical derivation techniques for the Jacobian matrix (Wittmann and Heinzle, 2002). Alternatively, evolutionary algorithms such as simulated annealing employed by (Schmidt et al., 1999) attempt to avoid being trapped in local minima. A potential shortcoming of these approaches is that by relying on essentially local optimization procedures they may get trapped in sub-optimal solutions. To remedy this global optimization approaches relying on branch and bound (e.g., BARON (Tawarmalani and Sahinidis, 2004) a deterministic global optimization package) coupled with convex relaxation of the problem have recently been used to quantify the fluxes (Ghosh et al., 2006; Ghosh et al., 2005; Riascos et al., 2005).
While the mathematical rigor gained by pinpointing the globally optimal solution is noteworthy, the practical benefit of this is diluted by the fact that experimental error in the measurements may render the true flux distribution a sub-optimal solution. This motivates the importance of assessing and quantifying the impact of measurement uncertainty on all obtained solutions. Earlier on, linearized statistics were used (Dauner et al., 2001; Emmerling et al., 2002; Wiechert and de Graaf, 1997). In addition, Monte-Carlo stochastic simulation has been used to examine how the elucidated fluxes change when uncertainty in the form of normally distributed noise is added to the data (Forbes et al., 2001; Wittmann and Heinzle, 2002; Zhao and Shimizu, 2003). More recently new techniques were introduced that allow for determination of the upper and lower bounds that do not use local estimates of the standard deviations and can therefore be non-uniform (Antoniewicz et al., 2006a). Instead, the flux whose sensitivity is under investigation is increased step-by-step until a threshold χ2 value is reached.
A common feature key to all studies so far is that they focus on relatively small subsets of cellular metabolism such as the tricarboxylic acid (TCA) cycle, the pentose-phosphate pathway (PPP), or core metabolism using a coarse description of metabolism with 15 to 75 reactions at most. This lumping of reaction steps not only obscures detail but also could lead to the erroneous conclusion that the available data are sufficient to elucidate a unique flux distribution. Notably, when lumped metabolic models are projected onto large-scale models significant ambiguity in flux allocation is revealed. Inherently, any bias in the generation of the lumped metabolic model is thus propagated when the flux distribution is elucidated. In addition, considering individual parts of metabolism in isolation of the rest may lead to the elucidation of flux distributions that are not physiologically relevant. A consequence of this is the inability to handle cofactor balancing which in many cases is the bottleneck in the desired product generation. These limitations motivated in this paper the construction of a detailed isotopomer mapping network reaction consisting of 238 reactions (350 fluxes) and 184 metabolites in Escherichia coli which fully accounts for cofactor balancing and biomass drain requirements.
When faced with a much larger isotope mapping model, the computational challenge of resolving fluxes becomes much more pronounced, notably in the determination of the global optimum. To this end, we have devised an iterative optimization procedure for identifying all flux distributions that are (local) minima to the minimum square discrepancy problem. By exhaustively identifying all solutions we can then use statistical analysis and biological insight to home in to the physiologically relevant one(s). Finally, by feeding all identified solutions to an F-test based significance procedure, confidence levels can be assigned to different flux distributions that are supported by the data.
The specific experimental system examined is part of a larger effort to develop strains of E. coli for the production of terpenoid compounds. Terpenoids are a class of isoprenoids often isolated from plants, and are currently used for a variety of applications including anti-cancer and antimicrobial drugs. For example, artemisinin is a powerful natural antimalarial drug first isolated from wormwood (Dhingra et al., 2000), and Taxol is a drug effective in cancer treatment extracted from the Pacific yew (Cragg, 1998). Because compounds such as these are normally produced in extremely small quantities, purification from biological material is difficult and resource-consuming. Furthermore, chemical synthesis of terpenoids is expensive and inefficient (Avery et al., 1992). Production of these compounds in a microbial host would eliminate many of these problems, and a strain of E. coli producing amorphadiene, the precursor to artemisinin, as a model terpenoid compound has recently been developed (Martin et al., 2003). Flux analysis was performed in order to better understand how the metabolic physiology of the organism changes with the introduction of heterologous pathways, and to evaluate how productivity could be improved.
In the following sections we first describe the isotopomer model construction procedure, followed by the optimization procedure and statistical analysis techniques. Comprehensive results are next presented for the amorphodiene producing strain followed by a discussion and summary.
2. Materials and Methods
2.1. Experimental System
The labeling experiments were performed in the chemostat with amorphadiene-producing strains of E. coli. The amorphadiene-producing strain (Martin et al., 2003) harbored three plasmids: pMevT, expressing an exogenous pathway for the synthesis of mevalonate from acetyl-CoA; pMBIS, expressing an exogenous pathway for the conversion of mevalonate to the isoprenoid precursor IPP as well as the native E. coli isoprenoid genes idi and ispA; and pADS, expressing the amorphadiene synthase gene for the conversion of farnesyl diphosphate to amorphadiene.
Chemostat cultivations were carried in mineral salts media (Neidhardt et al., 1974) supplemented with thiamine, iron and micronutrients, using a 1-L benchtop fermentor (Sartorius, Göttingen, Germany) at a working volume of approximately 600 mL. The carbon source used was glucose, supplied at 20% uniformly 13C-labeled (U-13C-glucose), with an overall concentration of 39.44 mM. pH was controlled at 7.0 by the addition of NaOH, the temperature was set at 37°C, and the vessel was aerated with sterile air. The agitation rate was set at 500 rpm and the dissolved oxygen level was at >50% of saturation at all times. The vessel was inoculated with 30 mL of culture already grown for several generations at the appropriate label concentration. Once stable optical density was reached, steady-state was insured by waiting an additional 3 vessel volume changes before sampling was begun. Flowrate was maintained to give a dilution rate of 0.19 hr−1, and the average OD600 of the culture during the sampling period was 3.9. After taking each 50 mL sample, the vessel was immediately filled to the original volume using fresh media, and the system was allowed to return to steady-state (3 volume changes) before the next sample.
Samples were harvested, total protein was extracted and hydrolyzed and the resulting amino acids were derivatized with N-(tert-butyldimethylsilyl)-N-methyl-trifluroacetamide (TBDMS-FA) (Siddiquee et al., 2004). Derivatized amino acids were analyzed using a Hewlett-Packard HP 5971A gas chromatograph-quadrupole mass selective detector (electron impact), equipped with a DB-5 column (Agilent Technologies). Each derivatized sample was injected 4 times, thus providing at least 8 total data points for the calculation of means and variances. Raw mass isotopomer data were corrected for naturally occurring 13C in the derivatization reagents and non-carbon isotopes in the entire fragment using an infinite dimensional matrix calculus method (Wahl et al., 2004). In addition, the aqueous phase of culture supernatants was assayed for residual glucose (2300 STAT Plus glucose analyzer, YSI Life Sciences, Yellow Springs, Ohio) and acetate (kit from R-Biopharm AG, Darmstadt, Germany), while the concentration of amorphadiene was determined in the organic phase. Amorphadiene was measured by first diluting 10 μl of the dodecane phase in 990 μl ethyl acetate, and then quantifying using GC-MS by comparison to an amorphadiene standard run on GC-MS.
2.2. Construction of Large-scale Isotopomer Mapping Model
The isotopomer model was constructed using as a starting point the large-scale stoichiometric model of Escherichia coli metabolism, iJR904 (Reed et al., 2003), which has been successfully applied to predict the phenotypes of various E. coli strains, both wild-type and mutant, under certain conditions (Fong and Palsson, 2004; Ibarra et al., 2002). The iJR904 model contains 931 intracellular reactions although many of these reactions do not contribute to the labeling patterns of the examined amino acids. Accordingly, in this study, we developed a smaller, though still biologically comprehensive, metabolic model for the flux elucidation. The model reduction procedure first involved removing from iJR904 all blocked reactions defined as reactions that cannot carry flux during aerobic growth on glucose due to stoichiometric limitations (Burgard et al., 2004). Examples of blocked reactions include transporter reactions for components absent from the media such as xylose, glycerol, or fructose, and reactions involved in their utilization. This enabled the removal of nearly one third of the reactions from iJR904. In addition, reactions in the cell envelope, membrane lipid, nucleotide, and cofactor biosynthetic pathways were not explicitly included in the model enabling the removal of approximately another one third of the reactions from iJR904. However, because these biosynthetic pathways can affect amino acid labeling patterns by draining precursor metabolites away from central metabolism, the biomass equation, which was based largely on the one described in (Edwards and Palsson, 2000), was modified to account for these additional drains on central metabolism. Lastly, we assumed that the catabolic reactions in the nucleotide salvage pathways do not contribute significantly to the labeling patterns of the amino acids.
The model constructed for this study contained 238 reactions (including biomass) and 184 metabolites, as well as 32 exchange fluxes. This total includes a set of reactions absent from iJR904 that enable amorphadiene production via the non-native mevalonate pathway (Martin et al., 2003). Reversible reactions were broken into their forward and backward components to enable the investigation of how reaction reversibility affects the labeling patterns of the amino acids. The network contained 80 such reversible reactions, bringing the total number of flux describing variables to 350. The model included all reactions of Embden-Meyerhoff-Parnas (EMP) and Entner-Doudoroff (ED) glycolysis, the tricarboxcylic acid (TCA) cycle, and the pentose phosphate pathway (PPP). In addition, all of the anaplerotic reactions and amino acid biosynthesis and degradation pathways were included. Finally, the model enforced the explicit balancing of all metabolic cofactors (e.g., ATP, NADH, NADPH) by including reactions for energy generation via substrate-level and oxidative phosphorylation as well as transhydrogenase activity.
The Atom Mapping Matrices (AMMs), which describe the transfer of carbon atoms from the reactants to products (Zupke and Stephanopoulos, 1994), were calculated using Pipeline Pilot™ (SciTegic Inc., San Diego, CA), a high-throughput data analysis and mining system for chemoinformatic applications, using a structural matching algorithm. AMMs are matrices where the columns represent carbon atoms in the reactant and the rows represent carbon atoms in the product. The numbering scheme is consistent with the order of atoms represented in the molecule (.mol) file. There is one AMM per reactant-product pair. The input to the program was a list of reactions with associated reactants and products and their KEGG ID numbers (http://www.genome.jp/kegg/ligand.html). The program then extracted the appropriate molecule files (.mol format) and calculated the predicted AMM as well as a score indicating the quality of the match. The above procedure was applied for the reactions in the E. coli model described above, and the results checked manually based on known biochemistry. Approximately 80% of the auto-generated AMMs were found to be correct while the remaining were subsequently corrected manually. These corrected AMMs are summarized in compact atom transformation form, using letters to represent the carbon atoms, in the Supplemental Material.
Isotopomer mapping matrices (IMMs) were then calculated directly from the AMMs using the algorithm introduced by (Van Dien et al., 2003). IMMs indicate the possible product isotopomers that can be created from each reactant isotopomer (Schmidt et al., 1997). IMMs were used in the isotopomer balance equations to determine the isotopomer distribution vectors (IDVs) for each metabolite. The numbering scheme shown in Figure 1 is used for the isotopomers.
Figure 1. Pictoral representation of the variable Iik and the parameter MDV for a sample molecule.
For a molecule, i, that contains 3 carbons there are 23, i.e., 8, different labeling patterns that make up the k members of the isotope fraction Iik. The relative fraction of each is contained in the Iik. During GC-MS, the molecule is derivitized and can be fragmented into different-sized species, f, that are then analyzed. The mass distribution vector, MDV, contains the information about the relative fractions of each fragment that contains m labeled carbons. Because only the total mass of each instance of a fragment is determined, the mass distribution vector is made by collecting various isotopomers by mass, as shown for two different fragments.
The final step of the initial analysis was to calculate the mass distribution vector (MDV) for all observed products from the mass spectrometry data (Wittmann and Heinzle, 1999). For each amino acid that was detected, elements of the IDV with the same number of labeled carbons were summed to yield an element of the MDV, as shown in Figure 1. For amino acid fragments (for example, a common fragment is missing the carboxy-terminal carbon), the procedure was modified to include only the relevant carbon atoms. The process was facilitated by a Matlab program that makes use of matrix algebra (Van Dien et al., 2003; Wittmann and Heinzle, 1999).
2.3. Mathematical Analysis of Flux Elucidation
The problem of calculating the fluxes and labeling patterns for a given set of GC-MS data is formulated as a nonlinear optimization problem (FluxCalc) that is given in the Appendix. This problem minimizes the sum of the variance-weighted differences between the experimental data and the calculated values for the MDVs of the measured amino acid fragments by solving isotope balances (see Figure 2). (FluxCalc) is solved multiple times using CONOPT version 3 accessed within the GAMS modeling environment. The uniformly labeled glucose was modeled to be 99% isotopically pure, with the 12C impurities being equally distributed among each of the carbons. Only isotope forms having five labeled carbons were considered for the impurities; each of the six possibilities were equally likely. The initial flux distributions are provided by solving (FluxInit) that generates a set of random feasible flux distributions; it is also briefly described in the Appendix, using CPLEX version 10, also accessed within GAMS. Given to the nonconvex nature of (FluxCalc), the size of the resulting formulation and the experimental variability in the experimental MDV, we pursue the identification of as many local optima solutions as possible.
Figure 2. The isotope labeling of species is governed by an isotopomer distribution vector balance and the isotope mapping matrix.
Two reactions, j1 and j2, form the species E; the first is unimolecular and the second is bi-molecular. E is also consumed by two reactions j3 and j4; again, the first is unimolecular and the second is bi-molecular. The isotope labeling of E is given by IEk. Each carbon-containing compound has an associated isotope fraction for each labeling pattern, Iik.. The isotope labeling balance of the system is at the top, with the first term of the right-hand side of Eq. 3 on the left and the second term on the right. The product symbol Π and ⊗ are used for term by term multiplication. The isotope labeling, Iik, and the isotope mapping matrix, , that describes the contributions from the k′ isotopes forms of the reactant i′ on the k isotopes forms of product E are shown for the two production reactions. Note that the unimolecular reaction j1 creates bilinear terms, and the bimolecular reaction j2 creates trilinear terms, as indicated. For the consumption terms, on the right, only the overall labeling patterns of E are considered.
2.4. Probability distribution of least squares objective function
The iterative procedure described in the previous section identifies many different local optima with potentially highly varying flux distributions. Due to the variability and measurement imprecision of the mass spectrometry measurements that yield , the key question is how many (if any) of these flux distributions are statistically indistinguishable representations of the metabolic state of the system. To address this question, validation of the fit was performed using a χ2-based goodness-of-fit test.
2.5. Identification of lower/upper bounds on elucidated fluxes given an objective value cutoff
Upon identifying the top scoring solution, a confidence level (i.e., 95%) may be imposed using the method from (Antoniewicz et al., 2006a) to determine an objective function cutoff. The flux ranges can be calculated by maximizing/minimizing each net flux and exchange flux separately subject to the constraints of formulation (FluxCalc) along with the following limit on the objective function: z(l) ≤ zcutoff to obtain the upper and lower bounds ( and , respectively). We refer to this formulation as (FluxRange). As in the case of (FluxCalc) formulation, (FluxRange) is nonconvex. Therefore it is solved iteratively from multiple initial guesses and only the best solutions are recorded. Fluxes whose maximization and minimization under (FluxRange) yield exactly the same value are referred to as fully resolved. Fluxes that yield solutions of the (FluxRange) formulation that are identical to the upper and lower bounds found by the relaxed problem for which only overall mass balance is considered ( and , respectively) are referred to as unresolved, whereas fluxes that yield narrower ranges are referred to as partially resolved. These flux ranges were quantified by defining the variable d, the degree of resolution.
| (1) |
The higher the value of d is for a flux, the greater its resolution is. Thus, the degree of resolution is equal to zero for those fluxes that are unresolved, between zero and one for those that are partially resolved, and equal to one for those that are fully resolved.
3. Results and Discussion
Three macroscopic measurements from the chemostat experiment were used to constrain the system. These measurements are comprised of the initial glucose concentration minus the residual glucose concentration (6.1 g/L ± 0.2 g/L), the cell density (1.6 g/L ± 0.2 g/L), and the final amorphadiene concentration (0.077 g/L ± 0.005 g/L) all of which were taken in replicate. The dilution rate was held constant at 0.19 hr−1 ± 0.00 hr−1. The glucose entering the system was assumed to be converted to biomass, amorphadiene, CO2, or acetate. Although the fluxes towards biomass and amorphadiene were fixed based on the experimental measurements, the employed metabolic model was free to appropriately partition the fluxes towards CO2 and acetate while attempting to match the observed labeling patterns of the amino acids. The measured acetate flux (10.2–12.8 mM/hr), was not used as an explicit constraint on the system, but rather was left as a final consistency check. All computational results are reported using a basis glucose uptake rate of 10 mM/hr.
Using GC-MS, we observed the mass distributions (MDV) for fragments of 13 amino acids collected from two samples taken at different time points. Each derivatized sample was injected four times and the standard deviations for each sample and an overall standard deviation were calculated. In general, the overall standard deviations of the mass distributions among replicate measurements were less than the instrument error of 0.4%, which is consistent with previous estimates of instrument error (Wittmann et al., 2002). Next, the measurement error was estimated as the quadrature addition of the instrument error and the standard deviation (Taylor, 1990). After correcting for naturally occurring 13C, we validated the internal consistency of the MDV distributions of the fragments for each amino acid.
The 32 fragments used for MFA are listed in Table 1. These fragments contain 162 mass entries giving 162−32 = 130 independent mass measurements for use in the MFA. The mean values of the experimental mass measurements and corresponding estimates of the measurement error are given in Table 2. The best computationally derived MDV values (weighted residual sum of squares = 56.4) are also given for each fragment. Although the unweighted differences are quite small, we performed a goodness-of-fit analysis. The isotope model contains 130 free fluxes. Using direct experimental flux measurements, 3 internal fluxes (glucose uptake, amorphadiene production, and biomass production) and 22 exchange fluxes were set, giving a total of 105 independent flux variables for (FluxCalc) to determine. A χ2 distribution with 130−105=25 degrees of freedom yields a maximum allowable objective value of 37.7, at the 95% confidence level. Thus the model fit is statistically not acceptable (P(χ2[25]<56.4)=0.9997) implying significant errors in the measurements. As seen in the table, most of the error lies in the serine and lysine fragments. A normal probability plot of the weighted residuals was linear.
Table 1.
Fragments of the amino acids experimentally measured by GC-MS and simulated using the IMM models
| Amino acida | Monitored ions | Amino acid carbon atomsb | Fragment |
|---|---|---|---|
| Val-159 | 186–192 | 1-2-4-5 | M–C7H15O2Si |
| Val-85 | 260–266 | 1-2-4-5 | M–C5H9O |
| Val-57 | 288–295 | 1-2-3-4-5 | M–C4H9 |
|
| |||
| Gly-85 | 218–221 | 2 | M–C5H9O |
| Gly-57 | 246–249 | 1-2 | M–C4H9 |
|
| |||
| Ala-85 | 232–235 | 1-3 | M–C5H9O |
| Ala-57 | 260–264 | 1-2-3 | M–C4H9 |
|
| |||
| Glu-159 R | 330–335 | 1-2-4-5 | M–C7H15O2Si |
| Glu-85 R | 404–409 | 1-2-4-5 | M–C5H9O |
| Glu-57 R | 432–438 | 1-2-3-4-5 | M–C4H9 |
|
| |||
| Asp-159 R | 316–320 | 1-2-4 | M–C7H15O2Si |
| Asp-57 R | 418–423 | 1-2-3-4 | M–C4H9 |
| Asp-15 R | 460–465 | 1-2-3-4 | M–CH3 |
|
| |||
| Met-159 | 218–223 | 1-3-4-5 | M–C7H15O2Si |
| Met-85 | 292–297 | 1-3-4-5 | M–C5H9O |
| Met-57 | 320–325 | 1-2-3-4-5 | M–C4H9 |
|
| |||
| Leu-159 | 200–205 | 1-2-4-5-6 | M–C7H15O2Si |
| Leu-85 | 274–280 | 1-2-4-5-6 | M–C5H9O |
|
| |||
| Ile-159 | 200–205 | 1-2-4-5-6 | M–C7H15O2Si |
| Ile-85 | 274–280 | 1-2-4-5-6 | M–C5H9O |
|
| |||
| Pro-159 | 184–189 | 1-3-4-5 | M–C7H15O2Si |
|
| |||
| Ser-159 R | 288–291 | 1-3 | M–C7H15O2Si |
| Ser-85 R | 362–365 | 1-3 | M–C5H9O |
| Ser-57 R | 390–394 | 1-2-3 | M–C4H9 |
|
| |||
| Thr-85 R | 376–379 | 1-2-4 | M–C5H9O |
| Thr-57 | 290–294 | 1-2-3-4 | M–C4H9 |
| Thr-57 R | 404–408 | 1-2-3-4 | M–C4H9 |
|
| |||
| Phe-159 | 234–243 | 1-2-3-4-5-6-7-9 | M–C7H15O2Si |
| Phe-85 | 308–317 | 1-2-3-4-5-6-7-9 | M–C5H9O |
| Phe-57 | 336–345 | 1-2-3-4-5-6-7-8-9 | M–C4H9 |
|
| |||
| Lys-159 R | 329–335 | 1-2-4-5-6 | M–C7H15O2Si |
The number after the amino acid is the size of the fragment removed from the molecule with both N and C groups derivatized. An R following a fragment denotes that the R-group is also derivatized.
The carbon numbering follows that in the .mol file from Pipeline Pilot™. For those fragments that have a removed C, it corresponds to that of the carboxyl
Table 2.
Experimentally measured and simulated mass distributions (mol %) of amino acid fragments
| Fragment | M+0 | M+1 | M+2 | M+3 | M+4 | M+5 | M+6 | M+7 | M+8 | M+9 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Val-159 | measa | 61.9 ± 0.6 | 5.3 ± 0.6 | 27.6 ± 0.5 | 2.1 ± 0.4 | 3.1 ± 0.4 | |||||
| simb | 62.2 | 5.6 | 27.9 | 1.2 | 3.1 | ||||||
| Val-85 | meas | 61.7 ± 0.5 | 6.0 ± 0.6 | 27.8 ± 0.5 | 1.4 ± 0.4 | 3.1 ± 0.4 | |||||
| sim | 62.2 | 5.6 | 27.9 | 1.2 | 3.1 | ||||||
| Val-57 | meas | 59.6 ± 0.6 | 7.5 ± 0.5 | 16.4 ± 0.7 | 12.9 ± 0.6 | 1.1 ± 0.4 | 2.6 ± 0.4 | ||||
| sim | 59.5 | 7.5 | 16.1 | 13.1 | 1.1 | 2.7 | |||||
|
| |||||||||||
| Gly-85 | meas | 80.9 ± 0.5 | 19.2 ± 0.5 | ||||||||
| sim | 80.6 | 19.4 | |||||||||
| Gly-57 | meas | 77.8 ± 0.5 | 5.6 ±0.5 | 16.7 ± 0.4 | |||||||
| sim | 77.9 | 5.6 | 16.5 | ||||||||
|
| |||||||||||
| Ala-85 | meas | 78.7 ± 0.5 | 3.6 ± 0.6 | 17.8 ± 0.5 | |||||||
| sim | 78.9 | 3.5 | 17.6 | ||||||||
| Ala-57 | meas | 75.0 ± 0.8 | 6.2 ± 0.8 | 3.7 ± 0.8 | 15.1 ± 0.7 | ||||||
| sim | 75.5 | 6.1 | 3.4 | 15.1 | |||||||
|
| |||||||||||
| Glu-159 R | meas | 58.1 ±0.7 | 13.5 ± 1.1 | 23.3 ± 0.7 | 2.9 ± 0.5 | 2.2 ± 0.4 | |||||
| sim | 58.2 | 13.4 | 23.4 | 2.8 | 2.2 | ||||||
| Glu-85 R | meas | 58.1 ± 0.8 | 13.6 ± 1.2 | 23.4 ± 0.7 | 2.7 ± 0.5 | 2.1 ± 0.4 | |||||
| sim | 58.2 | 13.4 | 23.4 | 2.8 | 2.2 | ||||||
| Glu-57 R | meas | 51.9 ± 0.7 | 15.3 ± 0.8 | 21.4 ±0.4 | 8.1 ± 0.4 | 2.3 ± 0.4 | 1.1 ± 0.4 | ||||
| sim | 52.1 | 15.1 | 21.5 | 7.9 | 2.3 | 1.0 | |||||
|
| |||||||||||
| Asp-159 R | meas | 65.7 ± 0.6 | 16.1 ± 0.7 | 11.9 ± 0.4 | 6.2 ± 0.5 | ||||||
| sim | 66.1 | 16.2 | 11.8 | 5.9 | |||||||
| Asp-57 R | meas | 60.7 ± 0.5 | 16.5 ± 0.5 | 10.6 ± 0.6 | 9.6 ± 0.5 | 2.7 ± 0.4 | |||||
| sim | 60.9 | 16.5 | 10.4 | 9.7 | 2.5 | ||||||
| Asp-15 R | meas | 60.8 ± 0.6 | 16.3 ± 0.5 | 10.6 ± 0.6 | 9.6 ± 0.5 | 2.8 ± 0.5 | |||||
| sim | 60.9 | 16.5 | 10.4 | 9.7 | 2.5 | ||||||
|
| |||||||||||
| Met-159 | meas | 53.6 ± 0.5 | 25.8 ± 0.5 | 12.7 ± 0.5 | 6.9 ± 0.5 | 1.1 ± 0.4 | |||||
| sim | 53.3 | 25.9 | 12.6 | 7.0 | 1.1 | ||||||
| Met-85 | meas | 53.2 ± 1.0 | 25.8 ± 0.6 | 12.8 ± 0.5 | 7.0 ± 0.6 | 1.2 ± 0.7 | |||||
| sim | 53.3 | 25.9 | 12.6 | 7.0 | 1.1 | ||||||
| Met-57 | meas | 49.2 ± 0.6 | 24.9 ± 0.6 | 11.6 ± 0.5 | 9.8 ± 0.5 | 4.0 ± 0.5 | 0.5 ± 0.4 | ||||
| sim | 49.1 | 25.1 | 11.6 | 9.8 | 3.9 | 0.5 | |||||
|
| |||||||||||
| Leu-159 | meas | 50.5 ± 0.5 | 15.9 ± 0.6 | 23.9 ± 0.5 | 6.3 ± 0.4 | 2.9 ± 0.4 | 0.6 ± 0.4 | ||||
| sim | 50.1 | 16.6 | 23.6 | 6.4 | 2.7 | 0.6 | |||||
| Leu-85 | meas | 50.3 ± 0.5 | 16.4 ± 0.5 | 23.7 ± 0.5 | 6.3 ± 0.4 | 2.8 ± 0.4 | 0.6 ±0.4 | ||||
| sim | 50.1 | 16.6 | 23.6 | 6.4 | 2.7 | 0.6 | |||||
|
| |||||||||||
| Ile-159 | meas | 52.0 ± 0.5 | 15.1 ± 0.7 | 21.7 ± 0.4 | 7.9 ± 0.5 | 2.3 ± 0.4 | 1.0 ± 0.4 | ||||
| sim | 52.1 | 15.3 | 21.5 | 7.9 | 2.3 | 1.0 | |||||
| Ile-85 | meas | 51.5 ± 0.5 | 15.5 ± 0.7 | 21.7 ± 0.4 | 8.0 ± 0.4 | 2.3 ± 0.4 | 1.0 ± 0.4 | ||||
| sim | 52.1 | 15.3 | 21.5 | 7.9 | 2.3 | 1.0 | |||||
|
| |||||||||||
| Pro-159 | meas | 58.8 ± 0.7 | 13.3 ± 1.1 | 23.1 ± 0.7 | 2.6 ± 0.5 | 2.1 ±0.5 | |||||
| sim | 58.2 | 13.4 | 23.4 | 2.8 | 2.2 | ||||||
|
| |||||||||||
| Ser-159 R | meas | 77.5 ± 0.7 | 6.2 ± 0.8 | 16.3 ± 0.7 | |||||||
| sim | 78.3 | 4.7 | 17.0 | ||||||||
| Ser-85 R | meas | 77.4 ± 0.8 | 6.2 ± 0.9 | 16.4 ± 0.7 | |||||||
| sim | 78.3 | 4.7 | 17.0 | ||||||||
| Ser-57 R | meas | 75.0 ± 0.5 | 6.1 ± 0.5 | 3.7 ± 0.5 | 15.2 ± 0.5 | ||||||
| sim | 75.7 | 5.7 | 3.4 | 15.2 | |||||||
|
| |||||||||||
| Thr-85 R | meas | 66.0 ± 0.5 | 16.4 ± 0.7 | 11.7 ± 0.5 | 5.9 ± 0.5 | ||||||
| sim | 66.0 | 16.4 | 11.8 | 5.8 | |||||||
| Thr-57 | meas | 61.0 ± 0.5 | 16.5 ± 0.6 | 10.3 ± 0.5 | 9.5 ± 0.5 | 2.7 ± 0.4 | |||||
| sim | 60.8 | 16.6 | 10.5 | 9.6 | 2.5 | ||||||
| Thr-57 R | meas | 60.6 ± 0.5 | 16.7 ± 0.5 | 10.5 ± 0.6 | 9.6 ± 0.5 | 2.6 ± 0.4 | |||||
| sim | 60.8 | 16.6 | 10.5 | 9.6 | 2.5 | ||||||
|
| |||||||||||
| Phe-159 | meas | 43.6 ± 0.7 | 10.7 ± 0.7 | 20.9 ± 0.7 | 9.5 ± 0.7 | 8.8 ± 0.7 | 3.1 ± 0.7 | 2.7 ± 0.7 | 0.4 ± 0.7 | 0.3 ± 0.7 | |
| sim | 43.4 | 11.3 | 19.9 | 9.3 | 9.1 | 3.4 | 2.9 | 0.4 | 0.3 | ||
| Phe-85 | meas | 43.4 ± 0.7 | 10.8 ± 0.7 | 21.0 ± 0.7 | 9.4 ± 0.7 | 8.7 ± 0.7 | 3.1 ±0.7 | 2.6 ± 0.7 | 0.3 ± 0.7 | 0.6 ± 0.7 | |
| sim | 43.4 | 11.3 | 19.9 | 9.3 | 9.1 | 3.4 | 2.9 | 0.4 | 0.3 | ||
| Phe-57 | meas | 42.4 ± 0.5 | 11.3 ± 0.4 | 12.3 ± 0.4 | 16.7 ± 0.4 | 8.5 ± 0.4 | 4.1 ± 0.4 | 2.9 ± 0.4 | 1.3 ± 0.4 | 0.3 ± 0.4 | 0.3 ± 0.4 |
| sim | 42.0 | 11.6 | 12.1 | 16.1 | 9.0 | 4.2 | 3.0 | 1.4 | 0.4 | 0.3 | |
|
| |||||||||||
| Lys-159 R | meas | 51.9 ± 0.8 | 15.5 ± 1.1 | 21.6 ± 0.7 | 7.9 ± 0.7 | 2.2 ± 0.7 | 1.1 ± 0.7 | ||||
| sim | 50.1 | 17.8 | 21.8 | 7.2 | 2.3 | 0.7 | |||||
meas are the values measured experimentally, reported as mean ± experimental error. The mass distributions are corrected for naturally occuring isotopes.
sim are the values simulated by the isotope model, reported for the lowest objective (56.4)
Given that no single solution emerged as a clear-cut candidate for the correct flux distribution, we set out to determine if our model contained degrees of freedom that had no bearing on the carbon labeling and thus made the goodness-of-fit test too conservative. First, we tested whether the observed SSE are indicative of true elucidation of metabolic fluxes and not an artifact due to the numerical scaling of the problem. To do this we generated random but feasible flux distributions and then calculated the corresponding SSE and contrasted them against the ones obtained for the solutions obtained with (FluxCalc). As seen in Figure 3A, the root mean square error for the inferred fluxes were always much smaller than those for randomly generated feasible fluxes. This lack of overlap suggested that the obtained low SSE values were not caused by the numerical scaling of the problem but instead were driven by correctly apportioning fluxes in accordance with the observed experimental data and suggested that the goodness-of-fit test might have been too conservative.
Figure 3. Inferred fluxes and propagation of error.
Panel A) shows that the objective values determined for inferred fluxes do not overlap those from randomly-generated feasible fluxes, suggesting that the fit of the data requires correctly apportioning fluxes in accordance with the observed experimental data. The same total of solutions occur for each histogram. Panel B) shows that these solutions cluster tightly for the most part and that many of solutions could be statically acceptable once experimental error is propagated onto the objective values. Those to the left of the maximal χ2 value (solid vertical line) are retained for use as inputs into (FluxRange) whereas those to the right are discarded.
Thus, the next step was to characterize the tightness of elucidation (i.e., resolvability) of each flux in the network. First, we calculated the variability in all fluxes subject to only network stoichiometry, the three fixed macroscopic measurements (i.e., glucose, biomass, and amorphadiene), and the fixed exchange fluxes. This involved solving the (FluxRange) formulation without imposing the isotopomer balance constraints. The flux ranges for central metabolism (see Figure 4A) are provided in the third and fourth columns of Table 3 while the ranges for the entire network are available as supplementary material. Table 4 reveals that 125 fluxes of the total 270 net fluxes are fixed by stoichiometry alone. However, as expected, nearly all central metabolic fluxes span wide ranges of values motivating the need for 13C-based MFA.
Figure 4. Flux degree of resolution.
The proportion of fluxes with improved (higher) degrees of resolution considerably increases for the reduced model (open rectangles), as compared with the full model (closed rectangles), once ambiguities in the flux elucidations are identified and the network is selectively pruned with biological insight.
Table 3.
Flux ranges
| Relaxed modela,b |
Full model
|
Reduced model
|
Basic model
|
||||||
|---|---|---|---|---|---|---|---|---|---|
| Abbreviationc | Subsystem | min | max | min | max | min | max | min | max |
| ICL | Anaplerotic reactions | 0.0 | 12.3 | 0.0 | 1.0 | 0.0 | 0.4 | 0.0 | 0.0 |
| MALS | Anaplerotic reactions | 0.0 | 12.3 | 0.0 | 1.0 | 0.0 | 0.4 | 0.0 | 0.0 |
| ME1x | Anaplerotic reactions | 0.0 | 75.9 | 0.0 | 4.0 | 0.0 | 3.7 | 0.0 | 0.0 |
| PPA | Anaplerotic reactions | 0.8 | 129.4 | 0.8 | 107.2 | 0.8 | 80.1 | 0.8 | 83.1 |
| PPC | Anaplerotic reactions | 0.0 | 130.1 | 1.1 | 6.8 | 1.1 | 5.5 | 1.4 | 5.5 |
| PPCK | Anaplerotic reactions | 0.0 | 128.7 | 0.0 | 5.4 | 0.0 | 4.0 | 0.0 | 4.0 |
| ACONT | Citrate Cycle (TCA) | 0.5 | 12.8 | 0.8 | 4.3 | 0.9 | 3.2 | 0.9 | 3.1 |
| AKGD | Citrate Cycle (TCA) | 0.0 | 12.3 | 0.0 | 2.7 | 0.0 | 2.7 | 0.1 | 2.6 |
| CITL | Citrate Cycle (TCA) | 0.0 | 128.7 | 0.0 | 128.6 | 0.0 | 128.7 | 0.0 | 0.0 |
| CS | Citrate Cycle (TCA) | 0.5 | 141.0 | 0.8 | 109.4 | 1.4 | 82.4 | 0.9 | 3.1 |
| FRD2 | Citrate Cycle (TCA) | 0.0 | 68.7 | 0.0 | 58.6 | 0.0 | 0.0 | 0.0 | 0.0 |
| FRD3 | Citrate Cycle (TCA) | 0.0 | 68.7 | 0.0 | 58.6 | 4.0 | 41.5 | 0.0 | 0.0 |
| FUM | Citrate Cycle (TCA) | 0.4 | 281.5 | 1.0 | 230.0 | 0.7 | 14.3 | 0.8 | 5.5 |
| ICDHy | Citrate Cycle (TCA) | 0.5 | 12.8 | 1.1 | 3.2 | 0.5 | 3.2 | 0.9 | 3.1 |
| MDH | Citrate Cycle (TCA) | −59.1 | 281.5 | −1.4 | 230.0 | −1.1 | 14.3 | 0.8 | 5.5 |
| SUCD1i | Citrate Cycle (TCA) | 0.0 | 77.3 | 2.6 | 60.5 | 5.1 | 43.8 | 0.4 | 2.6 |
| SUCOAS | Citrate Cycle (TCA) | −12.3 | 18.4 | −4.0 | 0.1 | −2.6 | 11.6 | −2.5 | 0.0 |
| ENO | Glycolysis/Gluconeogenesis | −44.3 | 18.8 | −40.3 | 18.4 | 16.5 | 18.4 | 16.6 | 18.8 |
| FBA | Glycolysis/Gluconeogenesis | −0.3 | 9.6 | 0.2 | 9.6 | 7.7 | 9.6 | 7.9 | 9.6 |
| FBP | Glycolysis/Gluconeogenesis | 0.0 | 128.7 | 0.0 | 107.3 | 0.0 | 79.8 | 0.0 | 0.0 |
| G1PP | Glycolysis/Gluconeogenesis | 0.0 | 128.7 | 0.0 | 107.3 | 0.0 | 0.0 | 0.0 | 0.0 |
| GAPD | Glycolysis/Gluconeogenesis | −32.3 | 19.0 | −20.1 | 19.0 | 17.2 | 19.0 | 17.3 | 19.0 |
| GLCP | Glycolysis/Gluconeogenesis | 0.0 | 128.7 | 0.0 | 107.3 | 0.0 | 0.0 | 0.0 | 0.0 |
| GLCS1 | Glycolysis/Gluconeogenesis | 0.1 | 128.7 | 0.1 | 107.4 | 0.1 | 0.1 | 0.1 | 0.1 |
| GLGC | Glycolysis/Gluconeogenesis | 0.1 | 128.7 | 0.1 | 107.4 | 0.1 | 0.1 | 0.1 | 0.1 |
| HEX1 | Glycolysis/Gluconeogenesis | 0.0 | 136.2 | 0.0 | 115.9 | 0.0 | 10.0 | 0.0 | 10.0 |
| PDH | Glycolysis/Gluconeogenesis | 0.0 | 27.1 | 0.0 | 16.4 | 0.0 | 16.4 | 0.0 | 15.3 |
| PFK | Glycolysis/Gluconeogenesis | 0.0 | 138.3 | 0.0 | 116.4 | 7.8 | 88.3 | 7.9 | 9.6 |
| PGI | Glycolysis/Gluconeogenesis | 0.0 | 9.9 | 0.0 | 9.9 | 4.4 | 9.9 | 4.7 | 9.9 |
| PGK | Glycolysis/Gluconeogenesis | −32.3 | 19.0 | −20.1 | 19.0 | 17.2 | 19.0 | 17.3 | 19.0 |
| PGM | Glycolysis/Gluconeogenesis | −44.3 | 18.8 | −42.3 | 18.4 | 16.5 | 18.4 | 16.6 | 18.8 |
| PPS | Glycolysis/Gluconeogenesis | 0.0 | 128.7 | 0.0 | 108.4 | 0.0 | 79.8 | 0.0 | 0.0 |
| PYK | Glycolysis/Gluconeogenesis | 0.0 | 144.2 | 0.0 | 112.3 | 3.5 | 92.7 | 4.6 | 16.6 |
| TPI | Glycolysis/Gluconeogenesis | −41.6 | 9.6 | −29.4 | 9.6 | 7.8 | 9.6 | 7.9 | 9.6 |
| ATPS4r | Oxidative phosphorylation | −24.2 | 108.4 | −24.2 | 106.6 | −24.2 | 60.3 | −24.2 | 63.5 |
| CYTBO3 | Oxidative phosphorylation | 27.9 | 77.2 | 31.1 | 77.2 | 31.1 | 43.7 | 31.1 | 44.2 |
| FDH2 | Oxidative phosphorylation | 0.0 | 27.1 | 0.0 | 27.1 | 0.0 | 15.3 | 0.0 | 15.3 |
| LDH_D2 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 54.5 | 0.0 | 0.0 | 0.0 | 0.0 |
| NADH10 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 58.8 | 0.0 | 0.0 | 0.0 | 0.0 |
| NADH12 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 54.8 | 0.0 | 35.0 | 0.0 | 42.1 |
| NADH6 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 53.7 | 0.0 | 35.0 | 0.0 | 42.1 |
| NADH7 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 58.8 | 0.0 | 0.0 | 0.0 | 0.0 |
| NADH8 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 58.8 | 3.0 | 41.5 | 0.0 | 0.0 |
| NADH9 | Oxidative phosphorylation | 0.0 | 68.7 | 0.0 | 58.3 | 0.0 | 0.0 | 0.0 | 0.0 |
| POX | Oxidative phosphorylation | 0.0 | 27.1 | 0.0 | 16.1 | 0.0 | 15.3 | 0.0 | 0.0 |
| SUCD4 | Oxidative phosphorylation | −0.1 | 77.2 | 2.3 | 60.4 | 5.0 | 43.7 | 0.3 | 44.2 |
| THD2 | Oxidative phosphorylation | 0.0 | 257.3 | 0.0 | 214.8 | 0.0 | 159.7 | 0.0 | 164.7 |
| THD5 | Oxidative phosphorylation | 0.0 | 272.6 | 0.0 | 229.6 | 0.0 | 163.9 | 0.0 | 169.9 |
| TRDR | Oxidative phosphorylation | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
| EDA | Pentose Phosphate Cycle | 0.0 | 9.9 | 0.0 | 9.9 | 0.0 | 0.0 | 0.0 | 0.0 |
| G6PDHy | Pentose Phosphate Cycle | 0.0 | 9.9 | 0.0 | 9.9 | 0.0 | 5.5 | 0.0 | 5.2 |
| PGDH | Pentose Phosphate Cycle | 0.0 | 9.9 | 0.0 | 5.5 | 0.0 | 5.5 | 0.0 | 5.2 |
| PGDHy | Pentose Phosphate Cycle | 0.0 | 9.9 | 0.0 | 9.9 | 0.0 | 0.0 | 0.0 | 0.0 |
| PGL | Pentose Phosphate Cycle | 0.0 | 9.9 | 0.0 | 9.9 | 0.0 | 5.5 | 0.0 | 5.2 |
| RPE | Pentose Phosphate Cycle | −0.2 | 6.4 | −0.2 | 4.5 | −0.2 | 3.4 | −0.2 | 3.2 |
| RPI | Pentose Phosphate Cycle | −3.5 | −0.2 | −2.6 | −0.2 | −2.1 | −0.2 | −1.9 | −0.2 |
| TAL | Pentose Phosphate Cycle | −0.1 | 3.2 | −0.1 | 2.3 | −0.1 | 1.8 | −0.1 | 1.7 |
| TKT1 | Pentose Phosphate Cycle | −0.1 | 3.2 | −0.1 | 2.3 | −0.1 | 1.8 | −0.1 | 1.7 |
| TKT2 | Pentose Phosphate Cycle | −0.2 | 3.1 | −0.2 | 1.7 | −0.2 | 1.6 | −0.2 | 1.6 |
| ACt6 | Transport, Extracellular | −12.3 | 0.0 | −11.6 | 0.0 | −11.5 | −8.4 | −11.5 | 0.0 |
| GLCpts | Transport, Extracellular | 0.0 | 10.0 | 0.0 | 10.0 | 0.0 | 10.0 | 0.0 | 10.0 |
| GLCt2 | Transport, Extracellular | 0.0 | 10.0 | 0.0 | 10.0 | 0.0 | 10.0 | 0.0 | 10.0 |
| AMDNt | Amorphadiene | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
| BIOMASS | Biomass | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 |
All flux values have units of mM/hr and are relative to a glucose uptake rate of 10 mM/hr.
The relaxed model only uses stoichiometric data and not the 13C labeling data and serves as the allowable bounds for the fluxes. Shown here is the relaxed model of the full model.
Flux abbreviations correspond to the names in (Reed et al. 2003) and are shown in Figure 6.
Table 4.
Flux resolution classification
| Number of fluxes | |||
|---|---|---|---|
| Full | Reduced | Basic | |
| Set by experimental conditions | 3 | 3 | 3 |
| Pruned by experimental conditions/reductions | 22 | 51 | 61 |
| Stoichiometrically fixed | 100 | 94 | 96 |
| Fully Resolved | 0 | 3 | 0 |
| Partially Resolved | 131 | 110 | 88 |
| Unresolved | 14 | 9 | 22 |
We next reran formulation (FluxRange) after including the isotopomer balancing constraints to determine (i) which fluxes (if any) were fully resolved; (ii) how many of them were only partially resolved; and (iii) if there were any remaining fluxes in the model that are completely unaffected by the isotopomer data. A cut-off objective value of 66.0 was determined for (FluxRange) as the 95% percentile using an F-test statistic (Kleijn et al., 2006). The calculated flux ranges for central metabolism under this scenario are provided in the fifth and sixth columns of Table 3, while the remaining flux ranges are included as supplementary material. The identified classifications of the fluxes based on their resolution status are summarized in Table 4. Of the 145 net fluxes that were not fixed due to stoichiometry, 131 were partially resolved with bound ranges spanning magnitudes from 0.2% to 98% of the flux range allowable by the stoichiometry alone. These ranges can be characterized by defining the degree of resolution, as shown in Figure 5. Examples of partially resolved fluxes with significantly narrower (and more biologically realistic) ranges after the imposition of the isotopomer balancing constraints included the pentose phosphate pathway reactions (RPE, RPI, TKT1, TKT2, and TAL), the TCA cycle reactions (ACONT, ICDHy, AKGD, SUCOAS), and the anaplerotic reactions (ICL, MALS, PPC, PPCK).
Figure 5. Central metabolism flux maps.

The flux map of central metabolism for the full model (panel A) shows the metabolite names (sans-serif typeface) and reaction names (smaller, serif typeface). Irreversible reactions have single, filled, arrows. Reversible reactions have two arrows: the forward direction is filled and the backwards direction is unfilled. In comparison, most of the reactions are irreversible for the smaller basic model (panel B).
Closer investigation of columns five and six of Table 4 reveals that additional manual intervention is needed when resolving fluxes based on a large model, particularly in cases where there exists pathways with redundant labeling patterns. For example, the labeling pattern of pyruvate is unaffected regardless of whether glucose is metabolized via Embden-Meyerhoff-Parnas (EMP) glycolysis (GAPD, PGK, PGM, ENO, and PYK) or via the methylglyoxal pathway (MGSA, LGTHL, GLYOX, and LDH_D2). In fact, one feasible, though obviously unrealistic, flux distribution involved a high gluconeogenic flux (~20 mM/hr) coupled to an even higher flux (~40 mM/hr) through the methylglyoxal pathway. In cases such as these, care must be taken to prune the model using biological intuition without compromising its ability to match the experimental labeling observations. Although this involves making assumptions about the current metabolic system, it provides a systematic way of cataloging these assumptions that can be later verified via expression measurements, enzyme assays, or additional labeling studies with different labeled substrates. Candidate reactions for removal must have (i) flux ranges that start at zero and (ii) no effect on the objective value of any (FluxCalc) solutions upon removal. Thus by using information from the large-scale model, reactions can be identified whose removal does not affect the quality of agreement with the data. This removal of 33 fluxes in effectively redundant pathways from the model (see Table 5) selected using biological intuition and the requirement of not worsening the objective led to the derivation of a ‘reduced’ model. We note that with the removal of these reactions that have no bearing on the effective carbon labeling, given the substrate labeling, that the model contains fewer degrees of freedom. There are 99 free fluxes in the reduced model, giving a total of 74 independent flux variables for (FluxCalc) to determine. An updated goodness-of-fit test (having 130−74=56 degrees of freedom) is passed, with a maximum allowable objective of 74.4. Thus, we can accept the reduced model as statically valid (P(χ2[54]<56.2)=0.6075).
Table 5.
Fluxes manually set or pruned
| Abbreviationa | value | Model |
|---|---|---|
| EX_glc | −10 | All |
| BIOMASS_Ec | 0.46 | All |
| EX_amdm | 0.1114 | All |
| EX_ala-L, EX_arg-L, EX_asn-L, EX_asp-L, EX_cys-L, EX_etoh, EX_for, EX_gln-L, EX_glu-L, EX_gly, EX_ile-L, EX_lac-L, EX_leu-L, EX_lys-L, EX_met-L, EX_phe-L, EX_pro-L, EX_ser-L, EX_succ, EX_thr-L, EX_tyr-L, EX_val- | ||
| L | 0 | All |
| MGSA, LGTHL, GLYOX, LDH_D2 | 0 | reduced, basic |
| PGDHY, EDA | 0 | reduced, basic |
| FRD2 | 0 | reduced, basic |
| ADHEr_b, ADHEr_f | 0 | reduced, basic |
| LDH_D_b, LDH_D_f | 0 | reduced, basic |
| G1PP | 0 | reduced, basic |
| METAT, ADMDC, SPMS, MTAN, MTRK, MTRI, MDRPD, | ||
| DKMPPD, DKMPD2, P5CR, UNK3 | 0 | reduced, basic |
| GLCP | 0 | reduced, basic |
| ASPT | 0 | reduced, basic |
| SERD_L | 0 | reduced, basic |
| TRPAS1 | 0 | reduced, basic |
| FTHFD | 0 | reduced, basic |
| NADH7, NADH9, NADH10 | 0 | reduced, basic |
| PGMT_f | 0 | reduced, basic |
| VALTA_f | 0 | reduced, basic |
| ME1x | 0 | basic |
| ICL, MALS | 0 | basic |
| POX | 0 | basic |
| FBP | 0 | basic |
| PPS | 0 | basic |
| CITL | 0 | basic |
| NADH8, FRD3 | 0 | basic |
| THRA_f, THRA_b | 0 | basic |
| FBA_b, GAPD_b, PGK_b, PGM_b, ENO_b, SUCOAS_f, | ||
| ICDHy_b, ACONT_b, G6PDHy_b, ASPTA1_f, PGI_b | 0 | basic |
Flux abbreviations correspond to the names in (Reed et al. 2003).
As shown in Figure 3B, most of the 521 inferred optima had objective values in the range 56 to 107 with most (all but 100) outside the significance level of 74.4. The lowest objective value obtained was 56.4 which occurred six times (for comparison, 56.5 occurred 20 times). A total of 49 different optimum values were observed when rounding to 0.1; 12 of these were below the statistical cutoff. Given the plethora of identified locally minimum solutions with very similar objective function values acceptable by χ2 criteria but with quite different flux distributions, the next step was to deduce flux ranges in equivalent solutions given the experimental error in the measurements. This result further validated the decision to identify as many statistically equivalently solutions as possible rather than homing in on a single solution that may miss many of the attributes of the physiologically relevant one.
Reapplication of the (FluxRange) formulation, using an updated cutoff of 60.6 owing to the increased degrees of freedom, lead to far more biologically realistic ranges for the glycolytic fluxes GAPD, PGK, PGM, and ENO as shown in columns seven and eight of Table 3. Note that this reduced model also depends upon the specific experimental labeling conditions; that is, a flux that is redundant for one experimental condition may not be generically so. For instance the Entner Doudoroff (ED) pathway was set to zero because the current experiment was performed with a mixture of uniformly and un-labeled glucose meaning that it cannot distinguish between ED and Embden-Meyerhoff-Parnas (EMP) glycolysis. This assumption led to a tighter flux range for the oxidative branch of the pentose phosphate pathway (G6PDHy, PGL, PGDH). Improvement in the overall degree of resolution was also observed for this reduced model as seen in Figure 5. We again observed that some fluxes that had consistent values for the optimum solutions (e.g., ICL had a value of zero) did in fact have a range of values they could take when accounting for measurement noise, thus emphasizing the importance of range calculations. Of especial note is the acetate transport flux. Although the allowable stoichiometric range remained the same as for the full model (from 0 to 12.3 mM/hr), the allowable range for the isotopomer model shrank to a range from 8.4 to 11.5 mM/hr. This narrow range was in good agreement with the experimentally observed values (10.2–12.8 mM/hr), and thus served as a validation of the model.
Finally, we tested the value-added by using a relatively large-scale model of metabolism in contrast to existing metabolic representations used in similar efforts (Christensen and Nielsen, 1999; Fischer et al., 2004; Ghosh et al., 2005). Instead of choosing one such representation, we decided to simply prune additional reactions from our model to make it resemble previously employed metabolic representations. In contrast with the reduced model described above, the construction of the ‘basic’ model was not based on using information from the elucidation in the large model to decide which reactions can be removed. A pictorial representation of the basic model is shown in Figure 4B. Specifically, we further pruned 26 transformations (see Table 4) consisting primarily of reactions typically assumed inactive for aerobic growth on glucose such as fumarate reductase (FRD) and phosphoenolpyruvate synthase (PPS) and also anapleurotic reactions such as malic enzyme (ME1x) and the glyoxylate shunt (ICL). Furthermore, many reactions were treated as irreversible as was assumed in previous studies. In doing so, the residual sum of squares for the measured labeling patterns of the amino acids worsened from 56.4 to 559 in the best case, indicating clearly that significant information is lost when neglecting potentially active pathways present in the large-scale model. Notably, all solutions to the (FluxCalc) problem were well outside of the goodness-of-fit cut-off objective value used in the previous scenarios, and the model was also statistically unacceptable (P(χ2[80]<558)=1.0). A new F-test cut-off was calculated and implemented for (FluxRange) as before. Using the basic model and new objective value cut-off, many partially resolved fluxes had tighter ranges than for the complete model as shown in columns nine and ten of Table 3. Although this may seem to be a desirable attribute, in reality it means that by excluding some reactions the smaller model cannot discern certain flux distributions that lead to wider ranges in the elucidated fluxes and thus loss of potentially useful information (Table 4). The impact of removing each reaction individually on the objective was also examined. Removal of FRD had the largest impact, resulting in a 24% increase in objective value, but removal of ME1x and the assumption of irreversibility for PGM and ENO also caused 9% increases, by not allowing changes in the labeling of PEP by the TCA cycle to affect the labeling of the 3PG amino acids. Other reactions had smaller effects individually.
Summary
In this paper we introduced a large-scale isotopomer mapping model accounting for 238 reactions, 350 fluxes and 184 metabolites which is at least four-fold larger than existing mapping models. This model is available as supplementary material. The novel aspect of this model is the simultaneous presence of explicit cofactor balancing and biomass drain used in conjunction with isotope data. While others have introduced before models that included cofactor balances or isotope data (van Winden et al., 2003) their simultaneous use was never exploited for flux elucidation. The construction procedure of the isotopomer mapping model by reducing a genome-scale one enabled the seamless integration of MFA results with FBA analyses. The model was adapted for the amorphodiene producing strain (Martin et al., 2003) where fragments from 13 amino acids were analyzed using GC-MS. We found that many (i.e., tens) of different local minima solutions fit the experimental data equally well. This motivated the use of a goodness-of-fit test to determine significance. We found that many of the local minima were statistically indistinguishable from one another implying that given the available data significant ambiguity remained in flux elucidation. This ambiguity was subsequently quantified by identifying lower and upper bounds on all the fluxes given a required level of fit with the experimental data. Using these results as a starting point we subsequently removed unneeded, unobservable, or effectively redundant pathways (from the perspective given the experimental data) from the model by making sure that their removal does not worsen the fit. Subsequently, with fewer equivalent routes, the model was found to adequately describe the data. The advantage of starting with a more complete model and subsequently pruning inactive pathways was quantitatively demonstrated by simply removing pathways typically absent in existing isotopomer mapping models. This led to a statistically significant worse fit and allowed us to pinpoint the reactions whose removal was not warranted by the available data.
It is important to note that the methods described here could be extended to take advantage of the EMU framework (Antoniewicz et al., 2006b) for the rapid evaluation of the labeling patterns using multiple isotopic tracers by coupling it to the optimization procedure described herein, similar to the very recent work on a 1,3-propanediol system (Antoniewicz et al., 2007). Also, the developed isotopomer model is not restricted to isotope data generated from GC-MS. For example, NMR data could also be incorporated by mapping the Iik predictions onto the NMR data in an analogous way to that used to link the Iik with the MDV values obtained from the GC-MS data. The NMR data may include the fractional abundance of 13C-atoms at each position or the relative amount of 13C-13C and 13C-12C carbons through COSY NMR. Furthermore, the Iik predictions could also be linked to positional isotopomer data generated by methods such as Forier Transform-Ion Cyclotron Resonance Mass Spectrometry (FT-ICR MS) (Sleno et al., 2005) which would enable even more accurate metabolic flux elucidations by providing more specific flux labeling information.
In the present study, the inability to uniquely resolve fluxes was a consequence of the limited set of derivatized metabolites (i.e., only 13 metabolites), the limited set of measured fragments, the choice of labeled substrate, and the multiple routes with identical labeling patterns afforded by the large-scale mapping model. While not explicitly covered in this paper the proposed framework can be used to systematically identify how many and which additional metabolites to measure and what substrate carbon labeling patterns to use to uniquely recapitulate all fluxes in the network. For example, the inability to tightly resolve the branch point between the pentose phosphate pathway and glycolysis is in this experiment was due to the use of uniformly labeled glucose. Instead, using glucose labeled at position 1, would have enabled the identification of the PPP/glycolysis branch ratio. Note that the isotopomer mapping model can be used to evaluate the changes in fluxes throughout the network upon addition of exogenous genes or deletions made during the engineering of pathways by simply appending or removing the corresponding functionalities from the model. Finally our study demonstrated that even if a smaller mapping model is ultimately used, it is important to start with a more complete reconstruction to base the decisions about which reactions can be eliminated without compromising the quality of agreement with the data.
Supplementary Material
Acknowledgments
This work was supported by NIH Phase I SBIR #R43 RR020263-01, by DOE DE-FG02-05ER25684, and by a grant from the Institute for OneWorld Health through the generous support of the Bill & Melinda Gates Foundation.
5. Appendix
In the context of metabolic flux analysis (MFA) the description of a metabolic network requires the definition of the following sets, variables, and parameters.
Sets
I = {i} = set of metabolites
J = {j} = set of reactions
JR ={j} = set of reversible reactions
IF = {i} = metabolites present in growth medium
IE ={i} = metabolites that can cross cell boundaries
Parameters
Sij = stoichiometic matrix
Variables
νj = fluxes
bi = exchange fluxes
The set I contains all of the metabolites present while set J enumerates all reactions composing the metabolic network. The set IF is all of the metabolites that are contained in the growth medium, and set IE contains all metabolites that can cross cellular boundaries. Sij is the stoichiometric coefficient of metabolite i in reaction j. νj quantifies the rate of reaction j, and bi is the rate of transport (active or passive) of metabolite i across cellular boundaries. Reversible reactions are replaced by the difference of the corresponding pair of exchange reactions thus maintaining positivity of all reaction steps present in the model:
| (1) |
Using the principle of stoichiometric analysis along with the application of a pseudo-steady-state hypothesis to the intracellular metabolites (Vallino and Stephanopoulos, 1993), an overall flux balance can be written as follows:
| (2) |
Metabolites that do not exchange across the cell’s boundaries have no exchange flux. That is, bi = 0, ∀ i ∉IE. Positive or negative values for b are allowed when a metabolite can enter or leave the cellular boundaries, respectively.
When 13C substrate labeling is introduced, an additional layer of detail is needed to fully characterize the network. This information includes the substrate(s) labeling patterns and descriptions of the fate of the carbon atoms in each reaction. We express the labeling patterns using the concepts of isotopomer distribution vectors (IDVs) (Schmidt et al., 1997; Wittmann and Heinzle, 2002) and isotopomer mapping matrices (IMMs) (see subsection 2.2). The following additional sets, parameters and variables are required for the mathematical quantification.
Sets
K ={k} = set of isotopomers
N = {n} = the number of carbon atoms
Parameters
= isotopomer mapping matrix
Variables
Iik = isotopomer distribution vector
The set K enumerates all possible labeling patterns for a given metabolite. Variable Iik, referred to as the isotopomer distribution vector, is defined as the fraction of metabolite i that exists in the isotopomer form k. There are at most 2n isotopomers for each metabolite containing n carbons, and the sum of all Iik over k for a given metabolite i is equal to one. Parameter IMM links the specific isotopomer form k′ of reactant i′ that contribute to the formation of product i in isotopomer form k through reaction j. The corresponding entry for IMM for such an indices combination is equal to one unless it refers to a symmetric molecule i′ (e.g., succinate). For symmetric molecules IMMs have a fractional entry to account for the fact that an isotopomer k′ may map to more than one isotopomers k for the same product molecule i. Given the above definitions the mass balance for every metabolite i in isotopomer form k can be written as follows:
| (3) |
in which the product symbol is used for term by term multiplication. Figure 1 pictorially illustrates the origin of each term present in the isotopomer balance equation. The balance equation acts as a mixing/splitting node where all generation terms through different reactions j of metabolite i in isotope form k are aggregated and then channeled through to all consuming reactions at a fixed isotopomer fraction. Note that the generation terms gives rise to all nonlinearities between IMMs. Unimolecular reactions yield products of metabolic fluxes times isotopomer fractions (bilinear terms) while bimolecular reactions contribute trilinear terms due to the presence of two separate isotopomer fractions in the product. Thus, the isotopomer balances abstracted through Eq. (3) relate the Iik of each intracellular metabolite in terms of fluxes νj, IMMs, and the isotopomer distribution in the feedstock (Schmidt et al., 1997).
Experimental techniques such as mass spectrometry are used to obtain isotope labeling data. However, instead of providing the isotopomer distribution Iik directly, mass spectrometry provides raw data on the mass distribution vectors (MDV) for each measured metabolite. Because essentially mass spectroscopy acts as a molecular level scale that deduces the weight distribution of each measured metabolite, MDV contain information only about the total number of labeled carbons in a fragment generated by the mass spectrophotometer but not their specific location. These weight differences arise form the fact that different isotope forms may contain different numbers of labeled carbons. Each entry of the MDV contains a group of isotopomers that all have the same mass (Wittmann and Heinzle, 1999). Linking the isotopomer distribution vectors and the mass distribution vectors requires the following additional definitions.
Sets
F = {f} = set of fragments
M = {m} = set of mass fractions
R = {r} = set of replicants
IM = {i} = set of measured metabolites
Parameters
= isotopomer grouping matrix
= standard deviation
Variables
= mass distribution vector
= mean mass distribution vector
Here set F contains all possible fragments generated upon ionization for a measured metabolite i. Set IM is the set of measured metabolites which in the current study are 14 amino acids. Set M represents the mass fractions observed for any given fragment f, as shown in the illustrative example in Figure 2. The set R represents all measurement replicates, as described above in the experimental section. Parameter links the specific isotopomer form k of metabolite i with the mass distribution m associated with fragment f of metabolite i. For a given fragment, all of the isotopomers that have the same number of labeled carbon atoms in them would be grouped together into the same , as shown for the example in Figure 2. The variable is simply the algebraic mean of the mass distribution m of the replicates for a given fragment f of metabolite i, and is the corresponding experimental error.
Using the notation listed above, the problem of calculating the fluxes and labeling patterns for a given set of GC-MS data is formulated as a nonlinear optimization problem (FluxCalc).
| (4) |
subject to
| (1) |
| (3) |
| (5) |
| (6) |
| (7) |
In (FluxCalc), the objective function z in Eq. (4) is the sum of the discrepancies between the experimental data (exp) and the calculated values (calc) for the MDVs of the measured amino acid fragments. Eq. (1) enforces mass balance on metabolites and Eq. (3) does the same for individual isotopomers. The mapping of MDV onto Iik is performed by Eq. (5). Eq. (6) constrains all the fluxes to be positive, and further restricts them between lower and upper bounds; except for fluxes whose values are fixed, these bounds were 0 and 100 times the observed glucose rate. Eq. (7) enforce that Iik remains between zero and one and that their sum is equal to one for each metabolite i.
Given the nonconvex nature of (FluxCalc), the size of the resulting formulation and the experimental variability in the experimental MDV, we pursue the identification of as many local optima solutions as possible. To facilitate the enumeration of local optima, the additional formulation (FluxInit) was constructed to generate a set of random feasible flux distributions that are used as initial conditions for the solution of the (FluxCalc) problem. (FluxInit) essentially minimally perturbs an original randomly chosen flux distribution so as to satisfy all metabolite balances. The resulting linear optimization problem is as follows.
| (8) |
subject to
| (2) |
| (9) |
| (6) |
Here, Eq. (8) minimizes the sum of the distance from feasibility for each flux vj encoded within variable ej. Eq. (9) defines variable ej to be equal to the absolute value of the difference between vj and which assumes values drawn from a random uniform distribution within prespecified upper and lower bounds UB, LB. The fluxes are constrained as in (FluxCalc) by Eq. (6). The overall mass balance is applied as before by Eq. (1). The overall solution procedure is listed in a step-wise manner below.
Step 1: Populate model parameters Sij, , Ii∈IFk UBj and LBj based on the experimental system setup and metabolic network abstraction. Set iteration counter Iter to zero.
Step 2: Initialize using a random number generator to construct a uniform distribution such that LBj< < UBj, ∀ j ∈ J.
Step 3: Solve (FluxInit) to determine an initial feasible flux distribution→νj.
Step 4: Initialize Iik by setting Ii,k =1 =1, ∀ i ∉ IF; Ii,k>1 = 0,∀ i ∉ IF.
Step 5: Solve the nonlinear optimization problem (FluxCalc) to generate a (local) solution νj, Iik and .
Step 6: Store solutions for current iteration, update iteration counter Iter ← Iter + 1, and return to Step 2 if the maximum iterations limit is not exceeded.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Antoniewicz MR, et al. Determination of confidence intervals of metabolic fluxes estimated from stable isotope measurements. Metab Eng. 2006a;20:20. doi: 10.1016/j.ymben.2006.01.004. [DOI] [PubMed] [Google Scholar]
- 2.Antoniewicz MR, et al. Elementary metabolic units (EMU): A novel framework for modeling isotopic distributions. Metab Eng. 2006b doi: 10.1016/j.ymben.2006.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Antoniewicz MR, et al. Metabolic flux analysis in a nonstationary system. 2007 doi: 10.1016/j.ymben.2007.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fed-batch fermentation of a high yielding strain of E. coli producing 1,3-propanediol. Metab Eng. doi: 10.1016/j.ymben.2007.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Avery MA, et al. Stereoselective total synthesis of (+)-artemisinin, the antimalarial constituent of Artemisia annua L. J Am Chem Soc. 1992;114:974–979. [Google Scholar]
- 6.Bacher A, et al. Elucidation of novel biosynthetic pathways and metabolite flux patterns by retrobiosynthetic NMR analysis. FEMS Microbiology Reviews. 1998;22:567–598. [Google Scholar]
- 7.Bailey JE. Toward a science of metabolic engineering. Science. 1991;252:1668–75. doi: 10.1126/science.2047876. [DOI] [PubMed] [Google Scholar]
- 8.Burgard AP, et al. Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 2004;14:301–12. doi: 10.1101/gr.1926504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Christensen B, Nielsen J. Isotopomer analysis using GC-MS. Metab Eng. 1999;1:282–90. doi: 10.1006/mben.1999.0117. [DOI] [PubMed] [Google Scholar]
- 10.Cragg GM. Paclitaxel (Taxol): a success story with valuable lessons for natural product drug discovery and development. Med Res Rev. 1998;18:315–31. doi: 10.1002/(sici)1098-1128(199809)18:5<315::aid-med3>3.0.co;2-w. [DOI] [PubMed] [Google Scholar]
- 11.Dauner M, et al. Metabolic flux analysis with a comprehensive isotopomer model in Bacillus subtilis. Biotechnol Bioeng. 2001;76:144–56. doi: 10.1002/bit.1154. [DOI] [PubMed] [Google Scholar]
- 12.Dhingra V, et al. Current status of artemisinin and its derivatives as antimalarial drugs. Life Sci. 2000:279–300. doi: 10.1016/s0024-3205(99)00356-2. [DOI] [PubMed] [Google Scholar]
- 13.Edwards JS, Palsson BO. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc Natl Acad Sci U S A. 2000;97:5528–33. doi: 10.1073/pnas.97.10.5528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Emmerling M, et al. Metabolic flux responses to pyruvate kinase knockout in Escherichia coli. J Bacteriol. 2002;184:152–64. doi: 10.1128/JB.184.1.152-164.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Fischer E, et al. High-throughput metabolic flux analysis based on gas chromatography-mass spectrometry derived 13C constraints. Anal Biochem. 2004;325:308–16. doi: 10.1016/j.ab.2003.10.036. [DOI] [PubMed] [Google Scholar]
- 16.Fong SS, Palsson BO. Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat Genet. 2004;36:1056–8. doi: 10.1038/ng1432. [DOI] [PubMed] [Google Scholar]
- 17.Forbes NS, et al. Using isotopomer path tracing to quantify metabolic fluxes in pathway models containing reversible reactions. Biotechnol Bioeng. 2001;74:196–211. doi: 10.1002/bit.1109. [DOI] [PubMed] [Google Scholar]
- 18.Ghosh S, et al. A three-level problem-centric strategy for selecting NMR precursor labeling and analytes. Metab Eng. 2006 doi: 10.1016/j.ymben.2006.05.001. [DOI] [PubMed] [Google Scholar]
- 19.Ghosh S, et al. Closing the loop between feasible flux scenario identification for construct evaluation and resolution of realized fluxes via NMR. Computers & Chemical Engineering. 2005;29:459–466. [Google Scholar]
- 20.Ibarra RU, et al. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature. 2002;420:186–9. doi: 10.1038/nature01149. [DOI] [PubMed] [Google Scholar]
- 21.Jeffrey FM, et al. 13C-NMR: a simple yet comprehensive method for analysis of intermediary metabolism. Trends Biochem Sci. 1991;16:5–10. doi: 10.1016/0968-0004(91)90004-f. [DOI] [PubMed] [Google Scholar]
- 22.Kelleher JK. Flux Estimation Using Isotopic Tracers: Common Ground for Metabolic Physiology and Metabolic Engineering. Metabolic Engineering. 2001;3:100–110. doi: 10.1006/mben.2001.0185. [DOI] [PubMed] [Google Scholar]
- 23.Klapa MI, et al. Systematic quantification of complex metabolic flux networks using stable isotopes and mass spectrometry. Eur J Biochem. 2003;270:3525–42. doi: 10.1046/j.1432-1033.2003.03732.x. [DOI] [PubMed] [Google Scholar]
- 24.Kleijn RJ, et al. 13C-labeled gluconate tracing as a direct and accurate method for determining the pentose phosphate pathway split ratio in Penicillium chrysogenum. Appl Environ Microbiol. 2006;72:4743–54. doi: 10.1128/AEM.02955-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Martin VJ, et al. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat Biotechnol. 2003;21:796–802. doi: 10.1038/nbt833. Epub 2003 Jun 1. [DOI] [PubMed] [Google Scholar]
- 26.Marx A, et al. Determination of the fluxes in the central metabolism of Corynebacterium glutamicum by nuclear magnetic resonance spectroscopy combined with metabolite balancing. Biotechnology and Bioengineering. 1996;49:111–129. doi: 10.1002/(SICI)1097-0290(19960120)49:2<111::AID-BIT1>3.0.CO;2-T. [DOI] [PubMed] [Google Scholar]
- 27.Neidhardt FC, et al. Culture medium for enterobacteria. J Bacteriol. 1974;119:736–47. doi: 10.1128/jb.119.3.736-747.1974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Nielsen J. It Is All about Metabolic Fluxes. J Bacteriol. 2003;185:7031–7035. doi: 10.1128/JB.185.24.7031-7035.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Reed JL, et al. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR. Genome Biol. 2003;4:R54. doi: 10.1186/gb-2003-4-9-r54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Riascos CAM, et al. A global optimization approach for metabolic flux analysis based on labeling balances. Computers & Chemical Engineering. 2005;29:447–458. [Google Scholar]
- 31.Sauer U, et al. Metabolic flux ratio analysis of genetic and environmental modulations of Escherichia coli central carbon metabolism. J Bacteriol. 1999;181:6679–88. doi: 10.1128/jb.181.21.6679-6688.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Schmidt K, et al. Modeling isotopomer distributions in biochemical networks using isotopomer mapping matrices. Biotechnology and Bioengineering. 1997;55:831–840. doi: 10.1002/(SICI)1097-0290(19970920)55:6<831::AID-BIT2>3.0.CO;2-H. [DOI] [PubMed] [Google Scholar]
- 33.Schmidt K, et al. Quantitative analysis of metabolic fluxes in Escherichia coli, using two-dimensional NMR spectroscopy and complete isotopomer models. J Biotechnol. 1999;71:175–89. doi: 10.1016/s0168-1656(99)00021-8. [DOI] [PubMed] [Google Scholar]
- 34.Segre D, et al. Analysis of optimality in natural and perturbed metabolic networks. Proc Nat Acad Sci USA. 2002:99. doi: 10.1073/pnas.232349399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Siddiquee KA, et al. Metabolic flux analysis of pykF gene knockout Escherichia coli based on 13C-labeling experiments together with measurements of enzyme activities and intracellular metabolite concentrations. Appl Microbiol Biotechnol. 2004;63:407–17. doi: 10.1007/s00253-003-1357-9. [DOI] [PubMed] [Google Scholar]
- 36.Sleno L, et al. Assigning product ions from complex MS/MS spectra: The importance of mass uncertainty and resolving power. Journal of the American Society for Mass Spectrometry. 2005;16:183–198. doi: 10.1016/j.jasms.2004.10.001. [DOI] [PubMed] [Google Scholar]
- 37.Stephanopoulos G, Vallino JJ. Network rigidity and metabolic engineering in metabolite overproduction. Science. 1991;252:1675–81. doi: 10.1126/science.1904627. [DOI] [PubMed] [Google Scholar]
- 38.Szyperski T. Biosynthetically directed fractional 13C-labeling of proteinogenic amino acids. An efficient analytical tool to investigate intermediary metabolism. Eur J Biochem. 1995;232:433–48. doi: 10.1111/j.1432-1033.1995.tb20829.x. [DOI] [PubMed] [Google Scholar]
- 39.Tawarmalani M, Sahinidis NV. Global optimization of mixed-integer nonlinear programs: A theoretical and computational study. Mathematical Programming. 2004;99:563–591. [Google Scholar]
- 40.Vallino JJ, Stephanopoulos G. Metabolic flux distributions in Corynebacterium glutamicum during growth and lysine overproduction. Biotechnology and Bioengineering. 1993;41:633–646. doi: 10.1002/bit.260410606. [DOI] [PubMed] [Google Scholar]
- 41.Van Dien SJ, et al. Quantification of central metabolic fluxes in the facultative methylotroph methylobacterium extorquens AM1 using 13C-label tracing and mass spectrometry. Biotechnol Bioeng. 2003;84:45–55. doi: 10.1002/bit.10745. [DOI] [PubMed] [Google Scholar]
- 42.van Winden WA, et al. Metabolic flux and metabolic network analysis of Penicillium chrysogenum using 2D [13C, 1H] COSY NMR measurements and cumulative bondomer simulation. Biotechnol Bioeng. 2003;83:75–92. doi: 10.1002/bit.10648. [DOI] [PubMed] [Google Scholar]
- 43.van Winden WA, et al. Correcting mass isotopomer distributions for naturally occurring isotopes. Biotechnol Bioeng. 2002;80:477–9. doi: 10.1002/bit.10393. [DOI] [PubMed] [Google Scholar]
- 44.Varma A, Palsson BO. Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Appl Environ Microbiol. 1994;60:3724–31. doi: 10.1128/aem.60.10.3724-3731.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wahl SA, et al. New tools for mass isotopomer data evaluation in (13)C flux analysis: mass isotope correction, data consistency checking, and precursor relationships. Biotechnol Bioeng. 2004;85:259–68. doi: 10.1002/bit.10909. [DOI] [PubMed] [Google Scholar]
- 46.Wiechert W, de Graaf AA. Bidirectional Reaction Steps in Metabolic Networks: I. Modeling and Simulation of Carbon Isotope Labeling Experiments. Biotechnol Bioeng. 1997;55:101–117. doi: 10.1002/(SICI)1097-0290(19970705)55:1<101::AID-BIT12>3.0.CO;2-P. [DOI] [PubMed] [Google Scholar]
- 47.Wiechert W, et al. Bidirectional reaction steps in metabolic networks:III. Explicit solution and analysis of isotopomer labeling systems. Biotechnol Bioeng. 1999;66:69–85. [PubMed] [Google Scholar]
- 48.Wittmann C, et al. In vivo analysis of intracellular amino acid labelings by GC/MS. Anal Biochem. 2002;307:379–82. doi: 10.1016/s0003-2697(02)00030-1. [DOI] [PubMed] [Google Scholar]
- 49.Wittmann C, Heinzle E. Mass spectrometry for metabolic flux analysis. Biotechnol Bioeng. 1999;62:739–750. doi: 10.1002/(sici)1097-0290(19990320)62:6<739::aid-bit13>3.0.co;2-e. [DOI] [PubMed] [Google Scholar]
- 50.Wittmann C, Heinzle E. Genealogy profiling through strain improvement by using metabolic network analysis: metabolic flux genealogy of several generations of lysine-producing corynebacteria. Appl Environ Microbiol. 2002;68:5843–59. doi: 10.1128/AEM.68.12.5843-5859.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Yang TH, et al. Metabolic network simulation using logical loop algorithm and Jacobian matrix. Metab Eng. 2004;6:256–67. doi: 10.1016/j.ymben.2004.02.002. [DOI] [PubMed] [Google Scholar]
- 52.Zhao J, Shimizu K. Metabolic flux analysis of Escherichia coli K12 grown on 13C-labeled acetate and glucose using GC-MS and powerful flux calculation method. J Biotechnol. 2003;101:101–17. doi: 10.1016/s0168-1656(02)00316-4. [DOI] [PubMed] [Google Scholar]
- 53.Zupke C, Stephanopoulos G. Modeling of Isotope Distributions and Intracellular Fluxes in Metabolic Networks Using Atom Mapping Matrices. Biotechnol Prog. 1994;10:489–498. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





