Abstract
Flux balance analysis (FBA) of a genome-scale metabolic model allows calculation of intracellular fluxes by optimizing an objective function, such as maximization of cell growth, under given constraints, and has found numerous applications in the field of systems biology and biotechnology. Due to the underdetermined nature of the system, however, it has limitations such as inaccurate prediction of fluxes and existence of multiple solutions for an optimal objective value. Here, we report a strategy for accurate prediction of metabolic fluxes by FBA combined with systematic and condition-independent constraints that restrict the achievable flux ranges of grouped reactions by genomic context and flux-converging pattern analyses. Analyses of three types of genomic contexts, conserved genomic neighborhood, gene fusion events, and co-occurrence of genes across multiple organisms, were performed to suggest a group of fluxes that are likely on or off simultaneously. The flux ranges of these grouped reactions were constrained by flux-converging pattern analysis. FBA of the Escherichia coli genome-scale metabolic model was carried out under several different genotypic (pykF, zwf, ppc, and sucA mutants) and environmental (altered carbon source) conditions by applying these constraints, which resulted in flux values that were in good agreement with the experimentally measured 13C-based fluxes. Thus, this strategy will be useful for accurately predicting the intracellular fluxes of large metabolic networks when their experimental determination is difficult.
Keywords: Escherichiacoli, flux balance analysis, grouping reaction constraints, 13C-based flux, genome-scale metabolic model
The accumulation of omics data, including genome, transcriptome, proteome, metabolome, and fluxome, provides an opportunity to understand cellular physiology and characteristics at multiple levels (1, 2). Among them, the fluxome profiling allows quantification of metabolic fluxes, which collectively represent the cellular metabolic characteristics. For successful metabolic engineering, it is important to understand and predict the changes of metabolic fluxes after modifying interesting genes, pathways, and regulatory circuits. On the basis of systems-level understanding of cellular status through fluxome profiling, rational and systematic development of an improved strain becomes possible (3–6).
One of the popular methods that has been used to examine cellular metabolic fluxes and predict their changes in perturbed conditions is flux balance analysis (FBA) (7–12). To calculate the optimal flux distribution in the multidimensional flux solution space of metabolic network, FBA optimizes an objective function, such as maximizing cell growth rate, under pseudosteady state with mass balance and other constraints (13). Given the fact that the information used to reconstruct the metabolic networks is incomplete, multiple equivalent solutions can be computed for the states of these networks (14). One method that has been presented to overcome the problem of multiple solutions is to explore the flux solution space by flux variability analysis (FVA). However, this approach still works with a flux solution space that is too large compared with the biologically feasible flux solution space of a real organism (15–17). Thus, additional information and procedures, such as providing more constraints, are needed to narrow down the set of candidate solutions.
One example of a suitable constraint that can be applied is the experimentally measured metabolic flux data obtained from 13C-based flux analysis and fermentation data (18–21). Other constraints that can be applied include transcriptome data under various environmental changes (22–26), thermodynamic constraints (27–30), and molecular crowding of various biomolecules in limited cytoplasmic space (31). However, to apply these condition-dependent and objective function-specific constraints to the models, biologically meaningful knowledge and information on environmental and genetic conditions are required.
The community has thus been actively pursuing the development of such methods. Previously, gene neighborhood methods were used for gap filling in a metabolic network, finding new metabolic modules and linking them to the genome, reconstructing pathway networks, and refining the accuracy of metabolic flux prediction (32). Also, the coupling of fluxes was previously studied in genome-scale networks using methods such as the flux coupling finder (33, 34). These methods were used to primarily investigate the topology of the metabolic network and to supplement operon prediction tools.
Here we present a unique strategy of using information generated from genomic context and flux-converging pattern analyses to generate and introduce unique constraints for more accurate simulation of the metabolic network. Three types of genomic context analyses consisting of conserved neighborhood, gene fusion, and co-occurrence were performed (Fig. 1A). The conserved neighborhood analysis predicts genes in close proximity on various genomes repeatedly. The gene fusion analysis captures events of forming a hybrid gene from previously separate genes in other organisms. The co-occurrence analysis predicts presence or absence of linked proteins across organisms (35). Flux-converging pattern analysis restricts the range of attainable flux distributions by considering the number of carbons in the metabolite that participates in the reactions and the patterns of the fluxes from a carbon source in the metabolic network by using the concept of flux-converging metabolite (Fig. 1B). These were all considered together in generating grouping reaction constraints, which represent those genes that have good chance of being simultaneously on/off under the influence of the same regulatory mechanisms and showing similar expression patterns under given genetic and/or environmental perturbations (23). The grouping reaction constraints were incorporated during the FBA of the Escherichia coli genome-scale metabolic model under various genetic and environmental conditions. Flux distributions obtained by this approach showed good agreement with those experimentally determined by 13C-based flux analysis, which suggests its general usefulness in more accurately determining metabolic fluxes by FBA.
Fig. 1.
Schematic illustration of flux balance analysis with constraints of grouping of functionally and physically related reactions based on genomic context and flux-converging pattern analyses. (A) By considering protein–protein associations based on genomic context analyses consisting of conserved neighborhood, gene fusion and co-occurrence, significantly related reactions were organized in a group. The binary variable, y, is 0 or 1. vj is the flux of reaction j and v1 and v2 belong to the same group. vj,min and vj,max are lower and upper bounds for the flux of reaction j. (B) The flux ranges of the grouped reactions based on genomic context analyses were further constrained by flux-converging pattern analyses based on the carbon number of metabolites that participate in each reaction (Ca) and the number of passing through flux-converging metabolites from a carbon source (Jb), for instance glucose in this study. A metabolite could be split into different metabolites or created by combining two metabolites. Each circle with different colors defines reaction groups that have a similar scale of flux values, determined by stoichiometric carbon balances for metabolites and fluxes through metabolic pathway. The flux-converging metabolite is defined as a metabolite that sums fluxes originated from a reaction that produces two or more metabolites into parallel pathways. δ denotes a constant defining the flux level of reactions in a reaction group. v1n, v2n are normalized flux of reaction 1, 2 by dividing each reaction by a carbon source uptake rate, such as glucose. (C) To restrict the flux solution space of the Escherichia coli genome-scale model, the simultaneous on/off constraints (Con/off) by genomic context analyses and flux scale constraints (Cscale) by flux-converging pattern analyses were applied to the E. coli central metabolism. (D) Grouping of reactions using genomic context analysis and clustering of the grouped reactions using flux-converging pattern analysis in the form of CxJy. Cx indicates the carbon number of metabolites that participate in each reaction and Jy indicates the type of fluxes through flux-converging metabolites from a carbon source. The red metabolites are flux-converging metabolites. The flux-converging metabolites correspond to metabolites where two split reactions combine, indicated by bold, transparent, and red arrows, such as glyceraldehyde-3-phosphate, which converges fluxes split by fructose-bisphosphate aldolase. The flux-converging metabolites categorize Jy into four types, denoted as JA, JB, JC, and JD, where each subscript indicates the number of flux-converging metabolites passed zero to three times, respectively, for the given flux from a carbon source. The subscript E, which specifically indicates that the flux has once converged into pyruvate, is placed next to subscripts of J, including A, B, C, or D. Each rectangle with different colors defines reaction groups that are likely on or off simultaneously, determined by genomic context analysis. The CxJy’s assigned by using a different pathway from a glucose for each reaction are partitioned by a slash.
Results
Grouping of Reactions in E. coli Central Metabolism by Genomic Context Analysis.
Genomic context and flux-converging pattern analyses were performed to group functionally and physically related reactions (Fig. 1C). Here, grouping reaction constraints for FBA, which consist of simultaneous on/off (Con/off) and flux scale (Cscale) constraints, were focused on reactions of central carbon metabolism because it is the most interconnected core metabolism wherein, in contrast to most linear biosynthetic pathways, multiple regulations and controls are in operation to properly operate the metabolism (21). Thus, it is expected that appropriately constraining the reaction fluxes of central carbon metabolism would lead to accurate prediction of flux distribution in the whole metabolism.
By considering protein–protein associations on the basis of genomic context analysis, significantly related reactions were organized in a group (Fig. 1). After the assignment of association scores for each predictor of functional associations of proteins, the final combined score between any pair of proteins was summed up as
where Si denotes the score of each predictor i. This combined score is usually higher than the individual subscores. After considering the criteria of minimal score from 0.7 to 0.95 to organize the groups that are composed of significantly related reactions, the minimal score of 0.75 that showed good prediction fidelity compared with experimental data was chosen (Table S1). The individual and combined scores were calculated to predict protein–protein associations, which can be obtained from a database of known and predicted protein interactions, STRING 8.0 (35). Consequently, E. coli central metabolism appeared to be composed of nine reaction groups from genomic context analysis (Tables S1 and S2). Reactions in the same group by genomic context analysis receive Con/off during FBA as follows:
The binary variable, y, is available to operate the simultaneous on/off of reactions for each group (Eq. 2). The binary variables in each group represent the same values and are multiplied to upper and lower bounds of reactions (Eq. 3 and Eq. 4). The vj is the flux of reaction j and the v1 and v2 belong to the same group. The vj,min and vj,max are lower and upper bounds for the flux of reaction j.
The reactions that constitute the same predicted group usually belong to an identical functional category of metabolism. However, some predicted groups consist of reactions belonging to different functional categories (Table S2). In group 3 of Table S2, although pyruvate dehydrogenase (PDH) bridges between glycolysis and TCA cycle by providing acetyl-CoA, PDH was predicted to be related to and grouped with reactions in the TCA cycle. Additionally, glucokinase (GLK) and glucose-6-phosphate isomerase in glycolysis are found in group 5 with reactions in the pentose phosphate (PP) pathway and the Entner–Doudoroff (ED) pathway. In the utilization of glucose, although most of glucose is transported and phosphorylated by the phosphoenolpyruvate:sugar phosphotransferase system (PTS) in E. coli and most other bacteria, the expression of glk encoding GLK reaction is still active (36). In contrast to FBA without grouping reaction constraints, which predicts only PTS as a glucose importer, FBA with grouping reaction constraints predicted that both PTS and GLK reactions are in operation, which is consistent with experimental observation.
Restricting the Flux Ranges of Grouped Reactions by Flux-Converging Pattern Analysis.
Followed by genomic context analyses, reactions in each reaction group were further clustered into 35 sets using flux-converging pattern analysis, to constrain each set of reactions to have a similar scale of flux values. This analysis is based on the idea that the scale of flux values is considerably affected by two elements: the number of carbons (Cx) in a metabolite that participates in the reaction (Rj) and the flux-converging pattern of fluxes from a carbon source (Jy) (SI Text and Fig. 1). For example, the C6 metabolite, fructose-6-phosphate, is divided into two C3 metabolites, glyceraldehyde-3-phosphate and dihydroxyacetone phosphate, by fructose-bisphosphate aldolase in glycolysis. This would cause downstream reactions of fructose-bisphosphate aldolase metabolizing C3 metabolites to have 2-fold higher flux values, compared with the upstream reactions of fructose-bisphosphate aldolase. Also, the flux-converging metabolites in the metabolic pathway heavily influence the scale of flux values as fluxes propagate from an initial carbon source, and thus the number of such flux-converging metabolites is considered. The flux-converging metabolite is defined as a metabolite that sums fluxes originated from a reaction that produces two or more metabolites into parallel pathways (Fig. 1 and Fig. S1). These flux-converging metabolites categorize Jy into four types, denoted as JA, JB, JC, and JD, where each subscript indicates the number of flux-converging metabolites passing through 0, 1, 2, and 3 times, respectively, for the given flux originated from a carbon source. Analysis of the flux-converging pattern is terminated if the flux goes back to the metabolite it has already passed.
There were three flux-converging metabolites, glyceraldehyde-3-phosphate, pyruvate, and malate, found in E. coli central metabolism. Among them, pyruvate deserves further attention as it causes more complex changes of flux distribution, compared with the other two flux-converging metabolites; fluxes split from 6-phospho-D-gluconate eventually converge into pyruvate, but one of them goes through glyceraldehyde-3-phosphate, which is a flux-converging metabolite. Thus, subscript E, which specifically indicates that the flux has once converged into pyruvate, is placed next to subscripts of J, including A, B, C, or D. Although each reaction could be assigned by several CxJy at the same time by using a different pathway from a carbon source, reactions having a similar scale of flux values showed the same CxJy’s. Furthermore, it should be noted that Jy is specifically assigned for the case of using glucose as a carbon source and should be modified accordingly if a different carbon source is used (see below). Taken together, reactions in the same group from genomic context analyses were further clustered into the same set if they have the same CxJy in E. coli central metabolism with glucose as a carbon source (Fig. 1). Reactions in the same set from flux-converging pattern analysis will not only receive simultaneous on/off constraints, but also be constrained to have a similar scale of flux values, which is Cscale, whereas reactions that were not clustered into the set within each reaction group are only subjected to on/off constraints. In other words, the Cscale by flux-converging analysis further constrains the flux ranges of the reactions grouped by genomic context analysis. The mathematical expression of the Cscale is detailed in SI Text.
FBA with Grouping Reaction Constraints Predicts the Changes of Flux Patterns upon Gene Knockouts.
FVA was first performed to estimate the maximal and minimal flux values of each reaction to explore the change of flux solution space under the given condition with respect to 13C-based measurements. The changes in flux patterns in response to several perturbations compared with a control flux solution space were studied by using two simple determinants called flux bias, Vavg, and flux capacity, lsol; the former is defined as the average value of the maximal and minimal flux values of a reaction, and the latter is the distance between them (SI Text). To systematically characterize the changes in flux values, flux changes in response to perturbations were categorized into nine types on the basis of the combinations of positive and negative changes in Vavg and lsol in comparison with the control flux distribution (Table S3). In this study, the control flux solution space was the flux distribution of the wild-type E. coli cultured in glucose minimal medium under aerobic conditions. The flux values from FBA and 13C-based experiments are relative fluxes that were normalized to the respective carbon source uptake rates. Details on the determination of flux pattern changes by considering the increase or decrease of Vavg and lsol are described in SI Text.
The intracellular fluxes can often be much affected by genetic perturbation, such as gene knockouts (37–40), and hence validation of our approach was performed by gene knockouts. Thus, FBA with grouping reaction constraints was performed on four representative gene knockout mutant strains: the pykF (pyruvate kinase in glycolytic pathway), zwf (glucose 6-phosphate dehydrogenase in PP pathway), ppc (anaplerotic phosphoenolpyruvate carboxylase), and sucA (oxoglutarate dehydrogenase in TCA cycle) mutants. According to the results of 13C-based flux analysis of the pyk mutant strain and the wild-type strain, the glycolytic, malate dehydrogenase (MDH) in TCA cycle and acetate production fluxes decreased whereas the fluxes in the PP pathway and the anaplerotic pathway increased when the pyk gene was knocked out (37). However, results obtained by FBA did not show such changes. It was found that an alternative route for the flux was found to compensate the deleted pyruvate kinase (PYK) reaction during the FBA of the pyk knockout strain, resulting in no change of glycolysis. The results of FBA on the pyk knockout mutant showed that they all belong to type 9 (Table S3), implying no effects of pyk deletion on the changes in flux distribution. Thus, FBA does not seem to be suitable for determining the flux distribution. Next, the grouping reaction constraints were applied during the FBA. PYK is grouped with other reactions in glycolysis (group 4 in Table S2), and thus the other reactions in group 4 are also affected upon the knockout of the pyk gene, as shown by their decreased flux values (Table S2 and Fig. 2). FBA with grouping reaction constraints predicted that the fluxes through reactions in the PP pathway and malic enzyme reaction in the anaplerotic pathway increased whereas the glycolytic, MDH in TCA cycle and acetate production fluxes decreased, compared with the wild type based on Vavg. These results are in agreement with those of the 13C-based flux and fermentation profile of the pykF knockout mutant (Fig. 2) (37), suggesting that the FBA with grouping reaction constraints allows more accurate determination of the fluxes. Furthermore, the lsol’s of reactions in the ED pathway and the PP pathway were found to be increased to supply the deficient pyruvate pool caused by the removal of PYK reaction. Also, the lsol of PDH, an enzyme connecting the ED pathway with TCA cycle, and the lsol’s of the beginning reactions of TCA cycle increased. The type 1 reactions, implying the increase of the Vavg’s and lsol’s of reactions, were 43.59% of reactions in the E. coli central metabolism of the pykF knockout mutant (Fig. 2 and Table S3). The percentage of type I reactions are the results of the high fluxes through the PP pathway, anaplerotic pathway, and TCA cycle in the pykF knockout mutant. The type 4 and 5 reactions, implying the decrease of the Vavg’s of reactions, were 2.56 and 51.28% of reactions in the E. coli central metabolism of the pykF knockout mutant, respectively, and mainly influenced by MDH reaction and neighboring reactions in TCA cycle and those in glycolysis, all of which showed lower flux values from pykF knockout. The type 9 reaction corresponds to the isocitrate lyase reaction in glyoxylate shunt, comprising 2.56% of considered reactions in E. coli central metabolism. These changes of flux distribution were further confirmed by the analysis based on the random sampling of glucose uptake rate having normal distribution (Fig. S2).
Fig. 2.
Comparison of 13C-based flux values to those calculated from FBA with or without grouping reaction constraints in wild-type E. coli and pykF knockout mutant strains on glucose minimal medium under aerobic conditions and flux distribution changes for pykF knockout in E. coli central metabolism. The color gradient and squares indicate the values of Vavg. The red lines indicate that the fluxes show the increment determined by flux bias (Vavg), compared with control fluxes, which were from E. coli wild type aerobically grown in glucose minimal medium. The blue lines indicate the opposite. Thickness of the lines denotes flux capacities (lsol) after environmental perturbation, compared with control fluxes; three types of thickness were used to indicate the changes in flux capacity. The thickest line indicates the increased flux capacity, the thinnest line the decreased flux capacity, and the medium thickness refers to no changes in flux capacity. Each graph represents the 13C-based fluxes and solution spaces of each flux predicted with or without grouping reaction constraints for wild type and pykF mutant. The y axis in each graph indicates the relative flux (%) that is normalized to the carbon source uptake rate. The 13C-based fluxes are represented in (I) wild type and (II) pykF mutant. The fluxes predicted by FBA without grouping reaction constraints are represented in (III) wild type and (IV) pykF mutant. The fluxes predicted by FBA with grouping reaction constraints are represented in (V) wild type and (VI) pykF mutant. The bars in each embedded graph denote the range between the maximal and minimal values of each reaction computed from FVA. The graph for succinyl-CoA synthetase is not shown because its flux values were outside the range of biologically feasible values in the case of FBA without grouping reaction constraints. In the case of reversible reactions, the fluxes corresponding to the direction represented in the pathway were considered. The 13C-based fluxes were evaluated with 90% confidence limits obtained from statistical analysis (37).
Flux patterns predicted by FBA with grouping reaction constraints for other gene knockout mutants also showed good agreement with 13C-based fluxes of corresponding strains. For the zwf knockout mutant, in which glucose 6-phosphate dehydrogenase in the PP pathway was disabled, FBA with grouping reaction constraints predicted increased Vavg’s through glycolysis, TCA cycle and acetate production reaction, and decreased Vavg’s of the PP pathway, which is consistent with results of the 13C-based metabolic flux analysis. In both methods, higher Vavg’s through TCA cycle were observed to complement the shortage of reducing power (NADPH) caused by the disruption of the PP pathway (Fig. S3) (38). For the ppc (phosphoenolpyruvate carboxylase) and sucA (2-oxoglutarate dehydrogenase) knockouts, FBA with grouping reaction constraints also yielded consistent predictions, compared with 13C-based fluxes (Figs. S4 and S5) (39, 40). In the case of the gnd and zwf mutants, the changes of the 13C-based flux range (95% confidence intervals) were in good agreement with those of the predicted flux ranges obtained using the grouping reaction constraints (Fig. S6). As a result, FBA with grouping reaction constraints well predicted changes in flux distribution caused by various single-gene knockouts, which suggests its useful application in in silico gene targeting for metabolic engineering.
FBA with Grouping Reaction Constraints Predicts the Changes of Flux Patterns upon Altering the Carbon Source from Glucose to Acetate.
The changes of flux patterns using glucose and acetate as carbon sources were predicted by FBA with grouping reaction constraints, and the results were compared with the experimentally measured fluxes (Fig. S7) (38). When acetate was used as a carbon source, the direction of the fluxes through glycolysis was reversed. In addition to the redirection of the pathway, the activation of the glyoxylate shunt and anaplerotic phosphoenolpyruvate carboxykinase reaction toward phosphoenolpyruvate changed the flux distribution dramatically compared with that observed with glucose as a carbon source. FBA with grouping reaction constraints also successfully predicted the use of the glyoxylate shunt and anaplerotic reaction toward phosphoenolpyruvate, and the fluxes were redirected toward glycolysis on the basis of Vavg and lsol. Most of the fluxes that originated from acetate uptake went through the TCA cycle and the glyoxylate shunt. The Vavg’s and lsol’s of the redirected reactions in glycolysis and Vavg’s of glyoxylate shunt reactions showed increments implying a higher likelihood of fluxes going through the newly directed pathways. The high Vavg’s of the reactions in the TCA cycle were consistent with those flux patterns experimentally observed with acetate as a carbon source.
The organic acid production rates predicted with grouping reaction constraints showed very small Vavg’s, which are also consistent with the batch fermentation profiles obtained using acetate as a carbon source (Fig. S8). The type 1 and 2 reactions covered 21.62 and 8.11% of reactions in E. coli central metabolism when glucose and acetate were used, respectively, which is because of the amplified redirected reactions and the reactions in the glyoxylate shunt (Fig. S7 and Table S3). The type 5 reactions mainly correspond to reactions involved in organic acid production, the PP pathway, and the TCA cycle, comprising 70.27% of all considered central reactions. Thus, FBA with grouping reaction constraints allowed more accurate prediction of the changes in flux patterns when different carbon sources were used.
Assessment of the Prediction Accuracy of FBA with Grouping Reaction Constraints.
Having seen that FBA with grouping reaction constraints can predict flux patterns better than simple FBA, the prediction accuracies of the two methods were compared with the experimentally determined 13C-based fluxes. Here, the Euclidean distance was used to quantify the overall agreement of the calculated fluxes with experimentally determined fluxes (SI Text). More specifically, the Euclidean distance was maximized or minimized with the constrained optimal cell growth rate under the given condition to estimate a range of possible Euclidean distances resulting from the multiple solutions of computationally calculated fluxes. Consequently, the maximal and minimal Euclidean distances that were estimated with and without grouping reaction constraints allowed us to see the improvement of predictions in the presence of grouping reaction constraints. The minimal distances obtained with grouping reaction constraints were shorter than those without grouping reaction constraints. Additionally, the distances between the best (minimal Euclidean distance) and the worst (maximal Euclidean distance) possible predictions with grouping reaction constraints were remarkably reduced, compared with those without grouping reaction constraints (Fig. 3). These results mean that the predicted flux values became much more similar to those experimentally determined for all six perturbations examined by applying grouping reaction constraints. In conclusion, FBA with grouping reaction constraints allows more accurate prediction of fluxes during the simulation of the genome-scale metabolic model, which is typically a highly underdetermined system.
Fig. 3.
Assessment of the prediction accuracy of FBA with grouping reaction constraints using Euclidean distance between computationally calculated fluxes and 13C-based fluxes. Shown are Euclidean distances of E. coli wild type on glucose minimal medium under aerobic conditions (A) with grouping reaction constraints and (B) without grouping reaction constraints, E. coli pykF mutant on glucose minimal medium under aerobic conditions (C) with grouping reaction constraints and (D) without grouping reaction constraints, and E. coli wild type on acetate minimal medium under aerobic conditions (E) with grouping reaction constraints and (F) without grouping reaction constraints.
Discussion
FBA-based simulation of genome-scale models has been valuable in predicting metabolic flux distributions and understanding the metabolic characteristics under various genotypic and environmental conditions. However, the accuracy of FBA in predicting the metabolic fluxes is limited due to the incomplete information available; e.g., there is broad flux solution space causing multiple intracellular flux solutions for a given optimal objective state. The key issue to overcome such problems is to reduce the flux solution space of the genome-scale model by using additional constraints (22–31). Even though these constraints are invaluable in improving the prediction, they require complex information, such as transcriptional regulation and a signaling mechanism, and are condition dependent. We thus aimed at developing a strategy that generates additional information to be incorporated, in the form of constraints, to improve the accuracy in predicting the metabolic fluxes.
Performance of FBA could also be improved by adopting appropriate objective functions under specific circumstances. Schuetz et al. (41) systematically evaluated several objective functions under specific conditions and showed that the prediction accuracy could be significantly improved by selecting a suitable objective function. Yet, this approach still requires the selection of an appropriate objective function according to the specific condition being investigated. Consequently, the objective of this study was whether the prediction of metabolic fluxes in pseudosteady state can be improved by applying condition- and objective function-independent constraints, such as genomic features and stoichiometric reactions.
Recently, the complexity of transcription unit architecture was investigated by several groups (42–44). The presence of these transcriptional units results in a more complicated model of gene expression, compared with the classical operon model. A transcription unit can contain a gene group that is transcribed to an mRNA and corresponds to its respective metabolic reaction. However, the transcriptional unit architecture does not account for the influences of other functionally related genes that are not found in the same transcriptional unit or operon. The grouping reaction constraints, defined by this study, consider the co-occurrence and gene fusion, in addition to the conserved neighborhood relationship. As a result, grouping reaction constraints encompasses a broader scope that includes the operons and transcription units of functionally related genes and resulted in an assumption that the reactions in the same group are affected by similar regulation. Thus, grouping of such significantly related reactions by genomic context analysis provides us with a condition-independent constraint that incorporates complex regulations, such as transcriptional regulation, in a simple way. In addition, applying additional flux scale constraints within the groups generated by genomic context analysis to the model could improve the fidelity of the predicted metabolic fluxes by further restricting the range of flux solution space. From an engineering point of view, predicting the metabolic fluxes and their changes under various perturbed conditions is important in designing the strategies for strain improvement. The existence of alternative functional states to the predicted metabolic fluxes can be explained from a biological point of view, often described as biological robustness (14, 45, 46). Thus, we investigated the metabolic fluxes and their changes under perturbed conditions using FVA, which can calculate alternative functional states for a given condition.
In this paper, we reported a simple yet generally applicable strategy for performing FBA more accurately by providing condition- and objective function-independent constraints. The range of flux solution space could be reduced by applying grouping reaction constraints generated from information easily obtainable from the genome sequence and stoichiometric network. Thus, this strategy will be generally useful for the accurate prediction of metabolic fluxes in genome scale when their experimental determination is difficult.
Materials and Methods
FBA with Grouping Reaction Constraints.
FBA was used to maximize or minimize an objective function under different constraints. The FBA with grouping reaction constraints framework was established by applying simultaneous on/off and flux scale constraints to FBA, as described in Results and as detailed in SI Text.
Fermentations and Analytical Procedures.
Batch fermentations were carried out in a 6.6-L Bioflo 3000 fermenter (New Brunswick Scientific) containing 2 L of M9 minimal medium with 20 g/L of glucose or 10 g/L of acetate at 37 °C under aerobic conditions. The pH was controlled at 6.8 by automatic feeding of NH4OH or HCl. Cell growth was monitored by measuring the absorbance at 600 nm (OD600) using an Ultrospec3000 spectrophotometer (Pharmacia Biotech). Glucose concentration was measured using a glucose analyzer (model 2700 STAT; Yellow Springs Instrument). The concentrations of glucose and organic acids were determined by HPLCy (ProStar 210; Varian) equipped with UV/visible-light (ProStar 320; Varian) and refractive index (Shodex RI-71) detectors. The experimental details of fermentations and analytical procedures are detailed in SI Text.
Supplementary Material
Acknowledgments
We thank Hyun Uk Kim and Seung Bum Sohn for their help in manuscript preparation. We also thank Dr. Byung Kwan Cho for his helpful discussion. This work was supported by the Korean Systems Biology Research Project (20100002164) of the Ministry of Education, Science, and Technology through the National Research Foundation of Korea. Further support by the World Class University Program (R32-2009-000-10142-0) through the National Research Foundation of Korea funded by the Ministry of Education, Science, and Technology is appreciated.
Footnotes
The authors declare no conflict of interest.
*This article is a PNAS Direct Submission. B.Ø.P. is a guest editor invited by the Editorial Board.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1003740107/-/DCSupplemental.
References
- 1.Lee SY, Lee DY, Kim TY. Systems biotechnology for strain improvement. Trends Biotechnol. 2005;23:349–358. doi: 10.1016/j.tibtech.2005.05.003. [DOI] [PubMed] [Google Scholar]
- 2.Joyce AR, Palsson BO. The model organism as a system: Integrating ‘omics’ data sets. Nat Rev Mol Cell Biol. 2006;7:198–210. doi: 10.1038/nrm1857. [DOI] [PubMed] [Google Scholar]
- 3.Park JH, Lee SY. Towards systems metabolic engineering of microorganisms for amino acid production. Curr Opin Biotechnol. 2008;19:454–460. doi: 10.1016/j.copbio.2008.08.007. [DOI] [PubMed] [Google Scholar]
- 4.Krömer JO, Sorgenfrei O, Klopprogge K, Heinzle E, Wittmann C. In-depth profiling of lysine-producing Corynebacterium glutamicum by combined analysis of the transcriptome, metabolome, and fluxome. J Bacteriol. 2004;186:1769–1784. doi: 10.1128/JB.186.6.1769-1784.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Deshpande R, Yang TH, Heinzle E. Towards a metabolic and isotopic steady state in CHO batch cultures for reliable isotope-based metabolic profiling. Biotechnol J. 2009;4:247–263. doi: 10.1002/biot.200800143. [DOI] [PubMed] [Google Scholar]
- 6.Becker J, Klopprogge C, Wittmann C. Metabolic responses to pyruvate kinase deletion in lysine producing Corynebacterium glutamicum. Microb Cell Fact. 2008;7:8. doi: 10.1186/1475-2859-7-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Price ND, Reed JL, Palsson BO. Genome-scale models of microbial cells: Evaluating the consequences of constraints. Nat Rev Microbiol. 2004;2:886–897. doi: 10.1038/nrmicro1023. [DOI] [PubMed] [Google Scholar]
- 8.Kim HU, Kim TY, Lee SY. Metabolic flux analysis and metabolic engineering of microorganisms. Mol Biosyst. 2008;4:113–120. doi: 10.1039/b712395g. [DOI] [PubMed] [Google Scholar]
- 9.Park JH, Lee KH, Kim TY, Lee SY. Metabolic engineering of Escherichia coli for the production of L-valine based on transcriptome analysis and in silico gene knockout simulation. Proc Natl Acad Sci USA. 2007;104:7797–7802. doi: 10.1073/pnas.0702609104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lee KH, Park JH, Kim TY, Kim HU, Lee SY. Systems metabolic engineering of Escherichia coli for L-threonine production. Mol Syst Biol. 2007;3:149. doi: 10.1038/msb4100196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jamshidi N, Palsson BO. Using in silico models to simulate dual perturbation experiments: Procedure development and interpretation of outcomes. BMC Syst Biol. 2009;3:44. doi: 10.1186/1752-0509-3-44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Feist AM, Palsson BO. The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol. 2008;26:659–667. doi: 10.1038/nbt1401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Edwards JS, Ibarra RU, Palsson BO. In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat Biotechnol. 2001;19:125–130. doi: 10.1038/84379. [DOI] [PubMed] [Google Scholar]
- 14.Reed JL, Palsson BO. Genome-scale in silico models of E. coli have multiple equivalent phenotypic states: Assessment of correlated reaction subsets that comprise network states. Genome Res. 2004;14:1797–1805. doi: 10.1101/gr.2546004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng. 2003;5:264–276. doi: 10.1016/j.ymben.2003.09.002. [DOI] [PubMed] [Google Scholar]
- 16.Park JM, Kim TY, Lee SY. Constraints-based genome-scale metabolic simulation for systems metabolic engineering. Biotechnol Adv. 2009;27:979–988. doi: 10.1016/j.biotechadv.2009.05.019. [DOI] [PubMed] [Google Scholar]
- 17.Puchałka J, et al. Genome-scale reconstruction and analysis of the Pseudomonas putida KT2440 metabolic network facilitates applications in biotechnology. PLoS Comput Biol. 2008;4:e1000210. doi: 10.1371/journal.pcbi.1000210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Blank LM, Kuepfer L, Sauer U. Large-scale 13C-flux analysis reveals mechanistic principles of metabolic network robustness to null mutations in yeast. Genome Biol. 2005;6:R49. doi: 10.1186/gb-2005-6-6-r49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Fischer E, Sauer U. Large-scale in vivo flux analysis shows rigidity and suboptimal performance of Bacillus subtilis metabolism. Nat Genet. 2005;37:636–640. doi: 10.1038/ng1555. [DOI] [PubMed] [Google Scholar]
- 20.Yong K T, Lee SY. Accurate metabolic flux analysis through data reconciliation of isotope balance-based data. J Microbiol Biotechnol. 2006;16:1139–1143. [Google Scholar]
- 21.Sauer U. Metabolic networks in motion: 13C-based flux analysis. Mol Syst Biol. 2006;2:62. doi: 10.1038/msb4100109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Barrett CL, Herring CD, Reed JL, Palsson BO. The global transcriptional regulatory network for metabolism in Escherichia coli exhibits few dominant functional states. Proc Natl Acad Sci USA. 2005;102:19103–19108. doi: 10.1073/pnas.0505231102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO. Integrating high-throughput and computational data elucidates bacterial networks. Nature. 2004;429:92–96. doi: 10.1038/nature02456. [DOI] [PubMed] [Google Scholar]
- 24.Shlomi T, Eisenberg Y, Sharan R, Ruppin E. A genome-scale computational study of the interplay between transcriptional regulation and metabolism. Mol Syst Biol. 2007;3:101. doi: 10.1038/msb4100141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Covert MW, Xiao N, Chen TJ, Karr JR. Integrating metabolic, transcriptional regulatory and signal transduction models in Escherichia coli. Bioinformatics. 2008;24:2044–2050. doi: 10.1093/bioinformatics/btn352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Lee JM, Min Lee J, Gianchandani EP, Eddy JA, Papin JA. Dynamic analysis of integrated signaling, metabolic, and regulatory networks. PLOS Comput Biol. 2008;4:e1000086. doi: 10.1371/journal.pcbi.1000086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Feist AM, et al. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007;3:121. doi: 10.1038/msb4100155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Henry CS, Broadbelt LJ, Hatzimanikatis V. Thermodynamics-based metabolic flux analysis. Biophys J. 2007;92:1792–1805. doi: 10.1529/biophysj.106.093138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kümmel A, Panke S, Heinemann M. Systematic assignment of thermodynamic constraints in metabolic network models. BMC Bioinformatics. 2006;7:512. doi: 10.1186/1471-2105-7-512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Yang F, Qian H, Beard DA. Ab initio prediction of thermodynamically feasible reaction directions from biochemical network stoichiometry. Metab Eng. 2005;7:251–259. doi: 10.1016/j.ymben.2005.03.002. [DOI] [PubMed] [Google Scholar]
- 31.Beg QK, et al. Intracellular crowding defines the mode and sequence of substrate uptake by Escherichia coli and constrains its metabolic activity. Proc Natl Acad Sci USA. 2007;104:12663–12668. doi: 10.1073/pnas.0609845104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Breitling R, Vitkup D, Barrett MP. New surveyor tools for charting microbial metabolic maps. Nat Rev Microbiol. 2008;6:156–161. doi: 10.1038/nrmicro1797. [DOI] [PubMed] [Google Scholar]
- 33.Burgard AP, Nikolaev EV, Schilling CH, Maranas CD. Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 2004;14:301–312. doi: 10.1101/gr.1926504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Papin JA, Reed JL, Palsson BO. Hierarchical thinking in network biology: The unbiased modularization of biochemical networks. Trends Biochem Sci. 2004;29:641–647. doi: 10.1016/j.tibs.2004.10.001. [DOI] [PubMed] [Google Scholar]
- 35.Jensen LJ, et al. STRING 8—a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 2009;37(Database issue):D412–D416. doi: 10.1093/nar/gkn760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Meyer D, Schneider-Fresenius C, Horlacher R, Peist R, Boos W. Molecular characterization of glucokinase from Escherichia coli K-12. J Bacteriol. 1997;179:1298–1306. doi: 10.1128/jb.179.4.1298-1306.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Al Zaid Siddiquee K, Arauzo-Bravo MJ, Shimizu K. Metabolic flux analysis of pykF gene knockout Escherichia coli based on 13C-labeling experiments together with measurements of enzyme activities and intracellular metabolite concentrations. Appl Microbiol Biotechnol. 2004;63:407–417. doi: 10.1007/s00253-003-1357-9. [DOI] [PubMed] [Google Scholar]
- 38.Zhao J, Baba T, Mori H, Shimizu K. Effect of zwf gene knockout on the metabolism of Escherichia coli grown on glucose or acetate. Metab Eng. 2004;6:164–174. doi: 10.1016/j.ymben.2004.02.004. [DOI] [PubMed] [Google Scholar]
- 39.Li M, Ho PY, Yao S, Shimizu K. Effect of sucA or sucC gene knockout on the metabolism in Escherichia coli based on gene expressions, enzyme activities, intracellular metabolite concentrations and metabolic fluxes by 13C-labeling experiments. Biochem Eng J. 2006;30:286–296. doi: 10.1016/j.jbiotec.2005.09.016. [DOI] [PubMed] [Google Scholar]
- 40.Peng L, Arauzo-Bravo MJ, Shimizu K. Metabolic flux analysis for a ppc mutant Escherichia coli based on 13C-labelling experiments together with enzyme activity assays and intracellular metabolite measurements. FEMS Microbiol Lett. 2004;235:17–23. doi: 10.1016/j.femsle.2004.04.003. [DOI] [PubMed] [Google Scholar]
- 41.Schuetz R, Kuepfer L, Sauer U. Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol. 2007;3:119. doi: 10.1038/msb4100162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Cho BK, et al. The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol. 2009;27:1043–1049. doi: 10.1038/nbt.1582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Güell M, et al. Transcriptome complexity in a genome-reduced bacterium. Science. 2009;326:1268–1271. doi: 10.1126/science.1176951. [DOI] [PubMed] [Google Scholar]
- 44.Sharma CM, et al. The primary transcriptome of the major human pathogen Helicobacter pylori. Nature. 2010;464:250–255. doi: 10.1038/nature08756. [DOI] [PubMed] [Google Scholar]
- 45.Raamsdonk LM, et al. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat Biotechnol. 2001;19:45–50. doi: 10.1038/83496. [DOI] [PubMed] [Google Scholar]
- 46.Fong SS, Marciniak JY, Palsson BO. Description and interpretation of adaptive evolution of Escherichia coli K-12 MG1655 by using a genome-scale in silico metabolic model. J Bacteriol. 2003;185:6400–6408. doi: 10.1128/JB.185.21.6400-6408.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



