Abstract
Large-scale biochemical models are of increasing sizes due to the consideration of interacting organisms and tissues. Model reduction approaches that preserve the flux phenotypes can simplify the analysis and predictions of steady-state metabolic phenotypes. However, existing approaches either restrict functionality of reduced models or do not lead to significant decreases in the number of modelled metabolites. Here, we introduce an approach for model reduction based on the structural property of balancing of complexes that preserves the steady-state fluxes supported by the network and can be efficiently determined at genome scale. Using two large-scale mass-action kinetic models of Escherichia coli, we show that our approach results in a substantial reduction of 99% of metabolites. Applications to genome-scale metabolic models across kingdoms of life result in up to 55% and 85% reduction in the number of metabolites when arbitrary and mass-action kinetics is assumed, respectively. We also show that predictions of the specific growth rate from the reduced models match those based on the original models. Since steady-state flux phenotypes from the original model are preserved in the reduced, the approach paves the way for analysing other metabolic phenotypes in large-scale biochemical networks.
Subject terms: Computational biology and bioinformatics, Biochemical reaction networks, Network topology
Introduction
Advances in phenotyping, quantitative genetics methods, and systems biology have helped characterize function of genes embedded in different types of biochemical networks1–3. Generation of large-scale stoichiometric models, capturing the metabolism of individual cell types and their interactions in an organism or communities1, has propelled the development of computational approaches for prediction and analysis of metabolic phenotypes4. The sheer size of genome-scale metabolic networks indisputably leads to computational and numerical challenges when employing them to predict and analyse metabolic phenotypes, particularly when these networks are endowed with enzyme kinetics5–7. Against this background we ask: Can large-scale biochemical models be reduced in an unbiased, i.e. fully automated fashion, while still guaranteeing that the steady-state flux phenotypes are preserved in the resulting models of smaller size?
There already exist several frameworks and theories to reduce biochemical models8,9. These approaches can be categorized based on different criteria: (i) whether they only employ the structure of the network (i.e. stoichiometry of biochemical reactions)9–13 or also consider the kinetics of reaction fluxes14–23, (ii) whether they approximate or provide one-to-one correspondence of the steady-state properties between the original and reduced network10,23–25, and (iii) whether the reduction process is fully automated, i.e. unbiased, or semi-automated (i.e. requires user input)9.
Reduction approaches rooted in the constraint-based modelling framework have been successfully employed to arrive at reduced models that approximate steady-state flux properties of large-scale metabolic networks. One group comprises approaches based on user-specified input, consisting of pre-selected metabolites, reactions, and functions (e.g. biomass production) that must be present or supported by the reduced model and, thus, result in reduced models biased by the user’s input24–26. Other approaches in this group start with user-defined subsystems that are subsequently connected (and further refined) to obtain a minimal model that can fulfil a pre-specified function27. In contrast, using ideas from reaction coupling28, a recent unbiased approach provides an efficient reduction of stoichiometric models by reaction removal, while ensuring one-to-one correspondence between the elementary flux modes of the original and reduced model10. Since these approaches do not consider kinetics of reaction flux, they remain silent about the effects that the reduction has on the concentrations of the remaining metabolites.
Approaches for reduction of models with mass action kinetics are particularly important to solve or approximate different reaction mechanisms (e.g. Michaelis–Menten, Hill)16,29. They usually invoke the quasi-steady state assumption (QSSA), whereby the modelled components can be grouped into those with small concentrations, referred to as intermediates, and those with large concentrations (i.e. substrates and products), such that the network displays two (or more) timescales. Invoking QSSA implies that the intermediates, that change on the fast time scale, are expressed as functions of the slow components16. Therefore, these intermediates are eliminated from the original network and, then, the reduced network, capturing the slow dynamics, is further analysed. Application of QSSA is not straightforward since it requires data on kinetic parameters that are challenging to obtain at a genome-scale level30. Therefore, such approaches cannot be analytically carried out29, and often involve approximations that lead to inaccuracies of estimations.
One unbiased technique to eliminate a component in mass action networks requires that the component appears only as a sole substrate or product in the reactions in which it participates23. Assuming steady state, the concentration of such a component can be readily expressed as a function of the concentrations of the remaining components. This approach guarantees correspondence between the conservation laws of the original and reduced network and the respective numbers of positive steady states23. This idea has been generalized to allow linear removal of some subsets of components that appear only with stoichiometry at most one in any reaction of a given network endowed with mass action kinetics14. Yet, structural conditions ensuring that the steady-state flux spaces of the reduced network are subset of the reduced network by removal of components of any stoichiometry in networks endowed with arbitrary kinetics remain elusive.
Here we identify such a structural property for model reduction and show that it can be efficiently computed and applied in genome-scale metabolic networks from organisms across the kingdoms of life. The condition allows consideration of biochemical constraints and can, therefore, be invoked with or without assumptions on optimized biological function. The key advantage of the structural condition is that it can be applied to large-scale biochemical networks endowed with mass action kinetics, and under some restrictions, to networks endowed with arbitrary kinetics, while guaranteeing that the steady-state flux phenotypes of the original model are preserved in the reduced.
Results and discussion
Balancing of complexes as a structural condition for network reduction
To illustrate the condition, we first introduce some key concepts from stoichiometric modelling of biochemical networks. A biochemical network is composed of irreversible reactions through which biochemical species acting as substrates are transformed into products. The biochemical network on Fig. 1a is composed of ten reactions that transform six species. The network structure is described by nodes that denote complexes, corresponding to the left- and right-hand sides of the considered reactions, and edges representing the reactions. Therefore, the network on Fig. 1a contains eight complexes connected by ten reactions. The incoming and outgoing neighbourhoods of a complex are given by the complexes to which it is directly connected via incoming and outgoing reactions, respectively. For instance, the incoming neighbourhood of complex 2B is composed of complexes D and F, and the outgoing neighbourhood comprises A + E and F. The stoichiometric matrix of the network, , is then given by the product of the matrix, , describing the species composition of complexes and the incidence matrix, , of the corresponding directed graph (“Methods”, Supplementary Fig. S1)31,32. In addition, the reactions are weighted by non-negative numbers which correspond to fluxes of a steady-state flux distribution, , which satisfies . The steady-state condition implies that the concentrations of species are invariant in time, i.e. the species are balanced, whereby the production and consumption rates are the same.
An analogous condition on balancing can be defined for a single complex: Given a biochemical network, a complex is balanced in a set of steady-state flux distributions if the sum of fluxes of its incoming and outgoing reactions is the same for every flux distribution in . Clearly, complexes that act as sinks or sources, that only have incoming or outgoing reactions, respectively, cannot be balanced in a network without blocked reactions (i.e. reactions which carry no flux in every steady state the network supports). Further, a complex is considered trivially balanced if it includes a species that appears in no other complex in the network. This result is a consequence of the balancing of species due to the steady state assumption. A balanced complex that includes species all of which appear in other complexes are termed non-trivially balanced. For instance, complex C cannot be balanced as it is a sink, while D, A + E, and F are trivially balanced since species D, E, and F appear only in these complexes that have both incoming and outgoing reactions. Note that our definition of trivially balanced complexes extends the notion of an intermediate species in other reduction approaches23. Finally, 2A is a non-trivially balanced complex, since A appears in two complexes of which one, A + E, is trivially balanced. In fact, by considering only the steady-state conditions, it is the interlinking of species into complexes that contributes to the formation of balanced complexes.
For a set of steady-state flux distributions, every balanced complex can be readily identified by constraint-based modelling (“Methods”). The approach amounts to determining that the minimum and maximum total fluxes around a complex are zero for all steady-state flux distributions in the set . These conditions can be verified by linear programming that is efficient even for large-scale networks (“Methods”). The identification of balanced complexes can be further streamlined by easy screening of trivially balanced as well as sink and source complexes.
Networks in which all complexes are balanced are termed complex balanced, and they can be identified by calculating the structural property of deficiency34. Seminal theoretical results pinpoint that complex balanced networks do not exhibit exotic properties, like multistationarity and oscillations. However, it has not yet been addressed how one can identify individual balanced complexes and what the implications of their presence in the network are with respect to the capacity to exhibit particular phenotypic properties.
We next ask if balanced complexes are relevant for reduction of networks which are not complex balanced. To answer this question, we first consider the scenario with a network endowed with arbitrary kinetics. If a balanced complex has only one outgoing reaction, then the flux of this reaction can be expressed as a sum of the fluxes of incoming reactions. The balanced complex can then be safely removed since the flux of the outgoing reaction can be substituted with the respective sum of fluxes of the incoming reactions for the balanced complex. This is concomitant to introducing a bipartite directed clique, connecting each complex in the incoming neighbourhood with the complex in the outgoing neighbourhood (Fig. 1b,d). Note that any loop edges introduced by this transformation can be removed, since they do not contribute to the balancing equations. For instance, A + E, F, and 2A can be removed as balanced complexes, resulting in the network on Fig. 1f; however, this cannot be done for the balanced complex D since it has more than one outgoing reaction. Since the removal of complexes in this way results in expressing one flux as a sum of others, the reduced network is unique and does not depend on the order of removing the balanced complexes. A similar idea has been used in the unbiased efficient reduction of stoichiometric models10, however, based on the coupling of reactions around balanced species and without the rewiring of reactions.
Attempting to remove a balanced complex with more than one outgoing reaction fails in general; however, under the constraint that all outgoing reactions are fully coupled (i.e. their flux ratio in every steady state is a unique constant) all outgoing reaction fluxes collapse into one and the balanced complex can be removed. Interestingly, this condition is trivially met for networks endowed with mass action kinetics, since the ratio of fluxes of reactions outgoing from any complex is given by the ratio of the respective rate constants. Importantly, this removal can be carried out even if values for the rate constants are not specified, since the ratio is ensured to be constant. Therefore, under mass action, a balanced complex can be removed by introducing a bipartite directed clique of reactions with rescaled kinetic constants (see “Methods”, Fig. 1c,e). For instance, the balanced complex D can be removed, like the other balanced complexes in the network, resulting in a mass action network with fewer complexes and species, and reactions with modified kinetic parameters, as shown in Fig. 1g.
The usage of balanced complexes to reduce networks with mass action kinetics generalizes the removal of trivially balanced complexes in which species appear in a single complex with stoichiometry of one23. However, the proposed approach allows removal of species that participate in any number of complexes with any stoichiometry, provided they appear only in balanced complexes. It also provides a sound extension of a reduction approach in which complexes are–arbitrarily–assumed to be balanced18—and leads to biases in the model reduction since the complexes are not demonstrated to be balanced. In our approach, the conservation laws in the original network hold in the reduced, but not necessarily vice versa (see “Methods”). Therefore, if the original network has at most one positive steady state for any rate constants and values for the conservation moieties, then the reduced model also has at most one positive steady state for the rescaled rate constants23. Hence, analysis of the reduced model can help identify if multistationarity for the concentration of remaining species is precluded in the original, larger model if seminal, well-established conditions apply34. Note that, the removal of balanced complexes may result in removal of species, but may lead to an increase in the number of reactions. Most importantly, the steady-state concentration of any removed species in the process of model reduction can be expressed as a function of the concentrations of the species in the reduced model (Supplementary Fig. S2).
Reduction of a large-scale kinetic model of Escherichia coli metabolism
To verify that the proposed approach could lead to reduction of metabolic networks endowed with mass action kinetics, we used the genome-scale7 and the so-called core35 kinetic metabolic models of E. coli. The core model is composed of 828 species, 1474 irreversible reactions, and 1190 complexes, while the genome-scale model consist of 3002 species, 5237 irreversible reactions, and 4371 complexes. Relying solely on the structure of the network, we identified 484 and 1639 balanced complexes in the core and genome-scale kinetic models, respectively, of which 77.7% and 75.4% were trivially balanced. For instance, the complexes 'Phosphoenolpyruvate + FBP-Phosphoenolpyruvate-complex’, ‘ACALD-acetaldehyde-NAD-CoA-complex', and ‘3-phospho-glycerate + PGK-3-phospho-glycerate-complex' are trivially balanced since the respective enzyme–substrate biochemical complexes appear only in these network complexes. In addition, the complex ‘3-phospho-glycerate + ICL’ is non-trivially balanced, since the species 3-phospho-glycerate appears only in two complexes (‘3-phospho-glycerate + ICL’ and ‘3-phospho-glycerate + PGK-3-phospho-glycerate-complex' x) and the complex ‘3-phospho-glycerate + PGK-3-phospho-glycerate-complex' is trivially balanced. From the 484 and 1639 balanced complexes 23% and 26% had only one outgoing reaction; for instance, ‘l-aspartate + PRASCS-ATP-complex’, ‘ATP + PRASCS’, and ‘glucose-6-phosphate + FBP’ are balanced complexes with such properties (see Supplementary Figure S3). The removal of such balanced complexes led to a decrease in the number of species by ~ 14% in comparison to both investigated kinetic models. Both models consider elementary reaction steps and, therefore, model species can be of three types: metabolite, enzyme or metabolite-enzyme biochemical complex. We observed that ~ 95% of the species removed denoted species that represent metabolite-enzyme biochemical complexes. This is in line with the commonly applied assumption in deriving kinetics that are based on mass action (e.g. Michaelis–Menten), and is due to the fact that metabolite-enzyme complexes, as species, participate in single network complexes that are consequently trivially balanced (as illustrated in the examples above). When considering the additional restriction due to mass action kinetics, we could apply the approach in a second round of reduction. This led to the identification of 694 and 2708 additional non-trivial balanced complexes in the core and genome-scale kinetic model, respectively. Removal of these balanced complexes led to ~ 99% decrease in the number of species in comparison to the original model for the core as well as the genome-scale model. Therefore, the approach does lead to a substantial reduction in the number of species even when only the structure of the network was employed.
The reduced core kinetic model comprises: fumarate, succinate as well as ubiquinone-8 and ubiquinol-8, interconverted by fumarate reductase (FRD2) and succinate dehydrogenase (SUCDi) together with the related substrate-enzyme complexes (Fig. 2). In the reduced genome-scale model extracellular pyruvate, acetate, and l-glucose as well as enzymes exchanging those metabolites with the environment extend the set of species around fumarate reductase and succinate dehydrogenase that are found in the reduced core kinetic model (Fig. 2). Therefore, the steady-state concentrations of all other metabolites in these kinetic models can be expressed as functions of the few species that are present in the reduced model.
Reduction of large-scale stoichiometric metabolic networks
Motivated by the findings from the reduction of the core and genome-scale kinetic models of E. coli metabolism, we next employed the proposed approach to inspect the reduction in the number of species in genome-scale metabolic models of different organisms. To this end, we used the metabolic networks of twelve organisms to compare and contrast the reductions for the following scenarios: (i) all reactions are assumed to be reversible in contrast to the case when irreversibility constraints, included in the original models, are used, (ii) all reactions follow arbitrary kinetics or their fluxes are described by mass action kinetics, (iii) balanced complexes are determined with respect to the set of steady-state flux distributions compatible with reversibility in comparison to steady-state flux distributions at optimal specific growth rate. Note that reversible reactions are split into two irreversible reactions before applying the approach.
With arbitrary kinetics of reaction fluxes, the general observation was that invoking irreversibility led to only a small increase (< 7%) in the number of removed species across all organisms (Fig. 3a, Supplementary Table S1). The model of M. barkeri is an exception to this finding, with 30% increase in species reduction when considering reaction irreversibility. In addition, imposing constraints on biomass had negligible additional effect on the balanced complexes in majority of the networks, except in the models of P. putida and N. pharaonis, for which there was an increase by 26 and 30% in the number of removed species, respectively (Fig. 3a). This finding demonstrated that the balanced complexes are a property of the network structure and steady-state constraint, rather than due to optimality conditions imposed.
Consideration of mass action kinetics led to an increase in model reduction of, on average, 26% in comparison to the case of arbitrary kinetics when all reactions are considered reversible (Fig. 3a,b). In contrast to the case of arbitrary kinetics, the assumptions of reaction irreversibility has a large effect on the reduction in the number of species in comparison to the case when all reactions are assumed to be reversible (Fig. 3b). On average, we observed 23% increase in species reduction when reaction irreversibility is considered. Finally, imposing constraints on biomass had no effect on the balanced complexes and, thus, on the number of removed species, when mass action was considered (Fig. 3b).
Altogether, when irreversibility for reactions in the original model was considered, the proposed approach led to no reduction in the number of species in the Arabidopsis thaliana core, M. musculus, N. paharaonis and P. putida model and reached up to 55.1% reduction in the model of M. barkeri. On average we observed 18% of species being removed across all considered networks. Instead, when mass action kinetics was considered, we found a reduction between 24.3% for E. coli to 85.7% in T. maritima, with an average of 44% across the considered models. These results demonstrated that substantial reduction in the number of removed species is possible, while ensuring that the key network properties at steady state are matched between the original and reduced network. In addition, the reduction eliminated up to 17 (12%) species which enter complexes with stoichiometry larger than one (Supplementary Table S2), which is not possible with the existing approaches.
As a general note, the differences in the reduction of the considered models is due to the differences in the number and positioning of balanced complexes with the specific motif of having one outgoing–many incoming reactions (or vice versa) in the specific scenarios considered. In comparison to the reduction of the large-scale mass action model of E. coli, its counterpart with arbitrary kinetics can be reduced to a smaller degree since its balanced complexes that have the motif structure for reduction are subset of such balanced complexes when mass action is employed. Furthermore, the differences between the scenarios are due to the additional constraints imposed in each of the scenarios.
Next, we investigate compartment-wise species reduction between the original and reduced models, considering reaction irreversibility and mass action kinetics. In doing this analysis, we determined the balanced complexes in the entire network, and then investigated the effect of the reduction by using information about compartments in the respective models. The general observation was that models comprising extracellular space show large reduction for this compartment. The reduction ranged from 26% for N. pharaonis to 100% in T. maritima (Supplementary Table S3). On average, the reduction over the ten models including extracellular space as a compartment was 65.5%. The average reduction for the cytoplasmic part of bacterial and archaea models was 49%. Compartmented models of S. cerevisiae, C. reinhardtii and A. thaliana showed largest reduction in mitochondria, chloroplast and cytosol, respectively (Supplementary Table S3).
Using the two genome-scale metabolic models of E. coli and S. cerevisiae we compared the set of metabolites in the original and reduced models using Jaccard similarity (see “Methods”). The original models show Jaccard similarity of 0.27 for the set of metabolites, while the reduced models have Jaccard similarity of 0.26. A similar observation was made comparing bacterial models of E. coli, M. tuberculosis, and N. pharaonis for which at least 95% of the metabolite names could be translated to a common name space. Here, we found Jaccard similarity across the set of metabolites of 0.19 for the original models and 0.16 in the reduced models. These finding demonstrated that the reduction keeps the differences between the original models largely unchanged in comparison to the differences between the reduced models across organisms.
Predictions of specific growth rate from reduced models
To showcase the benefit of the reduction approach, we considered the reduced models that still include the metabolites (i.e. species) participating in the biomass reactions from the original model (with specified irreversibility). This allows us to compare the simulated specific growth rate between the original and reduced models. The resulting reduced models included 4% fewer metabolites in the model of A. thaliana to 28% in the model of M. musculus (Fig. 4b). Our findings demonstrate that optimal specific growth rate predicted by the original models are exact match for those of the reduced models (Fig. 4a). Thus, this analysis showcases that the proposed property can be used to remove some well-defined balanced complexes (e.g. those that do not include the building blocks of biomass) without affecting specific growth rates, thus, allowing more targeted applications.
Analyses of flux variability and essentiality of reactions
Next, we employed the reduced models used in the prediction of specific growth rates to investigate the difference in the ranges derived from flux variability analysis36. To this end, we calculated the ranges of reactions that are present in the reduced and original model following standard procedure (36“Methods”). We found that seven of the reduced models showed the same flux ranges at the optimal specific growth rate determined from FBA as those in the original model. In the remaining five models of C. reinhardtii, M. acetivorans, M. tuberculosis, T. maritima and S. cerevisiae, 71% to 97% of reactions shared between reduced and original model had the same flux ranges at the optimal specific growth rate determined from FBA as those in the original model (Supplementary Fig. 4). For the remaining reactions, we observed that in C. reinhardtii 94% showed lower flux ranges, while all remaining reactions for T. maritima showed larger flux ranges. In models of M. acetivorans, M. tuberculosis and T. maritima we found that 47% to 76% of the remaining reactions showed larger flux ranges in the reduced models in comparison to the original. The reason why fluxes of some reactions are larger is because the steady state flux distributions of the original network are also steady state flux distributions in the reduced network; however, the reduced network may admit additional steady states (see “Methods”). These findings demonstrate that the model reduction leaves the flux variability properties (largely) unchanged.
We also inspected the extent to which the reduction affects the predictions of reaction essentiality. To this end, we determined the essential reactions in the reactions shared between the reduced and original models. We found that every reaction that is shared between an original model and its reduced counterpart and is essential in the original model is also essential in the reduced. In addition, the set of reactions introduced during model reduction, e.g. reaction , introduced by removal of balanced complex from the path , contains no essential reactions for the models of A. niger and N. pharaonis. In contrast, for the models of M. acetivorans and T. maritima all reactions in these sets are essential. For the remaining models, the percentage of essential reactions in the set of introduced reactions is between 19 and 50%. The essentiality of reactions introduced during model reduction likely results from lumping of at least one essential reaction in the original model. Therefore, we conclude that the proposed reduction does not alter the findings regarding the essentiality of shared reactions and can be used to infer biological role of reactions.
Conclusion
The constraint-based modelling framework provides a powerful suite of approaches to study the genotype-to-phenotype mapping not only in a single cell, but also across multiple unicellular organisms in a microbial community and interconnected tissues in a multi-cellular organism. However, these advanced applications are associated to increases in the model size which lead to computational and numerical issues in predicting phenotypes of interest. Here, we proposed a fully automated (i.e. unbiased) approach for reduction of models with arbitrary or mass action kinetics at steady state. The approach is based on identifying balanced complexes which contain either a single outgoing reaction, in the case of arbitrary kinetics, or more than one outgoing reaction, for mass action kinetics. We also show that the structural condition can be efficiently identified in large-scale networks, assuming they operate at steady state. Moreover, such balanced complexes can be safely removed from the network as their removal translates into identification of substitution of variables which preserves the steady-state flux distributions of the original model. In addition, the variable substitution can be graphically represented by rewriting the network, leading to a reduced network, and can be carried out without knowledge of kinetic parameters (in the case of mass action kinetics). Most importantly, if a species appears only in balanced complexes, it can be removed from the network and, under the assumption of mass action kinetic, its steady-state concentration can be represented as functions of the steady-state concentrations of the metabolites that remain in the reduced network.
Our extensive analysis of genome-scale kinetic models endowed with mass action show that more than 99% of the metabolites can be removed following the proposed reduction. In addition, since the approach is constraint-based, it also allows us to examine if reaction reversibility or assumption on cellular tasks (e.g. growth) that are optimized have an effect on the size of the reduced model, in terms of number of species. The analysis of genome-scale metabolic models across kingdoms of life show that reaction reversibility and assumption of kinetics have the largest effect on the size of the reduced models, leading to up to 85% reduction in the number of species. Nevertheless, we show that if the reduction is performed such that no metabolite participating in the biomass reaction is removed, the reduced model is not affected with respect to predicted growth phenotypes.
We would like to emphasize that once the balanced complexes are identified, it is a choice of the modeller which balanced complexes should be removed in the specific analysis case. For instance, we showed that the removal of the balanced complexes, that do not include species that participate in a biomass reactions, does not affect the prediction of specific growth rates. In this scenario, we also showed that the results from flux variability analysis and predictions of essential reactions in reduced and original models are in very good agreement.
Since the proposed approach is based on a property that can be efficiently determined even in large-scale networks, it provides the means for reduction of multi-tissue and microbial community models. Moreover, it opens the possibility to study the relationship of key properties, such as robustness and multistationarity, between the original and reduced models. However, these applications will have to ensure imposing additional constraints on the removal of the balanced complexes that do not affect the conservation laws in the network. Finally, our approach opens the possibility for unbiased model reduction by seeking to identify other types of structural motifs in large-scale metabolic networks.
Methods
Models and their processing
The genome-scale metabolic models of twelve organisms (Supplementary Table S1), were obtained from their original publications37–48. The blocked reactions, reactions with absolute flux values less than 10–9 mmol/gDW/h, in the metabolic network were determined by Flux Variability Analysis36 and were removed from the original models. Each reversible reaction was split into two irreversible reactions. Two cases for the irreversible reactions originally declared in the model were considered: (1) all were treated as reversible (and were split, as aforementioned) or (2) were maintained as irreversible. The lower bounds for the irreversible reactions were set to zero, while the upper bounds were fixed to the maximum of the upper bounds in the original model. Optimum biomass was determined per Flux Balance Analysis49 with the assumed reaction reversibility.
Identification of balanced complexes
Let denote the non-negative matrix of complexes, with rows denoting species and columns representing complexes. The entry denotes the stoichiometry with which species enters the complex . Let denote the incidence matrix of the directed graph with nodes representing complexes and edges denoting reactions. The rows of denote the complexes and its column stand for the reactions. Since the graph is directed, each column of has precisely one − 1 and one 1 entry, corresponding to the substrate and product complexes of the respective reaction. The stoichiometric matrix is then given by .
The sum of fluxes around the complex is given by the th entry of the vector , denoted by . A complex is balanced in the set of flux distributions if for every , it holds that . This condition can be verified by determining that the minimum and maximum values of equal to 0 for each complex , separately. The latter can be ensured by solving two linear programs:
s.t.
The dependence of balanced complexes on the flux bounds is addressed in follow-up studies50.
Removal of balanced complexes
Suppose that the complex is balanced and that it participates in incoming and outgoing reactions as a product and substrate complex, respectively. Let … and … denote the substrate and product complexes of the incoming and outgoing reactions.
Let us assume that , i.e., the complex is incident on only one outgoing reaction. Due to the balancing of , . The removal of the complex without affecting the steady-state fluxes entails: (i) removal of the reactions incident on from the original network and (ii) insertion of reactions with a substrate complex from … and a product complex . This amounts to substituting every occurrence of by , preserving steady-state flux solutions (i.e. the reduced network includes all steady-state flux solutions of the original).
Let us now assume that and that the network is endowed with mass action kinetics. The flux of a reaction with complex as a substrate is given by , where and denotes the stoichiometric coefficient of species in the complex . Due to complex balancing, then, . The removal of the complex without affecting the steady-state values of the complexes entails: (i) removal of the reactions incident on from the original network and (ii) insertion of reactions with a substrate complex from … and a product complex from …. The rate constant of the reaction with and as substrate and product complexes, respectively, is given by . This amounts to substituting every occurrence of by in , preserving the steady-state flux solutions with respect to (rather than fluxes, in the case for arbitrary kinetics, above). Note that is a vector collecting over all complexes, while is the matrix with as many rows as reactions and as many columns as complexes, whose entry corresponds to the rate constant of the reaction having complex as a substrate or zero, otherwise.
Preservation of steady states in the reduction process
Let index any species/metabolite and be any steady state flux distribution. It follows from the steady state property that ; hence , where is the stoichiometric coefficient by which metabolite participates in complex .
Let us consider a single elimination step in this process, in which a particular balanced complex, say , is being removed as explained above. Due to the specific way this removal was defined, the algebraic sum of fluxes for any complex in the reduced network, that is, remains intact, while this sum for the removed complex , that is, equals zero like that of any other balanced complex. It follows that the steady state equation for metabolite , that is, still holds after the elimination.
Since the same argument holds for any arbitrary metabolite , it follows that the whole metabolic network will remain at steady state after each removal step. Therefore, any steady state of the original network is conserved at every elimination step and hence, all along the way up to the last reduced model.
Correspondence of conservation laws
The removal of a balanced complex corresponds structurally to substituting of any directed path of two reactions, and , on which the balanced complex is the middle node, with another reaction, given by reaction vector . Clearly, . It follows that every new reaction in the reduced network lies in the column span of . Denoting the stoichiometry matrix for the reduced network by , yields .Therefore, for every conservation law, that satisfies , one also has , which means the conservation law is preserved in the reduced network. . However, a conservation law in the reduced network may not necessarily correspond to one of the original network, unless further constraints are imposed on the balanced complexes being removed.
Predictions of specific growth rates
We calculate optimal specific growth rates by flux balance analysis (FBA) in models with considered irreversibility and compare the predictions before and after removal of balanced complexes. Blocked reactions are removed from the original network and each reversible reaction is split into two irreversible reactions. To avoid cases where no biomass production is ensured in the original model after removal of blocked reactions due to numerical issues, we set the lower bound of the biomass reaction to 5% of the optimal biomass obtained from the model including blocked reactions. In addition, we ensure that the biomass reaction, whose flux is optimized during FBA, is still included in the reduced model. Therefore, we only remove balanced complexes that do not include metabolites (i.e. species) participating in the biomass reactions from the original model.
Analysis of flux variability and essentiality of reactions
We conducted flux variability analysis36 at 99% of optimal biomass for all reactions that appear in both the original and reduced models considering any kinetic. We then classified the reactions into those that do and do not show differences in the flux ranges between each original and corresponding reduced model. Similar analysis was conducted with respect to reaction essentiality: A reaction that appears in both an original and corresponding reduced model was knocked out and the effect on growth was examined. The reactions were classified as those that are essential or not in each model.
Supplementary Information
Acknowledgements
Z.N and D.L are supported by the Max Planck Society.
Author contributions
Conceptualization: Z.N., data collection: A.K., formal analysis: A.K., investigation: A.K., Z.N., software: A.K., P.W., validation: A.K., writing – original draft: Z.N., writing – reviewing and editing: A.K., P.W., D.L., Z.N.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Data availability
The approach is implemented and available together with all data needed to reproduce the findings at https://github.com/ankueken/network_reduction_by_balanced_complexes.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-021-96835-1.
References
- 1.Fang X, Lloyd CJ, Palsson BO. Reconstructing organisms in silico: Genome-scale models and their emerging applications. Nat. Rev. Microbiol. 2020 doi: 10.1038/s41579-020-00440-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Silverbush D, Sharan R. A systematic approach to orient the human protein–protein interaction network. Nat. Commun. 2019 doi: 10.1038/s41467-019-10887-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Thompson D, Regev A, Roy S. Comparative analysis of gene regulatory networks: From network reconstruction to evolution. Ann. Rev. Cell Dev. Biol. 2015 doi: 10.1146/annurev-cellbio-100913-012908. [DOI] [PubMed] [Google Scholar]
- 4.Bordbar A, Monk JM, King ZA, Palsson BO. Constraint-based models predict metabolic and associated cellular functions. Nat. Rev. Genet. 2014;15:107–120. doi: 10.1038/nrg3643. [DOI] [PubMed] [Google Scholar]
- 5.Stanford NJ, et al. Systematic construction of kinetic models from genome-scale metabolic networks. PLoS ONE. 2013 doi: 10.1371/journal.pone.0079195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chindelevitch L, Trigg J, Regev A, Berger B. An exact arithmetic toolbox for a consistent and reproducible structural analysis of metabolic network models. Nat. Commun. 2014 doi: 10.1038/ncomms5893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Khodayari A, Maranas CD. A genome-scale Escherichia coli kinetic metabolic model k-ecoli457 satisfying flux data for multiple mutant strains. Nat. Commun. 2016;7:13806. doi: 10.1038/ncomms13806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Radulescu O, Gorban AN, Zinovyev A, Noel V. Reduction of dynamical biochemical reactions networks in computational biology. Front. Genet. 2012 doi: 10.3389/fgene.2012.00131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Singh D, Lercher MJ. Network reduction methods for genome-scale metabolic models. Cell. Mol. Life Sci. 2020 doi: 10.1007/s00018-019-03383-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Tefagh M, Boyd SP. Metabolic network reductions. bioRxiv. 2019 doi: 10.1101/499251. [DOI] [Google Scholar]
- 11.Sambamoorthy G, Raman K. MinReact: A systematic approach for identifying minimal metabolic networks. Bioinformatics. 2020 doi: 10.1093/bioinformatics/btaa497. [DOI] [PubMed] [Google Scholar]
- 12.Jonnalagadda S, Srinivasan R. An efficient graph theory based method to identify every minimal reaction set in a metabolic network. BMC Syst. Biol. 2014 doi: 10.1186/1752-0509-8-28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Clarke BL. General method for simplifying chemical networks while preserving overall stoichiometry in reduced mechanisms. J. Chem. Phys. 1992 doi: 10.1063/1.463911. [DOI] [Google Scholar]
- 14.Sáez M, Wiuf C, Feliu E. Graphical reduction of reaction networks by linear elimination of species. J. Math. Biol. 2017 doi: 10.1007/s00285-016-1028-y. [DOI] [PubMed] [Google Scholar]
- 15.Feliu E, Wiuf C. Variable elimination in chemical reaction networks with mass-action kinetics. SIAM J. Appl. Math. 2012 doi: 10.1137/110847305. [DOI] [PubMed] [Google Scholar]
- 16.Feliu, E., Lax, C., Walcher, S. & Wiuf, C. Quasi-steady state and singular perturbation reduction for reaction networks with non-interacting species. (2019).
- 17.Radulescu O, Gorban AN, Zinovyev A, Lilienbaum A. Robust simplifications of multiscale biochemical networks. BMC Syst. Biol. 2008 doi: 10.1186/1752-0509-2-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rao S, der Schaft A, van Eunen K, van Bakker BM, Jayawardhana B. A model reduction method for biochemical reaction networks. BMC Syst. Biol. 2014 doi: 10.1186/1752-0509-8-52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Roussel MR, Fraser SJ. Invariant manifold methods for metabolic model reduction. Chaos. 2001 doi: 10.1063/1.1349891. [DOI] [PubMed] [Google Scholar]
- 20.Sunnåker M, Cedersund G, Jirstrand M. A method for zooming of nonlinear models of biochemical systems. BMC Syst. Biol. 2011 doi: 10.1186/1752-0509-5-140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Danø S, Madsen MF, Schmidt H, Cedersund G. Reduction of a biochemical model with preservation of its basic dynamic properties. FEBS J. 2006 doi: 10.1111/j.1742-4658.2006.05485.x. [DOI] [PubMed] [Google Scholar]
- 22.Schmidt H, Madsen MF, Danø S, Cedersund G. Complexity reduction of biochemical rate expressions. Bioinformatics. 2008 doi: 10.1093/bioinformatics/btn035. [DOI] [PubMed] [Google Scholar]
- 23.Feliu E, Wiuf C. Simplifying biochemical models with intermediate species. J. R. Soc. Interface. 2013 doi: 10.1098/rsif.2013.0484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Röhl A, Bockmayr A. A mixed-integer linear programming approach to the reduction of genome-scale metabolic networks. BMC Bioinform. 2017 doi: 10.1186/s12859-016-1412-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Erdrich P, Steuer R, Klamt S. An algorithm for the reduction of genome-scale metabolic network models to meaningful core models. BMC Syst. Biol. 2015 doi: 10.1186/s12918-015-0191-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ataman M, Hernandez Gardiol DF, Fengos G, Hatzimanikatis V. redGEM: Systematic reduction and analysis of genome-scale metabolic reconstructions for development of consistent core metabolic models. PLoS Comput. Biol. 2017 doi: 10.1371/journal.pcbi.1005444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Baroukh C, Muñoz-Tamayo R, Steyer JP, Bernard O. DRUM: A new framework for metabolic modeling under non-balanced growth. Application to the carbon metabolism of unicellular microalgae. PLoS ONE. 2014 doi: 10.1371/journal.pone.0104499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Burgard AP, Nikolaev EV, Schilling CH, Maranas CD. Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 2004 doi: 10.1101/gr.1926504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Pantea C, Gupta A, Rawlings JB, Craciun G. The QSSA in chemical kinetics: As taught and as practiced. Nat. Comput. Ser. 2014 doi: 10.1007/978-3-642-40193-0_20. [DOI] [Google Scholar]
- 30.Davidi D, et al. Global characterization of in vivo enzyme catalytic rates and their correspondence to in vitro kcat measurements. Proc. Natl. Acad. Sci. USA. 2016;113:3401–3406. doi: 10.1073/pnas.1514240113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Müller S, Regensburger G. Generalized mass action systems: Complex balancing equilibria and sign vectors of the stoichiometric and kinetic-order subspaces. SIAM J. Appl. Math. 2012 doi: 10.1137/110847056. [DOI] [Google Scholar]
- 32.Neigenfind J, Grimbs S, Nikoloski Z. On the relation between reactions and complexes of (bio)chemical reaction networks. J. Theor. Biol. 2013 doi: 10.1016/j.jtbi.2012.10.016. [DOI] [PubMed] [Google Scholar]
- 33.Shinar G, Feinberg M. Structural sources of robustness in biochemical reaction networks. Science. 2010;327:1389–1391. doi: 10.1126/science.1183372. [DOI] [PubMed] [Google Scholar]
- 34.Feinberg M. Chemical reaction network structure and the stability of complex isothermal reactors—I. The deficiency zero and deficiency one theorems. Chem. Eng. Sci. 1987;42:2229–2268. doi: 10.1016/0009-2509(87)80099-4. [DOI] [Google Scholar]
- 35.Khodayari A, Zomorrodi AR, Liao JC, Maranas CD. A kinetic model of Escherichia coli core metabolism satisfying multiple sets of mutant flux data. Metab. Eng. 2014;25:50–62. doi: 10.1016/j.ymben.2014.05.014. [DOI] [PubMed] [Google Scholar]
- 36.Gudmundsson S, Thiele I. Computationally efficient flux variability analysis. BMC Bioinform. 2010 doi: 10.1186/1471-2105-11-489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Andersen MR, Nielsen ML, Nielsen J. Metabolic model integration of the bibliome, genome, metabolome and reactome of Aspergillus niger. Mol. Syst. Biol. 2008 doi: 10.1038/msb.2008.12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Arnold A, Nikoloski Z. Bottom-up metabolic reconstruction of Arabidopsis and its application to determining the metabolic costs of enzyme production. Plant Physiol. 2014;165:1380–1391. doi: 10.1104/pp.114.235358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Zhang Y, et al. Three-dimensional structural view of the central metabolic network of thermotoga maritima. Science. 2009 doi: 10.1126/science.1174671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lu H, et al. A consensus S. cerevisiae metabolic model Yeast8 and its ecosystem for comprehensively probing cellular metabolism. Nat. Commun. 2019 doi: 10.1038/s41467-019-11581-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Imam S, et al. A refined genome-scale reconstruction of Chlamydomonas metabolism provides a platform for systems-level analyses. Plant J. 2015 doi: 10.1111/tpj.13059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Orth JD, et al. A comprehensive genome-scale reconstruction of Escherichia coli metabolism-2011. Mol. Syst. Biol. 2011 doi: 10.1038/msb.2011.65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Benedict MN, Gonnerman MC, Metcalf WW, Price ND. Genome-scale metabolic reconstruction and hypothesis testing in the methanogenic archaeon Methanosarcina acetivorans C2A. J. Bacteriol. 2012 doi: 10.1128/JB.06040-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Feist AM, Scholten JCM, Palsson B, Brockman FJ, Ideker T. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol. Syst. Biol. 2006 doi: 10.1038/msb4100046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Quek LE, Nielsen LK. On the reconstruction of the Mus musculus genome-scale metabolic network model. Genome Inform. 2008 doi: 10.1142/9781848163324_0008. [DOI] [PubMed] [Google Scholar]
- 46.Fang X, Wallqvist A, Reifman J. Development and analysis of an in vivo-compatible metabolic network of Mycobacterium tuberculosis. BMC Syst. Biol. 2010 doi: 10.1186/1752-0509-4-160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Gonzalez O, et al. Characterization of growth and metabolism of the haloalkaliphile Natronomonas pharaonis. PLoS Comput. Biol. 2010 doi: 10.1371/journal.pcbi.1000799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Nogales J, Palsson B, Thiele I. A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory. BMC Syst. Biol. 2008 doi: 10.1186/1752-0509-2-79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Orth JD, Thiele I, Palsson BØO. What is flux balance analysis? Nat. Biotechnol. 2010;28:245–248. doi: 10.1038/nbt.1614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Langary D, Küken A, Nikoloski Z. The unraveling of balanced complexes in metabolic networks. bioRxiv. 2021 doi: 10.1101/2021.03.23.436554. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The approach is implemented and available together with all data needed to reproduce the findings at https://github.com/ankueken/network_reduction_by_balanced_complexes.