Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Feb 28.
Published in final edited form as: Science. 2012 Aug 31;337(6098):1101–1104. doi: 10.1126/science.1216861

Network context and selection in the evolution to enzyme specificity

Hojung Nam 1,*, Nathan E Lewis 1,3,‡,*, Joshua A Lerman 2, Dae-Hee Lee 1,, Roger L Chang 2, Donghyuk Kim 1, Bernhard O Palsson 1,
PMCID: PMC3536066  NIHMSID: NIHMS428337  PMID: 22936779

Abstract

Enzymes are thought to have evolved highly specific catalytic activities from promiscuous ancestral proteins. By analyzing a genome-scale model of Escherichia coli metabolism, we found that 37% of its enzymes act on a variety of substrates and catalyze 65% of the known metabolic reactions. However, it is not apparent why these generalist enzymes remain. Here, we show that there are marked differences between generalist enzymes and specialist enzymes, known to catalyze a single chemical reaction on one particular substrate in vivo. Specialist enzymes (i) are frequently essential, (ii) maintain higher metabolic flux, and (iii) require more regulation of enzyme activity to control metabolic flux in dynamic environments than do generalist enzymes. Furthermore, these properties are conserved in Archaea and Eukarya. Thus, the metabolic network context and environmental conditions influence enzyme evolution toward high specificity.


Ancestral enzymes are proposed to have exhibited broad substrate specificity and low catalytic efficiency (1). Through mutation, duplication, and horizontal gene transfer, gene families diversified and promiscuous enzymes apparently were refined to exhibit specific and more efficient catalytic abilities (2, 3). Thus, today’s metabolic enzymes are commonly assumed to be “specialists,” having evolved to catalyze one reaction on a unique primary substrate in an organism. However, some enzymes are “generalists” that promiscuously catalyze reactions on a variety of substrates in vivo (2) or exhibit multifunctionality by catalyzing multiple classes of reactions, often at different active sites (4). Thus, a fundamental question arises: Why do some enzymes evolve to become specialists, whereas others retain generalist characteristics? By analyzing enzyme functions and properties in experimental data and in silico metabolic network models, we show that the in vivo biochemical network context in which an enzyme resides may influence the evolution of enzyme specificity.

How many metabolic enzymes are generalists? To answer this question, we used a comprehensive reconstruction of the Escherichia coli K-12 MG1655 metabolic network, which accounts for the metabolic functions of 1260 gene products (28% of the predicted and experimentally validated open reading frames in E. coli) (5), which contribute to 1081 enzyme complexes analyzed in this study. In the reconstruction, we define a reaction as a unique set of substrates that are chemically transformed into a unique set of products. With this definition, we classified 677 enzymes as specialists because they catalyze one unique reaction and 404 as generalists because they catalyze multiple reactions. Thus, we estimate that 37% of metabolic enzymes in E. coli are generalists, most of which exhibit substrate promiscuity (fig. S1A). Furthermore, specialist and generalist enzymes catalyze 454 and 859 metabolic reactions, respectively, distributed across many metabolic subsystems (Fig. 1, A and B). Thus, contrary to the textbook view of enzymes as “specific catalysts,” generalist enzymes have a prominent role in E. coli, catalyzing at least 65% of the nonspontaneous metabolic reactions.

Fig. 1.

Fig. 1

(A) Specialist and generalist genes and proteins and their associated reactions were enumerated in E. coli metabolism. (B) Several metabolic subsystems were enriched in specialist enzyme reactions (SERxns) or generalist enzyme reactions (GERxns) in E. coli (hypergeometric P ≤ 0.05). (C) Reaction flux magnitudes were rank-ordered and binned in histograms for each unique media condition. A heat map was used to visualize histograms for all 174 media conditions (columns) with each row representing bins spanning the given flux rank ranges (y axis). Color intensity shows histogram bin height, corresponding to the percentage of reactions in the bin. Example histograms are provided to the right provide for one representative condition. SERxns tend to have a higher flux, but low-flux SERxns are enriched in enzymes that synthesize low abundance essential cell components, such as cofactors and prosthetic groups (fig. S4C). (D) Genes for specialist enzymes are more frequently essential in vivo. (E) In silico, few essential GERxns were identified for growth on glucose minimal medium. (F) For all 174 simulated growth conditions, SERxns are significantly enriched among in silico-predicted reactions essential for growth, representing 56% of the essential reactions (inset).

We performed several network-wide analyses to provide additional support for our estimates and the classification. First, we found that almost all genes in the network have been well-characterized and studied in more than 61,727 published studies (fig. S1D). Second, we found no correlation between our classification and knowledge depth, i.e., neither specialist nor generalist enzymes had been studied in more depth (fig. S1E). Third, our generalist enzymes did not likely include many latent promiscuous reactions measured in vitro that likely do not occur in vivo, because 85% of the generalist enzymes reactions (GERxns) were active in silico in common growth conditions. This is the same percentage seen for specialist enzyme reactions (SERxns) (fig. S2). Fourth, because enzyme classification may vary with further study, we tested the sensitivity of the results presented in this work. We found the results to be qualitatively robust with improvements in the metabolic network from the discovery of new enzymes, variations in enzyme classification, and the exclusion of promiscuous enzymes or multifunctional enzymes from the generalist class (fig. S3). Although transporter reactions were not included in the groups of SERxns or GERxns, their inclusion would not qualitatively change the results in this work (fig. S3). Thus, the classification and results from our subsequent analysis are robust.

Why are so many generalist enzymes evolutionarily retained, whereas others became specialists? Demands for higher metabolic flux may provide an evolutionary selective pressure to enhance an enzyme’s catalytic rate and reduce the required enzyme concentration. However, catalytic improvements for one substrate of a generalist enzyme can suppress other catalytic activities (6). To determine if specialists maintain higher flux, we estimated the steady-state metabolic flux rates (7) for all E. coli enzymes using a genome-scale metabolic network model. We employed a Markov chain Monte Carlo sampling method (8) to simulate flux on 174 media conditions with different nutrient compositions (9). For each growth condition, the median flux for each reaction was rank-ordered to determine the relative flux among reactions.

Across all simulated growth conditions, SERxns maintained higher flux than GERxns (Fig. 1C and fig. S4). Gene duplications may have been fixed in the population when specialization occurred to increase activity of high-flux enzymes. Higher activity would permit lower enzyme concentrations, thereby offsetting the cost of duplication (10). Consistent with this reasoning, kcat values are significantly higher for high-flux specialist enzymes than for all other enzymes (fig. S5C, Wilcoxon P = 2.8 × 10−7).

Although flux level may contribute to enzyme specialization, gene essentiality may also contribute. High substrate affinity for essential enzymes could mitigate substrate competition in the synthesis of necessary biomass components, irrespective of flux level. Consistent with this hypothesis, we found that essential enzymes have lower Km values and therefore higher substrate affinity (fig. S5F, Wilcoxon P = 1.1 × 10−11). Furthermore, specialist enzymes are enriched among experimentally determined essential genes (11) (hypergeometric P = 8.65 × 10−5, Fig. 1D). In silico simulation also demonstrated that cell growth rarely directly depends on flux through generalist enzymes (Fig. 1E), whereas many SERxns were essential for growth across all 174 tested media conditions (Fig. 1F and fig. S6).

Gene essentiality (12, 13) and reaction fluxes often vary (8, 14, 15) because natural environments are dynamic and nutrient concentrations fluctuate in the microbial microenvironment (16). The need to regulate reaction flux in dynamic environments could induce gene duplication and enzyme specialization in order to simplify the combinatorial complexity of regulating multiple reactions on a single enzyme (e.g., see serine hydroxymethyltransferase in fig. S7). To test this hypothesis, we identified enzymes that will require more metabolic regulation in dynamic environments by simulating changes in carbon source and electron acceptors for E. coli. For each substrate shift, the model predicted whether reaction flux should increase or decrease, and these predictions were consistent with measured differential gene expression (fig. S8) (17).

Across all shifts in growth media, there was a considerable difference in the percentages of active SERxns and GERxns that significantly changed their flux between growth conditions (Fig. 2A). SERxn fluxes were often more than twice as likely to change than GERxn fluxes. Thus, flux through SERxns is considerably more sensitive to environmental change, whereas GERxn fluxes vary less. To examine if this is a general property, we simulated 15,051 pairwise environmental shifts. In 96% of these shifts, SERxns changed more frequently than GERxns (Fig. 2B). This difference was strongest for environmental shifts that cause more than 8% of the reactions to change flux (fig. S9). Because SERxns are subject to greater flux changes in nutritionally dynamic environments, it seems that duplication may have occurred to allow more focused regulation of fluxes. This duplication would be reinforced as the enzymes enhance their catalytic specificity.

Fig. 2.

Fig. 2

(A) Phenotypic measurements, such as substrate uptake rates (9), were acquired and used to parameterize the model to predict the percentage of reactions that change flux in four nutritional shifts. (B) A systematic computational screen of 15,051 shifts between 174 carbon substrates shows that SERxns tend to change more frequently. By rank-ordering shifts based on the number of enzyme-catalyzed reaction fluxes that change, the difference is particularly clear for shifts that cause more reactions to change. Most cases in which there is only a weak difference involve shifts between two similar primary carbon susbstrates, as measured by their Tanimoto coefficients (inset; Tanimoto coefficients are averaged across sets of 100 shifts).

In dynamic environments, metabolic flux can be regulated through metabolite-protein interactions or posttranslational modifications (PTMs) (18, 19). We quantified the association of metabolic regulation with enzyme specificity, using a few hundred metabolite-mediated regulatory interactions obtained from the EcoCyc database and enzyme PTMs from mass spectrometry studies in E. coli (9). Allosteric, uncompetitive, and noncompetitive regulatory interactions are enriched among specialists (hypergeometric P = 9 × 10−4), as were PTMs (hypergeometric P = 5 × 10−3). Metabolic regulation was less prevalent among generalists, consistent with the decreased need to change flux through their reactions in dynamic environments. Moreover, fluxes for reactions catalyzed by the same generalist often covary, thereby reducing requirements for more complex regulation (fig. S10).

To further assess the association of specificity with regulation, we quantified how frequently each reaction changed flux across all simulated 15,051 media shifts. K-means clustering elucidated three dominant reaction clusters (Fig. 3A). Two clusters show frequent changes in flux, and these were enriched in specialists, particularly those associated with central and amino acid metabolism (Fig. 3B). The reaction cluster with few changes in flux was significantly enriched in generalists (Fig. 3C). PTMs and small molecule mediated allosteric regulation were enriched within the cluster experiencing the most change in flux (hypergeometric P = 5 × 10−3), but depleted from the cluster dominated by generalists (hypergeometric P = 3 × 10−3; Fig. 3D and fig. S11). Thus, enzymes exhibiting more extensive metabolic regulation tend to have evolved increased enzyme specificity.

Fig. 3.

Fig. 3

(A) Clustering reactions that change (blue) or not change (white) across 15,051 different media shifts (x axis) yields three distinct groups, (B) which are each enriched in unique metabolic subsystems. (C) Specialist enzymes are enriched in more sensitive clusters, whereas generalist enzymes are enriched in the cluster with few flux changes. (D) The number of PTMs (acetylation, phosphorylation, and/or succinylation) on enzymes increases with sensitivity of clusters.

The aforementioned properties show how enzyme specificity correlates with holistic functions of the E. coli metabolic network. However, these properties should be conserved if they influence selection of enzyme specificity in protein evolution. Thus, we examined their conservation using genome-scale metabolic models of microbes from the other domains of life, including the archeon Methanosarcina barkeri (20) and the eukaryotes Saccharomyces cerevisiae (21) and Chlamydomonas reinhardtii (22). Similar to E. coli, the three organisms contain numerous generalist enzymes. Common growth conditions were simulated for each organism to estimate metabolic flux. In each organism, specialist enzymes maintained a higher flux on average than generalist enzymes. Moreover, when environmental shifts were simulated for each organism, generalist enzymes were less likely to change flux between growth conditions (fig. S12). Even as microbes diversified, high flux and a need for focused regulation in varying environments remained as general features of specialist enzymes.

It is generally believed that highly promiscuous ancestral enzymes eventually evolved to become specific and highly efficient (1). However, many current enzymes are only moderately efficient (23), and there are numerous generalists. Thus, evolution has not converged to a point where metabolic enzymes are all specialists. Our results suggest that this convergence has been hindered in part by the lower essentiality, smaller flux, and reduced regulatory requirements of generalist enzymes, including those that are multifunctional and those exhibiting substrate promiscuity (figs. S3B and S4C). The specialization of these enzymes may not provide adequate fitness advantages to offset the cost of gene duplication and maintenance (10) that accompanies the separation of catalytic functions into several specialists. In addition, these selective pressures may not influence some classes of enzymes if their generalist activities are desirable, such as in the degradation and clearance of diverse toxins (24) or the synthesis of structural lipids or glycoconjugates. However, our results suggest that many metabolic enzymes will specialize when an environmental change elicits a fitness challenge that causes a generalist to contribute to the high-flux (8) or essential biomass-producing core (25) of metabolism, or if new environmental fluctuations require more focused regulation of flux. Preliminary analysis suggests that potential examples of this divergence include serine hydroxymethyltransferase and its isozyme LtaE (fig. S7) or pyruvate formate lyase and TdcE (see supplementary material).

Our results demonstrate that the metabolic network, as a whole, supports organismal survival and influences cell physiology in a given environment. By analyzing the functions of its pathways and using biomolecular networks to integrate many disparate data types into a coherent whole, we show that systems biology allows the elucidation of selection pressures that are not apparent at the level of a single enzyme (2629).

Supplementary Material

Combined Supp
Database

Acknowledgments

We thank D. Zielinski for insightful discussion on this work. This work was supported by NIH, NSF, and U.S. Department of Energy grants 2R01GM057089-13, NSF GK-12 742551, DE-SC0004917, and DE-FG02-09ER25917. Data are available at the NCBI Gene Expression Omnibus (GEO) database (GSE34631).

Footnotes

Supplementary Materials:

Materials and Methods

Supporting Text and Figs. S1 to S17

Tables S1 to S2

Caption for databases S1

References (30–78)

References and Notes

  • 1.Jensen RA. Enzyme recruitment in evolution of new function. Annu Rev Microbiol. 1976;30:409. doi: 10.1146/annurev.mi.30.100176.002205. [DOI] [PubMed] [Google Scholar]
  • 2.Khersonsky O, Tawfik DS. Enzyme promiscuity: a mechanistic and evolutionary perspective. Annu Rev Biochem. 2010;79:471. doi: 10.1146/annurev-biochem-030409-143718. [DOI] [PubMed] [Google Scholar]
  • 3.Innan H, Kondrashov F. The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet. 2010;11:97. doi: 10.1038/nrg2689. [DOI] [PubMed] [Google Scholar]
  • 4.Khersonsky O, Malitsky S, Rogachev I, Tawfik DS. Role of chemistry versus substrate binding in recruiting promiscuous enzyme functions. Biochemistry. 2011;50:2683. doi: 10.1021/bi101763c. [DOI] [PubMed] [Google Scholar]
  • 5.Feist AM, et al. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007;3:121. doi: 10.1038/msb4100155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Aharoni A, et al. The ‘evolvability’ of promiscuous protein functions. Nat Genet. 2005;37:73. doi: 10.1038/ng1482. [DOI] [PubMed] [Google Scholar]
  • 7.Lewis NE, Nagarajan H, Palsson BO. Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods. Nat Rev Microbiol. 2012;10:291. doi: 10.1038/nrmicro2737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Almaas E, Kovacs B, Vicsek T, Oltvai ZN, Barabasi AL. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature. 2004;427:839. doi: 10.1038/nature02289. [DOI] [PubMed] [Google Scholar]
  • 9.Materials and methods are available as supporting material on Science Online.
  • 10.Wagner A. Energy costs constrain the evolution of gene expression. J Exp Zool B Mol Dev Evol. 2007;308B:322. doi: 10.1002/jez.b.21152. [DOI] [PubMed] [Google Scholar]
  • 11.Baba T, et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol. 2006;2:2006.0008 . doi: 10.1038/msb4100050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Papp B, Pal C, Hurst LD. Metabolic network analysis of the causes and evolution of enzyme dispensability in yeast. Nature. 2004;429:661. doi: 10.1038/nature02636. [DOI] [PubMed] [Google Scholar]
  • 13.Deutscher D, Meilijson I, Kupiec M, Ruppin E. Multiple knockout analysis of genetic robustness in the yeast metabolic network. Nat Genet. 2006;38:993. doi: 10.1038/ng1856. [DOI] [PubMed] [Google Scholar]
  • 14.Bordel S, Agren R, Nielsen J. Sampling the solution space in genome-scale metabolic networks reveals transcriptional regulation in key enzymes. PLoS Comput Biol. 2010;6:e1000859. doi: 10.1371/journal.pcbi.1000859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Schuetz R, Kuepfer L, Sauer U. Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol Syst Biol. 2007;3:119. doi: 10.1038/msb4100162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gur E, Biran D, Ron EZ. Regulated proteolysis in Gram-negative bacteria - how and when? Nat Rev Microbiol. 2011;9:839. doi: 10.1038/nrmicro2669. [DOI] [PubMed] [Google Scholar]
  • 17.Lewis NE, Cho BK, Knight EM, Palsson BO. Gene expression profiling and the use of genome-scale in silico models of Escherichia coli for analysis: providing context for content. J Bacteriol. 2009;191:3437. doi: 10.1128/JB.00034-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zhang Z, et al. Identification of lysine succinylation as a new post-translational modification. Nat Chem Biol. 2011;7:58. doi: 10.1038/nchembio.495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gerosa L, Sauer U. Regulation and control of metabolic fluxes in microbes. Curr Opin Biotechnol. 2011;22:566. doi: 10.1016/j.copbio.2011.04.016. [DOI] [PubMed] [Google Scholar]
  • 20.Feist AM, Scholten JC, Palsson BO, Brockman FJ, Ideker T. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol Syst Biol. 2006;2:2006 0004 . doi: 10.1038/msb4100046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Mo ML, Palsson BO, Herrgard MJ. Connecting extracellular metabolomic measurements to intracellular flux states in yeast. BMC Syst Biol. 2009;3:37. doi: 10.1186/1752-0509-3-37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Chang RL, et al. Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism. Mol Syst Biol. 2011;7:518. doi: 10.1038/msb.2011.52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bar-Even A, et al. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry. 2011;50:4402. doi: 10.1021/bi2002289. [DOI] [PubMed] [Google Scholar]
  • 24.Morar M, Wright GD. The genomic enzymology of antibiotic resistance. Annu Rev Genet. 2010;44:25. doi: 10.1146/annurev-genet-102209-163517. [DOI] [PubMed] [Google Scholar]
  • 25.Almaas E, Oltvai ZN, Barabasi AL. The activity reaction core and plasticity of metabolic networks. PLoS Comput Biol. 2005;1:e68. doi: 10.1371/journal.pcbi.0010068. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Copley SD. Toward a systems biology perspective on enzyme evolution. J Biol Chem. 2012;287:3. doi: 10.1074/jbc.R111.254714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Papp B, Notebaart RA, Pal C. Systems-biology approaches for predicting genomic evolution. Nat Rev Genet. 2011;12:591. doi: 10.1038/nrg3033. [DOI] [PubMed] [Google Scholar]
  • 28.Nam H, Conrad TM, Lewis NE. The role of cellular objectives and selective pressures in metabolic pathway evolution. Curr Opin Biotechnol. 2011;22:595. doi: 10.1016/j.copbio.2011.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Carbonell P, Lecointre G, Faulon JL. Origins of specificity and promiscuity in metabolic networks. J Biol Chem. 2011;286:43994. doi: 10.1074/jbc.M111.274050. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Combined Supp
Database

RESOURCES