Abstract
Recent studies proposed that enzyme-active sites induce evolutionary constraints at long distances. The physical origin of such long-range evolutionary coupling is unknown. Here, I use a recent biophysical model of evolution to study the relationship between physical and evolutionary couplings on a diverse data set of monomeric enzymes. I show that evolutionary coupling is not universally long-range. Rather, range varies widely among enzymes, from 2 to 20 Å. Furthermore, the evolutionary coupling range of an enzyme does not inform on the underlying physical coupling, which is short range for all enzymes. Rather, evolutionary coupling range is determined by functional selection pressure.
Significance
Until recently, only residues near enzyme-active sites were thought to be evolutionarily constrained. However, recent studies proposed that active sites induce long-range evolutionary constraints. This seems to conflict with the common finding that physical couplings in proteins are short range. This raises the question of how short-range physical couplings may cause long-range evolutionary couplings. Here, I show that the function that maps physical coupling into evolutionary coupling depends on functional selection pressure. Under weak selection, both couplings are similarly short ranged; under strong selection, short-range physical coupling is nonlinearly turned into long-range evolutionary coupling. Thus, due to a huge variation of selection pressure, evolutionary coupling range varies widely among enzymes, from very short (2 Å) to very long (20 Å).
Main text
As enzymes evolve, different sites evolve at different rates. The main reason for such variation of evolutionary rate among sites within proteins is selection for stability (1, 2, 3, 4). Until recently, activity constraints were thought to affect just the few residues directly involved in catalysis and their immediate neighbors (1,5). However, recent studies have reported that active sites influence evolutionary rates at long distances, slowing down the evolution of residues as distant as (6, 7, 8, 9, 10). It seems reasonable to assume that such long-range evolutionary coupling could result from long-range physical couplings, such as those involved in allosteric mutations (11,12). This would align with the notion that enzymes are evolutionarily designed to optimize long-range coupling (13, 14, 15). However, except perhaps for a few allosteric residues, I would expect physical coupling to be a typical short-range, exponentially decreasing function of distance characteristic of indirect through-the-contact-network couplings (14,16, 17, 18, 19, 20). If this is the case, it would leave long-range evolutionary coupling begging explanation.
The aim of this work is to verify whether physical coupling is short-range and, in that case, to study how such a short-range physical coupling may give rise to a long-range evolutionary coupling. To this end, I use the Stability-Activity Model of enzyme evolution, MSA (21). I previously showed that this model reproduces quantitatively the observed slow increase of evolutionary rate with distance from the active site that led to the proposal of long-range evolutionary couplings (7). This makes MSA suitable for exploring the physical underpinnings of long-range evolutionary coupling.
Before describing the model, I start with some definitions. The evolutionary rate, K, is the number of amino acid substitutions per unit time along an evolutionary trajectory; is the rate for the case in which all mutations are neutral; is the rate relative to the neutral rate; and is the relative slowdown with respect to the neutral evolution case. As selection pressure increases, evolution slows down: K and ω decrease, and (the relative slowdown) increases. K, ω, and contain exactly the same information. Therefore, in what follows, to measure the effect of selection on evolution, I will mostly use , the evolutionary slowdown.
With the help of the MSA model, I define physical and evolutionary coupling measures, and derive the formula that relates them. The MSA model is described in detail in the Supporting material, Section 1 and Section 3, and in (21). MSA predicts that the evolutionary slowdown of an enzyme residue r due to selection on stability and activity is given by the following equation (Eq. S19):
| (1) |
where and are, respectively, mutational changes of folding free energy and activation free energy; and are positive parameters that represent selection pressure on stability and activity, respectively; and stands for averaging over mutations. Equation 1 relates the evolutionary slowdown of residue r to the effects of mutating this residue on stability and activity. Because here I am only interested in selection on activity, I consider a hypothetical scenario in which selection on stability is turned off. Replacing in Eq. 1, it follows that the evolutionary slowdown of site r due to selection on activity is given by the following equation (Eq. S20):
| (2) |
The mutational activation free energy change is due to the distortion of the active site caused by mutating residue r (21) (Eq. S45; Supporting material, Section 1.1 and Section 3.3). Therefore, represents the physical coupling between the enzyme’s active site and residue r. This causes the slowdown , which, therefore, can be considered a measure of evolutionary coupling. Thus, Eq. 2 governs how evolutionary coupling depends on physical coupling and functional selection pressure . For notational simplicity, I will drop the explicit reference to residue r whenever possible.
I studied the relationship between physical and evolutionary couplings on a data set of monomeric enzymes used previously (21) (Supporting material, Section 1.2). Briefly, for each protein, I calculated and for all mutations at all residues using the linearly forced elastic network model. Then, I obtained the model parameters and by fitting MSA predictions to empirical rates. Finally, I calculated the residue-dependent evolutionary couplings, , and physical couplings, . For details of the calculation, see Supporting material, Section 1.1. I consider MSA to be validated in a previous study (21). However, for completeness, in Supporting material, Section 2, I show the excellent agreement between MSA predictions and empirical rates (Supporting material, Section 2.1) and I discuss the adequacy of using the linearly forced elastic network model to calculate and (Supporting material, Section 2.2). In what follows, I focus on the study of physical and evolutionary couplings.
For clarity, I start by considering three illustrative examples (Fig. 1). I measure coupling range using , the distance at which coupling is half of the maximum. Physical coupling is similar for the three examples: it decreases exponentially with increasing distance, and it has very short range ( is , , and for 1OYG, 1QK2, and 1PMI, respectively; Fig. 1 A). In contrast, evolutionary coupling varies among the examples, it is not an exponential but a sigmoid, and its range varies widely ( is , , and for 1OYG, 1QK2, and 1PMI, respectively; Fig. 1 B). It will be shown below that this variation of evolutionary coupling range is due to the variation of functional selection pressure acting on these three enzymes ( is 29.2, 80.2, and 800 for 1OYG, 1QK2, and 1PMI, respectively).
Figure 1.
Coupling between the active site and other residues for three illustrative examples. The three cases shown are the enzymes with Protein Data Bank identity codes 1OYG, 1QK2, and 1PMI. (A) Physical coupling, as measured by the change in activation free energy that results from mutations; each point corresponds to a protein site; a site’s is the average of over mutations; the smooth line is an exponential fit , where d is the distance from the closest active-site residue, measured in Å; the point for which coupling is half of its maximum is displayed in black; the insets show the three-dimensional (3D) protein structures colored from yellow to red according to increasing physical coupling. (B) Evolutionary coupling, as measured by , the relative slowdown of evolution due to selection on activity; each point corresponds to a protein site; the smooth line is a function , fit to the points; the point for which coupling is half of the maximum is displayed in black; the insets show the 3D protein structures colored from yellow to red according to increasing evolutionary coupling. For the sake of comparison, couplings are scaled so that the smooth fits are 1 at . 3D images were made with https://3dproteinimaging.com/protein-imager (22). To see the figure in color, go online.
Fig. 2 shows the distance-dependence of physical and evolutionary couplings for all the enzymes studied. Physical coupling is a short-range exponential decline with distance, very similar for all cases (Fig. 2 A). In contrast, the distance-dependence of evolutionary coupling varies among proteins, from a short-range exponential decline to a long-range sigmoidal decline (Fig. 2 B). Because physical coupling is very similar for all enzymes, from Eq. 2, it follows that the variation among enzymes of evolutionary coupling must be determined by the selection parameter . This is confirmed in Fig. 2 C, which shows that evolutionary coupling range is independent of physical coupling range, and Fig. 2 D, that shows that the variation of evolutionary coupling range is almost completely explained by the selection pressure parameter .
Figure 2.
Evolutionary coupling range increases with selection pressure on activity. Coupling between the active site and other residues for each protein of a data set of 157 monomeric enzymes of diverse sizes, structures, and functions. (A) Physical coupling, as measured by , the activation free energy change averaged over mutations at the mutated site; each line is the smooth fit for one protein of the data set, where d is the distance from the closest active site residue in Å. (B) Evolutionary coupling, as measured by , the relative slowdown due to selection on activity; each line is the smooth fit for one protein of the data set; Couplings are scaled so that all smooth fits are 1 at . (C) Range of evolutionary coupling measured by , the distance at which becomes half of its maximum, versus the range of physical coupling, measured by , the distance at which becomes half of its maximum. (D) Range of evolutionary coupling measured by versus functional selection pressure measured by model parameter . In (C) and (D), each point represents one protein, ρ is Spearman correlation coefficient, p is the p-value, and the lines are local regression fits. To see the figure in color, go online.
Thus, according to Fig. 2, evolutionary coupling range varies widely among enzymes, from to , as a result of the variation of parameter over more than four orders of magnitude, from to . This huge variation of represents a variation of the functional selection pressure under which enzymes evolve (Supporting material, Section 2.3). This variation can be explained by Eq. 2, that nonlinearly maps a short-range, exponentially decreasing physical coupling into a sigmoidally decreasing evolutionary coupling whose range increases with selection pressure (Supporting material, Section 2.4). In summary, the fundamental finding of this work is that long-range evolutionary coupling is not due to enzymes being particularly well designed for long-range physical coupling, but it is a consequence of a nonlinear amplification of physical coupling under strong functional selection pressure.
The previous findings provide a mechanism that explains how functional constraints slow down enzyme evolution. A priori, the decrease of the rate of evolution with increasing selection pressure could be uniformly distributed among all residues. However, this work implies a different picture: increasing functional selection pressure increases the range of influence of the enzyme’s active site on other residues (Fig. 2 D); as this range increases, more sites become functionally constrained (Fig. 3 A) and the active site becomes more tightly coupled to the rest of the protein (Fig. 3 B); as a result, the enzyme’s evolution slows down (Fig. 3 C). In this work, I have derived from patterns of rate variation among sites within proteins. However, the model predicts that should also influence rate variation among proteins, which would connect rate variation within proteins with rate variation among proteins. Further work is needed to test this important prediction.
Figure 3.
Increasing evolutionary coupling range slows down enzyme evolution. Evolutionary coupling range is measured by , the distance at which evolutionary coupling is half its maximum. (A) Increase of the fraction of functionally constrained sites; the number of activity-constrained residues was calculated using , where r denotes residue and . So defined, , which varies between 1 and the total number of sites, measures how distributed over sites is. (B) Increase of overall enzyme coupling, measured by averaged over residues. (C) Predicted decrease of the protein rate of evolution, measured by the relative rate ω averaged over residues; ρ is Spearman’s correlation coefficient, p is the p-value. In all panels, each point represents one protein and lines are local regression fits. To see the figure in color, go online.
To finish, I mention another two research directions suggested by this work. First, for enzymes, functional selection pressure depends on metabolic role (23, 24, 25, 26). Specifically, the main functional constraints are enzyme-specific metabolic flow and enzyme essentiality (24). Therefore, these properties should correlate with parameter of this work, and, as a consequence, metabolic role should affect evolutionary coupling range. This prediction should be verified. Second, these findings indicate the intriguing possibility of manipulating evolutionary coupling range by adjusting selection pressure, which could be explored using enzyme evolution experiments.
Data availability
The data and code underlying this article are available in Zenodo, at http://doi.org/10.5281/zenodo.4309233.
Acknowledgments
This work was supported by Consejo Nacional de Investigaciones Científicas y Técnicas (grant number PIP 112 201501 00385 CO) and by Agencia Nacional de Promoción Científica y Tecnológica (grant number PICT-2016-4209).
Editor: Robert Best.
Footnotes
Supporting material can be found online at https://doi.org/10.1016/j.bpj.2021.08.042.
Supporting citations
References (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38) appear in the Supporting materials and methods.
Supporting material
References
- 1.Echave J., Spielman S.J., Wilke C.O. Causes of evolutionary rate variation among protein sites. Nat. Rev. Genet. 2016;17:109–121. doi: 10.1038/nrg.2015.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Echave J., Wilke C.O. Biophysical models of protein evolution: understanding the patterns of evolutionary sequence divergence. Annu. Rev. Biophys. 2017;46:85–103. doi: 10.1146/annurev-biophys-070816-033819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bastolla U., Dehouck Y., Echave J. What evolution tells us about protein physics, and protein physics tells us about evolution. Curr. Opin. Struct. Biol. 2017;42:59–66. doi: 10.1016/j.sbi.2016.10.020. [DOI] [PubMed] [Google Scholar]
- 4.Goldstein R.A., Pollock D.D. Sequence entropy of folding and the absolute rate of amino acid substitutions. Nat. Ecol. Evol. 2017;1:1923–1930. doi: 10.1038/s41559-017-0338-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bartlett G.J., Porter C.T., Thornton J.M. Analysis of catalytic residues in enzyme active sites. J. Mol. Biol. 2002;324:105–121. doi: 10.1016/s0022-2836(02)01036-7. [DOI] [PubMed] [Google Scholar]
- 6.Dean A.M., Neuhauser C., Golding G.B. The pattern of amino acid replacements in alpha/beta-barrels. Mol. Biol. Evol. 2002;19:1846–1864. doi: 10.1093/oxfordjournals.molbev.a004009. [DOI] [PubMed] [Google Scholar]
- 7.Jack B.R., Meyer A.G., Wilke C.O. Functional sites induce long-range evolutionary constraints in enzymes. PLoS Biol. 2016;14:e1002452. doi: 10.1371/journal.pbio.1002452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sharir-Ivry A., Xia Y. Nature of long-range evolutionary constraint in enzymes: insights from comparison to pseudoenzymes with similar structures. Mol. Biol. Evol. 2018;35:2597–2606. doi: 10.1093/molbev/msy177. [DOI] [PubMed] [Google Scholar]
- 9.Sharir-Ivry A., Xia Y. Using pseudoenzymes to probe evolutionary design principles of enzymes. Evol. Bioinform. Online. 2019;15 doi: 10.1177/1176934319855937. 1176934319855937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sharir-Ivry A., Xia Y. Non-catalytic binding sites induce weaker long-range evolutionary rate gradients than catalytic sites in enzymes. J. Mol. Biol. 2019;431:3860–3870. doi: 10.1016/j.jmb.2019.07.019. [DOI] [PubMed] [Google Scholar]
- 11.Perica T., Kondo Y., Teichmann S.A. Evolution of oligomeric state through allosteric pathways that mimic ligand binding. Science. 2014;346:1254346. doi: 10.1126/science.1254346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Guarnera E., Berezovsky I.N. On the perturbation nature of allostery: sites, mutations, and signal modulation. Curr. Opin. Struct. Biol. 2019;56:18–27. doi: 10.1016/j.sbi.2018.10.008. [DOI] [PubMed] [Google Scholar]
- 13.Flechsig H. Design of elastic networks with evolutionary optimized long-range communication as mechanical models of allosteric proteins. Biophys. J. 2017;113:558–571. doi: 10.1016/j.bpj.2017.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Naganathan A.N. Modulation of allosteric coupling by mutations: from protein dynamics and packing to altered native ensembles and function. Curr. Opin. Struct. Biol. 2019;54:1–9. doi: 10.1016/j.sbi.2018.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Campitelli P., Modi T., Ozkan S.B. The role of conformational dynamics and allostery in modulating protein evolution. Annu. Rev. Biophys. 2020;49:267–288. doi: 10.1146/annurev-biophys-052118-115517. [DOI] [PubMed] [Google Scholar]
- 16.Maslov S., Ispolatov I. Propagation of large concentration changes in reversible protein-binding networks. Proc. Natl. Acad. Sci. USA. 2007;104:13655–13660. doi: 10.1073/pnas.0702905104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Maslov S., Sneppen K., Ispolatov I. Spreading out of perturbations in reversible reaction networks. New J. Phys. 2007;9:273. doi: 10.1088/1367-2630/9/8/273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rajasekaran N., Suresh S., Naganathan A.N. A general mechanism for the propagation of mutational effects in proteins. Biochemistry. 2017;56:294–305. doi: 10.1021/acs.biochem.6b00798. [DOI] [PubMed] [Google Scholar]
- 19.Rajasekaran N., Naganathan A.N. A self-consistent structural perturbation approach for determining the magnitude and extent of allosteric coupling in proteins. Biochem. J. 2017;474:2379–2388. doi: 10.1042/BCJ20170304. [DOI] [PubMed] [Google Scholar]
- 20.Rajasekaran N., Sekhar A., Naganathan A.N. A universal pattern in the percolation and dissipation of protein structural perturbations. J. Phys. Chem. Lett. 2017;8:4779–4784. doi: 10.1021/acs.jpclett.7b02021. [DOI] [PubMed] [Google Scholar]
- 21.Echave J. Beyond stability constraints: a biophysical model of enzyme evolution with selection on stability and activity. Mol. Biol. Evol. 2019;36:613–620. doi: 10.1093/molbev/msy244. [DOI] [PubMed] [Google Scholar]
- 22.Tomasello G., Armenia I., Molla G. The Protein Imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities. Bioinformatics. 2020;36:2909–2911. doi: 10.1093/bioinformatics/btaa009. [DOI] [PubMed] [Google Scholar]
- 23.Boucher J.I., Bolon D.N.A., Tawfik D.S. Quantifying and understanding the fitness effects of protein mutations: laboratory versus nature. Protein Sci. 2016;25:1219–1226. doi: 10.1002/pro.2928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Aguilar-Rodríguez J., Wagner A. Metabolic determinants of enzyme evolution in a genome-scale bacterial metabolic network. Genome Biol. Evol. 2018;10:3076–3088. doi: 10.1093/gbe/evy234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Alvarez-Ponce D., Sabater-Muñoz B., Fares M.A. Essentiality is a strong determinant of protein rates of evolution during mutation accumulation experiments in Escherichia coli. Genome Biol. Evol. 2016;8:2914–2927. doi: 10.1093/gbe/evw205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Poyatos J.F. Genetic buffering and potentiation in metabolism. PLoS Comput. Biol. 2020;16:e1008185. doi: 10.1371/journal.pcbi.1008185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Echave J. Evolutionary divergence of protein structure: the linearly forced elastic network model. Chem. Phys. Lett. 2008;457:413–416. [Google Scholar]
- 28.Echave J., Fernández F.M. A perturbative view of protein structural variation. Proteins. 2010;78:173–180. doi: 10.1002/prot.22553. [DOI] [PubMed] [Google Scholar]
- 29.Ming D., Wall M.E. Allostery in a coarse-grained model of protein dynamics. Phys. Rev. Lett. 2005;95:198103. doi: 10.1103/PhysRevLett.95.198103. [DOI] [PubMed] [Google Scholar]
- 30.Furnham N., Holliday G.L., Thornton J.M. The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes. Nucleic Acids Res. 2014;42:D485–D489. doi: 10.1093/nar/gkt1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Webb E.C. Academic Press; San Diego, CA: 1992. Enzyme Nomenclature. [Google Scholar]
- 32.McCandlish D.M., Stoltzfus A. Modeling evolution using the probability of fixation: history and implications. Q. Rev. Biol. 2014;89:225–252. doi: 10.1086/677571. [DOI] [PubMed] [Google Scholar]
- 33.Echave J., Jackson E.L., Wilke C.O. Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites. Phys. Biol. 2015;12:025002. doi: 10.1088/1478-3975/12/2/025002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Huang T.-T., del Valle Marcos M.L., Echave J. A mechanistic stress model of protein evolution accounts for site-specific evolutionary rates and their relationship with packing density and flexibility. BMC Evol. Biol. 2014;14:78. doi: 10.1186/1471-2148-14-78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Marcos M.L., Echave J. Too packed to change: side-chain packing and site-specific substitution rates in protein evolution. PeerJ. 2015;3:e911. doi: 10.7717/peerj.911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Stein R.L. Wiley; Hoboken, NJ: 2011. Kinetics of Enzyme Action: Essential Principles for Drug Hunters. [Google Scholar]
- 37.Schowen R.L. In: Gandour R.D., Schowen R.L., editors. Springer; 1978. Chapter 20. Catalytic power and transition-state stabilization; pp. 77–144. (Transition States of Biochemical Processes). [Google Scholar]
- 38.Micheletti C., Carloni P., Maritan A. Accurate and efficient description of protein vibrational dynamics: comparing molecular dynamics and Gaussian models. Proteins. 2004;55:635–645. doi: 10.1002/prot.20049. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data and code underlying this article are available in Zenodo, at http://doi.org/10.5281/zenodo.4309233.



