Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2021 Sep 2;120(20):4320–4324. doi: 10.1016/j.bpj.2021.08.042

Evolutionary coupling range varies widely among enzymes depending on selection pressure

Julian Echave 1,
PMCID: PMC8553639  PMID: 34480927

Abstract

Recent studies proposed that enzyme-active sites induce evolutionary constraints at long distances. The physical origin of such long-range evolutionary coupling is unknown. Here, I use a recent biophysical model of evolution to study the relationship between physical and evolutionary couplings on a diverse data set of monomeric enzymes. I show that evolutionary coupling is not universally long-range. Rather, range varies widely among enzymes, from 2 to 20 Å. Furthermore, the evolutionary coupling range of an enzyme does not inform on the underlying physical coupling, which is short range for all enzymes. Rather, evolutionary coupling range is determined by functional selection pressure.

Significance

Until recently, only residues near enzyme-active sites were thought to be evolutionarily constrained. However, recent studies proposed that active sites induce long-range evolutionary constraints. This seems to conflict with the common finding that physical couplings in proteins are short range. This raises the question of how short-range physical couplings may cause long-range evolutionary couplings. Here, I show that the function that maps physical coupling into evolutionary coupling depends on functional selection pressure. Under weak selection, both couplings are similarly short ranged; under strong selection, short-range physical coupling is nonlinearly turned into long-range evolutionary coupling. Thus, due to a huge variation of selection pressure, evolutionary coupling range varies widely among enzymes, from very short (2 Å) to very long (20 Å).

Main text

As enzymes evolve, different sites evolve at different rates. The main reason for such variation of evolutionary rate among sites within proteins is selection for stability (1, 2, 3, 4). Until recently, activity constraints were thought to affect just the few residues directly involved in catalysis and their immediate neighbors (1,5). However, recent studies have reported that active sites influence evolutionary rates at long distances, slowing down the evolution of residues as distant as 30 Å (6, 7, 8, 9, 10). It seems reasonable to assume that such long-range evolutionary coupling could result from long-range physical couplings, such as those involved in allosteric mutations (11,12). This would align with the notion that enzymes are evolutionarily designed to optimize long-range coupling (13, 14, 15). However, except perhaps for a few allosteric residues, I would expect physical coupling to be a typical short-range, exponentially decreasing function of distance characteristic of indirect through-the-contact-network couplings (14,16, 17, 18, 19, 20). If this is the case, it would leave long-range evolutionary coupling begging explanation.

The aim of this work is to verify whether physical coupling is short-range and, in that case, to study how such a short-range physical coupling may give rise to a long-range evolutionary coupling. To this end, I use the Stability-Activity Model of enzyme evolution, MSA (21). I previously showed that this model reproduces quantitatively the observed slow increase of evolutionary rate with distance from the active site that led to the proposal of long-range evolutionary couplings (7). This makes MSA suitable for exploring the physical underpinnings of long-range evolutionary coupling.

Before describing the model, I start with some definitions. The evolutionary rate, K, is the number of amino acid substitutions per unit time along an evolutionary trajectory; K0 is the rate for the case in which all mutations are neutral; ω=KK0 is the rate relative to the neutral rate; and 1ω=K0KK0 is the relative slowdown with respect to the neutral evolution case. As selection pressure increases, evolution slows down: K and ω decrease, and 1ω (the relative slowdown) increases. K, ω, and 1ω contain exactly the same information. Therefore, in what follows, to measure the effect of selection on evolution, I will mostly use 1ω, the evolutionary slowdown.

With the help of the MSA model, I define physical and evolutionary coupling measures, and derive the formula that relates them. The MSA model is described in detail in the Supporting material, Section 1 and Section 3, and in (21). MSA predicts that the evolutionary slowdown of an enzyme residue r due to selection on stability and activity is given by the following equation (Eq. S19):

1ω(r)=1min(1,eaSΔΔG(r))min(1,eaAΔΔG(r)), (1)

where ΔΔG(r) and ΔΔG(r) are, respectively, mutational changes of folding free energy and activation free energy; aS and aA are positive parameters that represent selection pressure on stability and activity, respectively; and stands for averaging over mutations. Equation 1 relates the evolutionary slowdown of residue r to the effects of mutating this residue on stability and activity. Because here I am only interested in selection on activity, I consider a hypothetical scenario in which selection on stability is turned off. Replacing aS=0 in Eq. 1, it follows that the evolutionary slowdown of site r due to selection on activity is given by the following equation (Eq. S20):

1ωAr=1-min1,eaAΔΔGr. (2)

The mutational activation free energy change ΔΔG(r) is due to the distortion of the active site caused by mutating residue r (21) (Eq. S45; Supporting material, Section 1.1 and Section 3.3). Therefore, ΔΔG(r) represents the physical coupling between the enzyme’s active site and residue r. This causes the slowdown 1ωA(r), which, therefore, can be considered a measure of evolutionary coupling. Thus, Eq. 2 governs how evolutionary coupling (1ωA) depends on physical coupling (ΔΔG) and functional selection pressure (aA). For notational simplicity, I will drop the explicit reference to residue r whenever possible.

I studied the relationship between physical and evolutionary couplings on a data set of monomeric enzymes used previously (21) (Supporting material, Section 1.2). Briefly, for each protein, I calculated ΔΔG and ΔΔG for all mutations at all residues using the linearly forced elastic network model. Then, I obtained the model parameters aS and aA by fitting MSA predictions to empirical rates. Finally, I calculated the residue-dependent evolutionary couplings, 1ωA, and physical couplings, ΔΔG. For details of the calculation, see Supporting material, Section 1.1. I consider MSA to be validated in a previous study (21). However, for completeness, in Supporting material, Section 2, I show the excellent agreement between MSA predictions and empirical rates (Supporting material, Section 2.1) and I discuss the adequacy of using the linearly forced elastic network model to calculate ΔΔG and ΔΔG (Supporting material, Section 2.2). In what follows, I focus on the study of physical and evolutionary couplings.

For clarity, I start by considering three illustrative examples (Fig. 1). I measure coupling range using d1/2, the distance at which coupling is half of the maximum. Physical coupling (ΔΔG) is similar for the three examples: it decreases exponentially with increasing distance, and it has very short range (d1/2phys is 2.1, 2.0, and 2.0 Å for 1OYG, 1QK2, and 1PMI, respectively; Fig. 1 A). In contrast, evolutionary coupling (1ωA) varies among the examples, it is not an exponential but a sigmoid, and its range varies widely (d1/2evo is 5.4, 8.7, and 15.6 Å for 1OYG, 1QK2, and 1PMI, respectively; Fig. 1 B). It will be shown below that this variation of evolutionary coupling range is due to the variation of functional selection pressure acting on these three enzymes (aA is 29.2, 80.2, and 800 for 1OYG, 1QK2, and 1PMI, respectively).

Figure 1.

Figure 1

Coupling between the active site and other residues for three illustrative examples. The three cases shown are the enzymes with Protein Data Bank identity codes 1OYG, 1QK2, and 1PMI. (A) Physical coupling, as measured by the change in activation free energy that results from mutations; each point corresponds to a protein site; a site’s ΔΔG is the average of ΔΔG over mutations; the smooth line is an exponential fit ΔΔG=b0eb1d, where d is the distance from the closest active-site residue, measured in Å; the point for which coupling is half of its maximum is displayed in black; the insets show the three-dimensional (3D) protein structures colored from yellow to red according to increasing physical coupling. (B) Evolutionary coupling, as measured by 1ωA, the relative slowdown of evolution due to selection on activity; each point corresponds to a protein site; the smooth line is a function 1ωA=1ec0ec1d, fit to the points; the point for which coupling is half of the maximum is displayed in black; the insets show the 3D protein structures colored from yellow to red according to increasing evolutionary coupling. For the sake of comparison, couplings are scaled so that the smooth fits are 1 at d=0. 3D images were made with https://3dproteinimaging.com/protein-imager (22). To see the figure in color, go online.

Fig. 2 shows the distance-dependence of physical and evolutionary couplings for all the enzymes studied. Physical coupling is a short-range exponential decline with distance, very similar for all cases (Fig. 2 A). In contrast, the distance-dependence of evolutionary coupling varies among proteins, from a short-range exponential decline to a long-range sigmoidal decline (Fig. 2 B). Because physical coupling is very similar for all enzymes, from Eq. 2, it follows that the variation among enzymes of evolutionary coupling must be determined by the selection parameter aA. This is confirmed in Fig. 2 C, which shows that evolutionary coupling range is independent of physical coupling range, and Fig. 2 D, that shows that the variation of evolutionary coupling range is almost completely explained by the selection pressure parameter aA.

Figure 2.

Figure 2

Evolutionary coupling range increases with selection pressure on activity. Coupling between the active site and other residues for each protein of a data set of 157 monomeric enzymes of diverse sizes, structures, and functions. (A) Physical coupling, as measured by ΔΔG, the activation free energy change averaged over mutations at the mutated site; each line is the smooth fit ΔΔG=b0eb1d for one protein of the data set, where d is the distance from the closest active site residue in Å. (B) Evolutionary coupling, as measured by 1ωA, the relative slowdown due to selection on activity; each line is the smooth fit 1ωA=1ec0ec1d for one protein of the data set; Couplings are scaled so that all smooth fits are 1 at d=0. (C) Range of evolutionary coupling measured by d1/2evo, the distance at which 1ωA becomes half of its maximum, versus the range of physical coupling, measured by d1/2phys, the distance at which ΔΔG becomes half of its maximum. (D) Range of evolutionary coupling measured by d1/2evo versus functional selection pressure measured by model parameter aA. In (C) and (D), each point represents one protein, ρ is Spearman correlation coefficient, p is the p-value, and the lines are local regression fits. To see the figure in color, go online.

Thus, according to Fig. 2, evolutionary coupling range varies widely among enzymes, from 1.9 to 19.7 Å, as a result of the variation of parameter aA over more than four orders of magnitude, from 6×102 to 2×103. This huge variation of aA represents a variation of the functional selection pressure under which enzymes evolve (Supporting material, Section 2.3). This variation can be explained by Eq. 2, that nonlinearly maps a short-range, exponentially decreasing physical coupling into a sigmoidally decreasing evolutionary coupling whose range increases with selection pressure (Supporting material, Section 2.4). In summary, the fundamental finding of this work is that long-range evolutionary coupling is not due to enzymes being particularly well designed for long-range physical coupling, but it is a consequence of a nonlinear amplification of physical coupling under strong functional selection pressure.

The previous findings provide a mechanism that explains how functional constraints slow down enzyme evolution. A priori, the decrease of the rate of evolution with increasing selection pressure could be uniformly distributed among all residues. However, this work implies a different picture: increasing functional selection pressure increases the range of influence of the enzyme’s active site on other residues (Fig. 2 D); as this range increases, more sites become functionally constrained (Fig. 3 A) and the active site becomes more tightly coupled to the rest of the protein (Fig. 3 B); as a result, the enzyme’s evolution slows down (Fig. 3 C). In this work, I have derived aA from patterns of rate variation among sites within proteins. However, the model predicts that aA should also influence rate variation among proteins, which would connect rate variation within proteins with rate variation among proteins. Further work is needed to test this important prediction.

Figure 3.

Figure 3

Increasing evolutionary coupling range slows down enzyme evolution. Evolutionary coupling range is measured by d1/2evo, the distance at which evolutionary coupling is half its maximum. (A) Increase of the fraction of functionally constrained sites; the number of activity-constrained residues was calculated using nA=erprlnpr, where r denotes residue and pr=(1ωAr)/r(1ωAr). So defined, nA, which varies between 1 and the total number of sites, measures how distributed over sites 1ωA is. (B) Increase of overall enzyme coupling, measured by 1ωA averaged over residues. (C) Predicted decrease of the protein rate of evolution, measured by the relative rate ω averaged over residues; ρ is Spearman’s correlation coefficient, p is the p-value. In all panels, each point represents one protein and lines are local regression fits. To see the figure in color, go online.

To finish, I mention another two research directions suggested by this work. First, for enzymes, functional selection pressure depends on metabolic role (23, 24, 25, 26). Specifically, the main functional constraints are enzyme-specific metabolic flow and enzyme essentiality (24). Therefore, these properties should correlate with parameter aA of this work, and, as a consequence, metabolic role should affect evolutionary coupling range. This prediction should be verified. Second, these findings indicate the intriguing possibility of manipulating evolutionary coupling range by adjusting selection pressure, which could be explored using enzyme evolution experiments.

Data availability

The data and code underlying this article are available in Zenodo, at http://doi.org/10.5281/zenodo.4309233.

Acknowledgments

This work was supported by Consejo Nacional de Investigaciones Científicas y Técnicas (grant number PIP 112 201501 00385 CO) and by Agencia Nacional de Promoción Científica y Tecnológica (grant number PICT-2016-4209).

Editor: Robert Best.

Footnotes

Supporting material can be found online at https://doi.org/10.1016/j.bpj.2021.08.042.

Supporting citations

References (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38) appear in the Supporting materials and methods.

Supporting material

Document S1. Supporting materials and methods and Figs S1–S8.
mmc1.pdf (558.9KB, pdf)
Document S2. Article plus supporting material
mmc2.pdf (2.4MB, pdf)

References

  • 1.Echave J., Spielman S.J., Wilke C.O. Causes of evolutionary rate variation among protein sites. Nat. Rev. Genet. 2016;17:109–121. doi: 10.1038/nrg.2015.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Echave J., Wilke C.O. Biophysical models of protein evolution: understanding the patterns of evolutionary sequence divergence. Annu. Rev. Biophys. 2017;46:85–103. doi: 10.1146/annurev-biophys-070816-033819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bastolla U., Dehouck Y., Echave J. What evolution tells us about protein physics, and protein physics tells us about evolution. Curr. Opin. Struct. Biol. 2017;42:59–66. doi: 10.1016/j.sbi.2016.10.020. [DOI] [PubMed] [Google Scholar]
  • 4.Goldstein R.A., Pollock D.D. Sequence entropy of folding and the absolute rate of amino acid substitutions. Nat. Ecol. Evol. 2017;1:1923–1930. doi: 10.1038/s41559-017-0338-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bartlett G.J., Porter C.T., Thornton J.M. Analysis of catalytic residues in enzyme active sites. J. Mol. Biol. 2002;324:105–121. doi: 10.1016/s0022-2836(02)01036-7. [DOI] [PubMed] [Google Scholar]
  • 6.Dean A.M., Neuhauser C., Golding G.B. The pattern of amino acid replacements in alpha/beta-barrels. Mol. Biol. Evol. 2002;19:1846–1864. doi: 10.1093/oxfordjournals.molbev.a004009. [DOI] [PubMed] [Google Scholar]
  • 7.Jack B.R., Meyer A.G., Wilke C.O. Functional sites induce long-range evolutionary constraints in enzymes. PLoS Biol. 2016;14:e1002452. doi: 10.1371/journal.pbio.1002452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Sharir-Ivry A., Xia Y. Nature of long-range evolutionary constraint in enzymes: insights from comparison to pseudoenzymes with similar structures. Mol. Biol. Evol. 2018;35:2597–2606. doi: 10.1093/molbev/msy177. [DOI] [PubMed] [Google Scholar]
  • 9.Sharir-Ivry A., Xia Y. Using pseudoenzymes to probe evolutionary design principles of enzymes. Evol. Bioinform. Online. 2019;15 doi: 10.1177/1176934319855937. 1176934319855937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sharir-Ivry A., Xia Y. Non-catalytic binding sites induce weaker long-range evolutionary rate gradients than catalytic sites in enzymes. J. Mol. Biol. 2019;431:3860–3870. doi: 10.1016/j.jmb.2019.07.019. [DOI] [PubMed] [Google Scholar]
  • 11.Perica T., Kondo Y., Teichmann S.A. Evolution of oligomeric state through allosteric pathways that mimic ligand binding. Science. 2014;346:1254346. doi: 10.1126/science.1254346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Guarnera E., Berezovsky I.N. On the perturbation nature of allostery: sites, mutations, and signal modulation. Curr. Opin. Struct. Biol. 2019;56:18–27. doi: 10.1016/j.sbi.2018.10.008. [DOI] [PubMed] [Google Scholar]
  • 13.Flechsig H. Design of elastic networks with evolutionary optimized long-range communication as mechanical models of allosteric proteins. Biophys. J. 2017;113:558–571. doi: 10.1016/j.bpj.2017.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Naganathan A.N. Modulation of allosteric coupling by mutations: from protein dynamics and packing to altered native ensembles and function. Curr. Opin. Struct. Biol. 2019;54:1–9. doi: 10.1016/j.sbi.2018.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Campitelli P., Modi T., Ozkan S.B. The role of conformational dynamics and allostery in modulating protein evolution. Annu. Rev. Biophys. 2020;49:267–288. doi: 10.1146/annurev-biophys-052118-115517. [DOI] [PubMed] [Google Scholar]
  • 16.Maslov S., Ispolatov I. Propagation of large concentration changes in reversible protein-binding networks. Proc. Natl. Acad. Sci. USA. 2007;104:13655–13660. doi: 10.1073/pnas.0702905104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Maslov S., Sneppen K., Ispolatov I. Spreading out of perturbations in reversible reaction networks. New J. Phys. 2007;9:273. doi: 10.1088/1367-2630/9/8/273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rajasekaran N., Suresh S., Naganathan A.N. A general mechanism for the propagation of mutational effects in proteins. Biochemistry. 2017;56:294–305. doi: 10.1021/acs.biochem.6b00798. [DOI] [PubMed] [Google Scholar]
  • 19.Rajasekaran N., Naganathan A.N. A self-consistent structural perturbation approach for determining the magnitude and extent of allosteric coupling in proteins. Biochem. J. 2017;474:2379–2388. doi: 10.1042/BCJ20170304. [DOI] [PubMed] [Google Scholar]
  • 20.Rajasekaran N., Sekhar A., Naganathan A.N. A universal pattern in the percolation and dissipation of protein structural perturbations. J. Phys. Chem. Lett. 2017;8:4779–4784. doi: 10.1021/acs.jpclett.7b02021. [DOI] [PubMed] [Google Scholar]
  • 21.Echave J. Beyond stability constraints: a biophysical model of enzyme evolution with selection on stability and activity. Mol. Biol. Evol. 2019;36:613–620. doi: 10.1093/molbev/msy244. [DOI] [PubMed] [Google Scholar]
  • 22.Tomasello G., Armenia I., Molla G. The Protein Imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities. Bioinformatics. 2020;36:2909–2911. doi: 10.1093/bioinformatics/btaa009. [DOI] [PubMed] [Google Scholar]
  • 23.Boucher J.I., Bolon D.N.A., Tawfik D.S. Quantifying and understanding the fitness effects of protein mutations: laboratory versus nature. Protein Sci. 2016;25:1219–1226. doi: 10.1002/pro.2928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Aguilar-Rodríguez J., Wagner A. Metabolic determinants of enzyme evolution in a genome-scale bacterial metabolic network. Genome Biol. Evol. 2018;10:3076–3088. doi: 10.1093/gbe/evy234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Alvarez-Ponce D., Sabater-Muñoz B., Fares M.A. Essentiality is a strong determinant of protein rates of evolution during mutation accumulation experiments in Escherichia coli. Genome Biol. Evol. 2016;8:2914–2927. doi: 10.1093/gbe/evw205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Poyatos J.F. Genetic buffering and potentiation in metabolism. PLoS Comput. Biol. 2020;16:e1008185. doi: 10.1371/journal.pcbi.1008185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Echave J. Evolutionary divergence of protein structure: the linearly forced elastic network model. Chem. Phys. Lett. 2008;457:413–416. [Google Scholar]
  • 28.Echave J., Fernández F.M. A perturbative view of protein structural variation. Proteins. 2010;78:173–180. doi: 10.1002/prot.22553. [DOI] [PubMed] [Google Scholar]
  • 29.Ming D., Wall M.E. Allostery in a coarse-grained model of protein dynamics. Phys. Rev. Lett. 2005;95:198103. doi: 10.1103/PhysRevLett.95.198103. [DOI] [PubMed] [Google Scholar]
  • 30.Furnham N., Holliday G.L., Thornton J.M. The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes. Nucleic Acids Res. 2014;42:D485–D489. doi: 10.1093/nar/gkt1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Webb E.C. Academic Press; San Diego, CA: 1992. Enzyme Nomenclature. [Google Scholar]
  • 32.McCandlish D.M., Stoltzfus A. Modeling evolution using the probability of fixation: history and implications. Q. Rev. Biol. 2014;89:225–252. doi: 10.1086/677571. [DOI] [PubMed] [Google Scholar]
  • 33.Echave J., Jackson E.L., Wilke C.O. Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites. Phys. Biol. 2015;12:025002. doi: 10.1088/1478-3975/12/2/025002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Huang T.-T., del Valle Marcos M.L., Echave J. A mechanistic stress model of protein evolution accounts for site-specific evolutionary rates and their relationship with packing density and flexibility. BMC Evol. Biol. 2014;14:78. doi: 10.1186/1471-2148-14-78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Marcos M.L., Echave J. Too packed to change: side-chain packing and site-specific substitution rates in protein evolution. PeerJ. 2015;3:e911. doi: 10.7717/peerj.911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Stein R.L. Wiley; Hoboken, NJ: 2011. Kinetics of Enzyme Action: Essential Principles for Drug Hunters. [Google Scholar]
  • 37.Schowen R.L. In: Gandour R.D., Schowen R.L., editors. Springer; 1978. Chapter 20. Catalytic power and transition-state stabilization; pp. 77–144. (Transition States of Biochemical Processes). [Google Scholar]
  • 38.Micheletti C., Carloni P., Maritan A. Accurate and efficient description of protein vibrational dynamics: comparing molecular dynamics and Gaussian models. Proteins. 2004;55:635–645. doi: 10.1002/prot.20049. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supporting materials and methods and Figs S1–S8.
mmc1.pdf (558.9KB, pdf)
Document S2. Article plus supporting material
mmc2.pdf (2.4MB, pdf)

Data Availability Statement

The data and code underlying this article are available in Zenodo, at http://doi.org/10.5281/zenodo.4309233.


Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES