Abstract
Design of proteins with desired thermal properties is important for scientific and biotechnological applications. Here we developed a theoretical approach to predict the effect of mutations on protein stability from non-equilibrium unfolding simulations. We establish a relative measure based on apparent simulated melting temperatures that is independent of simulation length and, under certain assumptions, proportional to equilibrium stability, and we justify this theoretical development with extensive simulations and experimental data. Using our new method based on all-atom Monte-Carlo unfolding simulations, we carried out a saturating mutagenesis of Dihydrofolate Reductase (DHFR), a key target of antibiotics and chemotherapeutic drugs. The method predicted more than 500 stabilizing mutations, several of which were selected for detailed computational and experimental analysis. We find a highly significant correlation of r = 0.65–0.68 between predicted and experimentally determined melting temperatures and unfolding denaturant concentrations for WT DHFR and 42 mutants. The correlation between energy of the native state and experimental denaturation temperature was much weaker, indicating the important role of entropy in protein stability. The most stabilizing point mutation was D27F, which is located in the active site of the protein, rendering it inactive. However for the rest of mutations outside of the active site we observed a weak yet statistically significant positive correlation between thermal stability and catalytic activity indicating the lack of a stability-activity tradeoff for DHFR. By combining stabilizing mutations predicted by our method, we created a highly stable catalytically active E. coli DHFR mutant with measured denaturation temperature 7.2°C higher than WT. Prediction results for DHFR and several other proteins indicate that computational approaches based on unfolding simulations are useful as a general technique to discover stabilizing mutations.
Author Summary
All-atom molecular simulations have provided valuable insight into the workings of molecular machines and the folding and unfolding of proteins. However, commonly employed molecular dynamics simulations suffer from a limitation in accessible time scale, making it difficult to model large-scale unfolding events in a realistic amount of simulation time without employing unrealistically high temperatures. Here, we describe a rapid all-atom Monte Carlo simulation approach to simulate unfolding of the essential bacterial enzyme Dihydrofolate Reductase (DHFR) and all possible single point-mutants. We use these simulations to predict which mutants will be more thermodynamically stable (i.e., reside more often in the native folded state vs. the unfolded state) than the wild-type protein, and we confirm our predictions experimentally, creating several highly stable and catalytically active mutants. Thermally stable active engineered proteins can be used as a starting point in directed evolution experiments to evolve new functions on the background of this additional “reservoir of stability.” The stabilized enzyme may be able to accumulate a greater number of destabilizing yet functionally important mutations before unfolding, protease digestion, and aggregation abolish its activity.
Introduction
Protein stability is an important determinant of organismal fitness and is central to the process of enzyme design for industrial applications [1–3]. Most proteins must be folded to carry out their functions in vitro or in vivo. In addition, non-functional aggregation of unfolded or partially-unfolded proteins can have a deleterious effect on the fitness of an organism and can lead to protein aggregation diseases, which include Alzheimer’s and Huntington’s, in humans [4–6]. Aggregation of poorly folded proteins can also hamper protein production for research and technological purposes [7].
While most mutations in a natural protein are destabilizing [8,9], biological proteins are not generally at their highest possible stability; some mutations will stabilize a protein, increasing the equilibrium population of the folded state [10–12]. This stabilization can be achieved by either slowing the rate of unfolding or speeding the rate of folding, depending on the role of the mutated residue in the folding nucleation process [13,14]. The unfolding temperature, Tm, at which the free energy of the folded and unfolded states coincide (ΔG = 0) serves as a common measure of protein stability. Tm is obtainable by experiment and, in theory, from simulation, although current molecular dynamics simulations are limited in their ability to capture full folding or unfolding trajectories of most proteins (except very small fast folding domains [15]) in a tractable amount of simulation time [16].
Several computational methods to predict protein stability or changes in stability upon mutation have been developed and tested [17–19]. However, the performance of these popular methods is still relatively weak [20–22]. Other existing techniques to rationally design proteins with improved stability have involved optimization of charge-charge interactions [23], saturation mutagenesis of residues with high crystallographic B-factors [24], methods based on protein simulation and calculation of free energies [25–27] and comparison to homologous proteins including the ultra-stable proteins of thermophiles [28,29]. We reasoned that better predictions of mutant stability might be obtained by evaluating the unfolding temperature Tm in realistic yet computationally tractable simulations of protein unfolding.
Here, we use a Monte Carlo protein unfolding approach (MCPU) with an all-atom simulation method and knowledge-based potential developed earlier in our lab [16,30,31] to simulate unfolding and predict melting temperatures for all possible single point mutants of E. coli Dihydrofolate Reductase (DHFR). DHFR is an essential enzyme in bacteria and higher organisms, and it is an important target of antibiotics [32] and anti-cancer drugs [33,34]. Its moderate size (18 kDa) makes it amenable to both simulation and experiment. As described in the Materials and Methods section, the Monte Carlo move set consists of rotations about torsional angles. At high temperature, the higher entropy of unfolded states overcomes the increase in energy due to loss of favorable contacts and torsional preferences, leading to unfolding. We experimentally determine melting temperatures and catalytic activities for several predicted stabilizing mutants, and for mutants combining multiple stabilizing mutations. Our approach allows us to identify several stabilized mutants of DHFR, and our prediction method marks an improvement over existing stability predictors such as Eris [19], FoldX [17], and PopMusic [18]. Simulations of non-DHFR proteins likewise indicated that our method is useful as a general approach to simulate protein unfolding and select stabilizing mutations.
Results
Predicting the effects of mutations on protein stability from non-equilibrium unfolding simulations
Ideally, protein stability for any sequence should be predicted in all-atom equilibrium simulations that cover multiple folding-unfolding events to determine equilibrium populations of various states of the protein. However, despite recent progress in ab initio simulations of protein folding [15] this goal is not attainable for proteins of realistic size and biological relevance. Currently, non-equilibrium unfolding simulations are within reach for sufficiently large proteins and the question arises whether such simulations can be used to assess mutational effects on protein stability, which is an equilibrium property. The following analysis provides an affirmative answer to this question, under certain assumptions. Although the idea of obtaining equilibrium free energy differences from non-equilibrium measurements is not new [35], and protein stabilities have been calculated from molecular dynamics simulations using the Jarzynski equality, e.g., [36–38], such simulations require application of an external steering force; in the present paper we report the use of multi-temperature Monte-Carlo unfolding simulations in obtaining protein stabilities.
Assuming two-state unfolding kinetics [39–42] we can estimate the characteristic time required to cross the unfolding free energy barrier (in fact it is the time spent in the native state waiting for sufficient thermal fluctuation to cross the barrier) as:
(1) |
where is first-passage time from the folded to the unfolded state, ΔG# is the free energy barrier between the folded state and the transition state for unfolding (see Fig. 1) and τ0 is the elementary time constant. When simulation time τsim approaches unfolding events are observed in simulation. The apparent “melting temperature”, i.e., the temperature at which unfolding events occur in simulations, therefore depends on the simulation time τsim according to Eq. (1):
(2) |
This analysis suggests that non-equilibrium first passage unfolding simulations are not suitable to predict the temperature at which a protein would unfold at equilibrium. However the effect of mutations on stability can be predicted from unfolding simulations. In order to see this we note that the mutational effect on protein stability ΔΔG is related to the change in the unfolding free energy barrier ΔΔG#, the difference between the WT barrier height and the mutant barrier height, shown in Fig. 1.
(3) |
where i denotes the mutated amino acid and φi is the φ-value for residue i which determines the fraction of interactions that this residue forms in the folding/unfolding transition state [40,43,44]. We therefore obtain
(4) |
where is the shift in apparent unfolding temperature upon a specific mutation in the i-th residue. Introducing the relative (to WT) unfolding temperature we get
(5) |
i.e. the mutational shift in observed unfolding temperature, normalized to the observed unfolding temperature of the wild-type at the same simulation condition does not depend on the simulation length, provided that the simulation is sufficiently equilibrated in the native basin so that the rules of transition state theory apply. The analysis of extensive kinetic and equilibrium data for multiple proteins shows that for the majority of mutations (except for a small fraction of residues that participate in the folding nucleus) φi ≈ 0.24 with remarkable accuracy and consistency [45]. We get therefore
(6) |
i.e. is independent of simulation time and proportional to the equilibrium free energy effect of mutations, provided that simulations have equilibrated in the native basin of attraction.
Monte Carlo protein unfolding simulation
We ran MCPU on DHFR (PDB ID: 4DFR) at a range of temperatures, to generate simulated unfolding curves. Unfolding steps of a sample trajectory are shown in Fig. 2, and a flowchart of the simulation method is shown in S1 Fig. The protein was subject to a brief MD energy minimization, beginning from the WT crystallographic native state, followed by unfolding simulations at each of 32 different temperatures using all-atom Monte-Carlo (see Materials and Methods section). As shown in figures S2 Fig—;S4 Fig, the RMSD and total energy increased and the number of contacts decreased as each simulation proceeded, and with increasing temperature. (Here, temperature is given in arbitrary simulation units.) Plots of RMSD and contact number vs. temperature showed sigmoidal behavior, with a clearly identifiable transition temperature, and the melting temperature (Tm) could be determined by fitting to a sigmoidal function (Fig. 3). Plots of energy vs. temperature (S5 Fig) were sigmoid-like, but with an additional rise in energy at low to intermediate temperatures, perhaps indicating pre-melting to a dry-molten globule state with loosened side chains but native-like topology [46,47]. This deviation from sigmoidal behavior becomes clearer as the simulation length is increased (S6 Fig).
Computational identification of stabilizing single point mutations
All possible single point mutations of DHFR (159 * 19 = 3,021) were simulated with the Monte Carlo protein unfolding simulation protocol. The simulated Tm values were calculated as described above. Of the 3,021 mutations, 523 mutations (17.3%) were predicted to have a stabilizing effect according to all three metrics (energy, contacts, and RMSD), while 42.1% of mutations had a destabilizing effect according to all three metrics. These predictions are in good agreement with statistical analysis of published experimental data and FoldX predictions [8,12]. The simulated Tm values evaluated using RMSD, total energy, and number of contacts are strongly correlated, as shown in Fig. 4A. The distribution of predicted melting temperatures (averaged over the 3 metrics) for all 3021 point mutants is shown in Fig. 4B. Next, we selected a subset of predicted stabilizing mutations for subsequent in depth computational and experimental analysis. To that end we selected the loci where multiple mutations were consistently predicted as stabilizing. Out of this set we selected one mutation at each loci which were predicted as most stabilizing. As a result we arrived at 23 single predicted stabilizing point mutants shown in S1 Table, which we deemed most promising for subsequent in depth computational and experimental analysis. Furthermore, five stabilizing mutations at different sites within DHFR, shown in Fig. 5, were combined to form the multiple mutants listed in Table 1, with the rationale that the combination of individual stabilizing mutants often yields more stable proteins, and these mutants were likewise subjected to computational and experimental analysis.
Table 1. The simulated and experimental results of the selected single point mutants and WT.
Mutation(s) | Tm (DSC) | Cm (CD) | k cat | kcat/Km | Simulated Tm |
---|---|---|---|---|---|
WT | 54.1 | 3.09 | 24.60 | 14.07 | 1.358 ± 0.004 |
T113V | 58.0 | 3.28 | 13.67 | 10.86 | 1.389 ± 0.004 |
Q108D | 55.7 | 3.18 | 24.60 | 10.35 | 1.361 ± 0.004 |
S138Y | 55.6 | 3.33 | 24.51 | 9.33 | 1.366 ± 0.004 |
D116F | 55.5 | 3.43 | 24.80 | 9.53 | 1.369 ± 0.004 |
T68N | 55.5 | 3.26 | 29.36 | 13.32 | 1.367 ± 0.003 |
E120P | 55.3 | 3.25 | 30.02 | 13.91 | 1.371 ± 0.004 |
T68N,Q108D,T113V,E120P,S138Y | 61.3 | 3.52 | 32.63 | 12.20 | 1.400 ± 0.004 |
T113V,E120P,S138Y | 58.5 | 3.49 | 31.13 | 13.10 | 1.384 ± 0.004 |
T68N,Q108D,E120P,S138Y | 56.4 | 3.47 | 22.80 | 10.94 | 1.377 ± 0.003 |
T68N,Q108D | 55.8 | 3.14 | 17.99 | 15.24 | 1.366 ± 0.003 |
E120P,S138Y | 55.6 | 3.29 | 16.01 | 10.81 | 1.371 ± 0.004 |
Note: The data were averaged over 50 replications. 2,000,000 MC steps were simulated in total, and the last 1,000,000 steps were used to calculate Tm.
The units: Tm: °C, Cm: M, kcat: s−1, kcat∕KM: s−1 μM−1
Computational test of the theoretical analysis
First we test two predictions that emerge from the theoretical analysis of unfolding simulations. The first prediction is that the apparent unfolding temperature decreases as the length of the unfolding simulation increases (Equation 4). Secondly and most importantly the mutational change in relative (normalized to WT) apparent unfolding temperature is a) robust with respect to simulation time provided that simulations have equilibrated in the native basin and b) directly proportional to the effect of mutations on equilibrium protein stability (Equation 6). We test these predictions using MCPU simulations and experiment.
We carried out two sets of MCPU simulations of different lengths: 2,000,000 and 20,000,000 steps for the 23 predicted stabilizing mutants, 15 mutants studied previously by experiment [48] (the complete set of single mutants is listed in S1 Table), and the 5 stabilizing multiple mutants combining individual mutations listed in Table 1, and compared their predicted absolute and relative simulated unfolding temperatures (Fig. 6). Indeed both predictions of our theoretical analysis are confirmed, i.e., the apparent unfolding temperature decreases with simulation time (Fig. 6A) while the relative unfolding temperature is remarkably independent of simulation time (Fig. 6B). We note that due to the nature of the energy function used in our simulations, there is no obvious mapping of simulation temperature to real absolute temperature (i.e., in Celsius or Kelvin). Conversion of simulation temperature to physical temperature would require use of experimental data (e.g., WT unfolding temperature and deviation of temperatures over all mutants) and therefore would not provide a completely simulation- or theory-based prediction. Furthermore, as noted above, the apparent absolute value of the transition temperature in the Monte-Carlo unfolding approach depends on simulation time. Therefore, we used relative melting temperature, , when comparing simulation results with experimental results.
As mentioned, the simulated Tm values evaluated using RMSD, total energy, and number of contacts are strongly correlated in our simulations as shown in Fig. 4A and S1 Table. In what follows we define the computational unfolding temperature Tm as averaged over Tm values determined using these three criteria.
Experimental characterization of predicted mutants
We cloned, expressed, and purified the 23 single point mutants of DHFR listed in S1 Table, as well as the multiple mutants listed in Table 1 (see Materials and Methods). The biophysical properties of the mutants were measured and compared with WT DHFR, as shown in S2 Table. As many studies have shown that oligomerization can alter protein stability [23,48,49], we first tested whether mutations induce oligomerization and/or aggregation using the gel filtration method [48,50] and light scattering. The results indicated that all of the 23 mutants were monomeric at studied concentrations except for E154V, which appeared aggregation-prone. We excluded E154V from the subsequent analysis.
As shown in S2 Table, all single mutants are catalytically active except for D27F. D27 is known to be a key catalytic residue of E. coli DHFR [51].
For each mutant we obtained two measures of stability: the apparent melting temperature determined by Differential Scanning Calorimetry (DSC) and the urea midpoint unfolding concentration (Cm) determined by monitoring chemical denaturation by Circular Dichroism (CD) with subsequent fitting to a two-state model (see Materials and Methods). Both measures of stability were highly correlated, despite the fact that thermal unfolding was irreversible (S7 Fig). Of the selected 22 single point mutations, 10 mutations were stabilizing, according to their Tm or Cm values (S2 Table). Given that statistically most random mutations are destabilizing with only a small fraction (less than 18%) stabilizing [8,12], this statistically significant result (p = 0.002 under the null hypothesis that mutations are random) indicates that MCPU is an effective method for selecting stability-enhancing mutations.
As expected, combinations of single stabilizing mutations led to more stable multiple mutant variants, [24,25,52] as predicted by simulation. In particular, the stability of the quintuple mutant (T68N,Q108D,T113V,E120P,S138Y) was found to be substantially higher than that of the wild type protein (Table 1), with Tm 7.2°C higher than WT, and Cm, the urea concentration at the mid-unfolding point, was 0.43M higher than WT. All multiple mutants were catalytically active, and the quintuple mutant and triple mutant (T113V,E120P,S138Y) were found to be more catalytically active than WT. We note that while combination of stabilizing mutations generally increases stability, the effect is less than additive (S8 Fig); for instance, the quintuple mutant is about 4°C less stable than predicted under the assumption of additive ΔTm (a 7.2°C stability increase vs. predicted 9.6°C).
We computationally predicted relative unfolding temperatures of 15 DHFR mutants published earlier [48] and added these mutants to the set for analysis resulting in 42 mutants in total. The correlation coefficient between experimental relative Tm and simulated relative Tm for the 42 mutants was about 0.65, as shown in Fig. 7A. To address the issue that both simulated Tm and DSC measurements are not strictly at equilibrium, we plotted the relation between simulated Tm and equilibrium measurement of stability in chemical denaturation by urea. The denaturation mid-transition urea concentration Cm and computationally determined unfolding temperature exhibit even a slightly higher correlation of r = 0.68 (Fig. 7B), demonstrating that our non-equilibrium simulation method shows good agreement with the equilibrium measurement of urea denaturation, as predicted by Equation 6.
We also used the dataset to evaluate the effect of the number of replications and the number of MC steps on the performance of the method. As shown in Fig. 8A, the prediction accuracy is sensitive to the number of replications. To achieve reliable Tm predictions, at least 20 replications should be used. However, the number of MC steps did not greatly affect prediction accuracy, provided simulations were run for at least ~ 200,000 steps (see Fig. 8B). In the context of the theory developed in the earlier section: “Predicting the effects of mutations on protein stability from non-equilibrium unfolding simulations,” this initial equilibration period may allow time for equilibration within the native basin, after which simulation length does not appreciably affect the consistency of results with equilibrium stability measurements.
Stability and activity do not trade-off for DHFR
It has been proposed that stability imposes a constraint on protein function leading to stability-activity tradeoffs [53,54]. Our data, however, paints a different picture for DHFR—;of a weak positive correlation between Tm and kcat or kcat/KM (r = 0.46, p = 0.02 and r = 0.41, p = 0.03 respectively) with one notable outlier D27F, where the stabilizing mutation is made right in the active site (Fig. 9). The D27F mutant has high thermal stability but, as noted above, is not catalytically active, indicating that there is in fact a stability-activity trade-off for this active-site residue.
Evolutionary analysis
Using an alignment of 290 bacterial DHFRs, we determined the DHFR consensus sequence (S9 Fig). Mutation of a non-consensus residue to the consensus residue generally resulted in protein stabilization [29]. In 4/16 of the experimentally stabilizing mutations, a residue was changed to the consensus residue, while only 2/29 destabilizing mutations resulted from a change to consensus. Likewise, in 18/29 destabilizing mutations, a residue was changed away from the consensus residue, while this was true for only 5/16 of stabilizing mutations.
Simulated melting temperatures by residue
We compared the minimum and maximum simulated Tm values obtainable by mutating a single residue to any of the 19 other amino acids (Fig. 10A). There is a weak positive correlation between minimum and maximum melting temperatures (r = 0.30, p = 10−4). Apparently, protein loci where mutations can cause significant stabilization are statistically less susceptible to destabilizing mutations and vice versa, which may be expected: once a residue is already at its most stabilizing amino acid variant, the protein cannot be stabilized further by mutation. Distinct outliers correspond to the loci with the strongest stabilizing or destabilizing effects of mutations. Interestingly, these outliers, which may represent structural weak spots in DHFR, tend to fall on the interface connecting the C-terminal beta hairpin and the rest of the protein (Fig. 10B). This is in fact the interface that is the first to dissociate in the Monte Carlo simulations (see Fig. 2).
Comparison with other methods
We compared our computational DHFR predictions with four popular approaches to predict the effect of a mutation on protein stability: FoldX [17], Eris [26], PopMusic [55], and SDM [56]. (S3 Table). The MCPU performs better than these methods on DHFR mutants. PopMusic shows also strong performance with highly statistically significant r = 0.55 between theory and experiment, however the limitation of this method is that it can consider only single point mutations. To further evaluate MCPU performance we tested it on four additional proteins from four different SCOP structural classes: the Cro repressor protein from bacteriophage lambda (PDB-ID 5CRO), the B. Subtilis major cold shock protein (1CSP), E. coli Thioredoxin (2TRX), and Gln-25 ribonuclease T1 from Aspergillus oryzae (1RN1). Our predictions were compared with Eris and SDM. We did not compare MCPU results with FoldX and PopMusic as these mutations were selected in the training dataset for the two methods. As shown in Table 2, the correlation coefficient between MCPU predictions and the experimental Tm values, averaged over all proteins, is about 0.71, which is higher than that provided by Eris (-0.05), for which predictions were quite poor for both DHFR and other proteins, and SDM (0.63). If we consider only the binary prediction of whether a mutation is stabilizing or destabilizing, MCPU can correctly predict 11 out of 16 mutations, while Eris and SDM correctly classify 9 and 8 mutations respectively.
Table 2. Simulation results on non-DHFR proteins.
SCOP | Length | PDB | Mutant | Real Tm | MCPU | Eris | SDM | Native energy |
---|---|---|---|---|---|---|---|---|
All alpha proteins | 66 | 5CRO | Y26D | 54.0 | 0.878 | −6.958 | −0.690 | −2044.4 |
66 | 5CRO | Y26H | 49.5 | 0.869 | −3.425 | −0.660 | −1908.2 | |
66 | 5CRO | Y26L | 46.0 | 0.868 | −3.458 | 0.300 | −1891.8 | |
66 | 5CRO | WT-5CRO | 39.5 | 0.869 | 0.000 | 0.000 | −1886.0 | |
66 | 5CRO | Y26W | 37.5 | 0.871 | −1.200 | 0.350 | −1872.7 | |
All beta proteins | 67 | 1CSP | A46E | 48.6 | 1.015 | 1.203 | 0.020 | −1624.9 |
67 | 1CSP | E3L | 62.7 | 1.033 | 2.271 | −0.460 | −1536.7 | |
67 | 1CSP | E3R | 69.6 | 1.023 | 1.638 | −0.650 | −1878.0 | |
67 | 1CSP | E66L | 66.4 | 1.042 | 2.510 | 1.320 | −1568.1 | |
67 | 1CSP | WT-1CSP | 53.6 | 1.022 | 0.000 | 0.000 | −1596.3 | |
Alpha and beta proteins (a/b) | 109 | 2TRX | D26I | 98.0 | 1.135 | 2.847 | 4.290 | −2478.5 |
109 | 2TRX | WT-2TRX | 87.0 | 1.107 | 0.000 | 0.000 | −2558.4 | |
109 | 2TRX | T66L | 85.0 | 1.124 | 3.540 | 2.180 | −2552.0 | |
109 | 2TRX | T77V | 82.0 | 1.124 | −0.980 | 1.960 | −2541.9 | |
109 | 2TRX | C35A | 73.0 | 1.104 | −14.670 | −2.040 | −2574.7 | |
Alpha and beta proteins (a+b) | 125 | 1RN1 | V16A | 44.5 | 0.887 | 2.033 | −1.530 | −1934.1 |
125 | 1RN1 | V16S | 36.9 | 0.876 | 3.027 | −4.410 | −1920.1 | |
125 | 1RN1 | V78S | 34.6 | 0.878 | 3.870 | −4.450 | −1943.8 | |
125 | 1RN1 | V89S | 29.6 | 0.877 | 3.414 | −4.100 | −1904.8 | |
125 | 1RN1 | WT-1RN1 | 51.5 | 0.899 | 0.000 | 0.000 | −1929.7 | |
Error number | 5 | 7 | 8 | 7 | ||||
Error rate | 0.313 | 0.438 | 0.500 | 0.438 | ||||
r | 0.708 | −0.053 | 0.635 | −0.348 |
Error number and error rate describe the number and fraction of mutations not predicted in the correct direction (stabilizing vs. destabilizing)
Entropy of the native state is an important contributor to stability
The theoretical analysis of the unfolding simulations relates the effect of mutations on the equilibrium between folded and unfolded states to the effect of mutations on free energy of the folded and transition states. It is widely believed that in the low-entropy folded state energetic factors dominate. If so that would imply that we can get an equally good correlation between prediction and experiment by estimating the mutational effect on energy of the native state as is the case for most empirical methods. To that end we evaluated the correlation between the energy of the minimized (after long MC equilibration) native state and the experimental Tm and found only a weak correlation with experimental melting temperatures (Table 2, last column), indicating that protein entropy, which is accounted for in the MCPU, in addition to enthalpy, is important in determining protein stability.
Discussion
Estimates of protein stability using Molecular Dynamics are prohibitive for all but the smallest protein domains. However using MCPU we were able to efficiently explore stabilities of all possible point mutants for an essential enzyme of a typical size (159 amino acids) in a manageable amount of computational time (approx. one hour for every 1,000,000 MC steps). Although the use of rapid Monte Carlo simulations reduces simulation time and allows for a greater number of replicates, our method to predict stability effects of mutations based on non-equilibrium unfolding simulations represents a general approach that could be modified for use with conventional MD simulations, especially given the current rate of improvement in simulation speed and accuracy.
Since our method involves protein unfolding simulations and not equilibrium simulations of both folding and unfolding processes, we expect it to be especially useful for predicting mutations that mostly affect the rate of protein unfolding as highlighted in our theoretical analysis. Low φ-value residues, which acquire contacts with other residues late in the folding process and lose contacts early in the unfolding process [14] constitute the majority of residues in proteins, with φ-value roughly constant around 0.24 as noted in [45]. Combining this observation with assumptions of transition state theory, we found that for the majority of residues (those not part of the folding nucleus [14,57] exhibiting anomalously high φ-values) the observed simulation Tm relative to WT is proportional to the equilibrium stability change ΔΔG, as verified by simulation and experiment. We establish that relative Tm is independent of simulation length, demonstrating that non-equilibrium simulations can in fact be used to quantify relative protein stability.
Many of the experimentally verified stabilizing mutations in DHFR predicted by MCPU are found in the C-terminal beta hairpin region, which is the first to unfold in simulations, prior to the main unfolding event encompassing the entire structure (see Fig. 2). It has been shown that the source of ultra-stability in hyperthermophiles generally arises from slowing the unfolding rate, rather than increasing the folding rate [28], so our method may be particularly suitable for discovering biologically relevant stabilizing mutations. In addition, our results might be particularly applicable to in vivo studies, where protease digestion and/or aggregation proceed from the partially-unfolded state. We note, however, that some stabilizing residues predicted by MCPU lie in the region of the protein that is late to unfold in simulations, including I61V, which raises the experimental melting temperature by 1.7°C. These mutants, along with the destabilized outlier I155A for which relative Tm depends on simulation length (Fig. 6), are appealing candidates for further study, as they may reflect a breakdown in the simplifying assumptions of 2-state kinetic theory for proteins.
It has been hypothesized that there exists a tradeoff between enzyme activity and stability, since certain regions of an enzyme must be sufficiently flexible to promote catalysis [53,54]. This conclusion was reached in [53,58], based on the exploration of stability effects of mutations in the active site of beta-lactamase [53] and rubisco [58]. Fersht and coauthors also found several stabilizing mutations in the active site of Barnase rendering the protein inactive [59]. While we observe a similar effect with the D27F mutation in DHFR, Fig. 9 shows that exploring only mutations in the active site provides a biased view on the tradeoff between activity and stability. Rather a vast majority of mutations throughout the protein show a qualitatively opposite trend. The likely explanation of the distinction between an apparent tradeoff when mutations are made in the active site and the opposite trend for mutations outside of the active site is that “carving” an active site requires special selection of catalytic amino acids, which could indeed have a destabilizing effect, overall. However our observation of a small positive correlation argues against an obligate relation between global protein dynamics and activity for DHFR, at least for the aspects of dynamics that are correlated with stability. Warshel and colleagues reached a similar conclusion in their theoretical analysis of the role of dynamics for DHFR and other proteins in [60]. This point has likewise been made by Bloom et al. [11], who noted that a number of proteins have been stabilized experimentally without loss of activity, and Taverna and Goldstein argued that marginal stability is an inherent property of proteins due to the high dimensionality of sequence space and not due to a requirement of reduced stability in order to generate sufficient flexibility [61].
A straightforward explanation for the weak yet statistically significant positive correlation between activity and stability observed in our case might be that more stable proteins have greater effective concentration of the folded (i.e. active) form. It is also important to note that a weak yet statistically significant positive correlation between activity and stability for DHFR can be revealed only when stabilizing mutations are included in the analysis. Our earlier study [48] analyzed a smaller set of primarily destabilizing mutants and did not reveal any statistically significant trend (positive or negative) in the stability-activity relation for DHFR.
The development of highly-stabilized DHFR mutants through our combined in silico—;in vitro approach opens up promising avenues for new in vivo studies. It has been postulated that protein stability places a fundamental constraint on the evolutionary pathways available to a protein [29,62] which has particular significance in the development of antibiotic resistance: higher protein stability can provide the microorganism with an increased capacity to evolve to evade antibiotic drugs [63] or, more generally, with capacity to evolve new functions [62]. We plan to use an approach developed in our lab [48] to endogenously introduce stabilized DHFR mutants into the bacterial chromosome and we will evaluate mutant fitness relative to wild-type using growth rates and competition experiments. These experiments will allow us to assess whether an evolutionary trade-off exists between stability and fitness in vivo, particularly in the presence of antibiotics.
We plan to apply MCPU to predict stability effects of mutations in proteins other than DHFR, in particular to develop highly stabilized mutants. Comprehensive experimental analysis of fitness and/or stability effects of mutations [64] could be useful in assessing the predictive capabilities of this method. In addition to predicting mutant stabilities, MCPU can provide atomic-detail molecular trajectories to rationalize the stability effects of mutations; such analysis is left to future study.
Materials and Methods
Monte Carlo simulations
We employed an all-atom Monte Carlo simulation program incorporating a knowledge-based potential, described in previous publications [16,31,65]. Briefly, the energy function is a sum of contact energy, hydrogen-bonding, torsional angle, and sidechain torsional terms, with an additional term describing orientation of nearby aromatic residues. The move set consists of rotations about ϕ, ψ, and χ dihedral angles, with bonds and angles held fixed. Moves are accepted or rejected according to the Metropolis criterion.
Mutations were introduced into the protein using the program Modeller v9.2 [66]. An initial minimization was carried out in NAMD [67] for 5,000 steps, using the default minimization algorithm and par_all27_prot_lipid.inp parameter file (without waters). An additional minimization step was carried out by running the Monte Carlo simulation program at low temperature (0.100 in simulation units) for 2,000,000 steps. A 2,000,000-step simulation was then run at each of 32 temperatures, averaging over all 2,000,000 steps to obtain Energy, RMSD, and number of contacts. These results were averaged over 50 simulations, for each temperature. Data was then plotted and fit to a sigmoid to obtain the computationally-predicted melting temperature, for each of Energy, RMSD, and number of contacts. To assess dependence of melting temperature on simulation length, longer simulations of 20,000,000 steps were carried out with 30 replications, averaging over the final 2,000,000 steps. For DHFR, 1,000,000 steps took approximately one hour of simulation time, on a single CPU.
Bioinformatics Analysis
DHFR protein sequences from 290 bacterial species were aligned using the program MUSCLE and online server [68]. MATLAB, with the Bioinformatics Toolbox, was used to create sequence logo representations and to determine the consensus sequence.
Effect of number of replications on simulation accuracy
We evaluated the effect of the number of MC simulation replications on the prediction results. As shown in S9 Fig, the prediction accuracy is sensitive to the number of replications, but converges to a constant value after approximately 20 replications. In addition, we saw that increasing the number of MC steps beyond 2,000,000 steps does not increase prediction accuracy when the protein has been simulated with at least 20 replications, despite the fact that not all simulations have converged by 2,000,000 steps (S2 Fig—;S4 Fig).
Simulation analysis
Sigmoidal fits were accomplished using the module “Sigmoidal, 4PL” using the software program Prism 6. The sigmoid function has the form:
Y = Bottom + (Top-Bottom)/(1+10^((LogIC50-X)*HillSlope))
Method availability
The tool is accessible from Shakhnovich lab website http://faculty.chemistry.harvard.edu/shakhnovich/software
Site-directed protein mutagenesis of DHFR
The wild type dhfr gene was cloned in a pET24 expression vector under the inducible T7 promoter, then transformed into BL21(DE3) cells [69]. Single point mutations of DHFR were constructed based on a two-step PCR-mutagenesis strategy [70], in which the template for the PCR is the plasmid of WT DHFR. The multiple-mutant variants of DHFR were constructed based on the same method with the single point mutation, but the template of PCR was the plasmid of the corresponding dhfr mutant. To verify the mutations of dhfr, DNA sequencing was performed at the GENEWIZ Incorporation (MA, U.S.). The verified plasmids were transformed into competent E. coli BL21(DE3) cells for expression.
Protein expression and purification
WT DHFR and all mutants used in this study were cloned into a pET24 expression vector and overexpressed in the BL21(DE3) pLys E. coli strain.
A single colony of the transformed E. coli carrying the wild type or mutation dhfr was cultured in Luria-Bertani liquid medium containing 50 μg/mL kanamycin (LB-kana) at 30°C overnight, and then inoculated to fresh LB-kana (1:100 dilution) and incubated again at 30°C. When the OD600 of the culture reached 0.6, isopropyl β-D-1-thiogalactopyranoside (final concentration, 0.4 mM) was added. Cultures were incubated for an additional 12–16 h at 25°C. The cells were then collected by centrifugation and disrupted by sonication. The recombinant proteins were purified with Ni-NTA Superflow (QIAGEN, U.S.) according to the manufacturer’s instructions. Then, the collected protein sample was run with Superdex 75pg Column and was desalted with the desalting Column in ÄKTA protein purification system (GE Healthcare, U.S.). The final concentration of the purified protein was determined using the BCA protein assay kit (PIERCE CHEMICAL, USA) or the NanoDrop instrument (GE Healthcare, U.S.).
Enzyme kinetics
DHFR kinetic parameters were measured by the progress-curve kinetics, essentially as described [69,71]. A Scientific stopped flow apparatus, RX.2000 Rapid kinetics system (Applied Photophysics, UK) was used with absorbance monitoring at 340 nm, under single-turnover conditions. NADPH was preincubated with DHFR for 5 min in syringe 1 at the temperature 25°C in a thermostated syringe compartment, and then the reaction was initiated by rapidly mixing the contents with dihydropholate (DHF) from syringe 2. The final assay conditions are 25 nM DHFR, 120 μM NADPH(D), and 25 μM DHF in MTEN buffer (50 mM 2-(N-morpholino)ethanesulfonic acid, 25 mM tris(hydroxymethyl)aminomethane, 25 mM ethanolamine, and 100 mM sodium chloride, pH 7.6). The kinetics parameters (kcat and KM) were derived from progress-curves analysis using Global Kinetic explorer [72].
Stability measurements
Thermal stability was characterized by differential scanning calorimetry (DSC), essentially as described in references [69,73]. Briefly, DHFR proteins in Buffer A (10 mM potassium-phosphate buffer pH 7.8 supplemented with 0.2 mM EDTA and 1 mM beta-mercaptoethanol) were subjected to a temperature increase of 1°C/min between 20 to 90°C (nano-DSC, TA instruments, U.S.), and the evolution of heat was recorded as a differential power between reference (buffer A) and sample (120 μM protein in buffer A) cells. The resulting thermogram (after buffer subtraction) was used to derive apparent thermal transition midpoints (Tm app). Thermal unfolding appeared irreversible for all DHFR proteins tested [48], and the two state scaled model provided in NanoAnalyze software (TA INstruments, U.S.) was used to fit the Tm app value. The mutants constructed in this study and the ones published earlier [48] were determined with different DSC instruments with slightly different calibration leading to a small offset of about 2°C for the WT DHFR for earlier published data[48].
Urea unfolding was used to measure stability of the DHFR mutants against chemical denaturation. Proteins (25 μM in buffer A) were diluted in urea (0.2 mM increments up to a final urea concentration between 0 and 6 M), preequilibrated overnight at 25°C for 3 hours, and the change in the folded fraction was monitored by a circular dichroism signal at far-uv wavelength (221 nm) at 25°C (J-710 spectropolarimeter, Jasco). Fitting to a two-state model was used to derive the chemical transition midpoint (Cm).
Supporting Information
Acknowledgments
We thank Bharat Adkar and Shimon Bershtein for help with the purification and Biophysical characterization of the DHFR mutants and Muyoung Heo for help and guidance in the use of the Monte Carlo simulations.
Data Availability
All relevant data are within the paper and its Supporting Information files.
Funding Statement
This work was supported by NIH grant RO1 GM06870 (EIS) www.nih.gov; NIH Biophysics training grant (JCW) www.nih.gov; National High Technology Research and Development Program of China (863 Program, 2013AA102804)http://www.most.gov.cn/eng/programmes1/200610/t20061009_36225.htm; and the National Natural Science Foundation of China (NSFC, Grant no. 31371748)(JT) http://www.nsfc.gov.cn. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Serohijos AW, Shakhnovich EI (2014) Merging molecular mechanism and evolution: theory and computation at the interface of biophysics and evolutionary population genetics. Curr Opin Struct Biol 26C: 84–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, et al. (2012) The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci 21: 769–785. doi: 10.1002/pro.2071 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Shakhnovich E (2006) Protein folding thermodynamics and dynamics: where physics, chemistry, and biology meet. Chem Rev 106: 1559–1588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Aguzzi A, O'Connor T (2010) Protein aggregation diseases: pathogenicity and therapeutic perspectives. Nat Rev Drug Discov 9: 237–248. doi: 10.1038/nrd3050 [DOI] [PubMed] [Google Scholar]
- 5.Dobson CM (2003) Protein folding and misfolding. Nature 426: 884–890. [DOI] [PubMed] [Google Scholar]
- 6.Chiti F, Dobson CM (2009) Amyloid formation by globular proteins under native conditions. Nat Chem Biol 5: 15–22. doi: 10.1038/nchembio.131 [DOI] [PubMed] [Google Scholar]
- 7.Chennamsetty N, Voynov V, Kayser V, Helk B, Trout BL (2009) Design of therapeutic proteins with enhanced stability. Proc Natl Acad Sci U S A 106: 11937–11942. doi: 10.1073/pnas.0904191106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Zeldovich KB, Chen PQ, Shakhnovich EI (2007) Protein stability imposes limits on organism complexity and speed of molecular evolution. Proceedings of the National Academy of Sciences of the United States of America 104: 16152–16157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kumar MD, Bava KA, Gromiha MM, Prabakaran P, Kitajima K, et al. (2006) ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions. Nucleic Acids Res 34: D204–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Duarte NC, Becker SA, Jamshidi N, Thiele I, Mo ML, et al. (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proceedings of the National Academy of Sciences of the United States of America 104: 1777–1782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bloom JD, Labthavikul ST, Otey CR, Arnold FH (2006) Protein stability promotes evolvability. Proc Natl Acad Sci U S A 103: 5869–5874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tokuriki N, Stricher F, Schymkowitz J, Serrano L, Tawfik DS (2007) The Stability Effects of Protein Mutations Appear to be Universally Distributed. J Mol Biol 369: 1318–1332. [DOI] [PubMed] [Google Scholar]
- 13.Fersht AR (1995) Optimization of rates of protein folding: the nucleation-condensation mechanism and its implications. Proc Natl Acad Sci U S A 92: 10869–10873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Abkevich VI, Gutin AM, Shakhnovich EI (1994) Specific nucleus as the transition state for protein folding: evidence from the lattice model. Biochemistry 33: 10026–10036. [DOI] [PubMed] [Google Scholar]
- 15.McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, et al. (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9: 356–369. doi: 10.1038/nrg2344 [DOI] [PubMed] [Google Scholar]
- 16.Yang JS, Wallin S, Shakhnovich EI (2008) Universality and diversity of folding mechanics for three-helix bundle proteins. Proc Natl Acad Sci U S A 105: 895–900. doi: 10.1073/pnas.0707284105 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bar-Even A, Noor E, Lewis NE, Milo R (2010) Design and analysis of synthetic carbon fixation pathways. Proc Natl Acad Sci U S A 107: 8889–8894. doi: 10.1073/pnas.0907176107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Gilis D, Rooman M (2000) PoPMuSiC, an algorithm for predicting protein mutant stability changes: application to prion proteins. Protein Eng 13: 849–856. [DOI] [PubMed] [Google Scholar]
- 19.Yin S, Ding F, Dokholyan NV (2007) Eris: an automated estimator of protein stability. Nat Methods 4: 466–467. [DOI] [PubMed] [Google Scholar]
- 20.Khan S, Vihinen M (2010) Performance of protein stability predictors. Hum Mutat 31: 675–684. doi: 10.1002/humu.21242 [DOI] [PubMed] [Google Scholar]
- 21.Potapov V, Cohen M, Schreiber G (2009) Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. Protein Eng Des Sel 22: 553–560. doi: 10.1093/protein/gzp030 [DOI] [PubMed] [Google Scholar]
- 22.Thiltgen G, Goldstein RA (2012) Assessing predictors of changes in protein stability upon mutation using self-consistency. PLoS One 7: e46084. doi: 10.1371/journal.pone.0046084 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Neher RA, Shraiman BI (2011) Genetic draft and quasi-neutrality in large facultatively sexual populations. Genetics 188: 975–996. doi: 10.1534/genetics.111.128876 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Reetz MT, Carballeira JD, Vogel A (2006) Iterative saturation mutagenesis on the basis of B factors as a strategy for increasing protein thermostability. Angew Chem Int Ed Engl 45: 7745–7751. [DOI] [PubMed] [Google Scholar]
- 25.Tian J, Wang P, Huang L, Chu X, Wu N, et al. (2013) Improving the thermostability of methyl parathion hydrolase from Ochrobactrum sp. M231 using a computationally aided method. Appl Microbiol Biotechnol 97: 2997–3006. doi: 10.1007/s00253-012-4411-7 [DOI] [PubMed] [Google Scholar]
- 26.Alfarano P, Varadamsetty G, Ewald C, Parmeggiani F, Pellarin R, et al. (2012) Optimization of designed armadillo repeat proteins by molecular dynamics simulations and NMR spectroscopy. Protein Sci 21: 1298–1314. doi: 10.1002/pro.2117 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Korkegian A, Black ME, Baker D, Stoddard BL (2005) Computational thermostabilization of an enzyme. Science 308: 857–860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Luke KA, Higgins CL, Wittung-Stafshede P (2007) Thermodynamic stability and folding of proteins from hyperthermophilic organisms. FEBS J 274: 4023–4033. [DOI] [PubMed] [Google Scholar]
- 29.Mustonen V, Lassig M (2009) From fitness landscapes to seascapes: non-equilibrium dynamics of selection and adaptation. Trends Genet 25: 111–119. doi: 10.1016/j.tig.2009.01.002 [DOI] [PubMed] [Google Scholar]
- 30.Xu JB, Huang L, Shakhnovich EI (2011) The ensemble folding kinetics of the FBP28 WW domain revealed by an all-atom Monte Carlo simulation in a knowledge-based potential. Proteins-Structure Function and Bioinformatics 79: 1704–1714. doi: 10.1002/prot.22993 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Yang JS, Chen WW, Skolnick J, Shakhnovich EI (2007) All-atom ab initio folding of a diverse set of proteins. Structure 15: 53–63. [DOI] [PubMed] [Google Scholar]
- 32.Boehr DD, McElheny D, Dyson HJ, Wright PE (2006) The dynamic energy landscape of dihydrofolate reductase catalysis. Science 313: 1638–1642. [DOI] [PubMed] [Google Scholar]
- 33.Neradil J, Pavlasova G, Veselska R (2012) New mechanisms for an old drug; DHFR- and non-DHFR-mediated effects of methotrexate in cancer cells. Klin Onkol 25 Suppl 2: 2S87–92. [PubMed] [Google Scholar]
- 34.El-Subbagh HIH, G.S.; El-Messery S.M., Al-Rashood S.T.; Al-Omary F.A.; Abulfadl Y.S.; Shabayek M.I. (2014) Nonclassical antifolates, part 5. Benzodiazepine analogs as a new class of DHFR inhibitors: synthesis, antitumor testing and molecular modeling study. Eur J Med Chem 3: 234–245. [DOI] [PubMed] [Google Scholar]
- 35.Jarzynski C (1997) Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Phys Rev E Stat Nonlin Soft Matter Phys 56: 5018–5035. [Google Scholar]
- 36.Imparato A, Luccioli S, Torcini A (2007) Reconstructing the free-energy landscape of a mechanically unfolded model protein. Phys Rev Lett 99: 168101. [DOI] [PubMed] [Google Scholar]
- 37.Echeverria I, Amzel LM (2010) Helix propensities calculations for amino acids in alanine based peptides using Jarzynski's equality. Proteins 78: 1302–1310. doi: 10.1002/prot.22649 [DOI] [PubMed] [Google Scholar]
- 38.Schulten KT, E.; Park S.; Khalili-Araghi F. (2003) Free energy calculation from steered molecular dynamics simulations using Jarzynski's equality. Journal of Chemical Physics 119: 3559–3566. [Google Scholar]
- 39.Eyre-Walker A, Woolfit M, Phelps T (2006) The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173: 891–900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fersht A (1999) Structure and mechanism in protein science: a guide to enzyme catalysis and protein folding. New York: W.H. Freeman. xxi, 631 p. p. [Google Scholar]
- 41.Shakhnovich EI, Finkelstein AV (1989) Theory of cooperative transitions in protein molecules. I. Why denaturation of globular protein is a first-order phase transition. Biopolymers 28: 1667–1680. [DOI] [PubMed] [Google Scholar]
- 42.Gutin AM, Abkevich VI, Shakhnovich EI (1998) A protein engineering analysis of the transition state for protein folding: simulation in the lattice model. Fold Des 3: 183–194. [DOI] [PubMed] [Google Scholar]
- 43.Matouschek A, Kellis JT Jr., Serrano L, Fersht AR (1989) Mapping the transition state and pathway of protein folding by protein engineering. Nature 340: 122–126. [DOI] [PubMed] [Google Scholar]
- 44.Fersht AR, Sato S (2004) Phi-value analysis and the nature of protein-folding transition states. Proc Natl Acad Sci USA 101: 7976–7981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Naganathan AN, Munoz V (2010) Insights into protein folding mechanisms from large scale analysis of mutational effects. Proc Natl Acad Sci U S A 107: 8611–8616. doi: 10.1073/pnas.1000988107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Baldwin RL, Frieden C, Rose GD (2010) Dry molten globule intermediates and the mechanism of protein unfolding. Proteins 78: 2725–2737. doi: 10.1002/prot.22803 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Finkelstein AV, Shakhnovich EI (1989) Theory of cooperative transitions in protein molecules. II. Phase diagram for a protein molecule in solution. Biopolymers 28: 1681–1694. [DOI] [PubMed] [Google Scholar]
- 48.Bershtein S, Mu W, Shakhnovich EI (2012) Soluble oligomerization provides a beneficial fitness effect on destabilizing mutations. Proc Natl Acad Sci U S A 109: 4857–4862. doi: 10.1073/pnas.1118157109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Bershtein S, Mu WM, Serohijos AWR, Zhou JW, Shakhnovich EI (2013) Protein Quality Control Acts on Folding Intermediates to Shape the Effects of Mutations on Organismal Fitness. Mol Cell 49: 133–144. doi: 10.1016/j.molcel.2012.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, et al. (2012) A whole-cell computational model predicts phenotype from genotype. Cell 150: 389–401. doi: 10.1016/j.cell.2012.05.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Ohmae E, Miyashita Y, Tate S, Gekko K, Kitazawa S, et al. (2013) Solvent environments significantly affect the enzymatic function of Escherichia coli dihydrofolate reductase: comparison of wild-type protein and active-site mutant D27E. Biochim Biophys Acta 1834: 2782–2794. doi: 10.1016/j.bbapap.2013.09.024 [DOI] [PubMed] [Google Scholar]
- 52.Reetz MT, Prasad S, Carballeira JD, Gumulya Y, Bocola M (2010) Iterative saturation mutagenesis accelerates laboratory evolution of enzyme stereoselectivity: rigorous comparison with traditional methods. J Am Chem Soc 132: 9144–9152. doi: 10.1021/ja1030479 [DOI] [PubMed] [Google Scholar]
- 53.Beadle BM, Shoichet BK (2002) Structural bases of stability-function tradeoffs in enzymes. J Mol Biol 321: 285–296. [DOI] [PubMed] [Google Scholar]
- 54.DePristo MA, Weinreich DM, Hartl DL (2005) Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet 6: 678–687. [DOI] [PubMed] [Google Scholar]
- 55.Milo R, Last RL (2012) Achieving Diversity in the Face of Constraints: Lessons from Metabolism. Science 336: 1663–1667. doi: 10.1126/science.1217665 [DOI] [PubMed] [Google Scholar]
- 56.Worth CL, Preissner R, Blundell TL (2011) SDM—;a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res 39: W215–222. doi: 10.1093/nar/gkr363 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Itzhaki LS, Otzen DE, Fersht AR (1995) The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation-condensation mechanism for protein folding. J Mol Biol 254: 260–288. [DOI] [PubMed] [Google Scholar]
- 58.Studer RA, Christin PA, Williams MA, Orengo CA (2014) Stability-activity tradeoffs constrain the adaptive evolution of RubisCO. Proc Natl Acad Sci U S A. [DOI] [PMC free article] [PubMed]
- 59.Fersht AR, Matouschek A, Serrano L (1992) The folding of an enzyme. I. Theory of protein engineering analysis of stability and pathway of protein folding. J Mol Biol 224: 771–782. [DOI] [PubMed] [Google Scholar]
- 60.Adamczyk AJ, Cao J, Kamerlin SC, Warshel A (2011) Catalysis by dihydrofolate reductase and other enzymes arises from electrostatic preorganization, not conformational motions. Proc Natl Acad Sci U S A 108: 14115–14120. doi: 10.1073/pnas.1111252108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Taverna DM, Goldstein RA (2002) Why are proteins marginally stable? Proteins 46: 105–109. [DOI] [PubMed] [Google Scholar]
- 62.Romero PA, Arnold FH (2009) Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol 10: 866–876. doi: 10.1038/nrm2805 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Brown NG, Pennington JM, Huang W, Ayvaz T, Palzkill T (2010) Multiple global suppressors of protein stability defects facilitate the evolution of extended-spectrum TEM beta-lactamases. J Mol Biol 404: 832–846. doi: 10.1016/j.jmb.2010.10.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Jacquier H, Birgy A, Le Nagard H, Mechulam Y, Schmitt E, et al. (2013) Capturing the mutational landscape of the beta-lactamase TEM-1. Proc Natl Acad Sci U S A. [DOI] [PMC free article] [PubMed]
- 65.Xu J, Huang L, Shakhnovich EI (2011) The ensemble folding kinetics of the FBP28 WW domain revealed by an all-atom Monte Carlo simulation in a knowledge-based potential. Proteins 79: 1704–1714. doi: 10.1002/prot.22993 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, et al. (2006) Comparative protein structure modeling using Modeller. Curr Protoc Bioinformatics Chapter 5: Unit 5 6. [DOI] [PMC free article] [PubMed]
- 67.Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, et al. (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26: 1781–1802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792–1797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Bershtein S, Mu WM, Shakhnovich EI (2012) Soluble oligomerization provides a beneficial fitness effect on destabilizing mutations. Proc Natl Acad Sci U S A 109: 4857–4862. doi: 10.1073/pnas.1118157109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Kirsch RD, Joly E (1998) An improved PCR-mutagenesis strategy for two-site mutagenesis or sequence swapping between related genes. Nucleic Acids Res 26: 1848–1850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Kim HS, Damo SM, Lee SY, Wemmer D, Klinman JP (2005) Structure and hydride transfer mechanism of a moderate thermophilic dihydrofolate reductase from Bacillus stearothermophilus and comparison to its mesophilic and hyperthermophilic homologues. Biochemistry 44: 11428–11439. [DOI] [PubMed] [Google Scholar]
- 72.Johnson KA, Simpson ZB, Blom T (2009) FitSpace explorer: an algorithm to evaluate multidimensional parameter space in fitting kinetic data. Anal Biochem 387: 30–41. doi: 10.1016/j.ab.2008.12.025 [DOI] [PubMed] [Google Scholar]
- 73.Lopez MM, Makhatadze GI (2002) Differential scanning calorimetry. Methods Mol Biol 173: 113–119. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data are within the paper and its Supporting Information files.