Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Jul 17.
Published in final edited form as: Proteins. 2009 Nov 15;77(3):670–684. doi: 10.1002/prot.22481

Effective approach for calculations of absolute stability of proteins using focused dielectric constants

Spyridon Vicatos 1, Maite Roca 1,2, Arieh Warshel 1,*
PMCID: PMC4505752  NIHMSID: NIHMS130076  PMID: 19856460

Abstract

The ability to predict the absolute stability of proteins based on their corresponding sequence and structure is a problem of great fundamental and practical importance. In this work, we report an extensive, refinement and validation of our recent approach (Roca et al., FEBS Lett 2007;581:2065–2071) for predicting absolute values of protein stability ΔGfold. This approach employs the semimacroscopic protein dipole Langevin dipole method in its linear response approximation version (PDLD/S-LRA) while using the best fitted values of the dielectric constants εp and εeff for the self energy and charge–charge interactions, respectively. The method is validated on a diverse set of 45 proteins. It is found that the best fitted values of both dielectric constants are around 40. However, the self energy of internal residues and the charge–charge interactions of Lys have to be treated with care, using a somewhat lower values of εp and εeff. The predictions of ΔGfold reported here, have an average error of only 1.8 kcal/mole compared to the observed values, making our method very promising for estimating protein stability. It also provides valuable insight into the complex electrostatic phenomena taking place in folded proteins.

Keywords: protein stability, folding energy, dielectric constants, electrostatics in proteins

INTRODUCTION

The ability to predict physical and chemical properties of proteins, given their sequence and folded tertiary structure, is of crucial importance for the study of enzymes.13 Computational approaches for predicting the thermal stability of proteins, the difference of free energy between their folded and unfolded state, have yet to emerge and validate. That is, despite the progress in the development of models for studying the folding of proteins412 there are still major problems in predicting protein stability by either microscopic or macroscopic models. For example, we lack a clear understanding of the magnitude of electrostatic contributions to thermal stability and to the overall folding free energy. Similarly, the values of the dielectric constants to be used and the contribution of the ionizable residues to the stability of a folded protein are only a few of the questions still unanswered.

Discretized continuum studies13,14 have suggested that charged and polar groups lead to destabilization of a folded protein. Other studies, however, have indicated that protein stability is far more complicated than originally thought, and that charged residues do not necessarily destabilize the protein core. Quite the opposite, they tend to increase protein stability.1518 Even the general idea in continuum studies that internal ionizable residues tend to destabilize the protein core is now under reevaluation.19 Overall there is a growing realization that electrostatic energy is related to stability in general, and that electrostatic interactions usually stabilize the native states of proteins quite significantly. However, the exact role of electrostatic interactions in protein stability is obscured by the competition between desolvation penalties, stabilization by local protein dipoles, and charge–charge interaction.

Recently, we introduced an approach20 that evaluates the electrostatic contribution to protein stability by selecting relatively high values of dielectric constants for charge–charge interactions (εeff) and for self energy (εp). This approach determines the absolute stability of a given protein based on (a) the use of the semimacroscopic protein dipole Langevin dipole (PDLD/S) in its linear response approximation version (the PDLD/S-LRA method) for self energy and (b) the usage of high values εp and εeff. Our preliminary studies20 indicated that a major part of the absolute folding energy is reproduced by the electrostatic energy evaluated with large εeff and more surprisingly by a large εp. The contribution of the screened charge–charge interactions is consistent with the recent findings (e.g., Makhatadze and coworkers2123 as well as Garcia Moreno and coworkers24,25 and Pace and coworkers2628), as far as the relative contribution of surface groups is concerned. However, our finding appeared to be more general with implications about the energetics of internal groups. The corresponding physical picture appeared to be quite intriguing but more validation is essential. Moreover, the predictive value of our approach has not been fully validated. The present work explores this model in a much more extensive and systematic way and establishes its general validity. We focus on the challenge of finding the best fitted values for the dielectric constants and on the effectiveness of the method in predicting protein stability.

METHODS

The method described in this work focuses on the electrostatic contribution to the folding free energy, assuming that this is the main contribution. The method was described previously by Roca et al.,20 and we consider below the main points.

Our starting point is the folding paths of Figure 1, where the system can move from the unfolded state (A) to the folded state (E) in two pathways. In one of them (A) → (B) → (C) → (D) → (E), the unfolded protein becomes uncharged (transferring the charges to solution on the proper proton transfer process), folds to its neutral folded form (C), and then move to the final folded charged form (E) through an uncharged structure that is similar to that of (E). In the second path, the charged protein folds directly by moving from (A) to (E). The electrostatic contributions to folding, ΔGfoldelec, of a given protein at a specific ionization state and at a given pH, is evaluated by29:

ΔGfoldelec=ΔGfelecΔGufelec=2.3RTiQi(f)(pKi,intp(εp)pH)+166ijQi(f)Qj(f)rij(f)εeff(rij(f))+2.3RTiQi(uf)(pKa,iwpH)166ijQi(uf)Qj(uf)εwaterrij(uf) (1)

where uf and f designate, the unfolded and folded states, Qi is the charge of the ith residue, in the folding (f) or the unfolding (uf) state, εp is the dielectric constant used in the semimacroscopic calculation of intrinsic pKa, pKi,intp(εp) is the intrinsic pKa of the ith residue in its given protein state when all other residues are neutral, at a given εp. Finally rij is the distance between residues i and j, while εeff is the effective dielectric for charge–charge interactions inside the protein and εwater is the effective dielectric constant for charge–charge interactions in water.

Figure 1.

Figure 1

The thermodynamic cycle for the folding of a charged protein. Steps A → B and B →C correspond, respectively, to the uncharging of the unfolded protein and the folding of the uncharged protein. Step C → D corresponds to the change of the unfolded protein from its equilibrium structure to the structure of the folded charged protein. Step A → E corresponds to a direct folding of the ionized protein. The figure defines the free energy terms used in this work.

As clarified elsewhere,29,30 εp determines the “self energy” and the corresponding intrinsic pKa of each charged group. This parameter is not related to the response of the protein to electric field but to the method used in the calculations and to the elements included explicitly in the simulation system. Basically, εp reflects all the effects that are not included explicitly in the calculations of the self energy.30 Similarly, εeff is a parameter that reflects the compensation of the gas phase charge–charge interaction by the reorganization of the solvent and the protein.29

The folding free energy also includes non-electrostatic contributions such as configuration entropy and hydrophobic contributions. These contributions depend on the path used in Figure 1. For example, we can write according to Figure 1,

ΔGfold=ΔGfelecΔGufelec+ΔGuffuncharged=ΔGfoldelec+ΔGuffuncharged (2)

In this description, the electrostatic terms represent the charging process in the folded state and the uncharging in the unfolded state, while the non-electrostatic is entirely associated with the folding of the fully uncharged protein. The use of this equation for mutations of ionized residues of a given protein allows us to focus only on ΔGfoldelec.

Eq. (1) describes the electrostatic contribution to the folding energy of the given protein, and the corresponding value depends, of course, upon the choice of εp and εeff. Concerning the non-electrostatic term ΔGuffuncharged, it may well be of different value for different proteins, and there is no current computational method that can provide a reliable estimate of the quantities. However, if we use the direct (A) → (E) path of Figure 1 we get a different picture of the calculate ΔGfold, since now the entire folding process can be described in terms of the corresponding change in charge–charge interactions, using

ΔGfold166ijQiQjεeff(rij)[1rij(f)1rij(uf)] (3)

That is, now the non-electrostatic effects may be absorbed in the effective dielectric for the work of bringing the charged groups from the unfolded to the folded state in the (A) → (E) path of Figure 1. In other words, the (B) → (D) step involves non-electrostatic contributions that may be significant, while the step (D) → (E) is formally a well defined electrostatic step, which includes, of course, structural reorganization. The dielectric that reflects the structural reorganization in the (D) → (E) step may be modified by considering the free energy of the (B) → (D) step and requiring that this free energy will be reproduced by Eq. (1). Now if the effect of the (B) → (D) step is not large or if it is somehow correlated with the trend in the electrostatic free energy, we will have a uniform dielectric that works for all proteins. Conversely, if this is not the case we will not be able to use such approximation. It is this philosophy, though with some assumptions, that leads us to the next important step of our method. Since our concept of a dielectric constant is somewhat complex we refer the reader to Refs. 2931 for further clarification.

If the ΔGuffuncharged term is somehow small, reflecting compensation between the entropy and the hydrophobic effect, we may examine the approximation:

ΔGfoldΔGfoldelec(εp,εeff) (4)

where εp,εeff is the set of εp and εeff that successfully reproduce the folding energy. In this case we obtain:

ΔGfoldelec=ΔGfoldelec(εp,εeff)2.3RTiQi(pKi,intp(εp)pKi,ww)+166ijQiQj[1rij(f)εeff(f)180rij(uf)] (5)

Where: uf and f designate, the unfolded and folded states, Qi is the charge of the ith residue, in the folding (f) state only, pKi,intp(εp) is the intrinsic pKa of the ith residue calculated from the chosen εp and not the original εp, and pKi,ww is the pKa of the i residue in water. Rigorously, we should use Qi(f) but since the contribution from the unfolded state will be neglected here we leave the notation Qi. We also use 80 instead of the εwater since the difference is trivial and the effect of this term is small. Finally, rij is the distance between residues i and j, in the folding (f) and unfolding (uf) state. Here, εeff(f) is the chosen dielectric effective constant for the charge–charge interactions. It should be pointed out that εeff(f) in Eq. (5) is taken here as independent of the distance rij. In later sections, we discuss exceptions, where the choice of εeff(f) for certain cases is affected by both the residues i and j as well as its distance rij. At any rate, the first term in Eq. (5) represents the change of self energy upon moving a charge from water to its site in the folded protein: That is,

2.3RTiQi(pKi,intp(εp)pKi,ww)=ΔΔGsolwp(Qi=0Qi=Qi0) (6)

The second term in Eq. (5) represents the effect of charge–charge interaction.

Adopting the approximation of Eq. (5) reduces the complicated and difficult problem of calculating absolute folding energy of a protein, to the finding of the proper, best fitted values of εp and εeff(f), which provides the best estimate of ΔGfold. This idea is based on the fact that semimacroscopic models can represent implicitly physical and chemical properties even though they lack explicit representation of the phenomena taking place. Of course, such an approach can work only if the given model retains the main physics of the real system. Overall, we expect the best fitted values of εp and εeff(f) to be higher than the values of 4 to 8, values used for example to calculate accurate values of pKa’s in proteins.32

Equation (5) can be further simplified, by assuming that rij(uf) is in general much larger than rij(f), and that εeff(f) is smaller than 80. Thus, Eq. (5) is finally approximated by:

ΔGfoldelec(εp,εeff(f))=2.3RTiQi(pKi,intp(εp)pKi,ww)+166ijQiQj[1rij(f)εeff(f)]=2.3RTiQi(pKi,intp(εp)pKi,ww)+166ijQiQj[1rijεeff] (7)

In Eq. (7) we drop the notation (f) from the variables rij(f) and εeff(f), and we will use rij and εeff(f) as the distance and the corresponding dielectric constant for interactions between residues i and j in the folding state. This simplification may overlook cases where the contribution from the unfolded protein is significant. This might be the case when the unfolded protein has a compact structure, as implied by the studies of Raleigh and coworkers33 or when electrostatic interaction in the unfolded state play a significant role for folding kinetics as suggested by Pace, Trefethen, and coworkers.26,27 However, it is possible that the contribution from the unfolded configuration is relatively small and the best way to judge this issue is to explore the validity of Eq. (7) in a large number of test cases.

The evaluation of the intrinsic pKa’s were done by using the PDLD/S-LRA method according to standard protocol using the POLARIS module and the ENZYMIX force field of MOLARIS program.20,36 Each protein studied in this work was first solvated by the surface constrained all atom solvent (SCAAS) model,35,36 and all the ionizable groups (Asp, Glu, Lys, Arg, but not His) at pH = 7 were assigned a charge which is 50% of their full charge at the ionized state (this was considered as the best procedure for the initial relaxation). The resulting system (protein and waters) was equilibrated by running a 100 ps molecular dynamics simulation with 1 fs time step at 300 K. The subsequent PDLD/S-LRA calculations of the pKi,intp(εp) of each ionizable residue started with 10 ps equilibration run, followed by 25 2 ps runs (starting from different configurations) on both the charged and uncharged state of the given residue. The resulted calculated pKi,int were then averaged to evaluate the final pKi,intp(εp). The total simulation time depends on the size and the number of ionizable groups of the given protein. For example it took 22 hours on 4 dual core nodes (Dual Intel P4 3.0 GHz, 2 GB Memory) to evaluate all the pKi,intp(εp) of snase (protein 1EY0, size 136 residues, which contains 48 ionizable residues).

Once the intrinsic pKa’s were evaluated, the ionization states of the protein residues at pH = 7 were determined, by using a Monte Carlo approach described previously.36 This procedure was repeated at different εp,εeff and provides the charges (the Qi) of the ionized residues as a function of the given dielectric constants. Using the pKi,intp(εp) and the Qi in Eq. (7) provided ΔGfoldelec(εp,εeff) as a function of εp and εeff.

RESULTS

Initial estimate of the best fitted values of εp and εeff

In this work, we moved to a much more extensive benchmark than the one used in our previous work,20 considering all the proteins listed in Table I. Our test set included 45 proteins with various sequence sizes, folds, and function.

Table I.

The benchmark considered in this work

No. Name PDB
ID
Number
of residues
Type Source
1 SSO7d 1SSO 62 monomer 37
2 Thioredoxin 2TRX 108 monomer 37
3 Barstar 1A19 89 monomer 37
4 Apoflavodoxin 1FTG 168 monomer 37
5 λ-Repressor 1LMB 87 monomer 37
6 Snase 1EY0 136 monomer 37
7 BsHpr, Phosphotransferase 2HID 87 monomer 37
8 FeCyt b562 1QPU 106 monomer 37,38
9 Arc Repressor 1ARQ 53 dimer 37,39
10 Aspartate Amilotransferase 1VPE 398 monomer 3
11 Chey 1TMY 118 monomer 3
12 GDH Domain II 1B26 (A02) 234 monomer 2,3,37
13 Histidine Phosphocarrier 1Y4Y 87 monomer 3
14 Phosphotransferase, Histidine containing protein 2HPR 87 monomer 3
15 Ribosomal 1H7M 99 monomer 34
16 RNase H* 1JXB 147 monomer 3,40
17 Ferridoxin 1VJW 59 monomer 3
18 O-Methyl Guanine DNA methyltransferase 1MGT 169 monomer 3
19 Sac7d 1WD0 66 monomer 2,3
20 aHistone 1BFM 134 dimer 2,41
21 PFRD-XC4 1QCV 53 monomer 42
22 aStaphylococcal Nuclease I92K 1TR5 130 monomer 43
23 Staphylococcal Nuclease I92E 1TR5 130 monomer 43
24 aStaphylococcal Nuclease 1TR5 130 monomer 43
25 Bs CSP 1CSP 67 monomer 20,44
26 Bc CSP 1C9O 66 monomer 20,45
27 Tm CSP 1G6P 61 monomer 2,20
28 aUbiquitin D21N 1AAR 76 monomer 20,46
29 Ubiquitin F45W 1AAR 76 monomer 20,47
30 aUbiquitin K27A 1AAR 76 monomer 20,46
31 Tm DHFR 1CZ3 164 dimer 20,48
32 Ec DHFR 1RX2 159 monomer 20
33 Ribonuclease 1X1P 212 monomer 2,49
34 Glucanase C 1CX1 153 monomer 50
35 Phospholipid A2 1P2P 124 monomer 51
36 Trypsin Proteinase 2CI2 65 monomer 52
37 Interleukin 1IOB 153 monomer 53
38 aAdhesion transferase Y92E 1K40 126 monomer 54
39 Complement 1GKG 136 monomer 4
40 Cytochrome b5 Rat 1B5M 84 monomer 55
41 Gene V DNA Binding 1VQB 86 monomer 56
42 Glu Transferase 1GSD 208 dimer 57
43 Growth Factor 1FGA 124 monomer 58
44 Isomerase 1HTI 248 dimer 59
45 Ribosomal S6 1RIS 97 dimer 60
a

The structure used to calculate ΔGfoldelec, was obtained by mutating the corresponding PDB structure reported in the third column of this table.

We started with a rough estimate of the best fitted set of εp,εeff for the prediction of ΔGfold. This was done first for a single protein (L-Repressor, PDB ID 1LMB) by the approach illustrated in Figure 2.

Figure 2.

Figure 2

The calculated values of ΔGfoldelec as a function of εp and εeff for L-Repressor. The best fitted values of εp and εeff are taken as the values that give the best agreement between ΔGfoldelec(εp,εeff) and the observed ΔGfold. Observed stability for this protein is −4.6 kcal/mole. As seen from the figure, these values are around 35εp45 and 35εeff45.

In order to clarify the nature of Figure 2, it is useful to start, for example, with the evaluation of ΔGfoldelec(εp,εeff) at a specific point (e.g., εp=20 and εeff=30). In this case, we first used εp=20 and evaluate the intrinsic pKa values for all the ionizable residues. In the next step, we used Eq. (7) with εeff=30 and obtained ΔGfoldelec(εp,εeff)=ΔGfoldelec(20,30)=4.27kcal/mol. This value was then assigned in the corresponding point in Figure 2. Now the same procedure was repeated for other values of εp and εeff and the figure was completed. Next, we identified the set of ΔGfoldelec(εp,εeff) that gave the best agreement with the observed ΔGfold (around −4.6 kcal/mole for L-Repressor) and identified the optimal dielectric constants. In the case of Figure 2, the best fitted values were found to be in the vicinity of εp=εeff=40.

After applying the above approach to all the 45 proteins of our test case, we considered several alternative ways of analyzing the corresponding results. The first and simplest analysis which is summarized in Figure 3, was based on allowing both εp and εeff to be either 35 or 40, and choosing the value of ΔGfoldelec(εp,εeff) that minimizes the difference |ΔGfoldelec(εp,εeff)ΔGfoldobs|. The corresponding results were used to generate Figure 3(A). This type of treatment will be referred here as getting the “best fitted” ΔGfoldelec(εp,εeff). The same procedure was followed in the generation of Figure 3(B), but this time with 8εp20 and 8εeff20.

Figure 3.

Figure 3

The correlation between the best fitted ΔGfoldelec(εp,εeff) and ΔGfold for best fitted values of εp and εeff in a restricted range. The calculations considered all the proteins in the benchmark of Table I and take the value of ΔGfoldelec(εp,εeff) that gives best fitted results for the two allowed values of εp and εeff (see text for the procedure used for selecting the best fitted values). The two values considered in Figure 3(A) are εp, εeff = 35, 40 whereas in Figure 3(B) we consider εp, εeff = 8, 20. As discussed in the text, the calculated best fitted values of ΔGfoldelec(εp,εeff) diverge greatly from the corresponding observed values. The average error between observed ΔGfold and best fitted ΔGfoldelec(εp,εeff) is 3.5 kcal/mole, and 7.3 kcal/mole for Figure 3(A) and 3(B), respectively. Some points of Figure 3(B) are outside the scale of this figure.

As seen from both Figures 3(A) and 3(B), there is significant disagreement between the best fitted ΔGfoldelec(εp,εeff) and observed ΔGfold for many proteins. However, from these initial trials it appears that with the use of high values of εp and εeff of around 40, the absolute difference between observed and calculated stability |ΔGfoldelec(εp,εeff)ΔGfold| is significantly smaller than the difference obtained with other dielectric constants (see example demonstrated in Figure 3(B), where the discrepancy is much larger at the region when 8εp20,8εeff20).

Improving the model

In the next stage of the refinement of our model, we tried to further search for the best fitted values for the dielectric constants. This was done by considering the dielectrics that minimize the function

Ferror(εp,εeff)=i|ΔG(i)foldelec(εp,εeff)ΔG(i)foldobs| (8)

Where i runs on all 45 test proteins, and Ferror(εp,εeff) is the sum of the absolute value of errors between calculated and observed stabilities, for a specific value of εp and εeff. The resulting surface is shown in Figures 4(A) and 4(B).

Figure 4.

Figure 4

The surface (A) and contour plots (B) of Ferror(εp,εeff). In these plots it is clearly shown that the best fitted values for the dielectric constants follow approximately the line εp=εeff, and that for εp=εeff=40 and above, the error reaches its minimum plateau.

As seen from the figures, the best fitted values of εp and εeff follow approximately a line where εp=εeff, and the global minimum occurs approximately where εp=εeff=40. Thus, it is concluded that the best fitted dielectrics needed for prediction of protein stability is the one around εp=εeff=40.

To improve the performance of our model, we started with a global analysis of the reasons for the disagreement between the calculated and observed values shown in Figure 3. This analysis focused first on the location of the ionized residues in the cases that have not produced satisfactory results. This was done by evaluating the distance between the ionized groups to the closest water molecule using the grid created by MOLARIS in the initial process of generating the SCAAS water sphere (see Refs. 29 and 36 for earlier studies along this line). The use of water grid in the identification of the internal/external residues is illustrated in Figure 5.

Figure 5.

Figure 5

Illustrating our procedure for defining internal groups. The protein is surrounded by a cubic grid of water molecules and the distance between the water oxygen and the given group is used to determine whether the group is internal or external.

The above grid analysis indicated that the main problem is associated with internal groups, and led us to modify our approach, with the working hypothesis that in the case of truly internal groups it is better to use lower εp while still using a large εeff. This was done by defining an internal residue when the following conditions occur: (a) the shortest distance between a grid point and a heavy atom of the given residue is higher or equal to the threshold value reported in Table II, depending upon the type of the residue, and (b) the given residue has a low number of water molecules within a radius of 5 Å (five or less water molecules) from its geometrical center. Condition (b) was used mainly as a verification of condition (a).

Table II.

The Threshold Distances Between the Terminal Heavy Atom of the Given Amino Acid and the Closest Grid Point (The Closest Water Molecule)

Residue type Threshold (Å)
ASP 5.0
GLU 5.2
HIS 5.2
LYS 5.5
ARG 5.7

For example, a Glu residue is considered to be internal when there are five or less water molecules within a distance of 5 Å from its geometrical center, and the closest water is 5.2 Å or more from its furthest heavy atom, which in this case is one of the two oxygen atoms off its side chain.

If a single ionizable residue is identified as an internal residue, then the intrinsic pKa for this residue is chosen to take a value between the calculated pKi,intp for εp=8 and the pKi,intp for εp=20. For example, if a tested protein has 20 ionizable residues, and residue 17 is identified as internal, then in Eq. (7), for residue i = 17 the pKa value used will be:

pK17,intp(εp=fixed)pK17,intp(εp=8)+pK17,intp(εp=20)2 (9)

Now Eq. (7) for this case (protein with 20 ionizable residues and one internal residue) becomes:

ΔGfoldelec(εp,εeff)=2.3RTi=1i1720Qi(pKi,intp(εp)pKi,ww)2.3RTQ17(pK17,intp(εp=fixed)pK17,ww)+166ijQiQj[1rijεeff] (10)

In some cases we identified internal ionizable residues with a very large difference between calculated pKi,intp(εp) and pKi,ww. In these cases we found it useful to use an even lower value of εp. Thus we further refined our procedure and used an intrinsic pKa subscript that corresponds to εp=8 when the difference pKi,intp(εp)pKi,ww for values of εp between 4 and 8 was positive and higher than 4 for Asp and Glu, and negative and lower than −3 for Lys and Arg.

Assigning lower εp values and using the corresponding pKi,intp(εp) for the internal ionizable residues, resulted in a significant increase in the accuracy of the calculated ΔGfold for our protein test set. Now the majority of the proteins were showing excellent ΔGfold predictions for dielectrics of εp=εeff40. However, we also observed that a significant fraction of the tested proteins give more accurate results for 35εp40 and 20εeff25. The result for both cases are shown in Figures 6(A) and 6(B), respectively.

Figure 6.

Figure 6

The correlation between ΔGfoldelec and ΔGfoldobs after using lower εp for the internal residues. Both Figures 6(A) and 6(B) use the same representation as in Figure 3. In this case, the test proteins fall into two main categories: (a) the one shown in Figure 6(A) where the best fitted ΔGfoldelec(εp,εeff) is in the range of 35εp40,35εeff40, and (b) the one shown in Figure 6(B) where the best fitted ΔGfoldelec(εp,εeff) is in the range 35εp40,20εeff25. The majority of the test proteins fall into the first category.

Some of the results obtained after the specialized treatment of the internal residues were still puzzling. That is, although the majority of the proteins tests were giving excellent results for εp=εeff=40, there was still a significant number of proteins where ΔGfoldelec(εp,εeff) was substantially lower than ΔGfold, (see Fig. 6B). However, the fact that in both cases the value of εp was around 40, indicated that the problem is due to εeff, that produced — in certain cases—the underestimation of ΔGfold. To decipher this non obvious behavior of εeff, we concentrated on proteins cases where point mutation resulted in “shifting” the best fitted values of εp and εeff from the 35εp40 and 35εeff40 range to the 35εp40 and 20εeff25 range, and vice versa. This was found in the case of the CSP where we examined examined three proteins with high sequence similarities and only a few point mutations. In the case of bs-CSP we obtained the best results in the 35εp40 and 20εeff25 range, in the case of bc-CSP we obtained the best results in the 35εp40 and 25εeff30 range, while in the case of tm-CSP we obtained the best results in the 35εp40 and 35εeff40 range. The analysis of this data indicated that the interaction of Lys residue with charged residues at 10 Å or less leads to the above shift. Thus, we modified the εeff for the interaction of Lys with negatively charged residues by using:

(εeff)shifted=0.8εeff (11)

for a distance rij ≤ 10 Å.

This special treatment is rationalized by the fact that Lys residues can move their center quite easily and thus can adjust their interaction with counter ions (see analysis in our previous work61).

Using both the modified εp and εeff for Lys allowed us finally to obtain much better results, as shown in Figure 7, and in the corresponding columns in Table III. Now the agreement between the best fitted ΔGfoldelec(εp,εeff) and ΔGfold is excellent.

Figure 7.

Figure 7

The best fitted ΔGfoldelec(εp,εeff) and observed ΔGfold for the tested proteins, after the use of lower εp for the internal residues and lower εeff for the Lys with rij ≤ 10 Å. In this case, all proteins show best fitted values of ΔGfoldelec(εp,εeff) when dielectric constants are in the range of 35εp40,35εeff40. The difference between best fitted ΔGfoldelec(εp,εeff) and observed ΔGfold has significantly decreased, compared to the previous cases of Figures 3 and 5. (For 45 proteins, the average difference is reduced to 0.7 kcal/mole.)

Table III.

Predicted ΔGfold for the Series of Proteins Considered in this Worka

No. Protein
ΔGfoldelec
ΔGfoldobs
|Error| Best εp Best εeff
DG¯foldcalc
|Error|
1 SSO7d 7.9 8.0 0.1 40 40 8.5 0.5
2 Thioredoxinb 4.4 9.0 4.6 40 35 3.1 5.9
3 Barstar 5.9 5.7 0.2 35 35 4.9 0.8
4 Apoflavodoxin 4.3 4.3 0.0 35 40 5.0 0.7
5 λ-Repressor 4.5 4.6 0.1 40 40 4.8 0.2
6 Snase 6.3 6.2 0.1 35 40 8.0 1.8
7 BsHpr, Phosphotransferase 3.1 4.0 0.9 40 35 2.1 1.9
8 FeCyt b562 7.7 5.2 2.5 35 40 9.7 4.5
9 Arc Repressor 4.3 4.6 0.3 40 35 1.7 2.9
10 Aspartate Amilotransferase 26.8 28.9 2.1 40 40 31.0 2.1
11 Chey 11.6 9.5 2.1 35 40 13.5 4.0
12 GDH Domain II 5.6 4.9 0.7 35 35 4.2 0.7
13 Histidine Phosphocarrier 5.4 8.2 2.8 40 35 4.5 3.7
14 Phosphotransferase, Histidine containing protein 5.3 5.2 0.1 40 35 4.0 1.2
15 Ribosomal 7.4 10.7 3.3 40 35 6.4 4.3
16 RNase H* 7.1 7.5 0.4 40 35 6.5 1.0
17 Ferridoxinb 3.8 9.3 5.5 40 35 3.2 6.1
18 O-Methyl Guanine DNA methyltransferase 10.2 10.2 0.0 35 35 9.1 1.1
19 Sac7d 7.8 7.4 0.4 40 40 9.0 1.6
20 Histone 7.8 7.2 0.6 40 35 7.6 0.4
21 PFRD-XC4 2.1 3.2 1.1 40 35 1.2 2.0
22 Staphylococcal Nuclease I92K 1.5 2.3 0.8 40 40 2.8 0.5
23 Staphylococcal Nuclease I92E 3.8 4.2 0.4 40 35 2.2 2
24 Staphylococcal Nuclease 11.5 11.5 0.0 35 35 10.3 1.2
25 Bs CSP 1.1 1.2 0.1 35 35 0.8 0.4
26 Bc CSP 5.2 5.0 0.2 40 35 4.2 0.8
27 Tm CSP 6.8 6.5 0.3 35 40 8.0 1.5
28 Ubiquitin D21N 5.1 6.1 1.0 40 35 3.9 2.2
29 Ubiquitin F45W 6.9 7.4 0.5 40 35 5.9 1.5
30 Ubiquitin K27A 4.3 4.4 0.1 40 40 4.8 0.4
31 Tm DHFR 30.4 30.1 0.3 40 35 25.9 4.2
32 Ec DHFR 6.5 6.0 0.5 35 35 4.9 1.1
33 Ribonuclease 11.2 10.5 0.7 40 40 13.1 2.6
34 Glucanase Cb 0.0 0.0 0.0 0.0 0.0
35 Phospholipid A2 7.8 6.5 1.3 35 40 9.1 2.6
36 Trypsin Proteinase 7.7 7.5 0.2 35 35 6.8 0.7
37 Interleukin 9.7 9.1 0.6 35 35 8.9 0.2
38 Adhesion transferase Y92E 7.5 8.0 0.5 40 40 8.4 0.4
39 Complement 2.6 2.7 0.1 35 35 2.5 0.2
40 Cytochrome b5 Rat 3.5 3.2 0.3 35 35 2.3 0.9
41 Gene V DNA Binding 7.6 9.0 1.4 40 35 6.6 2.4
42 Glu Transferase 13.6 14.4 0.8 35 35 10.2 4.2
43 Growth Factor 0.0 0.0 0.0 0.0
44 Isomerase 19.2 19.3 0.1 40 35 13.7 5.6
45 Ribosomal S6 11.1 8.0 3.1 35 40 12.9 4.9
a

All free energies reported are in kcal/mol. The third column reports the experimental values of the absolute stabilities of the tested proteins. The last two columns give the predicted stability according to Eq. (12) and the corresponding error. The sources of the observed values are given in Table I.

b

Proteins containing a disulfide bond (which is not treated by the present method) where the corresponding “missing” energy is around 5 kcal/mole.

Using the results described in Figure 7 to calculate the sum of errors Ferror(εp,εeff) introduced in Eq. (8), we evaluated again the best fitted values for the dielectric constants by determining the region of (εp,εeff) where Ferror(εp,εeff) is minimized. This treatment led to the surface described in Figures 8(A) and 8(B). The best fitted region follows again an approximate line where εp=εeff. The main difference between Figures 4 and 8 is that the region with εp=εeff corresponds to much deeper valley (indicating a minimum in the function Ferror(εp,εeff)).

Figure 8.

Figure 8

The surface (A) and contour plots (B) of Ferror(εp,εeff) obtained after refinement of εp for internal groups and εeff for Lys. The plots show clearly that the best fitted values for the dielectric constants follow approximately the line εp=εeff, with a global minimum of around (εp,εeff)=(40,40). This means that values of ΔGfoldelec(εp,εeff) proximal to ΔGfold can be acquired when εp,εeff have high values, and εp=εeff.

Table III summarizes the calculated results for the best fitted dielectric constants. As seen from the Table, we obtained quite accurate results. Most of the protein tested show error of around 1 kcal/mole, and only a few have errors above 2 kcal/mole, while still not exceeding 3 kcal/mole. The average error for the best fitted values of ΔGfoldelec(εp,εeff) (shown on the forth column of Table III) is only 0.7 kcal/mole.

The ΔGfold values reported in Table III for Glucanase C is zero, since the protein is unstable according to Ref. 50. In this case, it is found experimentally that without the Cys6–Cys27 disulfide bond, the protein is unstable, thus our finding that ΔGfoldelec(εp,εeff) is positive for all dielectric constants, is consistent with the experimental observation, since our calculations do not include the disulfide bond. Protein acidic fibroblast growth factor is also unstable under physiological conditions. It is reported62,63 that the protein without certain polyanions, remains relatively unstable.

Concerning the two proteins of Thioredoxin64,37 and Ferridoxin,3,65 our method underestimated the observed stability by 5 kcal/mole. However, those two proteins contain one disulfide bond, and since a disulfide bond stabilizes a protein with an amount of approximately 5 kcal/mole,66,67 we consider our method to be successful in predicting their absolute stabilities.

Predicting absolute stabilities of proteins

This work so far has shown that the calculated absolute stability of a protein can be correlated quite accurately with the corresponding observed value, by choosing best fitted values of εp and εeff in the region 35εp40;35εeff40.

This however is not fully unbiased since the “best fitted” values are taken as the values that give the best result. Thus, it is important to select a well defined scheme for actually predicting the relevant stability. After considering several options, we concluded that a reasonable prediction can be obtained by using:

ΔG¯foldcalc=ΔGfoldelec(35,35)+ΔGfoldelec(35,40)4+ΔGfoldelec(40,35)+ΔGfoldelec(40,40)4 (12)

The corresponding result for ΔG¯foldcalc are shown on the right columns of Table III and in Figure 9. As seen from the figure, the agreement is reasonable (an average error of 1.8 kcal/mole) although less impressive than that of Figure 7. However, in this case, we are not selecting any best fitted values and have a robust way of predicting protein stability.

Figure 9.

Figure 9

Calculated and observed folding energies for all our benchmark. The calculations are done using Eq. (12), rather than by choosing the best fitted values of ΔGfoldelec(εp,εeff), as was in Figure 7.

Dependence of absolute stabilities on the size and fold of the protein

During the validation of our model we examined the possibile influence/dependence of factors such as protein size and fold on absolute stability. To investigate such dependence, we chose the protein test set that included proteins of largely different size and fold. Comparisons of both the observed and the calculated stability with size and fold of proteins didn’t show any particular correlation. For example, there are many small proteins in the test set shown in Table I, which have relatively large values of observed and calculated absolute stabilities (for example SSO7D, Histidine Phosphocarrier and Sac7d) but at the same time there are many that show low values (for example Barstar, PFRD-X64, and Bs CSP). The same trend is observed for large proteins (for example staphylococcal nuclease, isomerase, Tm DHFR, and aspartate amilotransferase show large values of absolute stability, while GDH Domain II, staphylococcal nuclease I92K, and Ec DHFR have low values for absolute stability). Comparison of fold and absolute stability also has not showed any correlation, i.e., whether the structures were mainly alpha, mainly beta, amorphous, or alpha-beta, didn’t show an important difference in absolute stability.

Finally, no correlation between best fitted values of εp and εeff and the size or fold has been observed.

A practical and powerful approach for stabilizing proteins

The present approach should provide a powerful new way of stabilizing proteins by systematic mutations and this is expected to help augmenting existing strategies (e.g., Reetz et al.68). We can of course just use Eq. (7) and examine systematically different mutant. This is done as a demonstration in Table IV. However, a more effective approach is obtained by differentiating Eq. (7) and evaluating the gradient of the free energy with regards to the charges of the protein residues.

ΔGfoldelecQi2.3RT(pKi,intppKi,ww)+332ijQj[1rijεij] (13)

Table IV.

Calculated ΔGfoldelec for the Set of Dielectrics εp=40 and εeff=40 for Different Proposed Mutations in the Wild Type Lip Aa

Ionized to non polar mutations of the wild type Lip A
Structures
ΔGfoldelec
Wild type −7.62b
Arg142Ala −9.27
Lys23Ala −9.61
Lys70Ala −9.65
Lys88Ala −9.80
Arg107Ala −10.15
Arg33Ala −10.31
Lys23Ala//Arg33Ala −10.64
Lys23Ala//Arg33Ala//Arg107Ala −12.43
Non polar to ionized mutations of the wild type Lip A
Structures
ΔGfoldelec
Wild type −7.62b
Ala38Asp −8.53
Ala132Asp −8.96
Ala105Asp −9.39
Ala15Asp −9.52
Ala75Asp −9.77
Ala81Asp −9.97
Ala20Asp −10.78
Ala68Asp −10.89
Ala146Asp −11.03
Ala97Asp −11.21
Ala113Asp −11.63
Charged to non polar mutations of the most stable mutant (variant XI) of Lip A
Structures
ΔGfoldelec
Variant XI −10.30b
Lys32Ala −10.71
Asp133Ala −11.08
Lys70Ala −11.16
Lys23Ala//Lys70Ala −11.32
a

Energies in kcal/mol.

b

Taken from Ref. 69.

The gradient can then be used in predicting the change of charges that will stabilize the protein, using

ΔQi=α(ΔGfoldelecQi) (14)

(where α is a scaling factor) and repeating the procedure iteratively we can predict the charge configurations that will increase the stability. Our preliminary validations of the above approach included a demonstration that we can reproduce the energetics of known mutants starting from the native protein, or from other mutants. Experimental validation of our approach should be extremely useful.

DISCUSSION

This work attempted to develop and validate a method for predicting absolute folding free energies. The main idea behind our approach is the hypothesis that the folding free energy is mainly determined by electrostatic free energies. Thus, the working hypothesis is that fitting the relevant dielectric constants will give a highly predicted model. The specific strategy involved evaluation of the intrinsic pKa by the PDLD/S–LRA method with a given εp and evaluating charge–charge interactions with a dielectric εeff.

The overall best fitted dielectric constants were found to be in the range of 35–40 for both εp and εeff. However, a significant improvement was obtained by using a lower value of εp for internal groups and by reducing εeff for Lys groups.

As seen from Figure 9, the agreement between calculated and observed stability is reasonable although slightly less impressive than those of Figure 7. However, in this case, we are not selecting any best fitted values of εp and εeff, and we have a robust way of predicting protein stability. In fact, it is quite possible that the present method can be further improved by, for example, reducing εeff for internal ion pairs, or by other physically based refinements.

The validity of our method can be best judged from its performance on the diverse test case used. To the best of our knowledge, such a performance in prediction of absolute folding energies has not been obtained by other approaches. Here, one can argue that the method is empirical and thus may reflect effects that are not related to the electrostatic free energy. We believe, however, that the method captures consistently electrostatic effects and discusses these issues in our previous work (Ref. 20). That is, using Figure 1 we rationalize the high εp even for internal groups by pointing out that going on A → E path corresponds to a folding of polyelectrolyte where the energetics of buried ionized group in the protein interior is reflected in the effective dielectric for charge–charge interactions. We also note that using a large εeff is fully consistent with our previous concepts.29 Furthermore, we addressed the puzzling finding that εp for calculating pKa’s is small (when evaluated consistently while considering the local protein relaxation30) while εp for folding calculations is relatively large. This issue was explored in the specific case of ubiquitin where it was concluded that εp in pKa calculations reflects only the relaxation in the D → E step (the protein is near its structure in the charged configuration). On the other hand, εp should reflect implicitly the compensating protein pre-organization in the C → E process (see Figure 5 in Ref. 20). Apparently, since this effect is not included explicitly, we have to use a large value of εp.

The present work involved much more extensive benchmark than the one used in Ref. 20 and thus allowed us to refine our ideas about εp and εeff. We found that the idea of using large dielectrics is confirmed as much as surface groups are concerned. However, for internal groups, it appears that a compromise that involves εp ≤ 20 (which is still quite large) is required.

One may argue that the identification of our calculated folding free energy with electrostatic energy is not fully justified since the effective dielectric reflects non-electrostatic contributions. However, this argument overlooks the logical definition of energy contributions in biophysics. That is, if a given effect scales according to the charges of the system, it can and should be defined as an electrostatic contribution. For example, the free energy change associated with turning a charge on or off (e.g., in redox and pKa processes) is rigorously defined as electrostatic free energy, despite the fact that the corresponding charging process is associated with a response of the system that reflects all the forces and interactions in the system. The same is true for the charging of an ion in water, which reflects the reorientation of the water molecules and even the water–water Van der Walls interactions. On the other hand, if the folding energy is due to the size of different residues, rather than their charges, it would be logical to state that the folding energy reflects steric repulsion. Alternatively, if we will be able to correlate the absolute folding free energy with the hydrophobicity of the residues, we will be able to say that the folding free energy is determined by the hydrophobic effect. The same is true with regards to configurational entropy effects. This does not mean that hydrophobic and entropic effects do not contribute to the folding free energy. However, the present study indicates that theses contributions may cancel each other in some way. More specifically, it has been long argued that hydrophobic effects contribute in a major way to protein stability (e.g., Refs. 70, 71). While there are overwhelming evidences that such effects are a crucial part of the folding free energy, we are not aware of any clear correlation between the number or position of the hydrophobic residues and the observed absolute protein stability. Thus, although the reason for the luck of correlation is unclear (it may be due to a compensation by configuartional entropy) it seems to us that (until proved otherwise) we should accept the fact that the absolute protein stability is not correlated with the markers of hydrophobic free energy.

One may also claim that our effective dielectric reflect non-electrostatic contributions such as the hydrophobic contributions in the (B) → (D) step of our cycle. This argument is reasonable as much as the cycle (A) → (E) is concerned but less reasonable when Eq. (3) captures the trend in the observed folding energy. At any rate, as long as we cannot establish correlation between hydrophobic markers and absolute stability it would be hard to attribute the folding energy to such effects.

It must be stated here that the present work has not established that the correlation with the electrostatic free energy is perfect. Obvious examples are the effects of SS bonds in the proteins thioredoxin, ferredoxin, and glucanace C. Cases of interaction with the ligand were not explored here carefully by considering the relevant electrostatic contributions. Another interesting case is observed for example in the case of residues of staphylococcal nuclease,43 where the deviation between calculated and observed folding energy is due most probably to the hydrophobic contribution. However, the overall trend is clearly best and overwhelmingly correlated with the electrostatic free energy.

We would also like to mention that it is very likely that we will identify cases where the dielectric constant will not follow our idealized roles. For example, there could be cases where we will have to use higher εeff, when we have a large number of neighboring charge residues, that will create the equivalent of a high ionic strength. There could be also cases where εeff at close distance will have to be decreased (this may be handled by employing a distance dependent dielectric30). To explore and analyze such cases, we will have to focus on more comparative mutational studies, in particular between thermophile and mesophile enzymes. Obviously, further experiments and theoretical studies are clearly needed. Fortunately, recent experimental studies provide remarkable benchmark for such detailed studies.19 At any rate, regardless of the formal justification of our approach, its power can be exploited in predicting and analyzing protein stabilities.

As mentioned in the results-section, we have not found a correlation between the ΔGfold and the size of the protein or other characteristic properties. We did observe, however, that regardless of the size or the fold of a protein, a single mutation that turns a charge on or off may have a large impact on stability. This is clearly shown in the case of the staphylococcal nuclease mutations,43 where a single mutation of a non-charged residue into Lysine, reduced the protein’s stability by the enormous amount of around 9 kcal/mole. Therefore, we consider the charges of a protein as a more important factor in determining the overall stability, than the size or the fold. Obviously, various protein sizes and folds may reflect the amount and type of certain ionizable residues within a protein, but in the end it is the charges of those ionizable residues that define the stability.

It may be useful to point out here that Gurd and coworkers72 invoked the idea that the stability of proteins could be evaluated by considering the ΔGij obtained with εeff of 40 or larger. However, this was an almost obvious conclusion for those who accepted the Tanford Kirkwood model73 which overlooked the self energy term and the destabilization expected from this term in non polar regions of the protein (see discussion in Warshel et al.74). The interesting finding that seems to emerge from the present and other works is the observation that despite the fact that charges should not be stable in non polar environments the effective environment around charges in proteins behave as a relatively polar environment.

It is significant, however, to note that Eq. (3) does not produce perfect results [and the same is true for Eq. (7)] when εp and εeff are equal. In some cases, we had to use εp, which is significantly smaller that 40 and this reflects the effect of the self energy. Yet this effect is much smaller than what would be deduced from the assumption that the charged groups are in partially non polar environment. In fact, as established in our previous work (Ref. 20) and the present work the best fitted εp is much larger than the one that is requires to obtain the observed pKa’s. The reasons for this have been discussed in great length in Ref. 20 and it is associated with the compensation in the (D) → (E) step in Figure 1.

Another interesting issue is the acid unfolding effect.75 Here, again we are dealing with a full electrostatic effect exactly as is the case in redox processes since the unfolding upon charging of ionizable residues reflects exactly the free energy of the charging process, which always involves structural rearrangements. In other words, it does not matter that the unfolding process is opposed by hydrophobic forces, the overall energetics is rigorously the charging free energy and thus electrostatic free energy. This is very different than the (B) → (D) free energy which is a non-electrostatic energy as it does not involve any change in charge. At any rate, predicting the energetics of the acid unfolding is clearly a task that should be accomplished by our formulation.

The present approach focuses in absolute folding energy and is likely to work well on the less challenging problem of exploring mutational effects. The limited test cases of mutational effects examined in this present work provides encouraging results, but further studies are needed. Here, we have a very extensive set of relevant experiments (e.g., Ref. 4346, 58).

ACKNOWLEDGMENTS

The authors thank USC’s High Performance Computing and Communication Center (HPCC) for computer time.

Grant sponsor: NIH; Grant number: GM24492

REFERENCES

  • 1.Jaenicke R. Protein stability and molecular adaptation to extreme conditions. Eur J Biochem. 1991;202:715–728. doi: 10.1111/j.1432-1033.1991.tb16426.x. [DOI] [PubMed] [Google Scholar]
  • 2.Luke KA, Higgins CL, Wittung-Stafshede P. Thermodynamic stability and folding of proteins from hyperthermophilic organisms. FEBS J. 2007;274:4023–4033. doi: 10.1111/j.1742-4658.2007.05955.x. [DOI] [PubMed] [Google Scholar]
  • 3.Razvi A, Scholtz JM. Lessons in stability from thermophilic proteins. Protein Sci. 2006;15:1569–1578. doi: 10.1110/ps.062130306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Clark NS, Dodd I, Mossakowska DE, Smith RA, Gore MG. Folding and conformational studies on SCR1-3 domains of human complement receptor 1. Protein Eng. 1996;9:877–884. doi: 10.1093/protein/9.10.877. [DOI] [PubMed] [Google Scholar]
  • 5.Dill KA. Dominant forces in protein folding. Biochemistry. 1990;29:7133–7155. doi: 10.1021/bi00483a001. [DOI] [PubMed] [Google Scholar]
  • 6.Fan ZZ, Hwang JK, Warshel A. Using simplified protein representation as a reference potential for all-atom calculations of folding free energy. Theor Chem Acc. 1999;103:77–80. [Google Scholar]
  • 7.Go N. Theoretical-studies of protein folding. Annu Rev Biophys Bioeng. 1983;12:183–210. doi: 10.1146/annurev.bb.12.060183.001151. [DOI] [PubMed] [Google Scholar]
  • 8.Karanicolas J, Brooks CL. The structural basis for biphasic kinetics in the folding of the WW domain from a formin-binding protein: Lessons for protein design? Proc Natl Acad Sci USA. 2003;100:3954–3959. doi: 10.1073/pnas.0731771100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Khalili M, Liwo A, Scheraga HA. Kinetic studies of folding of the B-domain of staphylococcal protein A with molecular dynamics and a united-residue (UNRES) model of polypeptide chains. J Mol Biol. 2006;355:536–547. doi: 10.1016/j.jmb.2005.10.056. [DOI] [PubMed] [Google Scholar]
  • 10.Levitt M, Warshel A. Computer-simulation of protein folding. Nature. 1975;253:694–698. doi: 10.1038/253694a0. [DOI] [PubMed] [Google Scholar]
  • 11.Onuchic JN, Wolynes PG, Lutheyschulten Z, Socci ND. Toward an outline of the topography of a realistic protein-folding funnel. Proc Natl Acad Sci USA. 1995;92:3626–3630. doi: 10.1073/pnas.92.8.3626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Snow CD, Nguyen N, Pande VS, Gruebele M. Absolute comparison of simulated and experimental protein-folding dynamics. Nature. 2002;420:102–106. doi: 10.1038/nature01160. [DOI] [PubMed] [Google Scholar]
  • 13.Hendsch ZS, Tidor B. Do salt bridges stabilize proteins—a continuum electrostatic analysis. Protein Sci. 1994;3:211–226. doi: 10.1002/pro.5560030206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Honig B, Yang AS. Free-energy balance in protein-folding. Adv Protein Chem. 1995;46:27–58. doi: 10.1016/s0065-3233(08)60331-9. [DOI] [PubMed] [Google Scholar]
  • 15.Schwehm JM, Fitch CA, Dang BN, Garcia-Moreno EB, Stites WE. Changes in stability upon charge reversal and neutralization substitution in staphylococcal nuclease are dominated by favorable electrostatic effects. Biochemistry. 2003;42:1118–1128. doi: 10.1021/bi0266434. [DOI] [PubMed] [Google Scholar]
  • 16.Whitten ST, Garcia-Moreno EB. pH dependence of stability of staphylococcal nuclease: evidence of substantial electrostatic interactions in the denatured state. Biochemistry. 2000;39:14292–14304. doi: 10.1021/bi001015c. [DOI] [PubMed] [Google Scholar]
  • 17.Giletto A, Pace CN. Buried, charged, non-ion-paired aspartic acid 76 contributes favorably to the conformational stability of ribonuclease T-1. Biochemistry. 1999;38:13379–13384. doi: 10.1021/bi991422s. [DOI] [PubMed] [Google Scholar]
  • 18.Xiao L, Honig B. Electrostatic contributions to the stability of hyperthermophilic proteins. J Mol Biol. 1999;289:1435–1444. doi: 10.1006/jmbi.1999.2810. [DOI] [PubMed] [Google Scholar]
  • 19.Isom DG, Cannon BR, Castaneda CA, Robinson A, Garcia-Moreno EB. High tolerance for ionizable residues in the hydrophobic interior of proteins. Proc Natl Acad Sci USA. 2008;105:17784–17788. doi: 10.1073/pnas.0805113105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Roca M, Messer B, Warshel A. Electrostatic contributions to protein stability and folding energy. FEBS Lett. 2007;581:2065–2071. doi: 10.1016/j.febslet.2007.04.025. [DOI] [PubMed] [Google Scholar]
  • 21.Strickler SS, Gribenko AV, Gribenko AV, Keiffer TR, Tomlinson J, Reihle T, Loladze VV, Makhatadze GI. Protein stability and surface electrostatics: a charged relationship. Biochemistry. 2006;45:2761–2766. doi: 10.1021/bi0600143. [DOI] [PubMed] [Google Scholar]
  • 22.Gribenko AV, Makhatadze GI. Role of the charge-charge interactions in defining stability and halophilicity of the CspB proteins. J Mol Biol. 2007;366:842–856. doi: 10.1016/j.jmb.2006.11.061. [DOI] [PubMed] [Google Scholar]
  • 23.Schweiker KL, Zarrine-Afsar A, Davidson AR, Makhatadze GI. Computational design of the Fyn SH3 domain with increased stability through optimization of surface charge charge interactions. Protein Sci. 2007;16:2694–2702. doi: 10.1110/ps.073091607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Baran KL, Chimenti MS, Schlessman JL, Fitch CA, Herbst KJ, Garcia-Moreno BE. Electrostatic effects in a network of polar and ionizable groups in staphylococcal nuclease. J Mol Biol. 2008;379:1045–1062. doi: 10.1016/j.jmb.2008.04.021. [DOI] [PubMed] [Google Scholar]
  • 25.Lee KK, Fitch CA, Garcia-Moreno EB. Distance dependence and salt sensitivity of pairwise, coulombic interactions in a protein. Protein Sci. 2002;11:1004–1016. doi: 10.1110/ps.4700102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Pace CN. Single surface stabilizer. Nat Struct Biol. 2000;7:345–346. doi: 10.1038/75100. [DOI] [PubMed] [Google Scholar]
  • 27.Trefethen JM, Pace CN, Scholtz JM, Brems DN. Charge-charge interactions in the denatured state influence the folding kinetics of ribonuclease Sa. Protein Sci. 2005;14:1934–1938. doi: 10.1110/ps.051401905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Trevino SR, Gokulan K, Newsom S, Thurlkill RL, Shaw KL, Mitkevich VA, Makarov AA, Sacchettini JC, Scholtz JM, Pace CN. Asp79 makes a large, unfavorable contribution to the stability of RNase Sa. J Mol Biol. 2005;354:967–978. doi: 10.1016/j.jmb.2005.09.091. [DOI] [PubMed] [Google Scholar]
  • 29.Warshel A, Sharma PK, Kato M, Parson WW. Modeling electrostatic effects in proteins. Biochim Biophys Acta. 2006;1764:1647–1676. doi: 10.1016/j.bbapap.2006.08.007. [DOI] [PubMed] [Google Scholar]
  • 30.Schutz CN, Warshel A. What are the dielectric ‘constants’ of proteins and how to validate electrostatic models. Proteins. 2001;44:400–417. doi: 10.1002/prot.1106. [DOI] [PubMed] [Google Scholar]
  • 31.Sham YY, Muegge I, Warshel A. The effect of protein relaxation on charge-charge interactions and dielectric constants of proteins. Biophys J. 1998;74:1744–1753. doi: 10.1016/S0006-3495(98)77885-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sham YY, Chu ZT, Warshel A. Consistent calculations of pKa’s of ionizable residues in proteins: semi-microscopic and microscopic approaches. J Phys Chem B. 1997;101:4458–4472. [Google Scholar]
  • 33.Shan B, Bhattacharya S, Eliezer D, Raleigh DP. The low-pH unfolded state of the C-terminal domain of the ribosomal protein L9 contains significant secondary structure in the absence of denaturant but is no more compact than the low-pH urea unfolded state. Biochemistry. 2008;47:9565–9573. doi: 10.1021/bi8006862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wong KB, Lee CF, Chan SH, Leung TY, Chen YW, Bycroft M. Solution structure and thermal stability of ribosomal protein L30e from hyperthermophilic archaeon Thermococcus celer. Protein Sci. 2003;12:1483–1495. doi: 10.1110/ps.0302303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.King G, Warshel A. A surface constrained all-atom solvent model for effective simulations of polar solutions. J Chem Phys. 1989;91:3647–3661. [Google Scholar]
  • 36.Lee FS, Chu ZT, Warshel A. Microscopic and semimicroscopic calculations of electrostatic energies in proteins by the POLARIS and ENZYMIX programs. J Comp Chem. 1993;14:161–185. [Google Scholar]
  • 37.Kumar S, Tsai CJ, Nussinov R. Maximal stabilities of reversible two-state proteins. Biochemistry. 2002;41:5359–5374. doi: 10.1021/bi012154c. [DOI] [PubMed] [Google Scholar]
  • 38.Feng YQ, Sligar SG. Effect of heme binding on the structure and stability of Escherichia coli apocytochrome b562. Biochemistry. 1991;30:10150–10155. doi: 10.1021/bi00106a011. [DOI] [PubMed] [Google Scholar]
  • 39.Bowie JU, Sauer RT. Equilibrium dissociation and unfolding of the Arc repressor dimer. Biochemistry. 1989;28:7139–7143. doi: 10.1021/bi00444a001. [DOI] [PubMed] [Google Scholar]
  • 40.Hollien J, Marqusee S. A thermodynamic comparison of mesophilic and thermophilic ribonucleases H. Biochemistry. 1999;38:3831–3836. doi: 10.1021/bi982684h. [DOI] [PubMed] [Google Scholar]
  • 41.Li WT, Grayling RA, Sandman K, Edmondson S, Shriver JW, Reeve JN. Thermodynamic stability of archaeal histones. Biochemistry. 1998;37:10563–10572. doi: 10.1021/bi973006i. [DOI] [PubMed] [Google Scholar]
  • 42.Strop P, Mayo SL. Contribution of surface salt bridges to protein stability. Biochemistry. 2000;39:1251–1255. doi: 10.1021/bi992257j. [DOI] [PubMed] [Google Scholar]
  • 43.Nguyen DM, Leila Reynald R, Gittis AG, Lattman EE. X-ray and thermodynamic studies of staphylococcal nuclease variants I92E and I92K: insights into polarity of the protein interior. J Mol Biol. 2004;341:565–574. doi: 10.1016/j.jmb.2004.05.066. [DOI] [PubMed] [Google Scholar]
  • 44.Jacob M, Holtermann G, Perl D, Reinstein J, Schindler T, Geeves MA, Schmid FX. Microsecond folding of the cold shock protein measured by a pressure-jump technique. Biochemistry. 1999;38:2882–2891. doi: 10.1021/bi982487i. [DOI] [PubMed] [Google Scholar]
  • 45.Mueller U, Perl D, Schmid FX, Heinemann U. Thermal stability and atomic-resolution crystal structure of the Bacillus caldolyticus cold shock protein. J Mol Biol. 2000;297:975–988. doi: 10.1006/jmbi.2000.3602. [DOI] [PubMed] [Google Scholar]
  • 46.Went HM, Jackson SE. Ubiquitin folds through a highly polarized transition state. Protein Eng Des Sel. 2005;18:229–237. doi: 10.1093/protein/gzi025. [DOI] [PubMed] [Google Scholar]
  • 47.Khorasanizadeh S, Peters ID, Butt TR, Roder H. Folding and stability of a tryptophan-containing mutant of ubiquitin. Biochemistry. 1993;32:7054–7063. doi: 10.1021/bi00078a034. [DOI] [PubMed] [Google Scholar]
  • 48.Dams T, Jaenicke R. Stability and folding of dihydrofolate reductase from the hyperthermophilic bacterium Thermotoga maritima. Biochemistry. 1999;38:9169–9178. doi: 10.1021/bi990635e. [DOI] [PubMed] [Google Scholar]
  • 49.Mukaiyama A, Takano K, Haruki M, Morikawa M, Kanaya S. Kinetically robust monomeric protein from a hyperthermophile. Biochemistry. 2004;43:13859–13866. doi: 10.1021/bi0487645. [DOI] [PubMed] [Google Scholar]
  • 50.Creagh AL, Koska J, Johnson PE, Tomme P, Joshi MD, McIntosh LP, Kilburn DG, Haynes CA. Stability and oligosaccharide binding of the N1 cellulose-binding domain of Cellulomonas fimi endoglucanase CenC. Biochemistry. 1998;37:3529–3537. doi: 10.1021/bi971983o. [DOI] [PubMed] [Google Scholar]
  • 51.Janssen MJ, van de Wiel WA, Beiboer SH, van Kampen MD, Verheij HM, Slotboom AJ, Egmond MR. Catalytic role of the active site histidine of porcine pancreatic phospholipase A2 probed by the variants H48Q, H48N and H48K. Protein Eng. 1999;12:497–503. doi: 10.1093/protein/12.6.497. [DOI] [PubMed] [Google Scholar]
  • 52.Jackson SE, Moracci M, elMasry N, Johnson CM, Fersht AR. Effect of cavity-creating mutations in the hydrophobic core of chymotrypsin inhibitor 2. Biochemistry. 1993;32:11259–11269. doi: 10.1021/bi00093a001. [DOI] [PubMed] [Google Scholar]
  • 53.Chrunyk BA, Evans J, Lillquist J, Young P, Wetzel R. Inclusion body formation and protein stability in sequence variants of interleukin-1 beta. J Biol Chem. 1993;268:18053–18061. [PubMed] [Google Scholar]
  • 54.Zhou Z, Feng H, Bai Y. Detection of a hidden folding intermediate in the focal adhesion target domain: Implications for its function and folding. Proteins. 2006;65:259–265. doi: 10.1002/prot.21107. [DOI] [PubMed] [Google Scholar]
  • 55.Cowley AB, Altuve A, Kuchment O, Terzyan S, Zhang X, Rivera M, Benson DR. Toward engineering the stability and hemin-binding properties of microsomal cytochromes b5 into rat outer mitochondrial membrane cytochrome b5: examining the influence of residues 25 and 71. Biochemistry. 2002;41:11566–11581. doi: 10.1021/bi026005l. [DOI] [PubMed] [Google Scholar]
  • 56.Sandberg WS, Terwilliger TC. Engineering multiple properties of a protein by combinatorial mutagenesis. Proc Natl Acad Sci USA. 1993;90:8367–8371. doi: 10.1073/pnas.90.18.8367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hornby JA, Luo JK, Stevens JM, Wallace LA, Kaplan W, Armstrong RN, Dirr HW. Equilibrium folding of dimeric class mu glutathione transferases involves a stable monomeric intermediate. Biochemistry. 2000;39:12336–12344. doi: 10.1021/bi000176d. [DOI] [PubMed] [Google Scholar]
  • 58.Mach H, Ryan JA, Burke CJ, Volkin DB, Middaugh CR. Partially structured self-associating states of acidic fibroblast growth factor. Biochemistry. 1993;32:7703–7711. doi: 10.1021/bi00081a015. [DOI] [PubMed] [Google Scholar]
  • 59.Mainfroid V, Mande SC, Hol WG, Martial JA, Goraj K. Stabilization of human triosephosphate isomerase by improvement of the stability of individual alpha-helices in dimeric as well as monomeric forms of the protein. Biochemistry. 1996;35:4110–4117. doi: 10.1021/bi952692n. [DOI] [PubMed] [Google Scholar]
  • 60.Chen L, Cabrita GJ, Otzen DE, Melo EP. Stabilization of the ribosomal protein S6 by trehalose is counterbalanced by the formation of a putative off-pathway species. J Mol Biol. 2005;351:402–416. doi: 10.1016/j.jmb.2005.05.056. [DOI] [PubMed] [Google Scholar]
  • 61.Cutler RL, Davies AM, Creighton S, Warshel A, Moore GR, Smith M, Mauk AG. Role of arginine-38 in regulation of the cytochrome c oxidation-reduction equilibrium. Biochemistry. 1989;28:3188–3197. doi: 10.1021/bi00434a012. [DOI] [PubMed] [Google Scholar]
  • 62.Copeland RA, Ji H, Halfpenny AJ, Williams RW, Thompson KC, Herber WK, Thomas KA, Bruner MW, Ryan JA, Marquis-Omer D, Sanyal G, Sitrin RD, Yamazaki S, Middaugh CR. The structure of human acidic fibroblast growth factor and its interaction with heparin. Arch Biochem Biophys. 1991;289:53–61. doi: 10.1016/0003-9861(91)90441-k. [DOI] [PubMed] [Google Scholar]
  • 63.Gospodarowicz D, Cheng J. Heparin protects basic and acidic FGF from inactivation. J Cell Physiol. 1986;128:475–484. doi: 10.1002/jcp.1041280317. [DOI] [PubMed] [Google Scholar]
  • 64.Katti SK, LeMaster DM, Eklund H. Crystal structure of thioredoxin from Escherichia coli at 1.68 A resolution. J Mol Biol. 1990;212:167–184. doi: 10.1016/0022-2836(90)90313-B. [DOI] [PubMed] [Google Scholar]
  • 65.Macedo-Ribeiro S, Darimont B, Sterner R, Huber R. Small structural changes account for the high thermostability of 1[4Fe-4S] ferredoxin from the hyperthermophilic bacterium Thermotoga maritima. Structure. 1996;4:1291–1301. doi: 10.1016/s0969-2126(96)00137-2. [DOI] [PubMed] [Google Scholar]
  • 66.Matsumura M, Matthews BW. Stabilization of functional proteins by introduction of multiple disulfide bonds. Methods Enzymol. 1991;202:336–356. doi: 10.1016/0076-6879(91)02018-5. [DOI] [PubMed] [Google Scholar]
  • 67.Zavodszky P, Kardos J, Svingor A, Petsko GA. Adjustment of conformational flexibility is a key event in the thermal adaptation of proteins. Proc Natl Acad Sci USA. 1998;95:7406–7411. doi: 10.1073/pnas.95.13.7406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Reetz MT, Carballeira JD, Vogel A. Iterative saturation mutagenesis on the basis of B factors as a strategy for increasing protein thermostability. Angew Chem Int Ed Engl. 2006;45:7745–7751. doi: 10.1002/anie.200602795. [DOI] [PubMed] [Google Scholar]
  • 69.Jaeger KE, Dijkstra BW, Reetz MT. Bacterial biocatalysts: molecular biology, three-dimensional structures, and biotechnological applications of lipases. Annu Rev Microbiol. 1999;53:315–351. doi: 10.1146/annurev.micro.53.1.315. [DOI] [PubMed] [Google Scholar]
  • 70.Privalov PL. Stability of proteins: small globular proteins. Adv Protein Chem. 1979;33:167–241. doi: 10.1016/s0065-3233(08)60460-x. [DOI] [PubMed] [Google Scholar]
  • 71.Creighton TE. Protein folding. Biochem J. 1990;270:1–16. doi: 10.1042/bj2700001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Matthew JB, Gurd FR, Garcia-Moreno B, Flanagan MA, March KL, Shire SJ. pH-dependent processes in proteins. CRC Crit Rev Biochem. 1985;18:91–197. doi: 10.3109/10409238509085133. [DOI] [PubMed] [Google Scholar]
  • 73.Tanford C, Kirkwood JG. Theory of protein titration curves. I. General equations for impenetrable spheres. J Am Chem Soc. 1957;79:5333. [Google Scholar]
  • 74.Warshel A, Russell ST, Churg AK. Macroscopic models for studies of electrostatic interactions in proteins: limitations and applicability. Proc Natl Acad Sci USA. 1984;81:4785–4789. doi: 10.1073/pnas.81.15.4785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Perutz MF. Electrostatic effects in proteins. Science. 1978;201:1187–1191. doi: 10.1126/science.694508. [DOI] [PubMed] [Google Scholar]

RESOURCES