Abstract
In this paper we use the results from all atom MD simulations of proteins and peptides to assess individual contribution of charged atomic groups to the enthalpic stability of the native state of globular proteins and investigate how the distribution of charged atomic groups in terms of solvent accessibility relates to protein enthalpic stability. The contributions of charged groups is calculated using a comparison of nonbonded interaction energy terms from equilibrium simulations of charged amino acid dipeptides in water (the “unfolded state”) and charged amino acids in globular proteins (the “folded state”). Contrary to expectation, the analysis shows that many buried, charged atomic groups contribute favorably to protein enthalpic stability. The strongest enthalpic contributions favoring the folded state come from the carboxylate (COO−) groups of either Glu or Asp. The contributions from Arg guanidinium groups are generally somewhat stabilizing, while groups from Lys contribute little toward stabilizing the folded state. The average enthalpic gain due to the transfer of a methyl group in an apolar amino acid from solution to the protein interior is described for comparison. Notably, charged groups that are less exposed to solvent contribute more favorably to protein native-state enthalpic stability than charged groups that are solvent exposed. While solvent reorganization/release has favorable contributions to folding for all charged atomic groups, the variation in folded state stability among proteins comes mainly from the change in the nonbonded interaction energy of charged groups between the unfolded and folded states. A key outcome is that the calculated enthalpic stabilization is found to be inversely proportional to the excess charge density on the surface, in support of an hypothesis proposed previously.
Keywords: electrostatic interaction and folded state stability of globular proteins, enthalpic contribution from charged atomic groups, natural selection of buried charge distribution
1 Introduction
Variation in solvent accessible areas, packing density, or van der Waals and electrostatic interactions has been traditionally used to account for changes in protein stability upon amino acid mutations or under different environmental conditions. A quantitative understanding of the factors contributing to protein native-state stability remains an important goal, with benefits for advancing protein design efforts and understanding human disease. Studies to define the origins of structural stability are particularly relevant due to the implication of stability as a causative effect in disease causing mutations. A decrease in protein stability by 1 to 3 kcal/mol can be a major factor contributing to monogenic human disease (1–3).
At physiological temperatures, the native structure of most globular proteins is only marginally stable. The folded state is enthalpically but not entropically favored relative to the unfolded random-coil state, however the enthalpy and entropy of unfolding on a per residue basis vary in magnitude among globular proteins (4). The variation in the residue-specific entropy of unfolding is understood to arise primarily from differences in amino acid composition of buried residues, and can be rationalized in terms of changes in solvation of protein surface area and hydrophobicity (5). In contrast, the physical origin of the specific enthalpy of unfolding is not well understood (6,7).
Recently, a computational analysis of protein compressibility showed a correlation involving the distribution of charged atoms between the surface and interior of the protein (8); together, the results inferred an unexpected relationship between the enthalpy of unfolding and this charge distribution whereby the per-residue change in enthalpy upon protein unfolding, ΔHU, decreases (the folded state is less stable) as the fraction of charged atoms located on the protein surface increases. We postulated that the variation among proteins in ΔHU is due to differences in the excess of surface charged atoms but in a counter-intuitive way: a more uniform distribution of charge between the surface and interior stabilizes protein structure. Here, we test this postulate by estimating the enthalpic stability from all atom molecular dynamics (MD) simulations.
The change in solution free energy upon protein folding has three primary sources: a) the favorable decrease in free energy due to the formation of intraprotein interactions; b) the change in free energy due to the disruption of amino acid–water interactions (or desolvation) of all amino acids that are buried in the protein core or only partially exposed to water in the protein folded state, and c) the change in the free energy of the solvent due to solvent reorganization and structural collapse. The focus of this investigation is to estimate the enthalpic contribution of charged amino acids, Asp, Glu, Arg, and Lys, to protein folded state stability as a function of solvent accessibility and the spatial distribution of charged groups.
While there is a general agreement that burial of hydrophobic amino acids in the protein interior is a major driving force for protein folding and stability, the contributions from charged atomic groups and hydrogen bonding to protein folded state stability has been a much debated issue. Total or partial desolvation of charged and polar groups upon protein folding is widely regarded as a limiting step in protein folding. Some researchers (9–14) have favored a stabilizing contribution from salt bridges and side chain - side chain hydrogen bonds (15), while others have contended that hydrogen bonding and salt bridges make an unfavorable contribution (16–18). Transfer of a salt bridge between charged atomic groups from water to a nonpolar environment was found to be destabilizing by 10 to 16 kcal/mol (19). Nonetheless, the increase in the number of salt bridges in the native structure of the thermophilic proteins has been regarded as an indicator that, salt bridges could stabilize the protein folded state (6, 20) and that intraprotein electrostatic interactions are responsible for the increased stability of thermophilic proteins (21).
Quantitative measurements of salt bridge contributions to the stability of proteins also lend contradictory results. Salt bridges stabilize the folded state of T4 lysozyme by 5 to .5 kcal/mol depending on the experimental system and the location of the salt bridge (22, 23). Certain salt bridges substantially stabilize the folded state of T4 lysozyme, but ionic interactions at the solvent exposed surface have little or no contribution to stabilization. In other cases surface salt bridges were found to add several kcal/mol to protein stability (24). On the other hand, buried salt bridges were reported to destabilize a coiled-coil protein and Arc repressor protein by 2 to 4 kcal/mol (25,26). Thus experimental studies of salt bridges submit opposing conclusions of stabilization and destabilization of the protein folded state.
In an earlier study of protein compressibility (8), we linked the enthalpy of unfolding per residue to the excess charge on the protein surface and found that globular proteins that have lower excess charge (more charges buried) are enthalpically more stable in the folded state. The excess charge, δxs, is the difference between protein surface charge density and the charge density in the protein interior. The inverse relationship between the excess charge and the change in solution enthalpy upon protein unfolding was observed for experimental data from a set of 16 globular proteins. We hypothesized that protein folded state stabilization is partially due to a gain in enthalpy from favorable electrostatic interactions between buried charges as a result of optimal charge distributions evolved to stabilize buried charge in globular proteins. A goal of the present work is to test this hypothesis and investigate the origin of the empirical correlation between the distribution of buried charged groups in native proteins and protein folded state stability.
We use the results from all atom MD simulations of proteins and peptides to assess individual contribution of charged atomic groups to the enthalpic stability of the native state of globular proteins and investigate how the distribution of charged atomic groups in the protein tertiary structure influences their contribution to protein enthalpic stability. The net contribution of each charged atomic group is estimated as a sum of two components: a) the change in the nonbonded interaction energy of charged amino acids between the unfolded state modeled by solvated dipeptides and the folded state of globular proteins; and b) the change in the enthalpy of the hydration water due to the collapse of the hydration shells formed by charged atomic groups and subsequent release of certain water molecules to the bulk solvent in the process of protein folding. The concept is illustrated in Figure 1, where a carboxylate group from GLU (in gold), is shown in a protein folded state environment (on the left side of the scheme), and an unfolded state environment (on the right). The analysis shows that most native, charged atomic groups have significant contributions to protein enthalpic stability. While solvent reorganization/release has favorable contributions to folding for all charged atomic groups, the variation in folded state stability among proteins comes mainly from the change in the nonbonded interaction energy of charged amino acids between the unfolded and folded states.
The carboxylate (COO−) groups from Glu and Asp are the leaders in enhancing the folded state stability. The guanidinium groups from Arg are somewhat stabilizing, while an group from Lys is only marginally stabilizing the folded state. The average enthalpic gain due to the transfer of a methyl group in an apolar amino acid from solution to the protein interior is described for comparison. The trend in stabilization by the type of charged group is consistent with the distribution of the relative numbers of charged groups as a function of the degree of solvent exposure. Charged groups that are less exposed to solvent contribute more to protein native state enthalpic stability than charged groups that are exposed and the calculated cumulative effect of charge is inversely proportional to the excess charge density, δxs.
2 Theory and Computational Methods
2.1 Model Systems
The contributions of individual atomic groups to the enthalpic stability of the folded state were assessed from a comparison of their enthalpy in the folded and unfolded states. The atomic groups considered here are the side-chain carboxylate groups of Asp and Glu (COO−), the guanidinium group of Arg (Gu); the side-chain amino group of ; the charged Ca2+ ions in α-lactalbumin and trypsin. The apolar group is the methyl group, -CH3, in ALA, VAL, LEU and ILE. The unfolded state of the protein is modeled here as charged- or apolar dipeptides, CH3-CO-XXX-NH-CH3, where XXX is Glu, Asp, Arg, Lys, Ala, Val, Leu, Ile, and the corresponding dipeptides are glud, aspd, argd, lysd, alad, vald, leud and iled, respectively. This definition of the unfolded state assumes that the unfolded state is a random coil, an ensemble of conformations in which the amino acid chain is highly hydrated and individual residues do not interact with each other. The backbone dihedral angles of the dipeptides are not fixed; they exhibit both α and β/PPII (stretched) conformations with relative populations specific to each dipeptide.
The folded state corresponds to the native states of four globular proteins: trypsin, ribonuclease A, hen egg-white lysozyme and α-lactalbumin. The total number of charged atomic groups in the four proteins is 104. Of these, 28% are COO−(Asp), 16% are COO−(Glu), 17% are from Arg and the rest of 38% are groups from Lys. In vertebrates, 23% of amino acids are charged amino acids (27) and the relative composition in the charged category is: 26% are Asp, 25% are Glu, 18% are Arg and 31% are Lys. Therefore, our set of proteins is a reasonably representative set.
2.2 Change in Enthalpy upon Protein Unfolding
The net contribution of each atomic group to the change in enthalpy upon protein unfolding, ΔHU, is a sum of two quantities: a) the change in the enthalpy of the charged group, g, due to a change in the local environment for transfer from the folded state to the unfolded state, ; and b) the change in the enthalpy of the hydration water molecules that are released to the bulk upon protein folding, :
(1) |
If ΔHU > 0, the atomic group contributes favorably to protein folding. The sum of ΔHU over all atomic groups in a protein corresponds to the unfolding enthalpy.
2.2.1 Local Environment of Charged Groups,
of an atomic group is estimated from the difference between the time-average nonbonded energy (a sum of the electrostatic and van der Waals terms) in the unfolded state modeled by the dipeptide solution, , and the corresponding energy in the folded state,
(2) |
where < … > represents the time average. In this approximation, the difference between the covalent bonding energies in the folded and unfolded states is assumed to be negligible.
The nonbonded energy of an atomic group, , in either the native protein or dipeptide solution, is a time average of the pairwise sum over the van der Waals and electrostatic interactions of all atoms in the atomic group, ng, with the protein (or dipeptide) and the solvent molecules within a specified cutoff distance:
(3) |
where m is the number of all atoms within the cutoff distance.
2.2.2 Solvent Reorganization,
The second term in Eq. 1, , is the change in solvation water enthalpy due to changes in the solvation shells around charged groups upon protein folding. The released water molecules become bulk water molecule in the peptide solution. Water molecules in the vicinity of proteins are “reorganized” with respect to the bulk water, and the corresponding radial distribution functions show structure up to 8 Å away from charged atomic groups (28). Three hydration shells are formed: the first hydration shell extends from 0 to 3.2 Å, a second shell is evident between 3.2 and 6 Å, and a less pronounced third hydration shell is formed between 6 and 8 Å
To calculate the change in hydration water enthalpy and account for its release from the dipeptide surface in the unfolded state, we estimate the average numbers of water molecules in the first, second and third hydration shells for each charged groups in the dipeptide solution ( ), and the difference in the nonbonded energy of a water molecule in each hydration shell relative to bulk water: , and . The average change in enthalpy for each water molecule in a hydration shell, , is calculated as follows:
(4) |
where and are the average nonbonded energy of hydration-shell water molecule and a bulk water molecule, respectively, in the dipeptide solution. <Enb > is according to Eq. 3. are calculated from post-processing the trajectories. is the average number of water molecules in a spherical shell extending to 3.2 Å from a charged group in the peptide solution. To estimate , we calculated the number of water molecules in a spherical shell between 3.2 and 6 Å around the charged group excluding water molecules that belong to the first hydration shell of other polar or apolar groups. In the same manner, is calculated to include water molecules between 6 and 8 Å from the charged group, excluding water molecules that are in the first hydration shell of apolar or polar groups. The maximum change in solvation water enthalpy for release of all hydration water molecules from the surface of the charged group when the protein folds, is estimated as:
(5) |
For charged groups that are not completely buried upon protein folding, the actual is less than . The solvent reorganization contribution can be estimated by factoring in the fractional degree of charge burial, 1 − SE, where SE is the fractional exposure of the charged group to solvent (a definition of SE is given below). Therefore:
(6) |
2.3 Fractional Solvent Exposure for Atomic Groups
The nonbonded energy of each atomic group depends on its specific microenvironment, the sum of all the interactions with the neighboring atomic groups. We distinguish “environments” according to the location of the charged groups: from fully buried in the protein to gradually exposed to the solvent on the protein surface. The peptide environment is taken here to be similar to the environment encountered by the charged groups in the unfolded protein.
The fractional degree of exposure to solvent, SE (in %), is the ratio of the time average solvent accessible surface area exposed by an atomic group in the folded protein, < SAg,prot >, and the time average solvent accessible surface area exposed by the same charged group in the corresponding dipeptide, < SAg,dipep >:
(7) |
The time average SA for each atomic group is calculated from the time series of MD simulations using CHARMM (29) with a probe radius of 1.4 Å (using the algorithm of Lee and Richards (30)).
2.4 Molecular Dynamics Simulations
Equilibrium MD simulations for four proteins and 8 dipeptides were performed using CHARMM (29). A solvated Ca2+ ion that is found buried in both trypsin and α-lactalbumin, was also simulated in the ”unfolded” state. The main simulation boxes contained 3500 to 5000 explicit water molecules (1100 water molecules were included in the dipeptide and ion solutions) as well as charge neutralizing counterions. Additional Na+ and Cl− molecules were added up to a final salt concentration of 0.166 M. Periodic boundary conditions and the particle mesh Ewald method (31) for calculating electrostatic interactions were used with a 2 fs time step for the integration of the equations of motion, in an NPT ensemble. Charged atomic groups have protonation states corresponding to pH 7, with the exception of Glu35 in HEW lysozyme. Other details of the simulations were as reported earlier (32, 33). 2 ns MD equilibrium simulation trajectories were post analyzed and snapshots 10 fs apart were used to calculate the nonbonded energies of charged groups.
Distributions of nonbonded energies are shown in Figure 2 for the example of the COO− group in glud and using a 10, 12, or 14 Å cutoff distance. The distributions are well converged. The time-averaged nonbonded energy is −180.0 kcal/mol for 10 Å, −182.5 kcal/mol for 12 Å and −184.0 kcal/mol for 14 Å. Our analysis utilizes the time-average nonbonded energy values, which for the well converged Enb distributions observed in this study, are effectively equivalent to the most likely nonbonded energy; average nonbonded energies are within 1 kcal/mol from the most likely energies. Non-bonded energy cut-offs were set to 14 Å and all images within 18 Å were included in the calculation.
3 Results
3.1 Distribution of Charged Groups as a Function of Solvent Exposure
The positions of the four types of charged atomic groups in the protein structure in terms of solvent exposure (Eq. 7) are summarized in Table 1. The level of solvent exposure, SE, for charged atomic groups is divided in five categories: less or equal to 20% exposed (SE0–20), between 20 and 40% (SE20–40), between 40 and 60%(SE40–60), between 60 and 80%(SE60–80), and between 80 and 100% (SE80–100) solvent exposed. The fraction of charged atomic groups, , is the ratio between the number of charged atomic groups of type i for each SE range, and the total number of atomic groups of type i, multiplied by 100.
Table 1.
Group | Protein |
|
|
|
|
|
|
||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||
COO− (ASP) | TRPS | 6 | 4 | 0 | 1 | 0 | 1 | ||||||
RBNA | 5 | 2 | 1 | 0 | 2 | 0 | |||||||
HLYSO | 7 | 1 | 3 | 2 | 1 | 0 | |||||||
ALACTA | 11 | 3 | 3 | 1 | 3 | 1 | |||||||
ΣNg | 29 | 10 | 7 | 4 | 6 | 2 | |||||||
|
34 | 24 | 14 | 21 | 7 | ||||||||
| |||||||||||||
COO− (GLU) | TRPS | 4 | 2 | 0 | 1 | 0 | 1 | ||||||
RBNA | 5 | 0 | 2 | 1 | 1 | 1 | |||||||
HLYSO | 1 | 0 | 0 | 0 | 1 | 0 | |||||||
ALACTA | 7 | 0 | 2 | 2 | 0 | 3 | |||||||
ΣNg | 17 | 2 | 4 | 4 | 2 | 5 | |||||||
|
12 | 24 | 24 | 12 | 28 | ||||||||
| |||||||||||||
Gu (ARG) | TRPS | 2 | 0 | 1 | 0 | 1 | 0 | ||||||
RBNA | 4 | 1 | 1 | 1 | 1 | 0 | |||||||
HLYSO | 11 | 1 | 1 | 3 | 3 | 3 | |||||||
ALACTA | 1 | 0 | 0 | 0 | 0 | 1 | |||||||
ΣNg | 18 | 2 | 3 | 4 | 5 | 4 | |||||||
|
11 | 17 | 22 | 28 | 22 | ||||||||
| |||||||||||||
|
TRPS | 14 | 1 | 1 | 4 | 6 | 2 | ||||||
RBNA | 10 | 1 | 1 | 4 | 1 | 3 | |||||||
HLYSO | 5 | 0 | 0 | 1 | 4 | 0 | |||||||
ALACTA | 11 | 1 | 1 | 2 | 6 | 1 | |||||||
ΣNg | 40 | 3 | 3 | 11 | 17 | 6 | |||||||
|
7.5 | 7.5 | 27.5 | 42.5 | 15 | ||||||||
| |||||||||||||
All Crg | ΣΣNg | 104 | 17 | 17 | 23 | 30 | 17 | ||||||
All Crg |
|
- | 17 | 16 | 23 | 30 | 16 |
The distribution of the charged groups by type as a function of solvent exposure is shown in Figure 3. For the proteins studied here, the carboxylate group Asp has the highest percentage of buried groups (58%) with less than 40% within SE0–40 (see also Tables 3S through 6S). By contrast, Lys amino groups are largely solvent exposed and not buried; 42.5% fall within the range of SE60–80 while only 15% have less than 40% SE. These data indicate that on average, the basic amino acids Lys and Arg tend to position themselves closer to the protein-water interface than the acidic amino acids Asp and Glu.
3.2 Unfolded State Protein Enthalpy
The time-averaged values and < SA > for the charged groups in aspd, glud, argd and lysd, the methyl groups in alad, vald, leud, iled, and the Ca2+ ion were evaluated from distributions from 2 ns MD simulations. (Table 2S, Supplementary Material). The methyl group values range from −13.2 kcal/mol for leud to 2.2 kcal/mol for alad. Thus, methyl groups of vald, leud, iled (γ) have negative energies while the side chain in alad and iled (δ) have weakly unfavorable nonbonded interactions with its environment. The charged groups have strong, favorable interaction energies ranging from −72 kcal/mol for the guanidinium group in argd to −201 kcal/mol for the carboxylate group in aspd. The carboxylate group in glud has a nonbonded interaction energy of −184 kcal/mol and the group in lysd −86 kcal/mol.
3.3 Contributions of Apolar Groups to the Enthalpy of Unfolding from the Change in Local Environment,
To provide a basis of comparison for the enthalpy of charged groups, the enthalpy change for apolar methyl groups upon protein folding was evaluated from the difference of nonbonded interaction energy of methyl groups in apolar dipeptide molecules (alad, vald, leud, iled) and that of the corresponding apolar amino acids, Ala, Val, Leu and Ile in folded proteins (Eq. 2). The average for a methyl group is 1.3 kcal/mol and varies with the amino acid in which the methyl group is embedded as well as its position in the protein structure (Table 1S, Supplementary Material). The enthalpic gain for a side chain methyl group in Ala ranges from 0.8 to 1.5 kcal/mol, from 0 to 0.6 kcal/mol for Val, from 1 to 1.9 kcal/mol for Ile, and from 0.6 to 1.1 kcal/mol for Leu. The variation observed here is consistent with the site specific differences in contributions of apolar groups to protein stability from experiment (10), whereby the local environment, as characterized by the number of methyl and methylene groups in the neighborhood of a specific methyl group, was found to influence the free energy change upon folding.
3.4 Contributions to the Enthalpy of Unfolding Due to Solvent Reorganization,
The change in number of bound water molecules as a result of the increase in protein SA upon unfolding is accounted for with calculated from the nonbonded energy differences of a water molecule in pure solvent and one in each hydration shell, plus an estimate of the number of hydration water molecules (see Theory and Computational Methods). Values for each hydration shell of ; and the maximum change in solvent enthalpy due to reorganization for a fully buried charged group, , are listed in Table 7S, Supplementary Material. depends on the nonbonded energy of a water molecule in pure solvent, which is . ranges from 1.1 to 2.4 kcal/mol.
While water molecules in the first hydration shell of charged groups have lower than bulk water molecules (by 1 to 3 kcal/mol, Table 7S) due to strong charge-dipole interactions between the charged atomic groups and water molecular dipoles, water molecules in the second and third hydration shells have unsatisfied hydrogen bonding capabilities so and are greater than . The combined result of the integrated effects for the loss of hydration shells is a net favorable enthalpy gain upon protein folding. The contribution of solvent reorganization/release favors protein folding but differentiates among charged atomic groups; the maximum gain in enthalpy upon complete burial of a charged group is from ASP, which contributes 2.4 kcal/mol towards the folded state stability. Complete burial of the same group from GLU, or the charged groups of arginine and lysine gains in solvent enthalpy 1.7, 1.9 and 1.1 kcal/mol, respectively. Estimates from vald for the maximum change in the enthalpy due to water molecules near apolar groups (Table 7S, last row) indicate that the change is similar in magnitude (approximately 2.4 kcal/mol) as that for charged groups, even though the mechanism that generates the change is different. Water in the first and third hydration shells of apolar groups have a less favorable nonbonded interaction energy than average waters, while water molecules in the second hydration shells have a more favorable interaction energy.
3.5 Charged Group Enthalpy of Unfolding,
The enthalpic contributions to protein unfolding (Eq. 1) summed over the charged groups from either asp, glu, arg or lys residues in each of the four proteins are shown in Table 2 as a function of SE. The individual residue contributions are provided in Tables 3S to 6S (Supplementary Material) along with residue values for , < SA >, SE, , and . Values for ΔHU summed over residue type in each of the four proteins (Table 2) are largely positive in sign, so that the enthalpic contribution from charged groups generally stabilizes the folded state. Integrated over all degrees of solvent exposure (i.e. SE0–100), the residue sums for ΔHU are positive except for only two cases: Glu residues in HLYSO (−0.8 kcal/mol) and Lys residues in ALACTA (−1.8 kcal/mol).
Table 2.
Group | Protein | ΔHU
|
||||||
---|---|---|---|---|---|---|---|---|
0–100 | 0–20 | 20–40 | 40–60 | 60–80 | 80–100 | |||
| ||||||||
g= COO− (ASP) | TRPS | +30.7 | +16.0 | 0 | +14.5 | 0 | +.2 | |
RBNA | +33.1 | +8.4 | +14.0 | +.9 | +9.8 | 0.0 | ||
HLYSO | +33.8 | +9.9 | 8.7 | +14.3 | +.9 | 0.0 | ||
ALACTA | +38.4 | +29.9 | +15.8 | +1.3 | −2.9 | −5.7 | ||
ΣΔHU | +136.0 | +64.2 | +38.5 | +31.0 | +7.8 | −5.5 | ||
ΣNg | 29 | 10 | 7 | 4 | 6 | 2 | ||
|
+4.7 | +6.4 | +5.5 | +7.8 | +1.3 | −2.8 | ||
| ||||||||
g= COO− (GLU) | TRPS | +27.9 | +24.4 | 0.0 | +4.4 | 0.0 | +.9 | |
RBNA | +18.5 | 0.0 | +19.1 | −2.1 | −2.3 | +3.8 | ||
HLYSO | −.8 | 0.0 | 0.0 | 0.0 | −.8 | 0.0 | ||
ALACTA | +1.6 | 0.0 | +3.0 | +3.9 | 0.0 | −5.3 | ||
ΣΔHU | +49.0 | +24.4 | +22.1 | +6.2 | −3.1 | −.6 | ||
ΣNg | 17 | 2 | 4 | 4 | 2 | 5 | ||
|
+2.9 | +12.2 | +5.5 | +1.4 | −1.6 | −.1 | ||
| ||||||||
g= Gu(ARG) | TRPS | +2.1 | 0.0 | .1 | 0.0 | +2.0 | 0.0 | |
RBNA | +9.0 | +2.5 | +3.4 | +2.8 | +.3 | 0.0 | ||
HLYSO | +12.6 | +5.0 | +1.6 | +7.8 | +1.7 | −1.6 | ||
ALACTA | +5.2 | 0 | 0 | 0 | 0 | +5.2 | ||
ΣΔHU | +28.9 | +7.5 | +5.1 | +10.6 | +4.0 | +3.6 | ||
ΣNg | 18 | 2 | 3 | 4 | 5 | 4 | ||
|
+1.6 | +3.8 | +1.7 | +2.7 | +.8 | +.9 | ||
| ||||||||
|
TRPS | +8.7 | +2.4 | −1.3 | +6.3 | −.2 | +1.7 | |
RBNA | +.6 | −1.3 | −3.3 | +5.7 | −1.0 | +.6 | ||
HLYSO | +1.7 | 0.0 | 0.0 | +1.6 | .1 | 0.0 | ||
ALACTA | −1.8 | +.1 | −3.2 | −.1 | +3.5 | −2.1 | ||
ΣΔHU | +9.2 | +1.2 | −7.8 | +13.5 | +2.4 | +.2 | ||
ΣNg | 40 | 3 | 3 | 11 | 17 | 6 | ||
|
+.2 | +.4 | −2.6 | +1.2 | .03 | |||
| ||||||||
All | ΣΣΔHU | 223.1 | 97.3 | 57.9 | 61.3 | 11.1 | −2.3 | |
All | ΣΣNg | 104 | 1917 | 17 | 23 | 30 | 17 | |
All Ave |
|
2.1 | 5.7 | 3.4 | 2.7 | .4 | −.2 |
The global average of ΔHU over all charged groups, , is 2.1 kcal/mol (Table 2, last row), a combination of for protein environmental effects and for solvent reorganization (Tables 8S and 9S, Supplementary Material). These charged group global averages are similar to those for methyl enthalpy changes, which are and . Nonetheless the variation among charged groups is considerably larger than that of methyl groups (Tables 3S to 6S). While methyl groups in apolar amino acids have favorable contributions to the enthalpy of the protein folded state, irrespective of solvent exposure (Table 1S), the charged atomic groups have both favorable and unfavorable contributions to protein folding.
The listings in Tables 3S to 6S also indicate from the positive values associated with individual charged groups from the four proteins that formation of hydration shells at the surface of the unfolded protein has the effect of increasing enthalpy. Thus, in addition to the well known increase in entropy upon the release of hydration water molecules during folding, the loss of hydration water molecules also favors folding for enthalpic reasons, which has been noted previously (35). Nonetheless, is generally smaller in magnitude (2.4 kcal/mol) than , which ranges from −5 to 13 kcal/mol.
3.5.1 Carboxylate Group in Asp
Asp residues exhibit the strongest enthalpic interactions stabilizing the folded state. ΔHU for all Asp residues (SE from 0 to 100) is greater than 30 kcal/mol in the case of all four proteins (Table 2). Based on these representative globular proteins, carboxylate groups from Asp contribute on average to protein folded-state stability. The fully buried COO− group of Asp176 contributes ΔHU = 9.9 kcal toward stabilization of TRPS folded state. This gain in enthalpic stability is more than twice the average enthalpic gain due to the transfer of a CH3 group from water in the hydrophobic core of a protein (see above). By example that opposite effects on stability can occur, we note that a different buried COO− group in TRPS, the COO− of Asp171 (Table 3S), has ΔHU = −2.6 kcal/mol and contributes unfavorably to the folded state of trypsin.
3.5.2 Carboxylate Group in Glu
Glu carboxylate groups have and can strongly stabilize tertiary structure, although to a somewhat lesser extent than those of Asp. The COO− group in glud has a less weaker interaction with the solvent (−184 kcal/mol) than its counterpart in the aspd (−201 kcal/mol). This difference is likely due to the more hydrophobic nature of the glutamic acid (it has an extra CH2 group). As part of a protein however, the COO− group of the glutamic acid can contribute as much favorable enthalpy as that of an aspartic acid. A buried glutamic acid, Glu52, COO− contributes 13.7 kcal/mol to the enthalpy of protein unfolding, and thus favors folding almost four times as much as an average methyl group, while the surface exposed charged group from Glu167 contributes only 0.9 kcal/mol.
3.5.3 Guanidinium Group in Arg
The average effect of the guanidinium group of arg residues to protein stability is slightly favorable by 1.6 kcal/mol (Table 2). Protein folding induces both positive and negative changes in enthalpy with values from different arg residues ranging from +5.2 kcal/mol for the charged group from Arg11 in α-lactalbumin (Table 6S) to −2.2 kcal/mol for the charged group from Arg128 (in HEW lysozyme).
3.5.4 The Group in Lys
Unlike other charged groups, the average contribution of the group of Lys is close to negligible: (Table 2). The folded-state environments are energetically similar to that of solvated Lys, with (Table 8S).
3.6 Charged Atomic Groups and Solvent Exposure
The group average contributions from different charged group types with given solvent exposure (Table 2), , are plotted as a function of solvent exposure in Figure 4. The major stabilizing contributions are from the buried COO− groups of Glu (red) and Asp (black) with SE less than 60%. In particular, the COO− groups from Glu, which are less than 20% exposed to solvent, contribute an average 12.2 kcal/mol. By contrast, groups with more than 60% of their surface exposed to solvent have on a much smaller contribution to protein stability. The trend in for types of charged groups is roughly consistent with their frequency of occurrence as a function of SE. Comparison of Figures 3 and 4 finds higher frequency for greater stabilization in the folded state, as expected. 58% of Asp, 36% of Glu and 28% of Arg charged groups are more than 40% buried, while 85% of Lys charged groups are largely exposed to water.
The cumulative contribution of all charged groups of a protein to ΔHU, , as a function of solvent exposure is shown in Figure 5. For the four proteins studied here, the cumulative contribution increases up to SE 40–60%, then plateaus or decreases slightly with the addition of the charged groups near the surface. Overall, charged groups contribute 43.4 kcal/mol to the stability of ALACTA, 49.2 kcal/mol for HLYSO, 61.3 kcal/mol for RBNA, and 71.4 kcal/mol for TRPS.
3.7 Correlation between and the Excess Charge Density, δxs
In an earlier paper (8) we linked protein folded state stability to the excess charge on the protein surface, and found that globular proteins with lower excess charge, or a more uniform charge distribution, have a more positive unfolding enthalpy per residue and thus are enthalpically more stable. The negative correlation between excess charge, δxs, and experimental ΔHU values was shown for a set of 16 globular proteins (see Figure 5 in (8)). To establish a causal relationship in this correlation, we examine here whether enthalpies calculated from only charged groups exhibit this trend. Figure 6 shows , and the contributions from environmental change only, , as a function of δxs for all charged groups in the four proteins as determined in this study. The inverse correlation observed here for from charged residues alone supports the postulate that differences among proteins in the specific enthalpy of unfolding are due to the spatial distribution of charge. Because most of the stabilizing effect related to charged groups comes from groups that are largely buried, the relationship observed with experimental data in protein folded state stability, showing increased protein stability with decreasing excess charge, is at least in part due to optimized nonbonded interactions between charged groups with the environment in the native protein folded state.
It is interesting to note that the contribution from solvent reorganization, even though substantial (approximately 22 kcal/mol for each protein), does not influence the variation in stability among proteins, as shown by the two almost parallel best fits in Figure 6. The correlation coefficient for the linear fit of (upper line, Figure 6) is −0.89, and of (lower line, Figure 6) is −0.93. Therefore, when comparing relative stabilities of globular proteins, the enthalpy gain due to solvent release around charged groups appears to provide an approximately constant enhancement to the specific .
4 Discussion and Conclusions
The energetic analysis of protein charged groups reported here was motivated by a surprising empirical correlation observed between the folded-state spatial distribution of charged groups and the residue-specific enthalpic stability of globular proteins (8). We sought to determine if the enthalpic contribution of charged groups to ΔHU increasingly favors the folded state as the fraction of buried charged residues increases. If so, the tertiary structures of proteins with more uniform charge distributions should provide a basis for learning what is needed to reduce the substantial energetic penalty that arises from the removal of charged groups from an aqueous environment. Such information can be exploited in future protein design. A key outcome of this study is that the estimated enthalpy due to charged groups for different proteins was found to be inversely related to the excess surface charge δxs (Figure 6), and thus the results support the hypothesis that buried charge lowers native protein enthalpy and a causal effect for the correlation (8).
Since the publication by Murphy, Privalov and Gill (4) correlating heat capacity with the entropy and enthalpy of protein unfolding, the basis for differences among globular proteins in the entropy per residue of unfolding is known to be the amount of buried apolar surface area (34). Nevertheless, proteins with large amounts of buried apolar surface do not gain stability regardless of the larger entropy favoring folding, and the origin of the compensating enthalpy was unclear (7,34,36). Studies of compressibility (8, 37,38) identified relationships with the unfolding enthalpy and entropy similar to those of heat capacity, and suggested a rationale based on charge distribution for variations in enthalpy of native, folded proteins (8). The calculated energies reported here confirm this rationale by demonstrating that enthalpic stability is gained by strong, favorable electrostatic interactions generated when charged groups are located in the interior as well as on the protein surface. Increasing numbers of buried charged groups necessarily opposes burying apolar surface, and thus changes in specific enthalpy by this mechanism would be compensated by opposing ones in the entropy per residue. Further, attributing differences in the specific thermodynamic properties of proteins to charged residues is reasonable because the variation in the fraction of charged residues is relatively large compared to that for uncharged polar and apolar residues (8).
We find that a number of buried charged groups make significant, favorable contributions to structural stability. The main argument against structural stabilization by charges buried in a protein interior is the large penalty for desolvation, and the experimental measures of free energy of transfer between water and organic solvents (19). On the other hand, it is well recognized that protein interiors are not a homogeneous, hydrophobic medium; numerous uncharged polar groups and some charged polar groups compose electrostatic networks and specific local environments shielded from solvent (34, 39). The question is whether the local environments can compensate sufficiently for the loss of water around a charged group. This study finds that the set of buried charged groups from four globular proteins exhibits a large range of values that span both favorable and unfavorable contributions to the total enthalpy of unfolding. for individual charged groups range from 14 kcal/mol to −3 kcal/mol (supplemental Tables 3–6), with the average over all groups examined here equal to 2.1 kcal/mol (Table 2), near the value 1.3 kcal/mol averaged over CH3 apolar groups (supplemental Table 1). The different types of charged groups vary in their tendency for being buried; carboxylate groups are most likely to be buried while amino groups are the least likely. Moreover, the interactions of buried Glu and Asp carboxylate groups are found to be overall lower energy than buried Arg and Lys basic groups (Table 2). Indeed, the most favorable contributions for COO− groups occur for fully buried groups, SE0–20. Thus, many environments are indeed sufficient to overcome the desolvation penalty and the total contribution for all charged groups favors the folded state.
What is the explanation for the variations in for a given type of charged group with similar solvent exposure? While a rigorous explanation is likely complex, we suggest a simple assessment of the local electrostatic environment may explain how to accommodate a charged group upon protein folding. The number of charged groups and the net charge, , was evaluated for a spherical region centered on the group of interest and of radius . The conjecture is that a local , generated from two or more groups in addition to the central group, creates a “good neighborhood,” and the atomic group is likely to provide a positive contribution to the enthalpy of folding. This premise is supported by several examples in trypsin. Glu52 is buried yet has one of the largest favorable contributions to the protein folded state enthalpy: +12 kcal/mol (Table 3S). A 5 Å-radius sphere surrounding this group includes five charged groups (Arg49, Asp53, Glu59, Glu62, Arg99) and a Ca2+ ion with a net charge of Qtot = 0, making a good neighborhood for embedding a charged group. The carboxylate group of Glu62 from the same cluster of charges contributes +9 kcal/mol and its local neighborhood contains six charged groups (Arg49, Glu52, Asp53, Glu59, Arg99, Lys125) and a total charge Qtot = −1e. On the other hand, Asp171 contributes −5 kcal/mol and this destabilization derives from six charged groups (Lys125, Lys136, Lys170, Asp176, Lys200) within 5 Å but with a net positive charge of +2e. A more extensive analysis is underway to establish a quantitative relationship between the structural features of a local environment and the contribution to the folded state stability.
Experimental analysis of the effect of buried charged amino acids on protein structural stability has generated conflicting conclusions. Experimental mutagenesis of buried charged groups have an intrinsic limitation in that substitution of a given residue, or residue pair for a salt bridge, can disrupt the structure and impact local interactions of many atomic groups, charged and uncharged. As a result, factors other than simple removal of charge come into play. By contrast, the computational approach from MD simulations taken here allowed a direct analysis of the energetics.
The occurrence of both positive and negative values explains certain conflicting results from mutagenesis studies; depending on which salt bridges are mutated, measurements would indicate either stabilizing or destablizing effects. In addition, engineering charged residues into a structure for stability likely requires long range analysis of the full structure rather than simple pairwise mutation of proximal residues with appropriate geometry for substitution of a salt bridge. Our study suggests that engineering attempts to design buried charged groups into a protein (24, 40, 41) can benefit by consideration of the carefully crafted network of interactions that have evolved in native proteins with uniform charge distribution, perhaps by development of parameters like .
Supplementary Material
References
- 1.Wang Z, Moult J. Snps, protein structure, and disease. Hum Mutat. 2001;17:263–70. doi: 10.1002/humu.22. [DOI] [PubMed] [Google Scholar]
- 2.Partridge AW, Therien AG, Deber CM. Missense mutations in transmembrane domains of proteins: phenotypic propensity of polar residues for human disease. Proteins. 2004;54:648–56. doi: 10.1002/prot.10611. [DOI] [PubMed] [Google Scholar]
- 3.Yue P, Li Z, Moult J. Loss of protein structure stability as a major causative factor in monogenic disease. J Mol Biol. 2005;353:459–73. doi: 10.1016/j.jmb.2005.08.020. [DOI] [PubMed] [Google Scholar]
- 4.Murphy KP, Privalov PL, Gill SJ. Common features of protein unfolding and dissolution of hydrophobic compounds. Science. 1990;247:559–561. doi: 10.1126/science.2300815. [DOI] [PubMed] [Google Scholar]
- 5.Yang AS, Sharp KA, Honig BV. Analysis of the Heat Capacity Dependence of Protein Folding. J Mol Biol. 1992;227:889–890. doi: 10.1016/0022-2836(92)90229-d. [DOI] [PubMed] [Google Scholar]
- 6.Xiao L, Honig BH. Electrostatic contributions to the stability of hyperthermophilic proteins. J Mol Biol. 1999;289:1435–1444. doi: 10.1006/jmbi.1999.2810. [DOI] [PubMed] [Google Scholar]
- 7.Lazaridis T, Archontis G, Karplus M. Enthalpic contribution to protein stability: Insights from atom-based calculations and statistical mechanics. Adv Protein Chem. 1995;47:231–306. doi: 10.1016/s0065-3233(08)60547-1. [DOI] [PubMed] [Google Scholar]
- 8.Dadarlat VM, Post CB. Adhesive-cohesive model for protein compressibility: an alternative perspective on stability. Proc Natl Acad Sci USA. 2003;100:14778–14783. doi: 10.1073/pnas.2434157100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Perutz MF. Electrostatic effects in proteins. Science. 1978;201:1187–1191. doi: 10.1126/science.694508. [DOI] [PubMed] [Google Scholar]
- 10.Serrano L, Kellis J, Cann P, Matouschek A, Fersht AS. The folding of an enzyme. ii. substructure of barnase and the contribution of different interactions to protein stability. J Mol Biol. 1992;224:783–804. doi: 10.1016/0022-2836(92)90562-x. [DOI] [PubMed] [Google Scholar]
- 11.Marqusee S, Baldwin RL. Helix stabilization by glu−...lys+ salt bridges in short peptides of de novo design. Proc Natl Acad Sci USA. 1987;84:88988902. doi: 10.1073/pnas.84.24.8898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Habermann SM, Murphy KP. Energetics of hydrogen bonding in proteins: a model compound study. Protein Sci. 1996;5:1229–1239. doi: 10.1002/pro.5560050702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Myers JK, Pace CN. Hydrogen bonding stabilizes globular proteins. Biophys J. 1996;71:2033–2039. doi: 10.1016/S0006-3495(96)79401-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Grimsley GR, Shaw KL, Fee LR, Alston RW, Huyghues-Despointes BM, Thurlkill RL, Scholtz JM, Pace CN. Increasing protein stability by altering long-range coulombic interactions. Protein Sci. 1999;8:1843–1849. doi: 10.1110/ps.8.9.1843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ragone R. Hydrogen-bonding classes in proteins and their contribution to the unfolding reaction. Protein Sci. 2001;10:2075–2082. doi: 10.1110/ps.09201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dill KA. Dominant forces in protein folding. Biochemistry. 1990;29:7133–7155. doi: 10.1021/bi00483a001. [DOI] [PubMed] [Google Scholar]
- 17.Hendsch ZS, Tidor B. Do salt bridges stabilize proteins? a continuum electrostatic analysis. Protein Sci. 1994;3:211226. doi: 10.1002/pro.5560030206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Honig BH, Yang AS. Free energy balance in protein folding. Adv Protein Chem. 1995;46:27–58. doi: 10.1016/s0065-3233(08)60331-9. [DOI] [PubMed] [Google Scholar]
- 19.Honig BH, Hubbel WL. Stability of salt bridges in membrane proteins. Proc Natl Acad Sci USA. 1984;81:5412–5416. doi: 10.1073/pnas.81.17.5412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kumar S, Tsai CJ, Nussinov R. Maximal stabilities of reversible two-state proteins. Biochemistry. 2002;41:5359–5374. doi: 10.1021/bi012154c. [DOI] [PubMed] [Google Scholar]
- 21.Szilgyi A, Zvodszky P. Structural differences between mesophilic, moderately thermophilic and extremely thermophilic protein subunits: Results of a comprehensive survey. Structure. 1995;8:493–504. doi: 10.1016/s0969-2126(00)00133-7. [DOI] [PubMed] [Google Scholar]
- 22.Anderson DE, Becktel WJ, Dahlquist FW. ph-induced denaturation of proteins: a single salt bridge contributes 3–5 kcal/mol to the free energy of folding of t4 lysozyme. Biochemistry. 1990;29:2403–8. doi: 10.1021/bi00461a025. [DOI] [PubMed] [Google Scholar]
- 23.Sun DP, Sauer U, Nicholson H, Matthews BW. Contributions of engineered surface salt bridges to the stability of t4 lysozyme determined by directed mutagenesis. Biochemistry. 1991;30:7143–7153. doi: 10.1021/bi00243a015. [DOI] [PubMed] [Google Scholar]
- 24.Perl D, Mueller U, Heinemann U, Schmid FX. Two exposed amino acid residues confer thermostability on a cold shock protein. Nat Struct Mol Biol. 2000;7:380–383. doi: 10.1038/75151. [DOI] [PubMed] [Google Scholar]
- 25.Schneider JP, Lear JD, DeGrado WF. A designed buried salt bridge in a heterodimeric coiled coil. J Am Chem Soc. 1997;119:5742–5743. [Google Scholar]
- 26.Waldburger CD, Schildbach JF, Sauer RT. Are buried salt bridges important for protein stability and conformational specificity? Nat Struct Biol. 1995;2:122–128. doi: 10.1038/nsb0295-122. [DOI] [PubMed] [Google Scholar]
- 27.King JL, Jukes TH. Non-darwinian evolution. Science. 1969;164:788–798. doi: 10.1126/science.164.3881.788. [DOI] [PubMed] [Google Scholar]
- 28.Dadarlat VM, Post CB. Decomposition of protein experimental compressibility into intrinsic and hydration shell contributions. Biophys J. 2006;91:4544–4554. doi: 10.1529/biophysj.106.087726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Brooks B, Bruccoleri R, Olafson B, States D, Swaminathan S, Karplus M. Charmm: A program for macromolecular energy, minimization and dynamics calculations. J Comput Chem. 1983;4:187–217. [Google Scholar]
- 30.Lee B, Richards FM. The interpretation of protein structures: estimation of static accessibility. J Mol Biol. 1971;55:379–400. doi: 10.1016/0022-2836(71)90324-x. [DOI] [PubMed] [Google Scholar]
- 31.Darden TA, York DM, Pedersen LG. Particle mesh ewald: An nlog(n) method for ewald sums in large systems. J Chem Phys. 1993;98:10089–10092. [Google Scholar]
- 32.Dadarlat VM, Post CB. Insights into protein compressibility from molecular dynamics simulations. J Phys Chem B. 2001;105:715–724. [Google Scholar]
- 33.Dadarlat VM. Potentials of mean force (PMFs) for the interaction of blocked alanine dipeptide molecules in water and gas phase from MD simulations. Biophys J. 2005;89:1433–1445. doi: 10.1529/biophysj.104.054130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Honig BH. Protein folding: From the levinthal paradox to structure prediction. J Mol Biol. 1999;293:283–293. doi: 10.1006/jmbi.1999.3006. [DOI] [PubMed] [Google Scholar]
- 35.Cheng Y, Rossky P. Surface topography dependence of biomolecular hydrophobic hydration. Nature. 1998;392:696–699. doi: 10.1038/33653. [DOI] [PubMed] [Google Scholar]
- 36.Makhatadze GI, Privalov PL. Energetics of protein structure. Advances in Protein Chemistry. 1995;47:307–425. doi: 10.1016/s0065-3233(08)60548-3. [DOI] [PubMed] [Google Scholar]
- 37.Phelps DK, Post CB. A novel basis for capsid stabilization by antiviral compounds. J Mol Biol. 1995;254:544–551. doi: 10.1006/jmbi.1995.0637. [DOI] [PubMed] [Google Scholar]
- 38.Phelps DK, Rossky PJ, Post CB. Influence of an antiviral compound on the temperature dependence of viral protein flexibility and packing: a molecular dynamics study. J Mol Biol. 1998;276:331–337. doi: 10.1006/jmbi.1997.1542. [DOI] [PubMed] [Google Scholar]
- 39.Warshel A. Electrostatic origin of the catalytic power of enzymes and the role of preorganized active sites. J Biol Chem. 1998;273:27035–38. doi: 10.1074/jbc.273.42.27035. [DOI] [PubMed] [Google Scholar]
- 40.Godoy-Ruiz R, Perez-Jimenez R, Ibarra-Molero B, Sanchez-Ruiz JM. Relation between protein stability, evolution and structure, as probed by carboxylic acid mutations. J Mol Biol. 2004;336:313–318. doi: 10.1016/j.jmb.2003.12.048. [DOI] [PubMed] [Google Scholar]
- 41.Godoy-Ruiz R, Perez-Jimenez R, Ibarra-Molero B, Sanchez-Ruiz JM. A stability pattern of protein hydrophobic mutations that reflects evolutionary structural optimization. Biophys J. 2005;89:3320–3331. doi: 10.1529/biophysj.105.067025. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.