Abstract
Characterization of solvent preferences of proteins is essential to the understanding of solvent effects on protein structure and stability. Although it is generally believed that solvent preferences at distinct loci of a protein surface may differ, quantitative characterization of local protein solvation has remained elusive. In this study, we show that local solvation preferences can be quantified over the entire protein surface from extended molecular dynamics simulations. By subjecting microsecond trajectories of two proteins (lysozyme and antibody fragment D1.3) in 4 M glycerol to rigorous statistical analyses, solvent preferences of individual protein residues are quantified by local preferential interaction coefficients. Local solvent preferences for glycerol vary widely from residue to residue and may change as a result of protein side-chain motions that are slower than the longest intrinsic solvation timescale of ∼10 ns. Differences of local solvent preferences between distinct protein side-chain conformations predict solvent effects on local protein structure in good agreement with experiment. This study extends the application scope of preferential interaction theory and enables molecular understanding of solvent effects on protein structure through comprehensive characterization of local protein solvation.
Introduction
Cosolvents such as denaturants, salts, amino acids, polyols, and sugars play an important role in many protein processes involving protein folding, stabilization, and association (1–3). This is because the addition of cosolvents to aqueous protein solutions commonly alters the equilibrium between protein conformations (4–6). Over the past decades, a rigorous thermodynamic framework has been developed that relates cosolvent effects on protein conformations with solvent preferences of the protein surface (4,7–19). This framework—which is often referred to as “preferential interaction theory”—stipulates that adding cosolvent to a protein solution will shift the protein toward conformations with a greater degree of preferential solvation by the cosolvent. Preferential solvation of a protein is quantified by the preferential interaction coefficient ΓXP (4,20). Because every solvent molecule at the protein surface contributes to ΓXP (7,11,21), detailed characterization of protein solvation is required for molecular understanding of solvent effects on protein conformations.
Preferential interactions reflect relative preferences of the protein surface for either cosolvent or water, and are manifested in local concentration ratios of cosolvent and water that are either greater (preferential solvation), smaller (preferential hydration), or equal (neutral solvation) with respect to the bulk solvent (4). Because protein surfaces comprise physically and chemically distinct surface loci, protein solvation in a mixed solvent can be conceived of as an ensemble of preferentially hydrated, solvated, and neutral solvent regions near the protein surface (4,22). Just like a protein displays a mosaic of heterogeneous surface loci to the solvent, the solvent surrounding a protein is expected to form a mosaic of solvent regions with varying degrees of preferential interactions.
Unfortunately, characterization of the protein solvation mosaic has remained elusive to this day. Although various spectroscopy- and NMR-based techniques have revealed many aspects of protein solvation, these techniques do not allow quantifying local solvation preferences (23–25). On the other hand, equilibrium techniques such as vapor pressure osmometry and dialysis-densitometry allow the measurement of the preferential interaction coefficient ΓXP of the ensemble-average of all protein conformations in a solvent mixture (4,26,27). However, equilibrium techniques cannot quantify differences of ΓXP between distinct protein conformations and are unable to resolve local preferential interactions at distinct protein surface loci.
One approach to understand solvent effects on protein conformations was pioneered by Tanford (22), who quantified thermodynamic solvent effects on smaller constituent groups of a protein molecule and hypothesized the additivity of individual contributions of the constituent groups. Group additivity has proven to be a useful assumption for quantifying the effects of a number of cosolvents on protein folding (28–30); however, other studies reported solvent effects on proteins for which group additivity is not valid (31–33). Hence, more research is needed to determine the range of cosolvents and the extent of conformational changes for which group additivity holds.
Solvent effects on protein (un)folding events have been directly observed from extended molecular-dynamics (MD) simulations (34–38), and solvent effects on the free energy landscape of peptides and mini-proteins have been quantified through advanced sampling techniques (39–41). These techniques have provided a wealth of information on the folding mechanisms of proteins, but they have also generated widely differing views on the molecular mechanisms by which cosolvents alter protein conformations. Perhaps the most prevalent reason for this disagreement is the difficulty to differentiate cause and effect between concurrent changes of protein structure and protein solvation. Another reason for the longstanding disagreement on the molecular mechanisms of cosolvent effects on protein conformations could be that these studies provide no or limited insight in local solvent preferences at distinct loci of the protein surface.
An alternative approach for obtaining detailed molecular understanding of solvent effects on protein conformations is by quantifying differences of ΓXP values between specific protein conformations. ΓXP of a specific protein conformation can be calculated from MD simulations of a single protein conformation with well-defined atom coordinates or an ensemble of protein conformations corresponding to a particular protein state (21,42–46). The solvent-induced equilibrium shift between two protein conformations can then be quantified from the difference of ΓXP values between the respective protein conformations (42,46). Because ΓXP values are calculated by summing up solvent molecules at all loci of the protein surface (21), this approach could, in principle, elucidate the role of individual solvent molecules and protein surface loci in solvent-induced conformational changes. However, even though previous studies have shown that ΓXP values calculated from MD simulations—when run for tens of nanoseconds—are generally in good agreement with experiment (21,42–45), simulation times in this range turned out to be insufficient to quantify local preferential interactions coefficients (47).
Convergence of local preferential interaction coefficients from MD simulations is perceived as a formidable challenge because ΓXP is intrinsically “a fluctuation difference and is prone to large levels of noise” (48) and because “dissection of the influence of different protein group contributions is not straightforward” (49). Key issues that hinder convergence are limitations in computational power and lacking insight on pertinent timescales of protein solvation. To this day, these issues continue to curtail the potential of computational studies of solvent effects on biomolecules, and molecular insight has been mostly limited to inferences from nonconverged protein simulations and simulations of smaller model compounds such as amino acids and peptides (49–53).
In this study, key issues that have curtailed the potential of computational studies of solvent effects on proteins are addressed. We demonstrate that the protein solvation mosaic can be characterized over the entire protein surface from MD simulations that are sufficiently longer than the longest intrinsic solvation timescale, which is ∼10 ns. Local preferential interaction coefficients are quantified for all protein residues of two proteins (lysozyme and antibody fragment D1.3) in 4 M glycerol, and cosolvent effects on local protein conformations are predicted from changes in local preferential interactions between distinct protein side-chain conformations.
Methods
Molecular-dynamics simulations
All-atom molecular simulations are performed for two proteins, Hen Egg-white Lysozyme (HEL) and the antibody fragment Fv D1.3 (54), which are explicitly solvated in a box of water with 4 M glycerol (see Table S1 in the Supporting Material). The CHARMM22 parameter set (55) is used to model protein atoms, water is modeled by the TIP3-model (56), and force-field parameters for glycerol are based on the carbohydrate hydrate parameters developed by Ha et al. (57). Crystallographic structures for HEL and D1.3 are taken from PDB:1VFB (54) and the setup of each simulation is carried out with CHARMM version c32b2 (58). Simulations of the solvated proteins are run for 1.3 μs at constant pressure and temperature (1 atm and 298 K) with NAMD version 2.7 (59), as described previously in Vagenende et al. (47).
In a previous study, we pointed out that conformational changes of the backbone generally occur within nanoseconds along divergent trajectories and result in protein conformations with significantly different preferential interaction coefficients (42). Because this study aims to determine converged characteristics of the protein solvation mosaic, coordinates of the protein backbone are constrained with respect to the crystal structure. To investigate effects of side-chain motions, an additional simulation whereby all protein atom coordinates are constrained is performed for HEL (see Table S1). Unless indicated otherwise, the results in this study are based on the simulations with constrained backbone coordinates and free side-chain motions.
Computation of preferential interaction coefficients
The global preferential interaction coefficient ΓXP can be calculated from MD simulations by counting water and cosolvent molecules within a distance R from the protein surface (21,47):
(1) |
In the above equation, brackets 〈⋅〉τrun refer to the time average over the entire simulation time τrun; nX and nW are the total number of cosolvent and water molecules in the simulation box; and nXP(r < R) and nWP(r < R) are the number of cosolvent and water molecules for which the center of mass falls within a radial distance R from the protein van der Waals surface. The global preferential interaction coefficient ΓXP(R) reaches a plateau from 5 Å onwards (see Fig. S1 in the Supporting Material). This indicates that preferential interactions of a protein in a mixture of water and glycerol are confined to the solvent region within 5 Å from the protein van der Waals surface.
Residue-based preferential interaction coefficients ΓiXP are calculated by assigning each solvent molecule within 5 Å from the protein surface to its closest protein residue i (21):
(2) |
In the above equation, niXP (r < 5 Å) and niWP (r < 5 Å) are the number of cosolvent and water molecules within 5 Å from the protein surface that are closer to residue i than to any other residue of the protein. Residues with, on average, less than 1 water molecule and less than 0.1 glycerol molecule are considered to have limited solvent accessibility. Because each solvent molecule is assigned to only one residue, residue-based preferential interaction coefficients are additive and the following equation is automatically met:
(3) |
The additivity of residue-based preferential interaction coefficients implies that an independent change of ΓiXP for any of the protein residues results in an identical change of the global preferential interaction coefficient ΓXP, and therefore codetermines the effect of cosolvent on the chemical potential of the protein.
To quantify local solvation preferences of groups of residues, we calculate regional preferential interaction coefficients by assigning solvent molecules to a particular residue group if the residue that is closest to the solvent molecule belongs to that residue group. For a residue group consisting of residues [a, b,…], we get:
(4) |
The prime symbol (′) is used to refer to all protein residues excluding residues [a, b,…], and we get
(5) |
Although regional preferential interaction coefficients of a residue group are simply the sum of the residue-based preferential interaction coefficients of the residues of that group, they can be determined with higher precision because solvent fluctuations near neighboring residues are generally negatively correlated.
Statistical analysis
Statistical analysis of any property A is performed based on plots of p(τB) (47,60):
(6) |
In the above equation, τrec and τB are the trajectory recording period and the variable block time, respectively, and σ2〈A〉τrec and σ2〈A〉τB are the variances calculated for the respective times. The block time τB for which p(τB) reaches a plateau indicates the correlation time beyond which block time averages 〈A〉τB become statistically independent (60). If p(τB) remains at the plateau value for all block times, the standard error σ〈A〉 is estimated from the average plateau value. If this is not the case, an upper limit of σ〈A〉 is obtained from the average of σ〈A〉τB for the four largest block times after it has been verified that σ〈A〉τB continually decreases for increasing bock times.
Local concentration maps and characteristic residence times
Local concentrations are calculated based on the solvent occupancy of a three-dimensional grid with grid size 1.5 Å and visualized with the software VMD 1.9 (61), as described previously in Vagenende et al. (47). Cutoff values are selected such that local concentration maps depict all solvent regions with local solvent concentrations greater than the respective bulk solvent concentrations cα, bulk.
Residence times of solvent molecules near the protein surface (r < 5 Å) and survival functions for glycerol and water, i.e., NXP(t) and NWP(t), are calculated as described previously (47). Characteristic residence times are obtained by fitting NXP(t) and NWP(t) to the following function:
(7) |
Equation 7 allows for good fitting of NXP(t) and NWP(t) for both HEL and D1.3 (see Fig. S2).
To estimate the surface density of specific surface loci, protein solvent-accessible surface areas are calculated with VMD 1.9 (61) using a probe radius of 1.4 Å. The average solvent-accessible surface areas of HEL and D1.3 are 7248 Å2 and 11,118 Å2, respectively.
Results
Global protein solvation
Molecular-dynamics simulations of two proteins, HEL and D1.3, in a mixture of water and 4 M glycerol are run for 1.3 μs and subjected to extensive statistical analyses. For both proteins, the global preferential interaction coefficient ΓXP has a distinctive correlation time of ∼10 ns (Fig. 1). This time corresponds with the longest characteristic residence time of glycerol (Table 1). Glycerol molecules with characteristic residence times of ∼10 ns reside at specific protein surface loci that form multiple hydrogen bonds with glycerol (47), such as near HEL-residues Lys13 and Asp18 (Fig. 2). This surface locus is occupied by glycerol for approximately half of the time, and, within a time period of 200 ns, there are three instances whereby glycerol stays longer than 5 ns at this locus (Fig. 2 and see Movie S1 in the Supporting Material). Because each glycerol molecule increases ΓXP by nearly one unit (Eq. 1), temporal changes in glycerol occupancy of such loci cause proportional fluctuations of ΓXP on a timescale of ∼10 ns. We conclude therefore that the distinctive correlation time of ΓXP at ∼10 ns (Fig. 1) results from temporal changes of solvent occupancy of protein surface loci that form multiple hydrogen bonds with glycerol.
Table 1.
Protein | Constraint | n1 [−] | τ1 [ns] | n2 [−] | τ2 [ns] | n3 [−] | τ3 [ns] | c [−] | |
---|---|---|---|---|---|---|---|---|---|
Glycerol | D1.3 | Backbone | 21.5 | 0.3 | 91.7 | 1.4 | 10.3 | 8.6 | 0.8 |
HEL | Backbone | 15.1 | 0.3 | 66.1 | 1.3 | 2.3 | 10.3 | 0.9 | |
HEL | All | 14.4 | 0.3 | 62.5 | 1.4 | 3.1 | 11.3 | 0.3 | |
Water | D1.3 | Backbone | 1200.9 | 0.4 | 67.7 | 2.0 | 9.2 | 22.5 | 2.0 |
HEL | Backbone | 831.6 | 0.3 | 16.0 | 2.0 | 3.4 | 17.1 | 2.4 | |
HEL | All | 781.5 | 0.3 | 30.6 | 2.0 | 5.3 | 8.7 | 3.5 |
Parameters obtained by fitting NXP(t) and NWP(t) according to Eq. 7.
If the slowest timescale of protein solvation is 10 ns, preferential interaction coefficients could be quantified with statistical significance based on MD simulations that are at least 10 times longer, i.e., 100 ns (44). Although this is the case for D1.3, block time averages of ΓXP for HEL are correlated over time intervals exceeding 100 ns (Fig. 1). Long correlation times of ΓXP for HEL are also reflected in the increasing number of glycerol molecules near the protein surface at simulation times >700 ns (see Table S2 and Fig. S3). Nevertheless, variances σ〈ΓXP〉τB for HEL continue to decrease for larger block times (see Fig. S4) and an upper limit of the standard deviation of ΓXP can be determined (see Table S1).
Local protein solvation
The protein solvation mosaic—which is the ensemble of preferentially solvated, preferentially hydrated and neutral solvent regions at the protein surface—can be visualized by local concentration maps. Unlike previous studies using shorter simulations (47), convergence of local concentration maps is obtained from 100 ns trajectories (see Fig. S5). Local concentration maps from microsecond simulations reveal that most solvent regions near the protein surface of HEL are either preferentially hydrated or preferentially solvated, and very few solvent regions do not have any preference for either solvent (Fig. 3 A). The majority of the mapped solvent regions rapidly vanish for increasing cutoff-values (Fig. 3, B and C), which indicates that solvent preferences of most protein surface loci are weak. A limited number of solvent regions have a significantly higher degree of preferential interactions (Fig. 3 B) and the highest degree of preferential interactions is found for glycerol in the catalytic binding pocket of HEL (Fig. 3 C). Similar observations are made for local concentration maps of D1.3 (see Fig. S6).
Local protein solvation is further characterized by quantifying local preferential interaction coefficients ΓiXP for all residues of HEL (Fig. 4 and see Fig. S7) and D1.3 (see Fig. S8). Local preferential interaction coefficients ΓiXP of solvent-accessible protein residues differ widely and range from significantly negative values (i.e., strong preferential hydration) to significantly positive values (i.e., strong preferential solvation by glycerol). For most protein residues, correlation times of ΓiXP are smaller than 100 ns, but for ∼10% of the residues, correlation times are significantly longer (Fig. 4 and see Fig. S8). Interestingly, slow convergence of ΓXP for HEL is caused by local solvation changes near three of the 13 residues with slow convergence, i.e., Gly49, Ile58, and Arg68 (Fig. 1). Moreover, slow local solvation changes disappear and ΓXP converges in 10 ns when protein side-chain motions are constrained (see Fig. S9). We conclude, therefore, that slow solvation changes affecting ΓXP of HEL on timescales exceeding 100 ns are related to protein side-chain motions near Gly49, Ile58, and Arg68.
Interdependence of protein solvation and side-chain motions
A single instance whereby side-chain motions affect protein solvation is observed near Gly49 and Arg68 of HEL. At the start of the simulation, the side chain of Arg68 quickly changes from conformation A with its guanidinium-group contacting the carboxyl group of Gly49 (Fig. 5 A) to conformation B with its guanidinium-group separated from Gly49 by Arg45 (Fig. 5 B). At 860 ns, Arg68 switches again from conformation B to conformation A. This conformational change coincides with an increase of ΓiXP for both Arg68 and Gly49 (Fig. 5 C). Changes in protein solvation near these residues are also evident from local concentration maps: when Arg68 adopts conformation A, solvent region 1 is preferentially hydrated and solvent region 2 is preferentially solvated by glycerol (Fig. 5 D), but when Arg68 adopts conformation B, both solvent regions 1 and 2 are preferentially hydrated near the protein surface and preferentially solvated by glycerol further from the protein surface (Fig. 5 E). Thus, local preferential interaction coefficients and concentration maps concordantly confirm that side-chain motions of Arg68 affect local protein solvation.
The sum of ΓiXP for Arg68 and Gly49 is significantly larger for Arg68 conformation A (0.17 ± 0.06) than for conformation B (−0.23 ± 0.07). Based on the thermodynamic principles of preferential interactions applied to aqueous glycerol solutions (42), a difference of ΓiXP between conformation A and B of 0.4 results in a glycerol-induced shift of the relative free energies of conformation A and B in favor of conformation A by ∼1 kJ/mol. This prediction is in good agreement with the range of free energy shifts between local protein conformations in aqueous glycerol solutions derived from hydrogen-exchange experiments (62).
Similar to Gly49 and Arg68, local preferential interaction coefficients of 10 other residues of HEL have long correlation times (>100 ns) because of slow side-chain motions (Fig. 4). However, unlike Gly48 and Arg68, side-chain motions near these residues only affect the proximity to distinct solvent regions without affecting local protein solvation. The corresponding changes of ΓiXP values of these residues are consequently such that their overall contributions to the global preferential interaction coefficient ΓXP cancel out (Eq. 3). That most side-chain motions do not considerably affect protein solvation is also evidenced by the limited effects of constraining side-chain motions on 1), characteristic residence times of solvent at the protein surface (Table 1); 2), the global preferential interaction coefficient (see Table S1); and 3), local concentration maps (see Fig. S10).
Solvation of a protein-binding pocket
A unique case of local protein solvation is observed in the catalytic binding pocket of HEL near Ile58. When protein side chains are constrained, this binding pocket contains only water. However, for the simulation of HEL with unconstrained side-chain coordinates, a glycerol molecule enters the binding pocket at 80 ns and it remains there for ∼500 ns before leaving the binding pocket (Fig. 6). Consecutively, several other glycerol molecules populate the binding pocket with residence times of ∼100 ns, and the binding pocket is occupied by glycerol for >90% of the simulation time (Fig. 6 C, and see Movie S2). Glycerol in the binding pocket adopts two major binding orientations: orientation A is such that its O-atoms are in contact with the protein and its C-atoms point toward the solvent (Fig. 6 A), whereas for orientation B the direction of O- and C-atoms is reversed (Fig. 6 B). The switch from glycerol orientation A to orientation B around the middle of the simulation coincides with the increase of residue-based preferential interaction coefficient ΓiXP for Ile58 (Fig. 6 C). This indicates that local solvation preferences of a protein can be affected by slow (>100 ns) orientational changes of cosolvent molecules at unique binding pockets.
Discussion
In this study we have characterized the protein solvation mosaic—i.e., the ensemble of preferentially hydrated, preferentially solvated, and neutral solvent regions—over the entire surface of a specific protein conformation in a mixed solvent. This was achieved by performing extended (1.3 μs) classical all-atom MD simulations of a protein in a mixed solvent with constrained protein coordinates. We found that correlation times of preferential interaction coefficients are governed by the longest characteristic residence time of glycerol of ∼10 ns, which arises from glycerol molecules forming multiple hydrogen bonds with specific protein surface loci. Solvation timescales of ∼10 ns were also reported for proteins in mixtures of water and other cosolvents such as urea (63) and arginine (43). This suggests that the slowest intrinsic timescale for protein solvation in mixed solvents is generally ∼10 ns. As a corollary, simulations need to be sufficiently longer than 10 ns to obtain multiple statistically independent block-time averages of ΓXP, and we find that a minimum simulation time of ∼100 ns is needed to quantify global and local preferential interactions of a specific protein conformation with statistical significance.
Interestingly, Ma et al. (51) found that the minimum simulation time to obtain convergence of ΓXP for a triglycine peptide in aqueous urea solutions was also ∼100 ns. These authors used multiple 2–3 ns simulations, whereas we used a single extended simulation per protein. Although both simulation schemes appear equally effective for quantifying local preferential interaction coefficients and equally efficient in their use of computational resources, multiple shorter simulation times can arguably be completed in a shorter time. This apparent advantage loses much of its significance when one considers that a 100 ns simulation of a medium-size protein (∼25 kDa) in a solvent box only takes ∼5 days on a standard high-performance computing cluster with ∼100 CPUs. However, an important difference between both simulation schemes is that only continuous extended simulations allow characterizing residence of solvent molecules at the protein surface. This is a considerable advantage because characteristic residence times determine convergence times of preferential interactions and reveal pertinent protein solvation characteristics (42,47).
Specific protein-surface loci forming multiple hydrogen bonds with the cosolvent have a surprisingly low affinity for cosolvent. For example, a specific protein surface locus that forms up to six hydrogen bonds with a single glycerol molecule is only occupied by glycerol for half of the time (Fig. 2). Assuming a Langmuir binding model, this surface locus binds glycerol with a dissociation constant Kd of ∼4 M. Thus, multiple hydrogen-bonding at this surface-locus results in specific glycerol orientations without pronounced increases of the binding affinity. This is in agreement with empirical observations that hydrogen bonds in molecular recognition processes primarily convey specificity rather than affinity (33). Taking into account the average number of glycerol molecules with long characteristic residence times (i.e., n3 in Table 1), the solvent-accessible surface areas of the simulated proteins, and an average locus occupancy by glycerol of 50%, we find that the surface density of specific loci that form multiple hydrogen bonds with glycerol is typically in the range 0.5–2.0 × 10−3/Å2. This corresponds with several specific surface loci for smaller proteins and tens of specific surface loci for larger proteins. Specific surface loci forming multiple hydrogen bonds with a cosolvent molecule will therefore significantly contribute to ΓXP, and local solvation preferences at such loci are expected to affect overall cosolvent effects on protein processes.
Most protein surface loci have a similar affinity for glycerol as for water (Fig. 3). However, several protein surface loci, including loci forming multiple hydrogen bonds with glycerol, have a higher affinity for glycerol with dissociation constants Kd as low as 2 M. By far the greatest binding affinity for glycerol is found in the catalytic binding pocket of HEL, which is occupied by glycerol for ∼90% of the time (Fig. 6 C). The high occupancy of the binding pocket by glycerol, which is in good agreement with crystal structures of HEL that resolve glycerol in the binding pocket (64), corresponds with a dissociation constant Kd of ∼0.4 M. The clear distinction in timescales and binding affinities between the catalytic binding pocket of HEL and the rest of the protein surface prompts us to differentiate between general protein solvation, which is determined by low binding affinities (Kd > 2 M) and characteristic solvent residence times <20 ns, versus high-affinity protein solvation, which is determined by substantially higher binding affinities (Kd < 0.5 M) and residence times >100 ns.
Despite the high binding-affinity of the catalytic binding pocket of HEL, glycerol only enters this pocket for the first time after 80 ns. This clearly demonstrates that shorter simulation times and simulations that fail to demonstrate convergence behavior of local preferential interactions may lead to erroneous conclusions on local solvation preferences of proteins. Hence, careful evaluation is needed of local protein solvation data from shorter simulations or simulations during which the protein undergoes large conformation changes. This is especially critical when using classical MD simulations to identify binding pockets and estimate the affinity of druglike molecules (65).
After entering the binding pocket, glycerol predominantly adapts two distinct binding orientations and repeatedly undergoes orientational changes on timescales in the order of ∼100 ns. Such slow orientational changes obviously do not converge within microsecond classical molecular dynamics simulations and other computational methods such as 3D-RISM based methods (66) could be more appropriate to characterize solvent orientations in high-affinity binding pockets. By constraining cosolvent molecules at high-affinity sites with respect to predominant binding orientations, local solvation preferences of the binding pocket could then be characterized by classical MD simulations. In this manner, the structural and dynamical properties of the protein solvation mosaic could be determined over the entire protein surface, even near high-affinity pockets. Alternatively, if the protein process of interest does not involve the binding pocket, high-affinity binding pockets could be excluded by considering regional preferential interaction coefficients of all protein residues outside of the binding pockets.
Local preferential interaction coefficients may differ for protein residues of the same amino-acid type. For example, Lys13 of HEL is preferentially solvated by glycerol (ΓiXP = 0.12 ± 0.04) whereas Lys116 of HEL is preferentially hydrated (ΓiXP = −0.05 ± 0.03) (Fig. 4). Local preferential interaction coefficients of these two residues differ because of adjacent residues: Lys13 and its adjacent residues cooperatively bind glycerol (Fig. 2), whereas protein residues near Lys116 do not cooperatively interact with the solvent. Cooperative interactions of adjacent protein groups with solvent molecules depend on the size and nature of the constituent groups that partition the protein surface. Although smaller constituent groups allow characterization of local protein solvation at higher resolution, local preferential interaction coefficients of smaller protein constituent groups become increasingly dependent on solvent interactions with adjacent groups and information on solvent interactions of individual constituent groups becomes more convoluted.
Results from our simulations of two proteins (lysozyme and antibody fragment D1.3) in 4 M glycerol agree well with available experimental data on 1), characteristic residence times of solvent molecules at the protein surface; 2), the high occupancy of the catalytic cleft of HEL by glycerol; and 3), the glycerol-induced shift of the free energy between local protein conformations. Previously, we also demonstrated quantitative agreement of ΓXP values for lysozyme in aqueous glycerol between simulation and experiment (42). This indicates that force fields used in this study are appropriate for characterizing local protein solvation in aqueous glycerol solutions.
Conclusions
We have shown that local solvation preferences can be quantified over the entire protein surface from extended MD simulations of specific protein conformations in mixed solvents. This has been achieved by combining analysis methods with gradually increasing resolution, starting with the statistical analysis of the global preferential interaction coefficient ΓXP and the identification of protein solvation timescales, followed by the characterization of local concentration maps and residue-base preferential interaction coefficients, and complemented with the inspection of the trajectories of specific solvent molecules and protein surface loci. This methodology allows quantifying the contribution of individual solvent molecules and protein residues to ΓXP, and therefore expands the scope of preferential interaction theory by directly linking atom-level details of local protein solvation with thermodynamic solvent effects on protein processes.
From the statistical analysis of solvent fluctuations at the protein surface, we derived that the slowest intrinsic solvation timescale of a protein in a mixture of water and cosolvent is determined by long residence times of cosolvent molecules at specific protein surface loci that enable multiple hydrogen-bonding with a cosolvent molecule. The slowest intrinsic solvation timescale is generally ∼10 ns, and simulation times to quantify local solvation preferences of a constrained protein conformation need to be longer than 100 ns. Most protein side-chain motions do not significantly affect protein solvation, although slow (>100 ns) side-chain motions of a single protein residue can lead to significant changes of local solvent preferences. Another singular event that affects local solvation preferences is the slow reorientation of cosolvent molecules at protein surface loci with high cosolvent affinity, such as observed for glycerol in the catalytic binding pocket of lysozyme.
Because convergence of local solvent preferences of proteins in mixed solvents can be achieved on standard high-performance clusters, computational characterization of local protein solvation emerges as an accessible high-resolution technique for studying solvent effects on proteins. Granted the availability of accurate force fields, this technique enables direct identification of physicochemical properties that determine solvent preferences without assuming group additivity. Even for a relatively simple cosolvent like glycerol, solvent preferences are remarkably heterogeneous at distinct protein residues. Detailed characterization of local protein solvation, therefore, appears indispensable to further understanding of the molecular mechanisms by which solvents affect protein structure.
Acknowledgments
This research was supported in part by the National Science Foundation through TeraGrid resources provided by Texas Advanced Computing Centre under grant No. TG-MCB100058, and by the Biomedical Research Council of A∗STAR, Singapore.
Supporting Material
References
- 1.Cohen F.E., Kelly J.W. Therapeutic approaches to protein-misfolding diseases. Nature. 2003;426:905–909. doi: 10.1038/nature02265. [DOI] [PubMed] [Google Scholar]
- 2.Kamerzell T.J., Esfandiary R., Volkin D.B. Protein-excipient interactions: mechanisms and biophysical characterization applied to protein formulation development. Adv. Drug Deliv. Rev. 2011;63:1118–1159. doi: 10.1016/j.addr.2011.07.006. [DOI] [PubMed] [Google Scholar]
- 3.Ohtake S., Kita Y., Arakawa T. Interactions of formulation excipients with proteins in solution and in the dried state. Adv. Drug Deliv. Rev. 2011;63:1053–1073. doi: 10.1016/j.addr.2011.06.011. [DOI] [PubMed] [Google Scholar]
- 4.Timasheff S.N. Control of protein stability and reactions by weakly interacting cosolvents: the simplicity of the complicated. Adv. Protein Chem. 1998;51:355–432. doi: 10.1016/s0065-3233(08)60656-7. [DOI] [PubMed] [Google Scholar]
- 5.López C.J., Fleissner M.R., Hubbell W.L. Osmolyte perturbation reveals conformational equilibria in spin-labeled proteins. Protein Sci. 2009;18:1637–1652. doi: 10.1002/pro.180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Flores Jiménez R.H., Do Cao M.-A., Cafiso D.S. Osmolytes modulate conformational exchange in solvent-exposed regions of membrane proteins. Protein Sci. 2010;19:269–278. doi: 10.1002/pro.305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kirkwood J.G., Buff F.P. The statistical mechanical theory of solutions. I. J. Chem. Phys. 1951;19:774–777. [Google Scholar]
- 8.Casassa E.F., Eisenberg H. Thermodynamic analysis of multicomponent solutions. Adv. Protein Chem. 1964;19:287–395. doi: 10.1016/s0065-3233(08)60191-6. [DOI] [PubMed] [Google Scholar]
- 9.Wyman J., Jr. Linked functions and reciprocal effect in hemoglobin: a second look. Adv. Protein Chem. 1964;19:223–286. doi: 10.1016/s0065-3233(08)60190-4. [DOI] [PubMed] [Google Scholar]
- 10.Tanford C. Extension of the theory of linked functions to incorporate the effects of protein hydration. J. Mol. Biol. 1969;39:539–544. doi: 10.1016/0022-2836(69)90143-0. [DOI] [PubMed] [Google Scholar]
- 11.Schellman J.A. Solvent denaturation. Biopolymers. 1978;17:1305–1322. [Google Scholar]
- 12.Smith P.E. Cosolvent interactions with biomolecules: relating computer simulation data to experimental thermodynamic data. J. Phys. Chem. B. 2004;108:18716–18724. [Google Scholar]
- 13.Shimizu S., Boon C.L. The Kirkwood-Buff theory and the effect of cosolvents on biochemical reactions. J. Chem. Phys. 2004;121:9147–9155. doi: 10.1063/1.1806402. [DOI] [PubMed] [Google Scholar]
- 14.Schurr J.M., Rangel D.P., Aragon S.R. A contribution to the theory of preferential interaction coefficients. Biophys. J. 2005;89:2258–2276. doi: 10.1529/biophysj.104.057331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shulgin I.L., Ruckenstein E. A protein molecule in an aqueous mixed solvent: fluctuation theory outlook. J. Chem. Phys. 2005;123:054909. doi: 10.1063/1.2011388. [DOI] [PubMed] [Google Scholar]
- 16.Shimizu S., Matubayasi N. Preferential hydration of proteins: a Kirkwood-Buff approach. Chem. Phys. Lett. 2006;420:518–522. [Google Scholar]
- 17.Smith P.E. Chemical potential derivatives and preferential interaction parameters in biological systems from Kirkwood-Buff theory. Biophys. J. 2006;91:849–856. doi: 10.1529/biophysj.105.078790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Kang M., Smith P.E. Preferential interaction parameters in biological systems by Kirkwood-Buff theory and computer simulation. Fluid Phase Equilib. 2007;256:14–19. [Google Scholar]
- 19.Jiao Y., Smith P.E. Fluctuation theory of molecular association and conformational equilibria. J. Chem. Phys. 2011;135:014502. doi: 10.1063/1.3601342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Record M.T., Jr., Zhang W.T., Anderson C.F. Analysis of effects of salts and uncharged solutes on protein and nucleic acid equilibria and processes: a practical guide to recognizing and interpreting polyelectrolyte effects, Hofmeister effects, and osmotic effects of salts. Adv. Protein Chem. 1998;51:281–353. doi: 10.1016/s0065-3233(08)60655-5. [DOI] [PubMed] [Google Scholar]
- 21.Baynes B.M., Trout B.L. Proteins in mixed solvents: a molecular-level perspective. J. Phys. Chem. B. 2003;107:14058–14067. [Google Scholar]
- 22.Tanford C. Isothermal unfolding of globular proteins in aqueous urea solutions. J. Am. Chem. Soc. 1964;86:2050–2059. [Google Scholar]
- 23.Chen X., Sagle L.B., Cremer P.S. Urea orientation at protein surfaces. J. Am. Chem. Soc. 2007;129:15104–15105. doi: 10.1021/ja075034m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Harries, D., and J. Rosgen. 2008. A practical guide on how osmolytes modulate macromolecular properties. In Biophysical Tools for Biologists, Vol. 1: In Vitro Techniques. Methods in Cell Biology Series, Vol. 84. Academic Press, Elsevier, London, Amsterdam, Oxford, Burlington, San Diego. 679–735. [DOI] [PubMed]
- 25.Hilser V.J. Structural biology: finding the wet spots. Nature. 2011;469:166–167. doi: 10.1038/469166a. [DOI] [PubMed] [Google Scholar]
- 26.Courtenay E.S., Capp M.W., Record M.T., Jr. Vapor pressure osmometry studies of osmolyte-protein interactions: implications for the action of osmoprotectants in vivo and for the interpretation of “osmotic stress” experiments in vitro. Biochemistry. 2000;39:4455–4471. doi: 10.1021/bi992887l. [DOI] [PubMed] [Google Scholar]
- 27.Schneider C.P., Trout B.L. Investigation of cosolute-protein preferential interaction coefficients: new insight into the mechanism by which arginine inhibits aggregation. J. Phys. Chem. B. 2009;113:2050–2058. doi: 10.1021/jp808042w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Auton M., Bolen D.W. Predicting the energetics of osmolyte-induced protein folding/unfolding. Proc. Natl. Acad. Sci. USA. 2005;102:15065–15068. doi: 10.1073/pnas.0507053102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.O’Brien E.P., Ziv G., Thirumalai D. Effects of denaturants and osmolytes on proteins are accurately predicted by the molecular transfer model. Proc. Natl. Acad. Sci. USA. 2008;105:13403–13408. doi: 10.1073/pnas.0802113105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Guinn E.J., Pegram L.M., Record M.T., Jr. Quantifying why urea is a protein denaturant, whereas glycine betaine is a protein stabilizer. Proc. Natl. Acad. Sci. USA. 2011;108:16932–16937. doi: 10.1073/pnas.1109372108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Robinson D.R., Jencks W.P. Effect of compounds of urea-guanidinium class on activity coefficient of acetyltetraglycine ethyl ester and related compounds. J. Am. Chem. Soc. 1965;87:2462–2470. doi: 10.1021/ja01089a028. [DOI] [PubMed] [Google Scholar]
- 32.Roseman M.A. Hydrophilicity of polar amino acid side-chains is markedly reduced by flanking peptide bonds. J. Mol. Biol. 1988;200:513–522. doi: 10.1016/0022-2836(88)90540-2. [DOI] [PubMed] [Google Scholar]
- 33.Bissantz C., Kuhn B., Stahl M. A medicinal chemist’s guide to molecular interactions. J. Med. Chem. 2010;53:5061–5084. doi: 10.1021/jm100112j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bennion B.J., Daggett V. The molecular basis for the chemical denaturation of proteins by urea. Proc. Natl. Acad. Sci. USA. 2003;100:5142–5147. doi: 10.1073/pnas.0930122100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Das A., Mukhopadhyay C. Atomistic mechanism of protein denaturation by urea. J. Phys. Chem. B. 2008;112:7903–7908. doi: 10.1021/jp800370e. [DOI] [PubMed] [Google Scholar]
- 36.Stumpe M.C., Grubmüller H. Polar or apolar—the role of polarity for urea-induced protein denaturation. PLOS Comput. Biol. 2008;4:e1000221. doi: 10.1371/journal.pcbi.1000221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Stumpe M.C., Grubmüller H. Urea impedes the hydrophobic collapse of partially unfolded proteins. Biophys. J. 2009;96:3744–3752. doi: 10.1016/j.bpj.2009.01.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hua L., Zhou R.H., Berne B.J. Urea denaturation by stronger dispersion interactions with proteins than water implies a 2-stage unfolding. Proc. Natl. Acad. Sci. USA. 2008;105:16928–16933. doi: 10.1073/pnas.0808427105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Canchi D.R., Paschek D., García A.E. Equilibrium study of protein denaturation by urea. J. Am. Chem. Soc. 2010;132:2338–2344. doi: 10.1021/ja909348c. [DOI] [PubMed] [Google Scholar]
- 40.Saladino G., Pieraccini S., Sironi M. Metadynamics study of a β-hairpin stability in mixed solvents. J. Am. Chem. Soc. 2011;133:2897–2903. doi: 10.1021/ja105030m. [DOI] [PubMed] [Google Scholar]
- 41.Berteotti A., Barducci A., Parrinello M. Effect of urea on the β-hairpin conformational ensemble and protein denaturation mechanism. J. Am. Chem. Soc. 2011;133:17200–17206. doi: 10.1021/ja202849a. [DOI] [PubMed] [Google Scholar]
- 42.Vagenende V., Yap M.G.S., Trout B.L. Mechanisms of protein stabilization and prevention of protein aggregation by glycerol. Biochemistry. 2009;48:11084–11096. doi: 10.1021/bi900649t. [DOI] [PubMed] [Google Scholar]
- 43.Shukla D., Trout B.L. Preferential interaction coefficients of proteins in aqueous arginine solutions and their molecular origins. J. Phys. Chem. B. 2011;115:1243–1253. doi: 10.1021/jp108586b. [DOI] [PubMed] [Google Scholar]
- 44.Shukla D., Schneider C.P., Trout B.L. Complex interactions between molecular ions in solution and their effect on protein stability. J. Am. Chem. Soc. 2011;133:18713–18718. doi: 10.1021/ja205215t. [DOI] [PubMed] [Google Scholar]
- 45.Schneider C.P., Shukla D., Trout B.L. Effects of solute-solute interactions on protein stability studied using various counterions and dendrimers. PLoS One. 2011;6:e27665. doi: 10.1371/journal.pone.0027665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Canchi D.R., García A.E. Backbone and side-chain contributions in protein denaturation by urea. Biophys. J. 2011;100:1526–1533. doi: 10.1016/j.bpj.2011.01.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Vagenende V., Yap M.G.S., Trout B.L. Molecular anatomy of preferential interaction coefficients by elucidating protein solvation in mixed solvents: methodology and application for lysozyme in aqueous glycerol. J. Phys. Chem. B. 2009;113:11743–11753. doi: 10.1021/jp903413v. [DOI] [PubMed] [Google Scholar]
- 48.Hu C.Y., Kokubo H., Pettitt B.M. Backbone additivity in the transfer model of protein solvation. Protein Sci. 2010;19:1011–1022. doi: 10.1002/pro.378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Horinek D., Netz R.R. Can simulations quantitatively predict peptide transfer free energies to urea solutions? Thermodynamic concepts and force field limitations. J. Phys. Chem. A. 2011;115:6125–6136. doi: 10.1021/jp1110086. [DOI] [PubMed] [Google Scholar]
- 50.Roccatano D. Computer simulations study of biomolecules in non-aqueous or cosolvent/water mixture solutions. Curr. Protein Pept. Sci. 2008;9:407–426. doi: 10.2174/138920308785132686. [DOI] [PubMed] [Google Scholar]
- 51.Ma L., Pegram L., Cui Q. Preferential interactions between small solutes and the protein backbone: a computational analysis. Biochemistry. 2010;49:1954–1962. doi: 10.1021/bi9020082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kokubo H., Hu C.Y., Pettitt B.M. Peptide conformational preferences in osmolyte solutions: transfer free energies of decaalanine. J. Am. Chem. Soc. 2011;133:1849–1858. doi: 10.1021/ja1078128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Stumpe M.C., Grubmüller H. Interaction of urea with amino acids: implications for urea-induced protein denaturation. J. Am. Chem. Soc. 2007;129:16126–16131. doi: 10.1021/ja076216j. [DOI] [PubMed] [Google Scholar]
- 54.Bhat T.N., Bentley G.A., Poljak R.J. Bound water molecules and conformational stabilization help mediate an antigen-antibody association. Proc. Natl. Acad. Sci. USA. 1994;91:1089–1093. doi: 10.1073/pnas.91.3.1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.MacKerell A.D., Bashford D., Karplus M. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
- 56.Jorgensen W.L., Chandrasekhar J., Klein M.L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983;79:926–935. [Google Scholar]
- 57.Ha S.N., Giammona A., Brady J.W. A revised potential-energy surface for molecular mechanics studies of carbohydrates. Carbohydr. Res. 1988;180:207–221. doi: 10.1016/0008-6215(88)80078-8. [DOI] [PubMed] [Google Scholar]
- 58.Brooks B.R., Bruccoleri R.E., Karplus M. CHARMM—a program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 1983;4:187–217. [Google Scholar]
- 59.Phillips J.C., Braun R., Schulten K. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005;26:1781–1802. doi: 10.1002/jcc.20289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Allen M.P., Tildesley D.J. Clarendon Press; Oxford, UK: 1987. Computer Simulation of Liquids. [Google Scholar]
- 61.Humphrey W., Dalke A., Schulten K. VMD: visual molecular dynamics. J. Mol. Graph. 1996;14:33–38. doi: 10.1016/0263-7855(96)00018-5. 27–28. [DOI] [PubMed] [Google Scholar]
- 62.Gregory R.B. The influence of glycerol on hydrogen isotope exchange in lysozyme. Biopolymers. 1988;27:1699–1709. doi: 10.1002/bip.360271102. [DOI] [PubMed] [Google Scholar]
- 63.Lindgren M., Sparrman T., Westlund P.-O. A combined molecular dynamic simulation and urea 14N NMR relaxation study of the urea-lysozyme system. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2010;75:953–959. doi: 10.1016/j.saa.2009.11.054. [DOI] [PubMed] [Google Scholar]
- 64.Ueno T., Abe S., Watanabe Y. Elucidation of metal-ion accumulation induced by hydrogen bonds on protein surfaces by using porous lysozyme crystals containing Rh(III) ions as the model surfaces. Chemistry. 2010;16:2730–2740. doi: 10.1002/chem.200903269. [DOI] [PubMed] [Google Scholar]
- 65.Seco J., Luque F.J., Barril X. Binding site detection and druggability index from first principles. J. Med. Chem. 2009;52:2363–2371. doi: 10.1021/jm801385d. [DOI] [PubMed] [Google Scholar]
- 66.Imai T., Oda K., Kidera A. Ligand mapping on protein surfaces by the 3D-RISM theory: toward computational fragment-based drug design. J. Am. Chem. Soc. 2009;131:12430–12440. doi: 10.1021/ja905029t. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.