Abstract
The 36 residue helical subdomain of the villin headpiece, HP36, is one of the smallest cooperatively folded proteins, folding on the microsecond timescale. The domain is an extraordinarily popular model system for both experimental and computational studies of protein folding. The structure of HP36 has been solved using X-ray crystallography and NMR spectroscopy, with the resulting structures exhibiting differences in helix packing, van der Waals contacts and hydrogen bonding. It is important to determine the solution structure of HP36 with as much accuracy as possible since this structure is widely used as a reference for simulations and experiments. We complement the existing data by using all-atom molecular dynamics simulations with explicit solvent to evaluate which of the experimental models is the better representation of HP36 in solution. After 50 ns of simulation initiated with the NMR structure, we observed that the protein spontaneously adopts structures with a backbone conformation, core packing and C-capping motif on the third helix that are more consistent with the crystal structure. We also examined hydrogen bonding and sidechain packing interactions between D44 and R55 and between F47 and R55 respectively, which were observed in the crystal structure but not present in the NMR-based solution structure. Simulations showed large fluctuations in the distance between D44 and R55, while the distance between F47 and R55 remained stable, suggesting the formation of a cation-pi interaction between those residues. Experimental double mutant cycles confirmed that the F47/R55 pair has a larger energetic coupling than the D44/R55 interaction. Overall, these combined experimental and computational studies show that the X-ray crystal structure is the better reference structure for HP36 in solution at neutral pH. Our analysis also shows how detailed molecular dynamics simulations can help bridge the gap between NMR and crystallographic methods.
The villin headpiece helical subdomain (HP36), the C-terminal portion of the villin headpiece, is the shortest naturally occurring sequence which has been shown to fold cooperatively (Figure 1). Infrared temperature jump(1), laser fluorescence(2, 3) and NMR lineshape analysis(4) techniques have measured the folding of HP36 to occur on the microsecond time scale. Its rapid folding, small size and simple topology of three helices have made this domain an extremely popular system for experimental(1-13) and computational(14-30) studies. Much of this work relies on using the folded structure as a reference and thus the accuracy of the known HP36 structure is of particular importance.
Several structures have been solved for HP36, one by NMR and the others by X-ray crystallographic methods(6, 7). These structures vary in the hydrophobic core packing, interhelical H-bonds and in the length of the helices. In addition, two potentially important sidechain contacts differ significantly between the NMR and X-ray structures: F47/R55 (4.3 Å (X-ray) and 6.3 Å (NMR)) and D44/R55 (2.7 Å (X-ray) and 7.9 Å (NMR)) (Figure 2a & b). In the X-ray structure, the F47/R55 pair forms a van der Waals contact which could be particularly stabilizing as a cation-pi interaction, while D44/R55 form a hydrogen bond (D44-Oδ1 and R55-Nε). Neither contact is present in the NMR structure. These differences may arise from changes in the HP36 sequence used in the two sets of experiments, although this seems unlikely. The crystallographic study employed the N68H mutant of HP36 and also lacks the N-terminal methionine incorporated by the expression system used for the NMR study (note that we adopt the typical numbering convention(5, 6) for HP36, in which L42 follows the N-terminal methionine). Another possible reason for the structural differences could be the variation of experimental conditions such as pH or temperature. There was significant deviation in the pH between structural determinations; the NMR structure was solved at pH 3.7 in contrast to the more neutral conditions of the crystallography experiment (pH 6.7). An alternate explanation for the observed structural differences is that they arise from methodological limitations conditions; these frequently give rise to differences in structures of the same protein solved using different techniques. In general, NMR structures are less precise than X-ray structures, particularly if only homonuclear methods are used. Nevertheless, X-ray structures can suffer from effects due to crystal packing; the resulting contacts may have a local influence on conformational preferences. The small size of HP36 and its correspondingly large surface area to volume ratio could make crystal contacts play an important role. On the other hand, crystallographic data is often collected at low temperatures which might result in the dampening of thermal motions that are present under physiological conditions.
Many computational studies have used HP36 as a model system for development and validation of protein folding methods and for optimization of force field parameters(14-30). If the native reference structure is not correct, the basis of these studies may not be valid. For example, the structure of the first helix and the C-terminus vary in the ensemble of NMR structures(6, 27) and many MD studies have therefore neglected these regions of the experimental structure when evaluating their success. Nevertheless, most simulations are performed at neutral pH and thus it is not clear if the simulations should be compared to the NMR structure from pH 3.7. A better structural model for neutral conditions would be invaluable for further work in understanding the folding and stability of this important model system for protein folding.
Accurate computational studies can provide an alternate method to study conformational behavior and alleviate the uncertainty about which structure is the better representation of the folded state in solution. In principle, molecular dynamics (MD) simulations can supply detailed information with spatial and time resolution that exceed the ability of NMR and X-ray experiments, providing insight into the role of specific interactions that may not be readily accessible through experiments that probe averages over rapidly interconverting ensembles.
Here, we conducted all-atom MD simulation in explicit solvent using the NMR structure of HP36 in order to gain insight into the details of the folded state in solution. The simulation diverges from the initial NMR structure and spontaneously adopts a structure with much greater similarity to the X-ray structure, arguing that the X-ray structure is a more accurate representation of the structure in solution at neutral pH. In addition, two residue pairs, D44/R55 and F47/R55, spontaneously formed contacts during the simulation, with the F47/R55 pair appearing to be more stable. These interactions were reported in the crystal structure but were not present in the ensemble of structures generated by the NMR studies. Thus we conclude that the F47/R55 may play an important role in stabilizing HP36 in solution. We acknowledge that simulation models can be limited in accuracy and any predictions should be tested through direct experimentation. In order to validate our computational observations, we employed an experimental double mutant cycle analysis. The results are consistent with our simulation data, and suggest that the interaction between F47 and R55 plays a role in stabilizing the native state through a cation-pi interaction. Overall, the results show how properly validated MD simulations can provide an avenue to test the stability and validity of structural models that were derived from experimental data.
Materials and Methods
Computational
The numbering system corresponds to that used for the full length villin headpiece, with the sequence M41–F76 (MLSDEDF KAVFGMTRSAFANLPWKQQNLKKEKGLF). HP36 has free N and C-termini that were modeled in the charged state. This sequence and termini correspond to those used in the experimental studies. All sidechains for Asp, Glu, Lys, and Arg were charged during the simulation. All calculations employed Amber version 8(31) and used the ff99SB modification(32) of the Amber ff99 force field(33, 34). SHAKE(35) was used to constrain bonds involving hydrogen. The time step was 2 fs. The temperature was maintained using the weak coupling algorithm (36) with a thermostat of 37 °C (310 K) and the pressure was equilibrated to 1 atm. All production simulations were performed using the NVT ensemble. An independent simulation using the NPT ensemble provided similar results (Data not shown).
Solvation plays a key role in biomolecular structural preferences and thus accurate treatment of solvation is essential for the investigation of structural propensities in simulations. Explicit solvent models can be highly effective, particularly when water has non-bulk properties and interacts directly with the solute.(37) Implicit models such as the semi-analytical Generalized Born model (GB)(38) are attractive because they are computationally less expensive and can converge more rapidly than simulations in explicit water due to lack of solvent viscosity. While GB has been widely used for protein folding studies by a number of groups, other investigators have reported poor results including secondary structural bias and ion pairing issues. (39-41) Our previous studies on fragments of HP36 have shown that the use of explicit water produced results which were much more consistent with experimental trends than those obtained with implicit solvent(30). Consequently, we used explicit solvent in our simulations of HP36, in a truncated octahedral box using periodic boundary conditions with Particle Mesh Ewald (PME) (42) and a direct space cutoff of 8 Å. In order to investigate the influence of long-range periodicity, two additional simulations were run: one with the Isotropic Periodic Sum (IPS) (43) non-lattice method with a cutoff of 8 Å, and another with an atom-based nonbonded cutoff of 12 Å with no smoothing function. Simulations were initiated from the NMR structure (PDB ID 1VII) surrounded by 2327 TIP3P(44) waters molecules and equilibrated at 310 K for 50 ps with harmonic restraints on solute atoms, followed by minimization with gradually reduced positional restraints and three 5 ps MD simulations with gradually reduced restraints. The production simulations of the NMR structure were 50 ns in length for two PME simulations with different random seeds for assignment of velocities, and 30 ns for the IPS and cutoff simulations respectively. As a control, the X-ray structure(PDB ID 1YRF) was setup with the same amount of waters and equilibrated in a similar fashion. This simulation was run for 30 ns using PME.
Data analysis
The last 5 ns of the simulation were used for cluster analysis and DSSP calculations. Cluster analysis was performed with Moil-view using all atoms as a similarity criterion with average linkage. Clusters were formed with the bottom-up approach using a similarity cutoff of 2.5 Å. DSSP analysis and calculation of distances, RMSD values, and radius of gyration were done using the ptraj module in Amber. Distances between sidechains were calculated using selected heavy atoms as indicated in the text. PMFs were calculated according to equation (3):
(3) |
where Ni is the population of a particular histogram bin along the reaction coordinates that were employed and N0 is the most populated bin. Error bars were estimated for the PMF by averaging two independent simulations and subsequently subtracting the PMF of an individual simulation from the average PMF.
Cloning, expression and purification
The plasmid (pET3a-NTL9-FXa-HP36) containing the gene for HP36 was prepared as described previously.(45, 46) The primers for generation of the mutants D44N, F47L and R55M were from Operon. The fusion proteins with coded amino acid substitutions were expressed and purified as described previously.(45) For the purification of the fusion protein of F47L and F47LR55M, ethylene diamine tetraacetic acid and phenylmethyl sulfonyl fluoride were added to final concentration of 1 mM to prevent degradation by proteases during ion exchange chromatography. Cleavage of WT HP36, D44N, F47L, R55M, D44NR55M and F47LR55M from fusion proteins were carried out by incubating with Factor Xa in buffer (50 mM Tris-HCl, 100 mM NaCl and 5 mM CaCl2, pH 8.0) for 16 hours. The conditions were as follows: WT HP36, D44N, R55M and D44NR55M, 2 units of Factor Xa at 23 °C; F47L, 1 unit of Factor Xa at 4 °C; F47LR55M, 0.5 units of Factor Xa at 4 °C. The solutions were purified by RP-HPLC as described. All proteins were more than 95% pure. The identities of the proteins were analyzed by electro-spray ionization mass spectrometry (ESI MS). The expected and observed molecular weights were as follows: WT HP36, expected 4190.9, observed 4190.1; D44N, expected 4188.9, observed 4189.0; F47L, expected 4155.8, observed 4157.0; R55M, expected 4164.9, observed 4166.4; D44NR55M, expected 4163.9, observed 4164.3; F47LR55M, expected 4130.8, observed 4130.2.
Circular Dichroism (CD) Spectroscopy
All Circular Dichroism (CD) experiments were carried out on an Aviv 62A DS and a 202SF Circular Dichroism spectrophotometer. All samples for CD experiments were prepared in 10 mM sodium acetate, 150 mM sodium chloride buffer solution at pH 5.0. The protein concentrations were determined from absorbance measurements using the method of Pace & coworkers.(47) The absorbance was measured at 280 nm in 6M guanidine hydrochloride, 20 mM sodium phosphate pH 6.5, with an extinction coefficient of 5690 M-1cm-1. Thermal unfolding and urea denaturation experiments were carried out in a 1 cm quartz cuvette by monitoring the signal at 222 nm. Thermal unfolding experiments were performed from 2 °C to 98 °C with a 2 °C interval. The concentrations of urea were measured by measuring the refractive index. The concentration of urea was increased from 0 M to 10 M by ∼0.25 M each step. Urea denaturation experiments were performed at 25 °C Urea denaturation experiments in the osmolyte trimethylamine N-oxide (TMAO) for F47L and F47LR55M were also performed at 25 °C The concentrations of TMAO were measured by measuring the refractive index.(48)
All thermal and urea denaturations were analyzed by a non-linear least squares curve fitting equation using SigmaPlot (SPSS Inc.):
(4) |
(5) |
The free energy of unfolding is assumed to be linearly proportional to the concentration of denaturant:
(6) |
Where ΔG°U([urea]) is the apparent free energy for the native to denatured transition; ΔG°U(H2O) is the free energy of unfolding in the absence of denaturant; m is the slope, equation (5) shows the linear dependence of ΔG°U ([urea]) on the denaturant concentration and the m-value reveals quantitative information about the buried surface area change upon unfolding; T is the temperature; R is the gas constant; an and bn represents the intercept and slope of the pre-transition; ad and bd represent intercept and slope of the post-transition. The fraction of unfolded state is:
(7) |
(8) |
Results
Simulations of the NMR structure
Figure 3 shows the backbone RMSD versus time and RMSD distributions calculated during the last 5 ns for selected regions of HP36 during the simulation. The RMSD is shown relative to both the NMR and X-ray structures. At the end of the equilibration period, the backbone RMSD (residues L42-L75) to each experimental structure was ∼ 2.0 Å (Figure 3a). At 8ns, a structural transition occurred causing the overall backbone RMSD (X-ray) to drop 1.0 Å below the RMSD (NMR). This greater similarity to the X-ray structure persisted throughout the remainder of the simulation.
In Figure 3b, the RMSD relative to the X-ray structure of the region containing helix-3 (residues P62-F76) demonstrates even more clearly a switch during the simulation from similarity to the initial NMR structure to a greater similarity to the X-ray structure, as indicated by a reduction in the RMSD to the X-ray structure from 3 - 4 Å to 0.5 − 1.0 Å. Clearly, the simulation shows the inclination of HP36 to sample structures with a backbone similar to the X-ray structure despite being initiated with the NMR solution structure. The RMSD values for the two other helices remained stable and also showed a clear preference for the X-ray structure (Figure S1a and S1b).
In order to investigate the source of the large reduction in RMSD relative to the X-ray structure, a best fit alignment was performed on residues 61 to 74 to compare the differences before and after the structural transition. In figure 4a, the NMR, X-ray and simulation structures are shown. The conformations of the C-terminus differ significantly between the X-ray and the NMR structure. The simulation structure spontaneously converts from the conformation in the NMR structure to that in the X-ray structure, concomitant with formation of three hydrogen bonds that stabilize the observed conformation. G74 forms a C-capping interaction with K70 and K71 at the end of helix-3, along with an additional hydrogen bond formed between K70 and L75. Figure 4b shows the time evolution of these hydrogen bond distances. In the beginning of the simulation, all three distances are 4 - 9 Å. At 8 ns, the distances are reduced to 2 - 3 Å, indicating formation of the hydrogen bonds that may play an important role in stabilizing the C-terminal helix. Importantly, all three hydrogen bonds are present in the X-ray structure but absent in the NMR structure (Figure 4a).
Dictionary of secondary structural prediction (DSSP)(49) analysis was employed to characterize the secondary structure in the simulation in order to facilitate comparisons with the X-ray and NMR structures (Figure 5). In the simulations, helix-1 spans the same 8 residues as found in the X-ray structure (D44 to F51), while the NMR structure contained only a five residue helix from D44-K48. Thus the simulation significantly extends the length of the first helix, in agreement with the X-ray structure. Overall, the locations of the sequence of helices 2 and 3 are similar in the NMR and X-ray structures, although helix-2 is one residue shorter in the NMR structure, (residues R55 to F58 for the NMR vs. R55 to A59 for the X-ray). In the simulation, helix-2 appears consistent with both experimental structures; full α-helical content is sampled for residues 55 through 58, with partial helical content (∼50%) observed for A59. This may indicate that the C-terminus of the longer helix in the X-ray structure frays at the temperature of the NMR experiment. In both the NMR and X-ray structures, the α-helical content is the same for helix-3 (L63-K72). The simulations sample the same helix, with residue K73 sampling a partial population of helical structures. As noted above, the simulation spontaneously adopts a C-capping motif for this helix that is present in the X-ray structure. Overall, the alpha helical structural content of the structures in the simulation is in much better agreement with the X-ray structure, particularly in helix-1.
All-atom cluster analysis was used to generate a representative simulation structure using the last 5 nanoseconds of the trajectory. This structure has backbone and all-atom RMSD values relative to the X-ray structure of 1.5 and 2.7 Å (residues 42 to 75), while the RMSD values relative to the initial NMR structure were higher (2.3 Å (backbone) and 3.3 Å (all-atom)). Figure 6 shows all three structures after best-fit of the backbone from residues 42 to 62 (helices 1 and 2). Notably, the X-ray and simulation structure have a very similar spatial arrangement of their phenylalanine cores. In contrast to the X-ray and simulation structures, the NMR structure has F51 shifted more into the core. Thus, the backbone and core of the protein in the solution simulation possesses structural features that are much more similar to the X-ray structure despite being initiated from the NMR structure.
Structural similarities to the NMR Family
Given the diversity among the family of structures solved using the NMR data, it is reasonable to expect that some of them may be more similar than others to the X-ray structure. Figure S2 shows the backbone RMSD as compared to the X-ray, simulation and NMR average structures for each structure in the NMR family. Overall, the individual NMR structures are all more similar to the NMR average than to the X-ray structure (average RMSD values of 1.7 and 2.4 Å respectively). The RMSD of the three individual helices demonstrate similar differences. However, some of the individual members of the NMR family are similar to the X-ray and simulation structures, especially in helix 1. According the DSSP, 7 out of the 29 members of the NMR family sample alpha helical conformations at V50 (data not shown) which is outside of the helical region in the average NMR structure. This suggests that extension of helix-1 beyond the range seen in the average structure remains consistent with the NMR family. However, the overall backbone of the X-ray and the simulation structure differs from all of the structures in the NMR family (Figure S2).
Specific sidechain interactions
There are several specific sidechain interactions which differ in the NMR and X-ray structures. In the X-ray structure, R55 forms a van der Waals interaction with F47 and an interhelical sidechain-sidechain hydrogen bond with D44 (D44-Oδ1 and R55-Nε); both interactions are absent in the NMR structure. In Figure 7a & b, the simulation structure was aligned with the X-ray structure to highlight the similarities in the interaction of those particular sidechains. Since the simulation structure is a single snapshot, we also investigated the behavior of these contacts as a function of time during the MD run, observing fluctuations in both cases (Figure 8). In both the X-ray and the simulation structure, the H-bond distance between D44 and R55 is 2.7 Å, in contrast with the much longer distance of 7.9 Å in the NMR structure. This specific contact also samples a range of distances from 6.7 Å to 11.6 Å in the family of NMR structures (Figure S3). During the simulation, this hydrogen bond is broken and re-formed multiple times, suggesting that a reasonable description of the equilibrium distance distribution has been sampled (Figure 8a). We used histogram analysis to calculate the potential mean force (PMF) for the pair to quantify the stability of the contact in the native state. While two free energy minima are located at the hydrogen bonding distance, two other local minima at 5.0 and 7.0 Å have relative energies of less than 0.6 kcal/mol compared to the contact pair (Figure 8b). Thus breaking this contact is expected to be a readily accessible thermal fluctuation. The stability of the contact between F47 and R55 was evaluated by measuring the distance from the Cγ of F47 to the Nε of R55 (Figure 8c). This distance had comparable values in the simulation and X-ray structures (4.7 and 4.3 Å, respectively), while a much longer distance of 6.3 Å is observed in the average NMR structure. Only 2 structures in the entire NMR family sample a contact distance of less than 5.5 Å (Figure S3). In contrast with the D44/R55 pair, the PMF for formation of the F47/R55 contact shows only a single minimum at 5.5 Å (Figure 8d). Overall, this suggests that R55 has a much more stable interaction with F47 than the salt bridge that it forms with D44.
Simulations of the X-ray structure
Figure S4 shows the backbone RMSD versus time and RMSD distributions calculated during the simulation starting from the X-ray structure. The RMSD is shown relative to both the X-ray, NMR and simulation (from NMR) structures. After equilibration, the simulation samples backbone conformations (S43-L75) with an average RMSD relative to the X-ray structure of 1.5 Å and remains quite stable through the 30 ns duration. Overall, there is a preference to adopt structures comparable to the simulation structure discussed above rather than the NMR structure (RMSD compared to the the simulation-equilibrated NMR structure is 1.5 Å below the RMSD to the original NMR structure). Individual helices demonstrate comparable preferences for the X-ray and simulation structures (Data not shown). Hence, the simulations starting from the NMR and X-ray structures both converge to a common simulation structure that is much closer to the X-ray structure than the NMR structure.
Experimental investigation of the putative sidechain interactions
While simulations can provide a detailed view of molecule structure and dynamics, many approximations are involved, necessitating validation through experimentation. A set of single mutants and double mutants were prepared in order to probe the putative sidechain interactions involving D44/R55 and F47/R55. D44 was mutated to Asn, F47 to Leu and R55 to Met. Thermal unfolding experiments were performed for wildtype HP36 (WT HP36) and for each of the mutants at pH 5.0 (Figure 9a, Table 1). The WT HP36 has a transition midpoint (Tm) of 73.0 °C, while all the variants show a lower melting temperature. The Tm of D44N, F47L, R55M, D44NR55M and F47LR55M are 57.8 °C, 45.6 °C, 67.3 °C, 55.4 °C and 35.3 °C respectively. From the thermal unfolding curves, at 25 °C, 22 % of the population of F47L and 40 % of the population of F47LR55M are unfolded.
Table 1.
protein | Tm (°C) | ΔH° (Tm) (kcal mol-1) | ΔG°U (H2O) (kcal mol-1) | M (kcal mol-1 M-1) |
---|---|---|---|---|
WT HP36 | 73.0 | 31.8 | 3.22 | -0.52 |
D44N | 57.8 | 32.1 | 2.48 | -0.55 |
F47L | 45.6 | 15.8 | 0.52a | -0.45b |
R55M | 67.3 | 26.3 | 2.19 | -0.43 |
D44NR55M | 55.4 | 27.4 | 1.74 | -0.44 |
F47LR55M | 35.3 | 9.8 | 0.19-0.28c | N/A |
ΔG°U (H2O) of F47L is extrapolated from urea denaturation in different TMAO concentrations;
m is the average value of the m from urea denaturation in different TMAO concentrations;
ΔG°U (H2O) of F47LR55M is calculated from Gibbs-Helmholtz equation using ΔC°P values ranging from 0.30-0.70 kcal mol-1 K-1.
Urea denaturation experiments were also carried out in 10 mM sodium acetate and 150 mM sodium chloride at 25 °C to determine the free energy of unfolding. The estimated free energy for unfolding (ΔG°U) was 3.22 kcal mol-1 for WT HP36, 2.48 kcal mol-1 for D44N, 2.19 kcal mol-1 for R55M and 1.74 kcal mol-1 for D44NR55M (Figure 9b, Table 1). The F47L and F47LR55M mutants were so unstable that the native baseline was not observed (Figure 9b) and the unfolding free energy could not be accurately measured by urea denaturation. Thermal and urea denaturation experiments showed that F47L and F47LR55M are partially unfolded in the absence of urea at 25 °C. Previous studies have shown that TMAO can stabilize partially or completely unfolded proteins.(50) Therefore the combination of urea denaturation and TMAO stabilization can be utilized to estimate the stability of marginally stable proteins. In order to determine the unfolding free energy of F47L and F47LR55M, we performed urea denaturation experiments in increasing TMAO concentrations. For F47L, the titration curves show good pre- and post-transitions in different TMAO concentrations (Figure 10a). With increasing TMAO concentrations, the urea denaturation curves shifted to higher urea concentrations. The free energy of unfolding at each TMAO concentration was measured: ΔG°U ranges from 1.27 kcal mol-1 in 1.62 M TMAO to 1.67 kcal mol-1 in 2.50 M TMAO (Table S1). Mello and coworkers(50) have shown that the free energy of unfolding depends linearly on TMAO concentration. The extrapolated ΔG°U of F47L at 0 M TMAO was estimated to be 0.52 kcal mol-1 at 25°C (Figure 10b), which is in reasonable agreement with the value estimated from the thermal unfolding curve.
Unfortunately the same strategy could not be applied to the F47LR55M double mutant. High TMAO concentrations are necessary to stabilize the protein to detect the pre-transition, but comparatively high urea concentrations are needed to observe the post-transition. Therefore it is very difficult to find conditions where full unfolding curves could be measured. Thus, we extrapolated from the thermal unfolding data using the Gibbs-Helmholtz equation:
(1) |
This calculation requires knowledge of the heat capacity change, ΔC°P. HP36 is small, resulting in a very broad differential scanning calorimetry (DSC) transition, which makes it very difficult to calculate the heat capacity accurately by DSC. From the literature, the value of ΔC°P of unfolding is expected to be about 0.012 kcal mol-1 K-1 per residue of protein.(51) To a first approximation, the ΔC°P for HP36 can be calculated to be 0.43 kcal mol-1 K-1. Another small 41-residue helical protein, the peripheral subunit-binding domain, has a ΔC°P value of 0.43 kcal mol-1 K-1 (52), suggesting that the estimate for HP36 is reasonable. In order to check whether the value of ΔC°P significantly affects the results, we use heat capacities ranging from 0.30 to 0.70 kcal mol-1 K-1 to calculate the ΔG°U . The F47LR55M has a measured Tm of 35.3 °C and ΔH°(Tm) of 9.5 kcal mol-1, and the resulting calculated ΔG°U of F47LR55M at 25 °C ranged from 0.19 to 0.28 kcal mol-1 depending on the value of ΔC°P used (Table S2). The value of ΔG°U estimated from the Gibbs-Helmholtz equation is in good agreement with the fraction unfolded determined directly from the fit to the thermal melt.
The coupling free energy between the D44 or F47 sidechain and the R55 sidechain, ΔΔG°coupling, was calculated using equation (2), where ΔG°WT and ΔG°R55M are the free energies of unfolding for wild type protein and R55M single mutant; and ΔG°Single represent D44N or F47L single mutants; and ΔG°Double represents the D44NR55M or F47LR55M double mutants.
(2a) |
The relationship can be rearranged to a simpler form:
(2b) |
Using the ΔG°U values (Table 1) measured from experiments, the coupling free energy between the D44 sidechain and the R55 sidechain was close to zero (0.29±0.20 kcal mol-1). In contrast, the coupling free energy between the F47 sidechain and R55 sidechain ranged from 0.70±0.20 to 0.79±0.20 kcal mol-1. The different estimates arise from using different ΔC°P values to calculate ΔG°Double. The analysis shows that there is a non-zero coupling between the F47 and R55.
Discussion
The explicit water MD simulation starting from the NMR structure showed a clear preference to sample structures with much greater similarity to the X-ray structure, as indicated by RMSD values, DSSP analysis, packing of the phenyalanine core, formation of a C-capping motif on helix-3 and adopting of specific contacts between side chains. Double mutant cycle experiments were performed and demonstrated clear coupling between F47 and R55. It is apparent that these residues are not interacting in the NMR structure but appear to do so in the X-ray structure. Based on free energies calculated from MD simulations and obtained experimentally through double mutant cycles, the F47/R55 contact appears to be a stronger interaction than the proposed salt bridge between D44 and R55. Hence, the van der Waals interaction seen in the X-ray structure appears to play an important role in stabilizing the solution structure of HP36. The coupling free energy between the D44 sidechain and the R55 sidechain is small, only 0.29±0.20 kcal mol-1. The F47 sidechain to R55 sidechain coupling free energy is 0.70±0.20 to 0.79±0.20 kcal mol-1. These results are consistent with the simulation results showing that F47/R55 interacts strongly and that the stability of the D44/R55 pair is lower than the thermal energy.
Previous studies by Frank et al.(8) have shown the importance of each phenylalanine in stabilizing the core of the protein. Interestingly, the F47LR55M double mutant is even less stable than these single Phe mutants, which suggests that the sidechain of R55 also plays a key role in stabilizing the structure. It is likely that the optimum packing of the three phenylalanines in the core is enhanced by R55 because it helps to shield the core with its long sidechain and also forms a cation-pi interaction. Cation-pi interactions can be important for folding and thermostability of various proteins and protein ligand systems.(53-55) For the single mutant, R55M, the Tm dropped 6 °C in thermal stability, showing that more than just a bulky sidechain it is required at position 55. In the majority of villin sequences, Lys is found as a conservative mutation in place of R55.(56) This suggests that the charge is important for stabilizing the structure, but as the simulations and double mutant experiments indicate, the importance of this charge at position 55 does not arise from formation of an ion pair with D44 as observed in the crystal structure. It is worth noting, however that in the X-ray structure D44 appears to be involved in a network of interactions including a hydrogen bond to the backbone carbonyl L42. Backbone sidechain interactions can not be probed by double mutant cycle analysis.
These simulations are models and as with any model there are limitations, especially in the interpretation of results. Realistic, detailed simulations come at a high computational cost that must often be balanced against the need for obtaining extensive conformational sampling. Computational models continue to improve; the Amber and CHARMm force fields have been used extensively enough to identify weaknesses (32, 57) such as overstabilization of secondary structure elements. In the simulations that we report here, this type of systematic error might contribute to theextension of alpha-helices that we observed, although we specifically addressed secondary structure bias in the development of the parameter set that was used for all of the present simulations.(32). It has also been noted that the use of PME to calculate long range electrostatics imposes long-range periodicity that that may result in artifacts from a crystal-like environment (58-60). In the present case, simulations with two alternate treatments of long-range interactions (including undesirable direct truncation) provided essentially the same conclusion, that the simulations adopt a structure in better agreement with the crystal structure than with the NMR structure. Thus there is no evidence that the present results are an artifact of PME
Previous work by van der Spoel and Lindahl (15) reported a series of simulations of the NMR structure of HP36. These authors noted a modest degree of sensitivity to force field, water models, and protonation states. In their simulations, they noted larger structure fluctuations in the region connecting helices 1 and 2 as compared to the rest of the molecule. This observation is consistent with our results, which indicate this linker as one region in which the simulations spontaneously adopt a conformation more consistent with the crystal structure. At that time, there was no way for the van der Spoel and Lindahl to determine whether this larger fluctuation resulted from a conversion toward the crystal structure which was reported two years later. Importantly, van der Spoel and Lindahl also noted the importance of taking into account the pH of the experiment when running simulations of HP36. Upon protonation of the glutamic acids disde chains in the starting structure, the resulting simulation displayed a greater correlation to the chemical shift and j-coupling results which were originally measured at a pH of 3.7. This further suggests that one must be cautious in the quantitative comparison of simulations at neutral pH to experimental data obtained at low pH.
In summary, the results from our simulations and experiments show that the recently published X-ray structure is a more accurate representation of the structure in solution at neutral pH than the NMR structure at low pH. Importantly, the simulations also indicated that a salt bridge between R55 and D44 observed in the low-temperature crystal structure was thermally unstable, in contrast to the stable interaction between R55 and F47 in the simulation. Experimental double mutant analysis confirmed that the interaction free energy of the salt bridge was small, and that the F47-R55 pair likely plays an important role in stabilizing the protein via a cation-pi interaction. The analysis presented here shows how the combination of molecular dynamics simulations and experimental measurements can be used to develop a better understanding of the structural properties of proteins in solution.
Supplementary Material
Acknowledgments
We gratefully acknowledge computer time at NCSA through MCA02N028 and financial support from the National Institute of Health (NIH GM6167803 (CS) and GM54233 (DPR)). Additional computer time and assistance was provided by Roberto Gomperts and the SGI Engineering group. We thank C. James McKnight for helpful discussions.
Support for this project was provided from the National Institute of Health (GM61678 to CLS), by the Donors of the Petroleum Research Fund of the American Chemical Society, ACS-PRF (DPR)) and the National Computational Science Alliance Grant MCA02N028 (C.S.), which provided resources at NCSA.
Abbreviations
- HP36
the villin headpiece helical subdomain
- DSSP
Dictionary of Secondary Structure Prediction
- RMSD
Root Mean Squared Deviation
- GB
Generalized Born
- PME
Particle Mesh Ewald
- IPS
Isotropic Periodic Sum
- MD
Molecular Dynamics
- PMF
Potential of Mean Force
- D44N
the Asp44 to Asn mutant of WT HP36
- F47L
the Phe47 to Leu mutant of WT HP36
- R55M
the Arg55 to Met mutant of WT HP36
- D44NR55M
the Asp44 to Asn, Arg55 to Met double mutant of WT HP36
- F47LR55M
the Phe47 to Leu, Arg55 to Met double mutant of WT HP36
- RP-HPLC
reverse phase high performance liquid chromatography
- ESI-MS
electrospray ionization mass spectrometry
- CD
circular dichroism
- TMAO
trimethylamine N-oxide
- ΔG°U
the free energy of unfolding at 25°C
- ΔC°P
heat capacity change of protein unfolding
References
- 1.Brewer SH, Vu DM, Tang Y, Li Y, Franzen S, Raleigh DP, Dyer RB. Effect of modulating unfolded state structure on the folding kinetics of the villin headpiece subdomain. Proc Natl Acad Sci U S A. 2005;102:16662–16667. doi: 10.1073/pnas.0505432102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Kubelka J, Eaton WA, Hofrichter J. Experimental tests of villin subdomain folding simulations. J Mol Biol. 2003;329:625–630. doi: 10.1016/s0022-2836(03)00519-9. [DOI] [PubMed] [Google Scholar]
- 3.Buscaglia M, Kubelka J, Eaton WA, Hofrichter J. Determination of ultrafast protein folding rates from loop formation dynamics. J Mol Biol. 2005;347:657–664. doi: 10.1016/j.jmb.2005.01.057. [DOI] [PubMed] [Google Scholar]
- 4.Wang M, Tang Y, Sato S, Vugmeyster L, McKnight CJ, Raleigh DP. Dynamic NMR line-shape analysis demonstrates that the villin headpiece subdomain folds on the microsecond time scale. J Am Chem Soc. 2003;125:6032–6033. doi: 10.1021/ja028752b. [DOI] [PubMed] [Google Scholar]
- 5.McKnight CJ, Doering DS, Matsudaira PT, Kim PS. A thermostable 35-residue subdomain within villin headpiece. J Mol Biol. 1996;260:126–134. doi: 10.1006/jmbi.1996.0387. [DOI] [PubMed] [Google Scholar]
- 6.McKnight CJ, Matsudaira PT, Kim PS. NMR structure of the 35-residue villin headpiece subdomain. Nat Struct Biol. 1997;4:180–184. doi: 10.1038/nsb0397-180. [DOI] [PubMed] [Google Scholar]
- 7.Chiu TK, Kubelka J, Herbst-Irmer R, Eaton WA, Hofrichter J, Davies DR. High-resolution x-ray crystal structures of the villin headpiece subdomain, an ultrafast folding protein. Proc Natl Acad Sci U S A. 2005;102:7517–7522. doi: 10.1073/pnas.0502495102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Frank BS, Vardar D, Buckley DA, McKnight CJ. The role of aromatic residues in the hydrophobic core of the villin headpiece subdomain. Protein Sci. 2002;11:680–687. doi: 10.1110/ps.22202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Vugmeyster L, Trott O, McKnight CJ, Raleigh DP, Palmer AG., 3rd Temperature-dependent dynamics of the villin headpiece helical subdomain, an unusually small thermostable protein. J Mol Biol. 2002;320:841–854. doi: 10.1016/S0022-2836(02)00537-5. [DOI] [PubMed] [Google Scholar]
- 10.Tang Y, Rigotti DJ, Fairman R, Raleigh DP. Peptide models provide evidence for significant structure in the denatured state of a rapidly folding protein: the villin headpiece subdomain. Biochemistry. 2004;43:3264–3272. doi: 10.1021/bi035652p. [DOI] [PubMed] [Google Scholar]
- 11.Havlin RH, Tycko R. Probing site-specific conformational distributions in protein folding with solid-state NMR. Proc Natl Acad Sci U S A. 2005;102:3284–3289. doi: 10.1073/pnas.0406130102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kubelka J, Chiu TK, Davies DR, Eaton WA, Hofrichter J. Sub-microsecond protein folding. J Mol Biol. 2006;359:546–553. doi: 10.1016/j.jmb.2006.03.034. [DOI] [PubMed] [Google Scholar]
- 13.Tang Y, Goger MJ, Raleigh DP. NMR characterization of a peptide model provides evidence for significant structure in the unfolded state of the villin headpiece helical subdomain. Biochemistry. 2006;45:6940–6946. doi: 10.1021/bi052484n. [DOI] [PubMed] [Google Scholar]
- 14.De Mori GM, Colombo G, Micheletti C. Study of the Villin headpiece folding dynamics by combining coarse-grained Monte Carlo evolution and all-atom molecular dynamics. Proteins. 2005;58:459–471. doi: 10.1002/prot.20313. [DOI] [PubMed] [Google Scholar]
- 15.van der Spoel D, Lindahl E. Brute-Force Molecular Dynamics Simulations of Villin Headpiece: Comparison with NMR Parameters. J Phys Chem B. 2003;107:11178–11187. [Google Scholar]
- 16.Duan Y, Kollman PA. Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. Science. 1998;282:740–744. doi: 10.1126/science.282.5389.740. [DOI] [PubMed] [Google Scholar]
- 17.Duan Y, Wang L, Kollman PA. The early stage of folding of villin headpiece subdomain observed in a 200-nanosecond fully solvated molecular dynamics simulation. Proc Natl Acad Sci U S A. 1998;95:9897–9902. doi: 10.1073/pnas.95.17.9897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Shen MY, Freed KF. All-atom fast protein folding simulations: the villin headpiece. Proteins. 2002;49:439–445. doi: 10.1002/prot.10230. [DOI] [PubMed] [Google Scholar]
- 19.Fernandez A, Shen MY, Colubri A, Sosnick TR, Berry RS, Freed KF. Large-scale context in protein folding: villin headpiece. Biochemistry. 2003;42:664–671. doi: 10.1021/bi026510i. [DOI] [PubMed] [Google Scholar]
- 20.Zagrovic B, Snow CD, Shirts MR, Pande VS. Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing. J Mol Biol. 2002;323:927–937. doi: 10.1016/s0022-2836(02)00997-x. [DOI] [PubMed] [Google Scholar]
- 21.Sullivan DC, Kuntz ID. Conformation spaces of proteins. Proteins. 2001;42:495–511. [PubMed] [Google Scholar]
- 22.Sullivan DC, Kuntz ID. Protein Folding as Biased Conformational Diffusion. J Phys Chem B. 2002;106:3255–3262. [Google Scholar]
- 23.Islam SA, Karplus M, Weaver DL. Application of the diffusion-collision model to the folding of three-helix bundle proteins. J Mol Biol. 2002;318:199–215. doi: 10.1016/S0022-2836(02)00029-3. [DOI] [PubMed] [Google Scholar]
- 24.Ripoll DR, Vila JA, Scheraga HA. Folding of the villin headpiece subdomain from random structures. Analysis of the charge distribution as a function of pH. J Mol Biol. 2004;339:915–925. doi: 10.1016/j.jmb.2004.04.002. [DOI] [PubMed] [Google Scholar]
- 25.Bandyopadhyay S, Chakraborty S, Balasubramanian S, Bagchi B. Sensitivity of polar solvation dynamics to the secondary structures of aqueous proteins and the role of surface exposure of the probe. J Am Chem Soc. 2005;127:4071–4075. doi: 10.1021/ja042847r. [DOI] [PubMed] [Google Scholar]
- 26.Herges T, Wenzel W. Free-energy landscape of the villin headpiece in an all-atom force field. Structure. 2005;13:661–668. doi: 10.1016/j.str.2005.01.018. [DOI] [PubMed] [Google Scholar]
- 27.Jayachandran G, Vishal V, Pande VS. Using massively parallel simulation and Markovian models to study protein folding: examining the dynamics of the villin headpiece. J Chem Phys. 2006;124:164902. doi: 10.1063/1.2186317. [DOI] [PubMed] [Google Scholar]
- 28.Trebst S, Troyer M, Hansmann UHE. Optimized parallel tempering simulations of proteins. The Journal of Chemical Physics. 2006;124:174903. doi: 10.1063/1.2186639. [DOI] [PubMed] [Google Scholar]
- 29.Zagrovic B, van Gunsteren WF. Comparing atomistic simulation data with the NMR experiment: how much can NOEs actually tell us? Proteins. 2006;63:210–218. doi: 10.1002/prot.20872. [DOI] [PubMed] [Google Scholar]
- 30.Wickstrom L, Okur A, Song K, Hornak V, Raleigh DP, Simmerling CL. The unfolded state of the villin headpiece helical subdomain: computational studies of the role of locally stabilized structure. J Mol Biol. 2006;360:1094–1107. doi: 10.1016/j.jmb.2006.04.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Case D, Cheatham T, III, Darden T, G, G, Luo R, Merz KM, Jr, Onufriev A, Simmerling C, Wang B, Woods RJ. The Amber biomolecular simulation programs. 2005;26:1668–1688. doi: 10.1002/jcc.20290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins. 2006;65:712–725. doi: 10.1002/prot.21123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM, Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. Journal of the American Chemical Society. 1995;117:5179–5197. [Google Scholar]
- 34.Wang J, Cieplak P, Kollman PA. How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? 2000;21:1049–1074. [Google Scholar]
- 35.Ryckaert JP, Ciccotti G, Berendsen HJC. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. Journal of Computational Physics. 1977;23:327–341. [Google Scholar]
- 36.Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A, Haak JR. Molecular dynamics with coupling to an external bath. Journal of Chemical Physics. 1984;81:3684–3690. [Google Scholar]
- 37.Shirts MR, Pande VS. Solvation free energies of amino acid side chain analogs for common molecular mechanics water models. The Journal of Chemical Physics. 2005;122:134508. doi: 10.1063/1.1877132. [DOI] [PubMed] [Google Scholar]
- 38.Still WC, Tempczyk A, Hawley RC, Hendrickson T. Semianalytical treatment of solvation for molecular mechanics and dynamics. Journal of the American Chemical Society. 1990;112:6127–6129. [Google Scholar]
- 39.Zhou R. Free energy landscape of protein folding in water: Explicit vs implicit solvent. Proteins: Structure, Function, and Genetics. 2003;53:148–161. doi: 10.1002/prot.10483. [DOI] [PubMed] [Google Scholar]
- 40.Pitera JW, Swope W. Understanding folding and design: Replica-exchange simulations of “Trp-cage” miniproteins. PNAS. 2003;100:7587–7592. doi: 10.1073/pnas.1330954100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zhou R, Berne BJ. Can a continuum solvent model reproduce the free energy landscape of a beta -hairpin folding in water? PNAS. 2002;99:12777–12782. doi: 10.1073/pnas.142430099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Darden T, York D, Pedersen L. Particle mesh Ewald: an N.log(N) method for Ewald sums in large systems. Journal of Chemical Physics. 1993;98:10089–10092. [Google Scholar]
- 43.Wu XW, Brooks BR. Isotropic periodic sum: A method for the calculation of long-range interactions. Journal of Chemical Physics. 2005;122:-. doi: 10.1063/1.1836733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. Journal of Chemical Physics. 1983;79:926–935. [Google Scholar]
- 45.Bi Y, Tang Y, Raleigh DP, Cho JH. Efficient high level expression of peptides and proteins as fusion proteins with the N-terminal domain of L9: application to the villin headpiece helical subdomain. Protein Expr Purif. 2006;47:234–240. doi: 10.1016/j.pep.2005.10.023. [DOI] [PubMed] [Google Scholar]
- 46.Simmerling C, Elber R, Zhang J. MOIL-View - A Program for Visualization of Structure and Dynamics of Biomolecules and STO- A Program for Computing Stochastic Paths, in Modelling of Biomolecular Structure and Mechanisms. In: Pullman A, editor. Modeling of Biomolecular Structure and Mechanism. Kluwer; Netherlands: 1995. pp. 241–265. [Google Scholar]
- 47.Pace CN, Vajdos F, Fee L, Grimsley G, Gray T. How to measure and predict the molar absorption coefficient of a protein. Protein Sci. 1995;4:2411–2423. doi: 10.1002/pro.5560041120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wang A, Bolen DW. A naturally occurring protective system in urea-rich cells: mechanism of osmolyte protection of proteins against urea denaturation. Biochemistry. 1997;36:9101–9108. doi: 10.1021/bi970247h. [DOI] [PubMed] [Google Scholar]
- 49.Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22:2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
- 50.Mello CC, Barrick D. Measuring the stability of partly folded proteins using TMAO. Protein Sci. 2003;12:1522–1529. doi: 10.1110/ps.0372903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Fersht A. Structure and mechanism in protein science: a guide to enzyme catalysis and protein folding. W. H. Freeman and company; New York: 1999. [Google Scholar]
- 52.Spector S, Young P, Raleigh DP. Nativelike structure and stability in a truncation mutant of a protein minidomain: the peripheral subunit-binding domain. Biochemistry. 1999;38:4128–4136. doi: 10.1021/bi982915k. [DOI] [PubMed] [Google Scholar]
- 53.Gallivan JP, Dougherty DA. Cation-pi interactions in structural biology. Proc Natl Acad Sci U S A. 1999;96:9459–9464. doi: 10.1073/pnas.96.17.9459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Shi Z, Olson CA, Bell AJJ, Kallenbach NR. Stabilization of alpha-helix structure by polar side-chain interactions: Complex salt bridges, cation-pi interactions, and C-H…O H-bonds. Biopolymers. 2001;60:366–380. doi: 10.1002/1097-0282(2001)60:5<366::AID-BIP10177>3.0.CO;2-5. [DOI] [PubMed] [Google Scholar]
- 55.Dougherty DA. Cation-pi interactions in chemistry and biology: a new view of benzene, Phe, Tyr, and Trp. Science. 1996;271:163–168. doi: 10.1126/science.271.5246.163. [DOI] [PubMed] [Google Scholar]
- 56.Vermeulen W, Vanhaesebrouck P, Van Troys M, Verschueren M, Fant F, Goethals M, Ampe C, Martins JC, Borremans FAM. Solution structures of the C-terminal headpiece subdomains of human villin and advillin, evaluation of headpiece F-actin-binding requirements. 2004:1276–1287. doi: 10.1110/ps.03518104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Okur A, Strockbine B, Hornak V, Simmerling C. Using PC clusters to evaluate the transferability of molecular mechanics force fields for proteins. Journal of Computational Chemistry. 2003;24:21–31. doi: 10.1002/jcc.10184. [DOI] [PubMed] [Google Scholar]
- 58.Hunenberger PH, McCammon JA. Effect of artificial periodicity in simulations of biomolecules under Ewald boundary conditions: a continuum electrostatics study. Biophysical Chemistry. 1999;78:69–88. doi: 10.1016/s0301-4622(99)00007-1. [DOI] [PubMed] [Google Scholar]
- 59.Hunenberger PH, McCammon JA. Ewald artifacts in computer simulations of ionic solvation and ion-ion interaction: A continuum electrostatics study. Journal of Chemical Physics. 1999;110:1856–1872. [Google Scholar]
- 60.Weber W, Hunenberger PH, McCammon JA. Molecular dynamics simulations of a polyalanine octapeptide under Ewald boundary conditions: Influence of artificial periodicity on peptide conformation. Journal of Physical Chemistry B. 2000;104:3668–3675. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.