Protein folding is evidently not a random process given the speed and reproducibility of folding in vivo;1 yet how a given polypeptide sequence translates into the globular structure of a fully folded protein remains unclear,2 particularly with respect to the role that water plays in this process.3 One frequently occurring folding pattern in proteins is the β-turn,4 where the amino acid sequences that give rise to these turns are thought to nucleate folding.5 The question remains however, if it is the mere presence of certain amino acids which initiate the formation of β-turns or if water plays a fundamental role in this process.
The hydrophilic/hydrophobic nature of peptides and proteins in physiological solutions can be probed using neutron diffraction enhanced by isotopic substitution (NDIS). NDIS can directly address structural interactions between water and biomolecules in solution6–8—the physical milieu in which these life-giving molecules must operate.
The glycine-proline-glycine sequence in the peptide GPG-NH2 is known to occur in β-turns in proteins.9, 10 Its structure in aqueous solution (Figure 1) has been assessed using NDIS in concert with NMR spectroscopy and both molecular dynamics (MD) and empirical potential structural refinement (EPSR) simulations. This unique combination of techniques allows for structural interactions between GPG-NH2 and water to be investigated on the atomic scale (10−10 m, Å), the scale of hydrogen-bonding interactions; yielding a full assessment of the role that water plays in peptide conformation in solution and, importantly, how this relates to peptide folding.
Radial distribution functions (g(r)s)—which show the average distances in solution—for water atoms (Hw/Ow) around the Gly1-Pro2 peptide bond oxygen (O1) and the Gly3-Cap4 peptide bond oxygen (O3) from EPSR and MD are shown in Figure 2 a. The EPSR simulation contained a mixture of cis and trans GPG-NH2 molecules in a ratio corresponding to that measured by 1H NMR spectroscopy and the MD g(r) functions are from two simulations—one which contained only cis peptides and one containing only trans peptides.
The reduction of intensity in the first peak of the gO1-Hw(r) compared to the gO3-Hw(r) and the average coordination from these peaks shows fewer Hw-O1 hydrogen bonds (1.1) than O3-Hw bonds (1.7). This indicates that the Gly1-Pro2 peptide bond oxygen is not fully hydrated with respect to both the Gly3-Cap4 peptide bond oxygen and to previous measurements of C=O hydration.8
Spatial density functions,11 which show the most probable location of water molecules around O1 and O3 are shown in Figure 2b and c. Here, water molecules around O1 are preferentially located directly above the C=O group, in a fairly tight distribution, indicating highly directed hydrogen bonding from water to this oxygen. In contrast, O3 shows a much broader distribution, similar to that seen for acetylcholine in aqueous solution where the C=O group in this neurotransmitter is highly accessible to the bulk water solvent.7
That there are more highly directed water molecules around O1 is further evident on comparison of the gO1-Ow(r) and gO3-Ow(r) functions in Figure 2 a; the gO1-Ow(r) shows a more shallow minimum after the first peak compared to the gO3-Ow(r). This shallow minimum in the O1-Ow function gives rise to a small peak at about 3.5 Å in the MD simulations and a smaller peak at the same distance in the EPSR simulations, while O3-Ow shows the more usual effect of water around fully hydrated oxygen, with a single sharp peak at about 3 Å. Interestingly, the O1-Ow hydration in the MD simulations look similar when the molecule is cis or trans suggesting that the proline ring has a steric influence on the O1 hydration shell, regardless of the Gly1-Pro2 peptide bond conformation.
The 1H NMR spectrum of GPG-NH2 gives an 85:15 mixture of trans:cis conformations about the Gly1-Pro2 peptide bond, as expected for cationic proline-containing peptides in solution.12 Interestingly, the terminal hydrogen atoms on the -NH2 of GPG-NH2, Hn4, and Hn4′ shown in Figure 3 a, also show distinct peaks corresponding to cis and trans conformers, even though this group is separated from the Gly1-Pro2 peptide bond by seven bonds.
The region of the 1H–13C HSQC spectrum for GPG-NH2 glycine Hα–Cα correlations is shown in Figure 3 b. The peaks at about 43 ppm in the 13C dimension are from Gly1 and the peaks at about 45 ppm from Gly3. For Gly1, the single peak at 4.15 ppm in the 1H dimension is from the trans conformer; the two Hα are equivalent, indicating significant conformational averaging likely arising from nearly free rotation about the Cα-C=O bond (ψ angle) when GPG-NH2 is trans. In the cis conformer, two distinct Hα peaks are observed for Gly1 at 3.99 and 3.78 ppm in Figure 3 b indicating restricted rotation about ψ. The opposite pattern is observed for Gly3, a single Hα peak is observed for Gly3 when Gly1-Pro2 is cis while a pair of Hα peaks, at 3.97 and 3.93 ppm, is observed when Gly1-Pro2 is trans; indicating that the conformation of Gly3 is more constrained when this bond is trans.
This restricted rotation for Gly1 in cis GPG-NH2 molecules is likely the result of steric clash between the NH3+ terminus and the rest of the GPG-NH2 molecule. Gly3, on the other hand, experiences more restricted rotation when the Gly1-Pro2 bond is in its more dominant trans configuration, indicating that Gly3 shows preferred orientations in this conformation. This preferred orientation in GPG-NH2 must be due to an interaction with the rest of the molecule not present in the cis form. The observation of distinct peaks for Hn4/Hn4′ in the cis and trans conformers (Figure 3 a) is further evidence of a difference in the behavior of Gly3 in the cis and trans GPG-NH2 conformers in solution.
Figure 4 shows the average inter-peptide radial distribution function g(r) between O1 and the NH2 terminal hydrogen (Hn4) from EPSR and MD simulations. Trans GPG-NH2 shows three broad Hn4-O1 peak maxima at around 2, 4, and 6.5 Å in the MD simulations, whereas the cis GPG-NH2 molecules are fully extended in solution and show only a large broad peak at around 8 Å. The MD trans g(r) indicates that GPG-NH2 has some association between its Gly1 and Cap4 ends; EPSR shows shorter distances for these second two peaks—at about 3.1 Å and 6 Å—and no peak at the shortest distance. The peak at 2 Å in the MD indicates a direct hydrogen bond between O1 and Hn4 whereas the peak at 3–4 Å is indicative of a more highly ordered interaction between Hn4 and O1, not because of direct hydrogen bonding between C=O and N-H groups.
The unique hydration structure seen in Figure 2 around O1 coupled with the distinct distances observed in Figure 4 are suggestive of a water-mediated hydrogen bonding motif between the NH2 and Gly1-Pro2 peptide bond oxygen. In the MD simulations, the reduction of hydration around O1 can be partially explained by the NH2 group directly bonding to O1, replacing some of the water molecules which would be present if the Gly1 C=O group were fully solvent accessible. However this accounts for only 3 % of the molecules and there are no direct C=O⋅⋅⋅H–N interactions apparent in the EPSR. An alternate explanation to direct intra-peptide bonding is that the more highly ordered water molecules around O1 mediate the Cap4-Gly1 interactions in solution. The O1-Ow SDF in Figure 2 b is consistent with this view, as the nearest-neighbor water molecules are preferentially oriented directly above O1 compared with the more highly solvent accessible O3 oxygen. Bridging water molecules above the Gly1 peptide oxygen would almost certainly displace hydrating water molecules, giving rise to the unique O1-Ow hydration observed in Figure 2 a.
The coordination number of the first peak (at 5 Å) in the EPSR fits to the NDIS data indicates that roughly 17 % of the molecules are likely to be mediated by one water molecule forming an O1⋅⋅⋅Hw-Ow⋅⋅⋅Hn4 interaction. The trans MD molecules show roughly the same coordination at a somewhat larger distance, although the exact number of molecules which are bound in this manner is difficult to assess as the g(r) functions in Figure 4 also account for GPG-NH2 conformations that may not contain mediating waters between O1 and Hn4.
Although at first glance the MD and EPSR intra-peptide O1-Hn4 at the distances indicative of O1⋅⋅⋅Hw-Ow⋅⋅⋅Hn4 interactions appear remarkably different at 4.0 Å and 3.1 Å, respectively, both of these distances lead to fairly similar water-mediated molecular conformations of GPG-NH2 as shown in Figure 4 b. In this Figure the O1-Ow and O1-Hn4 distances were set to the value of the peak maxima in Figures 2 and 4 and the Hn4-Ow distances at 1.9 Å for EPSR and 2.0 Å for MD, the value of first peak maxima in the gHn4-Ow(r)s. The differences in water orientation between the single water-mediated O1⋅⋅⋅Hw-Ow⋅⋅⋅Hn4 interactions in Figure 4 b may be indicative of slightly different energetic configurations of these interactions from MD versus EPSR simulations. Interestingly, the average O1⋅⋅⋅Ow⋅⋅⋅Hn4 angle is 82° for EPSR simulations and 110° for MD simulations. By comparison the value for pure water is 86° (Ow⋅⋅⋅Ow⋅⋅⋅Hw; see the Supporting Information) when considering only hydrogen-bonding interactions between molecules. It should be noted that it is possible that the peptides themselves will adopt slightly different conformations in solution to compensate for these potentially higher-energy water-mediated configurations, thus leading to small changes in the overall energy of the system. It should also be noted that MD also shows similar configurations to EPSR in solution as there is still an appreciable amount of density in the MD intra-peptide g(r) at 3.1 Å.
Figure 5 shows the probability distribution from the trans-GPG-NH2 of directly hydrogen-bonded (C=O⋅⋅⋅H–N) molecules compared with those bonded by water-mediated interactions through one or two water molecules from the MD simulations, normalized to the total number of GPG-NH2 molecules in solution. To generate these distributions, only the peptides which have strongly correlated interactions bound directly to water molecules were included; O1-Hn4 distances were discounted if there were no water molecules bound through both O1-Hw and Ow-Hn1 interactions. Roughly 16 % of the molecules are either directly bound or bound by one or two water-mediated hydrogen bonds, where the average O1-Ow distance is 2.78 Å for Hn4-O1 single-water-mediated bonds, consistent with the distances observed in the gO1-Ow(r) in Figure 2 a.
Many theories of how proteins fold are centered around the “hydrophobic effect” where the expulsion of water from hydrophobic amino acid side chains is thought to drive a structural collapse leading to a fully folded functional protein in vivo.13 Even though this effect is often cited as the dominant force in protein folding and association, this prevailing view has been recently challenged.6, 14 Previous work has also suggested that hydrophilic interactions could, in fact, be more important to the folding process than hydrophobic ones.15
It is certainly true that water should play some role as proteins fold in aqueous solutions. In this work all of the experimental and computational methods used indicate that water acts as a guide or mediator for the nucleation of folding by virtue of hydrogen-bonding interactions rather than only by a process “de-wetting” or hydrophobic elimination which has been observed in simulations of a β-hairpin-forming peptide.16 When GPG-NH2 is trans, a water-mediated hydrogen bond between the Gly1 O1 and Cap4 Hn4 appears to act as a nucleation point for folding before, perhaps, being finally eliminated in the fully folded protein. This indicates that water may play a dual role in β-turn formation, where perhaps water nucleates or initiates folding by mediating the formation of an i+4 hydrogen bond. This electrostatic water bridge would allow the hydrophobic amino acid side chains to be in close enough proximity to one another for the subsequent hydrophobic collapse to occur, leading ultimately to the formation of a functional, globular protein.
In both EPSR and MD, trans GPG-NH2 molecules are for the most part at least partially folded as the intra-peptide distances in Figure 4 suggest, as opposed to the cis conformers that are fully extended in solution. The contribution of water to the initiation of folding is also reflected in the unique hydration structure around the C=O (O1) oxygen (Figure 2). If this C=O oxygen was fully solvent accessible a similar hydration to the O3 C=O oxygen would be expected. The only other factor which might lead to this unique hydration would be large-scale association between separate GPG-NH2 molecules. However, large-scale aggregation was excluded by small-angle neutron scattering (SANS) measurements and aggregation was not evident in the simulations (see the Supporting Information). Importantly, a number of the GPG-NH2 intra-peptide contacts in these partially folded states appear to be mediated by water molecules which perhaps provide the impetus for nucleation mechanism of folding.
Anfinsen hypothesized that certain “portions of a protein chain that can serve as nucleation sites for folding will be those that can ”flicker“ in and out of the conformation that they occupy in the final protein.”17 That these peptides are “flickering” in and out of a suitable conformation is evident as the average structure in solution presented here only shows a relatively small proportion of peptides being bound by water mediation at any given time. In much the same way as GPG-NH2 is in solution, the bridging water molecules will likely also “flicker” in and out of the exact configuration which leads to a mediating hydrogen bond joining the C=O⋅⋅⋅H–N groups of the peptide.
That C=O⋅⋅⋅H–N bonds are necessary for the formation of a β-turn was first identified by crystallographic techniques.18 However, as more crystallographic data appeared in the literature for a variety of peptides and proteins which contain turn motifs,19 this hydrogen bonding interaction was often discounted as the distances or alignment between the C=O and H–N groups were deemed too far for hydrogen bond formation to occur.10, 20 In light of the data presented here, an alternative explanation may be that the C=O⋅⋅⋅H–N contacts are stabilized by water-mediated hydrogen bonds as a large number of β-turns in proteins are located on the protein surface,21 leaving them exposed to the surrounding water solvent.
For GPG-NH2 in water, hydrogen-bonding interactions appear to be the primary driving force in inducing this common β-turn sequence to fold. It is highly likely that hydrophilic forces are just as important in driving protein folding as the hydrophobic effect in solution, especially for the initiation of this process in vivo.
Experimental Section
Glycyl-l-prolyl-glycinamide⋅HCl (GPG-NH2⋅HCl) was purchased from Bachem (Bubendorf, CH) and was used without further purification. Details of the sample preparation for NDIS and NMR are found in the Supporting Information. NMR measurements were performed on 500 and 750 MHz spectrometers (Oxford) controlled by GE/Omega software and equipped with a home-built triple-resonance pulsed-field-gradient probe head.
NDIS measurements were performed on 1 m GPG-NH2⋅HCl solutions using the SANDALS instruments at the ISIS Facility (STFC, UK). The EPSR[22] modeling boxes contained 20 GPG-NH3+ ions, 20 Cl− ions, and 1160 water molecules and the “seed” potentials were modified from the MD potentials. The peptide bonds were constrained to be planar and the cis/trans ratio was fixed to the NMR value. Both MD cis and trans simulations each contained 64 GPG-NH3+ ions, 64 Cl− ions, and 3712 water molecules. GPG-NH3+ and Cl− ions were modeled using the CHARMM force field and TIP3P water molecules for this force field.23 Water bonds and angles were constrained using the SHAKE algorithm24 and simulations were conducted using GROMACS.25 The ensemble-averaged site-site radial distribution functions (g(r)s) and the SDFs11 were calculated from EPSR molecular assemblies. Details of EPSR and MD simulations are shown in the Supporting Information.
Supporting Information
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
References
- 1.Karplus M. Folding Des. 1997;2:S69–S75. doi: 10.1016/s1359-0278(97)00067-9. [DOI] [PubMed] [Google Scholar]
- 2.Dill KA, McCallum JL. Science. 2012;338:1042–1046. doi: 10.1126/science.1219021. [DOI] [PubMed] [Google Scholar]
- 3.Fuchs PFJ, Bonvin AMJJ, Bochicchio B, Pepe A, Alix AJP, Tamburro AM. Biophys. J. 2006;90:2745–2759. doi: 10.1529/biophysj.105.074401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4a.Toniolo C, Benedetti E. Crit. Rev. Biochem. Mol. Biol. 1980;9:1–44. doi: 10.3109/10409238009105471. [DOI] [PubMed] [Google Scholar]
- 4b.Chou K-C. Anal. Biochem. 2000;286 doi: 10.1006/abio.2000.4757. [DOI] [PubMed] [Google Scholar]
- 4c.Panasik N, Fleming PJ, Rose GD. Protein Sci. 2005;14 doi: 10.1110/ps.051625305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5a.McCallister EL, Alm E, Baker D. Nat. Struc. Biol. 2000;7:669–673. doi: 10.1038/77971. [DOI] [PubMed] [Google Scholar]
- 5b.Thukral L, Smith JC, Daidone I. J. Am. Chem. Soc. 2009;131 doi: 10.1021/ja9064365. [DOI] [PubMed] [Google Scholar]
- 6.McLain SE, Soper AK, Daidone I, Smith JC, Watts A. Angew. Chem. 2008;120:9199–9202. doi: 10.1002/anie.200802679. [DOI] [PubMed] [Google Scholar]
- Angew. Chem. Int. Ed. 2008;47 [Google Scholar]
- 7a.Hargreaves R, Bowron DT, Edler K. J. Am. Chem. Soc. 2011;133:16524–16536. doi: 10.1021/ja205804k. [DOI] [PubMed] [Google Scholar]
- 7b.Mancinelli R, Bruni F, Ricci MA, Imberti S. J. Chem. Phys. 2013;138 doi: 10.1063/1.4807601. [DOI] [PubMed] [Google Scholar]
- 7c.Hayes R, Imberti S, Warr CG, Atkin R. Angew. Chem. 2013;125 doi: 10.1002/anie.201209273. [DOI] [PubMed] [Google Scholar]
- Angew. Chem. Int. Ed. 2013;52 [Google Scholar]
- 8.Hulme EC, Soper AK, McLain SE, Finney JL. Biophys. J. 2006;91:2371–2380. doi: 10.1529/biophysj.106.089185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Guruprasad K, Rajkumar S. J. Biosci. 2000;25:143–156. [PubMed] [Google Scholar]
- 10.Lewis PN, Momany FA, Scheraga HA. Proc. Natl. Acad. Sci. USA. 1971;68:2293–2297. doi: 10.1073/pnas.68.9.2293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.McLain SE, Soper AK, Luzar A. J. Chem. Phys. 2006;124:074502. doi: 10.1063/1.2170077. [DOI] [PubMed] [Google Scholar]
- 12.Grathwohl C, Wüthrich K. Biopolymers. 1976;15:2025–2041. doi: 10.1002/bip.1976.360151012. [DOI] [PubMed] [Google Scholar]
- 13.Nicholls A, Sharp KA, Honig B. Proteins Struct. Funct. Genet. 1991;11:281–296. doi: 10.1002/prot.340110407. [DOI] [PubMed] [Google Scholar]
- 14.Ben-Naim A. Open J. Biophys. 2011;1:1–7. [Google Scholar]
- 15a.Ben-Naim A. Molecular Theory of Water and Aqueous Solutions: Role of Water in Protein Folding, Self-Assembly and Molecular Recognition Pt. II. Singapore: World Scientific; 2011. p. 480. [Google Scholar]
- 15b.Ben-Naim A. The Protein Folding Problem and its Solutions. Singapore: World Scientific; 2013. [Google Scholar]
- 16.Daidone I, Ulmschneider MB, Di Nola A, Amadei A, Smith JC. Proc. Natl. Acad. Sci. USA. 2007;104:15230–15235. doi: 10.1073/pnas.0701401104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Anfinsen CB. Science. 1973;181:223–280. doi: 10.1126/science.181.4096.223. [DOI] [PubMed] [Google Scholar]
- 18.Venkatachalam CM. Biopolymers. 1968;6:1425–1436. doi: 10.1002/bip.1968.360061006. [DOI] [PubMed] [Google Scholar]
- 19.Crawford JL, Lipscomb WN, Schellman CG. Proc. Natl. Acad. Sci. USA. 1973;70:538–542. doi: 10.1073/pnas.70.2.538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Richardson JS. In: Advances in Protein Chemistry, Vol. 34. Anfinsen CB, Richards JT, editors. New York: Academic Press; 1981. pp. 167–339. [DOI] [PubMed] [Google Scholar]
- 21.Kuntz ID. J. Am. Chem. Soc. 1972;94:4009–4012. doi: 10.1021/ja00766a060. [DOI] [PubMed] [Google Scholar]
- 22.Soper AK. Mol. Simul. 2012;38:1171–1185. [Google Scholar]
- 23a.MacKerell AD., Jr J. Comp. Chem. 2004;25:1584–1604. doi: 10.1002/jcc.20082. [DOI] [PubMed] [Google Scholar]
- 23b.MacKerell AD, Jr, Bashford D, Bellott M, Dunbrack RL, Jr, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, III, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiorkiewicz-Kuczera J, Yin D, Karplus M. J. Phys. Chem. B. 1998;102 doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
- 23c.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. J. Chem. Phys. 1983;79 [Google Scholar]
- 24.Ryckaert J-P, Ciccotti G, Berendsen HJC. J. Comput. Phys. 1977;23:327–341. [Google Scholar]
- 25.Hess B, Kutzner C, van der Spoel D, Lindahl E. J. Comp. Theory Comp. 2008;4:435–447. doi: 10.1021/ct700301q. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.