Protein molecules generally fold into precise three-dimensional shapes. While of direct interest to biochemists, the question of how folding occurs has attracted the interest of a much broader audience of scientists ranging from the traditional chemical scientists to computer scientists and physicists. Also changing over time has been the very meaning of the question of protein folding. To the descriptive scientist it may be sufficient to assert that folding occurs on a time scale no slower than protein biosynthesis, and that the information required to find the precise three-dimensional shape is contained in the one-dimensional sequence. Although exceptions to these generalizations recently have begun to emerge; in the study of prion-associated diseases (1), they have sufficient generality to allow us to treat the folding process as a black box for transcribing one-dimensional information into three-dimensional structures. It is, however, necessary to probe deeper into the mechanism if the prediction of protein structure from sequence and the design of truly novel protein-like molecules are to be achieved. These goals are of great practical significance in biology and medicine. The question of the mechanism of folding was once thought to be entirely analogous to the question of mechanism in intermediary metabolism or classical organic chemistry. In those problems the small number of participating species and the relatively specific routes by which they interconvert owing to the large scale of covalent energy barriers compared with thermal energies means that a small number of fairly discrete chemical steps can be isolated. This is the classic notion of a protein folding pathway with a series of discrete intermediates. Such discrete intermediates do occur in the late stages of protein folding, and to a great extent, the chemical kinetic details of these interconversions have been catalogued (2). However, to answer the practical questions of structure prediction and design, one must go a considerable distance beyond this phenomenology—a new viewpoint on folding is required.
This new viewpoint is that of the chemical physicists rather than the classical chemists (3). The chemical physics view brings the problem in much closer connection to the underlying forces and the underlying microscopic events. This view has required a new set of theoretical ideas, computational techniques, and major advances in experimental methodology.
Energy landscape theory provides the theoretical framework, asserting that a full understanding of the folding process requires a global overview of the landscape. The folding landscape of a protein resembles a partially rough funnel riddled with traps where the protein can transiently reside (Fig. 1). There is no unique pathway but a multiplicity of convergent folding routes toward the native state (3–7). Although we focus on developments from our groups in this paper, several other groups have participated in developing this new view.
The importance of a funnel landscape can be seen by contrasting random heteropolymeric molecules and proteins. Random heteropolymers have an underlying driving force to collapse but do not adopt well-defined three-dimensional structures because of the conflict or frustration of different interaction energies. Instead, they exist in an ensemble of dissimilar low-energy structures and move among this multitude of states, jumping over barriers between adjacent minima, giving multiexponential kinetics. Proteins, on the other hand, have a single manifold of structurally similar collapsed low-energy structures. When the slope toward the native state is dominant over the ruggedness of the landscape, folding kinetics is exponential and fast. The essence of the funnel landscape idea is competition between the tendency toward the folded state and trapping because of ruggedness. This competition is measured by the ratio between the folding temperature (Tf) and the glass temperature (Tg). Good folding sequences fold rapidly on minimally frustrated landscapes with large values of Tf/Tg (3, 5, 7).
Minimally frustrated sequences not only fold fast at relevant temperatures but are robust folders, only weakly dependent on minor variations of the folding environment or to mutations. Robustness is essential for biology. Minor variations in pH, temperature, or mutations will affect the native configuration by favoring other low-energy structures, which in a funnel-like landscape are very similar to the original one. Energy landscape theory suggests a diversity of folding scenarios that has been observed by computer simulations of minimalist models (see, for example, references in refs. 3, 8, 11, and 15–18) allowing connections to studies of real proteins (19, 20).
A deeper understanding of how the landscape relates to particular properties of a protein’s sequence or final folded topology requires detailed molecular calculations. Such calculations often use explicit atomic level models of the protein and surrounding solvent together with molecular dynamics and biased sampling methods to map the free energy landscape onto coordinates describing folding progress (21). These permit detailed atomic information about folding to be related to experimentally measurable quantities. Atomistic studies may be compared with general predictions of landscape theories.
In Fig. 1 we display the folding free energy profile for the α/β protein GB1 under conditions strongly favoring the native state (22). Folding of this protein is dominated by an initial collapse without significant formation of native interactions, followed by folding toward the native without significant change in size. This behavior contrasts with the mechanism seen for some small helical proteins that are even less frustrated (14, 23). These fold with commensurate reduction in size and gain in tertiary interactions. Thus, two apparently different mechanisms are used by proteins that differ in their final folded topology. Such behavior can be mimicked with minimalist lattice models by changing the balance of native and non-native (hydrophobic) attractive interactions (24).
Also directly connected to phenomenological models of folding is the distribution of stabilizing native interactions in the transition region of folding (20). For helical proteins (at 300 K), this is rather broad centering around 50% occupation of the native contacts. The interactions that stabilize the protein at this point in folding are distributed throughout the structure. For the α/β protein GB1, this distribution is more bimodal, with a larger number of native interactions formed with the highest and lowest probabilities compared to those that occur with 50% probability.
Simulations with detailed atomic models are extremely intensive numerically. The number and size of systems that can be studied is limited. Therefore, it is difficult for such simulations to provide a general view of folding dynamics. Here an intermediate level of model is sought giving the ability to directly simulate the folding process and broadly explore the connections between thermodynamics and kinetics (9, 18, 25–27).
Experiments also have been devised to explore the landscape. Methods such as dynamic NMR, protein engineering, laser initiated folding, and ultrafast mixing are being used (28–30). During the past 5 years it has become possible to study “burst phase kinetics,” which cannot be resolved within the dead time of conventional stopped-flow experiments. In some cases, e.g. apomyoglobin, substantial secondary structure and nativelike tertiary contacts are observed to have already formed. Such burst phases could represent kinetics with a small or negligible activation barrier. Even in the case where kinetics is limited by activation, much structure formation occurs by the subsequent downhill dynamics to the native state. These must be studied via fast folding experiments.
One experiment uses cold denaturation in supercooled water to prepare unfolded proteins (30). An infrared laser pulse then jumps the temperature by directly heating water 10–30° in 10 ns, placing the unfolded chain in an environment conducive to folding. The refolding is monitored by a series of UV pulses spaced 14 ns apart. The UV light excites tryptophan fluorescence, for which specific quenching mechanisms are introduced to serve as structural markers. Experiments on horse apomyoglobin show that the cold denatured state has some residual structure in the GH helices and loop; the N terminal A helix is largely destabilized at low temperature. The fast refolding experiment indicates that formation of the AGH core of apomyoglobin is an extremely efficient process (30). Refolding of phosphoglycerate kinase, a 415-residue two-domain enzyme, displays multiscale kinetics on a time scale from 10 to 6,000 microseconds, indicating the roughness of the free energy landscape. Experiments are beginning to build up a phase diagram of folding kinetics that can be used to test and refine theoretical models.
Acknowledgments
This work was supported by National Science Foundation Grant MCB-9603839 and National Institutes of Health Grants GM48807, RR12255, and GM44557.
References
- 1.Harrison P M, Bamborough P, Daggett V, Prusiner S B, Cohen F C. Curr Opin Struct Biol. 1997;7:53–59. doi: 10.1016/s0959-440x(97)80007-3. [DOI] [PubMed] [Google Scholar]
- 2.Kim P S, Baldwin R L. Annu Rev Biochem. 1990;59:631–660. doi: 10.1146/annurev.bi.59.070190.003215. [DOI] [PubMed] [Google Scholar]
- 3.Onuchic J N, Luthey-Schulten Z, Wolynes P G. Annu Rev Phys Chem. 1997;48:545–600. doi: 10.1146/annurev.physchem.48.1.545. [DOI] [PubMed] [Google Scholar]
- 4.Gō N. J Stat Phys. 1983;30:413–423. [Google Scholar]
- 5.Bryngelson J D, Wolynes P G. Proc Natl Acad Sci USA. 1987;84:7524–7528. doi: 10.1073/pnas.84.21.7524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bryngelson J D, Wolynes P G. J Phys Chem. 1989;93:6902–6915. [Google Scholar]
- 7.Leopold P E, Montal M, Onuchic J N. Proc Nat Acad Sci USA. 1992;89:8721–8725. doi: 10.1073/pnas.89.18.8721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dill K A, Chan H S. Nat Struct Biol. 1997;4:10–19. doi: 10.1038/nsb0197-10. [DOI] [PubMed] [Google Scholar]
- 9.Guo Z, Thirumalai D. J Mol Biol. 1996;263:323–343. doi: 10.1006/jmbi.1996.0578. [DOI] [PubMed] [Google Scholar]
- 10.S̆ali A, Shakhnovich E, Karplus M. J Mol Biol. 1994;235:1614–1636. doi: 10.1006/jmbi.1994.1110. [DOI] [PubMed] [Google Scholar]
- 11.Scheraga H A. Protein Sci. 1992;1:691–693. doi: 10.1002/pro.5560010515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zwanzig R. Proc Natl Acad Sci USA. 1995;92:9801–9804. doi: 10.1073/pnas.92.21.9801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pande V S, Grosberg A Y, Tanaka T. Proc Natl Acad Sci USA. 1994;91:12972–12975. doi: 10.1073/pnas.91.26.12972. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Boczko E M, Brooks C L., III Science. 1995;269:393–396. doi: 10.1126/science.7618103. [DOI] [PubMed] [Google Scholar]
- 15.Guo Z Y, Thirumalai D. Biopolymers. 1995;36:83–102. [Google Scholar]
- 16.Mirny L A, Abkevich V, Shakhnovich E I. Folding Design. 1996;1:103–116. doi: 10.1016/S1359-0278(96)00019-3. [DOI] [PubMed] [Google Scholar]
- 17.Friedrichs M S, Goldstein R A, Wolynes P G. J Mol Biol. 1991;222:1013–1034. doi: 10.1016/0022-2836(91)90591-s. [DOI] [PubMed] [Google Scholar]
- 18.Guo Z, Brooks C L., III Biopolymers. 1997;42:745–757. doi: 10.1002/(sici)1097-0282(199712)42:7<745::aid-bip1>3.0.co;2-t. [DOI] [PubMed] [Google Scholar]
- 19.Onuchic J N, Wolynes P G, Luthey-Schulten Z, Socci N D. Proc Natl Acad Sci USA. 1995;92:3626–3630. doi: 10.1073/pnas.92.8.3626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wolynes P G, Schulten Z L, Onuchic J. Chem Biol. 1996;3:425–432. doi: 10.1016/s1074-5521(96)90090-3. [DOI] [PubMed] [Google Scholar]
- 21.Sheinerman F B, Brooks C L., III J Mol Biol. 1998;278:439–455. doi: 10.1006/jmbi.1998.1688. [DOI] [PubMed] [Google Scholar]
- 22.Sheinerman F B, Brooks C L., III Proc Natl Acad USA. 1998;95:1562–1567. doi: 10.1073/pnas.95.4.1562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Guo Z, Brooks C L, III, Bockzo E. Proc Natl Acad Sci USA. 1997;94:10161–10166. doi: 10.1073/pnas.94.19.10161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Socci N D, Onuchic J N. J Chem Phys. 1995;103:4732–4744. [Google Scholar]
- 25.Nymeyer H, García A E, Onuchic J N. Proc Natl Acad Sci USA. 1998;95:5921–5928. doi: 10.1073/pnas.95.11.5921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Shea, J.-E., Nochomovitz, Y. D., Guo, Z. & Brooks C. L., III (1998) J. Chem. Phys., in press.
- 27.Hardin, C., Luthey-Schulten, Z. A. & Wolynes, P. G. (1998) Proteins Struct. Funct. Genet., in press.
- 28.Fersht A R. Curr Opin Struct Biol. 1997;7:3–9. doi: 10.1016/s0959-440x(97)80002-4. [DOI] [PubMed] [Google Scholar]
- 29.Eaton W A, Munoz V, Thompson P, Chan C K, Hofrichter J. Curr Opin Struct Biol. 1997;7:10–14. doi: 10.1016/s0959-440x(97)80003-6. [DOI] [PubMed] [Google Scholar]
- 30.Gruebele, M., Sabelko, J., Ballew, R. M. & Ervin, J. (1998) Acc. Chem. Res., in press.