Abstract
How do proteins fold, and why do they fold in that way? This Perspective integrates earlier and more recent advances over the 50-y history of the protein folding problem, emphasizing unambiguously clear structural information. Experimental results show that, contrary to prior belief, proteins are multistate rather than two-state objects. They are composed of separately cooperative foldon building blocks that can be seen to repeatedly unfold and refold as units even under native conditions. Similarly, foldons are lost as units when proteins are destabilized to produce partially unfolded equilibrium molten globules. In kinetic folding, the inherently cooperative nature of foldons predisposes the thermally driven amino acid-level search to form an initial foldon and subsequent foldons in later assisted searches. The small size of foldon units, ∼20 residues, resolves the Levinthal time-scale search problem. These microscopic-level search processes can be identified with the disordered multitrack search envisioned in the “new view” model for protein folding. Emergent macroscopic foldon–foldon interactions then collectively provide the structural guidance and free energy bias for the ordered addition of foldons in a stepwise pathway that sequentially builds the native protein. These conclusions reconcile the seemingly opposed new view and defined pathway models; the two models account for different stages of the protein folding process. Additionally, these observations answer the “how” and the “why” questions. The protein folding pathway depends on the same foldon units and foldon–foldon interactions that construct the native structure.
Keywords: protein folding, hydrogen exchange, protein structure
Proteins must fold to their active native state when they emerge from the ribosome and when they repeatedly unfold and refold during their lifetime (1, 2). The folding process is difficult (3, 4) and potentially dangerous (5). Biological health depends on its success and disease on its failure. However, more than 50 y after the formative demonstration that protein folding is a straightforward biophysical process (6), there is not general agreement on the overarching questions of how proteins fold and why they fold in that way. Given this uncertainty, one is not sure how to even think about many related biophysical and biological problems.
Early in the history of the folding field, experimentalists simply assumed that proteins fold through distinct intermediate states in a distinct pathway (Fig. 1A), as seen for a classical biochemical pathways. Following Anfinsen’s demonstration that proteins can fold all by themselves without outside help (6), Levinthal perceived that no undirected folding process would be able to find the native structure by random searching through the vast number of structural options (3, 4). Proteins must solve the problem, he believed, by folding through predetermined pathways, although one had no clue how or why that should occur.
Fig. 1.
(A) The classical view of a defined folding pathway, and (B) the new view of multiple routes through a funneled landscape. Reprinted with permission from ref. 13. Dashed line in A illustrates the insertion of an optional error-dependent kinetic barrier, which can affect some population fraction and not others and thus mimic multipathway folding.
A realization of the inability to equilibrate to a common structure (3, 4) and the ensemble nature of partially folded forms led the theoretical community to a very different more statistical “new view” (7–11). It was inferred that proteins must fold to their unique native state through multiple unpredictable routes and intermediate conformations. Another prominent inference configured the Anfinsen thermodynamic hypothesis (6) in terms of a funnel-shaped energy landscape diagram (Fig. 1B), which pictures that proteins must fold energetically downhill (the Z axis) and shrink in conformational extent (the generalized XY plane) as they go (9, 12–14). To fill out the landscape picture, classical rate-determining kinetic barriers are often replaced by qualitative concepts such as ruggedness, frustration, and traps, and major species by deep wells, all forming a kind of metalanguage known as “energy landscape theory” (10, 15–17). The graphic funnel picture is a generic representation, independent of structural and thermodynamic detail and equally applicable to any protein, RNA, or other compact polymer. Although it provides no constraints that would exclude any realistic folding scenario, even a defined pathway model, it has been widely interpreted to require that proteins fold through many independent pathways.
R. L. Baldwin took up the challenge and led the field in a multiyear effort to experimentally define kinetic folding intermediates and pathways (18–20). In a thoughtful protein folding review 20 y ago, Baldwin considered the disparate insights available at the time from both theory and experiment (21). He highlighted uncertainties in the experimental evidence for classical pathways. Kinetic folding intermediates seemed to form asynchronously over a range of time scales. Equilibrium analogs of folding intermediates called molten globules yielded mixed results, sometimes agreeing with kinetic folding information and sometimes not. Baldwin’s article served to alert the experimental protein folding community to the new view of heterogeneous folding and helped to establish the current paradigm of a multipath funneled energy landscape.
The distinction between the classical view of a more or less single pathway through defined intermediates and the disordered many-pathway new view has broad significance for the understanding of protein biophysics and biological function. The question could be resolved by determining experimentally the structure of the intermediate forms that bridge between unfolded and native states in real proteins, but this effort has turned out to be exceptionally difficult. The usual methods, crystallography and NMR, cannot define partial structures that form and decay in less than 1 s. Experimentalists have been forced to depend on spectroscopic methods (fluorescence, CD, IR) that can follow kinetic folding in real time but are blind to the specifics of structure and so allow the possibility of alternative folding mechanisms. Theorists have attempted to avoid these difficulties by simulating the folding process in computers. Theory-based computer simulations can be remarkably powerful. For example, one can compute the path of a multiton rocket through 150 million miles of free space to a pinpoint landing on Mars. The equations that govern space flight are known precisely (22), computer power is ample, and the track to be controlled is clear. Computing the structural journey of minuscule protein molecules through submicrons of space has proved to be more difficult. The computer power required to track the folding process at the level of thermally driven residue-level dynamics is immense. The forces that direct protein folding are delicately balanced, interlocking, and not describable in exact terms. The reaction path(s) to be mined from the mass of computer data are unknown.
For both the classical and new view models, Fig. 1 implies that the structure of folding intermediates and their pathway connections might be determined in three different ways: (i) as intermediates that reach significant occupancy during kinetic folding; (ii) as conformationally excited states that exist at their equilibrium Boltzmann level in the high free energy space above the native protein; and (iii) as modified molten globule forms made by destabilizing the native protein so that higher energy states become the lowest free energy equilibrium form. Experimental advances have accomplished all three of these approaches. At any time point during kinetic folding, a briefly present folding intermediate can be marked in a structure-sensitive way by hydrogen exchange (HX) pulse labeling and defined by later analysis. The structure of partially folded states minimally populated in the high free energy space can be determined by HX labeling, sulfhydryl labeling, and NMR methods. Partially unfolded molten globules can be labeled in a structure-sensitive way by hydrogen–deuterium (H-D) exchange and analyzed later by site-resolved NMR in the reformed native state or directly by mass spectrometry.
These advances now make it possible to determine the structure and properties of intermediate protein folding states and their pathway connections and so place the study of folding pathways on the solid ground of structural biology. Experiment can now ask whether proteins fold through a limited number of distinct obligatory intermediate structures in an ordered kinetic sequence as suggested in Fig. 1A, or through a heterogeneous collection of independent multiply parallel forms and routes as in Fig. 1B, or through some other combination of conformations.
Intermediates During Kinetic Folding
HX Pulse Labeling and NMR Analysis.
It first became possible to obtain detailed structural information on briefly present protein folding intermediates with the development of the HX pulse labeling method (23, 24). The initially unfolded and D-exchanged protein is mixed into folding conditions and then, at various times during folding, is subjected to a short, selective D to H exchange labeling pulse. The protein folds to the native state, and D vs. H placement is analyzed by NMR to identify amide sites that were already protected (still D-labeled) or not yet protected (H-labeled) at the time of the labeling pulse. The results provide a series of snapshots during the time course over which folding converts identifiable main chain amides to a protected H-bonded condition. The results will detect intermediates that encounter a sizeable kinetic barrier and so reach significant population.
Initial results obtained for cytochrome c (Cyt c; Fig. 2) showed that approximately half of the molecules form their sequentially remote but structurally contiguous N- and C-terminal helical segments early (12 ms), suggesting the formation of a specific on-pathway native-like intermediate. However, Baldwin’s review (21) emphasized the asynchrony in these kinetic results; some of the molecules protect their N- and C-terminal helices early, whereas others do so at later times, along with other regions. Other proteins similarly studied have often yielded analogous results. This heterogeneous behavior conflicts with a well-defined sequential pathway model but seems more consistent with the new view of different routes, rates, and traps.
Fig. 2.
Initial HX NMR pulse labeling results for Cyt c (24). A brief D to H labeling pulse imposed after various folding times was used to track the increasing protection (decreasing H-labeling) of individual residues and the segments that they represent. The results suggested early formation of a native-like N/C bihelical folding intermediate. Baldwin’s review (21) noted the kinetic asynchrony, with the N- and C-terminal helical segments in different molecules folding at different rates. Later work shows that the asynchrony is caused by protein aggregation and by HX pulse breakthrough due to back-unfolding of the transiently populated intermediate during the H-labeling pulse (50 ms).
One now knows that the heterogeneous folding seen in kinetic HX pulse labeling experiments can be due to previously unrecognized experimental issues. One problem concerns the tendency of refolding proteins to transiently aggregate, especially at the high concentrations used to facilitate the preparation of samples for NMR analysis (25, 26). Another unexpected effect was revealed in a sophisticated analysis of the HX pulse labeling experiment, which showed that intermediates populated during kinetic folding may repeatedly unfold and refold on a fast time scale. Sites that are already folded and protected can nevertheless become H-labeled during the intense high pH interrogation pulse even with only a single reversible unfolding during the pulse (the so-called EX1 HX regime; 50-ms pulse, back unfolding rate 12 s−1 for Fig. 2) (27, 28). Other HX NMR pulse labeling studies have been compromised by similar aggregation and HX EX1 behavior, and also by the inability to differentiate mixtures of states due to the ensemble averaging that occurs when NMR is used to obtain a single measurement for each individual residue.
HX Pulse Labeling and MS Analysis.
A recently developed variant of the HX pulse labeling experiment can produce a more explicit description of the kinetic folding process. The new technology replaces NMR analysis with a mass spectrometry technique (HX MS) that allows folding experiments at 1,000-fold lower concentration and thus excludes aggregation. As before, the unfolded and D-exchanged protein is mixed into folding conditions and is subjected to a D to H exchange labeling pulse after various folding times. The labeling pulse can be adjusted to avoid or to study the back-unfolding behavior of transiently populated intermediates. To terminate labeling and prepare for analysis, the selectively labeled protein is plunged into slow HX conditions (low pH and temperature), then cleaved into short fragments, and the fragments are separated and analyzed by fast HPLC and mass spectrometry. The two examples so far published, illustrated in Figs. 3 and 4, provide detailed pathway information.
Fig. 3.
Pulse labeling HX MS results for maltose binding protein (29). (A) The time-dependent folding (HX protection) of 116 highest-quality MBP peptide fragments representing different protein regions. Black kinetic curves show the slow time course for folding of peptide fragments that are most protected in the initial collapse. (B–D) Representative HX-labeled MS fragments from different protein regions (colored) define the separate folding steps, display their concerted two-state nature, measure their formation rates, and show that the entire protein population (>95%) experiences the same steps. (E) The course of folding. On dilution from denaturant into folding conditions, MBP rapidly collapses into a heterogeneous polyglobular state (SAXS envelope reconstruction in gray) with widespread low level HX protection, then slowly folds (kinetic curves in A) through an initial native-like intermediate (blue, τ = 7 s) and later kinetically unresolved steps (green, gray, red; τ ∼60 s to 120 s; fastest green segments shown in C and E). Mutations known to greatly slow folding (stars) are all within the 7 s intermediate.
Fig. 4.
Pulse labeling HX MS results for Ribonuclease H (30). (A and B) Kinetic curves for time-dependent HX protection of peptide fragments that define the blue, green, yellow, and red foldons. (C–F) HX MS pulse labeling results for representative peptide fragments show the time course and two-state concerted nature of foldon folding steps, and that the entire protein population (>95%) experiences the same sequence of concerted steps in a single dominant pathway. The yellow foldon does not reach complete protection because of partial labeling due to back-unfolding during the 10-ms labeling pulse, which helps to distinguish the yellow foldon from the green foldon, along with the small difference in their formation rates seen in the renormalized kinetic phases (A, Inset).
When the large (370 residues) two-domain and aggregation prone maltose binding protein (MBP) is diluted into folding conditions at <1 μM concentration it does not aggregate, but it does rapidly collapse into a dynamic polyglobular state with heterogeneous low level HX protection (Fig. 3) (29). This condition might be expected to spawn multiple folding routes as in the new view model, but it does not. The microsecond and millisecond time scales pass with no indication of native-like structure formation, perhaps because conformational searching in the collapsed state is difficult. Ultimately, the entire protein population assembles sequentially remote segments into a specific native-like intermediate with a single exponential time constant of 7 s (blue in Fig. 3). Other peptides then report on later folding events that move to the native state over a broader time scale (60–120 s), suggesting several folding steps, but their kinetics are too compressed to allow clear resolution. These experiments largely avoided the back-unfolding HX labeling artifact by using a short labeling pulse (12 ms). Longer pulses (up to 42 ms) allowed the back-unfolding of the weakly protected regions in the initially collapsed form to be studied (29). Higher protection seems to correlate with the amphipathic nature of different segments and their tendency to form helical structure. The more protected segments (black curves in Fig. 3) are not the ones that form the emergent 7 s native-like foldon.
The same technology was able to resolve the entire folding trajectory of Ribonuclease H (155 residues; Fig. 4) in structural and temporal detail (30). The overlapping peptide MS results allow transiently formed intermediates to be defined at near amino acid resolution. In each case they are composed of sets of residues that form well-defined H-bonded elements in the native protein (foldons). The results display a stepwise assembly of the native structure, first helix A + strand 4 (blue in Fig. 4), then the neighboring helix D + strand 5 (green), then the interacting B/C helix (yellow), and finally the terminal segments (red). The yellow foldon does not reach complete protection because of some back-unfolding (∼20%) during the 10-ms HX labeling pulse which, fortuitously, helps to distinguish the yellow and green foldons along with the small difference in their formation rates seen in the renormalized kinetic phases (Fig. 4, Inset).
We used the HX MS method to reexamine the ambiguous kinetic folding results of Cyt c measured before by HX NMR (Fig. 2). Low folding concentration (2 μM) avoided the previous transient aggregation problem, and a short labeling pulse (10 ms rather than 50 ms) minimized spurious labeling due to back-unfolding during the pulse. The results confirm that all of the proteins fold and dock their two terminal helices in a single early step (∼12 ms), and the rest of the native structure folds later. A method for studying kinetic folding intermediates at equilibrium, known as native state HX, described below, independently confirms this result and elucidates the entire subsequent folding pathway.
Unlike all ensemble-level measurements including HX NMR, these pulse labeling HX MS results provide snapshots that show the structurally different populations that are already formed and not yet formed at any time point during kinetic folding rather than a potentially misleading population average. The HX MS data show that folding occurs in a stepwise manner and that each kinetic step is individually two-state representing the cooperative formation of an additional folding unit (foldon) (MS data in Figs. 3 and 4). The time-dependent MS data show that, once each foldon unit is formed, it remains in place as subsequent foldons are added, demonstrating a stepwise buildup through distinct, progressively more folded forms. Essentially the entire refolding population joins synchronously in the same stepwise sequence of intermediate structures, indicating a single dominant folding pathway. The data show explicitly that less than 5% of the protein population folds through any other pathway(s). However, other Cyt c results do detect minimal branching in the special case where the prior structure can support two different but essentially equivalent subsequent steps; either step can occur before the other (31).
These results support a picture of protein folding in which the entire protein population folds through the same distinct intermediates and kinetic barriers in the same defined pathway, as in Fig. 1A. A seminal observation is that the intermediates form by assembling pieces of the native protein, called foldons.
Other Kinetic Studies.
A large fraction of the protein folding literature is directed at finding the determinants of folding rates. Prominent issues, highlighted by reviewers, concern the nucleation–condensation model, the φ value analysis method, and two-state folding. Is the distinct pathway model consistent with current kinetic information?
The nucleation–condensation model suggests that folding is initiated by a nucleation event that potentiates subsequent structural consolidation (32, 33). The φ value analysis method attempts to define the parts of a protein that gain structure in the initial rate-limiting transition state, the nucleating event, by measuring the effect of specific mutations on folding rate (34). The usual result, that φ values are small and fractional (∼0.3 ± 0.2) (35), can be explained either by multiple pathways or by the likelihood that flexible partially folded structures can accommodate disruptive mutations more easily than the rigid native state. Thus, implications for the question of one pathway vs. many are ambiguous. A related ψ value analysis method, although much less used, is more definitive. It finds the same distinct partially formed native-like structure for the entire folding population (36). These results favor the distinct pathway hypothesis.
Many proteins, especially small ones, tend to fold and unfold in a kinetically two-state manner, each with a single exponential rate. The same kinetic barrier is rate-limiting in both folding and unfolding directions, and their ratio gives the correct equilibrium stability constant. In this case, intermediates will not be seen to populate either before or after the barrier, whether they exist or not, and the usual kinetic folding experiment simply cannot distinguish whether separate pathway steps do or do not occur. For example, the defined pathway model in Fig. 1A will produce two-state kinetic folding and unfolding (and linear chevron plots) in the absence of the inserted misfolding barrier noted. Unfortunately, the absence of explicit evidence for multiple kinetic steps is often taken, incorrectly, as evidence for their absence. However, again here one can note that the observation of the same folding rate for the whole protein population tends to favor a single common pathway rather than multiple independent paths.
Thus, much of available kinetic information is unable to distinguish alternative pathway behaviors, although some observations can be deemed supportive of the distinct pathway model.
Multiple Pathways and Misfolding.
Some other optically measured kinetic results have been thought to support multiple pathways, although only a small number. The conflict is often due to the chance occurrence of partial misfolding, which inserts an optional kinetic barrier into the folding pathway, differently affecting the folding of different population fractions (37). In this case kinetic folding will appear heterogeneous and asynchronous, even when all of the molecules fold through the same sequence of intermediate structures. This barrier-based problem is common and has greatly confused protein folding studies. Known optional errors include aggregation (26), partial proline mis-isomerization (38), incorrect disulfide pairing (39), nonnative hydrophobic clustering (40), and partial heme mis-ligation (24). (Note: The term “misfolding” has become associated with amyloid formation; we use it in a more general sense.)
In a prime example, folding experiments on the large TIM barrel protein α-Trp synthase found several kinetically distinct population fractions and intermediates, suggesting four parallel folding tracks (41). Subsequent work found that each additional track could be suppressed, one at a time, by mutational replacement of one or more proline residues or by addition of a prolyl isomerase (42), as expected for a defined pathway interrupted in some fraction of the folding population by optional mis-isomerized proline barriers. In similar work, multiple kinetic folding phases observed by optical methods for hen egg lysozyme (43, 44) and Staphylococcal nuclease (45) were also fit by the authors to multiple pathway models, but it was shown that the data can be fit at least as well by a single pathway in which some fraction of the molecules experience an error that slows its folding (37, 46). In the absence of structural information, it is not possible to distinguish between a multiple pathway interpretation and a given pathway with optional barriers. This can be seen intuitively by considering the insertion of an optional barrier at any step in a well-defined pathway as in Fig. 1A (dashed line).
The common occurrence of on-pathway optional errors has led to other incorrect suggestions: that well-populated kinetic intermediates are grossly misfolded artifacts rather than constructive on-pathway structures with some particular misfolding error; that visible intermediates hinder rather than promote folding because visible intermediates and slowed folding occur together. Other literature results have been interpreted in terms of multiple pathways, either during unfolding at conditions far from native, or during folding but potentially confounded by ensemble averaging, or by complex spectroscopic phases that allow different interpretations, as well as by spurious barriers due to optional errors. In all of these cases, the structural information that is necessary to support a definitive conclusion is absent.
More definitive information comes from the kinetic HX MS experiments illustrated above, which do document a distinct pathway, and from a number of equilibrium-based methods described in the following, which have been able to reveal multiple native-like partially folded on-pathway intermediates, even when simple folding seems to be kinetically two-state.
Intermediates Observed at Equilibrium
Intermediates as Conformationally Excited States.
An experiment called equilibrium native state HX, explained in Fig. 5, first detected and described cooperative foldon units (2, 47). The experiment uses low concentrations of denaturant (or other destabilant) to promote sizeable unfolding reactions to the point where they come to dominate the H-exchange of the amides that they expose. The results, reproduced in Fig. 5A, showed that specific structural elements of Cyt c (Fig. 6) repeatedly unfold and refold, accessing partially unfolded high energy states with ΔGo of 4–13 kcal/mol above the native state, corresponding to steady-state populations between 10−3 and 10−9 of the total protein. These results identify foldon unfolding units in terms of their detailed residue composition, specify the free energy of the partially unfolded states relative to the native state, and can measure unfolding and refolding rates.
Fig. 5.
Initial equilibrium native state HX NMR results for Cyt c (47). (A–D) HX rates of many individual Cyt c residues, measured by NMR as a function of low levels of added denaturant far below the melting transition, are plotted in terms of the free energy of the exposure reaction that determines each amide HX rate. HX governed by a small local fluctuation is insensitive to denaturant and produces a horizontal curve. HX determined by a large unfolding reaction is sharply promoted by denaturant and can come to dominate the exchange of the residues that it exposes. The residues that join each cooperative unfolding (large slope) specify the identity of that unfolding unit. The intercept of each HX isotherm defines the free energy level of each PUF at zero denaturant; the slope relates to its surface exposure. These data identified four large unfolding units (foldons), coded as blue, green, yellow, and red. The less definitive red foldon and the infrared foldon not seen here (gray in Fig. 6) were better defined in later work. (E) The free energy levels of the PUFs produced by the individual cooperative unfolding reactions place them on a free energy ladder. The data in A–D specify the identity of each foldon unfolding unit but do not specify the complete PUF produced by each unfolding. Therefore, one cannot tell whether the foldons unfold independently or sequentially or in some other manner. A series of stability labeling experiments defined the PUFs shown (far right). They constitute a stepwise unfolding and refolding pathway, as in Fig. 1A.
Fig. 6.
The foldon construction of Ribonuclease H and Cyt c. The order of folding is blue, green, yellow, and red, and finally gray for the large bottom Cyt c loop.
However, these results do not fully identify the different partially unfolded forms (PUFs). At each intermediate state, sites that have already exchanged to D are invisible (NMR); one cannot tell whether they are structured or not in the given intermediate. Therefore, one cannot tell whether the different foldons simply unfold independently or in a pathway sequence, as posed in Fig. 5E. This is unlike the kinetic HX MS experiments in Figs. 3 and 4, where the pulse labeling approach directly provides a snapshot of the folded condition of all of the residues during the folding process. Ultimately, a series of “stability labeling” experiments showed that the high energy states seen for Cyt c do represent a quantized stepwise series of progressively more unfolded PUFs, as pictured in Fig. 5E (48). In the unfolding direction, the transition to each higher energy PUF unfolds one more foldon in a sequential pathway manner. Because these experiments were done under equilibrium native conditions (pD 7, 30 °C), each uphill unfolding step must be matched by an equivalent refolding step. The downhill sequence defines a stepwise sequential folding pathway.
In detailed confirmation, these equilibrium results identified the same N/C bihelical foldon (blue) as did the kinetic pulse labeling experiment. The pulse labeling experiment places this state as first in the folding sequence; the native state HX experiment places it as last in the unfolding sequence. The initially folded N/C bihelical PUF accumulates in Cyt c kinetic folding when it encounters a histidine to heme mis-ligation barrier; both peripheral histidines of Cyt c are placed on and therefore block formation of the green foldon segment, which is programmed to fold next. An independent kinetic mode native state HX experiment showed that the various foldons unfold in the kinetic order shown in the rising ladder in Fig. 5E. The unfolding rate for a first unfolding step (by EX1 HX) accurately matches the independently measured Cyt c global unfolding rate in two-state unfolding conditions (49).
Distinct native-like pathway intermediates have been found for other proteins by HX pulse labeling, native state HX, and non-HX methods. Silverman and Harbury (50) designed a proteomics method to measure the reactivity of 25 cysteine SH side chains in triose phosphate isomerase, analogous to the native state HX experiment. The experiment identified three partial unfolding reactions, and stability labeling experiments showed that they stack up in an unfolding/refolding sequence, as for Cyt c. Sekhar and Kay (51) used NMR relaxation dispersion to identify individual partially unfolded forms in several small, supposedly two-state proteins, and supported their role as defined folding pathway intermediates.
All of these results are fully consistent with a classical folding pathway with each intermediate PUF separated from its neighbors by the folding or unfolding of one more foldon.
Molten Globules as Lowest Energy Folding Intermediates.
In his 1995 new view paper (21), Baldwin considered the then-current status of the molten globule hypothesis. Earlier thinking shaped by protein denaturation studies had supposed that proteins are highly cooperative two-state structures and can only occupy, at equilibrium, either their fully native or fully unfolded condition (although see ref. 52). Ptitsyn and coworkers and others found that certain destabilizing conditions, especially low pH, could induce a new, more dynamic, and somewhat expanded protein form, with loose tertiary structure but often considerable secondary structure, which came to be called the molten globule (53–55). Ptitsyn suggested that molten globules represent equilibrium analogs of kinetic folding intermediates. In some cases, HX NMR connected the secondary structure with native-like helical elements, consistent with the Ptitsyn hypothesis. However, Baldwin (21) compared this proposal with expectations from classical and theoretical models, and again here ambiguity prevailed. For example, the pH 4 equilibrium molten globule of apomyoglobin contains the very same native-like A, G, and H helices found by HX pulse labeling during kinetic folding, in line with the Ptitsyn hypothesis, but a Cyt c molten globule contains all three of its native helical segments and not just the N/C bihelical intermediate observed in kinetic HX pulse labeling.
Later work connects the enigmatic structural character of molten globules with the foldon construction of native proteins. A number of partially structured proteins have now been prepared by synthesis or mutation, whether guided by foreknowledge of kinetic folding intermediates or not (56). They mimic natively structured pieces of the native protein. Most incisively, Bai and coworkers (57) used native-state HX to define partially unfolded intermediates of apoCyt b562, and then inserted mutations that selectively destabilize the native state. In the present terms, one or more of the less stable foldons, on the lower rungs of the energy state ladder (as in Fig. 5E), were made to remain unfolded, which caused some higher energy partly unfolded state to become the dominant lowest energy form. Feng et al. solved the structures of two re-engineered versions by NMR and found that both are close mimics of the folding intermediates indicated by native state HX. They are both partially folded and clearly native-like, although they energy minimize to produce some nonnative distortions that shield otherwise exposed hydrophobic side chains.
These results provide a clear picture of the structure of an authentic folding intermediate. They also explain the molten globule ambiguities described in Baldwin’s new view article (21) and elsewhere. A molten globule may emulate, as a free-standing equilibrium species, any one of the quantized intermediate PUFs seen by the kinetic and equilibrium methods just described, depending on how lower energy states are destabilized. Evidently the foldon concept has broad applicability for understanding the range of protein structures.
Protein Folding in Silico
The ability to simulate protein folding has been hampered by the immense computer power necessary, by incompletely adequate force fields, and by the difficulty of discerning a meaningful course of events (reaction coordinate) within the vast data files generated. Until recently most efforts have attempted to evade the computational problems by using simplified nonphysical force fields and models. They have not found cooperative foldons and discrete foldon-dependent pathways.
In one exception Weinkam et al. (58) simulated the folding course of a Cyt c mimic without side chains using a modified Gō model. The computer was initially told what the target native structure looks like, the calculation was instructed to assign more favorable energy as the mock residues draw closer to their normal partners, a multiatom cooperativity term was added, and outsized influence was given to the heme. The presumed shape and properties of the folding landscape did not enter the calculation except for the energetically downhill tendency. These instructions caused foldon units to emerge and associate to produce a stepwise folding pathway, resembling the Cyt c experiments. This success was considered to show that the experimental Cyt c result depends especially on the influence of the heme group, but other proteins with no prosthetic group are now known to fold through distinct intermediates and pathways. The significance of the Cyt c calculation is that it tends to identify the factors that determine the foldon-based behavior. As for any mathematical derivation, the factors that determine the output result must be coded into the initial premises. In the mock Cyt c simulation, the important factor seems to be the added cooperativity term, as emphasized in the foldon hypothesis.
A new generation of theoretical analysis with real proteins in realistic force fields and enhanced computer capabilities is overcoming the calculational difficulties in other ways (59–61). Gathering results from these approaches tend to emulate the foldon-based distinct pathway picture.
Discussion
This article considers the fundamental questions of protein folding, previously answered so differently by the classical and new view models. How do proteins fold, and why do they fold in that way? Extensive experience with the folding problem over a 50-y period has shown that clear structural information on the intermediate states that bridge between the unfolded and native states will be required. Experimentation has developed three useful approaches. Folding intermediates can be studied as significantly populated forms during kinetic folding, or as conformationally excited forms present at equilibrium under native conditions, or as equilibrium molten globule forms. Structural results from these different approaches converge on the same conclusions.
The Foldon Hypothesis.
In all of these observations, cooperative foldon units play a pivotal role. Foldon units were first discovered and characterized in the initial native state HX experiment (2, 47). The experiment showed that native Cyt c at equilibrium under native conditions repeatedly unfolds and refolds. A series of experiments showed that the foldon unfolding reactions occur in a sequential pathway-like manner (48) rather than independently (Fig. 5). That chain of research was rather complex; it developed over a period of years and has evidently been difficult for most investigators to follow. However, reversible partial unfolding and refolding steps have now been seen in various ways for many proteins, and they have often been connected to the protein folding process. Most pointedly, a recently advanced HX MS capability made it possible to observe matching behavior as it occurs during kinetic folding for MBP (29), RNase H (30), and Cyt c, as just described. In all cases one sees that unfolding and refolding proceed in steps that subtract or add one more native-like cooperative foldon unit. The detailed foldon construction of Cyt c and RNase H is illustrated in Fig. 6. Both fold by first forming their blue foldon, then an immediately adjacent foldon to form the blue + green PUF, and so on.
The centrally important point is this: contrary to previous belief, proteins are multistate objects built from separately cooperative foldon units. This fundamental insight leads to a foldon-based hypothesis that suggests the “how” and the “why” of protein folding. The cooperative foldon construction of proteins predisposes them to unfold and refold through foldon-determined steps. The discrete steps produce an ordered repeatable macroscopic folding pathway because previously formed foldons tend to guide and stabilize the formation of incoming foldons that they are designed to interact with in the native protein.
Time and Energy.
A successful folding model must resolve major questions concerning folding time and energy. Levinthal pointed out that the vast array of protein conformations in unfolded space cannot simply reequilibrate and reach the unique native state by an undirected random search in any reasonable time (3, 4). Early theoretical work therefore focused on the downhill energetic drive and the many independent routes that heterogeneity and microscopic thermal searching alone seemed to require. The new view answer to the “why” question is that, from the microscopic point of view, there seems to be no other viable choice.
Experimental work recounted here reveals an emergent macroscopic behavior that provides a previously unrecognized mechanism. Random search does not have to carry the protein all of the way to the native state. It only needs to accomplish the formation of a first native-like foldon. This process is thermodynamically downhill and is guided by the inherent cooperativity of native foldon units. Present information indicates that the first-formed foldon tends to be stable in the context of the rest of the protein (27, 62). The still-unfolded regions can shield and energy minimize unfavorably exposed groups, as in the molten globule situation described before. The time scale for forming a first foldon unit by an unguided search, perhaps two segments ∼20 residues in length, is shorter by far than for a reference 100-residue protein [3100/(2 × 320) ∼ 1040]. The formation of subsequent foldons must proceed by way of similar microscopic searching but in a more guided way analogous to the process of “folding upon binding.” The concept that proteins start folding by forming a native-like structural nucleus has been widely accepted (33). This minimal structure can be sufficient to seed subsequent foldon–foldon interaction steps in a sequence of more guided searches that follow through, rapidly, to the native target.
Does this process have the energetic bias necessary to select specific folding steps and drive folding to completion in a short time? Zwanzig et al. (63) calculated that a free energy bias of 2 kT toward correct interactions is necessary for a folding sequence to complete on a time scale of seconds. It should be appreciated that this degree of bias, more than 1 kcal/mol, is unreasonable at the individual residue level. A single residue has very low probability for finding its correct native partners in a sea of nonnative alternatives. Certainly, microscopic thermal searching must underlie any structure formation process. However, given the required energy bias computed by Zwanzig et al., it seems that microscopic-level searching alone cannot swiftly reach the native state.
By contrast, in a more macroscopic foldon-based scenario each correct native-like choice is driven by the collective energy of many interaction sites held stereochemically in a native-like geometry in partner foldons. This mechanism has been described before as sequential stabilization (48). It is analogous to the well-known folding upon binding process, except that here the incoming disordered segment is advantageously tethered to its already structured partner. The macroscopic foldon-level factors provide both the qualitative structural basis and the quantitative energetic bias required to rapidly and repeatably select discrete determinate pathway steps in competition with all of the other possible alternatives.
Conclusions
The supposed conflict between the classical and new views can be resolved by the realization that they touch on different but equally essential parts of the folding mechanism. Laboratory experiment is able to discern macroscopic molecular behavior, but it is blind to the microscopic thermally driven amino acid-level searching behavior that has been the domain of theoretical analysis. The disordered microscopic multitrack search envisioned in the paradigmatic new view model describes the initial stage amino acid-level search to form cooperative native-like foldon structures, but not the final native state. Experiment displays an emergent foldon-based macroscopic behavior that provides the structural guidance and free energy bias for the ordered stepwise formation of discrete native-like intermediates in a folding pathway that leads to the native state.
Folding in moderately small, separately cooperative units may be necessary for proteins to fold at all. A much larger step size would confront the Levinthal time scale problem; much smaller steps cannot assemble the energy bias required by the Zwanzig criterion for fast folding. Thus, as before for the microscopic view, it may be that there is no other viable choice. Efficient folding may well require foldon-based protein folding pathways. However, here a related constraint enters. Because the essential folding intermediates closely duplicate native structure, as perhaps they must in a reasonable pathway sequence, it seems that the same requirement has reciprocally shaped the foldon-based nature of native protein structure. In respect to foldon-based folding and foldon-based native structure, it seems that each necessitates the other, and that protein-based biology may require both.
Acknowledgments
We thank K. A. Dill, M. F. Gellert, S. Lund-Katz, V. S. Pande, M. C. Phillips, G. D. Rose, T. R. Sosnick, A. Szabo, A. J. Wand, and members of our laboratory for helpful contributions. This work was supported by National Institutes of Health Grant RO1 GM031847, National Science Foundation Grant MCB1020649, and the Mathers Charitable Foundation.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
References
- 1.Bai Y, Englander JJ, Mayne L, Milne JS, Englander SW. Thermodynamic parameters from hydrogen exchange measurements. Methods Enzymol. 1995;259:344–356. doi: 10.1016/0076-6879(95)59051-x. [DOI] [PubMed] [Google Scholar]
- 2.Bai Y, Englander SW. Future directions in folding: The multi-state nature of protein structure. Proteins. 1996;24(2):145–151. doi: 10.1002/(SICI)1097-0134(199602)24:2<145::AID-PROT1>3.0.CO;2-I. [DOI] [PubMed] [Google Scholar]
- 3.Levinthal C. Are there pathways for protein folding. J Chim Phys. 1968;65:44–45. [Google Scholar]
- 4.Levinthal C. Mossbauer Spectroscopy in Biological Systems. Proceedings University of Illinois Bulletin. University of Illinois Press; Urbana, IL: 1969. How to fold graciously; pp. 22–24. [Google Scholar]
- 5.Luheshi LM, Crowther DC, Dobson CM. Protein misfolding and disease: From the test tube to the organism. Curr Opin Chem Biol. 2008;12(1):25–31. doi: 10.1016/j.cbpa.2008.02.011. [DOI] [PubMed] [Google Scholar]
- 6.Anfinsen CB, Haber E, Sela M, White FH., Jr The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc Natl Acad Sci USA. 1961;47:1309–1314. doi: 10.1073/pnas.47.9.1309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Sali A, Shakhnovich E, Karplus M. How does a protein fold? Nature. 1994;369(6477):248–251. doi: 10.1038/369248a0. [DOI] [PubMed] [Google Scholar]
- 8.Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG. Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins. 1995;21(3):167–195. doi: 10.1002/prot.340210302. [DOI] [PubMed] [Google Scholar]
- 9.Dill KA, Chan HS. From Levinthal to pathways to funnels. Nat Struct Biol. 1997;4(1):10–19. doi: 10.1038/nsb0197-10. [DOI] [PubMed] [Google Scholar]
- 10.Plotkin SS, Onuchic JN. Understanding protein folding with energy landscape theory. Part I: Basic concepts. Q Rev Biophys. 2002;35(2):111–167. doi: 10.1017/s0033583502003761. [DOI] [PubMed] [Google Scholar]
- 11.Kussell E, Shimada J, Shakhnovich EI. Side-chain dynamics and protein folding. Proteins. 2003;52(2):303–321. doi: 10.1002/prot.10426. [DOI] [PubMed] [Google Scholar]
- 12.Leopold PE, Montal M, Onuchic JN. Protein folding funnels: A kinetic approach to the sequence-structure relationship. Proc Natl Acad Sci USA. 1992;89(18):8721–8725. doi: 10.1073/pnas.89.18.8721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wolynes PG, Onuchic JN, Thirumalai D. Navigating the folding routes. Science. 1995;267(5204):1619–1620. doi: 10.1126/science.7886447. [DOI] [PubMed] [Google Scholar]
- 14.Oliveberg M, Wolynes PG. The experimental survey of protein-folding energy landscapes. Q Rev Biophys. 2005;38(3):245–288. doi: 10.1017/S0033583506004185. [DOI] [PubMed] [Google Scholar]
- 15.Onuchic JN, Luthey-Schulten Z, Wolynes PG. Theory of protein folding: The energy landscape perspective. Annu Rev Phys Chem. 1997;48:545–600. doi: 10.1146/annurev.physchem.48.1.545. [DOI] [PubMed] [Google Scholar]
- 16.Onuchic JN, Nymeyer H, García AE, Chahine J, Socci ND. The energy landscape theory of protein folding: Insights into folding mechanisms and scenarios. Adv Protein Chem. 2000;53:87–152. doi: 10.1016/s0065-3233(00)53003-4. [DOI] [PubMed] [Google Scholar]
- 17.Plotkin SS, Onuchic JN. Understanding protein folding with energy landscape theory. Part II: Quantitative aspects. Q Rev Biophys. 2002;35(3):205–286. doi: 10.1017/s0033583502003785. [DOI] [PubMed] [Google Scholar]
- 18.Kim PS, Baldwin RL. Specific intermediates in the folding reactions of small proteins and the mechanism of protein folding. Annu Rev Biochem. 1982;51:459–489. doi: 10.1146/annurev.bi.51.070182.002331. [DOI] [PubMed] [Google Scholar]
- 19.Kim PS, Baldwin RL. Intermediates in the folding reactions of small proteins. Annu Rev Biochem. 1990;59:631–660. doi: 10.1146/annurev.bi.59.070190.003215. [DOI] [PubMed] [Google Scholar]
- 20.Baldwin RL. The search for folding intermediates and the mechanism of protein folding. Annu Rev Biophys. 2008;37:1–21. doi: 10.1146/annurev.biophys.37.032807.125948. [DOI] [PubMed] [Google Scholar]
- 21.Baldwin RL. The nature of protein folding pathways: the classical versus the new view. J Biomol NMR. 1995;5(2):103–109. doi: 10.1007/BF00208801. [DOI] [PubMed] [Google Scholar]
- 22.Withers P. Landing spacecraft on Mars and other planets: An opportunity to apply introductory physics. Am J Phys. 2013;81:565–569. [Google Scholar]
- 23.Udgaonkar JB, Baldwin RL. NMR evidence for an early framework intermediate on the folding pathway of ribonuclease A. Nature. 1988;335(6192):694–699. doi: 10.1038/335694a0. [DOI] [PubMed] [Google Scholar]
- 24.Roder H, Elöve GA, Englander SW. Structural characterization of folding intermediates in cytochrome c by H-exchange labelling and proton NMR. Nature. 1988;335(6192):700–704. doi: 10.1038/335700a0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Nawrocki JP, Chu R-A, Pannell LK, Bai Y. Intermolecular aggregations are responsible for the slow kinetics observed in the folding of cytochrome c at neutral pH. J Mol Biol. 1999;293(5):991–995. doi: 10.1006/jmbi.1999.3226. [DOI] [PubMed] [Google Scholar]
- 26.Silow M, Oliveberg M. Transient aggregates in protein folding are easily mistaken for folding intermediates. Proc Natl Acad Sci USA. 1997;94(12):6084–6086. doi: 10.1073/pnas.94.12.6084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Krishna MM, Lin Y, Mayne L, Englander SW. Intimate view of a kinetic protein folding intermediate: Residue-resolved structure, interactions, stability, folding and unfolding rates, homogeneity. J Mol Biol. 2003;334(3):501–513. doi: 10.1016/j.jmb.2003.09.070. [DOI] [PubMed] [Google Scholar]
- 28.Krishna MMG, Hoang L, Lin Y, Englander SW. Hydrogen exchange methods to study protein folding. Methods. 2004;34(1):51–64. doi: 10.1016/j.ymeth.2004.03.005. [DOI] [PubMed] [Google Scholar]
- 29.Walters BT, Mayne L, Hinshaw JR, Sosnick TR, Englander SW. Folding of a large protein at high structural resolution. Proc Natl Acad Sci USA. 2013;110(47):18898–18903. doi: 10.1073/pnas.1319482110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hu W, et al. Stepwise protein folding at near amino acid resolution by hydrogen exchange and mass spectrometry. Proc Natl Acad Sci USA. 2013;110(19):7684–7689. doi: 10.1073/pnas.1305887110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Krishna MM, Maity H, Rumbley JN, Englander SW. Branching in the sequential folding pathway of cytochrome c. Protein Sci. 2007;16(9):1946–1956. doi: 10.1110/ps.072922307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sosnick TR, Mayne L, Englander SW. Molecular collapse: The rate-limiting step in two-state cytochrome c folding. Proteins. 1996;24(4):413–426. doi: 10.1002/(SICI)1097-0134(199604)24:4<413::AID-PROT1>3.0.CO;2-F. [DOI] [PubMed] [Google Scholar]
- 33.Fersht AR. Nucleation mechanisms in protein folding. Curr Opin Struct Biol. 1997;7(1):3–9. doi: 10.1016/s0959-440x(97)80002-4. [DOI] [PubMed] [Google Scholar]
- 34.Fersht AR, Sato S. Phi-value analysis and the nature of protein-folding transition states. Proc Natl Acad Sci USA. 2004;101(21):7976–7981. doi: 10.1073/pnas.0402684101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Naganathan AN, Muñoz V. Insights into protein folding mechanisms from large scale analysis of mutational effects. Proc Natl Acad Sci USA. 2010;107(19):8611–8616. doi: 10.1073/pnas.1000988107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Pandit AD, Krantz BA, Dothager RS, Sosnick TR. Characterizing protein folding transition States using Psi-analysis. Methods Mol Biol. 2007;350:83–104. doi: 10.1385/1-59745-189-4:83. [DOI] [PubMed] [Google Scholar]
- 37.Krishna MM, Englander SW. A unified mechanism for protein folding: predetermined pathways with optional errors. Protein Sci. 2007;16(3):449–464. doi: 10.1110/ps.062655907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wedemeyer WJ, Welker E, Scheraga HA. Proline cis-trans isomerization and protein folding. Biochemistry. 2002;41(50):14637–14644. doi: 10.1021/bi020574b. [DOI] [PubMed] [Google Scholar]
- 39.Song MC, Scheraga HA. Formation of native structure by intermolecular thiol-disulfide exchange reactions without oxidants in the folding of bovine pancreatic ribonuclease A. FEBS Lett. 2000;471(2-3):177–181. doi: 10.1016/s0014-5793(00)01386-7. [DOI] [PubMed] [Google Scholar]
- 40.Klein-Seetharaman J, et al. Long-range interactions within a nonnative protein. Science. 2002;295(5560):1719–1722. doi: 10.1126/science.1067680. [DOI] [PubMed] [Google Scholar]
- 41.Bilsel O, Zitzewitz JA, Bowers KE, Matthews CR. Folding mechanism of the alpha-subunit of tryptophan synthase, an alpha/beta barrel protein: Global analysis highlights the interconversion of multiple native, intermediate, and unfolded forms through parallel channels. Biochemistry. 1999;38(3):1018–1029. doi: 10.1021/bi982365q. [DOI] [PubMed] [Google Scholar]
- 42.Wu Y, Matthews CR. Proline replacements and the simplification of the complex, parallel channel folding mechanism for the alpha subunit of Trp synthase, a TIM barrel protein. J Mol Biol. 2003;330(5):1131–1144. doi: 10.1016/s0022-2836(03)00723-x. [DOI] [PubMed] [Google Scholar]
- 43.Bieri O, Wildegger G, Bachmann A, Wagner C, Kiefhaber T. A salt-induced kinetic intermediate is on a new parallel pathway of lysozyme folding. Biochemistry. 1999;38(38):12460–12470. doi: 10.1021/bi9909703. [DOI] [PubMed] [Google Scholar]
- 44.Radford SE, Dobson CM, Evans PA. The folding of hen lysozyme involves partially structured intermediates and multiple pathways. Nature. 1992;358(6384):302–307. doi: 10.1038/358302a0. [DOI] [PubMed] [Google Scholar]
- 45.Kamagata K, Sawano Y, Tanokura M, Kuwajima K. Multiple parallel-pathway folding of proline-free Staphylococcal nuclease. J Mol Biol. 2003;332(5):1143–1153. doi: 10.1016/j.jmb.2003.07.002. [DOI] [PubMed] [Google Scholar]
- 46.Bédard S, Krishna MM, Mayne L, Englander SW. Protein folding: Independent unrelated pathways or predetermined pathway with optional errors. Proc Natl Acad Sci USA. 2008;105(20):7182–7187. doi: 10.1073/pnas.0801864105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bai Y, Sosnick TR, Mayne L, Englander SW. Protein folding intermediates: Native-state hydrogen exchange. Science. 1995;269(5221):192–197. doi: 10.1126/science.7618079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Englander SW, Mayne L, Krishna MM. Protein folding and misfolding: Mechanism and principles. Q Rev Biophys. 2007;40(4):287–326. doi: 10.1017/S0033583508004654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Hoang L, Bedard S, Krishna MM, Lin Y, Englander SW. Cytochrome c folding pathway: Kinetic native-state hydrogen exchange. Proc Natl Acad Sci USA. 2002;99(19):12173–12178. doi: 10.1073/pnas.152439199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Silverman JA, Harbury PB. The equilibrium unfolding pathway of a (beta/alpha)8 barrel. J Mol Biol. 2002;324(5):1031–1040. doi: 10.1016/s0022-2836(02)01100-2. [DOI] [PubMed] [Google Scholar]
- 51.Sekhar A, Kay LE. NMR paves the way for atomic level descriptions of sparsely populated, transiently formed biomolecular conformers. Proc Natl Acad Sci USA. 2013;110(32):12867–12874. doi: 10.1073/pnas.1305688110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Mayne L, Englander SW. Two-state vs. multistate protein unfolding studied by optical melting and hydrogen exchange. Protein Sci. 2000;9(10):1873–1877. doi: 10.1110/ps.9.10.1873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Ptitsyn OB. How the molten globule became. Trends Biochem Sci. 1995;20(9):376–379. doi: 10.1016/s0968-0004(00)89081-7. [DOI] [PubMed] [Google Scholar]
- 54.Ptitsyn OB. Molten globule and protein folding. Adv Protein Chem. 1995;47:83–229. doi: 10.1016/s0065-3233(08)60546-x. [DOI] [PubMed] [Google Scholar]
- 55.Arai M, Kuwajima K. Role of the molten globule state in protein folding. Adv Protein Chem. 2000;53:209–282. doi: 10.1016/s0065-3233(00)53005-8. [DOI] [PubMed] [Google Scholar]
- 56.Peng ZY, Wu LC. Autonomous protein folding units. Adv Protein Chem. 2000;53:1–47. doi: 10.1016/s0065-3233(00)53001-0. [DOI] [PubMed] [Google Scholar]
- 57.Feng H, Zhou Z, Bai Y. A protein folding pathway with multiple folding intermediates at atomic resolution. Proc Natl Acad Sci USA. 2005;102(14):5026–5031. doi: 10.1073/pnas.0501372102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Weinkam P, Zong C, Wolynes PG. A funneled energy landscape for cytochrome c directly predicts the sequential folding route inferred from hydrogen exchange experiments. Proc Natl Acad Sci USA. 2005;102(35):12401–12406. doi: 10.1073/pnas.0505274102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science. 2011;334(6055):517–520. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
- 60.Pande VS. Understanding protein folding using Markov state models. Adv Exp Med Biol. 2014;797:101–106. doi: 10.1007/978-94-007-7606-7_8. [DOI] [PubMed] [Google Scholar]
- 61.Adhikari AN, Freed KF, Sosnick TR. Simplified protein models: Predicting folding pathways and structure using amino acid sequences. Phys Rev Lett. 2013;111(2):028103. doi: 10.1103/PhysRevLett.111.028103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Hughson FM, Wright PE, Baldwin RL. Structural characterization of a partly folded apomyoglobin intermediate. Science. 1990;249(4976):1544–1548. doi: 10.1126/science.2218495. [DOI] [PubMed] [Google Scholar]
- 63.Zwanzig R, Szabo A, Bagchi B. Levinthal’s paradox. Proc Natl Acad Sci USA. 1992;89(1):20–22. doi: 10.1073/pnas.89.1.20. [DOI] [PMC free article] [PubMed] [Google Scholar]