Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2003 Mar 25;100(7):3555–3556. doi: 10.1073/pnas.0830965100

Meeting halfway on the bridge between protein folding theory and experiment

Vijay S Pande 1,*
PMCID: PMC152957  PMID: 12657736

Understanding the mechanism of protein folding is a notoriously difficult problem for both experiment and simulation (1, 2). In this issue of PNAS, through a combination of theory (Karanicolas and Brooks; ref. 3) and experiment (Nyguen et al.; ref. 4), progress toward the bridge between simulation and experiment is beautifully demonstrated. Indeed, lead by predictions made by Karanicolas and Brooks, Nyguen et al. demonstrate the utility of a new generation of model systems for protein folding in which a small model protein (the WW domain) can be tuned from simple two-state behavior to more complex three-state folding merely by changing the temperature. Because such qualitative differences in the folding pathway can be induced through relatively subtle changes in the experimental conditions, this WW domain makes an ideal test system for our understanding of protein folding, both in terms of theory, simulation, and one's conceptualization of the relevant physical forces driving folding.

Typically, proteins are categorized as two-state (single exponential kinetics) or three state (biexponential kinetics). Indeed, after Jackson and Fersht first demonstrated two-state folding kinetics in CI2 (5), much attention has been paid primarily to two-state proteins as a simpler model system, a natural first challenge before going on to more complex, multistate kinetic folders. Moreover, a common rule of thumb is that small proteins (<100 residues) would have simpler, two-state kinetics, and large proteins would likely have more complex multistate folding.

The behavior of FBP WW domain is unusual, not as an exception to this rule of size, but for the ability to switch between the two- and three-state worlds. Nyguen et al. (4) examined the folding of the FBP WW-domain, a stable three-stranded β-sheet protein by laser temperature jump measurements. They examined several variants of this domain, including several single point and truncation mutants. The most significant result of their work is the finding that above the transition midpoint (e.g., at 65°C), single fast exponential kinetics were seen with a time constant of <15 μs. On the other hand, below the midpoint temperature (e.g., at 35°C), biexponential kinetics were seen with a time constants of ≈ 30 μs and >900 μs. Thus, it appears that as one raises the temperature above physiological conditions, the kinetics qualitatively changes, and the slow phase disappears. What could be the cause of this slow phase and of its disappearance? By using mutagenesis experiments, Nyguen et al. (4) make some compelling suggestions, but at least currently, one's desire for understanding folding with atomic detail cannot currently be fully satisfied by experiment alone.

Indeed, because of their complementary strengths and weaknesses, it is natural to use a combination of simulation and experiment to understand protein folding (1, 610). Experiments on these systems are extremely difficult and are limited to primarily yielding folding rates and thermodynamics. With a well constructed experiment, one can use rates and free energies to decipher mechanistic properties, as demonstrated by Φ analysis pioneered by Fersht and coworkers (11). However, it is clear that even with experimental methods like Φ analysis, an understanding to atomistic detail is still elusive, and it is natural to complement these experiments with detailed, atomistic simulations (11), which have the capability of yielding everything one could want: atomistic detail and femtosecond temporal resolution.

It is natural to use a combination of simulation and experiment to understand protein folding.

However, this level of simulation detail comes at a steep computational price. Indeed, typically one can only simulate the nanosecond time scale on the fastest computers. Because typical proteins fold on the millisecond to subsecond time scale, this gap between simulation and experimental time scales has appeared to put simulations out of reach of experimentally testable systems. To help bridge this divide, experimentalists have taken the challenge to discover (6, 7, 12) or create (13, 14) small, fast folding proteins that are amenable to simulation time scales. With experimentalists studying systems that push the “speed limit” (15, 16) on folding kinetics, orders of magnitude faster than the previous record less than a decade ago (12) and reaching single microsecond folding time constants, the gauntlet has been thrown for theorists to themselves devise means to push the limits of their methodology for simulating folding kinetics.

For example, one approach to circumvent this apparent limitation is to examine systems in which dynamics occurs on nanosecond time scales. For example, Fersht and coworkers have experimental data supporting the observation that unfolding of the α-helical Engrailed Homeodomain (EngHd) near 100°C occurs on the nanosecond time scale (6). In collaboration with Daggett and coworkers, they have combined this experimental data on EngHd unfolding with 100°C temperature unfolding simulations to gain atomistic information about the unfolding pathway at room temperatures. The end goal of course would be to use this unfolding pathway data to suggest the nature of the folding pathway at room temperature. Unfolding at high temperatures has been successful in reproducing Φ values in several proteins (11). However, because the FBP WW domain explicitly exhibits qualitatively different kinetics as one raises the temperature, it is unclear whether elevated temperature methods would be applicable for this system.

Another approach has been to develop novel algorithms to take advantage of the statistical nature of folding kinetics. Because they are dominated by a single rate-limiting step, two-state folding kinetics folding time distributions are well modeled by an exponential distribution. If this rate-limiting step dominates significantly (17, 18), then one can examine folding kinetics statistically by examining large ensemble of trajectories that are ≈1% of the folding time constant in length (e.g., tens of nanoseconds). To get sufficient statistics, one of course needs tens of thousands of such simulations, requiring exceedingly large-scale computational resources. However, through the use of novel grid-computing based methods (9, 1923) and a network of over 30,000 PCs, Pande and collaborators have simulated hundreds of microseconds, resulting in tens of successfully folded trajectories and accurate, quantitative predictions of folding rates and stabilities (7, 9, 1922). However, these methods to statistically sample folding time distributions are clearly complicated by nonsingle exponential behavior (17, 18), and although there have been suggestions for means to tackle multistate folding (17, 22), this has yet to be achieved.

A third approach is to use a so-called “minimalist” model (1, 8, 2428). These models are attempts to strip down the simulation to the bare essentials, including the relevant physical forces, but no more. Apart from the obvious gains in computational tractability, such simulations allow one to directly ask, “what are the relevant aspects of proteins, which lead to the experimentally observed folding?” Even if the model disagrees with experiment, then one still has learned much. Indeed, much has been learned from even more minimalist lattice models (1). Of course, because of the minimalist nature of these models, comparison to experiment is vital, as this is the only way to discern whether the model was a success, and thus, whether the aspects used to construct the models were sufficient.

Indeed, recently Karanicolas and Brooks (3), also in this issue of PNAS, have examined the FBP domains by using a minimalist model and a mixture of kinetic and thermodynamic methods and have made predictions (before the work of Nyguen et al.; ref. 4) regarding the nature of the folding mechanism. Specifically, Karanicolas and Brooks shed light on the mysterious two- to three-state switch. They suggest that most of the protein can fold independently and quickly with the fast phase and that the slow phase is created by nonnative, misregistered contacts in loop 2. This is consistent with Nyguen et al.'s results, especially the reemergence of three-state behavior when Leu-26 in loop 2 is mutated. Thus, Karanicolas and Brooks have demonstrated that even subtle features of folding can be captured in current minimalist models.

What can we learn from recent successes in bridging experiment and theory? Because a connection between simulation and experiment has been made by using a variety of computational methods, it is natural to ask “what do all of these simulation methods have in common?” One commonality is the role of the topology of the modeled protein. Although interactions are clearly somewhat different from experiment at elevated temperatures or in minimalist model representations, what is common is the polymeric aspects, in particular the role of topology and geometry. Inspired by Plaxco et al.'s quantitative correlation between folding rate and contact order for two-state folders (29), many theorists have been looking to see how aspects of topology and geometry (in the sense found in contact order) may be playing a critical role in determining the folding mechanism.

Although much recent theoretical attention has been paid to topology, it is interesting to consider a second possibility: the role of sterics. Recent theoretical work by the groups of Rose (3032), Shortle (33), and Pande (19, 20, 34) suggest that local sterics may play a significant role in biasing a protein through a particular pathway to its particular native state. Steric-based algorithms have also played an important role in the recently suggested structure prediction methods by these groups as well. Indeed, the importance of the role of sterics as well as that of conformational averaging (34) may also be a natural explanation for the unusual effect seen in the native-like behavior of the unfolded state (35, 36).

In summary, the hypothesis that local (steric) and global (topology) issues may play a central role in protein folding is intriguing. The ability to gain qualitative and at times quantitative agreement with experiment suggests that the current generation of models is in many ways “good enough.” Perhaps the challenge that lies ahead for the folding community is reaching an understanding of why these models are agreeing reasonably well with experiment. With this determined, we would finally attain the ultimate goal – an atomic understanding of the relevant driving forces for folding.

Acknowledgments

I thank Stefan Larson, Young Min Rhee, Eric Sorin, and Bojan Zagrovic for critical comments on this manuscript.

Footnotes

See companion articles on pages 3948 and 3954.

References


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES