Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2003 Jan;12(1):17–26. doi: 10.1110/ps.0220003

The topomer search model: A simple, quantitative theory of two-state protein folding kinetics

Dmitrii E Makarov 1, Kevin W Plaxco 2
PMCID: PMC2312397  PMID: 12493824

Abstract

Most small, single-domain proteins fold with the uncomplicated, single-exponential kinetics expected for diffusion on a smooth energy landscape. Despite this energetic smoothness, the folding rates of these two-state proteins span a remarkable million-fold range. Here, we review the evidence in favor of a simple, mechanistic description, the topomer search model, which quantitatively accounts for the broad scope of observed two-state folding rates. The model, which stipulates that the search for those unfolded conformations with a grossly correct topology is the rate-limiting step in folding, fits observed rates with a correlation coefficient of ∼0.9 using just two free parameters. The fitted values of these parameters, the pre-exponential attempt frequency and a measure of the difficulty of ordering an unfolded chain, are consistent with previously reported experimental constraints. These results suggest that the topomer search process may dominate the relative barrier heights of two-state protein-folding reactions.

Keywords: Contact order, diffusion-collision, nucleation


The folding kinetics of most simple, single-domain proteins is well fitted as a single-exponential, two-state process (Jackson and Fersht 1991; Guijarro et al. 1998; Jackson 1998; Plaxco et al. 1999), even at the lowest temperatures accessible to experiment (Gillespie and Plaxco 2000). This observation confirms that the rapid, biologically relevant folding rates that distinguish naturally occurring proteins are associated with a smooth energy landscape lacking both significant discrete traps (well-populated intermediates and misfolded states) and fine-scale heterogeneous roughness (Bryngelson and Wolynes 1987; Bryngelson et al. 1995; Onuchic et al. 1997). In the absence of these complications, folding is free to progress unimpeded to the native state with the greatest possible speed (Dill and Chan 1997; Dobson et al. 1998; Dinner et al. 2000). But this observation begs the question, if the folding energy landscapes of two-state proteins are generally smooth, why do some simple proteins fold a million times more rapidly than others (van Nuland et al. 1998; Wittung-Stafshede et al. 1999)? Here, we describe a simple, near-first principles model that quantitatively accounts for this well-established experimental observation.

Although evolution can presumably smooth the energy landscape arbitrarily, there is one aspect of protein chemistry that selective pressures cannot optimize, namely, a polypeptide is a covalent chain that cannot cross through itself. An unavoidable consequence of this connectivity is that the rate with which unfolded polypeptides diffuse between distinct topologies is limited, and thus, even imaginary proteins with perfectly smooth energy landscapes will exhibit varying folding rates due to topological frustration (Clementi et al. 2000) and the difficulty of diffusing into the correct, native topology. Consistent with this hypothesis, by the mid- to late nineties, numerous authors had suggested the search for the correct gross topology may be an important contributor to the folding barrier (Sosnick et al. 1994; Gross 1996; Sosnick et al. 1996; Guo et al. 1997; Kolinski et al. 1998; Sheinerman and Brooks 1998; Socci et al. 1998; Bergasa-Caceres et al. 1999; Debe et al. 1999; Shea et al. 1999).

In 1998, we serendipitously discovered that a simple, empirical measure of topological complexity is highly correlated with the experimentally observed folding rates of two-state proteins (Plaxco et al. 1998, 2000). The measure of topology in question, termed relative contact order, is simply the average sequence separation between all pairs of residues in contact in the native structure relative to the total length of the protein. The surprising strength of this correlation [r ≈ 0.9; all correlation coefficients in this review are for linearized equations. This correlation coefficent, r, thus reflects the significance of the linear relationship between log (kf) and contact order. The square of the correlation coefficient, r2, is a measure of the fraction of all of the variance in the data set that is captured by the model.] demonstrates that, against the background of the smooth energy landscapes of two-state proteins, this perhaps naïve measure captures in excess of 3/4 of the variance in reported (log) folding rates.

Whereas the contact order-rate relationship hints at the mechanistic underpinnings of the folding reaction, the measure has not lent itself to any simple, quantitative reconciliation with first principles models of the process. For example, because contact order is related to the sequence separation between contacting residues, it has been suggested that it relates to the entropic cost of the loop closures required to surmount the rate-limiting step in folding (Plaxco et al. 1998; Alm and Baker 1999a; Galzitskaya and Finkelstein 1999; Fersht 2000). Unfortunately, however, loop closure entropy is proportional to the logarithm of loop length rather than loop length per se (Jacobson and Stockmayer 1950) and the average log (separation) between contacting residues is more poorly correlated with rates than is contact order as originally defined (K.W. Plaxco, unpubl.). Similarly, relative contact order (the average contact separation in terms of fraction of total peptide length) predicts rates significantly more accurately than absolute measures of the average sequence separation of contacting residues (Grantcharova et al. 2001; Ivankov et al. 2002). This produces the counterintuitive result that, of two proteins with the same average contact separation, the longer protein folds faster. Observations such as these lead inevitably to the possibility that contact order predicts rates, not because it is directly related to the underlying mechanism of folding, but because it is a proxy for some other, physically more reasonable parameter. Consistent with this sugges-tion, a number of additional, empirical measures of topology correlate approximately equally well with folding rates. These include the number of sequence-distant contacts per residue (Gromiha and Selvaraj 2001), the fraction of contacts that are sequence distant (Mirny and Shakhnovich 2001), and the total contact distance (Zhou and Zhou 2002).

Motivated by the quantitative dependence of kinetics on topology, several groups have attempted to define mechanistic models of folding that predict rates with accuracy equal to or surpassing that of these empirical relationships. One approach is based on calculating the loop-entropy cost of sequentially creating the stabilizing interactions that define the native state (Alm and Baker 1999b; Muñoz and Eaton 1999; Grantcharova et al. 2001; Ivankov and Finkelstein 2001). Whereas these models have achieved real success in predicting folding kinetics, their relative complexity and slightly poorer correlation with experiment again emphasizes the question of whether the entropic cost of specific loop closures really underlies the—perhaps deceptively—simple relationship between a protein’s topology and the rate with which it folds.

The topomer search model

The topomer search model provides a simple, alternative explanation for the topology-rate relationship. This model postulates that relative barrier heights are dominated by the diffusive search for the set of unfolded conformations that share a common, global topology with the native state (i.e., are in the native topomer; Debe et al. 1999), and that once this is achieved, the rate-limiting step has been surmounted and specific native contacts rapidly zipper to form the fully folded protein (Fig. 1). The model implies that the various empirical, topological metrics correlate with rates because they correlate with the probability of the unfolded chain diffusing into this native topomer. Here, we review the simple, quantitative arguments in support of the topomer search model of two-state folding.

Figure 1.

Figure 1.

The essence of the topomer search model is that the rate with which an unfolded polymer diffuses between distinct topologies is much slower than the rate with which local structural elements zipper (and, critically, unzip). It is well established that the formation of helices, loops, and other sequence-local structural elements (B to A transition) is significantly faster than the rate-limiting step in folding. Because the free energy of these partially folded states (A) is almost invariably ≥0 for two-state proteins, their disruption (A to B transition) is more rapid still. Given these constraints, the rate-limiting step in folding might be the slow, large-scale diffusion to find the set of conformations (C) close enough to the native topology that they can zipper deeply into the stable, native well (E) without requiring slow, large-scale topological rearrangements. Central to this argument is the suggestion that, whereas the formation of specific, native-like interactions may be necessary in order to surmount the rate-limiting step (D), they are neither sufficient (A) nor the dominant determinant of relative barrier heights. Here, we review the quantitative, experimental evidence in support of this model of two-state folding.

Why might the topomer search process be the dominant contributor to the folding barrier? A potential answer to this question arises as a consequence of two experimental observations. The first is that the formation of helices, hairpins, loops, and other local structures is orders of magnitude more rapid than the rate-limiting step in folding (Hagen et al. 1996; Muñoz et al. 1997; Thompson et al. 1997; Bieri et al. 1999; Eaton et al. 2000; Lapidus et al. 2000). The second is that the folding free energy of such isolated structural elements and of almost all of the partially folded and misfolded states of single domain proteins, are near or above zero (for review, see Flanagan et al. 1992; Ladurner et al. 1997; Camarero et al. 2001). Because local zippering is rapid, all of the conformers in the denatured ensemble will rapidly sample sequence-local elements of native structure (Fig. 1; transition B to A). For the vast majority of unfolded conformations, however, this zippering will stall when it reaches a point at which the formation of additional native structure would require massive, potentially slow rearrangement of the polypeptide chain (Fig. 1, state A). Because this partially folded state is unstable, it ruptures at least as rapidly as the rate with which it formed (Fig. 1, transition A to B), liberating the chain to diffuse into a new topology (Fig. 1, transition B to C). Only if the new topology is grossly similar to the native topology (i.e., is in the native topomer; Fig. 1, state C), can the zippering of local structure proceed deep into the native well and the chain become trapped (Fig. 1, transition C to E). The observation of strong correlation between topology and rates suggests that the slow, diffusive search for unfolded conformations that can zipper directly into the native well (i.e., that are in the native topomer) is the dominant contributor to the folding barrier. More precisely, we have suspected that contact order predicts rates, because it correlates with the probability of a random, diffusive search finding the native topomer (Gillespie and Plaxco 2000; Millet et al. 2002).

The demonstration that contact order—or any of the many related topological parameters—correlates with the probability of finding the native topomer would provide critical support for this hypothesis. On first inspection, however, one might think that determining the probability of a chain diffusing into a given topomer is an overwhelmingly complex exercise in conditional probabilities (Fig. 2A); the probability of bringing a given residue pair into proximity may depend acutely on which other pairs are already ordered (Chan and Dill 1990). The critical question is whether a simple mathematical description exists that reasonably approximates this complex set of conditional probabilities and accurately predicts the probability of achieving the native topomer.

Figure 2.

Figure 2.

(A) The probability of bringing a sequence-distant native pair into proximity depends on the set of other pairs that are already in proximity. (B) Gaussian chain simulations indicate, however, that once a few (∼3) sequence-distant pairs are in proximity, the probability of forming any additional pair is well approximated as a constant (probability CD ≈ probability EF). This, in turn, suggests that the probability of achieving a given topomer is approximately exponentially related to QD, the number of sequence-distant pairs of residues that must be brought into proximity in order to define it.

To date, two groups have attempted the search for a simple description of the probability of finding the native topomer. The first, Debe and Goddard, used rather detailed simulations to estimate numerically the number of distinct topologies available to an unfolded polypeptide chain (Debe et al. 1999), and arrived at a first principles model that accurately predicts (r = ∼0.8–0.9), the folding rates of the non-helical two-state proteins (Debe and Goddard 1999). However, this model fails to predict the folding rates of predominantly helical, two-state proteins. More recently, we have described a rather simpler and still more general version of the topomer search model that accurately predicts the folding rates of all classes of two-state proteins (Makarov et al. 2002).

Our model stems from simulations of the properties of inert, Gaussian chains. These simulations demonstrate that, due to two simplifying effects, a straightforward approximation describes the probability of a random-coil polymer adopting a given gross topology. The first simplifying effect is that, because the probability of sequence-neighboring residues being in proximity is high (their locations are highly correlated) the probability of forming the native topomer is dominated by pairs of residues that are distant in the sequence. Thus, sequence-local interactions contribute little to the probability of being in the native topomer. The second is that the probability of ordering the chain is well described by a mean-field approximation. That is, once a sufficient number of sequence-distant pairs of residues are brought into proximity, the remaining ordering events become independent of the precise nature of the pre-existing order (i.e., become independent of one another), and the probability of each of these orderings becomes approximately constant. The nature of this approximation can be understood in qualitative terms by considering the ordering of a native pair in a chain with a significant amount of pre-existing, native-like order (Fig. 2B). In such a situation, it is plausible that the entropic cost of bringing such an additional native pair into proximity to form a bundle of residues in the native topomer could be described in terms of the bulk characteristics of this bundle instead of its precise structure (Flory 1956; Gutin and Shakhnovich 1994; Plotkin et al. 1996; Shoemaker and Wolynes 1999). The simplest approximation of the probability of forming the native topomer would then be to replace the unique probability of ordering each specific pair by the average probability of ordering all pairs. Numerical simulations of the exactly solvable Gaussian chain model provide quantitative support for this qualitative argument (Makarov and Metiu 2002; Makarov et al. 2002).

If, as suggested by Gaussian chain simulations, the probability of bringing each additional sequence-distant pair into proximity in the unfolded state is constant, then the probability that the unfolded polypeptide is in a given topomer, P(QD), is given by

graphic file with name M1.gif 1

in which QD is the number of sequence-distant pairs whose proximity defines the topomer, <K> is the average equilibrium constant for residue pairs being in proximity (and is less than unity) and γ is a proportionality constant. We note that this probability is proportional to <K>QD rather than equal to it; this is because, as suggested by the Gaussian chain studies, the additional entropic cost associated with the formation of the first few ordered pairs results in a prefactor that is less than unity (Makarov et al. 2002). Because of this, Equation 1 is only a valid approximation when QD is sufficiently large (in practice greater than ∼3). We also note that <K> may depend generally on the length of the chain; that is, whereas P(QD) has approximately an exponential dependence on the number of sequence-distant pairs that must be brought into proximity, this dependence may be different for different chain lengths. For the present, we will ignore the length dependence of <K>, and will return to these considerations later in the review.

The topomer search model predicts that the rate-limiting step in two-state folding is the formation of a conformation in which every residue is roughly in proximity to the residues that it contacts in the native state. We thus have the prediction that, by analogy to transition state theory, folding rates (kf) should scale approximately as

graphic file with name M2.gif 2

in which κQD is the attempt frequency (proportional to QD due to the QD possible pairs of native residues that can be ordered), and γ<K>QD ≈ exp(−ΔG/kBT) is the equilibrium constant for the formation of the native topomer. This relationship is reminiscent of the contact order-rate relationship (kf exponentially related to contact order). Nevertheless, the physical meaning of QD—the number of sequence-distant native pairings that define the native topomer—differs fundamentally from that of the earlier, entirely empirical measure.

Testing the topomer search model

The prediction that folding rates relate to QD provides a means of testing the topomer search model. To perform this test, however, we must define QD in terms of experimental observables. This is readily performed if we assume that any pair of sequence-distant residues (separated by more than lc residues) that are in contact in the native state (i.e., within a cutoff distance, rc) must be in proximity to form the native topomer. The precise values of rc and lc, however, are not well constrained by the model. Typical choices are for rc to reflect pairs of Cα atoms—the model is independent of specific chemical interactions and thus ignores side chains—that approach to within 6 Å–8 Å in the native state and for lc in the range of 4–12 residues [i.e., 0.5–1.5 times reported persistence lengths (Schwalbe et al. 1997; Penkett et al. 1998)]. Fortunately, the topomer search model is rather insensitive to the precise details of how these native pairs (and thus how QD) are defined; the range of QD that correspond to this wide range of parameters are all strongly correlated with one another, and critically, with experimentally observed folding rates. For example, if lc = 12 residues and rc = 6 Å, we obtain the statistically significant (r = 0.88), predictive correlation illustrated in Figure 3. Thus, this simple model captures in excess of three-fourths of the variance in our kinetic data set using only two fitted parameters (<K> and the product κγ).

Figure 3.

Figure 3.

The topomer search model is highly correlated with observed folding rates (kf) and the number of sequence-distant native pairs (QD). The observed correlation [across a previously established database of simple, single domain proteins (Plaxco et al. 2000)] is excellent, r = 0.88, and indicates that this simple model captures > ¾ of the variance in (log) observed two-state folding rates. Illustrated is the fit to a linear version of Equation 2, log (kf /QD) = log (κγ) + QDlogK with fit parameters κγ = 3800 s−1 and <K> = 0.86. QD is determined as described in the text.

In addition to successfully predicting folding rates, the topomer search model also successfully predicts when it will fail to predict folding rates. For example, Equation 2 is only a valid approximation if more than approximately three sequence-distant native pairs have been brought into proximity (Fig. 2); if fewer than four native pairs are ordered, the mean-field approximation breaks down. If a protein adopts a very simple native topology (i.e., a sequence-local topology for which QD < 4), the behavior of Gaussian chains suggests that it should fold more rapidly than would be expected based on the naive application of this approximation. Three such proteins—the engrailed homeodomain, protein A, and the villin headpiece (Mayor et al. 2000; Myers and Oas 2001; D. Raleigh, pers. comm.)—have been characterized, and consistent with this prediction, all fold at least an order of magnitude more rapidly than simple, topology-based calculations would suggest (for review, see Islam et al. 2002). This limitation, however, affects only two-state proteins for which QD ≲3. Within the remaining set of two-state proteins, Equation 2 is successful in quantitatively predicting folding rates.

The model parameters

The fitted parameters in the topomer search model are physically reasonable. The relationship between QD and folding rates stems from first principles arguments that allow us to assign meaning to the slope and intercept of the relationship and to test their validity experimentally. The value of these fitted parameters depends only weakly on how QD is defined, and the range of these parameters suggested by the model are consistent with a number of experimental and simulations-based studies of folding and the denatured ensemble.

Despite the potentially significant approximation that all two-state folding reactions exhibit the same κγ irrespective of, for example, chain length (Portman et al. 2001; Kaya and Chan 2002), the pre-exponential produced by the model is physically reasonable. We base this assertion on the following first-principles argument. The attempt frequency, κQD, is the rate of moving residue pairs into or out of proximity (Makarov et al. 2002). Assuming this is a purely entropic event, κ is the rate with which sequence-distant pairs diffuse apart and is given by (Szabo et al. 1980)

graphic file with name M3.gif 3

in which D ≈ 4 × 10−7 cm2/s is the loop-closure diffusion coefficient (Hagen et al. 1997) and d is the characteristic distance at which a residue pair is no longer in sufficient proximity to rapidly zipper. Whereas the precise value of d is unclear, it must lie between 6 Å and 24 Å (respectively, the typical distance between residues in physical contact and the typical dimensions of a single domain protein). Across this range of d, κ ≈ 108 s−1, and as the fitted value of κγ is ∼3800 s−1 (Fig. 3), γ ≈ 4 × 10−5. Because γ arises due to the extra entropy associated with the first few ordering events, this suggests an additional Rlnγ ≈ −85 J/mole.K can be assigned to this step in the topomer search process. Consistent with the arguments presented above (that the mean-field approximation becomes valid after ∼3 sequence-distant pairs have been ordered), this is comparable with approximately three times the entropic cost of closing a typical 12–25 residue loop (Poland and Scheraga 1965).

It is more difficult to ascertain whether the value of <K> is reasonable. The value obtained from fitting experimental folding rates is ∼0.8–0.9, depending on the choice of lc and rc. These values imply that, once more than approximately three sequence distant native pairs have been brought into proximity (and Equation 2 becomes a valid approximation), any remaining native pairs have an ∼45% chance (corresponding to an equilibrium constant of 0.8–0.9) of being in proximity in the unfolded molecule. Whereas this may suggest that the unfolded state is relatively well-ordered, an important consideration is that the model defines proximity as any orientation in which elements can collide to form native contacts more rapidly than the rate-limiting step in folding. As the rate-limiting step in folding is orders of magnitude slower than the rate of loop closure, proximity need not imply that two residues are particularly close in space. Indeed, this is precisely how the topomer search model solves Levinthal’s paradox; whereas the number of conformations in the native topomer is small relative to the total number of conformations available in the unfolded ensemble, it is enormously larger than unity. Because of this, the entropic cost of finding the native topomer may be reasonable even in the absence of native-like interactions that may favor this set of conformations. That said, recent experimental (Hodsdon and Frieden 2001; Plaxco and Gross 2001; Shortle and Ackerman 2001; Baldwin 2002; Klein-Seetharaman et al. 2002) and simulations-based (Choy and Forman-Kay 2000; Zagrovic et al. 2002) reports of residual long-range order in the equilibrium denatured state are consistent with the seemingly high value of <K>. If, as suggested by these studies, the denatured state adopts a native-like topology, then it is perhaps not surprising that any given sequence distant native pair has, on average, a ∼45% chance of being in proximity.

With these considerations, we now have all of the elements required to draw a complete picture of the topomer search model of two-state protein folding. The fitted value of <K> argues that, once the first few sequence-distant native pairs are in proximity, about half of the remaining sequence-distant pairs are likely to be in the correct topomeric state. That is, they are in sufficient proximity that they rapidly sample (and because of the relative instability of partially folded states, “unsample”) their native interactions. The pre-exponential suggests that these correctly oriented elements will be rapidly fluctuating out of—and incorrectly oriented elements back into—the correct topomeric state. The rate-limiting step in folding is then the set of random fluctuations that simultaneously brings every element in the chain into the native topomer. Once this is achieved, the rate-limiting step is surmounted and specific native contacts rapidly and productively zipper to form the fully folded protein.

The relationship between rates and contact order

The relationship between rates and contact order thus appears to arise indirectly. That is, the behavior of Gaussian chains suggests that QD defines the probability of achieving the native topomer—and thus defines folding rates—and that contact order predicts rates not because it is related to the folding mechanism per se but because it is a proxy for QD. Although a strong correlation between contact order and QD for most proteins renders it difficult to prove this hypothesis directly, recent counterexamples provide significant evidence in support of it. For example, circular permutation of the S6 domain allows us to distinguish between the two parameters, whereas permutation significantly alters the protein’s contact order, it does not significantly alter QD (Miller et al. 2002). Consistent with the predictions of the topomer search model, it has been reported recently that these permutations do not significantly alter folding rates (Lindberg et al. 2001). Similarly, the covalent circularization of a protein should significantly alter its contact order (presumably, one counts the shortest covalent path between contacting residues), leading to orders of magnitude rate accelerations. The topomer search model, in contrast, predicts relatively small rate accelerations, circularization pre-orders only one sequence-distant native pair. This will reduce the entropic cost of the first few ordering events by, at most, about one-third, increasing γ—and thus folding rates—by no more than a factor of 10. Consistent with this prediction, the relevant, reported circularizations produce only three- to sevenfold rate enhancements (Otzen and Fersht 1998; Grantcharova and Baker 2001; Camarero et al. 2001).

Native interactions and the topomer search

The topomer search model ignores the contributions of native-like interactions to the rate-limiting step in folding, obviously a potentially significant omission. For example, the strong, perfectly exponential denaturant dependencies of folding rates demonstrate that the folding transition state contains interactions similar to those that stabilize the native state (Plaxco et al. 2000). This suggestion is further supported by reports that native-state stability is an important determinant of the relative folding rates of topologically similar proteins (Guijarro et al. 1998; Clarke et al. 1999). Moreover, exhaustive mutagenasis studies (termed φ-value analysis) have firmly established that many side chains are in near-native environments during the rate-limiting step in folding (for review, see Fersht 1997). It is thus abundantly clear that, in addition to the topomer search process, the formation of specific, native interactions also contributes to the relative free energy of the folding transition state. However, despite its studied lack of specific, nucleating interactions, it ignores all chemistry—the topomer search model captures three-fourths of the variance in the log of relative two-state folding rates. This suggests that, although specific, native-like interactions are an obligatory feature of the folding transition state (Fig. 1, C to D transition), these interactions are neither sufficient to ensure folding nor the dominant determinant of relative barrier heights.

Of course, the topomer search model need not completely ignore the energetically favorable interactions that may exist in the folding transition state; they are spun into the factor <K> (Makarov et al. 2002). That is, any stabilizing interactions that bias the chain toward the native geometry will increase the average probability of a native-like orientation of structural elements. As noted above, however, the observed value of <K> may be reasonable even in the absence of significant stabilizing interactions simply because proximity only implies “close enough to collide more rapidly than the rate-limiting step.” As the rate-limiting step in folding is slow (relative to loop closure rates), “close enough” may, in reality, be rather distant, and thus, energetically favorable interactions are not necessarily required to generate <K> ∼0.8–0.9 and the rapid folding rates this produces.

Relationship to previous folding models

The topomer search model unifies several previous models of protein-folding kinetics. For example, the topomer search model is grounded in the energy landscape picture of protein folding (Bryngelson and Wolynes 1987; Bryngelson et al. 1995; Dill and Chan 1997; Onuchic et al. 1997; Dobson et al. 1998; Dinner et al. 2000); it is precisely because the energy landscapes of two-state proteins are exceedingly smooth that the topomer search, rather than diffusion over a rough landscape or escape from discrete traps, defines the folding barrier (Sosnick et al. 1994; Debe et al. 1999; Gillespie and Plaxco 2000; Millet et al. 2002). Notably, the energy landscape of the topomer search process itself is smooth; recent studies of the rate with which sequence-distant residue pairs are brought into proximity in unfolded cytochrome c demonstrate that inter-residue interactions (i.e., energetic roughness) do not control large-scale conformational diffusion even under native conditions (Hagen et al. 2001).

The topomer search model can also be considered a limiting (albeit simple, general, and easily quantified) case of the hierarchical folding models (Rose 1979; Baldwin and Rose 1999). The diffusion-collision model, for example, stipulates that protein folding occurs via the diffusive, hierarchical assembly of more-or-less preformed elements of secondary structure (Karplus and Weaver 1979; Zhou and Karplus 1999; Myers and Oas 2001). The topomer search model, in contrast, stipulates that, except for those few, rapidly folding proteins for which QD < 4 (see Islam et al. 2002), the sampling of local structure is orders of magnitude more rapid than the sampling of topomers. For most two-state proteins, the barrier is thus largely defined by the latter, with the sampling of local structural elements playing a much lesser role in determining relative folding rates.

How can the topomer search model be improved?

The strong correlation between Equation 2 and observed two-state folding rates suggests that, despite its seemingly excessive simplicity, the topomer search model captures the dominant contributor to relative barrier heights. There is, nevertheless, clearly room to improve the model’s accuracy and generalizability. Here, we discuss likely future efforts in these directions.

Chain-length dependence

Numerous theoretical studies suggest that both the pre-exponential (via the diffusion coefficient) and the activation barrier (due to the entropic cost of the search) of folding are strong functions of chain length, N (Thirumalai 1995; Gutin et al. 1996; Zhdanov 1998; Debe et al. 1999). Most models predict that folding rates scale exponentially with N with a large, negative exponent (i.e., longer chains fold more slowly). No statistically significant length dependence is evident, however, in the experimentally observed folding rates of simple, single-domain proteins (Plaxco et al. 1998, 2000), perhaps because the effects of differing topologies overwhelm the more subtle, length-rate relationship. The topomer search model provides a convenient opportunity to account for the effects of topological variations and thus investigate the length dependence of folding independently of topology. When this is performed, a statistically significant relationship between rates and N arises, but in the counter-intuitive direction; longer proteins tend to fold more rapidly than predicted. This leads to a small, but statistically significant improvement in the relationship between log (kf) and QDNα versus QD alone (Fig. 4) via the equation

Figure 4.

Figure 4.

The empirical addition of length dependence to the topomer search model produces a small but statistically significant improvement in its predictive value. The addition also produces the counterintuitive prediction that a longer protein generally folds more rapidly than a shorter one with an equivalent number of sequence-distant native pairs. Shown here is the fit to a linearized version of Equation 4, log (kf/QD) = log (β) + QDNαlogJ, with α set to −1. The correlation is relatively insensitive to the precise value of α, producing a correlation coefficient of 0.92–0.93 for all α over the range −0.5 to −1.

graphic file with name M4.gif 4

in which α is a negative number in the range of −0.5 to −1.0 (r = ∼0.92–0.93 over this range), J is a constant of magnitude < 1, and β is a constant analogous to κγ. Because J and α are interdependent variables, it is impossible to pinpoint the value of α more precisely. It is clear, however, that α is negative, leading to the counterintuitive result that, all other parameters being equal, longer proteins fold more rapidly.

It is not hard to rationalize this length dependence in the context of the topomer search model. It is consistent with the generalization of the model in which the mean equilibrium constant for the ordering of native pairs is dependent on chain length

graphic file with name M5.gif 5

with an exponent, α that is negative. A possible origin of this relationship (and the counterintuitive length dependence it gives rise to) is crowding effects. That is, if a sequence-distant interaction occurs, on average, once every 5 residues along the chain steric and geometric constraints may render the native topomer more difficult to achieve than if, on average, sequence-distant interactions occur only every 10 residues. Critically, the Gaussian chain model is unlikely to capture crowding correctly, as it rather poorly mimics the stiffness of an unfolded polypeptide and entirely ignores excluded volume interactions. This suggests that simulations of more realistic chains are in order if we are to verify the validity of this currently empirical correction.

The mean-field approximation

A second concern is that the mean-field approximation is simply that, an approximation. It is certain that the inclusion of additional parameters (beyond simply counting the number of sequence-distant native pairs) will be required in order to define the probability of achieving a given topomer more accurately. The equilibrium constant for bringing sequence-distant native pairs into proximity, for example, is at least a weak function of the chain length separating the pair from itself and from other, preordered pairs. This effect may be illustrated by studies in which the extension of solvent-exposed loops slows folding rates; such extension does not significantly alter QD, but does change the accuracy of the approximation that <K> is a constant. That said, the effect of extending a loop by less than lc residues is relatively subtle; extensions of 10–13 residues reduce rates by less than a factor of 4 (Ladurner and Fersht 1997; Viguera and Serrano 1997; Grantcharova et al. 2000). Only the longest reported loop-extensions (e.g., a 59-residues loop inserted in an artificially engineered, monomeric arc repressor) produce significant changes in two-state folding rates (Robinson and Sauer 1998).

Native interactions

A potentially more serious omission is that the topomer search model ignores all of the detailed chemical interactions that define the native state. As noted above, it is abundantly clear that the topomer search is only part of the folding barrier and the inclusion of specific, stabilizing interactions is clearly critical if we are to develop a more predictive model of folding kinetics. Recent experimental results, however, suggest that the native-like interactions occurring in the folding transition state are rather plastic (i.e., can be altered significantly without significantly altering folding rates), and thus, their effect on folding kinetics may prove difficult to model accurately (Grantcharova and Baker 2001; Nauli et al. 2001). Nevertheless, progress has already been reported on this front for the folding of the topologically simple proteins (QD < 4), for which native-like interactions play the greatest role in defining relative rates (Myers and Oas 2001; Islam et al. 2002).

Non-two-state folding

Further generalization of the model to fit non-two-state proteins may also prove difficult. The topomer search model is rooted in the observation that the folding energy landscape of two-state proteins is extremely smooth and, in the absence of energetic roughness, the connectivity-induced difficulty of the topomer search dominates relative barrier heights. In contrast, non-two-state folding necessarily implies that well-populated intermediates dominate the folding landscape, leading to deviations from single-exponential kinetics. Under these circumstances, folding kinetics could be defined by the rate of escape from these intermediate states rather than by the rate of topomer sampling (Sosnick et al. 1994; Debe et al. 1999; Millet et al. 2002). As the free energy of these states are defined by specific chemical interactions, predicting the kinetics with which they are escaped will probably not prove as simple as describing the kinetics of the topomer search.

Conclusions

The topomer search model stipulates that the random, diffusive process by which an unfolded polypeptide achieves its native topomer dominates the relative folding rates of two-state proteins. This native topomer is defined as the set of conformations in which every pair of residues in contact in the native state are in sufficient proximity that they can collide and form native interactions more rapidly than the—relatively slow—rate-limiting step in folding. Simulations of the diffusion of an inert, Gaussian chain indicate that the probability of such an occurrence relates simply to the number of sequence-distant residue pairs required to define the native topomer. Consistent with this result, the experimentally observed folding rates of two-state proteins correlate strongly with QD, the number of sequence-distant residue pairs in contact in the native state. The predictive value of this result supports the argument that the topomer search process is the dominant contributor to the relative barrier heights of two-state protein folding reactions.

Acknowledgments

The quantitative topomer search model was originally developed in collaboration with our colleagues Horia Metiu and Craig Keller and was motivated in part by the pioneering work of Derek Debe and William Goddard. The authors would also like to acknowledge numerous informative discussions with David Baker, Buzz Baldwin, Hue Sun Chan, Ken Dill, Chris Dobson, Carl Frieden, Blake Gillespie, Michael Gross, Jim Hu, Bob Matthews, Vijay Pande, Rohit Pappu, George Rose, David Shortle, and Tobin Sosnick.

Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0220003.

References

  1. Alm, E. and Baker, D. 1999a. Matching theory and experiment in protein folding. Curr. Opin. Struct. Biol. 9 189–196. [DOI] [PubMed] [Google Scholar]
  2. ———. 1999b. Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures. Proc. Natl. Acad. Sci. 96 11305–11310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baldwin, R.E. 2002. Protein folding—Making a network of hydrophobic clusters. Science 295 1657–1658. [DOI] [PubMed] [Google Scholar]
  4. Baldwin, R.L. and Rose, G.D. 1999. Is protein folding hierarchic? II. Folding intermediates and transition states. Trends. Biochem. Sci. 24 77–83. [DOI] [PubMed] [Google Scholar]
  5. Bergasa-Caceres, F., Ronneberg, T.A., and Rabitz, H.A. 1999. Sequential collapse model for protein folding pathways. J. Phys. Chem. B 103 9749–9758. [Google Scholar]
  6. Bieri, O., Wirz, J., Hellrung, B., Schutkowski, M., Drewello, M., and Kiefhaber, T. 1999. The speed limit for protein folding measured by triplet–triplet energy transfer. Proc. Natl. Acad. Sci. 96 9597–9601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bryngelson, J.D. and Wolynes, P.G. 1987. Spin-glasses and the statistical-mechanics of protein folding. Proc. Natl. Acad. Sci. 84 7524–7528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bryngelson, J.D., Onuchic, J.N., Socci, N.D., and Wolynes, P.G. 1995. Funnels, pathways, and the energy landscape of protein-folding—a synthesis. Prot. Sruct. Func. Gen. 21 167–195. [DOI] [PubMed] [Google Scholar]
  9. Camarero, J.A., Fushman, D., Sato, S., Giriat, I., Cowburn, D., Raleigh, D.P., and Muir, T.W. 2001. Rescuing a destabilized protein fold through backbone cyclization. J. Mol. Biol. 308 1045–1062. [DOI] [PubMed] [Google Scholar]
  10. Chan, H.S. and Dill, K.A. 1990. The effects of internal constraints on the configurations of chain molecules. J. Chem. Phys. 92 3118–3135. [Google Scholar]
  11. Choy, W.Y. and Forman-Kay, J. 2000. Calculation of ensembles of structures representing the unfolded state of an SH3 domain. J. Mol. Biol. 308 1011–1032. [DOI] [PubMed] [Google Scholar]
  12. Clarke, J., Cota, E., Fowler, S.B., and Hamill, S.J. 1999. Folding studies of immunoglobulin-like β-sandwich proteins suggest that they share a common folding pathway. Structure 7 1145–1153. [DOI] [PubMed] [Google Scholar]
  13. Clementi, C., Nymeyer, H., and Onuchic, J.N. 2000. Topological and energetic factors: What determines the structural details of the transition state ensemble and ‘en-route’ intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 298 937–953. [DOI] [PubMed] [Google Scholar]
  14. Debe, D.A. and Goddard, W.A. 1999. First principles prediction of protein folding rates. J. Mol. Biol. 294 619–625. [DOI] [PubMed] [Google Scholar]
  15. Debe, D.A., Carlson, M.J., and Goddard, W.A. 1999. The topomer-sampling model of protein folding. Proc. Natl. Acad. Sci. 96 2596–2601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dill, K.A. and Chan, H.S. 1997. From Levinthal to pathways to funnels. Nat. Struc. Biol. 4 10–19. [DOI] [PubMed] [Google Scholar]
  17. Dinner, A.R., Sali, A., Smith, L.J., Dobson, C.M., and Karplus, M. 2000. Understanding protein folding via free-energy surfaces from theory and experiment. Trend. Bioch. Sci. 25 331–339. [DOI] [PubMed] [Google Scholar]
  18. Dobson, C.M., Sali, A., and Karplus, M. 1998. Protein folding: A perspective from theory and experiment. Ang. Chem. Int. Ed. 37 868–893. [DOI] [PubMed] [Google Scholar]
  19. Eaton, W.A., Munoz, V., Hagen, S.J., Jas, G.S., Lapidus, L.J., Henry, E.R., and Hofrichter, J. 2000. Fast kinetics and mechanisms in protein folding. Annu. Rev. Biomol. Struct. 29 327–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fersht, A.R. 1997. Nucleation mechanisms in protein folding. Curr. Opin. Struct. Biol. 7 3–9. [DOI] [PubMed] [Google Scholar]
  21. ———. 2000. Transition-state structure as a unifying basis in protein-folding mechanisms: Contact order, chain topology, stability, and the extended nucleus mechanism. Proc. Natl. Acad. Sci. 97 1525–1529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Flanagan, J.M., Kataoka, M., Shortle, D., and Engelman, D.M. 1992. Truncated staphylococcal nuclease is compact but disordered. Proc. Natl. Acad. Sci. 89 748–752. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Flory, P.J. 1956. Theory of elastic mechanisms in fibrous proteins. J. Am. Chem. Soc. 78 5222–5234. [Google Scholar]
  24. Galzitskaya, O.V. and Finkelstein, A.V. 1999. A theoretical search for folding/unfolding nuclei in three-dimensional protein structures. Proc. Natl. Acad. Sci. 96 11299–11304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Gillespie, B. and Plaxco, K.W. 2000. Non-glassy kinetics in the folding of a simple, single domain protein. Proc. Natl. Acad. Sci. 97 12014–12019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Grantcharova, V.P. and Baker, D. 2001. Circularization changes the folding transition state of the src SH3 domain. J. Mol. Biol. 306 555–563. [DOI] [PubMed] [Google Scholar]
  27. Grantcharova, V.P., Riddle, D.S., and Baker, D. 2000. Long-range order in the src SH3 folding transition state. Proc. Natl. Acad. Sci. 97 7084–7089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Grantcharova, V.P., Alm. E.J., Baker, D., and Horowitz, A.L. 2001. Mechanisms of protein folding. Curr. Opin. Struct. Biol. 11 70–82. [DOI] [PubMed] [Google Scholar]
  29. Gromiha, M.M. and Selvaraj, S. 2001. Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: Application of long-range order to folding rate prediction. J. Mol. Biol. 310 27–32. [DOI] [PubMed] [Google Scholar]
  30. Gross, M. 1996. Linguistic analysis of protein folding. FEBS Lett. 390 249–252. [DOI] [PubMed] [Google Scholar]
  31. Guijarro, J.I., Morton, C.J., Plaxco, K.W., Campbell, I.D., and Dobson, C.M. 1998. Folding kinetics of the SH3 domain of PI3 by real-time NMR and optical techniques. J. Mol. Biol. 275 657–667. [DOI] [PubMed] [Google Scholar]
  32. Guo, Z.Y., Brooks, C.L., and Boczko, E.M. 1997. Exploring the folding free energy surface of a three-helix bundle protein. Proc. Natl. Acad. Sci. 94 10161–10166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Gutin, A.M. and Shakhnovich, E.I. 1994. Statistical mechanics of polymers with distance constraints. J. Chem. Phys. 100 5290–5293. [Google Scholar]
  34. Gutin, A.M., Abkevich, V.I., and Shakhnovich, E.I. 1996. Chain length scaling of protein folding time. Phys. Rev. Lett. 77 5433–5436. [DOI] [PubMed] [Google Scholar]
  35. Hagen, S.J., Hofrichter, J., Szabo, A., and Eaton, W.A. 1996. Diffusion-limited contact formation in unfolded cytochrome c: Estimating the maximum rate of protein folding. Proc. Natl. Acad. Sci. 93 11615–11617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Hagen, S.J., Hofrichter, J., and Eaton, W.A. 1997. Rate of intrachain diffusion of unfolded cytochrome c. J. Phys. Chem. B 101 2352–2365. [Google Scholar]
  37. Hagen, S.J., Carswell, C.W., and Sjolander, E.M. 2001. Rate of intrachain contact formation in an unfolded protein: Temperature and denaturant effects. J. Mol. Biol. 305 1161–1171. [DOI] [PubMed] [Google Scholar]
  38. Hodsdon, M.E. and Frieden, C. 2001. Intestinal fatty acid binding protein: The folding mechanism as determined by NMR studies. Biochemistry 40 732–742. [DOI] [PubMed] [Google Scholar]
  39. Islam, S.A., Karplus, M., and Weaver, D.L. 2002. Application of the diffusion-collision model to the folding of three-helix bundle proteins. J. Mol. Biol. 318 199–215. [DOI] [PubMed] [Google Scholar]
  40. Ivankov, D.N. and Finkelstein, A.V. 2001. Theoretical study of a landscape of protein folding-unfolding pathways. Folding rates at midtransition. Biochemistry 40 9957–9961. [DOI] [PubMed] [Google Scholar]
  41. Jackson, S.E. 1998. How do small single domain proteins fold? Fold. Des. 3 R81–R91. [DOI] [PubMed] [Google Scholar]
  42. Jackson, S.E. and Fersht, A.R. 1991. The folding of chymotrypsin inhibitor-2. 1. Evidence for a two-state transition. Biochemistry 30 10428–10435. [DOI] [PubMed] [Google Scholar]
  43. Jacobson, H. and Stockmayer, W.H. 1950. Intramolecular reaction in polycondensations. I. The theory of linear systems. J. Chem. Phys. 18 1600–1606. [Google Scholar]
  44. Karplus, M. and Weaver, D.L. 1979. Diffusion-collision model or protein folding. Biopolymers 18 1421–1437. [Google Scholar]
  45. Kaya, H. and Chan, H.S. 2002. Towards a consistent modeling of protein thermodynamic and kinetic cooperativity: How applicable is the transition state picture to folding and unfolding? J. Mol. Biol. 315 899–909. [DOI] [PubMed] [Google Scholar]
  46. Klein-Seetharaman, J., Oikawa, M., Grimshaw, S.B., Wirmer, J., Duchardt, E., Ueda, T., Imoto, T., Smith, L.J., Dobson, C.M., and Schwalbe, H. 2002. Long-range interactions within a nonnative protein. Science 295 1719–1722. [DOI] [PubMed] [Google Scholar]
  47. Kolinski, A., Galazka, W., and Skolnick, J. 1998. Monte Carlo studies of the thermodynamics and kinetics of reduced protein models: Application to small helical, β, and α/β proteins. J. Chem. Phys. 108 2608–2617. [Google Scholar]
  48. Ladurner, A.G. and Fersht, A.R. 1997. Glutamine, alanine, or glycine repeats inserted into the loop of a protein have minimal effects on stability and folding rates. J. Mol. Biol. 273 330–337. [DOI] [PubMed] [Google Scholar]
  49. Ladurner, A.G., Itzhaki, L.S., Gay, G.D., and Fersht, A.R. 1997. Complementation of peptide fragments of the single domain protein chymotrypsin inhibitor 2. J. Mol. Biol. 273 317–329. [DOI] [PubMed] [Google Scholar]
  50. Lapidus, L.J., Eaton, W.A., and Hofrichter, J. 2000. Measuring the rate of intramolecular contact formation in polypeptides. Proc. Natl. Acad. Sci. 97 7220–7225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Lindberg, M.O., Tangrot, J., Otzen, D.E., Dolgikh, D.A., Finkelstein, A.V., and Oliveberg, M. 2001. Folding of circular permutants with decreased contact order: General trend balanced by protein stability. J. Mol. Biol. 314 891–900. [DOI] [PubMed] [Google Scholar]
  52. Makarov, D.E. and Metiu, H. 2002. A model for the kinetics of protein folding: Kinetic Monte Carlo simulations and analytical results. J. Chem. Phys. 116 5205–5216. [Google Scholar]
  53. Makarov, D.E., Keller, C.A., Plaxco, K.W., and Metiu, H. 2002. How the folding rate constant of simple-single domain proteins depends on number of native contacts. Proc. Natl. Acad. Sci. 99 3535–3539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Mayor, U., Johnson, C.M., Daggett, V., and Fersht, A.R. 2000. Protein folding and unfolding in microseconds to nanoseconds by experiment and simulation. Proc. Natl. Acad. Sci. 97 13518–13522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Miller, E.J., Fischer, K.F., and Marqusee, S. 2002. Experimental evaluation of topological parameters determining protein-folding rates. Proc. Natl. Acad. Sci. 99 10359–10363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Millet, I.S., Townsley, L., Chiti, F., Doniach, S., and Plaxco, K.W. 2002. Equilibrium collapse and the kinetic ‘foldability’ of proteins. Biochemistry 41 321–325. [DOI] [PubMed] [Google Scholar]
  57. Mirny, L. and Shakhnovich, E. 2001. Protein folding theory: From lattice to all-atom models. Annu. Rev. Biophys. Biomol. Struc. 30 361–396. [DOI] [PubMed] [Google Scholar]
  58. Muñoz, V. and Eaton, W.A. 1999. A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc. Natl. Acad. Sci. 96 11311–11316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Muñoz, V., Thompson, P.A., Hofrichter, J., and Eaton, W.A. 1997. Folding dynamics and mechanism of β-hairpin formation. Nature 390 196–199. [DOI] [PubMed] [Google Scholar]
  60. Myers, J.K. and Oas, T.G. 2001. Preorganized secondary structure as an important determinant of fast protein folding. Nat. Struct. Biol. 8 552–558. [DOI] [PubMed] [Google Scholar]
  61. Nauli, S., Kuhlman, B., and Baker, D. 2001. Computer-based redesign of a protein folding pathway. Nat. Struct. Biol. 8 602–605. [DOI] [PubMed] [Google Scholar]
  62. Onuchic, J.N., Luthey-Schulten, Z., and Wolynes, P.G. 1997. Theory of protein folding: The energy landscape perspective. Annu. Rev. Phys. Chem. 48 545–600. [DOI] [PubMed] [Google Scholar]
  63. Otzen, D.E. and Fersht, A.R. 1998. Folding of circular and permuted chymotrypsin inhibitor 2: Retention of the folding nucleus. Biochemistry 37 8139–8146. [DOI] [PubMed] [Google Scholar]
  64. Penkett, C.J., Redfield, C., Jones, J.A., Dodd, I., Hubbard, J., Smith, R.A.G., Smith, L.J., and Dobson, C.M. 1998. Structural and dynamical characterization of a biologically active unfolded fibronectin-binding protein from Staphylococus aureus. Biochemistry 37 17054–17067. [DOI] [PubMed] [Google Scholar]
  65. Plaxco, K.W. and Gross, M. 2001. Unfolded, yes, but random? Never! Nat. Struct. Biol. 8 659–670. [DOI] [PubMed] [Google Scholar]
  66. Plaxco, K.W., Simons, K.T., and Baker, D. 1998. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277 985–994. [DOI] [PubMed] [Google Scholar]
  67. Plaxco, K.W., Millett, I.S., Segel, D.J., Doniach, S., and Baker, D. 1999. Polypeptide chain collapse can occur concomitantly with the rate limiting step in protein folding. Nat. Struct. Biol. 6 554–557. [DOI] [PubMed] [Google Scholar]
  68. Plaxco, K.W., Simons, K.T., Ruczinski, I., and Baker, D. 2000. Topology, stability, sequence, and length: defining the determinants of two-state protein folding kinetics. Biochemistry 37 11177–11183. [DOI] [PubMed] [Google Scholar]
  69. Plotkin, S.S., Wang, J. and Wolynes, P.G. 1996. Correlated energy landscape model for finite, random heteropolymers. Phys. Rev. E 53 6271–6296. [DOI] [PubMed] [Google Scholar]
  70. Poland, D.C. and Scheraga, H.A. 1965. Statistical mechanics of noncovalent bonds in polyamino acids. 8. Covalent loops in proteins. Biopolymers 3 379–385. [Google Scholar]
  71. Portman, J.J., Takada, S., and Wolynes, P.G. 2001. Microscopic theory of protein folding rates. II. Local reaction coordinates and chain dynamics. J. Chem. Phys. 114 5082–5096. [Google Scholar]
  72. Robinson, C.R. and Sauer, R.T. 1998. Optimizing the stability of single-chain proteins by linker length and composition mutagenasis. Proc. Natl. Acad. Sci. 95 5929–5934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Rose, G.D. 1979. Hierarchic organization of domains in globular-proteins. J. Mol. Biol. 134 447–470. [DOI] [PubMed] [Google Scholar]
  74. Schwalbe, H., Fiebig, J.M., Buck, M., Jones, J.A., Grimshaw, S.B., Spencer, A., Glaser, S.J., Smith, L.J., and Dobson, C.M. 1997. Structural and dynamical properties of a denatured protein. Heteronuclear 3D NMR experiments and theoretical simulations of lysozyme in 8 M urea. Biochemistry 36 8977–8991. [DOI] [PubMed] [Google Scholar]
  75. Shea, J.E., Onuchic, J.N., and Brooks, C.L. 1999. Exploring the origins of topological frustration: Design of a minimally frustrated model of fragment B of protein A. Proc. Natl. Acad. Sci. 96 12512–12517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Sheinerman, F.B. and Brooks, C.L. 1998. Molecular picture of folding of a small α/β protein. Proc. Natl. Acad. Sci. 95 1562–1567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Shoemaker, B.A. and Wolynes, P.G. 1999. Exploring structures in protein folding funnels with free energy functionals: The denatured ensemble. J. Mol. Biol. 287 657–674. [DOI] [PubMed] [Google Scholar]
  78. Shortle, D. and Ackerman, M.S. 2001. Persistence of native-like topology in a denatured protein in 8 M urea. Science 293 487–489. [DOI] [PubMed] [Google Scholar]
  79. Socci, N.D., Onuchic, J.N., and Wolynes, P.G. 1998. Protein folding mechanisms and the multidimensional folding funnel. Prot. Struc. Func. Gen. 32 136–158. [PubMed] [Google Scholar]
  80. Sosnick, T.R., Mayne, L., Hiller, R., and Englander, S.W. 1994. The barriers in protein folding. Nat. Struct. Biol. 1 149–156. [DOI] [PubMed] [Google Scholar]
  81. Sosnick, T.R., Mayne, L., and Englander, S.W. 1996. Molecular collapse: the rate-limiting step in two-state cytochrome c folding. Proteins 24 413–426. [DOI] [PubMed] [Google Scholar]
  82. Szabo, A., Schulten, K., and Schulten, Z. 1980. 1st passage time approach to diffusion controlled reactions. J. Chem. Physics 72 4350–4357. [Google Scholar]
  83. Thirumalai, D. 1995. From minimal models to real proteins: Time scales for protein folding. J. Physique I 5 1457–1467. [Google Scholar]
  84. Thompson, P.A., Eaton, W.A., and Hofrichter, J. 1997. Laser temperature jump study of the helix reversible arrow coil kinetics of an alanine peptide interpreted with a ‘kinetic zipper’ model. Biochemistry 36 9200–9210. [DOI] [PubMed] [Google Scholar]
  85. Van Nuland, N.A.J., Chiti, F., Taddei, N., Raugei, G., Ramponi, G., and Dobson, C.M. 1998. Slow folding of muscle acylphosphatase in the absence of intermediates. J. Mol. Biol. 283 883–891. [DOI] [PubMed] [Google Scholar]
  86. Viguera, A.R. and Serrano, L. 1997. Loop length, intramolecular diffusion and protein folding. Nat. Struct. Biol. 4 939–946. [DOI] [PubMed] [Google Scholar]
  87. Wittung-Stafshede, P., Lee, J.C., Winkler, J.R., and Gray, H.B. 1999. Cytochrome b562 folding triggered by electron transfer: Approaching the speed limit for formation of a four-helix-bundle protein. Proc. Natl. Acad. Sci. 96 6587–6590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Zagrovic, B., Snow, C., Khaliq, S., Shirts, M., and Pande, V. 2002. Native-like mean structure in the unfolded ensemble of small proteins. J. Mol. Biol. (in press) [DOI] [PubMed]
  89. Zhdanaov, V.P. 1998. Folding time of ideal β-sheets vs. chain length. Europhys. Lett. 42 577–581. [Google Scholar]
  90. Zhou, H.Y. and Zhou, Y.Q. 2002. Folding rate prediction using total contact distance. Biophys. J. 82 458–463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Zhou, Y.Q. and Karplus, M. 1999. Interpreting the folding kinetics of helical proteins. Nature 401 400–403. [DOI] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES