Skip to main content
Biophysics and Physicobiology logoLink to Biophysics and Physicobiology
. 2016 Nov 18;13:281–293. doi: 10.2142/biophysico.13.0_281

Cooperativity and modularity in protein folding

Masaki Sasai 1,, George Chikenji 1, Tomoki P Terada 1
PMCID: PMC5221511  PMID: 28409080

Abstract

A simple statistical mechanical model proposed by Wako and Saitô has explained the aspects of protein folding surprisingly well. This model was systematically applied to multiple proteins by Muñoz and Eaton and has since been referred to as the Wako-Saitô-Muñoz-Eaton (WSME) model. The success of the WSME model in explaining the folding of many proteins has verified the hypothesis that the folding is dominated by native interactions, which makes the energy landscape globally biased toward native conformation. Using the WSME and other related models, Saitô emphasized the importance of the hierarchical pathway in protein folding; folding starts with the creation of contiguous segments having a native-like configuration and proceeds as growth and coalescence of these segments. The Φ-values calculated for barnase with the WSME model suggested that segments contributing to the folding nucleus are similar to the structural modules defined by the pattern of native atomic contacts. The WSME model was extended to explain folding of multi-domain proteins having a complex topology, which opened the way to comprehensively understanding the folding process of multi-domain proteins. The WSME model was also extended to describe allosteric transitions, indicating that the allosteric structural movement does not occur as a deterministic sequential change between two conformations but as a stochastic diffusive motion over the dynamically changing energy landscape. Statistical mechanical viewpoint on folding, as highlighted by the WSME model, has been renovated in the context of modern methods and ideas, and will continue to provide insights on equilibrium and dynamical features of proteins.

Keywords: WSME model, energy landscape, statistical mechanics


Understanding protein folding is a fascinating problem of biomolecular self-organization, and it is a prerequisite for comprehending the reactions and interactions of proteins. An important method for delineating the folding problem is through a simple statistical mechanical model. The model was proposed by Wako and Saitô in 1978 [1,2] by extending classical models of helix-coil transitions [3,4] to many-bodied heterogeneous cases. However, the model was not widely accepted until quantitative comparison between the model results and the experimental data became possible.

Around 1990–2000, three important advances changed the researchers’ viewpoint. The first advance was the progress in statistical mechanics of complex systems such as spin glasses and neural networks. Accordingly, a complex system’s behavior could be described as a competition between its tendency to be trapped into one of extensively many disordered states and its tendency to globally drift along the energy landscape toward an ordered functional state. Applying this notion to protein folding revealed that the global structure of the folding energy landscape is a key to explaining the experimental results [5]. The second advance was the experimental observation of the folding rates of systematically derived mutant proteins, which led to the Φ-value analysis technique to reveal structures of the transition state ensemble of folding [6,7]. The third advance was the drastic increase in computational power, which facilitated not only large-scale simulations with realistic models but also the quick and accurate evaluation of folding mechanisms with simplified models. Combining these advances, theoretical models of the energy landscape of folding were introduced to explain and predict the experimentally observed Φ-values and other quantities, which led to the innovative cooperation between theories and experiments and promoted a paradigm shift in folding studies [8,9]. The model developed by Wako and Saitô was “re-discovered” in 1999 by Muñoz and Eaton [10], and this model has since made a significant contribution to the advancement in folding studies.

A major advantage of this model is that the partition function can be exactly calculated from the model Hamiltonian [11,12]; the exact calculation allows us to obtain a transparent picture on free-energy landscapes, pathways, and rates of folding. The model was at first criticized as quantitatively invalid [13]. However, such invalidity was due to the particular approximation used in the calculation and the problem disappeared when the exact solution of the model was used. Since then, the Wako-Saitô-Muñoz-Eaton (WSME) model has been widely applied in calculating pathways [1423] and kinetics [14,19,2325] of folding as well as in explaining mechanical unfolding [26,27], amyloidosis [28], and allosteric transitions and functions [2932]. In this review, we discuss the physics behind the WSME model and its applications to folding and other intriguing biophysical problems.

The WSME Model and Cooperativity

In the WSME model, a protein conformation is described by a set of Ising-like variables, {mi}. mi=1, when the dihedral angles of the backbone chain at the ith residue have similar values to those in the native conformation, and mi=0 otherwise. The WSME Hamiltonian is defined by a function of {mi}as

HWSME({mi})=i=1N-1j=i+1NɛijΔijk=ijmk, (1)

where N is the total number of residues in the protein and Δij represents the pattern of native contacts: Δij=1, when the residues i and j are in contact in the native conformation and Δij=0 otherwise. ɛij<0 represents the strength of the attractive native interactions, for which we may use ɛij ≈ −0.3 to −1.5 kcal·mol−1 depending on the extent of the atomic contacts between the residues i and j in the native conformation [23]. The partition function is calculated as

ZWSME(n)=Trnexp (-HWSME({mi})/kBT-i=1Nσimi). (2)

Here, 0≤n≤1 is an order parameter of folding: n=0 when the chain is completely disordered, n=1 when the structure is identical to that determined via X-ray or NMR analysis. Trn is a sum under the constraint M=i=1Nmi=Nn as Trn=m1=0,1m2=0,1mN=0,1δM,Nn, where δM,Nn is a Kronecker delta. −σi represents the reduction of entropy upon structure ordering at the residue i, and we may use σikB≈ 2–3 cal·mol−1K−1 [23]. From Eq. 2, we can calculate the free energy, F(n)=−kBT log ZWSME(n), which is the one-dimensional free-energy landscape represented as a function of n. The expression of Eq. 2 can be easily extended to the two-dimensional version, ZWSME(n1, n2), with the corresponding free-energy landscape, F(n1, n2), by introducing the two-dimensional folding order parameter (n1, n2) with n1=i=1N1mi/N1,n2=i=N1+1Nmi/N2, and N1+N2=N [15,16,20,23]; the higher-dimensional representation is also feasible [20].

The WSME model is based on two major assumptions. First, it does not consider non-native interactions. Since only native interactions are explicitly considered in Eq. 1, the energy monotonously decreases as the chain approaches the native conformation, i.e., the energy landscape has a global bias toward native conformation. This global bias has been considered as a characteristic of sequences selected by evolution to meet consistency between local and global structures [33] or to show minimally frustrated interactions [5]. The model with such a global bias was first considered by Gō and his colleagues [3436], and the WSME model belongs to a class of such “Gō-like models”.

Another significant assumption in the model is that a native interaction occurs only within the “island” of a native-like configuration; the ɛij term in Eq. 1 has a nonzero contribution to HWSME only when the consecutive segment from residues i through j assume native-like configurations, satisfying mimi+1 ... mj–1mj=1. This assumption is illustrated in Figure 1, where intra-segment native interactions are effective (Fig. 1A), but interactions are ineffective when an intervening residue takes the “wrong” direction (Fig. 1B). This assumption seems plausible when we consider that the residues should form a local ordered structure through compact atomic packing of residue side chains. Such local structural ordering should be represented as a cooperative many-residue correlation given by mimi+1 ... mj–1mj=1 and not as a naive summation of pairwise correlations.

Figure 1.

Figure 1

The native interaction in the WSME model. Residues in the native-like configuration are shown with white circles, and residues in non-native configurations are shown with filled circles. A) The native interaction (a blue dashed line) between the residues within a contiguous native-like segment is taken into account in the WSME model. B) The interaction becomes ineffective when an intervening residue is in the non-native configuration. C) If the linker chain connecting two native-like segments is long enough, a number of residues with random configurations can compensate each other to allow two segments to reach the positions where native interactions are effective. This type of interaction, however, is not taken into account in the WSME model. D) Interactions as in C can be suitably calculated with the WSME Hamiltonian if we consider that the N- and C-termini are connected by a virtual link, as explained in the section “The WSME Model for Multi-domain Proteins”.

With these two assumptions, contiguous native-like segments are energetically stabilized. Therefore, as illustrated in Figure 2, folding starts with the creation of short segments with the native-like configuration and proceeds through growth and coalescence of these segments into a larger region to assume the native conformation. We should note that there are combinatorially many ways of segment creation, growth, and coalescence, and the statistical weight of these different pathways is evaluated with the WSME model to explain the distribution of folding pathways observed in the ensemble of protein molecules.

Figure 2.

Figure 2

The hierarchical process of protein folding. Folding starts with the creation of contiguous segments with a native-like configuration. After nucleation, folding proceeds as those segments grow and coalesce into larger regions to reach native conformation.

The WSME model quantitatively explains free-energy landscapes, pathways, Φ-values, and kinetic rates of the folding of various proteins [1423]. In Figure 3, an example result is shown for the B domain of protein A (BdpA). As shown in Figure 3A, BdpA is a small 60 residue, α-helical protein comprising three helices: H1, H2, and H3. BdpA demonstrates a two-state folding transition between the unfolded and native states [37]. The two-dimensional free-energy landscape F(n1, n2) was calculated, where n1 is the order parameter of folding for the N-terminal half, and n2 is the one for the C-terminal half. In F(n1, n2) of Figure 3B, we find two basins: one at a small n1 and a small n2, which corresponds to the unfolded state, and the other at (n1, n2)≈(0.95, 0.96), which corresponds to the native state. In this landscape, we find two saddles with similar free-energy heighs; therefore, BdpA has two dominant transition states, TS1 and TS2, in this representation. Along the pathway through TS1, the helix H1 folds earlier than H3, whereas along the pathway through TS2, H3 folds earlier than H1. The Φ-values were calculated at TS1 and TS2 with the WSME model. Here, the Φ-value represents the frequency of structure formation at each residue in the transition state ensemble. By averaging the Φ-values at two TSs with the respective weights of the Boltzmann factor, the average Φ-values are calculated and compared with the observed ones in Figure. 3C, which shows good agreement between the calculated and observed data. The existence of two TSs having almost equivalent free-energy heights is due to the symmetrical native conformation of BdpA, as shown in Figure. 3A, and a subtle difference in the experimental conditions or settings of the simulation model should break this symmetry and change the relative heights of TS1 and TS2. The results of several simulation studies are conflicting on which helix, H1 or H3, folds earlier [38], but the WSME model provides a clear explanation of the reason for this disagreement; a symmetrical native conformation brings about the competing multiple pathways of folding and the detailed simulation condition or the parameter setting modulates the relative statistical importance of multiple pathways.

Figure 3.

Figure 3

Application of the WSME model to the B domain of Staphylococcal protein A (BdpA). A) Native conformation of BdpA (Protein Data Bank (PDB) code: 1bdd). B) Two-dimensional free-energy landscape, F(n1, n2), calculated with the WSME model, where n1 is the folding order parameter of the N-terminal half, and n2 is the one of the C-terminal half. A contour is drawn every 0.5kBT. F(n1, n2) has two basins: the unfolded state basin (n1≈0.3, n2≈0.3) and the basin of the native state (n1≈1.0, n2≈1.0). Two transition states, TS1 and TS2, are shown; there are two dominant pathways of folding, which proceed through TS1 and TS2. C) Comparison of the calculated and observed Φ-values. The calculated values are shown with a line and the observed values [37] are green squares shown with error bars. Bars on the bottom represent the positions of α helices. Modified from Figures 1, 3, and 5 of [15].

Another example is shown for barnase in Figures 4 and 5. Barnase is a 110 residue α+β protein (Fig. 4A), and its folding proceeds via an intermediate state [6]. The two-dimensional free-energy landscape F(n1, n2) was calculated by disregarding two structurally unresolved residues with N1= 54 and N2= 54; therefore, n1 is the order parameter of folding for the N-terminal half and n2 is the one for the C-terminal half. In F(n1, n2) of Figure 4B, a dominant intermediate state is represented by a basin at a large n2 and a small n1 value, indicating that the C-terminal half is more structurally ordered than the N-terminal half is in the intermediate state. There are two transition states, TS1 between the unfolded and intermediate states, and TS2 between the intermediate and native states. In Figure 5, the calculated Φ-values at TS1 and TS2 are compared with the experimentally observed values [39,40], showing a good agreement between the WSME results and the observed data. In barnase, as shown in Figure 5, the Φ-value shows a large change around the boundaries of the structural modules, which are defined by the geometrical pattern of the native contacts [4145]. This interesting feature will be discussed later in the Discussion section.

Figure 4.

Figure 4

Application of the WSME model to barnase. A) Native conformation of barnase from Bacillus amyloliquefaciens (PDB code: 1a2p). B) Two-dimensional free-energy landscape, F(n1, n2), calculated with the WSME model, where n1 is the order parameter of folding of the N-terminal half, and n2 is the one of the C-terminal half. Contour is drawn in every 2kBT. F(n1, n2) has four basins; basin of unfolded state (n1≈0.2, n2≈0.2), basin of native state (n1≈1.0, n2≈1.0), and two basins of intermediate states, I1 (n1≈0.2, n2≈0.8) and I2 (n1≈0.9, n2≈0.2). Saddles around the basin I1 are much lower in free energy than those around I2 are; therefore, a pathway through I1 is a dominant pathway, and I1 is a dominant intermediate. I2 could be detected as an off-pathway intermediate. Along the dominant pathway, there are two transition states, TS1 and TS2. Modified from Figure 14 of [20] with permission.

Figure 5.

Figure 5

Calculated and observed Φ-values at the two transition states, TS1 and TS2, of barnase. Lines shaded with gray correspond to the calculated Φ-values with the WSME model. Dots are the experimentally observed values [39,40]. Red arrows are boundaries of modules defined by the pattern of atomic contacts in the native conformation [44,45]. Bars shown on the bottom represent secondary structure elements, helices (blue) and strands (yellow). Modified from Figure. 15 of [20] with permission.

As in the above examples, the WSME model explained the experimentally observed data of many proteins, which strongly suggests that the two major assumptions made in developing the WSME model, dominance of native interactions and the local cooperative formation of the native-like configuration, are indeed valid assumptions. The dominance of native interactions was also recently shown [21,46] using folding trajectories of all-atom simulations performed by Shaw’s group [4749]. Comparing the folding trajectories of all-atom simulations and the WSME results, it was shown that the much simpler WSME model quantitatively explains the all-atom results [21]. The dominance of native interactions can be interpreted as following. When we consider the atomic details of a short molecular dynamics trajectory of the picosecond time-scale, there would be no distinction between native and non-native interactions; both have the same physical origin as electrostatic, hydrophobic, or van der Waals interactions. However, when we consider a micro-second or a longer process, the non-native interactions are only transiently formed within that process; also, the lifetime of native interactions is much longer due to the multi-residue cooperativity forming the local ordered structure. Then, we can approximate the long-term process using only the native interactions. The dominance of native interactions and the resulting globally biased energy landscape were first assumed by Gō and his colleagues to explain the two-state feature of folding transitions [33,34]. It was re-formulated later to explain how the trapping into the non-native states is prevented as well as how the Levinthal paradox is resolved in the energy landscape perspective [5,8]. Here, the dominance of native interactions in folding has been clearly supported by the results of the quantitative analyses of experimental data and all-atom simulations, and the WSME model has played an important role in these analyses.

By regarding the dominance of native interactions as the 0th order description, non-native interactions should determine the next order description. Thus, non-native interactions should bring about the off-pathway intermediates in the folding process or work as “friction” in the course of folding [50]; non-native interactions may destabilize the native conformation to some extent to make the structure flexible to meet functional requirements [51]. Understanding the role of non-native interactions in long-term dynamics remains as an important challenging problem.

In the WSME model, contiguous native-like segments are emphasized so that interactions such as those shown in Figure 1B or C are neglected. Within a single-domain structure, this approximation seems reasonable. To make the native interaction between residues belonging to two segments separated by residues with the non-native configuration effective, as shown in Figure 1C, the multiple intervening residues in the linker between two segments must follow multiple non-native directions to compensate for “incorrect” directions and to recover the “correct” orientation between residues having the native interaction. This flexible structural adjustment of the linker chain is a necessary condition to make the interaction effective, but such flexible adjustment is rare in a single domain when the linker is short. Therefore, the assumption made for the WSME model is considered appropriate at least for describing the folding process of single-domain proteins. Indeed, the validity of the WSME model was shown for single-domain proteins [14,15,1721], but further careful argument is necessary to describe multi-domain proteins, particularly when they have a nontrivial topological arrangement of domains, as discussed in the next section.

The eWSME Model for Multi-domain Proteins

Many proteins show all-or-none two-state transitions between the folded and unfolded states, but in 1978, Wako and Saitô [2] suggested the presence of an intermediate state for lysozyme based on the calculated heterogeneous size distribution of contiguous native-like segments. In the 1980s, clear experimental evidence was discovered for the folding intermediates, which were referred to as the molten globule states [52]. Particularly, the folding process of typical small multi-domain proteins, such as α-lactalbumin and lysozyme, was analyzed. It was shown that, in these example proteins, the intermediate state in the equilibrium three-state transition is very similar to the intermediate state that appears on the kinetic folding pathway, suggesting the pivotal role of the molten globule state in protein folding. Furthermore, the structure of the molten globule state is heterogeneous and composed of ordered and disordered parts, whereas the degrees of compaction and side-chain packing largely depend on the protein species. To obtain a unified picture of the diversity of the molten globule state, extending the WSME model to describe generic multi-domain proteins by taking account of native interactions, as illustrated in Figure 1C, is strongly desired. The need for considering native interactions between residues separated by others with non-native configuration is evident particularly for proteins having topologically complex structures, as shown in Figure 6.

Figure 6.

Figure 6

Examples of multi-domain proteins with non-trivial topology. A) Dihydrofolate reductase (DHFR) (PDB code: 1rx1) has two domains, DLD and ABD. B) Adenylate kinase (AdK) (PDB code: 4ake) has three domains, CORE, NMP, and LID. Topological connectivity of the chain is illustrated at the bottom.

Dihydrofolate reductase (DHFR), a 159 residue α/β protein, for example, has two domains, the discontinuous loop domain (DLD) and the adenosine-binding domain (ABD), as shown in Figure 6A; the ABD is a continuous domain comprising the residues 38–106, and the DLD is a discontinuous domain comprising the N-terminal part (residues 1–37) and the C-terminal part (residues 107–159). Therefore, native interactions between the N- and C-terminal parts in the DLD are expected to form even when the intervening ABD is disordered, which is just the case illustrated in Figure 1C. A convenient way to consider such interactions is to introduce a virtual link connecting the N- and C-termini (Fig. 1D) and applying the WSME Hamiltonian to this virtually closed ring to derive the partition function Zring. Using Zring, the extended WSME (eWSME) partition function is defined by

ZeWSME(n)=ZWSME(n)+(Zring(n)-ZWSME(n))eSring(n)/kB, (3)

where Sring(n)<0 is the entropic reduction arising from the constraint to place the N- and C-termini at a distance determined by the native conformation, which can be estimated assuming that the disordered parts of the chain under the n constraint behave as fragments with random configurations [23]. ZeWSME is smoothly interpolated between ZWSME and Zring; ZeWSMEZWSME, when the entropic reduction is significant, as Sring<<0, and ZeWSMEZring, when the entropic reduction is negligible, as Sring ≈ 0. ZeWSME incorporates both local multi-residue correlations as in ZWSME and native interactions separated by intervening non-native residues with suitable statistical weights; also, it is exactly calculable.

The two-dimensional free-energy folding landscape of DHFR calculated with this eWSME model is shown in Figure 7A [23]. Here, the two-dimensional space is defined by the parameters MDLD= ∑i∈DLD mi and MABD= ∑i∈ABD mi. This landscape has basins at (MDLD, MABD)≈(30, 30), which is the basin of the unfolded state (U); at (MDLD, MABD)≈(30, 69) (the basin denoted by IA); at (MDLD, MABD)≈(70, 69) (the basin IB); at (MDLD, MABD)≈(90, 35) (the basin Iα); and at (MDLD, MABD)≈(90, 69) (the basin of the native state, N). In IA, the ABD is folded and the DLD is unfolded, whereas, in Iα, the DLD is folded and the ABD is unfolded. The basin Iα has lower free energy than IA; however, Iα is separated from U by a higher free-energy barrier than IA. Therefore, we can expect that molecules starting from U pass through IA to proceed along the pathway U→IA→IB→N. This was confirmed by numerically following the kinetic change of {mi} with the Monte Carlo simulation using the following function to calculate the effective eWSME energy for the Metropolis criterion;

Figure 7.

Figure 7

Free-energy landscape and kinetics of DHFR folding calculated by the eWSME model. A) Free-energy landscape of DHFR folding represented in the two-dimensional space of MDLD and MABD. The landscape has basins corresponding to the unfolded state U, the native state N, and the intermediates, IA, IB, and Iα. B–D) Evolution of the population of 200 molecules simulated with the Monte Carlo calculation at B) 3.3×105 t0, C) 1.6×106 t0, and D) 3.0×106 t0, where t0 is a unit of time in simulation. Reproduced from [23].

EeWSME({mi})=-kBTlog (e-HWSME/kBT+(e-Hring/kBT-e-HWSME/kBT)eSring/kB)+kBTiσimi. (4)

The kinetic evolution of the DHFR molecules’ population on the two-dimensional space is shown in Figure 7B–D. These panels show that the population indeed proceeds along the folding pathway U→IA→IB→N by sequentially visiting the intermediate states IA and IB. This pathway agrees with the observed pathway and kinetics of folding [53]. This sequential pathway is preferred due to the high free-energy barrier between U and Iα, which prevents folding trajectories from branching to Iα. This barrier arises from the large entropy decrease, which brings together the discontinuous parts to form DLD. In other words, the topological complexity of DHFR is the reason for this simple sequential pathway of folding. It should also be noted that the free-energy barrier between N and Iα is predicted to be low, leading to structural fluctuations, including the partial unfolding/folding of the ABD that can be important for the function of DHFR in the native state.

We should note that the topological complexity of DHFR can be resolved by circular permutation. Connecting the N and C termini and disconnecting the linker part of the chain between DLD and ABD, both ABD and DLD become continuous domains comprising continuous parts of the chain. The free energy change due to this circular permutation was calculated by the eWSME model and shown in Figure 8. This circular permutation increases the free energy at around IB and lowers the free energy at the barrier between U and Iα. Then, the kinetic evolution of DHFR molecules’ population branches into two pathways, U→IA→IB→N and U→Iα→N, as indicated by the Monte Carlo results of Figure 8B–D. In this way, the simplification of the DHFR topology through circular permutation brings about the complex folding behavior. This complex folding behavior is consistent with the observed folding kinetics of the circular permutant [54].

Figure 8.

Figure 8

Free-energy landscape and folding kinetics of the circular permutant of DHFR calculated by the eWSME model. A) Difference in the free-energy landscape between the wild type and the circular permutant of DHFR. B–D) Evolution of the population of 200 molecules simulated with the Monte Carlo calculation at B) 3.3×105 t0, C) 1.6×106 t0, and D) 3.0×106 t0, where t0 is a unit of time in simulation. Reproduced from [23].

Further extension of the WSME model is possible for proteins with more complex topologies, and we here outline this idea. Adenylate kinase (AdK), for example, has three domains: CORE (residues 1–29, 68–117, and 161–214), NMP (residues 30–67), and LID (residues 118–167), as shown in Figure 6B. We define the virtual ring closures at residues 29 and 68 (closure-1), 117 and 161 (closure-2), and 1 and 214 (closure-3). The WSME partition function Zring(i) is calculated by assuming only one closure for i=1, 2, or 3, Zring(ij) is calculated for two closures with ij=12, 23, or 31, and Zring(123) is calculated for three closures. Then, ZeWSME is calculable from the WSME Hamiltonian as

ZeWSME=ZWSME+(Zring(1)-ZWSME)A1(1-A2)(1-A3)+(Zring(2)-ZWSME)A2(1-A3)(1-A1)+(Zring(3)-ZWSME)A3(1-A1)(1-A2)+(Zring(12)-ZWSME)A1A2(1-A3)+(Zring(23)-ZWSME)A2A3(1-A1)+(Zring(31)-ZWSME)A3A1(1-A2)+(Zring(123)-ZWSME)A1A2A3, (5)

where Ai=exp(Sring(i)/kB) is a factor representing the entropy reduction due to the closure-i, which could be estimated by evaluating the probability that the two sites in a Gaussian chain are located at the closure distance from each other, under the constraint of a given pattern of {mi}. In this way, the eWSME model can be directly applied to proteins with various topologies, as exploring folding mechanisms of multi-domain proteins with a unified perspective is an important avenue of the folding studies.

The aWSME Model for Protein Allostery

The classical view of protein folding, wherein folding proceeds along a definite pathway [55], was replaced by the modern energy landscape picture, which describes protein folding as fluctuating diffusive motions over a globally biased energy landscape. Energy landscape methods have shown that the folding pathway and transition state ensemble are determined by the statistical features of the distributed fluctuating trajectories; these methods enabled the quantitative understanding of protein folding and guided methods of protein engineering [8]. The energy landscape perspective should be important not only for protein folding but also for protein conformational change, wherein fluctuations and diversity of trajectories are significant. Particularly, the energy landscape description should be necessary for understanding allosteric transitions [5658].

An allosteric transition is a change in the distribution of a protein’s structure triggered by a chemical or physical perturbation [59], which is often an essential step for proteins to exert their functions. Although the classical view of allosteric transition is based on the picture of a deterministic sequential structural change [60], motions in allosteric transition should bear flexible stochastic fluctuations that may allow diversely different transition trajectories, as in protein folding, which should be quantitatively assessed by energy landscape methods. For this purpose, the WSME model can be extended to describe the energy landscape of allosteric transitions.

Here, we assume that a protein shows two different low-energy conformations in the native state. To be more specific, we consider the case that one is the active (A) conformation, which has the higher affinity to bind a partner protein, and the other is the inactive (I) conformation, which has the lower affinity to bind it. The dominant conformation, around which the protein structure fluctuates, switches from I to A upon binding of a ligand or through chemical modification such as phosphorylation of the protein. We should note that the following theoretical scheme is applicable to cases other than this I-A structural change when the transition between two low-energy conformations is concerned with. We assume that mi can take three values, A, I, and D; mi=A or I when the ith residue takes the configuration similar to that found in the A or I conformation, respectively, and mi=D, when the residue takes a disordered non-native configuration. Here, for mathematical convenience, to calculate the partition function from the Hamiltonian, we use a redundant expression of either mi=A or mi=I for the residue with the configuration common to A and I [31].

The contact patterns in the native conformations are expressed as ΔijA and ΔijI; ΔijA(or I)=1 when the residues i and j are in contact in the A(or I) conformation and ΔijA(or I)=0, otherwise. ΔijC=ΔijAΔijI represents the contact pattern which is common to A and I. Δ˜ijA=ΔijA(1-ΔijC) and Δ˜ijI=ΔijI(1-ΔijC) are the contact patterns which are specific to A and I, respectively. We define the functions PkA(mk),PkI(mk), and Pk0(mk) by PkA(A)=1,PkA(I)=PkA(D)=0,PkI(I)=1,PkI(A)=PkI(D)=0, and Pk0(mk)=PkA(mk)+PkI(mk).

Then, the WSME Hamiltonian for allosteric transition (the aWSME Hamiltonian) is

HaWSME(α,{mi})=Vα({mi})+i=1N-1j=i+1Nɛij(ΔijCk=ijPk0(mk)+Δ˜ijAk=ijPkA(mk)+Δ˜ijIk=ijPkI(mk)), (6)

where α distinguishes the ligand binding/unbinding or the phosphorylation/dephosphorylation and Vα({mi}) represents the local interactions between the bound ligand and surrounding residues or those around the phosphorylated site [31]. The first term in the summation of the right-hand side of Eq. 6 is the energy decrease due to the many-residue correlation to form native-like segments, and the second and third terms represent the energy decrease due to the many-residue correlation to form A and I-like segments, respectively. We define the order parameter n of the folding and the order parameter x of allostery as n=i=1NPk0(mi)/N and x=MA/NA, respectively. Here, MA is the number of residues assuming the configuration specific to the A conformation, and NA is the maximal number of MA, so that (x, n)=(0, 1) is the I conformation, (x, n)=(1, 1) is the A conformation, and (x, n)=(0, 0) is the completely disordered state. The partition function ZaWSME(α, x, n) and the two-dimensional free-energy landscape Fα(x, n)=−kBT log ZaWSME are exactly calculable from HaWSME. See [32] for a more detailed explanation of the model.

Figure 9 illustrates the allosteric transition of an example protein, the bacterial nitrogen regulatory protein C (NtrC). The distribution of the NtrC structures is dominated by the A conformation, when the residue Asp54 is phosphorylated, and by the I conformation, when dephosphorylated. Figure 10 shows Fα(x, n) calculated with the aWSME model. Although the most stable structure in Fdephos(x, n) is the I conformation at (x, n) ≈ (0, 1), a low free-energy valley extends from I to A conformations with metastable basins at (x, n) ≈ (0.2, 0.97), (0.55, 0.97), and (0.75, 0.97), demonstrating that the dephosphorylated NtrC should exhibit large structural fluctuation. The NtrC molecules within the valley bear the A-like features, which transiently appear as fluctuations, though the most stable structure is the I conformation. As shown in Figure 11, this structure fluctuation, explains the observed Rex values derived from the R1, R2, and the NOE relaxation data of NMR [61].

Figure 9.

Figure 9

Allosteric transition of NtrC. Upon phosphorylation of Asp54, the NtrC structure switches from a state around the inactive (I) conformation (PDB code: 1dc7) to another state around the active (A) conformation (PDB code: 1dc8). Asp54 is shown with blue colored spheres. “3445 face” (the region comprises helices and strands, α3, β4, α4, and β5) is colored red. Reproduced from [31].

Figure 10.

Figure 10

Free-energy landscape Fα(x, n) of allosteric transition of NtrC calculated with the aWSME model. x is the order parameter of allosteric transition and n is the order parameter of folding transition. (x, n)=(0, 1) is the I conformation, (1, 1) is the A conformation, and (0, 0) is the completely disordered state. A) Fdephos(x, n) in the dephosphorylated state and B) Fphos(x, n) in the phosphorylated state. C) and D) are closeups of A) and B), respectively, at n≈1. Contour is drawn for every 2kBT. Reproduced from [31].

Figure 11.

Figure 11

Pre-existing structural fluctuation of NtrC. (Top) The parameter ξA showing the extent of the A-like structure development in the dephosphoryated state. ξA calculated with the aWSME model under the constraint of each fixed χ and n=1 is plotted in gray scale. Even in conformations near the I conformation with small x, the A-like structure appears as a fluctuation around the 3445 face. (Bottom) Rex observed in the relaxation measurement of NMR in the dephosphorylated state [61] are shown with red dots. Rex is larger than a threshold for the blue dots [61]. Reproduced from [31].

As shown in Figure 10, when Asp54 is phosphorylated, a basin that does not exist in Fdephos(x, n) appears at (x, n) = (0.95, 0.97) in Fphos(x, n). Therefore, the conformation close to A becomes most stable upon phosphorylation. The large fluctuation between A and I in the dephosphorylated state shows that the transition from I to A can be regarded as the selection of pre-existing A-like conformations, but the shift from (0.75, 0.97) to (0.95, 0.97) shows that the “induced-fit” works during the last step of this transition. Thus, the aWSME model reveals that the mixed mechanisms of conformation selection and induced fit regulate the allosteric transition of NtrC.

The large structural fluctuation in the dephosphorylated state is due to the entropic gain for the intermediate x. In the intermediate x regime, multiple A- or I-like segments coexist in the chain, and a large number of mosaic patterns of these segments are possible; this large number of structures is the reason for the large entropy in this regime. In other words, the multitude of fluctuating trajectories with similar energies is the reason for the flat free-energy landscape and large fluctuation along the x variance with n≈1. Such entropic gain is not taken into account by conventional simulations based on the classical picture assuming a unique definite transition pathway. Thus, the results of the WSME model reveal the importance of fluctuating movement over the energy landscape. It should be noted that in the problem of allostery, the landscape itself is modified by binding/unbinding of an effector such as the phosphate group, inducing the dynamical transition FdephosFphos. To emphasize this aspect, we would argue that the “dynamical energy landscape view” is important for analyzing protein allostery and functions.

Finally, we note that the aWSME model can be applied to the folding problem, when competition between the native conformation and an off-pathway intermediate state with a distinct non-native structure dominates the folding process [62,63]. The aWSME model is applicable to this problem using these native and non-native conformations in place of the A and I conformations in the above analysis.

Discussion: Cooperativity and Modularity

Prof. Nobuhiko Saitô emphasized the importance of the hierarchal pathway of protein folding through the WSME model development and the related models of secondary structure formation [6466]. In this hierarchical picture, “islands” or local native-like contiguous segments are spontaneously formed at the early stage of folding, and folding proceeds through growth and coalescence of these segments through long-range interactions. Saitô suggested that the segments formed first should typically be secondary structure elements (SSEs), such as α-helices or β-strands, and these SSEs are packed with hydrophobic interactions in the later stage of folding [6466]. However, in many cases, the loop regions include as dense hydrogen-bonds or other interactions as in SSEs such that local structures including loops can be energetically stabilized similarly to SSEs. Therefore, segments that include loops could also be formed during the early stage of folding. A well-known example of a loop, where the folding reaction initiates, is the distal hairpin loop of src SH3 [67]. The above discussion suggests that we should carefully examine the parts of the protein that fold during the early stage of the folding process. Importantly, the statistical weight of the different folding pathways can be compared with the WSME model by taking account the balance between energy and entropy so that the quantitative comparison between the experiments and the WSME results would facilitate solving this problem.

Local segments, which could be identified as units of a protein’s substructure, have been defined and analyzed from several viewpoints. A notable approach is the geometrical analysis; using the contact pattern in the native conformation, “modules” were defined as units of the substructure [41]. Gō showed that the boundaries of these modules coincide with the boundaries of exons of example proteins [42,43], which suggested that modern proteins were formed through shuffling of modules in the evolutionary history. Barnase, for example, comprises six modules, M1, M2, ..., M6, and their boundaries are at residues 24, 52, 73, 88, and 98 [44,45]. In Figure 5, these module boundaries are compared with the calculated and observed Φ-values at two transition states, TS1 and TS2. Meanwhile, when we examine an ensemble of numerous protein molecules, those molecules diffusively move on the energy landscape to diversely trace different trajectories so that the transition state, in which the folding nucleus is formed, is not dominated by a unique structure, but should be described as an ensemble of many heterogeneous structures. The Φ-values represent the average tendency to form the ordered structure at each residue in this transition state ensemble.

We found distinct dips in the calculated Φ-values at residues 72–73 and 89–90 at TS1, and at 20–23, 46, 72–73, 77–78, and 87–89 at TS2, showing the rough correlation between the module boundaries and the Φ-value boundaries. Through this comparison, we see that in the nucleus formation in TS1, M1 (residues 1–24) and M2 (residues 25–52) are disordered, M3 (residues 53–73) and M6 (residues 99–110) have small but finite probability of structure formation, and M4 (residues 74–88) and M5(residues 89–98) have intermediate levels of probability of folding. In another stage of nucleus formation in TS2, M1 has an intermediate level of probability of folding, M2 is disordered, and M3–M6 have higher probabilities of folding. Although the correspondence is not exact, this comparison suggests that module-like segments are formed at the transition states of barnase as cooperative structure formation units.

Energetic analysis is another method to define the subunits. Using a knowledge-based potential, the units of cooperative folding, foldons, were defined as segments that show the maximal energy gap between ordered and disordered structures [68,69]. For barnase, the foldons’ boundaries do not exactly match with those of the modules; however, there is a correlation between them; foldon-1 corresponds to M1, foldon-2 corresponds to M2, and foldon-3 corresponds to a part extending from M3 to M6 [68]. With this terminology, foldon-3 is folded with a large probability, foldon-1 is folded with a modest probability, and foldon-2 is almost unfolded at TS2 of barnase.

Comparing multiple proteins showed that there are correlations among modules, exons, and foldons, but the correspondence is not perfect and deviations specific to proteins were reported [68,69]. To elucidate the correlation and deviation of these differently defined local segments, the comprehensive comparison of different types of proteins is necessary. As shown in the above discussion, the Φ-value analysis with the WSME model should be useful for interpreting the results of such a comparison.

At a larger scale, local cooperative structures, foldons or modules, are assembled into the native conformation in a further cooperative way. A question in this scale is how such long-range cooperative assembly is realized. Here, the geometrical analysis sheds light on this problem. One of the present authors developed an efficient non-sequential structure alignment software, MICAN [70], and demonstrated that the spatial arrangement of SSEs of numerous different proteins can be precisely superposed on each other if we disregard both the chain direction in SSEs and the manner those SSEs are connected by chains [70,71]. An example of a non-sequential structure alignment by MICAN is shown in Figure 12. Indeed, approximately 80% of the fold representatives defined in the SCOP database [72] share the same spatial arrangement of SSEs with other folds [71]. Because it is widely accepted that proteins with different folds are very unlikely to be evolutionarily related, this frequent sharing of the same SSE arrangement suggests that particular SSE arrangements were evolutionarily selected as liquid-crystal-like configurations, which satisfy the chemical or physical requirements for interactions. With the same SSE arrangement, the non-local interactions in native conformations can be similarly stable, but local interactions can exhibit significantly different stabilities, depending on the connectivity of the SSEs. In addition, differences in the chain connectivity can modify the entropy reduction process along the folding funnel. To elucidate the relative importance of local versus non-local interactions as well as the role of entropy in the SSE assembly, it would be interesting to compare folding pathways for a set of proteins that share the same SSE arrangement but have different topologies. For such a purpose, the WSME model would play an important role, as implicated by the successful description of the folding pathways of both the wild type and the circular permutant of DHFR [23].

Figure 12.

Figure 12

An example of a non-sequential structure alignment. A) Structure of Q8ZRJ2 (PDB code: 2es9), B) structure of the eukaryotic clamp loader (PDB code: 1sxj), and C) the superimposition of Q8ZRJ2 and the eukaryotic clamp loader obtained by the non-sequential alignment program MICAN [70]. In A–C, the structurally equivalent regions are drawn with the same color. It can be clearly seen that all helices are well superimposed if both the chain direction and the connectivity are ignored. D and E are two-dimensional diagrams of protein topology of Q8ZRJ2 (A) and eukaryotic clamp loader (B), respectively. F) Correspondence relation of helices obtained by MICAN. Reproduced from [73] with permission.

Conclusively, we address the implications of the coarse-grained modeling studies discussed in this review. Protein folding is a complex molecular process, affected by various atomic interactions; non-native interactions, particularly non-native disulfide bonds, slow down the folding process. Isomerization of proline or other residues affects the folding/ unfolding rates. Cooperative exclusion of water molecules and the concomitant hydrophobic packing in each local part affect the height and position of the barrier in the free-energy landscape of folding. Some of these features, such as the effects of non-native interactions and proline isomerization, have been explicitly considered in the kinetic description using the WSME model [23]. Here, we emphasize that important aspects of these atomic features are represented in a coarse-grained way, which are compatible with the core assumption of the WSME model that is the cooperativity in forming local structural modules and assembling those local structures, as indicated by the agreement between the WSME results and the observed data. Therefore, the analyses of modularity and cooperativity with the WSME model provide guidelines on how to represent the effects of atomic interactions in a coarse-grained way to construct models of complex problems, such as allostery dynamics [57]. Therefore, coarse-graining methods should provide insights on protein evolution, development of techniques for protein structure prediction, and protein engineering. Finally, this approach using simplified statistical mechanical models, which was pioneered by Saitô, should continue to play an important role in this modern field of protein biophysics.

Significance.

Statistical mechanical models have made a significant contribution to elucidating the physics of protein folding. In particular, a simple theoretical model proposed by Wako and Saitô has explained quantitative features of pathways, transition-state ensembles, and intermediates of folding of a variety of proteins. This review explains how the physical principles of protein folding were revealed by this model, and discusses the application of the model to the folding of multi-domain proteins with topologically complex conformations and to the problems in allosteric transitions.

Acknowledgment

This study was supported by JSPS KAKENHI Grant Number JP16H02217, CREST of the Japan Science and Technology Agency, and Riken Pioneering Project “Cellular Evolution”.

Footnotes

Conflicts of Interest

The authors declare no competing financial interest.

Author Contributions

M. S., G. C. and T. P. T. co-wrote the manuscript.

References

  • 1.Wako H, Saitô N. Statistical mechanical theory of the protein conformation. I. General considerations and the application to homopolymers. J Phys Soc Jpn. 1978;44:1931–1938. [Google Scholar]
  • 2.Wako H, Saitô N. Statistical mechanical theory of the protein conformation. II. Folding pathway for protein. J Phys Soc Jpn. 1978;44:1939–1945. [Google Scholar]
  • 3.Lifson S, Roig A. On the theory of helix-coil transition in polypeptides. J Chem Phys. 1961;34:1963–1974. [Google Scholar]
  • 4.Poland D, Scheraga HA. Phase transitions in one dimension and the helix-coil transition in polyamino acids. J Chem Phys. 1966;45:1456–1463. doi: 10.1063/1.1727785. [DOI] [PubMed] [Google Scholar]
  • 5.Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG. Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins. 1995;21:167–195. doi: 10.1002/prot.340210302. [DOI] [PubMed] [Google Scholar]
  • 6.Fersht A. Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding. Freeman; New York: 1999. [Google Scholar]
  • 7.Fersht A. Transition-state structure as a unifying basis in protein-folding mechanisms: Contact order, chain topology, stability, and the extended nucleus mechanism. Proc Natl Acad Sci USA. 2000;97:1525–1529. doi: 10.1073/pnas.97.4.1525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Onuchic JN, Wolynes PG. Theory of protein folding. Curr Opin Struct Biol. 2004;14:70–75. doi: 10.1016/j.sbi.2004.01.009. [DOI] [PubMed] [Google Scholar]
  • 9.Daggett V, Fersht A. The present view of the mechanism of protein folding. Nat Rev Mol Cell Biol. 2003;4:497–502. doi: 10.1038/nrm1126. [DOI] [PubMed] [Google Scholar]
  • 10.Munôz V, Eaton WA. A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc Natl Acad Sci USA. 1999;96:11311–11316. doi: 10.1073/pnas.96.20.11311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bruscolini P, Pelizzola A. Exact solution of the Muñoz-Eaton model for protein folding. Phys Rev Lett. 2002;88:258101. doi: 10.1103/PhysRevLett.88.258101. [DOI] [PubMed] [Google Scholar]
  • 12.Pelizzola A. Exactness of the cluster variation method and factorization of the equilibrium probability for the Wako-Saitô-Muñoz-Eaton model of protein folding. J Stat Mech. 2005:P11010. [Google Scholar]
  • 13.Karanicolas J, Brooks CL., III The importance of explicit chain representation in protein folding models: An examination of Ising-like models. Proteins. 2003;53:740–747. doi: 10.1002/prot.10459. [DOI] [PubMed] [Google Scholar]
  • 14.Henry ER, Eaton WA. Combinatorial modeling of protein folding kinetics: free energy profiles and rates. Chem Phys. 2004;307:163–185. [Google Scholar]
  • 15.Itoh K, Sasai M. Flexibly varying folding mechanism of a nearly symmetrical protein: B domain of protein A. Proc Natl Acad Sci USA. 2006;103:7298–7303. doi: 10.1073/pnas.0510324103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Itoh K, Sasai M. Cooperativity, connectivity, and folding pathways of multidomain proteins. Proc Natl Acad Sci USA. 2008;105:13865–13870. doi: 10.1073/pnas.0804512105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kubelka J, Henry ER, Cellmer T, Hofrichter J, Eaton WA. Chemical, physical, and theoretical kinetics of an ultrafast folding protein. Proc Natl Acad Sci USA. 2008;105:18655–18662. doi: 10.1073/pnas.0808600105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Nelson ED, Grishin NV. Folding domain B of protein A on a dynamically partitioned free energy landscape. Proc Natl Acad Sci USA. 2008;105:1489–1493. doi: 10.1073/pnas.0705707105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Yu W, Chung K, Cheon M, Heo M, Kyou-Hoon Han K-H, Ham S, Chang I. Cooperative folding kinetics of BBL protein and peripheral subunit-binding domain homologues. Proc Natl Acad Sci USA. 2008;105:2397–2402. doi: 10.1073/pnas.0708480105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Itoh K, Sasai M. Multidimensional theory of protein folding. J Chem Phys. 2009;130:145104. doi: 10.1063/1.3097018. [DOI] [PubMed] [Google Scholar]
  • 21.Henry ER, Best RB, Eaton WA. Comparing a simple theoretical model for protein folding with all-atom molecular dynamics simulations. Proc Natl Acad Sci USA. 2013;110:17880–17885. doi: 10.1073/pnas.1317105110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Sivanandan S, Naganathan AN. A disorder-induced domino-like destabilization mechanism governs the folding and functional dynamics of the repeat protein Iκ Bα. PLoS Comput Biol. 2013;9:e1003403. doi: 10.1371/journal.pcbi.1003403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Inanami T, Terada TP, Sasai M. Folding pathway of a multidomain protein depends on its topology of domain connectivity. Proc Natl Acad Sci USA. 2014;111:15969–15974. doi: 10.1073/pnas.1406244111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zamparo M, Pelizzola A. Rigorous results on the local equilibrium kinetics of a protein folding model. J Stat Mech. 2006:P12009. [Google Scholar]
  • 25.Zamparo M, Pelizzola A. Kinetics of the Wako-Saitô-Muñoz-Eaton model of protein folding. Phys Rev Lett. 2006;97:068106. doi: 10.1103/PhysRevLett.97.068106. [DOI] [PubMed] [Google Scholar]
  • 26.Imparato A, Pelizzola A, Zamparo M. Ising-like model for protein mechanical unfolding. Phys Rev Lett. 2007;98:148102. doi: 10.1103/PhysRevLett.98.148102. [DOI] [PubMed] [Google Scholar]
  • 27.Imparato A, Pelizzola A. Mechanical unfolding and refolding pathways of ubiquitin. Phys Rev Lett. 2008;100:158104. doi: 10.1103/PhysRevLett.100.158104. [DOI] [PubMed] [Google Scholar]
  • 28.Zamparo M, Trovato A, Maritan A. Simplified exactly solvable model for β-amyloid aggregation. Phys Rev Lett. 2010;105:108102. doi: 10.1103/PhysRevLett.105.108102. [DOI] [PubMed] [Google Scholar]
  • 29.Itoh K, Sasai M. Dynamical transition and proteinquake in photoactive yellow protein. Proc Natl Acad Sci USA. 2004;101:14736–14741. doi: 10.1073/pnas.0402978101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Itoh K, Sasai M. Coupling of functioning and folding: photoactive yellow protein as an example system. Chem Phys. 2004;307:121–127. [Google Scholar]
  • 31.Itoh K, Sasai M. Entropic mechanism of large fluctuation in allosteric transition. Proc Natl Acad Sci USA. 2010;107:7775–7780. doi: 10.1073/pnas.0912978107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Itoh K, Sasai M. Statistical mechanics of protein allostery: roles of backbone and side-chain structural fluctuations. J Chem Phys. 2011;134:125102. doi: 10.1063/1.3565025. [DOI] [PubMed] [Google Scholar]
  • 33.Gō N. Theoretical studies of protein folding. Annu Rev Biophys Bioeng. 1983;12:183–210. doi: 10.1146/annurev.bb.12.060183.001151. [DOI] [PubMed] [Google Scholar]
  • 34.Taketomi H, Ueda Y, Gō N. Studies on protein folding, unfolding and fluctuations by computer simulations 1. The effect of specific amino acid sequence represented by specific inter-unit interactions. Int J Pept Protein Res. 1975;7:445–459. [PubMed] [Google Scholar]
  • 35.Gō N, Abe H. Noninteracting local-structure model of folding and unfolding transition in globular proteins. I. Formulation. Biopolymers. 1981;20:991–1011. doi: 10.1002/bip.1981.360200511. [DOI] [PubMed] [Google Scholar]
  • 36.Abe H, Gō N. Noninteracting local-structure model of folding and unfolding transition in globular proteins. II. Application to two-dimensional lattice proteins. Biopolymers. 1981;20:1013–1031. doi: 10.1002/bip.1981.360200512. [DOI] [PubMed] [Google Scholar]
  • 37.Sato S, Religa TL, Daggett V, Fersht AR. Testing protein-folding simulations by experiment: B domain of protein A. Proc Natl Acad Sci USA. 2004;101:6952–6956. doi: 10.1073/pnas.0401396101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wolynes PG. Latest folding game results: Protein A barely frustrates computationalists. Proc Natl Acad Sci USA. 2004;101:6837–6838. doi: 10.1073/pnas.0402034101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Serrano L, Matouschek A, Fersht AR. The folding of an enzyme. III. Structure of the transition state for unfolding of barnase analysed by a protein engineering procedure. J Mol Biol. 1992;224:805–818. doi: 10.1016/0022-2836(92)90563-y. [DOI] [PubMed] [Google Scholar]
  • 40.Salvatella X, Dobson CM, Fersht AR, Vendruscolo M. Determination of the folding transition states of barnase by using ΦI-value-restrained simulations validated by double mutant ΦIJ-values. Proc Natl Acad Sci USA. 2005;102:12389–12394. doi: 10.1073/pnas.0408226102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Gō M. Modular structural units, exons, and function in chicken lysozyme. Proc Natl Acad Sci USA. 1983;80:1964–1968. doi: 10.1073/pnas.80.7.1964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Gō M, Nosaka M. Protein architecture and the origin of introns. Cold Spring Harb Symp Quant Biol. 1987;52:915–924. doi: 10.1101/sqb.1987.052.01.100. [DOI] [PubMed] [Google Scholar]
  • 43.Gō M. Correlation of DNA exonic regions with protein structural units in haemoglobin. Nature. 1981;291:90–92. doi: 10.1038/291090a0. [DOI] [PubMed] [Google Scholar]
  • 44.Yanagawa H, Yoshida K, Torigoe C, Park J-S, Sato K, Shirai T, Gō M. Protein anatomy: Functional roles of barnase module. J Biol Chem. 1993;268:5861–5865. [PubMed] [Google Scholar]
  • 45.Noguti T, Sakakibara H, Gō M. Localization of hydrogen-bonds within modules in barnase. Proteins. 1993;16:357–363. doi: 10.1002/prot.340160405. [DOI] [PubMed] [Google Scholar]
  • 46.Best RB, Hummer G, Eaton WA. Native contacts determine protein folding mechanisms in atomistic simulations. Proc Natl Acad Sci USA. 2013;110:17874–17879. doi: 10.1073/pnas.1311599110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science. 2011;334:517–520. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
  • 48.Piana S, Lindorff-Larsen K, Shaw DE. Protein folding kinetics and thermodynamics from atomistic simulation. Proc Natl Acad Sci USA. 2012;109:17845–17850. doi: 10.1073/pnas.1201811109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Piana S, Lindorff-Larsen K, Shaw DE. Atomic-level description of ubiquitin folding. Proc Natl Acad Sci USA. 2013;110:5915–5920. doi: 10.1073/pnas.1218321110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Borgia A, Wensley BG, Soranno A, Nettels D, Borgia MB, Hoffmann A, et al. Localizing internal friction along the reaction coordinate of protein folding by combining ensemble and single-molecule fluorescence spectroscopy. Nat Commun. 2012;3:1195. doi: 10.1038/ncomms2204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ferreiro DU, Komives EA, Wolynes PG. Frustration in biomolecules. Q Rev Biophys. 2014;47:285–363. doi: 10.1017/S0033583514000092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Arai M, Kuwajima K. Role of the molten globule state in protein folding. Adv Protein Chem. 2000;53:209–282. doi: 10.1016/s0065-3233(00)53005-8. [DOI] [PubMed] [Google Scholar]
  • 53.Arai M, Iwakura M, Matthews CR, Bilsel O. Micro-second subdomain folding in dihydrofolate reductase. J Mol Biol. 2011;410:329–342. doi: 10.1016/j.jmb.2011.04.057. [DOI] [PubMed] [Google Scholar]
  • 54.Texter FL, Spencer DB, Rosenstein R, Matthews CR. Intramolecular catalysis of a proline isomerization reaction in the folding of dihydrofolate reductase. Biochemistry. 1992;31:5687–5691. doi: 10.1021/bi00140a001. [DOI] [PubMed] [Google Scholar]
  • 55.Baldwin RL. The nature of protein folding pathways: The classical versus the new view. J Biomol NMR. 1995;5:103–109. doi: 10.1007/BF00208801. [DOI] [PubMed] [Google Scholar]
  • 56.Boehr DD, McElheny D, Dyson HJ, Wright PE. The dynamic energy landscape of dihydrofolate reductase catalysis. Science. 2006;313:1638–1642. doi: 10.1126/science.1130258. [DOI] [PubMed] [Google Scholar]
  • 57.Terada TP, Kimura T, Sasai M. Entropic mechanism of allosteric communication in conformational transitions of dihydrofolate reductase. J Phys Chem B. 2013;117:12864–12877. doi: 10.1021/jp402071m. [DOI] [PubMed] [Google Scholar]
  • 58.Tsai C-J, Nussinov R. A unified view of “How allostery works”. PLoS Comput Biol. 2014;10:e1003394. doi: 10.1371/journal.pcbi.1003394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Motlagh HN, Wrabl JO, Li J, Hilser VJ. The ensemble nature of allostery. Nature. 2014;508:331–339. doi: 10.1038/nature13001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Vreede J, Juraszek J, Bolhuis PG. Predicting the reaction coordinates of millisecond light-induced conformational changes in photoactive yellow protein. Proc Natl Acad Sci USA. 2010;107:2397–2402. doi: 10.1073/pnas.0908754107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Volkman BF, Lipson D, Wemmer DE, Kern D. Two-state allosteric behavior in a single-domain signaling protein. Science. 2001;291:2429–2433. doi: 10.1126/science.291.5512.2429. [DOI] [PubMed] [Google Scholar]
  • 62.Hamada D, Segawa S, Goto Y. Non-native α-helical intermediate in the refolding of β-lactoglobulin, a predominantly β-sheet protein. Nat Struct Biol. 1996;3:868–873. doi: 10.1038/nsb1096-868. [DOI] [PubMed] [Google Scholar]
  • 63.Borgia A, Kemplen KR, Borgia MB, Soranno A, Shammas S, Wunderlich B, et al. Transient misfolding dominates multidomain protein folding. Nat Commun. 2015;6:8861. doi: 10.1038/ncomms9861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Wako H, Saitô N, Scheraga HA. Statistical mechanical treatment of α-helices and extended structures in proteins with inclusion of short- and medium-range interactions. J Protein Chem. 1983;2:221–249. [Google Scholar]
  • 65.Saitô N, Shigaki T, Kobayashi Y, Yamamoto M. Mechanism of protein folding: I. General considerations and refolding of myoglobin. Proteins. 1988;3:199–207. doi: 10.1002/prot.340030308. [DOI] [PubMed] [Google Scholar]
  • 66.Saitô N, Kobayashi Y. Physical foundation of protein architecture. Int J Modern Phys B. 1999;13:2431–2529. [Google Scholar]
  • 67.Grantcharova VP, Riddle DS, Santiago JV, Baker D. Important role of hydrogen bonds in the structurally polarized transition state for folding of the src SH3 domain. Nat Struct Biol. 1998;5:714–720. doi: 10.1038/1412. [DOI] [PubMed] [Google Scholar]
  • 68.Panchenko AR, Luthey-Schulten Z, Wolynes PG. Foldons, protein structural modules, and exons. Proc Natl Acad Sci USA. 1996;93:2008–2013. doi: 10.1073/pnas.93.5.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Panchenko AR, Luthey-Schulten Z, Cole R, Wolynes PG. The foldon universe: A survey of structural similarity and self-recognition of independently folding units. J Mol Biol. 1997;272:95–105. doi: 10.1006/jmbi.1997.1205. [DOI] [PubMed] [Google Scholar]
  • 70.Minami S, Sawada K, Chikenji G. MICAN: a protein structure alignment algorithm that can handle Multiple-chains, Inverse alignments, Cα only models, Alternative alignments, and Non-sequential alignments. BMC Bioinformatics. 2013;14:24. doi: 10.1186/1471-2105-14-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Minami S, Sawada K, Chikenji G. How a spatial arrangement of secondary structure elements is dispersed in the universe of protein folds. PLoS ONE. 2014;9:e107959. doi: 10.1371/journal.pone.0107959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Andreeva A, Howorth D, Chandonia J, Brenner S, Hubbard T, Chothia C, et al. Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res. 2008;36:D419–D425. doi: 10.1093/nar/gkm993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Minami S. thesis. Nagoya University; 2015. [Google Scholar]

Articles from Biophysics and Physicobiology are provided here courtesy of The Biophysical Society of Japan

RESOURCES