Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2013 Nov 27;110(51):20515–20520. doi: 10.1073/pnas.1320483110

Free energy landscapes for initiation and branching of protein aggregation

Weihua Zheng a,b, Nicholas P Schafer b,c, Peter G Wolynes a,b,c,1
PMCID: PMC3870682  PMID: 24284165

Significance

This study leverages a predictive protein-folding simulation model to study the free energy landscapes of fused oligomeric constructs to quantify the conditions under which these constructs spontaneously misfold. Constructs of this type have been used to probe the early stages of aggregation in the laboratory. Oligomeric species may be the toxic agents in misfolding-related diseases. The critical structures that initiate aggregation are shown to depend on specific sequence signals and thermodynamic conditions. Our results also suggest that branching due to the presence of multiple amyloidogenic segments may determine the morphology of protein aggregates.

Keywords: energy landscape theory, branched aggregates, gelation

Abstract

Experiments on artificial multidomain protein constructs have probed the early stages of aggregation processes, but structural details of the species that initiate aggregation remain elusive. Using the associative-memory, water-mediated, structure and energy model known as AWSEM, a transferable coarse-grained protein model, we performed simulations of fused constructs composed of up to four copies of the Titin I27 domain or its mutant I27* (I59E). Free energy calculations enable us to quantify the conditions under which such multidomain constructs will spontaneously misfold. Consistent with experimental results, the dimer of I27 is found to be the smallest spontaneously misfolding construct. Our results show how structurally distinct misfolded states can be stabilized under different thermodynamic conditions, and this result provides a plausible link between the single-molecule misfolding experiments under native conditions and aggregation experiments under denaturing conditions. The conditions for spontaneous misfolding are determined by the interplay among temperature, effective local protein concentration, and the strength of the interdomain interactions. Above the folding temperature, fusing additional domains to the monomer destabilizes the native state, and the entropically stabilized amyloid-like state is favored. Because it is primarily energetically stabilized, the domain-swapped state is more likely to be important under native conditions. Both protofibril-like and branching structures are found in annealing simulations starting from extended structures, and these structures suggest a possible connection between the existence of multiple amyloidogenic segments in each domain and the formation of branched, amorphous aggregates as opposed to linear fibrillar structures.


Mature fibrils are a most striking feature of protein aggregation-related diseases, but how these structures relate to disease pathology remains an open question (14). Recent evidence supports the notion that in some cases fibers are by-products of a process that starts with oligomers that are themselves toxic and pathogenic. It is then vitally important to understand the early stages of aggregation that are invisible to many experimental techniques. Molecular simulations can be useful for proposing candidate misfolded structures and other oligomeric species. These initial seeds may or may not end up becoming incorporated into a fiber or an amorphous, nonfibrillar aggregate. In this study, we show how free energy calculations using a coarse-grained protein model can be used to determine the minimal number of domains needed for a fused construct to spontaneously misfold under different thermodynamic conditions. These simulations based on a transferable potential capable of low-resolution structure prediction (5, 6) also allow us to obtain detailed information about the misfolded structures, which can serve as the bridge between low oligomeric species and insoluble aggregates.

The fundamental question of how individual protein domains fold to nearly unique structures in a short amount of time has been answered in the form of the Principle of minimal frustration (7), and a quantitative understanding of this critical self-organization process has grown out of the framework of the energy landscape theory (ELT) of protein folding (8). Although initially these ideas were investigated using analytical models (9) and further tested with relatively simple lattice models (10), the degree of detail and realism of the models that have now been optimized using ELT makes them useful for practical tasks such as de novo prediction of protein structures (5) and prediction of binding interfaces (6). Protein folding theory provides a solid framework on which to build our understanding of misfolding (11). Misfolding and aggregation are bigger problems when the concentration of protein domains is high. Many proteins in nature are, in fact, multidomain proteins (12), consisting of multiple covalently linked but (semi)independently folding domains. In such systems, the local concentration of protein domains is always high because of their covalent linkages. Artificial multidomain protein constructs are therefore good systems for probing the early stages of aggregation.

The question of the size of the smallest thermodynamically stable misfolded species is fundamental to understanding the aggregation process as a whole. Despite the experimental difficulties, recent progress has been made using isolated artificial multidomain protein constructs (13). Using these constructs allows the experimentalist to control precisely the number of interacting domains. These low-oligomer misfolding experiments differ from aggregation experiments where only the average concentration of a sample is fixed. In aggregation experiments, the structural and thermodynamic details of the transiently populated species that seed aggregation are hidden beneath a cascade of events. Aggregation experiments that monitor the concentration of macroscopic aggregates in bulk (12, 14) typically observe a lag phase and then a fast growth phase after aggregation has been seeded. The size of the critical initiating species can be roughly inferred by repeating these types of experiments using constructs with a variable number of domains, but the wide range of time and spatial scales involved makes these inferences nontrivial. In this study we focus our attention on systems that have been studied in aggregation experiments, but use simulation techniques that are more akin to studies using isolated fused constructs in hopes that we can make a connection between these two levels of observation.

Previously, equilibrium simulations using a lattice model with the Miyazawa–Jernigan potential (15) have suggested that the melting temperature decreases as the concentration increases, and misfolding is entropically driven near the folding temperature of a single chain. Off-lattice native structure-based models have also been used to probe the folding of multidomain proteins (16) as well as to understand the biophysical consequences of tethering in multidomain proteins (17). An early use of lattice models for studying the initial stages of aggregation yielded a phase diagram including six phases with varying degrees of order present at different temperatures and concentrations (18). The state-of-the-art coarse-grained models used to study protein aggregation have been reviewed recently (19). Here we investigate the equilibrium thermodynamics of multidomain protein folding and misfolding using an optimized, transferable, and coarse-grained protein force field named AWSEM for associative-memory, water-mediated, structure and energy model. We recently used AWSEM to clarify the multidomain protein misfolding at the level of a single pair of neighboring domains (20). In this paper, we extend our study of the I27 protein to larger constructs with up to four domains. We study how the free energy landscape changes as the number of domains increases. We quantify the relative stability between the native and the misfolded states via equilibrium umbrella sampling (21) simulations to infer the minimal number of domains needed for spontaneous misfolding under different conditions of temperature and strength of interdomain interactions. We also perform simulated annealing simulations to characterize the possible misfolded structures of oligomers. Our analysis suggests a connection between the existence of multiple amyloidogenic segments per domain and a preference for forming amorphous branched aggregates as opposed to linear, fibril-like structures.

Results

We investigated the monomer, as well as the fused dimer, trimer, and tetramer constructs of the 27th Ig-like (Ig) domain of human muscle protein titin (I27; Protein Data Bank ID code 1TIU) using the AWSEM protein model. These constructs will be referred to as the wild-type systems Inline graphic . In this notation, the subscript indicates the number of domains in the construct. Each pair of consecutive domains was connected using a four-residue glycine linker. Simulation results for these wild-type constructs were compared with those for the mutant constructs Inline graphic wherein a single point mutation (I59E) was made to each domain to weaken the strong amyloidogenic segment (56HILILH61). This mutation was introduced in a previous study, and is based on a calculation using the AWSEM energy function (20). A second, weaker amyloidogenic segment (11VEVFVGETAHFEIE24) found in the wild type is still present in the mutant constructs.

The Melting Temperature Decreases as the Number of Domains Increases.

The melting temperatures Inline graphic of the wild-type systems and those of the mutant systems are plotted as a function of the number of fused domains in Fig. 1. Because these constructs are not strictly two-state systems, the Tm for each system is defined by the location of the peak of the specific heat curve as a function of the temperature. As the number of domains in the fused construct increases, the melting temperatures of the wild-type and mutant systems both decrease moderately, but with different rates. The decrease of the melting temperature as the number of domains increases was also observed in Cellmer et al.’s (15) study of multidomain protein folding in a lattice model using the Miyazawa–Jernigan potential. For the wild-type systems, the biggest drop in Tm occurs when Ndomain changes from 1 to 2; after that, the change in Tm with the addition of each domain diminishes. Without a significant change in this trend, Tm will stay above zero for all Ndomain. Therefore, the types of misfolded structures formed in our simulations do not ultimately correspond to states that are more energetically stable than the fully folded state in the limit of large Ndomain. For the mutant systems, there are only minor changes in Tm as Ndomain increases, until Ndomain increases from 3 to 4, where a larger change in Tm does occur. Note that the melting temperature of Inline graphic is the same as that of Inline graphic. The overall change of Tm is smaller for the mutant systems than for the wild-type systems. For values of Inline graphic, the value of Tm for the mutant construct is also expected to stay above that for the wild-type construct because the mutant, having only a weaker amyloidogenic fragment, has weaker interdomain interactions.

Fig. 1.

Fig. 1.

The melting temperatures Inline graphic of the wild-type constructs (●) and those of the mutant systems (■) are plotted as a function of the number of fused domains Ndomain. The Tm for each system is defined by the location of the peak of the specific heat curve as a function of the temperature and is in units of the folding temperature of the I27 monomer Inline graphic.

The free energy profiles as a function of the fraction of total native contacts QD for both the wild-type and the mutant systems are shown in Fig. S1. QD is the average fraction of native contacts across all domains. The exact formula used to calculate QD is given in SI Methods. I27 and I27* monomers have the same folding temperature Tf despite the single point mutation. At Tf, the misfolded states (with Inline graphic) become more and more populated as the number of domains increases. We observed that the energy gap per residue between the native state and the misfolded ensemble decreases as the number of domains increases, which contributes to the decreased Tm and the increased population of misfolded states. Judging by the change of Tm, Inline graphic and Inline graphic appear to be the critical sizes for misfolding in the wild-type and the mutant systems, respectively. The strong amyloid-like interactions between two amyloidogenic segments in different domains, present only in the wild-type systems, appear to be the main cause of the change of the energy landscape when more domains are introduced. A combination of residual amyloid-like interactions and domain-swapped contacts is responsible for the change in the mutant systems when Inline graphic.

Strong Interdomain Interactions Between Unfolded Domains Drive Misfolding for Inline graphic.

The free energy surfaces of the dimers and trimers are plotted with respect to QD and the number of interdomain contacts Ninter in Fig. 2. Each free energy surface is shown at its respective melting temperature. States with at least two unfolded domains have significantly more interdomain interactions than do states with no unfolded domains or only a single unfolded domain. When there are more than two unfolded domains, unfolding each additional domain increases the number of interdomain contacts. Structures with QD values greater than 0.6 consist of completely and independently folded domains; an example of one such structure is shown above the free energy surface of the mutant trimer. The fully folded structures are always found to be the minimum energy structures sampled in our simulations, although depending on the temperature the native folded structures may not be the lowest free energy structure. In the wild-type protein, the states with at least two unfolded domains frequently form an amyloid-like interaction using the strongly amyloidogenic segment 56HILILH61. An example of such a structure is shown above the free energy profile of the wild-type trimer. This state is largely stabilized by entropy. The presence of these amyloid-like interactions shifts the landscape toward misfolding: in the wild-type system, the global barrier to misfolding corresponds to the formation of the first amyloid-like interface. By contrast, the highest barrier for complete misfolding in the mutant systems corresponds to unfolding the last folded domain. These two transitions coincide for the dimer systems, but the barrier is significantly higher for the wild-type system, where the amyloid-like interaction is present. Kinetic models that are fit to experimental data of fibril formation show that the rate of elongation of a fibril is oftentimes much larger than that of seeding the fibril (22). Our observations are consistent with such a higher rate for elongation compared with initiation of fibril formation. In Fig. S2, we show the 2D free energy profiles of Inline graphic and Inline graphic along with the expectation values of the number of self-recognition contacts at the monomer’s folding temperature Tf. Self-recognition contacts are defined as contacts between two residues at the same position in the monomer sequence from two different domains. Very energetically stable self-recognizing segments are typically at the core of amyloids. At the folding temperature of the monomer, the dominant basin in the free energy landscape of Inline graphic is stabilized largely by entropy but also by a relatively large number of self-recognition interactions, particularly those coming from the amyloidogenic segment 56HILILH61. In contrast, all four states with different numbers of folded (F) and unfolded (U) domains, UUU, UUF, UFF and FFF, are thermally accessible in the landscape of Inline graphic, which lacks this strong amyloidogenic segment.

Fig. 2.

Fig. 2.

The 2D free energy profiles of Inline graphic (Upper Left), Inline graphic (Upper Right), Inline graphic (Lower Left), and Inline graphic (Lower Right) are plotted at their respective melting temperatures. The free energy profiles (in units of kilocalories per mole) are shown as a function of the number of interdomain contacts Inline graphic and QD. Example structures of the UUF (two domains unfolded, one folded) state and the FFF (all domains folded) state are shown above the profiles of Inline graphic and Inline graphic, respectively. Note that UUF does not refer specifically to the first two domains being unfolded, but rather to any one of the domains being folded and the other two unfolded or misfolded. The particular structure shown above the Inline graphic profile shows one well-folded domain and two domains that have misfolded and formed the strong amyloid-like interaction. This structure is typical of the UUF state in the wild-type system. In all cases, the number of interdomain interactions increases steadily as the number of unfolded domains increases, although the number of interdomain contacts is always fewer in the case of the mutant constructs.

Conditions for Spontaneous Misfolding: Temperature, Effective Local Protein Concentration, and Interdomain Interactions.

To determine the conditions for spontaneous misfolding, we calculated the relative free energy difference Inline graphic between the native and the misfolded states (or between the native and unfolded states for the single domain systems). Inline graphic is plotted as a function of the number of fused domains for three different temperatures, Inline graphic, Inline graphic, and Inline graphic in Fig. 3. These temperatures correspond to native conditions Inline graphic, the melting temperature of the monomer Inline graphic, and denaturing conditions Inline graphic. Note that for the constructs with Inline graphic, the unfolded state, which has a small number of interdomain and intradomain contacts, has a negligible population in this temperature range compared with the native ensemble and the misfolded ensemble. The Inline graphic s for the wild-type systems are plotted in solid lines. The Inline graphic s for the mutant systems are plotted in dashed lines. In Fig. 3, a data point having a positive value along the ordinate indicates that the native state of that system is more stable than the misfolded state. When Ndomain increases, thus raising the effective local protein concentration, the native state becomes stabilized or destabilized depending on the temperature and strength of interdomain interactions. For the wild-type system, when the temperature is above Tf, so that the monomer readily unfolds and refolds, fusing more domains to the constructs always leads to the formation of misfolded states that are entropically more stable than the native state. The larger the number of domains, the more stable the misfolded state becomes. At temperatures somewhat lower than Tf, the native state is instead stabilized by the addition of a second domain. At Tf (in green), the wild-type dimer construct spontaneously misfolds, as does the mutant tetramer. The mutant dimer and trimer constructs maintain marginal stability. Our result for the wild-type construct is consistent with the aggregation experiments on I27 under denaturing conditions [28% Trifluoroethanol (TFE)] done by Wright et al. (12) and Borgia et al. (14), in which the number of monomers involved in the critical step for initial aggregation was inferred to be 2.

Fig. 3.

Fig. 3.

The relative free energy difference Inline graphic (in units of kilocalories per mole) between the misfolded state Inline graphic and the native state Inline graphic is plotted as a function of Ndomain at various temperatures. The Inline graphic of the wild-type systems is plotted in circle symbols and solid lines. The Inline graphic of the mutant systems are plotted in square symbols and dashed lines. The results at different temperatures are distinguished by the color: Inline graphic are in blue, green, and orange, respectively. A point’s having a positive value along the y axis indicates that the native state is more stable than the misfolded state. The effect of increasing Ndomain on the relative stability of the native and misfolded states depends on the temperature. When Inline graphic, adding more domains destabilizes the native state. When Inline graphic, adding one more domain to the monomer stabilizes the native state. The turnover in ΔF as Ndomain increases is more evident in the wild-type system. When Inline graphic (green), the misfolded state first becomes significantly more stable than the native state for the wild-type dimer and for the mutant tetramer, as labeled in red arrows.

Simulated Annealing of the Fused Systems: Inline graphic and Inline graphic Are Critical for Misfolding of I27 and I27*, Respectively.

To characterize structurally metastable misfolded structures, we performed simulated annealing molecular dynamics simulations. These simulations are analogous to refolding experiments wherein the environmental conditions are rapidly changed from denaturing to native conditions. All protein constructs were simulated starting from a totally extended structure. The temperature was slowly annealed from slightly above the folding temperature to below it Inline graphic for efficient folding. The same annealing schedule was used for all systems. Forty independent annealing simulations were run for each system, and the final snapshots were collected for structural analysis. In Fig. S3, the fraction of misfolded domains as a function of Ndomain is plotted for both the wild-type and mutant systems. The wild-type and mutant monomers both have a low fraction of misfolded domains. For the wild-type constructs, Inline graphic is a critical point above which misfolding dominates. This result from the annealing studies is consistent with our equilibrium simulation results shown in Fig. 3; for the mutants, increasing Ndomain has a more gradual effect on the rate of misfolding. With each domain that is added, the amount of misfolding increases, and at Inline graphic almost all domains misfold. The frequencies of interdomain contacts formed in the final structures of the annealing simulations for the wild-type and mutant trimers are summarized using contact maps in Fig. S4.

Self-Recognition of Short Amyloidogenic Segments Is the Main Cause of Misfolding in Simulated Annealing Simulations.

The wild-type misfolded structures contain a significant number of self-recognition contacts that are highly localized within the two amyloidogenic segments. The second segment (56HILILH61) forms a stronger amyloid-like pairing with itself so that those contacts form much more frequently than do contacts involving the other, weaker, amyloidogenic segment (11VEVFVGETAHFEIE24). We note here that within natural protein sequences, amyloidogenic segments are common but not universal. The local nature of the strong amyloid-like interaction is consistent with the largely entropic stabilization of the misfolded species seen in our previous study (20); this contrasts with the popular notion that all peptides are capable of participating in amyloid-like interactions and that, therefore, the amyloid state is the most energetically stable state for all protein sequences. Misfolded structures of the mutant have residual self-recognition interactions between the weaker amyloidogenic segment and also having a significant number of domain-swapped contacts. It is interesting to note that the two distinct amyloidogenic segments actually form a native pair of β-strands in fully folded I27. In the simulations of the mutant trimer, these two segments from different domains often form a β-strand pair, which sometimes leads to the formation of a fully domain-swapped structure across two domains. The domain-swapped pairing of these two segments is rarely observed in the final annealed structures of the wild-type constructs due to the formation of strong amyloid-like interactions between copies of 56HILILH61 from distinct domains.

Topology of the Misfolded Structures.

In the sequence of I27, there are two amyloidogenic segments according to our amylometer calculation (20), one of which is weaker than the other. It is interesting to view the misfolded multidomain structures mentioned above as arising from multiple independent polymer units that can be cross-linked to other chains via these amyloid-like interactions. Because β-strands can be linked to two other β-strands via backbone hydrogen bonding, a single domain can be thought of as a polymer unit having two reactive groups (23, 24) per amyloidogenic segment. In the case where there is only a single amyloidogenic segment, each chain can only form two cross-links. Therefore, the polymerization proceeds linearly through a combination of elongation and breakup events, and later through the association of protofibril structures into mature fibrils. However, when the number of reactive groups per chain exceeds two, as is the case here, branched structures become possible and completely ordered fibrils must compete with these other structures. Topological representations of linear protofibril-like structures and branched structures selected from simulations of the wild-type trimer and tetramer are shown in Fig. S5. Among the final structures obtained by simulated annealing of the trimer, protofibril-like structures and branched structures are observed 47.5% and 5% of the time, respectively. For the tetramer, protofibril-like and branched structures are observed 30% and 32.5% of the time, respectively. We also made the point mutation V11K to the wild-type tetramer to reduce the interactions only between the weaker amyloidogenic segments. The percentage of observed protofibril-like and branched structures in the tetramer changed to 47.5% and 7.5%, respectively, suggesting the additional weaker amyloidogenic segment is needed for extensive branching. The fibrillar and branched structures are shown in Fig. 4. The growth kinetics and molecular weight distribution of branched and linear aggregates are likely to be different, and this difference should be discernible in experiment. It is interesting to note that in the experiments of Borgia et al. (14) on I27, amorphous aggregates form, and the rate of aggregation appears to be proportional to the surface area of the aggregate. No macroscopic fibrils were observed in these experiments. Nonetheless, some linear protofibril-like structures are still found in our folding simulations of the tetramer Inline graphic.

Fig. 4.

Fig. 4.

Two of the misfolded structures of the wild-type tetramer observed at the end of the annealing simulations are shown. (Left) Typical protofibril-like structure. The four short segments that are colored orange are the amyloidogenic segments (56HILILH61). In this structure, the amyloidogenic segments form a single in-register, parallel β-sheet. (Right) Typical branching structure. The weaker amyloidogenic segment (11VEVFVGETAHFEIE24) is colored in blue. This branching structure corresponds to the topology shown in Fig. S5D Lower Center.

Discussion

Factors Affecting Spontaneous Misfolding.

The human muscle protein Titin consists of 244 individually folded domains, and these domains unfold when the protein is stretched and refold when the tension is removed (25). It has been suggested that possible misfolding in multidomain proteins is ameliorated by reducing the sequence identity between neighboring domains (12). The previous multidomain misfolding study using AWSEM showed that at the molecular level, by lowering the sequence identity, the formation of strong interdomain interactions, including both the domain-swapped and the self-recognition contacts, is reduced (20). For multidomain proteins like Titin, which may undergo frequent unfolding and refolding, the lack of strong domain-swapped contacts and amyloid-like self-recognition interactions seems to be required for reliable refolding. In the artificial constructs Inline graphic and Inline graphic, both ample domain-swapped contacts and strong amyloid-like interactions exist across the domains. The refolding in the annealing simulations becomes more and more unreliable when the number of domains increases. Mutating away one of the stronger amyloid-like segments alleviates the misfolding considerably for Inline graphic.

We also explored the effect of temperature and fusing additional domains on stabilizing and destabilizing the native state. As shown in our previous study (20), the misfolded state for the wild-type dimer is only entropically favored over the native state; therefore, at higher temperatures, above the Tf of the monomer, the misfolded state is more stable than the native state, but not at low temperature. The misfolded states of the trimer and tetramer are even more stable. When the temperature is high enough, however, a fully unfolded ensemble becomes populated, and both the populations of the folded and misfolded states are negligible. At low temperature, where the native state is stable, adding a second domain actually stabilizes the native state, consistent with experiments (26), presumably due to crowding effects (27). Therefore, there is a limited range of temperatures that favor the misfolded state in our simulations. It is interesting to note that it has been observed in experiments that misfolding and aggregation only prevail for an intermediate range of denaturant concentration. In this respect, denaturant and temperature have similar effects on the population of the native, misfolded, and unfolded states.

Our simulations show how multiple, structurally distinct misfolded states can be stabilized under different thermodynamic conditions and therefore provide a plausible link between the aggregation experiments of Wright et al. (12) and the single-molecule misfolding experiments of Borgia et al. (28). In experiments on the I27 fused dimer under native conditions, a misfolded state can form during refolding (28) and is found to be very kinetically stable. The kinetic stability of the misfolded state may come from the strong local amyloid-like interactions or from domain swapping. If amyloid-like interactions are the dominant cause of misfolding under native conditions, how long the misfolded state takes to refold would depend on the strength of these interactions. We would therefore expect the misfolded state to be short-lived or absent during refolding of the mutant systems. Different misfolded structures may be preferred under different conditions, e.g., high vs. low denaturant concentration. Domain swapping is more likely to be important in misfolding under native conditions than it is under denaturing conditions. The misfolded state that seeds aggregation under denaturing conditions must be largely entropically stabilized and is therefore likely to be the amyloid-like state that dominates at high temperature and is found to be the dominant cause of misfolding in our simulated annealing simulations.

Critical Number of Domains for Misfolding.

The wide range of temporal and spatial scales involved in aggregation make individual phases of aggregation difficult to probe in experiment. Thus, there is not yet agreement about the size of the species that initiate aggregation, much less their structures. Indeed, both of these aspects are likely to be system dependent. A critical number of domains for misfolding can be inferred through aggregation experiments by making constructs having increasing numbers of domains and monitoring lag times and macroscopic aggregation rates, but these experiments are difficult to interpret quantitatively. Experiments on isolated fused constructs promise to clarify the initiation of aggregation. Fusing different numbers of molecules together and keeping these constructs isolated from each other allows for precise control of local protein concentration and the phase of aggregation. These types of experiments yield the relative free energies of the folded and misfolded states. In the experiments of Liu et al. (13) on fused constructs of U1A, the critical number of domains for spontaneous misfolding was found to be four. This misfolded species could correspond to the formation of a specific stable structure, the formation of a critical number of nonspecific interactions, or a mixture of the two. Making a definitive distinction between these cases will necessarily depend on further structural characterization using experiments and simulation. Furthermore, we have shown that the critical number of domains for spontaneous misfolding for a system depends on the environmental conditions, so care must be taken when comparing experiments performed with, for example, different concentrations of denaturant.

Aggregation experiments performed on I27 fused constructs suggest that the critical misfolded species is the dimer (12). Our simulations are more closely analogous to the experiments of Liu et al. (13) on isolated fused constructs. Corresponding laboratory studies have not yet been published for I27. The bulk aggregation experiments that have been used to infer the critical number of domains for misfolding were done in strongly denaturing conditions (28% TFE). We simulate denaturing conditions by raising the temperature above the folding temperature of the monomer. Under these conditions, our calculations agree with the critical number of domains for misfolding inferred from the aggregation experiments. Interestingly, our simulations show that this critical number of domains is sensitive to the strength of a particular type of interdomain interaction—specifically, mutating away the strong amyloid-like interaction in I27 allows the dimer and trimer to remain marginally stable at the folding temperature of the monomer. The tetramer appears to be a critical size for the mutant construct insofar as the relative population of the misfolded state increases abruptly when going from Inline graphic to Inline graphic ; this is in harmony with the observation that misfolding completely dominates in the annealing simulations of Inline graphic, similar to what was found for Inline graphic.

Ordered vs. Amorphous Aggregation.

Borgia et al. (14) note that no fibers were observed during the aggregation of I27. Instead, the aggregate that does form remains amorphous over several months, and the kinetic data can be fitted well using a model wherein the rate of aggregation is proportional to the surface area of the aggregate. Aggregates of I27, however, do bind dyes of the type that stain fiber structures, indicating that some fiber-like structure may be present on the nanoscale (12). One reason why I27 may prefer an amorphous morphology to forming a well-defined fiber is the presence of multiple amyloidogenic elements in its sequence. If these amyloidogenic segments serve as strong cross-links between chains while the rest of the structure remains relatively disordered, the aggregate will be a branched, gel-like structure containing β-sheet nanocrystals embedded in an amorphous matrix. Such structures are not unprecedented—the proteins that compose spider silk also have multiple amyloidogenic segments and are thought to form cross-links in the form of β-sheet nanocrystals (29). The formation of gels through amyloid-like interactions can be probed using microrheology experiments (30). In these experiments, a “precocious” change in the mechanical properties of the protein solution was observed before the formation of fibrils, which likely corresponds to a gel-like phase wherein cross-links are formed using amyloid-like interactions. Such structures even have precedent in nature, having been found to work as sieve-like permeability barriers (31). Given the importance of understanding the kinetics and mechanisms of oligomer and insoluble aggregate formation, further studies should include investigations of branched and gel-like structures in aggregation processes as well as the well-defined fibrils.

Supplementary Material

Supporting Information

Acknowledgments

We thank Jane Clarke and Martin Gruebele for helpful suggestions upon a critical reading of the manuscript. Support for this work was provided by National Institute of General Medical Sciences Grants R01 GM44557 and P01 GM071862 (to W.Z. and N.P.S); the D. R. Bullard-Welch Chair at Rice University; and the Data Analysis and Visualization Cyberinfrastructure funded by National Science Foundation Grant OCI-0959097.

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1320483110/-/DCSupplemental.

References

  • 1.Ross CA, Poirier MA. Protein aggregation and neurodegenerative disease. Nat Med. 2004;10(Suppl):S10–S17. doi: 10.1038/nm1066. [DOI] [PubMed] [Google Scholar]
  • 2.Chiti F, Dobson CM. Protein misfolding, functional amyloid, and human disease. Annu Rev Biochem. 2006;75:333–366. doi: 10.1146/annurev.biochem.75.101304.123901. [DOI] [PubMed] [Google Scholar]
  • 3.Benilova I, Karran E, De Strooper B. The toxic Aβ oligomer and Alzheimer’s disease: An emperor in need of clothes. Nat Neurosci. 2012;15(3):349–357. doi: 10.1038/nn.3028. [DOI] [PubMed] [Google Scholar]
  • 4.Berthelot K, Cullin C, Lecomte S. What does make an amyloid toxic: Morphology, structure or interaction with membrane? Biochimie. 2013;95(1):12–19. doi: 10.1016/j.biochi.2012.07.011. [DOI] [PubMed] [Google Scholar]
  • 5.Davtyan A, et al. AWSEM-MD: Protein structure prediction using coarse-grained physical potentials and bioinformatically based local structure biasing. J Phys Chem B. 2012;116(29):8494–8503. doi: 10.1021/jp212541y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zheng W, Schafer NP, Davtyan A, Papoian GA, Wolynes PG. Predictive energy landscapes for protein-protein association. Proc Natl Acad Sci USA. 2012;109(47):19244–19249. doi: 10.1073/pnas.1216215109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bryngelson JD, Wolynes PG. Spin glasses and the statistical mechanics of protein folding. Proc Nat Acad Sci. 1987;84(21):7524–7528. doi: 10.1073/pnas.84.21.7524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG. Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins Struct Funct Bioinform. 1995;21(3):167–195. doi: 10.1002/prot.340210302. [DOI] [PubMed] [Google Scholar]
  • 9.Bryngelson JD, Wolynes PG. Intermediates and barrier crossing in a random energy model (with applications to protein folding) J Phys Chem. 1989;93(19):6902–6915. [Google Scholar]
  • 10.Onuchic JN, Wolynes PG, Luthey-Schulten Z, Socci ND. Toward an outline of the topography of a realistic protein-folding funnel. Proc Natl Acad Sci USA. 1995;92(8):3626–3630. doi: 10.1073/pnas.92.8.3626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dobson CM. Protein folding and misfolding. Nature. 2003;426(6968):884–890. doi: 10.1038/nature02261. [DOI] [PubMed] [Google Scholar]
  • 12.Wright CF, Teichmann SA, Clarke J, Dobson CM. The importance of sequence diversity in the aggregation and evolution of proteins. Nature. 2005;438(7069):878–881. doi: 10.1038/nature04195. [DOI] [PubMed] [Google Scholar]
  • 13.Liu F, Gruebele M. Mapping an aggregation nucleus one protein at a time. Phys Chem Lett. 2009;1:16–19. [Google Scholar]
  • 14.Borgia MB, Nickson AA, Clarke J, Hounslow MJ. A mechanistic model for amorphous protein aggregation of immunoglobulin-like domains. J Am Chem Soc. 2013;135(17):6456–6464. doi: 10.1021/ja308852b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cellmer T, Bratko D, Prausnitz JM, Blanch H. Protein-folding landscapes in multichain systems. Proc Natl Acad Sci USA. 2005;102(33):11692–11697. doi: 10.1073/pnas.0505342102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wang Y, Chu X, Suo Z, Wang E, Wang J. Multidomain protein solves the folding problem by multifunnel combined landscape: Theoretical investigation of a Y-family DNA polymerase. J Am Chem Soc. 2012;134(33):13755–13764. doi: 10.1021/ja3045663. [DOI] [PubMed] [Google Scholar]
  • 17.Arviv O, Levy Y. Folding of multidomain proteins: Biophysical consequences of tethering even in apparently independent folding. Proteins. 2012;80(12):2780–2798. doi: 10.1002/prot.24161. [DOI] [PubMed] [Google Scholar]
  • 18.Dima RI, Thirumalai D. Exploring protein aggregation and self-propagation using lattice models: Phase diagram and kinetics. Protein Sci. 2002;11(5):1036–1049. doi: 10.1110/ps.4220102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wu C, Shea J-E. Coarse-grained models for protein aggregation. Curr Opin Struct Biol. 2011;21(2):209–220. doi: 10.1016/j.sbi.2011.02.002. [DOI] [PubMed] [Google Scholar]
  • 20.Zheng W, Schafer NP, Wolynes PG. Frustration in the energy landscapes of multidomain protein misfolding. Proc Natl Acad Sci USA. 2013;110(5):1680–1685. doi: 10.1073/pnas.1222130110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Kollman PA. The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J Comput Chem. 1992;13(8):1011–1021. [Google Scholar]
  • 22.Lee C-C, Nayak A, Sethuraman A, Belfort G, McRae GJ. A three-stage kinetic model of amyloid fibrillation. Biophys J. 2007;92(10):3448–3458. doi: 10.1529/biophysj.106.098608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Flory PJ. Constitution of three-dimensional polymers and the theory of gelation. Rubber Chem Technol. 1942;15(4):812–819. [Google Scholar]
  • 24.Stockmayer WH. Theory of molecular size distribution and gel formation in branched-chain polymers. J Chem Phys. 1943;11:45–55. [Google Scholar]
  • 25.Minajeva A, Kulke M, Fernandez JM, Linke WA. Unfolding of titin domains explains the viscoelastic behavior of skeletal myofibrils. Biophys J. 2001;80(3):1442–1451. doi: 10.1016/S0006-3495(01)76116-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Rounsevell RW, Steward A, Clarke J. Biophysical investigations of engineered polyproteins: Implications for force data. Biophys J. 2005;88(3):2022–2029. doi: 10.1529/biophysj.104.053744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cheung MS, Klimov D, Thirumalai D. Molecular crowding enhances native state stability and refolding rates of globular proteins. Proc Natl Acad Sci USA. 2005;102(13):4753–4758. doi: 10.1073/pnas.0409630102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Borgia MB, et al. Single-molecule fluorescence reveals sequence-specific misfolding in multidomain proteins. Nature. 2011;474(7353):662–665. doi: 10.1038/nature10099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Keten S, Xu Z, Ihle B, Buehler MJ. Nanoconfinement controls stiffness, strength and mechanical toughness of beta-sheet crystals in silk. Nat Mater. 2010;9(4):359–367. doi: 10.1038/nmat2704. [DOI] [PubMed] [Google Scholar]
  • 30.Manno M, Giacomazza D, Newman J, Martorana V, San Biagio PL. Amyloid gels: Precocious appearance of elastic properties during the formation of an insulin fibrillar network. Langmuir. 2010;26(3):1424–1426. doi: 10.1021/la903340v. [DOI] [PubMed] [Google Scholar]
  • 31.Ader C, et al. Amyloid-like interactions within nucleoporin FG hydrogels. Proc Natl Acad Sci USA. 2010;107(14):6281–6285. doi: 10.1073/pnas.0910163107. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES