Abstract
Protein refolding/misfolding to an alternative form plays an aetiologic role in many diseases in humans, including Alzheimer's disease, the systemic amyloidoses, and the prion diseases. Here we have discovered that such refolding can occur readily for a simple lattice model of proteins in a propagatable manner without designing for any particular alternative native state. The model uses a simple contact energy function for interactions between residues and does not consider the peculiarities of polypeptide geometry. In this model, under conditions where the normal (N) native state is marginally stable or unstable, two chains refold from the N native state to an alternative multimeric energetic minimum comprising a single refolded conformation that can then propagate itself to other protein chains. The only requirement for efficient propagation is that a two-faced mode of packing must be in the ground state as a dimer (a higher-energy state for this packing leads to less efficient propagation). For random sequences, these ground-state dimeric configurations tend to have more β-sheet-like extended structure than almost any other sort of dimeric ground-state assembly. This implies that propagating states (such as for prions) are β-sheet rich because the only likely propagating forms are β-sheet rich. We examine the details of our simulations to see to what extent the observed properties of prion propagation can be predicted by a simple protein folding model. The formation of the alternative state in the present model shows several distinct features of amyloidogenesis and of prion propagation. For example, an analog of the phenomenon of conformationally distinct strains in prions is observed. We find a parallel between `glassy' behavior in liquids and the formation of a propagatable state in proteins. This is the first report of simulation of conformational propagation using any heteropolymer model. The results imply that some (but not most) small protein sequences must maintain a sequence signal that resists refolding to propagatable alternative native states and that the ability to form such states is not limited to polypeptides (or reliant on regular hydrogen bonding per se) but can occur for other protein-like heteropolymers.
Keywords: Amyloid, propagation, prion, protein folding, simulation
Protein misfolding to an alternative form underlies a variety of degenerative disorders in humans. These include amyloidoses such as Alzheimer's disease and senile systemic amyloidosis (Kelly 1996, Kelly 1998; Selkoe 1997) and also the prion diseases (Harrison et al. 1997; Cohen and Prusiner 1998).
In amyloidoses, the N native soluble form of a protein undergoes a conformational change and assembles into an amyloid fibril. There appears to be a direct causal link between the formation of such amyloid fibrils and the pathogenesis of these diseases. For example, in Alzheimer's disease, A-β peptide is metastable in its soluble state and becomes more structured, with β-sheet forming on appearance of an amyloidogenic intermediate (Kelly 1996, Kelly 1998; Selkoe 1997). Lansbury and coworkers have shown that the Alzheimer's amyloid assembly process is modular, with at least two quaternary structural intermediates during amyloid filament assembly (Harper et al. 1997). However, it has also been shown that Alzheimer's amyloid can grow through the irreversible binding of monomers to the ends of a fibril (Lomakin et al. 1996). In contrast to Alzheimer's disease, the amyloidogenic intermediate of other amyloid disease proteins (such as in nonneuropathic lysozyme amyloidosis [Booth et al. 1997] or transthyretin-based systemic amyloidoses [Lai et al. 1997]) appears to be less structured than the N native state. Recently, amyloid fibrils have been made from proteins independent of any immediate disease context (Chiti et al. 1999; Jimenez et al. 1999).
A prion is an alternative propagatable conformation for a protein that refolds from the N native state. The prion particle forms under the appropriate conditions and, in a cellular or intact animal system, converts copies of the N native conformation of the protein to the alternative conformation. In the prion diseases of humans and other mammals, the cellular prion protein PrPC changes conformation to a disease-causing form PrPSc that is rich in β-sheet (Pan et al. 1993). PrPSc is the obligatory and major component of the infectious prion particle (Prusiner 1982). The minimal size of the PrPSc particle appears to be a dimer, as deduced from ionizing radiation studies (Bellinger-Kawahara et al. 1988). Although PrPSc can assemble into amyloid-like rods following partial proteolytic degradation to form PrP(27–30) (Nguyen et al. 1995), amyloid formation is not required for infectivity (Wille et al. 1996). The ability of prions to infect organisms and direct their self-propagation is a principal distinction between prions and other amyloidogenic particles. Three other prions have been discovered: [PSI] and [URE3] in the yeast Saccharomyces cerevisiae and [Het-s] in the fungus Podospora anserina (Lindquist 1997; Wickner 1997). Although the prion proteins are central actors in replication, additional cellular factors appear to play a role in the replication process, including `Protein X' in the case of PrP (Kaneko et al. 1997) and the chaperone Hsp104 for [PSI] (Chernoff et al. 1995). Strains of prions for PrPSc have been identified that have distinct disease incubation times and neurohistopathology (Harrison et al. 1997; Cohen and Prusiner 1998). A growing body of work indicates that the biological properties of up to eight different prion strains are encrypted in the PrPSc tertiary structure (Bessen et al. 1995; Telling et al 1996; Collinge et al. 1996; Safar et al. 1998). There is also evidence for distinct strains of the [PSI] prion (Derkatch et al. 1996).
Simple lattice models have helped in understanding many properties of protein folding and evolution (Leopold et al. 1992; Chan and Dill 1991, Chan and Dill 1994; Sali et al. 1994; Bryngelson et al. 1995; Li et al. 1996; Klimov and Thirumalai 1996; Dill and Chan 1997; Govindarajan and Golstein 1997; Bornberg-Bauer and Chan 1999; Buchler and Goldstein 1999; Pande and Rokshar 1999; Dinner and Karplus 1999). A number of lattice studies have investigated various aspects of misfolding or folding to alternative conformations. In a previous lattice model study, we showed that less stable proteins intrinsically have a greater chance of encrypting alternative native states as multimers (such as those that occur for prions) and that the hydrophobicity of a protein sequence has no bearing on the existence of an alternative native state (Harrison et al. 1999). Shakhnovich and coworkers have studied what happens when two alternative conformations occupy the ground state for a protein sequence (a scenario that may be relevant to prion formation; Abkevich et al. 1998). They found that, under denaturing conditions, the conformation with more local contacts is folded to first. Dinner and Karplus (1998) reported an example of a model protein chain for a lattice model that folds recurrently and consistently to a metastable native state. (Similar behavior has been simulated previously for an off-lattice model for the folding of a β-barrel [Honeycutt and Thirumalai 1990].) They found that the barrier from the kinetic metastable native state to the thermodynamic lowest-energy state was chiefly entropic. There have been several lattice studies on aggregation in proteins (Broglia et al. 1998; Gupta et al. 1998; Istrail et al. 1999). Gupta et al. (1998) found that despite the simplicity of their two-dimensional (2D) model, aggregates tended to be formed from native protein contacts. Broglia et al. (1998) noted that aggregated states for two chains form from local contacts that occur early in the normal folding process between strongly interacting residues. Istrail et al. (1999) reported that the propensity to aggregate is not simply dependent on the sequence composition but also on the proportion of long runs of hydrophobic or hydrophilic residues in the sequence. A recent study (Giugliarelli et al. 2000) focused on a 2D superlattice of protein chains. It showed that the number of sequences that have compact, soluble native states is greatest at a residue interaction potential that gives protein-like hydrophobicities and for which the number of prion-forming sequences is of the same order.
How can a propagatable alternative conformation arise for a protein? To what extent can the features of prion propagation come together in a simple model of protein folding, without accessory factors? We attempt to address these questions in this article, using a simple model of protein folding. We report the first simulation of propagation of an alternative conformation using any protein model. This occurs without designing the alternative conformation. Intriguingly, the model shows some features of both amyloidogenesis and prion formation. It provides insights into possible mechanisms for propagation that may be particularly relevant to the non-PrP (yeast and fungus) prions.
Results
Searching for an alternative conformation that propagates
Our chief goal was to investigate how a propagating alternative conformation might arise in a simple 2D lattice model of protein folding. Toward this end, first we performed enumerations of assemblies of two 16-mer 2D lattice chains for random sequences (described in Materials and Methods). We looked for lowest-energy modes of packing in which each member of the dimer assembly has two binding faces (depicted schematically in Fig. 1 ▶). Only one such mode of packing was observed, which we term an R state (Figs. 1a ▶, 2 ▶). This occurs for ∼6% of random sequences. It is reminiscent of the stacking evident for type-1 pilus assembly in Escherichia coli (Choudhury et al. 1999). The packing enables stacking of multiple chains of the same conformation end to end. The R2 dimer configurations have one notable distinguishing feature: They adopt a more extended structure than other lowest-energy dimeric assemblies. (The R2 dimer assemblies are significantly more extended, as determined by a Mann-Whitney U-test [Hollander and Wolfe 1973; p = 0.013 that their more extended nature in the sample of 70 sequences is random]; see Materials and Methods.) An extended secondary structure is defined by two succeeding chain bonds that travel in the same direction on the lattice. Extended secondary structure is thus β-sheet-like. The proportion of these extended secondary structures in the R2 assemblies is 0.70 (±0.13), compared to 0.36 (±0.14) in general. Amyloid and prions have extensive β structure (Pan et al. 1993; Sunde et al. 1997; Taylor et al. 1999; King et al. 1997). The stacked β-sheet-like R state that we have found may thus be the lattice analog of the amyloid protofilament, as its stacking occurs only along one axis of lattice space.
Fig. 1.
Two packing modes that give two binding faces per chain. The symbols are meant to represent distinct binding surfaces schematically and do not correspond to lattice residue types. (a) Packing mode 1 (observed here), (b) packing mode 2 (not observed here).
Fig. 2.
Gallery of key conformations and configurations for the model protein sequence studied. The four residue types are represented as follows: white disc for P, black disc for H, a light gray–colored ▵ symbol for A, and a dark gray–colored ▿ symbol for B. The total energy for each assembly is indicated here: (a) The N state for the model protein sequence (Etotal = −34). (b) The R-state dimer assembly for the model protein sequence (Etotal = −87). (c) The other dimer ground-state assembly for the model protein sequence (Etotal = −87). (d) The R3 assembly for the model protein sequence (Etotal = −147). (e) The domain-swapped variant of the model protein sequence R2 and R3 assemblies (Etotal = −86 and −146, respectively). (f) The lowest-energy assembly of the normal native (N) state for two chains for the model protein (denoted 2Nlowest in the text) (Etotal = −86).
Helical structure for our model can be defined as runs of at least two consecutive bends with the contact pattern i → i + 3, i + 2 → i+5, and so on (basically the definition used previously by Chan and Dill 1991). We did not find any R states for our initial sample of 70 sequences that had helical structure. A previous study showed that aggregatability relies on the composition of a sequence in terms of `runs' of hydrophobic and hydrpophilic residues (Istrail et al. 1999). In addition, we examined the sequences with an R state that we found for any obvious trends in sequence composition but found no such trends.
Two sequences studied for propagation
To investigate propagation, extensive folding and refolding simulations were performed for a model protein sequence that has a lowest-energy R2 dimer in its ground state arising from the dimeric enumerations. (The proportion of extended structure for this R2 dimer of the model protein sequence is 0.71.) A much smaller number of simulations was performed for a random sequence. The construction of these two sequences is described below in Materials and Methods. The N native states of both the model protein and random sequences are illustrated in Figures 2a and 3a ▶ ▶). These are the native states at infinite dilution. The encoded N state of the model protein (Fig. 2a ▶) has an obvious hydrophobic core.
Fig. 3.
Gallery of key conformations and configurations for the random sequence studied. The four residue types are represented as for Figure 2 ▶. The total energy for each assembly is indicated here: (a) The N state for the random sequence (Etotal = −27). (b) The R-state dimer assembly for the random sequence (Etotal = −71). (c) The second dimer ground-state assembly for the random sequence (Etotal = −71). (d) The third dimer ground-state assembly for the random sequence (Etotal = −71). (e) The R3 assembly for the random sequence (Etotal = −120). (f) The second trimer ground-state assembly for random sequence (Etotal = −120).
Monte Carlo (MC) dynamics simulations are performed to study three processes of protein refolding. These processes are given by the reactions
|  | 1 | 
|  | 2 | 
|  | 3 | 
where N is the normal native state and Rn is the refolded self-similar assembly of n chains. Process (1) is spontaneous refolding to an R2 dimer. Process (2) is templated propagation. In terms of prion phenomena, spontaneous refolding corresponds to an initial event of sporadic prion formation (Cohen et al. 1994; Cohen and Prusiner 1998), and templated propagation is a subsequent event of conversion to the alternative conformation (Cohen et al. 1994; Cohen and Prusiner 1998). Process (3) is the spontaneous event that corresponds to process (2).
We also consider the protein folding reaction to the N state starting from a randomly chosen unfolded conformation. This can be considered as protein folding at infinite dilution.
The parameter that was varied in our simulations is the interresidue interaction strength (ɛ/T), as in previous simulation work (Chan and Dill 1994, Chan and Dill 1998). Higher interaction strength (ɛ/T) implies less denaturing conditions (lower denaturant concentration). We used two interaction strengths (ɛ/T) in simulations for the model protein: one where the N state ΔGNfold for folding is ∼0.0 (fractional population f of the N state for normal folding at infinite dilution is 0.5), and the other where the N state would normally be denatured (f = 0.2). These are denoted (ɛ/T)f = 0.5 and (ɛ/T)f = 0.2, respectively.
Lowest-energy states for sequences from simulations
We established the complete dimeric ground states of our two sequences from an examination of all of the refolding trajectories involving two chains. We have not found any assemblies of lower energy than the R-state assemblies for each sequence either for two or for three chains. For the model protein sequence, the ground-state conformations (interaction energy (E) = −87) for the dimeric free-energy surface are the self-similar R2 dimer (Fig. 2b ▶) and another dimeric assembly that has one chain in the N state (Fig. 2c ▶). For the random sequence (E = −71), there are three ground-state dimeric assemblies (Fig. 3b–d ▶). For both examples, the interaction energy at one interface for the R conformation is ∼40% of the total in the R2 dimer (Einterface = −33 for the model protein and −27 for the random sequence). The ground-state conformations for the three-chain simulations were determined in the same manner and are also shown (Fig. 2e ▶, Fig. 3e–f ▶; total energy E = −147 for the model protein sequence, and E = −120 for the random sequence).
What happens in the dimeric refolding trajectories?
Having obtained the dimeric ground-state conformation(s), we examined the trajectories for dimeric refolding in more detail at (ɛ/T)f = 0.5 = 0.99 for the model protein. A trajectory for the refolding reaction 2N → R2 is illustrated for the model protein sequence at this ɛ/T value (Fig. 4a–d ▶). The average behavior of each variable for all trajectories is inset in each panel of the figure. As noted in Materials and Methods, the simulation protocol for the refolding reactions, in some sense, mimics a molecular crowding effect, for example, as might occur for GPI-anchored proteins in rafts (Simons and Ikonen 1997). The behavior of four variables over the course of the trajectory was monitored. These are the number of contacts in common with the N state (denoted CN): the total energy for the system, the total number of contacts, and the total number of contacts in common with the interface of the R state (Fig. 4a–d ▶). A good indicator of the progress of the refolding reaction is clearly the number of contacts in common with the interface of the R state (which is denoted CRinter). The mean value of CRinter during the simulations before conversion to R2 is 0.6 (±1.7) and is 6.0 (±1.7) afterward (this is the value of CRinter to which all of the simulations eventually converge [Fig. 4d ▶]). In the illustrated example (Fig. 4 ▶), there are three distinct periods that occur: First, the two chains quickly find the lowest-energy assembly of the normal N state (denoted 2Nlowest; Fig. 4a ▶). This has an energy of −86 dimensionless units: one higher than for the R-state dimer assembly (Fig. 2f ▶). This first period is characterized by a lack of fluctuation in all four variables (Fig. 4 ▶). Second, there is a transient period of substantial unfolding and disassembly of the two chains in which higher-energy species are sampled, which is followed by a third period of intermediate energy fluctuation in which the R-state dimer is found. This mechanism of disassembly and unfolding is common to most of the trajectories (5/6 trajectories) that first get stuck in the 2Nlowest configuration. This refolding process bears some analogy to the folding process observed for the folding of smaller HP-model proteins (Chan and Dill 1994), with the 2N state as an apparent kinetically trapped intermediate similar to those predicted for single-chain folding (Bryngelson et al. 1995; Bryngelson and Wolynes 1987, Bryngelson and Wolynes 1989; Thirumalai and Woodson 1996). The average behavior (over all trajectories) of the total number of contacts and the total energy indicates a rapid (within 625,000,000 moves) convergence to sampling high-contact low-energy assemblies (Fig. 4b–c ▶). However, these retain substantial similarity to the N state until ∼1.25 × 1011 moves have elapsed (Fig. 4a ▶), whereafter the simulations gradually converge to a value of CN midway between 8 (= CN for the R2 dimer) and 11 (= CN for its partner ground-state assembly).
Fig. 4.
Example trajectory at (ɛ/T) = 0.99 for the two-chain refolding process 2N → R2. The fluctuations in the following variables are given for a two-chain 1010 Monte Carlo move simulation (with the average behavior over all trajectories inset): (a) The number of N-state contacts (CN); the dotted line indicates the value of CN for the dimeric ground-state assembly (with energy = −87) that is nonpropagating. (b) The total energy; (c) the total number of contacts; (d) the number of R-state interface contacts (CRinter).
In total, we completed 57 of these 1010-move trajectories at (ɛ/T)f = 0.5 = 0.99 for two chains (Table 1). Thirty-three successfully refold to the R2 assembly (i.e., the reaction 2N → R2). Importantly, most (27 of 33) of these successful trajectories do not encounter the 2Nlowest configuration. The lowest-energy assembly of the N state has none of the interface contacts that occur for the R state, and any assembly of the N state has at most two such contacts. This suggests that selection for stable assemblies of the normal native state may be a design factor that could help to avoid this misfolding. However, as the model protein sequence already has a very low energy packing of its N conformation (2Nlowest, E = −86), this can be expected to be a minor effect.
Table 1.
Matrix of interactions for the four-letter alphabet
| H* | P | A | B | |
| H | −4ɛ | −2ɛ | −ɛ | −ɛ | 
| P | −2ɛ | −3ɛ | −2ɛ | −2ɛ | 
| A | −ɛ | −2ɛ | 0 | −5ɛ | 
| B | −ɛ | −2ɛ | −5ɛ | 0 | 
The residue symbols {H, P, A, B} are defined in the text. Each entry is the contact energy (energetic score) assigned to the interaction between residues of the two given types.
We also examined the relationship between the R2 dimer and its ground-state assembly counterpart (shown in Fig. 2b,c ▶). These two dimeric ground-state configurations for the model protein sequence readily interconvert in our two-chain simulations. The occupancy of both conformations after conversion is ∼15% at (ɛ/T)f = 0.5 = 0.99. All trajectories that encounter one ground-state assembly also find the other. This interconvertibility is lost once the R conformation is propagated to a third chain to form an R3 assembly (see below).
Conversion efficiency and conversion time for dimeric refolding reaction (2N → R2)
The conversion efficiency (i.e., the proportion of successful refoldings) and the mean first conversion time for the refolding of two chains were monitored for the model protein at two values of (ɛ/T) (Table 2). At (ɛ/T)f = 0.2, there is ready formation of the minimal R-state dimer. Under these more denaturing conditions, initial R-state dimer formation is ∼15 times slower than the monomeric folding process to the N state. Indeed, the simulations reach equilibrium with the R state reverting and re-converting 1022 times over the course of the 50 1010-move trajectories. At (ɛ/T)f = 0.5, where the ΔGNfold for the normal folding process to the N state is ∼0.0, there is no such reversion to the initial 2N state. At this higher interaction strength, only ∼58% (33 of 57) of the simulations lead to R-state dimer formation (Table 2). From those successful simulations, we can deduce that the mean conversion time is at least ∼7.5 × 108, only moderately slower than folding to the N state (∼10-fold).
Table 2.
Mean first conversion times for the three processes studied for the model protein sequence
| Process | Interaction strength (ɛ/T) | Number of 1010-move runs that succeed | Mean conversion time (in attempted moves per chain) | 
| Spontaneous two-chain | 0.71 | 50/50 | 2.8 × 107 | 
| Spontaneous three-chain | 0.71 | 46/46 | 7.7 × 108 (4.5 × 108 to R2 dimer) | 
| Templated three-chain | 0.71 | 50/50 | 1.4 × 108 | 
| Spontaneous two-chain | 0.99 | 33/57 | >7.5 × 108 | 
| Spontaneous three-chain | 0.99 | 2/42 (9/42 reach R2 dimer) | >2.5 × 109 (>4.8 × 108 to R2 dimer) | 
| Templated three-chain | 0.99 | 17/42 | >3.7 × 108 | 
| Folding at infinite dilution | 0.71 | 50/50 | 2.0 × 106 | 
| Folding at infinite dilution | 0.99 | 50/50 | 7.7 × 107 | 
An effective equilibrium constant and the change in free energy on conversion for the process 2N → R2 were calculated from the simulations at (ɛ/T)f = 0.2. From a simple ratio of the mean reversion time to the mean conversion time for the 1022 reversions (R2 → 2N) and re-conversions (2N → R2) observed, we can calculate ΔG(2N → R2) = −0.13 at (ɛ/T)f = 0.2 = 0.71. This favors the R2 dimer assembly under the simulation conditions.
Propagation of the R conformation
Does the alternative low-energy conformation propagate? To address this question for the model protein, we performed a set of simulations for three chains, with two chains assembled as an R2 dimer initially and a third chain in the N conformation in contact with them (Table 2). These are templated simulations. As a comparison, we also performed simulations that started with three chains in the N state (spontaneous simulations; Table 2). An R2 dimer converts a third chain from the N state to the same R conformation with complete fidelity under the more denaturing conditions at (ɛ/T)f = 0.2 (50 of 50 simulations). This templated conversion is about five times slower than (2N → R2) dimer formation but is about five times faster than the corresponding spontaneous process starting from 3N (Table 2). The effects of excluded volume or excessive molecular crowding are evident in two ways: First, the rate to dimer formation in the spontaneous three-chain simulations (starting from 3N) is ∼15 times slower with a third chain initially in direct contact. Second, the (3N → R3) process is not the sum of the (2N → R2) and (N + R2 → R3) processes at (ɛ/T)f = 0.2 (Table 2).
We found that the existence of a two-faced mode of packing in the ground state as a dimer is sufficient to produce such propagation at a stronger interaction strength where the ΔGNfold of normal N-state folding is marginal (∼0.0). At (ɛ/T)f = 0.5, coopting a third chain to the R conformation has comparable efficiency and rate of refolding to initial R-dimer formation (2N → R2). The corresponding spontaneous process is rare (2 of 42 simulations), although 9 out of 42 of these trajectories reach R2 formation (Table 2). None of the three-chain simulations revert to the original normal native conditions at the stronger interaction strength, (ɛ/T)f = 0.5.
This process of propagation to a third chain is autocatalytic in a simple sense (i.e., the conversion time to a stable R conformation is infinite in the absence of the R2 template). Also, the corresponding spontaneous refolding process for three chains is slower at both ɛ/T values considered. From our limited simulations here at two ɛ/T values, it seems that lower ɛ/T values (corresponding to more denaturing conditions) make the R2 dimer formation process and the propagation process to a third chain faster. This is consistent with general observations for a wide variety of amyloidogenesis mechanisms (e.g., lysozyme-based amyloidogenesis [Booth et al. 1997]).
Sequence mutant studies
We performed simulations to test whether four single-site mutant sequences for the model protein that do not have the R2 dimer in their dimer ground state can propagate the R conformation to a third chain (Table 3). The residue positions at which these sequences differ from the original sequence are indicated in the table. Twenty propagation simulations (process 2; N + R2 → R3) were performed for each sequence at (ɛ/T) = 0.99. Three of the mutants do not propagate, and the fourth mutant propagates inefficiently (Table 3). The two propagating sequences (the original sequence and mutant 2) both have the R3 assembly in their three-chain ground state. However, what distinguishes the inefficient propagator is the lack of the R2 dimer in its two-chain ground state. These results suggest that the R2 dimer should be in the dimeric ground state to enable efficient propagation of the R conformation to a third chain.
Table 3.
Sequence mutants for the model protein that do not have the R2 assembly in the dimer ground state
| Sequence name | Sequencea | Is the R2 dimer in the two-chain ground state? | Is the R3 dimer in the three-chain ground state? | Is there propagation N + R2 → R3? | 
| Original sequence | AHHABPBHHBHHABHH | Yes | Yes | Yes (17/42) | 
| Mutant 0 | AHHABBBHHBHHABHH | No | No | No (0/20) | 
| Mutant 1 | AHHABPBHHBHHBBHH | No | No | No (0/20) | 
| Mutant 2 | AHHABPBHPBHHABHH | No | Yes | Yes (4/20) | 
| Mutant 3 | AHHABPBHHPHHABHH | No | No | No (0/20) | 
a Residues different to the original sequence are in bold.
A domain-swapped variant in the simulations
Examination of the fluctuation in energy over the course of the trajectories at the stronger interaction strength (ɛ/T)f = 0.5 reveals an example of domain swapping. In domain swapping, a segment of one protein chain is replaced by the corresponding segment of a second protein chain (Bennett et al. 1995). This swapping may either be symmetric (as is that, e.g., recently described for nitric oxide synthase [Crane et al. 1999]) or serial (as has been suggested for protein aggregation [Bennett et al. 1995]). Domain swapping has also been suggested as a mechanism for the distinction of prion strain conformations (Cohen and Prusiner 1998).
Here, serial domain swapping occurs between multiple chains in an interlocking manner (Fig. 2e ▶). The R-state assembly for the model protein sequence has internal energy E = −87 for two chains and E = −147 for three chains. A serially domain-swapped variant of the R state arises for the same sequence that has E = −86 and E = −146 for two and three chains, respectively (Fig. 2e ▶; denoted R2DS and R3DS, respectively). This configuration has the same interface contacts (eight in number) as the R2 dimer, plus an additional interchain contact that corresponds to an intrachain contact in R2. This gives a total of nine interface contacts, the highest number for a dimeric assembly in the simulations here. (It is theoretically possible for two chains to have up to 16 interfacial contacts between each other. However, these high-energy configurations are not sampled under our effectively high-concentration conditions and are energetically much less stable.) For two and three chains, the R-conformation assembly and its domain-swapped counterpart interconvert readily at (ɛ/T)f = 0.2. For two chains at (ɛ/T)f = 0.5, every simulation that finds the ground-state dimer conformations also finds the R2DS configuration. Intriguingly, R2DS is almost always found via the other (nonpropagating) ground-state dimer assembly (95% of trajectories). However, interconversion between the two forms is not observed at the higher interaction strength, ([ɛ/T]f = 0.5) for three chains over the course of any trajectory.
This domain swapping in the simulations arises logically as a theoretical analogy of the prion-strain phenomenon (see Discussion).
Limited simulations for the random sequence
To verify that the basic premise of our investigation is not peculiar to the model protein sequence chosen, we performed a smaller number of simulations for a random sequence (for processes [1–3] listed above). Simulations were performed at (ɛ/T)f = 0.10 = 0.99 for the random sequence (this is (ɛ/T)f = 0.50 for the model protein sequence). We feel it is more appropriate to compare the sequences for this value of ɛ/T, as at this ɛ/T the better-designed model protein is marginally stable (as most real proteins are marginally stable, this would reflect a dominant environment in the cell). With the random sequence, R2 dimer formation is observed (24 of 25 simulations), as well as propagation to a third chain to produce an R3 assembly (15 of 22 simulations). Conversion thus occurs more readily for this random sequence than for the model protein sequence. Less-stable or destabilized sequences for many amyloidogenic proteins are more susceptible to amyloid formation (e.g., Booth et al. 1997) and are also predicted to be intrinsically more likely to encrypt an alternative multimeric ground-state conformation (Harrison et al. 1999). This result suggests that they are also more susceptible kinetically. Simulation of (re)folding to the R state is not feasible to study at an interaction strength at which this random sequence folds stably ([ɛ/T]f = 0.50 = 1.79), owing to time constraints.
Two equilibrium scenarios at two extremes of concentration
We examined the behavior of the fractional population of important conformations for the two extremes of concentration in the simulations: infinite dilution and high effective concentration of chains for the model protein sequence. The two sets of population curves indicate two very distinct equilibrium scenarios (Fig. 5 ▶). The fractional population curves of the N-state and R-state conformations for protein folding at infinite dilution show that the R-state conformation is rare at equilibrium (Fig. 5a,b ▶) at any value of T/ɛ (f < 0.0016; maximum f at T/ɛ = 1.63). For our two-chain folding simulations, the equilibrium probability for the R2 dimer is <0.5, it shares the ground-state with another two-chain assembly (Fig. 5c ▶). The contribution of the domain-swapped variant of R2 to the overall R-state probability is small. The behavior of the population probability of any N dimer is also illustrated in Figure 5c ▶. Interestingly, the 2N state becomes more favored than the R state at low interaction strength, (T/ɛ) > 1.54, indicating that the 2N state has a higher degree of configurational diversity. However, under such denaturing conditions, both states are disfavored compared with numerous more unfolded configurations.
Fig. 5.
The fractional population f of conformations/configurations versus (T/ɛ) and mean energy versus (T/ɛ). The densities of states used in calculating the fractional populations for specific conformations were derived from the simulations at (ɛ/T) = 0.71. (a) The fractional population f of the N-state conformation for protein folding at infinite dilution. (b) The fractional population f of the R conformation (single chain) for protein folding at infinite dilution. (c) The fractional population f of the 2N and R2 states for protein refolding at high concentration. (d) The mean energy for the kinetics simulations versus the ideal equilibrium values. A set of 25 simulations of 1010 attempted moves were performed for each kinetic data point starting from the 2N state. For the kinetic curve, the two chains sample various assemblies of the normal native N state, interconverting between them more slowly the less denaturing the conditions are.
An analogy with glassy behavior
The dimer system does not actually reach a demonstrable equilibrium at (T/ɛ) < 0.99 for the model protein sequence between the conformational space near the 2N state and the conformational space around the R2 state. This is demonstrated by the curve in Figure 5d ▶. Here, the mean energy for simulations at a range of T/ɛ values is compared to the extrapolated values for the system at equilibrium. This curve bears some analogy to a hallmark feature discussed by Kauzmann (1948) for the enthalpy difference between a glassy state nN and a crystalline alternative state Rn. In analogy to Kauzmann's scenario of decreasing temperature, the Rn state is more stable but is increasingly more unlikely to be reached the more native the conditions (i.e., higher interaction strength, less denaturing conditions). The two curves in Figure 5 ▶ start to deviate from each other below the point at which the N state would be marginally stable at infinite dilution (f ∼ 0.5).
A dimeric free-energy surface
We constructed a free-energy landscape or surface from trajectories for two chains at equilibrium (see Materials and Methods). The number of R-state interface contacts (CRinter) between the two chains in a dimer assembly, the most appropriate indicator of the progress of the refolding reaction, was used as one axis of the free-energy surface. The total number of contacts (Ctotal) is used as the orthogonal reaction coordinate to construct a contour plot (Fig. 6a ▶; (ɛ/T)f = 0.5 = 0.99). Other surface coordinates are possible, but we have used the total number of contacts as this has previously been used by other workers (e.g., Dinner et al. 2000) and as monomeric folding reactions almost always finally progress toward more compact assemblies (Chan and Dill 1994, Chan and Dill 1998). Surfaces like this one that are based on thermodynamic reaction coordinates should not be overinterpreted kinetically (Chan and Dill 1998; Dinner and Karplus 1999). Nonetheless, they are useful for elucidating thermodynamic relationships among different states and suggest viable kinetic interpretation.
Fig. 6.
Free-energy surface and profile. The densities of states used in calculating this surface and profile were derived from the simulations at (ɛ/T) = 0.71. (a) A two-dimensional free-energy surface (see text for details). Highest free energies are indicated by the color red, lowest free energies by green. The labels (A–D) on the surface indicate the following: A is the lowest-energy 2N dimer (2Nlowest); B is a low free-energy cluster of structures accessible by a fluctuation of the other nonpropagating ground-state dimer located at C; C is the low point on the free-energy surface for the other nonpropagating ground-state dimer configuration; D is the low point on the free-energy surface populated by R2 and its domain-swapped variant R2DS. (b) Profile of free energy and mean energy as a function of Cinter. The entropy contribution is given by the difference between these two curves.
The key low-lying features of the surface are labeled (Fig. 6a ▶). The slope of the surface becomes steeper as CRinter increases (Fig. 6a ▶). This indicates that there is some entropic contribution to the barrier to R2 conversion that arises from fewer extended assemblies of chains in the vicinity of the R2 state than for the 2N state. However, at low denaturant ([ɛ/T] ≳ 0.99), the profile is dominated by a small number of specific assemblies along the CRinter coordinate (Fig. 6a,b ▶), and one sees the same low points for both the mean energy and the free energy of the system (Fig. 6b ▶). Where CRinter = 0, dimeric assemblies of the N state dominate where the total number of contacts is high. The lowest free-energy assembly of the N state has CRinter = 0. Assemblies of 2N have at most two such contacts. Before conversion to R2, the chains are restricted to sampling various different 2N assemblies and fluctuations of them, not progressing beyond CRinter = 2. The alternative ground-state dimer (with E = −87) is close to the R2 dimer at CRinter = 6. These two assemblies are observed to interconvert readily (see above). The low point at CRinter = 5 is caused by a fluctuation of the nonpropagating ground-state dimer assembly (with CRinter = 6) and an ensemble of conformations with higher internal energy (E = −85). The domain-swapped variant of the R state contributes to the free-energy low point at CRinter = 8. With decreasing ɛ/T (more denaturing conditions), these specific free-energy wells become less dominant. This is because less contribution to the free energy comes from the enthalpic interaction of two chains (e.g., at [ɛ/T] = 0.25, the mean interaction enthalpy between two chains is −3.0 [±8.7]).
Figure 6b ▶ indicates the trend in free energy and mean energy as a function of CRinter at a high ɛ/T value (= 1.21). This profile corresponds to one axis of the free-energy surface. It further demonstrates the domination of specific dimer assemblies and also indicates that the conversion process cannot simply be explained by a drive toward greater compactness. At ΔGNfold ∼ 0.0 for the present system for the model protein sequence, the entropic component (configurational and conformational) of the free energy does not significantly contribute to the total free energy along this coordinate. Under such conditions that excessively favor the native state N at infinite dilution ([ɛ/T] > 0.99), there is no refolding (from CRinter = 0 to CRinter = 8) over the course of our longest trajectories (1010 MC moves), indicating that the predominantly enthalpic refolding barriers are too high and too numerous to cross.
Discussion
A propagating β-sheet-rich state
We have discovered a conformational propagation mechanism in a simple lattice model of protein folding. The propagation has some similarities to amyloidogenesis and to prion formation. The lattice model used does not have any specific features of polypeptide geometry and only relies on a rudimentary contact-based energy function. We have not performed any sequence design to derive the alternative propagating conformations. The basic requirement for an efficiently propagating state is that a self-similar mode of two-faced packing must be in the ground state as a dimer (a higher-energy level appears not be sufficient). These modes of packing (termed R states) have more extended β-sheet-like structure than almost any other sort of dimeric assembly and occur for ∼6% of random sequences. They also allow the stacking of chains end to end. An example of such packing is observed in type 1 pilus assembly in E. coli (Choudhury et al. 1999). Amyloid and prions have extensive β structure (Pan et al. 1993; King et al. 1997; Sunde et al. 1997; Taylor et al. 1999). A propagatable helical assembly is conceivable; e.g., a sequence could have an N state with the contact map ([first contacting residue, second contacting residue]: [1,4], [1,16], [3,6], [4,13], [5,8], [5,12], [9,12], [11,14], [13,16]). It could then have an R-state monomer comprising a single, long helix: (1,4), (3,6), (5,8), and so forth, although we did not find any R states that had helical content (they are presumably rarer). The possibility of such a helical propagating state arising in a larger sample of sequences could be a further topic of investigation for this model. The stacked β-sheet-like R state is thus a lattice analog of the amyloid protofilament, as its stacking occurs only along one axis of lattice space. A dimeric R2 template can convert a third chain from the N state to form an R3 assembly in a catalytic fashion. (There is a rate enhancement in the conversion of a third chain to the R conformation by a dimer template, compared to the corresponding spontaneous process.) As the rate of decay for the R3 assembly is far less than the rate of conversion to it, we expect that the R conformation can further propagate to Rn assemblies, forming amyloid-like filaments.
The only likely propagatable alternative conformations (i.e., that have two binding faces) that we have found in this model are β-sheet rich. This indicates that β-sheet-rich amyloid- and prion-like alternative conformations are intrinsic to encoded structures and that other forms are unlikely. They can arise for a contact-based energy function without any explicit consideration of separate hydrophobic and hydrogen-bonding terms in the energy function. Also, for some amyloidogenesis/prion formation mechanisms, there might be extensive β structure simply because the lowest-energy propagating assemblies accessible for a small number of chains are among the most extended assemblies possible.
Prion-like features and implications
Although our simulations do not reproduce a replicated particle, they do nonetheless, have a number of key prion-like features. In addition to the β-sheet-like content of the alternative form (that is observed for PrP and the two yeast prions [Pan et al. 1993; King et al. 1997; Taylor et al. 1999]), a minimal alternative conformation of an R2 dimer is observed. This was observed for PrPSc by ionizing radiation studies (Bellinger-Kawahara et al. 1988). In the present simulations, the R monomer is a rare equilibrium unfolding intermediate under dilute conditions. For PrP, although there is some evidence that a mildly protease-resistant β-sheet-containing form is observable at acid pH (Jackson et al. 1999), other evidence suggests that spontaneous conversion to PrPSc is coupled with dimer formation at neutral pH (but not multimer formation of a higher order; Post et al. 1998). Such a coupling of conversion and dimer formation is observed here.
Although a single move to detach a chain from an Rn assembly would be very unlikely in the present simulations, disintegration of an Rn assembly (of greater than three chains) by a combination of moves may arise in simulations of longer duration. As the interaction enthalpy required for conversion of a third chain by a dimer (here 40% of the total individual chain interactions) is too much to enable ready subsequent release of the converted chain, our model predicts that extra factors (such as Protein X in PrP prion propagation; Kaneko et al. 1997) should play a role in facilitating detachment. Assessment of the rates and mechanisms of Rn polymer disintegration is beyond the scope of this article. In a more extensive study, for other model examples not studied here, the interfacial binding energy per additional R unit may be much less than in the present examples and, thus, more likely to give a distribution of Rn polymer sizes (as predicted by Masel et al. 1999). If the binding enthalpy per additional chain were less, the resulting higher rate of detachment for longer simulations under denaturing conditions might produce replication-competent particles. Such a mechanism might be more applicable to the non-PrP prions. (For PrP specifically, PrP prion infectivity is separable from amyloid formation in prion disease [Wille et al. 1996].)
To what extent can an analog of the prion strain phenomenon arise for a protein without the need for accessory factors? Distinct strain conformations of PrPSc are linked to the different prion disease histopathologies and incubation times (Cohen and Prusiner 1998). They have different proteinase-K resistant fragment sizes and antibody binding profiles (Collinge et al. 1996; Telling et al. 1996; Safar et al. 1998). Strain conformations may also occur for the [PSI] yeast prion (Derkatch et al. 1996). There is recent evidence that different modes of metal ion binding are associated with PrP prion strain conformations (Wadsworth et al. 1999). Also, different strains may have different glycosylation patterns (Collinge et al. 1996). Domain swapping has been suggested as a way to obtain different prion strain conformations (Cohen and Prusiner 1998). We have observed a domain-swapped variant (named RnDS above) of the Rn assembly for either two or three chains for the model protein sequence. Its interaction energy is only one dimensionless unit higher than that of the lowest-energy Rn assemblies for two and three chains. Although it is not in the ground state, it remains propagatable because of its relationship with the propagatable Rn assembly. This domain swapping arises logically as a theoretical analogy of the prion strain phenomenon here. Our observation suggests that strain conformations may be a general property of polymeric aggregates, including amyloid. The difference in the interconversion behavior for the two-chain and three-chain simulations (under less denaturing conditions) indicates that the likelihood of interconversion between an Rn assembly and its domain-swapped variant will decrease with increasing value of n, with the domain-swapped form as the minor strain variant needing a larger inoculum of chains to propagate its configuration. For the random sequence, there are no propagatable variants of the Rn assembly.
A number of other features in the present model are notable as amyloid-like. Conversion occurs more readily under more denaturing conditions for the monomer, as for many amyloidogenesis mechanisms (e.g., lysozyme-based amyloidosis; Funahashi et al. 1996; Booth et al. 1997). The propagation procedure is essentially irreversible once a large enough assembly (here, three chains) is attained. However, there does not appear to be a well-defined monomeric amyloidogenic intermediately before dimer formation in the present simulations, as described for some amyloidogenesis mechanisms (Kelly 1998).
With the present limited interaction scheme, our model does not have any explicit scoring term for hydrogen bonding, although one can define secondary structure in the conformations of the heteropolymers (see above and Chan and Dill 1991; Li et al. 1996). Our results may indicate that hydrogen bonding per se is not the underlying factor for an amyloid state for heteropolymers. In some observed amyloid fibrils, experimental data indicate that there is substantial stabilization of the fibril from hydrophobic interactions. In transthyretin fibrils, for example, a model is implied by high-angle X-ray diffraction data that has four β-sheets packed against each other with hydrophobic interactions orthogonal to the hydrogen bonding (Blake and Serpell 1996; Sunde et al. 1997). Such a fibril model comprising four to six β-sheets also agrees with X-ray data for five other amyloids (Sunde et al. 1997).
The free-energy surface for dimer refolding to a propagatable state
The free-energy surface for dimer formation was characterized under conditions of molecular crowding at high effective concentration. It indicates a split (re)folding landscape leading at either extreme to an alternative multimeric minimum (termed the R state) or assemblies of the N state. Starting from 2N, the chains sample various different assemblies of the N state and fluctuations of them before conversion to the R2 configuration occurs (they are restricted to a small region of the free-energy surface). Refolding from the lowest-energy assembly of the N state (2Nlowest) to the R2 state usually requires transient unfolding and disassembly of the two chains. Refolding to the R2 state is more successful if 2Nlowest is avoided. The 2N state is an apparent kinetically trapped intermediate similar to those predicted for single-chain folding (Bryngelson et al. 1995). This refolding behavior is similar to the folding process observed for smaller HP-model proteins (Chan and Dill 1994).
A key point here is that, at high ɛ/T (less denaturing conditions) where the normal native state is at least marginally stable, the dimeric refolding observed is dominated by specific assemblies (as described in detail above and in Fig. 6 ▶). There appears to be some entropic contribution to the barrier for conversion to R2 that arises from the fewer extended assemblies of two chains that are accessible from the R2 configuration. This is discernible from the increased slope of the free-energy surface as CRinter increases. However, the same low points are observed along the CRinter reaction coordinate for the mean energy as for the free energy, indicating domination by 2Nlowest, R2, and a small number of other two-chain assemblies.
Sequence design issues for the model protein
Small proteins fold on a time scale of seconds or less (Fedorov and Baldwin 1999). Many troublesome misfolders are small protein chains or peptides (Kelly 1998). Suppose we assume that our folding time to the N state is physiological; that is, the mean first passage time is equivalent to an in vivo folding timescale of ∼10 sec. Then, at the higher interaction strength (ɛ/T)f = 0.5 = 0.99, up to ∼58% of the population of molecules could refold to the R state in ∼100 sec, a time scale short enough to be physiologically relevant. This gives a lower bound for the dimeric conversion process. Excessive molecular crowding will tend to lengthen this time scale by an order of magnitude (this is discernible from the spontaneous propagation simulations involving three chains). This nonetheless strongly suggests that for some (but not most) small proteins or peptides that must function under conditions of high concentration (e.g., in GPI-anchored rafts, as in the case of PrP), there must be a strong selection pressure to design out the tendency to detrimental R-state formation that results in a sequence signal that retards R-state formation or in speedy clearance mechanisms for removing R-state precursors or aggregates.
We have shown that a solution of the normal native nN state for a protein can be considered as a glassy state and the Rn state as the crystalline form, that becomes increasingly more difficult to relax to, the less denaturing the conditions. This is evident in our model system from the behavior of the mean energy as a function of ɛ/T (Fig. 5d ▶). However, this phenomenon does not suffice to prevent conversion to a propagatable alternative native state at the ɛ/T value where the N state would be marginally stable at infinite dilution or under more denaturing conditions.
One sequence design factor that would retard formation of the R2 state would be a mutation that lifts the R2 dimer out of the two-chain ground state. Simulations on single-site sequence mutants of our model protein sequence suggest that the R2 dimer must be in the dimeric ground state to enable efficient propagation of the R conformation. These mutation experiments indicate a thermodynamic component in amyloidogenic/prion-promoting sequence mutations: Where the mutant sequence is rarer for its encoded protein (and there is thus less evolutionary pressure on designing a sufficient barrier to conversion), the mutation may primarily act by pulling the R2 assembly down into the dimeric ground state. Such details of our simulations will help in designing simulation of propagation with a more detailed model, such as one based on polypeptide geometry, either on- or off-lattice. This mutation mechanism may be relevant to PrP as an additional factor in prion disease promotion, as not all PrP prion-disease-causing mutations have been shown to be destabilizing to the PrPC form (Liemann and Glockshuber 1999). Also, the fact that a single-site mutant can drastically change the possible lowest-energy aggregate morphology in our model parallels an experiment on a single-site mutant of the immunoglobulin light chain in immunoglobulin light chain amyloidosis (Helms and Wetzel 1996).
Another possible sequence design factor that may help to retard the formation of propagatable alternative native states Rn might be to select for stable assemblies of the N state during evolution, in addition to selecting for a stable monomeric N state. However, the fact that a relatively low-energy assembly for the normal native N state exists for the model protein in our study (only +1 dimensionless energy unit relative to the R2 assembly) suggests that this would not be sufficient. In addition, assemblies of the N state that are too stable may also lead to solubility problems. Other design issues relate to the nature of the dimeric ground state. Our examples (both random and model-protein sequences) show that one can have competing configurations in the dimeric ground state yet still have a propagating R state maintained. For the model protein sequence, the fact that one of the chains in this alternative ground-state assembly is in the N conformation shows that this does not suffice to prevent propagation of an R state.
We found that the random sequence, which is much more denatured at (ɛ/T) = 0.99, converts more readily to the R state. Less stable or destabilized sequences for many amyloidogenic proteins are more susceptible to amyloid formation (e.g., lysozyme-based amyloidosis; Funahashi et al. 1996; Booth et al. 1997) and are also predicted to be intrinsically more likely to encrypt an alternative multimeric ground-state conformation (Harrison et al. 1999). This result suggests that they are also more susceptible kinetically.
Materials and methods
Sequences and enumeration of conformations
Sequences are configured on a 2D square lattice, as described previously (Chan and Dill 1991, Chan and Dill 1994; Harrison et al. 1999). They are made from a four-letter alphabet of residue types: {H, P, A, and B}. H and P are hydrophobic and polar, whereas A and B pair best with residue types B and A, respectively. Each residue can make contact with another residue on the lattice that is more than two residue placings distant along the sequence or that is in a different chain. These contacts are given an energetic score in units of ɛ (ɛ > 0; Table 1). This residue alphabet and scoring matrix has been used previously (Kaffe-Abramovich and Unger 1998) and can be considered a simplification of the Miyazawa and Jernigan contact matrix (Miyazawa and Jernigan 1996; Wang and Wang 1999). The total energy (E) for a conformation is simply the sum of its individual contact energies. To find the lowest-energy state for a sequence as a monomer, we enumerate the energies of all possible conformations. Sequences that have a single lowest-energy conformation as a monomer are considered candidates for model proteins for this residue alphabet. We initially generated a random sample of 10,000 16-mer sequences for this four-letter residue alphabet. Experiments on the foldability of random sequences suggests that only a very small fraction of randomly generated sequences can fold to a protein-like globule with a unique conformation (Davidson et al. 1995). A rather large proportion (38.5%) of the random sequences for our four-letter alphabet has a single conformation of lowest energy. Therefore, a further screening is used to select model proteins (Abkevich et al. 1998).
We set out to generate a model protein to study. Out of the initial random sample of sequences, a further selection of 70 random sequences was chosen that have a unique ground-state conformation. From this set of sequences, we picked one native-state conformation randomly and generated a variety of 20 sequences for it using the sequence design procedure of Shakhnovich and coworkers (Abkevich et al. 1998). This procedure may be viewed as mimicking the pressures of natural selection. For each of these sequences, we performed a further set of enumerations to find low-energy homodimeric assemblies. All possible assemblies that comprise a copy of each possible conformation docked against itself were enumerated. We examined the homodimer enumerations to look for modes of packing that give two binding faces per chain (Fig. 1 ▶). This type of assembly is termed an R state. We found one sequence that had the mode of packing in Figure 1a ▶ in its dimer ground state. This is the model protein sequence in our study. Its sequence is AHHABPBHHBHHABHH. Its encoded structure has an obvious hydrophobic core (Fig. 2a ▶).
In addition, we performed these homodimeric enumerations for the complete set of 70 random sequences. We found that 4 of the 70 (6%) random sequences have the sort of two-faced packing, as indicated in Figure 1a ▶. Other modes of two-faced packing (such as that in Fig. 1b ▶) are conceivable but were not observed in the enumerations and so are not studied here. We also chose one of these four sequences to study (denoted random sequence). Its sequence is BHPPPPHBAPPPPHHA.
Folding simulation protocol
The MC moves considered in the (re)folding dynamics simulations are crankshafts, end-flips, corner-flips, and pivots (or `rigid-body rotations') as described previously (Chan and Dill 1994). Because our goal is to deduce general coarse-grained principles rather than to provide detailed predictions, for simplicity, all moves are assigned equal probability, as in previous studies (Chan and Dill 1994, Chan and Dill 1998). We have designed a multichain kinetic model to mimic physiological molecular crowding, which allows for the physical possibility that even in a crowded environment chains can sometimes be detached while undergoing kinetic changes in close proximity to other chains. In our multichain simulations, chains are initialized in randomly chosen configurations such that each chain has at least one contact with one other chain. In addition, in these multichain simulations, a small fraction (5%) of all attempted moves are translational, whereby the entire chain is rigidly displaced one lattice spacing in one of the four possible directions (as in Gupta et al. 1998). Hence chains can diffuse together or drift apart. To model molecular crowding, after every 1 × 106 attempted MC moves (i.e., on average every 5 × 104 attempted translation moves per chain), every chain is recentered to a region near the origin of the simulation box (where the simulation starts). Chains are recentered en masse with their relative positions maintained if they are in contact, whereas chains that are not in contact are assigned randomly to be in contact with at least one other chain. This recentering algorithm ensures that chains would not drift too far apart and would always have ample opportunities to interact with one another.
The standard Metropolis MC criterion (Metropolis et al. 1953) is used to accept or reject moves. The Boltzmann constant is set to unity for simplicity. Fifty simulations are studied for each sequence for each process unless otherwise stated in the text. Folding to the N state at infinite dilution is studied for up to 5 × 108 attempted moves. These simulations are started from a randomly chosen conformation. Time to fold initially from the unfolded state to the N state is termed the mean first passage time. Refolding from the N state is simulated for 1010 attempted moves. Time to refold initially from the N state to the R state is called the mean first conversion time. Model time is defined as the number of attempted MC moves per chain. All attempted moves contribute to the model time regardless of whether they succeed or fail, as required for the system to equilibrate to a Boltzmann distribution. Our model time is equal to the MC time step in Gupta et al. (1998) times the length of a model chain but differs from the definition of model time of Sali et al. (1994), as they did not include attempted moves forbidden by excluded volume. Simulations are monitored beyond the mean first passage time or mean first conversion time to calculate equilibrium properties. Over the course of all of our simulations, we never find assemblies of lower energy than the R-state assemblies for either two or three chains. The conversion efficiency is simply the proportion of simulations that result in refolding to the Rn assembly of n chains.
The interaction strength parameter (ɛ/T) and calculation of free energy
Kinetic simulations are performed at different interaction strengths, denoted ɛ/T, where T is temperature and ɛ is the interaction energy unit. The interaction energy ɛ is assumed to be enthalpic. Entropic contributions in our system originate entirely from conformational and configurational freedom (both translational and rotational). Hence, ɛ/T measures the relative importance of enthalpy versus entropy in the free-energy balance of the model system. Interaction strength (ɛ/T) has been the parameter varied in other work previously (Chan and Dill 1994, Chan and Dill 1998). The variation of ɛ/T may be viewed as modeling changes in the interaction energy ɛ or temperature T. However, without considering the effects of temperature dependence of ɛ (which should represent a potential of mean force) and of the rate of attempted MC moves—that is, the MC clock—it is more appropriate to interpret the variation of ɛ/T in this study as representing changes in denaturing condition of the solvent at an essentially constant physiological temperature (Chan and Dill 1998).
We perform MC dynamics simulations chiefly at two ɛ/T values, (ɛ/T)f = 0.5 = 0.99 and (ɛ/T)f = 0.2 = 0.71 unless otherwise stated. Here, (ɛ/T)f = x is the interaction strength at which the fractional Boltzmann population f for normal N-state (single-chain) folding at infinite dilution is x. The change in free energy on folding to the N state is denoted ΔGNfold and is derived as in Harrison et al. (1999) with two modifications. First, the α variable in this previous work is now replaced by 1/T. Second, instead of restricting to compact conformations, we now consider all accessible conformations in the determination of free energy.
An effective free-energy surface for two chains at a given ɛ/T is calculated by determining the densities of states for the dimer assemblies using a modification of the weighted histogram analysis method (Boczko and Brooks 1993). The entropy and Boltzmann-average energy as a function of two reaction coordinates are calculated by a method similar to that of Sali et al. (1994).
Recently, two other 2D square lattice approaches have been used to model kinetics of protein aggregation. Gupta et al. (1998) studied 25–40 HP model chains (20-mers) in a simulation box whose size was determined by the protein concentration. In the model of Istrail et al. (1999), two HP (or HP/solvent) model chains (16-mers) are folded independently, and their aggregability is tested periodically during the folding process. Though we consider at most three chains, our approach is physically more akin to that of Gupta et al. (1998) because our chains can potentially run into one another at every attempted MC move; they do not fold independently for sustained durations as in Istrail et al. (1999). The approach of Gupta et al. (1998) is most adequate for modeling in vitro aggregation experiments on protein solutions with given concentrations. However, our recentering algorithm is a rudimentary model of molecular crowding, which can be more complex than a concentrated protein solution because anchored molecules are sometimes involved. There is no explicit input of concentration information in our approach, though a certain effective concentration may correspond to a particular assigned interval between recentering at one particular interaction strength (see above). However, because the molecular crowding effects here are modeled kinetically, the relation between the variation of any effective concentration and the variation of the interval between recentering (as well as the variation in interaction strength) may be rather complex.
Acknowledgments
P.M.H., F.E.C., and S.B.P were funded by grants from the National Institutes of Health. The research of H.S.C. is supported partially by Medical Research Council of Canada Grant MT-15323 and a Premier's Research Excellence Award (Ontario). H.S.C. is a Canada Research Chair.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at www.proteinscience.org/cgi/doi/10.1110/ps.38701.
References
- Abkevich, V.I., Gutin, A.M., and Shakhnovich, E.I. 1998. Theory of kinetic partitioning in protein folding with possible application to prions. Proteins Struct., Func. Genet. 31 335–344. [PubMed] [Google Scholar]
- Bellinger-Kawahara, C.G., Kempner, E., Groth, D., Gabizon, R., and S.B. Prusiner. 1988. Scrapie prion liposomes and rods exhibit target sizes of 55000 kDa. Virology 164 537–541. [DOI] [PubMed] [Google Scholar]
- Bennett, M.J., Schlunegger, M.P., and Eisenberg, D. 1995. 3D Domain swapping: A mechanism for oligomer assembly. Protein Sci. 4 2455–2468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bessen, R.A., Kocisko, D.A., Raymond, G.J., Nandan, S., Lansbury, P.T., and Caughey, B. 1995. Non-genetic propagation of strain-specific properties of scrapie prion protein. Nature 375 698–700. [DOI] [PubMed] [Google Scholar]
- Blake, C. and Serpell, L. 1996. Synchrotron X-ray studies suggest that the core of the transthyretin amyyloid fibril is a continuous β-sheet helix. Structure 4 989–998. [DOI] [PubMed] [Google Scholar]
- Boczko, E.M. and Brooks, C.L. 1993. Constant temperature free energy surfaces for physical and chemical processes. J. Chem. Phys. 97 4509–4513. [Google Scholar]
- Booth, D.R., Sunde, M., Bellotti, V., Robinson, C.V., Hutchinson, W., Fraser, P.E., Hawkins, P.N., Dobson, C.M., Radford, S.E., and Blake, C.C. 1997. Instability, unfolding and aggregation of human lysozyme variants underlying amyloid fibrillogenesis. Nature 385 787–793. [DOI] [PubMed] [Google Scholar]
- Bornberg-Bauer, E. and Chan, H. S. 1999. Modeling evolutionary landscapes: Mutational stability, topology and superfunnels in sequence space. Proc. Natl. Acad. Sci. 96 10689–10694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broglia, R.A., Tiana, G., Pasquali, S., Roman, H.E., and E. Vigezzi. 1998. Folding and aggregation of designed proteins. Proc. Natl. Acad. Sci. 95 12930–12933. [Erratum 96: 10943 (1999).] [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bryngelson, J.D. and Wolynes, P.G. 1987. Spin glasses and the statistical mechanics of protein folding. Proc. Natl. Acad. Sci. 84 7524–7528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ———. 1989. Intermediates and barrier crossing in a random energy model (with applications to protein folding). J. Phys. Chem. 93 6902–6915. [Google Scholar]
- Bryngelson, J.D., Onuchic, J.N., Socci, N.D., and Wolynes, P.G. 1995. Funnels, pathways and the energy landscape of protein folding: A synthesis. Proteins Struct. Funct. Genet. 21 167–195. [DOI] [PubMed] [Google Scholar]
- Buchler, N.E.G. and Goldstein, R.A. 1999. Universal correlation between energy gap and foldability for the random energy model and lattice proteins. J. Chem. Phys. 111 6599–6609. [Google Scholar]
- Chan, H.S. and Dill, K.A. 1991. Sequence space soup of protein and copolymers. J. Chem. Phys. 95 3775–3787. [Google Scholar]
- ———. 1994. Transition states and folding dynamics of proteins and heteropolymers. J. Chem. Phys. 100 9238–9257. [Google Scholar]
- ———. 1998. Protein folding in the landscape perspective: Chevron plots and non-Arrhenius kinetics. Proteins Struct. Func. and Genet. 30 2–33. [DOI] [PubMed] [Google Scholar]
- Chernoff, Y.O., Lindquist, S., Ono, B.I., Inge-Vechtamov, S.G., and S.W. Liebman. 1995. Role of the chaperone Hsp104 in propagation of the yeast prion-like factor [psi+]. Science 268 880–884. [DOI] [PubMed] [Google Scholar]
- Chiti, F., Webster, P., Taddei, N., Clark, A., and Dobson, C.M. 1999. Designing conditions for in vitro formation of amyloid protofilaments and fibrils. Proc. Natl. Acad. Sci. 96 3590–3594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choudhury, D., Thompson, A., Stojanoff, V., Langermann, S., Pinkner, J., Hultgren, S.J., and Knight, S.D. 1999. X-ray structure of the FimC-FimH chaperone-adhesin complex from uropathogenic E. coli. Science 285 1061–1066. [DOI] [PubMed] [Google Scholar]
- Cohen, F.E. and Prusiner, S.B. 1998. Pathologic conformations of prion proteins. Annu. Rev. Biochem. 67 793–819. [DOI] [PubMed] [Google Scholar]
- Cohen, F.E., Pan, K.M., Huang, Z., Baldwin, M., Fletterick, R., and Prusiner, S.B. 1994. Structural clues to prion replication. Science 264 530–531. [DOI] [PubMed] [Google Scholar]
- Collinge, J., Sidle, K.C., Meads, J., Ironside, J., and Hill, A.F. 1996 Molecular analysis of prion strain variation and the aetiology of new variant CJD. Nature 383 685–690. [DOI] [PubMed]
- Crane, B., Rosenfeld, R.J., Arrai, A.S., and Ghosh, D.K. 1999. N-terminal domain swapping and metal-ion binding in nitric synthase. EMBO J. 18 6271–6281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davidson, A.R., Lumb, K.J., and Sauer, R.T. 1995. Cooperatively folded proteins in random sequence libraries. Nature Struct. Biol. 2 856–864. [DOI] [PubMed] [Google Scholar]
- Derkatch, I.L., Chernoff, Y.O., Kushnirov, V.V., Inge-Vechtomov, S.G., and Liebman, S.W. 1996. Genesis and variability of [PSI] prion factors in S. cerevisiae. Genetics 144 1375–1386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dill, K.A. and Chan, H.S. 1997 From Levinthal to pathways to funnels. Nature Struct. Biol. 4 10–19. [DOI] [PubMed]
- Dinner, A.R. and Karplus, M. 1998 A metastable state in folding simulations of a protein model. Nature Struct. Biol. 5 236–241. [DOI] [PubMed]
- ———. 1999. Is protein unfolding the reverse of protein folding? A lattice simulation analysis. J. Mol. Biol. 292 403–419. [DOI] [PubMed] [Google Scholar]
- Dinner, A.R., Sali, A., Smith, R., Dobson, C.M., and Karplus, M. 2000. Understanding protein folding via free-energy surfaces from theory and experiment. Trends Biochem. Sci. 25 331–339. [DOI] [PubMed] [Google Scholar]
- Fedorov, A.N. and Baldwin, T.O. 1999. Process of biosynthetic folding determines the rapid formation of native structure. J. Mol. Biol. 294 579–586. [DOI] [PubMed] [Google Scholar]
- Funahashi, J., Takano, K., Ogasahara, K., and Yamagata, Y. 1996. The structure, stability and folding process of amyloidogenic mutant human lysozyme. J. Biochem. 120 1216–1223. [DOI] [PubMed] [Google Scholar]
- Giugliarelli, G., Micheletti, C., Banavar, J.R., and Maritan, A. 2000. Compactness, aggregation and prionlike behavior of protein: A lattice model study. J. Chem. Phys. 113 5072–5077. [Google Scholar]
- Govindarajan, S. and Goldstein, R.A. 1997. Evolution of model proteins on a foldability landscape. Proteins: Struct. Funct. Genet. 29 461–466. [DOI] [PubMed] [Google Scholar]
- Gupta, P., Hall, C.K., and Voegler, A.C. 1998. Effect of denaturant and protein concentrations upon protein refolding and aggregation: A simple lattice model. Protein Sci. 7 2642–2652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harper, J.D., Wong, S.S., Lieber, C.M., and P.T. Lansbury 1997. Observation of metastable A-β amyloid protofibrils by atomic force microscopy. Chem. Biol. 4 119–125. [DOI] [PubMed] [Google Scholar]
- Harrison, P.M., Bamborough, P., Daggett, V., Prusiner, S.B., and Cohen, F.E. 1997. The prion folding problem. Curr. Opin. Struct. Biol. 7 53–59. [DOI] [PubMed] [Google Scholar]
- Harrison, P.M., Chan, H.S., Prusiner, S.B., and Cohen, F.E. 1999. Thermodynamics of model prions and its implications for the problem of prion protein folding. J. Mol. Biol. 286 593–606. [DOI] [PubMed] [Google Scholar]
- Helms, L.R. and R., Wetzel. 1996. Specificity of abnormal assembly in immunoglobulin light-chain deposition and amyloidosis. J. Mol. Biol. 257 77–86. [DOI] [PubMed] [Google Scholar]
- Hollander, M. and Wolfe, D.A. 1973. Nonparametric statistical methods. Wiley, New York.
- Honeycutt, J.D. and Thirumalai, D. 1990. Metastability of the folded states of globular proteins. Proc. Natl. Acad. Sci. 87 3526–3529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Istrail, S., Schwartz, R., and King, J. 1999. Lattice simulations of aggregation funnels for protein folding. J. Comp. Biol. 6 143–162. [DOI] [PubMed] [Google Scholar]
- Jackson, G.S., Hosszu, L.L., Power, A., Hill, A.F., Kenney, J., Saibil, H., Craven, C.J., Waltho, J.P., Clarke, A.R., and Collinge, J. 1999. Reversible conversion of monomeric human prion protein between native and fibrillogenic conformations. Science 283 1935–1937. [DOI] [PubMed] [Google Scholar]
- Jimenez, J.L., Guijarro, J.L., Orlova, E., Zurdo, J., and C.M.Dobson 1999. Cryo-electron microscopy structure of an SH3 amyloid fibril and model of the molecular packing. EMBO J. 18 815–821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaffe-Abramovich, T. and Unger, R. 1998. A simple model for evolution of proteins towards the global minimum of free energy. Folding Design 3 389–399. [DOI] [PubMed] [Google Scholar]
- Kaneko, K., Zulianello, L., Scott, M., Cooper, C.M., Wallace, A., James, T.L., Cohen, F.E., and Prusiner, S.B. 1997. Evidence for protein X binding to a discontinuous epitope on the cellular prion protein during prion propagation. Proc. Natl. Acad. Sci. 94 6618–6622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kauzmann, W. 1948. The nature of the glassy state and the behavior of liquids at low temperatures. Chem. Rev. 43 219–256. [Google Scholar]
- Kelly, J.W. 1996. Alternative conformations of amyloidogenic proteins govern their behavior. Curr. Opin. Struct. Biol. 6 11–17. [DOI] [PubMed] [Google Scholar]
- ———. 1998. The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr. Opin. Struct. Biol. 8 101–106. [DOI] [PubMed] [Google Scholar]
- King, C.Y., Tittmann, P., Gross, H., Gebert, R., and Wuthrich, K. 1997. Prion-inducing domain 2–114 of yeast Sup35 protein transforms in vitro into amyloid-like filaments. Proc. Natl. Acad. Sci. 94 6618–6622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klimov, D.K. and Thirumalai, D. 1996. Factors governing the foldability of proteins. Proteins Struct. Func. Genet. 26 411–441. [DOI] [PubMed] [Google Scholar]
- Lai, Z.H., McCulloch, J., Lashuel, H., and Kelly, J.W. 1997. Guanidine-HCl-induced denaturation and refolding of transthyretin exhibits a marked hysteresis. Biochemistry 36 10230–10239. [DOI] [PubMed] [Google Scholar]
- Leopold, P.E., Montal, M., and Onuchic, J.N. Protein folding funnels: A kinetic approach to the sequence structure relationship. 1992. Proc. Natl. Acad. Sci. 89 8721–8725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li, H., Helling, R., Tang, C., and Wingreen, N. 1996. Emergence of preferred structures in a simple model of protein folding. Science 273 666–669. [DOI] [PubMed] [Google Scholar]
- Liemann, S. and Glockshuber, R. 1999. Influence of amino acid substitutions related to inherited human prion diseases on the thermodynamic stability of the cellular prion protein. Biochemistry 38 3258–3267. [DOI] [PubMed] [Google Scholar]
- Lindquist, S. 1997. Mad cows meet psi-chotic yeast: The expansion of the prion hypothesis. Cell 89 495–498. [DOI] [PubMed] [Google Scholar]
- Lomakin, A., Chung, D.S., Benedek, G.B., Kirschner, D.A., and Teplow, D.B. 1996. On the nucleation and growth of amyloid beta protein fibrils: Detection of nuclei and quantification of rate constants. Proc. Natl. Acad. Sci. 93 1125–1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masel, J., Jansen, V., and M. Nowak. 1999. Quantifying the kinetic parameters of prion replication. Biophys. Chem. 77 139–152. [DOI] [PubMed] [Google Scholar]
- Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E. 1953. J. Chem. Phys. 21 : 1087–1092. [Google Scholar]
- Miyazawa, M. and Jernigan, R.L. 1996. Residue–residue potential with a favorable contact term and an unfavorable packing density term for simulation and threading. J. Mol. Biol. 256 623–644. [DOI] [PubMed] [Google Scholar]
- Nguyen, J.L., Inouye, H., Baldwin, M.A., Fletterick, R., Cohen, F.E., Prusiner, S.B., and Kirschner, D. 1995. X-ray diffraction of scrapie prion rods and peptides. J. Mol. Biol. 252 412–422. [DOI] [PubMed] [Google Scholar]
- Pan, K.M., Baldwin, M., Nguyen, J., Gasset, M., Cohen, F.E., and Prusiner, S.B. 1993. Conversion of α-helices into β-sheets features in the formation of the scrapie prion protein, Proc. Natl. Acad. Sci. 90 10962–10966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pande, V.S. and Rokshar, D.S. 1999. Folding pathway of a lattice model for proteins. Proc. Natl. Acad. Sci. 96 1273–1278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Post, K., Pitschke, M., Schafer, O., Wille, H., Appel, T.R., Kirsch, D., Mehlhorn, I., Serban, H, Prusiner, S.B., and Riesner, D. 1998. Rapid acquisition of β-sheet structure in the prion protein prior to multimer formation. Biol. Chem. 379 1307–1317. [DOI] [PubMed] [Google Scholar]
- Prusiner, S.B. 1982, Novel proteinaceous particles cause scrapie. Science, 216 136–144. [DOI] [PubMed]
- Safar, J., Wille, H., Itri, V., Groth, D., Serban, H., Torchia, M., Cohen, F.E., and Prusiner, S.B. 1998. Eight prion strains have PrPSc molecules with different conformations. Nat. Med. 4 1157–1165. [DOI] [PubMed] [Google Scholar]
- Sali, A., Shakhnovich, E.I., and Karplus, M. 1994. How does a protein fold? Nature 369 248–251. [DOI] [PubMed] [Google Scholar]
- Selkoe, D.J. 1997. Alzheimer's disease: Genotypes, phenotypes and treatments. Science 275 630–631. [DOI] [PubMed] [Google Scholar]
- Simons, K. and Ikonen, E. 1997. Functional rafts in cell membranes, Nature 387 569–572. [DOI] [PubMed] [Google Scholar]
- Sunde, M., Serpell, L.C., Bartlam, M., Fraser, P.E., Pepys, M.B. and Blake, C.C. 1997. Common core structure of amyloid fibrils by synchrotron X-ray diffraction. J. Mol. Biol. 273 729–739. [DOI] [PubMed] [Google Scholar]
- Taylor, K.L., Cheng, N., Williams, R.W., Steven, A.C., and Wickner, R.B. 1999. Prion domain initiation of amyloid formation in vitro from native Ure2p. Science 283 1339–1343. [DOI] [PubMed] [Google Scholar]
- Telling, G.C., Parchi, P., DeArmond, S.J., Cortelli, P., Montagna, P., Gabizon, R., Mastrianni, J., Lugaresi, E., Gambetti, P., and Prusiner, S.B. 1996. Evidence for the conformation of the pathologic isoform of the prion protein enciphering and propagating prion diversity. Science 274 2079–2082. [DOI] [PubMed] [Google Scholar]
- Thirumalai, D. and Woodson, S. A. 1996. Kinetics of folding of proteins and RNA. Acc. Chem. Res. 29 433–439. [Google Scholar]
- Wadsworth, J.D., Hill, A.F., Joiner, S., Jackson, G.S., Clarke, A.R., and Collinge, J. 1999. Strain-specific prion-protein conformation determined by metal ions. Nature Cell Biol. 1 55–59. [DOI] [PubMed] [Google Scholar]
- Wang, J. and Wang, W. 1999. A computational approach to simplifying the protein folding alphabet. Nature Struct. Biol. 6 1033–1038. [DOI] [PubMed] [Google Scholar]
- Wickner, R. 1997. Anew prion controls fungal cell fusion incompatibility. Proc. Natl. Acad. Sci. 94 10012–10014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wille, H., Zhang, G.F., Baldwin, M.A., Cohen, F.E., and Prusiner, S.B. 1996. Separation of scrapie prion infectivity from PrP amyloid polymers. J. Mol. Biol. 259 608–621. [DOI] [PubMed] [Google Scholar]













