Skip to main content
HFSP Journal logoLink to HFSP Journal
. 2008 Oct 15;2(6):365–377. doi: 10.2976/1.2991513

Studying the folding of multidomain proteins

Sarah Batey 1, Adrian A Nickson 1, Jane Clarke 1,
PMCID: PMC2645590  PMID: 19436439

Abstract

There have been relatively few detailed comprehensive studies of the folding of protein domains (or modules) in the context of their natural covalently linked neighbors. This is despite the fact that a significant proportion of the proteome consists of multidomain proteins. In this review we highlight some key experimental investigations of the folding of multidomain proteins to draw attention to the difficulties that can arise in analyzing such systems. The evidence suggests that interdomain interactions can significantly affect stability, folding, and unfolding rates. However, preliminary studies suggest that folding pathways are unaffected—to this extent domains can be truly considered to be independent folding units. Nonetheless, it is clear that interactions between domains cannot be ignored, in particular when considering the effects of mutations.


Most folding studies have concentrated on the folding of small, globular domains that have been principally, but not exclusively, isolated from much larger, multidomain proteins. Analysis of genomes has demonstrated that a significant proportion of proteins actually contain more than one domain (Apic et al., 2001; Ekman et al., 2005; Gerstein, 1998; Liu and Rost, 2004;Teichmann et al., 1999). In the context of this review the term domain is defined as a “structural, functional, and evolutionary component of proteins, which can often be expressed as a single unit” (Murzin et al., 1995). This definition distinguishes multidomain proteins from proteins such as lysozyme, which have two structural domains, but the individual domains are not stable alone and are not found in other protein contexts [and thus are defined together as a single domain in both the SCOP and Pfam databases (http:∕∕scop.mrc-lmb.cam.ac.uk and http:∕∕pfam.sanger.ac.uk∕)]. Furthermore, this definition does not include the linear repeat proteins made up of small repeating units which cannot be expressed singly (see Kloss et al., 2008; Main et al., 2005, for recent reviews). Approximately 40%–65% of prokaryotic proteins contain multiple domains and the proportion is even higher in eukaryotic proteins (∼65%–80%). Thus, in vivo, most domains fold in the context of a much larger protein, with neighboring domains. The domains in a multidomain protein often have significant interfaces and can either be attached by short, structured linkers or by longer flexible linkers. The linkers themselves may be important for the protein’s function (Gokhale and Khosla, 2000).

There have been few systematic studies of the folding of domains in the context of larger, multidomain systems. Some of the earlier studies have been discussed in depth in a detailed review by Jaenicke (1999), and a more comprehensive review of the folding and evolution of multidomain proteins was published more recently by Han et al. (2007). This work does not aim to be a comprehensive account of all the work that has been done. Instead we aim to illustrate what has been learned about how multidomain protein systems should be analyzed. A few seminal and careful studies have highlighted some general experimental problems in the analysis of multidomain proteins, which may affect the conclusions drawn.

In principle, in a two domain protein system there are three regimes which can occur:

(1) Each domain is entirely independent of the other. In this case the stability of each domain will be unaffected by its neighbor and the folding and unfolding rate constants will be the same whether the domain is alone or in the tandem pair.

(2) The two domains interact in the native state only. In this case the native state of both domains will be stabilized by the interaction in the fully folded protein. If this is the only effect, then the unfolding rate constants of each domain will be slowed but the folding rate constants will be unaffected.

(3) Folding of one domain catalyzes the folding of the second domain. In this case the folding rate constant of the second domain will be affected.

Note that, additionally, non-native interactions between the two domains can have the effect of slowing folding significantly.

DISTINGUISHING SPECIFIC FROM NON-SPECIFIC EFFECTS

(i) Deciding on domain boundaries. This is nontrivial. It has been established for some time that cutting a domain “too short” can lead to a loss in stability of the domain (Hamill et al., 1998; Pfuhl et al., 1997). This arises from the difficulty of establishing precise domain boundaries. Domain boundaries may be determined using sequence alignments, by analysis of proteolytic fragments, by determining regions that crystallize, or by examining crystal or nuclear magnetic resonance (NMR) structures; however, none of these methods is perfect. The best solution is probably by alignment of sequences, although even this may not be successful. In Ig-like domains, for instance, the terminal A and G strands show most variation and are often not used in sequence alignment algorithms, so domain boundaries are hard to define. Structural alignments have to be used in this case.

The 16th alpha helical repeat domain of chicken brain α-spectrin (R16) has been the subject of a number of studies into the effect of domain boundaries on stability, and illustrates the difficulty of deciding where domains start and end.

  • There are different opinions as to which residues in R16 should be “counted” as making up the entire repeat. Pfam (Bateman et al., 2004) defines the repeating unit as 105 residues (with a single one-residue linker between repeats) and gives the boundaries of R16 as residues 1769–1873. MacDonald and co-workers define a 106 repeat as 1772–1876 (MacDonald et al., 1994; MacDonald and Pozharski, 2001). In our work we have used the Pfam definition extended at one end to include the N-terminal linking residue (1768–1873).

  • However, MacDonald showed that addition of flanking residues at the N-terminus (but not the C-terminus) increased the stability of R16 significantly (1763–1876) (MacDonald and Pozharski, 2001). Thus, to ensure we were studying a complete, not a truncated domain, we extended the Pfam domain by several residues at both the N- and C-terminus (1764–1876) (Scott et al., 2004a).

  • A comparison of the NMR and crystal structures of R16 complicates the case further. The NMR structure of R16 (residues 1772–1869) shows that there is significant fraying of both the N- and C-terminal helices, so that only 98 residues (1772–1869) are structured (Pascual et al., 1997). However, in the crystal structure of R16 with its neighboring domains, R15 and R17, both the N- and C-terminal helices of R16 are contiguous with the neighboring C- and N-terminal helices respectively (Grum et al., 1999; Kusunoki et al., 2004). This is compelling evidence to suggest that neighboring domain interactions may be important.

Importantly, it has been shown that domains, even when well defined, can “overlap” thermodynamically. That is, some residues can be considered, structurally, to belong to two neighboring domains and they play a role in the stabilization of both. This is well illustrated by considering the case of two fibronectin type III (fnIII) domains from human fibronectin. When first studied, FNfn9 (the 9th fnIII domain of human fibronectin) was shown to be very unstable alone, (1 kcal mol−1), but appeared to be significantly stabilized by its neighbor FNfn10 (Spitzfaden et al., 1997). However, when a new construct of FNfn9 was examined, lengthened by just two residues, the stability of FNfn9 was now found to be independent of FNfn10 (Steward et al., 2002). It is clear that the two residues at the “end” of FNfn9 and the “start” of FNfn10 belong to both domains [Fig. 1a].

Figure 1. Multidomain proteins discussed in this review.

Figure 1

The domains are colored as follows: N-terminal domain, blue; C-terminal domain, red; Middle domain [(e) and (f)], green. In (a) and (d) the linker region between the N-and C-domains is colored green. (a) Crystal structure of fibronectin type III domains FNfn9 and FNfn10 from human fibronectin (pdb code 1fnf). The two-residue linker that is the end of FNfn9 and the start of FNfn10 is shown in green. This linker is important for the stability of both domains. The domains themselves do not interact detectably. (b) The structure of γB-crystallin (pdb code 1amm). There is a large interface between the two domains which stabilizes the folded domains. (c) Semiliki forest virus protein (SFVP, pdb code 1vcq). The two domains fold sequentially. (d) bsPGK (pdb code 1php). The linking helix (green) was not included in either single domain construct. The data suggest that this linking helix is important for the stability of the C-domain. (e) Crystal structure of the 15th, 16th, and 17th spectrin repeat domains from chicken brain α-spectrin (pdb code 1u4q). The C-helix of R15 forms a continuous helix with the A-helix of R16, while the C-helix of R16 forms a continuous helix with the A-helix of R17. (f) The three domain protein trigger factor (pdb code 1w26). An extension of the N-terminal domain forms important contacts with the C-terminal domain.

A final complication is that artificially shortening a domain could bring a charged N- or C-terminus too close to the folded domain, and this itself could be problematic (Hamill et al., 1998).

Thus, when investigating a domain in the presence and absence of its natural neighbor(s) it is important to ensure that the domain boundaries have been chosen correctly.

(ii) Even where domain boundaries have been chosen with care, nonspecific effects can be important in stabilization of a domain. Interestingly, we have found that entirely non-natural extensions to spectrin domains can significantly alter their stability, but only at one terminus (Randles et al., 2008).

  • An extension of a spectrin domain at the N-terminus by a natural folded neighboring domain stabilizes the protein significantly and no other extension at this terminus has any effect on stability. This is exactly what one might expect. In contrast, the effect of extension at the C-terminus of spectrin domains gives quite different results.

  • Addition of a folded neighbor at the C-terminus adds significant stability, presumably through specific, natural interdomain interactions, again as one might expect.

  • However, any long extension at the C-terminus of spectrin domains increases their stability by ∼1 kcal mol−1. These extensions include unfolded natural neighboring domains, long fragments of the neighboring domains and even folded non-natural neighbors such as the all-β sheet immunoglobulin (Ig) domain titin I27.

Such nonspecific effects need to be taken into account when computing the effects of neighboring domains in multidomain proteins [see Eq. 3 below].

DETERMINING THE EQUILIBRIUM STABILITY OF A MULTIDOMAIN PROTEIN

A careful, detailed analysis of the thermodynamics of multidomain proteins, in particular where determined using calorimetry, has been published earlier (Freire et al., 1992). Here we discuss problems that can arise from the use of equilibrium denaturation experiments, and importantly we consider the effect that a simple extension (by an unfolded neighboring domain) may have on the thermodynamics of the system.

Consider a two domain protein, A-B.

(1) Where the two domains are entirely independent then the stability of a multidomain protein (ΔGTOT) is simply the sum of the stabilities of the constituent domains

ΔGTOT=ΔGA+ΔGB. (1)

(2) Where two domains interact through native state: native state interactions only, then the total stability will be the sum of the individual stabilities, plus the interaction energy between them, ΔGINT

ΔGTOT=ΔGA+ΔGB+ΔGINT. (2)

(3) If the domains are also stabilized by an unfolded neighboring domain (ΔGchain), then the total stability of the protein will be

ΔGTOT=ΔGA+ΔGB+ΔGINT+ΔGchainAb+ΔGchainaB, (3)

where superscript Ab denotes the effect of unfolded domain B on the stability of folded domain A and superscript aB denotes the effect of unfolded domain A on the stability of folded domain B.

EQUILIBRIUM EXPERIMENTS, PRACTICAL ASPECTS: UNEXPECTED m-VALUES ARE THE KEY

While the theory of the stability of multidomain proteins is straightforward, even in the simplest of situations the stability of such a protein may be extraordinarily difficult to determine experimentally. When an equilibrium unfolding experiment is performed on a simple, single-domain two-state system, an m-value is determined, which is the dependence of the free energy of unfolding (ΔG) on denaturant concentration. The m-value reflects the change in solvent accessible surface area (SASA) between the denatured state (D) and the folded state (F), and is related to the size of the protein (Myers et al., 1995). The experiment also measures [d]50%, which is the concentration of denaturant at which D and F are of equal energy (ΔG=0) i.e., where D and F are equally populated so that 50% of the molecules are folded and 50% are unfolded. The free energy of unfolding, at 0 M denaturant, is the product of m and [d]50% (Pace, 1986),

ΔG=m*[d]50%.

(i) Independent domains: where ΔGTOTGAGB

(a) If the two domains have [d]50% values which are sufficiently separated then two transitions will be observed [Fig. 2a]. Individual m, [d]50% and ΔG values can be determined, and compared with the isolated domains, and thus ΔGTOT can be determined accurately.

Figure 2. Four possible equilibrium denaturant profiles for multidomain proteins.

Figure 2

Unfolding transitions of the individual domains are shown in red and blue and the multidomain protein is shown in black. (a) Independent domains with significantly separate [d]50%. Two separate unfolding transitions can be seen in the multidomain protein. (b) Independent domains with similar [d]50%. Only one transition can be observed in the multidomain protein; however, the m-value (the slope) is significantly lower than that of the individual domains. In this case the individual domain transitions can only be determined if the spectroscopic properties are different. (c) Interacting domains with no significantly populated intermediate. There is one observable unfolding transition with an m-value approximately that of the sum of the two individual m-values (mTOTmA+mB). (d) Interacting domains with a populated unfolding intermediate. Only one transition is observed; however, the m-value is significantly lower than the sum of the two individual m-values. (a) and (b) use modeled data, (c) and (d) use actual data from spectrin R1617 (Batey et al., 2005) and R1516 (Batey and Clarke, 2006), respectively.

(b) If the [d]50% values of the two domains are close to each other then a single, apparently cooperative (two-state) transition is observed. The apparent [d]50% will be between the individual [d]50% values but, importantly the apparent m-value will be lower than the m-values of either of the individual constituent domains [Fig. 2b]. In this case the apparent total free energy of unfolding ΔGTOTapp<ΔGA+ΔGB.

(c) If the two domains have identical stabilities and identical m-values, and behave completely independently then a single transition will be observed, reflecting the individual (identical) stability of the two domains (Arora et al., 2006). This is the case for titin I27, where the equilibrium data for a single domain is identical to that of a chain containing eight copies of the I27 domain (Rounsevell et al., 2005).

(d) Note that if the two domains have different spectroscopic properties it may be possible to determine individual domain stabilities even where the transitions overlap.

(ii) Interacting domains: where ΔGTOTGAGB

(e) Where two domains interact it may still be possible to see two transitions if one domain is significantly less stable than the other. Importantly, while it is possible to use these data to determine the stability of the less stable domain in the presence of its folded neighbor, these data can only report on the stability of the more stable domain in the presence of its unfolded neighbor. Clever mutagenesis (take care! NOT in the region where the two domains interact) may be able to reverse the order of the domain stabilities, thereby allowing the remaining stabilities to be elucidated. Only then can ΔGTOT be determined.

If a single transition only is observed, then there are two possibilities:

(f) If the overall m-value observed is approximately equal to the sum of the m-values of the constituent domains this suggests that the entire protein is behaving as a single cooperative unit at equilibrium (i.e., no intermediate, with one domain folded and the other unfolded, is populated at equilibrium) [Fig. 2c] (Batey et al., 2005). This can be confirmed by reconstructing the equilibrium curve from kinetic data (see below) (Batey and Clarke, 2006). Furthermore, if neither domain is stable in isolation then a true cooperative transition will also observed. In this case, again, the m-value should reflect the size of the entire protein.

(g) If the equilibrium transition looks cooperative, but the overall m-value observed is significantly lower than the total m-value of the entire protein (i.e., less than mA+mB) this suggests that the protein is not behaving as a single cooperative unit—an intermediate is accumulating. In this case the apparent ΔGapp [=m(apparent)×[d]50%(apparent)] will be lower than the correct ΔGTOT, sometimes by a significant amount. An extreme example is in the two-domain protein spectrin R1516. The entire protein unfolds by an apparent two-state transition, with an apparent [d]50% higher than either of the constituent domains, R15 or R16, but with a m-value that is unchanged [Fig. 2d]. This gives an apparent stability for the two-domain protein that is less than the sum of the two domains [8.7 versus (6.8+6.2) kcal mol−1] and is significantly less than the true ΔGTOT, estimated to be ∼17 kcal mol−1 (Batey and Clarke, 2006).

MEASURING THE KINETICS OF MULTIDOMAIN PROTEINS

[To make this review accessible to non-specialist readers we have included a brief guide to kinetic analysis of protein folding including a discussion of chevron plots as supplementary material. See Supplementary Materials (EPAPS)] There are relatively few truly systematic investigations of the folding kinetics of multidomain proteins. These can be complicated by the difficulty of analyzing multiphasic kinetics and the studies nearly always rely on using both single- and double-jump (interrupted refolding∕unfolding) stopped-flow kinetics. It is often necessary to analyze both CD and fluorescence amplitudes, and mutation may be necessary to help determine which phase belongs to which domain. In some systems folding and∕or unfolding phases may be kinetically silent.

LEARNING FROM CASE STUDIES

The complex nature of the analysis of the folding of multidomain proteins is best illustrated using a few specific examples of careful, detailed studies.

γB-crystallin: a relatively straightforward case

Jaenicke and co-workers carried out some of the first detailed studies on the folding of multidomain proteins using the two-domain protein γB-crystallin. They studied both the two-domain construct and its constituent N- and C-terminal domains. These studies are summarized in detail in (Jaenicke, 1999). The domains have a large interface and clearly interact in the native state [Fig. 1b]. Equilibrium unfolding is a two-step process, with a populated intermediate (case e, above). The C-terminal domain is only partially folded alone (ΔG∼1 kcal mol−1) but stabilized by ∼4 kcal mol−1 by interaction with the N-terminal domain (ΔGINT=4 kcal mol−1). The equilibrium intermediate (I) is a species with the C-domain unfolded and the N-domain folded. The stability of the N-domain in the wild-type protein is the same as the stability of the isolated N-domain (i.e., the N-domain is not stabilized by unfolded C domain, ΔGchainNc=0).

In terms of kinetics, two folding and two unfolding phases can be observed, which produce two chevron plots, one representing the D⇄I transition and the other the I⇄F transition. Thus, equilibrium and kinetic data are consistent and allow the folding pathway to be determined

D(nc)slowfastI(Nc)fastslowF(NC),

where, as before, upper case N and C refer to folded N- and C-domains, respectively, and lower case n and c to unfolded domains. As we will see, the relative rate constants of the formation and subsequent folding∕unfolding of I are important. In both the folding and unfolding reactions presented here a fast phase is followed by a slower phase—hence, all phases can be observed in a single-jump folding or unfolding experiment. Note that the kinetics of the isolated domains were not compared to those in the wild-type protein, and the effect of an unfolded N-domain on the stability (ΔGchainnC) or folding of the C-domain could not be determined in these experiments.

Semiliki forest virus protein: silent kinetic phases

The two-domain protein from semiliki forest virus (SFVP) [Fig. 1c] is known to fold cotranslationally (Nicola et al., 1999), and Kiefhaber and co-workers have shown that the complete protein folds very rapidly in vitro (τ=50 ms) (Sanchez et al., 2004). SFVP has three Trp residues, which all reside in the C-terminal domain. Equilibrium experiments show an apparent two-state transition, with complete agreement between CD and fluorescence and where the m-value is consistent with the m-value expected for a protein of this size, i.e., at equilibrium the protein appears to fold and unfold as a single cooperative unit.

In the kinetics several refolding phases are observed, but there is only one unfolding phase. Using double-jump (interrupted unfolding) experiments, Kiefhaber and co-workers demonstrated unequivocally that the slower, minor refolding phases result from proline isomerization. More interestingly, there is a lag in the formation of fully folded protein, and roll-over in the refolding arm of the chevron shows evidence for formation of an early intermediate, which does not involve a change in tryptophan fluorescence. This intermediate was therefore presumed to be a species with a folded N-terminal domain, since folding of this domain is not associated with any change in fluorescence. Thus, the folding pathway of SFVP is

D(nc)I(Nc)F(NC).

Critically important for the discussion here, the authors of this manuscript demonstrate that where only one true folding phase is observed, it is not possible to determine which step is rate determining (formation of I or formation of F).

Unfortunately, in these studies of SFVP the two domains were not studied independently, and so it is not clear whether the two domains interact to stabilize one another and whether the folding of the N-domain catalyzes the folding of the C-domain. However, the authors suggest that “rapid structure formation in the N-terminal domain might provide a template for efficient and rapid formation of the complete three-dimensional structure.”

bsPGK: independent folding; folding and unfolding intermediates are not the same species

There are a number of forms of the two-domain protein phosphoglycerate kinase (PGK), which differ in their folding behavior. Data for the yeast form suggest that the two domains interact despite the small domain interface (Osvath et al., 2005), whereas in the thermostable form from bacillus stearothermophilus (bsPGK) the two domains appear to behave as independent units (Parker et al., 1996). Clarke and co-workers studied the folding of bsPGK in detail, in a beautifully executed study (Parker et al., 1996). A pseudowild-type protein (bsPGK) with a single buried Trp in the C-domain was used to simplify the analysis, and was compared to the two isolated domains. In this investigation the interdomain bridging helix was not included in either isolated domain [Fig. 1d]. At equilibrium a single two-state transition is observed when using Trp fluorescence as a probe, reflecting the unfolding of the C-domain. By CD a biphasic transition is seen reflecting the unfolding of the two domains, with the N-domain unfolding at a lower concentration of denaturant than the C-domain. Thus, at equilibrium the intermediate that accumulates is a protein with the N domain unfolded and the C-domain folded (nC).

When Trp fluorescence was used to monitor refolding, the rate of folding was identical to the rate of regain of enzymatic activity, suggesting that the rate-limiting step in folding is formation of the C-domain. If folding is followed using an excitation wavelength of 270 nm (thus allowing both Tyr and Trp fluorescence to be monitored) an additional fast folding phase can be observed, which can be ascribed to the folding of the N-domain. All the domain assignments were confirmed by comparison with the individual domains. In the unfolding experiments, two unfolding phases could also be observed: a fast phase, associated with the unfolding of the C-domain, and a slow phase associated with the unfolding of the N-domain. The analysis suggests that, in the case of bsPGK, the two domains fold and unfold independently. This means that there is no compulsory “order” to the folding and unfolding events, so that both fast and slow phases can be observed in single-jump stopped-flow experiments. However, since the N-domain both folds and unfolds faster than the C-domain the intermediate that is populated is different in folding and unfolding. The authors describe this independent folding as “random order”

Folding:D(nc)I(Nc)F(NC),
Unfolding:F(NC)I(nC)D(nc).

Comparison of the equilibrium stability of the isolated N-domain with the N-domain in bsPGK suggests that there are few interactions between the two domains. Since in the equilibrium experiment the N-domain unfolds at lower denaturant concentrations (i.e., in the presence of the folded C-domain) this must suggest that the interaction energy (ΔGINT) is insignificant. However, the C-domain alone is significantly less stable than the C-domain in the intact protein. This observation suggests that the C-domain interacts with the linking helix that was not present in either isolated domain, illustrating the point that regions outside the traditional domain boundaries cannot be ignored.

The final complication in this analysis comes from the surprising observation that the isolated N-terminal domain actually folds somewhat more rapidly (∼ eightfold) than the same domain in the intact protein. Both the C- and the N-terminal domains, when in isolation, fold through collapsed, partly folded, transiently populated intermediates before reaching the fully folded state (as evidenced by rollover in the folding limbs of the chevron plots). Parker et al. (1996) argue compellingly that nonproductive interactions between these intermediates cause the slower folding of the N-terminal domain when placed in the full-length protein.

Spectrin domains: explaining strange m-values and hidden kinetic phases

Spectrin domains are three-helix bundle proteins joined by linking helices, such that the C-terminal helix of one domain is contiguous with the N-terminal helix of the following domain [Fig. 1e]. It was demonstrated by MacDonald and co-workers that spectrin domain pairs unfold at higher temperatures and at higher denaturant concentrations than the component domains alone (MacDonald and Pozharski, 2001). This “cooperative” behavior has been associated with the presence of the linking helix, as the interface between the two domains is relatively small (Han et al., 2007). Additionally, mutations that disrupt this linking helix have been shown to disrupt interdomain cooperativity (Batey et al., 2005; Johnson et al., 2007).

We have examined the folding of pairs of domains taken from chicken brain α-spectrin, R1516 (Batey and Clarke, 2006) (with linked domains R15 and R16) and R1617 (Batey et al., 2005; 2006). Usefully, R16 has 2 Trp residues, while R15 and R17 have only one, so that relative fluorescence changes on unfolding are greater for R16 than for the other domains.

Equilibrium behavior: In R1516 the tandem construct unfolds at a higher denaturant concentration than either constituent domain, suggesting that the domains are stabilized. However the m-value of the two-domain protein is the same as that of the individual domains. Although the fluorescence and CD data overlay, the low m-value is a clear indicator that protein is not unfolding as a single cooperative unit at equilibrium [Fig. 2d].

In R1617 the equilibrium behavior is different. The tandem domain also unfolds at a higher urea concentration that either R16 or R17 alone, but in this case the m-value increases [to ∼1.6×(mR16+mR17)] and the fluorescence and CD data overlay, suggesting a more cooperative transition [Fig. 2c].

Determining the interaction energies: A mutational study using a variety of mutants of the R17 domain in R1617 allowed interaction energies to be determined (Batey et al., 2005). Weakly destabilizing mutations showed a single transition but with a lower m-value and, in some cases, a loss in coincidence between fluorescence and CD data. However, highly destabilizing mutations resulted in a biphasic equilibrium transition, due to the population of an intermediate with R17 unfolded and R16 folded. From these data it was possible to show that R17 is stabilized by ∼3 kcal mol−1 by folded R16. We later showed that unfolded R16 did not affect the stability of R17 (ΔGchainR16uR17f=0), which makes ΔGINT∼3 kcal mol−1. From the same data it was also clear that R16 is stabilized by ∼1 kcal mol−1 by unfolded R17 (ΔGchainR16fR17u=1 kcal mol−1). Thus, where

ΔGTOTR1617=ΔGR16+ΔGR17+ΔGINT+ΔGchainR16fR17u+ΔGchainR16uR17f,

then the total free energy of the system,

ΔGTOTR1617=6.4+6.0+3.0+1.0+0=16.4kcalmol1.

KINETICS

Simple folding kinetics of R1516 (Batey and Clarke, 2006): Two refolding phases are observed. The first is assigned to the folding of R15 which folds at the same rate as the isolated domain. R16 then folds with a rate constant for folding that has been increased (∼30 fold) by the presence of prefolded R15. Only a single unfolding phase (unfolding of R16) is observed in single jump experiments, but interrupted refolding experiments reveal the unfolding of the R15 domain, which is slowed ∼35 fold by the presence of unfolded R16. Thus, R1516 folds by a sequential mechanism, with an intermediate comprising R15 folded and R16 unfolded

D(R15uR16u)fastfastI(R15fR16u)slowslowD(R15fR16f).

Importantly these results suggest that when R16 is folded, the R15 domain is stabilized significantly, so that it unfolds at least as slowly as R16. Notice that here, in contrast to the bsPGK case above, the folding and unfolding are strictly ordered and folding and unfolding are the reverse of each other.

Complex folding kinetics of R1617: The analysis of the kinetics of R1617 (determined using a combination of double-jump experiments, amplitude analysis in fluorescence and CD modes and carefully selected mutants) is very complex and so beyond the scope of this review (Batey et al., 2006). Briefly, there is one observable folding and unfolding transition below 5.5 M urea, and a collapsed unstable intermediate (I1) is populated early in the folding of R1617. However, overall, the folding mechanism is similar to that observed for R1516, with the N-terminal domain (R16) folding first, and with the unfolding reaction being the reverse of folding

D(R16uR17u)I1collapsedfastslowI2(R16fR17u)slowfastD(R16fR17f).

For both R1516 and R1617 the kinetic data could be used to determine all terms in Eq. 3, allowing to ΔGTOT to be evaluated.

Explaining the equilibrium m-values: Despite having a similar folding mechanism, we showed (above and Fig. 2) that the equilibrium unfolding of R1516 and R1617 were very different. In R1516 the equilibrium m-value is the same as that of the individual domains, whereas in R1617 the m-value is close to the sum of the two individual domains: i.e., R1617 appears to show two-state (all-or-none) behavior at equilibrium, whereas this is not the case in R1516. The kinetic data for R1516 and R1617 could be used to reconstruct equilibrium curves (Fig. 3) and explain this observation (Batey and Clarke, 2006). In R1617, the folding intermediate (R16fR17u) does not accumulate to any significant extent, so the entire protein unfolds, at equilibrium, as an apparently single cooperative unit [Figs. 3a, 3c]. In contrast, in R1516 an intermediate (R15fR16u) accumulates, resulting in an equilibrium curve with a low m-value [Figs. 3b, 3d]. This study illustrates that careful analysis of equilibrium m-values can give information about the population of intermediates in multidomain proteins, even where CD and fluorescence data coincide, and that a complete kinetic analysis is essential for understanding the folding of domains within a multidomain context.

Figure 3. The equilibrium properties of the spectrin tandem repeat domains R1516 and R1617 can be explained using kinetic data.

Figure 3

(a) Equilibrium populations of the folded (red), intermediate (black), and unfolded (blue) species of R1516 determined from the kinetic rate constants. (b) Equilibrium populations of the folded (red), intermediate (black), and unfolded (blue) species of R1617 determined from the kinetic rate constants. (c) Modeled (open circles) and actual (closed circles) equilibrium denaturation curves of R1516 from CD (black) and fluorescence (red) measurements. (d) Modeled (open circles) and actual (closed circles) equilibrium denaturation curves of R1617 from CD (black) and fluorescence (red) measurements. Data taken from (Batey and Clarke, 2006).

DO DOMAINS FOLD BY THE SAME PATHWAY WITHIN A MULTIDOMAIN PROTEIN?

It is important to specify what is meant by the term “pathway” in this context. Here we refer to the order of formation of structure between the unfolded and folded states of an individual domain; the most commonly studied species are the transition state and any populated intermediates. If the transition state structure changes significantly when an isolated domain is inserted into a multidomain context, then this is an indication of a “different pathway.” Changes in the rate constants for folding and unfolding, or changes in the populations of partly structured species do not necessarily indicate a different pathway. In order to address this question definitively, the folding pathway of a domain has to be studied in detail both in the isolated domain and in the context of its neighboring domain.

Where two domains are independent then the folding pathways should be independent of their context. In bsPGK, for instance both the N- and C-domain fold via an intermediate, and both also fold via (apparently) the same intermediate in the intact protein (Parker et al., 1996). The complication is that interaction between the partially folded intermediates form transient, weak, “off pathway” interactions, which actually slow folding. This does not, however, suggest that the domains fold via different pathways in the multidomain protein. Similarly, changes in fluorescence profiles of yeast PGK as the domains fold might simply reflect the different environments of the Trp residues in the domains when they fold in the multidomain protein (Osvath et al., 2005).

Where proteins interact in the native state, slowing unfolding, it is possible that this might alter the unfolding pathway by stabilizing a region of the protein that normally unfolds early. If a domain also folds faster when placed in a multidomain protein, this suggests that the stabilization of a neighboring domain is conferred not only on the native state of the protein, but also the transition state for folding. In this case it is possible that the protein folds via the same transition state (which has been stabilized by the presence of a neighboring domain), but it may also be that the structure of the transition state itself is altered, i.e., the pathway of folding is changed.

To our knowledge, there is only one definitive study of the effect of a folded neighboring domain on the folding pathway. We studied the folding (and unfolding) of R16 in the R1516 construct (Batey and Clarke, 2008). R16 folds faster and unfolds more slowly in the presence of (prefolded) R15 than it does alone. This study was comprehensive; 56 mutations were made in the 106 residue R16 domain, at 39 positions. The folding and unfolding of each of these mutant proteins was followed in R1516 where, in all cases, R15 folded first and unfolded last. This allowed Φ values to be determined at 35 positions. These were compared with Φ values previously determined at the same position in isolated R16 (Scott et al., 2004b). R16 folds via a high energy intermediate (Scott and Clarke, 2005), and so it is possible to determine the Φ values for both an “early” and a “late” transition state (from folding and unfolding data, respectively). So a total of 70 Φ values were compared. The pattern of Φ values was identical for both transition states, whether R16 is folding in the presence of folded R15 or in isolation. Thus, the folding pathway of the R16 domain is determined entirely by its own primary sequence: it is not influenced by interactions with its neighbor. In other words, the energy landscape of this domain is maintained in the multidomain protein. The primary sequence alone codes for the folding pathway (landscape) and the Anfinsen principle holds (Anfinsen, 1973).

Although this study is only a single example, it is perhaps what one should expect. During evolution, new proteins arise from domain shuffling, and domains that can still fold efficiently will be more successful than domains which require a reengineering of their protein folding landscapes in order for the domains to fold in their new context. Thus, the landscapes both of single and multidomain proteins are funnel-like with reduced frustration (Onuchic et al., 1997).

EFFECT OF MUTATION ON THE INTERACTION ENERGY OF MULTIDOMAIN PROTEINS

In a multidomain protein, mutations may have a number of effects. As an example, take a two-domain protein, A-B, where the domains interact.

The free energy of this protein is described by Eq. 3 which we reproduce again here

ΔGTOT=ΔGA+ΔGB+ΔGINT+ΔGchainAb+ΔGchainaB.

If there is a mutation in domain A which destabilizes domain A, but does not affect the interactions between the domains then the total change in the free energy of the protein

ΔΔGTOT=ΔΔGA,

where ΔΔGA is the change in stability of domain A.

If, however, the mutation also affects the interface between the native state of domains A and B, then the effect is larger

ΔΔGTOT=ΔΔGA+ΔΔGINT,

where ΔΔGINT is the change in the interaction energy between the domains.

An even more drastic scenario can exist. In chicken brain alpha-spectrin R1617 we have made a mutation (Glu to Pro) that is analogous to a disease-causing mutation in human spectrin, as it places a Pro in the linker helix between two spectrin domains (Giorgi et al., 2001). This mutation destabilizes R16 alone (ΔΔG16) by only about 1 kcal mol−1. However, it also disrupts all interactions between the two domains (unpublished data).

In this case

ΔΔGTOT=ΔΔGR16+ΔΔGINT+ΔΔGchainR16fR17u+ΔΔGchainR16uR17f.

Since in wild-type R1617 the interaction energy between the two domains (ΔGINT) is ∼3 kcal mol−1 and ΔGchainR16fR17u is ∼1 kcal mol−1(ΔGchainR16uR17f=0), the total destabilization of this apparently minor mutation is ∼5 kcal mol−1, much greater than the initial estimate. Thus, in estimating the effect of pathogenic mutations it is important to take into account the effect the mutation may have on the entire protein, not just on the domain in isolation.

WHEN THERE ARE MORE THAN TWO DOMAINS

Even fewer 3-domain proteins systems have been studied.

Trigger factor: Trigger factor is a 3-domain protein, N-M-C [Fig. 1f]. All three domains are essential for function, and the folding of the entire protein and a number of fragments have been investigated (Zarnt et al., 1997). The results suggest that the contiguous N- and M-domains can be considered as energetically independent folding units, which do not interact significantly. There are weakly stabilizing interactions between the M- and C-domains that are apparently important for solublizing the C-domain and allowing it to fold. From an examination of the structure colored by domain boundaries used in the study, this might reflect the difficulty of domain boundary selection [in this case chosen as protease digestion fragments—see Fig. 1f]. Interestingly the data suggest that the strongest interactions are between the N- and C-domains. An analysis of the structure shows that these interact through a flexible extension of the N-domain [Fig. 1f]. The authors suggest that these N-C interactions orient the three domains correctly, so that an active conformation is obtained. The effect of these interactions on the kinetics of domain folding were not examined in this study.

Tandem array proteins: Some proteins, such as spectrin, titin, and fibronectin are comprised of a number of domains in tandem array. The linkage between the domains is fairly stiff so that while each domain is in contact with its direct neighbor, it does not interact with other domains in the same molecule (Improta et al., 1998; Kusunoki et al., 2004; Leahy et al., 1996). In the case of the domains in the I-band of titin, the domains appear to be essentially independent of each other (Scott et al., 2002). The stability of any domain is the same whether examined as an isolated construct or in the presence of its N- and C-terminal neighbors, i.e., ΔGINT=0 and ΔGchain=0. In this case, we proposed, the I-band of titin could be considered the simple “sum of its parts”

ΔGTOT=ΔGindividual.

In the case of spectrin domains, each domain interacts with, and is stabilized by, its direct neighbors, but appears to be unaffected by any neighbor at a position i±2. Thus, in the extended three-domain construct R151617, R15 folds and unfolds exactly as it does in R1516. Likewise R17 folds and unfolds, in R151617, exactly as it does in R1617. R16, however, is stabilized by both its neighbors (Fig. 4) and is more stable in R151617 than it is alone or in either two-domain construct. Thus

ΔGTOT=ΔGindividual+ΔGINT+ΔGchain.

Figure 4. Kinetics of the three-domain protein spectrin R151617, in this case domains are affected only by direct neighbors.

Figure 4

The kinetics of R15 (purple), R16 (red), and R17 (blue) shown as individual domains (closed circles) in R1516 (open circles), in R1617 (open triangles), and in R151617 (closed triangles). Top row shows the kinetics of R15 (alone and in the two- and three-domain proteins, R1516 and R151617, respectively). The unfolding of R15 is slowed by its neighbor, R16, in R1516. There is no additional effect on either folding or unfolding rates by the addition of R17 to form R151617. Middle row shows the kinetics of R16 (alone, in the two-domain proteins R1516 and R1617, and in the three-domain protein, R151617). R16 is stabilized in R1516 by an increase in folding rate and a decrease in unfolding rate. In R1617, the folding rate of R16 is slowed due to the population of a collapsed intermediate; the unfolding rate is also decreased. In R151617 the folding rate of R16 is changed by two opposing effects, the presence of folding R15 increases the folding rate, however, the presence of R17 still leads to the population of the collapsed intermediate. The unfolding of R16 in R151617 is slower than in R1516 or R1617 due to the additive effect of the two neighboring domains. The bottom row shows the kinetics of R17 (alone, in the two-domain protein R1617 and in the three-domain protein R151617). R17 is stabilized in R1617 by an increase in folding rate and a decrease in unfolding rate. The kinetics of R17 in R151617 are exactly the same as those in R1617 therefore the addition of R15 has no effect on R17.

Effects of mutations in one domain on the stability of other domains: Take a three-domain protein A-B-C, where each domain, as in spectrin R151617, only interacts with its direct neighbors. Can domain A exert indirect effects on domain C? It has recently been demonstrated, in the case of a protein with multiple natively unfolded domains that stabilization of domain A, by for instance binding of a ligand, could induce folding of domain B, which in turn could induce folding in domain C (Hilser and Thompson, 2007). In this case, it has been proposed that the cooperativity between the domains could explain allosteric effects in such proteins.

In a natively folded protein, such as spectrin R151617, is there a case where a mutation in domain A would affect the stability of domain C? A severe mutation in domain A, which disrupted all interactions between A and B, might cause A to become unfolded. In this case domain B could also be destabilized significantly. However, so long as B is stable alone (or stable and fully folded in the presence of domain C), this would not significantly affect the population of folded B. Thus, loss of folded domain A would not disrupt stabilizing interactions between domains B and C, and domain C will be unaffected. In large multidomain proteins it might be advantageous for each domain to be individually stable to avoid catastrophic domino effects upon mutation.

Similarly, we have previously suggested that for the Ig domains in the I-band of titin, it is an advantage for each domain to be independent of its neighbors (Scott et al., 2002). The I-band of titin is responsible for passive elasticity in muscle. It is subject to cycles of mechanical stress and it has been suggested that under extremes of force the titin molecule may lengthen by unfolding of just a few domains (Minajeva et al., 2001). As long as the domains are independent, unfolding of one domain will not make the unfolding of a direct neighbor more likely. Indeed, the arrangementof weak and strong domains in titin suggests that near neighbors are unlikely to unfold simultaneously (Li et al., 2002; Scott et al., 2002). Upon release of any mechanical stress, this segregation of unfolded domains will allow each domain to refold efficiently, free of the chance of misfolding with an adjacent unfolded chain.

FOLDING OF MULTIDOMAIN PROTEINS IN VIVO

Efficient folding is vital, not only to produce a supply of functional folded protein, but also to prevent formation of misfolded, or partly folded species which may be prone to aggregation. In principle formation of such non-native states is more likely when folding a multidomain protein due to a higher local concentration of protein. The protein luciferase, for instance, folds far more efficiently in vivo or in an in vitro translation system than upon refolding of the entire protein following denaturation, even when a full component of chaperones is included in the in vitro refolding system (reviewed in detail in Frydman, 2001). There is compelling evidence to suggest that multidomain proteins fold cotranslationally, domain by domain. It has been suggested that this will lead to more efficient folding by preventing formation of misfolded states that may (a) slow folding (as is seen in bsPGK) or (b) result in aggregation. It has also been proposed that translational pausing may play a role in facilitating the efficient folding of multidomain proteins in vivo (Thanaraj and Argos, 1996). The pause in translation may allow one domain to fold before production of its neighbor—again avoiding interdomain misfolding. It is also possible, given our results on spectrin domains, that the prior folding of an N-terminal domain may catalyze the folding of its C-terminal neighbor (a folded R16, for instance increases the rate of R17 folding from 30 to 1000 s−1), again resulting in more efficient folding. Interestingly there is evidence that eukaryotic translation machinery is significantly more efficient in the production of correctly folded multidomain proteins than that within prokaryotic systems. This has been related to the observation that multidomain proteins make up a larger proportion of the eukaryotic proteome and that multidomain proteins in eukaryotes are, on average, larger with more domains than prokaryotic proteins (Frydman, 2001).

It is important to recall, in the context of this discussion, that domains in a protein may fold and unfold a number of times during their lifetime. Take spectrin domains as an example, again. Spectrin is a component of the cytoskeleton of the red blood cell where it mediates membrane elasticity. The average lifetime of a red blood cell exceeds three months and during this time the membrane experiences significant mechanical deformation as it passes through the circulatory system. Thus, it is likely that during their lifetime, the spectrin domains unfold and refold many times, away from the translation machinery of the cell. The rate constant of unfolding of a spectrin domain is slowed significantly by each of its neighbors, decreasing the likelihood (frequency) of unfolding. R17, for example, unfolds some 30 times more slowly in the presence of folded R16—dramatically increasing the half-life of unfolding from 30 min to around 15 h. Furthermore, recovery of the native state will be significantly more rapid in the multidomain protein, decreasing the time spent in the vulnerable unfolded state where the protein is susceptible to aggregation or degradation.

CONCLUSIONS

Too little work has been done in this area to elucidate in detail the effects that interdomain interactions may have in a multidomain protein. By analogy with protein complexes it seems to be “obvious” that where the interface between domains is large, the interaction between them will confer more stability than where the interface is small (Levy et al., 2008). Indeed there is evidence to support this in a study of the stabilities of Ig domains in antibody fragments [see Rothlisberger et al. (2005) and detailed discussion in Han et al. (2007)]. However, the interaction energy will also depend on the nature of the interface, and the specificity of the interactions between domains. If the activity (or stability) of the protein relies on specific positioning of the two domains, then the interface is likely to be more important and thus it is more likely to confer stability than a nonspecific interface.

However, in the helical spectrin domains any nonspecific extension at the C-terminus of a domain (even a non-natural all-beta protein) is stabilizing. Is spectrin alone in this? No previous studies have suggested that simple incorporation of a domain into a long protein could in itself be stabilizing. Perhaps that is because such an effect is not expected and so has not been looked for (we came across this by accident ourselves). We suggested that this stabilization might arise because any extension prevents fraying of helices, and that such an interaction is lost when a proline residue is incorporated in the linker. Does this mean that this effect is a phenomenon that will be restricted to, or indeed be common in, all-alpha proteins?

Is there any way one might predict what kind of interactions between domains are likely to promote (speed) the folding of a neighboring domain? There is evidence to suggest that the nature of the interface or linker is important. It is intriguing to note that the most compelling case for significant interdomain effects on the folding rates of proteins is in spectrin domains where the interface is relatively small; however, there is a continual helix that stretches from one domain to the other. Would a continuous β-sheet be equally competent to promote folding of a neighboring domain?

The bsPGK and luciferase examples give a clear indication that the folding of domains within their multidomain context might be slowed when compared to the domains in isolation. In R1617 there is formation of a collapsed intermediate that also slows folding. This could potentially be a problem. Is there any evidence that domains that are commonly found in a multidomain environment are more “foldable”? If this is the case, what determines the “foldability” in this context? We construct artificial multidomain constructs for use in atomic force microscopy experiments, where we express multiple copies of the same domain together in a single chain—a so-called polyprotein (Carrion-Vazquez et al., 1999). We have found that some perfectly stable, soluble domains such as barnase simply cannot be expressed in these polyproteins, whereas others such as titin domains can be expressed very easily (Best et al., 2001). Have some domains evolved to allow efficient folding in a multidomain context? Some domains, such as immunity proteins, are never found in a multidomain context, whereas others, such as fnIII domains, are virtually never found alone. For the domains that are common in multidomain environments, are there specific features of their folding energy landscape that allow this? Han et al. published an analysis of a number of proteins, the folding of which has been studied in detail, which may allow a starting point for future studies of this question [see supplementary data (Han et al., 2007)].

In summary, there are many unanswered questions posed within this review which provide scope for much work. We suggest that the use of structural and bioinformatics data will provide many interesting targets to allow these specific questions to be addressed.

SUPPORTING INFORMATION

Supplemental Content S1.pdf

ACKNOWLEDGMENTS

SB, AAN, and JC were all supported by the Wellcome Trust (Grant No. 064417∕Z∕01). JC is a Wellcome Trust Senior Research Fellow.

References

  1. Anfinsen, C B (1973). “Principles that govern the folding of protein chains.” Science 10.1126/science.181.4096.223 181, 223–230. [DOI] [PubMed] [Google Scholar]
  2. Apic, G, Gough, J, and Teichmann, S A (2001). “Domain combinations in archaeal, eubacterial and eukaryotic proteomes.” J. Mol. Biol. 310, 311–325. [DOI] [PubMed] [Google Scholar]
  3. Arora, P, Hammes, G G, and Oas, T G (2006). “Folding mechanism of a multiple-independently folding domain protein: double, B. domain of protein A.” Biochemistry 45, 12312–12324. [DOI] [PubMed] [Google Scholar]
  4. Bateman, A, et al. (2004). “The Pfam protein families database.” Nucleic Acids Res. 10.1093/nar/gkh121 32, D138–D141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Batey, S, and Clarke, J (2006). “Apparent cooperativity in the folding of multidomain proteins depends on the relative rates of folding of the constituent domains.” Proc. Natl. Acad. Sci. U.S.A. 103, 18113–18118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Batey, S, and Clarke, J (2008). “The folding pathway of a single domain in a multidomain protein is not affected by its neighboring domain.” J. Mol. Biol. 378, 297–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Batey, S, Randles, L G, Steward, A, and Clarke, J (2005). “Cooperative folding in a multidomain protein.” J. Mol. Biol. 10.1016/j.jmb.2005.04.028 349, 1045–1059. [DOI] [PubMed] [Google Scholar]
  8. Batey, S, Scott, K A, and Clarke, J (2006). “Complex folding kinetics of a multidomain protein.” Biophys. J. 90, 2120–2130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Best, R B, Li, B, Steward, A, Daggett, V, and Clarke, J (2001). “Can non-mechanical proteins withstand force? Stretching barnase by atomic force microscopy and molecular dynamics simulation.” Biophys. J. 81, 2344–2356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Carrion-Vazquez, M, Oberhauser, A F, Fowler, S B, Marszalek, P E, Broedel, S E, Clarke, J, and Fernandez, J M (1999). “Mechanical and chemical unfolding of a single protein: a comparison.” Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.96.7.3694 96, 3694–3699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ekman, D, Bjorklund, A K, Frey-Skott, J, and Elofsson, A (2005). “Multi-domain proteins in the three kingdoms of life: orphan domains and other unassigned regions.” J. Mol. Biol. 348, 231–243. [DOI] [PubMed] [Google Scholar]
  12. See EPAPS Document No. E-HJFOA5-2-007807 for supplemental material. This document can be reached through a direct link in the online article’s HTML reference section or via the EPAPS homepage (http://www.aip.org/pubservs/epaps.html).
  13. Freire, E, Murphy, K P, Sanchez-Ruiz, J M, Galisteo, M L, and Privalov, P L (1992). “The molecular basis of cooperativity in protein folding. Thermodynamic dissection of interdomain interactions in phosphoglycerate kinase.” Biochemistry 31, 250–256. [DOI] [PubMed] [Google Scholar]
  14. Frydman, J (2001). “Folding of newly translated proteins in vivo: the role of molecular chaperones.” Annu. Rev. Biochem. 10.1146/annurev.biochem.70.1.603 70, 603–647. [DOI] [PubMed] [Google Scholar]
  15. Gerstein, M (1998). “How representative are the known structures of the proteins in a complete genome? A comprehensive structural census.” Folding Des. 3, 497–512. [DOI] [PubMed] [Google Scholar]
  16. Giorgi, M, Cianci, C D, Gallagher, P G, and Morrow, J S (2001). “Spectrin oligomerization is cooperatively coupled to membrane assembly: a linkage targeted by many hereditary hemolytic anemias.” Exp. Mol. Pathol. 70, 215–230. [DOI] [PubMed] [Google Scholar]
  17. Gokhale, R S, and Khosla, C (2000). Curr. Opin. Chem. Biol. 4, 22–27. [DOI] [PubMed] [Google Scholar]
  18. Grum, V L, Li, D, MacDonald, R I, and Mondragon, A (1999). “Structures of two repeats of spectrin suggest models of flexibility.” Cell 10.1016/S0092-8674(00)81980-7 98, 523–535. [DOI] [PubMed] [Google Scholar]
  19. Hamill, S J, Meekhof, A E, and Clarke, J (1998). “The effect of boundary selection on the stability and folding of the third fibronectin type III domain from human tenascin.” Biochemistry 37, 8071–8079. [DOI] [PubMed] [Google Scholar]
  20. Han, J H, Batey, S, Nickson, A A, Teichmann, S A, and Clarke, J (2007). “The folding and evolution of multidomain proteins.” Nat. Rev. Mol. Cell Biol. 8, 319–330. [DOI] [PubMed] [Google Scholar]
  21. Hilser, V J, and Thompson, E B (2007). “Intrinsic disorder as a mechanism to optimise allosteric coupling in proteins.” Proc. Natl. Acad. Sci. U.S.A. 104, 8311–8315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Improta, S, Krueger, J K, Gautel, M, Atkinson, R A, Lefevre, J-F, Moulton, S, Trewhella, J, and Pastore, A (1998). “The assembly of immunoglobulin-like modules in titin: implications for muscle elasticity.” J. Mol. Biol. 10.1006/jmbi.1998.2028 284, 761–777. [DOI] [PubMed] [Google Scholar]
  23. Jaenicke, R (1999). “Stability and folding of domain proteins.” Prog. Biophys. Mol. Biol. 71, 155–241. [DOI] [PubMed] [Google Scholar]
  24. Johnson, C P, Gaetani, M, Ortiz, V, Bhasin, N, Harper, S, Gallagher, P G, Speicher, D W, and Discher, D E (2007). “Pathogenic proline mutation in the linker between spectrin repeats: disease caused by spectrin unfolding.” Blood 108, 3538–3543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kloss, E, Courtemanche, N, and Barrick, D (2008). “Repeat-protein folding: new insights into origins of cooperativity, stability and topology.” Arch. Biochem. Biophys. 469, 83–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kusunoki, H, Minasov, G, MacDonald, R I, and Mondragon, A (2004). “Independent movement, dimerization and stability of tandem repeats of chicken brain alpha-spectrin.” J. Mol. Biol. 10.1016/j.jmb.2004.09.019 344, 495–511. [DOI] [PubMed] [Google Scholar]
  27. Leahy, D J, Aukhil, I, and Erickson, H P (1996). “2.0 Å crystal structure of a four-domain segment of human fibronectin encompassing the RGD loop and synergy region.” Cell 84, 155–164. [DOI] [PubMed] [Google Scholar]
  28. Levy, E D, Erba, E B, Robinson, C V, and Teichmann, S A (2008). “Assembly reflects evolution of protein complexes.” Nature (London) 453, 1262–1266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Li, H B, Linke, W A, Oberhauser, A F, Carrion-Vazquez, M, Kerkviliet, J G, Lu, H, Marszalek, P E, and Fernandez, J M (2002). “Reverse engineering of the giant muscle protein titin.” Nature (London) 418, 998–1002. [DOI] [PubMed] [Google Scholar]
  30. Liu, J, and Rost, B (2004). “CHOP: parsing proteins into structural domains.” Nucleic Acids Res. 32, W569–W571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. MacDonald, R I, Musacchio, A, Holmgren, R A, and Saraste M (1994). “Invariant tryptophan at a shielded site promotes folding of the conformational unit of spectrin.” Proc. Natl. Acad. Sci. U.S.A. 91, 1299–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. MacDonald, R I, and Pozharski, E V (2001). “Free energies of urea and of thermal unfolding show that two tandem repeats of spectrin are thermodynamically more stable than a single repeat.” Biochemistry 40, 3974–3984. [DOI] [PubMed] [Google Scholar]
  33. Main, E R, Lowe, A R, Mochrie, S G, Jackson, S E, and Regan, L (2005). “A recurring theme in protein engineering: the design, stability and folding of repeat proteins.” Curr. Opin. Struct. Biol. 15, 464–471. [DOI] [PubMed] [Google Scholar]
  34. Minajeva, A, Kulke, M, Fernandez, J M, and Linke, W A (2001). “Unfolding of titin domains explains the viscoelastic behavior of skeletal myofibrils.” Biophys. J. 80, 1442–1451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Murzin, A G, Brenner, S E, Hubbard, T, and Chothia, C (1995). “SCOP: a structural classification of proteins database for the investigation of sequences and structures.” J. Mol. Biol. 10.1006/jmbi.1995.0159 247, 536–540. [DOI] [PubMed] [Google Scholar]
  36. Myers, J K, Pace, C N, and Scholtz, J M (1995). “Denaturant m-values and heat capacity changes: relation to changes in accessible surface areas of protein unfolding.” Protein Sci. 4, 2138–2148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Nicola, A V, Chen, W, and Helenius, A (1999). “Co-translational folding of an alphavirus capsid protein in the cytosol of living cells.” Nat. Cell Biol. 1, 341–345. [DOI] [PubMed] [Google Scholar]
  38. Onuchic, J N, LutheySchulten, Z, and Wolynes, P G (1997). “Theory of protein folding: the energy landscape perspective.” Annu. Rev. Phys. Chem. 10.1146/annurev.physchem.48.1.545 48, 545–600. [DOI] [PubMed] [Google Scholar]
  39. Osvath, S, Kohler, G, Zavodszky, P, and Fidy, J (2005). “Asymmetric effect of domain interactions on the kinetics of folding in yeast phosphoglycerate kinase.” Protein Sci. 14, 1609–1616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pace, C N (1986). “Determination and analysis of urea and guanidinium hydrochloride denaturation curves.” Methods Enzymol. 131, 266–280. [DOI] [PubMed] [Google Scholar]
  41. Parker, M J, Spencer, J, Jackson, G S, Burston, S G, Hosszu, L L, Craven, C J, Waltho, J P, and Clarke, A R (1996). “Domain behavior during the folding of a thermostable phosphoglycerate kinase.” Biochemistry 35, 15740–15752. [DOI] [PubMed] [Google Scholar]
  42. Pascual, J, Pfuhl, M, Walther, D, Saraste, M, and Nilges, M (1997). “Solution structure of the spectrin repeat: a left-handed antiparallel triple-helical coiled-coil.” J. Mol. Biol. 10.1006/jmbi.1997.1344 273, 740–751. [DOI] [PubMed] [Google Scholar]
  43. Pfuhl, M, Improta, S, Politou, A S, and Pastore, A (1997). “When a module is also a domain: the role of the N terminus in the stability and the dynamics of immunoglobulin domains from titin.” J. Mol. Biol. 265, 242–256. [DOI] [PubMed] [Google Scholar]
  44. Randles, L G, Batey, S, Steward, A, and Clarke, J (2008). “Distinguishing specific and nonspecific interdomain interactions in multidomain proteins.” Biophys. J. 94, 622–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Rothlisberger, D, Honegger, A, and Pluckthun, A (2005). “Domain interactions in the Fab fragment: a comparative evaluation of the single-chain Fv and Fab format engineered with variable domains of different stability.” J. Mol. Biol. 347, 773–789. [DOI] [PubMed] [Google Scholar]
  46. Rounsevell, R W S, Steward, A, and Clarke, J (2005). “Biophysical investigations of engineered polyproteins: implications for force data.” Biophys. J. 88, 2022–2029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Sanchez, I E, Morillas, M, Zobeley, E, Kiefhaber, T, and Glockshuber, R (2004). “Fast folding of the two-domain semliki forest virus capsid protein explains co-translational proteolytic activity.” J. Mol. Biol. 338, 159–167. [DOI] [PubMed] [Google Scholar]
  48. Scott, K A, Batey, S, Hooton, K A, and Clarke, J (2004a). “The folding of spectrin domains. I: wild-type domains have the same stability but very different kinetic properties.” J. Mol. Biol. 344, 195–205. [DOI] [PubMed] [Google Scholar]
  49. Scott, K A, and Clarke, J (2005). “Spectrin R16: broad energy barrier or sequential transition states?” Protein Sci. 14, 1617–1629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Scott, K A, Randles, L G, and Clarke, J (2004b). “The folding of spectrin domains II: phi-value analysis of R16.” J. Mol. Biol. 344, 207–221. [DOI] [PubMed] [Google Scholar]
  51. Scott, K A, Steward, A, Fowler, S B, and Clarke, J (2002). “Titin; a multidomain protein that behaves as the sum of its parts.” J. Mol. Biol. 10.1006/jmbi.2001.5260 315, 819–829. [DOI] [PubMed] [Google Scholar]
  52. Spitzfaden, C, Grant, R P, Mardon, H J, and Campbell, I D (1997). “Module-module interactions in the cell binding region of fibronectin: stability, flexibility and specificity.” J. Mol. Biol. 265, 565–579. [DOI] [PubMed] [Google Scholar]
  53. Steward, A, Adhya, S, and Clarke, J (2002). “Sequence conservation in Ig-like domains: the role of highly conserved proline residues in the fibronectin type, III superfamily.” J. Mol. Biol. 318, 935–940. [DOI] [PubMed] [Google Scholar]
  54. Teichmann, S A, Chothia, C, and Gerstein, M (1999). “Advances in structural genomics.” Curr. Opin. Struct. Biol. 9, 390–399. [DOI] [PubMed] [Google Scholar]
  55. Thanaraj, T A, and Argos, P (1996). “Ribosome-mediated translational pause and protein domain organization.” Protein Sci. 5, 1594–1612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Zarnt, T, Tradler, T, Stoller, G, Scholz, C, Schmid, F X, and Fischer, G (1997). “Modular structure of the trigger factor required for high activity in protein folding.” J. Mol. Biol. 271, 827–837. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Content S1.pdf


Articles from HFSP Journal are provided here courtesy of HFSP Publishing.

RESOURCES