Abstract
We present here a structural and mechanistic description of how a protein changes its fold and function, mutation by mutation. Our approach was to create 2 proteins that (i) are stably folded into 2 different folds, (ii) have 2 different functions, and (iii) are very similar in sequence. In this simplified sequence space we explore the mutational path from one fold to another. We show that an IgG-binding, 4β+α fold can be transformed into an albumin-binding, 3-α fold via a mutational pathway in which neither function nor native structure is completely lost. The stabilities of all mutants along the pathway are evaluated, key high-resolution structures are determined by NMR, and an explanation of the switching mechanism is provided. We show that the conformational switch from 4β+α to 3-α structure can occur via a single amino acid substitution. On one side of the switch point, the 4β+α fold is >90% populated (pH 7.2, 20 °C). A single mutation switches the conformation to the 3-α fold, which is >90% populated (pH 7.2, 20 °C). We further show that a bifunctional protein exists at the switch point with affinity for both IgG and albumin.
Keywords: evolution, NMR, protein design, protein folding
Protein molecules are capable of self-organizing into 3D topologies that create biologic functions. The fundamental principles of how the sequence of amino acids in a protein determines its structure remain poorly understood, however, despite its central importance to biology. The primary approach to the “folding problem” has been to determine a detailed structural and energetic description of the equilibrium between the native state and the random population of disordered, unfolded states. It is well known that the equilibrium between folded and unfolded can be radically shifted in either direction with a few mutations. There is accumulating evidence, however, that a few mutations sometimes can dramatically shift the equilibrium into new tertiary (and/or quaternary) structures (1, 2). Understanding the capacity of a protein to acquire a completely different structure as a result of minor mutagenic perturbation is central to understanding both protein folding in general and more specifically how new protein structures and functions evolve. Most natural proteins populate only the native state significantly, with ΔGunfolding ≥5 kcal/mol. It is also generally assumed that many mutations are required to shift the equilibrium such that ΔGunfolding for some alternative state is ≥5 kcal/mol. This assumption underpins most bioinformatics methods, in fact. Most mutations in a protein that increase its propensity toward an alternative fold destabilize the original fold. Thus, it seems intuitive that a pathway of single amino acid substitutions would result in a long series of mutants that would be unfolded before enough folding information accumulates to significantly populate an alternative fold. Both natural and engineered examples demonstrate, however, that the sequence space separating 2 proteins with different structures can be quite small (3–5). To understand this seemingly paradoxical situation, one needs to methodically examine the sequence space separating 2 stable folds. In concept this is simple. One begins with 2 stable proteins of similar size but different folds and mutates one to be more like the other until a switch in structure occurs. In practice this approach is not trivial, however. Any mutation in a protein will change the context of other amino acids. This is the essence of the folding problem. Our approach, therefore, was to create a simplified sequence space in which the mutational path from one fold to another can be explored and shifts in the equilibrium between the 2 folded states (and unfolded states) can be measured as a function of mutation.
Previously we and others have studied the structure, folding, and stability of 2 binding domains of Streptococcus protein G (6). Protein G contains 2 types of domains that bind to serum proteins in blood: the GA domain of 45 structured amino acids that bind to human serum albumin (HSA) (7, 8), and the GB domain of 56 structured amino acids that bind to the constant (Fc) region of IgG (9, 10). The natural versions of GA and GB domains share no significant sequence homology and have different folds, 3-α and 4β+α, respectively. From these studies we have been able to create high-identity versions of GA and GB, which have wild-type stabilities and binding function but which are 77% identical. These proteins are denoted GA77 and GB77. GA77 binds to HSA with a Kd = 100 nM and has a ΔGunfolding of 5 kcal/mol (20 °C, 0.1 M KPO4, pH 7.2) (11). Amino acids 1–8 and 54–56 are disordered in GA77. The remaining 45 aa are well ordered in a 3-α helix bundle (12). GB77 binds to the constant (Fc) region of IgG with a Kd = 100 nM and has a ΔGunfolding of 5 kcal/mol (20 °C, 0.1 M KPO4, pH 7.2) (13, 14). All 56 aa of GB77 are well ordered in a 4-stranded β-sheet with an α-helix connecting strands 2 and 3 (12). The proteins were engineered such that the IgG and HSA binding epitopes are encoded in both proteins. The IgG-binding epitope is functional in the 4β+α fold and latent in the 3-α fold, whereas the albumin-binding epitope is functional in the 3-α fold and latent in the 4β+α fold. This results in an experimental system in which unmasking the latent function is linked to a switch in conformation (Fig. 1). This work is described in refs. 3, 4, 11, 12, and 15. The fact that GA77 and GB77 are so close in mutational space greatly simplifies subsequent searches of the sequence space that separates them. The context problem is not eliminated but is reduced to a practicable level. This allowed a systematic exploration of the sequence space separating these 2 functional folds.
Results
The positions of nonidentity between the GA77 and GB77 proteins are shown in Fig. 1. Our approach was to explore binary sequence space (choice of either the GA or GB amino acid at positions of nonidentity). Obviously making 13 sequential substitutions in any order for the corresponding amino acid at a position of nonidentity will result in a conformational switch. Finding the path with the fewest unstructured intermediates required a systematic approach, however. We first analyzed all 13 single-site mutants in GA77 and GB77. We were able to produce and purify folded proteins for approximately half of these mutants. Proteins were purified using an affinity-cleavage tag system that we developed (16), essentially as described in ref. 3. The system enabled the rapid, standardized purification of mutant proteins, even of low stability. Mutants were characterized by circular dichroism (CD) to assess secondary structure (Fig. S1), thermal denaturation by CD to assess stability (Fig. S2), the ability to bind HSA and IgG to assess function, and by 2D heteronuclear single quantum correlation (HSQC) spectra using NMR to assess tertiary structure. Monomeric state was established using size exclusion chromatography and multiple-angle laser-light scattering. High-resolution structures were determined by standard 3D NMR methods for key proteins. Midpoints of thermal denaturation (TM) (0.1 M KPO4, pH 7.2, 20 °C) are reported for all folded mutants in Table 1. We found that every position in one fold or the other could be mutated without uncoding the native structure. Key mutants in the pathway to a conformational switch are shown in Fig. 2. A heteromorphic pair of 88% identity (GA88 and GB88b) was found in which stability and function remain similar to some naturally occurring IgG and HSA binding domains. Assembling heteromorphic pairs of greater than 88% identity was also possible, although additional mutation causes stability to fall below the threshold observed for most natural proteins. The pair GA91 and GB91 have a ΔGunfolding of >3 kcal/mol at 20 °C. Both proteins show undiminished binding affinity to their respective ligands and are monomeric.
Table 1.
TM | 9 | 12 | 18 | 20 | 24 | 25 | 30 | 33 | 45 | 49 | 50 | 51 | 52 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Mutants with GA 3-α fold | ||||||||||||||
77.5° | L | A | K | L | G | I | I | I | L | I | L | K | A | GA77 |
65.8° | L | A | K | L | A* | I | I | I | L | I | L | K | A | |
67.3° | L | A | K | L | G | T* | I | I | Y | I | L | K | A | |
62.5° | L | A | K | L | G | I | F* | I | L | I | L | K | A | |
60.2° | L | A | K | L | G | I | I | I | Y* | I | L | K | A | |
65.5° | L | A | K | L | G | I | I | I | L | I | K* | K | A | |
71.6° | L | A | K | L | G | I | I | I | L | I | L | T* | A | |
75.3° | L | A | K | L | G | I | I | I | L | I | L | K | F* | |
69.4° | L | A | K | L | G | I | I | I | L | I | L | T* | F* | GA88 |
61.5° | L | A | K | L | G | T* | I | I | L | I | L | T* | F* | GA91 |
50.0° | L | A | K | L | G | T* | I | I | L | I | K* | T* | F* | GA95 |
37.0° | L | A | K | L | G | T* | F* | I | L | I | K* | T* | F* | GA98 |
Mutants with GB 4β+α fold | ||||||||||||||
62.4° | G | L | T | A | A | T | F | Y | Y | T | K | T | F | GB77 |
63.8° | L* | A* | T | A | A | T | F | Y | Y | T | K | T | F | |
55.8° | G | L | K* | A | A | T | F | Y | Y | T | K | T | F | |
49.9° | G | L | T | L* | A | T | F | Y | Y | T | K | T | F | |
58.3° | G | L | T | A | G* | T | F | Y | Y | T | K | T | F | |
58.0° | G | L | T | A | A | I* | F | Y | Y | T | K | T | F | |
61.9° | G | L | T | A | A | T | F | I* | Y | T | K | T | F | |
60.2° | G | L | T | A | A | T | F | Y | Y | I* | K | T | F | |
57.5° | L* | A* | T | A | G* | T | F | Y | Y | I* | K | T | F | GB88b |
49.3° | L* | A* | K* | A | G* | T | F | Y | Y | I* | K | T | F | GB91 |
48.7° | L* | A* | K* | A | G* | T | F | I* | Y | I* | K | T | F | GB95 |
37.0° | L* | A* | K* | L* | G* | T | F | Y | Y | I* | K | T | F | |
35.0° | L* | A* | K* | L* | G* | T | F | I* | Y | I* | K | T | F | GB98 |
*Amino acid substitution.
The binary mutational space separating GA91 and GB91 comprises only 32 sequences. We constructed, expressed, and purified 17 of these variants to effectively sample the sequence space (Tables 1 and S1). The main observations are as follows. The mutation Y33I has little effect on the stability of the 4β+α fold (−0.1 kcal/mol), and the mutation L50K has the least effect on the stability of the 3-α fold (−1.0 kcal/mol). Mutation at position 20, 33, or 45 is not tolerated in the 3-α fold in any of the 32 contexts (Table S1). Mutation at position 30, 45, or 50 is not tolerated in the 4β+α fold in any of the contexts. The only position that cannot be changed in either fold (without unfolding it) was position 45. Eight of the 17 proteins were predominantly folded into one of the native structures: four were 3-α and 4 were 4β+α. Of the 9 “unfolded” proteins, all were purified and can be seen by CD to have significant secondary structure content. The exact nature of this residual structure can probably be determined in the future using 3D NMR techniques.
The variants denoted GA95 and GB95 differ only at positions 20, 30, and 45, yet are fully folded, with ΔGunfolding of ≈3 kcal/mol at 20 °C (Fig. S2). Both proteins show binding affinity to their respective ligands (KD <1 μM) and are monomeric. High-resolution structures of these proteins were determined to better understand how so few amino acids control the conformational switch. This is discussed in detail below. Statistics for the GB95 and GA95 ensembles of 20 structures are shown in Table S2. Protein Data Bank accession codes for GA95 and GB95 are 2KDL and 2KDM, respectively.
Mutation of I30F in GA95 (GA98) and A20L in GB95 (GB98) leads to a heteromorphic pair differing at only 1 amino acid. GA98 is folded into the 3-α conformation (>90% populated, pH 7.2, 20 °C), and GB98 is folded into the 4β+α conformation (>90% populated, pH 7.2, 20 °C). The CD spectra of GA98 and GB98 are essentially identical to the spectra of their parent GA and GB proteins (Fig. S1). The assigned HSQC spectra of GA98 and GB98 are compared in Fig. 3. GA98 exhibits diminished affinity for HSA (KD >1 μM) but has acquired affinity for IgG (KD <1 μM). GB98 binds tightly to IgG but not HSA. The ability of GA98 to bind IgG as well as HSA may reveal a hidden propensity to switch into the 4β+α conformation and unmask the IgG-binding epitope. The population of the 4β+α conformation is too low to detect in the unliganded state of GA98, but IgG binding may shift the equilibrium away from 3-α and toward 4β+α. It is also possible that GA98 binds IgG via a new mechanism. This will be sorted out using NMR to map the binding epitope (17). In either case, exploration of binary sequence space shows that one functional protein can switch into a completely different conformation with a different function via a mutational pathway in which neither function nor native structure is completely lost. The GA98 and GB98 proteins are only marginally stable, but restoration of close to wild-type stability and function in either direction can be attained with only 3 additional mutations (e.g., to GA88 and GB88b).
Overall Change in Topology.
High-resolution structures were determined for GA95 and GB95 and provide insight into the structural determinants of the conformational switch. Amino acid substitutions at 20, 30, and 45 shift the equilibrium from >99% 4β+α to >99% 3-α. Amino acids 1–8 are disordered in GA95 and form the central β1 strand in GB95 (Fig. 4). Amino acids 9–23 form helix 1 in GA95 and in GB95 form the turn between β1 and β2, the β2 strand, and the turn between β2 and the central helix. Amino acids 27–33 are helical in GA95. In GB95 the helix is slightly longer: 24–37. Amino acids 39–51 form helix 3 in GA95 and in GB95 form the turn between the central helix and β3, the β3 strand, the turn between β3 and β4, and the first part of the β4 strand. Amino acids 52–56 are disordered in GA95 and form the end of the β4 strand in GB95. Except for common helical residues 27–33, every amino acid has undergone a change in secondary structure in the switch from 3-α to 4β+α.
Hydrophobic Interactions.
The hydrophobic core of GA95 comprises A12, A16, and L20 from α1; I30, I33, and A36 from α2; and V42, K46, and I49 from α3 (Fig. 5). The aliphatic side chain of K46 contributes to packing of the hydrophobic core with the ammonium group protuding into solvent. A16, L20, I30, I33, and I49 form a tight network of mutually dependent hydrophobic contacts. L45 is 35% buried and actually not highly conserved among natural GA domains. Its principal contacts are with boundary residues Y29, L32, and the core residue I33. The hydrophobic core of GB95 comprises Y3, L5, and L7 from β1; A16 and A20 from β2; A26, F30, and A34 from the α-helix; W43 and Y45 from β3; and F52 and V54 from β4. The Y3, L5, F30, and F52 cluster accounts for ≈50% of the hydrophobic core. F30 interacts primarily with Y3 and L5 in β1 and F52 in β4. Y45 is packed against F52 and also forms a very strong H-bond with D47, which raises the pKa of Y45 to >12.0 (18). The Y45, D47, and F52 interactions within the β3-β4 hairpin help give it the independent stability observed in previous studies of protein G mutants (19). Seven critical hydrophobic residues in the 3-α core (A12, A16, I33, A36, V42, K46, and I49) are preserved in the 4β+α fold of GB95. Nine critical residues in the 4β+α fold (Y3, L5, L7, A16, A26, A34, W43, F52, and V54) are preserved in the 3-α fold of GA95. Buried residues in one fold are frequently more solvent exposed in the other fold (Fig. S3).
The Critical Role of N- and C-Termini in Switching.
One can consider the protein in 2 parts: amino acids 9–51, which are fully structured in both folds, and the 2 ends (1–8 and 52–56), which are unstructured in 3-α but form β strands in GB95 (Fig. 6). In the switch from 3-α to 4β+α, hydrophobic interactions in the core of the helix bundle are released and replaced with mutually exclusive hydrophobic interactions between the central helix (A26, F30, and A34) and the central β-strands: β1 (Y3, L5, and L7) and β4 (F52 and V54). Thus there is a competition between formation of the helix bundle and recruitment of the N- and C-terminal residues to form a β-sheet with an alternative hydrophobic core. When 20 = L, 30 = I, and 45 = L, their preferred binding partners are in the hydrophobic core of the 3-α fold (e.g., A16, I33, and I49). When 20 = A, 30 = F, and 45 = Y, their preferred binding partners are 3, 5, and 7 in the β1 strand and 52 and 54 in the β4 strand. The binary sequence space between GA95 and GB95 is only 8 combinations. In this sequence space, L vs. Y at 45 is critical to which way the balance tips. When 45 = L, variants are unfolded if 20 = A. If 45 = L and 20 = L the protein will remain predominantly 3-α, even when I30 becomes F. Likewise, when 45 = Y, variants are unfolded if 30 = I. If 45 = Y and 30 = F, the protein will remain predominantly 4β+α, even when A20 becomes L. Thus the switch can be distilled down to 1 amino acid.
Thermodynamic Linkage.
To fully understand the energetics of the switch it is critical to appreciate that (i) the 2 alternative folds are thermodynamically linked, and (ii) this linkage develops only as the context of the alternative fold develops. This is best illustrated by the variations at 45 and 52. L45/Y45 residues have ≈35% solvent accessibility in both folds. In GA95, L45 is located in the α3-helix and has no direct interactions with F52. As noted above, Y45 is located in the β3-strand of GB95 and is packed against F52. Neither L45Y nor A52F mutations are catastrophic in the 3-α fold(Table 1). In GA77, the L45Y mutation costs 1.5 kcal/mol and the A52F mutation costs only 0.3 kcal/mol in ΔGunfolding. Either the Y45L or the F52A mutation strongly destabilizes the 4β+α fold, however. The ultimate tertiary structure is strongly influenced by the stability of the β3-β4 hairpin structure. When Y45 and F52 occur together, the β3-β4 hairpin is favored. In terms of logical operators this is an AND gate. Other combinations at 45 and 52 favor 3-α. A switch can result from the additive effects of destabilization of the 3-α fold by Y45 (−1.5 kcal/mol) and stabilization of the 4β+α fold when Y45 and F52 interact (+2 kcal/mol).
The default propensity of the isolated 9–51 sequence in GB98 is 3-α. The effects of short N- and C-terminal sequences can override the independent propensity. It has been demonstrated previously with short peptides that local propensities can be overcome by the larger protein context (20, 21), but here the propensity of an independently stable, 45-aa domain is overcome by 2 short sequences on its ends. The importance of the ends is not manifest until the context of the 4β+α fold develops, however. For example, the N- (1–8) and C- (52–56) termini of GB were substituted early in the engineering of GA. The changes are neutral to the stability of 3-α until the switch point is approached in mutational space. However, it should not be generally assumed that unstructured N- and C-termini are neutral to stability. In earlier work, we found that replacing 8 unstructured amino acids on the N terminus of another 3-α protein (staphylococcal protein A) with 8 aa from GB resulted in a loss in stability of 3 kcal/mol (4).
The importance of context is also revealed in the Y vs. L choice at position 45. In GA98, the L45Y mutation will switch conformation to 4β+α. In GA91 and GA95, where insufficient context is present to support a full switch to 4β+α, L45Y causes unfolding. The destabilization of 3-α may result from stabilization of a “nonnative” (β3-β4 hairpin) structure. This is consistent with a general phenomenon in protein folding. The changes in stability upon mutation result from interactions in both the native state and the unfolded state (e.g., ref. 22). In this case the “unfolded” state of 3-α probably includes the β3-β4 hairpin. In general, effects of mutation on the unfolded state are difficult to predict because specific structural elements in the unfolded state are unknown.
Bypassing the Unfolded State.
The fact that sequential mutations can switch the conformational preference of a protein without populating the unfolded state seems at first paradoxical. Once dual propensities develop, however, a critical point is reached, and mutations tend to populate the alternative state rather than the unfolded state. This is an extreme case of another well-established phenomenon in folding. Denaturation with heat or a chaotropic agent such as urea or Gu-HCl generally produces a highly disordered unfolded state characterized by exposed hydrophobic groups and high conformational entropy (23). When mutation is the denaturant (instead of heat or chaotropic solvent), the “unfolded” state is often a limited ensemble of conformations characterized by high secondary structure content, although usually not a well-defined tertiary structure. The persistence of order in mutants of low stability is consistent with basic principles: (i) proteins collapse in water into compact structures; (ii) compactness drives acquisition of secondary structure; and (iii) final structural propensities are a subtle mix of many compensating weak forces. The important general point is that many destabilizing mutations will create propensities for new folds because certain nonnative states are strongly preferred over others. New propensity for a specific tertiary structure generally may not be discernible in a molten population, but even weak propensity may lead to short mutational pathways to stable, alternative topologies. Unfolded mutants such as L45Y in GA91 and GA95 populate neither the 3-α nor 4β+α folds detectibly but may contain certain structural elements of each. For example, β3-β4 hairpin structure may be forming but not β1-β2. This type of hybrid propensity may lead to short mutational paths to yet a third type of fold, were we to expand choices of amino acids and positions.
Discussion
We demonstrate that fold and function can be switched through a mutational pathway in which function is never lost and in which the unfolded state is never significantly populated. The switch from an IgG-binding, predominantly 4β+α fold into a bifunctional, predominantly 3-α fold can occur via a single amino acid substitution. The switch was achieved despite constraints common to protein engineering efforts: (i) defined end points for fold and function; and (ii) examination of a restricted (i.e., binary) sequence space. Nature does not have predetermined destinations in fold/function space or operate in restricted mutational space. Even so, numerous engineering efforts have demonstrated the potential of proteins to switch folds. This topic was reviewed in 2006 (2). There are also natural examples of fold switching. Classic examples of major conformational changes include influenza virus hemagglutinin upon interaction with the host cell (24), serpins upon proteolysis of a loop (25), and prion proteins that change from a mainly α-helical benign form (PrPC) to an infectious state (PrPSc) with increased β-sheet content (26). Another recent example is the chemokine lymphotactin, which has 2 populated conformational states under physiologic conditions: the canonical chemokine fold (3-stranded β-sheet and C-terminal α-helix) in equilibrium with an all β-sheet dimer (27). Each conformation has a different binding function. In addition, natural Cro transcription factors exist with 40% identity but different structures for the 25 aa at their C-termini (28).
Most examples of natural and engineered fold switching involve a contextual change in the protein monomer. The most common contextual change is a change in quaternary structure. For example, mutation of several core residues in protein G can result in switching from a stable monomer to a stable tetramer with many of the native secondary structural elements (29). In the prion protein example, the driving force for the conformational switch from monomer to aggregate is extensive interaction of the edges of outer β-strands in the aggregated state. Metal binding has also been demonstrated to mediate major changes in secondary, tertiary, and quaternary structure in computationally designed proteins (30–32). Although our results show a switch from one monomeric fold into another, the energetic situation is similar in some ways to conformation switching driven by ligand binding. In our case the “binding” of the N- and C-termini to form the central strands of the β-sheet in the 4β+α fold is the primary energetic driving force for the switch.
Our work also shows that latent binding function can be linked to alternative folding propensity. In fact, our selection for high-identity alternative folds relied on migration from IgG-binding to albumin-binding function (3). Latent functionality likely exists in nature but is difficult to discern because an alternative function is revealed only in an unknown alternative fold. We would suggest that when latent function exists in an accessible alternative fold, functional migration is likely. We have demonstrated in our artificial system that there is no automatic penalty associated with evolving alternative propensity or latent function. We suggest that nature will explore sequence space when there is no penalty for doing so—that is, nature will follow any functional path. Small changes in amino acid sequence can cause a protein to veer toward propensity for another fold and on occasion switch folds, resulting in functional migration. The fact that fold and function can switch through short mutational pathways suggests that functional annotation based on sequence similarity should be regarded circumspectly. Even so, similar sequence does generally imply similar structure. If a fold switch occurs in nature, sequences rapidly diverge thereafter. Lymphotactin and the Cro transcription factors seem to be 2 natural proteins that have been “caught in the act,” however (27, 28).
Fold switching seems to be more probable between some folds than others (33). Certain structural aspects of GA and GB make them amenable to switching. Many core residues in each fold are largely solvent exposed in the other fold, allowing significant folding information for both folds to coexist in a single protein. The degree to which hydrophobic core positions overlap in any 2 arbitrary folds of similar size will vary widely. This means that there may be some predictability as to what folds can come close in sequence space. Given the limited number of fold families, this observation may provide a coarse guide to how proteins migrate through fold space in the process of evolution.
Materials and Methods
Mutagenesis.
Site-directed mutants were made using QuikChange (Stratagene) mutagenesis kits according to the manufacturer's instructions.
Protein Expression and Purification.
GA and GB variants were cloned into the vector pG58 (16), which encodes an engineered subtilisin prosequence as an N-terminal fusion domain, and the resulting fusion proteins were purified using an affinity-cleavage tag system that we developed (16), essentially as described in ref. 3. A commercial version of the purification system is available through Bio-Rad Laboratories (Profinity eXact Purification System). Minimal medium (14) was used for 15N and 13C labeling. Soluble cell extract of prodomain (eXact tag) fusion protein was injected on a 5-mL S189 column at 5 mL/min to allow binding and then washed with 10 column volumes of 100 mM KPO4, pH 7.2 to remove impurities (16). To cleave and elute the purified target protein, 6 mL of 10 mM sodium azide, 100 mM KPO4, pH 7.2 was injected at 0.5 mL/min. The purified protein was then concentrated to 0.2 to 0.3 mM, as required for NMR analysis. All proteins are monomeric under these conditions. This was demonstrated by using gel filtration with an in-line multiple angle laser light scattering detector (miniDAWN TREOS; Wyatt Technology).
Binding to IgG and HSA.
IgG and HSA were immobilized on GE HT1 columns containing NHS-activated agarose resin according to the manufacturer's instructions. Binding of mutants was carried out in 0.1 M KPO4, pH 7.2 by injecting 1 mL of a 1-mg/mL solution at 0.5 mL/min. Washing was with 15 mL of 0.1M KPO4, pH 7.2 at 1 mL/min. Elution was with 6 mL of 0.1 M phosphoric acid.
NMR Spectroscopy.
15N- and 13C/15N-labeled samples were prepared at 0.2–0.3-mM concentrations in 100 mM KPO4 buffer (pH 7.2) containing 10 mM sodium azide and 5–10% D2O. NMR spectra were collected on a Bruker AVANCE 600 MHz spectrometer fitted with a z axis gradient 1H/13C/15N triple resonance cryoprobe. Spectra were recorded at 15 °C for GA95 and 20 °C for GB95. NmrPipe (34) was used for data processing, and analysis was done with Sparky (35). Standard triple resonance experiments were used to obtain NMR assignments. Main chain assignments were made from HNCACB, CBCA(CO)NH, HBHA(CO)NH, and HNCO spectra, whereas side chains were assigned with 3D (H)C(CO)NH-TOCSY, 3D H(CCO)NH-TOCSY, 2D CBHD, and 2D CBHE experiments. Interproton distance information was derived from 3D 15N NOESY and aliphatic and aromatic 3D 13C NOESY spectra with mixing times of 100 and 150 ms. For additional information see SI Text.
Supplementary Material
Acknowledgments.
We thank David Shortle and George Rose for helpful discussion; Yang Shen, David Baker, and Ad Bax for making results of computational analysis available before publication; and Jane Ladner for laser light scattering analysis. This work was supported by National Institutes of Health Grant GM62154.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
See Commentary on page 21011.
This article contains supporting information online at www.pnas.org/cgi/content/full/0906408106/DCSupplemental.
References
- 1.Davidson AR. A folding space odyssey. Proc Natl Acad Sci USA. 2008;105:2759–2760. doi: 10.1073/pnas.0800030105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Ambroggio XI, Kuhlman B. Design of protein conformational switches. Curr Opin Struct Biol. 2006;16:525–530. doi: 10.1016/j.sbi.2006.05.014. [DOI] [PubMed] [Google Scholar]
- 3.Alexander PA, He Y, Chen Y, Orban J, Bryan PN. The design and characterization of two proteins with 88% sequence identity but different structure and function. Proc Natl Acad Sci USA. 2007;104:11963–11968. doi: 10.1073/pnas.0700922104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Alexander PA, Rozak DA, Orban J, Bryan PN. Directed evolution of highly homologous proteins with different folds by phage display: Implications for the protein folding code. Biochemistry. 2005;44:14045–14054. doi: 10.1021/bi051231r. [DOI] [PubMed] [Google Scholar]
- 5.Dalal S, Regan L. Understanding the sequence determinants of conformational switching using protein design. Protein Sci. 2000;9:1651–1659. doi: 10.1110/ps.9.9.1651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Fahnestock SR, Alexander P, Nagle J, Filpula D. Gene for an immunoglobulin-binding protein from a Group G Streptococcus. J Bacteriol. 1986;167:870–880. doi: 10.1128/jb.167.3.870-880.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Falkenberg C, Bjork L, Åkerstrom B. Localization of the binding site for streptococcal Protein G on human serum albumin. Identification of a 55-kilodalton protein G binding albumin fragment. Biochemistry. 1992;31:1451–1457. doi: 10.1021/bi00120a023. [DOI] [PubMed] [Google Scholar]
- 8.Frick IM, et al. Convergent evolution among immunoglobulin G-binding bacterial proteins. Proc Natl Acad Sci USA. 1992;89:8532–8536. doi: 10.1073/pnas.89.18.8532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Myhre EB, Kronvall G. Heterogeneity of nonimmune immunoglobulin Fc reactivity among gram-positive cocci: Description of three major types of receptors for human immunoglobulin G. Infect Immun. 1977;17:475–482. doi: 10.1128/iai.17.3.475-482.1977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Reis KJ, Ayoub EM, Boyle MDP. Streptococcal Fc receptors. II. Comparison of the reactivity of a receptor from a group C streptococcus with staphylococcal protein A. J Immunol. 1984;132:3098–3102. [PubMed] [Google Scholar]
- 11.Rozak DA, et al. Using offset recombinant polymerase chain reaction to identify functional determinants in a common family of bacterial albumin binding domains. Biochemistry. 2006;45:3263–3271. doi: 10.1021/bi051926s. [DOI] [PubMed] [Google Scholar]
- 12.He Y, Chen Y, Alexander P, Bryan PN, Orban J. NMR structures of two designed proteins with high sequence identity but different fold and function. Proc Natl Acad Sci USA. 2008;105:14412–14417. doi: 10.1073/pnas.0805857105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Orban J, Alexander P, Bryan P. Hydrogen-deuterium exchange in the free and immunoglobulin G-bound Protein G B-domain. Biochemistry. 1994;33:5702–5710. doi: 10.1021/bi00185a006. [DOI] [PubMed] [Google Scholar]
- 14.Alexander P, Fahnestock S, Lee T, Orban J, Bryan P. Thermodynamic analysis of the folding of the Streptococcal Protein G IgG-binding domains B1 and B2: Why small proteins tend to have high denaturation temperatures. Biochemistry. 1992;31:3597–3603. doi: 10.1021/bi00129a007. [DOI] [PubMed] [Google Scholar]
- 15.He Y, et al. Structure, dynamics, and stability variation in bacterial albumin binding modules: Implications for species specificity. Biochemistry. 2006;45:10102–10109. doi: 10.1021/bi060409m. [DOI] [PubMed] [Google Scholar]
- 16.Ruan B, Fisher KE, Alexander PA, Doroshko V, Bryan PN. Engineering subtilisin into a fluoride-triggered processing protease useful for one-step protein purification. Biochemistry. 2004;43:14539–14546. doi: 10.1021/bi048177j. [DOI] [PubMed] [Google Scholar]
- 17.He Y, Chen Y, Rozak DA, Bryan PN, Orban J. An artificially evolved albumin binding module facilitates chemical shift epitope mapping of GA domain interactions with phylogenetically diverse albumins. Protein Sci. 2007;16:1490–1494. doi: 10.1110/ps.072799507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Khare D, et al. pKa measurements from nuclear magnetic resonance for the B1 and B2 immunoglobulin G-binding domains of protein G: Comparison with calculated values for nuclear magnetic resonance and x-ray structures. Biochemistry. 1997;36:3580–3589. doi: 10.1021/bi9630927. [DOI] [PubMed] [Google Scholar]
- 19.Sari N, Alexander P, Bryan PN, Orban J. Structure and dynamics of an acid-denatured protein G mutant. Biochemistry. 2000;39:965–977. doi: 10.1021/bi9920230. [DOI] [PubMed] [Google Scholar]
- 20.Minor DL, Jr, Kim PS. Context-dependent secondary structure formation of a designed protein sequence. Nature. 1996;380:730–734. doi: 10.1038/380730a0. [DOI] [PubMed] [Google Scholar]
- 21.Anderson TA, Cordes MH, Sauer RT. Sequence determinants of a conformational switch in a protein structure. Proc Natl Acad Sci USA. 2005;102:18344–18349. doi: 10.1073/pnas.0509349102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Flanagan JM, Kataoka M, Fujisawa T, Engelman DM. Mutations can cause large changes in the conformation of a denatured protein. Biochemistry. 1993;32:10359–10370. doi: 10.1021/bi00090a011. [DOI] [PubMed] [Google Scholar]
- 23.Dill KA, Shortle D. Denatured states of proteins. Annu Rev Biochem. 1991;60:795–825. doi: 10.1146/annurev.bi.60.070191.004051. [DOI] [PubMed] [Google Scholar]
- 24.Skehel JJ, Wiley DC. Receptor binding and membrane fusion in virus entry: The influenza hemagglutinin. Annu Rev Biochem. 2000;69:531–569. doi: 10.1146/annurev.biochem.69.1.531. [DOI] [PubMed] [Google Scholar]
- 25.Gooptu B, Lomas DA. Conformational pathology of the serpins: Themes, variations, and therapeutic strategies. Annu Rev Biochem. 2009;78:147–176. doi: 10.1146/annurev.biochem.78.082107.133320. [DOI] [PubMed] [Google Scholar]
- 26.Harrison PM, Chan HS, Prusiner SB, Cohen FE. Thermodynamics of model prions and its implications for the problem of prion protein folding. J Mol Biol. 1999;286:593–606. doi: 10.1006/jmbi.1998.2497. [DOI] [PubMed] [Google Scholar]
- 27.Tuinstra RL, et al. Interconversion between two unrelated protein folds in the lymphotactin native state. Proc Natl Acad Sci USA. 2008;105:5057–5062. doi: 10.1073/pnas.0709518105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Roessler CG, et al. Transitive homology-guided structural studies lead to discovery of Cro proteins with 40% sequence identity but different folds. Proc Natl Acad Sci USA. 2008;105:2343–2348. doi: 10.1073/pnas.0711589105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kirsten Frank M, Dyda F, Dobrodumov A, Gronenborn AM. Core mutations switch monomeric protein GB1 into an intertwined tetramer. Nat Struct Biol. 2002;9:877–885. doi: 10.1038/nsb854. [DOI] [PubMed] [Google Scholar]
- 30.Ambroggio XI, Kuhlman B. Computational design of a single amino acid sequence that can switch between two distinct protein folds. J Am Chem Soc. 2006;128:1154–1161. doi: 10.1021/ja054718w. [DOI] [PubMed] [Google Scholar]
- 31.Cerasoli E, Sharpe BK, Woolfson DN. ZiCo: A peptide designed to switch folded state upon binding zinc. J Am Chem Soc. 2005;127:15008–15009. doi: 10.1021/ja0543604. [DOI] [PubMed] [Google Scholar]
- 32.Hori Y, Sugiura Y. Effects of Zn(II) binding and apoprotein structural stability on the conformation change of designed antennafinger proteins. Biochemistry. 2004;43:3068–3074. doi: 10.1021/bi035742u. [DOI] [PubMed] [Google Scholar]
- 33.Meyerguz L, Kleinberg J, Elber R. The network of sequence flow between protein structures. Proc Natl Acad Sci USA. 2007;104:11627–11632. doi: 10.1073/pnas.0701393104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Delaglio F, et al. NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J Biomol NMR. 1995;6:277–293. doi: 10.1007/BF00197809. [DOI] [PubMed] [Google Scholar]
- 35.Goddard TD, Kneller DG. SPARKY 3. San Francisco: University of California; 2004. [Google Scholar]
- 36.Lejon S, Frick IM, Bjorck L, Wikstrom M, Svensson S. Crystal structure and biological implications of a bacterial albumin binding module in complex with human serum albumin. J Biol Chem. 2004;279:42924–42928. doi: 10.1074/jbc.M406957200. [DOI] [PubMed] [Google Scholar]
- 37.Derrick JP, Wigley DB. Crystal structure of a streptococcal protein G domain bound to an Fab fragment. Nature. 1992;359:752–754. doi: 10.1038/359752a0. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.