Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2015 Dec 26;25(3):662–675. doi: 10.1002/pro.2860

Clusters of isoleucine, leucine, and valine side chains define cores of stability in high‐energy states of globular proteins: Sequence determinants of structure and stability

Sagar V Kathuria 1, Yvonne H Chan 1, R Paul Nobrega 1, Ayşegül Özen 1, C Robert Matthews 1,
PMCID: PMC4815418  PMID: 26660714

Abstract

Measurements of protection against exchange of main chain amide hydrogens (NH) with solvent hydrogens in globular proteins have provided remarkable insights into the structures of rare high‐energy states that populate their folding free‐energy surfaces. Lacking, however, has been a unifying theory that rationalizes these high‐energy states in terms of the structures and sequences of their resident proteins. The Branched Aliphatic Side Chain (BASiC) hypothesis has been developed to explain the observed patterns of protection in a pair of TIM barrel proteins. This hypothesis supposes that the side chains of isoleucine, leucine, and valine (ILV) residues often form large hydrophobic clusters that very effectively impede the penetration of water to their underlying hydrogen bond networks and, thereby, enhance the protection against solvent exchange. The linkage between the secondary and tertiary structures enables these ILV clusters to serve as cores of stability in high‐energy partially folded states. Statistically significant correlations between the locations of large ILV clusters in native conformations and strong protection against exchange for a variety of motifs reported in the literature support the generality of the BASiC hypothesis. The results also illustrate the necessity to elaborate this simple hypothesis to account for the roles of adjacent hydrocarbon moieties in defining stability cores of partially folded states along folding reaction coordinates.

Keywords: protein stability, HX‐NMR, BASiC hypothesis, hydrophobic clusters


Abbreviations

NH

main chain amide hydrogen

BASiC

branched aliphatic side chain

ILV

isoleucine leucine valine

vdW

van der Waals

HX

hydrogen exchange

H‐bond

hydrogen bond

FIRST

floppy inclusions and rigid substructure topography

GNM

Gaussian network model

BEST

biology using ensemble‐based structural thermodynamics

αTS

alpha subunit of tryptophan synthase

sIGPS

indole‐3‐glycerolphosphate synthase from S. solfataricus

HX‐MS

HX mass spectrometry

AA

amino acid

TP

true positives

FP

false negatives

FN

false negatives

TPR

random true positives

FPR

random false positives

FNR

random false negatives

SAB

surface area buried

CSU

contacts of structural units

SNase

Staphylococcal nuclease

RNase H

ribonuclease H

hFGF‐1

human acidic fibroblast growth factor

Introduction

It has been over 50 years since Walter Kauzmann published his seminal review of protein denaturation reactions.1 His analysis of the various factors involved in stabilizing the native conformation of proteins anticipated important roles for the hydrophobic effect and hydrogen bonding. Based upon solubility data for nonpolar analogs of side chains and the propensity of detergents and organic solvents to denature proteins, he reasoned that aliphatic and aromatic side chains would be preferentially sequestered in the interior of proteins. The denaturation of proteins by guanidine HCl and urea, analogs of the peptide linkage, was taken as evidence for an important contribution from main chain–main chain hydrogen bonds. His views were subsequently verified when the first crystal structure of a protein, myoglobin, appeared shortly thereafter.2 Although a lively debate has ensued about the relative importance of the hydrophobic effect and hydrogen bonding to stability,3, 4, 5, 6 it is well accepted that buried and tightly packed nonpolar side chains are crucial to the stabilization of the native, functional forms of globular proteins.

The contribution of individual nonpolar side chains to the stability of the native conformations of numerous proteins have been examined using mutational analysis.7, 8, 9 Replacements of buried nonpolar residues with alanine almost invariably decrease the stability of the native conformation, reflecting the tight packing in the native conformation and the global connectivity of the numerous noncovalent interactions that result in highly cooperative unfolding reactions. The molecular underpinnings of these observations reflect both the relative differences in the transfer free energies of the side chains from nonpolar solvents to water10, 11 and the loss in the van der Waals (vdW) interactions when large nonpolar side chains are replaced by smaller counterparts. At the point where the polypeptide chain reaches the native conformation, essentially all of the buried side chains contribute to its stability.

Where the buried side chains do distinguish themselves is in the formation of partially folded high‐energy states that exist in equilibrium with the native state. Structural information on these rare states has been inferred by measurements of the protection of main chain NHs against exchange with solvent.12 Hydrogen exchange (HX) followed by NMR is a powerful tool to study the hydrogen bond (H‐bond) network at the level of individual amino acids in these high‐energy states.13 In favorable cases, H‐bond networks in one or more partially folded states, whose size decreases with increasing free energy, have been observed.14, 15, 16, 17 The most resistant NHs to exchange are often located in sequential elements of secondary structure whose side chains interact in the native conformation. Although the implied structures of these marginally populated states have been mapped for a large number of proteins,18 there has not yet been a unifying hypothesis that rationalizes these structures in terms of their constituent side chains.

Insights into Sequence Correlates of Stability Cores from TIM Barrel Proteins

Recent results from hydrogen exchange experiments on a pair of structurally conserved (βα)8, TIM barrel proteins, the alpha subunit of tryptophan synthase from E. coli (αTS)19, 20 and the indole‐3‐glycerolphosphate synthase from S. solfataricus (sIGPS),16 provided significant insights into the role of sequence in determining the cores of stability. Native‐state HX analysis of αTS revealed a core of strong protection at the N‐terminus, (βα)1‐4 [Fig. 1(A)].20 Comparison with the energies of an early off‐pathway intermediate and a late on‐pathway intermediate on the folding free‐energy surface led to the association of the observed HX protection with the off‐pathway species. This assignment was confirmed by a mutational analysis of 10 isoleucines, leucines, and valines in the same N‐terminal region,19 participants in a very large, 31‐residue cluster of these branched aliphatic side chains. Replacement of ILVs on either the exterior of the β‐barrel or the interior of the helical shell completely eliminated the off‐pathway species; replacements at other buried ILV positions destabilized but did not eliminate this intermediate. Native‐state HX mass spectrometry (HX‐MS) analysis of similar early off‐pathway21 and late on‐pathway16 species in the sIGPS TIM barrel showed strong protection in peptic peptides whose ILV side chains form a pair of large clusters in the α2(βα)3‐5β6 region of the structure [Fig. 1(B)].

Figure 1.

Figure 1

HX protection patterns in TIM barrels. (A) Two‐dimensional contact map of the ILV clusters in αTS (PDB: 1BKS).20 The circles represent contacts made between any two ILV residues represented on the X‐ and Y‐axes. The size of the circle is proportional to the amount of surface area buried by the contacting residues. Three clusters are color coded as black, magenta, and green circles. The blue box represents the stability core of the protein. The purple box encompasses the most strongly protected region and represents structured regions in the off‐pathway intermediate, IBP.20 The black cluster corresponds to the structured regions of the on pathway intermediate, I1.19 (B) Two‐dimensional contact map of the ILV clusters in sIGPS (PDB: 2C3Z).77 The two clusters are color coded as black and magenta circles. The blue box encompasses the most strongly protected region and represents structured regions in the on‐pathway Ia intermediate. The red box encompasses additional regions that are at least moderately protected and represents structured regions in the on‐pathway Ib intermediate. The purple box represents the densest region of sequence local contacts in the protein (11 of the 22 residues are ILV), and corresponds to the protection pattern in the early, off‐pathway intermediate, IBP.16 The folding mechanisms determined for the two proteins21, 78 are shown above the corresponding contact map.

The close association of large clusters of ILV side chains with the cores of stability in partially folded states of TIM barrel proteins16, 19 has led to the proposal of the Branched Aliphatic amino acid Side Chain (BASiC) Hypothesis. In this article, we describe the rationale for the BASiC hypothesis and explore its generality for 21 globular proteins with a variety of topologies. We do so by comparing the structures of their partially folded high‐energy states, determined by native‐state HX NMR, with their ILV clusters. We support this analysis with a detailed discussion of the protection patterns of four well‐studied globular proteins in the context of their ILV clusters.

Background and Development of the BASiC Hypothesis

Because the HX protection in partially folded states of the TIM barrel proteins studied is (a) not uniform throughout the barrel and (b) not associated with a specific set of βα‐repeat units in these two proteins, it is clear that the protection does not simply reflect the (βα)8 topology. Although the ILV composition of β‐strands in (βα)8 barrels, ∼40%,22 is significantly more than the overall composition of globular proteins, ∼20%, the protection against exchange occurred preferentially in segments corresponding to large clusters of ILV side chains (10 or more ILV side chains that mutually bury more than 500 Å2), not to those in smaller clusters dispersed throughout the structures (see Supporting Information). The preferential protection of NHs against exchange with solvent in the vicinity of ILV clusters led us to reconsider the role of hydrophobicity in stabilizing proteins.

The hydrophobic effect

Hydrophobicities of amino acids are typically determined by measuring the partitioning of the amino acids between a nonpolar solvent and water. The nonpolar solvents have included cyclohexane and 1‐octanol, among others,23 and, depending upon the nonaqueous phase, the relative hydrophobicities differ to some extent from scale to scale (Supporting Information, Fig. S1). In addition to this inherent variability, the difficulty in applying any single scale is its inability to accurately reflect the complex and heterogeneous interior of a protein. Although isoleucine, leucine, and valine are among the top‐ranked hydrophobic residues on all of the scales,11, 24 the aromatics—phenylalanine, tryptophan, and tyrosine—are commonly accepted to be the most hydrophobic amino acids. On the basis of the empirical hydrophobicities of the individual side chains, therefore, it was not obvious why large clusters of ILV side chains might be responsible for preferential HX protection in partially folded states of αTS and sIGPS.

Chandler et al.25 have speculated that the docking of hydrophobic surfaces in protein folding and/or association reactions, may be preceded by dewetting transitions. All‐atom molecular dynamic simulations in explicit solvent by Berne, Zhou and colleagues demonstrated that a dramatic decrease in the water density indeed occurs between the docking surface of two pairs of chains (the four α‐helix complex of melittin26 was separated into two‐chain pairs) as they approach each other. A similar docking of the hydrophobic interfaces for the two domains in a dioxygenase, BphC from Pseudomonas sp., however, only showed a small decrease in the density of the intervening solvent prior to docking.27 The dioxygenase interface is a heterogeneous mixture of nonpolar side chains, however, the melittin interface was almost entirely composed of ILV side chains: 4 leucines, 3 isoleucines, 2 valines, and 1 tryptophan per chain. A subsequent molecular dynamics analysis of dewetting transitions in other multimeric protein complexes showed a preference in dewetting for those whose interface is dominated by ILV residues.28 A similar study of the density of water between the helical shell of αTS, and its internal β‐strands, found that rapid fluctuations occur between a dry and a wet state when they are held apart at a distance of 4 Å. The instability of the water phase is significantly higher between segments of the protein that correspond to a large ILV cluster.29

The BASiC hypothesis

The molecular basis of the BASiC hypothesis resides in the uniquely unfavorable interactions of saturated hydrocarbons with water.4 Radzicka and Wolfenden11 measured the partitioning of side chain analogs of the 20 naturally occurring amino acids between the vapor phase and water. Only alanine, glycine, isoleucine, leucine, and valine have a positive free energy of transfer from the vapor phase to water. All other side chain analogs, including those for phenylalanine, tyrosine, and tryptophan, spontaneously dissolve in water. In the context of globular proteins, however, the iso‐butyl, sec‐butyl, and iso‐propyl analogs for isoleucine, leucine, and valine will not only be more effective than the methyl group of alanine or the hydrogen of glycine at excluding water from the main chain NH group, but also have a greater capacity for stabilizing vdW interactions. The unique properties of ILV residues are evident in their preferential sequestration in the interior of globular proteins24, 30 and, for isoleucine and valine, in the enhanced protection against exchange with solvent for their associated NHs in simple peptides.31 The absence of water removes a source of competition for the H‐bond between the N—H donor and C=O acceptor and decreases the dielectric constant of the surrounding medium. As a result, the strength of the main chain–main chain hydrogen bond is increased.

Interestingly, a further consequence of burying any residue in the interior of proteins is the ∼10% decrease in the volume of its peptide moiety.5 The decrease in the volume of the peptide group, in turn, will force a closer packing of the aliphatic side chains, enhancing their vdW interactions. Such a tightly‐packed ILV cluster can, in turn, more effectively exclude water from the underlying hydrogen bond network. The resulting synergy between the tertiary and secondary structures, based upon the exclusion of water from the polypeptide backbone, can significantly enhance the cooperativity of the folding reaction, a hallmark of globular proteins. Other nonpolar side chains, for example, phenylalanine, tyrosine, or tryptophan, or the hydrocarbon portions of polar side chains, for example, those linking the charged groups in lysine or arginine to the Cα carbon that are lower on the Wolfenden hydrophobicity scale, may also be associated with these clusters. The aliphatic components of these side chains can potentially play a supportive role in stabilizing the ILV clusters, and the aromatic components could serve as an amphipathic interface between the ILV cores and the solvent in partially folded states.

Results

The wealth of information available on the nativestate exchange properties of proteins from different topological groups made it possible to test the validity of the BASiC hypothesis. A database of 34 proteins with residuespecific nativestate HX‐NMR information was developed from the literature to test this hypothesis (Supporting Information, Table S1).

Residue composition in the immediate vicinity of protected NHs

The residue composition within a 4 Å shell around the protected NHs shows that this region is preferentially enriched in the hydrophobic side chains of isoleucine, leucine, phenylalanine, and valine (Fig. 2). Serine, proline, and glutamic acid side chains, by contrast, are enriched around unprotected NHs (Supporting Information, Fig. S2). Only a subset of the polar and charged residues, arginine, aspartic acid, asparagine, and histidine, appears to have a slightly higher occurrence within a 4 Å shell around unprotected NHs (Supporting Information, Fig. S3). The remaining residues, alanine, cysteine, glutamine, lysine, methionine, threonine, tryptophan, and tyrosine, are equally likely to occur around either protected or unprotected NHs (Supporting Information, Fig. S4). Significance levels determined by the Mann–Whitney–Wilcoxon Rank test are reported in Supporting Information, Table S2.

Figure 2.

Figure 2

Relative composition around main chain NHs. Box plots of the distribution in 34 proteins of a side chain type within a 4 Å shell around protected main chain NHs (grey) or unprotected main chain NHs (cross‐hatch), as a ratio of its composition in the entire protein is shown for ILV and F side chains. These four side chain types are significantly more likely to surround a protected NH than an unprotected NH. Significance values are listed in Supporting Information, Table S2. The lower and upper limits of the box represent the first and third quartiles, respectively. The black line in the middle of the box represents the median of the distribution. The whiskers of the box plot represent the 10th and 90th percentile. The outliers are represented as filled circles.

Distribution of residue types in relation to protected NHs

More than half of the hydrophobic residues (CFILWYV) are within 4 Å of protected NHs (Supporting Information, Fig. S5A). Side chains of CMYW residues are more likely to be within 4 Å of unprotected residues than protected residues (Supporting Information, Fig. S5B). Of the residues within the protected shell, 25–35% of the ILV residues also have an unprotected NH within 4 Å (Supporting Information, Fig S5C). For CFWY residues, this overlap is larger; that is, more than 50% of the CFWY residues within 4 Å of a protected residue also have an unprotected residue within 4 Å (Supporting Information, Fig. S5C). This distribution pattern suggests that while the core of the proteins comprised hydrophobic residues, only half of the ILV residues participate in this core. By contrast, the majority of CFWY residues form a boundary between the protected core and the solvent accessible unprotected region of the protein (see Discussion). Separated from this core cluster of ILV residues, the remaining ILV residues are distributed throughout the protein.

ILV clusters in TIM barrel proteins

The preferential association of strong protection against HX with ILV side chains was originally observed in a TIM barrel protein, αTS.20 That HX‐NMR study revealed, moreover, that the protection was not uniformly distributed across ILV residues, but rather with a previously identified19 large cluster of branched aliphatic side chains. The analysis led to a definition of a cluster that required a minimum of 12 side chains, each of which had at least two mutual contacts within a 4.2 Å vdW interaction distance. This approach, however, did not directly address the issue of the contact surface area with water, a metric for the hydrophobic effect,32, 33 and it suffered from a significant sensitivity to the arbitrarily chosen vdW cut‐off distance.

We developed a metric based upon buried surface area and refined it across a set 55 TIM barrel proteins. The network of interactions between the ILV residues was analyzed as a function of the amount of surface area buried (SAB) between pairs of side chains using the Contacts of Structural Units (CSU) algorithm.34 By examining the dependence of the cluster definition on the pairwise SAB, it was determined that an SAB of 10 Å2 between two ILV side chains led to a stable and robust definition of the clusters (Supporting Information, Fig. S6).35 Insights into the minimum size of an ILV cluster that might offer protection against HX was obtained from several previous MD simulations of model systems. Simulations on hydrocarbon side chain analogs,36 and amyloidogenic peptides37, 38 revealed that a stable cluster required a SAB of ∼500 Å2. We now define clusters with the potential to offer protection against HX as requiring a minimum of 10 ILV side chains that mutually bury more than 500 Å2. These ILV clusters, determined from the structure of the native state, are presumed to play key roles in stabilizing partially folded, high‐energy states in equilibrium with the native state. Their presence is revealed by protection against hydrogen exchange with solvent.

It is interesting to note that neither the size nor the location of ILV clusters in TIM barrel proteins are conserved, despite the common kinetic folding mechanism.19, 21 This observation suggests that while the TIM barrel architecture requires the formation of a large cluster of ILV residues in on‐pathway intermediates, evolution does not constrain its location or extent.

Test for the validity of the BASiC hypothesis

Only 21 of the 34 proteins in the database bury a significant amount of surface area and possess clusters as defined by the above method. The remaining proteins are small domains and most have chain lengths of fewer than 100 amino acid residues. For these small proteins, mechanisms other than the burial of a large number of hydrophobic side chains are employed to stabilize the native structure. Where ILV clusters were present, we determined that a shell of 6 Å around the side chains of the cluster, encompass the associated back bone NHs. All the main chain NHs within this 6 Å shell from the cluster and at least 1 Å from the surface of the protein were predicted to be protected.

Two measures of any prediction are (1) the correct identification of strongly protected NHs and (2) a low frequency of false positives. These two measures, termed sensitivity and specificity,39 respectively, are represented for the 21 proteins in Figure 3. On average, this method of predicting cores of protein stability has a high sensitivity score (captures an average of 70% of the strongly protected NHs across 21 proteins). The specificity is lower, ∼55%, meaning that the BASiC Hypothesis overpredicts by a factor of ∼2. The examples discussed below illustrate some of the reasons for the limitation of this method.

Figure 3.

Figure 3

Measures of success for the prediction of protection patterns. (A) Sensitivity and (B) specificity. The broken line represents the value obtained for random distribution. See Supporting Information, Table S1 for PDB identifiers.

Representative Examples of the Application of the BASiC Hypothesis to Globular Proteins

Staphylococcal nuclease

The high‐energy states in staphylococcal nuclease (SNase), an α + β protein [Fig. 4(A)], have been examined by native‐state HX‐NMR.17 The most strongly protected core of the protein involves β4, β5 and the immediately contiguous parts of β3 and α2, behaving as one structural unit in a high‐energy state. However, distal regions of α2 (L103 and V104) and β2 (L25) also demonstrate equally strong protection and likely contribute to the stability of this intermediate [Fig. 4(B)]. Application of the BASiC hypothesis [Fig. 4(C)] shows that a cluster of 13 ILV side chains contributes to the hydrophobic core. This ILV cluster intimately links β1, β2, β4, β5, and α2. The cluster also involves a single contact with α1 and with β3, and, as such, is the foundation for the packing of the native structure. Eight of the nine ILVs for which protection factors were measured have NHs with ΔGºHX values exceeding 8 kcal mol−1, including the α2 and β2 contributions to the high‐energy state.17 Twenty other NHs surrounding the ILV cluster also display protection against exchange.

Figure 4.

Figure 4

(A) Cartoon representation of the crystal structure of staphylococcal nuclease (PDB: 1SNP).79 The elements of secondary structure are colored, helices in cyan, strands in magenta, and loops in salmon. (B) The HX protection pattern. The backbone nitrogens of protected residues are shown as colored spheres, with blue corresponding to those that are most protected.17 Those in green form the next level of protection, followed by yellow and red. (C) The ILV cluster in SNase. The grey spheres represent the hydrophobic cluster made up of ILV residues. The proximity of the highly protected residues with the ILV core is apparent.

Mutations in the β‐sheet region have previously been shown40 to result in increases in the m‐value of chemical denaturation reactions, indicative of the destabilization of a partially folded, high‐energy state in SNase.

Ribonuclease H

Ribonuclease H (RNase H), an α/β protein [Fig. 5(A)], provides another example where an ILV cluster serves as a subset of side chains associated with the HX‐NMR‐determined stability core.14 Main chain NHs in αA and αD are most strongly protected against exchange [Fig. 5(B)]. These helices dock on each other, are rich in ILV residues, and are central to the structure of this protein. ILV cluster analysis shows that 16 ILVs form a large buried cluster in RNase H and that these side chains are derived from all 4 α‐helices and 4 of the 5 β‐strands [Fig. 5(C)]. Comparison with the HX protection map reveals that the NHs of all 16 ILVs are protected to some degree and 8 exchange through a global unfolding mechanism. The very strong protection for αA and αD may reflect the large number of intrafacial ILV contacts for their resident side chains. In a recent study, using pulse‐quench HX followed by mass spectrometry (HX‐MS),41 αA and the immediately following β4 are the first to be protected during refolding. These two elements of secondary structure are at the core of the ILV cluster. The next stage in development of structure proceeds by recruiting αD and β5 and is analogous to the high‐energy state seen in the native state HX‐NMR experiment. The inclusion of the intervening sequence, αB and αC, completes the formation of an “Icore” intermediate that spans the entire ILV cluster.

Figure 5.

Figure 5

(A) Cartoon representation of the crystal structure of ribonuclease H (PDB: 2RN2).80 (B) The HX protection pattern. The backbone nitrogens of protected NHs14 are shown as colored spheres. The color scheme is the same as Fig. 4(B). (C) The ILV cluster in RNAse H. The legend is the same as in Fig. 4(C).

Apo‐myoglobin

Sperm whale myoglobin, a 153‐residue helical heme‐binding protein whose apo form (without the heme moiety, apoMb) has been extensively studied by Wright et al. [Fig. 6(A)]. The residues protected against exchange in the native [Fig. 6(B)] and intermediate [Fig. 6(C)] states were identified by populating the species at pH 6.0 and pH 4.2, respectively.42 Two ILV clusters can be identified in the crystal structure of holo‐Mb43 [Fig. 6(D)]. The larger cluster comprised 19 ILV residues from αA, αB, αE, αG, and αH, while the smaller cluster comprised 7 ILV residues from αE, αF, αG, and αH. In holo‐Mb, these two clusters pack against the heme group, which possibly stabilizes the smaller cluster. The NHs protected against exchange in the native state42 surround and include the large ILV cluster. A few weakly protected residues are associated with the smaller ILV cluster, albeit in the regions proximal to the large cluster. The loss of protection in αC and αD and partly in αB and αE in the acid intermediate42 suggests that while the core of the large ILV cluster remains unperturbed, the structures in the periphery of the cluster are destabilized in the acid intermediate.

Figure 6.

Figure 6

(A) Cartoon representation of the crystal structure of apo‐myoglobin (PDB: 1MBC).43 Experimentally determined protection patterns42 of the (B) native state and (C) intermediate are shown as colored spheres. The color scheme is the same as in Fig. 4(B). (D) The ILV clusters in apo‐myoglobin. The grey spheres represent the large cluster of ILV residues and the small cluster is shown as grey dots.

Human acidic fibroblast growth factor (hFGF‐1)

hFGF‐1 is a β‐barrel protein [Fig. 7(A)], thought to adopt a partially unfolded state for transport across cell membranes.44, 45, 46, 47 An equilibrium intermediate, possibly representative of that partially unfolded state, is populated in the presence of small concentration of guanidine HCl (0.96 M).47

Figure 7.

Figure 7

(A) Cartoon representation of the human acidic fibroblast growth factor 1, hFGF‐1 (PDB: 1JQZ).81 (B) The HX protection pattern. The backbone nitrogen of protected NHs48 are shown as colored spheres. The color scheme is the same as in Fig. 4(B). (C) The ILV cluster in hFGF‐1. The legend is the same as in Fig. 4(C). The two phenylalanine residues that border the ILV cluster are shown as green lines. F99 is contributed by β VIII and F146 by β XII.

An inspection of the HX‐NMR protection patterns of the native protein [Fig. 7(B)] reveals that the strongest protection forms a band around the interior of the entire β‐barrel, with the loops exchanging either within the dead‐time of the experiment or shortly thereafter. The stability core of the native protein includes β I–III, β VI, β VII, β X, and β XII. β XI and, to a smaller extent, the solvent‐exposed NHs of β IV show the lowest protection factors. These two strands are known binding sites for physiological substrates. The BASiC algorithm identifies one large cluster of 12 ILV residues that spans the entire N‐terminal half of the protein (β I to β VII) and includes two residues from β X [Fig. 7(C)]. β VIII and β XII border the ILV cluster and are recruited into the stability core through phenylalanine residues (F99 and F146).

Although the overall structure of the protein in the native and intermediate states is similar, the dynamics in the C‐terminal region (β IX, β X, and β XI) and to a lesser extent in β IV are increased in the intermediate.48 The remaining strands continue to offer protection, albeit to a lesser degree than the native state. The stability core of this intermediate was found to include β II, β VIII, and β XII. The most significant differences between the HX patterns of the native and intermediate species are in the C‐terminal strands that are furthest from the ILV cluster, these include β IX and β XI and to a lesser extent β X. The increased importance of β VIII in the core of the intermediate suggests that a structural rearrangement in that region may occur either during translocation or during refolding to native.48

Discussion

The composition of side chains surrounding amide hydrogens protected against exchange with solvent of all four major structural classes globular proteins49 is significantly skewed toward ILV and F side chains (Fig. 2). Of these core residues, the ILV side chains are secluded from unprotected NHs, whereas the F side chains are more likely to also have unprotected NHs within 4 Å (Supporting Information, Fig. S5C). This suggests that phenylalanine side chains tend to be found at the periphery of the protected core, where they would serve as an interface between the stability core and the solvent in a partially folded state. A 6 Å shell surrounding ILV clusters that contain more than 10 side chains and bury more than 500 Å2 of surface area, captures > 70% of the NHs protected against exchange. However, this metric overpredicts by a factor of ∼2, resulting in a specificity of 55%.

These and other observations on the distribution of hydrophobic side chains (Supporting Information, Fig. S5) can be captured in a cartoon format (Fig. 8). A subset of the ILV residues forms a network of interactions that stabilizes the core of the protein. This core cluster is bounded by aromatic side chains, primarily phenylalanine and, to a smaller extent, tyrosine and tryptophan, and cysteine. The remaining ILV side chains are distributed elsewhere in the structure, along with the remaining aromatic side chains.

Figure 8.

Figure 8

Conceptual representation of protein structure and distribution of hydrophobic residues. A hypothetical crosssection of a model protein is shown with the protected NHs at its core, represented by the filled brown circle. Nearly half of the ILV residues form a large ILV cluster, represented by the black outline. A 6 Å shell around this cluster encompasses, on average, ∼70% of the protected core, but extends out further than the core (∼ 55% specificity). The remaining half of the ILV residues form small clusters on the interior of the protein that bury <500 Å2 and are not expected to offer protection in high‐energy states of proteins. Half of the remaining hydrophobic residues, the aromatic side chains, FW and Y, represented here by the light blue outlines, and C form an amphipathic layer around the main ILV cluster that could serve as a phase boundary in high‐energy states. The remaining aromatic side chains are distributed elsewhere in the protein, potentially mediating functional roles of the protein.

Comparison with other algorithms that predict HX protection patterns

An algorithm to identify the rigid core of proteins, floppy inclusions, and rigid substructure topography (FIRST)50 uses an all atom model to define constraints on the topology. A simulated thermal denaturation then systematically removes weak hydrogen bonds until the most rigid core of the protein remains. This method is very effective at identifying the stable core of the protein at the secondary structure level.39 The FIRST algorithm was supplemented by including local motions in a coarse‐grained Gaussian Network Model (GNM) analysis that identifies “kinetically hot” residues as peaks in the fastest vibrational modes.51 The FIRST method provided “sensitivity” to the position of the core, while the GNM fast modes provided “specificity” in distinguishing protected from unprotected residues in the identified core.

Another algorithm, COREX/BEST, developed by Freire, Hilser and colleagues,52, 53 calculates probabilities of populating an exhaustive sampling of partially unfolded structures to determine the potential for each residue to undergo HX. The algorithm uses a sliding window to divide the protein into a manageable number of states, with a defined number of residues in either native‐like or non‐native conformations. Since the number of possible conformations of each non‐native residue in a state would compound the number of sub‐states by several orders magnitude, a statistical thermodynamic approach is used to determine the stability of each state. The known dependence of changes in enthalpy, solvation entropy and heat capacity on the changes in buried surface area of polar and nonpolar residues, and the empirically determined values for conformational entropy change for each residue54, 55 are used to determine the energetic impact of each non‐native residue in a given state. The thermodynamic stability of each state, thus calculated, is used to determine the probability of the state occurring relative to all the other states generated in the original ensemble. The thermodynamic stability of each residue is then determined by the summed probability of structures in which it appears in native‐like versus the summed probability of structures, in which it appears in a non‐native conformation. The lower the thermodynamic stability of a residue, the higher the likelihood of it undergoing HX with solvent.

A comparison of the predictions of the GNM, COREX/BEST, and the BASiC algorithms with the experimentally determined protection patterns (Fig. 9) suggests that all three techniques are sensitive to the regions of protection in the protein (70–80%),39, 52, 53 but the degree of specificity varies from protein to protein and from one technique to another. A consensus set of residues predicted to be protected by all three techniques (Fig. 9) improves the likelihood of a positive match with experimental data by ∼15% over the BASiC algorithm. All these approaches are modeled on inter‐residue interactions in the native state and perhaps respond to a set of inter‐related phenomena. GNM is directly sensitive to packing densities, COREX/BEST to the statistical thermodynamics properties of each residue and BASiC to the specific sequence composition of the hydrophobic core. One might suppose that higher packing densities would enhance vdWs interactions between nonpolar side chains, linking all three algorithms. The BASiC hypothesis differentiates itself by relating the HX protection patterns to the underlying sequence of the proteins. This feature may be useful in the de novo design of hyperstable proteins by biasing the interior to favor the formation of ILV clusters.56, 57

Figure 9.

Figure 9

Sequence comparison of experimental and predicted protection against HX for (A) staphylococcal nuclease, (B) ribonuclease H, (C) apo‐myoglobin, and (D) hFGF‐1. The experimentally determined protected residues are highlighted in red on the sequence, line E. The residues predicted to have the least flexibility by GNM, line G, the residues predicted to have highest thermodynamic stability by COREX/BEST, line C, and the residues predicted to be protected by the BASiC algorithm, line B, are highlighted in red on sequence of the protein. The cutoff percentile for selecting the protected residues by the GNM and COREX algorithm was based on the number of residues predicted by BASiC for each protein. The secondary structure elements are represented above the sequence of the protein (α helices in red, β strands in green and loops in blue).82 Residues that are predicted to be protected by all three algorithms are emphasized with an asterisk below them; colored red if the residue is protected in experimental data also.

Generality of the BASiC hypothesis

The BASiC hypothesis was developed to explain the favored formation of intramolecular clusters of branched aliphatic side chains in partially folded states of globular proteins. The concept can readily be extended to other common motifs and to intermolecular complexes that stabilize protein–protein interactions. For example, ILV clusters are apparent in ankyrin repeat proteins,58 leucine‐rich repeat proteins,59 left‐handed β‐helix proteins,60 and de novo designed βα‐repeat proteins.56 Protein–protein interactions can also be mediated by ILV clusters. Examples include leucine zipper coiled coils,61 calmodulin/peptide complexes,62 SNARE complexes responsible for protein trafficking,63 dimerization domains of Hsp 90,64 and IRF5,65 and chaperones/client interactions for GroEL66 and DnaK.67 ILV clusters may also be involved in the formation of amyloids thought to be responsible for human pathologies68, 69 and infectious prions.70 All these potential applications would serve to reemphasize the role of the amino acid sequence in folding and stability, as originally envisioned by Anfinsen et al.

Materials and Methods

Databases

A database of 71 nonredundant TIM barrel structures created by Gromiha et al.71 was employed to define characteristic properties of ILV clusters. A database of site‐specific native‐state HX‐NMR information was created from a list of 29 proteins compiled by Radar and Bahar39 and by Li and Woodward.72 This dataset was expanded to 34 proteins with results from the recent literature (Supporting Information, Table S1).

Definition of protection against hydrogen exchange

Main‐chain NHs that were not fully exchanged with solvent over the initial collection time of an 15N–1H HSQC experiment, typically 15–30 min, were defined as protected against hydrogen exchange.

Determination of side chain composition around NHs

The side chains that are within 4 Å of each NH were determined from the crystal structure and classified as those surrounding protected NHs (set P) and those surrounding unprotected NHs (set U). The percentage distribution of an amino acid (AA) type in each set was determined for each protein (e.g., percent Ile in set P of protein A = number of Ile in set P of protein A times 100 divided by the total number of Ile in protein A). The percentage overlap of an AA type between both sets was determined for each protein (e.g., percent overlap of Ile for protein A = number of Ile residues that occur in both set P and set U of protein A times 100 divided by the total number of Ile in protein A). The percentage composition of each amino acid AA type within each two set was determined for each protein (e.g., percent composition of Ile in set P of protein A = number of Ile in set P of protein A times 100 divided by the total number of residues in set P of protein A).

The relative distributions were determined as the ratio of the percentage composition of an AA type in a set (set P or set U) of a protein to the percentage composition of that AA type in the entire protein. The difference between the relative distributions of each AA type in set P and set U was determined using the Mann–Whitney–Wilcoxon test.73

Cluster definition and prediction of protected NHs

The contact surface area between residues was calculated using the method described by Sobolev et al.,34 and a contact between any two ILV residues that buries more than 10 Å2 of surface area in either partner was considered to be a structurally significant contact. A network of such contacts that comprised more than 10 side chains and buries more than 500 Å2 was considered to be an ILV cluster capable of providing protection against HX. All the NHs inside a 6 Å shell around the cluster and not within 1 Å of the surface of the protein were predicted to be protected.

Sensitivity and specificity

The pattern of protection predicted by the above method was compared with the experimentally determined protection pattern. True positives (TP) were defined as those NHs that occur in both the predicted and experimental data sets. False positives (FP) were defined as those that are in the predicted data set but not in the experimental data set. False negatives (FN) were defined as those that appear in the experimental data set but not in the predicted data set.

Given the number of residues in the predicted data set and the experimental dataset, the number of TP occurring as a random probability (TPR) was determined by multiplying the total number of residues with the product of the ratios of each set, that is, total number of residues times (number of protected residues predicted divided by total number of residues) times (number of protected residues experimentally determined divided by the total number of residues). The number of FP as a random probability (FPR) was determined, TP + FP − TPR. The number of FN as a random probability (FNR) was determined, TP + FN − TPR.

The sensitivity of the prediction was defined as the ratio of number of correctly predicted NHs over the total number of NHs protected as experimentally determined, TP/(TP + FP). The specificity of the prediction was defined as the ratio of the number of correctly predicted NHs over the total number of NHs predicted to be protected, TP/(TP + FN). Similarly, the sensitivity and specificity of a random sampling was also determined using the corresponding calculated variables, TPR, FPR, and FNR.

Gaussian network model

The protein structure is represented as an elastic network of nodes, with each node represents an α‐carbon atom and the network represents its contacts with other α‐carbon atoms within 7 Å cut‐off radius by hookean springs of uniform force constant. The high‐frequency (fast) mode fluctuations were calculated and an average of the 10 fastest modes used as a predictor of kinetic hot spots.51

Corex/Best

The online tool at http://best.bio.jhu.edu/BEST/was used to determine the COREX/BEST predictions for the stability of each residue. The exhaustive enumeration mode was employed with a window size of 8 residues and a minimum window size of 4 (for boundary conditions) to generate the protein ensemble. The entropy weighting factor was determined using the known stability of each protein (SNase—8 kcal mol−1 at 15°C,74 RNaseH—10 kcal mol−1 at 25°C,75 Apo‐myoglobin—6.7 kcal mol−1 at 25°C,76 and hFGF‐1—5 kcal mol−1 at 25°C48).

ILV cluster analysis

A web‐based tool for calculating clusters of different residues is available at http://biotools.umassmed.edu/ccss/ccssv2/basic.cgi

Supporting information

Supporting Information

Acknowledgments

The authors would like to thank Noah Cohen for his help with data mining; Jill Zitzewitz and Osman Bilsel for valuable discussions and editing the manuscript; Can Kayatekin for helpful discussions; and Ramakrishna Vadrevu, Ying Wu, and Zhenyu Gu for their insights in the TIM barrel data analysis. A special thanks to David Lapointe and Rahul Metta for developing and hosting the web‐based version of the cluster algorithm.

C. Robert Matthews is the recipient of the Protein Society 2015 Carl Brändén Award.

Author Contributions: The BASiC hypothesis was conceived by CRM. SVK developed the database. SVK, YC, and RPN analyzed the database. AÖ analyzed the GNM results. CRM and SVK wrote the manuscript.

References

  • 1. Kauzmann W (1959) Some factors in the interpretation of protein denaturation. Adv Protein Chem 14:1–63. [DOI] [PubMed] [Google Scholar]
  • 2. Kendrew JC (1961) The three‐dimensional structure of a protein molecule. Sci Am 205:96–110. [DOI] [PubMed] [Google Scholar]
  • 3. Pace CN (2001) Polar group burial contributes more to protein stability than nonpolar group burial. Biochemistry 40:310–313. [DOI] [PubMed] [Google Scholar]
  • 4. Southall NT, Dill KA, Haymet ADJ (2002) A view of the hydrophobic effect. J Phys Chem B 106:521–533. [Google Scholar]
  • 5. Schell D, Tsai J, Scholtz JM, Pace CN (2006) Hydrogen bonding increases packing density in the protein interior. Proteins Struct Funct Bioinform 63:278–282. [DOI] [PubMed] [Google Scholar]
  • 6. Pace CN, Fu H, Fryar KL, Landua J, Trevino SR, Shirley BA, Hendricks MM, Iimura S, Gajiwala K, Scholtz JM, Grimsley GR (2011) Contribution of hydrophobic interactions to protein stability. J Mol Biol 408:514–528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Matouschek A, Fersht AR (1991) Protein engineering in analysis of protein folding pathways and stability. Methods Enzymol 202:82–112. [DOI] [PubMed] [Google Scholar]
  • 8. Shortle D (1992) Mutational studies of protein structures and their stabilities. Q Rev Biophys 25:205–250. [DOI] [PubMed] [Google Scholar]
  • 9. Srivastava AK, Sauer RT (2002) Mutational studies of protein stability and folding of the hyperstable MYL Arc repressor variant. Biophys Chem 101102:35–42. [DOI] [PubMed] [Google Scholar]
  • 10. Nozaki Y, Tanford C (1971) The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. Establishment of a hydrophobicity scale. J Biol Chem 246:2211–2217. [PubMed] [Google Scholar]
  • 11. Radzicka A, Wolfenden R (1988) Comparing the polarities of the amino acids: side‐chain distribution coefficients between the vapor phase, cyclohexane, 1‐octanol, and neutral aqueous solution. Biochemistry 27:1664–1670. [Google Scholar]
  • 12. Bai Y, Englander SW (1996) Future directions in folding: the multi‐state nature of protein structure. Proteins Struct Funct Bioinform 24:145–151. [DOI] [PubMed] [Google Scholar]
  • 13. Krishna MM, Hoang L, Lin Y, Englander SW (2004) Hydrogen exchange methods to study protein folding. Methods 34:51–64. [DOI] [PubMed] [Google Scholar]
  • 14. Chamberlain AK, Handel TM, Marqusee S (1996) Detection of rare partially folded molecules in equilibrium with the native conformation of RNaseH. Nat Struct Biol 3:782–787. [DOI] [PubMed] [Google Scholar]
  • 15. Rojsajjakul T, Wintrode P, Vadrevu R, Robert Matthews C, Smith DL (2004) Multi‐state unfolding of the alpha subunit of tryptophan synthase, a TIM barrel protein: insights into the secondary structure of the stable equilibrium intermediates by hydrogen exchange mass spectrometry. J Mol Biol 341:241–253. [DOI] [PubMed] [Google Scholar]
  • 16. Gu Z, Zitzewitz JA, Matthews CR (2007) Mapping the structure of folding cores in TIM barrel proteins by hydrogen exchange mass spectrometry: the roles of motif and sequence for the indole‐3‐glycerol phosphate synthase from Sulfolobus solfataricus. J Mol Biol 368:582–594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Bedard S, Mayne LC, Peterson RW, Wand AJ, Englander SW (2008) The foldon substructure of staphylococcal nuclease. J Mol Biol 376:1142–1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Brockwell DJ, Radford SE (2007) Intermediates: ubiquitous species on folding energy landscapes? Curr Opin Struct Biol 17:30–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Wu Y, Vadrevu R, Kathuria S, Yang X, Matthews CR (2007) A tightly packed hydrophobic cluster directs the formation of an off‐pathway sub‐millisecond folding intermediate in the alpha subunit of tryptophan synthase, a TIM barrel protein. J Mol Biol 366:1624–1638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Vadrevu R, Wu Y, Matthews CR (2008) NMR analysis of partially folded states and persistent structure in the alpha subunit of tryptophan synthase: implications for the equilibrium folding mechanism of a 29‐kDa TIM barrel protein. J Mol Biol 377:294–306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Gu Z, Rao MK, Forsyth WR, Finke JM, Matthews CR (2007) Structural analysis of kinetic folding intermediates for a TIM barrel protein, indole‐3‐glycerol phosphate synthase, by hydrogen exchange mass spectrometry and Go model simulation. J Mol Biol 374:528–546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Janin J, Chothia C (1980) Packing of [alpha]‐helices onto [beta]‐pleated sheets and the anatomy of [alpha]/[beta] proteins. J Mol Biol 143:95–128. [DOI] [PubMed] [Google Scholar]
  • 23. Biswas KM, DeVido DR, Dorsey JG (2003) Evaluation of methods for measuring amino acid hydrophobicities and interactions. J Chromatogr a 1000:637–655. [DOI] [PubMed] [Google Scholar]
  • 24. Rose GD, Wolfenden R (1993) Hydrogen bonding, hydrophobicity, packing, and protein folding. Annu Rev Biophys Biomol Struct 22:381–415. [DOI] [PubMed] [Google Scholar]
  • 25. Liu P, Huang X, Zhou R, Berne BJ (2005) Observation of a dewetting transition in the collapse of the melittin tetramer. Nature 437:159–162. [DOI] [PubMed] [Google Scholar]
  • 26. ten Wolde PR, Chandler D (2002) Drying‐induced hydrophobic polymer collapse. Proc Natl Acad Sci USA 99:6539–6543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Zhou R, Huang X, Margulis CJ, Berne BJ (2004) Hydrophobic collapse in multidomain protein folding. Science 305:1605–1609. [DOI] [PubMed] [Google Scholar]
  • 28. Hua L, Huang XH, Liu P, Zhou RH, Berne BJ (2007) Nanoscale dewetting transition in protein complex folding. J Phys Chem B 111:9069–9077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Das P, Kapoor D, Halloran KT, Zhou R, Matthews CR (2013) Interplay between drying and stability of a TIM barrel protein: a combined simulation‐experimental study. J Am Chem Soc 135:1882–1890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Pace CN, Trevino S, Prabhakaran E, Scholtz JM (2004) Protein structure, stability and solubility in water and other solvents. Philos Trans R Soc Lond B Biol Sci 359:1225–1234; discussion 1234−1225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Bai Y, Englander SW (1994) Hydrogen bond strength and beta‐sheet propensities: the role of a side chain blocking effect. Proteins 18:262–266. [DOI] [PubMed] [Google Scholar]
  • 32. Lee B, Richards FM (1971) The interpretation of protein structures: estimation of static accessibility. J Mol Biol 55:379–400. [DOI] [PubMed] [Google Scholar]
  • 33. Sharp KA, Nicholls A, Friedman R, Honig B (1991) Extracting hydrophobic free energies from experimental data: relationship to protein folding and theoretical models. Biochemistry 30:9686–9697. [DOI] [PubMed] [Google Scholar]
  • 34. Sobolev V, Sorokine A, Prilusky J, Abola EE, Edelman M (1999) Automated analysis of interatomic contacts in proteins. Bioinformatics 15:327–332. [DOI] [PubMed] [Google Scholar]
  • 35. Kathuria SV, Day IJ, Wallace LA, Matthews CR (2008) Kinetic traps in the folding of beta/alpha‐repeat proteins: CheY initially misfolds before accessing the native conformation. J Mol Biol 382:467–484. [DOI] [PubMed] [Google Scholar]
  • 36. Raschke TM, Tsai J, Levitt M (2001) Quantification of the hydrophobic interaction by simulations of the aggregation of small hydrophobic solutes in water. Proc Natl Acad Sci USA 98:5965–5969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Zanuy D, Ma B, Nussinov R (2003) Short peptide amyloid organization: stabilities and conformations of the islet amyloid peptide NFGAIL. Biophys J 84:1884–1894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Hills RD Jr., Brooks CLIII (2007) Hydrophobic cooperativity as a mechanism for amyloid nucleation. J Mol Biol 368:894–901. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Rader AJ, Bahar I (2004) Folding core predictions from network models of proteins. Polymer 45:659–668. [Google Scholar]
  • 40. Shortle D, Stites WE, Meeker AK (1990) Contributions of the large hydrophobic amino acids to the stability of staphylococcal nuclease. Biochemistry 29:8033–8041. [DOI] [PubMed] [Google Scholar]
  • 41. Hu W, Walters BT, Kan ZY, Mayne L, Rosen LE, Marqusee S, Englander SW (2013) Stepwise protein folding at near amino acid resolution by hydrogen exchange and mass spectrometry. Proc Natl Acad Sci USA 110:7684–7689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Hughson FM, Wright PE, Baldwin RL (1990) Structural characterization of a partly folded apomyoglobin intermediate. Science 249:1544–1548. [DOI] [PubMed] [Google Scholar]
  • 43. Kuriyan J, Wilz S, Karplus M, Petsko GA (1986) X‐ray structure and refinement of carbon‐monoxy (Fe II)‐myoglobin at 1.5 A resolution. J Mol Biol 192:133–154. [DOI] [PubMed] [Google Scholar]
  • 44. Dabora JM, Sanyal G, Middaugh CR (1991) Effect of polyanions on the refolding of human acidic fibroblast growth factor. J Biol Chem 266:23637–23640. [PubMed] [Google Scholar]
  • 45. Wiedlocha A, Madshus IH, Mach H, Middaugh CR, Olsnes S (1992) Tight folding of acidic fibroblast growth factor prevents its translocation to the cytosol with diphtheria toxin as vector. Embo J 11:4835–4842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Mach H, Middaugh CR (1995) Interaction of partially structured states of acidic fibroblast growth factor with phospholipid membranes. Biochemistry 34:9913–9920. [DOI] [PubMed] [Google Scholar]
  • 47. Srimathi T, Kumar TK, Chi YH, Chiu IM, Yu C (2002) Characterization of the structure and dynamics of a near‐native equilibrium intermediate in the unfolding pathway of an all beta‐barrel protein. J Biol Chem 277:47507–47516. [DOI] [PubMed] [Google Scholar]
  • 48. Chi YH, Kumar TK, Chiu IM, Yu C (2002) Identification of rare partially unfolded states in equilibrium with the native conformation in an all beta‐barrel protein. J Biol Chem 277:34941–34948. [DOI] [PubMed] [Google Scholar]
  • 49. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540. [DOI] [PubMed] [Google Scholar]
  • 50. Jacobs DJ, Rader AJ, Kuhn LA, Thorpe MF (2001) Protein flexibility predictions using graph theory. Proteins 44:150–165. [DOI] [PubMed] [Google Scholar]
  • 51. Haliloglu T, Bahar I, Erman B (1997) Gaussian dynamics of folded proteins. Phys Rev Lett 79:3090–3093. [Google Scholar]
  • 52. Hilser VJ, Freire E (1996) Structure‐based calculation of the equilibrium folding pathway of proteins. Correlation with hydrogen exchange protection factors. J Mol Biol 262:756–772. [DOI] [PubMed] [Google Scholar]
  • 53. Hilser VJ, Garcia‐Moreno EB, Oas TG, Kapp G, Whitten ST (2006) A statistical thermodynamic model of the protein ensemble. Chem Rev 106:1545–1558. [DOI] [PubMed] [Google Scholar]
  • 54. Lee KH, Xie D, Freire E, Amzel LM (1994) Estimation of changes in side chain configurational entropy in binding and folding: general methods and application to helix formation. Proteins 20:68–84. [DOI] [PubMed] [Google Scholar]
  • 55. D'Aquino JA, Gomez J, Hilser VJ, Lee KH, Amzel LM, Freire E (1996) The magnitude of the backbone conformational entropy change in protein folding. Proteins 25:143–156. [DOI] [PubMed] [Google Scholar]
  • 56. Koga N, Tatsumi‐Koga R, Liu G, Xiao R, Acton TB, Montelione GT, Baker D (2012) Principles for designing ideal protein structures. Nature 491:222–227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Huang PS, Feldmeier K, Parmeggiani F, Fernandez Velasco DA, Hocker B, Baker D (2015) De novo design of a four‐fold symmetric TIM‐barrel protein with atomic‐level accuracy. Nat Chem Biol [VOL:PAGE #S]. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Mello CC, Barrick D (2004) An experimentally determined protein folding energy landscape. Proc Natl Acad Sci USA 101:14102–14107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Kloss E, Barrick D (2009) C‐terminal deletion of leucine‐rich repeats from YopM reveals a heterogeneous distribution of stability in a cooperatively folded protein. Protein Sci 18:1948–1960. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Renn JP, Clark PL (2008) A conserved stable core structure in the passenger domain beta‐helix of autotransporter virulence proteins. Biopolymers 89:420–427. [DOI] [PubMed] [Google Scholar]
  • 61. Hirst JD, Vieth M, Skolnick J Brooks CL, 3rd (1996) Predicting leucine zipper structures from sequence. Protein Eng 9:657–662. [DOI] [PubMed] [Google Scholar]
  • 62. Adey NB, Kay BK (1996) Identification of calmodulin‐binding peptide consensus sequences from a phage‐displayed random peptide library. Gene 169:133–134. [DOI] [PubMed] [Google Scholar]
  • 63. Chen YA, Scheller RH (2001) SNARE‐mediated membrane fusion. Nat Rev Mol Cell Biol 2:98–106. [DOI] [PubMed] [Google Scholar]
  • 64. Wayne N, Lai Y, Pullen L, Bolon DN Modular control of cross‐oligomerization: analysis of superstabilized Hsp90 homodimers in vivo. J Biol Chem 285:234–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Chen W, Lam SS, Srinath H, Jiang Z, Correia JJ, Schiffer CA, Fitzgerald KA, Lin K, Royer WE Jr. (2008) Insights into interferon regulatory factor activation from the crystal structure of dimeric IRF5. Nat Struct Mol Biol 15:1213–1220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Xu Z, Horwich AL, Sigler PB (1997) The crystal structure of the asymmetric GroEL‐GroES‐(ADP)7 chaperonin complex. Nature 388:741–750. [DOI] [PubMed] [Google Scholar]
  • 67. Liu Q, Hendrickson WA (2007) Insights into Hsp70 chaperone activity from a crystal structure of the yeast Hsp110 Sse1. Cell 131:106–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Nelson R, Sawaya MR, Balbirnie M, Madsen AO, Riekel C, Grothe R, Eisenberg D (2005) Structure of the cross‐beta spine of amyloid‐like fibrils. Nature 435:773–778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Paravastu AK, Leapman RD, Yau WM, Tycko R (2008) Molecular structural basis for polymorphism in Alzheimer's beta‐amyloid fibrils. Proc Natl Acad Sci USA 105:18349–18354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Wasmer C, Lange A, Van Melckebeke H, Siemer AB, Riek R, Meier BH (2008) Amyloid fibrils of the HET‐s(218‐289) prion form a beta solenoid with a triangular hydrophobic core. Science 319:1523–1526. [DOI] [PubMed] [Google Scholar]
  • 71. Gromiha MM, Pujadas G, Magyar C, Selvaraj S, Simon I (2004) Locating the stabilizing residues in (alpha/beta)8 barrel proteins based on hydrophobicity, long‐range interactions, and sequence conservation. Proteins 55:316–329. [DOI] [PubMed] [Google Scholar]
  • 72. Li R, Woodward C (1999) The hydrogen exchange core and protein folding. Protein Sci 8:1571–1590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. The Ann Math of Statist 18:50–60. [Google Scholar]
  • 74. Maki K, Cheng H, Dolgikh DA, Shastry MC, Roder H (2004) Early events during folding of wild‐type staphylococcal nuclease and a single‐tryptophan variant studied by ultrarapid mixing. J Mol Biol 338:383–400. [DOI] [PubMed] [Google Scholar]
  • 75. Rosen LE, Kathuria SV, Matthews CR, Bilsel O, Marqusee S (2015) Non‐native structure appears in microseconds during the folding of E. coli RNase H. J Mol Biol 427:443–453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Loh SN, Kay MS, Baldwin RL (1995) Structure and stability of a second molten globule intermediate in the apomyoglobin folding pathway. Proc Natl Acad Sci USA 92:5446–5450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Schneider B, Knochel T, Darimont B, Hennig M, Dietrich S, Babinger K, Kirschner K, Sterner R (2005) Role of the N‐terminal extension of the (betaalpha)8‐barrel enzyme indole‐3‐glycerol phosphate synthase for its fold, stability, and catalytic activity. Biochemistry 44:16405–16412. [DOI] [PubMed] [Google Scholar]
  • 78. Bilsel O, Zitzewitz JA, Bowers KE, Matthews CR (1999) Folding mechanism of the alpha‐subunit of tryptophan synthase, an alpha/beta barrel protein: global analysis highlights the interconversion of multiple native, intermediate, and unfolded forms through parallel channels. Biochemistry 38:1018–1029. [DOI] [PubMed] [Google Scholar]
  • 79. Truckses DM, Somoza JR, Prehoda KE, Miller SC, Markley JL (1996) Coupling between trans/cis proline isomerization and protein stability in staphylococcal nuclease. Protein Sci 5:1907–1916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Katayanagi K, Miyagawa M, Matsushima M, Ishikawa M, Kanaya S, Nakamura H, Ikehara M, Matsuzaki T, Morikawa K (1992) Structural details of ribonuclease H from Escherichia coli as refined to an atomic resolution. J Mol Biol 223:1029–1052. [DOI] [PubMed] [Google Scholar]
  • 81. Brych SR, Blaber SI, Logan TM, Blaber M (2001) Structure and stability effects of mutations designed to increase the primary sequence symmetry within the core region of a beta‐trefoil. Protein Sci 10:2587–2599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Porollo AA, Adamczak R, Meller J (2004) POLYVIEW: a flexible visualization tool for structural and functional annotations of proteins. Bioinformatics 20:2460–2462. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information


Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES