Significance
Orthologous proteins from the three superkingdoms have conserved their structures and functions over evolutionary time. We ask whether their folding mechanisms and the structures of their partially folded states are similarly conserved, using bacterial and archaeal representatives of the IGPS TIM barrel enzyme. Comparison of circular dichroism and fluorescence spectroscopic studies reveal a highly conserved mechanism, and hydrogen–deuterium exchange mass spectrometry analyses highlight similar cores of stability in regions dominated by clusters of branched aliphatic side chains. A bioinformatics analysis of hundreds of IGPS sequences from each superkingdom shows a very highly conserved sequence, V/ILLI, that nucleates the formation of a misfolded, microsecond intermediate and has existed since the last universal common ancestor of the IGPS family of proteins.
Keywords: protein folding, protein evolution, TIM barrel orthologs, hydrogen deuterium exchange, mass spectrometry
Abstract
The amino acid sequences of proteins have evolved over billions of years, preserving their structures and functions while responding to evolutionary forces. Are there conserved sequence and structural elements that preserve the protein folding mechanisms? The functionally diverse and ancient (βα)1–8 TIM barrel motif may answer this question. We mapped the complex six-state folding free energy surface of a ∼3.6 billion y old, bacterial indole-3-glycerol phosphate synthase (IGPS) TIM barrel enzyme by equilibrium and kinetic hydrogen–deuterium exchange mass spectrometry (HDX-MS). HDX-MS on the intact protein reported exchange in the native basin and the presence of two thermodynamically distinct on- and off-pathway intermediates in slow but dynamic equilibrium with each other. Proteolysis revealed protection in a small (α1β2) and a large cluster (β5α5β6α6β7) and that these clusters form cores of stability in Ia and Ibp. The strongest protection in both states resides in β4α4 with the highest density of branched aliphatic side chain contacts in the folded structure. Similar correlations were observed previously for an evolutionarily distinct archaeal IGPS, emphasizing a key role for hydrophobicity in stabilizing common high-energy folding intermediates. A bioinformatics analysis of IGPS sequences from the three superkingdoms revealed an exceedingly high hydrophobicity and surprising α-helix propensity for β4, preceded by a highly conserved βα-hairpin clamp that links β3 and β4. The conservation of the folding mechanisms for archaeal and bacterial IGPS proteins reflects the conservation of key elements of sequence and structure that first appeared in the last universal common ancestor of these ancient proteins.
Proteins are indispensable workhorses of cellular machinery whose functional diversity is defined by their final folded conformations. The folding pathway of a protein is determined by its energy landscape, whose map is encoded in the amino acid sequence. Partially folded states on the landscape often contain elements of the native topology and connect the nascent unfolded polypeptide chain to the functional folded conformation (1, 2). Proteins and their folding pathways have evolved over billions of years, responding to evolutionary forces such as mutation and natural selection (3–5). Orthologs, proteins that have diverged from a common ancestor but share a common structure and function, provide vehicles for exploring the impact of evolution on folding pathways and the intermediates that guide the folding to the native conformation.
The functionally diverse (βα)1–8 TIM barrel motif is an ideal candidate to decipher evolutionary constraints on protein folding pathways. The motif supports a wide variety of essential enzymatic transformations in all three superkingdoms of life (6–8) and is one of the 10 ancestral protein folds that were instrumental in the transition from RNA–protein world to the last universal common ancestor of life (LUCA) to the present complex DNA–RNA–protein world (9, 10). The βα-repeat architecture produces a cylindrical β-barrel core and an amphipathic α-helical shell whose loops between the β-strands and subsequent α-helices form the canonical active site of this very large family of enzymes. Although the pairwise sequence conservation across the family of TIM barrels is typically ∼30%, their folding mechanisms are complex and highly conserved (11). Folding intermediates, both on the productive folding pathway and as misfolded, kinetic traps have been observed for candidate TIM barrels from several bacterial and archaeal organisms (11–16). The divergence of these two superkingdoms, which occurred ∼4 billion y ago, right after life arose, speaks to the robustness of the TIM barrel folding mechanism across the span of evolutionary time.
We have previously examined the relationships between sequence, structure, and fitness in a yeast-based competition assay for three thermophilic indole-3-glycerolphosphate synthase (IGPS) orthologs from the TIM barrel family (17). Significant correlations between the archaeal Sulfolobus solfataricus (SsIGPS) and the bacterial Thermotoga maritima (TmIGPS) and Thermus thermophilus (TtIGPS) proteins revealed that both sequence and structure are critical in defining their fitness landscapes. This observation and the conservation of TIM barrel folding mechanisms motivated the hypothesis that the sequences of TIM barrel orthologs from archaeal and bacterial organisms also conserve the structures of their folding intermediates. If valid, we would obtain detailed insights into the constraints that TIM barrel structure and function impose on the enormous sequence space available in ∼4 billion y of evolution (18, 19). We have previously mapped the structures of the on- and off-pathway intermediates for SsIGPS by hydrogen–deuterium exchange mass spectrometry (HDX-MS) (15, 16), providing an archaeal reference for the present study of a bacterial ortholog (SI Appendix, Fig. S1).
Comparison of the structures of the folding intermediates and folding mechanisms for S. solfataricus and T. maritima IGPS confirmed our hypothesis. A bioinformatics analysis of thousands of nonredundant IGPS sequences from the bacterial, archaeal, and eukaryota superkingdoms revealed the conservation of three adjacent structural elements that form a nucleus responsible for defining the folding free energy surface of the IGPS family of TIM barrel proteins. We conclude that the folding mechanism of the IGPS TIM barrel, including the structures of key partially folded states, arose in the LUCA and has persisted for over ∼4 billion y.
Results
The Folding Mechanism Is Conserved in Bacterial and Archaeal IGPS TIM Barrels.
Equilibrium experiments.
We monitored the equilibrium and kinetic folding properties of TmIGPS in the denaturant guanidine hydrochloride (GdnHCl) with circular dichroism (CD) and tryptophan fluorescence (FL) spectroscopy to monitor the formation and disruption of secondary and tertiary structure (Fig. 1). Surprisingly, the titrations of TmIGPS with GdnHCl took 9 d to reach equilibrium at 25 °C and pH 7.2 (SI Appendix, Fig. S2). The CD results at 222 nm revealed a shoulder at ∼2 M GdnHCl and follow an apparent three-state mechanism, N ↔ Ieq ↔ U, similar to other TIM barrel proteins (13, 14). By contrast, FL spectroscopy only detected the N ↔ Ieq transition (Fig. 1A). The Ieq state retains ∼50% of the far-ultraviolet (UV) CD signal, but the α6 (W194) and α8 (W250) helices containing the two tryptophans appear to be unfolded. The population of Ieq is highest at 1.8 M GdnHCl, ∼80% of the population, and transition to the unfolded state is complete by ∼5 M GdnHCl (Fig. 1B). To enhance the precision of the thermodynamic parameters from each technique, we independently fitted the CD ellipticities between 200 and 260 nm to a three-state model and the FL emission data between 305 and 450 nm to a two-state model. The CD data yielded a free energy for the N ↔ Ieq transition of 5.4 ± 0.1 kcal ⋅ mol−1, and 4.1 ± 0.2 kcal ⋅ mol−1 for the Ieq ↔ U transition (SI Appendix, Table S1). FL spectroscopy yielded an apparent free energy for the N ↔ Ieq transition of 4.9 ± 0.6 kcal ⋅ mol−1, in excellent agreement with the CD measurement.
Kinetic experiments.
To obtain a complete picture of the folding reaction, we complemented the equilibrium results with an analysis of the kinetic unfolding and refolding properties of TmIGPS over a time range of milliseconds to hours. The kinetic traces were fitted to one or two exponential functions, and the log10 of the observed relaxation times were plotted as a function of the final GdnHCl concentration (Fig. 1C). TmIGPS folds in the submillisecond time frame to a kinetic intermediate, Ibp, with substantial secondary structure (Fig. 1A). The denaturant dependence of the Ibp ellipticity is coincident with the Ieq ↔ U transition, implying an apparent stability comparable to Ieq (Fig. 1A). Unfortunately, aggregation below 1 M GdnHCl precluded a quantitative analysis of its stability. Stopped-flow FL (SF-FL) revealed a 100’s of millisecond reaction whose acceleration at increasing denaturant concentration indicates an unfolding-like reaction for Ibp (Fig. 1C). The final, rate-limiting step in refolding accelerates exponentially with decreasing GdnHCl to reach an extrapolated relaxation time of 74 s in the absence of denaturant. The unfolding SF-FL kinetic traces were fitted with two exponential functions and describe a pair of denaturant-dependent phases between 2 and 4.5 M GdnHCl. The major (∼95%) slow and minor (∼5%) fast unfolding phases merged into a single phase whose relaxation time rolls over at ≥5 M GdnHCl. Although the major FL unfolding phase is also detected by CD, the minor unfolding phase is not.
We interpret the kinetic results to support a six-state folding mechanism for TmIGPS (Scheme 1), very similar to those for other IGPS TIM barrels (11). In the TmIGPS folding mechanism, the unfolded, U state initially collapses to an off-pathway intermediate, Ibp, that must at least partially unfold to access the first on-pathway intermediate, Ia. The conversion of Ia to the second on-pathway intermediate, Ib, is the rate-limiting step in folding, rendering the subsequent faster Ib to N reaction as undetectable. The denaturant dependence of the ellipticity of Ia, the dominant species after 10 s of folding, is coincident with that for Ibp, demonstrating comparable stabilities for the off- and on-pathway intermediates (Fig. 1C). For unfolding, the N to Ib reaction is the minor fast phase between 2 and 4.5 M GdnHCl (Fig. 1C). The major slow unfolding phase corresponds to the rate-limiting conversion of Ib to Ia. The N to N′ unfolding reaction, revealed by a rollover in the relaxation times at high denaturant concentrations, is a distinguishing feature of the TmIGPS mechanism. The very weak denaturant dependence of the N to N′ phase indicates a very small change in the buried surface area, justifying the N′ designation. As will be demonstrated in the HDX-MS experiment, Ieq is a composite of the Ia and Ibp species.
HDX-MS Confirms and Expands the Folding Mechanism of TmIGPS.
Equilibrium experiments.
The CD and FL experiments were useful in defining a folding free energy surface for TmIGPS, but neither are capable of providing insights into the structures of the partially folded states on that surface. HDX-MS can provide a global assessment for the protection of backbone amide hydrogens (NHs) against exchange with solvent deuterium for the intact protein (20). The H-to-D exchange behavior of TmIGPS was monitored after equilibration at different GdnHCl concentrations (0 to 6 M) for 9 d. After equilibration, deuterium-labeled samples were quenched and loaded on a home-built HDX device and then detected by electrospray ionization mass spectrometry (ESI-MS) (SI Appendix, Fig. S3). The m/z peaks for the +28-charge state and the number of exchanged backbone NHs obtained after Gaussian fitting of ESI-MS data for the +28 charge state are shown (Figs. 2 and 3).
In the native state (N) at 0 M GdnHCl, TmIGPS has exchanged 34 NHs (m/z 890.0) with deuterium in comparison to its undeuterated state (m/z 888.8). The N peak shifts smoothly from m/z 890.0 (34 Da) to m/z 890.4 (46 Da) with increasing denaturant concentration, reflecting the transition from the N to the N′ state (Figs. 2 and 3). The N′ state disappears completely at ∼1.8 M GdnHCl, the same concentration reported by equilibrium FL experiment (Fig. 1B). At 1.2 M GdnHCl concentration, where N and Ieq are populated (Fig. 1B), a new peak appears at m/z 893.5 (130 Da) that grows in intensity as the N′ peak diminishes (Fig. 2). Further increases in GdnHCl concentration reveal the new peak to be a pair of overlapping peaks, a broad peak at m/z 892.7 (109 Da) and a narrow peak at m/z 893.6 (132 Da). The pair of peaks shift smoothly to higher m/z up to 2.3 M GdnHCl (SI Appendix, Table S2). At the same time, the peak area of the lower m/z, broader peak decreases as the area of the higher m/z, narrower peak increases (Fig. 3). By 2.8 M GdnHCl, where the U state is highly populated, only a single narrow peak is apparent (m/z 894.8, 167 Da). This peak shifts smoothly to a higher m/z up to 6 M GdnHCl (m/z 895.3 m/z, 180 Da).
Kinetic experiments.
The assignment of the pair of m/z peaks between 1.1 and 2.8 M GdnHCl was obtained by a kinetic HDX-MS experiment based on the TmIGPS folding mechanism (Scheme 1). After 10 s of refolding at 0.8 M GdnHCl, TmIGPS occupies only the Ia state, following escape from the Ibp trap and prior to the rate-limiting step in folding (SI Appendix, Fig. S4). A single peak with a narrow width is observed at m/z 893.0 in the kinetic refolding HDX-MS experiment, assigning the narrow peak to Ia and the broad peak to Ibp in the equilibrium HDX-MS experiment (SI Appendix, Table S3). The deuterium labeling of the Ib state was obtained by a kinetic unfolding HDX-MS experiment. TmIGPS was unfolded and pulse labeled with 1.5 M deuterated D2O/GdnHCl between 10 and 1,800 s (SI Appendix, Fig. S5A). The conversion of Ib to Ia at 1.5 M GdnHCl occurs with a time constant of ∼2.7 h (Fig. 1C), ensuring the kinetic unfolding HDX-MS data only reflected the protection against H-to-D exchange by Ib state. The N′ to Ib transition occurs with a time constant of ∼300 s (SI Appendix, Fig. S5B), consistent with the observed faster unfolding FL phase (Fig. 1C).
Although the HDX-MS data shown in Figs. 2 and 3 were collected under equilibrium conditions, the results are informative about the limits of the kinetic processes that links various species on the TmIGPS free energy folding landscape. The exchange behavior of backbone NHs with deuterium is controlled by the rate constants for the opening (kop) and the closing (kcl) reactions that expose the backbone NH to solvent, and the intrinsic rate constant for exchange (kex) of an exposed backbone NH under the experimental conditions (20). Under the EX1 limit, kcl << kex, and the exchange is controlled by kop. Under the EX2 limit, kcl >> kex, and exchange is controlled by kop/kcl (i.e., the free energy difference between the open and closed states (∆G° = -RT x ln(kop/kcl)). At pH 7.2 and 25 °C, the average kex for amide hydrogens is ∼1 s−1 (20), defining EX1 processes as those with kcl << 1 s−1 and EX2 processes as those with kcl >> 1 s−1. The consequences on the HDX-MS data are that EX1 processes result in the coordinated peak area changes for the protonated and deuterated states, and EX2 processes result in the continuous change of the m/z value between the protonated and deuterated states. The assignment of the steps in the kinetic scheme for TmIGPS folding are shown (Scheme 1 and Fig. 2). EX2 processes reflect the undetected fast refolding of Ib to N′ and the N′ to N transitions. As expected, the very slow rate-limiting refolding step from Ia to Ib is controlled by an EX1 process that accounts for the simultaneous appearance of the N/N′ peak and the overlapping Ia/Ibp peak between 1.2 and 1.7 M GdnHCl (Figs. 2–4). Surprisingly, another EX1 process links the high-energy states, Ia and Ibp, between 1.2 and 2.5 M GdnHCl. The source of the structural differences between these two high-energy states and the Ib state can be determined by mapping the HDX protection on the amino acid sequence.
Mapping the Structures of Folding Intermediates in TmIGPS with HDX-MS.
Equilibrium experiments.
HDX-MS results on the intact protein were useful in linking the folding intermediates in TmIGPS detected by equilibrium and kinetic optical experiments (CD and FL) and equilibrium HDX-MS experiments, but they cannot provide insights into their structures. Proteolytic digestion of the pulse-labeled protein from 0 to 5 M GdnHCl concentrations provides the desired information. The equilibration protocol, normalization controls, and deuterium-pulse labeling and quenching procedures were similar to that of the HDX-MS experiments on the intact protein (SI Appendix, Fig. S6). After quenching, TmIGPS samples were proteolytically digested on a chilled online digestion pepsin column. The resulting peptides were separated by ultra-pressure liquid chromatography (UPLC) and analyzed by ESI-MS. A total of 70 overlapping peptides, covering 97% of the TmIGPS amino acid sequence, were analyzed (SI Appendix, Fig. S7).
The peptides were sorted into four different classes based on their H-to-D exchange mechanism and protection in intermediates. Representative spectra (Fig. 4) and titration curves (Fig. 5) for the selected set of peptides are shown. The fitted isotopic envelope of representative peptides at every GdnHCl concentration is shown in SI Appendix, Dataset S1.
Class I: The majority of the backbone NHs are exchanged within the 10 s deuterium pulse at 0 M GdnHCl. The remaining protection is lost via an EX2 mechanism with the formation of the N′ state at 1.1 M GdnHCl (Fig. 5A).
Class II: After the initial rapid exchange of a few backbone NHs via an EX2 mechanism, these peptides are protected in N′ and exchange their remaining backbone NHs via an EX1 mechanism with Ia and/or Ibp (Fig. 5A).
Class III: 15 to 40% of the backbone NHs in these peptides are protected in Ia and/or Ibp and exchange out in the U state via an EX2 mechanism (Fig. 5B).
Class IV: 45 to 60% of the NHs are protected against HDX in Ia and/or Ibp and exchange out in U via an EX2 mechanism for peptides spanning residues 128 to 149 (β4α4) (Fig. 5C). Compared to the Class III peptides, the higher apparent midpoint for the first transition in uptake implies a stronger protection for β4α4.
The four classes of peptides are shown in Fig. 6A and mapped onto the ribbon diagram of TmIGPS in Fig. 6B. The protection against exchange in Ia and/or Ibp is located in the α1β2 and β5α5β6α6β7 segments (Class III) and the β4α4 segment (Class IV). These regions, when mapped on a two-dimensional (2D) contact plot of isoleucine, leucine, and valine (ILV) side chains (Fig. 6C), show a strong correlation between clusters of ILV side chains and protection against exchange in these intermediates. Notably, the β4α4 segment has the highest density of ILV contacts. A similar correlation has been observed previously for SsIGPS [Fig. 6D (15)], emphasizing a key role for hydrophobicity in stabilizing folding intermediates. The absence of protection in β1 and β8 for Ia and Ibp means that the β-barrel has opened via the rate-limiting step from Ib to Ia. Consequently, the significant density of ILV contacts found in the 2D maps for these regions in the native state does not exist in Ia and Ibp intermediates.
Kinetic experiments.
The protection patterns in the individual N and N′ states could be mapped at equilibrium at 0 and 1.1 M GdnHCl. Kinetic experiments were required to isolate the Ia and Ib species and map their HDX protection patterns (see above). The peptides from N′ uniformly exhibit greater deuteration (i.e., more exposed to solvent) than their counterparts in the N state, suggesting a general loosening of the structure (SI Appendix, Fig. S8 A and B). Ib has a closed β-barrel and a nearly intact α-helical shell (SI Appendix, Fig. S9 A and B). The segments spanning α2β3α3 and α8′α8, however, experience significantly greater exchange, indicating them as the principal sites of annealing to reach the N state. In the mixture of Ia and Ibp at equilibrium, the β-barrel is open and only the α1β2 and the β4α4β5α5β6α6β7 segments offer protection against exchange (Class III and Class IV peptides) (Fig. 6B and SI Appendix, Fig. S8C). These segments in TmIGPS also contain βα-hairpin clamps (SI Appendix, Fig. S10) known to stabilize TIM barrel proteins (21, 22). Although aggregation in stopped-flow refolding experiments precluded a direct structural analysis of Ibp, peptide mapping of Ia revealed strong protection against exchange in β2 and β4α4β5 under strongly folding conditions (SI Appendix, Fig. S9C). The comparison of Ia with the mixture of Ia and Ibp at equilibrium in 1.8 M GdnHCl shows that Ibp also protects α1 and α5β6α6β7. We note that the lower mean m/z value for Ibp (i.e., less deuterated/more protected) is accompanied by a broader distribution than seen for the N, N′, Ia, and U states (Fig. 2). Ibp appears to be stabilized by a structurally different ensemble than its counterparts in the folding mechanism.
Bioinformatic Analysis of Hydrophobicity in the IGPS Family of TIM Barrels across Evolution.
The striking correlation between the HDX protection patterns in bacterial TmIGPS and archaeal SsIGPS, especially the strongest protection in the β4α4 segment (Fig. 6 C and D), motivated a bioinformatics analysis of the hydrophobicity of thousands of sequences of IGPS orthologs from the bacterial, acrchaeal, and eukarya superkingdoms. We used the Kyte–Doolittle hydropathy score (23) and a rolling five-residue window to calculate the hydrophobicity in members of a well distributed and large IGPS TIM barrel family. The Kyte–Doolittle hydrophobicity scale was chosen as it closely mimics the dominant role of ILV side chains reflected in Fig. 6 C and D.
The observed patterns reflect the periodicity of the hydrophobic β-strands, as expected for the (βα)1–8 TIM barrel architecture (Fig. 7 A–C). The site of highest hydrophobicity is strongly conserved in the β4 strand, and its score is >3 SDs higher than the mean hydrophobicity for each of the three superkingdoms. Strikingly, the hydrophobicity score for the adjacent β3 strand is the lowest of the eight β-strands and close to mean for the entire sequence. The sequence logos show that ILV residues are highly favored in the β4 strand for all three superkingdoms (Fig. 7 D–F), demonstrating a defining role for the branched aliphatic side chains in the folding of the IGPS family of proteins that spans over a billion years of evolution (SI Appendix, Fig. S11).
Although branched aliphatic side chains are often associated with β-strands, leucine favors α-helices. The β4 strand shows strong conservation for consecutive leucine residues at equivalent positions in β4 for all three superkingdoms (SI Appendix, Fig. S12). The net effect for the sequences corresponding to β4 is a surprising tendency toward α-helix formation (SI Appendix, Fig. S13). The sequence logo for β3 shows a strong preference for valine/isoleucine at the first position and leucine at the second (SI Appendix, Fig. S14). The preference for arginine and lysine at the third and fourth positions accounts for the low hydrophobicity of β3 (Fig. 7). K110 (β3) is absolutely conserved across all members of the IGPS class of enzymes because it plays a key role in the active site architecture (17). A final surprise in the bioinformatics analysis was the strong conservation of the glycine-alanine-aspartic acid (GAD) βα-hairpin clamp linking β3 and β4 in all three superkingdoms (Fig. 7 D–F and SI Appendix, Fig. S12). A previous saturation mutagenesis analysis of this region in SsIGPS, TmIGPS, and TtIGPS revealed a long-range, allosteric connection between the βα-hairpin clamp and the enzymatic active site at the opposite end of the β-barrel (17). Taken together, the conservation of these sequence features reflects essential and orthogonal roles in the folding and function for the IGPS family of proteins.
Discussion
Structure of Folding Intermediates in TmIGPS.
The equilibrium and kinetic HDX-MS experiments on TmIGPS revealed structural insights for all of the partially folded states on the folding free energy surface (Fig. 6B and SI Appendix, Figs. S8 and S9). The N′ state maintains its TIM barrel fold, with fraying at the ends of α-helices and β-strands accounting for its reduced protection. The Ib state has a closed β-barrel; however, the α2β3α3 and α8′α8 segments framing the active site become dynamic and exchange their amide hydrogens with deuterium. Although the β-barrel is open in both the Ia and Ibp states, Ia only offers strong protection in β2 and β4α4β5. We were not able to directly assess the protection of Ibp because the protein aggregates under stopped-flow conditions. However, the simultaneous presence of Ia and Ibp at equilibrium implies that Ibp also offers protection in α1β2 and β4α4β5α5β6α6β7. Despite the lack of a direct contact between the sequence-separated protected segments in the Ia and Ibp states, the coincident loss in protection at increasing GdnHCl concentrations (Fig. 5 B and C), is evidence for their cooperative transition to the unfolded state. However, the uniquely strong, Class IV, protection of the β4α4 segment observed in the equilibrium experiments (Figs. 4 and 5 B and C) shows that the β4α4β5 and/or β4α4β5α5β6α6β7 modules do not unfold in a two-state fashion. The selective protection of a single βα element in a structure with eightfold βα symmetry highlights the role of sequence in protection against exchange.
We have earlier proposed that large clusters formed by aliphatic side chains of ILV inhibit water penetration and hydrogen exchange in partially folded states and form cores of stability in other βα proteins (14, 24, 25). The 2D contact map for ILV side chains in the bacterial TmIGPS reveals two ILV clusters, markedly similar to its evolutionarily distinct ortholog, archaeal SsIGPS (Fig. 6 C and D) (15). In both cases, a large cluster is primarily located in the C-terminal half of the barrel, highly dense in the β4α4 region, and includes a few contacts that link the N and C termini. For TmIGPS, a small cluster is localized in the N-terminal half of the barrel and spans the α1β2α2β3α3 segments. A corresponding cluster in SsIGPS spans β2α2β3α3. For both TmIGPS and SsIGPS, side chain–main chain hydrogen bonds between even numbered β-strands and their preceding odd-numbered counterparts, β1β2 and β3β4, form stabilizing βα-hairpin clamps (21). In TmIGPS, the core of stability in intermediates (Ia and Ibp) is defined by protection against exchange in a small (α1β2) and a large cluster (β4α4β5α5β6α6β7). The strongest protection is in the β4α4 module that has the highest density of ILV contacts in the folded TmIGPS structure. These observations are strikingly similar to those for the archaeal ortholog SsIGPS (15, 16) and make a strong case for the Branched Aliphatic Side Chains hypothesis (BASiC) as a major determinant of TIM barrel folding reactions (24).
A Conserved Nonnative Folding Nucleus in the IGPS TIM Barrels?
The exceedingly high hydrophobicity and predicted helix propensity of β4 in the IGPS family of TIM barrels (Fig. 7 A–C and SI Appendix, Figs. S12 and S13) may provide an explanation for the puzzling observation of the uniquely strong, Class IV, protection for the segment corresponding to β4α4 in the native structure. How could a single β-strand offer protection against exchange in the absence of its adjacent partners, β3 and/or β5? Taken together, these bioinformatics and experimental observations may find a common explanation in the formation of a helical hairpin from β4 and α4 in Ibp for both TmIGPS and SsIGPS. We speculate that the 100’s of nanoseconds folding dynamics of α-helices would allow the β4 and α4 segments to sample helical structures that could rapidly associate into a helical hairpin. The hairpin would offer protection against exchange in nascent β4 and α4 via intrahelical H-bonds and be stabilized by the formation of a nonnative ILV hydrophobic cluster between the side chains of the β4 and α4 segments. The hairpin would appear in <ms and be sufficiently stable to drive the formation of Ibp. This putative nonnative structure could serve as a nucleus to recruit adjacent elements under folding conditions and offer protection against exchange for the α5β6α6β7 segment. Productive folding would require back-tracking to disrupt Ibp and enable the formation of a native-like β4/α4/β5 trio in Ia, where β4 and β5 would offer mutual protection against exchange. The broad distribution of protection against exchange in Ibp (Fig. 2), in the context of the narrower distributions for N, N′, Ia, and U, implies an alternative structural ensemble, possibly a molten globule stabilized by the hydrophobic effect (26, 27). Although aggregation precluded direct measurement of protection in Ibp, the remarkable resistance to exchange for β4α4 in the presence of high concentrations of GdnHCl (Figs. 4 and 5) and its local connectivity make a strong argument for its independent and early formation of a folding nucleus that drives the folding of both Ia and Ibp.
The conjecture that β-strands might initially fold as α-helices is consistent with previous HDX studies of the alpha subunit of Trp synthase (αTS), also a TIM barrel, RNase H, an α + β protein, and β-lactoglobulin (βLG), which predominantly contains β structure. An HDX-NMR study on αTS found strong protection in a continuous string of five residues, rich in ILVs, for a pair of adjacent β-strand segments (28), and a β-strand and adjacent α-helix was the first segment to be protected in the refolding of RNase H (29). A complementary FL refolding study of RNase H revealed nonnative structure in the microsecond time range, consistent with an early misfolding reaction (30). The transient helix formation in the early folding intermediate of βLG was first detected by the increased levels of nonnative α-helical CD signal on the millisecond time scale (31, 32). Later, ultra-rapid mixing techniques in conjunction with Trp fluorescence and HDX-NMR were used to characterize the structural and dynamic properties of partially helical compact state of early refolding intermediate in βLG (33).
Why Conserve a Nonnative Helical Hairpin and the Preceding βα-Hairpin Clamp?
It is striking that we observed the same folding mechanism and folding structural elements for TIM barrels from both archaea and bacteria, which diverged nearly ∼4 billion y ago. Further, the sequence signature for this biophysical feature, the conserved ILLI motif in β4, has been maintained for billions of years of evolution in archaea, bacteria, and eukaryotes. This observation provides strong evidence that the folding mechanism of IGPS TIM barrels first appeared in the last universal common ancestor of these ancient proteins and has persisted for billions of years.
The sequence density of ILV residues in the β4 segment is responsible for its extreme hydrophobicity. If not protected by a rapidly forming local structure, even if nonnative, it could nucleate by intermolecular interactions leading to aggregation. However, over ∼4 billion y, it might have been expected to either evolve to a less hydrophobic sequence and/or surrender its primary nucleation role in folding to another βα element. The answer may be that this putative nonnative structure equilibrates within seconds with the on-pathway, Ia, intermediate on the productive path to the native state. Synthesis on ribosome is slower, 5 to 20 aa/sec (34), allowing sufficient time for the sequence to escape the kinetic trap in Ibp as the protein extrudes from the tunnel and encounters the trigger factor chaperone. In addition, the helical propensity of the β4 sequence in Ibp may protect this very hydrophobic sequence from undergoing intermolecular interactions that can lead to aggregation or the formation of amyloid fibers (33). In another scenario, the transient helical structure formed by the β4 sequence in Ibp might be essential to disfavor the nonnative pairing of β4 with other β-strands early in folding. Thus, the β4 sequence would not experience an evolutionary pressure to eliminate a nonnative structure. Once this initial step on the folding pathway was established, it may have constrained the further evolution of the sequence without introducing off-pathway or misfolded intermediates. If correct, the very high conservation of the hydrophobic sequences for β4 in all three superkingdoms implies a set of mutations that became fixed in the LUCA and continues to dictate the initial events in the folding of the IGPS family of TIM barrel proteins. It would be interesting to know if TIM barrel paralogues have different nucleation sites that also persist over evolutionary time. For example, Escherichia coli αTS protects β2 and β3 against exchange in a high-energy state (28). Is it the case that once a strong nucleation sequence appears in TIM barrels, it becomes fixed throughout evolutionary time? Further experiments are required to answer this question.
The reason for the strong conservation of the βα-hairpin clamp may find its explanation at the final stage of folding when the native conformation appears, as observed in a previous mutational analysis of a similar βα-hairpin clamp in αTS (21). The βα-hairpin clamp stabilizes its rate-limiting transition state and the native conformation. However, it is intriguing to speculate that the βα-hairpin clamp may also play a transient role early in folding by colocating the pair of branched aliphatic side chain amino acids at the N terminus of β3 with the ILV-rich β4 segment proposed to form a helical pair with α4. The putative early and known late roles in folding for the GAD sequence may explain its very high conservation.
IGPS Folding Free Energy Surfaces: Landscapes or Foldons?
From the perspective of polymer physics and statistical mechanics, Zwanzig (35) long ago pointed out that the rapid formation of local biases toward the native structure in an unfolded polypeptide chain, when coupled with their assembly in a myriad of ways into higher order structures, are sufficient to drive folding reactions that occur well within a biological time frame. Landscape Theory (36) built on that observation describes a funnel-like energy landscape that would allow for many possible pathways to proceed from the unfolded manifold of microstates to the native state. Concurrent with this development were the experiments of Englander and colleagues (37) that were interpreted in terms of the progressive development of native structure by the sequential formation of higher-order structure by the ordered assembly of simple elements of structure referred to as foldons. In effect, the foldon concept defined a tightly proscribed pathway from the unfolded state to the native state. A lively debate about these two diametrically opposed views of folding reactions continues to the present (38–40).
The folding mechanism of the IGPS family of TIM barrels proteins is not well described by either the Landscape Theory or the Foldon Model. The eightfold βα symmetry does not result in eight comparable folding modules to initiate folding, as might be expected from the simplest view of Landscape Theory. Rather, the (βα)4 module is highly favored after only a few microseconds, comparable to the folding times of small proteins and domains from larger proteins (41). In effect, the funnel narrows considerably to reach the Ibp state, a transient species that matures through a series of subsequent steps to reach the native conformation. Surprisingly, the peculiar details of the sequence appear to result in the formation of a nonnative structure that must at least partially unfold to allow access to the productive folding pathway. The structure of the subsequent on-pathway intermediate involves elements from several βα modules, likely stabilized by a cluster of branched aliphatic side chains. As reported in the present study, these intermediates are largely conserved across the bacterial and archaeal superkingdoms, speaking to a robust folding pathway with a defined set of partially folded states whose structures appear to be stabilized by the hydrophobic effect. If the Foldon Model is operative, it must exert its effects on an ensemble of microstates within the energy wells of these intermediates, as a pseudoequilibrium prior to the transition to the next state on the folding pathway. For both Ia and Ibp intermediates, the sequence corresponding to (βα)4 segment is the core of stability around which adjacent elements of structure condense.
Perspective.
The conservation of the folding mechanism for a pair of IGPS TIM barrels from bacterial and archaeal organisms reflects the conservation of the structures of off- and on-pathway intermediates across evolutionary time. Although the examined sequences are only 30% identical, the active site residues and key elements of the sequence are very highly conserved. The conserved folding elements arose in the LUCA and, in the absence of selective pressure, have become fixed and define this family of TIM barrel proteins. We speculate that other TIM barrel families have conserved but different nucleation sites that are also rich in sequence-local ILV residues [e.g., αTS (28, 42)]. Examination of their sequences might not only provide insight into their early folding events but also be a fingerprint distinguishing various families of this ubiquitous fold.
The de novo design of TIM barrels, a quest for >25 y (43, 44), has thus far relied on tethering identical (βα)1–4 halves (45) or four repeating βαβα units (43). The asymmetry observed in the highly favored aliphatic sequences for (βα)4 in IGPS TIM barrels suggests that design algorithms might benefit from the lessons of nature to achieve efficient and rapid folding while avoiding aggregation.
Materials and Methods
Protein Expression and Purification.
Recombinant TmIGPS with ∆1–31 deletion corresponding to helix-00 and Cys101Ser mutation in the crystal structure [pdb 1I4N (46)] was expressed in E. coli strain BL21 Codonplus (DE3)RIL. TmIGPS without His6-tag was purified by using TEV protease and a series of chromatographic steps. TmIGPS purity was confirmed (> 98%) with SDS PAGE and ESI-MS measurement on a Synapt G2-Si (Waters Corporation, Milford, MA) quadrupole time-of-flight Q-TOF ESI mass spectrometer.
Equilibrium and Kinetic Folding Studies with CD and FL Spectroscopy.
CD and tryptophan fluorescence experiments were done for a range of GdnHCl concentrations at pH 7.2 and 25 °C. The buffer in all experiments contained 10 mM potassium phosphate and 10 mM KCl. Far-UV CD data at steady state were collected from 260 nm to 200 nm by using a quartz cuvette of 5 mm pathlength. Three replicate CD spectra were collected and averaged. The equilibrium emission spectra after excitation at 280 nm were collected between 300 and 450 nm at a 1 nm interval and averaged over three traces. Manual mixing was used to initiate slow unfolding and refolding kinetics of TmIGPS. The change in ellipticity as a function of time was monitored at 222 nm in a quartz cuvette of 5 mm pathlength. The time dependent change in fluorescence emission spectra at 320 nm was measured after excitation at 280 nm. The dead-time of the manual mixing experiments was 3 s, and the instrument response time was about 5 s. The fast unfolding and refolding kinetics measurements were monitored with stopped-flow instruments. CD data were collected at 222 nm with a dead time of 5 ms. Stopped-flow fluorescence experiments were performed with a dead-time of 2 ms. The excitation wavelength was 280 nm while the emission was monitored using a 320 nm cutoff filter.
Intact HDX-MS Experiments.
The H-to-D exchange behavior of intact TmIGPS was monitored after equilibration for 9 d at different GdnHCl concentrations. After equilibration, we applied a 1:20 pulse of deuterated D2O/GdnHCl at the same GdnHCl concentration for 10 s at pD 7.2 and 25 °C. The ∼95% deuterated solution was quenched by a 1:5 dilution with 200 mM potassium phosphate on ice to reduce the pH to 2.5. Small volume of ice cold protonated 7 M GdnHCl at pH 2.5 (0.2% formic acid) was added in quenched samples so that all the samples had ∼1 M GdnHCl before loading. The 50 µl quenched samples containing ∼620 ng intact TmIGPS were injected manually on a home built HDX module. Chromatographic separations were performed using a Waters Acquity UPLC fitted with a Waters C4 BEH (300Å, 1.7 µm, 2.1 mm × 50 mm) column interfaced to a Waters Synapt G2-Si ESI mass spectrometer operating in the positive ion electrospray mode. Three blank LC-MS runs with 50% isopropanol injection were used to minimize the carry over between TmIGPS samples. OriginPro and Savuka softwares were used for manual data analysis.
Peptide Level HDX-MS Experiments.
The GdnHCl equilibration for 9 d followed by pulse labeling (1:12) and quenching steps (1:5) for peptide level experiments were similar as intact level experiments. 50 µl quenched samples containing ∼800 ng intact TmIGPS were injected manually on a home built HDX module, where TmIGPS was digested in a cooled online immobilized pepsin column (Waters Enzymate BEH 300 Å, 5 µm, 2.1 mm × 30 mm). Cleaved peptides were trapped on a Waters C18 BEH VanGuard precolumn (300 Å, 1.7 µm, 2.1 mm × 5 mm) and separated using a Waters C18 BEH (300 Å, 1.7 µm, 1 mm × 100 mm) column using the Waters Acquity UPLC-Synapt G2-Si interface as described above. Three blank LC-MS runs with 50% isopropanol injection were used to minimize the carry over between TmIGPS samples. The generation of peptide list was automated (Waters PLGS) while the search, validation and fitting of peptides in HDX-MS experiments was semiautomated [ExMS2 (47) and HX-Express (48)].
Bioinformatics Analysis of IGPS Family of TIM Barrels.
IGPS amino acid sequences were downloaded from Pfam database (49) (id: PF00218) and sequences with >95% identity were culled. Sequences from each superkingdom were aligned separately. The gaps were removed from each sequence and hydrophobicity was calculated on the Kyte-Doolittle scale with a 5-residue rolling window. The hydrophobicity values were plotted for positions in the alignment that correspond to the reference sequence used for each superkingdom. The mean hydrophobicity for each position was calculated and plotted along with SD. Crystal structures of TmIGPS [PDB:1i4n (46)] and SsIGPS [PDB:2C3Z (50)] were used as reference to follow the secondary structures for bacterial and archaeal sequences. Sequence logos were generated from aligned sequences and using online server WebLogo3 (51). Secondary structures were predicted by online server JPred4 (52).
Supplementary Material
Acknowledgments
We thank all members of the laboratory of C.R.M. for helpful discussions. We thank Zhong-Yuan Kan and S. Walter Englander, University of Pennsylvania, Philadelphia, PA, USA, for help with ExMS2 software. We thank Lizz Bartlett, Biophysical Characterization Facility, University of Massachusetts, Amherst, MA, USA, for the use of stopped-flow CD spectrophotometer. This work was supported by NSF Grant MCB 1517888 (to C.R.M.) and the NIH Grant GM23303 (to C.R.M.).
Footnotes
The authors declare no competing interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2019571118/-/DCSupplemental.
Data Availability
All study data are included in the article and/or supporting information.
References
- 1.Baldwin R. L., Intermediates in protein folding reactions and the mechanism of protein folding. Annu. Rev. Biochem. 44, 453–475 (1975). [DOI] [PubMed] [Google Scholar]
- 2.Scott K. A., Daggett V., Folding mechanisms of proteins with high sequence identity but different folds. Biochemistry 46, 1545–1556 (2007). [DOI] [PubMed] [Google Scholar]
- 3.Newton M. S., Arcus V. L., Gerth M. L., Patrick W. M., Enzyme evolution: Innovation is easy, optimization is complicated. Curr. Opin. Struct. Biol. 48, 110–116 (2018). [DOI] [PubMed] [Google Scholar]
- 4.Nickson A. A., Clarke J., What lessons can be learned from studying the folding of homologous proteins? Methods 52, 38–50 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Smock R. G., Yadid I., Dym O., Clarke J., Tawfik D. S., De novo evolutionary emergence of a symmetrical protein is shaped by folding constraints. Cell 164, 476–486 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Brändén C.-I., The TIM barrel—the most frequently occurring folding motif in proteins: Current Opinion in Structural Biology 1991, 1:978–983. Curr. Opin. Struct. Biol. 1, 978–983 (1991). [Google Scholar]
- 7.Nagano N., Orengo C. A., Thornton J. M., One fold with many functions: The evolutionary relationships between TIM barrel families based on their sequences, structures and functions. J. Mol. Biol. 321, 741–765 (2002). [DOI] [PubMed] [Google Scholar]
- 8.Carstensen L., et al., Conservation of the folding mechanism between designed primordial (βα)8-barrel proteins and their modern descendant. J. Am. Chem. Soc. 134, 12786–12791 (2012). [DOI] [PubMed] [Google Scholar]
- 9.Goldman A. D., Samudrala R., Baross J. A., The evolution and functional repertoire of translation proteins following the origin of life. Biol. Direct 5, 15 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Reisinger B., et al., Evidence for the existence of elaborate enzyme complexes in the Paleoarchean era. J. Am. Chem. Soc. 136, 122–129 (2014). [DOI] [PubMed] [Google Scholar]
- 11.Forsyth W. R., Bilsel O., Gu Z., Matthews C. R., Topology and sequence in the folding of a TIM barrel protein: Global analysis highlights partitioning between transient off-pathway and stable on-pathway folding intermediates in the complex folding mechanism of a (betaalpha)8 barrel of unknown function from B. Subtilis. J. Mol. Biol. 372, 236–253 (2007). [DOI] [PubMed] [Google Scholar]
- 12.Bilsel O., Zitzewitz J. A., Bowers K. E., Matthews C. R., Folding mechanism of the alpha-subunit of tryptophan synthase, an alpha/beta barrel protein: Global analysis highlights the interconversion of multiple native, intermediate, and unfolded forms through parallel channels. Biochemistry 38, 1018–1029 (1999). [DOI] [PubMed] [Google Scholar]
- 13.Forsyth W. R., Matthews C. R., Folding mechanism of indole-3-glycerol phosphate synthase from sulfolobus solfataricus: A test of the conservation of folding mechanisms hypothesis in (beta(alpha))(8) barrels. J. Mol. Biol. 320, 1119–1133 (2002). [DOI] [PubMed] [Google Scholar]
- 14.Gangadhara B. N., Laine J. M., Kathuria S. V., Massi F., Matthews C. R., Clusters of branched aliphatic side chains serve as cores of stability in the native state of the HisF TIM barrel protein. J. Mol. Biol. 425, 1065–1081 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gu Z., Zitzewitz J. A., Matthews C. R., Mapping the structure of folding cores in TIM barrel proteins by hydrogen exchange mass spectrometry: The roles of motif and sequence for the indole-3-glycerol phosphate synthase from sulfolobus solfataricus. J. Mol. Biol. 368, 582–594 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gu Z., Rao M. K., Forsyth W. R., Finke J. M., Matthews C. R., Structural analysis of kinetic folding intermediates for a TIM barrel protein, indole-3-glycerol phosphate synthase, by hydrogen exchange mass spectrometry and Gō model simulation. J. Mol. Biol. 374, 528–546 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chan Y. H., Venev S. V., Zeldovich K. B., Matthews C. R., Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints. Nat. Commun. 8, 14614 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Povolotskaya I. S., Kondrashov F. A., Sequence space and the ongoing expansion of the protein universe. Nature 465, 922–926 (2010). [DOI] [PubMed] [Google Scholar]
- 19.Worth C. L., Gong S., Blundell T. L., Structural and functional constraints in the evolution of protein families. Nat. Rev. Mol. Cell Biol. 10, 709–720 (2009). [DOI] [PubMed] [Google Scholar]
- 20.Bai Y., Milne J. S., Mayne L., Englander S. W., Primary structure effects on peptide group hydrogen exchange. Proteins 17, 75–86 (1993). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Yang X., Kathuria S. V., Vadrevu R., Matthews C. R., Betaalpha-hairpin clamps brace betaalphabeta modules and can make substantive contributions to the stability of TIM barrel proteins. PLoS One 4, e7179 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Yang X., Vadrevu R., Wu Y., Matthews C. R., Long-range side-chain-main-chain interactions play crucial roles in stabilizing the (betaalpha)8 barrel motif of the alpha subunit of tryptophan synthase. Protein Sci. 16, 1398–1409 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kyte J., Doolittle R. F., A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982). [DOI] [PubMed] [Google Scholar]
- 24.Kathuria S. V., Chan Y. H., Nobrega R. P., Ozen A., Matthews C. R., Clusters of isoleucine, leucine, and valine side chains define cores of stability in high-energy states of globular proteins: Sequence determinants of structure and stability. Protein Sci. 25, 662–675 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Radzicka A., Wolfenden R., Comparing the polarities of the amino acids: Side-chain distribution coefficients between the vapor phase, cyclohexane, 1-octanol, and neutral aqueous solution. Biochemistry 27, 1664–1670 (1988). [Google Scholar]
- 26.Okabe T., Tsukamoto S., Fujiwara K., Shibayama N., Ikeguchi M., Delineation of solution burst-phase protein folding events by encapsulating the proteins in silica gels. Biochemistry 53, 3858–3866 (2014). [DOI] [PubMed] [Google Scholar]
- 27.Kuwajima K., The molten globule, and two-state vs. Non-Two-State folding of globular proteins. Biomolecules 10, 407 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Vadrevu R., Falzone C. J., Matthews C. R., Partial NMR assignments and secondary structure mapping of the isolated alpha subunit of Escherichia coli tryptophan synthase, a 29-kD TIM barrel protein. Protein Sci. 12, 185–191 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hu W., et al., Stepwise protein folding at near amino acid resolution by hydrogen exchange and mass spectrometry. Proc. Natl. Acad. Sci. U.S.A. 110, 7684–7689 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Rosen L. E., Kathuria S. V., Matthews C. R., Bilsel O., Marqusee S., Non-native structure appears in microseconds during the folding of E. coli RNase H. J. Mol. Biol. 427, 443–453 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kuwajima K., Yamaya H., Sugai S., The burst-phase intermediate in the refolding of beta-lactoglobulin studied by stopped-flow circular dichroism and absorption spectroscopy. J. Mol. Biol. 264, 806–822 (1996). [DOI] [PubMed] [Google Scholar]
- 32.Kuwajima K., Yamaya H., Miwa S., Sugai S., Nagamura T., Rapid formation of secondary structure framework in protein folding studied by stopped-flow circular dichroism. FEBS Lett. 221, 115–118 (1987). [DOI] [PubMed] [Google Scholar]
- 33.Kuwata K., et al., Structural and kinetic characterization of early folding events in beta-lactoglobulin. Nat. Struct. Biol. 8, 151–155 (2001). [DOI] [PubMed] [Google Scholar]
- 34.Riba A., et al., Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates. Proc. Natl. Acad. Sci. U.S.A. 116, 15023–15032 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zwanzig R., Szabo A., Bagchi B., Levinthal’s paradox. Proc. Natl. Acad. Sci. U.S.A. 89, 20–22 (1992). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Wolynes P. G., Onuchic J. N., Thirumalai D., Navigating the folding routes. Science 267, 1619–1620 (1995). [DOI] [PubMed] [Google Scholar]
- 37.Hu W., Kan Z.-Y., Mayne L., Englander S. W., Cytochrome c folds through foldon-dependent native-like intermediates in an ordered pathway. Proc. Natl. Acad. Sci. U.S.A. 113, 3809–3814 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Eaton W. A., Wolynes P. G., Theory, simulations, and experiments show that proteins fold by multiple pathways. Proc. Natl. Acad. Sci. U.S.A. 114, E9759–E9760 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Englander S. W., Mayne L., The case for defined protein folding pathways. Proc. Natl. Acad. Sci. U.S.A. 114, 8253–8258 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Baldwin R. L., Clash between energy landscape theory and foldon-dependent protein folding. Proc. Natl. Acad. Sci. U.S.A. 114, 8442–8443 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Englander S. W., Mayne L., The nature of protein folding pathways. Proc. Natl. Acad. Sci. U.S.A. 111, 15873–15880 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wu Y., Vadrevu R., Yang X., Matthews C. R., Specific structure appears at the N terminus in the sub-millisecond folding intermediate of the alpha subunit of tryptophan synthase, a TIM barrel protein. J. Mol. Biol. 351, 445–452 (2005). [DOI] [PubMed] [Google Scholar]
- 43.Huang P. S., et al., De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nat. Chem. Biol. 12, 29–34 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Löffler P., Schmitz S., Hupfeld E., Sterner R., Merkl R., Rosetta:MSF: A modular framework for multi-state computational protein design. PLOS Comput. Biol. 13, e1005600 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Höcker B., Lochner A., Seitz T., Claren J., Sterner R., High-resolution crystal structure of an artificial (betaalpha)(8)-barrel protein designed from identical half-barrels. Biochemistry 48, 1145–1147 (2009). [DOI] [PubMed] [Google Scholar]
- 46.Knöchel T., Pappenberger A., Jansonius J. N., Kirschner K., The crystal structure of indoleglycerol-phosphate synthase from Thermotoga maritima. Kinetic stabilization by salt bridges. J. Biol. Chem. 277, 8626–8634 (2002). [DOI] [PubMed] [Google Scholar]
- 47.Kan Z. Y., Ye X., Skinner J. J., Mayne L., Englander S. W., ExMS2: An integrated solution for hydrogen-deuterium exchange mass spectrometry data analysis. Anal. Chem. 91, 7474–7481 (2019). [DOI] [PubMed] [Google Scholar]
- 48.Guttman M., Weis D. D., Engen J. R., Lee K. K., Analysis of overlapped and noisy hydrogen/deuterium exchange mass spectra. J. Am. Soc. Mass Spectrom. 24, 1906–1912 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.El-Gebali S., et al., The Pfam protein families database in 2019. Nucleic Acids Res. 47, D427–D432 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Schneider B., et al., Role of the N-terminal extension of the (betaalpha)8-barrel enzyme indole-3-glycerol phosphate synthase for its fold, stability, and catalytic activity. Biochemistry 44, 16405–16412 (2005). [DOI] [PubMed] [Google Scholar]
- 51.Crooks G. E., Hon G., Chandonia J. M., Brenner S. E., WebLogo: A sequence logo generator. Genome Res. 14, 1188–1190 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Drozdetskiy A., Cole C., Procter J., Barton G. J., JPred4: A protein secondary structure prediction server. Nucleic Acids Res. 43, W389–W394 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Matthews C. R., Effect of point mutations on the folding of globular proteins. Methods Enzymol. 154, 498–511 (1987). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All study data are included in the article and/or supporting information.