Highlights
-
•
In-depth sequence analysis reveals that the protein fold universe is more evolutionarily connected than previously assumed.
-
•
Short ancestral fragments are observed to have propagated to many modern proteins and hint at possible evolutionary pathways.
-
•
Experimental reconstruction of such events by chimeragenesis and directed evolution allows to test evolutionary relationships.
-
•
Detailed knowledge of folding landscapes helps to understand evolutionary history and improve protein engineering.
-
•
The ubiquitous and versatile TIM-barrel fold is a model system to explore evolution, folding, and design.
Abstract
Proteins are chief actors in life that perform a myriad of exquisite functions. This diversity has been enabled through the evolution and diversification of protein folds. Analysis of sequences and structures strongly suggest that numerous protein pieces have been reused as building blocks and propagated to many modern folds. This information can be traced to understand how the protein world has diversified. In this review, we discuss the latest advances in the analysis of protein evolutionary units, and we use as a model system one of the most abundant and versatile topologies, the TIM-barrel fold, to highlight the existing common principles that interconnect protein evolution, structure, folding, function, and design.
Current Opinion in Structural Biology 2021, 68:94–104
This review comes from a themed issue on Sequences and topology
Edited by Nir Ben-Tal and Andrei N Lupas
For a complete overview see the Issue and the Editorial
Available online 13th January 2021
https://doi.org/10.1016/j.sbi.2020.12.007
0959-440X/© 2020 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Introduction
Structural and functional diversity in modern proteins is the result of diversification and optimization processes over the course of evolution. Studying these processes is useful to evaluate how different molecular mechanisms, like duplication and recombination, shape biophysical properties in proteins. Sequence and structural analysis suggest that numerous protein pieces, considered as evolutionary units, have been reused and combined to create higher complexity. In this context, what are the reasons for the recurring success of some of these units? What is their role in protein fold diversification? And how can we use the accumulated information to further our protein design goals?
In this review, we try to unravel these mysteries by integrating different perspectives and approaches (Figure 1). We first discuss the current views of evolutionary units (Section ‘Current views of evolutionary units’). Then, we use the TIM-barrel fold as model system to analyze how our knowledge of the protein-based world is enhanced by the integration of evolutionary analysis (Section ‘Evolutionary events: fragments and natural TIM-barrel proteins’), experimental recreation of evolutionary events (Section ‘Recreating evolutionary events in the lab: chimeragenesis and directed evolution’), folding-function-fitness studies (Section ‘Three f determinants in TIM-barrel evolution: folding, function, and fitness’), and protein design approaches (Section ‘Learning from nature towards protein design’). We illustrate how these studies pave the way to a detailed description of existing structure-folding-function-fitness relationships and also boost the design of new proteins with novel molecular properties.
Figure 1.
Schematic overview of the relationships between protein fold evolution, experimental characterization, and design approaches discussed in this review. The upper part of the figure shows how evolutionary units are reused through different molecular mechanisms to diversify protein folds. Experimental reconstruction of different evolutionary pathways and the analysis of folding, function, and fitness determinants in evolution increase our knowledge of the protein-based world and allow navigating from Nature to protein design as shown in the bottom part.
Current views of evolutionary units
Look at any protein and you are bound to find pieces that appear to have been reused either in different proteins or as the modules in a repeat protein. Clearly, reuse of sequences is ubiquitous within the natural fold space as was suggested already early on [1,2]. For protein scientists this beckons the question: how many of these pieces are there and what makes them so successful?
The structural annotation of proteins typically includes consulting at least one of the major databases SCOP, CATH or ECOD [3, 4, 5] to append additional information on evolutionary relationships. Molecular evolution studies have shown that different forces and mechanisms such as mutations, duplications, recombinations, deletions, and circular permutations drive the diversification of the protein-based world [6,7]. These mechanisms also hold true for events in the subdomain regime.
In recent years there have been several approaches to define subdomain units as distinguishable building blocks (Figure 2). For example, an evolutionary relationship between the TIM-barrel and flavodoxin-like folds based on a 40-residue fragment was identified by sequence searches [8]. In a large-scale approach, Alva et al. identified and defined the reuse of elements within all modern proteins [9]. They generated a vocabulary of 40 subdomain fragments of up to 38 residues, which occur within a great number of different folds. Subsequent efforts to expand on these initial fragments led to the description of themes – reused fragments of at least 35 residues [10••]. A theme is defined whenever a sensitive sequence search using HHsearch suggests remote homology.
Figure 2.
Current subdomain classification approaches. Shown is the generation of available subdomain databases including the different input, data processing, and final output. While Fragment/Themes are continuous sequences and are defined by HMM-profile comparisons and structural alignments, TERMs are non-continuous and focus on contact maps for classification. In contrast, EFLs combine information from structure, sequence and function, but are limited by existing annotation of functional sites.
Along the same lines, Ferruz et al. expanded the fragment universe applying a set of filters to ensure the fragments are related, but not restricting their length [11••]. This generated a dataset of over eight million hits, which are summarized in the Fuzzle database (https://fuzzle.uni-bayreuth.de). When visualizing the dataset in a network representation a major component is observed that includes many hits between folds thought to be ancestral reinforcing earlier observations on different datasets [12,13]. This might hint not only to a common evolutionary history, but also to the existence of a favorable set of rules for protein folding, function, and fitness.
Another description by Berezovsky defines elementary functional loops (EFLs) [14]. These EFLs describe stretches of proteins with a specific sequence profile thought to be defined by the polymer nature of the polypeptide as reviewed recently [15]. Combining this with information on the conservation of structure and function provides indications, which elements might have proven successful in a primordial peptide-stage of evolution. This concept has been employed for example in the nucleotide binding database (NBDB), which contains EFLs involved in binding nucleotide-containing ligands [16]. Phosphate binding signatures obtained by this database were applied in the design of a P-loop protein testing the role of polymer physics in the emergence of basic units of proteins [17].
A fourth view that does not necessarily focus on the evolutionary aspect but rather on protein fold space are the tertiary structural motifs (TERMs) [18]. TERMs are 5–56 residue-long, discontinuous structural entities that are generated solely by comparing their environment. While TERMs focus primarily on conserved structural environments, a comparison of motifs generated by simulated evolution on TERMs and those of their natural counterparts showed that TERMs were able to accurately describe nature-like sequence variation.
These examples of either using structural information alone or sensitive in-depth sequence analysis or a combination thereof clearly hint to one thing: there is a subset of successful ancestral sequences that are to this day propagated to many modern folds.
Evolutionary events: fragments and natural TIM-barrel proteins
The previous section showed that, even after a considerable timespan, we can detect evolutionary relationships in modern proteins. Can we decode the underlying mechanisms of conservation of subdomain fragments in natural proteins? This general question has been explored by analyzing the evolution of different protein folds, for which the TIM barrel is a model system (Figure 3). This fold is regarded to be one of the oldest and encompasses a wide variety of known protein functions [19, 20, 21]. Its canonical fold consists of a central eight-stranded, parallel β-barrel surrounded by eight α-helices forming the eponymous (βα)8-barrel structure. It has previously been shown that subdomain parts of the TIM-barrel fold present an excellent model to probe the role of subdomain events, but also explore its evolution [22].
Figure 3.
Summary of recent central studies that interconnect the evolution, its experimental reconstruction, folding, and design of TIM-barrel proteins as discussed in this review.
In a recent endeavor, Kadamuri et al. theorized that a set of βαβ sequences exists within the TIM-barrel fold-space, which would be autonomously folding units [23]. While there are not yet any reports of natural βαβ motifs folding in isolation, investigating the subdomain folding regime in TIM barrels might reveal crucial steps to improve the creation of novel proteins and help elucidate the evolution of protein domains themselves.
A study by Michalska et al. on the structural flexibility of naturally occurring TIM barrels reported a 3D-domain swap of an (αβ)2 element within a tryptophan synthase structure [24]. A similar event has recently been observed in a crystal structure of the archaeal chemotaxis protein CheY [25]. An analysis of alternative splicing events of (βα)8 barrels within the human genome also showed a considerable fraction expressing only as subdomains, and are thought to assemble to a complete barrel with their complementary partners [26]. These observations hint at a flexible subdomain composition within α/β proteins. This concept has been experimentally explored as will be discussed further in Section ‘Recreating evolutionary events in the lab: chimeragenesis and directed evolution’.
When Prakash and Bateman analyzed the variation of TIM-barrel domain boundaries, they found what they propose to be domain atrophy [27]. This rare event is characterized by a loss of core secondary structure features that is potentially detrimental to domain stability. While it is still not clear why such events are evolutionary fixed, a possible rescue of stability appears to be the formation of protein-protein interactions, for example, in homodimers.
All these examples of subdomain evolutionary events in the TIM-barrel fold point to one thing: there is a propensity of some proteins to swap subdomain elements. To really gauge if this subdomain recombination played — or still plays — an important role in the diversification of proteins, more protein folds need to be examined. Understanding the common principles that govern this process could help improve our knowledge of protein stability, folding, function and evolution.
Recreating evolutionary events in the lab: chimeragenesis and directed evolution
The enormous diversity of protein structures and functions can be interpreted as the result of a massive experiment that has been carried out by Nature in a sustained way for millions of years, whose results are observed in the broad number of protein sequences and structures. In the previous section, we discussed that diverse evolutionary events in natural proteins allow the expansion of the protein fold space. Now, we focus on how some of these evolutionary events can be recreated in the laboratory through chimeragenesis and directed evolution. Both approaches offer a good alternative to test evolutionary and thermodynamic hypotheses and also to generate novel proteins (Figure 3).
Newton et al. explored the evolution of the TIM-barrel enzyme HisA using directed evolution techniques [28••]. They follow up on the innovation-amplification-divergence model previously proposed as an explanation of how gene duplication leads to proteins with new functions [29]. They show how beneficial substitutions selected during real-time evolution can result in manifold changes in enzyme function and bacterial fitness. The results emphasize the importance of loop mutability and confirms the TIM barrel as an inherently evolvable protein scaffold.
The current evolutionary hypothesis about the emergence of the TIM-barrel fold is that it evolved from duplication and fusion events of a half barrel, that is, a (βα)4, or even smaller units [30, 31, 32, 33]. This possible pathway has been tested computationally and experimentally by analyzing sequence, structural, and folding properties [32,34,35]. Following this idea, Sharma et al. engineered and characterized active and stable chimeric TIM barrels of two distantly related glycosyl hydrolases, demonstrating that half-barrel domains from different sources can assemble and adopt the pre-evolved function [36]. Likewise, Almeida et al. tested the idea that (βα)4 halves are self-contained evolutionary units, independent of their size and internal symmetry. They introduced mutations in the inter-half contacts of a β-glucosidase to obtain independent half barrels that unfold cooperatively [37]. Further, Wang et al. identified physicochemical properties from a set of non-redundant TIM-barrel proteins that strongly support the existence of recurring βα and αβ motifs in this fold [38]. In addition, using a conserved αβα element as a recombination site, they created a chimeric protein from two different TIM barrels, highlighting the potential of recurring motifs as naturally optimized interfaces to engineer well-folded chimeras.
Inspired by TIM-barrel modularity, Lapidoth et al. designed highly active and stable enzymes by creating fragments of structurally conserved sites of two unrelated TIM-barrel families and then assembled them to create a large set of combinatorial backbones [39••]. The reported computational approach mimics natural evolutionary processes such as recombinations, insertions, deletions, and mutations, but it is more radical than these individual events since all of them are applied simultaneously to modify the protein fitness. As will be discussed in the last section (Learning from nature towards protein design), this method could be extended to create new biocatalysts by combining more distantly related families.
Apart from recombination events within a protein fold, recombination of heterologous structural motifs of unrelated folds is possible. Although difficult to detect in Nature, the idea can be tested in the laboratory and might be used to design proteins with novel biophysical properties [21]. In this context, ElGamacy et al. engineered an asymmetric dRP lyase fold fusing two heterologous and unrelated supersecondary structures. After interface optimization the approach generated a stable chimera with high precision to the original design [40•].
Similarly, we have used chimeragenesis in the past to elucidate evolutionary relationships of several α/β folds and design new proteins. Chimeras built combining parts of the flavodoxin-like proteins CheY or NarL with a piece of the TIM barrel HisF demonstrate that (βα)8-barrel proteins can be constructed by recombining a large repertoire of natural protein fragments from distantly related folds [8,41, 42, 43]. This interchangeability offers a great opportunity to retrace early evolutionary steps. Following up on this, Toledo-Patiño et al. found sequence-based evidence that the singleton HemD-like fold emerged from the flavodoxin-like fold [44•]. To test the hypothesized path, consisting of insert-assisted segment swap, gene duplication, and fusion, these evolutionary events were experimentally reverted, yielding well-folded and stable proteins. The results strongly support the emergence of the HemD-like fold from flavodoxin-like proteins and highlight the importance of duplication and fusion as evolutionary events that allow the creation of complex proteins. These experimental reconstructions of possible evolutionary events fit well with the bioinformatic studies on protein fragments as discussed in section ‘Current views of evolutionary units’. Databases such as Fuzzle [11••] provide many starting points for similar evolutionary explorations and open new ways to use already existing sequences in protein design. Fragments identified in Fuzzle can be used directly in the tool Protlego (https://hoecker-lab.github.io/protlego/) for automated chimera design and analysis [45].
Three f determinants in TIM-barrel evolution: folding, function, and fitness
The evolutionary study of biophysical determinants is useful to evaluate the role evolution has on the physical properties of proteins and informs us on how changes in the amino acid sequence shaped function in a specific fold [46]. In this section, we focus on recent advances to understand the biophysical basis underlying the success of the TIM-barrel fold as one of the most robust and versatile scaffolds.
The TIM-barrel fold provides a good architecture to explore how folding mechanisms have been conserved or diverged during evolution (Figure 3). In this context, Halloran et al. analyzed on a molecular level the earliest events in the folding of a TIM-barrel protein [47••]. Experimental and computational approaches revealed that the kinetic intermediate commonly observed in TIM barrels is dominated by a native-like structure in the central region of the sequence. They determined the rate-limiting step in the folding pathway to be the frustration encountered by the competition between the N-terminus or C-terminus to close the internal β-barrel. Also analyzing TIM-barrel proteins, Romero-Romero et al. studied and compared the folding pathway of eukaryotic homologous triosephosphate isomerases. Structural and biophysical analysis suggested that interfacial water molecules and water-mediated interactions could modulate the number of equilibrium intermediates, and therefore, the folding pathway in this enzyme family [48].
TIM-barrel proteins are notable for their diversity in catalytic activities. The broad presence of this topology in different enzymes has led to the assumption that the TIM-barrel fold played a central role in early evolution of catalysis. In a bioinformatic study, Goldman et al. showed by comparing the functional diversity of different protein folds that TIM-barrel proteins use the broadest range of enzymatic cofactors, including some putatively ancient cofactors [49•]. This supports the idea that the TIM barrel represented an ideal scaffold to facilitate the transition from ribozymes, peptides, and abiotic catalysts to modern protein-mediated metabolism.
Likewise, in terms of protein flexibility and enzymatic catalysis, Richard recently discussed why the selection and optimization of protein folds with multiple flexible loops, such as the TIM-barrel topology, is favored during enzyme evolution [50•]. He proposes that in TIM barrels the exploration of many different conformations during loop movement provides a potential starting point for the evolution of a new enzyme activity and allows the conformational changes needed in floppy enzymes. Also related with protein flexibility, but in the context of stability and evolution, Quezada et al. analyzed the molecular basis of the kinetic stability differences of two related triosephosphate isomerases and engineered new functional TIM-barrel enzymes with fine-tuned stabilities [51,52•]. They found a correlation between thermal flexibility and kinetic stability, suggesting how evolution has reached a balance between function and stability in cell-relevant timescales.
The evolution of protein folding, function, and fitness can be seen as a walk through sequence space, in the same way as was described 50 years ago by evolutionary biologist John Maynard Smith in his seminal work about natural selection and the concept of protein space [53]. Generally, each of these steps can be evaluated in terms of protein fitness, a measure of the effect that a property produces on the overall fitness of an organism. Following this logic, in two subsequent works the Matthews lab performed a quantitative description of the fitness landscape of distant orthologous TIM-barrel proteins to understand their evolutionary dynamics [54••,55]. They detected that the fitness landscapes are correlated and influenced by long-range epistatic interactions, and that these landscapes can be translocated in sequence space as a result of TIM-barrel fold plasticity.
The three f determinants in evolution discussed in this section have also been analyzed in other protein folds. Examples from the last years include discussions between the Makhatadze and Sanchez-Ruiz labs about the evolutionary validity of the minimal frustration hypothesis through the experimental characterization of ancestrally reconstructed proteins and extant homologous members of the thioredoxin family [56, 57, 58]. Also involving α/β proteins, Kukic et al. explored how the folding rates of Procarboxypeptidase A2 can be modulated during evolution by modification of the so-called nucleation-condensation mechanism [59]. Moreover, the Marqusee lab has made a substantial effort to understand how evolutionary pressures modify folding landscapes and tune kinetic and thermodynamic stability by characterizing one of the oldest protein folds, the RNase H-like superfamily [60, 61, 62, 63]. Other interesting works are the analysis of the influence of folding energies on the fitness of β-lactamases [64], the study of protein folding and fitness landscapes of amidases [65], the analysis of cotranslational folding and fitness of an integral membrane protein [66], and the evolutionary history of myoglobins [67]. The information obtained both on TIM barrels and other folds has revealed unanticipated details in protein molecular evolution thereby increasing our understanding of sequence-folding-fitness relationships, which has also relevant implications for protein design.
Learning from nature towards protein design
In the previous sections we discussed the evolution of protein folds from smaller units and provided examples recreating such evolutionary events with respect to folding, function, and fitness. Same as protein engineering has been used to test evolutionary hypotheses, the gained knowledge can also be used to design new proteins. Initial protein design strategies were mostly based on parametrization of well understood folds or supersecondary structures. But in the last decades many powerful algorithms were developed to predict protein structures and design new proteins as has been recently reviewed [68].
One of the most widely used design software, namely Rosetta, uses 3-residue and 9-residue long fragments from known protein structures to sample the backbone in ab initio predictions [69,70]. Those fragments are a lot smaller than the previously described evolutionary units [9,10••,11••,14], however, they still can carry information about possible conformations. Additionally, some algorithms use evolutionary mechanisms as inspiration. The SEWING algorithm for instance incorporates current understanding of protein evolution, the emergence of proteins by recombination and duplication of smaller fragments: sets of structures meeting predefined requirements are generated by recombination of small structural motifs [71]. The more recently developed program dTERMen uses the previously described TERMs by matching them to the target design and thereby determines sequence preferences [72]. Also, the approach from Lapidoth et al. mentioned previously is inspired by Nature and mimics evolution during the design process [39••]. The fully automated method combines recombination, insertion, deletion, and mutation events in a non-sequential manner. Initially a predefined set of structures is partitioned and then assembled to combinatorial backbones, which are finally applied to a complete sequence redesign. During this process conserved sites and residues necessary for catalysis or folding can be excluded from the design. In contrast to other enzyme design approaches it has the advantage that no transition state has to be modelled which is computationally expensive. This method was applied to homologous TIM barrels but could possibly be extended to more distantly related proteins, thereby creating new biocatalysts. While this approach, that is based on existing structures, can diversify enzyme function, it will not create proteins from scratch.
The complete de novo design of proteins is a task that has been explored and progressed increasingly in recent years fueled by technical advances in structure determination, modelling, and computation. An increase in de novo designed proteins could be further observed after Koga et al. defined rules for the design of idealized protein topologies as recently reviewed [73]. The value of these design rules, that relate foldability of a tertiary structure to the connection between secondary structure elements [74], in combination with improvements in design algorithms can be traced in the design progression of de novo TIM barrels.
Several attempts were made to design a symmetric TIM barrel from scratch to understand what makes this protein fold so successful (Figure 3). In the early 1990s, first symmetric designs were created using statistical information about barrel geometries and amino acid frequencies from few known TIM-barrel structures [75, 76, 77, 78, 79, 80]. However, those parameters were not sufficient to achieve designs with natural-like properties as all exhibited molten-globule like states. With an increasing number of TIM-barrel structures, geometric parameters were improved, and newly emerging algorithms were applied to sequence design and created all-atom models. In this way, the Martial lab was able to improve previous designs and create natural-like proteins [81,82]. Later, the solubility of one of those designs was improved by directed evolution and the three-dimensional structure was determined: it differed from the intended TIM barrel and resembled a Rossman-like fold [83]. Using the previously described rules for idealized topologies, Nagarajan et al. created four-fold symmetric TIM-barrel backbones [84]. Using folding simulations, they determined hydrogen bond networks and enrichment of polar residues in the pore as important features regarding the folding pathway. Those findings were applied during iterative sequence design and resulted in soluble proteins showing cooperative unfolding transitions, though structural studies indicated a molten globule.
In the meantime, Huang et al. also applied the rules from Koga et al. to design a four-fold symmetric TIM barrel [85•]. Their approach sampled backbones with different secondary structure lengths using predefined geometric restrictions followed by iterative sequence design enforcing sidechain-backbone hydrogen bonds. A circular-permutated variant, sTIM11, was soluble expressed and the design was validated by solving its three-dimensional structure. Further analysis revealed a significantly lower conformational stability compared to natural TIM barrels. In a modular approach, a collection of stabilized variants (DeNovoTIMs) was designed by improving hydrophobic packing [86••]. Structural and folding analysis showed that epistatic effects allow navigating an unexplored region of the stability landscape of natural proteins. One of these DeNovoTIMs was already used in a successful recombination with a de novo designed ferredoxin protein and engineered to bind lanthanide [87]. In another recent study, Wiese et al. extended sTIM11 by successfully incorporating a rationally designed small α-helix into a βα loop [88]. These works are first steps towards diversifying and ultimately functionalizing de novo TIM barrels.
The progression in the design of a TIM barrel reflects nicely the improvements of protein design in the last 30 years. Throughout all design approaches, a symmetric topology was targeted as despite rapidly increasing computational resources the modelling of large proteins is still time-consuming. Further, this process shows how important it is to understand a protein fold in detail and to know which interactions are essential for its stabilization. In this context, it would be interesting to analyze the design from Figueroa et al. [82] in detail and determine why this design acquired a different fold than intended [83]. Such analysis is important to improve our understanding and find deficiencies in current protein design strategies.
Additionally, protein design opens a door not only to increase and test our knowledge about folding, function, and fitness, but also to compare the properties of de novo proteins with naturally occurring ones. In this way, studies have shown that de novo proteins exhibit more complex folding pathways than natural proteins, as indicated for one of the first de novo designed proteins Top7, a βα protein [89]. This differs from natural small proteins which show high cooperativity in folding and a smooth free energy surface. In addition, the study of another small de novo protein Di-III_14, an IF3-like protein, revealed a more complex folding pathway than initially assumed [90••]. In-depth mutational and folding analysis revealed that electrostatic and hydrophobic networks affect the energy surface of this protein. Based on those findings, it was proposed to limit the number of charged amino acids, avoid charge segregation, and use a more diverse set of nonpolar side chains in future protein designs. Overall, these studies demonstrate that as we expand our exploration into sequence space by designing de novo proteins, we also expand our understanding of the molecular and physicochemical determinants that shaped and still modulate the protein-based world.
Conclusion and outlook
The study of protein evolution requires the integrated analysis of protein structure and stability, as well as folding, function, and fitness of proteins. There is clear evidence that modern diverse protein folds evolved via reuse of smaller units, which have been identified and described in recent years. Evolution of protein folds from smaller units via duplication has long been described, but also recombination is explored increasingly as an important mechanism. Understanding how protein diversity could emerge via these mechanisms is essential to learn how stable and functional proteins evolved and might be designed.
The ubiquitous TIM-barrel fold has been used in several studies to investigate its evolution, folding, and design. Explorations of the fold’s evolutionary history and experiments recreating evolutionary events have revealed how recombination of recurring fragments can lead to new proteins and enzymes. These studies go hand in hand with detailed analyses of protein folding and determination of fitness landscapes of TIM barrels. Moreover, this knowledge has already been applied to the design of de novo TIM barrels illustrating how the connection between evolution, folding, and design closes to a cycle and how analysis of designed proteins can help us understand the biophysical properties of proteins even better. Altogether, these recent studies have significantly increased our understanding of the evolution of sequence-structure-function relationships, enabling us to access new protein space through design.
Conflict of interest statement
Nothing declared.
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
We gratefully acknowledge financial support from the Foundations Alexander von Humboldt and Bayer Science & Education (Humboldt-Bayer Research Fellowship for Postdoctoral Researchers to S.R.R.), from the European Research Council (ERC Consolidator grant 647548 ‘Protein Lego’), and the Volkswagenstiftung (grant 94747). Figures were created with BioRender.com and Pymol.
References
- 1.Eck R.V., Dayhoff M.O. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science. 1966;152:363–366. doi: 10.1126/science.152.3720.363. [DOI] [PubMed] [Google Scholar]
- 2.Fetrow J.S., Godzik A. Function driven protein evolution. A possible proto-protein for the RNA-binding proteins. Pac Symp Biocomput. 1998 [PubMed] [Google Scholar]
- 3.Andreeva A., Howorth D., Chothia C., Kulesha E., Murzin A.G. SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res. 2014;42:310–314. doi: 10.1093/nar/gkt1242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sillitoe I., Dawson N., Lewis T.E., Das S., Lees J.G., Ashford P., Tolulope A., Scholes H.M., Senatorov I., Bujan A. CATH: expanding the horizons of structure-based functional annotations for genome sequences. Nucleic Acids Res. 2019;47:D280–D284. doi: 10.1093/nar/gky1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cheng H., Schaeffer R.D., Liao Y., Kinch L.N., Pei J., Shi S., Kim B.H., Grishin N.V. ECOD: an evolutionary classification of protein domains. PLoS Comput Biol. 2014;10 doi: 10.1371/journal.pcbi.1003926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ohta T. Mechanisms of molecular evolution. Philos Trans R Soc B Biol Sci. 2000;355:1623–1626. doi: 10.1098/rstb.2000.0724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Sikosek T., Chan H.S. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface. 2014;11 doi: 10.1098/rsif.2014.0419. 20140419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Farías-Rico J.A., Schmidt S., Höcker B. Evolutionary relationship of two ancient protein superfolds. Nat Chem Biol. 2014;10:710–715. doi: 10.1038/nchembio.1579. [DOI] [PubMed] [Google Scholar]
- 9.Alva V., Söding J., Lupas A.N. A vocabulary of ancient peptides at the origin of folded proteins. eLife. 2015;4 doi: 10.7554/eLife.09410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10••.Nepomnyachiy S., Ben-Tal N., Kolodny R. Complex evolutionary footprints revealed in an analysis of reused protein segments of diverse lengths. Proc Natl Acad Sci U S A. 2017;114:11703–11708. doi: 10.1073/pnas.1707642114. [DOI] [PMC free article] [PubMed] [Google Scholar]; In this bioinformatic analysis the reuse of segments, which are similar in sequence and structure, is described. These segments, called themes, are subdomain elements with reuse traced to the amino acid position, highlighting the impact of protein evolution.
- 11••.Ferruz N., Lobos F., Lemm D., Toledo-Patino S., Farías-Rico J.A., Schmidt S., Höcker B. Identification and analysis of natural building blocks for evolution-guided fragment-based protein design. J Mol Biol. 2020;432:3898–3914. doi: 10.1016/j.jmb.2020.04.013. [DOI] [PMC free article] [PubMed] [Google Scholar]; Following an all-vs-all comparison of protein domain sequences, reused protein fragments were compiled into a network. This customizable network of fragments not only carries evolutionary significance, but can also function as a starting point for protein design by recombination.
- 12.Alva V., Remmert M., Biegert A., Lupas A.N., Söding J. A galaxy of folds. Protein Sci. 2010;19:124–130. doi: 10.1002/pro.297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Nepomnyachiy S., Ben-Tal N., Kolodny R. Global view of the protein universe. Proc Natl Acad Sci U S A. 2014;111:11691–11696. doi: 10.1073/pnas.1403395111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Berezovsky I.N., Guarnera E., Zheng Z. Basic units of protein structure, folding, and function. Prog Biophys Mol Biol. 2017;128:85–99. doi: 10.1016/j.pbiomolbio.2016.09.009. [DOI] [PubMed] [Google Scholar]
- 15.Berezovsky I.N. Towards descriptor of elementary functions for protein design. Curr Opin Struct Biol. 2019;58:159–165. doi: 10.1016/j.sbi.2019.06.010. [DOI] [PubMed] [Google Scholar]
- 16.Zheng Z., Goncearenco A., Berezovsky I.N. Nucleotide binding database NBDB - a collection of sequence motifs with specific protein-ligand interactions. Nucleic Acids Res. 2016;44:D301–D307. doi: 10.1093/nar/gkv1124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Romero Romero M.L., Yang F., Lin Y.-R., Toth-Petroczy A., Berezovsky I.N., Goncearenco A., Yang W., Wellner A., Kumar-Deshmukh F., Sharon M. Simple yet functional phosphate-loop proteins. Proc Natl Acad Sci U S A. 2018;115:E11943–E11950. doi: 10.1073/pnas.1812400115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.MacKenzie C.O., Zhou J., Grigoryan G. Tertiary alphabet for the observable protein structural universe. Proc Natl Acad Sci U S A. 2016;113:E7438–E7447. doi: 10.1073/pnas.1607178113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Banner D.W., Bloomer A.C., Petsko G.A., Phillips D.C., Pogson C.I., Wilson I.A., Corran P.H., Furth A.J., Milman J.D., Offord R.E. Structure of chicken muscle triose phosphate isomerase determined crystallographically at 2.5 Å resolution: using amino acid sequence data. Nature. 1975;255:609–614. doi: 10.1038/255609a0. [DOI] [PubMed] [Google Scholar]
- 20.Nagano N., Orengo C.A., Thornton J.M. One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions. J Mol Biol. 2002;321:741–765. doi: 10.1016/s0022-2836(02)00649-6. [DOI] [PubMed] [Google Scholar]
- 21.Sterner R., Höcker B. Catalytic versatility, stability, and evolution of the (βα)8-barrel enzyme fold. Chem Rev. 2005;105:4038–4055. doi: 10.1021/cr030191z. [DOI] [PubMed] [Google Scholar]
- 22.Höcker B. Design of proteins from smaller fragments-learning from evolution. Curr Opin Struct Biol. 2014;27:56–62. doi: 10.1016/j.sbi.2014.04.007. [DOI] [PubMed] [Google Scholar]
- 23.Kadamuri R.V., Irukuvajjula S.S., Vadrevu R. Methods in Molecular Biology. Humana Press Inc; 2019. βαβ super-secondary motifs: sequence, structural overview, and pursuit of potential autonomously folding βαβ sequences from (β/α)8/TIM barrels; pp. 221–236. [DOI] [PubMed] [Google Scholar]
- 24.Michalska K., Kowiel M., Bigelow L., Endres M., Gilski M., Jaskolski M., Joachimiak A. 3D domain swapping in the TIM barrel of the α subunit of Streptococcus pneumoniae tryptophan synthase. Acta Crystallogr Sect D Struct Biol. 2020;76:166–175. doi: 10.1107/S2059798320000212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Paithankar K.S., Enderle M., Wirthensohn D.C., Miller A., Schlesner M., Pfeiffer F., Rittner A., Grininger M., Oesterhelt D. Structure of the archaeal chemotaxis protein CheY in a domain-swapped dimeric conformation. Acta Crystallogr Sect F Struct Biol Commun. 2019;75:576–585. doi: 10.1107/S2053230X19010896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Ochoa-Leyva A., Montero-Morán G., Saab-Rincón G., Brieba L.G., Soberón X. Alternative splice variants in TIM barrel proteins from human genome correlate with the structural and evolutionary modularity of this versatile protein fold. PLoS One. 2013;8 doi: 10.1371/journal.pone.0070582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Prakash A., Bateman A. Domain atrophy creates rare cases of functional partial protein domains. Genome Biol. 2015;16:88. doi: 10.1186/s13059-015-0655-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28••.Newton M.S., Guo X., Söderholm A., Näsvall J., Lundström P., Andersson D.I., Selmer M., Patrick W.M. Structural and functional innovations in the real-time evolution of new (βα)8 barrel enzymes. Proc Natl Acad Sci U S A. 2017;114:4727–4732. doi: 10.1073/pnas.1618552114. [DOI] [PMC free article] [PubMed] [Google Scholar]; In this work, a real-time evolution analysis is performed to understand how new TIM-barrel enzymes lead to phenotype and organismal fitness changes. The study details the structural and functional innovations of the HisA enzyme towards generalist or specialist activities providing clues about evolution from atomic to whole-organism levels.
- 29.Näsvall J., Sun L., Roth J.R., Andersson D.I. Real-time evolution of new genes by innovation, amplification, and divergence. Science. 2012;338:384–387. doi: 10.1126/science.1226521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lang D., Thoma R., Henn-Sax M., Sterner R., Wilmanns M. Structural evidence for evolution of the β/α barrel scaffold by gene duplication and fusion. Science. 2000;289:1546–1550. doi: 10.1126/science.289.5484.1546. [DOI] [PubMed] [Google Scholar]
- 31.Gerlt J.A., Raushel F.M. Evolution of function in (β/α)8-barrel enzymes. Curr Opin Chem Biol. 2003;7:252–264. doi: 10.1016/s1367-5931(03)00019-x. [DOI] [PubMed] [Google Scholar]
- 32.Höcker B., Claren J., Sterner R. Mimicking enzyme evolution by generating new (betaalpha)8-barrels from (betaalpha)4-half-barrels. Proc Natl Acad Sci U S A. 2004;10:16448–16453. doi: 10.1073/pnas.0405832101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Claren J., Malisi C., Höcker B., Sterner R. Establishing wild-type levels of catalytic activity on natural and artificial (βα)8-barrel protein scaffolds. Proc Natl Acad Sci U S A. 2009;106:3704–3709. doi: 10.1073/pnas.0810342106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Höcker B., Beismann-Driemeyer S., Hettwer S., Lustig A., Sterner R. Dissection of a (βα)8-barrel enzyme into two folded halves. Nat Struct Biol. 2001;8:32–36. doi: 10.1038/83021. [DOI] [PubMed] [Google Scholar]
- 35.Seitz T., Bocola M., Claren J., Sterner R. Stabilisation of a (βα)8-barrel protein designed from identical half barrels. J Mol Biol. 2007;372:114–129. doi: 10.1016/j.jmb.2007.06.036. [DOI] [PubMed] [Google Scholar]
- 36.Sharma P., Kaila P., Guptasarma P. Creation of active TIM barrel enzymes through genetic fusion of half-barrel domain constructs derived from two distantly related glycosyl hydrolases. FEBS J. 2016;283:4340–4356. doi: 10.1111/febs.13927. [DOI] [PubMed] [Google Scholar]
- 37.Almeida V.M., Frutuoso M.A., Marana S.R. Search for independent (β/α)4 subdomains in a (β/α)8 barrel β-glucosidase. PLoS One. 2018;13 doi: 10.1371/journal.pone.0191282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Wang J.J., Zhang T., Liu R., Song M., Wang J.J., Hong J., Chen Q., Liu H. Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs. Biochim Biophys Acta - Proteins Proteomics. 2017;1865:165–175. doi: 10.1016/j.bbapap.2016.11.001. [DOI] [PubMed] [Google Scholar]
- 39••.Lapidoth G., Khersonsky O., Lipsh R., Dym O., Albeck S., Rogotner S., Fleishman S.J. Highly active enzymes by automated combinatorial backbone assembly and sequence design. Nat Commun. 2018;9 doi: 10.1038/s41467-018-05205-5. [DOI] [PMC free article] [PubMed] [Google Scholar]; A design approach is reported that uses protein fragments to create combinatorial backbones by applying evolutionary mechanisms during the assembly. This is applied to two homologous TIM-barrel families to build enzymes, which either restore or even increase the parental activities.
- 40•.ElGamacy M., Coles M., Lupas A. Asymmetric protein design from conserved supersecondary structures. J Struct Biol. 2018;204:380–387. doi: 10.1016/j.jsb.2018.10.010. [DOI] [PubMed] [Google Scholar]; The authors explored the combination of heterologous structural motifs as a new potential mechanism in protein fold evolution. Replacement of an αα-hairpin motif from an unrelated family followed by interface optimization led to a stable, well-folded dRP-lyase protein verified by NMR.
- 41.Bharat T.A.M., Eisenbeis S., Zeth K., Höcker B. A βα-barrel built by the combination of fragments from different folds. Proc Natl Acad Sci U S A. 2008;105:9942–9947. doi: 10.1073/pnas.0802202105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Eisenbeis S., Proffitt W., Coles M., Truffault V., Shanmugaratnam S., Meiler J., Höcker B. Potential of fragment recombination for rational design of proteins. J Am Chem Soc. 2012;134:4019–4022. doi: 10.1021/ja211657k. [DOI] [PubMed] [Google Scholar]
- 43.Shanmugaratnam S., Eisenbeis S., Höcker B. A highly stable protein chimera built from fragments of different folds. Protein Eng Des Sel. 2012;25:699–703. doi: 10.1093/protein/gzs074. [DOI] [PubMed] [Google Scholar]
- 44•.Toledo-Patiño S., Chaubey M., Coles M., Höcker B. Reconstructing the remote origins of a fold singleton from a flavodoxin-like ancestor. Biochemistry. 2019;58:4790–4793. doi: 10.1021/acs.biochem.9b00900. [DOI] [PMC free article] [PubMed] [Google Scholar]; A hypothetical evolutionary pathway is reconstructed supporting the emergence of the HemD-like fold from the flavodoxin-like fold. This work exemplifies how protein fold evolution can be reconstructed in the lab using modern day proteins.
- 45.Ferruz N., Noske J., Höcker B. Protlego: a python package for the analysis and design of chimeric proteins. bioRxiv. 2020 doi: 10.1101/2020.10.04.325555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Harms M.J., Thornton J.W. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat Rev Genet. 2013;14:559–571. doi: 10.1038/nrg3540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47••.Halloran K.T., Wang Y., Arora K., Chakravarthy S., Irving T.C., Bilsel O., Brooks C.L., Robert Matthews C. Frustration and folding of a TIM barrel protein. Proc Natl Acad Sci U S A. 2019;116:16378–16383. doi: 10.1073/pnas.1900880116. [DOI] [PMC free article] [PubMed] [Google Scholar]; This work combines computational simulations with experiments to monitor the folding pathway of indole-3-glycerol phosphate synthase. A common structurally-stable intermediate is formed early in folding and it can be concluded that the rate-limiting step in the folding pathway is the closing of the β-barrel.
- 48.Romero-Romero S., Becerril-Sesín L.A., Costas M., Rodríguez-Romero A., Fernández-Velasco D.A. Structure and conformational stability of the triosephosphate isomerase from Zea mays. Comparison with the chemical unfolding pathways of other eukaryotic TIMs. Arch Biochem Biophys. 2018;658:66–76. doi: 10.1016/j.abb.2018.09.022. [DOI] [PubMed] [Google Scholar]
- 49•.Goldman A.D., Beatty J.T., Landweber L.F. The TIM barrel architecture facilitated the early evolution of protein-mediated metabolism. J Mol Evol. 2016;82:17–26. doi: 10.1007/s00239-015-9722-8. [DOI] [PMC free article] [PubMed] [Google Scholar]; This bioinformatic work studies the role of the TIM-barrel fold in early evolution. Analysis of function, cofactor usage, and metabolic pathways suggested that TIM-barrel proteins participated in the transition from non-peptidic catalysis to protein-mediated metabolism.
- 50•.Richard J.P. Protein flexibility and stiffness enable efficient enzymatic catalysis. J Am Chem Soc. 2019;141:3320–3331. doi: 10.1021/jacs.8b10836. [DOI] [PMC free article] [PubMed] [Google Scholar]; The author discusses how protein flexibility is an important factor in TIM-barrel enzymes to find a balance between substrate-binding energy and catalysis. It is suggested that selection of flexible loops provides a starting point for the evolution and divergence of new enzyme activities.
- 51.Quezada A.G., Díaz-Salazar A.J., Cabrera N., Pérez-Montfort R., Piñeiro Á, Costas M. Interplay between protein thermal flexibility and kinetic stability. Structure. 2017;25:167–179. doi: 10.1016/j.str.2016.11.018. [DOI] [PubMed] [Google Scholar]
- 52•.Quezada A.G., Cabrera N., Piñeiro Á, Díaz-Salazar A.J., Díaz-Mazariegos S., Romero-Romero S., Pérez-Montfort R., Costas M. A strategy based on thermal flexibility to design triosephosphate isomerase proteins with increased or decreased kinetic stability. Biochem Biophys Res Commun. 2018;503:3017–3022. doi: 10.1016/j.bbrc.2018.08.087. [DOI] [PubMed] [Google Scholar]; Two closely-related triosephosphate isomerases are used to explore the correlation between protein thermal flexibility and kinetic stability. Based on MD simulations and DSC experiments, the authors designed new functional TIM-barrel enzymes with fine-tuned kinetic stabilities.
- 53.Smith J.M. Natural selection and the concept of a protein space. Nature. 1970;225:563–564. doi: 10.1038/225563a0. [DOI] [PubMed] [Google Scholar]
- 54••.Chan Y.H., Venev S.V., Zeldovich K.B., Matthews C.R. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints. Nat Commun. 2017;8:1–12. doi: 10.1038/ncomms14614. [DOI] [PMC free article] [PubMed] [Google Scholar]; A mutational scanning approach is used to analyze the conservation of the fitness landscapes in three orthologous indole-3-glycerol phosphate synthases. It was found that their fitness landscapes are correlated, influenced by epistasis, and translocate sequence space due to the plasticity of the TIM-barrel fold.
- 55.Chan Y.H., Zeldovich K.B., Matthews C.R. An allosteric pathway explains beneficial fitness in yeast for long-range mutations in an essential TIM barrel enzyme. Protein Sci. 2020;29:1911–1923. doi: 10.1002/pro.3911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Tzul F.O., Vasilchuk D., Makhatadze G.I. Evidence for the principle of minimal frustration in the evolution of protein folding landscapes. Proc Natl Acad Sci U S A. 2017;114:E1627–E1632. doi: 10.1073/pnas.1613892114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Candel A.M., Romero-Romero M.L., Gamiz-Arco G., Ibarra-Molero B., Sanchez-Ruiz J.M. Fast folding and slow unfolding of a resurrected Precambrian protein. Proc Natl Acad Sci U S A. 2017;114:E4122–E4123. doi: 10.1073/pnas.1703227114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Gamiz-Arco G., Risso V.A., Candel A.M., Inglés-Prieto A., Romero-Romero M.L., Gaucher E.A., Gavira J.A., Ibarra-Molero B., Sanchez-Ruiz J.M. Non-conservation of folding rates in the thioredoxin family reveals degradation of ancestral unassisted-folding. Biochem J. 2019;476:3631–3647. doi: 10.1042/BCJ20190739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Kukic P., Pustovalova Y., Camilloni C., Gianni S., Korzhnev D.M., Vendruscolo M. Structural characterization of the early events in the nucleation-condensation mechanism in a protein folding process. J Am Chem Soc. 2017;139:6899–6910. doi: 10.1021/jacs.7b01540. [DOI] [PubMed] [Google Scholar]
- 60.Hart K.M., Harms M.J., Schmidt B.H., Elya C., Thornton J.W. Thermodynamic system drift in protein evolution. PLoS Biol. 2014;12 doi: 10.1371/journal.pbio.1001994. 1001994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Lim, Shion A., Marqusee S. The burst-phase folding intermediate of ribonuclease H changes conformation over evolutionary history. Biopolymers. 2018;109 doi: 10.1002/bip.23086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Lim S.A., Hart K.M., Harms M.J., Marqusee S. Evolutionary trend toward kinetic stability in the folding trajectory of RNases H. Proc Natl Acad Sci U S A. 2016;113:13045–13050. doi: 10.1073/pnas.1611781113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Lim S.A., Bolin E.R., Marqusee S. Tracing a protein’s folding pathway over evolutionary time using ancestral sequence reconstruction and hydrogen exchange. eLife. 2018;7 doi: 10.7554/eLife.38369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Yang J., Naik N., Patel J.S., Wylie C.S., Gu W., Huang J., Marty Ytreberg F., Naik M.T., Weinreich D.M., Rubenstein B.M. Predicting the viability of beta-lactamase: how folding and binding free energies correlate with beta-lactamase fitness. PLoS One. 2020;15 doi: 10.1371/journal.pone.0233509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Faber M.S., Wrenbeck E.E., Azouz L.R., Steiner P.J., Whitehead T.A. Impact of in vivo protein folding probability on local fitness landscapes. Mol Biol Evol. 2019;36:2764–2777. doi: 10.1093/molbev/msz184. [DOI] [PubMed] [Google Scholar]
- 66.Choi H.-K., Min D., Kang H., Ju Shon M., Rah S.-H., Chan Kim H., Jeong H., Choi H.-J., Bowie J.U., Yoon T.-Y. Watching helical membrane proteins fold reveals a common N-to-C-terminal folding pathway. Science. 2019;366:1150–1156. doi: 10.1126/science.aaw8208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Isogai Y., Imamura H., Nakae S., Sumi T., Takahashi K.I., Nakagawa T., Tsuneshige A., Shirai T. Tracing whale myoglobin evolution by resurrecting ancient proteins. Sci Rep. 2018;8 doi: 10.1038/s41598-018-34984-6. 16883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Korendovych I.V., DeGrado W.F. De novo protein design, a retrospective. Q Rev Biophys. 2020;53 doi: 10.1017/S0033583519000131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Leaver-Fay A., Tyka M., Lewis S.M., Lange O.F., Thompson J., Jacak R., Kaufman K., Renfrew P.D., Smith C.A., Sheffler W. Rosetta3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 2011;487:545–574. doi: 10.1016/B978-0-12-381270-4.00019-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Alford R.F., Leaver-Fay A., Jeliazkov J.R., O’Meara M.J., DiMaio F.P., Park H., Shapovalov M.V., Renfrew P.D., Mulligan V.K., Kappel K. The rosetta all-atom energy function for macromolecular modeling and design. J Chem Theory Comput. 2017;13:3031–3048. doi: 10.1021/acs.jctc.7b00125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Jacobs T.M., Williams B., Williams T., Xu X., Eletsky A., Federizon J.F., Szyperski T., Kuhlman B. Design of structurally distinct proteins using strategies inspired by evolution. Science. 2016;352:687–690. doi: 10.1126/science.aad8036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Zhou J., Panaitiu A.E., Grigoryan G. A general-purpose protein design framework based on mining sequence-structure relationships in known protein structures. Proc Natl Acad Sci U S A. 2020;117:1059–1068. doi: 10.1073/pnas.1908723117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Koga R., Koga N. Consistency principle for protein design. Biophys Physicobiology. 2019;16:304–309. doi: 10.2142/biophysico.16.0_304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Koga N., Tatsumi-Koga R., Liu G., Xiao R., Acton T.B., Montelione G.T., Baker D. Principles for designing ideal protein structures. Nature. 2012;491:222–227. doi: 10.1038/nature11600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Goraj K., Renard A., Martial J.A. Synthesis, purification and initial structural characterization of octarellin, a de novo polypeptide modelled on the α/β-barrel proteins. Protein Eng Des Sel. 1990;3:259–266. doi: 10.1093/protein/3.4.259. [DOI] [PubMed] [Google Scholar]
- 76.Tanaka T., Hayashi M., Kimura H., Oobatake M., Nakamura H. De novo design and creation of a stable artificial protein. Biophys Chem. 1994;50:47–61. doi: 10.1016/0301-4622(94)85019-4. [DOI] [PubMed] [Google Scholar]
- 77.Tanaka T., Kuroda Y., Kimura H., Kidokoro S.I., Nakamura H. Cooperative deformation of a de novo designed protein. Protein Eng Des Sel. 1994;7:969–976. doi: 10.1093/protein/7.8.969. [DOI] [PubMed] [Google Scholar]
- 78.Tanaka T., Kimura H., Hayashi M., Fujiyoshi Y., Fukuhara K.I., Nakamura H. Characteristics of a de novo designed protein. Protein Sci. 1994;3:419–427. doi: 10.1002/pro.5560030306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Houbrechts A., Moreau B., Abagyan R., Mainfroid V., Préaux G., Lamproye A., Poncin A., Goormaghtigh E., Ruysschaert J.M., Martial J.A. Second-generation octarellins: Two new de novo (β/α)8 polypeptides designed for investigating the influence of β-residue packing on the α/β-barrel structure stability. Protein Eng Des Sel. 1995;8:249–259. doi: 10.1093/protein/8.3.249. [DOI] [PubMed] [Google Scholar]
- 80.Beauregard M., Goraj K., Goffin V., Heremans K., Goormaghtigh E., Ruysschaert J.M., Martial J.A. Spectroscopic investigation of structure in octarellin (a de novo protein designed to adopt the α/β-barred packing) Protein Eng Des Sel. 1991;4:745–749. doi: 10.1093/protein/4.7.745. [DOI] [PubMed] [Google Scholar]
- 81.Offredi F., Dubail F., Kischel P., Sarinski K., Stern A.S., Van de Weerdt C., Hoch J.C., Prosperi C., François J.M., Mayo S.L. De novo backbone and sequence design of an idealized α/β-barrel protein: evidence of stable tertiary structure. J Mol Biol. 2003;325:163–174. doi: 10.1016/s0022-2836(02)01206-8. [DOI] [PubMed] [Google Scholar]
- 82.Figueroa M., Oliveira N., Lejeune A., Kaufmann K.W., Dorr B.M., Matagne A., Martial J.A., Meiler J., Van de Weerdt C. Octarellin VI: using rosetta to design a putative artificial (β/α)8 protein. PLoS One. 2013;8 doi: 10.1371/journal.pone.0071858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Figueroa M., Sleutel M., Vandevenne M., Parvizi G., Attout S., Jacquin O., Vandenameele J., Fischer A.W., Damblon C., Goormaghtigh E. The unexpected structure of the designed protein Octarellin V.1 forms a challenge for protein structure prediction tools. J Struct Biol. 2016;195:19–30. doi: 10.1016/j.jsb.2016.05.004. [DOI] [PubMed] [Google Scholar]
- 84.Nagarajan D., Deka G., Rao M. Design of symmetric TIM barrel proteins from first principles. BMC Biochem. 2015;16:18. doi: 10.1186/s12858-015-0047-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85•.Huang P.S., Feldmeier K., Parmeggiani F., Fernandez Velasco D.A., Hocker B., Baker D. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy. Nat Chem Biol. 2016;12:29–34. doi: 10.1038/nchembio.1966. [DOI] [PMC free article] [PubMed] [Google Scholar]; This work reports the first successful de novo designed TIM barrel. The design approach comprises determination of geometric restrictions, backbone generation and iterative sequence design. Experimental characterization showed well-folded proteins and the intended topology for the construct sTIM11.
- 86••.Romero-Romero S., Costas M., Silva D.-A., Kordes S., Rojas-Ortega E., Tapia C., Guerra Y., Shanmugaratnam S., Rodríguez‐Romero A., Baker D. Epistasis on the stability landscape of de novo TIM barrels explored by a modular design approach. bioRxiv. 2020 doi: 10.1101/2020.09.29.319103. [DOI] [PMC free article] [PubMed] [Google Scholar]; A modular design approach was used to create a family of stabilized sTIM11 variants by improving hydrophobic packing. Detailed analysis showed that unexplored regions of the stability landscape are accessed. This landscape is shaped by epistatic effects arising from improved hydrophobic clusters.
- 87.Caldwell S.J., Haydon I.C., Piperidou N., Huang P.-S., Bick M.J., Sjöström H.S., Hilvert D., Baker D., Zeymer C. Tight and specific lanthanide binding in a de novo TIM barrel with a large internal cavity designed by symmetric domain fusion. Proc Natl Acad Sci U S A. 2020;117:30362–30369. doi: 10.1073/pnas.2008535117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Wiese G., Shanmugaratnam S., Höcker B. Extension of a de novo TIM barrel with a rationally designed secondary structure element. bioRxiv. 2020 doi: 10.1101/2020.10.16.342774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Watters A.L., Deka P., Corrent C., Callender D., Varani G., Sosnick T., Baker D. The highly cooperative folding of small naturally occurring proteins is likely the result of natural selection. Cell. 2007;128:613–624. doi: 10.1016/j.cell.2006.12.042. [DOI] [PubMed] [Google Scholar]
- 90••.Basak S., Paul Nobrega R., Tavella D., Deveau L.M., Koga N., Tatsumi-Koga R., Baker D., Massi F., Robert Matthews C. Networks of electrostatic and hydrophobic interactions modulate the complex folding free energy surface of a designed βα protein. Proc Natl Acad Sci U S A. 2019;116:6806–6811. doi: 10.1073/pnas.1818744116. [DOI] [PMC free article] [PubMed] [Google Scholar]; This work analyses the complex folding pathway of the de novo protein Di-III_14. Electrostatic and hydrophobic networks are identified as possible modulators and their contribution specified by mutational analysis. These findings have implications for future protein design strategies.