Proteins perform a spectrum of functions inside the cell, ranging from energy utilization to enzymatic activity to signaling, as well as structural and mechanical roles, among many others. Substantial research efforts throughout the years have focused on understanding these phenomena at the molecular level as well as the evolutionary processes that influenced their development. Part of these efforts include the elucidation of the fundamental principles that give shape to protein structures. Since function is encoded in the three-dimensional (3D) arrangement of proteins, identifying the principles of protein folding, structure, and dynamics has been a priority for research in protein science for many years. Inscribed on the last blackboard of Richard Feynman is the phrase “What I cannot create, I cannot understand” (1). This way of thinking has inspired scientists to not be satisfied with revealing fundamental principles but to apply them to create novel proteins with desired properties. The work of Wei et al. (2) in PNAS is a relevant example of such efforts to demonstrate understanding of biophysical principles of protein structure and dynamics by designing a protein that can change its shape under specific solution conditions.
The field of protein design is challenged by the enormous space of potential protein sequences and their folds. Because of this, most efforts have been focused on studying the extant plethora of sequences and structures found in nature that were the result of millions of years of evolutionary history. Using natural proteins and their sequences as templates to engineer new proteins provides a key reduction of search space and a basis for both structural features and functional behavior. This concept has been explored successfully both experimentally and theoretically in the field of directed evolution (3) and computational template-based modeling of protein structure (4). Such has been the impact of the use of evolutionary principles to engineer proteins that, in 2018, Frances Arnold was awarded the Nobel Prize in Chemistry for her pioneering contributions to designing novel enzymes using directed evolution. Since the field of protein structure prediction is closely related to the idea of elucidating novel proteins, advances in this field have been fundamental, too. Particularly, the use of evolutionary information has also had important contributions, especially through the identification of coevolutionary signals encoded in protein sequences that inform models on amino acid interactions in 3D space (5–8). For example, in my group, we have used evolutionary signals to engineer the response of hybrid repressors for synthetic biology applications (9). Combining this with the extensive theoretical developments of protein folding (10) and computational modeling of protein free energy calculations (11, 12) has given rise to a clearer pathway for de novo protein design (13).
De novo protein design is different from the advances described before because it aims to explore a space that has not been previously visited in the evolutionary process but whose fundamental physical and chemical principles are maintained. The work of Wei et al. (2) belongs to this effort to create desired properties from the exploration of first principles. In their research, they build from an extensive framework for protein modeling and design spearheaded by contributions of the Baker laboratory and members of the Institute for Protein Design at the University of Washington (12, 14, 15). In a past study, the group created a de novo six-helix bundle, that is, a protein complex consisting of identical alpha helices that interact to form structures analogous to elongated rose bundles (14). Wei et al. (2) aim to solve the problem of creating a protein that could be energetically stable in more than one conformation. Although substantial conformational changes upon mutation via protein engineering have been achieved in the past, notably an IgG-binding protein with a 4 fold transformed into an albumin-binding, 3- fold (16), the work of Wei et al. (2) pushes the boundaries of de novo design by building on top of a synthetic protein.
A relevant aspect of the work of Wei et al. (2) is the use of computational tools and experimental validations to guide the design process. A nonexpert might be tempted to think about this process as something that occurs primarily in a computer before going into “production” in a wet laboratory. However, a better analogy is the concept of computer-aided design and that of the “design spiral” used industrially. This process involves a series of iterative and interconnected computer-based and physical production (in this case experimental) stages that are evaluated and refined until a final product is achieved. A brief summary of the design process is illustrated in Fig. 1.
Fig. 1.
Designing conformational transitions in proteins. (A) A subunit of interacting helices is used as a template for optimization. The outer helix is cut, and a loop is added to connect the two helices. (B) The new flipping helix can rearrange upon calcium ion binding and reach an extended conformation. The length and residue identities of helices and loops are important design parameters. (C) Hydrogen bond and hydrophobic layers are designed to control packing and stability of states. (D) Ion coordination at the hinge promotes the elongated state of the trimeric complex. (E) The final design includes a protein that is compact (short) if no calcium is present or is a long conformation when calcium is bound.
First, a six-helix bundle protein was used as a model (14) and inspired by the large conformational changes of the viral protein hemagglutinin (HA) protein (17) which changes from elongated to compact structures through an order–disorder transition during membrane fusion in the process of infection (18). Then, using six helices to build a base, interfaces and hinges were designed by introducing loops between the outer and inner helices and by cutting the outer helices (Fig. 1A) to allow flipping and elongation (Fig. 1B). Initially, this process involved the creation of two proteins that were stable in either a short conformation or a long conformation. A computational pipeline including computational tools like HBNet (14), RossettaRemodel (19), and RossettaScripts (15) was required to evaluate the energetics of the structural states, but, also, intertwined experimental validations for solubility and CD spectra for helical structure and trimerization were needed. Methodologies like small-angle X-ray scattering were also useful for discrimination between states. Next, fine tuning of hinge length and helical interfaces was utilized to modulate short and long states. Another important parameter was the length of the flipping helices (Fig. 1B) and the design of equal stability for both states.
A crucial design element was the introduction of layers of hydrogen bond networks (A layers) in combination with hydrophobic layers (X layers) (Fig. 1C). Wei et al. (2) found that, by creating mixtures of such layers, they could promote the presence of one state versus the other. For instance, more A layers favored the long state, and glycines in the hinge reduced favorable contributions to the long-state free energy. After testing several permutations, first with structure prediction methodologies and then by experimentally testing for aggregation, they identified a smaller set of candidates for further refining. In order to match computational predictions with in vitro conformations, the group also used X-ray crystallography and NMR to compare the structures of their designed proteins and the ones measured by experiment. After obtaining states that were stable, the authors turned to optimize the short-/long-state free energy difference by using an idea from natural proteins, where changes in stability occur upon ion coordination, for example, the human cytomegalovirus glycoprotein B (20). Next, the team decided to change hinge residues to allow ion coordination and also have secondary structure ambiguity for helix or coil using computational methods for secondary structure prediction (Fig. 1D). This design decision would allow the protein to explore both states as opposed to settling for a single one. Finally, crystallization and NMR were used to identify states under the presence and absence of calcium, which was the ion coordinated at the hinge regions. The end result included a protein with one hydrogen bond layer and two hydrophobic layers that changes conformations based on the calcium concentrations. If calcium is present, the protein favors the long state, while, in its absence, the short state is preferred (Fig. 1E). The crystal structures of the short state show an impressive agreement with the computational predictions of about 1.2 Å of root-mean-square deviation.
Taken together, the work of Wei et al. (2) helps us conclude several aspects of the state of the art of protein design. 1) These results would be impossible without advances in computational modeling of protein structures, and the availability of large computational resources and architectures. However, the complete process still requires significant human intervention, expert knowledge, and an extensive use of experimental techniques. 2) Ideally, de novo protein design does not require previous knowledge of functional proteins found in nature to reach a target molecule. In practice, we still find guidance from natural proteins, like the case of the viral HA protein that inspired the work of Wei et al. 3) Reaching stable structures via protein design is challenging, but devising a sequence that explores different energetically favorable states is remarkable. Previous work has focused on moderate conformational changes (21) or changes in oligomeric states (22). Showcasing the feasibility of this approach is an important contribution to the field. 4) A number of applications of proteins with engineered conformational plasticity seem plausible. Examples include biosensors, engineered membrane fusion proteins for controlled drug delivery, and the creation of novel materials. Finally, just as Feynman hinted, “creating” also helps us understand how a few changes in the protein sequence can lead to important molecular features that might propel adaptations and ultimate survival of an organism (or virus). This phenomenon is one of the most fascinating aspects of molecular evolution; therefore, a glimpse of how this can be achieved à la carte is particularly meaningful.
Acknowledgments
My research in protein design and evolution is supported by grants from the National Institute for General Medical Sciences of NIH (Grant R35GM133631) and the Division of Molecular and Cellular Biosciences of NSF (Grant MCB-1943442).
Footnotes
The author declares no competing interest.
See companion article, “Computational design of closely related proteins that adopt two well-defined but structurally divergent folds,” 10.1073/pnas.1914808117.
References
- 1.Caltech Archives , Richard Feynman’s blackboard at time of his death. Caltech Archives (1988). http://archives-dc.library.caltech.edu/islandora/object/ct1%3A483. Accessed March 29, 2020.
- 2.Wei K. Y., et al. , Computational design of closely related proteins that adopt two well-defined but structurally divergent folds. Proc. Natl. Acad. Sci. U.S.A. 117, 7208–7215 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dougherty M. J., Arnold F. H., Directed evolution: New parts and optimized function. Curr. Opin. Biotechnol. 20, 486–491 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Waterhouse A., et al. , SWISS-MODEL: Homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Morcos F., et al. , Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. U.S.A. 108, E1293–E1301 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Marks D. S., Hopf T. A., Sander C., Protein structure prediction from sequence variation. Nat. Biotechnol. 30, 1072–1080 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ovchinnikov S., et al. , Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta. Proteins 84 (suppl. 1), 67–75 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Schaarschmidt J., Monastyrskyy B., Kryshtafovych A., Bonvin A., Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age. Proteins 86 (suppl. 1), 51–66 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Dimas R. P., Jiang X. L., Alberto de la Paz J., Morcos F., Chan C. T. Y., Engineering repressors with coevolutionary cues facilitates toggle switches with a master reset. Nucleic Acids Res. 47, 5449–5463 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Onuchic J. N., Luthey-Schulten Z., Wolynes P. G., Theory of protein folding: The energy landscape perspective. Annu. Rev. Phys. Chem. 48, 545–600 (1997). [DOI] [PubMed] [Google Scholar]
- 11.Hansen N., van Gunsteren W. F., Practical aspects of free-energy calculations: A review. J. Chem. Theor. Comput. 10, 2632–2647 (2014). [DOI] [PubMed] [Google Scholar]
- 12.Leaver-Fay A., et al. , ROSETTA3: An object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Huang P. S., Boyken S. E., Baker D., The coming of age of de novo protein design. Nature 537, 320–327 (2016). [DOI] [PubMed] [Google Scholar]
- 14.Boyken S. E., et al. , De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science 352, 680–687 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Fleishman S. J., et al. , RosettaScripts: A scripting language interface to the Rosetta macromolecular modeling suite. PLoS One 6, e20161 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Alexander P. A., He Y., Chen Y., Orban J., Bryan P. N., A minimal sequence code for switching protein structure and function. Proc. Natl. Acad. Sci. U. S. A. 106, 21149–21154 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Skehel J. J., Wiley D. C., Receptor binding and membrane fusion in virus entry: The influenza hemagglutinin. Annu. Rev. Biochem. 69, 531–569 (2000). [DOI] [PubMed] [Google Scholar]
- 18.Lin X., et al. , Order and disorder control the functional rearrangement of influenza hemagglutinin. Proc. Natl. Acad. Sci. U.S.A. 111, 12049–12054 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Huang P. S., et al. , RosettaRemodel: A generalized framework for flexible backbone protein design. PLoS One 6, e24109 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Burke H. G., Heldwein E. E., Crystal structure of the human cytomegalovirus glycoprotein B. PLoS Pathog. 11, e1005227 (2015).Correction in: PLoS Pathog.11, e1005300 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Davey J. A., Damry A. M., Goto N. K., Chica R. A., Rational design of proteins that exchange on functional timescales. Nat. Chem. Biol. 13, 1280–1285 (2017). [DOI] [PubMed] [Google Scholar]
- 22.Lizatovic R., et al. , A de novo designed coiled-coil peptide with a reversible pH-induced oligomerization switch. Structure 24, 946–955 (2016). [DOI] [PubMed] [Google Scholar]

