Skip to main content
Computational and Structural Biotechnology Journal logoLink to Computational and Structural Biotechnology Journal
. 2012 Sep 28;2:e201209002. doi: 10.5936/csbj.201209002

Computational approaches for rational design of proteins with novel functionalities

Manish Kumar Tiwari a,, Ranjitha Singh a,, Raushan Kumar Singh a, In-Won Kim a, Jung-Kul Lee a,b,*
PMCID: PMC3962203  PMID: 24688643

Abstract

Proteins are the most multifaceted macromolecules in living systems and have various important functions, including structural, catalytic, sensory, and regulatory functions. Rational design of enzymes is a great challenge to our understanding of protein structure and physical chemistry and has numerous potential applications. Protein design algorithms have been applied to design or engineer proteins that fold, fold faster, catalyze, catalyze faster, signal, and adopt preferred conformational states. The field of de novo protein design, although only a few decades old, is beginning to produce exciting results. Developments in this field are already having a significant impact on biotechnology and chemical biology. The application of powerful computational methods for functional protein designing has recently succeeded at engineering target activities. Here, we review recently reported de novo functional proteins that were developed using various protein design approaches, including rational design, computational optimization, and selection from combinatorial libraries, highlighting recent advances and successes.

Keywords: De novo protein design, computational protein design, designed therapeutic proteins, metalloproteins, ROSETTA, K* algorithm, DEZYMER, ORBIT

Introduction

Proteins, polymers of amino acids, are the main building blocks and functional molecules of the cell. They are the most multifaceted macromolecules in living systems and have various important functions, including structural, catalytic, sensory, and regulatory functions. The ability of proteins to cluster together to form well-defined structures comprising amino acid sequences make these numerous roles possible. The collection of data regarding protein sequences is rapidly growing, with approximately 6 million entries in Universal Protein Resource (UniProt) knowledgebase at present [18]. To completely understand the function of a protein, knowledge of its three-dimensional structure is essential. Unfortunately, experimental structure determination is only possible for a small fraction of these proteins [2, 810], with only approximately 2% having experimentally verified structural annotation at present. For the remaining 98%, prediction of the structure is the only alternative. Therefore, the structural characterization of proteins is a major goal in computational biology [18].

Advances in molecular modeling have expanded the area of computational protein design, from creating new proteins based on known protein sequences present in nature to designing new proteins that fold into a specific structure or perform a specific function. Before aiming at protein design by using computational methods, one should understand the underlying physical principles governing the folding, stability, and function of a protein. For all these decades, scientists and researchers have been following a perturbation or an alternation-based paradigm in order to determine the functionality of a protein. The method relies on the generation of hundreds and thousands of protein mutants, coupled with selective pressure to identify variants with desired properties. Alternatively, in computational protein design one aims at a design-based paradigm instead of a perturbation-based paradigm. In design-based paradigm, biologists combine design paradigms or methods for problem solving with computational modeling techniques to predict the success of their designs. This paradigm has been effective for the creation and implementation of new ideas and inventions. Design-based paradigm is used for the identification of the boundaries of possible designs and for the elimination of impossible, impractical, inefficient, or otherwise undesirable designs which would have otherwise been difficult to identify using alternation-based paradigm. In a structure-based computational method, a computational or a mathematical framework is constructed by taking into consideration the evolution, function, stability, and functionality of a protein. The designed proteins are then checked experimentally for their specific function. If the designed proteins exhibit all these characteristics, then it can be concluded that the mathematical model or the framework can fundamentally capture the essence of a protein. On the other hand, if the experiments do not work then one can learn from the failures to modify and create a new model, which will ultimately serve the final goal of computationally designing a novel viable protein. Protein design from scratch is thus the most precise way of testing our knowledge on how natural proteins implement their functions.

Engineering proteins with improved functionality or novel applications has been experimentally achieved by screening of large mutant libraries. However, most of these proteins do not provide quantitative design principles and/or comprehend the structural features that support the desired function. Computational protein design has helped overcome these drawbacks. With reliable structural predictions [1114], protein stability at the desired conditions, and accurate description of intermolecular interactions (protein–protein interactions [15, 16] and DNA–protein interactions [17]), the technique of computationally designing proteins has been one of the fast-emerging trends in biotechnology and biomedicine. Furthermore, computational protein design has attained significant breakthrough, for example, in the design of novel biocatalysts [1821] and biosensors for non-natural molecules, redesign of proteins with improved binding affinity [22], redesign of proteins with greater binding specificity [16, 23, 24], and design of proteins capable of binding non-biological cofactors [25] (Figure 1).

Figure 1.

Figure 1

Computationally designed structures and enzymes. (A) A novel Top7 globular protein fold with atomic-level accuracy [34]. (B) Designed SspB adaptor protein [15]. (C) Redesigned endonuclease DNA binding [17]. The redesigned enzyme binds and cleaves the redesigned recognition site ∼10,000 times more effectively than does the wild-type enzyme, with a level of target discrimination comparable to the original endonuclease. (D) A novel retro-aldol enzyme designed within a TIM-barrel scaffold [20].

Enzyme design presents a huge challenge, not only in the de novo design of catalysts for which no natural counterparts are known, but also in the design of multipurpose enzymes, which may have a wide range of biotechnological applications in fields, such as industrial organic synthesis and metabolic engineering [2629]. This review mainly discusses the strengths and recent successes of computational protein design approaches. We also summarize advancements of design methodology and the application of protein design strategies over the past few years. Other recent reviews can provide additional backgrounds and perspective [3033].

Rational computational design

The creation of biocatalysts from scratch enables scientists and engineers to build synthetic enzymes for a series of different chemical reactions, e.g., retro-aldol reaction [20] and Kemp elimination [21]. It also presents a testing ground for our fundamental understanding of the complexities of protein structure and function. Computational protein design starts with the coordinates of a protein main chain and uses a force field to identify sequences and geometries of amino acids that are optimal for stabilizing the backbone geometry [35]. Even for small proteins, the number of possible sequences far exceeds that which can be thoroughly searched. The development of powerful search algorithms to find optimal solutions has provided a major stimulus to the field [36]. Computational protein design requires correlation of structural predictions and experimental stability. Artificial enzymes have been developed with varying degrees of computational involvement, which includes de novo enzymes, where both the protein topology and the active site are built from scratch [20, 34, 37, 38].

De novo active-site design

The introduction of amino acid residues in the form of active site residues into the existing scaffolds is essential for computationally designed enzyme catalysis. These active site residues of the enzymes are responsible for enhancing the chemical reactions by lowering the activation barrier via stabilization of the transition state [39]. Accurate modeling of important forces in the active site requires quantum mechanical (QM) calculations [38]. Potential binding pockets capable of binding tightly to the transition state and retaining the desired geometry of the functional groups are identified within different protein scaffolds. Using geometry-based identification, the transition state is matched with the binding site and the position of the transition state and the catalytic side chains are optimized. Finally, the remaining residues for tight binding of the transition state are designed and the designs are ranked on the basis of transition state binding energy and catalytic geometry. Although the simultaneous design of structure and catalysis promises to broadly expand the scope of artificial enzymes, this area is still in its infancy.

Computational techniques have been used to design novel metal binding sites into proteins [4042]. Nascent metalloenzymes with a variety of oxygen redox chemistries have been generated by leaving one of the primary coordination spheres of the metal unligated by the protein [43, 44]. The diverse and powerful chemistry of metals makes metalloprotein design a promising approach to enzyme design [45]. Early pioneering work included the modeling of iron with one primary coordination sphere ligated with dioxygen, and a number of metalloproteins were designed in the thioredoxin fold [44]. These proteins were experimentally shown to bind iron and catalyzed a variety of oxygen chemistries. A high-energy state of histidine-catalyzed p-nitrophenyl acetate (PNPA) hydrolysis as a series of side chain rotamers was modeled by Bolon and Mayo [18]. They had followed a method analogous to that used for the design of catalytic antibodies. Escherichia coli thioredoxin [46] was selected as a scaffold because of its favorable expression properties, thermodynamic stability [47], and successful history in computational design [44]. A composite side chain composed of the histidine covalently linked to PNPA was introduced and sampled conformationally around accessible bond rotations in order to computationally model this reaction. To facilitate substrate binding and recognition, amino acids adjacent to the His-PNPA site were allowed to mutate to alanine. The conformations of His-PNPA and surrounding side chains were optimized using Dead End Elimination [48]. This was followed by the synthesis of the top two scoring candidates, protozyme design (PZD) 1 and PZD2. PZD2 demonstrated significant rate enhancements over the uncatalyzed reaction and saturation kinetics with increasing substrate concentration. Although initial extension of these computational methods to the design of a triose phosphate isomerase turned out to be unsuccessful, many important ideas put forth in these studies were incorporated into the recent, successful design of chemically ambitious artificial enzymes. Recently a strategy was devised to design an organophosphate hydrolase starting from a functionally diverse set of mononuclear zinc-containing metalloenzyme scaffolds [49]. For the computational design of organophosphate hydrolysis activity, a dinuclear bacterial organophosphate hydrolase, a zinc-containing enzyme, was selected as the template. Computational design of mononuclear zinc-containing active sites successfully identified a set of mutations in an adenosine deaminase that confer on it the target organophosphate hydrolysis activity (Figure 2). It was found that only four simultaneous mutations were required for the emergence of organophosphate hydrolysis activity in the deaminase. The engineered zinc-containing mouse adenosine deaminase catalyzed the hydrolysis of a model organophosphate with a catalytic efficiency (kcat/Km) of ∼104 M-1 s-1, representing a net increase in activity greater than 107-fold.

Figure 2.

Figure 2

Computational design of an organophosphate hydrolase. Engineered zinc-containing mouse adenosine deaminase PT3.1 design crystal structure, with catalytic residues in yellow [49].

In parallel with computational advances, active-site designs continue to progress using rational, intuition-based strategies. Using protein and substrate engineering, esterase activity was successfully introduced into human carbonic anhydrase (HCAIII) [50]. The affinity of HCAIII for benzenesulfonamide-containing molecules was used to model a substrate such that the scissile bond was positioned within a cleft in the protein. Grafting a His dyad from previous de novo helix-loop-helix designs resulted in an HCAIII variant with enhanced esterase activity over wild-type. The general application of these approaches promises interesting future results, including the ability to design proteins to catalyze reactions that are inaccessible by natural enzymes.

Computational tools and de novo enzyme design

Computational protein design tools to date have been useful for engineering proteins with a wide range of functions [17, 20, 21, 25, 33, 51, 52]. Many of these successes rely on fixed backbone approaches that maintain the backbone conformations seen in the original high-resolution crystal structures and focus on remodeling only the side chains [53, 54]. Computational protein design programs typically contain two major components: an energy or scoring function to evaluate how well a particular amino acid sequence fits a given scaffold and a search function that samples sequences as well as backbone and side chain conformations. Energy functions for protein-design often contain a combination of physically-based and knowledge based terms [55].

Exceptional progress is seen in de novo design of enzymes catalyzing a chemical reaction for which a natural biocatalyst does not exist. Researchers have devised different computational tools to assist designing and engineering of proteins with desired catalytic properties aiming to improve the catalytic efficiency of the designed biocatalysts. Programs like METAL SEARCH [36, 42, 56, 57], DEZYMER [45], ORBIT [58], and ROSETTA [59] have laid a strong foundation and launched the development of de novo design of enzymes. METAL SEARCH uses an “on-the-fly binning” algorithm. Binning can be defined as a process of mapping continuous values into categorical values or bins. Binning can amplify data effects and can also reduce the effort required for exception detection by providing a sampling approach. METAL SEARCH helps design tetrahedrally coordinated metal binding sites in proteins of known structure. The program assumes fixed backbones, uses rotamers in the initial stages of the search, and uses simple geometric criteria for evaluating potential sites. The program specifies the 4 residues that might form tetrahedral sites using the backbone coordinates of a protein if wild-type amino acids were replaced by cysteine or histidine. The program has been used for the introduction of zinc binding sites in the designed 4-helix bundle protein a4 and in the B1 domain of streptococcal protein G, and in both cases, the tetrahedral coordination of a bound metal ion has been confirmed. DEZYMER, on the other hand, is a much more versatile computer program than METAL SEARCH, and helps design metal sites in proteins of known structure. DEZYMER is a molecular model building computer program that builds new ligand binding sites into a protein of known 3D structure by altering only the sequence and the side-chain structure of the protein, leaving the protein backbone folds intact by definition. This program enabled computer-aided modeling of sites with pre-defined geometry, providing a general method for the design of ligand-binding sites and enzyme active sites, which can then be tested experimentally. Using a crystal structure as the starting point the program can help maximize the stability of a target state by optimizing the side chain metal-ligand geometries.

Dahiyat and Mayo, via an automation algorithm called ORBIT (optimization of rotamers by iterative techniques), introduced a cyclical protein design strategy by coupling theory, computation, and experimental testing. By using a rotamer description of the side chains, they implemented a fast discrete search algorithm, based on the Dead-End Elimination (DEE) theorem, to rapidly find a globally optimal sequence in its ideal geometry from the vast number of possible solutions. DEE is a powerful algorithm capable of reducing the search space for structure-based protein design by a combinatorial factor. By using a fixed backbone template, a rotamer library, and a potential energy function, DEE identifies and prunes rotamer choices that are probably not part of the Global Minimum Energy Conformation (GMEC), effectively eliminating the majority of the conformations that must be subsequently enumerated to obtain the GMEC [48, 60]. Since the discovery of the DEE algorithm in 1992 several major theoretical and practical improvements have matured the method as a novel and promising tool in the fields of protein modeling and design.

ROSETTA, a milestone for protein design, is a suite of computational tools developed in the laboratory of David Baker. The most widely used computational protein design tool, Rosetta was originally developed for de novo fold prediction [11, 12, 61]. But it has been expanded to include methods for design, docking, experimental determination of structure from partial datasets, protein-protein interaction design and prediction, enzyme design, RNA structure prediction and protein-DNA interaction prediction and design [17, 20, 21, 25, 51, 52]. The Rosetta de novo enzyme design protocol has been used to design enzyme catalysts for a variety of chemical reactions, and in principle can be applied to any arbitrary chemical reaction of interest. The process has four stages: i) choice of a catalytic mechanism and corresponding minimal model active site, ii) identification of sites in a set of scaffold proteins where this minimal active site can be realized, iii) optimization of the identities of the surrounding residues for stabilizing interactions with the transition state and primary catalytic residues, and iv) evaluation and ranking the resulting designed sequences. Stages two through four of this process can be carried out with the Rosetta package, while stage one needs to be done externally [62]. The code is developed by the RosettaCommons. This working collaborative is composed of over 15 academic groups and thus the code is being applied to a very wide diversity of problems [63]. Recently, ROSETTA was used to develop artificial enzymes that catalyzed retro-aldol reaction [20] and Kemp elimination [21]. These designs were impressive in the extent to which the relationship between structure and reactivity was modeled and characterized.

Kemp elimination and retro-aldol reactions are considered milestones in the field of biocatalyst design. Kemp elimination is a model reaction for proton transfer from carbon. Seven rounds of random mutagenesis and shuffling resulted in an enzyme named KE07 (PDB accession code 2RKX) showing a >200-fold increase in catalytic efficiency and >1000 catalytic cycles exhibiting multiple turnovers. KE07 is based on the triose phosphate isomerase barrel scaffold of the Thermotoga maritime derived thermostable imidazole-3-glycerolphosphate synthase (HisF). Some of the mutations introduced during directed evolution were localized in the residues adjacent to designed positions; some changed flexibility of the region neighboring the active site; or adjusted pKa of the catalytic residues. The position and functional role of these mutations provide important insight into the strength and shortcomings of current designs, which need to be understood in order to match efficiency of natural catalysts in the future. On the other hand, retro-aldol reactions targeted retro-aldolase as biocatalysts and involved the breaking of carbon–carbon bonds in a non-natural substrate. The design required implementation of hashing methodology in ROSETTA to improve the algorithm in order to meet the requirements for a multistep reaction. The designs spanned a broad range of protein folds and 32 out of 72 experimentally characterized designs showed detectable retro-aldolase activity.

Though Rosetta has been shown to be capable of designing active enzymes in various cases [20, 21, 51], in each case the best designed proteins only had very modest activity, while many of the designs tested had no activity at all. Thus, while this protocol constitutes a powerful tool in the development of novel catalysts, success is by no means guaranteed [62]. In spite of the remarkable success, several shortcomings and potentials for improvement exist, some of which have been briefly summarized in this review. i) To increase the quality of designs, it is beneficial to include as many interactions in the theozyme as possible, and concurrently run matching for as many scaffolds as possible, ii) The enzyme design protocol so far only considers one state of the reactant, or one snapshot of the reaction trajectory. This means that Rosetta will try to design a sequence that optimally stabilizes this state, while ignoring the other states that also occur along the reaction coordinate. Finally, ranking and selection of designs could be improved by the development of faster more thorough computational examination methods [62]. Recently Jacak, R. et al. also pointed out in their study that that in many cases the Rosetta scoring function fails to prevent large hydrophobic clusters on the surface of proteins, even though the overall amino-acid composition of the protein surface is not significantly different from other soluble proteins [64]. Despite the limitations listed above, what distinguishes Rosetta's computational approach is that it is capable of generating catalytic activity from an inert scaffold, whereas most experimental methods, such as directed evolution approaches, rely on an existing catalytic activity as a starting point.

Several other novel algorithms or methods were also developed for de novo design of enzymes. A method for designing a new de novo protein was developed where one could search suitable scaffolds directly in the Protein Data Bank (PDB) [65]. Triose phosphate isomerase enzyme was used for the authentication of this method. Another novel algorithm based on dead-end elimination was useful for identifying minimized global minimum energy conformations and for the filtering of ensemble-based scoring [66]. A fast and precise algorithm that identified point mutations responsible for changing the net charge of the enzyme was also developed. This was done by bearing in mind that the enzyme maintained its fold and activity for redesigning. This resulted in the change in the pKa values of the catalytic residues placed into the putative catalytic sites [67]. The initial de novo designs could be additionally fine-tuned by simulations employing transition states [68, 69].

Another open-source, freely-available computational structure-based protein design suite of programs OSPREY (Open Source Protein REdesign for You) developed in the lab of Bruce Donald at Duke University identify protein mutants that possess desired target properties (e.g., improved stability, switch of substrate specificity, etc.). OSPREY can also be used for predicting small-molecule drug inhibitors. OSPREY incorporates set of several different algorithmic modules for structure-based protein design, including a number of powerful Dead-End Elimination (DEE) algorithms and the ensemble-based K* algorithm for protein-ligand binding prediction. This suite allows the incorporation of continuous protein side-chain and continuous or discrete backbone flexibility, while maintaining provable guarantees with respect to the input model (input structure, rotamer library, and energy function) for a given protein design problem. OSPREY also includes many extensions and improvements to the DEE framework (e.g., minDEE, iMinDEE, K*, DACS, BD, BRDEE). These extensions improve efficiency and allow the modeling of molecular flexibility. OSPREY includes the K* module, which is a provably-good e-approximation algorithm for computing binding constants (KD) over molecular ensembles of the bound and unbound states of a protein:ligand complex using minimized DEE/A* (namely, minDEE/A*/K*) [70, 71]. In 2009, Chen et al. reported a computational structure-based redesign of the 65-kDa phenylalanine adenylation domain of the nonribosomal peptide synthetase (NRPS) enzyme gramicidin S synthetase A (GrsA-PheA) for a set of noncognate substrates [70]. They applied the K* algorithm [66, 72] to predict mutations to the active site of GrsA-PheA to switch the enzyme specificity from the wild-type Phe toward the target noncognate substrates Leu, Arg, Glu, Lys, and Asp [70] and to predict mutations in dihydrofolate reductase from methicillin-resistant Staphylococcus aureus (MRSA) [73].

Amy E. Keating and her coworkers at Massachusetts Institute of Technology have been working on how the interaction properties of proteins (alpha-helical coiled coils and Bcl-2 family proteins) are encoded in their sequences and structures. Multicoil2, an algorithm predicting both the location and oligomerization state (two versus three helices) of coiled coils in protein sequences was developed by the Keating's group. It combines the pairwise correlations of the previous Multicoil method with the flexibility of Hidden Markov Models (HMMs) in a Markov Random Field (MRF). This new method significantly improves oligomer-state prediction, as well as coiled-coil detection, over the algorithms Multicoil and Paircoil2 [74, 75]. The performance of Multicoil2 is especially notable in the twilight zone of sequence identity, where HMM profile-based methods typically fail [76]. Recently Joe DeBartolo et al. at the Keating's laboratory used a combination of experimental assays and computational models to explore Bcl-2 homology 3 (BH3) peptide interactions with Bcl-2 family receptors [77]. They evaluated a novel structure-based protein–protein interaction statistical potential called STATIUM that can score interactions of BH3-like peptides with all five Bcl-2 receptors and is rapid enough to evaluate data sets containing more than 106 sequences in less than 1 s. The very general structure-based STATIUM model shows remarkably good performance compared to the experimentally derived position-specific scoring matrix (PSSM) models. STATIUM demonstrates great potential for evaluating candidate protein–protein interactions and can be used to complement other structure-based modeling techniques such as Rosetta, DFIRE, or MM/PBSA that require accurate construction of all atom models [7881].

A recent milestone in the field of computational enzyme design has been the de novo design of a Diels-Alderase (DA) [51]. David Baker and his group had previously used the Rosetta computational design methodology to design novel enzymes [20, 21] that catalyze bond-breaking reactions. Using DA reactions, they have tried to establish bimolecular bond-forming reactions. The concept required carbon-carbon bond formation between two separate substrates, catalyzing an intermolecular Diels-Alder reaction that requires the concomitant binding of two substrates in their proper relative orientation in order to accelerate the reaction and impart stereoselectivity. Quantum mechanical (QM) simulation to create a comprehensive theozyme library (∼1019 variants), which was fitted into a library of protein scaffolds by the RosettaMatch software, was employed. The ∼1019 active site configurations were reduced to about 106 possible protein scaffolds. Optimization of these protein scaffolds led to 84 protein designs. The 84 possible proteins were then synthesized within E. coli and then tested for catalytic behavior in the Diels-Alder reaction, resulting in the identification of two candidates having detectable DA activity. The catalytic efficiency of these two synthetic DAs matched the performance of catalytic antibodies raised for Diels-Alder cycloadditions and exhibited stereoselectivity and substrate specificity. X-ray crystallography confirmed that the structure matched the design for the most active of the enzymes, and binding site substitutions reprogrammed the substrate specificity. Recently, an increased DA activity through backbone remodeling was achieved using the computer game called Foldit [8285]. The active-site loops of a computationally designed enzyme, DA_20_10 [51], that catalyzes the Diels-Alder reaction, were remodeled. DA_20_10 catalyzed the well-studied reaction between 4-carboxybenzyl trans-1,3-butadiene-1-carbamate (diene) and N,N-dimethylacrylamide. A 24-residue helix-turn-helix motif, including a 13-residue insertion, was generated after several iterations of design and characterization, that increased the enzyme activity to >18-fold. Using this game Baker and coworkers aimed to exploit human problem-solving skills to improve the limitations of computer-designed proteins, which otherwise computers cannot solve alone.

The de novo designed enzymes are functional but do not match natural catalysts in their efficiency. Their catalytic efficiency is still many orders of magnitude below the natural enzymes [86]. Current models still tend to lag behind laboratory-evolved variants in catalytic performance. While some experimental optimization is possible by directed evolution, refinements in the design algorithm will likely yield further improvements in the accuracy of structure predictions and hence provide superior catalytic performance. Separately, the integration of protein dynamics in future simulations might deliver additional functional enhancement and at the same time provide an excellent testing ground for assessing its relevance to biocatalysis.

Engineering novel metalloproteins

Metalloproteins are proteins containing metal atoms or clusters. They are responsible for a diverse range of important biological functions and are involved in all vital cellular activities. The metal, contained within a metalloprotein, may be an isolated ion or may be coordinated with a non-protein organic compound. For example, hemoprotein containsporphyrin. Alternatively, the metal is co-coordinated with a side chain of the protein and an inorganic non-metallic ion, such as in the case of iron-sulfur clusters. Metalloenzymes occur in all six Enzyme Commission (EC) classes, accounting for 44% of oxidoreductases, 40% of transferases, 39% of hydrolases, 36% of lyases, 36% of isomerases, and 59% of ligases [87]. Metal ions add new functionality to proteins and help catalyze some of the most difficult biological reactions. Furthermore, with their varying redox states and geometric arrangements, metal ions enhance protein reactions. Probably for these reasons, metal-binding proteins account for about 50% of all proteins [88].

A promising approach to metalloprotein design is the knowledge and utilization of the diverse and powerful chemistry of metals [89]. Early pioneering work included the development of a computational method to identify protein sites capable of using side chains to complex metal atoms [45]. A number of metalloproteins were designed in the thioredoxin fold by modeling iron with one primary coordination sphere ligated with dioxygen [44]. These proteins were experimentally shown to bind iron and catalyzed a variety of oxygen chemistries. One such class of these metal binding proteins is the ‘Duo-Ferri’ (DF) series of maquettes which was developed to mimic di-iron proteins [90]. The DF maquettes bind two iron atoms and can also bind other metals ions (Zn, Co and Mn) with the stoichiometry of two ions per protein [90, 91].

The emphasis of designing as a tool has been expanded from the understanding of important characteristics or functionality of naturally occurring metalloproteins, to the design of functionally active novel artificial metalloproteins. However, designing of metalloproteins has proven to be more challenging than the design of non-metalloproteins. Most metal-binding sites are highly chromatic and display distinctive magnetic properties, making it easier to characterize the designed metalloprotein using metal-based spectroscopic techniques thus shortening design cycles. Therefore, the field of metalloprotein design has enjoyed much success recently, owing to advances in biophysical, computational and structural biology [92]. Therefore, designing and engineering novel metalloproteins is an important test of our ability to design proteins. One of the most important developments in the designing of metalloproteins has been the design and engineering of a novel metal-binding site into a native protein with a characteristic scaffold. This achievement has given rise to novel proteins catalyzing some of the most complicated biological reactions. Another advantage of choosing metal-binding sites as targets for protein design is the rich spectroscopic data available for evaluating the design process.

Extensive study of metalloproteins has been carried out using biochemical techniques such as site-directed mutagenesis [9397]. The loss of function accompanied by certain mutations allows the identification of residues essential for function. Although serving a different purpose, the same mutagenesis techniques can be used in metalloprotein design to impart new function into a protein scaffold by introducing residues that bind metal ions. One of the most effective approaches in the design and engineering of novel metalloproteins is the redesign of existing metal-binding sites to new sites possessing totally different structural and functional properties. This approach can be best illustrated using heme proteins. Heme proteins catalyze a variety of reactions ranging from electron transfer, small molecule transport and sensing, to oxygen activation. Redesign of heme proteins, from one type into another, provides a test of the known factors governing the structure and function of a heme protein, and allows the direct comparison of two different heme proteins in the same framework. Heme proteins have been redesigned by varying the axial or proximal ligand, by redesigning the distal side of the heme and/or by redesigning one type of heme protein into another type.

On the basis of sequence, mechanistic, and structural information, and a novel SIAFE (simultaneous incorporation and adjustment of functional elements) process, the active site of glyoxalase II αβ/βαa metallohydrolase scaffold (GlyII) was reconstructed to bind and catalyze the hydrolysis of a typical substrate for metallo β-lactamase (MBL), cefotaxime [98]. In order to achieve β-lactamase activity the metal-binding sites of the GlyII had to undergo complex redesigning. In addition to heme, other metal ions/cofactors have been engineered into helical bundles by introducing metal-binding ligands at specific locations to mimic those in native proteins. Examples are the Cys2His2 ligand set found in zinc-finger proteins [99] and the His3 set in carbonic anhydrase [100].

Introduction of metal-binding sites into a protein location where no native metal-binding site is found has provided insight into the structural features common to the metal-binding sites of template and target proteins. Using structural homology between the template protein (which contains no metal ions) and the target metalloprotein, metal-binding sites can be introduced into the template protein at the positions corresponding to those in which they are found in the metalloprotein. Using this strategy, new Zn(II)-binding sites were introduced into charybdotoxin [101] and retinol-binding protein [102] to mimic carbonic anhydrase. These developments in the field of metalloproteins lead to the design of dinuclear metal-binding sites or metal clusters. For example, the designing of CuA centres into the cupredoxinsazurin [103, 104] and amicyanin [105, 106], in which the copper-binding loop of a cupredoxin was replaced by the corresponding loop in cytochrome c oxidase (COX; also known as CcO), which has similar structural homology. A series of His3Fe sites were introduced to thioredoxin in various environments, classified as grooves, shallow pockets and a deep pocket, allowing the effect of the protein microenvironment on “nascent” enzymatic activity to be studied [43, 44].

A sequence-homology modeling and molecular dynamics simulation was employed to assemble the presumptive active site metal complex of nitric oxide reductase (NOR) in whalesperm myoglobin (Mb) [107]. NOR is a metalloenzyme in the denitrification pathway of anaerobic bacteria, and is a key enzyme in the nitrogen cycle that is critical for all life [108]. Difficulties in obtaining the enzyme in high yield, and the lack of a three-dimensional structure, have hampered structural and mechanistic studies of NORs. A strong model system to explore the spectroscopic properties, and to validate the catalytic function of the hypothetical metal complex in the Mb scaffold, was established using the assembly of the proposed catalytic site, consisting of a heme and a putative FeB site. A hydrophobic pocket near the heme cofactor was subjected to remodeling in order to establish the new non-heme Fe2+ binding site. The remodeling was done by introducing three histidines and one glutamate, predicted to be ligands in the active site of NOR, into the distal pocket of Mb. A crystal structure of the designed protein confirmed that the minimized computer model contained a heme/non-heme FeB. The designed protein also exhibited NO reduction activity, and thus modeled both the structure and function of NOR, offering insight that the active site glutamate is required for both iron binding and activity. The designed protein served as an excellent model for mechanistic studies of NOR. Engineering of the metal-binding sites of metalloproteins, and study of native enzymes, has enabled identification of the structural features that are necessary to confer the structure and function of these enzymes.

Using a “metal-first” approach [109, 110], a four-helix bundle protein was designed to bind four-iron four-sulfur (Fe4S4) in its hydrophobic core. This is particularly noteworthy given that natural Fe4S4-binding proteins are not a-helical and generally bind the ligand in flexible loops [111]. Recently, Kuhlman and his group devised a strategy of introducing a metal binding site at the target interface in order to promote protein interaction (Figure 3). In order to pursue their goal, they computationally designed a metal-mediated homodimer MID1 (metal interface design 1) with high affinity and orientation preference. The steps involved for the design of the symmetric metal-mediated interface are: i) examination of two-residue cysteine/histidine zinc binding sites using the RosettaMatch algorithm. During this step about 600 monomer scaffold surfaces were scanned, ii) grafting of all pairs of two-residue zinc binding sites for a given scaffold onto the surface, and the conversion of a monomer to a C2-symmetric dimer by rotation. This step resulted in the identification of 500,000 designable starting structures among 600 scaffolds using their Rosetta protocol, iii) using Monte Carlo simulated annealing, iteration of symmetric interface design with symmetric backbone minimization, and iv) finally, identification of best design models on the basis of two primary metrics: computed binding energy, excluding contribution from zinc (ΔGbind), and binding energy per unit of interface surface area (ΔGbind/ΔSASA). Eight best models obtained were finally subjected to testing. The final computationally designed metal-mediated homodimer MID1 (metal interface design 1), with high affinity and orientation preference, was only successful after considering the crystal structures of previous iterations in the design process. In the absence of metal, the MID1 design dimerized only weakly and with two types of nonspecific orientations. In the presence of metal, the desired binding orientation was achieved with high affinity, despite minor discrepancies at the atomic level between the computational model and the crystal structure [112]. De novo protein design thus provides an attractive approach for modeling the active sites of metalloproteins. The design strategy presented here enabled the conversion of an enzyme in the metallohydrolase superfamily into a new family member with a different catalytic function, providing experimental support for the divergent evolution of mechanistically diverse family enzymes.

Figure 3.

Figure 3

Computationally designed protein-protein interactions with high affinity and desired orientation. (A) The symmetric homodimer design with two interface zinc sites each coordinated by four histidines at i, i + 4 positions on each helix [112]. A Rosetta-based approach for the rational design of a protein monomer to form a zinc-mediated, symmetric homodimer. Incorporating metal-binding sites at the target interface may be one approach for increasing affinity and specifying the binding mode. (B) Metal interface design, named MID1 (NESG target ID OR37), forms a tight dimer in the presence of zinc (MID1-zinc) with a dissociation constant <30 nM.

Design and engineering of therapeutic proteins

Computational design holds great potential for the development of new protein-based therapeutics with novel modes of action. The method of systematic and quantitative engineering strategies for protein optimization is now being replaced by computational protein design methods. Antibodies are the predominant class of computationally designed proteins that are used as therapeutics. Rationally designed antibody molecules catalyze numerous chemical transformations, including many that cannot be achieved by standard chemical methods. Using computational methods different hapten [113] and antibody [114] design strategies have been developed. These strategies include a transition state analog approach. However, the catalytic efficiency of the resulting molecules has been low relative to natural enzymes [115].

In a significant advance, catalytic antibodies that utilize a nucleophilic mechanism were selected by reactive immunization and resulted in efficient catalysts [115]. Rather than a transition state analog, a mechanism-based inhibitor was used to elicit the immune response. Antibodies that formed stable covalent attachments to the suicide inhibitor were effectively selected. This method was employed in the selection of an efficient abzyme (catalytic antibody) with a nucleophilic lysine for aldol condensations [116]. The efficiency of this aldolase demonstrates the effectiveness of covalent catalysis. The ability to select for powerful catalytic groups and active sites with high transition-state specificity could theoretically yield more efficient catalysts.

Antibodies have several drawbacks despite their significant clinical success. They are large, which limits their entry into tumors and tissues, require expensive manufacturing and handling facilities, and often cause undesired effector functions. Although smaller antibody fragments have been developed, they are often associated with weaker binding than the intact antibody, they can exhibit lower stability, and they might expose immunogenic epitopes that were previously masked [117]. This has led to the development of ligand- and receptor-based agonists or antagonists with therapeutic potential.

A program that utilizes the information embedded in a protein structure to optimize the function of a protein, including its activity, binding affinity and specificity, stability, expression level, and potency, was developed by Xencor. Xencor's Protein Design Automation (PDA™) technology [58, 118121] couples computational design algorithms, which generate quality sequence diversity, with experimental high-throughput screening to discover proteins with improved properties. In order to capture the relationships between protein sequence, structure and function accurately the computational program uses atomic level scoring functions, side chain rotamer sampling and advanced optimization methods. Another method was used to design proteins that bind a conserved surface patch on the stem of the influenza hemagglutinin (HA) from the 1918 H1N1 pandemic virus [52]. Two of the designed proteins (HB36 and HB80), after affinity maturation, were found to bind H1 and H5 HAs with low nanomolar affinity. HB80 was demonstrated to inhibit the HA fusogenic conformational changes induced at low pH. Such designed binding proteins may be useful for both diagnostics and therapeutics (Figure 4).

Figure 4.

Figure 4

Design of novel binding proteins. (A) Crystal structure of HB36.3-SC1918/H1 complex . (B) Close up view of SC1918 HA-HB36.3 interface [52].

The most effective way to design protein drugs are using computational methods in conjunction with functional screening techniques. In silico methods can explore much larger portions of sequence space than can be accessed experimentally, and can be used to design targeted libraries that are enriched in functional sequences [122124]. Although computational design holds great potential for the development of new protein-based therapeutics with novel modes of action, many challenges remain.

Conclusions

Computationally designed proteins offer promise in many areas of research, from basic biology to application in the fields of industrial organic synthesis and biomedicine. There has been remarkable progress in the field of rational enzyme design. This field has evolved from the design of proteins with improved features, such as thermostability, catalytic activity, better metal affinity, substrate specificity, or stereoselectivity, to the design of novel proteins and folds ab initio. The field has made exciting progress, designing proteins with new structures and functions. In early 1998, a novel sequence that folded into a naturally occurring zinc finger structure was computed. Later in 2003, an exceptionally stable protein called Top7, which has a sequence and structure unrelated to any known protein, was designed using ROSETTA. This achievement, of having the power to create a brand new protein, encouraged the scientific community to design novel proteins with atomic-level accuracy. Recently in 2010, a non-natural aldolase capable of catalyzing reactions of non-natural substrates was designed. The aldol reaction constitutes one of the most powerful tools for the formation of carbon-carbon bonds both in nature and in the laboratory. In 2012, the most recent accomplishment in the field of computational design approach was the achievement of an increased Diels-Alderase (DA) activity through backbone remodeling, which was achieved using a game called Foldit. Table 1 lists a number of different enzymes that have been evolved using the computational protein design approaches over the past few years.

Table 1.

Summary of computationally designed biocatalysts with novel functions

Target Protein design goal Methodology Metric(s) Conclusions Ref.

Computational Experimental
Diels-Alderase Biocatalyst for intermolecular Diels-Alder reaction (Novel catalysis) QM/MM simulations, RosettaMatch and Design software Site-directed mutagenesis kcat = 0.036 min-1; kcat/Km = 0.455 M-1 s-1 Stereoselective Diels-Alderase, functional performance matches catalytic antibodies [51]
Kemp eliminase Novel catalysis De novo design via Rosetta Directed evolution kcat = 1.37 s-1, kcat/Km = 2590 M-1 s-1 >200-fold increase in kcat/Km [21]
Nitric oxide reductase Reconstitute active site of NOR in myoglobin VMD software (molecular modeling) Site-directed Mutagenesis Yield of N2O production by Fe(II)-FeBMb was estimated to be ∼30% Functional model of NOR [107]
Gramicidine S synthetase A (GrsA-PheA) Substrate specificity from Phe to Leu, Arg, Lys, Glu, or Asp K* algorithm (mutagenesis with rotamer library, flexible backbone and dynamic ligand substrate) Site-directed mutagenesis kcat = 0.85 min-1, kcat/Km = 159.86 mM-1 min-1 600-fold specificity shift for Phe→Leu due to changes in Km-values [70]
Mouse adenosine deaminase Hydrolysis of a model organophosphate Rosetta Design, RosettaMatch Directed evolution kcat/Km ∼104 M-1 s-1 Enhanced activity by ∼2,500-fold [49]
Rab4-binding domain of rabenosyn Protein monomer to form a zinc-mediated, symmetric homodimer RosettaMatch Site-directed mutagenesis Tight binding (Kd< 30 nM) Zinc binding leads to a >200-fold increase in binding affinity [112]

Despite all these major progresses and breakthroughs in the field of enzymology, the designed or engineered enzyme catalysts remain inferior to naturally evolved enzyme catalysts in terms of activity and efficiency. One such example is that of an artificially designed Kemp eliminase. However, the artificially designed enzyme exhibited an improvement in the catalytic efficiency of >200-fold when subjected to directed evolution, which resulted in a protein with multiple mutations. Accurate structure modeling, protein stability, and intermolecular interaction optimization remain the major challenges in the fields of computational protein design. Each of these major barriers has received significant attention in the past few years and many artificial protein designs have been produced as a result. In addition, incorporation of the protein backbone and ligand exchange between the active site and the solvent in the computational design methods will help improve artificially designed proteins. Although there is still a long way to go, with improvements to algorithms and increases in computing power, exciting progress is being made in both prediction and design. The success obtained by rational computational design is extremely encouraging, demonstrating that protein design represents a fundamental tool for understanding protein folding and interaction. Rational computational design promises a great positive impact on both the biotechnology and the therapeutic industry, thus revolutionizing the fields of molecular biology and biomedicine.

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2009-0070463 and 2009-0077718).

Competing Interests

The authors have declared that no competing interests exist.

References

  • 1.Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, et al. (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376–380 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Todd AE, Marsden RL, Thornton JM, Orengo CA (2005) Progress of structural genomics initiatives: an analysis of solved target structures. J Mol Biol 348: 1235–1260 [DOI] [PubMed] [Google Scholar]
  • 3.George RA, Spriggs RV, Bartlett GJ, Gutteridge A, MacArthur MW, et al. (2005) Effective function annotation through catalytic residue conservation. Proc Natl Acad Sci U S A 102: 12299–12304 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sillitoe I, Dibley M, Bray J, Addou S, Orengo C (2005) Assessing strategies for improved superfamily recognition. Protein Sci 14: 1800–1810 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Lee D, Redfern O, Orengo C (2007) Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 8: 995–1005 [DOI] [PubMed] [Google Scholar]
  • 6.Liolios K, Tavernarakis N, Hugenholtz P, Kyrpides NC (2006) The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. Nucleic Acids Res 34: D332–334 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL (2006) GenBank. Nucleic Acids Res 34: D16–20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Dessailly BH, Nair R, Jaroszewski L, Fajardo JE, Kouranov A, et al. (2009) PSI-2: structural genomics to cover protein domain family space. Structure 17: 869–881 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chandonia JM, Brenner SE (2006) The impact of structural genomics: expectations and outcomes. Science 311: 347–351 [DOI] [PubMed] [Google Scholar]
  • 10.Levitt M (2009) Nature of the protein universe. Proc Natl Acad Sci U S A 106: 11079–11084 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bradley P, Malmstrom L, Qian B, Schonbrun J, Chivian D, et al. (2005) Free modeling with Rosetta in CASP6. Proteins 61 Suppl 7: 128–134 [DOI] [PubMed] [Google Scholar]
  • 12.Bonneau R, Baker D (2001) Ab initio protein structure prediction: progress and prospects. Annu Rev Biophys Biomol Struct 30: 173–189 [DOI] [PubMed] [Google Scholar]
  • 13.Jin W, Kambara O, Sasakawa H, Tamura A, Takada S (2003) De novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification. Structure 11: 581–590 [DOI] [PubMed] [Google Scholar]
  • 14.Jaramillo A, Wernisch L, Hery S, Wodak SJ (2001) Automatic procedures for protein design. Comb Chem High Throughput Screen 4: 643–659 [DOI] [PubMed] [Google Scholar]
  • 15.Bolon DN, Grant RA, Baker TA, Sauer RT (2005) Specificity versus stability in computational protein design. Proc Natl Acad Sci U S A 102: 12724–12729 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Joachimiak LA, Kortemme T, Stoddard BL, Baker D (2006) Computational design of a new hydrogen bond network and at least a 300-fold specificity switch at a protein-protein interface. J Mol Biol 361: 195–208 [DOI] [PubMed] [Google Scholar]
  • 17.Ashworth J, Havranek JJ, Duarte CM, Sussman D, Monnat RJJr, et al. (2006) Computational redesign of endonuclease DNA binding and cleavage specificity. Nature 441: 656–659 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bolon DN, Mayo SL (2001) Enzyme-like proteins by computational design. Proc Natl Acad Sci U S A 98: 14274–14279 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kaplan J, DeGrado WF (2004) De novo design of catalytic proteins. Proc Natl Acad Sci U S A 101: 11566–11570 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jiang L, Althoff EA, Clemente FR, Doyle L, Rothlisberger D, et al. (2008) De novo computational design of retro-aldol enzymes. Science 319: 1387–1391 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Rothlisberger D, Khersonsky O, Wollacott AM, Jiang L, DeChancie J, et al. (2008) Kemp elimination catalysts by computational enzyme design. Nature 453: 190–195 [DOI] [PubMed] [Google Scholar]
  • 22.Lazar GA, Dang W, Karki S, Vafa O, Peng JS, et al. (2006) Engineered antibody Fc variants with enhanced effector function. Proc Natl Acad Sci U S A 103: 4005–4010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ogata K, Jaramillo A, Cohen W, Briand JP, Connan F, et al. (2003) Automatic sequence design of major histocompatibility complex class I binding peptides impairing CD8+ T cell recognition. J Biol Chem 278: 1281–1290 [DOI] [PubMed] [Google Scholar]
  • 24.Shifman JM, Choi MH, Mihalas S, Mayo SL, Kennedy MB (2006) Ca2+/calmodulin-dependent protein kinase II (CaMKII) is activated by calmodulin with two bound calciums. Proc Natl Acad Sci U S A 103: 13968–13973 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Cochran FV, Wu SP, Wang W, Nanda V, Saven JG, et al. (2005) Computational de novo design and characterization of a four-helix bundle protein that selectively binds a nonbiological cofactor. J Am Chem Soc 127: 1346–1347 [DOI] [PubMed] [Google Scholar]
  • 26.Jurgens C, Strom A, Wegener D, Hettwer S, Wilmanns M, et al. (2000) Directed evolution of a (beta alpha)8-barrel enzyme to catalyze related reactions in two different metabolic pathways. Proc Natl Acad Sci U S A 97: 9925–9930 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Canada KA, Iwashita S, Shim H, Wood TK (2002) Directed evolution of toluene ortho-monooxygenase for enhanced 1-naphthol synthesis and chlorinated ethene degradation. J Bacteriol 184: 344–349 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bornscheuer UT, Kazlauskas RJ (2004) Catalytic promiscuity in biocatalysis: using old enzymes to form new bonds and follow new pathways. Angew Chem Int Ed Engl 43: 6032–6040 [DOI] [PubMed] [Google Scholar]
  • 29.Kazlauskas RJ (2005) Enhancing catalytic promiscuity for biocatalysis. Curr Opin Chem Biol 9: 195–201 [DOI] [PubMed] [Google Scholar]
  • 30.Saven JG (2011) Computational protein design: engineering molecular diversity, nonnatural enzymes, nonbiological cofactor complexes, and membrane proteins. Curr Opin Chem Biol 15: 452–457 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Pantazes RJ, Grisewood MJ, Maranas CD (2011) Recent advances in computational protein design. Curr Opin Struct Biol 21: 467–472 [DOI] [PubMed] [Google Scholar]
  • 32.Samish I, MacDermaid CM, Perez-Aguilar JM, Saven JG (2011) Theoretical and computational protein design. Annu Rev Phys Chem 62: 129–149 [DOI] [PubMed] [Google Scholar]
  • 33.Havranek JJ (2010) Specificity in computational protein design. J Biol Chem 285: 31095–31099 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302: 1364–1368 [DOI] [PubMed] [Google Scholar]
  • 35.Bolon DN, Voigt CA, Mayo SL (2002) De novo design of biocatalysts. Curr Opin Chem Biol 6: 125–129 [DOI] [PubMed] [Google Scholar]
  • 36.Desjarlais JR, Clarke ND (1998) Computer search algorithms in protein modification and design. Curr Opin Struct Biol 8: 471–475 [DOI] [PubMed] [Google Scholar]
  • 37.Nanda V (2008) Do-it-yourself enzymes. Nat Chem Biol 4: 273–275 [DOI] [PubMed] [Google Scholar]
  • 38.Nanda V, Koder RL (2010) Designing artificial enzymes by intuition and computation. Nat Chem 2: 15–24 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Garcia-Viloca M, Gao J, Karplus M, Truhlar DG (2004) How enzymes work: analysis by modern rate theory and computer simulations. Science 303: 186–195 [DOI] [PubMed] [Google Scholar]
  • 40.Hellinga HW, Caradonna JP, Richards FM (1991) Construction of new ligand binding sites in proteins of known structure. II. Grafting of a buried transition metal binding site into Escherichia coli thioredoxin. J Mol Biol 222: 787–803 [DOI] [PubMed] [Google Scholar]
  • 41.Robertson DE, Farid RS, Moser CC, Urbauer JL, Mulholland SE, et al. (1994) Design and synthesis of multi-haem proteins. Nature 368: 425–432 [DOI] [PubMed] [Google Scholar]
  • 42.Klemba M, Gardner KH, Marino S, Clarke ND, Regan L (1995) Novel metal-binding proteins by design. Nat Struct Biol 2: 368–373 [DOI] [PubMed] [Google Scholar]
  • 43.Pinto AL, Hellinga HW, Caradonna JP (1997) Construction of a catalytically active iron superoxide dismutase by rational protein design. Proc Natl Acad Sci U S A 94: 5562–5567 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Benson DE, Wisz MS, Hellinga HW (2000) Rational design of nascent metalloenzymes. Proc Natl Acad Sci U S A 97: 6292–6297 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hellinga HW, Richards FM (1991) Construction of new ligand binding sites in proteins of known structure. I. Computer-aided modeling of sites with pre-defined geometry. J Mol Biol 222: 763–785 [DOI] [PubMed] [Google Scholar]
  • 46.Holmgren A (1985) Thioredoxin. Annu Rev Biochem 54: 237–271 [DOI] [PubMed] [Google Scholar]
  • 47.Ladbury JE, Wynn R, Hellinga HW, Sturtevant JM (1993) Stability of oxidized Escherichia coli thioredoxin and its dependence on protonation of the aspartic acid residue in the 26 position. Biochemistry 32: 7526–7530 [DOI] [PubMed] [Google Scholar]
  • 48.Desmet J, De Maeyer M, Hazes B, Lasters I (1992) The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356: 539–542 [DOI] [PubMed] [Google Scholar]
  • 49.Khare SD, Kipnis Y, Greisen PJr, Takeuchi R, Ashani Y, et al. (2012) Computational redesign of a mononuclear zinc metalloenzyme for organophosphate hydrolysis. Nat Chem Biol 8: 294–300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Host GE, Razkin J, Baltzer L, Jonsson BH (2007) Combined enzyme and substrate design: grafting of a cooperative two-histidine catalytic motif into a protein targeted at the scissile bond in a designed ester substrate. Chembiochem 8: 1570–1576 [DOI] [PubMed] [Google Scholar]
  • 51.Siegel JB, Zanghellini A, Lovick HM, Kiss G, Lambert AR, et al. (2010) Computational design of an enzyme catalyst for a stereoselective bimolecular Diels-Alder reaction. Science 329: 309–313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Fleishman SJ, Whitehead TA, Ekiert DC, Dreyfus C, Corn JE, et al. (2011) Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science 332: 816–821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Shah PS, Hom GK, Ross SA, Lassila JK, Crowhurst KA, et al. (2007) Full-sequence computational design and solution structure of a thermostable protein variant. J Mol Biol 372: 1–6 [DOI] [PubMed] [Google Scholar]
  • 54.Dantas G, Kuhlman B, Callender D, Wong M, Baker D (2003) A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J Mol Biol 332: 449–460 [DOI] [PubMed] [Google Scholar]
  • 55.Gordon DB, Marshall SA, Mayo SL (1999) Energy functions for protein design. Curr Opin Struct Biol 9: 509–513 [DOI] [PubMed] [Google Scholar]
  • 56.Clarke ND, Yuan SM (1995) Metal search: a computer program that helps design tetrahedral metal-binding sites. Proteins 23: 256–263 [DOI] [PubMed] [Google Scholar]
  • 57.Regan L, Clarke ND (1990) A tetrahedral zinc(II)-binding site introduced into a designed protein. Biochemistry 29: 10878–10883 [DOI] [PubMed] [Google Scholar]
  • 58.Dahiyat BI, Mayo SL (1996) Protein design automation. Protein Sci 5: 895–903 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Zanghellini A, Jiang L, Wollacott AM, Cheng G, Meiler J, et al. (2006) New algorithms and an in silico benchmark for computational enzyme design. Protein Sci 15: 2785–2794 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Lasters I, Desmet J (1993) The fuzzy-end elimination theorem: correctly implementing the side chain placement algorithm based on the dead-end elimination theorem. Protein Eng 6: 717–722 [DOI] [PubMed] [Google Scholar]
  • 61.Simons KT, Bonneau R, Ruczinski I, Baker D (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins Suppl 3: 171–176 [DOI] [PubMed] [Google Scholar]
  • 62.Richter F, Leaver-Fay A, Khare SD, Bjelic S, Baker D (2011) De novo enzyme design using Rosetta3. PLoS One 6: e19230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Renfrew PD, Campbell G, Strauss CE, Bonneau R (2011) The 2010 Rosetta developers meeting: macromolecular prediction and design meets reproducible publishing. PLoS One 6: e22431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Jacak R, Leaver-Fay A, Kuhlman B (2012) Computational protein design with explicit consideration of surface hydrophobic patches. Proteins 80: 825–838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Zhu X, Lai L (2009) A novel method for enzyme design. J Comput Chem 30: 256–267 [DOI] [PubMed] [Google Scholar]
  • 66.Georgiev I, Lilien RH, Donald BR (2008) The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. J Comput Chem 29: 1527–1542 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Tynan-Connolly BM, Nielsen JE (2007) Redesigning protein pKa values. Protein Sci 16: 239–249 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Pan Y, Gao D, Yang W, Cho H, Zhan CG (2007) Free energy perturbation (FEP) simulation on the transition states of cocaine hydrolysis catalyzed by human butyrylcholinesterase and its mutants. J Am Chem Soc 129: 13537–13543 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Zheng F, Yang W, Ko MC, Liu J, Cho H, et al. (2008) Most efficient cocaine hydrolase designed by virtual screening of transition states. J Am Chem Soc 130: 12148–12155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Chen CY, Georgiev I, Anderson AC, Donald BR (2009) Computational structure-based redesign of enzyme activity. Proc Natl Acad Sci U S A 106: 3764–3769 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Gainza P, Roberts KE, Donald BR (2012) Protein design using continuous rotamer. PLoS Comput Biol 8: e1002335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Lilien RH, Stevens BW, Anderson AC, Donald BR (2005) A novel ensemble-based scoring and search algorithm for protein redesign and its application to modify the substrate specificity of the gramicidin synthetase a phenylalanine adenylation enzyme. J Comput Biol 12: 740–761 [DOI] [PubMed] [Google Scholar]
  • 73.Frey KM, Georgiev I, Donald BR, Anderson AC (2010) Predicting resistance mutations using protein design algorithms. Proc Natl Acad Sci U S A 107: 13707–13712 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Wolf E, Kim PS, Berger B (1997) MultiCoil: a program for predicting two- and three-stranded coiled coils. Protein Sci 6: 1179–1189 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.McDonnell AV, Jiang T, Keating AE, Berger B (2006) Paircoil2: improved prediction of coiled coils from sequence. Bioinformatics 22: 356–358 [DOI] [PubMed] [Google Scholar]
  • 76.Trigg J, Gutwin K, Keating AE, Berger B (2011) Multicoil2: predicting coiled coils and their oligomerization states from sequence in the twilight zone. PLoS One 6: e23519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Debartolo J, Dutta S, Reich L, Keating AE (2012) Predictive bcl-2 family binding models rooted in experiment or structure. J Mol Biol 422: 124–144 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Zhang C, Liu S, Zhou H, Zhou Y (2004) An accurate, residue-level, pair potential of mean force for folding and binding based on the distance-scaled, ideal-gas reference state. Protein Sci 13: 400–411 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Raveh B, London N, Zimmerman L, Schueler-Furman O (2011) Rosetta FlexPepDock ab-initio: simultaneous folding, docking and refinement of peptides onto their receptors. PLoS One 6: e18934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.King CA, Bradley P (2010) Structure-based prediction of protein-peptide specificity in Rosetta. Proteins 78: 3437–3449 [DOI] [PubMed] [Google Scholar]
  • 81.Honig B, Nicholls A (1995) Classical electrostatics in biology and chemistry. Science 268: 1144–1149 [DOI] [PubMed] [Google Scholar]
  • 82.Eiben CB, Siegel JB, Bale JB, Cooper S, Khatib F, et al. (2012) Increased Diels-Alderase activity through backbone remodeling guided by Foldit players. Nat Biotechnol 30: 190–192 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Khatib F, Cooper S, Tyka MD, Xu K, Makedon I, et al. (2011) Algorithm discovery by protein folding game players. Proc Natl Acad Sci U S A 108: 18949–18953 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Khatib F, DiMaio F, Cooper S, Kazmierczyk M, Gilski M, et al. (2011) Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nat Struct Mol Biol 18: 1175–1177 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Cooper S, Khatib F, Treuille A, Barbero J, Lee J, et al. (2010) Predicting protein structures with a multiplayer online game. Nature 466: 756–760 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Wolfenden R, Snider MJ (2001) The depth of chemical time and the power of enzymes as catalysts. Acc Chem Res 34: 938–945 [DOI] [PubMed] [Google Scholar]
  • 87.Andreini C, Bertini I, Cavallaro G, Holliday GL, Thornton JM (2008) Metal ions in biological catalysis: from enzyme databases to general principles. J Biol Inorg Chem 13: 1205–1218 [DOI] [PubMed] [Google Scholar]
  • 88.Thomson AJ, Gray HB (1998) Bio-inorganic chemistry. Curr Opin Chem Biol 2: 155–158 [DOI] [PubMed] [Google Scholar]
  • 89.Benson DE, Wisz MS, Hellinga HW (1998) The development of new biotechnologies using metalloprotein design. Curr Opin Biotechnol 9: 370–376 [DOI] [PubMed] [Google Scholar]
  • 90.Lombardi A, Summa CM, Geremia S, Randaccio L, Pavone V, et al. (2000) Retrostructural analysis of metalloproteins: application to the design of a minimal model for diiron proteins. Proc Natl Acad Sci U S A 97: 6298–6305 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Torres Martin de Rosales R, Faiella M, Farquhar E, Que LJr, Andreozzi C, et al. (2010) Spectroscopic and metal-binding properties of DF3: an artificial protein able to accommodate different metal ions. J Biol Inorg Chem 15: 717–728 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Lu Y, Yeung N, Sieracki N, Marshall NM (2009) Design of functional metalloproteins. Nature 460: 855–862 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.He QY, Mason AB, Pakdaman R, Chasteen ND, Dixon BK, et al. (2000) Mutations at the histidine 249 ligand profoundly alter the spectral and iron-binding properties of human serum transferrin N-lobe. Biochemistry 39: 1205–1210 [DOI] [PubMed] [Google Scholar]
  • 94.Schmidt AM, Muller HN, Skerra A (1996) A Zn(II)-binding site engineered into retinol-binding protein exhibits metal-ion specificity and allows highly efficient affinity purification with a newly designed metal ligand. Chem Biol 3: 645–653 [DOI] [PubMed] [Google Scholar]
  • 95.Marino SF, Regan L (1999) Secondary ligands enhance affinity at a designed metal-binding site. Chem Biol 6: 649–655 [DOI] [PubMed] [Google Scholar]
  • 96.Matzapetakis M, Pecoraro VL (2005) Site-selective metal binding by designed alpha-helical peptides. J Am Chem Soc 127: 18229–18233 [DOI] [PubMed] [Google Scholar]
  • 97.Tiwari MK, Singh RK, Singh R, Jeya M, Zhao H, et al. (2012) Role of conserved glycine in zinc-dependent medium chain dehydrogenase/reductase superfamily. J Biol Chem 287: 19429–19439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Park HS, Nam SH, Lee JK, Yoon CN, Mannervik B, et al. (2006) Design and evolution of new catalytic activity with an existing protein scaffold. Science 311: 535–538 [DOI] [PubMed] [Google Scholar]
  • 99.Klemba M, Regan L (1995) Characterization of metal binding by a designed protein: single ligand substitutions at a tetrahedral Cys2His2 site. Biochemistry 34: 10094–10100 [DOI] [PubMed] [Google Scholar]
  • 100.Handel TDW (1990) De novo design of a Zn2+-binding protein. J Am Chem Soc 112: 6710–6711 [Google Scholar]
  • 101.Vita C, Roumestand C, Toma F, Menez A (1995) Scorpion toxins as natural scaffolds for protein engineering. Proc Natl Acad Sci U S A 92: 6404–6408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Muller HN, Skerra A (1994) Grafting of a high-affinity Zn(II)-binding site on the beta-barrel of retinol-binding protein results in enhanced folding stability and enables simplified purification. Biochemistry 33: 14126–14135 [DOI] [PubMed] [Google Scholar]
  • 103.Hay M, Richards JH, Lu Y (1996) Construction and characterization of an azurin analog for the purple copper site in cytochrome c oxidase. Proc Natl Acad Sci U S A 93: 461–464 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Robinson H, Ang MC, Gao YG, Hay MT, Lu Y, et al. (1999) Structural basis of electron transfer modulation in the purple CuA center. Biochemistry 38: 5677–5683 [DOI] [PubMed] [Google Scholar]
  • 105.Dennison C, Vijgenboom E, de Vries S, van der Oost J, Canters GW (1995) Introduction of a CuA site into the blue copper protein amicyanin from Thiobacillus versutus. FEBS Lett 365: 92–94 [DOI] [PubMed] [Google Scholar]
  • 106.Jones LH, Liu A, Davidson VL (2003) An engineered CuA Amicyanin capable of intermolecular electron transfer reactions. J Biol Chem 278: 47269–47274 [DOI] [PubMed] [Google Scholar]
  • 107.Yeung N, Lin YW, Gao YG, Zhao X, Russell BS, et al. (2009) Rational design of a structural and functional nitric oxide reductase. Nature 462: 1079–1082 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Wasser IM, de Vries S, Moenne-Loccoz P, Schroder I, Karlin KD (2002) Nitric oxide in biological denitrification: Fe/Cu metalloenzyme and metal complex NO(x) redox chemistry. Chem Rev 102: 1201–1234 [DOI] [PubMed] [Google Scholar]
  • 109.Nanda V, Rosenblatt MM, Osyczka A, Kono H, Getahun Z, et al. (2005) De novo design of a redox-active minimal rubredoxin mimic. J Am Chem Soc 127: 5804–5805 [DOI] [PubMed] [Google Scholar]
  • 110.Summa CM, Lombardi A, Lewis M, DeGrado WF (1999) Tertiary templates for the design of diiron proteins. Curr Opin Struct Biol 9: 500–508 [DOI] [PubMed] [Google Scholar]
  • 111.Grzyb J, Xu F, Weiner L, Reijerse EJ, Lubitz W, et al. (2010) De novo design of a non-natural fold for an iron-sulfur protein: alpha-helical coiled-coil with a four-iron four-sulfur cluster binding site in its central core. Biochim Biophys Acta 1797: 406–413 [DOI] [PubMed] [Google Scholar]
  • 112.Der BS, Machius M, Miley MJ, Mills JL, Szyperski T, et al. (2012) Metal-mediated affinity and orientation specificity in a computationally designed protein homodimer. J Am Chem Soc 134: 375–385 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Xu Y, Yamamoto N, Janda KD (2004) Catalytic antibodies: hapten design strategies and screening methods. Bioorg Med Chem 12: 5247–5268 [DOI] [PubMed] [Google Scholar]
  • 114.Hilvert D (2000) Critical analysis of antibody catalysis. Annu Rev Biochem 69: 751–793 [DOI] [PubMed] [Google Scholar]
  • 115.Stewart JD, Benkovic SJ (1995) Transition-state stabilization as a measure of the efficiency of antibody catalysis. Nature 375: 388–391 [DOI] [PubMed] [Google Scholar]
  • 116.Wagner J, Lerner RA, Barbas CF 3rd (1995) Efficient aldolase catalytic antibodies that use the enamine mechanism of natural enzymes. Science 270: 1797–1800 [DOI] [PubMed] [Google Scholar]
  • 117.Honegger A (2008) Engineering antibodies for stability and efficient folding. Handb Exp Pharmacol: 47–68 [DOI] [PubMed] [Google Scholar]
  • 118.Dahiyat BI, Gordon DB, Mayo SL (1997) Automated design of the surface positions of protein helices. Protein Sci 6: 1333–1337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Dahiyat BI, Mayo SL (1997) De novo protein design: fully automated sequence selection. Science 278: 82–87 [DOI] [PubMed] [Google Scholar]
  • 120.Filikov AV, Hayes RJ, Luo P, Stark DM, Chan C, et al. (2002) Computational stabilization of human growth hormone. Protein Sci 11: 1452–1461 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Luo P, Hayes RJ, Chan C, Stark DM, Hwang MY, et al. (2002) Development of a cytokine analog with enhanced stability using computational ultrahigh throughput screening. Protein Sci 11: 1218–1226 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Humphris EL, Kortemme T, et al. (2008) Prediction of protein-protein interface sequence diversity using flexible backbone computational protein design. Structure 16: 1777–1788 [DOI] [PubMed] [Google Scholar]
  • 123.Chica RA, Moore MM, Allen BD, Mayo SL, et al. (2010) Generation of longer emission wavelength red fluorescent proteins using computationally designed libraries. Proc Natl Acad Sci U S A 107: 20257–20262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Nannemann DP, Kaufmann KW, Meiler J, Bachmann BO, et al. (2010) Design and directed evolution of a dideoxy purine nucleoside phosphorylase. Protein Eng Des Sel 23: 607–616 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Computational and Structural Biotechnology Journal are provided here courtesy of Research Network of Computational and Structural Biotechnology

RESOURCES