Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1994 Oct;3(10):1871–1882. doi: 10.1002/pro.5560031026

A quantitative methodology for the de novo design of proteins.

S E Brenner 1, A Berry 1
PMCID: PMC2142604  PMID: 7849602

Abstract

We have developed a general quantitative methodology for designing proteins de novo, which automatically produces sequences for any given plausible protein structure. The method incorporates statistical information, a theoretical description of protein structure, and motifs described in the literature. A model system embodying a portion of the quantitative methodology has been used to design many protein sequences for the phage 434 Cro and fibronectin type III domain folds, as well as several other structures. Residue sequences selected by this prototype share no significant identity with any natural protein. Nonetheless, 3-dimensional models of the designed sequences appear generally plausible. When examined using secondary structure prediction methods and profile analysis, the designed sequences generally score considerably better than the natural ones. The designed sequences are also in reasonable agreement with a sequence template. This quantitative methodology is likely to be capable of successfully designing new proteins and yielding fundamental insights about the determinants of protein structure.

Full Text

The Full Text of this article is available as a PDF (11.1 MB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  2. Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 1991 Apr 25;19 (Suppl):2247–2249. doi: 10.1093/nar/19.suppl.2247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
  4. Bork P., Doolittle R. F. Fibronectin type III modules in the receptor phosphatase CD45 and tapeworm antigens. Protein Sci. 1993 Jul;2(7):1185–1187. doi: 10.1002/pro.5560020714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bowie J. U., Lüthy R., Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991 Jul 12;253(5016):164–170. doi: 10.1126/science.1853201. [DOI] [PubMed] [Google Scholar]
  6. Campbell I. D., Spitzfaden C. Building proteins with fibronectin type III modules. Structure. 1994 May 15;2(5):333–337. doi: 10.1016/s0969-2126(00)00034-4. [DOI] [PubMed] [Google Scholar]
  7. Chothia C., Lesk A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 1986 Apr;5(4):823–826. doi: 10.1002/j.1460-2075.1986.tb04288.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cohen F. E., Sternberg M. J., Taylor W. R. Analysis and prediction of the packing of alpha-helices against a beta-sheet in the tertiary structure of globular proteins. J Mol Biol. 1982 Apr 25;156(4):821–862. doi: 10.1016/0022-2836(82)90144-9. [DOI] [PubMed] [Google Scholar]
  9. Davidson A. R., Sauer R. T. Folded proteins occur frequently in libraries of random amino acid sequences. Proc Natl Acad Sci U S A. 1994 Mar 15;91(6):2146–2150. doi: 10.1073/pnas.91.6.2146. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dodd I. B., Egan J. B. Improved detection of helix-turn-helix DNA-binding motifs in protein sequences. Nucleic Acids Res. 1990 Sep 11;18(17):5019–5026. doi: 10.1093/nar/18.17.5019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Drexler K. E. Molecular engineering: An approach to the development of general capabilities for molecular manipulation. Proc Natl Acad Sci U S A. 1981 Sep;78(9):5275–5278. doi: 10.1073/pnas.78.9.5275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Fedorov A. N., Dolgikh D. A., Chemeris V. V., Chernov B. K., Finkelstein A. V., Schulga A. A., Alakhov YuB, Kirpichnikov M. P., Ptitsyn O. B. De novo design, synthesis and study of albebetin, a polypeptide with a predetermined three-dimensional structure. Probing the structure at the nanogram level. J Mol Biol. 1992 Jun 20;225(4):927–931. doi: 10.1016/0022-2836(92)90092-x. [DOI] [PubMed] [Google Scholar]
  13. Fermi G., Perutz M. F., Shaanan B., Fourme R. The crystal structure of human deoxyhaemoglobin at 1.74 A resolution. J Mol Biol. 1984 May 15;175(2):159–174. doi: 10.1016/0022-2836(84)90472-8. [DOI] [PubMed] [Google Scholar]
  14. Fersht A., Winter G. Protein engineering. Trends Biochem Sci. 1992 Aug;17(8):292–295. doi: 10.1016/0968-0004(92)90438-f. [DOI] [PubMed] [Google Scholar]
  15. Floegel R., Mutter M. Molecular dynamics conformational search of six cyclic peptides used in the template assembled synthetic protein approach for protein de novo design. Biopolymers. 1992 Oct;32(10):1283–1310. doi: 10.1002/bip.360321004. [DOI] [PubMed] [Google Scholar]
  16. Garnier J., Osguthorpe D. J., Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol. 1978 Mar 25;120(1):97–120. doi: 10.1016/0022-2836(78)90297-8. [DOI] [PubMed] [Google Scholar]
  17. Gregoret L. M., Cohen F. E. Novel method for the rapid evaluation of packing in protein structures. J Mol Biol. 1990 Feb 20;211(4):959–974. doi: 10.1016/0022-2836(90)90086-2. [DOI] [PubMed] [Google Scholar]
  18. Hecht M. H., Richardson J. S., Richardson D. C., Ogden R. C. De novo design, expression, and characterization of Felix: a four-helix bundle protein of native-like sequence. Science. 1990 Aug 24;249(4971):884–891. doi: 10.1126/science.2392678. [DOI] [PubMed] [Google Scholar]
  19. Hill C. P., Anderson D. H., Wesson L., DeGrado W. F., Eisenberg D. Crystal structure of alpha 1: implications for protein design. Science. 1990 Aug 3;249(4968):543–546. doi: 10.1126/science.2382133. [DOI] [PubMed] [Google Scholar]
  20. Huber A. H., Wang Y. M., Bieber A. J., Bjorkman P. J. Crystal structure of tandem type III fibronectin domains from Drosophila neuroglian at 2.0 A. Neuron. 1994 Apr;12(4):717–731. doi: 10.1016/0896-6273(94)90326-3. [DOI] [PubMed] [Google Scholar]
  21. Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
  22. Kamtekar S., Schiffer J. M., Xiong H., Babik J. M., Hecht M. H. Protein design by binary patterning of polar and nonpolar amino acids. Science. 1993 Dec 10;262(5140):1680–1685. doi: 10.1126/science.8259512. [DOI] [PubMed] [Google Scholar]
  23. Kuroda Y., Nakai T., Ohkubo T. Solution structure of a de novo helical protein by 2D-NMR spectroscopy. J Mol Biol. 1994 Feb 25;236(3):862–868. doi: 10.1006/jmbi.1994.1194. [DOI] [PubMed] [Google Scholar]
  24. Leahy D. J., Hendrickson W. A., Aukhil I., Erickson H. P. Structure of a fibronectin type III domain from tenascin phased by MAD analysis of the selenomethionyl protein. Science. 1992 Nov 6;258(5084):987–991. doi: 10.1126/science.1279805. [DOI] [PubMed] [Google Scholar]
  25. Lesk A. M., Boswell D. R. Does protein structure determine amino acid sequence? Bioessays. 1992 Jun;14(6):407–410. doi: 10.1002/bies.950140611. [DOI] [PubMed] [Google Scholar]
  26. Lovejoy B., Choe S., Cascio D., McRorie D. K., DeGrado W. F., Eisenberg D. Crystal structure of a synthetic triple-stranded alpha-helical bundle. Science. 1993 Feb 26;259(5099):1288–1293. doi: 10.1126/science.8446897. [DOI] [PubMed] [Google Scholar]
  27. Lüthy R., Bowie J. U., Eisenberg D. Assessment of protein models with three-dimensional profiles. Nature. 1992 Mar 5;356(6364):83–85. doi: 10.1038/356083a0. [DOI] [PubMed] [Google Scholar]
  28. Mondragón A., Wolberger C., Harrison S. C. Structure of phage 434 Cro protein at 2.35 A resolution. J Mol Biol. 1989 Jan 5;205(1):179–188. doi: 10.1016/0022-2836(89)90374-4. [DOI] [PubMed] [Google Scholar]
  29. Moult J., Unger R. An analysis of protein folding pathways. Biochemistry. 1991 Apr 23;30(16):3816–3824. doi: 10.1021/bi00230a003. [DOI] [PubMed] [Google Scholar]
  30. Neher E. How frequent are correlated changes in families of protein sequences? Proc Natl Acad Sci U S A. 1994 Jan 4;91(1):98–102. doi: 10.1073/pnas.91.1.98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Overington J., Johnson M. S., Sali A., Blundell T. L. Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure prediction. Proc Biol Sci. 1990 Aug 22;241(1301):132–145. doi: 10.1098/rspb.1990.0077. [DOI] [PubMed] [Google Scholar]
  32. Pabo C. Molecular technology. Designing proteins and peptides. Nature. 1983 Jan 20;301(5897):200–200. doi: 10.1038/301200a0. [DOI] [PubMed] [Google Scholar]
  33. Pastore A., Lesk A. M. Brave new proteins: what evolution reveals about protein structure. Curr Opin Biotechnol. 1991 Aug;2(4):592–598. doi: 10.1016/0958-1669(91)90085-j. [DOI] [PubMed] [Google Scholar]
  34. Pierschbacher M. D., Ruoslahti E. Cell attachment activity of fibronectin can be duplicated by small synthetic fragments of the molecule. Nature. 1984 May 3;309(5963):30–33. doi: 10.1038/309030a0. [DOI] [PubMed] [Google Scholar]
  35. Ponder J. W., Richards F. M. Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol. 1987 Feb 20;193(4):775–791. doi: 10.1016/0022-2836(87)90358-5. [DOI] [PubMed] [Google Scholar]
  36. Regan L., DeGrado W. F. Characterization of a helical protein designed from first principles. Science. 1988 Aug 19;241(4868):976–978. doi: 10.1126/science.3043666. [DOI] [PubMed] [Google Scholar]
  37. Richardson J. S., Richardson D. C. Amino acid preferences for specific locations at the ends of alpha helices. Science. 1988 Jun 17;240(4859):1648–1652. doi: 10.1126/science.3381086. [DOI] [PubMed] [Google Scholar]
  38. Richardson J. S., Richardson D. C. The de novo design of protein structures. Trends Biochem Sci. 1989 Jul;14(7):304–309. doi: 10.1016/0968-0004(89)90070-4. [DOI] [PubMed] [Google Scholar]
  39. Richardson J. S., Richardson D. C., Tweedy N. B., Gernert K. M., Quinn T. P., Hecht M. H., Erickson B. W., Yan Y., McClain R. D., Donlan M. E. Looking at proteins: representations, folding, packing, and design. Biophysical Society National Lecture, 1992. Biophys J. 1992 Nov;63(5):1185–1209. [PMC free article] [PubMed] [Google Scholar]
  40. Rost B., Sander C. Jury returns on structure prediction. Nature. 1992 Dec 10;360(6404):540–540. doi: 10.1038/360540b0. [DOI] [PubMed] [Google Scholar]
  41. Sander C., Vriend G., Bazan F., Horovitz A., Nakamura H., Ribas L., Finkelstein A. V., Lockhart A., Merkl R., Perry L. J. Protein design on computers. Five new proteins: Shpilka, Grendel, Fingerclasp, Leather, and Aida. Proteins. 1992 Feb;12(2):105–110. doi: 10.1002/prot.340120203. [DOI] [PubMed] [Google Scholar]
  42. Schafmeister C. E., Miercke L. J., Stroud R. M. Structure at 2.5 A of a designed peptide that maintains solubility of membrane proteins. Science. 1993 Oct 29;262(5134):734–738. doi: 10.1126/science.8235592. [DOI] [PubMed] [Google Scholar]
  43. Shakhnovich E. I., Gutin A. M. A new approach to the design of stable proteins. Protein Eng. 1993 Nov;6(8):793–800. doi: 10.1093/protein/6.8.793. [DOI] [PubMed] [Google Scholar]
  44. Tanaka T., Kimura H., Hayashi M., Fujiyoshi Y., Fukuhara K., Nakamura H. Characteristics of a de novo designed protein. Protein Sci. 1994 Mar;3(3):419–427. doi: 10.1002/pro.5560030306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Vijay-Kumar S., Bugg C. E., Cook W. J. Structure of ubiquitin refined at 1.8 A resolution. J Mol Biol. 1987 Apr 5;194(3):531–544. doi: 10.1016/0022-2836(87)90679-6. [DOI] [PubMed] [Google Scholar]
  46. Yue K., Dill K. A. Inverse protein folding problem: designing polymer sequences. Proc Natl Acad Sci U S A. 1992 May 1;89(9):4163–4167. doi: 10.1073/pnas.89.9.4163. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES