Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2000 Jun;9(6):1106–1119. doi: 10.1110/ps.9.6.1106

Prediction of amino acid sequence from structure.

K Raha 1, A M Wollacott 1, M J Italia 1, J R Desjarlais 1
PMCID: PMC2144664  PMID: 10892804

Abstract

We have developed a method for the prediction of an amino acid sequence that is compatible with a three-dimensional backbone structure. Using only a backbone structure of a protein as input, the algorithm is capable of designing sequences that closely resemble natural members of the protein family to which the template structure belongs. In general, the predicted sequences are shown to have multiple sequence profile scores that are dramatically higher than those of random sequences, and sometimes better than some of the natural sequences that make up the superfamily. As anticipated, highly conserved but poorly predicted residues are often those that contribute to the functional rather than structural properties of the protein. Overall, our analysis suggests that statistical profile scores of designed sequences are a novel and valuable figure of merit for assessing and improving protein design algorithms.

Full Text

The Full Text of this article is available as a PDF (3.1 MB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bateman A., Birney E., Durbin R., Eddy S. R., Howe K. L., Sonnhammer E. L. The Pfam protein families database. Nucleic Acids Res. 2000 Jan 1;28(1):263–266. doi: 10.1093/nar/28.1.263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bowie J. U., Reidhaar-Olson J. F., Lim W. A., Sauer R. T. Deciphering the message in protein sequences: tolerance to amino acid substitutions. Science. 1990 Mar 16;247(4948):1306–1310. doi: 10.1126/science.2315699. [DOI] [PubMed] [Google Scholar]
  3. Clarke N. D., Kissinger C. R., Desjarlais J., Gilliland G. L., Pabo C. O. Structural studies of the engrailed homeodomain. Protein Sci. 1994 Oct;3(10):1779–1787. doi: 10.1002/pro.5560031018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Dahiyat B. I., Gordon D. B., Mayo S. L. Automated design of the surface positions of protein helices. Protein Sci. 1997 Jun;6(6):1333–1337. doi: 10.1002/pro.5560060622. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Dahiyat B. I., Mayo S. L. De novo protein design: fully automated sequence selection. Science. 1997 Oct 3;278(5335):82–87. doi: 10.1126/science.278.5335.82. [DOI] [PubMed] [Google Scholar]
  6. Dahiyat B. I., Mayo S. L. Probing the role of packing specificity in protein design. Proc Natl Acad Sci U S A. 1997 Sep 16;94(19):10172–10177. doi: 10.1073/pnas.94.19.10172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dahiyat B. I., Mayo S. L. Protein design automation. Protein Sci. 1996 May;5(5):895–903. doi: 10.1002/pro.5560050511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dahiyat B. I., Sarisky C. A., Mayo S. L. De novo protein design: towards fully automated sequence selection. J Mol Biol. 1997 Nov 7;273(4):789–796. doi: 10.1006/jmbi.1997.1341. [DOI] [PubMed] [Google Scholar]
  9. Desjarlais J. R., Clarke N. D. Computer search algorithms in protein modification and design. Curr Opin Struct Biol. 1998 Aug;8(4):471–475. doi: 10.1016/s0959-440x(98)80125-5. [DOI] [PubMed] [Google Scholar]
  10. Desjarlais J. R., Handel T. M. De novo design of the hydrophobic cores of proteins. Protein Sci. 1995 Oct;4(10):2006–2018. doi: 10.1002/pro.5560041006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Desjarlais J. R., Handel T. M. Side-chain and backbone flexibility in protein core design. J Mol Biol. 1999 Jul 2;290(1):305–318. doi: 10.1006/jmbi.1999.2866. [DOI] [PubMed] [Google Scholar]
  12. Dunbrack R. L., Jr, Cohen F. E. Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 1997 Aug;6(8):1661–1681. doi: 10.1002/pro.5560060807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Eisenberg D., McLachlan A. D. Solvation energy in protein folding and binding. Nature. 1986 Jan 16;319(6050):199–203. doi: 10.1038/319199a0. [DOI] [PubMed] [Google Scholar]
  14. Fraenkel E., Rould M. A., Chambers K. A., Pabo C. O. Engrailed homeodomain-DNA complex at 2.2 A resolution: a detailed view of the interface and comparison with other engrailed structures. J Mol Biol. 1998 Nov 27;284(2):351–361. doi: 10.1006/jmbi.1998.2147. [DOI] [PubMed] [Google Scholar]
  15. Goldstein R. F. Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys J. 1994 May;66(5):1335–1340. doi: 10.1016/S0006-3495(94)80923-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gordon D. B., Marshall S. A., Mayo S. L. Energy functions for protein design. Curr Opin Struct Biol. 1999 Aug;9(4):509–513. doi: 10.1016/s0959-440x(99)80072-4. [DOI] [PubMed] [Google Scholar]
  17. Gribskov M., McLachlan A. D., Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355–4358. doi: 10.1073/pnas.84.13.4355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Harbury P. B., Plecs J. J., Tidor B., Alber T., Kim P. S. High-resolution protein design with backbone freedom. Science. 1998 Nov 20;282(5393):1462–1467. doi: 10.1126/science.282.5393.1462. [DOI] [PubMed] [Google Scholar]
  19. Harbury P. B., Tidor B., Kim P. S. Repacking protein cores with backbone freedom: structure prediction for coiled coils. Proc Natl Acad Sci U S A. 1995 Aug 29;92(18):8408–8412. doi: 10.1073/pnas.92.18.8408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hecht M. H. De novo design of beta-sheet proteins. Proc Natl Acad Sci U S A. 1994 Sep 13;91(19):8729–8730. doi: 10.1073/pnas.91.19.8729. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hellinga H. W. Rational protein design: combining theory and experiment. Proc Natl Acad Sci U S A. 1997 Sep 16;94(19):10015–10017. doi: 10.1073/pnas.94.19.10015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hellinga H. W., Richards F. M. Optimal sequence selection in proteins of known structure by simulated evolution. Proc Natl Acad Sci U S A. 1994 Jun 21;91(13):5803–5807. doi: 10.1073/pnas.91.13.5803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hendsch Z. S., Tidor B. Electrostatic interactions in the GCN4 leucine zipper: substantial contributions arise from intramolecular interactions enhanced on binding. Protein Sci. 1999 Jul;8(7):1381–1392. doi: 10.1110/ps.8.7.1381. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Henikoff S., Henikoff J. G. Position-based sequence weights. J Mol Biol. 1994 Nov 4;243(4):574–578. doi: 10.1016/0022-2836(94)90032-9. [DOI] [PubMed] [Google Scholar]
  25. Johnson E. C., Lazar G. A., Desjarlais J. R., Handel T. M. Solution structure and dynamics of a designed hydrophobic core variant of ubiquitin. Structure. 1999 Aug 15;7(8):967–976. doi: 10.1016/s0969-2126(99)80123-3. [DOI] [PubMed] [Google Scholar]
  26. Juffer A. H., Eisenhaber F., Hubbard S. J., Walther D., Argos P. Comparison of atomic solvation parametric sets: applicability and limitations in protein folding and binding. Protein Sci. 1995 Dec;4(12):2499–2509. doi: 10.1002/pro.5560041206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kissinger C. R., Liu B. S., Martin-Blanco E., Kornberg T. B., Pabo C. O. Crystal structure of an engrailed homeodomain-DNA complex at 2.8 A resolution: a framework for understanding homeodomain-DNA interactions. Cell. 1990 Nov 2;63(3):579–590. doi: 10.1016/0092-8674(90)90453-l. [DOI] [PubMed] [Google Scholar]
  28. Koehl P., Levitt M. De novo protein design. I. In search of stability and specificity. J Mol Biol. 1999 Nov 12;293(5):1161–1181. doi: 10.1006/jmbi.1999.3211. [DOI] [PubMed] [Google Scholar]
  29. Koehl P., Levitt M. De novo protein design. II. Plasticity in sequence space. J Mol Biol. 1999 Nov 12;293(5):1183–1193. doi: 10.1006/jmbi.1999.3212. [DOI] [PubMed] [Google Scholar]
  30. Kono H., Doi J. Energy minimization method using automata network for sequence and side-chain conformation prediction from given backbone geometry. Proteins. 1994 Jul;19(3):244–255. doi: 10.1002/prot.340190308. [DOI] [PubMed] [Google Scholar]
  31. Kono H., Nishiyama M., Tanokura M., Doi J. Designing the hydrophobic core of Thermus flavus malate dehydrogenase based on side-chain packing. Protein Eng. 1998 Jan;11(1):47–52. doi: 10.1093/protein/11.1.47. [DOI] [PubMed] [Google Scholar]
  32. Lazar G. A., Desjarlais J. R., Handel T. M. De novo design of the hydrophobic core of ubiquitin. Protein Sci. 1997 Jun;6(6):1167–1178. doi: 10.1002/pro.5560060605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lazar G. A., Johnson E. C., Desjarlais J. R., Handel T. M. Rotamer strain as a determinant of protein structural specificity. Protein Sci. 1999 Dec;8(12):2598–2610. doi: 10.1110/ps.8.12.2598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Leahy D. J., Hendrickson W. A., Aukhil I., Erickson H. P. Structure of a fibronectin type III domain from tenascin phased by MAD analysis of the selenomethionyl protein. Science. 1992 Nov 6;258(5084):987–991. doi: 10.1126/science.1279805. [DOI] [PubMed] [Google Scholar]
  35. Lim W. A., Hodel A., Sauer R. T., Richards F. M. The crystal structure of a mutant protein with altered but improved hydrophobic core packing. Proc Natl Acad Sci U S A. 1994 Jan 4;91(1):423–427. doi: 10.1073/pnas.91.1.423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Micheletti C., Seno F., Maritan A., Banavar J. R. Design of proteins with hydrophobic and polar amino acids. Proteins. 1998 Jul 1;32(1):80–87. [PubMed] [Google Scholar]
  37. Musacchio A., Noble M., Pauptit R., Wierenga R., Saraste M. Crystal structure of a Src-homology 3 (SH3) domain. Nature. 1992 Oct 29;359(6398):851–855. doi: 10.1038/359851a0. [DOI] [PubMed] [Google Scholar]
  38. Oubridge C., Ito N., Evans P. R., Teo C. H., Nagai K. Crystal structure at 1.92 A resolution of the RNA-binding domain of the U1A spliceosomal protein complexed with an RNA hairpin. Nature. 1994 Dec 1;372(6505):432–438. doi: 10.1038/372432a0. [DOI] [PubMed] [Google Scholar]
  39. Street A. G., Mayo S. L. Computational protein design. Structure. 1999 May;7(5):R105–R109. doi: 10.1016/s0969-2126(99)80062-8. [DOI] [PubMed] [Google Scholar]
  40. Su A., Mayo S. L. Coupling backbone flexibility and amino acid sequence selection in protein design. Protein Sci. 1997 Aug;6(8):1701–1707. doi: 10.1002/pro.5560060810. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES