Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1999 Apr;8(4):750–759. doi: 10.1110/ps.8.4.750

Factors limiting the performance of prediction-based fold recognition methods.

X de la Cruz 1, J M Thornton 1
PMCID: PMC2144320  PMID: 10211821

Abstract

In the past few years, a new generation of fold recognition methods has been developed, in which the classical sequence information is combined with information obtained from secondary structure and, sometimes, accessibility predictions. The results are promising, indicating that this approach may compete with potential-based methods (Rost B et al., 1997, J Mol Biol 270:471-480). Here we present a systematic study of the different factors contributing to the performance of these methods, in particular when applied to the problem of fold recognition of remote homologues. Our results indicate that secondary structure and accessibility prediction methods have reached an accuracy level where they are not the major factor limiting the accuracy of fold recognition. The pattern degeneracy problem is confirmed as the major source of error of these methods. On the basis of these results, we study three different options to overcome these limitations: normalization schemes, mapping of the coil state into the different zones of the Ramachandran plot, and post-threading graphical analysis.

Full Text

The Full Text of this article is available as a PDF (746.9 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Aurora R., Rose G. D. Seeking an ancient enzyme in Methanococcus jannaschii using ORF, a program based on predicted secondary structure comparisons. Proc Natl Acad Sci U S A. 1998 Mar 17;95(6):2818–2823. doi: 10.1073/pnas.95.6.2818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
  3. Bowie J. U., Lüthy R., Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991 Jul 12;253(5016):164–170. doi: 10.1126/science.1853201. [DOI] [PubMed] [Google Scholar]
  4. Bryant S. H., Altschul S. F. Statistics of sequence-structure threading. Curr Opin Struct Biol. 1995 Apr;5(2):236–244. doi: 10.1016/0959-440x(95)80082-4. [DOI] [PubMed] [Google Scholar]
  5. Bryant S. H. Evaluation of threading specificity and accuracy. Proteins. 1996 Oct;26(2):172–185. doi: 10.1002/(SICI)1097-0134(199610)26:2<172::AID-PROT7>3.0.CO;2-I. [DOI] [PubMed] [Google Scholar]
  6. Bryant S. H., Lawrence C. E. An empirical energy function for threading protein sequence through the folding motif. Proteins. 1993 May;16(1):92–112. doi: 10.1002/prot.340160110. [DOI] [PubMed] [Google Scholar]
  7. Feng Z. K., Sippl M. J. Optimum superimposition of protein structures: ambiguities and implications. Fold Des. 1996;1(2):123–132. doi: 10.1016/s1359-0278(96)00021-1. [DOI] [PubMed] [Google Scholar]
  8. Fischel-Ghodsian F., Mathiowitz G., Smith T. F. Alignment of protein sequences using secondary structure: a modified dynamic programming method. Protein Eng. 1990 Jul;3(7):577–581. doi: 10.1093/protein/3.7.577. [DOI] [PubMed] [Google Scholar]
  9. Fischer D., Eisenberg D. Protein fold recognition using sequence-derived predictions. Protein Sci. 1996 May;5(5):947–955. doi: 10.1002/pro.5560050516. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Godzik A. The structural alignment between two proteins: is there a unique answer? Protein Sci. 1996 Jul;5(7):1325–1338. doi: 10.1002/pro.5560050711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Jones D. T., Taylor W. R., Thornton J. M. A new approach to protein fold recognition. Nature. 1992 Jul 2;358(6381):86–89. doi: 10.1038/358086a0. [DOI] [PubMed] [Google Scholar]
  12. Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
  13. Kocher J. P., Rooman M. J., Wodak S. J. Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. J Mol Biol. 1994 Feb 4;235(5):1598–1613. doi: 10.1006/jmbi.1994.1109. [DOI] [PubMed] [Google Scholar]
  14. Lander E. S. The new genomics: global views of biology. Science. 1996 Oct 25;274(5287):536–539. doi: 10.1126/science.274.5287.536. [DOI] [PubMed] [Google Scholar]
  15. Lemer C. M., Rooman M. J., Wodak S. J. Protein structure prediction by threading methods: evaluation of current techniques. Proteins. 1995 Nov;23(3):337–355. doi: 10.1002/prot.340230308. [DOI] [PubMed] [Google Scholar]
  16. Marchler-Bauer A., Bryant S. H. A measure of success in fold recognition. Trends Biochem Sci. 1997 Jul;22(7):236–240. doi: 10.1016/s0968-0004(97)01078-5. [DOI] [PubMed] [Google Scholar]
  17. Marchler-Bauer A., Levitt M., Bryant S. H. A retrospective analysis of CASP2 threading predictions. Proteins. 1997;Suppl 1:83–91. doi: 10.1002/(sici)1097-0134(1997)1+<83::aid-prot12>3.3.co;2-2. [DOI] [PubMed] [Google Scholar]
  18. Martin A. C., MacArthur M. W., Thornton J. M. Assessment of comparative modeling in CASP2. Proteins. 1997;Suppl 1:14–28. doi: 10.1002/(sici)1097-0134(1997)1+<14::aid-prot4>3.3.co;2-f. [DOI] [PubMed] [Google Scholar]
  19. Milburn D., Laskowski R. A., Thornton J. M. Sequences annotated by structure: a tool to facilitate the use of structural information in sequence analysis. Protein Eng. 1998 Oct;11(10):855–859. doi: 10.1093/protein/11.10.855. [DOI] [PubMed] [Google Scholar]
  20. Murzin A. G., Brenner S. E., Hubbard T., Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995 Apr 7;247(4):536–540. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]
  21. Needleman S. B., Wunsch C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
  22. Ouzounis C., Sander C., Scharf M., Schneider R. Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. J Mol Biol. 1993 Aug 5;232(3):805–825. doi: 10.1006/jmbi.1993.1433. [DOI] [PubMed] [Google Scholar]
  23. Rice D. W., Eisenberg D. A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol. 1997 Apr 11;267(4):1026–1038. doi: 10.1006/jmbi.1997.0924. [DOI] [PubMed] [Google Scholar]
  24. Rice D. W., Fischer D., Weiss R., Eisenberg D. Fold assignments for amino acid sequences of the CASP2 experiment. Proteins. 1997;Suppl 1:113–122. doi: 10.1002/(sici)1097-0134(1997)1+<113::aid-prot15>3.3.co;2-3. [DOI] [PubMed] [Google Scholar]
  25. Rost B., Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993 Jul 20;232(2):584–599. doi: 10.1006/jmbi.1993.1413. [DOI] [PubMed] [Google Scholar]
  26. Rost B., Schneider R., Sander C. Protein fold recognition by prediction-based threading. J Mol Biol. 1997 Jul 18;270(3):471–480. doi: 10.1006/jmbi.1997.1101. [DOI] [PubMed] [Google Scholar]
  27. Russell R. B., Barton G. J. Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins. 1992 Oct;14(2):309–323. doi: 10.1002/prot.340140216. [DOI] [PubMed] [Google Scholar]
  28. Russell R. B., Copley R. R., Barton G. J. Protein fold recognition by mapping predicted secondary structures. J Mol Biol. 1996 Jun 14;259(3):349–365. doi: 10.1006/jmbi.1996.0325. [DOI] [PubMed] [Google Scholar]
  29. Russell R. B., Saqi M. A., Bates P. A., Sayle R. A., Sternberg M. J. Recognition of analogous and homologous protein folds--assessment of prediction success and associated alignment accuracy using empirical substitution matrices. Protein Eng. 1998 Jan;11(1):1–9. doi: 10.1093/protein/11.1.1. [DOI] [PubMed] [Google Scholar]
  30. Russell R. B., Saqi M. A., Sayle R. A., Bates P. A., Sternberg M. J. Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol. 1997 Jun 13;269(3):423–439. doi: 10.1006/jmbi.1997.1019. [DOI] [PubMed] [Google Scholar]
  31. Sali A., Potterton L., Yuan F., van Vlijmen H., Karplus M. Evaluation of comparative protein modeling by MODELLER. Proteins. 1995 Nov;23(3):318–326. doi: 10.1002/prot.340230306. [DOI] [PubMed] [Google Scholar]
  32. Samudrala R., Pedersen J. T., Zhou H. B., Luo R., Fidelis K., Moult J. Confronting the problem of interconnected structural changes in the comparative modeling of proteins. Proteins. 1995 Nov;23(3):327–336. doi: 10.1002/prot.340230307. [DOI] [PubMed] [Google Scholar]
  33. Sippl M. J., Weitckus S. Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins. 1992 Jul;13(3):258–271. doi: 10.1002/prot.340130308. [DOI] [PubMed] [Google Scholar]
  34. Smith T. F., Waterman M. S. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–197. doi: 10.1016/0022-2836(81)90087-5. [DOI] [PubMed] [Google Scholar]
  35. Swindells M. B., MacArthur M. W., Thornton J. M. Intrinsic phi, psi propensities of amino acids, derived from the coil regions of known structures. Nat Struct Biol. 1995 Jul;2(7):596–603. doi: 10.1038/nsb0795-596. [DOI] [PubMed] [Google Scholar]
  36. Taylor W. R., Orengo C. A. Protein structure alignment. J Mol Biol. 1989 Jul 5;208(1):1–22. doi: 10.1016/0022-2836(89)90084-3. [DOI] [PubMed] [Google Scholar]
  37. Westhead D. R., Collura V. P., Eldridge M. D., Firth M. A., Li J., Murray C. W. Protein fold recognition by threading: comparison of algorithms and analysis of results. Protein Eng. 1995 Dec;8(12):1197–1204. doi: 10.1093/protein/8.12.1197. [DOI] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES