Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1996 May;5(5):947–955. doi: 10.1002/pro.5560050516

Protein fold recognition using sequence-derived predictions.

D Fischer 1, D Eisenberg 1
PMCID: PMC2143416  PMID: 8732766

Abstract

In protein fold recognition, one assigns a probe amino acid sequence of unknown structure to one of a library of target 3D structures. Correct assignment depends on effective scoring of the probe sequence for its compatibility with each of the target structures. Here we show that, in addition to the amino acid sequence of the probe, sequence-derived properties of the probe sequence (such as the predicted secondary structure) are useful in fold assignment. The additional measure of compatibility between probe and target is the level of agreement between the predicted secondary structure of the probe and the known secondary structure of the target fold. That is, we recommend a sequence-structure compatibility function that combines previously developed compatibility functions (such as the 3D-1D scores of Bowie et al. [1991] or sequence-sequence replacement tables) with the predicted secondary structure of the probe sequence. The effect on fold assignment of adding predicted secondary structure is evaluated here by using a benchmark set of proteins (Fischer et al., 1996a). The 3D structures of the probe sequences of the benchmark are actually known, but are ignored by our method. The results show that the inclusion of the predicted secondary structure improves fold assignment by about 25%. The results also show that, if the true secondary structure of the probe were known, correct fold assignment would increase by an additional 8-32%. We conclude that incorporating sequence-derived predictions significantly improves assignment of sequences to known 3D folds. Finally, we apply the new method to assign folds to sequences in the SWISSPROT database; six fold assignments are given that are not detectable by standard sequence-sequence comparison methods; for two of these, the fold is known from X-ray crystallography and the fold assignment is correct.

Full Text

The Full Text of this article is available as a PDF (940.5 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 1992 May 11;20 (Suppl):2019–2022. doi: 10.1093/nar/20.suppl.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
  3. Bryant S. H., Lawrence C. E. An empirical energy function for threading protein sequence through the folding motif. Proteins. 1993 May;16(1):92–112. doi: 10.1002/prot.340160110. [DOI] [PubMed] [Google Scholar]
  4. Fischer D., Rice D., Bowie J. U., Eisenberg D. Assigning amino acid sequences to 3-dimensional protein folds. FASEB J. 1996 Jan;10(1):126–136. doi: 10.1096/fasebj.10.1.8566533. [DOI] [PubMed] [Google Scholar]
  5. Fischer D., Tsai C. J., Nussinov R., Wolfson H. A 3D sequence-independent representation of the protein data bank. Protein Eng. 1995 Oct;8(10):981–997. doi: 10.1093/protein/8.10.981. [DOI] [PubMed] [Google Scholar]
  6. Godzik A., Kolinski A., Skolnick J. Topology fingerprint approach to the inverse protein folding problem. J Mol Biol. 1992 Sep 5;227(1):227–238. doi: 10.1016/0022-2836(92)90693-e. [DOI] [PubMed] [Google Scholar]
  7. Jones D., Thornton J. Protein fold recognition. J Comput Aided Mol Des. 1993 Aug;7(4):439–456. doi: 10.1007/BF02337560. [DOI] [PubMed] [Google Scholar]
  8. Lemer C. M., Rooman M. J., Wodak S. J. Protein structure prediction by threading methods: evaluation of current techniques. Proteins. 1995 Nov;23(3):337–355. doi: 10.1002/prot.340230308. [DOI] [PubMed] [Google Scholar]
  9. Matsuo Y., Nishikawa K. Protein structural similarities predicted by a sequence-structure compatibility method. Protein Sci. 1994 Nov;3(11):2055–2063. doi: 10.1002/pro.5560031118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Needleman S. B., Wunsch C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
  11. Ouzounis C., Sander C., Scharf M., Schneider R. Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. J Mol Biol. 1993 Aug 5;232(3):805–825. doi: 10.1006/jmbi.1993.1433. [DOI] [PubMed] [Google Scholar]
  12. Pearl L., O'Hara B., Drew R., Wilson S. Crystal structure of AmiC: the controller of transcription antitermination in the amidase operon of Pseudomonas aeruginosa. EMBO J. 1994 Dec 15;13(24):5810–5817. doi: 10.1002/j.1460-2075.1994.tb06924.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Pearson W. R., Lipman D. J. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444–2448. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Rost B., Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993 Jul 20;232(2):584–599. doi: 10.1006/jmbi.1993.1413. [DOI] [PubMed] [Google Scholar]
  15. Sippl M. J. Knowledge-based potentials for proteins. Curr Opin Struct Biol. 1995 Apr;5(2):229–235. doi: 10.1016/0959-440x(95)80081-6. [DOI] [PubMed] [Google Scholar]
  16. Sippl M. J., Weitckus S. Detection of native-like models for amino acid sequences of unknown three-dimensional structure in a data base of known protein conformations. Proteins. 1992 Jul;13(3):258–271. doi: 10.1002/prot.340130308. [DOI] [PubMed] [Google Scholar]
  17. Smith T. F., Waterman M. S. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–197. doi: 10.1016/0022-2836(81)90087-5. [DOI] [PubMed] [Google Scholar]
  18. Wilmanns M., Eisenberg D. Three-dimensional profiles from residue-pair preferences: identification of sequences with beta/alpha-barrel fold. Proc Natl Acad Sci U S A. 1993 Feb 15;90(4):1379–1383. doi: 10.1073/pnas.90.4.1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Yi T. M., Lander E. S. Recognition of related proteins by iterative template refinement (ITR). Protein Sci. 1994 Aug;3(8):1315–1328. doi: 10.1002/pro.5560030818. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES