Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1995 Sep 12;92(19):8700–8704. doi: 10.1073/pnas.92.19.8700

Prediction of protein folding class using global description of amino acid sequence.

I Dubchak 1, I Muchnik 1, S R Holbrook 1, S H Kim 1
PMCID: PMC41034  PMID: 7568000

Abstract

We present a method for predicting protein folding class based on global protein chain description and a voting process. Selection of the best descriptors was achieved by a computer-simulated neural network trained on a data base consisting of 83 folding classes. Protein-chain descriptors include overall composition, transition, and distribution of amino acid attributes, such as relative hydrophobicity, predicted secondary structure, and predicted solvent exposure. Cross-validation testing was performed on 15 of the largest classes. The test shows that proteins were assigned to the correct class (correct positive prediction) with an average accuracy of 71.7%, whereas the inverse prediction of proteins as not belonging to a particular class (correct negative prediction) was 90-95% accurate. When tested on 254 structures used in this study, the top two predictions contained the correct class in 91% of the cases.

Full text

PDF
8700

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Chothia C., Finkelstein A. V. The classification and origins of protein folding patterns. Annu Rev Biochem. 1990;59:1007–1039. doi: 10.1146/annurev.bi.59.070190.005043. [DOI] [PubMed] [Google Scholar]
  2. Chothia C. Proteins. One thousand families for the molecular biologist. Nature. 1992 Jun 18;357(6379):543–544. doi: 10.1038/357543a0. [DOI] [PubMed] [Google Scholar]
  3. Chou K. C., Zhang C. T. A correlation-coefficient method to predicting protein-structural classes from amino acid compositions. Eur J Biochem. 1992 Jul 15;207(2):429–423. doi: 10.1111/j.1432-1033.1992.tb17067.x. [DOI] [PubMed] [Google Scholar]
  4. Chou K. C., Zhang C. T. A new approach to predicting protein folding types. J Protein Chem. 1993 Apr;12(2):169–178. doi: 10.1007/BF01026038. [DOI] [PubMed] [Google Scholar]
  5. Dubchak I., Holbrook S. R., Kim S. H. Prediction of protein folding class from amino acid composition. Proteins. 1993 May;16(1):79–91. doi: 10.1002/prot.340160109. [DOI] [PubMed] [Google Scholar]
  6. Finkelstein A. V., Ptitsyn O. B. Why do globular proteins fit the limited set of folding patterns? Prog Biophys Mol Biol. 1987;50(3):171–190. doi: 10.1016/0079-6107(87)90013-7. [DOI] [PubMed] [Google Scholar]
  7. Holbrook S. R., Dubchak I., Kim S. H. PROBE: a computer program employing an integrated neural network approach to protein structure prediction. Biotechniques. 1993 Jun;14(6):984–989. [PubMed] [Google Scholar]
  8. Holbrook S. R., Muskal S. M., Kim S. H. Predicting surface exposure of amino acids from protein sequence. Protein Eng. 1990 Aug;3(8):659–665. doi: 10.1093/protein/3.8.659. [DOI] [PubMed] [Google Scholar]
  9. Holley L. H., Karplus M. Protein secondary structure prediction with a neural network. Proc Natl Acad Sci U S A. 1989 Jan;86(1):152–156. doi: 10.1073/pnas.86.1.152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Klein P., Delisi C. Prediction of protein structural class from the amino acid sequence. Biopolymers. 1986 Sep;25(9):1659–1672. doi: 10.1002/bip.360250909. [DOI] [PubMed] [Google Scholar]
  11. Klein P. Prediction of protein structural class by discriminant analysis. Biochim Biophys Acta. 1986 Nov 21;874(2):205–215. doi: 10.1016/0167-4838(86)90119-6. [DOI] [PubMed] [Google Scholar]
  12. Kneller D. G., Cohen F. E., Langridge R. Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol. 1990 Jul 5;214(1):171–182. doi: 10.1016/0022-2836(90)90154-E. [DOI] [PubMed] [Google Scholar]
  13. Levitt M., Chothia C. Structural patterns in globular proteins. Nature. 1976 Jun 17;261(5561):552–558. doi: 10.1038/261552a0. [DOI] [PubMed] [Google Scholar]
  14. Metfessel B. A., Saurugger P. N., Connelly D. P., Rich S. S. Cross-validation of protein structural class prediction using statistical clustering and neural networks. Protein Sci. 1993 Jul;2(7):1171–1182. doi: 10.1002/pro.5560020712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Muskal S. M., Kim S. H. Predicting protein secondary structure content. A tandem neural network approach. J Mol Biol. 1992 Jun 5;225(3):713–727. doi: 10.1016/0022-2836(92)90396-2. [DOI] [PubMed] [Google Scholar]
  16. Nakashima H., Nishikawa K., Ooi T. The folding type of a protein is relevant to the amino acid composition. J Biochem. 1986 Jan;99(1):153–162. doi: 10.1093/oxfordjournals.jbchem.a135454. [DOI] [PubMed] [Google Scholar]
  17. Orengo C. A., Flores T. P., Taylor W. R., Thornton J. M. Identification and classification of protein fold families. Protein Eng. 1993 Jul;6(5):485–500. doi: 10.1093/protein/6.5.485. [DOI] [PubMed] [Google Scholar]
  18. Pascarella S., Argos P. A data bank merging related protein structures and sequences. Protein Eng. 1992 Mar;5(2):121–137. doi: 10.1093/protein/5.2.121. [DOI] [PubMed] [Google Scholar]
  19. Qian N., Sejnowski T. J. Predicting the secondary structure of globular proteins using neural network models. J Mol Biol. 1988 Aug 20;202(4):865–884. doi: 10.1016/0022-2836(88)90564-5. [DOI] [PubMed] [Google Scholar]
  20. Richardson J. S. The anatomy and taxonomy of protein structure. Adv Protein Chem. 1981;34:167–339. doi: 10.1016/s0065-3233(08)60520-3. [DOI] [PubMed] [Google Scholar]
  21. Rost B., Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993 Jul 20;232(2):584–599. doi: 10.1006/jmbi.1993.1413. [DOI] [PubMed] [Google Scholar]
  22. White J. V., Stultz C. M., Smith T. F. Protein classification by stochastic modeling and optimal filtering of amino-acid sequences. Math Biosci. 1994 Jan;119(1):35–75. doi: 10.1016/0025-5564(94)90004-3. [DOI] [PubMed] [Google Scholar]
  23. Zhang C. T., Chou K. C. An optimization approach to predicting protein structural class from amino acid composition. Protein Sci. 1992 Mar;1(3):401–408. doi: 10.1002/pro.5560010312. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES