Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 1996 Nov;5(11):2298–2310. doi: 10.1002/pro.5560051116

Identification and application of the concepts important for accurate and reliable protein secondary structure prediction.

R D King 1, M J Sternberg 1
PMCID: PMC2143286  PMID: 8931148

Abstract

A protein secondary structure prediction method from multiply aligned homologous sequences is presented with an overall per residue three-state accuracy of 70.1%. There are two aims: to obtain high accuracy by identification of a set of concepts important for prediction followed by use of linear statistics; and to provide insight into the folding process. The important concepts in secondary structure prediction are identified as: residue conformational propensities, sequence edge effects, moments of hydrophobicity, position of insertions and deletions in aligned homologous sequence, moments of conservation, auto-correlation, residue ratios, secondary structure feedback effects, and filtering. Explicit use of edge effects, moments of conservation, and auto-correlation are new to this paper. The relative importance of the concepts used in prediction was analyzed by stepwise addition of information and examination of weights in the discrimination function. The simple and explicit structure of the prediction allows the method to be reimplemented easily. The accuracy of a prediction is predictable a priori. This permits evaluation of the utility of the prediction: 10% of the chains predicted were identified correctly as having a mean accuracy of > 80%. Existing high-accuracy prediction methods are "black-box" predictors based on complex nonlinear statistics (e.g., neural networks in PHD: Rost & Sander, 1993a). For medium- to short-length chains (> or = 90 residues and < 170 residues), the prediction method is significantly more accurate (P < 0.01) than the PHD algorithm (probably the most commonly used algorithm). In combination with the PHD, an algorithm is formed that is significantly more accurate than either method, with an estimated overall three-state accuracy of 72.4%, the highest accuracy reported for any prediction method.

Full Text

The Full Text of this article is available as a PDF (1.2 MB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Benner S. A., Cohen M. A., Gerloff D. Correct structure prediction? Nature. 1992 Oct 29;359(6398):781–781. doi: 10.1038/359781a0. [DOI] [PubMed] [Google Scholar]
  2. Benner S. A., Gerloff D. L. Predicting the conformation of proteins. Man versus machine. FEBS Lett. 1993 Jun 28;325(1-2):29–33. doi: 10.1016/0014-5793(93)81408-r. [DOI] [PubMed] [Google Scholar]
  3. Benner S. A., Gerloff D. Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure: a prediction of the structure of the catalytic domain of protein kinases. Adv Enzyme Regul. 1991;31:121–181. doi: 10.1016/0065-2571(91)90012-b. [DOI] [PubMed] [Google Scholar]
  4. Biou V., Gibrat J. F., Levin J. M., Robson B., Garnier J. Secondary structure prediction: combination of three different methods. Protein Eng. 1988 Sep;2(3):185–191. doi: 10.1093/protein/2.3.185. [DOI] [PubMed] [Google Scholar]
  5. Bryson J. W., Betz S. F., Lu H. S., Suich D. J., Zhou H. X., O'Neil K. T., DeGrado W. F. Protein design: a hierarchic approach. Science. 1995 Nov 10;270(5238):935–941. doi: 10.1126/science.270.5238.935. [DOI] [PubMed] [Google Scholar]
  6. Chou P. Y., Fasman G. D. Prediction of protein conformation. Biochemistry. 1974 Jan 15;13(2):222–245. doi: 10.1021/bi00699a002. [DOI] [PubMed] [Google Scholar]
  7. Colloc'h N., Etchebest C., Thoreau E., Henrissat B., Mornon J. P. Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment. Protein Eng. 1993 Jun;6(4):377–382. doi: 10.1093/protein/6.4.377. [DOI] [PubMed] [Google Scholar]
  8. Garnier J., Osguthorpe D. J., Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol. 1978 Mar 25;120(1):97–120. doi: 10.1016/0022-2836(78)90297-8. [DOI] [PubMed] [Google Scholar]
  9. Geourjon C., Deléage G. SOPM: a self-optimized method for protein secondary structure prediction. Protein Eng. 1994 Feb;7(2):157–164. doi: 10.1093/protein/7.2.157. [DOI] [PubMed] [Google Scholar]
  10. Horovitz A., Matthews J. M., Fersht A. R. Alpha-helix stability in proteins. II. Factors that influence stability at an internal position. J Mol Biol. 1992 Sep 20;227(2):560–568. doi: 10.1016/0022-2836(92)90907-2. [DOI] [PubMed] [Google Scholar]
  11. Jenny T. F., Benner S. A. Evaluating predictions of secondary structure in proteins. Biochem Biophys Res Commun. 1994 Apr 15;200(1):149–155. doi: 10.1006/bbrc.1994.1427. [DOI] [PubMed] [Google Scholar]
  12. Kneller D. G., Cohen F. E., Langridge R. Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol. 1990 Jul 5;214(1):171–182. doi: 10.1016/0022-2836(90)90154-E. [DOI] [PubMed] [Google Scholar]
  13. Lim V. I. Algorithms for prediction of alpha-helical and beta-structural regions in globular proteins. J Mol Biol. 1974 Oct 5;88(4):873–894. doi: 10.1016/0022-2836(74)90405-7. [DOI] [PubMed] [Google Scholar]
  14. Mehta P. K., Heringa J., Argos P. A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70%. Protein Sci. 1995 Dec;4(12):2517–2525. doi: 10.1002/pro.5560041208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Muggleton S., King R. D., Sternberg M. J. Protein secondary structure prediction using logic-based machine learning. Protein Eng. 1992 Oct;5(7):647–657. doi: 10.1093/protein/5.7.647. [DOI] [PubMed] [Google Scholar]
  16. Padmanabhan S., Marqusee S., Ridgeway T., Laue T. M., Baldwin R. L. Relative helix-forming tendencies of nonpolar amino acids. Nature. 1990 Mar 15;344(6263):268–270. doi: 10.1038/344268a0. [DOI] [PubMed] [Google Scholar]
  17. Qian N., Sejnowski T. J. Predicting the secondary structure of globular proteins using neural network models. J Mol Biol. 1988 Aug 20;202(4):865–884. doi: 10.1016/0022-2836(88)90564-5. [DOI] [PubMed] [Google Scholar]
  18. Richardson J. S., Richardson D. C. Amino acid preferences for specific locations at the ends of alpha helices. Science. 1988 Jun 17;240(4859):1648–1652. doi: 10.1126/science.3381086. [DOI] [PubMed] [Google Scholar]
  19. Robson B., Suzuki E. Conformational properties of amino acid residues in globular proteins. J Mol Biol. 1976 Nov 5;107(3):327–356. doi: 10.1016/s0022-2836(76)80008-3. [DOI] [PubMed] [Google Scholar]
  20. Rost B., Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993 Jul 20;232(2):584–599. doi: 10.1006/jmbi.1993.1413. [DOI] [PubMed] [Google Scholar]
  21. Rost B., Sander C., Schneider R. Redefining the goals of protein secondary structure prediction. J Mol Biol. 1994 Jan 7;235(1):13–26. doi: 10.1016/s0022-2836(05)80007-5. [DOI] [PubMed] [Google Scholar]
  22. Russell R. B., Barton G. J. The limits of protein secondary structure prediction accuracy from multiple sequence alignment. J Mol Biol. 1993 Dec 20;234(4):951–957. doi: 10.1006/jmbi.1993.1649. [DOI] [PubMed] [Google Scholar]
  23. Solovyev V. V., Salamov A. A. Predicting alpha-helix and beta-strand segments of globular proteins. Comput Appl Biosci. 1994 Dec;10(6):661–669. doi: 10.1093/bioinformatics/10.6.661. [DOI] [PubMed] [Google Scholar]
  24. Wako H., Blundell T. L. Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. II. Secondary structures. J Mol Biol. 1994 May 20;238(5):693–708. doi: 10.1006/jmbi.1994.1330. [DOI] [PubMed] [Google Scholar]
  25. White S. H. Amino acid preferences of small proteins. Implications for protein stability and evolution. J Mol Biol. 1992 Oct 20;227(4):991–995. doi: 10.1016/0022-2836(92)90515-l. [DOI] [PubMed] [Google Scholar]
  26. Williams R. W., Chang A., Juretić D., Loughran S. Secondary structure predictions and medium range interactions. Biochim Biophys Acta. 1987 Nov 26;916(2):200–204. doi: 10.1016/0167-4838(87)90109-9. [DOI] [PubMed] [Google Scholar]
  27. Yi T. M., Lander E. S. Protein secondary structure prediction using nearest-neighbor methods. J Mol Biol. 1993 Aug 20;232(4):1117–1129. doi: 10.1006/jmbi.1993.1464. [DOI] [PubMed] [Google Scholar]
  28. Zhang X., Mesirov J. P., Waltz D. L. Hybrid system for protein secondary structure prediction. J Mol Biol. 1992 Jun 20;225(4):1049–1063. doi: 10.1016/0022-2836(92)90104-r. [DOI] [PubMed] [Google Scholar]
  29. Zvelebil M. J., Barton G. J., Taylor W. R., Sternberg M. J. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol. 1987 Jun 20;195(4):957–961. doi: 10.1016/0022-2836(87)90501-8. [DOI] [PubMed] [Google Scholar]

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES