Abstract
Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein's sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction.
Full Text
The Full Text of this article is available as a PDF (1.8 MB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
- Chou P. Y., Fasman G. D. Prediction of protein conformation. Biochemistry. 1974 Jan 15;13(2):222–245. doi: 10.1021/bi00699a002. [DOI] [PubMed] [Google Scholar]
- Cohen F. E., Abarbanel R. M., Kuntz I. D., Fletterick R. J. Secondary structure assignment for alpha/beta proteins by a combinatorial approach. Biochemistry. 1983 Oct 11;22(21):4894–4904. doi: 10.1021/bi00290a005. [DOI] [PubMed] [Google Scholar]
- Deléage G., Roux B. An algorithm for protein secondary structure prediction based on class prediction. Protein Eng. 1987 Aug-Sep;1(4):289–294. doi: 10.1093/protein/1.4.289. [DOI] [PubMed] [Google Scholar]
- Dill K. A. Dominant forces in protein folding. Biochemistry. 1990 Aug 7;29(31):7133–7155. doi: 10.1021/bi00483a001. [DOI] [PubMed] [Google Scholar]
- Garnier J., Osguthorpe D. J., Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol. 1978 Mar 25;120(1):97–120. doi: 10.1016/0022-2836(78)90297-8. [DOI] [PubMed] [Google Scholar]
- Gibrat J. F., Garnier J., Robson B. Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol. 1987 Dec 5;198(3):425–443. doi: 10.1016/0022-2836(87)90292-0. [DOI] [PubMed] [Google Scholar]
- Karlin S., Zuker M., Brocchieri L. Measuring residue associations in protein structures. Possible implications for protein folding. J Mol Biol. 1994 Jun 3;239(2):227–248. doi: 10.1006/jmbi.1994.1365. [DOI] [PubMed] [Google Scholar]
- Kneller D. G., Cohen F. E., Langridge R. Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol. 1990 Jul 5;214(1):171–182. doi: 10.1016/0022-2836(90)90154-E. [DOI] [PubMed] [Google Scholar]
- Kuwajima K. The molten globule state as a clue for understanding the folding and cooperativity of globular-protein structure. Proteins. 1989;6(2):87–103. doi: 10.1002/prot.340060202. [DOI] [PubMed] [Google Scholar]
- Lim V. I. Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. J Mol Biol. 1974 Oct 5;88(4):857–872. doi: 10.1016/0022-2836(74)90404-5. [DOI] [PubMed] [Google Scholar]
- Makhatadze G. I., Privalov P. L. Heat capacity of proteins. I. Partial molar heat capacity of individual amino acid residues in aqueous solution: hydration effect. J Mol Biol. 1990 May 20;213(2):375–384. doi: 10.1016/S0022-2836(05)80197-4. [DOI] [PubMed] [Google Scholar]
- Nagano K. Logical analysis of the mechanism of protein folding II. The nucleation process. J Mol Biol. 1974 Apr 5;84(2):337–372. doi: 10.1016/0022-2836(74)90588-9. [DOI] [PubMed] [Google Scholar]
- Nakai K., Kidera A., Kanehisa M. Cluster analysis of amino acid indices for prediction of protein structure and function. Protein Eng. 1988 Jul;2(2):93–100. doi: 10.1093/protein/2.2.93. [DOI] [PubMed] [Google Scholar]
- Niefind K., Schomburg D. Amino acid similarity coefficients for protein modeling and sequence alignment derived from main-chain folding angles. J Mol Biol. 1991 Jun 5;219(3):481–497. doi: 10.1016/0022-2836(91)90188-c. [DOI] [PubMed] [Google Scholar]
- Perkins S. J. Protein volumes and hydration effects. The calculations of partial specific volumes, neutron scattering matchpoints and 280-nm absorption coefficients for proteins and glycoproteins from amino acid sequences. Eur J Biochem. 1986 May 15;157(1):169–180. doi: 10.1111/j.1432-1033.1986.tb09653.x. [DOI] [PubMed] [Google Scholar]
- Presnell S. R., Cohen B. I., Cohen F. E. A segment-based approach to protein secondary structure prediction. Biochemistry. 1992 Feb 4;31(4):983–993. doi: 10.1021/bi00119a006. [DOI] [PubMed] [Google Scholar]
- Qian N., Sejnowski T. J. Predicting the secondary structure of globular proteins using neural network models. J Mol Biol. 1988 Aug 20;202(4):865–884. doi: 10.1016/0022-2836(88)90564-5. [DOI] [PubMed] [Google Scholar]
- Richardson J. S., Richardson D. C. Amino acid preferences for specific locations at the ends of alpha helices. Science. 1988 Jun 17;240(4859):1648–1652. doi: 10.1126/science.3381086. [DOI] [PubMed] [Google Scholar]
- Rooman M. J., Wodak S. J. Identification of predictive sequence motifs limited by protein structure data base size. Nature. 1988 Sep 1;335(6185):45–49. doi: 10.1038/335045a0. [DOI] [PubMed] [Google Scholar]
- Roseman M. A. Hydrophilicity of polar amino acid side-chains is markedly reduced by flanking peptide bonds. J Mol Biol. 1988 Apr 5;200(3):513–522. doi: 10.1016/0022-2836(88)90540-2. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol. 1993 Jul 20;232(2):584–599. doi: 10.1006/jmbi.1993.1413. [DOI] [PubMed] [Google Scholar]
- Serrano L., Matouschek A., Fersht A. R. The folding of an enzyme. VI. The folding pathway of barnase: comparison with theoretical models. J Mol Biol. 1992 Apr 5;224(3):847–859. doi: 10.1016/0022-2836(92)90566-3. [DOI] [PubMed] [Google Scholar]
- Stickle D. F., Presta L. G., Dill K. A., Rose G. D. Hydrogen bonding in globular proteins. J Mol Biol. 1992 Aug 20;226(4):1143–1159. doi: 10.1016/0022-2836(92)91058-w. [DOI] [PubMed] [Google Scholar]
- Stolorz P., Lapedes A., Xia Y. Predicting protein secondary structure using neural net and statistical methods. J Mol Biol. 1992 May 20;225(2):363–377. doi: 10.1016/0022-2836(92)90927-c. [DOI] [PubMed] [Google Scholar]
