Abstract
We describe a neural network system that predicts the locations of transmembrane helices in integral membrane proteins. By using evolutionary information as input to the network system, the method significantly improved on a previously published neural network prediction method that had been based on single sequence information. The input data were derived from multiple alignments for each position in a window of 13 adjacent residues: amino acid frequency, conservation weights, number of insertions and deletions, and position of the window with respect to the ends of the protein chain. Additional input was the amino acid composition and length of the whole protein. A rigorous cross-validation test on 69 proteins with experimentally determined locations of transmembrane segments yielded an overall two-state per-residue accuracy of 95%. About 94% of all segments were predicted correctly. When applied to known globular proteins as a negative control, the network system incorrectly predicted fewer than 5% of globular proteins as having transmembrane helices. The method was applied to all 269 open reading frames from the complete yeast VIII chromosome. For 59 of these, at least two transmembrane helices were predicted. Thus, the prediction is that about one-fourth of all proteins from yeast VIII contain one transmembrane helix, and some 20%, more than one.
Full Text
The Full Text of this article is available as a PDF (1.9 MB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Argos P., Rao J. K., Hargrave P. A. Structural prediction of membrane-bound proteins. Eur J Biochem. 1982 Nov 15;128(2-3):565–575. doi: 10.1111/j.1432-1033.1982.tb07002.x. [DOI] [PubMed] [Google Scholar]
- Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank: current status. Nucleic Acids Res. 1994 Sep;22(17):3578–3580. [PMC free article] [PubMed] [Google Scholar]
- Baldwin J. M. The probable arrangement of the helices in G protein-coupled receptors. EMBO J. 1993 Apr;12(4):1693–1703. doi: 10.1002/j.1460-2075.1993.tb05814.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
- Cornette J. L., Cease K. B., Margalit H., Spouge J. L., Berzofsky J. A., DeLisi C. Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. J Mol Biol. 1987 Jun 5;195(3):659–685. doi: 10.1016/0022-2836(87)90189-6. [DOI] [PubMed] [Google Scholar]
- Cowan S. W., Rosenbusch J. P. Folding pattern diversity of integral membrane proteins. Science. 1994 May 13;264(5161):914–916. doi: 10.1126/science.8178151. [DOI] [PubMed] [Google Scholar]
- Edelman J. Quadratic minimization of predictors for protein secondary structure. Application to transmembrane alpha-helices. J Mol Biol. 1993 Jul 5;232(1):165–191. doi: 10.1006/jmbi.1993.1375. [DOI] [PubMed] [Google Scholar]
- Eisenberg D., Schwarz E., Komaromy M., Wall R. Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J Mol Biol. 1984 Oct 15;179(1):125–142. doi: 10.1016/0022-2836(84)90309-7. [DOI] [PubMed] [Google Scholar]
- Eisenberg D., Weiss R. M., Terwilliger T. C. The hydrophobic moment detects periodicity in protein hydrophobicity. Proc Natl Acad Sci U S A. 1984 Jan;81(1):140–144. doi: 10.1073/pnas.81.1.140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engelman D. M., Steitz T. A., Goldman A. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu Rev Biophys Biophys Chem. 1986;15:321–353. doi: 10.1146/annurev.bb.15.060186.001541. [DOI] [PubMed] [Google Scholar]
- Fariselli P., Compiani M., Casadio R. Predicting secondary structures of membrane proteins with neural networks. Eur Biophys J. 1993;22(1):41–51. doi: 10.1007/BF00205811. [DOI] [PubMed] [Google Scholar]
- Henderson R., Baldwin J. M., Ceska T. A., Zemlin F., Beckmann E., Downing K. H. Model for the structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy. J Mol Biol. 1990 Jun 20;213(4):899–929. doi: 10.1016/S0022-2836(05)80271-2. [DOI] [PubMed] [Google Scholar]
- Johnston M., Andrews S., Brinkman R., Cooper J., Ding H., Dover J., Du Z., Favello A., Fulton L., Gattung S. Complete nucleotide sequence of Saccharomyces cerevisiae chromosome VIII. Science. 1994 Sep 30;265(5181):2077–2082. doi: 10.1126/science.8091229. [DOI] [PubMed] [Google Scholar]
- Jones D. T., Taylor W. R., Thornton J. M. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992 Jun;8(3):275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
- Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
- Lattman E. E. Protein crystallography for all. Proteins. 1994 Feb;18(2):103–106. doi: 10.1002/prot.340180203. [DOI] [PubMed] [Google Scholar]
- Manoil C., Beckwith J. A genetic approach to analyzing membrane protein topology. Science. 1986 Sep 26;233(4771):1403–1408. doi: 10.1126/science.3529391. [DOI] [PubMed] [Google Scholar]
- Matthews B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975 Oct 20;405(2):442–451. doi: 10.1016/0005-2795(75)90109-9. [DOI] [PubMed] [Google Scholar]
- Nakashima H., Nishikawa K. The amino acid composition is different between the cytoplasmic and extracellular sides in membrane proteins. FEBS Lett. 1992 Jun 1;303(2-3):141–146. doi: 10.1016/0014-5793(92)80506-c. [DOI] [PubMed] [Google Scholar]
- O'Hara P. J., Sheppard P. O., Thøgersen H., Venezia D., Haldeman B. A., McGrane V., Houamed K. M., Thomsen C., Gilbert T. L., Mulvihill E. R. The ligand-binding domain in metabotropic glutamate receptors is related to bacterial periplasmic binding proteins. Neuron. 1993 Jul;11(1):41–52. doi: 10.1016/0896-6273(93)90269-w. [DOI] [PubMed] [Google Scholar]
- Pearson W. R., Lipman D. J. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444–2448. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pearson W. R., Miller W. Dynamic programming algorithms for biological sequence comparison. Methods Enzymol. 1992;210:575–601. doi: 10.1016/0076-6879(92)10029-d. [DOI] [PubMed] [Google Scholar]
- Persson B., Argos P. Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J Mol Biol. 1994 Mar 25;237(2):182–192. doi: 10.1006/jmbi.1994.1220. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins. 1994 May;19(1):55–72. doi: 10.1002/prot.340190108. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C. Conservation and prediction of solvent accessibility in protein families. Proteins. 1994 Nov;20(3):216–226. doi: 10.1002/prot.340200303. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc Natl Acad Sci U S A. 1993 Aug 15;90(16):7558–7562. doi: 10.1073/pnas.90.16.7558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rost B., Sander C., Schneider R. Redefining the goals of protein secondary structure prediction. J Mol Biol. 1994 Jan 7;235(1):13–26. doi: 10.1016/s0022-2836(05)80007-5. [DOI] [PubMed] [Google Scholar]
- Rost B., Schneider R., Sander C. Progress in protein structure prediction? Trends Biochem Sci. 1993 Apr;18(4):120–123. doi: 10.1016/0968-0004(93)90017-h. [DOI] [PubMed] [Google Scholar]
- Sander C., Schneider R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991;9(1):56–68. doi: 10.1002/prot.340090107. [DOI] [PubMed] [Google Scholar]
- Sander C., Schneider R. The HSSP database of protein structure-sequence alignments. Nucleic Acids Res. 1994 Sep;22(17):3597–3599. [PMC free article] [PubMed] [Google Scholar]
- Sipos L., von Heijne G. Predicting the topology of eukaryotic membrane proteins. Eur J Biochem. 1993 May 1;213(3):1333–1340. doi: 10.1111/j.1432-1033.1993.tb17885.x. [DOI] [PubMed] [Google Scholar]
- Wang D. N., Kühlbrandt W., Sarabia V. E., Reithmeier R. A. Two-dimensional structure of the membrane domain of human band 3, the anion transport protein of the erythrocyte membrane. EMBO J. 1993 Jun;12(6):2233–2239. doi: 10.1002/j.1460-2075.1993.tb05876.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weiss M. S., Schulz G. E. Structure of porin refined at 1.8 A resolution. J Mol Biol. 1992 Sep 20;227(2):493–509. doi: 10.1016/0022-2836(92)90903-w. [DOI] [PubMed] [Google Scholar]
- von Heijne G. A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 1986 Jun 11;14(11):4683–4690. doi: 10.1093/nar/14.11.4683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- von Heijne G. Computer analysis of DNA and protein sequences. Eur J Biochem. 1991 Jul 15;199(2):253–256. doi: 10.1111/j.1432-1033.1991.tb16117.x. [DOI] [PubMed] [Google Scholar]
- von Heijne G., Gavel Y. Topogenic signals in integral membrane proteins. Eur J Biochem. 1988 Jul 1;174(4):671–678. doi: 10.1111/j.1432-1033.1988.tb14150.x. [DOI] [PubMed] [Google Scholar]
- von Heijne G. Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol. 1992 May 20;225(2):487–494. doi: 10.1016/0022-2836(92)90934-c. [DOI] [PubMed] [Google Scholar]
- von Heijne G. Membrane proteins: the amino acid composition of membrane-penetrating segments. Eur J Biochem. 1981 Nov;120(2):275–278. doi: 10.1111/j.1432-1033.1981.tb05700.x. [DOI] [PubMed] [Google Scholar]