Abstract
Previously, we introduced a neural network system predicting locations of transmembrane helices (HTMs) based on evolutionary profiles (PHDhtm, Rost B, Casadio R, Fariselli P, Sander C, 1995, Protein Sci 4:521-533). Here, we describe an improvement and an extension of that system. The improvement is achieved by a dynamic programming-like algorithm that optimizes helices compatible with the neural network output. The extension is the prediction of topology (orientation of first loop region with respect to membrane) by applying to the refined prediction the observation that positively charged residues are more abundant in extra-cytoplasmic regions. Furthermore, we introduce a method to reduce the number of false positives, i.e., proteins falsely predicted with membrane helices. The evaluation of prediction accuracy is based on a cross-validation and a double-blind test set (in total 131 proteins). The final method appears to be more accurate than other methods published: (1) For almost 89% (+/-3%) of the test proteins, all HTMs are predicted correctly. (2) For more than 86% (+/-3%) of the proteins, topology is predicted correctly. (3) We define reliability indices that correlate with prediction accuracy: for one half of the proteins, segment accuracy raises to 98%; and for two-thirds, accuracy of topology prediction is 95%. (4) The rate of proteins for which HTMs are predicted falsely is below 2% (+/-1%). Finally, the method is applied to 1,616 sequences of Haemophilus influenzae. We predict 19% of the genome sequences to contain one or more HTMs. This appears to be lower than what we predicted previously for the yeast VIII chromosome (about 25%).
Full Text
The Full Text of this article is available as a PDF (3.8 MB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- Argos P., Rao J. K., Hargrave P. A. Structural prediction of membrane-bound proteins. Eur J Biochem. 1982 Nov 15;128(2-3):565–575. doi: 10.1111/j.1432-1033.1982.tb07002.x. [DOI] [PubMed] [Google Scholar]
- Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank: current status. Nucleic Acids Res. 1994 Sep;22(17):3578–3580. [PMC free article] [PubMed] [Google Scholar]
- Bernstein F. C., Koetzle T. F., Williams G. J., Meyer E. F., Jr, Brice M. D., Rodgers J. R., Kennard O., Shimanouchi T., Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol. 1977 May 25;112(3):535–542. doi: 10.1016/s0022-2836(77)80200-3. [DOI] [PubMed] [Google Scholar]
- Bjørbaek C., Foërsom V., Michelsen O. The transmembrane topology of the a [corrected] subunit from the ATPase in Escherichia coli analyzed by PhoA protein fusions. FEBS Lett. 1990 Jan 15;260(1):31–34. doi: 10.1016/0014-5793(90)80058-q. [DOI] [PubMed] [Google Scholar]
- Boyd D., Beckwith J. The role of charged amino acids in the localization of secreted and membrane proteins. Cell. 1990 Sep 21;62(6):1031–1033. doi: 10.1016/0092-8674(90)90378-r. [DOI] [PubMed] [Google Scholar]
- Casadio R., Fariselli P., Taroni C., Compiani M. A predictor of transmembrane alpha-helix domains of proteins based on neural networks. Eur Biophys J. 1996;24(3):165–178. doi: 10.1007/BF00180274. [DOI] [PubMed] [Google Scholar]
- Casari G., Andrade M. A., Bork P., Boyle J., Daruvar A., Ouzounis C., Schneider R., Tamames J., Valencia A., Sander C. Challenging times for bioinformatics. Nature. 1995 Aug 24;376(6542):647–648. doi: 10.1038/376647a0. [DOI] [PubMed] [Google Scholar]
- Cornette J. L., Cease K. B., Margalit H., Spouge J. L., Berzofsky J. A., DeLisi C. Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. J Mol Biol. 1987 Jun 5;195(3):659–685. doi: 10.1016/0022-2836(87)90189-6. [DOI] [PubMed] [Google Scholar]
- Cowan S. W., Rosenbusch J. P. Folding pattern diversity of integral membrane proteins. Science. 1994 May 13;264(5161):914–916. doi: 10.1126/science.8178151. [DOI] [PubMed] [Google Scholar]
- Dalbey R. E. Positively charged residues are important determinants of membrane protein topology. Trends Biochem Sci. 1990 Jul;15(7):253–257. doi: 10.1016/0968-0004(90)90047-f. [DOI] [PubMed] [Google Scholar]
- Degli Esposti M., Crimi M., Venturoli G. A critical evaluation of the hydropathy profile of membrane proteins. Eur J Biochem. 1990 May 31;190(1):207–219. doi: 10.1111/j.1432-1033.1990.tb15566.x. [DOI] [PubMed] [Google Scholar]
- Donnelly D., Overington J. P., Ruffle S. V., Nugent J. H., Blundell T. L. Modeling alpha-helical transmembrane domains: the calculation and use of substitution tables for lipid-facing residues. Protein Sci. 1993 Jan;2(1):55–70. doi: 10.1002/pro.5560020106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edelman J. Quadratic minimization of predictors for protein secondary structure. Application to transmembrane alpha-helices. J Mol Biol. 1993 Jul 5;232(1):165–191. doi: 10.1006/jmbi.1993.1375. [DOI] [PubMed] [Google Scholar]
- Eisenberg D., Schwarz E., Komaromy M., Wall R. Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J Mol Biol. 1984 Oct 15;179(1):125–142. doi: 10.1016/0022-2836(84)90309-7. [DOI] [PubMed] [Google Scholar]
- Engelman D. M., Steitz T. A., Goldman A. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu Rev Biophys Biophys Chem. 1986;15:321–353. doi: 10.1146/annurev.bb.15.060186.001541. [DOI] [PubMed] [Google Scholar]
- Esposti M. D., De Vries S., Crimi M., Ghelli A., Patarnello T., Meyer A. Mitochondrial cytochrome b: evolution and structure of the protein. Biochim Biophys Acta. 1993 Jul 26;1143(3):243–271. doi: 10.1016/0005-2728(93)90197-n. [DOI] [PubMed] [Google Scholar]
- Fariselli P., Casadio R. HTP: a neural network-based method for predicting the topology of helical transmembrane domains in proteins. Comput Appl Biosci. 1996 Feb;12(1):41–48. doi: 10.1093/bioinformatics/12.1.41. [DOI] [PubMed] [Google Scholar]
- Fleischmann R. D., Adams M. D., White O., Clayton R. A., Kirkness E. F., Kerlavage A. R., Bult C. J., Tomb J. F., Dougherty B. A., Merrick J. M. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995 Jul 28;269(5223):496–512. doi: 10.1126/science.7542800. [DOI] [PubMed] [Google Scholar]
- Henderson R., Baldwin J. M., Ceska T. A., Zemlin F., Beckmann E., Downing K. H. Model for the structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy. J Mol Biol. 1990 Jun 20;213(4):899–929. doi: 10.1016/S0022-2836(05)80271-2. [DOI] [PubMed] [Google Scholar]
- Hucho F., Görne-Tschelnokow U., Strecker A. Beta-structure in the membrane-spanning part of the nicotinic acetylcholine receptor (or how helical are transmembrane helices?). Trends Biochem Sci. 1994 Sep;19(9):383–387. doi: 10.1016/0968-0004(94)90116-3. [DOI] [PubMed] [Google Scholar]
- Johnston M., Andrews S., Brinkman R., Cooper J., Ding H., Dover J., Du Z., Favello A., Fulton L., Gattung S. Complete nucleotide sequence of Saccharomyces cerevisiae chromosome VIII. Science. 1994 Sep 30;265(5181):2077–2082. doi: 10.1126/science.8091229. [DOI] [PubMed] [Google Scholar]
- Jones D. T., Taylor W. R., Thornton J. M. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry. 1994 Mar 15;33(10):3038–3049. doi: 10.1021/bi00176a037. [DOI] [PubMed] [Google Scholar]
- Kabsch W., Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983 Dec;22(12):2577–2637. doi: 10.1002/bip.360221211. [DOI] [PubMed] [Google Scholar]
- Karlin S., Altschul S. F. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264–2268. doi: 10.1073/pnas.87.6.2264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kreusch A., Schulz G. E. Refined structure of the porin from Rhodopseudomonas blastica. Comparison with the porin from Rhodobacter capsulatus. J Mol Biol. 1994 Nov 11;243(5):891–905. doi: 10.1006/jmbi.1994.1690. [DOI] [PubMed] [Google Scholar]
- Kyte J., Doolittle R. F. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982 May 5;157(1):105–132. doi: 10.1016/0022-2836(82)90515-0. [DOI] [PubMed] [Google Scholar]
- Lewis M. J., Chang J. A., Simoni R. D. A topological analysis of subunit alpha from Escherichia coli F1F0-ATP synthase predicts eight transmembrane segments. J Biol Chem. 1990 Jun 25;265(18):10541–10550. [PubMed] [Google Scholar]
- Manoil C., Beckwith J. A genetic approach to analyzing membrane protein topology. Science. 1986 Sep 26;233(4771):1403–1408. doi: 10.1126/science.3529391. [DOI] [PubMed] [Google Scholar]
- Nilsson I., von Heijne G. Fine-tuning the topology of a polytopic membrane protein: role of positively and negatively charged amino acids. Cell. 1990 Sep 21;62(6):1135–1141. doi: 10.1016/0092-8674(90)90390-z. [DOI] [PubMed] [Google Scholar]
- O'Hara P. J., Sheppard P. O., Thøgersen H., Venezia D., Haldeman B. A., McGrane V., Houamed K. M., Thomsen C., Gilbert T. L., Mulvihill E. R. The ligand-binding domain in metabotropic glutamate receptors is related to bacterial periplasmic binding proteins. Neuron. 1993 Jul;11(1):41–52. doi: 10.1016/0896-6273(93)90269-w. [DOI] [PubMed] [Google Scholar]
- Park K., Perczel A., Fasman G. D. Differentiation between transmembrane helices and peripheral helices by the deconvolution of circular dichroism spectra of membrane proteins. Protein Sci. 1992 Aug;1(8):1032–1049. doi: 10.1002/pro.5560010809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Persson B., Argos P. Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J Mol Biol. 1994 Mar 25;237(2):182–192. doi: 10.1006/jmbi.1994.1220. [DOI] [PubMed] [Google Scholar]
- Rost B., Casadio R., Fariselli P., Sander C. Transmembrane helices predicted at 95% accuracy. Protein Sci. 1995 Mar;4(3):521–533. doi: 10.1002/pro.5560040318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rost B., Sander C. Progress of 1D protein structure prediction at last. Proteins. 1995 Nov;23(3):295–300. doi: 10.1002/prot.340230304. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C., Schneider R. PHD--an automatic mail server for protein secondary structure prediction. Comput Appl Biosci. 1994 Feb;10(1):53–60. doi: 10.1093/bioinformatics/10.1.53. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C. Secondary structure prediction of all-helical proteins in two states. Protein Eng. 1993 Nov;6(8):831–836. doi: 10.1093/protein/6.8.831. [DOI] [PubMed] [Google Scholar]
- Rost B., Sander C. Structure prediction of proteins--where are we now? Curr Opin Biotechnol. 1994 Aug;5(4):372–380. doi: 10.1016/0958-1669(94)90045-0. [DOI] [PubMed] [Google Scholar]
- Rost B. TOPITS: threading one-dimensional predictions into three-dimensional structures. Proc Int Conf Intell Syst Mol Biol. 1995;3:314–321. [PubMed] [Google Scholar]
- Sander C., Schneider R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991;9(1):56–68. doi: 10.1002/prot.340090107. [DOI] [PubMed] [Google Scholar]
- Sander C., Schneider R. The HSSP database of protein structure-sequence alignments. Nucleic Acids Res. 1994 Sep;22(17):3597–3599. [PMC free article] [PubMed] [Google Scholar]
- Sipos L., von Heijne G. Predicting the topology of eukaryotic membrane proteins. Eur J Biochem. 1993 May 1;213(3):1333–1340. doi: 10.1111/j.1432-1033.1993.tb17885.x. [DOI] [PubMed] [Google Scholar]
- Stokes D. L., Taylor W. R., Green N. M. Structure, transmembrane topology and helix packing of P-type ion pumps. FEBS Lett. 1994 Jun 6;346(1):32–38. doi: 10.1016/0014-5793(94)00297-5. [DOI] [PubMed] [Google Scholar]
- Taylor W. R., Jones D. T., Green N. M. A method for alpha-helical integral membrane protein fold prediction. Proteins. 1994 Mar;18(3):281–294. doi: 10.1002/prot.340180309. [DOI] [PubMed] [Google Scholar]
- von Heijne G. A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 1986 Jun 11;14(11):4683–4690. doi: 10.1093/nar/14.11.4683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- von Heijne G. Control of topology and mode of assembly of a polytopic membrane protein by positively charged residues. Nature. 1989 Oct 5;341(6241):456–458. doi: 10.1038/341456a0. [DOI] [PubMed] [Google Scholar]
- von Heijne G. Membrane proteins: the amino acid composition of membrane-penetrating segments. Eur J Biochem. 1981 Nov;120(2):275–278. doi: 10.1111/j.1432-1033.1981.tb05700.x. [DOI] [PubMed] [Google Scholar]