Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1995 May 11;23(9):1632–1639. doi: 10.1093/nar/23.9.1632

Identification of ribosome binding sites in Escherichia coli using neural network models.

D Bisant 1, J Maizel 1
PMCID: PMC306908  PMID: 7784221

Abstract

This study investigated the use of neural networks in the identification of Escherichia coli ribosome binding sites. The recognition of these sites based on primary sequence data is difficult due to the multiple determinants that define them. Additionally, secondary structure plays a significant role in the determination of the site and this information is difficult to include in the models. Efforts to solve this problem have so far yielded poor results. A new compilation of E. coli ribosome binding sites was generated for this study. Feedforward backpropagation networks were applied to their identification. Perceptrons were also applied, since they have been the previous best method since 1982. Evaluation of performance for all the neural networks and perceptrons was determined by ROC analysis. The neural network provided significant improvement in the recognition of these sites when compared with the previous best method, finding less than half the number of false positives when both models were adjusted to find an equal number of actual sites. The best neural network used an input window of 101 nucleotides and a single hidden layer of 9 units. Both the neural network and the perceptron trained on the new compilation performed better than the original perceptron published by Stormo et al. in 1982.

Full text

PDF
1632

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Bilofsky H. S., Burks C. The GenBank genetic sequence data bank. Nucleic Acids Res. 1988 Mar 11;16(5):1861–1863. doi: 10.1093/nar/16.5.1861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Borisova G. P., Volkova T. M., Berzin V., Rosenthal G., Gren E. J. The regulatory region of MS2 phage RNA replicase cistron. IV. Functional activity of specific MS2 RNA fragments in formation of the 70 S initiation complex of protein biosynthesis. Nucleic Acids Res. 1979;6(5):1761–1774. doi: 10.1093/nar/6.5.1761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brunak S., Engelbrecht J., Knudsen S. Prediction of human mRNA donor and acceptor sites from the DNA sequence. J Mol Biol. 1991 Jul 5;220(1):49–65. doi: 10.1016/0022-2836(91)90380-o. [DOI] [PubMed] [Google Scholar]
  4. Coleman J., Inouye M., Nakamura K. Mutations upstream of the ribosome-binding site affect translational efficiency. J Mol Biol. 1985 Jan 5;181(1):139–143. doi: 10.1016/0022-2836(85)90332-8. [DOI] [PubMed] [Google Scholar]
  5. Davis M. A., Simons R. W., Kleckner N. Tn10 protects itself at two levels from fortuitous activation by external promoters. Cell. 1985 Nov;43(1):379–387. doi: 10.1016/0092-8674(85)90043-1. [DOI] [PubMed] [Google Scholar]
  6. Dunn J. J., Buzash-Pollert E., Studier F. W. Mutations of bacteriophage T7 that affect initiation of synthesis of the gene 0.3 protein. Proc Natl Acad Sci U S A. 1978 Jun;75(6):2741–2745. doi: 10.1073/pnas.75.6.2741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Gillam S., Astell C. R., Jahnke P., Hutchison C. A., 3rd, Smith M. Construction and properties of a ribosome-binding site mutation in gene E of phi X174 bacteriophage. J Virol. 1984 Dec;52(3):892–896. doi: 10.1128/jvi.52.3.892-896.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Gold L., Pribnow D., Schneider T., Shinedling S., Singer B. S., Stormo G. Translational initiation in prokaryotes. Annu Rev Microbiol. 1981;35:365–403. doi: 10.1146/annurev.mi.35.100181.002053. [DOI] [PubMed] [Google Scholar]
  9. Gupta S. L., Chen J., Schaefer L., Lengyel P., Weissman S. M. Nucleotide sequence of a ribosome attachment site of bacteriophage f2 RNA. Biochem Biophys Res Commun. 1970 Jun 5;39(5):883–888. doi: 10.1016/0006-291x(70)90406-7. [DOI] [PubMed] [Google Scholar]
  10. Hindley J., Staples D. H. Sequence of a ribosome binding site in bacteriophage Q-beta-RNA. Nature. 1969 Dec 6;224(5223):964–967. doi: 10.1038/224964a0. [DOI] [PubMed] [Google Scholar]
  11. Holbrook S. R., Muskal S. M., Kim S. H. Predicting surface exposure of amino acids from protein sequence. Protein Eng. 1990 Aug;3(8):659–665. doi: 10.1093/protein/3.8.659. [DOI] [PubMed] [Google Scholar]
  12. Hui A., Hayflick J., Dinkelspiel K., de Boer H. A. Mutagenesis of the three bases preceding the start codon of the beta-galactosidase mRNA and its effect on translation in Escherichia coli. EMBO J. 1984 Mar;3(3):623–629. doi: 10.1002/j.1460-2075.1984.tb01858.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Iserentant D., Fiers W. Secondary structure of mRNA and efficiency of translation initiation. Gene. 1980 Apr;9(1-2):1–12. doi: 10.1016/0378-1119(80)90163-8. [DOI] [PubMed] [Google Scholar]
  14. Ivanov I. G., Alexandrova R., Dragulev B., Leclerc D., Saraffova A., Maximova V., Abouhaidar M. G. Efficiency of the 5'-terminal sequence (omega) of tobacco mosaic virus RNA for the initiation of eukaryotic gene translation in Escherichia coli. Eur J Biochem. 1992 Oct 1;209(1):151–156. doi: 10.1111/j.1432-1033.1992.tb17271.x. [DOI] [PubMed] [Google Scholar]
  15. Jansone I., Berzin V., Gribanov V., Gren E. J. The regulatory region of MS2 phage RNA replicase cistron. III. Characterization of fragments resulting from S1 nuclease digestion. Nucleic Acids Res. 1979;6(5):1747–1760. doi: 10.1093/nar/6.5.1747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kneller D. G., Cohen F. E., Langridge R. Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol. 1990 Jul 5;214(1):171–182. doi: 10.1016/0022-2836(90)90154-E. [DOI] [PubMed] [Google Scholar]
  17. Lodish H. F. Translational control of protein synthesis. Annu Rev Biochem. 1976;45:39–72. doi: 10.1146/annurev.bi.45.070176.000351. [DOI] [PubMed] [Google Scholar]
  18. Lukashin A. V., Anshelevich V. V., Amirikyan B. R., Gragerov A. I., Frank-Kamenetskii M. D. Neural network models for promoter recognition. J Biomol Struct Dyn. 1989 Jun;6(6):1123–1133. doi: 10.1080/07391102.1989.10506540. [DOI] [PubMed] [Google Scholar]
  19. Matteucci M. D., Heyneker H. L. Targeted random mutagenesis: the use of ambiguously synthesized oligonucleotides to mutagenize sequences immediately 5' of an ATG initiation codon. Nucleic Acids Res. 1983 May 25;11(10):3113–3121. doi: 10.1093/nar/11.10.3113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. McGregor M. J., Flores T. P., Sternberg M. J. Prediction of beta-turns in proteins using neural networks. Protein Eng. 1989 May;2(7):521–526. doi: 10.1093/protein/2.7.521. [DOI] [PubMed] [Google Scholar]
  21. Meistrell M. L. Evaluation of neural network performance by receiver operating characteristic (ROC) analysis: examples from the biotechnology domain. Comput Methods Programs Biomed. 1990 May;32(1):73–80. doi: 10.1016/0169-2607(90)90087-p. [DOI] [PubMed] [Google Scholar]
  22. Min Jou W., Haegeman G., Ysebaert M., Fiers W. Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein. Nature. 1972 May 12;237(5350):82–88. doi: 10.1038/237082a0. [DOI] [PubMed] [Google Scholar]
  23. Munson L. M., Stormo G. D., Niece R. L., Reznikoff W. S. lacZ translation initiation mutations. J Mol Biol. 1984 Aug 25;177(4):663–683. doi: 10.1016/0022-2836(84)90043-3. [DOI] [PubMed] [Google Scholar]
  24. Muskal S. M., Holbrook S. R., Kim S. H. Prediction of the disulfide-bonding state of cysteine in proteins. Protein Eng. 1990 Aug;3(8):667–672. doi: 10.1093/protein/3.8.667. [DOI] [PubMed] [Google Scholar]
  25. O'Neill M. C., Chiafari F. Escherichia coli promoters. II. A spacing class-dependent promoter search protocol. J Biol Chem. 1989 Apr 5;264(10):5531–5534. [PubMed] [Google Scholar]
  26. O'Neill M. C. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol. 1989 May 20;207(2):301–310. doi: 10.1016/0022-2836(89)90256-8. [DOI] [PubMed] [Google Scholar]
  27. O'Neill M. C. Training back-propagation neural networks to define and detect DNA-binding sites. Nucleic Acids Res. 1991 Jan 25;19(2):313–318. doi: 10.1093/nar/19.2.313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Petersen C. The functional stability of the lacZ transcript is sensitive towards sequence alterations immediately downstream of the ribosome binding site. Mol Gen Genet. 1987 Aug;209(1):179–187. doi: 10.1007/BF00329856. [DOI] [PubMed] [Google Scholar]
  29. Qian N., Sejnowski T. J. Predicting the secondary structure of globular proteins using neural network models. J Mol Biol. 1988 Aug 20;202(4):865–884. doi: 10.1016/0022-2836(88)90564-5. [DOI] [PubMed] [Google Scholar]
  30. Ray P. N., Pearson M. L. Evidence for post-transcriptional control of the morphogenetic genes of bacteriophage lambda. J Mol Biol. 1974 May 5;85(1):163–175. doi: 10.1016/0022-2836(74)90135-1. [DOI] [PubMed] [Google Scholar]
  31. Ray P. N., Pearson M. L. Functional inactivation of bacteriophage lambda morphogenetic gene in RNA. Nature. 1975 Feb 20;253(5493):647–650. doi: 10.1038/253647a0. [DOI] [PubMed] [Google Scholar]
  32. Ringquist S., Shinedling S., Barrick D., Green L., Binkley J., Stormo G. D., Gold L. Translation initiation in Escherichia coli: sequences within the ribosome-binding site. Mol Microbiol. 1992 May;6(9):1219–1229. doi: 10.1111/j.1365-2958.1992.tb01561.x. [DOI] [PubMed] [Google Scholar]
  33. Roberts T. M., Kacich R., Ptashne M. A general method for maximizing the expression of a cloned gene. Proc Natl Acad Sci U S A. 1979 Feb;76(2):760–764. doi: 10.1073/pnas.76.2.760. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Rost B., Sander C. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc Natl Acad Sci U S A. 1993 Aug 15;90(16):7558–7562. doi: 10.1073/pnas.90.16.7558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Rudd K. E., Miller W., Werner C., Ostell J., Tolstoshev C., Satterfield S. G. Mapping sequenced E.coli genes by computer: software, strategies and examples. Nucleic Acids Res. 1991 Feb 11;19(3):637–647. doi: 10.1093/nar/19.3.637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Scherer G. F., Walkinshaw M. D., Arnott S., Morré D. J. The ribosome binding sites recognized by E. coli ribosomes have regions with signal character in both the leader and protein coding segments. Nucleic Acids Res. 1980 Sep 11;8(17):3895–3907. doi: 10.1093/nar/8.17.3895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Selker E., Yanofsky C. Nucleotide sequence of the trpC-trpB intercistronic region from Salmonella typhimurium. J Mol Biol. 1979 May 15;130(2):135–143. doi: 10.1016/0022-2836(79)90422-4. [DOI] [PubMed] [Google Scholar]
  38. Shine J., Dalgarno L. The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci U S A. 1974 Apr;71(4):1342–1346. doi: 10.1073/pnas.71.4.1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Snyder E. E., Stormo G. D. Identification of coding regions in genomic DNA sequences: an application of dynamic programming and neural networks. Nucleic Acids Res. 1993 Feb 11;21(3):607–613. doi: 10.1093/nar/21.3.607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Steege D. A. 5'-Terminal nucleotide sequence of Escherichia coli lactose repressor mRNA: features of translational initiation and reinitiation sites. Proc Natl Acad Sci U S A. 1977 Oct;74(10):4163–4167. doi: 10.1073/pnas.74.10.4163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Steitz J. A., Jakes K. How ribosomes select initiator regions in mRNA: base pair formation between the 3' terminus of 16S rRNA and the mRNA during initiation of protein synthesis in Escherichia coli. Proc Natl Acad Sci U S A. 1975 Dec;72(12):4734–4738. doi: 10.1073/pnas.72.12.4734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Steitz J. A. Polypeptide chain initiation: nucleotide sequences of the three ribosomal binding sites in bacteriophage R17 RNA. Nature. 1969 Dec 6;224(5223):957–964. doi: 10.1038/224957a0. [DOI] [PubMed] [Google Scholar]
  43. Stormo G. D., Schneider T. D., Gold L. M. Characterization of translational initiation sites in E. coli. Nucleic Acids Res. 1982 May 11;10(9):2971–2996. doi: 10.1093/nar/10.9.2971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Stormo G. D., Schneider T. D., Gold L., Ehrenfeucht A. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982 May 11;10(9):2997–3011. doi: 10.1093/nar/10.9.2997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Thanaraj T. A., Pandit M. W. An additional ribosome-binding site on mRNA of highly expressed genes and a bifunctional site on the colicin fragment of 16S rRNA from Escherichia coli: important determinants of the efficiency of translation-initiation. Nucleic Acids Res. 1989 Apr 25;17(8):2973–2985. doi: 10.1093/nar/17.8.2973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Vieth M., Koliński A., Skolnick J., Sikorski A. Prediction of protein secondary structure by neural networks: encoding short and long range patterns of amino acid packing. Acta Biochim Pol. 1992;39(4):369–392. [PubMed] [Google Scholar]
  47. de Smit M. H., van Duin J. Translational initiation on structured messengers. Another role for the Shine-Dalgarno interaction. J Mol Biol. 1994 Jan 7;235(1):173–184. doi: 10.1016/s0022-2836(05)80024-5. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES