Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1992 Aug 25;20(16):4331–4338. doi: 10.1093/nar/20.16.4331

An assessment of neural network and statistical approaches for prediction of E. coli promoter sites.

P B Horton 1, M Kanehisa 1
PMCID: PMC334144  PMID: 1508724

Abstract

We have constructed a perceptron type neural network for E. coli promoter prediction and improved its ability to generalize with a new technique for selecting the sequence features shown during training. We have also reconstructed five previous prediction methods and compared the effectiveness of those methods and our neural network. Surprisingly, the simple statistical method of Mulligan et al. performed the best amongst the previous methods. Our neural network was comparable to Mulligan's method when false positives were kept low and better than Mulligan's method when false negatives were kept low. We also showed the correlation between the prediction rates of neural networks achieved by previous researchers and the information content of their data sets.

Full text

PDF
4331

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Alexandrov N. N., Mironov A. A. Application of a new method of pattern recognition in DNA sequence analysis: a study of E. coli promoters. Nucleic Acids Res. 1990 Apr 11;18(7):1847–1852. doi: 10.1093/nar/18.7.1847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Collis C. M., Molloy P. L., Both G. W., Drew H. R. Influence of the sequence-dependent flexure of DNA on transcription in E. coli. Nucleic Acids Res. 1989 Nov 25;17(22):9447–9468. doi: 10.1093/nar/17.22.9447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Demeler B., Zhou G. W. Neural network optimization for E. coli promoter prediction. Nucleic Acids Res. 1991 Apr 11;19(7):1593–1599. doi: 10.1093/nar/19.7.1593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Gentz R., Bujard H. Promoters recognized by Escherichia coli RNA polymerase selected by function: highly efficient promoters from bacteriophage T5. J Bacteriol. 1985 Oct;164(1):70–77. doi: 10.1128/jb.164.1.70-77.1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Harley C. B., Reynolds R. P. Analysis of E. coli promoter sequences. Nucleic Acids Res. 1987 Mar 11;15(5):2343–2361. doi: 10.1093/nar/15.5.2343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Harr R., Häggström M., Gustafsson P. Search algorithm for pattern match analysis of nucleic acid sequences. Nucleic Acids Res. 1983 May 11;11(9):2943–2957. doi: 10.1093/nar/11.9.2943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Hawley D. K., McClure W. R. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 1983 Apr 25;11(8):2237–2255. doi: 10.1093/nar/11.8.2237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Hirst J. D., Sternberg M. J. Prediction of ATP/GTP-binding motif: a comparison of a perceptron type neural network and a consensus sequence method [corrected]. Protein Eng. 1991 Aug;4(6):615–623. doi: 10.1093/protein/4.6.615. [DOI] [PubMed] [Google Scholar]
  9. Holley L. H., Karplus M. Protein secondary structure prediction with a neural network. Proc Natl Acad Sci U S A. 1989 Jan;86(1):152–156. doi: 10.1073/pnas.86.1.152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Horwitz M. S., Loeb L. A. DNA sequences of random origin as probes of Escherichia coli promoter architecture. J Biol Chem. 1988 Oct 15;263(29):14724–14731. [PubMed] [Google Scholar]
  11. Jacquet M. A., Ehrlich R., Reiss C. In vivo gene expression directed by synthetic promoter constructions restricted to the -10 and -35 consensus hexamers of E. coli. Nucleic Acids Res. 1989 Apr 25;17(8):2933–2945. doi: 10.1093/nar/17.8.2933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Jacquet M. A., Reiss C. Transcription in vivo directed by consensus sequences of E.coli promoters: their context heavily affects efficiencies and start sites. Nucleic Acids Res. 1990 Mar 11;18(5):1137–1143. doi: 10.1093/nar/18.5.1137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Kneller D. G., Cohen F. E., Langridge R. Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol. 1990 Jul 5;214(1):171–182. doi: 10.1016/0022-2836(90)90154-E. [DOI] [PubMed] [Google Scholar]
  14. Koroleva O. N., Drutsa V. L. In vivo promoter activity of the synthetic Pribnow box. FEBS Lett. 1991 Jan 28;278(2):207–210. doi: 10.1016/0014-5793(91)80118-m. [DOI] [PubMed] [Google Scholar]
  15. Lukashin A. V., Anshelevich V. V., Amirikyan B. R., Gragerov A. I., Frank-Kamenetskii M. D. Neural network models for promoter recognition. J Biomol Struct Dyn. 1989 Jun;6(6):1123–1133. doi: 10.1080/07391102.1989.10506540. [DOI] [PubMed] [Google Scholar]
  16. Mulligan M. E., Hawley D. K., Entriken R., McClure W. R. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):789–800. doi: 10.1093/nar/12.1part2.789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Nakata K., Kanehisa M., Maizel J. V., Jr Discriminant analysis of promoter regions in Escherichia coli sequences. Comput Appl Biosci. 1988 Aug;4(3):367–371. doi: 10.1093/bioinformatics/4.3.367. [DOI] [PubMed] [Google Scholar]
  18. O'Neill M. C. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol. 1989 May 20;207(2):301–310. doi: 10.1016/0022-2836(89)90256-8. [DOI] [PubMed] [Google Scholar]
  19. O'Neill M. C. Escherichia coli promoters. I. Consensus as it relates to spacing class, specificity, repeat substructure, and three-dimensional organization. J Biol Chem. 1989 Apr 5;264(10):5522–5530. [PubMed] [Google Scholar]
  20. O'Neill M. C. Training back-propagation neural networks to define and detect DNA-binding sites. Nucleic Acids Res. 1991 Jan 25;19(2):313–318. doi: 10.1093/nar/19.2.313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Qian N., Sejnowski T. J. Predicting the secondary structure of globular proteins using neural network models. J Mol Biol. 1988 Aug 20;202(4):865–884. doi: 10.1016/0022-2836(88)90564-5. [DOI] [PubMed] [Google Scholar]
  22. Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES