Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1991 Jan 25;19(2):313–318. doi: 10.1093/nar/19.2.313

Training back-propagation neural networks to define and detect DNA-binding sites.

M C O'Neill 1
PMCID: PMC333596  PMID: 2014171

Abstract

A three layered back-propagation neural network was trained to recognize E. coli promoters of the 17 base spacing class. To this end, the network was presented with 39 promoter sequences and derivatives of those sequences as positive inputs; 60% A + T random sequences and sequences containing 2 promoter-down point mutations were used as negative inputs. The entire promoter sequence of 58 bases, approximately -50 to +8, was entered as input. The network was asked to associate an output of 1.0 with promoter sequence input and 0.0 with non-promoter input. Generally, after 100,000 input cycles, the network was virtually perfect in classifying the training set. A trained network was about 80% effective in recognizing 'new' promoters which were not in the training set, with a false positive rate below 0.1%. Network searches on pBR322 and on the lambda genome were also performed. Overall the results were somewhat better than the best rule-based procedures. The trained network can be analyzed both for its choice of base and relative weighting, positive and negative, at each position of the sequence. This method, which requires only appropriate input/output training pairs, can be used to define and search for any DNA regulatory sequence for which there are sufficient exemplars.

Full text

PDF
313

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Berg O. G., von Hippel P. H. Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J Mol Biol. 1987 Feb 20;193(4):723–750. doi: 10.1016/0022-2836(87)90354-8. [DOI] [PubMed] [Google Scholar]
  2. Beutel B. A., Record M. T., Jr E. coli promoter spacer regions contain nonrandom sequences which correlate to spacer length. Nucleic Acids Res. 1990 Jun 25;18(12):3597–3603. doi: 10.1093/nar/18.12.3597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Botchan P. An electron microscopic comparison of transcription on linear and superhelical DNA. J Mol Biol. 1976 Jul 25;105(1):161–176. doi: 10.1016/0022-2836(76)90201-1. [DOI] [PubMed] [Google Scholar]
  4. Galas D. J., Eggert M., Waterman M. S. Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. J Mol Biol. 1985 Nov 5;186(1):117–128. doi: 10.1016/0022-2836(85)90262-1. [DOI] [PubMed] [Google Scholar]
  5. Gentz R., Bujard H. Promoters recognized by Escherichia coli RNA polymerase selected by function: highly efficient promoters from bacteriophage T5. J Bacteriol. 1985 Oct;164(1):70–77. doi: 10.1128/jb.164.1.70-77.1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Harley C. B., Reynolds R. P. Analysis of E. coli promoter sequences. Nucleic Acids Res. 1987 Mar 11;15(5):2343–2361. doi: 10.1093/nar/15.5.2343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Harr R., Häggström M., Gustafsson P. Search algorithm for pattern match analysis of nucleic acid sequences. Nucleic Acids Res. 1983 May 11;11(9):2943–2957. doi: 10.1093/nar/11.9.2943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Hawley D. K., McClure W. R. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 1983 Apr 25;11(8):2237–2255. doi: 10.1093/nar/11.8.2237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Mulligan M. E., Hawley D. K., Entriken R., McClure W. R. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):789–800. doi: 10.1093/nar/12.1part2.789. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. O'Neill M. C., Chiafari F. Escherichia coli promoters. II. A spacing class-dependent promoter search protocol. J Biol Chem. 1989 Apr 5;264(10):5531–5534. [PubMed] [Google Scholar]
  11. O'Neill M. C. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol. 1989 May 20;207(2):301–310. doi: 10.1016/0022-2836(89)90256-8. [DOI] [PubMed] [Google Scholar]
  12. O'Neill M. C. Escherichia coli promoters. I. Consensus as it relates to spacing class, specificity, repeat substructure, and three-dimensional organization. J Biol Chem. 1989 Apr 5;264(10):5522–5530. [PubMed] [Google Scholar]
  13. Peden K. W. Revised sequence of the tetracycline-resistance gene of pBR322. Gene. 1983 May-Jun;22(2-3):277–280. doi: 10.1016/0378-1119(83)90112-9. [DOI] [PubMed] [Google Scholar]
  14. Schneider T. D., Stormo G. D., Gold L., Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986 Apr 5;188(3):415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
  15. Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Stormo G. D., Schneider T. D., Gold L., Ehrenfeucht A. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982 May 11;10(9):2997–3011. doi: 10.1093/nar/10.9.2997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Studnicka G. M. Nucleotide sequence homologies in control regions of prokaryotic genomes. Gene. 1987;58(1):45–57. doi: 10.1016/0378-1119(87)90028-x. [DOI] [PubMed] [Google Scholar]
  18. Stüber D., Bujard H. Organization of transcriptional signals in plasmids pBR322 and pACYC184. Proc Natl Acad Sci U S A. 1981 Jan;78(1):167–171. doi: 10.1073/pnas.78.1.167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Youderian P., Bouvier S., Susskind M. M. Sequence determinants of promoter activity. Cell. 1982 Oct;30(3):843–853. doi: 10.1016/0092-8674(82)90289-6. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES