Abstract
A general data study of eukaryotic promoter sequences from widely different species is presented. Mammalian promoters with known transcription initiation sites represented the largest subclass of the data, and for this group neural network algorithms were trained to predict the location of the initiation site in a test set. The prediction accuracy of this local method was higher than what could be expected from the known non-local structure of eukaryotic promoters. Subsequent analysis revealed, besides the consensus of the two known important subregions: the TATA-box TATAAA and the Cap-signal CA, a CT-signal positioned on the average seven nucleotides downstream of the transcription initiation site. The consensus of the CT-signal is CTNCNG. The details of this core promoter element were disclosed using multiple alignment and have earlier only been described in a few isolated examples.
Full text
PDF







Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Brunak S., Engelbrecht J., Knudsen S. Prediction of human mRNA donor and acceptor sites from the DNA sequence. J Mol Biol. 1991 Jul 5;220(1):49–65. doi: 10.1016/0022-2836(91)90380-o. [DOI] [PubMed] [Google Scholar]
- Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol. 1990 Apr 20;212(4):563–578. doi: 10.1016/0022-2836(90)90223-9. [DOI] [PubMed] [Google Scholar]
- Conaway R. C., Conaway J. W. General initiation factors for RNA polymerase II. Annu Rev Biochem. 1993;62:161–190. doi: 10.1146/annurev.bi.62.070193.001113. [DOI] [PubMed] [Google Scholar]
- Demeler B., Zhou G. W. Neural network optimization for E. coli promoter prediction. Nucleic Acids Res. 1991 Apr 11;19(7):1593–1599. doi: 10.1093/nar/19.7.1593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engelbrecht J., Knudsen S., Brunak S. G+C-rich tract in 5' end of human introns. J Mol Biol. 1992 Sep 5;227(1):108–113. doi: 10.1016/0022-2836(92)90685-d. [DOI] [PubMed] [Google Scholar]
- Ghosh D. Status of the transcription factors database (TFD). Nucleic Acids Res. 1993 Jul 1;21(13):3117–3118. doi: 10.1093/nar/21.13.3117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Green M. R. Gene regulation. Transcriptional transgressions. Nature. 1992 Jun 4;357(6377):364–365. doi: 10.1038/357364d0. [DOI] [PubMed] [Google Scholar]
- Hirst J. D., Sternberg M. J. Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochemistry. 1992 Aug 18;31(32):7211–7218. doi: 10.1021/bi00147a001. [DOI] [PubMed] [Google Scholar]
- Horikoshi M., Bertuccioli C., Takada R., Wang J., Yamamoto T., Roeder R. G. Transcription factor TFIID induces DNA bending upon binding to the TATA element. Proc Natl Acad Sci U S A. 1992 Feb 1;89(3):1060–1064. doi: 10.1073/pnas.89.3.1060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Horton P. B., Kanehisa M. An assessment of neural network and statistical approaches for prediction of E. coli promoter sites. Nucleic Acids Res. 1992 Aug 25;20(16):4331–4338. doi: 10.1093/nar/20.16.4331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Javahery R., Khachi A., Lo K., Zenzie-Gregory B., Smale S. T. DNA sequence requirements for transcriptional initiator activity in mammalian cells. Mol Cell Biol. 1994 Jan;14(1):116–127. doi: 10.1128/mcb.14.1.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson P. F., McKnight S. L. Eukaryotic transcriptional regulatory proteins. Annu Rev Biochem. 1989;58:799–839. doi: 10.1146/annurev.bi.58.070189.004055. [DOI] [PubMed] [Google Scholar]
- Killeen M., Coulombe B., Greenblatt J. Recombinant TBP, transcription factor IIB, and RAP30 are sufficient for promoter recognition by mammalian RNA polymerase II. J Biol Chem. 1992 May 15;267(14):9463–9466. [PubMed] [Google Scholar]
- Kirkpatrick S., Gelatt C. D., Jr, Vecchi M. P. Optimization by simulated annealing. Science. 1983 May 13;220(4598):671–680. doi: 10.1126/science.220.4598.671. [DOI] [PubMed] [Google Scholar]
- Lukashin A. V., Anshelevich V. V., Amirikyan B. R., Gragerov A. I., Frank-Kamenetskii M. D. Neural network models for promoter recognition. J Biomol Struct Dyn. 1989 Jun;6(6):1123–1133. doi: 10.1080/07391102.1989.10506540. [DOI] [PubMed] [Google Scholar]
- Lukashin A. V., Engelbrecht J., Brunak S. Multiple alignment using simulated annealing: branch point definition in human mRNA splicing. Nucleic Acids Res. 1992 May 25;20(10):2511–2516. doi: 10.1093/nar/20.10.2511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matthews B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975 Oct 20;405(2):442–451. doi: 10.1016/0005-2795(75)90109-9. [DOI] [PubMed] [Google Scholar]
- Nussinov R. DNA sequences at and between the GC and TATA boxes: potential DNA looping and spatial juxtapositioning of the protein factors. J Biomol Struct Dyn. 1992 Jun;9(6):1213–1237. doi: 10.1080/07391102.1992.10507988. [DOI] [PubMed] [Google Scholar]
- Nussinov R. Nucleotide quartets in the vicinity of eukaryotic transcriptional initiation sites: some DNA and chromatin structural implications. DNA. 1987 Feb;6(1):13–22. doi: 10.1089/dna.1987.6.13. [DOI] [PubMed] [Google Scholar]
- Nussinov R. The eukaryotic CCAAT and TATA boxes, DNA spacer flexibility and looping. J Theor Biol. 1992 Mar 21;155(2):243–270. doi: 10.1016/s0022-5193(05)80597-1. [DOI] [PubMed] [Google Scholar]
- O'Neill M. C. Escherichia coli promoters: neural networks develop distinct descriptions in learning to search for promoters of different spacing classes. Nucleic Acids Res. 1992 Jul 11;20(13):3471–3477. doi: 10.1093/nar/20.13.3471. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Neill M. C. Training back-propagation neural networks to define and detect DNA-binding sites. Nucleic Acids Res. 1991 Jan 25;19(2):313–318. doi: 10.1093/nar/19.2.313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Shea-Greenfield A., Smale S. T. Roles of TATA and initiator elements in determining the start site location and direction of RNA polymerase II transcription. J Biol Chem. 1992 Jan 15;267(2):1391–1402. [PubMed] [Google Scholar]
- Penotti F. E. Human DNA TATA boxes and transcription initiation sites. A statistical study. J Mol Biol. 1990 May 5;213(1):37–52. doi: 10.1016/S0022-2836(05)80120-2. [DOI] [PubMed] [Google Scholar]
- Pesole G., Prunella N., Liuni S., Attimonelli M., Saccone C. WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences. Nucleic Acids Res. 1992 Jun 11;20(11):2871–2875. doi: 10.1093/nar/20.11.2871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prestridge D. S., Burks C. The density of transcriptional elements in promoter and non-promoter sequences. Hum Mol Genet. 1993 Sep;2(9):1449–1453. doi: 10.1093/hmg/2.9.1449. [DOI] [PubMed] [Google Scholar]
- Prestridge D. S. SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elements. Comput Appl Biosci. 1991 Apr;7(2):203–206. doi: 10.1093/bioinformatics/7.2.203. [DOI] [PubMed] [Google Scholar]
- Schneider T. D., Stormo G. D., Gold L., Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986 Apr 5;188(3):415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
- Sheldon M., Ratnasabapathy R., Hernandez N. Characterization of the inducer of short transcripts, a human immunodeficiency virus type 1 transcriptional element that activates the synthesis of short RNAs. Mol Cell Biol. 1993 Feb;13(2):1251–1263. doi: 10.1128/mcb.13.2.1251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weis L., Reinberg D. Transcription by RNA polymerase II: initiator-directed formation of transcription-competent complexes. FASEB J. 1992 Nov;6(14):3300–3309. doi: 10.1096/fasebj.6.14.1426767. [DOI] [PubMed] [Google Scholar]
- Zenzie-Gregory B., Khachi A., Garraway I. P., Smale S. T. Mechanism of initiator-mediated transcription: evidence for a functional interaction between the TATA-binding protein and DNA in the absence of a specific recognition sequence. Mol Cell Biol. 1993 Jul;13(7):3841–3849. doi: 10.1128/mcb.13.7.3841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zenzie-Gregory B., O'Shea-Greenfield A., Smale S. T. Similar mechanisms for transcription initiation mediated through a TATA box or an initiator element. J Biol Chem. 1992 Feb 5;267(4):2823–2830. [PubMed] [Google Scholar]
