Abstract
A computer tool is described for comparison, analysis and search of genetic signals. The method is based on sequence consensus matrices. It assumes that a genetic signal (such as a promoter, enhancer or whatever) is composed of several signal blocks separated from each other by variable distances. A set of programs is presented to perform the analysis. The result of such an analysis is a description of the investigated signal including matrices for each signal block, distances between each block and distribution of the values. Programs are provided to search for a signal using results from previous analysis. The method is able to align large sets of sequences within a few minutes and to check the quality of the alignment. An analysis of E.coli promoters is provided as an example.
Full text
PDF







Images in this article
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Aleksandrov N. N., Mironov A. A. Raspoznavanie promotorov Escherichia coli po pervichnoi strukture DNK. Mol Biol (Mosk) 1987 Jan-Feb;21(1):242–249. [PubMed] [Google Scholar]
- Alexandrov N. N., Mironov A. A. Application of a new method of pattern recognition in DNA sequence analysis: a study of E. coli promoters. Nucleic Acids Res. 1990 Apr 11;18(7):1847–1852. doi: 10.1093/nar/18.7.1847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bilofsky H. S., Burks C. The GenBank genetic sequence data bank. Nucleic Acids Res. 1988 Mar 11;16(5):1861–1863. doi: 10.1093/nar/16.5.1861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bucher P., Bryan B. Signal search analysis: a new method to localize and characterize functionally important DNA sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):287–305. doi: 10.1093/nar/12.1part1.287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cameron G. N. The EMBL data library. Nucleic Acids Res. 1988 Mar 11;16(5):1865–1867. doi: 10.1093/nar/16.5.1865. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deuschle U., Kammerer W., Gentz R., Bujard H. Promoters of Escherichia coli: a hierarchy of in vivo strength indicates alternate structures. EMBO J. 1986 Nov;5(11):2987–2994. doi: 10.1002/j.1460-2075.1986.tb04596.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Devereux J., Haeberli P., Smithies O. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):387–395. doi: 10.1093/nar/12.1part1.387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duval-Valentin G., Ehrlich R. Dynamic and structural characterisation of multiple steps during complex formation between E. coli RNA polymerase and the tetR promoter from pSC101. Nucleic Acids Res. 1987 Jan 26;15(2):575–594. doi: 10.1093/nar/15.2.575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galas D. J., Eggert M., Waterman M. S. Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. J Mol Biol. 1985 Nov 5;186(1):117–128. doi: 10.1016/0022-2836(85)90262-1. [DOI] [PubMed] [Google Scholar]
- Gribskov M., Lüthy R., Eisenberg D. Profile analysis. Methods Enzymol. 1990;183:146–159. doi: 10.1016/0076-6879(90)83011-w. [DOI] [PubMed] [Google Scholar]
- Gribskov M., McLachlan A. D., Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355–4358. doi: 10.1073/pnas.84.13.4355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grob U., Gartmann C. J. COOL--a VAX program for finding COmmon OLigomers in nucleic acid sequences. Thyroid hormone receptor sequences used as an example. Comput Appl Biosci. 1991 Jul;7(3):379–381. doi: 10.1093/bioinformatics/7.3.379. [DOI] [PubMed] [Google Scholar]
- Grob U., Stüber K. Discrimination of phytochrome dependent light inducible from non-light inducible plant genes. Prediction of a common light-responsive element (LRE) in phytochrome dependent light inducible plant genes. Nucleic Acids Res. 1987 Dec 10;15(23):9957–9973. doi: 10.1093/nar/15.23.9957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grob U., Stüber K. GENEXPERT, a program system for nucleic acid sequence structural interpretation. Comput Appl Biosci. 1987 Sep;3(3):243–244. doi: 10.1093/bioinformatics/3.3.243. [DOI] [PubMed] [Google Scholar]
- Grob U., Stüber K. Recognition of ill-defined signals in nucleic acid sequences. Comput Appl Biosci. 1988 Mar;4(1):79–88. doi: 10.1093/bioinformatics/4.1.79. [DOI] [PubMed] [Google Scholar]
- Harley C. B., Reynolds R. P. Analysis of E. coli promoter sequences. Nucleic Acids Res. 1987 Mar 11;15(5):2343–2361. doi: 10.1093/nar/15.5.2343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hertz G. Z., Hartzell G. W., 3rd, Stormo G. D. Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput Appl Biosci. 1990 Apr;6(2):81–92. doi: 10.1093/bioinformatics/6.2.81. [DOI] [PubMed] [Google Scholar]
- Kahn P., Cameron G. EMBL Data Library. Methods Enzymol. 1990;183:23–31. doi: 10.1016/0076-6879(90)83004-s. [DOI] [PubMed] [Google Scholar]
- Lorimer D. D., Cao J. L., Revzin A. Specific sequences downstream from -6 are not essential for proper and efficient in vitro utilization of the Escherichia coli lactose promoter. J Mol Biol. 1990 Nov 20;216(2):275–287. doi: 10.1016/s0022-2836(05)80319-5. [DOI] [PubMed] [Google Scholar]
- Lukashin A. V., Anshelevich V. V., Amirikyan B. R., Gragerov A. I., Frank-Kamenetskii M. D. Neural network models for promoter recognition. J Biomol Struct Dyn. 1989 Jun;6(6):1123–1133. doi: 10.1080/07391102.1989.10506540. [DOI] [PubMed] [Google Scholar]
- Mengeritsky G., Smith T. F. Recognition of characteristic patterns in sets of functionally equivalent DNA sequences. Comput Appl Biosci. 1987 Sep;3(3):223–227. doi: 10.1093/bioinformatics/3.3.223. [DOI] [PubMed] [Google Scholar]
- Mulligan M. E., Hawley D. K., Entriken R., McClure W. R. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):789–800. doi: 10.1093/nar/12.1part2.789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mulligan M. E., McClure W. R. Analysis of the occurrence of promoter-sites in DNA. Nucleic Acids Res. 1986 Jan 10;14(1):109–126. doi: 10.1093/nar/14.1.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakata K., Kanehisa M., Maizel J. V., Jr Discriminant analysis of promoter regions in Escherichia coli sequences. Comput Appl Biosci. 1988 Aug;4(3):367–371. doi: 10.1093/bioinformatics/4.3.367. [DOI] [PubMed] [Google Scholar]
- Needleman S. B., Wunsch C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
- O'Neill M. C. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol. 1989 May 20;207(2):301–310. doi: 10.1016/0022-2836(89)90256-8. [DOI] [PubMed] [Google Scholar]
- Rozkot F., Sázelová P., Pivec L. A novel method for promoter search enhanced by function-specific subgrouping of promoters--developed and tested on E.coli system. Nucleic Acids Res. 1989 Jun 26;17(12):4799–4815. doi: 10.1093/nar/17.12.4799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sallantin J., Haiech J., Rodier F. Search for promoter sites of prokaryotic DNA using learning techniques. Biochimie. 1985 May;67(5):549–553. doi: 10.1016/s0300-9084(85)80275-3. [DOI] [PubMed] [Google Scholar]
- Schneider T. D., Stormo G. D., Gold L., Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986 Apr 5;188(3):415–431. doi: 10.1016/0022-2836(86)90165-8. [DOI] [PubMed] [Google Scholar]
- Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stormo G. D., Schneider T. D., Gold L., Ehrenfeucht A. Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982 May 11;10(9):2997–3011. doi: 10.1093/nar/10.9.2997. [DOI] [PMC free article] [PubMed] [Google Scholar]