Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 1995 Dec 11;23(23):4878–4884. doi: 10.1093/nar/23.23.4878

MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.

K Quandt 1, K Frech 1, H Karas 1, E Wingender 1, T Werner 1
PMCID: PMC307478  PMID: 8532532

Abstract

The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.

Full text

PDF
4878

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Brindle P. K., Holland J. P., Willett C. E., Innis M. A., Holland M. J. Multiple factors bind the upstream activation sites of the yeast enolase genes ENO1 and ENO2: ABFI protein, like repressor activator protein RAP1, binds cis-acting sequences which modulate repression or activation of transcription. Mol Cell Biol. 1990 Sep;10(9):4872–4885. doi: 10.1128/mcb.10.9.4872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol. 1990 Apr 20;212(4):563–578. doi: 10.1016/0022-2836(90)90223-9. [DOI] [PubMed] [Google Scholar]
  3. Cavener D. R. Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Res. 1987 Feb 25;15(4):1353–1361. doi: 10.1093/nar/15.4.1353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Chambers A., Stanway C., Tsang J. S., Henry Y., Kingsman A. J., Kingsman S. M. ARS binding factor 1 binds adjacent to RAP1 at the UASs of the yeast glycolytic genes PGK and PYK1. Nucleic Acids Res. 1990 Sep 25;18(18):5393–5399. doi: 10.1093/nar/18.18.5393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cornish-Bowden A. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984. Nucleic Acids Res. 1985 May 10;13(9):3021–3030. doi: 10.1093/nar/13.9.3021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Dhawale S. S., Lane A. C. Compilation of sequence-specific DNA-binding proteins implicated in transcriptional control in fungi. Nucleic Acids Res. 1993 Dec 11;21(24):5537–5546. doi: 10.1093/nar/21.24.5537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Frech K., Herrmann G., Werner T. Computer-assisted prediction, classification, and delimitation of protein binding sites in nucleic acids. Nucleic Acids Res. 1993 Apr 11;21(7):1655–1664. doi: 10.1093/nar/21.7.1655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Gu Z., Plaza S., Perros M., Cziepluch C., Rommelaere J., Cornelis J. J. NF-Y controls transcription of the minute virus of mice P4 promoter through interaction with an unusual binding site. J Virol. 1995 Jan;69(1):239–246. doi: 10.1128/jvi.69.1.239-246.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Hertz G. Z., Hartzell G. W., 3rd, Stormo G. D. Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput Appl Biosci. 1990 Apr;6(2):81–92. doi: 10.1093/bioinformatics/6.2.81. [DOI] [PubMed] [Google Scholar]
  10. Knüppel R., Dietze P., Lehnberg W., Frech K., Wingender E. TRANSFAC retrieval program: a network model database of eukaryotic transcription regulating sequences and proteins. J Comput Biol. 1994 Fall;1(3):191–198. doi: 10.1089/cmb.1994.1.191. [DOI] [PubMed] [Google Scholar]
  11. Liaw P. C., Brandl C. J. Defining the sequence specificity of the Saccharomyces cerevisiae DNA binding protein REB1p by selecting binding sites from random-sequence oligonucleotides. Yeast. 1994 Jun;10(6):771–787. doi: 10.1002/yea.320100608. [DOI] [PubMed] [Google Scholar]
  12. O'Neill M. C. Training back-propagation neural networks to define and detect DNA-binding sites. Nucleic Acids Res. 1991 Jan 25;19(2):313–318. doi: 10.1093/nar/19.2.313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Prestridge D. S. SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elements. Comput Appl Biosci. 1991 Apr;7(2):203–206. doi: 10.1093/bioinformatics/7.2.203. [DOI] [PubMed] [Google Scholar]
  14. Prestridge D. S., Stormo G. SIGNAL SCAN 3.0: new database and program features. Comput Appl Biosci. 1993 Feb;9(1):113–115. doi: 10.1093/bioinformatics/9.1.113. [DOI] [PubMed] [Google Scholar]
  15. Risse G., Jooss K., Neuberg M., Brüller H. J., Müller R. Asymmetrical recognition of the palindromic AP1 binding site (TRE) by Fos protein complexes. EMBO J. 1989 Dec 1;8(12):3825–3832. doi: 10.1002/j.1460-2075.1989.tb08560.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Soudeyns H., Geleziunas R., Shyamala G., Hiscott J., Wainberg M. A. Identification of a novel glucocorticoid response element within the genome of the human immunodeficiency virus type 1. Virology. 1993 Jun;194(2):758–768. doi: 10.1006/viro.1993.1317. [DOI] [PubMed] [Google Scholar]
  17. Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Wingender E. Compilation of transcription regulating proteins. Nucleic Acids Res. 1988 Mar 25;16(5):1879–1902. doi: 10.1093/nar/16.5.1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Wingender E. Recognition of regulatory regions in genomic sequences. J Biotechnol. 1994 Jun 30;35(2-3):273–280. doi: 10.1016/0168-1656(94)90041-8. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES