FIGURE 2.
Identification of conserved upstream sequence elements. (A) Enumerative search for over-represented 8-mers within conserved upstream regions. Next to each consensus sequence is the number of instances of this sequence in conserved C. elegans blocks allowing for zero or one mismatch to the consensus or its reverse complement, and the number of distinct upstream sequences containing these instances. The Z-score of the consensus motif A was 29.0, the score of motif B was 14.7. As a control, a search in equally sized, randomly generated sequences delivered a Z-score of 11.2. (B) Application of the MEME local alignment algorithm to the complete 2000-bp upstream sequence sets. Shown are the pictograms (http://genes.mit.edu/pictogram.html) computed from the sequences that were used in the alignment by MEME for C. elegans (E-value of 3.0e-24) and C. briggsae (E-value of 1.5e-37). Both methods identify a highly similar motif as the most significant one. (C) Histograms of the locations of the best hit per sequence to the motifs given in B, in bins of 100 bp.