MEME version 3.5.7 (Release date: 2007-12-17 16:56:19 -0800 (Mon, 17 Dec 2007))
For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net.
This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net.
If you use this program in your research, please cite:
Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.
DATAFILE= regThr74Top30_600.fasta ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ chrX_19458413_19458814 1.0000 402 chrX_4547813_4548214 1.0000 402 chrX_18682713_18683114 1.0000 402 chrX_18209813_18210114 1.0000 302 chrX_7909113_7909614 1.0000 502 chrX_11626413_11626814 1.0000 402 chrX_12924113_12924614 1.0000 502 chrX_3972813_3973214 1.0000 402 chrX_13570113_13570514 1.0000 402 chrX_1897613_1898014 1.0000 402 chrX_21134413_21134714 1.0000 302 chrX_14662213_14662614 1.0000 402 chrX_329513_329814 1.0000 302 chrX_8990013_8990514 1.0000 502 chrX_11009413_11009714 1.0000 302 chrX_17654313_17654614 1.0000 302 chrX_13130813_13131314 1.0000 502 chrX_21664513_21665014 1.0000 502 chrX_5534913_5535414 1.0000 502 chrX_10308313_10308814 1.0000 502 chrX_16154213_16154514 1.0000 302 chrX_5174313_5174614 1.0000 302 chrX_19324113_19324614 1.0000 502 chrX_13171613_13172014 1.0000 402 chrX_3706313_3706614 1.0000 302 chrX_13175713_13176014 1.0000 302 chr3R_2919913_2920214 1.0000 302 chrX_1755713_1756114 1.0000 402 chrX_1075413_1075714 1.0000 302 chrX_18329413_18329914 1.0000 502
This information can also be useful in the event you wish to report a problem with the MEME software. command: meme regThr74Top30_600.fasta -dna -mod anr -nmotifs 5 -minw 6 -maxw 50 -dir /Users/tobiasst model: mod= anr nmotifs= 5 evt= inf object function= E-value of product of p-values width: minw= 6 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 50 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11860 N= 30 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.225 G 0.216 T 0.281 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.225 G 0.216 T 0.281
Time 24.12 secs.
Time 47.31 secs.
Time 69.91 secs.
Time 91.48 secs.
Time 112.05 secs.
CPU: p338i-005.win.med.uni-muenchen.de
MOTIFS
For each motif that it discovers in the training set, MEME prints the following information:
J. Kyte and R. Doolittle, 1982. "A Simple Method for Displaying the Hydropathic Character of a Protein", J. Mol Biol. 157, 105-132.
Summing the information content for each position in the motif gives the total information content of the motif (shown in parentheses to the left of the diagram). The total information content is approximately equal to the log likelihood ratio divided by the number of occurrences times ln(2). The total information content gives a measure of the usefulness of the motif for database searches. For a motif to be useful for database searches, it must as a rule contain at least log_2(N) bits of information where N is the number of sequences in the database being searched. For example, to effectively search a database containing 100,000 sequences for occurrences of a single motif, the motif should have an IC of at least 16.6 bits. Motifs with lower information content are still useful when a family of sequences shares more than one motif since they can be combined in multiple motif searches (using MAST).
Multilevel TTATGTGAACGACGTCACACT consensus AA T A G A GA AA sequence T C TT T
You can convert these blocks to PSSMs (position-specific scoring matrices), LOGOS (color representations of the motifs), phylogeny trees and search them against a database of other blocks by pasting everything from the "BL" line to the "//" line (inclusive) into the Multiple Alignment Processor. If you include the -print_fasta switch on the command line, MEME prints the motif sites in FASTA format instead of BLOCKS format.
Note: The probability p used to compute the PSSM is not exactly the same as the corresponding value in the Position Specific Probability Matrix (PSPM). The values of p used to compute the PSSM take into account the motif prior, whereas the values in the PSPM are just the observed frequencies of letters in the motif sites.
Note: Earlier versions of MEME gave the posterior probabilities--the probability after applying a prior on letter frequencies--rather than the observed frequencies. These versions of MEME also gave the number of possible positions for the motif rather than the actual number of occurrences. The output from these earlier versions of MEME can be distinguished by "n=" rather than "nsites=" in the line preceding the matrix.
[sequence_name combined_p-value number_of_motif_occurrences [motif_number start_of_motif position_p-value]+]+