MEME

Search sequence databases with these motifs using MAST.
Submit these motifs to BLOCKS multiple alignment processor.
Build and use a motif-based hidden Markov model (HMM) using Meta-MEME.

MEME - Motif discovery tool

MEME version 3.5.7 (Release date: 2007-12-17 16:56:19 -0800 (Mon, 17 Dec 2007))

For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net.

This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net.

REFERENCE

If you use this program in your research, please cite:

Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994.

TRAINING SET

DATAFILE= regThr74Top30_600.fasta
ALPHABET= ACGT
Sequence name            Weight Length  Sequence name            Weight Length  
-------------            ------ ------  -------------            ------ ------  
chrX_19458413_19458814   1.0000    402  chrX_4547813_4548214     1.0000    402  
chrX_18682713_18683114   1.0000    402  chrX_18209813_18210114   1.0000    302  
chrX_7909113_7909614     1.0000    502  chrX_11626413_11626814   1.0000    402  
chrX_12924113_12924614   1.0000    502  chrX_3972813_3973214     1.0000    402  
chrX_13570113_13570514   1.0000    402  chrX_1897613_1898014     1.0000    402  
chrX_21134413_21134714   1.0000    302  chrX_14662213_14662614   1.0000    402  
chrX_329513_329814       1.0000    302  chrX_8990013_8990514     1.0000    502  
chrX_11009413_11009714   1.0000    302  chrX_17654313_17654614   1.0000    302  
chrX_13130813_13131314   1.0000    502  chrX_21664513_21665014   1.0000    502  
chrX_5534913_5535414     1.0000    502  chrX_10308313_10308814   1.0000    502  
chrX_16154213_16154514   1.0000    302  chrX_5174313_5174614     1.0000    302  
chrX_19324113_19324614   1.0000    502  chrX_13171613_13172014   1.0000    402  
chrX_3706313_3706614     1.0000    302  chrX_13175713_13176014   1.0000    302  
chr3R_2919913_2920214    1.0000    302  chrX_1755713_1756114     1.0000    402  
chrX_1075413_1075714     1.0000    302  chrX_18329413_18329914   1.0000    502

COMMAND LINE SUMMARY

This information can also be useful in the event you wish to report a
problem with the MEME software.

command: meme regThr74Top30_600.fasta -dna -mod anr -nmotifs 5 -minw 6 -maxw 50 -dir /Users/tobiasst 

model:  mod=           anr    nmotifs=         5    evt=           inf
object function=  E-value of product of p-values
width:  minw=            6    maxw=           50    minic=        0.00
width:  wg=             11    ws=              1    endgaps=       yes
nsites: minsites=        2    maxsites=       50    wnsites=       0.8
theta:  prob=            1    spmap=         uni    spfuzz=        0.5
em:     prior=   dirichlet    b=            0.01    maxiter=        50
        distance=    1e-05
data:   n=           11860    N=              30
strands: +
sample: seed=            0    seqfrac=         1
Letter frequencies in dataset:
A 0.279 C 0.225 G 0.216 T 0.281 
Background letter frequencies (from dataset with add-one prior applied):
A 0.279 C 0.225 G 0.216 T 0.281

MOTIF 1 width = 28 sites = 13 llr = 233 E-value = 8.4e-010

NAME	START	P-VALUE				SITES
			Simplified	A	:	1	1	:	:	:	:	1	2	1	:	:	2	:	2	:	:	:	2	1	:	:	2	2	:	2	:	2
			pos.-specific	C	:	5	4	4	:	8	:	7	:	7	4	8	:	8	2	a	2	3	:	1	:	8	:	5	2	:	3	6
			probability	G	3	2	2	6	4	:	2	1	1	2	:	:	4	:	1	:	3	:	3	:	2	2	8	2	2	6	4	:
			matrix	T	7	2	4	:	6	2	8	2	8	1	6	2	5	2	6	:	5	7	5	8	8	1	:	:	6	2	3	2
.
			bits	2.2
				2.0
				1.8
				1.5
			Information	1.3
			content	1.1
			(25.9 bits)	0.9
				0.7
				0.4
				0.2
				0.0
.
			Multilevel		`T`	`C`	`C`	`G`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`T`	`T`	`T`	`T`	`C`	`G`	`C`	`T`	`G`	`G`	`C`
			consensus		`G`	`G`	`T`	`C`	`G`		`G`				`C`		`G`	`T`			`G`	`C`	`G`					`A`	`G`	`T`	`C`	`T`
			sequence			`T`															`C`		`A`					`G`			`T`

.
chrX_8990013_8990514	189	4.11e-15	`TCAATCCGTA`		`T`	`C`	`C`	`G`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`G`	`C`	`T`	`C`	`T`	`T`	`T`	`T`	`T`	`C`	`G`	`C`	`G`	`G`	`T`	`C`	`GCGGTTAGAA`
chrX_10308313_10308814	474	6.78e-12	`TTTCTCTCTG`		`T`	`T`	`T`	`G`	`T`	`C`	`T`	`G`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`T`	`A`	`T`	`T`	`C`	`G`	`C`	`T`	`G`	`T`	`C`	`T`
chrX_18329413_18329914	181	1.07e-10	`CATCCCGCCA`		`T`	`C`	`G`	`C`	`G`	`C`	`T`	`A`	`T`	`C`	`T`	`C`	`G`	`C`	`T`	`C`	`T`	`T`	`A`	`T`	`T`	`C`	`G`	`C`	`T`	`T`	`C`	`C`	`GCTCTATACC`
chrX_5174313_5174614	122	2.12e-10	`GCATCCACGT`		`T`	`C`	`T`	`G`	`G`	`T`	`G`	`C`	`T`	`C`	`C`	`C`	`T`	`C`	`T`	`C`	`C`	`C`	`G`	`T`	`T`	`C`	`G`	`G`	`T`	`G`	`G`	`T`	`GTCCAGCGAT`
chrX_10308313_10308814	424	3.29e-10	`GTTCGTTCTC`		`T`	`T`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`C`	`C`	`T`	`C`	`T`	`C`	`T`	`T`	`T`	`T`	`T`	`G`	`A`	`C`	`C`	`G`	`T`	`C`	`TATGTCTGTC`
chrX_12924113_12924614	12	1.02e-09	`TACACGCAGC`		`G`	`C`	`C`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`C`	`T`	`T`	`T`	`C`	`T`	`T`	`T`	`T`	`T`	`C`	`A`	`G`	`C`	`A`	`G`	`C`	`AACGGCAACA`
chrX_3706313_3706614	203	4.38e-09	`CCGCCCTATG`		`G`	`C`	`C`	`G`	`T`	`C`	`T`	`C`	`A`	`C`	`T`	`C`	`A`	`C`	`C`	`C`	`G`	`C`	`T`	`T`	`G`	`C`	`G`	`A`	`T`	`G`	`C`	`A`	`AACGCCTGGC`
chrX_13175713_13176014	72	7.21e-09	`GCCGCATGTG`		`T`	`T`	`T`	`G`	`T`	`C`	`T`	`C`	`G`	`C`	`T`	`C`	`T`	`T`	`A`	`C`	`C`	`T`	`G`	`T`	`T`	`C`	`G`	`A`	`T`	`T`	`G`	`T`	`GTCATCAATC`
chrX_13175713_13176014	36	8.46e-09	`ACCATCCGTT`		`T`	`G`	`C`	`C`	`T`	`C`	`T`	`C`	`T`	`G`	`T`	`C`	`G`	`C`	`G`	`C`	`C`	`C`	`G`	`C`	`T`	`G`	`G`	`C`	`G`	`G`	`G`	`C`	`CGCATGTGTT`
chrX_3706313_3706614	159	4.15e-08	`ATAGGGGTTA`		`T`	`A`	`T`	`G`	`G`	`C`	`T`	`T`	`T`	`A`	`C`	`C`	`G`	`C`	`T`	`C`	`T`	`C`	`T`	`T`	`T`	`C`	`G`	`G`	`G`	`A`	`C`	`T`	`TGCAGTCCGC`
chrX_10308313_10308814	140	4.15e-08	`CGAGGGATGC`		`G`	`G`	`C`	`G`	`G`	`C`	`G`	`T`	`T`	`G`	`C`	`C`	`A`	`C`	`C`	`C`	`G`	`T`	`G`	`T`	`G`	`C`	`G`	`A`	`T`	`T`	`G`	`C`	`GCTGAACACT`
chrX_1075413_1075714	233	8.75e-08	`TTGCTCAACT`		`G`	`G`	`G`	`G`	`G`	`T`	`T`	`C`	`A`	`C`	`C`	`T`	`T`	`C`	`T`	`C`	`G`	`T`	`A`	`A`	`T`	`C`	`G`	`C`	`T`	`G`	`T`	`A`	`GAATCCGCCA`
chrX_12924113_12924614	184	9.83e-08	`CACAATTTGA`		`T`	`C`	`A`	`C`	`T`	`C`	`G`	`C`	`T`	`T`	`T`	`T`	`G`	`T`	`A`	`C`	`G`	`T`	`T`	`T`	`T`	`T`	`G`	`C`	`T`	`G`	`C`	`C`	`TAAATCGCGT`

Motif 1 block diagrams

Name

Lowest
p-value

Motifs

chrX_8990013_8990514

4.1e-15

chrX_10308313_10308814

4.2e-08

chrX_18329413_18329914

1.1e-10

chrX_5174313_5174614

2.1e-10

chrX_12924113_12924614

9.8e-08

chrX_3706313_3706614

4.4e-09

chrX_13175713_13176014

8.5e-09

chrX_1075413_1075714

8.8e-08

SCALE

\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|
1	25	50	75	100	125	150	175	200	225	250	275	300	325	350	375	400	425	450	475

Motif 1 in BLOCKS format

to BLOCKS multiple alignment processor.

Motif 1 position-specific scoring matrix

Motif 1 position-specific probability matrix

to known motifs in JASPAR database:

Motif 1 regular expression

[TG][CGT][CT][GC][TG]C[TG]CTC[TC]C[TG][CT]TC[TGC][TC][TGA]TTCG[CAG][TG][GT][GCT][CT]





Time 24.12 secs.

MOTIF 2 width = 29 sites = 8 llr = 171 E-value = 3.9e-004

NAME	START	P-VALUE				SITES
			Simplified	A	1	4	:	:	:	8	4	6	:	9	:	a	1	6	:	6	:	a	4	4	:	5	5	8	:	9	1	1	5
			pos.-specific	C	5	5	:	3	3	1	3	:	1	:	:	:	3	1	:	:	:	:	:	:	:	5	1	:	4	:	:	3	:
			probability	G	4	:	3	5	8	1	4	:	4	1	a	:	5	1	a	3	a	:	:	6	a	:	4	:	1	:	5	4	5
			matrix	T	:	1	8	3	:	:	:	4	5	:	:	:	1	1	:	1	:	:	6	:	:	:	:	3	5	1	4	3	:
.
			bits	2.2
				2.0
				1.8
				1.5
			Information	1.3
			content	1.1
			(30.9 bits)	0.9
				0.7
				0.4
				0.2
				0.0
.
			Multilevel		`C`	`C`	`T`	`G`	`G`	`A`	`A`	`A`	`T`	`A`	`G`	`A`	`G`	`A`	`G`	`A`	`G`	`A`	`T`	`G`	`G`	`A`	`A`	`A`	`T`	`A`	`G`	`G`	`A`
			consensus		`G`	`A`	`G`	`C`	`C`		`G`	`T`	`G`				`C`			`G`			`A`	`A`		`C`	`G`	`T`	`C`		`T`	`C`	`G`
			sequence					`T`			`C`																					`T`

.
chrX_19458413_19458814	166	3.77e-12	`GGTGCAGCGT`		`G`	`C`	`T`	`G`	`G`	`A`	`A`	`A`	`T`	`A`	`G`	`A`	`G`	`C`	`G`	`A`	`G`	`A`	`T`	`G`	`G`	`C`	`A`	`T`	`C`	`A`	`T`	`C`	`A`	`GAGACGGGAC`
chrX_329513_329814	161	2.09e-11	`TCAGTTGCTC`		`C`	`C`	`T`	`G`	`G`	`A`	`G`	`A`	`C`	`A`	`G`	`A`	`A`	`A`	`G`	`A`	`G`	`A`	`T`	`G`	`G`	`C`	`A`	`T`	`C`	`A`	`T`	`T`	`A`	`GTAAGCACTC`
chrX_1075413_1075714	115	4.36e-11	`CGCTGGTAGA`		`C`	`A`	`T`	`G`	`G`	`A`	`C`	`A`	`G`	`A`	`G`	`A`	`T`	`A`	`G`	`A`	`G`	`A`	`T`	`A`	`G`	`A`	`C`	`A`	`T`	`A`	`G`	`A`	`G`	`TAAGACGCGA`
chrX_14662213_14662614	224	7.09e-11	`AACGTAGGTA`		`C`	`A`	`T`	`T`	`C`	`A`	`A`	`T`	`T`	`A`	`G`	`A`	`C`	`A`	`G`	`G`	`G`	`A`	`T`	`A`	`G`	`C`	`G`	`A`	`T`	`A`	`T`	`G`	`A`	`AGATGACTCG`
chrX_14662213_14662614	258	1.22e-10	`TATGAAGATG`		`A`	`C`	`T`	`C`	`G`	`G`	`C`	`T`	`T`	`A`	`G`	`A`	`G`	`G`	`G`	`A`	`G`	`A`	`A`	`G`	`G`	`C`	`A`	`A`	`T`	`A`	`G`	`C`	`G`	`ATTCAAAGCG`
chrX_3706313_3706614	241	2.52e-10	`AACGCCTGGC`		`C`	`A`	`G`	`C`	`C`	`C`	`G`	`A`	`T`	`A`	`G`	`A`	`G`	`A`	`G`	`T`	`G`	`A`	`T`	`G`	`G`	`A`	`G`	`A`	`G`	`A`	`G`	`G`	`G`	`AGCCGTCTCT`
chrX_1755713_1756114	24	2.72e-10	`CGTGCGGAGT`		`G`	`C`	`G`	`G`	`G`	`A`	`A`	`A`	`G`	`G`	`G`	`A`	`C`	`A`	`G`	`G`	`G`	`A`	`A`	`A`	`G`	`A`	`G`	`A`	`C`	`T`	`G`	`G`	`G`	`AGAGAAAGTA`
chrX_13570113_13570514	244	4.19e-10	`AGATCAGAAA`		`G`	`T`	`T`	`T`	`G`	`A`	`G`	`T`	`G`	`A`	`G`	`A`	`G`	`T`	`G`	`A`	`G`	`A`	`A`	`G`	`G`	`A`	`A`	`A`	`T`	`A`	`A`	`T`	`A`	`TAAATAACTG`

Motif 2 block diagrams

Name

Lowest
p-value

Motifs

chrX_19458413_19458814

3.8e-12

chrX_329513_329814

2.1e-11

chrX_1075413_1075714

4.4e-11

chrX_14662213_14662614

7.1e-11

chrX_3706313_3706614

2.5e-10

chrX_1755713_1756114

2.7e-10

chrX_13570113_13570514

4.2e-10

SCALE

\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|
1	25	50	75	100	125	150	175	200	225	250	275	300	325	350	375

Motif 2 in BLOCKS format

to BLOCKS multiple alignment processor.

Motif 2 position-specific scoring matrix

Motif 2 position-specific probability matrix

to known motifs in JASPAR database:

Motif 2 regular expression

[CG][CA][TG][GCT][GC]A[AGC][AT][TG]AGA[GC]AG[AG]GA[TA][GA]G[AC][AG][AT][TC]A[GT][GCT][AG]





Time 47.31 secs.

MOTIF 3 width = 15 sites = 22 llr = 227 E-value = 6.1e-001

NAME	START	P-VALUE				SITES
			Simplified	A	9	5	a	:	9	6	4	1	7	3	a	3	5	2	6
			pos.-specific	C	:	2	:	:	:	:	1	:	:	3	:	:	:	:	:
			probability	G	:	:	:	:	:	:	1	:	3	:	:	:	:	:	1
			matrix	T	:	2	:	a	1	4	4	9	:	4	:	7	5	8	3
.
			bits	2.2
				2.0
				1.8
				1.5
			Information	1.3
			content	1.1
			(14.9 bits)	0.9
				0.7
				0.4
				0.2
				0.0
.
			Multilevel		`A`	`A`	`A`	`T`	`A`	`A`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`T`	`A`
			consensus			`C`				`T`	`T`		`G`	`A`		`A`	`T`	`A`	`T`
			sequence			`T`								`C`

.
chrX_16154213_16154514	259	3.66e-08	`TTTTTATAAT`		`A`	`A`	`A`	`T`	`A`	`A`	`A`	`T`	`A`	`C`	`A`	`T`	`T`	`T`	`A`	`CTTGCGCTCA`
chrX_17654313_17654614	191	3.55e-07	`ACTAACATTC`		`A`	`C`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`C`	`A`	`T`	`A`	`T`	`A`	`TATTACACAT`
chrX_19458413_19458814	320	3.55e-07	`TTGATGTCCT`		`A`	`C`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`C`	`A`	`T`	`A`	`T`	`A`	`TGGGCATTAA`
chrX_11009413_11009714	221	5.82e-07	`TATAAATACA`		`A`	`A`	`A`	`T`	`A`	`A`	`T`	`T`	`A`	`T`	`A`	`T`	`T`	`A`	`A`	`ATATATTGTA`
chrX_11009413_11009714	135	5.82e-07	`ACAAAAAGCA`		`A`	`A`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`A`	`T`	`T`	`A`	`ACAATACATA`
chrX_1897613_1898014	27	1.32e-06	`TTAATGCACG`		`A`	`C`	`A`	`T`	`A`	`A`	`G`	`T`	`A`	`C`	`A`	`T`	`A`	`T`	`A`	`TCGGTCTCCC`
chrX_11009413_11009714	255	1.51e-06	`ATAACATTAT`		`A`	`T`	`A`	`T`	`A`	`A`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`A`	`A`	`AAGAAAACCC`
chrX_8990013_8990514	444	1.91e-06	`GGTGATAAAG`		`A`	`A`	`A`	`T`	`A`	`T`	`A`	`T`	`A`	`C`	`A`	`T`	`A`	`T`	`G`	`TACATTAGAT`
chrX_13171613_13172014	243	2.37e-06	`TGGCGTAAAA`		`A`	`A`	`A`	`T`	`A`	`A`	`A`	`T`	`A`	`A`	`A`	`A`	`A`	`A`	`A`	`AAACAGCAGG`
chrX_13570113_13570514	274	3.60e-06	`GAAATAATAT`		`A`	`A`	`A`	`T`	`A`	`A`	`C`	`T`	`G`	`A`	`A`	`T`	`A`	`T`	`T`	`TCGGCTTTCT`
chrX_13130813_13131314	37	8.23e-06	`GCTTACTGTG`		`A`	`T`	`A`	`T`	`A`	`A`	`T`	`A`	`A`	`T`	`A`	`T`	`A`	`T`	`T`	`ATATATTATA`
chrX_1075413_1075714	43	1.14e-05	`GGGTGTACAT`		`A`	`T`	`A`	`T`	`A`	`C`	`T`	`T`	`G`	`T`	`A`	`T`	`A`	`T`	`A`	`TACATGTGTG`
chrX_14662213_14662614	113	1.14e-05	`TTTTCAATTG`		`A`	`A`	`A`	`T`	`A`	`A`	`T`	`A`	`A`	`G`	`A`	`T`	`T`	`T`	`A`	`AGTTCATGGG`
chrX_21134413_21134714	177	1.33e-05	`AGTGTTTATT`		`A`	`C`	`A`	`T`	`A`	`A`	`A`	`T`	`A`	`T`	`A`	`G`	`T`	`T`	`T`	`GTCCGCGTTT`
chrX_13130813_13131314	103	1.56e-05	`AATATTAAAT`		`A`	`A`	`A`	`T`	`T`	`A`	`T`	`T`	`A`	`C`	`A`	`A`	`A`	`T`	`T`	`ATTGAAAATC`
chrX_17654313_17654614	210	2.39e-05	`CATATATATT`		`A`	`C`	`A`	`C`	`A`	`T`	`A`	`T`	`G`	`A`	`A`	`T`	`A`	`T`	`A`	`CATTCTATGA`
chrX_13171613_13172014	310	2.56e-05	`AACAATTTAT`		`T`	`T`	`A`	`T`	`A`	`T`	`T`	`T`	`A`	`A`	`A`	`T`	`T`	`T`	`A`	`TTTTAATTTG`
chrX_1897613_1898014	198	2.93e-05	`CCTGCCGCAA`		`A`	`A`	`A`	`T`	`A`	`A`	`G`	`T`	`G`	`T`	`T`	`T`	`T`	`T`	`A`	`CTTTGATAAC`
chrX_13570113_13570514	314	2.93e-05	`TACGCATTTA`		`C`	`A`	`A`	`T`	`A`	`A`	`T`	`T`	`G`	`A`	`A`	`T`	`A`	`A`	`A`	`TAACTATTAC`
chrX_18682713_18683114	57	3.55e-05	`ATGCTGCTTG`		`A`	`A`	`A`	`T`	`A`	`T`	`C`	`A`	`A`	`A`	`A`	`A`	`T`	`T`	`T`	`ACCATAGTTA`
chrX_13570113_13570514	352	3.77e-05	`CTTATAACCC`		`A`	`A`	`A`	`T`	`T`	`T`	`T`	`T`	`G`	`T`	`A`	`A`	`T`	`T`	`T`	`ATAATAATTA`
chrX_12924113_12924614	432	4.51e-05	`TACAAATATA`		`A`	`T`	`A`	`T`	`A`	`A`	`C`	`T`	`A`	`T`	`A`	`A`	`T`	`A`	`G`	`ATTTTTCAAA`

Motif 3 block diagrams

Name

Lowest
p-value

Motifs

chrX_16154213_16154514

3.7e-08

chrX_17654313_17654614

2.4e-05

chrX_19458413_19458814

3.5e-07

chrX_11009413_11009714

1.5e-06

chrX_1897613_1898014

2.9e-05

chrX_8990013_8990514

1.9e-06

chrX_13171613_13172014

2.6e-05

chrX_13570113_13570514

3.8e-05

chrX_13130813_13131314

8.2e-06

chrX_1075413_1075714

1.1e-05

chrX_14662213_14662614

1.1e-05

chrX_21134413_21134714

1.3e-05

chrX_18682713_18683114

3.5e-05

chrX_12924113_12924614

4.5e-05

SCALE

\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|
1	25	50	75	100	125	150	175	200	225	250	275	300	325	350	375	400	425	450	475

Motif 3 in BLOCKS format

to BLOCKS multiple alignment processor.

Motif 3 position-specific scoring matrix

Motif 3 position-specific probability matrix

to known motifs in JASPAR database:

Motif 3 regular expression

A[ACT]ATA[AT][AT]T[AG][TAC]A[TA][AT][TA][AT]





Time 69.91 secs.

MOTIF 4 width = 15 sites = 19 llr = 202 E-value = 6.0e+000

NAME	START	P-VALUE				SITES
			Simplified	A	1	:	1	1	2	:	5	:	:	2	:	1	1	:	3
			pos.-specific	C	:	3	6	2	2	8	1	:	a	1	7	3	2	2	7
			probability	G	8	7	2	6	6	1	3	9	:	6	3	1	5	8	:
			matrix	T	2	:	1	1	:	1	2	1	:	2	:	6	3	:	:
.
			bits	2.2
				2.0
				1.8
				1.5
			Information	1.3
			content	1.1
			(15.3 bits)	0.9
				0.7
				0.4
				0.2
				0.0
.
			Multilevel		`G`	`G`	`C`	`G`	`G`	`C`	`A`	`G`	`C`	`G`	`C`	`T`	`G`	`G`	`C`
			consensus			`C`			`A`		`G`			`T`	`G`	`C`	`T`		`A`
			sequence						`C`

.
chrX_10308313_10308814	113	1.65e-08	`CAGCGGCCGA`		`G`	`G`	`T`	`G`	`G`	`C`	`A`	`G`	`C`	`G`	`C`	`T`	`G`	`G`	`C`	`TGCGAGGGAT`
chrX_5534913_5535414	468	5.06e-08	`GTCAACAAAA`		`G`	`G`	`C`	`A`	`G`	`C`	`G`	`G`	`C`	`G`	`C`	`C`	`G`	`G`	`C`	`GTCAGTATCA`
chrX_7909113_7909614	406	2.32e-07	`GGCGGCTGGT`		`G`	`G`	`C`	`G`	`G`	`C`	`G`	`G`	`C`	`T`	`C`	`C`	`G`	`C`	`C`	`ATGTCCGGCA`
chrX_7909113_7909614	139	2.31e-06	`CGCAATACTG`		`G`	`C`	`C`	`G`	`C`	`C`	`A`	`T`	`C`	`G`	`C`	`T`	`G`	`G`	`A`	`CAAACGGATG`
chrX_10308313_10308814	98	3.44e-06	`ACTAATTATC`		`G`	`C`	`C`	`G`	`A`	`C`	`A`	`G`	`C`	`G`	`G`	`C`	`C`	`G`	`A`	`GGTGGCAGCG`
chrX_13570113_13570514	148	3.44e-06	`CTGAATAGTA`		`G`	`G`	`T`	`G`	`G`	`C`	`T`	`G`	`C`	`A`	`C`	`T`	`C`	`G`	`C`	`TAGCGGTGAA`
chrX_13130813_13131314	449	3.79e-06	`GGCGTCATCT`		`T`	`C`	`G`	`G`	`C`	`C`	`G`	`G`	`C`	`G`	`G`	`T`	`G`	`G`	`C`	`AAGAAGGCCG`
chrX_18209813_18210114	196	4.56e-06	`GGGCTCTGGC`		`G`	`C`	`C`	`G`	`C`	`C`	`T`	`G`	`C`	`G`	`C`	`C`	`T`	`C`	`C`	`GCCTCCAGTT`
chrX_18209813_18210114	181	4.56e-06	`GATTGAATGT`		`G`	`G`	`G`	`C`	`G`	`G`	`G`	`G`	`C`	`T`	`C`	`T`	`G`	`G`	`C`	`GCCGCCTGCG`
chrX_5174313_5174614	1	6.02e-06			`T`	`G`	`C`	`T`	`C`	`C`	`A`	`G`	`C`	`G`	`C`	`C`	`T`	`G`	`C`	`CCAAGATTTA`
chrX_11626413_11626814	75	6.02e-06	`GCAATCGCCT`		`G`	`C`	`G`	`G`	`A`	`C`	`A`	`G`	`C`	`G`	`C`	`A`	`G`	`G`	`A`	`CAGCGGTAAT`
chrX_18209813_18210114	29	6.02e-06	`AGAAGAACAG`		`G`	`G`	`A`	`G`	`G`	`C`	`G`	`G`	`C`	`G`	`C`	`G`	`T`	`G`	`A`	`GTGCCGGCGC`
chrX_13171613_13172014	50	6.57e-06	`TTAGCCGGAC`		`G`	`C`	`A`	`G`	`G`	`C`	`C`	`G`	`C`	`G`	`G`	`T`	`T`	`G`	`C`	`TGGCTTGACT`
chrX_1755713_1756114	66	8.54e-06	`GAAAGTAAAG`		`G`	`G`	`C`	`C`	`G`	`C`	`G`	`G`	`C`	`A`	`G`	`T`	`G`	`C`	`A`	`GCATACAAAT`
chrX_1897613_1898014	276	8.54e-06	`AAACGGGAGT`		`A`	`G`	`C`	`G`	`A`	`C`	`A`	`G`	`C`	`C`	`C`	`T`	`G`	`G`	`C`	`GGCGCCATTG`
chrX_16154213_16154514	132	1.01e-05	`CGCGTTTCGC`		`T`	`G`	`C`	`T`	`G`	`C`	`T`	`G`	`C`	`G`	`G`	`T`	`C`	`G`	`C`	`AGCAATTGTT`
chrX_12924113_12924614	160	1.01e-05	`GACCGCGGGC`		`G`	`G`	`C`	`G`	`G`	`T`	`A`	`T`	`C`	`A`	`C`	`T`	`G`	`G`	`C`	`ACAATTTGAT`
chrX_7909113_7909614	427	1.65e-05	`CGCCATGTCC`		`G`	`G`	`C`	`A`	`G`	`C`	`A`	`G`	`C`	`T`	`C`	`A`	`A`	`G`	`C`	`TCCTACTCGT`
chrX_3972813_3973214	210	1.78e-05	`TGCAACACTC`		`G`	`G`	`C`	`C`	`A`	`G`	`A`	`G`	`C`	`T`	`G`	`T`	`T`	`G`	`C`	`TGTGCTGCTC`

Motif 4 block diagrams

Name

Lowest
p-value

Motifs

chrX_10308313_10308814

3.4e-06

chrX_5534913_5535414

5.1e-08

chrX_7909113_7909614

1.6e-05

chrX_13570113_13570514

3.4e-06

chrX_13130813_13131314

3.8e-06

chrX_18209813_18210114

4.6e-06

chrX_5174313_5174614

6e-06

chrX_11626413_11626814

6e-06

chrX_13171613_13172014

6.6e-06

chrX_1755713_1756114

8.5e-06

chrX_1897613_1898014

8.5e-06

chrX_16154213_16154514

1e-05

chrX_12924113_12924614

1e-05

chrX_3972813_3973214

1.8e-05

SCALE

\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|
1	25	50	75	100	125	150	175	200	225	250	275	300	325	350	375	400	425	450	475

Motif 4 in BLOCKS format

to BLOCKS multiple alignment processor.

Motif 4 position-specific scoring matrix

Motif 4 position-specific probability matrix

to known motifs in JASPAR database:

Motif 4 regular expression

G[GC]CG[GAC]C[AG]GC[GT][CG][TC][GT]G[CA]





Time 91.48 secs.

MOTIF 5 width = 14 sites = 8 llr = 113 E-value = 9.4e+000

NAME	START	P-VALUE				SITES
			Simplified	A	:	6	3	a	:	6	:	9	:	8	:	8	:	8
			pos.-specific	C	9	:	8	:	a	:	a	:	3	:	a	:	a	:
			probability	G	1	:	:	:	:	1	:	:	:	3	:	1	:	3
			matrix	T	:	4	:	:	:	3	:	1	8	:	:	1	:	:
.
			bits	2.2
				2.0
				1.8
				1.5
			Information	1.3
			content	1.1
			(20.3 bits)	0.9
				0.7
				0.4
				0.2
				0.0
.
			Multilevel		`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`T`	`A`	`C`	`A`	`C`	`A`
			consensus			`T`	`A`			`T`			`C`	`G`				`G`
			sequence

.
chrX_1755713_1756114	350	4.71e-09	`AGATCATACA`		`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`T`	`A`	`C`	`A`	`C`	`A`	`GGCGCACATC`
chrX_21134413_21134714	223	7.51e-08	`AATAAATTCC`		`C`	`A`	`C`	`A`	`C`	`T`	`C`	`A`	`T`	`G`	`C`	`A`	`C`	`A`	`CACTCATGTT`
chrX_1755713_1756114	135	1.88e-07	`ATTGTGCCTA`		`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`T`	`A`	`C`	`T`	`C`	`G`	`CGTAGGCAGC`
chrX_18329413_18329914	461	2.45e-07	`ACGACTACTA`		`C`	`T`	`A`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`CGATTTTGAA`
chrX_5534913_5535414	298	2.45e-07	`TCTCGATCGC`		`C`	`A`	`C`	`A`	`C`	`A`	`C`	`T`	`T`	`G`	`C`	`A`	`C`	`A`	`TAAACACCTA`
chrX_18329413_18329914	230	3.21e-07	`ATCGCATTCT`		`C`	`T`	`C`	`A`	`C`	`T`	`C`	`A`	`T`	`A`	`C`	`G`	`C`	`A`	`AGGACACACA`
chrX_13130813_13131314	215	3.83e-07	`ATAACGACTG`		`C`	`T`	`A`	`A`	`C`	`G`	`C`	`A`	`T`	`A`	`C`	`A`	`C`	`A`	`TCGCCAATTT`
chrX_18329413_18329914	246	4.92e-07	`CATACGCAAG`		`G`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`C`	`A`	`C`	`G`	`AGGTCGGCAC`

Motif 5 block diagrams

Name

Lowest
p-value

Motifs

chrX_1755713_1756114

1.9e-07

chrX_21134413_21134714

7.5e-08

chrX_18329413_18329914

3.2e-07

chrX_5534913_5535414

2.4e-07

chrX_13130813_13131314

3.8e-07

SCALE

\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|
1	25	50	75	100	125	150	175	200	225	250	275	300	325	350	375	400	425	450	475

Motif 5 in BLOCKS format

to BLOCKS multiple alignment processor.

Motif 5 position-specific scoring matrix

Motif 5 position-specific probability matrix

to known motifs in JASPAR database:

Motif 5 regular expression

C[AT][CA]AC[AT]CA[TC][AG]CAC[AG]





Time 112.05 secs.

SUMMARY OF MOTIFS

Combined block diagrams: non-overlapping sites with p-value < 0.0001

Name

Combined
p-value

Motifs

chrX_19458413_19458814

6.96e-10

chrX_4547813_4548214

9.38e-01

chrX_18682713_18683114

4.86e-01

chrX_18209813_18210114

7.12e-02

chrX_7909113_7909614

2.69e-03

chrX_11626413_11626814

7.06e-03

chrX_12924113_12924614

9.86e-08

chrX_3972813_3973214

9.14e-03

chrX_13570113_13570514

2.80e-09

chrX_1897613_1898014

7.62e-04

chrX_21134413_21134714

1.50e-06

chrX_14662213_14662614

6.48e-08

chrX_329513_329814

1.37e-06

chrX_8990013_8990514

2.00e-11

chrX_11009413_11009714

7.14e-03

chrX_17654313_17654614

4.24e-03

chrX_13130813_13131314

4.82e-06

chrX_21664513_21665014

4.92e-02

chrX_5534913_5535414

5.32e-07

chrX_10308313_10308814

2.13e-10

chrX_16154213_16154514

3.58e-05

chrX_5174313_5174614

1.30e-07

chrX_19324113_19324614

5.46e-01

chrX_13171613_13172014

9.32e-04

chrX_3706313_3706614

9.93e-11

chrX_13175713_13176014

2.92e-04

chr3R_2919913_2920214

6.19e-01

chrX_1755713_1756114

1.89e-11

chrX_1075413_1075714

2.79e-12

chrX_18329413_18329914

1.09e-08

SCALE

\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|	\|
1	25	50	75	100	125	150	175	200	225	250	275	300	325	350	375	400	425	450	475

Motif summary in machine readable format.

Stopped because nmotifs = 5 reached.


CPU: p338i-005.win.med.uni-muenchen.de

EXPLANATION OF MEME RESULTS

The MEME results consist of:

The version of MEME and the date it was released.
The reference to cite if you use MEME in your research.
A description of the sequences you submitted (the "training set") showing the name, "weight" and length of each sequence.
The command line summary detailing the parameters with which you ran MEME.
Information on each of the motifs MEME discovered, including:
1. A summary line showing the width, number of occurrences, log likelihood ratio and statistical significance of the motif.
2. A simplified position-specific probability matrix.
3. A diagram showing the degree of conservation at each motif position.
4. A multilevel consensus sequence showing the most conserved letter(s) at each motif position.
5. The occurrences of the motif sorted by p-value and aligned with each other.
6. Block diagrams of the occurrences of the motif within each sequence in the training set.
7. The motif in BLOCKS or FASTA format.
8. A position-specific scoring matrix (PSSM) for use by the MAST database search program.
9. The position specific probability matrix (PSPM) describing the motif.
10. A regular expression describing the motif.
A summary of motifs showing an optimized (non-overlapping) tiling of all of the motifs onto each of the sequences in the training set.
The reason why MEME stopped and the name of the CPU on which it ran.
This explanation of how to interpret MEME results.

MOTIFS

For each motif that it discovers in the training set, MEME prints the following information:

Summary Line
This line gives the width (`width'), number of occurrences in the training set (`sites'), log likelihood ratio (`llr') and E-value of the motif. Each motif describes a pattern of a fixed width--no gaps are allowed in MEME motifs. MEME numbers the motifs consecutively from one as it finds them. MEME usually finds the most statistically significant (low E-value) motifs first. The statistical significance of a motif is based on its log likelihood ratio, its width and number of occurrences, the background letter frequencies (given in the command line summary), and the size of the training set. The E-value is an estimate of the expected number of motifs with the given log likelihood ratio (or higher), and with the same width and number of occurrences, that one would find in a similarly sized set of random sequences. (In random sequences each position is independent with letters chosen according to the background letter frequencies.) The log likelihood ratio is the logarithm of the ratio of the probability of the occurrences of the motif given the motif model (likelihood given the motif) versus their probability given the background model (likelihood given the null model). (Normally the background model is a 0-order Markov model using the background letter frequencies, but higher order Markov models may be specified via the -bfile option to MEME.) Clicking on the buttons to the left of the motif summary line takes you to the previous motif (P) or next motif (N).
Simplified Position-Specific Probability Matrix
MEME motifs are represented by position-specific probability matrices that specify the probability of each possible letter appearing at each possible position in an occurrence of the motif. In order to make it easier to see which letters are most likely in each of the columns of the motif, the simplified motif shows the letter probabilities multiplied by 10 rounded to the nearest integer ("a" means 10). Zeros are replaced by ":" (the colon) for readability.

Information Content Diagram

The information content diagram provides an idea of which positions in the motif are most highly conserved. Each column (position) in a motif can be characterized by the amount of information it contains (measured in bits). Highly conserved positions in the motif have high information; positions where all letters are equally likely have low information. (The information content is relative to the background letter frequencies which are given in the command line summary section.) The diagram is printed so that each column lines up with the same column in the simplified position-specific probability matrix above it. Columns in the information content diagram are colored according to the majority category of the letters occurring in that column of the alignment. If no letter category has frequency above 0.5, the column in the diagram is colored black. For DNA sequences, the letter categories contain one letter each. For proteins, the categories are based on the biochemical properties of the various amino acids. The categories and their colors are:

NUCLEIC ACIDS	COLOR
A	RED
C	BLUE
G	ORANGE
T	GREEN

AMINO ACIDS COLOR PROPERTIES

A, C, F, I, L, V, W and M BLUE Most hydrophobic[Kyte and Doolittle, 1982]

NQST GREEN Polar, non-charged, non-aliphatic residues

DE MAGENTA Acidic

KR RED Positively charged

H PINK

G ORANGE

P YELLOW

Y TURQUOISE

J. Kyte and R. Doolittle, 1982. "A Simple Method for Displaying the Hydropathic Character of a Protein", J. Mol Biol. 157, 105-132.

Summing the information content for each position in the motif gives the total information content of the motif (shown in parentheses to the left of the diagram). The total information content is approximately equal to the log likelihood ratio divided by the number of occurrences times ln(2). The total information content gives a measure of the usefulness of the motif for database searches. For a motif to be useful for database searches, it must as a rule contain at least log_2(N) bits of information where N is the number of sequences in the database being searched. For example, to effectively search a database containing 100,000 sequences for occurrences of a single motif, the motif should have an IC of at least 16.6 bits. Motifs with lower information content are still useful when a family of sequences shares more than one motif since they can be combined in multiple motif searches (using MAST).

Multilevel Consensus Sequence
The multilevel consensus sequence corresponding to the motif is an aid in remembering and understanding the motif. It is calculated from the motif position-specific probability matrix as follows. Separately for each column of the motif, the letters in the alphabet are sorted in decreasing order by the probability with which they are expected to occur in that position of motif occurrences. The sorted letters are then printed vertically with the most probable letter on top. Only letters with probabilities of 0.2 or higher at that position in the motif are printed. As an example, the multilevel consensus sequence of motif 1 in the sample output is:
```
Multilevel TTATGTGAACGACGTCACACT
consensus  AA  T A G A GA     AA
sequence           T C TT     T
```
This multilevel consensus sequence says several things about the motif. First, the most likely form of the motif can be read from the top line as TTATGTGAACGACGTCACACT. Second, that only letter A has probability more than 0.2 in position 3 of the motif, both T and A have probability greater than 0.2 in position 1, etc. Third, a rough approximation of the motif can be made by converting the multilevel consensus sequence into a regular expression for the motif.
Occurrences of the Motif
MEME displays the occurrences (sites) of the motif in the training set. The sites are shown aligned with each other, and the ten sequence positions preceding and following each site are also shown. Each site is identified by the name of the sequence where it occurs, the strand (if both strands of DNA sequences are being used), and the position in the sequence where the site begins. When the DNA strand is specified, `+' means the sequence in the training set, and `-' means the reverse complement of the training set sequence. (For `-' strands, the `start' position is actually the position on the positive strand where the site ends.) The sites are listed in order of increasing statistical significance (p-value). The p-value of a site is computed from the the match score of the site with the position specific scoring matrix for the motif. The p-value gives the probability of a random string (generated from the background letter frequencies) having the same match score or higher. (This is referred to as the position p-value by the MAST algorithm.)
Block Diagrams of Motif Occurrences
The occurrences of the motif in the training set sequences are shown with MAST-style block diagrams. One diagram is printed for each sequence showing all the occurrences of the motif in that sequence. The sequences are sorted by the lowest p-value among all occurrences of the motif in a given sequence. (The p-value of an occurrence is the probability of a single random subsequence the length of the motif, generated according to the 0-order background model, having a score at least as high as the score of the occurrence.) When the DNA strand is specified, `+' means the motif appears from left to right on the sequence, and `-' means the motif appears from right to left on the complementary strand. A sequence position scale is shown at the end of each table of block diagrams. Very long sequences are shown with thick lines connecting the motifs and are not drawn to scale.
Motif in BLOCKS format or FASTA format
For use with BLOCKS tools, MEME prints the occurrences of the motif in BLOCKS format.
You can convert these blocks to PSSMs (position-specific scoring matrices), LOGOS (color representations of the motifs), phylogeny trees and search them against a database of other blocks by pasting everything from the "BL" line to the "//" line (inclusive) into the Multiple Alignment Processor. If you include the -print_fasta switch on the command line, MEME prints the motif sites in FASTA format instead of BLOCKS format.
Position-Specific Scoring Matrix
The position-specific scoring matrix corresponding to the motif is printed for use by database search programs such as MAST. This matrix is a log-odds matrix calculated by taking 100 times the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the background frequency of the letter (given in the command line summary section.) This is the same matrix that is used above in computing the p-values of the occurrences of the motif in the Occurrences of the Motif and Block Diagrams of Motif Occurrences sections. The scoring matrix is printed "sideways"--columns correspond to the letters in the alphabet (in the same order as shown in the simplified motif) and rows corresponding to the positions of the motif, position one first. The scoring matrix is preceded by a line starting with "log-odds matrix:" and containing the length of the alphabet, width of the motif, number of characters in the training set, the scoring threshold (obsolete) and the motif E-value.
Note: The probability p used to compute the PSSM is not exactly the same as the corresponding value in the Position Specific Probability Matrix (PSPM). The values of p used to compute the PSSM take into account the motif prior, whereas the values in the PSPM are just the observed frequencies of letters in the motif sites.
Position-Specific Probability Matrix
The motif itself is a position-specific probability matrix giving, for each position in the pattern, the observed frequency ("probability") of each possible letter. The probability matrix is printed "sideways"--columns correspond to the letters in the alphabet (in the same order as shown in the simplified motif) and rows corresponding to the positions of the motif, position one first. The motif is preceded by a line starting with "letter-probability matrix:" and containing the length of the alphabet, width of the motif, number of occurrences of the motif, and the E-value of the motif.
Note: Earlier versions of MEME gave the posterior probabilities--the probability after applying a prior on letter frequencies--rather than the observed frequencies. These versions of MEME also gave the number of possible positions for the motif rather than the actual number of occurrences. The output from these earlier versions of MEME can be distinguished by "n=" rather than "nsites=" in the line preceding the matrix.
Regular Expression
This is the multilevel consensus expressed as a regular expression for convenience. Regular expressions can be used for searching for against sequences (using, for example, PatMatch) but the search accuracy will usually be better with the PSSM (using, for example MAST.) MEME regular expressions are interpreted as follows: single letters match that letter; groups of letters in square brackets match any of the letters in the group.
Motif Summary Tiling
The motif summary tiling is done using the same algorithm as used by MAST. The motif occurrences shown in the motif summary may not be exactly the same as those reported in each motif section because only motifs with a position p-value of 0.0001 that don't overlap other, more significant motif occurrences are shown. The format of the machine readable motif-summary is:
```
[sequence_name combined_p-value number_of_motif_occurrences [motif_number start_of_motif position_p-value]+]+
```
See the documentation for MAST output for the definition of position and combined p-values.

Go to top

AMINO ACIDS	COLOR	PROPERTIES
A, C, F, I, L, V, W and M	BLUE	Most hydrophobic[Kyte and Doolittle, 1982]
NQST	GREEN	Polar, non-charged, non-aliphatic residues
DE	MAGENTA	Acidic
KR	RED	Positively charged
H	PINK
G	ORANGE
P	YELLOW
Y	TURQUOISE

EXPLANATION OF MEME RESULTS

The MEME results consist of:

Summary Line

Simplified Position-Specific Probability Matrix

Information Content Diagram

Multilevel Consensus Sequence

Occurrences of the Motif

Block Diagrams of Motif Occurrences

Motif in BLOCKS format or FASTA format

Position-Specific Scoring Matrix

Position-Specific Probability Matrix

Regular Expression

Motif Summary Tiling