. 2005 Sep 1;33(15):e135. doi: 10.1093/nar/gni131

Table 1.

Results of MOST evaluation on yeast datasets

Dataset	Yst01g	yst02g	yst03m	yst04r	yst05r	yst06g	yst08r	yst09g	Total	Average
Number of sequences	8	4	8	6	3	7	11	16
Sequence length	1000	500	500	1000	500	500	1000	1000
Number of known signals	6	5	18	7	4	7	14	13	47
Phase 1: solid words extracted	65	255	286	162	337	214	88	40
Phase 2: clusters (80% similarity threshold)	50	126	141	84	157	111	46	31
Phase 3A: ext. clusters (95% threshold for the MV function)	50	123	137	81	154	106	44	29
Phase 3B: ext. clusters (80% threshold for the MV function)	50	126	141	84	157	110	46	31
Phase 1: signals found in solid words	2	5	14	6	4	7	10	12	38
Phase 2: signals found in clusters	2	5	14	6	4	7	10	12	38
Phase 3A: signals found in ext. clusters	0	5	10	7	3	5	7	6	30
Phase 3B: signals found in ext. clusters	1	5	16	7	4	7	9	11	40
Phase 1	0.33	1.00	0.78	0.86	1.00	1.00	0.71	0.92		0.83
Phase 2 Sensitivity	0.33	1.00	0.78	0.86	1.00	1.00	0.71	0.92		0.83
Phase 3A	0.00	1.00	0.56	1.00	0.75	0.71	0.50	0.46		0.67
Phase 3B	0.17	1.00	0.89	1.00	1.00	1.00	0.64	0.85		0.84
Phase 2
Maximum number of signals per cluster	1	4	9	5	3	6	8	8	28
Number of maximal clusters	2	1	1	1	1	1	1	1
Sensitivity	0.17	0.80	0.50	0.71	0.75	0.86	0.57	0.62		0.63
Phase 3A
Maximum number of signals per cluster	0	1	5	4	3	5	4	3	18
Number of maximal clusters	-	11	3	2	1	1	1	4
Sensitivity	0.00	0.20	0.28	0.57	0.75	0.71	0.29	0.23		0.42
Phase 3B
Maximum number of signals per cluster	1	3	8	3	3	6	4	7	24
Number of maximal clusters	1	1	1	3	1	1	2	1
Sensitivity	0.17	0.60	0.44	0.43	0.75	0.86	0.29	0.54		0.54

Datasets are identified by the names originally used by Tompa et al. For each dataset, the total number of sequences included the length of promoter sequences, and the number of signals included is reported. Rows from four to seven describe results obtained by MOST different analysis steps and, for the third step, with different conditions. The following eight rows show the number and the proportion (sensitivity) of known signals per dataset represented in results of the previously described analysis steps. In the last part of the table, the number and the proportion (sensitivity) of known signals represented in the maximal cluster are shown, for each dataset.