Skip to main content
. 2005 Sep 1;33(15):e135. doi: 10.1093/nar/gni131

Table 1.

Results of MOST evaluation on yeast datasets

Dataset Yst01g yst02g yst03m yst04r yst05r yst06g yst08r yst09g Total Average
Number of sequences 8 4 8 6 3 7 11 16
Sequence length 1000 500 500 1000 500 500 1000 1000
Number of known signals 6 5 18 7 4 7 14 13 47
Phase 1: solid words extracted 65 255 286 162 337 214 88 40
Phase 2: clusters (80% similarity threshold) 50 126 141 84 157 111 46 31
Phase 3A: ext. clusters (95% threshold for the MV function) 50 123 137 81 154 106 44 29
Phase 3B: ext. clusters (80% threshold for the MV function) 50 126 141 84 157 110 46 31
Phase 1: signals found in solid words 2 5 14 6 4 7 10 12 38
Phase 2: signals found in clusters 2 5 14 6 4 7 10 12 38
Phase 3A: signals found in ext. clusters 0 5 10 7 3 5 7 6 30
Phase 3B: signals found in ext. clusters 1 5 16 7 4 7 9 11 40
Phase 1 0.33 1.00 0.78 0.86 1.00 1.00 0.71 0.92 0.83
Phase 2 Sensitivity 0.33 1.00 0.78 0.86 1.00 1.00 0.71 0.92 0.83
Phase 3A 0.00 1.00 0.56 1.00 0.75 0.71 0.50 0.46 0.67
Phase 3B 0.17 1.00 0.89 1.00 1.00 1.00 0.64 0.85 0.84
Phase 2
Maximum number of signals per cluster 1 4 9 5 3 6 8 8 28
Number of maximal clusters 2 1 1 1 1 1 1 1
Sensitivity 0.17 0.80 0.50 0.71 0.75 0.86 0.57 0.62 0.63
Phase 3A
Maximum number of signals per cluster 0 1 5 4 3 5 4 3 18
Number of maximal clusters - 11 3 2 1 1 1 4
Sensitivity 0.00 0.20 0.28 0.57 0.75 0.71 0.29 0.23 0.42
Phase 3B
Maximum number of signals per cluster 1 3 8 3 3 6 4 7 24
Number of maximal clusters 1 1 1 3 1 1 2 1
Sensitivity 0.17 0.60 0.44 0.43 0.75 0.86 0.29 0.54 0.54

Datasets are identified by the names originally used by Tompa et al. For each dataset, the total number of sequences included the length of promoter sequences, and the number of signals included is reported. Rows from four to seven describe results obtained by MOST different analysis steps and, for the third step, with different conditions. The following eight rows show the number and the proportion (sensitivity) of known signals per dataset represented in results of the previously described analysis steps. In the last part of the table, the number and the proportion (sensitivity) of known signals represented in the maximal cluster are shown, for each dataset.