Table 2.
Summary for most significant groups of conserved word pairs
| Most significant word pair in consensus group | ||||||
| Conserved word pairs (compilation of overlapping words) | Known transcription factors or motifs | (χ2, p-value via Bonferroni) | Median of min distance |
Number of TCRs | Expression conditions with significant gene subsets (FDR significance) | |
| 1 | G[AC]GATGAG, TGAAAATTTT | PAC, RRPE | 240.6 (10-49) | 19±0.5 | 162 | Repressed in multiple environmental stresses (10-6) |
| 2 | ANTGAAA, GAAAAWT | RRPE (Overlap) | 96.9 (2 × 10-16) | 43±11 | 68 | Repressed in multiple environmental stresses (10-6) |
| 3 | CTCCCC, CCCTTA | Msn2/4p-like, (Overlap) | 53.8 (5 × 10-7) | 28±3.7 | 15 | Induced in multiple environmental stresses (10-6) |
| 4 | GGCGGGC, GTGGCA | Ume6p, Rpn4p | 43.7 (9 × 10-5) | 48±16 | 25 | Cadmium, diamide (10-4) MMS, heat shock (10-3) |
| 5 | CCTTTT, GAGAAA | Msn2/4p, Hsf1p | 56.2 (2 × 10-7) | 54±5.4 | 69 | Heat shock (10-4) |
| 6 | CCGCCG, ACCCCA | Ume6p, Mig1p | 41.9 (2 × 10-4) | 17±1.5 | 14 | Stationary phase (10-6) |
| 7 | CCGCGG, CGGAAA | Pdr1/3p, Unknown | 111 (2 × 10-19) | 44±12 | 21 | Diamide (10-3) |
| 8 | RACGCG, RCGAAA | Swi6p/Mbp1p, Swi4/6p, | 83.0 (7 × 10-13) | 33±5.0 | 33 | Cell cycle, G1 phase (10-6) |
| 9 | GCACGTGC, ACTGTGGC | Cbf1p | Pho4p, Met31/32p | 37.4 (2 × 10-3) | 22±2.5 | 22 | Cadmium (10-6) |
| 10 | T[AT]TTGTT, TGTTTAC | Fkh1/2p (Overlap) | 51.1 (2 × 10-6) | 57±6.9 | 48 | Cell cycle (10-3) |
| 11 | TTTGTT, TTTTTY | Fkh1/2p, TnC | 37.6 (2 × 10-3) | 49±4.4 | 267 | Late nitrogen depletion (10-3) |
| 12 | CCGATA, TCGTTT | Hap1p, Ecm22p | Upc2p | 36.2 (4 × 10-3) | 41±5.9 | 28 | Ergosterol inhibition (10-4) MMS (DNA damage) (10-3) |
| 13 | TCGTTT, TATTGTT | Rox1p, Ecm22p | Upc2p | 58.8 (4 × 10-8) | 55±0.5 | 69 | Early menadione (10-3) |
| 14 | TGACTC, TCTTTT | Gcn4, TnC | 35.6 | 59±9.1 | 63 | Amino-acid starvation (10-5) |
Statistics are listed for one representative word pair for each group of overlapping word pairs, numbered as in Figure 4. Multiple transcription factors that may bind the same sequence motif are separated by |. To summarize the close spacing (
) between conserved word pairs, we report the median of the distribution of minimum distances in S. cerevisiae ± standard deviation of the medians of the distribution of minimum distances in all four Saccharomyces genomes.