Table 3.
Stepwise linear regression statistics
| S. cerevisiae | Three or more genomes | ||||||||
| Word | M fc | p-value | ΔR2 | M fc | p-value | ΔR2 | p-value | p-value | |
| Amino-acid starvation 0.5 h | AAATTT | -0.165 | < 2.0e-16 | 3.1% | -0.293 | < 2.0e-16 | 6.6% | 1.3e-37 | 2.0e-07 |
| GATGAG | - | - | - | -0.333 | 6.7e-16 | 4.1% | 1.6e-30 | 8.5e-03 | |
| AAGGGG | 0.209 | 3.2e-14 | 1.8% | 0.455 | < 2.0e-16 | 3.5% | 1.6e-22 | 7.9e-05 | |
| TGTGGC | 0.094 | 1.1e-03 | 0.6% | 0.283 | 2.9e-07 | 1.6% | 5.0e-07 | 8.3e-13 | |
| CCCTTA | 0.300 | 2.0e-16 | 1.7% | 0.363 | < 2.0e-16 | 1.4% | 3.2e-06 | 3.5e-03 | |
| TGACTC | 0.229 | 4.6e-10 | 0.8% | 0.311 | 2.2e-11 | 1.0% | 4.6e-01 | 1.1e-03 | |
| AAATTT • GATGAG | - | - | - | -0.266 | 9.1e-09 | 0.5% | - | - | |
| CACGTG | 0.045 | 3.5e-01 | 0.5% | 0.146 | 8.6e-03 | 0.5% | 1.9e-07 | 1.2e-08 | |
| CACGTG • TGTGGC | 0.443 | 3.8e-10 | 0.5% | 0.749 | 1.0e-12 | 0.9% | - | - | |
| GTGAAA | -0.066 | 1.1e-03 | 0.3% | -0.082 | 4.6e-03 | 0.1% | - | - | |
| TCTTTT | -0.022 | 2.3e-02 | 0.1% | - | - | - | - | - | |
| Total ΔR2 | 9.6% | 20.2% | |||||||
| Stationary phase YPD 10 h | AAATTT | -0.218 | < 2.0e-16 | 3.2% | -0.377 | < 2.0e-16 | 5.8% | 5.5e-39 | N/R |
| AAGGGG | 0.233 | 1.7e-11 | 0.9% | 0.591 | < 2.0e-16 | 4.0% | 4.5e-26 | N/R | |
| CCCTTA | 0.460 | < 2.0e-16 | 3.7% | 0.579 | < 2.0e-16 | 2.2% | 3.0e-07 | N/R | |
| GATGAG | - | - | - | -0.242 | 4.1e-06 | 1.8% | 4.4e-18 | N/R | |
| ACCCCA | 0.224 | 3.0e-03 | 0.3% | 0.459 | 1.5e-06 | 1.0% | - | N/R | |
| AAATTT • GATGAG | - | - | - | -0.287 | 1.7e-06 | 0.4% | - | N/R | |
| CCGCCG | 0.333 | 5.1e-07 | 0.8% | 0.208 | 1.5e-02 | 0.3% | - | N/R | |
| ACCCCA • CCGCCG | 0.294 | 1.8e-02 | 0.1% | 0.807 | 5.6e-05 | 0.3% | - | N/R | |
| GTGAAA | -0.090 | 4.2e-04 | 0.2% | -0.122 | 1.0e-03 | 0.2% | - | N/R | |
| Total ΔR2 | 9.4% | 16.0% | |||||||
| Terbinafine 3 h | TGACTC | 0.162 | < 2.0e-16 | 3.5% | 0.261 | < 2.0e-16 | 5.1% | 1.3e-14 | N/R |
| TCGTTT | 0.071 | < 2.0e-16 | 2.0% | 0.132 | < 2.0e-16 | 3.3% | 2.5e-24 | N/R | |
| TGAAAC | -0.055 | 1.3e-12 | 1.1% | -0.077 | 9.50e-11 | 0.9% | 4.0e-03 | N/R | |
| GATGAG | -0.029 | 1.7e-03 | 0.3% | -0.047 | 6.70e-06 | 0.4% | - | N/R | |
| AAGGGG | 0.025 | 1.1e-02 | 0.1% | 0.050 | 5.40e-04 | 0.3% | 2.4e-01 | N/R | |
| CCGATA | -0.008 | 6.5e-01 | 0.1% | 0.004 | 8.6e-01 | 0.1% | - | N/R | |
| CCGATA • TCGTTT | 0.080 | 9.4e-06 | 0.3% | 0.146 | 3.2e-07 | 0.5% | - | N/R | |
| CCCTTA | -0.021 | 5.0e-02 | 0.1% | -0.038 | 1.1e-02 | 0.1% | - | N/R | |
| Total ΔR2 | 7.6% | 10.8% | |||||||
Words and pairwise interaction terms are reported in the order of selection by the stepwise linear regression procedure performed on conserved words. The influence terms (Mf), associated p-values, and increase in R-square values were computed using the statistical package R [51]. Wang et al. [20] and Conlon et al. [21] previously fit regression models using sequence features derived from S. cerevisiae. The p-values of the most similar sequences features in their regression models were reported where available; sequence features that were more significant in this analysis are indicated in bold. Dashes indicate sequence features that were insignificant in the Wang et al. [20] or Conlon et al. [21] analyses. 'N/R' indicates gene-expression data that were not analyzed by Conlon et al. [21]