Table 1.
Predicted Sites That Were Experimentally Tested
Predicted pairs with ≥0 separation
| ||||||||
---|---|---|---|---|---|---|---|---|
Ranking | Site 1, site 2 | Predicted regions | Probability | Separation | ||||
1 | ArgR, ArgR | aroP-pdhR | 9.1 × 10-12 | 3 bp | ||||
8 | LexA, LexA | arsR | 6.7 × 10-7 | 0-30 bp | ||||
9 | GaIR, CRP | ppa-ytfQ | 1.6 × 10-6 | 0 bp | ||||
Predictions from analysis of overlapping sitesa |
Ranking | Site 1, site 2 | Predicted regions | Significance index | Separation |
---|---|---|---|---|
7 | PhoB, PhoB | dinJ-yafL, yqeF-yqeG | 2.3 × 10-12 | 0 bp |
14 | MetJ, MetJ | ybdH-ybdL | 6.0 × 10-12 | 0 bp |
Predicted sites are ranked according to the probability of obtaining the observed number of hits for the most overrepresented bin or spacing, given the number expected by chance for that particular bin or spacing (“separation”). Predictions coming from our analysis of pairs separated by ≥0 bp and predictions from our analysis of overlapping sites are treated separately (see Methods).
The yqeF-yqeG and dinJ-yafL IGRs each contain three adjacent 11 -bp phoB-predicted sites; the ybdH-ybdL IGR contains four adjacent 8-bp predicted MetJ sites. For PhoB and MetJ, we searched the genome with a matrix consisting of two adjacent sites, because a matrix consisting of a single site is not specific enough to be useful. Thus, triplets of the sites presented above showed up in our analysis as overlapping dimers of sites (two 22-bp matrix hits overlapping by 11 bp in the case of PhoB, or two 16-bp matrix hits overlapping by 8 bp in the case of MetJ). In addition, we constructed a highly specific 33-bp matrix from all known and footprinted triplets of PhoB sites, which identifies a very small number of sites in the genome, including the known sites and the dinJ-yafL and yqeF-yqeG IGRs. We also constructed a highly specific 24-bp matrix from all known and footprinted triplets of MetJ sites, which identifies only one new IGR in the genome in addition to the known ones (the ybdH-ybdL IGR). This matrix actually predicts two 24-bp sites in the ybdH-ybdL upstream region overlapping by 8 bp (i.e., a 32-bp pattern consisting of four consecutive motif instances). The fourth 8-bp site is weak, however; it does not show up when searching with the shorter 16-bp matrix. PhoB and MetJ also both showed up as highly significant in our spacing analysis with the pattern consisting of two dimers of sites separated by 0 bp (i.e., four adjacent sites-a 44-bp pattern for PhoB with probability 4.2e-6, or a 32-bp pattern for MetJ with a probability 1.1e-7). However, only known sites contributed to these spacing patterns; thus, there are no new predictions fitting this pattern. For both PhoB and MetJ, there are more footprinted sites with triplets of adjacent sites than strings of four adjacent sites. Our predictions in the table were based on significant triplets of sites