Zwir et al. 10.1073/pnas.0408238102.

Supporting Information

Files in this Data Supplement:

Supporting Text
Supporting Figure 3
Supporting Table 3
Supporting Figure 4
Supporting Figure 5




Supporting Figure 3

Fig. 3. Using gene promoter scan (GPS) to build profiles of PhoP-regulated promoters. Profiles are groups of promoters sharing common sets of features (i.e., gene expression levels, PhoP box submotifs, the orientation and distance of the PhoP box relative to the RNA polymerase site, class of s 70 promoter, the presence of potential binding sites for 60-plus transcription factors, and whether the position of the PhoP box suggests a promoter is activated or repressed). The identification of a feature in a promoter is based on measuring the degree of matching between a promoter instance and a model that represents that feature, which results in a vector of [0, 1] values where 1 (red) corresponds to maximum matching, and 0 (green) corresponds to the absence of the feature. The profile identification process initially groups promoters into subsets by using fuzzy clustering (1). Individual genes are allowed to have more than one promoter because more than one PhoP box candidate can be identified in an intergenic region. In addition, promoters for the same gene in different genomes are considered separately (black and blue colored names correspond to genes in the Escherichia coli and Salmonella genomes, respectively). (A) Expression analysis resulted in three major groups, where E1 and E2 correspond to up-regulated genes and E3 harbors downregulated genes. (B) Motif analysis resulted in four groups (M1M4), which are detailed in Fig. 2C, (M0 corresponds to PhoP-regulated genes that do not have a PhoP binding site). (C) The PhoP box could be present in the opposite (O1) or the same (O2) orientation as the regulated ORF. (O0 corresponds to PhoP-regulated genes that do not have a PhoP binding site). (D) Promoter analysis revealed three groups (P1P3) corresponding to types and location of s 70 promoters. Lanes: 1, close class II; 2, close class I; 3, medium class II; 4, medium class I; 5, remote class II; 6, remote class I (P0 corresponds to PhoP-regulated genes that do not have a PhoP binding site). (E) Presence of transcription factor binding sites in PhoP-regulated promoters. Lanes: 1, OxyR; 2, FruR; 3, DeoR; 4, MalT; 5, MelR; 6, CytR; 7, GlpR; 8, ArcA; 9, FNR; 10, RcsB; 11, Fur; 12, ArgR; 13, RhaS; 14, AraC; 15, CRP; 16, DnaA; 17, YhiW; 18, Lrp; 19, NarL; 20, FIS; 21, IHF; 22, OmpR; 23, PmrA. (I0 corresponds to PhoP-regulated genes that do not have a PhoP binding site). Colors indicate the plausibility of interaction with the PhoP box based on evaluation of the distance between the sites (i.e., green, low; red, high). (F) Activated/repressed analysis discriminates among three groups (A1–A3) corresponding to activated, repressed, and divergently transcribed genes, respectively. (G and H) Profiles were created by incrementally by combining promoters from AF. New profiles are created whenever adding a feature generates inconsistencies in the profiles being combined (light blue ring in GI and GII). The profile () (light blue ring in GI) is partitioned into three profiles when the motif feature is introduced (open light blue rings in GII). Therefore, the tan color ball in HI illustrates a level-3 profile, encompassing promoters that elong to the same expression (E1), PhoP submotif (M2), and promoter class (P2). New profiles are built by hierarchical combination of ancestor profiles and are evaluated as a tradeoff between the significance of the intersection of the ancestor profiles (green, low; red, high), instead of the number of shared promoters (circle size), and the number and quality of shared features. The figure illustrates only a substructure of the complete hierarchical representation of profiles generated by GPS.


1. Gasch, A. P. & Eisen, M. B. (2002) Genome Biol. 3, RESEARCH0059.





Supporting Figure 4

Fig. 4. Examples of profiles inferred by GPS. Each profile is defined by those shared features in the profile (e.g., ), where the superscript denotes the hierarchical level of the association and the subscript indicates the profiles being combined. The profile prototypes are locally rebuilt at each level of the feature space for each promoter association and type of feature, which were arranged as a unit-interval vector that can be interpreted as the degree/probability of matching between features and promoter instances and then linked by fuzzy/probabilistic operators (1, 2). Profile quality is evaluated as a tradeoff between the extent of the profile, measured as the probability of intersection PI of the ancestor profiles that were merged to generate it (green, low; red, high) and the quality of matching between promoters and features of a profile, and the number of such shared features, measured as the distance between promoters and the centroids (red lines) of the profile by the similarity of intersection SI (small circle size or narrow line, low; big circle size or thick line, high). Similarity is measured in a unit-interval [0,1] scale. Promoters for the same gene in different genomes are considered separately (black and blue colored names correspond to genes in the E. coli and Salmonella genomes, respectively). (A) Profile corresponding to promoters that belong to the same expression (E1), PhoP box submotif (M2), and promoter class (P1) and harbors not only the prototypical PhoP-regulated phoP and mgtA promoters but also the yhiW promoter, which was not known to be under PhoP control. (B) Profile that comprises promoters that share the promoter class (P2), PhoP box orientation (O1), and regulatory interactions (I3).

1. Beer, M. A. & Tavazoie, S. (2004) Cell 117, 185-198.

2. Pedrycz, W., Bonissone, P. P. & Ruspini, E. H. (1998) Handbook of Fuzzy Computation (Institute of Physics, Bristol, U.K.).





Supporting Figure 5

Fig. 5. Profiles that identify and distinguish groups of acid resistance genes regulated by PhoP. (A) Genes were grouped by their expression similarity by using clustering: E1 and E2 consist of up-regulated genes, and E3 harbors down-regulated genes. The number of promoters in each group is illustrated by the circle size. (B) Two profiles distinguish genes from the same expression group E2. One profile contains several acid resistance structural genes, such as dps and gadA, that do not have a recognizable PhoP box. A second profile harbors the PhoP box-containing promoters of the acid resistance regulatory genes yhiE and yhiW, which share the promoter class and distance of the PhoP box to the RNA polymerase site (P3). (C) Finally, one profile comprises a different set of acid resistance structural genes, including hdeD and hdeAB, which are predicted to have a class II promoter with a PhoP box close to the RNA polymerase site (P1); another two profiles distinguish between the acid resistance regulatory genes yhiE and yhiW, suggesting that they may be regulated in a different fashion.