Table 1.
Name | Description | Score | Instance | Alignment | Footprint | Distribution (COG) | Distribution [38] |
Minus strand | |||||||
STM1273 | Putative nitric oxide reductase | 0.848436 | CTTAATGTTTTCTTAAT | / | / | 1000 | All Salmonella only |
STM2132 | Pseudogene; frameshift; putative RBS for STM2133 | 0.814252 | TTTTAGATTCACTTAAT | / | / | 1000 | Some or all Salmonella only |
STM4596 | Paralog of E. coli ORF, hypothetical protein (AAC73478.1); BLAST hit to putative inner membrane protein | 0.806962 | TTTAATATTCACTTAAA | / | / | 1000 | Some Salmonella only |
STM3131 | Putative cytoplasmic protein; putative RBS for STM3130; putative first gene of operon with STM3130 (putative hypothetical protein) | 0.801641 | CTTAATTTTTACTTATT | / | / | 1000 | All Salmonella only |
STM1020 | Gifsy-2 prophage | 0.791616 | CTTATTGTTAAGTCAAT | / | / | 1000 | Other distributions |
stdA | STM3029; paralog of E. coli putative fimbrial-like protein (AAC73813.1); BLAST hit to putative fimbrial-like protein | 0.788548 | CAAAACATTAACTTAAT | / | / | 1000 | Subspecies 1 only? |
ugd | STM2080; S. typhimurium UDP-glucose 6-dehydrogenase | 0.781719 | CTCAGAATTAACTTAAT | m | + | 1100 | All nine genomes |
sinR | STM0304; S. typhimurium SINR protein. (SW:SINR_SALTY) transcriptional regulator | 0.780204 | CTTGATATCATCTTAAT | / | / | Subspecies 1 only | |
STM3131 | Putative cytoplasmic protein; putative RBS for STM3130; putative first gene of operon with STM3130; (putative hypothetical protein) | 0.772846 | CTTAATACTCACATTAT | / | / | 1000 | Other distributions |
STM4413 | Putative imidazolonepropionase and related amidohydrolases; putative RBS for STM4412; first gene of operon with STM4412 (D-galactonate transport) | 0.771153 | GTGAATGTTAAATTAAT | / | / | 1000 | Some or all Salmonella only |
ybdO | STM0606; ortholog of E. coli putative transcriptional regulator LYSR-type (AAC73704.1); BLAST hit to putative transcriptional regulator, LysR family | 0.769839 | CTTAATGTAGAGTTTAT | m | + | 1110 | All Salmonella only |
oraA | STM2828; ortholog of E. coli regulator, OraA protein (AAC75740.1); BLAST hit to regulator | 0.766748 | CTTGATGGTAATTTAAC | m | - | 1110 | All nine genomes |
sdhC | STM0732; Ortholog of E. coli succinate dehydrogenase, cytochrome b556 (AAC73815.1); Putative RBS for sdhD; first gene of putative operon encoding succinate dehydrogenase | 0.765950 | CTTATTATTCCCTTAAG | / | / | 1000 | All nine genomes |
ycaR | STM0987; Ortholog of E. coli ORF, hypothetical protein (AAC74003.1); BLAST hit to putative inner membrane protein; Putative RBS for kdsB; first gene of a putative operon with ksdB (CMP-3-deoxy-D-manno-octulosanate transferase) | 0.765889 | TTCAATATTAACATAAT | / | / | 1000 | All nine genomes |
lasT | STM4600; Ortholog of E. coli ORF, hypothetical protein (AAC77356.1); BLAST hit to putative tRNA*tRNA methyltransferase | 0.765754 | ATTTAGGATAATTTAAT | nd | / | 1110 | All nine genomes |
STM2137 | Putative cytoplasmic protein | 0.764036 | TTTAACCTTAATTTAAT | nd | / | 1100 | Some Salmonella only |
STM1672 | Putative cytoplasmic protein | 0.762904 | ATTAATAGTCACTTATT | / | / | 1000 | Subspecies 1 only? |
gcvA | STM2982; Ortholog of E. coli positive regulator of gcv operon (AAC75850.1); first gene of putative operon (gcvA, ygdD, ygdE containing a SAM-dependent methyltransferase) | 0.761166 | CTTAATGTCGAATGAAT | m | + | 1111 | All nine genomes |
ycgO | STM1801; Ortholog of E. coli ORF, hypothetical protein (AAC74275.1); BLAST hit to putative CPA1 family, Na:H transport protein | 0.760685 | TTTAACATTAACATAAT | m | + | 1110 | All nine genomes? |
STM2287 | Paralog of E. coli putative sulfatase* phosphatase (AAC75329.1); BLAST hit to putative cytoplasmic protein | 0.759519 | CTTATTATTCACATAAC | / | / | 1000 | Some or all Salmonella only? |
yebW | STM1852; Ortholog of E. coli ORF, hypothetical protein (AAC74907.1); BLAST hit to putative inner membrane lipoprotein | 0.754895 | CTCAATGTTAACTACTT | / | / | 1000 | All nine genomes? |
STM0897 | Hypothetical protein Fels-1 prophage | 0.754468 | CGTAAGGCTCTTTTAAT | / | / | 1000 | Some Salmonella only |
lpfA | STM3640; S. typhimurium long polar fimbrial protein A precursor; first gene of a putative fimbriae synthesis operon | 0.753228 | ATTAAGAATAAATTAAT | / | / | 1000 | Other distributions |
Plus strand | |||||||
yjdB* | STM4293; S. typhimurium hypothetical 61.6 kDa protein in basS*pmrA-adiY intergenic region. (SW:YJDB_SALTY) putative integral membrane protein; Putative RBS for basR; first gene of the putative operon (yjdB basR basS) | 0.930146 | CTTAAGGTTCACTTAAT | m | + | 1111 | All nine genomes |
ugd | STM2080; S. typhimurium UDP-glucose 6-dehydrogenase | 0.913666 | CTTAATATTAACTTAAT | m | + | 1100 | All nine genomes |
yfbE/ais | STM2297; Ortholog of E. coli putative enzyme (AAC75313.1); first gene of the yfbE operon; shared intergenic with ais | 0.912660 | CTTAATGTTAATTTAAT | m | + | 1111 | All nine genomes? |
STM1269*/STM1268 | Putative chorismate mutase; intergenic shared with STM1268 | 0.888478 | CTTAATGTTATCTTAAT | / | / | 1000 | All Salmonella only |
STM0692 | Paralog of E. coli nitrogen assimilation control protein (AAC75050.1); putative transcriptional regulator, LysR family | 0.814773 | CTTGATGTTGATTTAAT | / | / | 1000 | All Salmonella only |
ybjG/mdfA* | STM0865; Ortholog of E. coli orf, hypothetical protein (AAC73928.1); putative permease; intergenic shared with mdfA (multidrug translocase) | 0.810981 | CTTTAAGGTTAATTTAA | m | + | 1111 | All nine genomes |
STM2901 | Hypothetical protein putative cytoplasmic protein; located downstream of pathogenicity island 1 | 0.803712 | CTTAATATCAATATAAT | / | / | 1000 | Other distributions |
yhjC/yhjB | STM3607; Ortholog of E. coli putative transcriptional regulator LysR-type (AAC76546.1); intergenic shared with yhjB (putative transcriptional regulator) | 0.796967 | TTGAATATTAATTTAAT | nd | / | 1110 | All nine genomes? |
yjbE/pgi | STM4222; Ortholog of E. coli orf, hypothetical protein (AAC76996.1); BLAST hit to putative outer membrane protein; first gene of the putative operon (yjbE, yjbF, yjbG, yjbH) consisting of putative outer membrane (lipo)proteins; intergenic shared with pgi (glucosephosphate isomerase) | 0.791181 | TTTAATTTTAACTTATT | / | / | 1000 | All nine genomes? |
yibD* | STM3707; Ortholog of E. coli putative regulator (AAC76639.1); BLAST hit to putative glycosyltransferase | 0.790879 | CTTAATAGTTTCTTAAT | m | + | 1100 | Other distributions |
STM1926/flhC | Putative cytoplasmic protein Putative RBS for STM1926; first gene of a putative operon with yecG (putative universal stres protein); shared intergenic with flhC en flhD (flagellar transcriptional activator) | 0.790699 | CCTAATGTTCACTTTTT | / | / | 1000 | Some or all Salmonella only |
STM0334/STM0335 | Putative cytoplasmic protein; shared intergenic with STM0335 | 0.789514 | TTTCATATTCATTTAAT | / | / | 1000 | Some Salmonella only |
ybdN | STM0605; Ortholog of E. coli orf, hypothetical protein (AAC73703.1); BLAST hit to putative 3-phosphoadenosine 5-phosphosulfate sulfotransferase (PAPS reductase)*FAD synthetase Putative RBS for ybdM; first gene of a putative operon with ybdM (hypothetical transcriptional regulator) | 0.788778 | ATTAATATAAATTTAAT | nd | / | 1100 | All nine genomes? |
glgB | STM3538; Ortholog of E. coli 1,4-alpha-glucan branching enzyme (AAC76457.1); BLAST hit to 1,4-alpha-glucan branching enzyme; Putative RBS for glgX; putative first gene of operon involved in glycogen synthesis | 0.779808 | TTTAAGGGTAGCTTAAT | m | - | 1111 | All nine genomes |
leuO | STM0115; S. typhimurium probable activator protein in leuabcd operon. (SW:LEUO_SALTY) putative transcriptional regulator (LysR family) | 0.776490 | ATTAATGTTAACTTTTT | m | - | 1111 | All nine genomes |
STM0343 | Paralog of E. coli orf, hypothetical protein (AAC75237.1); BLAST hit to AAC75237.1 identity in aa 10 - 512 putative Diguanylate cyclase*phosphodiesterase domain | 0.774271 | ATTAATGTTACTTTAGT | nd | / | 1100 | Subspecies 1 only |
orf242 | STM1390 S. typhimurium ORF242 (gi|4456866) putative regulatory proteins, merR family | 0.773644 | CTTAGTCTTCATTTGAT | / | / | 1000 | Other distributions |
STM1868A/mig-3 | Lytic enzyme; intergenic shared with mig-3 (phage assembly protein) | 0.773462 | CTTAATGATTATTTATT | / | / | 1000 | ? |
STM2763/STM2726 | Paralog of E. coli prophage CP4-57 integrase (AAC75670.1); BLAST hit to putative integrase; intergenic shared with STM2726 (putative inner membrane) | 0.772053 | ATTAATGTCCATTTAGT | / | / | 1000 | S. typhimurium only |
pntA | STM1479; Ortholog of E. coli pyridine nucleotide transhydrogenase, alpha subunit (AAC74675.1); Blast hit to AAC74675.1 pyridine nucleotide transhydrogenase (proton pump), alpha subunit; Putative RBS for pntB; first gene of the putative operon (pntA, pntB) | 0.770547 | TTTAATGTTAATTTCTT | m | - | 1111 | All nine genomes |
STM0057/cit2 | Putative citrate-sodium symport; intergenic shared with citC2 (citrate lyase synthetase) | 0.767968 | CTCATGGTTCATTGAAT | nd | / | 1110 | Other distributions |
yrbF | STM3313; Ortholog of E. coli putative ATP-binding component of a transport system (AAC76227.1); Blast hit to AAC76227.1 putative ABC superfamily (atp_bind) transport protein; Putative RBS for yrbE; RegulonDB:STMS1H003330; first gene of putative yrb operon (ABC transporter) | 0.766758 | CCTAATTTTGACTTTAT | m | + | 1111 | All nine genomes |
yejG | STM2220; Paralog of E. coli orf, hypothetical protein (AAC75242.1); Blast hit to putative cytoplasmic protein | 0.767099 | CTTTATGTTTATTTTAT | m | + | 1111 | All nine genomes |
slsA | STM3761; putative inner membrane protein | 0.765418 | CTTTATGTTATTTAAAT | nd | / | 1110 | Other distributions |
yhcN | STM3361; Ortholog of E. coli orf, hypothetical protein (AAC76270.1); Blast hit to putative outer membrane protein | 0.764452 | ATTAGTGTATACTTAAT | m | + | 1111 | All nine genomes? |
yceP | STM1161; Ortholog of E. coli orf, hypothetical protein (AAC74144.1); Blast hit to putative cytoplasmic protein | 0.764191 | TTTATTGTTCATATAAT | m | + | 1100 | All nine genomes |
STM4098 | putative arylsulfate sulfotransferase | 0.763003 | TCTAATATTTATTTAAT | nd | / | 1100 | Subspecies 1 only? |
stfA | STM0195; S. typhimurium major fimbrial subunit StfA | 0.762241 | ATCAATTTTAATTTAAT | / | / | 1000 | Some Salmonella only |
atpF | STM3869; Ortholog of E. coli membrane-bound ATP synthase, F0 sector, subunit b (AAC76759.1); Blast hit to imembrane-bound ATP synthase, F0 sector, subunit b; Putative RBS for atpH; first gene of a putative operon encoding putative ATP synthase | 0.760841 | CAGAAGGTTAACTAGAT | m | + | 1111 | All nine genomes |
yegH/wza | STM2119; Ortholog of E. coli putative transport protein (AAC75124.1); Blast hit to putative inner membrane protein; intergenic shared with wza (putative polysaccharide export protein) | 0.760004 | ATTAATATTAAATGAAT | m | - | 1111 | All nine genomes |
yjgD/argI | STM4470; S. typhimurium hypothetical protein in argI-miaE intergenic region (ORF15.6). (SW:YJGD_SALTY) putative cytoplasmic protein; Putative binding site for ArgR; shared intergenic regions with argI (arginine ornithine transferase); first gene of a putative operon with miaE (tRNA hydroxylase) | 0.759514 | ATTAAAATTCACTTTAT | m | + | 1111 | All nine genomes |
sseJ/STM1630* | STM1631; S. typhimurium secreted effector; regulated by SPI-2; shared intergenic with STM1630 (putative inner membrane protein) | 0.758303 | CTTAAGAAATATTTAAT | / | / | 1000 | Some Salmonella only |
csrA | STM2826; S. typhimurium carbon storage regulator | 0.756990 | CTTAGGTTTAACAGAAT | m | + | 1111 | All nine genomes |
dinP/yafK | STM0313; Ortholog of E. coli damage-inducible protein P; putative tRNA synthetase (AAC73335.1); Blast hit to AAC73335.1 DNA polymerase IV, devoid of proofreading, damage-inducible protein P; intergenic shared with yafKJ (periplasmic protein, putative amido transferase) | 0.756938 | CATACTGTACACTTAAA | m | + | 1111 | All nine genomes |
STM0346 | Putative outer membrane protein; Homolog of ail and ompX | 0.756369 | CATTAGGTGCTCTTAAT | / | / | 1000 | Some Salmonella only |
ybfA/STM0707 | STM0708; Ortholog of E. coli orf, hypothetical protein (AAC73793.1); Blast hit to putative periplasmic protein; intergenic shared with STM0707 (hypothetical protein) | 0.754265 | ATTAGTATTAATTTAAC | m | + | 1111 | All nine genomes? |
yncD/STM1587 | STM1587; Ortholog of E. coli putative outer membrane receptor for iron transport (AAC74533.1); Blast hit to paral putative outer membrane receptor; intergenic shared with STM1586 (putative receptor) | 0.754063 | CATTTTCTTAACTTAAT | m | - | 1100 | All nine genomes |
yafC/STM0275 | STM0256; Ortholog of E. coli putative transcriptional regulator LysR-type (AAC73313.1); Blast hit to putative transcriptional regulator, LysR family; intergenic shared STM0275 (drug efflux protein) | 0.753257 | CAAAATATCAATTTAAT | m | - | 1111 | Other distributions |
Name: name of the gene in the S. typhimurium genome (NC_003197). For genes that are divergently transcribed and have a shared intergenic region, the gene for which the motif is detected on the plus strand is indicated first and the gene for which the motif is on the minus strand is indicated after the slash. Description: annotation of the encoded proteins and genome location of the genes (derived from GenBank and Sanger annotation). Score: normalized score assigned to the respective motifs by MotifLocator. Site: instance of the motif as detected in the respective intergenic sequence. Distribution (COG): distribution of the protein as determined by our analysis. The distribution is indicated by a binary profile that indicates the presence 1 versus absence 0 of the protein in species (serovars) of, respectively, Salmonella, E. coli, Shigella and Yersinia (for example, 1111 indicates protein present in all four species; 1000: protein present in Salmonella species only). Distribution: distribution of the protein encoded by the corresponding gene in nine bacterial genomes as determined by McClelland et al. [38]. Proteins having close homologs in at least one Salmonella strain but not in E. coli or K. pneumoniae are indicated by 'some Salmonella only'. Genes that contain close homologs in all genomes are indicated by 'all nine genomes'. Other combinations are indicated by 'other distributions'. ? indicates that the authors were not certain about the statement. Differences between the distribution as determined by McClelland et al. and the one determined by our analysis is due to the difference in selection criteria used to identify close homologs (see Materials and methods). Alignment: indicates whether the intergenic regions in the dataset could be locally aligned (nd, no local alignment detected that contained the original sequence of S. typhimurium; m, local alignment detected. If the dataset only contained homologs from Salmonella species, local alignments were considered noninformative (indicated by /)). Footprint: denotes whether the PmrA motif is conserved in the close homologs. +, the retrieved putative PmrA motif is conserved; -, the intergenic sequences of the orthologs could be locally aligned but the PmrA motif was not part of the conserved regions. Most promising PmrAB targets that contained a PmrA motif matching the PmrA consensus (Figure 4) are in bold face. PmrA motifs that are experimentally validated in this study are indicated by an asterisk.