TABLE 3.
SSHa sequence and sequence type | Length (bp) | G+C content (%)b | Presence of sequenced genomec
|
Best BLASTX match; comments (GenBank accession no.) | % Identity | Length (no. of amino acids) | E value | ||
---|---|---|---|---|---|---|---|---|---|
Bpm | Bt | Bm | |||||||
Sequences present in strain 338 but not 520 | |||||||||
Recombination related | |||||||||
338-B3 (DQ351720) | 1,097 | 55.7 | − | − | − | DNA helicase-related protein (Xanthomonas campestris) (NP_637459) | 32 | 364 | 2e−45 |
338-2D1 | 760 | 57.8 | − | − | − | DNA helicase-related (Xanthomonas campestris) (NP637459); different region of same protein as 338-B3 | 72 | 248 | 7e−83 |
338-B20 | 333 | 54.1 | − | − | − | Uncharacterized protein (Rubrivivax gelatinosus) (ZP_00241526); Membrane proteins, DNA recombination protein RmuC (Salmonella and others) (NP_457782) | 80 | 110 | 4e−43 |
54 | 110 | 5e−26 | |||||||
Bacteriophage related | |||||||||
338-2C9 | 509 | 51.3 | − | p | − | Putative transmembrane protein (Ralstonia solanacearum) (NP_520413); DNA methylase of bacteriophage Φ E125 (B. thailandensis) (AAL47559) | 74 | 50 | 1e−15 |
96 | 28 | 3e−8 | |||||||
Transcriptional regulators | |||||||||
338-B7 | 482 | 54.8 | +1 | − | − | DeoR family transcriptional regulator in RD6/GI5 (BPSL0939) | 99 | 160 | 4.5e−82 |
338-2C3 | 305 | 57.4 | − | − | − | Transcriptional regulator (Ralstonia eutropha) (ZP_00169018) | 47 | 59 | 2e−13 |
Enzymes | |||||||||
338-2A12 | 283 | 57.6 | − | − | − | Maleylacetate reductase (Ralstonia sp.) (AAS87585) | 55 | 68 | 4e−16 |
338-2D9 | 480 | 59.2 | − | − | − | Alcohol dehydrogenase (Polaromonas sp.) (ZP_00364129) | 72 | 159 | 5e−60 |
Hypothetical proteins | |||||||||
338-B8 | 425 | 50.6 | − | − | − | Hypothetical protein (Escherichia coli O157:H7) (NP_313283) | 32 | 145 | 3e−13 |
338-2C4 | 616 | 52.6 | − | − | − | Hypothetical protein (Rhodopseudomonas palustris) (NP_949350) | 31 | 217 | 5e−16 |
338-B4 | 282 | 53.9 | +1 | − | − | Hypothetical protein in RD6/GI5 (BPSL0942) | 100 | 88 | 2e−42 |
338-2D7 | >624 | 48.9 | − | − | − | Hypothetical protein (Chromobacterium violaceum) (AAQ61798) | 39 | 94 | 3e−10 |
No significant BLASTX matches | |||||||||
338-B1 | 190 | 50.0 | − | − | − | ||||
338-B16 | 374 | 48.1 | − | − | − | ||||
338-2A7 | 292 | 44.5 | − | − | − | ||||
338-2B2 | 429 | 50.4 | − | − | − | ||||
338-2B4 | 333 | 51.1 | +2 | + | + | ||||
338-2B7 | 426 | 57.5 | − | − | − | ||||
338-2B10 | 282 | 43.3 | − | − | − | ||||
338-2D3 | 331 | 52.3 | − | − | − | ||||
Sequences present in strain 520 but not 338 | |||||||||
Mobile elements | |||||||||
520-E15 | 337 | 54.0 | − | − | − | Putative transposase (Burkholderia fungoram) (ZP_00283626) | 75 | 29 | 1e−7 |
520-E18 | 335 | 59.4 | +2p | − | p | Putative transposase (BPSS2148) | 97 | 49 | 8e−21 |
520-E33 | 158 | 58.2 | +2 | − | − | Putative transposase (BPSS2148); different region but same protein as 520-E18 | 100 | 52 | 8.7e−23 |
520-2F1 | 420 | 59.5 | +2 | − | − | Putative transposase (BPSS2148); different region but same protein as 520-E18 and 520-E33 | 99 | 128 | 2e−66 |
Secretion related | |||||||||
520-E12 | 765 | 53.9 | − | − | − | Hypothetical SecA-related protein (Photobacterium profundum) (YP_133346) | 48 | 218 | 2e−53 |
Lipoprotein | |||||||||
520-E44 | 202 | 56.4 | +1 | + | − | Putative lipoprotein (BPSL2045) | 97 | 45 | 5e−19 |
Enzymes | |||||||||
520-E19 | 759 | 53.9 | − | − | − | Appr-1-p processing enzyme family (Nitrosomonas europaea) (NP_841411); conserved hypothetical protein (Synechocystis spp.) (NP_942395) | 77 | 127 | 5e−53 |
65 | 119 | 3e−43 | |||||||
520-E1 | 374 | 50.0 | +1p | − | p | Conserved hypothetical protein (Synechocystis spp.) (NP_942395); Appr-1-p processing enzyme family (Nitrosomonas europia) (NP841411.1) | 68 | 29 | 4e−5 |
56 | 30 | 0.008 | |||||||
520-2F8 | 314 | 56.7 | +1 | p | + | Molybdopterin oxidoreductase (BPSL2207) | 100 | 54 | 3e−25 |
Hypothetical or uncharacterized proteins | |||||||||
520-2E7 | 314 | 60.2 | − | − | − | Uncharacterized protein (Microbulbifer degradans) (ZP_00318360) | 55 | 103 | 7e−25 |
520-2F2 | 773 | 54.1 | +1 | − | − | Hypothetical protein in RD7/GI6 (BPSL1146); variation in the C terminus | 80 | 111 | 7e−40 |
520-2E10 (DQ351721) | 519 | 54.0 | +1p | + | − | Hypothetical protein (BPSL2048) | 49 | 123 | 6e−26 |
520-E35 (DQ351716) | 308 | 52.9 | − | + | − | Hypothetical protein (BPSL2048A) | 59 | 101 | 3e−24 |
520-2G6 | 373 | 53.9 | − | − | − | Hypothetical protein (B. mallei) (YP_105718) | 47 | 113 | 1e−23 |
No significant BLASTX matches | |||||||||
520-E16 | 134 | 57.0 | − | − | − | ||||
520-E10 | 529 | 54.6 | − | − | − | ||||
520-2E1 | 233 | 52.8 | − | − | − | ||||
520-2F11 | 602 | 54.5 | − | − | − | ||||
520-2F6 | 814 | 50.7 | − | − | − | ||||
Sequences present in strains 338 and 520 | |||||||||
338-2D10 (DQ351718) | 370 | 56.2 | − | + | − | Bacteriophage protein from Φ1026b (B. pseudomallei 1026b) (NP_945078); bacteriophage protein from ΦE125 (B. thailandensis) (NP_536399) | 87 | 100 | 3e−47 |
84 | 100 | 6e−45 | |||||||
338-2B9 | 663 | 47.1 | +2 | − | − | Putative exported protein (BPSS0658) | 100 | 162 | 4e−78 |
520-E36 (DQ351719) | 150 | 58.0 | − | − | − | Putative transposase (Streptomyces avermitilis) (NP_821845) | 48 | 45 | 0.023 |
520-2G9 (DQ351717) | 159 | 57.2 | − | − | − | ISRSO16 transposase ORFB (R. solanacearum) (NP_523187) | 80 | 51 | 4e−18 |
338-B14 | 181 | 54.1 | − | − | − | Hypothetical protein (Methylococcus capsulatus) (YP_115042) | 90 | 31 | 6e−9 |
338-B18 | 206 | 60.2 | +2 | − | − | Hypothetical protein in GI14 (BPSS0655) | 98 | 68 | 4e−34 |
338-2A1 | >630 | 47.0 | +2 | p | + | Hypothetical proteins (BPSS1753) | 100 | 68 | 2e−33 |
520-E42 | 303 | 54.1 | +1 | − | − | Hypothetical protein in GI7 (BPSL1385) | 100 | 71 | 7e−34 |
No significant BLASTX matches | |||||||||
338-B12 | 692 | 55.6 | − | − | − | ||||
338-2C5 | 429 | 51.3 | +1 | − | + | Overlaps BPSL2558 by 10 bp but lies mainly in the gap between BPSL2558 and BPSL2559 |
GenBank accession numbers are indicated in parentheses for those novel sequences used in the VAT analysis.
G+C content for the subtracted sequence.
The presence (+) or absence (−) of the subtracted sequence, based on >90% sequence identity by using BLASTN, is indicated for the genome-sequenced strains of B. pseudomallei (Bpm), B. thailandensis (Bt), and B. mallei (Bm). For B. pseudomallei the number of the matching chromosomes is indicated. p, a partial match, where the match does not extend over the entire length of the subtracted sequence.