TABLE 3.
Sequence analysis of CH014-specific subtracted DNA fragments.
| Group and fragment name | Sequence size (bp) | Protein with similaritya | Organism | % Nucleo- tide identity | % Amino acid identity | Locationb | EcoRI- digested fragment size(s) (kb) | HindIII-digested fragment size(s) (kb)c |
|---|---|---|---|---|---|---|---|---|
| I | ||||||||
| S-1 | 327 | Transport protein E-HlyB | EDL933(pO157) | 96 | 98 | pl | NDd | ND |
| S-2 | 305 | Unknown protein | Shigella flexneri(pWR100) | 92 | 86 | pl | ND | >10 |
| Unknown protein L,7076 | EDL933(pO157) | 88 | 85 | |||||
| S-3 | 307 | Tissue invasion protein SepA | Shigella flexneri(pWR100) | 37 | pO91:H21 | 9 | 8 | |
| Serine protease EspP | EDL933(pO157) | 26 | ||||||
| S-4 | 160 | Probable IS transposase | Shigella flexneri(pWR100) | 81 | 57 | pO91:H21 | 10 | ND |
| S-5 | 336 | Copy number control protein CopB | EPEC(pB171) | 99 | 98 | chr | >10 | >10 |
| Copy number control protein CopB | EDL933(pO157) | 93 | 60 | |||||
| S-6 | 281 | Helicase Tral | E. coli(plasmid F) | 97 | 97 | pl | >10 | 10 |
| S-7 | 193 | Unknown protein ORF2 | E. coli(plasmid IncF) | 89 | 75 | pl | 9 | 8 |
| S-8 | 287 | Type I restriction enzyme EcoR124II R HsdR | E. coli(plasmid R124/3) | 95 | 95 | chr | >10 | 8 |
| II | ||||||||
| S-9 | 324 | Probable capsid portal protein GpQ | Phage P2 | 94 | 99 | pl | 5 | ND |
| S-10 | 327 | H tail component | Phage λ | 99 | 98 | chr | >10 | ND |
| CP-933X tail component | EDL933 | 97 | 97 | |||||
| S-11 | 234 | DNA-packaging protein | Phage λ | 95 | 89 | chr | 2.5 | ND |
| CP-933X DNA-packaging protein | EDL933 | 94 | 89 | |||||
| S-12 | 288 | CP-933X-V head-tail adapter | EDL933 | 98 | 98 | chr | >10, >10 | >10, >10 |
| S-13 | 302 | CP-933M, -N, -V, -X unknown protein | EDL933 | 99 | 97 | chr | >10, >10 | >10, >10 |
| S-14 | 123 | CP-933M, -N, -V portal protein | EDL933 | 95 | 68 | chr | >10, >10 | >10, >10 |
| S-15 | 277 | CP-933M, -N, -V unknown protein | EDL933 | 97 | 96 | chr | ND | >10, >10 |
| S-16 | 631 | CP-933M DicA similar protein | EDL933 | 50 | chr | <1 | ND | |
| S-17 | 224 | CP-933N, -O, -U unknown protein | EDL933 | 93 | 90 | chr | 2.5 | ND |
| S-18 | 295 | CP-933U unknown protein | EDL933 | 99 | 98 | chr | 10 | >10 |
| S-19 | 234 | CP-933C terminase | EDL933 | 86 | 94 | chr | >10, >10 | ND |
| S-20 | 299 | CP-933C terminase | EDL933 | 95 | 97 | chr | >10, >10 | ND |
| S-21 | 300 | CP-933C head maturation protease | EDL933 | 95 | 98 | chr | >10, >10 | ND |
| III | ||||||||
| S-22 | 1,342 | O island no. 154 fimbrial usher protein | EDL933 | 78 | 76 | chr | 5 | 10 |
| Fimbrial usher protein LpfC | Salmonella typhimurium | 50 | ||||||
| S-23 | 279 | O island no. 134 DNA processing protein | EDL933 | 98 | 97 | chr | 8 | ND |
| S-24 | 300 | O island no. 133 unknown protein | EDL933 | 91 | 93 | chr | 8 | ND |
| IV | ||||||||
| S-25 | 220 | General secretion pathway protein EpsK | Vibrio cholerae | 36 | chr | 8 | >10 | |
| S-26 | 340 | Peptide synthetase McyG | “Microcystis aeruginosa” | 40 | chr | 10, 8 | ND | |
| S-27 | 241 | pO91:H21 | 7 | >10 | ||||
| S-28 | 285 | chr | 10 | ND | ||||
| S-29 | 305 | chr | 10 | >10, 8 | ||||
| S-30 | ND | chr | 8 | 8 | ||||
| S-31 | ND | pO91:H21 | 10 | 5 | ||||
| S-32 | ND | chr | >10, >10 | >10, >10 |
Determined by comparison of sequences and coding regions with the EMBL and GenBank DNA databases. BLAST and BLASTX network services were used. IS, insertion sequence.
chr, chromosome located; pl, plasmid located; pO91:H21, O91:H21-specific plasmid located.
Two values mean two different-size fragments.
ND, not determined.