Skip to main content
Infection and Immunity logoLink to Infection and Immunity
. 2001 May;69(5):3271–3285. doi: 10.1128/IAI.69.5.3271-3285.2001

Complete DNA Sequence and Analysis of the Large Virulence Plasmid of Shigella flexneri

Malabi M Venkatesan 1,*, Marcia B Goldberg 2,*, Debra J Rose 3, Erik J Grotbeck 3, Valerie Burland 3, Frederick R Blattner 3
Editor: V J DiRita
PMCID: PMC98286  PMID: 11292750

Abstract

The complete sequence analysis of the 210-kb Shigella flexneri 5a virulence plasmid was determined. Shigella spp. cause dysentery and diarrhea by invasion and spread through the colonic mucosa. Most of the known Shigella virulence determinants are encoded on a large plasmid that is unique to virulent strains of Shigella and enteroinvasive Escherichia coli; these known genes account for approximately 30 to 35% of the virulence plasmid. In the complete sequence of the virulence plasmid, 286 open reading frames (ORFs) were identified. An astonishing 153 (53%) of these were related to known and putative insertion sequence (IS) elements; no known bacterial plasmid has previously been described with such a high proportion of IS elements. Four new IS elements were identified. Fifty putative proteins show no significant homology to proteins of known function; of these, 18 have a G+C content of less than 40%, typical of known virulence genes on the plasmid. These 18 constitute potentially unknown virulence genes. Two alleles of shet2 and five alleles of ipaH were also identified on the plasmid. Thus, the plasmid sequence suggests a remarkable history of IS-mediated acquisition of DNA across bacterial species. The complete sequence will permit targeted characterization of potential new Shigella virulence determinants.


Shigella spp. continue to be a major health problem worldwide, causing an estimated 1 million deaths and 163 million cases of dysentery annually (29), predominantly in children younger than 5 years of age in developing countries. Shigella spp. cause bacillary dysentery in humans by invading and replicating in epithelial cells of the colon, causing an intense inflammatory reaction, characterized by abscess formation and ulceration, which damages the colonic epithelium.

Shigella entry into susceptible host cells requires the coordinated expression of numerous genes that are activated in response to environmental cues (29, 68). Upon contact with host cells, the bacteria secrete factors (IpaB, IpaC, and IpaA) that induce large membrane ruffles on the host cell surface associated with cytoskeletal rearrangements that lead to internalization of bacteria into the host cell. Once internalized, Shigella spp. release factors that cause lysis of the phagocytic vacuole, thereby releasing the bacteria into the cell cytoplasm, where the bacterial surface protein VirG (IcsA) assembles actin tails that propel the bacteria through the cell cytoplasm and into adjacent cells. Additional bacterial factors lead to release of proinflammatory cytokines, osmotic leak of the mucosal epithelium (ShET1 and ShET2), and, in macrophages, induction of cell death (IpaB) (20, 38, 75).

Most work on the molecular pathogenesis of Shigella has been carried out in S. flexneri serotypes 2a and 5a. The entire complement of genes critical for invasion of epithelial cells is contained on a large 220-kb plasmid, termed the virulence plasmid or the invasion plasmid, which is present in all pathogenic strains (68). Known to be located on the virulence plasmid is a locus of genes (ipa-mxi-spa) that encode proteins involved in invasion of mammalian cells and which has homologs in Salmonella, Yersinia, enteropathogenic Escherichia coli, the plant pathogens Ralstonia salanacearum and Xanthomonas campestris, and the flagellar assembly loci of Salmonella enterica serovar Typhimurium (21). In addition to those mentioned above, several other virulence plasmid proteins have been previously characterized (68).

Together, the known genes account for approximately 30 to 35% of the virulence plasmid. Sequence analysis of the entire virulence plasmid was undertaken to permit identification of all plasmid-based proteins, including potential new determinants of virulence, to facilitate the understanding of evolutionary assembly of the plasmid and mechanisms that generate spontaneous deletions which lead to strain avirulence, and to permit comparisons of virulence plasmids of different Shigella serotypes and strains and of different bacteria that vary in virulence properties. Herein, initial analysis of the virulence plasmid from S. flexneri 5a is presented.

MATERIALS AND METHODS

Bacterial strains and plasmids.

The virulence plasmid pWR501, which was sequenced in this study, had been previously derived from the native virulence plasmid pWR100 (39). In brief, S. flexneri wild-type serotype 5a strain M90T, harboring pWR100, was crossed with E. coli MC1061 carrying pMT999, a Tn501-labeled, self-transmissible, temperature-sensitive plasmid. The cross resulted in M90T carrying a cointegrate of pWR100 and pMT999, which was subsequently mobilized into E. coli 395-1 by conjugation. Following passage of this strain at 42°C with appropriate antibiotic selection, a pWR100 derivative from which pMT999 had been lost and in which the Tn501 integration was retained was isolated and designated pWR501 (39). This E. coli strain, which lacks the two smaller Shigella plasmids, was the source for the isolation of virulence plasmid DNA. Plasmid DNA was isolated using previously published methods (70) and was subjected to two cycles of CsCl banding and purification.

Library construction and sequencing.

Plasmid DNA was sheared by nebulization and size fractionated by agarose gel electrophoresis (31) to obtain DNA fragments in the range from 0.7 to 2.0 kb. Purified, end-repaired fragments were subcloned into M13Janus (7), forming a random library. Clones were sequenced using Prism dye terminator or BigDye terminator chemistry (P-E Applied Biosystems, Foster City, Calif.). Data were collected on ABI377 sequencers. Sequence reads were assembled by Seqman II (DNASTAR, Madison, Wis.). Finishing methods included sequencing of opposite ends of linking clones, PCR-based techniques, and primer walking.

Sequence analysis.

The sequence was searched for potential open reading frames (ORFs) by Genemark (5). Each ORF was initially searched against the GenPept protein database using the DeCypher II system, to determine identity or match with known proteins. After subsequent inspection and further analysis of many of the ORFs, annotations were created. Each gene was assigned a unique identifier (ID; S number).

Annotation.

Sequence comparisons were performed using the amino acid sequences predicted for the identified ORFs unless otherwise indicated. Significant homology was defined as greater than 30% identity over greater than 60% of both the query sequence and the target sequence, as has been used in other genome analysis projects (3). An exception was applied in the analysis of insertion sequence (IS) elements, where, because we wanted to not miss fragments of IS elements, we chose to apply the criteria of greater than 50% identity over greater than 60% of the query sequence only. Based on sequence analysis and the extent of similarity to hypothetical proteins, potential ORFs were categorized as detailed below. In a few specific circumstances, we designated protein similarity on lower levels of amino acid identity than defined by these criteria; these cases and all other exceptions to the criteria are noted in the text where appropriate. The clusters of orthologous genes (COG) algorithm (COGnitor [65]) was used to identify families to which predicted proteins are related, and ProfileScan and Prosite Scan algorithms were used for the identification of potential functional domains.

Nucleotide sequence accession number.

The sequence described here has been assigned GenBank Accession number AF 348706.

RESULTS AND DISCUSSION

Overview of annotation.

The fully assembled circular sequence of pWR501, including the Tn501 selectable marker, consisted of 221,851 bp, 213,491 bp of which was specific to the Shigella virulence plasmid. Initial analysis of the sequence identified 293 potential ORFs (Fig. 1), 286 of which were Shigella plasmid derived and 7 of which belonged to Tn501. The 8,360-bp Tn501 insert, which had been previously introduced into pWR501 (see Materials and Methods), was located between sequence coordinates 198713 and 207065, with a 7-bp CCTAAAG target site duplication near the pWR501 replicon.

FIG. 1.

FIG. 1

Circular map of the plasmid. Outer ring depicts ORFs and their orientations, color coded according to functional category: 1, identical or essentially identical to known virulence-associated proteins (red); 2, homologous to known pathogenesis-associated proteins (pink); 3, highly homologous to IS elements or transposases (blue); 4, weakly homologous to IS elements or transposases (light blue); 5, homologous to proteins involved in replication, plasmid maintenance, or other DNA metabolic functions (yellow); 6, no significant similarity to any protein or ORF in the database (brown); 7, homologous or identical to conserved hypothetical ORFs, i.e., proteins of unknown function (orange); and 8, Tn501 insertion-associated genes (green). The second ring shows complete IS elements. The third ring graphs G+C content, calculated for each ORF and plotted around the mean value for all ORFs, with each value color coded for the corresponding ORF. Scale is in base pairs. The figure was generated by Genescene (DNASTAR).

pWR501 is a mosaic of potential pathogenesis-associated genes, IS elements, maintenance genes, and unknown ORFs. Of the 286 Shigella-derived potential ORFs, 54 (19%) encode known Shigella proteins (category 1, indicated by red banding in Fig. 1). Thirty-seven of these are located within a 32-kb cluster of uninterrupted ORFs, previously described, constituting the ipa-mxi-spa loci or pathogenicity island (68). The remaining 17 are distributed throughout the plasmid and include five alleles of ipaH and one allele each of icsA (virG), virA, icsP (sopA), virF, virK, msbB sepA, ipgH, shet2, phoN-Sf, trcA, and an apyrase gene.

Four previously unidentified ORFs (1.4% of pWR501 ORFs) encode proteins that have significant sequence similarity to known virulence-associated proteins (category 2, indicated by pink banding in Fig. 1); these are homologs of S. flexneri ShET2, the Salmonella serovar Typhimurium intracellular growth and virulence determinant MkaD, the E. coli lipopolysaccharide biosynthesis-related protein RfbU, and a UDP-sugar hydrolase (16, 18, 38, 62).

Fifty-two percent of pWR501 ORFs (153 ORFs) are known and new IS elements (reviewed in reference 32) as well unknown IS elements (categories 3 and 4, indicated by blue and light blue in Fig. 1). Among these, 20 complete known and at least 4 new putative IS elements were identified. The majority of the IS element ORFs are partial copies and, together with the complete IS elements, are arranged in blocks separated from non-IS related ORFs. In some regions, the arrangement of the IS elements is complex, with fragments of one IS element ORF interdigitated and often fused with another IS element or fragment thereof, forming a mosaic pattern (Fig. 2).

FIG. 2.

FIG. 2

Schematic representation of IS elements flanking virulence-associated genes on pWR501. (A) The 32-kb ipa-mxi-spa region; (B) the virA-virG region; (C) the virF region. Sequence coordinates are indicated above each map. frag, fragment; put tnp, putative transposase; seq, sequence; URF, unknown ORF; inc, incomplete. (D) The region downstream of virF (left side, as shown in panel C) shown at the base pair level to demonstrate that the nucleotide marking the end of one IS element or IS fragment is the start of another. The relevant bases are indicated in bold underline.

In pWR501, most virulence-associated genes, including the ipa-mxi-spa operons, the virG gene, and the ShET2 toxin gene, are flanked by one or more such mosaics of IS element ORFs (Fig. 2). In addition, many of the unknown ORFs (discussed below) are flanked by IS element ORFs and have G+C content of less than 40%. Based on this genetic organization, the recombination events that led to the acquisition of many or most genetic loci and the assembly of the large virulence plasmid almost certainly involved IS-mediated events. To date, no known plasmid in any bacterial species has been described with this degree of IS element content.

Coincident with the extremely high content of IS element-related ORFs on pWR501 is an extremely low density of ORFs that encode proteins predicted or known to be involved in other (i.e., non-IS element-related) functions in comparison with other plasmids. Thus, while pWR501 contains 0.62 non-IS element-related genes per kb, the Y. pestis LCD plasmid contains 0.87 per kb (61 ORFs over 70,509 nucleotides) (45) and the E. coli O157:H7 plasmid contains 0.91 per kb (84 ORFs over 92,077 nucleotides) (8). From a different perspective, 48% of pWR501 ORFs are non-IS element related, whereas 82% of Y. pestis LCD plasmid (45) and 87% of E. coli O157:H7 plasmid (8) ORFs are non-IS element related. Furthermore, bacterial chromosomes contain relatively few IS elements. For example, only 2% of the ORFs on the E. coli chromosome are thought to encode transposons, phage, or plasmids, and the overall density of genes outside these three categories is 0.906 per kb (4,201 ORFs over 4,639,221 nucleotides) (3). Thus, pWR501 appears to be and has been capable of extraordinary IS element-related recombination events, which undoubtedly contributed to the unique evolution of Shigella.

Twenty-five ORFs (9% of pWR501 ORFs) showed significant homology to proteins involved in plasmid replication and DNA metabolic functions (category 5, indicated by yellow banding in Fig. 1). The replicon region, along with the origin of replication (ori) site, is almost identical to the R100 replicon (40, 43). Other proteins in this category include two sets of toxin-antitoxin genes or protein addiction modules, including the ccdAB module of plasmid F and several homologs of full-sized or partial copies of a reverse transcriptase associated with group II introns (1, 33).

Fifty putative ORFs (17% of pWR501 ORFs) had insufficient homology to known proteins to be assigned a putative function. These constitute the unknown ORFs, of which 33 had no significant similarity to any protein or ORF in the database (category 6, indicated by brown in Fig. 1), and 17 had significant similarity to conserved hypothetical ORFs, i.e., proteins of unknown function (category 7, indicated by orange in Fig. 1). Many of those in the latter group have been identified in other bacterial sequencing projects.

A SalI restriction map of the virulence plasmid of a serotype 2a strain (pMYSH6000) had been previously constructed by Sasakawa et al. (53, 54). The map of pMYSH6000 assembled from that analysis differs grossly from that of pWR501 in that the ipa-mxi-spa loci are in inverse orientation. The SalI restriction map predicted from the sequence of pWR501 matches that seen upon restriction analysis of pWR501 DNA (data not shown). However, the pWR501 SalI restriction map agrees only in part with the SalI restriction map of pMYSH6000. Approximately 60% of the individual SalI fragments of pWR501 are of sizes determined for individual fragments of pMYSH6000 (e.g., pMYSH6000 fragments C, F, G, I, and J), but many fragments are of different sizes, and the total number of fragments differs (23 in pMYSH6000, versus 22 in pWR501). These differences suggest that the virulence plasmids from these two serotypes have diverged in overall organization as well as in nucleotide sequence.

IS elements.

Of the 153 ORFs related to IS element ORFs, 114 (39%) have homology to known IS element ORFs (32; www-IS.biotoul.fr/is/is_family.html); of these, 71 belong to members of the IS3 family, which includes many IS elements in addition to IS3. Among the complete IS elements, 20 have been previously identified in Shigella or other organisms; these include five copies of IS629, four copies of IS1294, two copies of IS2, three copies of IS600, three copies of iso-IS1, one copy of IS4, one copy of IS630, and one copy of IS911 (Table 1). An IS element was considered partial or incomplete when the homology of the element in pWR501 did not extend to and include the inverted repeat of the target sequence, when it should be present. Most of the IS element-related ORFs on pWR501 are partial copies of IS elements that are truncated at the ends, contain internal deletions, or contain insertions of other IS elements within their sequence, reflecting extensive genomic rearrangement following acquisition, a pattern that has also been seen in the enteropathogenic E. coli plasmid pB171 (67) and the Y. pestis plasmid pCD1 (45). Among all IS element ORFs on pWR501, those that have not been previously identified on Shigella virulence plasmids include IS10, IS21, IS100, IS110, IS150, IS1294, IS1328, and IS1353. Detailed characterization of the IS element ORFs can be found at http://www.genome.wisc.edu/.

TABLE 1.

IS element-related ORFs on pWR501a

Type of IS element Length (bp) No. of ORFs/complete element No. of ORFs in pWR501 No. of complete elements in pWR501
Known familiesb
 IS1 808 2 15 3
 IS2 1,327 2 10 2
 IS3 1,258 2 7 0
 IS4 1,426 1 2 1
 IS10 1,329 1 1 0
 IS21 2,131 2 1 0
 IS91 1,829 2 4 0
 IS100 1,954 2 6 0
 IS150 1,443 2 3 0
 IS600 1,264 2 21 3
 IS629 1,310 3 27 5
 IS630 1,158–1,200 1 3 1
 IS911 1,250 2 2 1
 IS1294 1,688 1 9 4
 IS1328 1,359 1 1 0
 IS1353 1,613 2 2 0
Putative new IS elementsc
 ISSfl1 929 1 6 1
 ISSfl2 1,373–1,376 1 2 2
 ISSfl3 1,301 1 1 1
 ISSfl4 2,728 3 17 2
 Subtotal 140 26
Unknown ORFs 13
  Total 153
a

A detailed characterization of the IS elements on pWR501 has been posted at http://www.genome.wisc.edu/

b

pWR501 ORFs with ≥50% amino acid identity over ≥60% of the length of the query sequence only (significant homology). 

c

pWR501 ORFs with some homology (less than significant) to known or putative transposases. 

Four new types of putative IS elements were identified in pWR501 and were designated ISSfl1, ISSfl2, ISSfl3, and ISSfl4. Their identification was based on the following criteria: the sizes of the ORFs were comparable to ORF sizes of known IS elements, the protein sequence similarity was less than 55% to these known elements, and inverted and/or direct repeats were present at either end of the putative element. At present, there are no data as to whether these putative IS elements are transposable. Both complete and incomplete copies of these new putative IS elements are present in pWR501, comprising six new complete elements and 26 ORFs in all. In addition, 13 ORFs showed less than significant similarity to known IS elements or putative transposases (Table 1, unknown ORFs). None of the putative transposases showed the characteristic features of new IS elements; however, a more detailed investigation of their sequence remains to be done.

The first new putative IS element, ISSfl1, consists of ORFs S0204 and S0203, is 929 bp in length, has 20-bp inverted repeats, and generates a 3-bp TTC direct repeat at the point of insertion. S0204 and S0203 show similarity to OrfA and OrfB of IS1650 at the amino acid level but not at the nucleotide level. Elsewhere on pWR501, S0078-S0079 and S0101-S0102 constitute incomplete copies of ISSfl1 consisting of bp 6 to 869 and 4 to 827 respectively, of ISSfl1. The second putative new IS element, ISSfl2, is present in two almost identical copies, S0055 and S0128. They encode proteins with 58 to 59% identity to the Streptomyces coelicolor IS110 transposase, but as with ISSfl1, they have no significant homology at the nucleotide level, except for approximately 55 nucleotides in the center of the element, which are similar to bp 1071 to 1127 of IS110. A direct repeat sequence CCCCTATTA is seen at either end of S0055 and S0128, identifying and duplicating the target site for insertion. The third putative new IS element, ISSfl3, consists of S0034, which has 33% identity at the amino acid level to E. coli IS10 transposase, without homology at the nucleotide level. ISSfl3 is inserted within a Y. pestis IS100 sequence, with duplication of sequences at the insertion site. Sequences 98% identical at the amino acid level to S0034 have also been identified in the enteropathogenic E. coli EAF plasmid pB171 (Orf5 and Orf28 [67]).

The fourth putative new IS element, ISSfl4, is homologous to the recently identified ISEc8, located on the chromosome of enterohemorrhagic E. coli O157:H7, adjacent to the locus of enterocyte effacement pathogenicity island (44, 57). These elements are members of the IS66 family, which has also been identified in Rhizobium sp., Agrobacterium sp., and Sinorhizobium meliloti (57). Homologous ORFs are also present on the enteropathogenic E. coli plasmid pB171 (orf49 to orf51 [67]). ISEc8 (and ISShf4) consists of three ORFs transcribed in the same direction and flanked by 11- to 22-bp imperfect inverted repeats and an 8- or 9-bp target duplication site. On pWR501, three loci consisting of ORFs homologous to all three ISEc8 ORFs are present (S0023 to S0025, S0116 to S0119, and S0216 to S0219). In addition, six partial copies of the third ORF alone are distributed throughout the plasmid (S0008, S0038, S0072, S0081, S0090, and S0230). Within the loci containing all three ORFs, the third ORF appears to have undergone an internal deletion in one case (S0116-S0117) and a frameshift in another (S0216-S0217). Two of the ISSfl4 sequences (S0116 to S0119 and S0216 to S0219) are flanked by an exact 11-bp inverted repeat GTAAGCGCCCC, which is present 71 bp upstream of the ATG start site of the first ORF of the element (S0119 and S0219) and 21 bp downstream of the last ORF (S0116 and S0216). ISSfl4 is the largest IS element on pWR501, 2,730 bp, which is comparable in size to ISEc8 (2,443 bp). The ORFs belonging to ISSfl4 show variably 36 to 62% identity over long stretches to the ISEc8 ORFs.

Several IS elements of one type show direct repeats at each end of the insertion, while others belonging to the same group do not. Those that show target site duplication may have been acquired more recently than those that do not. For example, only two of the five copies of IS629 and two of three iso-IS1 elements have direct repeats. IS1294 elements have 4-bp direct repeats of the sequence CTTG, although the duplication occurs not as the result of target site duplication but as a result of insertion site specificity (CTTG), with the duplicate copy of CTTG being the end of IS1294 itself (66). Recent work with the IS1294 found on the ColD-like resistance plasmid pUB2380 indicates that the IS1294 element mediates not only its own transposition but also that of sequences adjacent to it in a transposition mechanism resembling rolling circle replication with single-stranded DNA intermediates (66). pWR501 contains four complete copies of IS1294.

A substantial portion of horizontally transferred genes, characterized by atypical nucleotide composition or patterns of codon usage, are associated with plasmid, phage, or transposon-related sequences, including IS elements (42). The regions adjacent to these horizontally transferred genes often contain remnants of these mobile elements (42). The map of pWR501 indicates a unique arrangement of known virulence-related genes and unknown ORFs separated by blocks of IS element-related ORFs. The blocks of IS elements presumably reflect the history of IS element-mediated acquisition of the virulence genes and other ORFs that have contributed to the assembly and evolution of the virulence plasmid.

For example, sequences homologous to the ipa-mxi-spa gene cluster (ipaJ to spa40) are seen in other bacteria, including the yscM-yopD gene cluster on the Yersinia low-Ca2+-response (LCR) plasmid pCD1 (45). It is noteworthy that a partial copy of the Yersinia IS100 (bp 1 to 1051 of the IS element's 1,954 bp) borders one end of the Shigella ipa-mxi-spa loci and defines one end of the region of low G+C content characteristic of this gene cluster. At the other end of the ipa-mxi-spa loci, approximately 50 bp downstream of the stop codon for spa ORF11, which is adjacent to spa40, is another sequence of 139 bp (sequence coordinates 129144 to 129282) identical to the portion of the Yersinia LCR plasmid that borders the insertion of an IS285 into the LCR plasmid. The 139-bp sequence includes the 28-bp left inverted repeat of IS285 and 100 bases into orf2 of the IS element. The end of the 139-bp sequence in pWR501 also signals the end of the low-G+C ipa-mxi-spa region. The 139-bp sequence is therefore a remnant of a mobile genetic element that was possibly involved in the horizontal transfer of the ipa-mxi-spa genes into pWR501. It is interesting to note that an IS100 has recently been described adjacent to yscM on pCD1, and two partial IS285 elements flank the yscM-yopD gene cluster in Yersinia (45). Homologs of the IS100 element from Yersinia associated with a pathogenicity island that contains homologs to the ipa-mxi-spa genes of Shigella and the yscM-yopD genes of Yersinia on a 154-kb native plasmid pAV511 in the bean pathogen Pseudomonas syringae pathovar phaseolicola have also been identified (27). The lack of IS elements within the ipa-mxi-spa pathogenicity island suggests that the entire locus was acquired in a single recombination event. Furthermore, like the yscM-yopD, the ipa-mxi-spa cluster lacks a characteristic of most pathogenicity islands, the presence of flanking tRNA genes (49). Absence of tRNA genes may have resulted from genomic rearrangements following gene transfer; alternatively, acquisition of the locus may not have involved insertion into tRNA genes.

A predictable consequence of the presence of such a high density of IS element sequences on the Shigella virulence plasmid is the observed predisposition to frequent genomic alterations. For example, the construction of S. flexneri 2a vaccine SC602 by targeted deletion of virG (12) was accompanied by an unexpected recombination between two IS629 sequences that flanked the region, resulting in a larger than expected deletion, which includes virA as well as virG (M. M. Venkatesan, unpublished data). A spontaneous avirulent isolate of S. flexneri 5 (M90T-A3) contains a deletion of approximately 70 kb that includes both the ipa-mxi-spa locus and the virG-virA locus (9), a segment flanked by IS629 elements (Fig. 1). Similar deletions have been described for the T-32 ISTRATI vaccine strain (71) and the chromosomal she locus in a serotype 2a strain (48). Another type of IS element-mediated alteration that has been described for the Shigella virulence plasmid is the spontaneous insertion of an IS1 element within the virF gene, which converted a virulent strain to an avirulent one (36).

G+C composition.

The G+C content of genes and gene flanking regions has been used as a marker for phylogenetic origin of genes or gene clusters, with those genes or gene clusters that have anomalous base composition being thought to have been acquired more recently by horizontal gene transfer. The average G+C content of pWR501 is 47.6%, similar to the estimated G+C content of the Shigella chromosome, 50.02% (G. Plunkett III et al., personal communication). In contrast, essentially all of the characterized virulence-associated genes encoded on the virulence plasmid have markedly lower G+C composition, in the range of 30 to 35% (Fig. 3).

FIG. 3.

FIG. 3

Plot of G+C content (y axis) of each ORF on pWR501 (x axis). Unknown ORFs and selected known ORFs with G+C content of less than 40% are labeled.

The overall G+C content of the ipa-mxi-spa region is 35%. While the G+C content of the Yersinia yscM-yopD region (44.8%) is similar to that of both the entire LCR plasmid and the Y. pestis chromosome (45), the G+C content of the sequenced region of pAV511 is 54%, significantly lower than the overall figures of 59 to 61% reported for pathovars of P. syringae (27). The homology of the virulence genes and flanking sequences of the pathogenicity islands among Shigella, Yersinia, and P. syringae plasmids indicate a common ancestry of these pathogenicity islands, with subsequent evolutionary changes in gene composition and arrangements. Based on G+C composition alone, it is tempting to speculate that the Shigella ipa, mxi, and spa genes were acquired much later in evolution than the Yersinia genes. Alternatively, Yersinia may have acquired the region from an organism with a similar G+C content.

Altogether, 66 ORFs (22.5% of pWR501 ORFs) have a G+C content of less than 40%; of these, 38 are found as a block within the ipa-mxi-spa loci, and 28 are distributed throughout the remainder of the plasmid (Fig. 3). Sixteen of these 28 are found in clusters of two or three genes of low G+C content, suggesting that they might have been acquired as a block of genes from a single donor organism. Four of these 28, virF (S0051), shet2 (S0097), virA (S0191), and sopA (S0292) are known virulence factors (68). Most of the remainder encode hypothetical proteins that lack significant similarity to any known protein, many of which are of moderate size (200 to 500 amino acids [aa]). This raises the possibility that like the known virulence genes on pWR501, many of these hypothetical ORFs encode virulence factors and were acquired by lateral transfer. These therefore potentially represent as yet uncharacterized recently acquired virulence determinants, some of which may have been acquired after the emergence of the evolutionary group from which pWR501 was isolated.

Three pWR501 loci have G+C contents of 60% or greater (Fig. 3). The first of these is S0214, which lies within a block of relatively high G+C content genes that is homologous to an E. coli plasmid ColIb-P9 locus. The second high-G+C− content locus is S0264, which lies immediately adjacent to a group of genes predicted to be involved in plasmid transfer and stability functions, most similar to those of plasmid R100 (see “The replicon,” below). The third locus is S0269 to S0274, within Tn501.

Unknown ORFs.

ORFs that either showed no significant similarity to any protein or ORF in the database or were homologous or identical to conserved hypothetical ORFs (proteins of unknown function) were jointly defined as unknown ORFs and are listed in Table 2. A discussion of several interesting features of these ORFs follows.

TABLE 2.

Unknown ORFsa

ORF Begin cds End cds First codon G+C content (%) No. of aa Organism of protein to which most similar Homolog protein Extent of similarity
S0002 485 949 ATG 51.0 154 No significant homology
S0003 1176 2042 ATG 32.8 288 No significant homology
S0005 3272 3526 ATG 42.3 84 No significant homology
S0006 4441 3470 TTG 31.2 323 No significant homology
S0007 4742 4344 GTG 32.3 132 No significant homology
S0011 6793 7110 ATG 49.4 105 No significant homology
S0016 11662 11084 ATG 31.4 192 No significant homology
S0030 18654 19331 ATG 35.5 225 S. flexneri ShET2 (565 aa)b 22% identical over 129 aa
S0031 20037 19813 ATG 51.6 74 No significant homology
S0062 44909 44427 ATG 45.8 160 Mycobacterium tuberculosis Rv0919 (hypothetical protein) (166 aa) 47% identical over 148 aa
S0063 45202 44900 ATG 44.9 100 M. tuberculosis Rv0918 (hypothetical protein) (158 aa)c 33% identical over 77 aa
S0066 48838 47884 ATG 32.6 484 No significant homology
S0088 67022 67312 GTG 41.9 96 No significant homology
S0098 76039 74627 ATG 31.8 470 No significant homology
S0099 76793 76494 ATG 53.0 99 No significant homology
S0100 77192 76803 ATG 48.7 129 No significant homology
S0103 78806 78333 ATG 35.0 157 S. typhimurium HilCb (295 aa) 31% identical over 128 aa
E. coli PerAb (274 aa) 34% identical over 119 aa
S0104 79222 78794 ATG 32.9 142 No significant homology
S0105 80088 79450 GTG 50.2 212 E. coli K-12 Hypothetical protein 55% identical over 212 aa
S0111 83759 83274 ATG 43.1 161 M. tuberculosis Rv0919 (hypothetical protein) (166 aa) 50% identical over 157 aa
S0112 84082 83747 TTG 39.9 111 M. tuberculosis Rv0918 (hypothetical protein) (158 aa)c 38% identical over 81 aa
S0120 88198 87914 ATG 47.7 94 No significant homology
S0121 88260 88439 ATG 37.8 59 No significant homology
S0122 88540 89994 ATG 31.2 484 No significant homology
S0141 109413 109057 ATG 32.8 118 No significant homology
S0176 134729 134313 GTG 48.9 138 No significant homology
S0177 136045 135269 ATG 30.9 258 M. bovis TfpB (261 aa)b 28% identical over 237 aa
S0179 137371 137201 ATG 51.6 56 No significant homology
S0193 147529 147786 ATG 39.1 85 No significant homology
S0194 147755 147952 GTG 47.5 65 No significant homology
S0205 155688 155287 GTG 41.8 133 No significant homology
S0208 157575 158342 GTG 53.0 255 E. coli plasmid ColIb-P9 YccB (308 aa)d 92% identical over 231 aa
S0209 158288 158545 GTG 56.6 85 E. coli plasmid ColIb-P9 YccB (308 aa)d 92% identical over 84 aa
S0210 158917 159600 ATG 56.0 227 E. coli plasmid ColIb-P9 YcdB (227 aa) 97% identical over 227 aa
S0211 159601 159822 ATG 53.1 73 E. coli plasmid ColIb-P9 YceA (73 aa) 100% identical over 73 aa
S0212 159834 160019 ATG 57.0 61 E. coli plasmid ColIb-P9 YceB (144 aa)d 90% identical over 61 aa
S0213 160053 160268 ATG 57.4 71 E. coli plasmid ColIb-P9 YceB (144 aa)d 81% identical over 70 aa
S0214 160313 161083 ATG 60.7 256 E. coli plasmid ColIb-P9 YcfA (256 aa) 93% identical over 256 aa
S0225 168238 167600 ATG 36.6 212 No significant homology
S0235 173855 174845 GTG 50.0 96 E. coli plasmid ColIb-P9 YacA (89 aa) 92% identical over 89 aa
Vibrio cholerae RelBb (122 aa) 26% identity over 74 aa
S0236 174219 174368 ATG 52.0 49 E. coli plasmid ColIb-P9 YacBc (93 aa) 97% identical over 32 aa at amino terminus
S0237 174397 174987 ATG 35.7 196 No significant homology
S0238 175234 175055 GTG 47.2 59 No significant homology
S0249 182574 179662 TTG 50.9 970 M. tuberculosis UvrD2b 23% identical over 310 aa near amino terminus
S0250 183057 183899 ATG 40.0 280 S. flexneri YSH6000 virulence plasmid Shf (280 aa) >99% identical over 280 aa
Enteroaggregative E. coli plasmid pAA Shf (280 aa) 95% identical over 280 aa
Enterohemorrhagic E. coli O157:H7 plasmid L7026 (hypothetical protein) (273 aa) 70% identical over 273 aa
S0264 196924 197159 ATG 61.5 70 Plasmid R100 YigA (70 aa) 73% identical over 70 aa
S0265 197451 197645 TTG 58.2 64 Plasmid R100 YigB (153 aa)d 95% identical over 64 aa
S0276 207312 207479 ATG 51.2 55 Plasmid R100 YigB (153 aa)d 96% identical over 55 aa
S0282 212528 212340 ATG 56.6 62 No significant homology
S0290 218962 218378 TTG 33.8 194 No significant homology
a

Defined as those that either showed no significant similarity to any protein or ORF in the database or were homologous or identical to conserved hypothetical ORFs (proteins of unknown function). “Significant similarity” was defined as greater than 30% identity over at least 60% of the query sequence and 60% of the target sequence. cds, sequence coordinates. 

b

Although the extent of similarity did not meet the criteria for “significant,” the homolog alignment was included because of the potential relevance of the particular gene to Shigella pathogenesis. 

c

Although the extent of similarity did not meet the criteria for “significant,” the homolog alignment was included because of its link to the adjacent ORF. 

d

Based on alignments, S0208 appears to be frame shifted from S0209, and S0212 appears to be frame shifted from S0213; in review of the sequence, no sequencing errors could be detected that would explain these apparent frame shifts. In addition, S0265 and S0276 appear to have evolved from a single ORF that was disrupted by inserted sequences and had an internal deletion. 

Unknown ORFs with no significant similarity to hypothetical proteins.

The longest unknown ORF is S0249, which putatively encodes a 970-aa protein that has less than significant homology to the Mycobacterium tuberculosis probable DNA helicase II homolog UvrD2 (700 aa [11]), largely between residues 335 and 422. While M. tuberculosis UvrD2 has not been extensively characterized, E. coli uvrD encodes a 720-aa protein with 3′-5′ DNA helicase activity (6) that is nonessential for viability but is required for methyl-directed mismatch repair and nucleotide excision repair and is believed to participate in recombination and DNA replication. S0249 belongs to the superfamily of DNA/RNA helicases (COG0210), as determined by COGnitor for identification of families to which predicted proteins are related (65). S0249 has an ATP/GTP binding site motif A (P loop) at residues 156 to 163 (GSAGSGKT), similar to the ATP binding motif in UvrD proteins (ProfileScan algorithm). The ATP binding motif is a glycine-rich region, which typically forms a flexible loop between a beta strand and an alpha helix and interacts with one of the phosphate groups of the nucleotide. S0249 also has an ankyrin repeat 2 motif at residues 721 to 753. Ankyrin repeats are tandemly repeated modules of 33 aa seen primarily in eukaryotic proteins, where they play a role in protein-protein interactions (4). The few known examples from prokaryotes and viruses are thought to be the result of horizontal gene transfer (4). S0249 could represent a helicase with a C-terminal end capable of interaction with other accessory proteins involved in DNA replication, excision, and/or repair mechanisms.

ORF S0141, located between icsB and ipgD within the ipa-mxi-spa loci, has not been previously described in the literature. The predicted protein has no significant homology to any hypothetical protein. The ATG start site and the first nine amino acids precede the start codon of ipgD, which is transcribed in the opposite orientation. It remains to be determined whether S0141 is expressed in vivo.

S0103 has less than significant homology to HilC (SprA) of Salmonella serovar Typhimurium and PerA of E. coli, both members of the AraC/XylS family of transcriptional regulators, known to bind DNA via a helix-turn-helix (HTH) motif (23, 56). In enteropathogenic E. coli, PerA activates the eaeA gene, encoding intimin. HilC regulates hilA expression, which in turn activates the expression of invasion-associated genes. A sequence that is 52% similar to the HTH motif characteristic of this family of proteins was detected in S0103 at residues 54 to 155 (Prosite Scan ID PS01124). HTH DNA binding motifs were originally identified in bacterial proteins. They have since also been found in eukaryotic DNA-binding proteins. It is not known what genes might be regulated by S0103.

Unknown ORFs with significant similarity to previously described proteins of unknown function.

Adjacent to and in the opposite orientation of S0249 is an ORF (S0250) that encodes a 280-aa protein with 95% identity to Shigella shf (48). In the previously published sequence of shf, this locus was thought to contain two ORFs, shf1 (predicted to encode a 133-aa protein) and shf2 (predicted to encode a 145-aa protein). The pWR501 sequence lacks a single T nucleotide that is present after base 770 of the published sequence. The absence of this nucleotide in pWR501 generates a single ORF encoding a 280-aa protein. COG analysis places S0250 in the family of xylanases/chitin deacetylases (COG0726), which function in carbohydrate transport and metabolism.

S0250 forms an operon with S0251, S0252, and S0253, which encode rfbU (capU), virK, and msbB, respectively. RfbU is a UDP-sugar hydrolase and has been described in Vibrio cholerae as an accessory protein required for O-antigen biosynthesis (18). MsbB is an acyltransferase involved in fatty acyl modification of the O antigen (60). msbB genes are also present on the Shigella chromosome and in the E. coli genome. Mutations in msbB genes result in lipopolysaccharide that has reduced toxicity (60). In Shigella, VirK mutants have less virG mRNA than wild type, suggesting the involvement of virK in posttranscriptional regulation of virG expression (37). It is interesting to consider that VirK, being encoded within the shf-capU-msbB operon, may have a role in O-antigen biosynthesis. A locus of three genes with significant homology and similar in organization to pWR501 genes shf, rfbU(capU), and virK is also seen in the enteroaggregative E. coli plasmid pAA2 (J. R. Czeczulin, T. R. Whittam, I. R. Henderson, and J. P. Nataro, submitted for publication) and E. coli O157:H7 plasmid pO157 (8).

Two loci containing seven (S0208 to S0214) and two (S0235 and S0236) ORFs are similar to unknown ORFs on plasmid ColIb-P9 (G. Sampei and K. Mizobuchi, unpublished data). Of note, ColIb-P9 was originally described in Shigella sonnei. ORF S0210 shows 41% identity to 135 residues of a 304-aa adenine methyltransferase (EcoVIII modification methylase), which contains an N6-adenine-specific methylase signature motif between residues 24 and 30 (ILTDPPY) (Prosite Scan ID PS00092). DNA methyltransferases modify DNA within a recognition sequence and in most cases are associated with a restriction endonuclease. The combination of these two activities protects native bacterial DNA from cleavage and facilitates cleavage of foreign DNA. In bacteria, DNA methylation may also be involved in transposon movement and DNA mismatch repair and may have important roles in virulence (26). Salmonella serovar Typhimurium strains with mutations in DNA adenine methylase show abnormalities in protein secretion, host cell invasion, and M-cell cytotoxicity (22). It is not known whether the DNA methylase activity of S0210 is associated with a restriction endonuclease.

The identification of several unknown ORFs, many of which have interesting homologies and many of which may be involved in virulence, permits targeted investigation of genes that may have subtle albeit important roles in Shigella pathogenesis. Most of the virulence genes that have been previously characterized were identified using in vitro assays designed to evaluate the ability of bacteria to invade and multiply within epithelial cells. Many of the unknown ORFs in pWR501 may have been missed in these screens by virtue of not being essential to either invasion or intracellular multiplication. They may nevertheless modulate the efficiency of invasion and intercellular dissemination or be otherwise important to pathogenesis. Further genetic and phenotypic analysis of each of these genes will permit characterization of each one's role in pathogenesis.

Known ORFs.

S0042 (405 aa) is 47% identical and 62% similar over 387 aa to a reverse transcriptase/maturase (RT) from Sinorhizobium meliloti (419 aa) that is associated with a group II intron (33). S0113 and S0114 are homologous to short regions of the RT, apparently fragments (50 and 47 aa, respectively), and S0199 and S0200 represent a second copy of the RT element that is frameshifted. Group II introns are self-splicing RNAs linked to mobile genetic elements, related to the nuclear introns of pre-mRNA. They are commonly found in organellar genes of lower eukaryotes and plants but have recently been described as associated with IS elements in many prokaryotes, including E. coli (13). In addition to their ribozyme core, some group II introns encode proteins with reverse transcriptase activity associated with endonuclease activity. In pWR501, the RT encoded by S0042 appears to contain the seven domains characteristic of active RTs (33). The S. meliloti group II intron with the encoded RT is inserted within an IS element. The association of group II introns with IS elements ensures the spread and maintenance of the introns. S0042 does not appear to be within an IS element, although there is an IS629 element 700 bp upstream of the coding sequence and a putative transposase downstream. Furthermore, the presence of multiple copies of RT in pWR501 indicates that they might originally have been acquired within IS elements. The IS629 sequence is immediately adjacent to a 170-bp sequence that is homologous to sequence on the enteropathogenic E. coli plasmid pB171, suggesting DNA exchange between pWR501 and pB171. Splicing efficiency of the S. meliloti group II intron requires expression of the RT protein, which suggests that it has a maturase activity (33). S0042 contains the carboxy-terminal domain that specifies maturase activity but lacks the zinc finger domain that specifies endonuclease activity.

A virulence plasmid-encoded operon that contributes to enhanced survival and mutation frequencies, impCAB, has previously been characterized in S. flexneri strain SA100 (50). In pWR501, the impCAB operon is missing; only the first 176 bp are present, beginning at sequence coordinate 157595. impCAB is similar to the chromosomal umuDC operons of E. coli and Salmenella serovar Typhimurium and the plasmid mucAB operon; all function in DNA repair following radiation- and chemical-induced mutagenesis.

The invasion locus of Shigella is a pathogenicity island-like cluster that consists of 38 ORFs (S0130 to S0167) of the ipa-mxi-spa operons within a stretch of 32 kb of the virulence plasmid (beginning with the start codon of ipaJ at sequence coordinate 97092 [S0130] and ending with the stop codon of spa-orf11 [S0167] at sequence coordinate 129091). Genes within this locus are critical for Shigella invasion of mammalian cells, although certain genes outside this region are required for optimal invasion of tissue culture cells. This region has been previously mapped and sequenced, and genes within this region have been extensively characterized (46). Notably, on pWR501, the orientation of the entire gene cluster is inverse of what had been previously published (17). Potential explanations for this inversion include (i) strain differences due to a true inversion during evolution that may have been enhanced by the presence of flanking IS elements or (ii) an artifact of the cloning approach that was used in the prior sequencing projects. Strain-to-strain differences in the arrangement of virulence genes have been previously described (2).

Two, and perhaps three, ShET2-like toxin genes are present on pWR501. The ShET2 enterotoxin (S0097, 566 aa) is present on the virulence plasmids of all species of Shigella (38). S0012 (533 aa) represents a second allele of the ShET2 gene, with 40% identity to ShET2 over more than 90% of its length (Fig. 4). In addition, S0230 (266 aa) is 22% identical to ShET2 over 55% of its length (Table 2 and Fig. 4).

FIG. 4.

FIG. 4

Alignment of ShET2 alleles on pWR501. The ShET2 toxin (S0097) is compared with the second copy of ShET2 (S0012) and a smaller ORF (S0030) with which it bears similarity. The similarities are indicated as for BLAST searches.

There are five complete alleles of ipaH on pWR501, designated ipaH1.4, ipaH2.5, ipaH4.5, ipaH7.8, and ipaH9.8 (69). A single T residue at pWR501 sequence coordinate 213802, within the coding sequence of ipaH1.4, a single T residue at pWR501 sequence coordinate 41551, within the coding sequence of ipaH2.5, and a single G residue at pWR501 sequence coordinate 61847, within the coding sequence of ipaH7.8, were missed previously, leading to truncation of ipaH1.4 and ipaH2.5 at their 3′ ends and ipaH7.8 at its 5′ end. Thus, our sequence predicts that ipaH1.4, ipaH2.5, ipaH4.5, ipaH7.8, and ipaH9.8 encode proteins of 575, 563, 574, 565, and 545 aa, respectively. The amino-terminal halves are variable, while the carboxy-terminal halves are conserved (Fig. 5), as described previously (69). The variable amino-terminal halves contains a leucine-rich repeat (delimited by asterisks in Fig. 5) (28). The carboxy-terminal 13 residues of IpaH7.8 and IpaH9.8 are absent in IpaH2.5 and IpaH4.5 and altered in IpaH1.4. The G+C content of these five alleles is remarkable in that the amino-terminal halves are lower in G+C content than the conserved carboxy-terminal halves of all five alleles, which suggests a modular unit whose two halves might have been acquired separately during evolution and subsequently fused. ipaH1.4 and ipaH2.5 are more similar to each other than to the other three alleles. A schematic representation of the amino-terminal alignment and the degree of sequence identity among the five ipaH alleles is shown in Fig. 5. Although the function of each ipaH allele is unknown, transcription of four of them, but not that of the ipaBCDA and mxi operons, was markedly increased during growth in the presence of Congo red and in an ipaD mutant, two conditions under which secretion through the Mxi-Spa machinery is enhanced (15). Transcription of the ipaH genes was also transiently activated upon entry into epithelial cells. Recently, IpaH7.8 was shown to facilitate the escape of Shigella from phagocytic vacuoles of mouse macrophages and human monocytes (19).

FIG. 5.

FIG. 5

The five IpaH alleles of pWR501. (A) Alignment of the amino-terminal halves of the five IpaH alleles, using the Clustal method with PAM250 residue weight table (MEGALIGN algorithm; DNASTAR software). Beginning with the sequence LADAV (residues 307 to 311 of IpaH1.4), the sequences are conserved among all five alleles; the carboxy termini are not shown. Asterisks, approximate extent of the leucine-rich repeat sequence; arrow, beginning of region conserved in all five alleles. (B) Sequence identity and divergence among the five IpaH alleles. Percent divergence is calculated by comparing sequence pairs in relation to the phylogeny reconstructed by MEGALIGN, whereas percent identity is determined for individual pairs without regard to phylogenetic relationship. (C) G+C content distribution of IpaH7.8. Scale is in base pairs.

The replicon.

pWR501 has a replicon region highly homologous to that of plasmid R100, which is within the RepFIIA family of replicons. There are six known incompatability (Inc) groups within this family; plasmid R100 belongs to the IncFII group. In pWR501, the replicon extends from sequence coordinates 208523 to 210861 and contains the essential replication elements present in R100, including an ori and a G site (single-strand initiation site), which directs the priming of single-stranded DNA templates for leading and/or lagging strand synthesis (Fig. 6) (43, 64). Of note, the R100 plasmid was initially isolated from an S. flexneri 2a strain.

FIG. 6.

FIG. 6

Replicon of pWR501. (A) Replicon region and flanking DNA sequences (sequence coordinates 187081 to 214607), including the insertion site of Tn501. (B) Expanded view of the replicon shown in panel A (sequence coordinates 207312 to 214912), indicating the position of inc RNA, ori, and the G site. Distances are indicated in base pairs below each map.

The frequency of replication is regulated by control of the synthesis of the plasmid-specific replication initiation protein RepA1, which binds to the plasmid ori and assembles a replication complex (43). The principal copy control elements are (i) an antisense RNA, inc RNA (or CopA in plasmid R1), which is constitutively expressed and rapidly turned over and which exerts negative control on the repA1 transcript (40); (ii) RepA2, a trans-acting repressor of the repA1 promoter; and (iii) repA6, whose expression disrupts the binding of inc RNA to the repA1 transcript (73). All of these elements are present on pWR501. To our knowledge, no function has been ascribed to repA4.

The pWR501 replicon is more than 75% identical to the R100 replicon at both nucleotide and amino acid levels. Classification of replicons is largely based on the extent of similarity of the sequence of RepA1 proteins. pWR501 RepA1 (251 aa in length) is 76% identical to residues 35 to 285 of R100 RepA1 (285 aa in length). Phylogenetic analysis of the RepA1 proteins from several different replicons indicated that pWR501 RepA1 is most closely related to RepA1 from R100 (data not shown).

repA2 of pWR501 is divergent from repA2 of R100 but is identical to that of R1 (an IncFII plasmid), as has been shown for another S. flexneri serotype 5a virulence plasmid, pWR110 (59); interestingly, the three sequences are identical at the protein level. The RepA2 target site is in a region that lacks homology, indicating that the RepA2 proteins and their targets are plasmid specific and not group specific (41).

Of note, pWR110 was shown to be compatible with IncFII plasmids and only weakly incompatible with IncFI plasmids (59). The pWR501 incompatibility determinant, located within the inc gene, is identical to that of pWR110 (59); therefore, the incompatibility profiles of the two plasmids would be predicted to be the same. The inc genes of pWR110 (59) and pWR501 (this study) differ slightly from the inc genes of the IncFII plasmids R1 and R100 and the IncFI plasmids P307 and ColV2-K94. Thus, while the replicon of pWR501 is most similar to that of IncFII plasmids, it does not fall within the FII incompatibility group, suggesting that the few differences in nucleotide sequence within the inc gene may give rise to significant differences in incompatibility.

The ori region of R100 is located within a 167-bp sequence downstream from repA1 that contains approximately 75 bp of high AT content; a similar high-AT-content sequence is observed in the putative ori of pWR501, which lies within a 362-bp segment between the stop codon of repA1 and the start codon of repA4 (sequence coordinates 210012 to 210373) (Fig 6) (63). Base pairs 93 to 258 of this 362-bp sequence are 95% identical to the 167-bp ori sequence of R100. A dnaA box (TTATCCACA) is located within the ori sequence, and a G site, analogous to the single-strand initiation site of phages, is located toward the 3′ end of repA4, as in R100. Within the G site are three blocks of conserved nucleotides, one of which provides the starting point for primer RNA synthesis (63); all three are conserved in pWR501. pWR501 homology to the R100 replicon ends 80 bp upstream of tir in R100 (Sampei and Mizobuchi, submitted for publication).

Apart from the essential replicon, pWR501 has multiple loci homologous to sequences known to be involved in plasmid segregation and stable maintenance: parA and parB of the P1 bacteriophage partitioning system (S0039 and S0040), stbB and stbA of the R100 plasmid (S0206 and S0207), ccdA and ccdB (also known as proteins H and G) of the F plasmid (S0232 and S0233), and mvpA and mvpT (S0259 and S0258). P1 ParA and ParB act on the cis-acting element parS to promote faithful segregation of the plasmid during the bacterial cell cycle (14). The ParA proteins of pWR501 and P1 are 75% identical and the ParB proteins are 59% identical over most of their lengths, and an AT-rich parS site with an inverted imperfect 20-bp repeat structure is present immediately downstream of the parB stop codon. Plasmid R100 StbA and StbB, along with an essential upstream cis-acting element, mediate stable plasmid inheritance (61). StbA and StbB of pWR501 are 43 and 29% identical to R100 StbA and StbB over more than 80% of the target sequences, although pWR501 StbA is predicted to be twice as long as R100 StbA. An AT-rich sequence immediately upstream of stbA may constitute a cis-acting site. The redundancy of plasmid segregation and stable maintenance systems and their distribution around the plasmid may enhance the stable inheritance of a single-copy plasmid, such as pWR501, over many generations. In addition, the redundancy may suggest not only that maintenance of the plasmid is important to Shigella virulence but that maintenance of plasmid derivatives that carry large deletions is important. The Shigella virulence plasmid is known to spontaneously delete large segments (34), and the observed distribution of segregation and maintenance loci might permit maintenance of derivatives that had deleted segments containing one or more of these loci, which may permit maintenance of virulence despite loss of some sequence. Toxin-antitoxin systems, which encode a stable cytotoxin and an unstable antitoxin, mediate postsegregational killing of daughter cells that lack plasmid. pWR501 contains one recently described toxin-antitoxin system (the mvp locus, S0258-S0259 [55]), one that has been characterized on F plasmid (S0232-S0233), and remnants of a third (S0205). S0232 and S0233 encode homologs of F plasmid CcdB (toxin) and CcdA (antitoxin) (1). Immediately downstream of stbB, S0205 encodes a protein homologous to E. coli RelB, the toxin encoded by the relBE toxin-antitoxin locus (25).

ColE1 plasmids resolve plasmid multimers into monomers by site-specific recombination involving the cer site (58), in conjunction with chromosomally encoded argR, pepC, and xerC. pWR501 contains a 129-bp sequence (sequence coordinates 175196 to 175067) that is similar to bp 111 to 240 of the 284-bp ColE1 cer site, and therefore contains the ArgR binding site, but lacks the right arm of the recombination site. Plasmid R1 has a significantly truncated cer site (only 44 bp long) that is thought to function by recombination with a related site in the terminus of the E. coli chromosome (10), which suggests that the pWR501 site may also be functional. Of note, a homolog of ColE1 mob9 protein is encoded within the pWR501 cer site (S0238).

Transfer genes.

A block of genes located between sequence coordinates 189271 and 196816 are similar to known transfer genes. These include ORFs with significant homology to a truncation of traD and the complete sequences of traI, traX, and finO. In pWR501, this block has the same genetic organization as the corresponding genes in plasmid R100 (74), including two downstream ORFs of unknown function, yigA and yigB, followed by the replicon, raising the possibility that the entire region was acquired by horizontal transfer from R100. The remainder of the large block of transfer genes that are located upstream in R100 and other conjugative plasmids are absent on pWR501, and the traI gene has an internal frameshift. Whether pWR501 once had the full complement of transfer genes and subsequently lost most of them or whether only a portion of the transfer genes were acquired cannot be determined. Experimental data have shown that the virulence plasmid is not capable of self-transfer by conjugation but can be conjugated in the presence of conjugative plasmids (52).

Evolution of the Shigella virulence plasmid.

Recent genetic analyses suggests that shigellae do not constitute a distinct genus as traditionally believed but rather are within the genus of E. coli, much like the pathogenic E. coli (47). These analyses indicate that Shigella emerged from E. coli seven or eight independent times during evolution, leading to three clusters of Shigella, each of which contains serotypes from multiple traditional species, and four or five additional forms, each of which contains one traditional serotype (47). The three main Shigella clusters are estimated to have evolved 35,000 to 270,000 years ago, which predates the development of agriculture and makes shigellosis one of the early infectious diseases of humans (47).

The defining event each time Shigella arose was almost certainly the acquisition of an historical precursor of the current-day virulence plasmid. The data also suggest that the loss of specific catabolic pathways (inability to utilize lactose and mucate and to decarboxylate lysine), loss of motility, and expansion of O-antigen diversity that are characteristic of Shigella strains occurred more recently than the acquisition of the plasmid (47). In addition, curli loci have been insertionally inactivated in Shigella (51). Since the plasmid was acquired at distinct times, one would predict that differences reflecting the evolution of the plasmid could be obtained by genetic comparison of virulence plasmids of the seven different Shigella evolutionary groups. Subsequent to the acquisition of the virulence plasmid, divergence of Shigella clones from E. coli would involve clonal divergence (accumulation of mutations by base substitution), horizontal transfer of genetic material from other species, and loss of gene sequences that interfere with pathogenicity (35, 42).

Certain horizontal gene transfer events have been key to the evolution of Shigella. A quintessential feature of Shigella is its ability to invade mammalian cells and access the cell cytoplasm, defining a niche unique among enteric gram-negative bacteria, with the exception of enteroinvasive E. coli. Thus, the acquisition and evolution of the ipa-mxi-spa pathogenicity island, which encodes all of the genes required for cell invasion and phagolysosomal lysis, permitted a major alteration in pathogenesis (24). Likewise, the acquisition of virG (icsA), which mediates actin assembly on Shigella, and virF and virB, the regulators of the virG and ipa-mxi-spa loci, were key to the emergence of Shigella. Since all Shigella serotypes contain these loci, they were probably all present on the prototypic virulence plasmid.

It is not known which ORFs were acquired subsequent to the acquisition of the plasmid by the first Shigella strain. One might predict that access to an intracellular niche would present the organism with selective pressure to acquire certain factors that would not have provided selective advantage previously. These might include factors that permit resistance to the cellular immune response of the host or utilization of nutrients present in the host cytoplasm. Of note, the cellular immune response is ineffective against Shigella in animal models (72), and Shigella-specific cytotoxic T lymphocytes have not been isolated from convalescent individuals. In addition, factors that permit the bacterium to optimize its lifestyle in the human colon may also have been acquired after acquisition of the prototypic virulence plasmid. An example of this is the acquisition by horizontal transfer of O-antigen genes, such as those present on the virulence plasmid of S. sonnei, and subsequent inactivation of native O-antigen genes (30). Serotypic diversity due to the variations in O antigen is seen among Shigella strains. Such diversity likely facilitates evasion of the host humoral immune response. Further analysis of the function of many ORFs of unknown function on pWR501 as well as comparative analysis of virulence plasmids from other Shigella serotypes will allow a more complete characterization of the evolution of the plasmid.

ACKNOWLEDGMENTS

We thank Guy Plunkett III for assistance with DNA analysis and helpful discussions, Vessela Ivanova for assistance in the isolation of DNA and analysis of the replicon, Sara Klink, George F. Mayhew, Guy Plunkett III, and Helen J. Wing for help with confirming the sequence assembly, Nicole Perna for critical reading of the manuscript, Luther Lindler for many helpful discussions, and the sequencing teams of the Genome Center of Wisconsin for excellent technical work.

The sequencing work was supported by NIH/NIAID, and the sequence annotation was supported by NIH grant AI43562 (M.B.G.).

ADDENDUM IN PROOF

Buchrieser et al. (C. Buchrieser, P. Glaser, C. Rusniok, H. Nedjeri, H. d'Hauteville, F. Kunst, P. Sansonetti, and C. Parsot, Mol. Microbiol. 38:760–771, 2000) have also recently sequenced and analyzed the Shigella flexneri virulence plasmid.

REFERENCES

  • 1.Bahassi E M, O'Dea M H, Allali N, Messens J, Gellert M, Couturier M. Interactions of CcdB with DNA gyrase. Inactivation of Gyra, poisoning of the gyrase-DNA complex, and the antidote action of CcdA. J Biol Chem. 1999;274:10936–10944. doi: 10.1074/jbc.274.16.10936. [DOI] [PubMed] [Google Scholar]
  • 2.Benjelloun-Touimi Z, Tahar M S, Montecucco C, Sansonetti P J, Parsot C. SepA, the 110 kDa protein secreted by Shigella flexneri: two domain structure and proteolytic activity. Microbiology. 1998;1444:1815–1822. doi: 10.1099/00221287-144-7-1815. [DOI] [PubMed] [Google Scholar]
  • 3.Blattner F R, Plunkett G, Bloch C A, Perna N T, Burland V, Riley M, Collado-Vides J, Glasner J D, Rode C K, Mayhew G F, Gregor J, Davis N W, Kirkpatrick H A, Goeden M A, Rose D J, Mau B, Shao Y. The complete genome sequence of Escherichia coli K-12. Science. 1997;277:1453–1462. doi: 10.1126/science.277.5331.1453. [DOI] [PubMed] [Google Scholar]
  • 4.Bork P. Hundreds of ankyrin-like repeats in functionally diverse proteins: mobile modules that cross phyla horizontally? Proteins. 1993;17:363–374. doi: 10.1002/prot.340170405. [DOI] [PubMed] [Google Scholar]
  • 5.Borodovsky M, McIninch J. GeneMark: parallel gene recognition for both DNA strands. Comput Chem. 1993;17:123–133. [Google Scholar]
  • 6.Bruand C, Ehrlich S D. UvrD-dependent replication of rolling-circle plasmids in Escherichia coli. Mol Microbiol. 2000;35:204–210. doi: 10.1046/j.1365-2958.2000.01700.x. [DOI] [PubMed] [Google Scholar]
  • 7.Burland V, Daniels D L, Plunkett G, Blattner F R. Genome sequencing on both strands: the Janus strategy. Nucleic Acids Res. 1993;21:3385–3390. doi: 10.1093/nar/21.15.3385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Burland V, Shao Y, Perna N T, Plunkett G, Sofia H J, Blattner F R. The complete DNA sequence and analysis of the large virulence plasmid of Escherichia coli O157:H7. Nucleic Acids Res. 1998;26:4196–4204. doi: 10.1093/nar/26.18.4196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Buysse J M, Venkatesan M, Mills J A, Oaks E V. Molecular characterization of a trans-acting, positive effector (ipaR) of invasion plasmid antigen synthesis in Shigella flexneri serotype 5. Microb Pathog. 1990;8:197–211. doi: 10.1016/0882-4010(90)90047-t. [DOI] [PubMed] [Google Scholar]
  • 10.Clerget M. Site-specific recombination promoted by a short DNA segment of plasmid R1 and by a homologous segment in the terminus region of the Escherichia coli chromosome. New Biol. 1991;3:780–788. [PubMed] [Google Scholar]
  • 11.Cole S T, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, Gordon S V, Eiglmeier K, Gas S, Barry C E, Tekaia F, Badcock K, Basham D, Brown D, Chillingworth T, Connor R, Davies R, Devlin K, Feltwell T, Gentles S, Hamlin N, Holroyd S, Hornsby T, Jagels K, Barrell B G, et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998;393:537–544. doi: 10.1038/31159. [DOI] [PubMed] [Google Scholar]
  • 12.Coster T, Hoge C W, VandeVerg L, Hartman A B, Oaks E V, Venkatesan M M, Cohen D, Robin G, Fontaine-Thompson A, Sansonetti P J, Hale T L. Vaccination against shigellosis with attenuated Shigella flexneri 2a strain SC602. Infect Immun. 1998;67:3437–3443. doi: 10.1128/iai.67.7.3437-3443.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cousineau B, Lawrence S, Smith D, Belfort M. Retrotransposition of a bacterial group II intron. Nature. 2000;404:1018–1021. doi: 10.1038/35010029. [DOI] [PubMed] [Google Scholar]
  • 14.Davis M A, Radnedge L, Martin K A, Hayes F, Youngren B, Austin S J. The P1 ParA protein and its ATPase activity play a direct role in the segregation of plasmid copies to daughter cells. Mol Microbiol. 1996;21:1029–1036. doi: 10.1046/j.1365-2958.1996.721423.x. [DOI] [PubMed] [Google Scholar]
  • 15.Demers B, Sansonetti P J, Parsot C. Induction of type III secretion in Shigella flexneri is associated with differential control of transcription of genes encoding secreted proteins. EMBO J. 1998;17:2894–2903. doi: 10.1093/emboj/17.10.2894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Edwards C J, Innes D J, Burns D M, Beacham I R. UDP-sugar hydrolase isoenzymes in Salmonella enterica and Escherichia coli: silent alleles of ushA in related strains of group I Salmonella isolates, and of ushB in wild-type and K12 strains of E. coli, indicate recent and early silencing events, respectively. FEMS Microbiol Lett. 1993;114:293–298. doi: 10.1111/j.1574-6968.1993.tb06588.x. [DOI] [PubMed] [Google Scholar]
  • 17.Egile C, d'Hauteville H, Parsot C, Sansonetti P J. SopA, the outer membrane protease responsible for polar localization of IcsA in Shigella flexneri. Mol Microbiol. 1997;23:1063–1073. doi: 10.1046/j.1365-2958.1997.2871652.x. [DOI] [PubMed] [Google Scholar]
  • 18.Fallarino A, Mavarangelos C, Stroeher U H, Manning P A. Identification of additional genes required for O-antigen biosynthesis in Vibrio cholerae O1. J Bacteriol. 1997;179:2147–2153. doi: 10.1128/jb.179.7.2147-2153.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Fernandez-Prada C M, Hoover D L, Tall B D, Hartman A B, Kopelowitz J, Venkatesan M M. Shigella flexneri IpaH7.8 facilitates escape of virulent bacteria from the endocytic vacuoles of mouse and human macrophages. Infect Immun. 2000;68:3608–3619. doi: 10.1128/iai.68.6.3608-3619.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Fernandez-Prada C M, Hoover D L, Tall B D, Venkatesan M M. Human monocyte-derived macrophages infected with virulent Shigella flexneri in vitro undergo a rapid cytolytic event similar to oncosis but not apoptosis. Infect Immun. 1997;65:1486–1496. doi: 10.1128/iai.65.4.1486-1496.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Galan J E, Collmer A. Type III secretion machines: bacterial devices for protein delivery into host cells. Science. 1999;284:1322–1328. doi: 10.1126/science.284.5418.1322. [DOI] [PubMed] [Google Scholar]
  • 22.Garcia-DelPortillo F, Pucciarelli M G, Casadesus J. DNA adenine methylase mutants of Salmonella typhimurium show defects in protein secretion, cell invasion, and M cell cytotoxicity. Proc Natl Acad Sci USA. 1999;96:11578–11583. doi: 10.1073/pnas.96.20.11578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gomez-Duarte O G, Kaper J B. A plasmid-encoded regulatory region activates chromosomal eaeA expression in enteropathogenic Escherichia coli. Infect Immun. 1995;63:1767–1776. doi: 10.1128/iai.63.5.1767-1776.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Groisman E A, Ochman H. Pathogenicity islands: bacterial evolution in quantum leaps. Cell. 1996;87:791–794. doi: 10.1016/s0092-8674(00)81985-6. [DOI] [PubMed] [Google Scholar]
  • 25.Gronlund H, Gerdes K. Toxin-antitoxin systems homologous with relBE of Escherichia coli plasmid P307 are ubiquitous in prokaryotes. J Mol Biol. 1999;285:1401–1415. doi: 10.1006/jmbi.1998.2416. [DOI] [PubMed] [Google Scholar]
  • 26.Heithoff D M, Sinsheimer R L, Low D A, Mahan M J. An essential role for DNA adenine methylation in bacterial virulence. Science. 1999;284:967–970. doi: 10.1126/science.284.5416.967. [DOI] [PubMed] [Google Scholar]
  • 27.Jackson R W, Athanassopoulos E, Tsiamis G, Mansfield J W, Sesma A, Arnold D L, Gibbon M J, Murillo J, Taylor J D, Vivian A. Identification of a pathogenicity island, which contains genes for virulence and avirulence, on a large native plasmid in the bean pathogen Pseudomonas syringae pathovar phaseolicola. Proc Natl Acad Sci USA. 1999;96:10875–10880. doi: 10.1073/pnas.96.19.10875. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kajava A V. Structural diversity of leucine-rich repeat proteins. J Mol Biol. 1998;277:519–527. doi: 10.1006/jmbi.1998.1643. [DOI] [PubMed] [Google Scholar]
  • 29.Kotloff K L, Winickoff J P, Ivanoff B, Clemens J D, Swerdlow D L, Sansonetti P J, Adak G K, Levine M M. Global burden of Shigella infections: implications for vaccine development and implementation of control strategies. Bull W H O. 1999;77:651–666. [PMC free article] [PubMed] [Google Scholar]
  • 30.Lai V, Wang L, Reeves P R. Escherichia coli clone Sonnei (Shigella sonnei) had a chromosomal O-antigen gene cluster prior to gaining its current plasmid-borne O-antigen genes. J Bacteriol. 1998;180:2983–2986. doi: 10.1128/jb.180.11.2983-2986.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Mahillon J, Kirkpatrick H A, Kijenski H L, Bloch C A, Rode C K, Mayhew G F, Rose D J, Plunkett G, Burland V, Blattner F R. Subdivision of Escherichia coli K-12 genome for sequencing: manipulation and DNA sequence of transposable elements introducing unique restriction sites. Gene. 1998;223:47–54. doi: 10.1016/s0378-1119(98)00365-5. [DOI] [PubMed] [Google Scholar]
  • 32.Mahillon J, Leonard C, Chandler M. IS elements as constituents of bacterial genomes. Res Microbiol. 1999;150:675–687. doi: 10.1016/s0923-2508(99)00124-2. [DOI] [PubMed] [Google Scholar]
  • 33.Martinez-Abarca F, Zekri S, Toro N. Characterization and splicing in vivo of a Sinorhizobium meliloti group II intron associated with particular insertion sequences of the IS630-Tc1/IS3 retrotransposon superfamily. Mol Microbiol. 1998;28:1295–1306. doi: 10.1046/j.1365-2958.1998.00894.x. [DOI] [PubMed] [Google Scholar]
  • 34.Maurelli A T, Blackmon B, Curtiss R. Loss of pigmentation in Shigella flexneri 2a is correlated with loss of virulence and virulence-associated plasmid. Infect Immun. 1984;43:397–401. doi: 10.1128/iai.43.1.397-401.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Maurelli A T, Fernandez R E, Bloch C A, Rode C K, Fasano A. “Black holes” and bacterial pathogenicity: a large genomic deletion that enhances the virulence of Shigella spp. and enteroinvasive Escherichia coli. Proc Natl Acad Sci USA. 1998;95:3943–3948. doi: 10.1073/pnas.95.7.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Mills J A, Venkatesan M, Baron L S, Buysse J M. Spontaneous insertion of an IS1-like element into the virF gene is responsible for avirulence in opaque colonial variants of Shigella flexneri 2a. Infect Immun. 1992;60:175–182. doi: 10.1128/iai.60.1.175-182.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Nakata N, Tobe T, Fukuda I, Suzuki T, Komatsu K, Yoshikawa M, Sasakawa C. The absence of a surface protease, OmpT, determines the intercellular spreading ability of Shigella: the relationship between the ompT and kcpA loci. Mol Microbiol. 1993;9:459–468. doi: 10.1111/j.1365-2958.1993.tb01707.x. [DOI] [PubMed] [Google Scholar]
  • 38.Nataro J P, Seriwatana J, Fasano A, Maneval D R, Guers L D, Noriega F, Dubovsky F, Levine M M, Morris J G. Identification and cloning of a novel plasmid-encoded enterotoxin of enteroinvasive Escherichia coli and Shigella strains. Infect Immun. 1995;63:4721–4728. doi: 10.1128/iai.63.12.4721-4728.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Newland J W, Hale T L, Formal S B. Genotypic and phenotypic characterization of an aroD deletion-attenuated Escherichia coli K12-Shigella flexneri hybrid vaccine expressing S. flexneri 2a somatic antigen. Vaccine. 1992;10:766–776. doi: 10.1016/0264-410x(92)90512-i. [DOI] [PubMed] [Google Scholar]
  • 40.Nordstrom K, Wagner E G. Kinetic aspects of control of plasmid replication by antisense RNA. Trends Biochem Sci. 1994;19:294–300. doi: 10.1016/0968-0004(94)90008-6. [DOI] [PubMed] [Google Scholar]
  • 41.Nordstrom M, Nordstrom K. Control of replication of FII plasmids: comparison of the basic replicons and of the copB systems of plasmids R100 and R1. Plasmid. 1985;13:81–87. doi: 10.1016/0147-619x(85)90060-5. [DOI] [PubMed] [Google Scholar]
  • 42.Ochman H, Lawrence J G, Groisman E A. Lateral gene transfer and the nature of bacterial innovation. Nature. 2000;405:299–304. doi: 10.1038/35012500. [DOI] [PubMed] [Google Scholar]
  • 43.Ohtsubo H, Ryder T B, Maeda Y, Armstrong K, Ohtsubo E. DNA replication of the resistance plasmid R100 and its control. Adv Biophys. 1986;21:115–132. doi: 10.1016/0065-227x(86)90018-3. [DOI] [PubMed] [Google Scholar]
  • 44.Perna N T, Mayhew G F, Posfai G, Elliott S, Donnenberg M S, Kaper J B, Blattner F R. Molecular evolution of a pathogenicity island from enterohemorrhagic Escherichia coli O157:H7. Infect Immun. 1998;66:3810–3817. doi: 10.1128/iai.66.8.3810-3817.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Perry R D, Straley S C, Fetherston J D, Rose D J, Gregor J, Blattner F R. DNA sequence and analysis of the low-Ca2+-response plasmid pCD1 of Yersinia pestis KIM5. Infect Immun. 1998;66:4611–4623. doi: 10.1128/iai.66.10.4611-4623.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Philpott D J, Edgeworth J D, Sansonetti P J. The pathogenesis of Shigella flexneri infection: lessons from in vitro and in vivo studies. Philos Trans R Soc Lond B. 2000;355:575–586. doi: 10.1098/rstb.2000.0599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Pupo G M, Lan R, Reeves P R. Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc Natl Acad Sci USA. 2000;97:10567–10572. doi: 10.1073/pnas.180094797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Rajakumar K, Luo F, Sasakawa C, Adler B. Evolutionary perspective on a composite Shigella flexneri 2a virulence plasmid-borne locus comprising three distinct genetic elements. FEMS Microbiol Lett. 1996;144:13–20. doi: 10.1111/j.1574-6968.1996.tb08502.x. [DOI] [PubMed] [Google Scholar]
  • 49.Rakin A, Noelting C, Schubert S, Hessemann J. Common and specific characteristics of the high-pathogenicity island of Yersinia enterocolitica. Infect Immun. 1999;67:5265–5274. doi: 10.1128/iai.67.10.5265-5274.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Runyen-Janecky L J, Hong M, Payne S M. The virulence plasmid-encoded impCAB operon enhances survival and induced mutagenesis in Shigella flexneri after exposure to UV radiation. Infect Immun. 1999;67:1415–1423. doi: 10.1128/iai.67.3.1415-1423.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Sakellaris H, Hannink N K, Rajakumar K, Bulach D, Hunt M, Sasakawa C, Adler B. Curli loci of Shigella spp. Infect Immun. 2000;68:3780–3783. doi: 10.1128/iai.68.6.3780-3783.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sansonetti P J, Kopecko D J, Formal S B. Involvement of a plasmid in the invasive ability of Shigella flexneri. Infect Immun. 1982;35:852–860. doi: 10.1128/iai.35.3.852-860.1982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Sasakawa C, Kamata K, Sakai T, Murayama S Y, Makino S, Yoshikawa M. Molecular alteration of the 140-megadalton plasmid associated with loss of virulence and Congo red binding activity in Shigella flexneri. Infect Immun. 1986;51:470–475. doi: 10.1128/iai.51.2.470-475.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sasakawa C, Makino S, Kamata K, Yoshikawa M. Isolation, characterization, and mapping of Tn5 insertions into the 140-megadalton invasion plasmid defective in the mouse Sereny test in Shigella flexneri 2a. Infect Immun. 1986;54:32–36. doi: 10.1128/iai.54.1.32-36.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Sayeed S, Reaves L, Radnedge L. The stability region of the large virulence plasmid of Shigella flexneri encodes an efficient postsegregational killing system. J Bacteriol. 2000;182:2416–2421. doi: 10.1128/jb.182.9.2416-2421.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Schechter L M, Damrauer S D, Lee C A. Two AraC/XylS family members can independently counteract the effect of repressing sequences upstream of the hilA promoter. Mol Microbiol. 1999;32:629–642. doi: 10.1046/j.1365-2958.1999.01381.x. [DOI] [PubMed] [Google Scholar]
  • 57.Schneiker S, Kosier B, Puhler A, Selbitschka W. The Sinorhizobium meliloti insertion sequence (IS) element ISRm14 is related to a previously unrecognized IS element located adjacent to the Escherichia coli locus of enterocyte effacement (LEE) pathogenicity island. Curr Microbiol. 1999;39:274–281. doi: 10.1007/s002849900459. [DOI] [PubMed] [Google Scholar]
  • 58.Sharpe M E, Chatwin H M, Macpherson C, Withers H L, Summers D K. Analysis of the ColE1 stability determinant Rcd. Microbiology. 1999;145:2145–2144. doi: 10.1099/13500872-145-8-2135. [DOI] [PubMed] [Google Scholar]
  • 59.Silva R M, Saadi S, Maas W K. A basic replicon of virulence-associated plasmids of Shigella spp. and enteroinvasive Escherichia coli is homologous with a basic replicon in plasmids of IncF groups. Infect Immun. 1988;56:836–842. doi: 10.1128/iai.56.4.836-842.1988. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Somerville J E, Cassiano L, Bainbridge B, Cunningham M D, Darveau P P. A novel Escherichia coli lipid A mutant that produces antiinflammatory lipopolysaccharide. J Clin Investig. 1996;97:359–365. doi: 10.1172/JCI118423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Tabuchi A, Min Y N, Womble D D, Rownd R H. Autoregulation of the stability operon of IncFII plasmid NR1. J Bacteriol. 1992;174:7629–7634. doi: 10.1128/jb.174.23.7629-7634.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Taira S, Riikonen P, Saarilahti H, Sukupolvi S, Rhen M. The mkaC virulence gene of the Salmonella serovar typhimurium 96 kb plasmid encodes a transcriptional activator. Mol Gen Genet. 1991;228:381–384. doi: 10.1007/BF00260630. [DOI] [PubMed] [Google Scholar]
  • 63.Tanaka K, Rogi T, Hiasa H, Miao D M, Honda Y, Nomura N, Sakai H, Komano T. Comparative analysis of functional and structural features in the primase dependent priming signals, G sites, from phages and plasmids. J Bacteriol. 1994;176:3606–3613. doi: 10.1128/jb.176.12.3606-3613.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tanaka K, Sakai H, Honda Y, Nakamura T, Higashi A, Komano T. Structural and functional features of cis-acting sequences in the basic replicon of plasmid ColIb-P9. Nucleic Acids Res. 1992;20:2705–2710. doi: 10.1093/nar/20.11.2705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Tatusov R L, Galperin M Y, Natale D A, Koonin E V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 2000;28:33–36. doi: 10.1093/nar/28.1.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Tavakoli N, Comanducci A, Dodd H M, Lett M C, Albiger B, Bennett P. IS1294, a DNA element that transposases by RC transposition. Plasmid. 2000;44:66–84. doi: 10.1006/plas.1999.1460. [DOI] [PubMed] [Google Scholar]
  • 67.Tobe T, Hayashi T, Han C G, Schoolnik G K, Ohtsubo E, Sasakawa C. Complete DNA sequence and structural analysis of the enteropathogenic Escherichia coli adherence factor plasmid. Infect Immun. 1999;67:5455–5462. doi: 10.1128/iai.67.10.5455-5462.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Tran van Nhieu G, Sansonetti P J. Mechanism of Shigella entry into epithelial cells. Curr Opin Microbiol. 1999;2:51–55. doi: 10.1016/s1369-5274(99)80009-5. [DOI] [PubMed] [Google Scholar]
  • 69.Venkatesan M M, Buysse J M, Hartman A. Sequence variation in two ipaH genes of Shigella flexneri and homology to the LRG-like family of proteins. Mol Microbiol. 1991;5:2435–2445. doi: 10.1111/j.1365-2958.1991.tb02089.x. [DOI] [PubMed] [Google Scholar]
  • 70.Venkatesan M M, Buysse J M, Kopecko D J. Characterization of invasion plasmid antigen genes (ipaBCD) from Shigella flexneri. Proc Natl Acad Sci USA. 1988;85:9317–9321. doi: 10.1073/pnas.85.23.9317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Venkatesan M M, Fernandez-Prada C, Buysse J M, Formal S B, Hale T L. Virulence phenotype and genetic characteristics of the T-32 Istrati Shigella flexneri vaccine strain. Vaccine. 1991;9:358–363. doi: 10.1016/0264-410x(91)90064-d. [DOI] [PubMed] [Google Scholar]
  • 72.Way S S, Borczuk A C, Goldberg M B. Thymic independence of adaptive immunity to the intracellular pathogen Shigella flexneri serotype 2a. Infect Immun. 1999;67:3970–3979. doi: 10.1128/iai.67.8.3970-3979.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Wu R, Wang X, Wonble D D, Rownd R H. Expression of the repA1 gene of IncFII plasmid NR1 is translationally coupled to expression of an overlapping leader peptide. J Bacteriol. 1992;174:7620–7628. doi: 10.1128/jb.174.23.7620-7628.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Yoshioka Y, Fujita Y, Ohtsubo E. Nucleotide sequence of the promoter-distal region of the tra operon of plasmid R100. J Mol Biol. 1990;214:39–53. doi: 10.1016/0022-2836(90)90145-C. [DOI] [PubMed] [Google Scholar]
  • 75.Zychlinsky A, Kenny B, Menard R, Prevost M-C, Holland I B, Sansonetti P J. IpaB mediates macrophage apoptosis induced by Shigella flexneri. Mol Microbiol. 1994;11:619–627. doi: 10.1111/j.1365-2958.1994.tb00341.x. [DOI] [PubMed] [Google Scholar]

Articles from Infection and Immunity are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES