Abstract
Current bacterial taxonomy is mostly based on phenotypic criteria, which may yield misleading interpretations in classification and identification. As a result, bacteria not closely related may be grouped together as a genus or species. For pathogenic bacteria, incorrect classification or misidentification could be disastrous. There is therefore an urgent need for appropriate methodologies to classify bacteria according to phylogeny and corresponding new approaches that permit their rapid and accurate identification. For this purpose, we have devised a strategy enabling us to resolve phylogenetic clusters of bacteria by comparing their genome structures. These structures were revealed by cleaving genomic DNA with the endonuclease I-CeuI, which cuts within the 23S ribosomal DNA (rDNA) sequences, and by mapping the resulting large DNA fragments with pulsed-field gel electrophoresis. We tested this experimental system on two representative bacterial genera: Salmonella and Pasteurella. Among Salmonella spp., I-CeuI mapping revealed virtually indistinguishable genome structures, demonstrating a high degree of structural conservation. Consistent with this, 16S rDNA sequences are also highly conserved among the Salmonella spp. In marked contrast, the Pasteurella strains have very different genome structures among and even within individual species. The divergence of Pasteurella was also reflected in 16S rDNA sequences and far exceeded that seen between Escherichia and Salmonella. Based on this diversity, the Pasteurella haemolytica strains we analyzed could be divided into 14 phylogenetic groups and the Pasteurella multocida strains could be divided into 9 groups. If criteria for defining bacterial species or genera similar to those used for Salmonella and Escherichia coli were applied, the striking phylogenetic diversity would allow bacteria in the currently recognized species of P. multocida and P. haemolytica to be divided into different species, genera, or even higher ranks. On the other hand, strains of Pasteurella ureae and Pasteurella pneumotropica are very similar to those of P. multocida in both genome structure and 16S rDNA sequence and should be regarded as strains within this species. We conclude that large-scale genome structure can be a sensitive indicator of phylogenetic relationships and that, therefore, I-CeuI-based genomic mapping is an efficient tool for probing the phylogenetic status of bacteria.
Bacterial genera or species, as defined by current taxonomic approaches (18, 44), may include bacteria that in fact are not closely related phylogenetically. For example, the DNA G+C content of accepted Bacillus species varies from 32 to 69 mol% (8), although species of a genus are not normally expected to differ from one another by more than 10 mol%. This situation results largely from the lack of appropriate methodologies for classifying bacteria on a phylogenetic basis. Consequent errors in bacterial identification could be quite misleading both in basic research and in practical applications. Therefore, the ability to efficiently distinguish phylogenetic groups of bacteria within an apparent taxonomic species is desirable, especially for the correct diagnosis and timely treatment of infectious diseases caused by bacterial pathogens. Molecular techniques such as RNA and DNA or protein sequencing have been extremely useful in establishing the phylogenetic relationships among organisms and in discriminating bacteria by phylogeny. However, such sequence analyses provide information for only a tiny (though important) portion of the genome, and different choices of sequences for analysis can lead to remarkably variable conclusions (10, 21). Furthermore, it can be difficult to distinguish recently diverged bacteria solely on the basis of limited DNA or RNA sequence analysis. For example, 16S rRNA (or ribosomal DNA [rDNA]) sequencing (16), currently the most frequently used technique for phylogenetic studies of bacteria, is efficient only in resolving genera and higher taxa; for ranks lower than genus it loses resolution.
The physical structure of bacterial genomes as revealed by endonuclease mapping, on the other hand, could provide an alternative parameter to be employed in phylogenetic studies (26, 29). The mapping approach involves the use of endonucleases such as I-CeuI (17), XbaI, AvrII, et al. (24, 25). The genomic DNA fragments generated are then separated according to size by pulsed-field gel electrophoresis (PFGE). The maps thus established reveal the size and geometry of the genome, the copy number and genomic distribution of rrn operons, and major genomic events, e.g., insertions, duplications, deletions, inversions, and translocations of DNA segments (29–31). Closely related bacteria can have highly similar genome structures; the similarity decreases and eventually vanishes as the phylogenetic relationship becomes more distant.
In this study, we examine the relatedness of bacterial populations within taxonomic species by revealing and comparing their genome structures. We focus on two genera of pathogenic bacteria: Salmonella, which consists of more than 2,300 known species, or serovars (38, 39), that display a highly conserved genetic background (9, 19, 22, 23), and Pasteurella, which consists of about 25 species of both genetically and biologically more diverse bacteria (3, 7, 11, 36, 40). Our data indicate that the analysis of genome structure readily provides information useful to the elucidation and identification of phylogenetic clusters.
MATERIALS AND METHODS
Bacterial strains.
The Salmonella strains were obtained from J. Lederberg (LT2) (45), R. K. Selander (RKS strains) (1, 5, 6), and M. Y. Popoff (CIP8231 and 156-87) and are listed in Table 1. Pasteurella strains were obtained from the American Type Culture Collection, W. Albritton, S. Lundberg, and R. Lo and are listed in Table 2.
TABLE 1.
Salmonella strains
Strain no. | Original no. | Subgenus (species) | Antigenic formula | Source |
---|---|---|---|---|
LT2 | LT2 | Salmonella I (S. typhimurium) | 1,4,[5],12:i:1,2 | |
RKS4993 | ATCC 9150 | Salmonella I (S. paratyphi A) | 1,2,12:a:− | |
RKS3222 | DMS155/76 | Salmonella I (S. paratyphi B) | 1,4,[5],12:b:1,2 | Human |
RKS3614 | IP87/87 | Salmonella I (S. java) | 1,4,[5],12:b:1,2 | Cow |
RKS5078 | Salmonella I (S. gallinarum) | 1,9,12:−:− | ||
RKS53 | CDCSSU7998 | Salmonella I (S. enteritidis) | 1,9,12:g,m:− | |
RKS2985 | CDC151-85 | Salmonella II | 58:d:z6 | Human |
RKS2993 | CDC3472-64 | Salmonella II | 42:f:g,t:− | |
RKS2990 | 3940-62 | Salmonella II | 47:b:1,5 | Human |
RKS3007 | 42-87 | Salmonella II | 60:g,m,t:z6 | |
RKS2980 | CDC346-86 | Salmonella IIIa | 62:z4,z23:− | Corn snake |
RKS2983 | CDC409-85 | Salmonella IIIa | 62:z36:− | Human |
RKS2982 | 187-87 | Salmonella IIIa | 48:z4,z24:− | Human |
RKS2984 | 98-84 | Salmonella IIIa | 18:z4,z32:− | Human |
RKS2978 | CDC156-87 | Salmonella IIIb | 501,2,3:k:z | Human |
CIP8231 | CIP8231 | Salmonella IIIb | ||
156-87 | 156-87 | Salmonella IIIb | ||
RKS2975 | 128-87 | Salmonella IIIb | 48:i:z | Human |
RKS3027 | CDC287-86 | Salmonella IV | 16:z4,z32:− | Human |
RKS3022 | 484-85 | Salmonella IV | 48:g,z51:− | Human |
RKS3025 | 145-86 | Salmonella IV | 43:z4,z23:− | Vacuum cleaner |
RKS3026 | 208-86 | Salmonella IV | 16:z4,z32:− | Iguana |
RKS3041 | CDC750-72 | Salmonella V (S. bongori) | 66:z41:− | Frog |
RKS3044 | CDC2703-76 | Salmonella V (S. bongori) | 48:z41:− | Parakeet |
RKS3040 | 327-80 | Salmonella V (S. bongori) | 66:z65:− | Fishmeal |
RKS3042 | 1308-83 | Salmonella V (S. bongori) | 48:a:− | Human |
RKS2995 | CDC1363-65 | Salmonella VI | 45:a:e,n,x | |
RKS3052 | 4603-68 | Salmonella VI | 11:b:1,7 | |
RKS3053 | 1411-60 | Salmonella VI | 1,6,14,25:a:e,n,x | Coconut |
RKS3055 | 2131-71 | Salmonella VI | 41:b:1,7 | Opossum |
RKS3014 | CDC5039-68 | Salmonella VII | 40:z4,z24:[z39] | Human |
RKS3804 | 513/69 | Salmonella VII | 41:z4,z23:− |
TABLE 2.
Pasteurella strains
Species and strain no. | Alternative no. | Host | Serotype | Genome group | GenBank nucleotide sequence accession no. |
---|---|---|---|---|---|
P. haemolytica | |||||
H099 | ATCC 33367 | Sheep | T3 | Pha1 | AF176279 |
H100 | ATCC 33368 | Sheep | T4 | Pha1 | |
H106 | ATCC 33374 | Sheep | Pha1 | AF176280 | |
H156 | Sheep | Pha1 | |||
H157 | Sheep | Pha1 | |||
H158 | Sheep | Pha1 | |||
H160 | Sheep | Pha1 | AF176281 | ||
H161 | Bighorn | Pha1 | |||
H162 | Bighorn | Pha1 | |||
H163 | Bighorn | Pha1 | |||
H164 | Bighorn | Pha1 | |||
H159 | Sheep | Pha1 | |||
H059 | Cattle | Pha2 | AF176282 | ||
H044 | Cattle | A1 | Pha3 | ||
H046 | Cattle | A1 | Pha3 | ||
H102 | ATCC 33371 | Sheep | A7 | Pha4 | AF176283 |
H103 | ATCC 33372 | Sheep | A8 | Pha5 | AF176284 |
H104 | ATCC 33373 | Sheep | A9 | Pha6 | |
H094 | Cattle | A1 | Pha6 | ||
H105 | ATCC 33369 | Sheep | A5 | Pha7 | |
H093 | Cattle | A1 | Pha8 | ||
H095 | Cattle | A1 | Pha9 | ||
H096 | Cattle | A1 | Pha10 | ||
H196 | Cattle | A1 | Pha10 | ||
H045 | Cattle | A1 | Pha11 | AF176285 | |
H187 | Cattle | Pha12 | AF176286 | ||
H188 | Cattle | Pha13 | AF176287 | ||
H060 | Sheep | Pha14 | AF176288 | ||
P. multocida | |||||
H135 | NCTC10322 | Pmu1 | AF176289 | ||
H134 | Human | Pmu1 | |||
H048 | Bovine | Pmu1 | |||
H063 | Chicken | Pmu1 | |||
H064 | Pig | Pmu1 | |||
H119 | Bovine | Pmu1 | |||
H121 | Bovine | Pmu1 | |||
H122 | Bovine | Pmu1 | |||
H125 | Bovine | Pmu1 | |||
H126 | Bovine | Pmu1 | |||
H127 | Bovine | Pmu1 | |||
H236 | Pig | Pmu1 | AF176290 | ||
H229 | Pig | Pmu2 | AF176291 | ||
H061 | Mallard duck | Pmu2 | |||
H062 | Turkey | Pmu2 | |||
H065 | Mallard duck | Pmu2 | |||
H118 | Bovine | Pmu2 | |||
H130 | Human | Pmu2 | |||
H140 | ATCC 15742 | Turkey | Pmu2 | ||
H047 | Pig | Pmu2 | |||
H139 | ATCC 11039 | Domestic fowl | Pmu2 | AF176292 | |
H232 | Pig | Pmu2 | |||
H233 | Pig | Pmu2 | |||
H234 | Pig | Pmu2 | |||
H116 | Bovine | Pmu3 | AF176293 | ||
H117 | Bovine | Pmu3 | AF176294 | ||
H120 | Bovine | Pmu3 | |||
H239 | Pig | Pmu4 | AF176295 | ||
H235 | Pig | Pmu5 | AF176296 | ||
H231 | Pig | Pmu6 | AF176297 | ||
H129 | Bovine | Pmu7 | AF176298 | ||
H123 | Bovine | Pmu7 | AF176299 | ||
H154 | Bovine | Pmu8 | AF176300 | ||
P. pneumotropica | |||||
M10 | Pmu1 | ||||
P. ureae | |||||
M14 | Pmu9 | AF176301 |
Enzymes and chemicals.
I-CeuI was purchased from New England BioLabs; proteinase K was from Boehringer Mannheim. Most other chemicals were from the Sigma Chemical Co.
PFGE methods and genomic mapping.
Preparation of intact genomic DNA, endonuclease cleaving of DNA in agarose blocks, and separation of the DNA fragments by PFGE were as described previously (27, 29). PFGE was performed with the Bio-Rad contour-clamped homogeneous electric field (CHEF) mapper, Bio-Rad CHEF DRII, or Hoefer Hula electrophoresis system. Genomic mapping methods with I-CeuI were as described and modified previously (26, 33).
16S rDNA sequencing and data analysis.
Partial nucleotide sequencing of 16S rDNA, from nucleotides 800 to 1450 (Escherichia coli K-12 numbering), was carried out for 23 representative Pasteurella strains. Genomic DNA was used directly as templates for the sequencing after isolation with proteinase K, fragmentation by sonication, and purification with the Qiagen kit. The following primers were used: Pas1, 5′TACGG(C/T)TACCTTGTTACGACT3′ (for nucleotides 1130 to 1450), and Pas2, 5′TCTCCTTTGAGTTCCCGA3′ (for nucleotides 800 to 1130). The generated 16S rDNA sequences and some reference sequences obtained from GenBank were analyzed with the PHYLIP programs (13, 14).
Nucleotide sequence accession numbers.
The GenBank accession numbers for the 23 16S rDNA segments sequenced in this study are given in Table 2.
RESULTS
Construction of genome maps by complete and incomplete I-CeuI cleavages of genomic DNA.
To explore the possible utility of structural analysis in the comparison of diverse bacterial genomes, we elected first to validate the method with a group of closely related bacteria and then to apply it to bacteria for which greater diversity might be anticipated. Two very special features of I-CeuI make it an excellent endonuclease for bacterial genomic mapping: (i) it directly reveals the copy number of rrn operons because it cleaves exclusively within the 23S rRNA gene (26, 35) and (ii) it easily generates incomplete cleavage products at a wide range of concentrations. The incomplete cleavage bands on a PFGE gel help determine the neighboring relationships among completely cleaved DNA fragments. Figure 1A shows the I-CeuI cleavage pattern of genomic DNA of Salmonella typhimurium LT2 on a PFGE gel, with the incomplete as well as the complete cleavage bands indicated; Fig. 1B shows the genome map based on the data from Fig. 1A.
FIG. 1.
Construction of an I-CeuI genome map. (A) I-CeuI cleavage pattern of genomic DNA of S. typhimurium LT2, with bands and their sizes indicated (complete cleavage products labeled with larger letters than the incomplete ones). Lane MW, molecular size markers. (B) Circular genome map based on complete as well as incomplete cleavage data in panel A. PFGE: CHEF Mapper, 30 s ramping to 120 s, 6 V/cm, 120°, 18 h.
Genome structure of Salmonella strains.
Based on the extraordinary similarity in genome structure among Salmonella species within subgenus I (29, 32) and even between Salmonella and E. coli (26, 42), a comparable level of similarity was expected for Salmonella strains in all eight subgenera. Figure 2A shows the PFGE patterns of I-CeuI-cleaved genomic DNA of representative strains of the eight Salmonella subgenera, and Fig. 2B shows the maps. There is substantial similarity in the sizes and map positions of the seven I-CeuI fragments among the 32 strains shown, although subtle but unique differences among the subgenera exist. For example, a small fragment C (450 kb) is a unique feature of all subgenus IIIa strains (Fig. 2A and B). Occasionally, unusual fragment sizes were detected in strains of any of the eight subgenera, probably as a result of genomic reorganizations (e.g., the larger B or G fragments in lanes 9, 15, 17, 18, and 31). Because similarities in genome structure among species of the same subgenera are already too great to distinguish readily among them and because our earlier data had demonstrated identical genome structure among strains of the same species, e.g., S. typhimurium (29), S. paratyphi A (28), S. paratyphi B, S. java, and many other Salmonella species (23a), we chose not to present data for strains of individual Salmonella species here. Complete 16S rRNA sequences from ca. 30 Salmonella strains covering more than 20 Salmonella species and from E. coli K-12, obtained from public databases, have more than 98% identity, both among the Salmonella subgenera and between Salmonella and E. coli (data not shown), demonstrating the genetic homogeneity of Salmonella as well as the great similarity between Salmonella and E. coli. These data therefore suggest that not only is sequence divergence between Salmonella and E. coli modest but also that large-scale organizational patterns of the genomes are well conserved (Fig. 2B).
FIG. 2.
Genome structure of Salmonella spp. (A) I-CeuI cleavage patterns of representative Salmonella spp. of all eight subgenera; (B) genome maps based on data in panel A. The circular maps are presented here as linear for more convenient comparisons among them, with only a part of fragment A shown. PFGE: CHEF DRII, 30 s ramping to 120 s, 170 V, 20 h.
Genome structure of Pasteurella strains.
To extend our analysis, we next considered members of Pasteurella spp., for which taxonomy has been more problematic (3). We focused on two Pasteurella species, Pasteurella haemolytica and Pasteurella multocida, the leading causative agents of pasteurellosis in livestock, including hemorrhagic septicemia in cattle and pneumonic infections in sheep and goats. We also included examples of Pasteurella ureae and Pasteurella pneumotropica for comparison. I-CeuI cleavage of genomic DNA generated six bands in all Pasteurella strains except H103, which has seven, possibly resulting from a duplication of the 60-kb fragment (see Fig. 4). All Pasteurella strains have circular genomes, as exemplified by P. haemolytica H099 in Fig. 3, with linear maps presented in Fig. 4B for more convenient comparisons among them. P. haemolytica and P. multocida have very different overall I-CeuI cleavage patterns (Fig. 4A) and genome maps (Fig. 4B). Even within species, differences in I-CeuI cleavage pattern and genome map among the strains are also very obvious. Some strains look very similar when the complete cleavage bands are compared; however, their different incomplete cleavage bands indicate different arrangements of the DNA segments (Fig. 4). Altogether, we made genome maps for 63 Pasteurella strains, including 28 identified as P. haemolytica, 33 as P. multocida, and one each as P. ureae and P. pneumotropica (Table 2). We then tried to divide the Pasteurella strains into groups according to their similarity in genome structure, so that genomes with the same or very similar complete as well as incomplete I-CeuI cleavage patterns were grouped together (Fig. 4 and Table 2).
FIG. 4.
Genome structure of Pasteurella spp. (A) I-CeuI cleavage patterns of representative Pasteurella strains. Fragment designations and sizes are given for strain H059 in lane 1. Note that fragments C (60 kb) and F (40 kb) of H059 are shown as four bands on this PFGE gel. This reflects artifactual retardation of some bands that is occasionally seen near the margins of some gels, especially with smaller fragments. (B) Genome maps based on data in panel A. The circular maps are presented here as linear ones for more convenient comparisons. PFGE: CHEF DRII, 10 s ramping to 100 s, 160 V, 20 h; 50 s ramping to 60 s, 160 V, 6 h; 10 s ramping to 20 s, 160 V, 3 h.
FIG. 3.
I-CeuI genome map of P. haemolytica H099. (A) PFGE pattern of genomic DNA after I-CeuI cleavage; (B) circular genome map based on the data in panel A. The map was established based on several PFGE runs under different pulsing conditions to resolve all size ranges. In this particular PFGE gel, for example, the resolution is good for fragments of ca. 500 kb and smaller but not for fragment A and its combinations with fragment B or F, which are resolved under other conditions. PFGE: CHEF DRII, 10 s ramping to 90 s, 160 V, 20 h; 10 s ramping to 30 s, 160 V, 6 h; 80 s ramping to 120 s, 120 V, 14 h.
The P. haemolytica complex.
The 28 P. haemolytica strains were divided into 14 genome groups (Pha groups). They all have a large I-CeuI fragment, 1.1 Mb or larger. Group 1 contains 12 strains that have nearly identical genome maps that are very different from those of all the other P. haemolytica strains. They are biotype T strains, and it has recently been suggested that they be elevated to a new species, Pasteurella trehalosi (43). Group 2 contains only one strain in our collection, and it has a genome structure totally different from that of strains of either group 1 or groups 3 to 14. Groups 3 to 14 each contain one or two strains which have very similar I-CeuI cleavage patterns, indicating similar genetic contents, but different genome maps. H103 in group 5 is a very special case: it has an extra 60-kb fragment probably resulting from a genomic duplication; otherwise it is similar to the rest of the cluster comprising Pha groups 3 to 14.
As shown in Fig. 4B, the Pha groups differ not only in genome structure but also in genome size: Pha groups 1 and 2 have small genomes, 2,130 and 2,190 kb, respectively, and Pha groups 3 to 14 have larger genomes, around 2,600 kb.
The P. multocida complex.
To determine whether the diversity seen in the P. haemolytica complex might also exist in other Pasteurella spp., we undertook comparable analyses of P. multocida. The 33 P. multocida strains we analyzed were also quite diverse and were divided into eight groups (Pmu groups). Group 1 has 12 strains and contains the type strain, NCTC10322 (H135). Groups 1, 2 (12 strains), 3 (3 strains), and 4 to 6 (1 strain each) have somewhat similar I-CeuI cleavage patterns, although the lengths of some homologous fragments could be variable (Fig. 4). In contrast, groups 7 (two strains) and 8 (one strain) are dramatically different in genome structure from the majority of P. multocida strains. Interestingly, the P. pneumotropica strain, M10, has a genome structure that is very similar to those of group 1 P. multocida strains and therefore was included with them. The P. ureae strain, M14, was also placed in the P. multocida complex as the sole strain of group 9 because of its great similarity in genome structure to most P. multocida strains.
The striking diversity in genome structure among strains of the same species (e.g., Pha group 1 or 2 versus Pha groups 3 to 14 and Pmu groups 7 and 8 versus groups 1 to 6) and contrastingly the great similarity among some strains of P. multocida, P. pneumotropica, and P. ureae make it critical to establish the phylogenetic relationships among these bacteria. Molecular methods such as 16S rDNA sequencing would provide data that may either support or challenge our hypothesis that phylogenetically closely related bacteria usually have similar genome structures.
16S rDNA sequence analysis of Pasteurella strains.
We therefore determined portions of the 16S rDNA sequences for 23 representative Pasteurella strains and compared them. We chose bases 1100 to 1450 (E. coli numbering) because this segment is one of the divergent regions of the molecule among different bacteria but not the most divergent region (23a). We tried to avoid sequencing and comparing the most constant as well as the most divergent regions of the 16S rDNA molecules to estimate phylogenetic relationships among these bacteria at a certain level. For example, the overall sequence identity of the chosen segment between the P. haemolytica and P. multocida complexes is about 95%, whereas this same segment is completely identical between E. coli and S. typhimurium (and among many other Salmonella species; data not shown). These data support the notion deriving from the I-CeuI analyses that Pasteurella species are indeed quite diverse phylogenetically. Figure 5 presents 240 bases that contain most of the divergent parts of the compared sequences.
FIG. 5.
Comparison of 16S rDNA sequences among representative Pasteurella strains with that of E. coli K-12 included as a reference. Bases 1151 to 1390 (E. coli numbering) are compared here.
Within the P. haemolytica complex, the compared 16S rDNA sequences are up to 4% divergent among Pha groups 1, 2, and 3 to 14, with Pha groups 3 to 14 having an identical sequence in this segment of 16S rDNA. The Pha group 1 strains form a very tight cluster and stand out from all other P. haemolytica strains, supporting their phylogenetic status revealed by genome structure and their recent reclassification as a novel Pasteurella species, P. trehalosi (43) (see Fig. 6). The P. multocida complex is also strikingly divergent among the strains: H154 of Pmu group 8 is about 5% divergent from both the P. haemolytica complex and the other genome groups of the P. multocida complex, with all the other strains in this complex being 2% different or less. Pmu groups 1, 2, 4, and 7 are identical in this segment of 16S rDNA, indicating very close phylogenetic relationships among them. Groups 3, 5, and 6 are all about 2% different from the cluster comprising groups 1, 2, 4, and 7. Interestingly, M14, the P. ureae strain, is only 1% different from groups 1, 2, 4, and 7. M10, the P. pneumotropica strain, is in Pmu group 1 and has the same 16S rDNA sequence as other strains of group 1. In the phylogenetic tree in Fig. 6, which was constructed based on the 16S rDNA data, E. coli and S. typhimurium are included as an outgroup. In the tree, two Pasteurella clusters are formed, one for the P. haemolytica complex and one for the P. multocida complex, with the two being connected by H154, the Pmu group 8 strain.
FIG. 6.
Phylogenetic tree of the Pasteurella genome groups based on their 16S rDNA data, with E. coli and Salmonella included as an outgroup.
DISCUSSION
To resolve the phylogenetic status of bacterial populations in the same taxonomic groupings, we tested our genome structural analysis methodology on two bacterial genera, Salmonella and Pasteurella. With Salmonella, we wished to confirm and extend our previous observations that bacterial genome structure can be stable over long evolutionary times (26, 29) and that closely related bacteria tend to have similar genome structures. Several lines of evidence indicate that E. coli and Salmonella may have diverged over 100 million years ago (12, 15, 37); some of the eight Salmonella subgenera may also have diverged for similarly long evolutionary times. Even so, their rDNA sequences and genome structures are hardly distinguishable, with very rare exceptions (30, 31). Thus, it would appear that large-scale genomic organization as we display it here evolves very slowly in at least some bacteria.
Pasteurella is one of the four recognized genera of the bacterial family Pasteurellaceae, the taxonomy of which has been a very confusing area of study for several decades. Specific and generic names have been changed back and forth frequently for many of the taxa in this family, a situation resulting mainly from the lack of a methodology for assessing the phylogenetic status of these bacteria reliably, efficiently, and conveniently. The situation within individual Pasteurella species is equally confusing: it is not known how phylogenetically diverse the bacteria within a taxonomic grouping might be. Our genome structure analysis technology thus appears to provide a method of resolving members of a bacterial taxon into distinct groups on a phylogenetic basis by a simple comparison of rrn-delimited fragment sizes and a relative ordering of these fragments along the genome (Fig. 4). Nevertheless, it is possible that some unusual genomic rearrangements, especially those not mediated by rrn operons as seen in Pmu group 7 strains, may obscure the phylogenetic groupings. In these cases, PFGE analysis should be used in conjunction with rDNA data to obtain more robust interpretations. While our method is insensitive to the accumulation of point mutations in the genome, it easily reveals large genomic insertions, deletions, and other recombinational events that may occur during the evolutionary diversification of the bacteria.
In close correspondence with our evidence of structural diversity, the multiple genome groups that were revealed in Pasteurella by I-CeuI mapping are also highly divergent in 16S rDNA sequence (Fig. 5). If the same criteria for defining bacterial species as those used for E. coli and Salmonella taxonomy are applied, then on the basis of 16S rRNA sequence information and other types of information including genome structure, at least some of these genome groups, e.g., Pha groups 1 and 2 and Pmu group 8 (Fig. 4 and 6), should have the taxonomic status of genus or higher. Some other genome groups, such as Pha groups 3 to 14, however, are more closely related to one another; in fact, their behavior is similar to that of S. typhi strains (31) in that they have nearly identical genetic contents though with occasional differences in the ordering of fragments, presumably reflecting recombination events arising at rrn operons (31). The situations for both Pasteurella Pha groups 3 to 14 and S. typhi are consistent with the adopt-adapt model of bacterial speciation (34), which hypothesizes that bacteria speciate from ancestors by acquisition of novel genetic material (adopt) followed by adaptive genomic reorganization (adapt). By this hypothesis, Pha groups 3 to 14 might be the products of independent genomic reorganizations in individual lineages of a P. haemolytica ancestor through homologous recombinations between rrn operons in the process of genomic rebalancing (30). In this sense, strains of Pha groups 3 to 14 may still be members of the same phylogenetic species, having identical biological properties and dwelling in the same or similar niches.
The P. multocida complex has a similar situation. Except for Pmu groups 7 and 8, all P. multocida strains have similar I-CeuI cleavage patterns, with most homologous bands among the strains recognizable by size only (and confirmed by DNA hybridization; data not shown). However, strains of Pmu groups 1 to 6 are much more diverse in terms of the lengths of both individual I-CeuI fragments and whole genomes than are strains of Pha groups 3 to 14. Difference in genome size means different genetic contents; therefore, strains of Pmu groups 1 to 6 may no longer be members of the same phylogenetic species, although they could still be very closely related phylogenetically. Strains of Pmu groups 1 to 6 could be the products of recent evolutionary events, e.g., gain or loss of genetic material, with homologous I-CeuI fragments being organized in different ways in the Pmu groups as a mechanism of compensation for the genomic imbalance caused by the gain or loss of DNA (34). Pmu group 7 is a special case: it has a PFGE pattern and a genome map that seemingly do not resemble those of any other P. multocida strains. However, its 16S rDNA is identical to those of the majority of P. multocida strains, a situation that might be explained by genomic rearrangement through several crossovers. The hypothesized genomic evolution of P. multocida groups 1 to 7 is shown in Fig. 7.
FIG. 7.
Hypothesized genomic evolution of P. multocida genome groups. Recombinations rearrange the genomes, but the starting events are assumed to be gain or loss of genomic DNA (see text for details of the hypothesis). This drawing considers only the “net” results, ignoring other possible genomic reorganizations that might have occurred in between. In these genomic reorganizations, at least one rule might have to be observed: transcription of rrn genes has to be in the directions away from, not toward, the origin of replication (30).
Of the known mechanisms by which bacteria diverge, differences in genome size might be of the greatest significance, as shown in the comparisons among the P. haemolytica strains: their genomes could be up to 20% different (compare strains of Pha group 1 or 2 with those of Pha groups 3 to 14). Consistent with these observations, recent studies show that various natural isolates of E. coli may also differ by up to 20% in genome size (2, 20, 41). Horizontal transfer of novel genes or gene clusters that contribute to pathogenicity likely accounts for much of this variation (41), with clustered insertions distributed at preferred genomic sites (2). Thus, these data suggest that macrovariation in genome structure may be a sensitive marker of both phylogeny and phenotype.
Genomic change over time, therefore, may not be a smooth process. Rather, it could be a quite saltatory process, depending on whether the change involves vertical or horizontal inheritance, because stochastic events can significantly change bacterial genome structure (size and geometry), as well as phenotypic properties, rapidly. For example, the acquisition of specific loci in pathogenicity islands of various E. coli isolates may be important contributors to the divergence of these bacteria in both genome structure and the specific diseases they cause (4). In this sense, our genome structure analysis techniques can better resolve phylogenetic groupings of bacteria whose changes in genome structure have resulted from horizontal acquisition of genetic material than those whose changes have resulted from a vertical mode. Our data for Salmonella, Pasteurella, and other bacteria (30) imply that the horizontal mode dominates in nature in the divergence and speciation of bacteria.
Little information about P. pneumotropica and P. ureae as species is available. The P. pneumotropica strain has a 16S rDNA sequence and a genome map that are identical to those of P. multocida genome group 1 strains, implying either (i) that this strain is a misidentification, (ii) that P. pneumotropica should be combined with P. multocida, or (iii) that P. pneumotropica may have recently speciated, not having sufficient time to permit divergence in 16S rDNA. The P. ureae strain, M14, is interesting: it may have just begun diverging from P. multocida both in 16S rDNA (99% similarity) and in genome structure (a 1-Mb fragment which is not seen in P. multocida strains).
H154 was identified as a P. multocida strain, but phylogenetically it is equally far away from P. haemolytica and P. multocida, as judged by 16S rDNA sequence. Its genome structure is also different from those of P. multocida. In fact, it looks like an ancestral strain from which both P. haemolytica and P. multocida might have diverged.
I-CeuI is the only endonuclease so far known to directly reveal these conservative aspects of the bacterial genome. A PFGE gel can contain as many as 60 lanes and, typically, a gel run can be finished in 2 or 3 days. Therefore, theoretically, this technique allows one to construct more than 100 genome maps within a week with one PFGE machine, making it possible to carry out systematic genomic comparisons involving thousands of bacterial strains. Unlike 16S rDNA and rRNA sequences, which change continuously among bacteria, genome structure has a clear-cut nature among different phylogenetic groups of bacteria (compare, for example, Pha groups 1, 2, and 3 to 14). Therefore, the identification of phylogenetic taxa of bacteria may eventually be established based on their genome structure. Finally, the genome structure methodology may help elucidate mechanisms of bacterial evolution. Quantitative measures of genomic organization and structural relatedness will further establish the utility of this methodology in bacterial phylogenetics and taxonomy.
ACKNOWLEDGMENTS
We thank Gui-Rong Liu and Glenis Wiebe for technical assistance.
This work was supported by a grant from the Medical Research Council of Canada to R.N.J., grants from the Natural Sciences and Engineering Research Council of Canada and grant RO1AI34829 from the National Institute of Allergy and Infectious Diseases of the National Institutes of Health to K.E.S., and a grant from the Ruth Rannie Memorial Foundation to A.B.S.
REFERENCES
- 1.Beltran P, Plock S A, Smith N H, Whittam T S, Old D C, Selander R K. Reference collection of strains of the Salmonella typhimurium complex from natural populations. J Gen Microbiol. 1991;137:601–606. doi: 10.1099/00221287-137-3-601. [DOI] [PubMed] [Google Scholar]
- 2.Bergthorsson U, Ochman H. Distribution of chromosome length variation in natural isolates of Escherichia coli. Mol Biol Evol. 1998;15:6–16. doi: 10.1093/oxfordjournals.molbev.a025847. [DOI] [PubMed] [Google Scholar]
- 3.Bisgaard M. Taxonomy of the family Pasteurellaceae Pohl 1981. In: Donachie W, Lainson F A, Hodgson J C, editors. Haemophilus, Actinobacillus, and Pasteurella. New York, N.Y: Plenum Publishing Corp.; 1995. pp. 1–8. [Google Scholar]
- 4.Boyd E F, Hartl D L. Chromosomal regions specific to pathogenic isolates of Escherichia coli have a phylogenetically clustered distribution. J Bacteriol. 1998;180:1159–1165. doi: 10.1128/jb.180.5.1159-1165.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Boyd E F, Wand F S, Beltran P, Plock S A, Nelson K, Selander R K. Salmonella reference collection B (SARB): strains of 37 serovars of subspecies I. J Gen Microbiol. 1993;139:1125–1132. doi: 10.1099/00221287-139-6-1125. [DOI] [PubMed] [Google Scholar]
- 6.Boyd E F, Want F-S, Whittam T S, Selander R K. Molecular genetic relationships among the salmonellae. Appl Environ Microbiol. 1996;62:804–808. doi: 10.1128/aem.62.3.804-808.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Carter G R. Genus I. Pasteurella Trevisan 1887. In: Krieg N R, Holt J G, editors. Bergey’s manual of systematic bacteriology. Baltimore, Md: Williams & Wilkins Co.; 1984. pp. 552–557. [Google Scholar]
- 8.Claus D, Berkeley R C W. Genus Bacillus Cohn 1872, 174AL*. In: Sneath P H A, Mair N S, Sharpe M E, Holt J G, editors. Bergey’s manual of systematic bacteriology. Baltimore, Md: Williams & Wilkins; 1986. pp. 1105–1139. [Google Scholar]
- 9.Crosa J H, Brenner D J, Ewing W H, Falkow S. Molecular relationships among the salmonellae. J Bacteriol. 1973;115:307–315. doi: 10.1128/jb.115.1.307-315.1973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Deckert G, Warren P V, Gaasterland T, Young W G, Lenox A L, Graham D E, Overbeek R, Snead M A, Keller M, Aujay M, Huber R, Feldman R A, Short J M, Olsen G J, Swanson R V. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature. 1998;392:353–358. doi: 10.1038/32831. [DOI] [PubMed] [Google Scholar]
- 11.Dewhirst F E, Paster B J, Olsen I, Fraser G J. Phylogeny of 54 representative strains of species in the family Pasteurellaceae as determined by comparison of 16S rRNA sequences. J Bacteriol. 1992;174:2002–2013. doi: 10.1128/jb.174.6.2002-2013.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Doolittle R F, Feng D F, Tsang S, Cho G, Little E. Determining divergence times of the major kingdoms of living organisms with a protein clock. Science. 1996;271:470–477. doi: 10.1126/science.271.5248.470. [DOI] [PubMed] [Google Scholar]
- 13.Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. doi: 10.1111/j.1558-5646.1985.tb00420.x. [DOI] [PubMed] [Google Scholar]
- 14.Felsenstein J. Phylip-(Phylogeny Inference Package). Seattle: University of Washington; 1993. [Google Scholar]
- 15.Feng D F, Cho G, Doolittle R F. Determining divergence times with a protein clock: update and reevaluation. Proc Natl Acad Sci USA. 1997;94:13028–13033. doi: 10.1073/pnas.94.24.13028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Fox G E, Stackebrandt E, Hespell R B, Gibson J, Maniloff J, Dyer T A, Wolfe R S, Balch W E, Tanner R S, Magrum L J, Zablen L B, Blackemore R, Gupta R, Bonen L, Lewis B J, Stahl D A, Luehrsen K R, Chen K N, Woese C R. The phylogeny of prokaryotes. Science. 1980;209:457–463. doi: 10.1126/science.6771870. [DOI] [PubMed] [Google Scholar]
- 17.Gauthier A, Turmel M, Lemieux C. A group I intron in the chloroplast large subunit rRNA gene of Chlamydomonas eugametos encodes a double-strand endonuclease that cleaves the homing site of this intron. Curr Genet. 1991;19:43–47. doi: 10.1007/BF00362086. [DOI] [PubMed] [Google Scholar]
- 18.Goodfellow M, O’Donnell A G. Roots of bacterial systematics. In: Goodfellow M, O’Donnell A G, editors. Handbook of new bacterial systematics. San Diego, Calif: Academic Press; 1993. pp. 3–56. [Google Scholar]
- 19.Hook E W. Salmonella species (including typhoid fever) In: Mandell G L, Douglas R G Jr, Bennett J E, editors. Principles and practices of infectious diseases. 2nd ed. New York, N.Y: John Wiley & Sons; 1979. pp. 1256–1269. [Google Scholar]
- 20.Hurtado A, Rodriguez-Valera F. Accessory DNA in the genomes of representatives of the Escherichia coli reference collection. J Bacteriol. 1999;181:2548–2554. doi: 10.1128/jb.181.8.2548-2554.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Karlin S, Weinstock G M, Brendel V. Bacterial classifications derived from recA protein sequence comparisons. J Bacteriol. 1995;177:6881–6893. doi: 10.1128/jb.177.23.6881-6893.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kauffman F. The bacteriology of Enterobacteriaceae. Baltimore, Md: Williams & Wilkens; 1966. [Google Scholar]
- 23.Le Minor L. Typing of Salmonella species. Eur J Clin Microbiol Infect Dis. 1988;7:214–218. doi: 10.1007/BF01963091. [DOI] [PubMed] [Google Scholar]
- 23a.Liu, S.-L. Unpublished data.
- 24.Liu S-L, Hessel A, Sanderson K E. The Xba I-Bln I-Ceu I genomic cleavage map of Salmonella enteritidis shows an inversion relative to Salmonella typhimurium LT2. Mol Microbiol. 1993;10:655–664. doi: 10.1111/j.1365-2958.1993.tb00937.x. [DOI] [PubMed] [Google Scholar]
- 25.Liu S-L, Hessel A, Sanderson K E. The Xba I-Bln I-Ceu I genomic cleavage map of Salmonella typhimurium LT2 determined by double digestion, end-labelling, and pulsed-field gel electrophoresis. J Bacteriol. 1993;175:4104–4120. doi: 10.1128/jb.175.13.4104-4120.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Liu S-L, Hessel A, Sanderson K E. Genomic mapping with I-CeuI, an intron-encoded endonuclease, specific for genes for ribosomal RNA, in Salmonella spp., Escherichia coli, and other bacteria. Proc Natl Acad Sci USA. 1993;90:6874–6878. doi: 10.1073/pnas.90.14.6874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Liu S-L, Sanderson K E. A physical map of the Salmonella typhimurium LT2 genome made by using XbaI analysis. J Bacteriol. 1992;174:1662–1672. doi: 10.1128/jb.174.5.1662-1672.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Liu S-L, Sanderson K E. The chromosome of Salmonella paratyphi A is inverted by recombination between rrnH and rrnG. J Bacteriol. 1995;177:6585–6592. doi: 10.1128/jb.177.22.6585-6592.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liu S-L, Sanderson K E. I-CeuI reveals conservation of the genome of independent strains of Salmonella typhimurium. J Bacteriol. 1995;177:3355–3357. doi: 10.1128/jb.177.11.3355-3357.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liu S-L, Sanderson K E. Rearrangements in the genome of the bacterium Salmonella typhi. Proc Natl Acad Sci USA. 1995;92:1018–1022. doi: 10.1073/pnas.92.4.1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Liu S-L, Sanderson K E. Highly plastic chromosomal organization in Salmonella typhi. Proc Natl Acad Sci USA. 1996;93:10303–10308. doi: 10.1073/pnas.93.19.10303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Liu S-L, Sanderson K E. Homologous recombination between rrn operons rearranges the chromosome in host-specialized species of Salmonella. FEMS Microbiol Lett. 1998;164:275–281. doi: 10.1111/j.1574-6968.1998.tb13098.x. [DOI] [PubMed] [Google Scholar]
- 33.Liu S-L, Sanderson K E. Physical analysis of the Salmonella typhimurium genome. In: Williams P H, Ketley J, Salmond G, editors. Methods in microbiology. New York, N.Y: Academic Press; 1998. pp. 371–381. [Google Scholar]
- 34.Liu, S.-L., K. E. Sanderson, and R. N. Johnston. Unpublished data.
- 35.Marshall P, Lemieux C. Cleavage pattern of the homing endonuclease encoded by the fifth intron in the chloroplast subunit rRNA-encoding gene of Chlamydomonas eugamatos. Gene. 1991;104:1241–1245. doi: 10.1016/0378-1119(91)90256-b. [DOI] [PubMed] [Google Scholar]
- 36.Mutters R, Mannheim W, Bisgaard M. Taxonomy of the group. In: Adlam C, Rutler J M, editors. Pasteurella and pasteurellosis. London, United Kingdom: Academic Press, Inc.; 1989. pp. 3–34. [Google Scholar]
- 37.Ochman H, Wilson A C. Evolutionary history of enteric bacteria. In: Neidhardt F C, Ingraham J L, Low K B, Magasanik B, Schaechter M, Umbarger H E, editors. Escherichia coli and Salmonella typhimurium: cellular and molecular biology. Washington, D.C.: American Society for Microbiology; 1987. pp. 1649–1654. [Google Scholar]
- 38.Popoff M Y, Bockemuel J, McWhorter-Murlin A. Supplement 1992 (no. 36) to the Kauffmann-White Scheme. Res Microbiol. 1993;144:495–498. doi: 10.1016/0923-2508(93)90058-a. [DOI] [PubMed] [Google Scholar]
- 39.Reeves M W, Evins G M, Heiba A A, Plikaytis B D, Farmer J J., III Clonal nature of Salmonella typhi and its genetic relatedness to other salmonellae as shown by multilocus enzyme electrophoresis, and proposal of Salmonella bongori comb. nov. J Clin Microbiol. 1989;27:313–320. doi: 10.1128/jcm.27.2.313-320.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Robert L D, Arkinsaw S, Selander R K. Evolutionary genetics of Pasteurella haemolytica isolates recovered from cattle and sheep. Infect Immun. 1997;65:3585–3593. doi: 10.1128/iai.65.9.3585-3593.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Rode C K, Melkerson-Watson L J, Johnson A T, Bloch C A. Type-specific contributions to chromosome size differences in Escherichia coli. Infect Immun. 1999;67:230–236. doi: 10.1128/iai.67.1.230-236.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sanderson K E, Hessel A, Liu S-L, Rudd K E. The genetic map of Salmonella typhimurium, edition VIII. In: Neidhardt F C, Curtiss III R, Ingraham J L, Lin E C C, Low K B, Magasanik B, Reznikoff W S, Riley M, Schaechter M, Umbarger H E, editors. Escherichia coli and Salmonella: cellular and molecular biology. 2nd ed. Washington, D.C.: ASM Press; 1996. pp. 1903–1999. [Google Scholar]
- 43.Sneath P H A, Stevens M. Actinobacillus rossii sp. nov., Actinobicillus seminis sp. nov., nom rev., Pasteurella bettii sp. nov., Pasteurella lymphangitidis sp. nov., Pasteurella mairi sp. nov., and Pasteurella trehalosi sp. nov. Int J Syst Bacteriol. 1990;40:148–153. doi: 10.1099/00207713-40-2-148. [DOI] [PubMed] [Google Scholar]
- 44.Stanley J T, Krieg N R. Bacterial classification. I. Classification of procaryotic organisms: an overview. In: Holt J G, editor. Bergey’s manual of systematic bacteriology. Baltimore, Md: Williams & Wilkins; 1984. pp. 1–4. [Google Scholar]
- 45.Zinder N, Lederberg J. Genetic exchange in Salmonella. J Bacteriol. 1952;64:679–699. doi: 10.1128/jb.64.5.679-699.1952. [DOI] [PMC free article] [PubMed] [Google Scholar]