Abstract
Salmonella enterica serovar Typhi is a clone with a low level of variation. We developed a molecular typing method for serovar Typhi using 38 genome-wide single-nucleotide polymorphisms (SNPs) as markers detected by PCR-restriction enzyme digestion. The 73 worldwide serovar Typhi isolates studied were separated into 23 SNP profiles and four distinct genetic groups. Serovar Typhi isolates expressing the unique flagellar antigen z66 were found to cluster together and branch off from the ancestral group, suggesting that serovar Typhi was initially monophasic with only an H1 antigen and subsequently gained the z66 antigen. Typing using the 38 SNPs gave a discriminatory power of 0.87, and a minimum of 16 SNPs may be used to achieve the same level of differentiation. The SNP typing method we developed will be a valuable tool for global epidemiology studies of serovar Typhi.
Typhoid fever, a serious systemic disease, is caused by Salmonella enterica serovar Typhi and is endemic in countries where problems of hygiene and sanitation remain unresolved. Annually, there are more than 17 million cases of typhoid fever, with around 600,000 deaths (41). Genetic diversity among serovar Typhi strains has been studied extensively using molecular techniques such as pulsed-field gel electrophoresis (14, 16, 20, 35, 36), ribotyping (7, 17, 22), IS200 typing (37), amplified fragment length polymorphism analysis (21), and random amplification of polymorphic DNA analysis (26, 33). A major drawback of these techniques is that the relationships derived from the data they provide do not necessarily reflect the true evolutionary relationships of the isolates.
Two population genetics studies showed that serovar Typhi is a highly homogenous clone. Multilocus enzyme electrophoresis of 24 metabolic enzymes revealed only two major electrophoretic types (27), and multilocus sequence typing (MLST) of seven housekeeping genes found only three base substitutions in a total of 3,336 bp analyzed and divided 26 serovar Typhi isolates into four sequence types (13). Therefore, there is insufficient variation for either multilocus enzyme electrophoresis or MLST to be useful for the determination of relationships among isolates or for global epidemiological studies.
To facilitate global epidemiology studies and to establish the evolutionary relationships within the serovar Typhi clone, there is a need for a molecular method that is cheap, discriminative, simple, and reproducible for the large-scale typing of isolates. Single-nucleotide polymorphisms (SNPs) are potential markers and have been used to type several pathogens, including Escherichia coli O157:H7 (42), Bacillus anthracis (25), Mycobacterium tuberculosis (8, 11), and Yersinia pestis (1). The discovery of SNPs is facilitated by the sequencing of more than one genome from the same clone. The completed genome sequences of serovar Typhi strains CT18 and Ty2 (4, 24) allowed us to explore the differences between them and to identify SNPs suitable for typing. We selected 37 SNPs that could be differentiated by the presence or absence of a restriction enzyme site to analyze a collection of worldwide Typhi isolates and showed that SNP typing is a good tool for genotyping and determining evolutionary relationships of global serovar Typhi isolates.
Strains.
Seventy-three worldwide serovar Typhi isolates, differing in localities and years of isolation, were obtained from the Salmonella Genetic Stock Centre, University of Calgary, Calgary, Canada, and one isolate was obtained from Imperial College London, London, United Kingdom (Table 1).
TABLE 1.
List of strains used in this study
Cluster | SNP profile | Strain name | Genotype | Phage type | Locality | Year of isolation | Status of z66 flagellar antigena | Haplotypeb |
---|---|---|---|---|---|---|---|---|
I | 1 | ST1106 | 4 | D1 | Malaysia | 1987 | − | |
414Ty | 3 | I+IV | Australia | 1981 | + | 59 | ||
2 | ST145 | 3 | I+IV | Malaysia | 1994 | + | ||
26T37 | 2 | I+IV | British Columbia, Canada | 1994 | + | |||
In20 | 9 | A | Indonesia | 1992 | + | 59 | ||
417Ty | 22 | I+IV | New Caledonia | 1982 | + | 59 | ||
418Ty | 3 | I+IV | The Netherlands | 1988 | + | |||
420Ty | 3 | UT | Japan | 1982 | + | 59 | ||
425Ty | 3 | I+IV | + | |||||
444Ty | 3 | I+IV | + | |||||
445Ty | 3 | + | ||||||
446Ty | 3 | I+IV | + | |||||
701Ty | 27 | + | ||||||
702Ty | 3 | + | ||||||
3 | CDC3137-73 | 6 | K1 | India | − | 42 | ||
26T30 | 3 | I+IV | Quebec, Canada | 1994 | − | |||
26T32 | 24 | I+IV | Quebec, Canada | 1994 | − | |||
4 | 423Ty | 3 | I+IV | Australia | 1981 | + | ||
5 | 415Ty | 3 | UT | The Netherlands | 1982 | + | ||
416Ty | 3 | UT | Japan | 1982 | + | 59 | ||
419Ty | 3 | I+IV | The Netherlands | 1988 | + | |||
421Ty | 3 | UT | France | 1984 | + | |||
II | 6 | 3125 | 3 | 46 | Chile | 1983 | − | 50 |
7 | Tp1 | 25 | A | Dakar, Senegal | 1988 | − | 39 | |
ST24A | 3 | DVS | Malaysia | 1986 | − | 16 | ||
8 | Tp2 | 17 | UT | Dakar, Senegal | 1988 | − | 39 | |
9 | CT18 | Vietnam | 1994 | − | 1 | |||
III | 10 | CDC3434-73 | 22 | G1 | Peru | − | 52 | |
CDC1707-81 | 3 | UT | Liberia | − | 81 | |||
CDC382-82 | 13 | M1 | Marshall Islands | − | ||||
R1167 | 19 | A | − | |||||
ST60 | 2 | C4 | Malaysia | 1986 | − | 50 | ||
ST24B | 3 | DVS | Malaysia | 1986 | − | |||
In15 | 3 | D2 | Indonesia | 1994 | − | 8 | ||
PL27566 | 26 | M1 | 1994 | − | ||||
PL73203 | 2 | A | − | |||||
26T6 | 30 | UT | British Columbia, Canada | 1994 | − | |||
26T9 | 16 | B1 | Manitoba, Canada | 1994 | − | |||
26T17 | 4 | B1 | British Columbia, Canada | 1994 | − | |||
26T19 | 5 | A | Alberta, Canada | 1994 | − | |||
26T40 | 19 | M3 | British Columbia, Canada | 1994 | − | |||
26T49 | 11 | B1 | British Columbia, Canada | 1994 | − | |||
26T50 | 8 | I+IV | Alberta, Canada | 1994 | − | |||
26T51 | 28 | DVS | British Columbia, Canada | 1994 | − | |||
26T56 | 23 | F1 | Quebec, Canada | 1994 | − | |||
3126 | 3 | 46 | Chile | 1983 | − | |||
PNG32 | 2 | D2 | Papua New Guinea | 1994 | − | 8 | ||
TYT1668 | 21 | M1 | Chile | − | 76 | |||
TYT1677 | 5 | F8 | Chile | − | ||||
422Mar92 | Zaire | 1992 | − | 6 | ||||
11 | CC6 | 7 | A | Thailand | 1995 | − | 50 | |
CC7 | 7 | A | Thailand | 1995 | − | |||
12 | 26T12 | 6 | O | Manitoba, Canada | 1994 | − | ||
13 | CDC1196-74 | 6 | A | Mexico | − | 11 | ||
CDC9032-85 | 23 | UT | Taiwan | − | 50 | |||
T189 | 11 | N | Thailand | 1990 | − | 42 | ||
In24 | 3 | C3 | Indonesia | 1992 | − | 14 | ||
14 | R1962 | 1 | UT | Alberta, Canada | 1993 | − | 42 | |
IP.E88 353 | UT | Dakar, Senegal | − | |||||
IP.E88 374 | UT | Dakar, Senegal | − | |||||
16 | TYT1669 | 6 | UT | Chile | − | 52 | ||
23 | 26T38 | 14 | E1 | British Columbia, Canada | 1994 | − | ||
3123 | 3 | Chile | 1983 | − | ||||
IV | 15 | ST1 | 18 | I+IV | Indonesia | − | 52 | |
17 | T202 | 3 | UT | Thailand | 1990 | − | 52 | |
18 | 25T-40 | 4 | E1 | British Columbia, Canada | 1993 | − | ||
R1637 | 14 | E2 | − | |||||
ST1002 | 9 | E1 | Malaysia | 1987 | − | 52 | ||
ST309 | 3 | E1 | Malaysia | 1987 | − | |||
19 | Ty2 | 9 | E1 | − | 10 | |||
20 | 25T-36 | 29 | E1 | Alberta, Canada | 1993 | − | ||
21 | 26T24 | 2 | E1 | Ontario, Canada | 1994 | − | ||
22 | 25T-44 | 2 | E1 | Ontario, Canada | 1993 | − |
+, present; −, absent. Strains ST145, 26T37, and In20 were found to carry z66 by PCR.
Haplotype according to study by Rougmanac et al. (30).
PCR, restriction enzyme digestion, and DNA sequencing.
The PCR mixture contained 0.2 μl of DNA template (∼20 ng), 0.2 μl of each forward and reverse primer (concentration, 30 pmol/μl; Sigma-Aldrich), 0.2 μl of 10 mM (each) deoxynucleoside triphosphates, 2 μl of 10× Taq polymerase PCR buffer (New England Biolabs), 0.125 μl (1.25 U) of Taq polymerase (New England Biolabs), and MilliQ water added to adjust the final volume to 20 μl. The PCR product (15 μl) was digested with 1 U of restriction enzyme at 37°C for 2 h and subsequently run on a 2% agarose gel in Tris-borate-EDTA buffer (31).
The 20-μl PCR sequencing mixture contained 1 μl of BigDye (version 3.1; Applied Biosystems), 20 ng of the purified PCR product, 3.5 μl of 5× PCR sequencing buffer (Applied Biosystems), 1 μl of forward primer (concentration, 3.2 pmol/μl; Sigma-Aldrich), and MilliQ water. Unincorporated dye was removed by ethanol precipitation. The sequencing reaction mixtures were resolved on an ABI 3730 automated DNA sequence analyzer (Applied Biosystems) at the sequencing facility of the School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia.
Selection of SNPs for typing.
From the comparison of the full-genome sequences of serovar Typhi CT18 (accession no. AL513382) and Ty2 (accession no. AE014613) (4, 24) by using BLAST tools available from the Australian National Genetic Information Service, 253 single-copy genes carrying SNPs with non-insertion-and-deletion variation were found. The number of polymorphisms within these genes ranged from 1 to 8 bases, and the polymorphisms accounted for a total of 285 base substitutions, with the majority of the genes (239) having a single-base substitution. Of the 285 polymorphisms, 111 were synonymous SNPs (sSNPs) and 174 were nonsynonymous SNPs (nsSNPs). It is interesting that the number of nsSNPs was 30% greater than the number of sSNPs since deleterious nsSNPs are expected to be eliminated from populations rather quickly. This circumstance seems to be a general phenomenon, as similar observations in other bacterial clones have been made previously (1, 8, 11, 25). In M. tuberculosis, 65% of the SNPs are nsSNPs (8), while 58% are nsSNPs in B. anthracis (25). The likely explanation is that the time frame is too short for many of the nsSNPs, particularly the mildly deleterious ones, to be removed by purifying selection (28).
Thirty-six genes were selected for SNP typing using the seven most economical 4-base restriction enzymes to discriminate 37 SNPs, of which 17 were sSNPs and 20 were nsSNPs. AluI was utilized to differentiate eight SNPs, HhaI and HaeIII were used for seven each, HpaII was utilized for four, NlaIII and RsaI were used for five each, and TacI was used for one SNP. All except two of the 37 SNPs were the single SNP present in a gene. SNP 18 and SNP 19, an sSNP and an nsSNP, respectively, were located in the same gene, yehU, encoding a putative two-component-system sensor kinase. There were six polymorphic sites scattered along the 1,686 bp of yehU, four of which resulted in a change of the amino acid.
SNP typing.
The 73 serovar Typhi isolates were typed for the 37 SNPs, of which 17 SNPs were shared by two or more isolates while 16 and 4 were found to be unique to CT18 and Ty2, respectively. It is unclear why there were four times more unique SNPs in CT18 than in Ty2. An additional SNP was discovered upon analyzing SNP 18. Sequencing of the PCR product revealed that two isolates (CC6 and CC7) had the same base as Ty2 at the site of SNP 18 but that a single base change 360 bases downstream created a new restriction enzyme site. This SNP was designated SNP 38, making a total of 38 SNPs. In SNP 37 (A-for-G substitution), CT18 has an A base according to the genome data (24) but had the same digestion pattern as Ty2, indicating that it has the same base (G) as Ty2, which was confirmed by sequencing. This finding may be due to an error in the CT18 genome sequence (24) or to a mutation in our CT18 isolate. Nevertheless, the base A allele was present in eight isolates.
The 73 serovar Typhi isolates were grouped into 23 SNP profiles (Table 2). Twelve profiles, including the profiles of CT18 and Ty2, were represented by only one isolate. The other 11 profiles were shared by two or more isolates, with SNP profile 10 being the most common, shared by 23 isolates. It is interesting that isolate 422Mar92, belonging to a unique MLST sequence type (sequence type 8) previously thought to be restricted to African isolates (13), also fell into this largest SNP profile.
TABLE 2.
SNP profiles of the serovar Typhi isolates
Cluster or characteristic | SNP profile | Total no. of isolates | Base at or characteristic of SNPa no.:
|
|||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | |||
Reference base | A | C | C | T | G | C | C | G | G | C | T | G | G | C | G | C | G | C | C | C | T | G | G | G | C | A | C | C | A | C | A | A | C | G | C | G | A | C | ||
I | 1 | 2 | . | . | . | . | . | . | . | . | . | . | G | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | . | . | . |
2 | 12 | . | . | . | . | . | . | . | . | . | . | G | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | . | G | . | |
3 | 3 | . | . | . | . | . | . | . | A | . | . | G | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | . | G | . | |
4 | 1 | . | . | . | . | . | . | . | . | . | . | G | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | A | . | . | |
5 | 4 | . | . | . | . | . | . | . | . | . | . | G | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | A | G | . | |
II | 6 | 1 | . | T | T | C | . | . | . | A | . | . | . | . | . | . | . | . | . | . | . | . | . | A | . | . | . | . | . | . | . | T | . | . | . | . | . | . | G | . |
7 | 2 | . | T | T | C | . | . | . | A | . | . | . | . | . | . | . | . | . | . | . | . | . | A | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | |
8 | 1 | . | T | . | . | A | . | . | A | . | . | . | . | . | . | . | . | A | . | . | . | . | A | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | |
9 (CT18) | 1 | . | T | T | C | A | T | T | A | A | T | G | . | . | T | T | . | . | T | T | T | C | A | . | A | . | G | . | T | . | T | G | . | T | . | T | . | G | . | |
III | 10 | 23 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | G | . |
11 | 2 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | G | T | |
12 | 1 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | G | . | |
13 | 4 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | |
14 | 3 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | A | G | . | |
16 | 1 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | . | . | A | . | . | |
23 | 2 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | . | . | . | G | . | |
IV | 15 | 1 | . | . | . | . | . | . | . | . | . | . | . | A | A | . | . | T | A | . | . | . | . | . | . | . | T | . | T | . | . | . | . | . | . | A | . | A | . | . |
17 | 1 | . | . | . | . | . | . | . | . | . | . | . | A | A | . | . | . | A | . | . | . | . | . | . | . | T | . | T | . | . | . | . | . | . | A | . | . | . | . | |
18 | 4 | . | . | . | . | . | . | . | . | . | . | . | A | A | . | . | T | A | . | . | . | . | . | . | . | T | . | T | . | . | . | . | . | . | A | . | A | G | . | |
19 (Ty2) | 1 | G | . | . | . | . | . | . | . | . | . | . | A | A | . | . | T | A | . | . | . | . | . | A | . | T | . | T | . | C | . | . | G | . | A | . | A | G | . | |
20 | 1 | . | . | . | . | . | . | . | . | . | . | . | A | A | . | . | . | A | . | . | . | . | . | . | . | T | . | T | . | . | . | . | . | . | A | . | A | G | . | |
21 | 1 | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | A | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | . | . | . | G | . | |
22 | 1 | . | . | . | . | . | . | . | . | . | . | . | . | A | . | . | T | A | . | . | . | . | . | . | . | T | . | . | . | . | . | . | . | . | . | . | . | G | . | |
Informativenessb | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * | ||||||||||||||||||||||
Typec | N | N | S | S | S | S | N | S | N | N | N | S | S | N | S | N | S | S | N | N | N | S | S | N | N | N | N | N | N | S | S | S | N | S | S | N | N | S | ||
Ancestral based | A | C | C | T | G | C | C | G | G | C | T | T | G | C | G | C | G | C | C | C | T | G | G | G | C | A | C | C | A | C | A | G | C | G | C | G | G | C | ||
Inclusion in minimal sete | * | * | * | * | * | * | * | * | * | * | * | * | * | * | * |
Underlined bases were confirmed by sequencing of the representative isolate for each SNP profile. Bases in bold support the corresponding cluster. Dots represent bases identical to the reference bases.
Asterisks indicate informative SNPs, i.e., those corresponding to two alleles each shared by two or more isolates.
N refers to nsSNPs and S refers to sSNPs.
The ancestral base for each SNP was derived from the consensus of eight non-serovar Typhi strains: serovar Choleraesuis strain SC-B67 (GenBank accession no. AE017220), serovar Paratyphi A strain SARB42 (accession no. CP000026), serovar Typhimurium strain LT2 (accession no. AE006468), serovar Paratyphi B strain SPB7 (ftp://genome.wustl.edu/pub/seqmgr/bacterial/salmonella), serovar Enteritidis strain PT4 and serovar Gallinarum strain 287/91 (ftp://sanger.ac.uk/pub/pathogens/Salmonella), serovar Pullorum (http://www.salmonella.org/genomics/spu.dbs), and S. enterica subsp. V strain 12149 (ftp://ftp.sanger.ac.uk/pub/pathogens/Salmonella).
Asterisks indicate the SNPs included in the minimal set of SNPs required to identify all the SNP profiles assigned in this study (see the text for details).
Phylogenetic relationships.
The relationships of the isolates were determined through a number of analyses (Fig. 1). Initially, we used PAUP (34) to construct a maximum-parsimony tree from the SNP data, and 574 most-parsimonious trees of equal lengths were found. The consensus tree derived from these trees revealed four major clusters (Fig. 1; also see Fig. S1 in the supplemental material), and the trees differed mostly in the branching patterns within the clusters. The division between clusters was supported by alleles unique to or uniformly present in the cluster (Table 2), suggesting that these clusters were genuine genetic groups. Cluster I was supported by two SNPs, base G and T alleles at SNPs 11 and 35, respectively; cluster II was supported by four SNPs (SNPs 2, 8, 22, and 30); and cluster IV was supported by two SNPs (SNPs 17 and 25). However, there were no unique SNPs to support cluster III.
FIG. 1.
Phylogenetic relationships of serovar Typhi SNP profiles. Within the circles are SNP profile numbers. The relative sizes of the circles (not to scale) illustrate the sizes (numbers of isolates) of the SNP profiles. The numbers on the branches are the total number of SNP differences between the connecting nodes. The gain (+) or loss (−) of the z66 antigen is marked on the branches. Clusters are labeled with Roman numerals. Each shaded area represents a CC. SNP profiles of CT18 and Ty2 are labeled and indicated with arrows.
We then used eBURST (6) to identify the most closely related SNP profiles as clonal complexes (CCs) by using the MLST terminology, based on the principle that SNP profiles differing by one SNP arose by a single mutation and thus originated from the same clone. eBURST revealed four CCs (see Fig. S2 in the supplemental material) and six ungrouped profiles, 8, 9, 16, 17, 19, and 22. The latter were considered to be singletons as they had differences of more than one SNP from the four CCs or from one another. CC1 consisted of SNP profiles 10 to 14, 21, and 23, with SNP profile 10 identified as the founder of the CC. CC2 consisted of SNP profiles 1 to 5, and CC3 consisted of SNP profiles 15, 18, and 20, with SNP profiles 2 and 18, respectively, predicted to be the founders of these CCs. CC4 was the smallest, having only two SNP profiles, 6 and 7, and it was not determinable which SNP profile was the founder.
A minimum spanning tree (MST) was constructed using Arlequin version 3.1 (5) to visualize the overall relationships of the profiles (Fig. 1). The MST groupings were consistent with the four clusters observed in the maximum-parsimony consensus tree. The MST showed that SNP profile 10 was the ancestral profile, connecting to the out-group by two changes, indicating that cluster III arose first and that the other three clusters emerged from cluster III. Most SNP profiles were linked to only one other SNP profile. However, SNP profiles 4, 16, and 17 showed equal distances from two or more SNP profiles, and alternative connections for these SNP profiles were represented on the tree as networks. The four CCs identified by eBURST were consistent with the phylogenetic clustering with the exception of the results for SNP profile 21. SNP profile 21 was assigned to CC1 by eBURST but belongs to cluster IV. However, there is no real conflict, as SNP profile 21 was the founding member of cluster IV. Cluster I was more homogenous than the other clusters, as it was represented by CC1 only while the other clusters contained more divergent members in addition to CCs.
The phylogenetic tree allowed us to determine whether there was any association of phylogenetic clustering with genome types, defined by the arrangement of I-CeuI fragments (22), and/or phage types, determined based on sensitivity or resistance to Vi phages (3) (Fig. 1 and Table 1). The phage type is largely independent from the genome type, as shown in other studies (15). Nevertheless, the occurrence of a particular combination of genome type and phage type was predominant in two phylogenetic clusters. Most of the isolates belonging to cluster I had genome type 3 and phage type I+IV. Although genome type 3 dominated, there were also other genome types in the cluster. For example, SNP profile 2 contained genome types 22 and 27. However, these two genome types were likely to have been derived from the predominant genome type 3, as each required only a single genomic rearrangement (18). Cluster IV contained all isolates with phage type E1 and its variant E2 used in this study but had no dominant genome type. Clusters II and III had no apparent association with particular genome or phage types. Cluster II contained three genome types and four phage types, while cluster III had the most variable characteristics, with 18 and 16 different genome and phage types, respectively. No association of phylogenetic clusters with the years of isolation and/or the localities of isolation was found. The 19 Canadian isolates analyzed were scattered into three of the four clusters. Isolates from cluster III spanned all five regions represented in the study: Africa, America, Asia, Europe, and Oceania. This finding suggests that major serovar Typhi clones have spread globally. Large-scale typing of isolates from different regions by a genotyping method such as the SNP typing technique developed in this study will help further elucidate any spatial or temporal clustering of serovar Typhi clones.
Parallel changes.
The sites of six SNPs, 8, 11, 17, 35, 36, and 37, seem to have undergone parallel changes across two or more independent lineages. For example, the allele base A of SNP 8 supporting cluster II was also present in SNP profile 3 of cluster I, and the two alleles of SNP 37 were present in all four clusters. As alleles were initially deduced from restriction enzyme digestion, we confirmed the base changes of the alleles concerned by sequencing the representative isolates. Thus, the polymorphisms observed resulted from parallel changes due to either mutation or recombination. However, it will be difficult to determine whether the changes were due to mutation or recombination. We have recently shown that recombination is frequent within S. enterica subspecies I (23). Recombination within a serovar may also be frequent, and the parallel changes observed are likely to be due to recombination. It is interesting that four of the six SNPs involved were nonsynonymous, and selection pressure may play a role in driving some of these parallel changes. However, none of the genes is known to be related to virulence. Note that, although the T allele of SNP 25 was shared by all SNP profiles of cluster IV and two profiles of cluster III, this allele is not considered to be the result of a parallel change because the two SNP profiles of cluster III were in direct line with the emergence of cluster IV.
Origin of isolates expressing the z66 flagellar antigen.
Although most serovar Typhi isolates are monophasic for the expression of the flagellar antigen encoded by the fliC gene at the H1 locus, some Indonesian isolates have an additional z66 flagellar antigen, which was first described in 1981 (10) and was thought to be encoded by fljB at the H2 locus (19). The z66 flagellar antigen is now known to be encoded by a gene in the fljBA-like operon not located in the H2 locus (12). All the serovar Typhi isolates used in this study were typed for the presence of the z66 flagellar antigen gene by PCR using the primers described by Huang et al. (12). Among the 73 serovar Typhi isolates, 15 were known from serotyping results to express z66, and all 15 were confirmed to be positive for z66 by PCR (Table 1; data not shown). An additional three isolates were found to carry the z66 flagellar antigen gene by PCR. The z66-positive isolates had SNP profile 1, 2, 4, or 5 and were all grouped into cluster I. It appears that isolates expressing flagellar antigen z66 had a single origin in cluster I. However, one isolate from SNP profile 1 and all SNP profile 3 isolates did not have the z66 flagellar antigen. Presumably, these isolates had lost the gene. The presence of the z66 flagellar antigen only in cluster I suggests that serovar Typhi was originally monophasic, having only an H1 antigen, and then gained a new phase 2 flagellin gene-like operon only recently during the divergence of cluster I. This new flagellin gene is more similar to H27 fliC of E. coli than to other H antigen genes of S. enterica (12) and is located on a linear plasmid, as shown recently (2), which further supports the hypothesis of the recent acquisition of the z66 antigen through lateral transfer. The findings in this study suggest that the earlier hypothesis that serovar Typhi first adapted to humans in Indonesia and was initially diphasic, with an H2 locus encoding the z66 flagellar antigen, before spreading globally from Indonesia is less likely to be correct (9).
It was unclear whether isolates carrying the z66 antigen had a selective advantage, as the antigen had not been stably maintained within cluster I. The H antigen is part of the cell surface and one of the targets of the host immune system and, thus, is under intense selection pressure to change (40). The z66 antigen may be an advantage, considering the high incidence of typhoid fever in Indonesia. However, findings from earlier studies showing that z66 isolates are found almost exclusively in Indonesia and do not spread widely across the globe (38, 39) argue against this hypothesis. It is possible that the coexistence of monophasic and diphasic serovar Typhi isolates in Indonesia is a result of balancing selection to maintain the genetic diversity of the flagellar antigen in serovar Typhi (32).
The discriminatory power of SNP typing was determined using Simpson's index of diversity (D) calculated with an in-house program, MLEECOMP (available upon request). The D value for this study was 0.870. We compared the D value of SNP typing to those of MLST (13) and ribotyping (15), both of which used global isolates, as no comparable data set is available for pulsed-field gel electrophoresis, the “gold standard” for the comparison of the powers of typing methods (29). MLST and ribotyping had D values of 0.503 and 0.873, respectively. The SNP typing method developed in this study had a considerably higher discriminatory power than MLST but a power similar to that of ribotyping. However, the power of SNP typing can be increased by incorporating more SNPs, while ribotyping is constrained to detecting variation in the seven regions containing rrn operons only. Furthermore, the variation detected by ribotyping has resulted mostly from genome rearrangement due to rrn recombination rather than mutation (22), and this type of variation cannot be used to determine true relationships.
Minimal SNP set required for differentiating SNP profiles.
To reduce the cost of genotyping and/or to facilitate large-scale typing, it would be useful to define a minimal SNP set that can identify all SNP profiles as assigned in this study. We identified 16 SNPs that could be used to classify the 73 serovar Typhi isolates into the same 23 SNP profiles (Table 2). These 16 SNPs require only four of the seven enzymes used in this study, AluI, HaeIII, HhaI, and HpaII, to type 3, 4, 5, and 3 SNPs, respectively. The total enzyme cost for typing an isolate is very small, far more economical than the cost of MLST. Further enhancement, such as automation and PCR multiplexing, can give additional advantage to this approach.
Comparison of approaches to SNP discovery for typing.
In this study, we took the advantage of genome sequences available to obtain SNPs for typing. As these SNPs were derived from the comparison of only two genomes, they can reveal only the evolutionary path separating Ty2 and CT18, due to phylogenetic discovery bias (25). Nevertheless, the SNPs used in our study allowed the determination of the position of the last common ancestor of serovar Typhi and the node positions for the SNP profiles, despite the inability to determine the true branch lengths of the SNP profiles other than those of the two profiles representing Ty2 and CT18. A recent study by Rougmanac et al. (30) took a different approach to obtain SNPs, aiming to circumvent the phylogenetic discovery bias problem. The study surveyed 200 gene fragments from over 100 serovar Typhi isolates for variation and found 88 informative SNPs. These SNPs were used as markers to differentiate 481 global serovar Typhi isolates into 85 haplotypes and group them into five major clusters. Despite the large number of SNPs used, each of the five clusters was supported by a single SNP only and there was little resolution of the relationships within a cluster. No parallel changes were detected, contrary to what we observed in this study.
Twenty-nine of the isolates analyzed in our study were also studied by Rougmanac et al. (30). These isolates were distinguished into 17 SNP profiles in this study and into 14 haplotypes in the Rougmanac et al. study (30) (Table 1). Overall, our SNPs offered a slightly higher degree of differentiation for these 29 isolates. However, SNPs from the two studies gave different resolutions to different groups. Our SNPs divided haplotypes 39, 42, 50, and 59 further, while their SNPs distinguished six isolates of SNP profile 10, the largest profile, into individual haplotypes. At the cluster level, it appears that our clusters II and III were subdivisions of their cluster II, as the six SNP profiles falling into cluster II of Rougmanac et al. (30) were grouped into two separate clusters in our study (SNP profiles 6, 7, and 8 in cluster II and 10, 11, and 13 in cluster III). Further studies will be useful to identify a set of SNPs from these two studies that may offer optimal differentiation.
In conclusion, we have shown that SNPs obtained from genome comparisons are valuable markers for typing serovar Typhi isolates. The distinctive advantage of an SNP-based method is that true genetic relationships can be established. We have developed an SNP typing method based on PCR-restriction enzyme digestion, and our method has the advantage of minimal cost for consumables and the need for only basic laboratory equipment for the detection of SNPs.
Acknowledgments
This work was supported by a faculty research grant from the University of New South Wales.
We thank Ken Sanderson (University of Calgary) and Gordon Dougan (Imperial College London) for generously providing us with the strains. We also thank the anonymous referees for their constructive comments and suggestions.
Footnotes
Published ahead of print on 29 August 2007.
Supplemental material for this article may be found at http://jcm.asm.org/.
REFERENCES
- 1.Achtman, M., G. Morelli, P. Zhu, T. Wirth, I. Diehl, B. Kusecek, A. J. Vogler, D. M. Wagner, C. J. Allender, W. R. Easterday, V. Chenal-Francisque, P. Worsham, N. R. Thomson, J. Parkhill, L. E. Lindler, E. Carniel, and P. Keim. 2004. Microevolution and history of the plague bacillus, Yersinia pestis. Proc. Natl. Acad. Sci. USA 101:17837-17842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Baker, S., J. Hardy, K. E. Sanderson, M. Quail, I. Goodhead, R. A. Kingsley, J. Parkhill, B. Stocker, and G. Dougan. 2007. A novel linear plasmid mediates flagellar variation in Salmonella typhi. PLoS Pathog. 3:605-610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Craigie, J., and C. Yen. 1938. The demonstration of types of B. typhosus by means of preparations of type II Vi phage. 1. Principles and technique. 2. The stability and epidemiological significance of V form types of B. typhosus. Can. Public Health J. 29:448-496. [Google Scholar]
- 4.Deng, W., S. R. Liou, G. Plunkett III, G. F. Mayhew, D. J. Rose, V. Burland, V. Kodoyianni, D. C. Schwartz, and F. R. Blattner. 2003. Comparative genomics of Salmonella enterica serovar Typhi strains Ty2 and CT18. J. Bacteriol. 185:2330-2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Excoffier, L., G. Laval, and S. Schneider. 2005. Arlequin ver. 3.0: an integrated software package for population genetics data analysis. Evol. Bioinform. Online 1:47-50. [PMC free article] [PubMed] [Google Scholar]
- 6.Feil, E. J., B. C. Li, D. M. Aanensen, W. P. Hanage, and B. G. Spratt. 2004. eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J. Bacteriol. 186:1518-1530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Fica, A. E., S. Prat-Miranda, A. Fernandez-Ricci, K. D'Ottone, and F. C. Cabello. 1996. Epidemic typhoid in Chile: analysis by molecular and conventional methods of Salmonella typhi strain diversity in epidemic (1977 and 1981) and nonepidemic (1990) years. J. Clin. Microbiol. 34:1701-1707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Filliol, I., A. Motiwala, M. Cavatore, W. Qi, M. Hazbon, M. del Valle, J. Fyfe, L. Garcia-Garcia, N. Rastogi, C. Sola, T. Zozio, M. Guerrero, C. Leon, J. Crabtree, S. Angiuoli, K. Eisenach, R. Durmaz, M. Joloba, A. Rendon, J. Sifuentes-Osornio, A. de Leon, M. Cave, R. Fleischmann, T. Whittam, and D. Alland. 2006. Global phylogeny of Mycobacterium tuberculosis based on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set. J. Bacteriol. 188:759-772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Frankel, G., S. M. Newton, G. K. Schoolnik, and B. A. Stocker. 1989. Intragenic recombination in a flagellin gene: characterization of the H1-j gene of Salmonella typhi. EMBO J. 8:3149-3152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Guinee, P. A., W. H. Jansen, H. M. Maas, L. Le Minor, and R. Beaud. 1981. An unusual H antigen (z66) in strains of Salmonella typhi. Ann. Microbiol. 132:331-334. [PubMed] [Google Scholar]
- 11.Gutacker, M. M., J. C. Smoot, C. A. Lux Miglicaccio, S. M. Ricklefs, S. Hua, D. V. Cousins, E. A. Graviss, E. Shashkina, B. N. Kreiswirth, and J. M. Musser. 2002. Genome-wide analysis of synonymous single nucleotide polymorphisms in Mycobacterium tuberculosis complex organisms: resolution of genetic relationships among closely related microbial strains. Genetics 162:1533-1543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Huang, X., L. V. Phung, S. Dejsirilert, P. Tishyadhigama, Y. Li, H. Liu, K. Hirose, Y. Kawamura, and T. Ezaki. 2004. Cloning and characterization of the gene encoding the z66 antigen of Salmonella enterica serovar Typhi. FEMS Microbiol. Lett. 234:239-246. [DOI] [PubMed] [Google Scholar]
- 13.Kidgell, C., U. Reichard, J. Wain, B. Linz, M. Torpdahl, G. Dougan, and M. Achtman. 2002. Salmonella typhi, the causative agent of typhoid fever, is approximately 50,000 years old. Infect. Genet. Evol. 2:39-45. [DOI] [PubMed] [Google Scholar]
- 14.Koay, A. S., M. Jegathesan, M. Y. Rohani, and Y. M. Cheong. 1997. Pulsed-field gel electrophoresis as an epidemiologic tool in the investigation of laboratory acquired Salmonella typhi infection. Southeast Asian J. Trop. Med. Public Health 28:82-84. [PubMed] [Google Scholar]
- 15.Kothapalli, S., S. Nair, S. Alokam, T. Pang, R. Khakhria, D. Woodward, W. Johnson, B. A. Stocker, K. E. Sanderson, and S. L. Liu. 2005. Diversity of genome structure in Salmonella enterica serovar Typhi populations. J. Bacteriol. 187:2638-2650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kubota, K., T. J. Barrett, M. L. Ackers, P. S. Brachman, and E. D. Mintz. 2005. Analysis of Salmonella enterica serotype Typhi pulsed-field gel electrophoresis patterns associated with international travel. J. Clin. Microbiol. 43:1205-1209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ling, J. M., N. W. Lo, Y. M. Ho, K. M. Kam, N. T. Hoa, L. T. Phi, and A. F. Cheng. 2000. Molecular methods for the epidemiological typing of Salmonella enterica serotype Typhi from Hong Kong and Vietnam. J. Clin. Microbiol. 38:292-300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Liu, S.-L., and K. E. Sanderson. 1996. Highly plastic chromosomal organization in Salmonella typhi. Proc. Natl. Acad. Sci. USA 93:10303-10308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Moshitch, S., L. Doll, B. Z. Rubinfeld, B. A. Stocker, G. K. Schoolnik, Y. Gafni, and G. Frankel. 1992. Mono- and bi-phasic Salmonella typhi: genetic homogeneity and distinguishing characteristics. Mol. Microbiol. 6:2589-2597. [DOI] [PubMed] [Google Scholar]
- 20.Nair, S., C. L. Poh, Y. S. Lim, L. Tay, and K. T. Goh. 1994. Genome fingerprinting of Salmonella typhi by pulsed-field gel electrophoresis for subtyping common phage types. Epidemiol. Infect. 113:391-402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nair, S., E. Schreiber, K. L. Thong, T. Pang, and M. Altwegg. 2000. Genotypic characterization of Salmonella typhi by amplified fragment length polymorphism fingerprinting provides increased discrimination as compared to pulsed-field gel electrophoresis and ribotyping. J. Microbiol. Methods 41:35-43. [DOI] [PubMed] [Google Scholar]
- 22.Ng, I., S. L. Liu, and K. E. Sanderson. 1999. Role of genomic rearrangements in producing new ribotypes of Salmonella typhi. J. Bacteriol. 181:3536-3541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Octavia, S., and R. Lan. 2006. Frequent recombination and low level of clonality within Salmonella enterica subspecies I. Microbiology 152:1099-1108. [DOI] [PubMed] [Google Scholar]
- 24.Parkhill, J., G. Dougan, K. D. James, N. R. Thomson, D. Pickard, J. Wain, C. Churcher, K. L. Mungall, S. D. Bentley, M. T. Holden, M. Sebaihia, S. Baker, D. Basham, K. Brooks, T. Chillingworth, P. Connerton, A. Cronin, P. Davis, R. M. Davies, L. Dowd, N. White, J. Farrar, T. Feltwell, N. Hamlin, A. Haque, T. T. Hien, S. Holroyd, K. Jagels, A. Krogh, T. S. Larsen, S. Leather, S. Moule, P. O'Gaora, C. Parry, M. Quail, K. Rutherford, M. Simmonds, J. Skelton, K. Stevens, S. Whitehead, and B. G. Barrell. 2001. Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413:848-852. [DOI] [PubMed] [Google Scholar]
- 25.Pearson, T., J. Busch, J. Ravel, T. Read, S. Rhoton, J. U'Ren, T. Simonson, S. Kachur, R. Leadem, M. Cardon, M. Van Ert, L. Huynh, C. Fraser, and P. Keim. 2004. Phylogenetic discovery bias in Bacillus anthracis using single-nucleotide polymorphisms from whole-genome sequencing. Proc. Natl. Acad. Sci. USA 101:13536-13541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Quintaes, B. R., N. C. Leal, E. M. Reis, and E. Hofer. 2004. Optimization of randomly amplified polymorphic DNA-polymerase chain reaction for molecular typing of Salmonella enterica serovar Typhi. Rev. Soc. Bras. Med. Trop. 37:143-147. [DOI] [PubMed] [Google Scholar]
- 27.Reeves, M. W., G. M. Evins, A. A. Heiba, B. D. Plikaytis, and J. J. Farmer III. 1989. Clonal nature of Salmonella typhi and its genetic relatedness to other salmonellae as shown by multilocus enzyme electrophoresis, and proposal of Salmonella bongori. J. Clin. Microbiol. 27:313-320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Rocha, E. P. C., J. Maynard Smith, L. D. Hurst, M. T. G. Holden, J. E. Cooper, N. H. Smith, and E. J. Feil. 2006. Comparisons of dN/dS are time dependent for closely related bacterial genomes. J. Theor. Biol. 239:226-235. [DOI] [PubMed] [Google Scholar]
- 29.Ross, I. L., and M. W. Heuzenroeder. 2005. Use of AFLP and PFGE to discriminate between Salmonella enterica serovar Typhimurium DT126 isolates from separate food-related outbreaks in Australia. Epidemiol. Infect. 133:635-644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Rougmanac, P., F.-X. Weill, C. Dolecek, S. Baker, S. Brisse, N. T. Chinh, T. A. H. Le, C. J. Acosta, J. Farrar, G. Dougan, and M. Achtman. 2006. Evolutionary history of Salmonella typhi. Science 314:1301-1304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sambrook, J., and D. Russell. 2001. Molecular cloning: a laboratory manual, 3rd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
- 32.Selander, R. K., J. Li, and K. Nelson. 1996. Evolutionary genetics of Salmonella enterica, p. 2691-2707. In F. C. Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger (ed.), Escherichia coli and Salmonella: cellular and molecular biology, 2nd ed., vol. 2. American Society for Microbiology, Washington, DC. [Google Scholar]
- 33.Shangkuan, Y. H., and H. C. Lin. 1998. Application of random amplified polymorphic DNA analysis to differentiate strains of Salmonella typhi and other Salmonella species. J. Appl. Microbiol. 85:693-702. [DOI] [PubMed] [Google Scholar]
- 34.Swofford, D. L. 1998. PAUP: phylogenetic analysis using parsimony, 4.0 beta ed. Sinauer Associates, Sunderland, MA.
- 35.Thong, K. L., M. Passey, A. Clegg, B. G. Combs, R. M. Yassin, and T. Pang. 1996. Molecular analysis of isolates of Salmonella typhi obtained from patients with fatal and nonfatal typhoid fever. J. Clin. Microbiol. 34:1029-1033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Thong, K. L., S. Puthucheary, R. M. Yassin, P. Sudarmono, M. Padmidewi, E. Soewandojo, I. Handojo, S. Sarasombath, and T. Pang. 1995. Analysis of Salmonella typhi isolates from Southeast Asia by pulsed-field gel electrophoresis. J. Clin. Microbiol. 33:1938-1941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Threlfall, E. J., E. Torre, L. R. Ward, B. Rowe, and I. Gibert. 1993. Insertion sequence IS200 can differentiate drug-resistant and drug-sensitive Salmonella typhi of Vi-phage types E1 and M1. J. Med. Microbiol. 39:454-458. [DOI] [PubMed] [Google Scholar]
- 38.Vieu, J. F., H. Binette, and M. Leherissey. 1986. Absence of the antigen H:z66 in 2355 strains of Salmonella typhi from Madagascar and several countries of tropical Africa. Bull. Soc. Pathol. Exp. 79:22-26. [PubMed] [Google Scholar]
- 39.Vieu, J. F., and M. Leherissey. 1988. The antigen H:z66 in 1,000 strains of Salmonella typhi from the Antilles, Central America and South America. Bull. Soc. Pathol. Exp. 81:198-201. [PubMed] [Google Scholar]
- 40.Wang, L., D. Rothemund, H. Curd, and P. Reeves. 2003. Species-wide variation in the Escherichia coli flagellin (H antigen) gene. J. Bacteriol. 185:2936-2943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.WHO. 2005. Typhoid fever, Democratic Republic of the Congo. Wkly. Epidemiol. Rec. 80:1-8. [Google Scholar]
- 42.Zhang, W., W. Qi, T. Albert, A. Motiwala, D. Alland, E. Hyytia-Trees, E. Ribot, P. Fields, T. Whittam, and B. Swaminathan. 2006. Probing genomic diversity and evolution of Eschericia coli O157 by single nucleotide polymorphisms. Genome Res. 16:757-767. [DOI] [PMC free article] [PubMed] [Google Scholar]