Abstract
Background
Whole-genome sequencing (WGS) of Treponema pallidum subspecies pallidum (TPA) has been constrained by the lack of in vitro cultivation methods for isolating spirochetes from patient samples.
Methods
We built upon recently developed enrichment methods to sequence TPA directly from primary syphilis chancre swabs collected in Guangzhou, China.
Results
By combining parallel, pooled whole-genome amplification with hybrid selection, we generated high-quality genomes from 4 of 8 chancre-swab samples and 2 of 2 rabbit-passaged isolates, all subjected to challenging storage conditions.
Conclusions
This approach enabled the first WGS of Chinese samples without rabbit passage and provided insights into TPA genetic diversity in China.
Keywords: chancre, China, syphilis, Treponema pallidum, whole-genome sequencing
Parallel, pooled whole-genome amplification and hybrid selection enabled whole-genome sequencing of Treponema pallidum DNA extracted directly from primary syphilis chancre swabs, despite challenging storage conditions. Phylogenomic analysis revealed diverse strains, including the first Nichols-like genome reported from China to date.
Efforts to address the resurgence of the sexually transmitted disease syphilis have been hampered by the limited understanding of the genetic diversity of its causative pathogen, the spirochete Treponema pallidum subspecies (subsp) pallidum (TPA) [1]. A vaccine to prevent TPA infection and transmission is sorely needed. Design of an effective syphilis vaccine requires an understanding of the diversity of TPA strains circulating worldwide, particularly in countries such as China, where the incidence of disease has steadily increased by more than 30-fold since the mid-1990s but from which only 9 TPA genomes have been reported to date [2–5].
Genomic analyses of TPA have lagged behind those of other pathogens, largely due to the lack of in vitro methods for isolating these spirochetes from patient samples. Although TPA in vitro cultivation is now possible, its use for direct culture from patient specimens has not yet been described [6]. Only a small number of TPA genomes have been published, and the majority of sequenced isolates required passage in rabbit testicles before sequencing [7]. Several strategies have been used to enrich TPA from clinical specimens. Selective enrichment through hybridization with ribonucleic acid (RNA) oligonucleotides for pull-down of complementary TPA deoxyribonucleic acid (DNA) fragments has now been used in several recent TPA genomic analyses [8–10]. Two newer techniques have also been described: one that utilizes antitreponemal antibody binding to enrich TPA cells isolated from clinical specimens and another that uses methyl-directed enrichment using the restriction nuclease DpnI [11, 12]. Although these methods represent advancements for the field, improved and complementary approaches for sequencing TPA in patient samples are needed, especially for samples with low TPA burdens or DNA loss during sample processing.
In the present study, we build upon recent advances in TPA enrichment to expand the range of samples from which whole-genome sequences can be obtained. We successfully sequenced TPA DNA extracted from both rabbit-passaged isolates and chancre swabs from Guangzhou, China. Using a combination of parallel, pooled whole-genome amplification (ppWGA) and hybrid selection, we achieved >80% genomic coverage in 6 of 10 samples, including 4 of 8 that had not undergone rabbit passage, despite challenging sample processing and storage conditions. Phylogenomic analyses revealed diverse TPA strains, including the first Nichols-like genome published from China to date.
METHODS
Study Population and Sample Processing
We collected samples from chancre exudates in Guangzhou, China, from 2017 to 2018. In brief, genital ulcers of patients diagnosed with primary syphilis were swabbed and processed as described in the Supplementary Methods. For the present study, we selected 10 samples from male patients ages 20 to 68 for sequencing (Supplementary Table 1). These included 8 DNA samples extracted directly from chancre swabs and 2 samples extracted after rabbit intratesticular passage. This study was approved by the ethics committee of the Dermatology Hospital of Southern Medical University (GDDHLS-20170614). The University of North Carolina at Chapel Hill (UNC) Institutional Review Board determined that the analysis of de-identified TPA DNA samples did not constitute human subjects research (No. 18-1949). All patients provided written informed consent and were offered treatment as part of routine care.
Treponema pallidum subsp pallidum genome copy numbers and total DNA concentrations were quantified using real-time quantitative polymerase chain reaction (qPCR) and a Qubit 4.0 fluorimeter with dsDNA HS reagents (Thermo Fisher Scientific, Waltham, MA), respectively, before freeze-drying and overnight shipment to UNC at ambient temperature (see Supplementary Methods). Freeze-drying was performed to comply with regulatory requirements and to facilitate long-distance shipment. Dehydrated samples were stored at −80°C until resuspension in 25 μL or 100 μL of Buffer EB (QIAGEN, Venlo, Netherlands).
Parallel, Pooled Whole-Genome Amplification, Hybrid Selection, and Library Preparation
Total DNA concentrations of resuspended samples were quantified using a Qubit fluorimeter as described above. For those samples with concentrations below the fluorimeter’s limit of detection (0.2 ng/µL), WGA using the Illustra GenomiPhi V2 Amplification kit (GE Healthcare, Chicago, IL) was used to increase the library input. Three microliters of DNA template was used for each reaction; reactions were otherwise performed according to the manufacturer’s instructions with the exception of reaction time, which was increased to 6 hours due to the low DNA concentration. Five separate WGA reactions were performed in parallel for each individual sample. The amplification products for each sample were pooled to achieve equal total DNA input per WGA reaction and purified using KAPA Pure Beads (Kapa Biosystems, Wilmington, MA).
Treponema pallidum subsp pallidum DNA was enriched using the SureSelect XT HS target enrichment system (Agilent Technologies, Santa Clara, CA), which utilizes RNA oligonucleotide probes to hybridize DNA of interest. We custom-designed probes using all TPA genomes that were publicly available at the time of construction, with increased tiling density of probes across specific regions of interest, including phylogenetically informative loci and those that encode known or putative outer membrane proteins (Supplementary Tables 2 and 3) [13]. Probe design, library construction, and hybridization were performed as described in the Supplementary Methods. Sequencing was performed at UNC using the MiSeq platform (Illumina, San Diego, CA) with 150-base pair paired-end reads.
Genomic Alignment, Variant Calling, and Phylogenetic Analysis
For phylogenetic analysis, we included 68 publicly available (18 complete and 50 draft), geographically diverse TPA genomes from 3 continents, in addition to the genomes generated in the present study. Sequencing alignment, variant calling, and phylogenetic analysis were performed as depicted in Figure 1. Intrastrain heterogeneity was assessed as described in the Supplementary Methods. Putative sites of recombination and indels were removed before phylogenetic analysis. We assessed the validity of our variant calls and phylogenetic analysis by Sanger sequencing 5 targets, including 3 loci used in a recently proposed multilocus strain typing (MLST) system (tp0136, tp0548, and tp0705) and each of TPA’s 2 23S rRNA operons, in which mutations have been associated with azithromycin resistance (see Supplementary Methods) [14]. Sequences were uploaded to the Sequence Read Archive (SRR10041766-75) and GenBank (see Supplementary Table 4).
RESULTS
Whole-Genome Sequencing Outcomes
After freeze-drying and rehydration, only 3 samples had sufficient DNA concentrations for quantification by Qubit. Total DNA concentrations for these samples ranged from 1.3 to 10.6 ng/µL, representing a 47% to 58% decrease in total DNA compared with the original samples (Supplementary Table 4). The remaining 7 rehydrated DNA samples had concentrations beneath the fluorometer’s limit of detection and were subjected to ppWGA before hybridization (Figure 1). Despite evidence of DNA loss during the process of freeze-drying, shipment, and rehydration, we successfully sequenced TPA genomes from 6 of 10 samples (60%), including 2 of 2 rabbit-passaged isolates and 4 of 8 chancre-swab samples that had not undergone rabbit passage. Among the 6 samples with sufficient coverage for phylogenomic analysis, 83.5% to 99.1% of the genomes were covered with ≥3 reads.
Sequencing was more successful in samples with higher TPA concentration before freeze-drying (Supplementary Table 4). Samples that originally contained 3320 to 3 160 000 polA copies/µL before freeze-drying achieved ≥3× depth in >80% of their genomes. Sequences from samples subjected to ppWGA before hybridization demonstrated evidence of significant “jackpotting” events (Figure 2A)—very high coverage in specific regions—some of which were unique to a given sample, whereas others were shared across samples. Although jackpotting events are a known limitation of WGA and decrease the cost efficiency of the sequencing runs, the use of ppWGA reactions nonetheless enabled salvage of low-concentration, rehydrated samples. In the 2 samples that were subjected to both approaches (SMUTp_08 and SMUTp_09), ppWGA achieved more even coverage than conventional WGA performed in singleton (Supplementary Figure 1). Together, ppWGA and hybrid selection allowed us to generate the first Chinese TPA genomes without rabbit passage.
Phylogenetic Diversity of Chinese Treponema pallidum Subspecies pallidum Isolates
We observed diverse TPA strains within our modest sample set (Figure 2B), which included 94 single-nucleotide variants (SNVs) expected to induce coding changes. Sixty-six putative sites of recombination were excluded from whole-genome phylogenetic analysis (Supplementary Materials and Supplementary Tables 5–8), including hypervariable loci like TprK that our short-read sequencing approach could not resolve unambiguously. We did not observe obvious clustering of rabbit-passaged TPA isolates, a finding that might be observed if passage in rabbits exerts selective pressure favoring specific variants, versus directly sequenced TPA isolates.
The genomes produced in the present study fell into 3 distinct clusters and could be distinguished from the previously published Chinese Amoy genome (Supplementary Table 10). These clusters included 1 closely related to previously sequenced Chinese SS14-like isolates, 1 distinct SS14-like cluster, and 1 Nichols-like genome (SMUTp_07) [2–5, 9, 10]. The MLST results confirmed the strain relationships observed during phylogenomic analysis (Supplementary Table 11), with samples falling into 3 of the most common allelic profiles reported in PubMLST (www.pubmlst.org). Samples with the 1.1.1 allelic profile were the most diverse compared with others of the same profile. Although past surveys using traditional molecular strain typing methods indicated low-level prevalence of Nichols-like strains in China [9, 15], SMUTp_07 is the first Nichols-like whole-genome sequence reported from China to our knowledge.
Initial whole-genome alignments of sample SMUTp_02 suggested the presence of a strain with 1 wild-type (macrolide-sensitive) 23S ribosomal RNA (rRNA) operon. However, manual inspection of these reads revealed poor mapping and sequences with 100% homology to Pseudomonas species. Sanger sequencing revealed that all samples harbored mutations associated with macrolide resistance in both 23S rRNA operons (Supplementary Table 12). In addition, we observed evidence of intrastrain heterogeneity in several genes (Supplementary Materials and Supplementary Table 13).
DISCUSSION
The lack of an in vitro TPA cultivation system from patient specimens and the need for rabbit passage have limited syphilis research. Recently developed sample enrichment and sequencing techniques offer new opportunities to study TPA transmission and evolution. However, they often fail when applied to clinical samples isolated directly from patients due to low TPA burdens. Although the number of publicly available TPA genomes has increased rapidly in the past several years, most published genomes required TPA enrichment by rabbit passage before WGS.
We built upon recent advances in TPA genomics by piloting a novel WGA approach (ppWGA) for enrichment of challenging clinical samples with low DNA concentrations before hybrid selection. Enrichment success using hybrid selection alone has previously been achieved with clinical samples with more than 1 × 104 TPA copies/µL [8]. However, this approach is expected to fail for many clinical samples collected directly from patients, which often harbor only 101–104 TPA copies/µL. The success of our approach in samples with concentrations nearing 103 TPA copies/µL measured before demanding sample processing and shipment confirms the utility of ppWGA as an adjunct to other TPA enrichment methods. These methods have potential for use in other fields and with diverse sample types, especially when freeze-drying is required for regulatory compliance or samples have low target concentrations.
Whole-genome amplification is typically used in singleton for nonspecific preamplification before selective enrichment and/or sequencing library preparation. During pilot testing in the present study, we observed distinct, isolated regions of deep sequencing coverage (“jackpotting”) in samples subjected to a single WGA. These findings may be due to early priming by random hexamers and amplification during the WGA process. To overcome the apparent stochastic nature by which genomic locations undergo jackpotting during a WGA reaction, we used WGA in parallel and pooled products before proceeding with hybrid selection and library preparation.
Our study is limited by its small sample size and the challenges of low-concentration and degraded samples. Although ppWGA enabled successful enrichment of a portion of these samples, the presence of Pseudomonas in one of the downstream 23S rRNA alignments confirms the need for careful attention to regions that are highly conserved across bacteria during analysis. We overcame the challenge of incorrect mapping of short reads by incorporating rigorous quality filters and removing reads that cross-mapped to other bacteria during bioinformatic analysis. These measures are especially important when nonspecific amplification such as ppWGA is used during the sample enrichment process.
CONCLUSIONS
We used ppWGA and hybrid selection to gain new insights into the genetic diversity of TPA strains currently circulating in China, where there is a syphilis epidemic and only a small number of genomes from rabbit-passaged isolates have been published [2–5]. Additional studies using novel enrichment methods and in locations where little is known about TPA genomic diversity, including sites outside of the United States and Europe, are needed to inform syphilis vaccine design.
Supplementary Data
Supplementary materials are available at The Journal of Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.
Supplementary Table 1. Clinical information for Chinese samples included in this study. See attached file.
Supplementary Table 2. Genomes used during the design of RNA oligonucleotide “baits” for hybrid selection. The SS14 and Nichols genomes were covered with 1× tiling density (1 bait per locus) with the exception of loci provided in Supplementary Table 3. See attached file.
Supplementary Table 3. Genomic regions of increased bait tiling density (5×). Phylogenetically informative loci and those encoding known or putative outer membrane proteins were covered with at least 5× tiling density (5 baits per locus). See attached file.
Supplementary Table 4. Sample details and sequencing results. See attached file.
Supplementary Table 5. Putative regions of ecombination identified by Gubbins and excluded from phylogenetic analysis. See attached file.
Supplementary Table 6. Called variants among the 5 Chinese SS14-like TPA isolates described in the present study, annotated using SnpEff, before removal of putative sites of recombination and paralogous genes. See attached file.
Supplementary Table 7. Called variants for the single Chinese Nichols-like TPA isolate described in the present study, annotated using SnpEff, before removal of putative sites of recombination and paralogous genes. See attached file.
Supplementary Table 8. Called variants expected to induce coding changes (amino acid changes) among the 6 Chinese TPA isolates described in the present study. See attached file.
Supplementary Table 9. Mutations in penicillin-associated genes. See attached file.
Supplementary Table 10. Comparison of SNVs between the new Chinese SS14-like strains (present study) and the Amoy strain. See attached file.
Supplementary Table 11. MLST allelic profiles of Chinese TPA strains. See attached file.
Supplementary Table 12. Genetic markers of macrolide (azithromycin) resistance. See attached file.
Supplementary Table 13. Single nucleotide intrastrain heterogeneous sites. See attached file.
Supplementary Table 14. Sequences used during phylogenetic tree construction. See attached file.
Notes
Acknowledgments. We thank Madeline Denton for assistance with Treponema pallidum subspecies pallidum whole-genome amplification experiments, Dr. Nicholas Brazeau for assistance with initial bioinformatic analysis, Dr. Jiajian Zhou for assistance with next-generation sequencing data analysis, and Tianci Yang for providing plasmid deoxyribonucleic acid for polA quantitative polymerase chain reaction standards.
Financial support. This study was funded by the Medical Scientific Research Foundation of Guangdong Province, China (A2018264 and B2019022; to W. C.) and Science and Technology Planning Project of Guangdong Province, China (2017A020212008; to H. Z.). Additional support was provided by the National Institute for Allergy and Infectious Diseases (U19AI144177 to D. S., K. L. H., M. J. C., J. D. R., A. S., J. D. T., B. Y., J. J. J., H. Z., and J. B. P.; K24AI143471 to J. D. T.; K24AI134990 to J. J. J.; and R01AI26756 to J. D. R.), a Yang Biomedical Scholars Award (to J. J. J.), the Clara Guthrie Patterson Trust, Bank of America, N.A., Trustee (K. L. H.), the Connecticut Children’s Medical Center (J. D. R., M. J. C., and K. L. H.), the Grant Agency of the Czech Republic (GA17-25455S, gacr.cz; to D. S.), and an American Society for Tropical Medicine and Hygiene-Burroughs Wellcome Foundation Award (to J. B. P.).
Potential conflicts of interest. J. B. P. reports grants from the World Health Organization and Gilead Sciences and nonfinancial support from Abbott Diagnostics, outside the submitted work. J. D. R. reports personal fees from Biokit SA (through UT Southwestern Medical Center), Span Diagnostics (through UConn Health), Diasorin (through UT Southwestern Medical Center), and Chembio Diagnostics (through UConn Health), and a patent through UT Southwestern Medical Center licensed, outside the submitted work. A. S. reports grants from Gilead FOCUS and personal fees from Hologic, Inc., and from UptoDate, outside the submitted work. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.
References
- 1. Radolf JD, Deka RK, Anand A, Šmajs D, Norgard MV, Yang XF. Treponema pallidum, the syphilis spirochete: making a living as a stealth pathogen. Nat Rev Microbiol 2016; 14:744–59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Tao Y, Chen MY, Tucker JD, et al. A nationwide spatiotemporal analysis of syphilis over 21 years and implications for prevention and control in China. Clin Infect Dis 2020; 70:7136–9. [DOI] [PubMed] [Google Scholar]
- 3. Tong ML, Zhao Q, Liu LL, et al. Whole genome sequence of the Treponema pallidum subsp. pallidum strain Amoy: an Asian isolate highly similar to SS14. PLoS One 2017; 12:e0182768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Strouhal M, Oppelt J, Mikalová L, et al. Reanalysis of Chinese Treponema pallidum samples: all Chinese samples cluster with SS14-like group of syphilis-causing treponemes. BMC Res Notes 2018; 11:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Sun J, Meng Z, Wu K, et al. Tracing the origin of Treponema pallidum in China using next-generation sequencing. Oncotarget 2016; 7:42904–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Edmondson DG, Hu B, Norris SJ. Long-term in vitro culture of the syphilis spirochete Treponema pallidum subsp. pallidum. MBio 2018; 9:e01153–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Fraser CM, Norris SJ, Weinstock GM, et al. Complete genome sequence of Treponema pallidum, the syphilis spirochete. Science 1998; 281:375–88. [DOI] [PubMed] [Google Scholar]
- 8. Pinto M, Borges V, Antelo M, et al. Genome-scale analysis of the non-cultivable Treponema pallidum reveals extensive within-patient genetic variation. Nat Microbiol 2016; 2:16190. [DOI] [PubMed] [Google Scholar]
- 9. Arora N, Schuenemann VJ, Jäger G, et al. Origin of modern syphilis and emergence of a pandemic Treponema pallidum cluster. Nat Microbiol 2016; 2:16245. [DOI] [PubMed] [Google Scholar]
- 10. Beale MA, Marks M, Sahi SK, et al. Genomic epidemiology of syphilis reveals independent emergence of macrolide resistance across multiple circulating lineages. Nat Commun 2019; 10:3255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Grillová L, Giacani L, Mikalová L, et al. Sequencing of Treponema pallidum subsp. pallidum from isolate UZ1974 using anti-treponemal antibodies enrichment: first complete whole genome sequence obtained directly from human clinical material. PLoS One 2018; 13:e0202619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Grillová L, Oppelt J, Mikalová L, et al. Directly sequenced genomes of contemporary strains of syphilis reveal recombination-driven diversity in genes encoding predicted surface-exposed antigens. Front Microbiol 2019; 10:1691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Radolf JD, Kumar S. The Treponema pallidum outer membrane. Curr Top Microbiol Immunol 2018; 415:1–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Grillová L, Bawa T, Mikalová L, et al. Molecular characterization of Treponema pallidum subsp. pallidum in Switzerland and France with a new multilocus sequence typing scheme. PLoS One 2018; 13:e0200773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Dai T, Li K, Lu H, Gu X, Wang Q, Zhou P. Molecular typing of Treponema pallidum: a 5-year surveillance in Shanghai, China. J Clin Microbiol 2012; 50:3674–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.