In some parts of the world, Corynebacterium diphtheriae has reemerged as a pathogen, especially as a cause of infections among impoverished and marginalized populations. We performed whole-genome sequencing (WGS) on all cutaneous C. diphtheriae isolates (n = 56) from Vancouver’s inner-city population over a 3-year time period (2015 to 2018).
KEYWORDS: Corynebacterium, DNA sequencing, diphtheria, genome analysis, genomics, molecular epidemiology, molecular subtyping, phylogenetic analysis, plasmid analysis, virulence factors
ABSTRACT
In some parts of the world, Corynebacterium diphtheriae has reemerged as a pathogen, especially as a cause of infections among impoverished and marginalized populations. We performed whole-genome sequencing (WGS) on all cutaneous C. diphtheriae isolates (n = 56) from Vancouver’s inner-city population over a 3-year time period (2015 to 2018). All isolates with complete genome assembly were toxin negative, contained a common set of 22 virulence factors, and shared a highly conserved accessory genome. One of our isolates harbored a novel plasmid conferring macrolide and lincosamide resistance. Fifty-two out of 56 isolates were multilocus sequence type 76, and single nucleotide variants (SNV) and core-genome multilocus sequence typing (cgMLST) analysis demonstrated tight clustering of our isolates relative to all publicly available C. diphtheriae genomes. All sequence type 76 (ST76) study isolates were within a median of 22 SNVs and 13 cgMLST alleles of each other, while NCBI genomes were within a median of 17,436 SNVs and 1,552 cgMLST alleles of each other (both P < 2.2 × 10−16). A single strain of C. diphtheriae appears to be causing cutaneous infections in the low-income population of Vancouver. Further research is needed to elucidate transmission networks in our study population and standardize C. diphtheriae epidemiological typing when whole genomes are sequenced.
INTRODUCTION
Following the introduction and widespread implementation of the diphtheria toxoid vaccine in the 1930s, cases of diphtheria caused by toxigenic strains of Corynebacterium diphtheriae plummeted worldwide. However, despite an estimated 86% global vaccination coverage, C. diphtheriae remains endemic in many countries, and outbreaks continue to be reported (https://www.who.int/immunization/monitoring_surveillance/burden/diphtheria/en/). Since the large 1990s outbreak in countries of the former Soviet Union (1) comprising over 150,000 cases, there has been ongoing transmission in Belarus and Latvia (2, 3); outbreaks have also been frequently reported in South America (Brazil, Venezuela), Asia (Laos), Africa (South Africa), and Europe (Switzerland, Germany) over the past 10 years (4–9).
While most of these outbreaks have been related to the classic respiratory presentation of C. diphtheriae infection, the organism can also cause cutaneous disease. Cutaneous diphtheria is characterized by a chronic, nonhealing ulcer, and is often a source for persistent colonization. Colonization with cutaneous or nasopharyngeal diphtheria has the potential to cause systemic disease and may be an important reservoir for ongoing transmission within a susceptible population (10). Notably, the diphtheria toxoid vaccine does not protect against nontoxigenic strains of the bacteria. The diphtheria toxin, encoded by the tox gene, is carried by multiple corynebacteriophages and has the ability to lysogenize nontoxigenic C. diphtheriae as well as a select few other Corynebacterium species.
Earlier molecular epidemiological analyses of C. diphtheriae relied on multilocus sequence typing (MLST). The MLST scheme for C. diphtheriae comprises 7 housekeeping loci and has defined at least 624 sequence types (https://pubmlst.org/bigsdb?db=pubmlst_cdiphtheriae_seqdef). However, MLST yields limited phylogenetic information, as it only interrogates less than 1% of the C. diphtheriae genome. Whole-genome sequencing (WGS) provides significantly greater phylogenetic resolution through single nucleotide variant (SNV) analysis or allelic analysis such as core-genome or whole-genome MLST (cgMLST/wgMLST). WGS also provides the ability to search for virulence factors such as diphtheria toxin and antimicrobial resistance markers. WGS had been used to elucidate transmission dynamics of C. diphtheriae previously (2, 7, 9, 11–15) and has shown that unrelated C. diphtheriae genomes differ by approximately 30,000 SNPs or more, whereas epidemiologically linked outbreak isolates differ by up to 150 SNVs (9, 12). Similarly, unrelated isolates have been found to be more than 1,000 alleles different when using cgMLST analysis, whereas related isolates have been within 5 alleles (12).
We have previously shown that the major circulating strain of C. diphtheriae in the urban poor population of Vancouver, British Columbia, Canada, is ST76, with 69% (20/29) cases from 1998 through 2007 categorized as this sequence type (16). All of these isolates were nontoxigenic. Most cases were in people who injected drugs, lived in the Vancouver “Downtown Eastside” neighborhood, and had significant medical comorbidities. Wounds with C. diphtheriae identified were commonly polymicrobial and often also harbored Staphylococcus aureus, Streptococcus pyogenes, and Arcanobacterium haemolyticum. Although epidemiologically linked to a geographic region, we conducted WGS on C. diphtheriae isolated at our hospital from 2015 through 2018 to further understand the local population structure and transmission of C. diphtheriae in our region.
MATERIALS AND METHODS
Sequencing of C. diphtheriae isolates.
Study isolates included confirmed C. diphtheriae from March 2015 to September 2018 at St. Paul’s Hospital (Vancouver, Canada). Only the first isolate from each patient over the study period was included. Identification of nontoxigenic C. diphtheriae (modified Elek test and PCR), as well as WGS, was performed as previously defined (17, 18). Briefly, the Kapa HyperPlus kit with Kapa dual-indexed adaptors (Roche Sequencing, Pleasanton, CA) was utilized, and sequencing was performed with the MiSeq Reagent 2 × 300 v3 Kit (Illumina, San Diego, CA). Raw data are available at BioProject accession no. PRJNA563223.
Bioinformatics analysis: genome assembly.
Quality assessment, adapter trimming, quality trimming, paired-end overlapping, and correction were performed with fastp (19). MultiQC was used to combine quality reports (20). Removal of contaminating reads is described in the supplemental methods. Reads were assembled using SPAdes and assessed for completeness using QUAST with the C. diphtheriae reference strain NCTC11397 (21, 22). All visualization tools are also provided in the supplementary methods.
Virulence and plasmid analysis.
Virulence factors and diphtheria toxin were identified with Abricate using the full version of the VFDB database (https://github.com/tseemann/abricate, 23). Reads were assembled with Unicycler to detect plasmids, and resulting circular contigs were BLASTed for homology and annotated with Prokka (24, 25).
Pan-genome analysis.
Assemblies were annotated with Prokka using the reference C. diphtheriae NCTC11397 for proteins, and a pan-genome was created with Roary (25, 26). Association of accessory genes with study inclusion was performed with Scoary using Benjamini-Hochberg correction and a false discovery rate of 0.05 as a threshold (27).
MLST and cgMLST analysis.
MLST analysis was performed with the Center for Genomic Epidemiology’s MLST tool 2.0.1 and database version 2.0.0 (28). A core-genome MLST scheme was generated with PGAdb-builder on all C. diphtheriae genomes marked as complete at NCBI Genomes (n = 23), only retaining loci found in 95% or more genomes (29). PGAdb-builder was then used to call alleles on all 226 NCBI C. diphtheriae genomes and our 45 study genomes with complete assembly using default settings. A second cgMLST profile was created with chewBBACA (30), using the reference C. diphtheriae NCTC11397 as training for Prodigal and removing all paralogous loci (31). Alleles were then called with chewBBACA.
SNV identification and phylogenetic analysis.
Core SNVs were identified with snippy 4.4.0 using C. diphtheriae NCTC11397 as a reference (https://github.com/tseemann/snippy). Recombinant SNVs were removed with Gubbins, and a phylogenetic tree was then generated with RaxML using the GTRGAMMA model and the bootstrap convergence criteria (32, 33). Sample SNV distances were calculated with snp-dists and processed in R (https://github.com/tseemann/snp-dists).
Ethics statement.
Ethics approval was obtained for this study from the University of British Columbia-Providence Health Care Research Ethics Board.
Data Availability.
Raw sequencing data are available in the NCBI Sequence Read Archive and can be accessed with BioProject accession no. PRJNA563223. Our novel plasmid sequence and annotation have been deposited in NCBI GenBank with accession no. MN462701.
RESULTS
Description.
We identified 60 isolates of nontoxigenic C. diphtheriae over the study period from multiple anatomic sites: blood (n = 1), throat (n = 1), and skin (n = 58). Fifty-six cutaneous isolates underwent WGS analysis; the remaining 4 were unable to be recovered from storage. We were able to successfully assemble genomes for 45 study isolates with a median of 90.1% of the reference C. diphtheriae NCTC11397 genome covered by each of our assemblies (Q1, 90.1%; Q3, 90.1%). An additional 11 isolates had incomplete or fragmented assemblies and were used solely for MLST analysis. These isolates had a median coverage of the reference genome of 86.7% (Q1, 69.0%; Q3, 88.0%) and a greater number of contigs per assembly (median of 530 versus 59 for full assemblies). Low-sequencing coverage and contamination (see Supplemental Methods) were likely contributors to these incomplete or fragmented assemblies. Full details of phenotypic resistance testing and analyses for genotypic markers of resistance have already been summarized (18).
Virulence factors, toxins, plasmids, and pan-genome analysis.
All of our isolates (n = 45) were toxin negative. Additionally, all isolates contained 22 known virulence factors including fagA to fagD, sapD, ciuA to ciuE, htaA to htaB, hmuT to hmuV, dtxR, spaA to spaB, srtA, and irp6A to irp6C. The fag, ciu, hmu, and irp gene groups are all involved in iron uptake processes. The sapD and spaA genes are pili, hta genes are heme uptake systems, srtA is a sortase, and dtxR is the diphtheria toxin repressor (Table S1). Notably, one isolate, CD30, harbored 11 virulence factors not found in the rest of our bacterial population, including additional sortase genes (srtB to srtE), pilin genes (sapA, spaC, spaE, and spaG to spaI), and heme uptake proteins (htaC) (Table S2).
We next searched our isolate assemblies for plasmids and found a single isolate with a 13,218-base pair plasmid (Fig. 1). This plasmid was successfully circularized by Unicycler and was most similar to C. diphtheriae plasmid pNG2 (RefSeq accession no. NC_005001.1) by BLAST search against the nonredundant GenBank database, covering 54% (8,100 bp) of the reference at an average sequence identity of 93.2%. Annotation of our plasmid revealed 12 coding sequences, including traA, parA, parB, ermX (conferring macrolide and lincosamide resistance), gcrR, and 7 hypothetical proteins. The replicase repA was present; however, its alignment was truncated at base 12,971 of NC_005001.1 (217 bp short out of a total 1,452 bp). Prokka predicted a coding sequence within the RefSeq-annotated repA from bp 12,377 to 12,940 with unknown function. We have deposited the sequence of what we believe to be a novel plasmid to GenBank (accession no. MN462701).
FIG 1.

Novel plasmid reconstruction (right) compared with C. diphtheriae pNG2 plasmid reference sequence (left). Green sections indicate annotated genes on the reference sequence, red bars indicate matching sequences, and ribbons connect the matching sequences by location.
Pan-genome analysis.
We built a pan-genome of all public C. diphtheriae genomes (n = 226), along with our 45 full-length assemblies. The core genome of C. diphtheriae (>95% genomes) comprised 1,514 genes, with a total pan-genome size of 11,930 genes. Our isolates shared a highly similar accessory genome, as depicted in Fig. 2. We calculated the differential presence or absence of accessory genes in our isolates relative to all NCBI genomes. We found 691 genes to be positively associated with our study isolates (i.e., they were overrepresented in our study population) and 1,020 to be negatively associated with our study isolates at a false discovery rate of 5%.
FIG 2.
Pan-genome of all NCBI C. diphtheriae genomes (purple), along with our study isolates (yellow). Presence and absence of core and accessory genes marked with blue bars, and dendrogram clustered by binary presence or absence of genes.
MLST analysis.
In silico traditional MLST typing identified one predominant strain: ST76 (52/56). The remaining ST-types included one ST05, one ST32, one ST319, and one novel ST (most resembling ST441/ST442/ST444). We next created a cgMLST scheme from 23 complete genomes hosted by NCBI using PGAdb-builder. PGAdb-builder identified 1,564 loci common to 95% of these isolates. Using this scheme, we classified our isolates and compared them with 226 draft- or better-quality genomes at NCBI. Excluding the outlier ST441/ST442/ST444 genome within our study, our 44 complete genomes (ST76) were a median of 2 (Q1, 1; Q3, 3) PGAdb cgMLST alleles different from each other. Twenty-seven out of 44 isolates had identical cgMLST profiles with at least 1 other isolate. In contrast, NCBI genomes were a median of 1,323 (Q1,813; Q3, 1414) cgMLST alleles different from each other (P < 2.2 × 10−16). We created a minimum spanning tree and show that our isolates cluster tightly (Fig. S1).
Using this scheme, we noticed that a significant number of alleles were called as absent by PGAdb (999/1564 loci absent in 43 or more study isolates), reflecting significantly divergent alleles (<90% sequence identity) and the tool’s inability to infer new alleles. Repeating the previous analysis using chewBBACA, which has the ability to infer new alleles, identified a core genome of 1,685 loci. Using this approach, the maximum number of loci not found in any study isolate was 6/1685. Our 44 ST76 isolates were a median of 13 (Q1, 9; Q3, 21) chewBBACA cgMLST alleles different from each other, and none had identical profiles (Fig. 3; Fig. S2 and S3). NCBI genomes were a median of 1,552 (Q1, 1,506; Q3, 1,580) alleles different from each other (P < 2.2 × 10−16).
FIG 3.

chewBBACA minimum spanning tree of all study isolates (blue) and NCBI genomes (purple). Distances are log scaled. Interactive visualization available at https://online.phyloviz.net/main/dataset/share/cff51111a3ca3f87cf9c49c0436ce710fc5ff89bcc5ab13888035190d4194143ff73f5264daef7d10c322fe0e0c35281e23a5b0aaa639b746bd7e9b86a4b706d8f7fe909dcf0a15b5f3964bc04491e0239ae80418e85b8edc85d150fb1460dada1264d96828fb4fd982ae135ee89b25de19cf82ea82e309eae0350b934f4880e6295884ba3bd1289555e0b774c64e9b57620.
Phylogenetic analysis.
We identified all SNVs in our isolates relative to the reference C. diphtheriae NCTC11397 genome. We next created a phylogenetic tree using the SNV information, including both our isolates and all NCBI genomes. Our isolates clustered tightly relative to all public genomes (Fig. 4). Again, excluding the novel ST isolate, our isolates were a median of 22 (Q1,14; Q3,33) SNVs apart from each other (Fig. S4). The maximum distance between any two of our ST76 isolates was 124 SNVs. In contrast, NCBI genomes were a median of 17,436 (Q1, 16,075; Q3, 17,886) SNVs apart from each other, which was significantly greater (P < 2.2 × 10−16). Furthermore, many of the SNVs in our ST76 isolates were in recombinant regions: excluding these SNVs resulted in a median of 12 SNVs (Q1, 8; Q3, 15) between isolates, with a maximum of 98 SNVs between any two ST76 isolates.
FIG 4.
Dendrogram of study isolates (red) along with all NCBI C. diphtheriae genomes (black), built off SNVs filtered for recombinant regions.
DISCUSSION
C. diphtheriae is a reemerging pathogen and is an unusually prevalent cause of cutaneous infections in the Downtown Eastside neighborhood of Vancouver, Canada. We conducted WGS on 56 isolates of C. diphtheriae recovered from cutaneous wounds in patients from this neighborhood over a 3-year study period. Interestingly, all of our C. diphtheriae isolates lacked the diphtheria toxin and shared a core set of 22 virulence factors. Pan-genome analysis confirmed and expanded upon these findings, showing a highly conserved accessory genome within our study isolates.
Using traditional MLST, 52/56 study isolates fell within sequence type 76. These findings are in keeping with our previous findings that most isolates in Vancouver are ST76 and nontoxigenic (16, 17). WGS enabled greater phylogenetic resolution and demonstrated small numbers of SNVs (maximum of 98 nonrecombinant SNVs) between our ST76 isolates. Core-genome MLST analysis supported these findings, with a median of 13 cgMLST alleles between ST76 isolates. Previous analyses of C. diphtheriae outbreaks have demonstrated outbreak isolates falling within 150 SNVs of each other (9, 12); all of our ST76 isolates were this closely related. Previous studies have also found epidemiologically linked isolates to be within 5 cgMLST alleles using the proprietary Ridom SeqSphere+ software (12). Using PGAdb-builder, which cannot infer new alleles, most of our isolates fell within this threshold. However, when we used chewBBACA, an open-source cgMLST/wgMLST scheme builder and allele caller, we found a greater number of allele differences. This difference suggests that interpretation of the absolute number of cgMLST allele differences to determine relatedness is inexact without a public, standardized scheme, along with software to build and use this scheme. However, the distribution of cgMLST allele differences within our isolates compared with that of publicly available C. diphtheriae genomes, which are a proxy for global sequence diversity, demonstrates that our isolates are far more closely related than to be expected from a diverse population. Combined with the information of SNV, traditional MLST, virulence factor, and pan-genome analysis, we concluded that the circulating C. diphtheriae in Vancouver largely reflects a single molecular clone.
Our study is novel in that it is the first study, to our knowledge, to whole-genome sequence a large collection of C. diphtheriae isolates collected in North America. Previous groups have sequenced collections of C. diphtheriae from Germany, Belarus, South Africa, and Australia (2, 7, 9, 11–15). Interestingly, our dominant sequence type, ST76, has only been reported in PubMLST.org, an MLST database, in 9 isolates from Saint Petersburg, Russia, spanning the years 2005 to 2010. These isolates were not characterized by WGS, were mainly isolated from carriers, and are all toxin negative by Elek test (https://pubmlst.org/bigsdb?db=pubmlst_cdiphtheriae_isolates&page=profiles). Using WGS, cgMLST, and SNV analysis, we found that our isolates clustered distantly relative to almost all publicly available C. diphtheriae genomes, with the exception of tight clustering with 5 isolates from Belarus isolated between 2004 and 2014 (2). These isolates were also ST76 and nontoxigenic, and at least 3 were isolated from the throat. We are unaware of any epidemiological links between our patients and those Belarussian/Russian isolates. Our study has also identified a novel C. diphtheriae plasmid conferring macrolide and lincosamide resistance; the plasmid sequence has been deposited in GenBank for public scrutiny (accession no. MN462701).
The major limitation of our study is a lack of epidemiological data to support transmission networks within our study population. For example, we were unable to identify how our novel ST isolate was acquired, as it differed significantly from all of the other study isolates. Further research should focus on identifying epidemiologically supported transmission networks within our inner-city population. More advanced modeling, such as that based on the mutation rate of C. diphtheriae, may help identify the timing of transmission between patients; however, a better understanding of the C. diphtheriae mutation rate will need to be attained before this model can be accurate (9).
In conclusion, we used whole-genome sequencing to identify a single molecular strain of C. diphtheriae circulating in the urban poor population of Vancouver, British Columbia, Canada, between 2015 and 2018. Further research will be needed to standardize C. diphtheriae typing using whole-genome sequencing on a global scale and use that information to better understand transmission dynamics within local populations.
Supplementary Material
ACKNOWLEDGMENTS
We thank the St. Paul’s Hospital microbiology and virology laboratory staff for their commitment to patient care and for identifying the isolates included in this study. We also thank the Canadian National Microbiology Laboratory for initial toxin testing of the bacterial isolates.
We declare no conflicts of interest.
Footnotes
Supplemental material is available online only.
REFERENCES
- 1.Dittmann S, Wharton M, Vitek C, Ciotti M, Galazka A, Guichard S, Hardy I, Kartoglu U, Koyama S, Kreysler J, Martin B, Mercer D, Rønne T, Roure C, Steinglass R, Strebel P, Sutter R, Trostle M. 2000. Successful control of epidemic diphtheria in the states of the former Union of Soviet Socialist Republics: lessons learned. J Infect Dis 181:S10–S22. doi: 10.1086/315534. [DOI] [PubMed] [Google Scholar]
- 2.Grosse-Kock S, Kolodkina V, Schwalbe EC, Blom J, Burkovski A, Hoskisson PA, Brisse S, Smith D, Sutcliffe IC, Titov L, Sangal V. 2017. Genomic analysis of endemic clones of toxigenic and non-toxigenic Corynebacterium diphtheriae in Belarus during and after the major epidemic in 1990s. BMC Genomics 18:873. doi: 10.1186/s12864-017-4276-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kantsone I, Lucenko I, Perevoscikovs J. 2016. More than 20 years after re-emerging in the 1990s, diphtheria remains a public health problem in Latvia. Euro Surveill 21:30414. doi: 10.2807/1560-7917.ES.2016.21.48.30414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Santos LS, Sant'anna LO, Ramos JN, Ladeira EM, Stavracakis-Peixoto R, Borges LL, Santos CS, Napoleão F, Camello TC, Pereira GA, Hirata R, Vieira VV, Cosme LM, Sabbadini PS, Mattos-Guaraldi AL. 2014. Diphtheria outbreak in Maranhão, Brazil: microbiological, clinical and epidemiological aspects. Epidemiol Infect 143:1–8. doi: 10.1017/S0950268814001241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lodeiro-Colatosti A, Reischl U, Holzmann T, Hernández-Pereira CE, Rísquez A, Paniz-Mondolfi AE. 2018. Diphtheria Outbreak in Amerindian communities, Wonken, Venezuela, 2016–2017. Emerg Infect Dis 24:1340–1344. doi: 10.3201/eid2407.171712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nanthavong N, Black AP, Nouanthong P, Souvannaso C, Vilivong K, Muller CP, Goossens S, Quet F, Buisson Y. 2015. Diphtheria in Lao PDR: insufficient coverage or ineffective vaccine? PLoS One 10:e0121749. doi: 10.1371/journal.pone.0121749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Du Plessis M, Wolter N, Allam M, de Gouveia L, Moosa F, Ntshoe G, Blumberg L, Cohen C, Smith M, Mutevedzi P, Thomas J, Horne V, Moodley P, Archary M, Mahabeer Y, Mahomed S, Kuhn W, Mlisana K, McCarthy K, von Gottberg A. 2017. Molecular characterization of Corynebacterium diphtheriae outbreak isolates, South Africa, March–June 2015. Emerg Infect Dis 23:1308–1315. doi: 10.3201/eid2308.162039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mahomed S, Archary M, Mutevedzi P, Mahabeer Y, Govender P, Ntshoe G, Kuhn W, Thomas J, Olowolagba A, Blumberg L, Mccarthy K, Mlisana K, Plessis MD, Gottberg AV, Moodley P. 2017. An isolated outbreak of diphtheria in South Africa, 2015. Epidemiol Infect 145:2100–2108. doi: 10.1017/S0950268817000851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Meinel DM, Kuehl R, Zbinden R, Boskova V, Garzoni C, Fadini D, Dolina M, Blümel B, Weibel T, Tschudin-Sutter S, Widmer AF, Bielicki JA, Dierig A, Heininger U, Konrad R, Berger A, Hinic V, Goldenberger D, Blaich A, Stadler T, Battegay M, Sing A, Egli A. 2016. Outbreak investigation for toxigenic Corynebacterium diphtheriae wound infections in refugees from Northeast Africa and Syria in Switzerland and Germany by whole genome sequencing. Clin Microbiol Infect 22:1003.e1–1003.e8. doi: 10.1016/j.cmi.2016.08.010. [DOI] [PubMed] [Google Scholar]
- 10.Murphy JR. 1996. Corynebacterium diphtheriae In Baron S. (ed). Medical Microbiology, 4th ed University of Texas Medical Branch at Galveston, Galveston, TX. [PubMed] [Google Scholar]
- 11.Berger A, Dangel A, Schober T, Schmidbauer B, Konrad R, Marosevic D, Schubert S, Hörmansdorfer S, Ackermann N, Hübner J, Sing A. 2019. Whole genome sequencing suggests transmission of Corynebacterium diphtheriae-caused cutaneous diphtheria in two siblings, Germany, 2018. Euro Surveill 24:1800683. doi: 10.2807/1560-7917.ES.2019.24.2.1800683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Dangel A, Berger A, Konrad R, Bischoff H, Sing A. 2018. Geographically diverse clusters of nontoxigenic Corynebacterium diphtheriae infection, Germany, 2016–2017. Emerg Infect Dis 24:1239–1245. doi: 10.3201/eid2407.172026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pivot D, Fanton A, Badell-Ocando E, Benouachkou M, Astruc K, Huet F, Amoureux L, Neuwirth C, Criscuolo A, Aho S, Toubiana J, Brisse S. 2019. Carriage of a single strain of nontoxigenic Corynebacterium diphtheriae bv. Belfanti (Corynebacterium belfantii) in four patients with cystic fibrosis. J Clin Microbiol 57:e00042–e00119. doi: 10.1128/JCM.00042-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Doyle CJ, Mazins A, Graham RMA, Fang N-X, Smith HV, Jennison AV. 2017. Sequence analysis of toxin gene–bearing Corynebacterium diphtheriae strains, Australia. Emerg Infect Dis 23:105–107. doi: 10.3201/eid2301.160584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Timms VJ, Nguyen T, Crighton T, Yuen M, Sintchenko V. 2018. Genome-wide comparison of Corynebacterium diphtheriae isolates from Australia identifies differences in the pan-genomes between respiratory and cutaneous strains. BMC Genomics 19:869. doi: 10.1186/s12864-018-5147-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lowe CF, Bernard KA, Romney MG. 2011. Cutaneous diphtheria in the urban poor population of Vancouver, British Columbia, Canada: a 10-year review. J Clin Microbiol 49:2664–2666. doi: 10.1128/JCM.00362-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Romney MG, Roscoe DL, Bernard K, Lai S, Efstratiou A, Clarke AM. 2006. Emergence of an invasive clone of nontoxigenic Corynebacterium diphtheriae in the urban poor population of Vancouver, Canada. J Clin Microbiol 44:1625–1629. doi: 10.1128/JCM.44.5.1625-1629.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zou J, Chorlton S, Romney M, Payne M, Lawson T, Wong A, Champagne S, Ritchie G, Lowe C. Phenotypic and genotypic correlates of penicillin susceptibility in non-toxigenic Corynebacterium diphtheriae, British Columbia, 2015–2018. Emerg Infect Dis, in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ewels P, Magnusson M, Lundin S, Käller M. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32:3047–3048. doi: 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. doi: 10.1093/bioinformatics/btt086. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liu B, Zheng D, Jin Q, Chen L, Yang J. 2019. VFDB 2019: a comparative pathogenomic platform with an interactive web interface. Nucleic Acids Res 47:D687–D692. doi: 10.1093/nar/gky1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068–2069. doi: 10.1093/bioinformatics/btu153. [DOI] [PubMed] [Google Scholar]
- 26.Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MTG, Fookes M, Falush D, Keane JA, Parkhill J. 2015. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31:3691–3693. doi: 10.1093/bioinformatics/btv421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Brynildsrud O, Bohlin J, Scheffer L, Eldholm V. 2016. Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary. Genome Biol 17:238. doi: 10.1186/s13059-016-1108-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, Jelsbak L, Sicheritz-Pontén T, Ussery DW, Aarestrup FM, Lund O. 2012. Multilocus sequence typing of total-genome-sequenced bacteria. J Clin Microbiol 50:1355–1361. doi: 10.1128/JCM.06094-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Liu Y-Y, Chiou C-S, Chen C-C. 2016. PGAdb-builder: a web service tool for creating pan-genome allele database for molecular fine typing. Sci Rep 6:36213. doi: 10.1038/srep36213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Silva M, Machado MP, Silva DN, Rossi M, Moran-Gilad J, Santos S, Ramirez M, Carriço JA. 2018. chewBBACA: a complete suite for gene-by-gene schema creation and strain identification. Microb Genom 4. doi: 10.1099/mgen.0.000166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. doi: 10.1186/1471-2105-11-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Croucher NJ, Page AJ, Connor TR, Delaney AJ, Keane JA, Bentley SD, Parkhill J, Harris SR. 2015. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 43:e15. doi: 10.1093/nar/gku1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw sequencing data are available in the NCBI Sequence Read Archive and can be accessed with BioProject accession no. PRJNA563223. Our novel plasmid sequence and annotation have been deposited in NCBI GenBank with accession no. MN462701.


