Abstract
The genus Yersinia includes three human pathogens, of which Yersinia pestis is responsible for >2,000 illnesses each year. To aid in the development of detection assays and aid further phylogenetic elucidation, we sequenced and assembled the complete genomes of 32 strains (across 9 Yersinia species).
GENOME ANNOUNCEMENT
The genus Yersinia contains 11 species, with three human pathogens, Y. pestis, Y. pseudotuberculosis, and Y. enterocolitica. Of these, Y. pestis is the most virulent, causing >2,000 global cases of plague annually, along with three global pandemics (1, 2). Y. pestis is a category A pathogen and potential biowarfare agent (3, 4), while Y. pseudotuberculosis and Y. enterocolitica cause food-borne self-limiting enteric diseases with low mortality rates (5). Recently, the list of strains for consideration in diagnostic assay development was released by the Association of Analytical Communities (AOAC) International, including strains that should be recognized (inclusivity) and ignored (exclusivity) by the assays (6). Here, we present the completed genome assemblies for 32 (see Table 1) of the 33 listed Yersinia strains (YPNN7 Y. pseudotuberculosis IB was not included due to technical issues).
TABLE 1.
Strain name | AOAC | Panela | Accession no. | Size (Mb) | No. of indicated plasmid |
|||
---|---|---|---|---|---|---|---|---|
pPCP | pCD/pYV | pMT | Other | |||||
Y. aldovae | ||||||||
670-83 | YPNN17 | E | CP009781 | 4.47 | ||||
Y. enterocolitica | ||||||||
2516-87 | YPNN13 | E | CP009837, CP009838 | 4.60 | 1 | |||
8081 | YPNN12 | E | CP009845, CP009846 | 4.68 | 1 | |||
WA | YPNN11 | E | CP009366, CP009367 | 4.61 | 1 | |||
Y. frederiksenii | ||||||||
Y225b | YPNN15 | E | CP009363, CP009364 | 4.55 | 1 | |||
Y. intermedia | ||||||||
Y228 | YPNN16 | E | CP009801 | 4.85 | ||||
Y. kristensenii | ||||||||
Y231 | YPNN14 | E | CP009997 | 4.49 | ||||
Y. pestis | ||||||||
A1122 | YP12 | I | CP009839–CP009841 | 4.67 | 1 | 1 | ||
Angola | YP7 | I | CP009934– CP009937 | 4.67 | 1 | 1 | 1 | |
Antiqua | YP3 | I | CP009903– CP009906 | 4.88 | 1 | 1 | 1 | |
CO92 pgm- | YP1 | I | CP009971– CP009973 | 4.72 | 1 | 1 | ||
Dodson | YP15 | I | CP009842– CP009844 | 4.77 | 1 | 1 | ||
El Dorado | YP16 | I | CP009782– CP009785 | 4.83 | 1 | 1 | 1 | |
Harbin35 | YP9 | I | CP009701– CP009704 | 4.70 | 1 | 1 | 1 | |
Java9c | YP11 | I | CP009992– CP009996 | 4.82 | 1 | 1 | 2 | |
KIM5 | YP2 | I | CP009833– CP009836 | 4.78 | 1 | 1 | 1 | |
Nairobi | YP8 | I | CP010293, CP010294 | 4.47 | 1 | |||
Nicholisk 41 | YP13 | I | CP009988– CP009991 | 4.70 | 1 | 1 | 1 | |
PBM19 | YP10 | I | CP009489– CP009492 | 4.86 | 1 | 1 | 1 | |
Pestoides B | YP4 | I | CP010020– CP010023 | 4.79 | 1 | 1 | 1 | |
Pestoides F | YP5 | I | CP009713– CP009715 | 4.72 | 1 | 1 | ||
Pestoides G | YP6 | I | CP010246– CP010248 | 4.73 | 1 | 1 | ||
Shasta | YP14 | I | CP009721– CP009724 | 4.83 | 1 | 1 | 1 | |
Y. pseudotuberculosis | ||||||||
1 | YPNN10 | E | CP009786 | 4.72 | ||||
EP2/+ | YPNN8 | E | CP009758, CP009759 | 4.77 | 1 | |||
IP32953 | YPNN4 | E | CP009710– CP009712 | 4.83 | 1 | 1 | ||
MD67 | YPNN9 | E | CP009757 | 4.72 | ||||
Pa3606 | YPNN6 | E | CP010067– CP010069 | 4.83 | 1 | 1 | ||
PB1/+ | YPNN3 | E | CP009779, CP009780 | 4.76 | 1 | |||
YPIII | YPNN5 | E | CP009792 | 4.68 | ||||
Y. rohdei | ||||||||
ATCC 43380 | YPNN2 | E | CP009787 | 4.37 | ||||
Y. ruckeri | ||||||||
YRB | YPNN1 | E | CP009539 | 3.60 |
Refers to the AOAC listing (6) of either inclusivity (I) or exclusivity (E) strains.
The plasmid in Y. frederiksenii is cryptic.
The two plasmids listed as “other” for Y. pestis JAVA9 are pJARS35 and pJARS36.
Each genome was assembled using at least two data sets (specific data types and coverages are listed in the NCBI records), from Illumina (short- and/or long-insert paired data), Roche 454 (long-insert paired data), and/or PacBio long reads. The short- and long-insert paired data were assembled together in both Newbler and Velvet and computationally shredded into 1.5-kbp overlapping shreds. If the PacBio coverage was ≥100×, the data were assembled using the PacBio Hierarchical Genome Assembly Process (HGAP) (7). All data were additionally assembled in AllPaths (8). The consensus sequences from both HGAP and AllPaths were computationally shredded into 10-kbp overlapping pieces. All shreds were integrated using Phrap. Possible misassemblies were corrected and repeat regions verified using in-house scripts and manual editing in Consed (9–11). All genomes were assembled to finished-quality completion (12), and each assembly was annotated using an Ergatis-based (13) workflow, with minor manual curation.
The genome sizes averaged 4.68 ± 0.04 Mb (Table 1; the smallest is Yersinia ruckeri YRB, at 3.6 Mb, and the largest is Y. pestis Antiqua, at 4.9 Mb), with up to 4 plasmids (average, 1.6 ± 0.2). Each genome contains 3,161 to 4,419 coding sequences (average, 4,155 ± 39.9) and a G+C content of 47 to 48%. As many of the virulence genes are located on plasmids, it is interesting to note that of the 16 Y. pestis strains, only 9 had all three “traditional” plasmids (pYV/pCD1 [virulence/calcium dependence], pPCP [plasminogen activator], and pMT [murine toxin]), with one strain (Y. pestis Nairobi) containing the pPCP plasmid only.
Nucleotide sequence accession numbers.
The GenBank accession numbers for all 32 genomes are listed in Table 1.
ACKNOWLEDGMENTS
Funding for this effort was provided by the Defense Threat Reduction Agency’s Joint Science and Technology Office (DTRA J9-CB/JSTO) and the Department of Homeland Security Science and Technology Directorate award HSHQDC-08-X-00790.
This paper is approved by LANL for unlimited release (LA-UR-14-29606).
The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the Department of the Navy, Department of Defense, or the United States Government.
The bacterial strains were obtained from the Department of Defense’s Unified Culture Collection (http://www.usamriid.army.mil/ucc/).
Footnotes
Citation Johnson SL, Daligault HE, Davenport KW, Jaissle J, Frey KG, Ladner JT, Broomall SM, Bishop-Lilly KA, Bruce DC, Coyne SR, Gibbons HS, Lo C-C, Munk AC, Rosenzweig CN, Koroleva GI, Palacios GF, Redden CL, Xu Y, Minogue TD, Chain PS. 2015. Thirty-two complete genome assemblies of nine Yersinia species, including Y. pestis, Y. pseudotuberculosis, and Y. enterocolitica. Genome Announc 3(2):e00148-15. doi:10.1128/genomeA.00148-15.
REFERENCES
- 1.Perry RD, Fetherston JD. 1997. Yersinia pestis—etiologic agent of plague. Clin Microbiol Rev 10:35–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.World Health Organization (WHO) 2014. Plague. Fact sheet no. 267. World Health Organization, Geneva, Switzerland: http://www.who.int/mediacentre/factsheets/fs267/en/index.htm. [Google Scholar]
- 3.Inglesby TV, Dennis DT, Henderson DA, Bartlett JG, Ascher MS, Eitzen E, Fine AD, Friedlander AM, Hauer J, Koerner JF, Layton M, McDade J, Osterholm MT, O’Toole T, Parker G, Perl TM, Russell PK, Schoch-Spana M, Tonat K, Working Group on Civilian Biodefense . 2000. Plague as a biological weapon: medical and public health management. JAMA 283:2281–2290. doi: 10.1001/jama.283.17.2281. [DOI] [PubMed] [Google Scholar]
- 4.Riedel S. 2005. Plague: from natural disease to bioterrorism. Proc (Bayl Univ Med Cent) 18:116–124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Huang X-Z, Nikolich MP, Lindler LE. 2006. Current trends in plague research: from genomics to virulence. Clin Med Res 4:189–199. doi: 10.3121/cmr.4.3.189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.AOAC International 2011. AOAC SMPR 2010.002: standard method performance requirements for polymerase chain reaction (PCR) methods for detection of Yersinia pestis in aerosol collection filters and/or liquids. J AOAC Int 94:1342–1346. [PubMed] [Google Scholar]
- 7.Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, Turner SW, Korlach J. 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods 10:563–569. doi: 10.1038/nmeth.2474. [DOI] [PubMed] [Google Scholar]
- 8.Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB. 2008. AllPaths: de novo assembly of whole-genome shotgun microreads. Genome Res 18:810–820. doi: 10.1101/gr.7337908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ewing B, Green P. 1998. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res 8:186–194. [PubMed] [Google Scholar]
- 10.Ewing B, Hillier L, Wendl MC, Green P. 1998. Base-calling of automated sequencer traces UsingPhred. I. Accuracy assessment. Genome Res 8:175–185. doi: 10.1101/gr.8.3.175. [DOI] [PubMed] [Google Scholar]
- 11.Gordon D, Abajian C, Green P. 1998. Consed: a graphical tool for sequence finishing. Genome Res 8:195–202. doi: 10.1101/gr.8.3.195. [DOI] [PubMed] [Google Scholar]
- 12.Chain PS, Grafham DV, Fulton RS, FitzGerald MG, Hostetler J, Muzny D, Ali J, Birren B, Bruce DC, Buhay C, Cole JR, Ding Y, Dugan S, Field D, Garrity GM, Gibbs R, Graves T, Han CS, Harrison SH, Highlander S, Hugenholtz P, Khouri HM, Kodira CD, Kolker E, Kyrpides NC, Lang D, Lapidus A, Malfatti SA, Markowitz V, Metha T, Nelson KE, Parkhill J, Pitluck S, Qin X, Read TD, Schmutz J, Sozhamannan S, Sterk P, Strausberg RL, Sutton G, Thomson NR, Tiedje JM, Weinstock G, Wollam A, Genomic Standards Consortium Human Microbiome Project Jumpstart Consortium, Detter JC. 2009. Genomics. Genome project standards in a new era of sequencing. Science 326:236–237. doi: 10.1126/science.1180614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hemmerich C, Buechlein A, Podicheti R, Revanna KV, Dong Q. 2010. An Ergatis-based prokaryotic genome annotation Web server. Bioinformatics 26:1122–1124. doi: 10.1093/bioinformatics/btq090. [DOI] [PubMed] [Google Scholar]