Complete Genome Sequence of Erwinia amylovora Strain 99east-3-1, Isolated from Pyrus sinkiangensis in China

Nuoya Fei; Yuwen Yang; Bo Song; Xiaofeng Zhu; Wei Guan; Tingchang Zhao

doi:10.1128/mra.00161-23

. 2023 Jun 14;12(7):e00161-23. doi: 10.1128/mra.00161-23

Complete Genome Sequence of Erwinia amylovora Strain 99east-3-1, Isolated from Pyrus sinkiangensis in China

Nuoya Fei ^a,^b,^#, Yuwen Yang ^a,^✉,^#, Bo Song ^c, Xiaofeng Zhu ^c, Wei Guan ^a, Tingchang Zhao ^a,^✉

Editor: David A Baltrus^d

PMCID: PMC10353370 PMID: 37314334

ABSTRACT

Here, we report the complete genome sequence of Erwinia amylovora strain 99east-3-1, which was isolated from Pyrus sinkiangensis in Xinjiang Uygur Autonomous Region, China.

ANNOUNCEMENT

Fire blight caused by Erwinia amylovora is a destructive bacterial disease that seriously endangers the production of rosaceous plants (1). However, genome resources for E. amylovora from China are rare. Here, we report a complete genome sequence of E. amylovora (strain 99east-3-1) from China. A sample of fire blight on Pyrus sinkiangensis was collected on 16 July 2021 in Bayingolin Mongolian Autonomous Prefecture (Xinjiang Uygur Autonomous Region; 41°40′24″N, 86°5′30″E). The infected branch tissue was cut, sterilized, and ground. After static precipitation, the supernatant was streaked on a nutrient agar (NA) plate and incubated at 28°C for 36 h. Single colonies were selected and identified by sequences of 16S rRNA gene fragments (the primers 27F/1492R were used; see “Data availability”). Strain 99east-3-1 was identified as E. amylovora and chosen for DNA extraction.

A single colony was grown in nutrient broth for 24 h, and 15 mL of the bacterial suspension was centrifuged. Genomic DNA was extracted using SDS method (2). The DNA was submitted to Tsingke Biotechnology (Beijing, China) for sequencing, after testing using the NanoDrop One spectrophotometer (Wilmington, DE), Qubit 3.0 fluorimeter (Life Technologies, Carlsbad, CA, USA), and gel electrophoresis. For Nanopore sequencing, DNA libraries were prepared utilizing a single library preparation method with the SQK-LSK109, EXP-NBD104, and 114 kits (Oxford Nanopore Technologies, Oxford, UK). The Qubit 2.0 fluorimeter was used for DNA quantification and library dilution. The library was sequenced using the PromethION sequencer (R9.4 chip; Oxford Nanopore Technologies). Library insert fragments were detected using the Agilent 2100 Bioanalyzer system. The Nanopore long reads were subjected to quality control using NanoPlot v1.15.0 with a threshold quality (Q) value of >7 (3), resulting in 54,987 high-quality reads with an average read length of 18,186.4 bp and an N₅₀ value of 20,514 bp. For Illumina sequencing, DNA libraries were prepared utilizing a single library preparation method based on the Illumina Nextera kit and sequenced on the NovaSeq 6000 platform (4 –9). We obtained paired-end reads (2 × 150 bp), totaling 8,114,076 reads covering a total of 1.22 Gb clean data (Q20, 97.25; Q30, 92.27). Unicycler v0.4.9 (10) was used to assemble the clean reads. Pilon v1.23 (11) was used to correct the assembled genome, using the second-generation data to obtain a final genome with higher accuracy. The average sequencing depth of Illumina sequencing was 308.54×, and the average sequencing depth of Nanopore sequencing was approximately 257.82×.

The complete genome of 99east-3-1 contains a circular chromosome of 3,799,623 bp with a GC content of 53.62% and a plasmid of 28,283 bp with a GC content of 50.21%. The whole-genome sequence (WGS) was annotated using the NCBI Prokaryotic Genome Annotation Pipeline (12) (Table 1). The genome of 99east-3-1 has a high sequence identity with the genome of representative E. amylovora strains ATCC 49946 (GenBank accession number NC_013971.1) and CFBP1430 (NC_013961.1), with an average nucleotide identity (ANI) of 99.98% (13). The ANI genome comparisons were performed using EzBioCloud (https://www.ezbiocloud.net/tools/ani). Default parameters were used for all software.

TABLE 1.

Genome annotation statistics of 99east-3-1

Characteristic	No.^a
Genes	3,478
CDSs^b	3,361
rRNAs (5S, 16S, 23S)	8, 7, 7
Complete rRNAs (5S, 16S, 23S)	8, 7, 7
tRNAs	76
Pseudogenes	88
CRISPR arrays	2

Open in a new tab

Determined using PGAP.

CDSs, coding DNA sequences.

Data availability.

The whole-genome sequencing project of 99east-3-1 has been deposited at NCBI GenBank under the accession number NZ_CP117554.1, BioProject accession number PRJNA893880, and BioSample accession number SAMN31435602. The 16S rRNA gene fragments have been deposited under the GenBank accession number OQ851893. The raw sequences have been deposited in the SRA under the accession number SRR22031458.

ACKNOWLEDGEMENT

We thank xjkcpy-2020006 and CAAS-ASTIP for supporting this work.

Contributor Information

Yuwen Yang, Email: yangyuwen@caas.cn.

Tingchang Zhao, Email: tczhao@ippcaas.cn.

David A. Baltrus, University of Arizona

REFERENCES

1.Eastgate JA. 2000. Erwinia amylovora: the molecular basis of fireblight disease. Mol Plant Pathol 1:325–329. doi: 10.1046/j.1364-3703.2000.00044.x. [DOI] [PubMed] [Google Scholar]
2.Rang J, Li L, Tang Q, Yang Q, He L, Ding X, Xia L. 2015. Comparative study of bacterial DNA extraction methods for the third generation sequencing technology. J Nat Sci Hunan Normal Univ 38:14–20. (In Chinese). [Google Scholar]
3.De Coster W, D'Hert S, Schultz DT, Cruts M, Van Broeckhoven C. 2018. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34:2666–2669. doi: 10.1093/bioinformatics/bty149. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. 2010. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771. doi: 10.1093/nar/gkp1137. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Hansen KD, Brenner SE, Dudoit S. 2010. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res 38:e131. doi: 10.1093/nar/gkq224. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Erlich Y, Mitra PP, delaBastide M, McCombie WR, Hannon GJ. 2008. Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nat Methods 5:679–682. doi: 10.1038/nmeth.1230. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. 2011. Synthetic spike-in standards for RNA-seq experiments. Genome Res 21:1543–1551. doi: 10.1101/gr.121095.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Yan L, Yang M, Guo H, Yang L, Wu J, Li R, Liu P, Lian Y, Zheng X, Yan J, Huang J, Li M, Wu X, Wen L, Lao K, Li R, Qiao J, Tang F. 2013. Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat Struct Mol Biol 20:1131–1139. doi: 10.1038/nsmb.2660. [DOI] [PubMed] [Google Scholar]
10.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Tatusova T, Dicuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI Prokaryotic Genome Annotation Pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Yoon S-H, Ha S-M, Kwon S, Lim J, Kim Y, Seo H, Chun J. 2017. Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int J Syst Evol Microbiol 67:1613–1617. doi: 10.1099/ijsem.0.001755. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[B1] 1.Eastgate JA. 2000. Erwinia amylovora: the molecular basis of fireblight disease. Mol Plant Pathol 1:325–329. doi: 10.1046/j.1364-3703.2000.00044.x. [DOI] [PubMed] [Google Scholar]

[B2] 2.Rang J, Li L, Tang Q, Yang Q, He L, Ding X, Xia L. 2015. Comparative study of bacterial DNA extraction methods for the third generation sequencing technology. J Nat Sci Hunan Normal Univ 38:14–20. (In Chinese). [Google Scholar]

[B3] 3.De Coster W, D'Hert S, Schultz DT, Cruts M, Van Broeckhoven C. 2018. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34:2666–2669. doi: 10.1093/bioinformatics/bty149. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4.Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5.Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. 2010. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771. doi: 10.1093/nar/gkp1137. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.Hansen KD, Brenner SE, Dudoit S. 2010. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res 38:e131. doi: 10.1093/nar/gkq224. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7.Erlich Y, Mitra PP, delaBastide M, McCombie WR, Hannon GJ. 2008. Alta-Cyclic: a self-optimizing base caller for next-generation sequencing. Nat Methods 5:679–682. doi: 10.1038/nmeth.1230. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8.Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B. 2011. Synthetic spike-in standards for RNA-seq experiments. Genome Res 21:1543–1551. doi: 10.1101/gr.121095.111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9.Yan L, Yang M, Guo H, Yang L, Wu J, Li R, Liu P, Lian Y, Zheng X, Yan J, Huang J, Li M, Wu X, Wen L, Lao K, Li R, Qiao J, Tang F. 2013. Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat Struct Mol Biol 20:1131–1139. doi: 10.1038/nsmb.2660. [DOI] [PubMed] [Google Scholar]

[B10] 10.Wick RR, Judd LM, Gorrie CL, Holt KE. 2017. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595. doi: 10.1371/journal.pcbi.1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11] 11.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12.Tatusova T, Dicuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M, Ostell J. 2016. NCBI Prokaryotic Genome Annotation Pipeline. Nucleic Acids Res 44:6614–6624. doi: 10.1093/nar/gkw569. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] 13.Yoon S-H, Ha S-M, Kwon S, Lim J, Kim Y, Seo H, Chun J. 2017. Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int J Syst Evol Microbiol 67:1613–1617. doi: 10.1099/ijsem.0.001755. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Complete Genome Sequence of Erwinia amylovora Strain 99east-3-1, Isolated from Pyrus sinkiangensis in China

Nuoya Fei

Yuwen Yang

Bo Song

Xiaofeng Zhu

Wei Guan

Tingchang Zhao

Roles

ABSTRACT

ANNOUNCEMENT

TABLE 1.

Data availability.

ACKNOWLEDGEMENT

Contributor Information

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Complete Genome Sequence of Erwinia amylovora Strain 99east-3-1, Isolated from Pyrus sinkiangensis in China

Nuoya Fei

Yuwen Yang

Bo Song

Xiaofeng Zhu

Wei Guan

Tingchang Zhao

Roles

ABSTRACT

ANNOUNCEMENT

TABLE 1.

Data availability.

ACKNOWLEDGEMENT

Contributor Information

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases