Abstract
Phomopsis longicolla is the primary cause of Phomopsis seed decay in soybean. Here, we report the de novo assembled draft genome sequence of the P. longicolla type strain TWH P74 (ATCC 60325), which was originally isolated by Hobbs et al. from soybean seed in Ohio in 1983.
GENOME ANNOUNCEMENT
Phomopsis longicolla T. W. Hobbs is one of the important seed-borne fungal pathogens in the Diaporthe-Phomopsis complex and the primary cause of Phomopsis seed decay (PSD) in soybean (1, 2). This disease severely decreases the quality of soybean seeds and the yield (3, 4). Most research has been focused on breeding for resistant lines (5, 6). However, the mechanisms of PSD disease development and pathogen invasion of soybean are not fully understood. As a first step to investigate the genetic base of fungal virulence factors and understand the mechanism of infection, we have assembled the draft genome sequence of P. longicolla type strain TWH P74 (ATCC 60325), which was originally isolated by Hobbs et al. from soybean seed in Ohio in 1983 (1).
Genomic DNA of type strain TWH P74 was extracted from a 4-day-old culture and used to generate mate-pair and paired-end libraries with insert sizes of approximately 4 kb and 500 bp, respectively. A no-gel mate-pair library was generated with the Nextera mate-pair sample preparation kit (Illumina, San Diego, CA), and a paired-end library was made using the TruSeq DNA PCR-free sample preparation kit (Illumina, San Diego, CA) according to the manufacturer’s protocols. Libraries were sequenced in separate lanes on an Illumina HiSeq 2500 sequencer using a TruSeq SBS sequencing kit (version 3, Illumina) at the Genomics Core Facility, Purdue University.
The mate-pair library produced 43,688,054 reads (read length = 101 bp). The paired-end library produced 71,713,088 reads (read length = 101 bp). After removing the adapter sequences and trimming poor quality bases (Phred score < 20) for each read using Trimmomatic (7) and/or fastx clipper (http://hannonlab.cshl.edu/fastx_toolkit/) and eliminating reads below 30 bases, a total of 29,788,854 mate-pair reads (2,554 Mb) and 68,445,472 paired-end reads (6,748 Mb) were retained. A draft of the P. longicolla genome was assembled from both libraries using software ABySS de novo assembler version 1.3.7 at a k-mer of 90 (8). The assembly contained a total of 986 scaffolds with average read depth coverage of 145-fold. The N50 was 213.1 kb, and the maximum contig length was 1,124 kb. The total sequence length of the resulting draft genome was 64,714,586 bp with an overall G+C content of 48.1%. Gene annotation using the Augustus program (http://bioinf.uni-greifswald.de/webaugustus/prediction/create) (9) trained with the parameters of the species Fusarium graminearum resulted in 16,606 genes. The average length of the gene is 1,709 bp ranging from 215 bp to 23,099 bp. The total length of the gene sequence was 28.4 Mb, which is 43.8% of the whole-genome sequence. We used CEGMA (10) against a set of 248 conserved protein families that occur in a wide range of core eukaryotic gene datasets (CEGs) (http://korflab.ucdavis.edu/Datasets/genome_completeness/index.html#SCT2) and 98.21% of the core genes were matched, indicating the draft genome sequence was substantially complete.
Nucleotide sequence accession numbers.
The draft genome sequence of P. longicolla type strain TWH P74 has been deposited at DDBJ/EMBL/GenBank under the accession no. JUJX00000000. The version described in this paper is version JUJX01000000.
ACKNOWLEDGMENTS
This work was partially supported by the USDA-ARS projects 6402-21220-012-00D.
We are grateful to Phillip SanMiguel and Rick Westerman at the Purdue Genomics Core Facility for sequencing. Support by the University of Georgia Coastal Plain Experiment Station to Ji is also appreciated.
Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the United States Department of Agriculture. The USDA is an equal opportunity provider and employer.
Footnotes
Citation Li S, Song Q, Ji P, Cregan P. 2015. Draft genome sequence of Phomopsis longicolla type strain TWH P74, a fungus causing Phomopsis seed decay in soybean. Genome Announc 3(1):e00010-15. doi:10.1128/genomeA.00010-15.
REFERENCES
- 1.Hobbs TW, Schmitthenner AF, Kuter GA. 1985. A new Phomopsis species from soybean. Mycologia 77:535–544. doi: 10.2307/3793352. [DOI] [Google Scholar]
- 2.Li S, Hartman GL, Boykin DL. 2010. Aggressiveness of Phomopsis longicolla and other Phomopsis spp. on soybean. Plant Dis 94:1035–1040. doi: 10.1094/PDIS-94-8-1035. [DOI] [PubMed] [Google Scholar]
- 3.Li S. 2011. Phomopsis seed decay of soybean, p 277–292. In: Sudaric A (ed), Soybean—molecular aspects of breeding. Intech Publisher, Vienna, Austria. [Google Scholar]
- 4.Hartman GL, Sinclair JB, Rupe JC. 1999. Compendium of soybean diseases, 4th ed. American Phytopathological Society, St. Paul, MN. [Google Scholar]
- 5.Jackson EW, Feng C, Fenn P, Chen P. 2009. Genetic mapping of resistance to Phomopsis seed decay in the soybean breeding line MO/PSD-0259 (PI562694) and plant introduction 80837. J Hered 100:777–783. doi: 10.1093/jhered/esp042. [DOI] [PubMed] [Google Scholar]
- 6.Li S, Smith JR, Nelson RL. 2011. Resistance to Phomopsis seed decay identified in maturity group V soybean plant introductions. Crop Sci 51:2681–2688. doi: 10.2135/cropsci2011.03.0162. [DOI] [Google Scholar]
- 7.Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. 2009. ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123. doi: 10.1101/gr.089532.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hoff KJ, Stanke M. 2013. WebAUGUSTUS—a Web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Res 41:W123–W128. doi: 10.1093/nar/gkt418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Parra G, Bradnam K, Korf I. 2007. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23:1061–1067. doi: 10.1093/bioinformatics/btm071. [DOI] [PubMed] [Google Scholar]
