Xylaria sp. BCC 1067 is a wood-decaying fungus which is capable of producing lignocellulolytic enzymes. Based on the results of a single-molecule real-time sequencing technology analysis, we present the first draft genome of Xylaria sp. BCC 1067, comprising 54.1 Mb with 12,112 protein-coding genes.
ABSTRACT
Xylaria sp. BCC 1067 is a wood-decaying fungus which is capable of producing lignocellulolytic enzymes. Based on the results of a single-molecule real-time sequencing technology analysis, we present the first draft genome of Xylaria sp. BCC 1067, comprising 54.1 Mb with 12,112 protein-coding genes.
ANNOUNCEMENT
Lignocellulolytic enzymes are widely exploited in various applications, particularly in the hydrolysis stages of lignocellulose-based industries (1, 2). Xylaria species belonging to the class Sordariomycetes in the phylum Ascomycota are considered one of the most efficient types of wood-decaying fungi and are classified as soft-rot type II (3). Xylaria sp. BCC 1067, isolated from Nam Nao National Park in Thailand (4, 5), has been shown to produce lignocellulolytic enzymes, including endoglucanase, β-glucosidase, xylanase, and laccase. However, the genes underlying the production of enzymes from this fungus have never been explored and characterized. Therefore, the whole-genome de novo sequencing of Xylaria sp. BCC 1067 was performed in order to further understand the lignocellulolytic enzyme systems of this fungus.
Genomic DNA of Xylaria sp. BCC 1067 was extracted from mycelia cultivated in potato dextrose broth at 25°C for 5 days by the phenol-chloroform method (6). Genome sequencing was performed using PacBio single-molecule real-time (SMRT) sequencing technology. High-molecular-weight genomic DNA was sheared and selected on a BluePippin system using a cutoff range of 15 to 50 kb. The libraries were sequenced on 16 SMRT cells using a PacBio RS II sequencer according to the manufacturer’s instructions (Pacific Biosciences, Menlo Park, CA, USA). For de novo assembly, a total of 8,912,350,715 bases of genome sequences with 45,112 circular consensus sequence (CCS) reads were assembled using the SMRT Analysis Hierarchical Genome Assembly Process 3 (HGAP3) pipeline, which resulted in 43 contigs, with the length of the longest contig and the N50 value being 6,684,005 bp and 5,573,684 bp, respectively. The total size of the Xylaria sp. BCC 1067 genome was 54,100,337 bp, and the cumulative G+C content of the genome was 42.44%.
Ab initio gene prediction was carried out using Fgenesh v4.0.0 (7) based on gene models from Fusarium graminearum. As a result, 12,112 coding sequences (CDSs) were predicted. According to tRNAscan-SE v2.0 (8), the genome contained 232 nuclear tRNAs. CDSs were functionally characterized using BLASTP v2.2.31+ (9). A BLAST database was created from all protein sequences of 433 fungi in the phylum Ascomycota downloaded from the NCBI FTP site on 12 January 2017. The carbohydrate-active enzyme (CAZyme) gene content was determined using dbCAN2 (10, 11) based on three different tools, HMMER v3.2.1, DIAMOND v0.9.24, and Hotpep (12). The analysis revealed that the Xylaria sp. BCC 1067 genome contains 239 glycoside hydrolases (GH), 100 glycosyltransferases (GT), 16 polysaccharide lyases (PL), 27 carbohydrate esterases (CE), 87 carbohydrate-binding modules (CBM), and 86 enzymes with auxiliary activities (AA), indicating that this fungus has high potential to be used for biomass conversion. The genomic information of Xylaria sp. BCC 1067 will provide a better understanding of the lignocellulolytic enzyme system in this organism and could enable metabolic engineering of the strain for enhanced lignocellulolytic enzyme production.
Data availability.
The draft genome sequence of Xylaria sp. BCC 1067 has been deposited in GenBank under the accession number SSCS00000000, BioProject number PRJNA531792, and BioSample number SAMN08619538. The raw reads have been deposited in the NCBI Sequence Read Archive (SRA) under the number SRP191752.
ACKNOWLEDGMENTS
This research was funded by the National Research Council of Thailand and King Mongkut’s University of Technology Thonburi (KMUTT), Thailand. We acknowledge the financial support provided by the NSFC-NRCT Collaboration Project and King Mongkut’s University of Technology Thonburi through the KMUTT 55th Anniversary Commemorative Fund.
We thank the Bioinformatics and Systems Biology Program, the School of Bioresources and Technology, and the School of Information Technology, KMUTT. We are also grateful to the Systems Biology and Bioinformatics Laboratory and the Fungal Biotechnology laboratory at KMUTT for the use of their facilities and their support.
REFERENCES
- 1.Kirk O, Borchert TV, Fuglsang CC. 2002. Industrial enzyme applications. Curr Opin Biotechnol 13:345–351. doi: 10.1016/S0958-1669(02)00328-2. [DOI] [PubMed] [Google Scholar]
- 2.Hofrichter M. (ed). 2010. The mycota, vol X, 2nd ed. Industrial applications. Springer, Berlin, Germany. doi: 10.1007/978-3-642-11458-8. [DOI] [Google Scholar]
- 3.Blanchette RA. 1995. Degradation of the lignocellulose complex in wood. Can J Bot 73:999–1010. doi: 10.1139/b95-350. [DOI] [Google Scholar]
- 4.Isaka M, Jaturapat A, Kladwang W, Punya J, Lertwerawat Y, Tanticharoen M, Thebtaranonth Y. 2000. Antiplasmodial compounds from the wood-decayed fungus Xylaria sp. BCC 1067. Planta Med 66:473–475. doi: 10.1055/s-2000-8588. [DOI] [PubMed] [Google Scholar]
- 5.Punya J. 2002. Polyketide synthase genes from the wood-decaying fungus Xylaria sp. BCC 1067. Doctoral dissertation, University of Westminster, London, United Kingdom. [Google Scholar]
- 6.Green MR, Sambrook J. 2012. Molecular cloning: a laboratory manual, 4th ed Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. [Google Scholar]
- 7.Solovyev V, Kosarev P, Seledsov I, Vorobyev D. 2006. Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol 7(Suppl 1):S10. doi: 10.1186/gb-2006-7-s1-s10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lowe TM, Chan PP. 2016. tRNAscan-SE on-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res 44:W54–W57. doi: 10.1093/nar/gkw413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. 2008. NCBI BLAST: a better Web interface. Nucleic Acids Res 36:W5–W9. doi: 10.1093/nar/gkn201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yin Y, Mao X, Yang JC, Chen X, Mao F, Xu Y. 2012. dbCAN: a Web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 40:W445–W451. doi: 10.1093/nar/gks479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zhang H, Yohe T, Huang L, Entwistle S, Wu P, Yang Z, Busk PK, Xu Y, Yin Y. 2018. dbCAN2: a meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 46:W95–W101. doi: 10.1093/nar/gky418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Busk PK, Pilgaard B, Lezyk MJ, Meyer AS, Lange L. 2017. Homology to peptide pattern for annotation of carbohydrate-active enzymes and prediction of function. BMC Bioinformatics 18:214. doi: 10.1186/s12859-017-1625-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The draft genome sequence of Xylaria sp. BCC 1067 has been deposited in GenBank under the accession number SSCS00000000, BioProject number PRJNA531792, and BioSample number SAMN08619538. The raw reads have been deposited in the NCBI Sequence Read Archive (SRA) under the number SRP191752.
