Abstract
Trichoderma asperellum PK1J2 is a promising cellulase-producing fungus isolated from a palm empty fruit bunch in Riau, Indonesia. Presented here is the genome assembly of T. asperellum PK1J2. The whole genome of the fungi was sequenced using Illumina NovaSeq PE150. The genome assembly was performed using SOAPdenovo, SPAdes, and Abyss software, and the assembly results of the three types of software were integrated with CISA software. T. asperellum PK1J2 has 6,835 protein-coding genes with a length of 9,233,597 bp. The final genome assembly was approximately 36 Mbp with a GC content of 48.45%. This whole genome shotgun project has been deposited at DDBJ/ENA/GenBank under accession JAGJIK000000000.
Keywords: Whole-genome sequence, Trichoderma asperellum, Palm empty fruit bunch, Genomic
Specifications Table
Subject | Biological science |
Specific subject area | Genomics, Microbiology |
Type of data | Genome sequencing in FASTA format Table Figure |
How the data were acquired | Genome sequencing was performed using Illumina Novaseq PE150 |
Data format | Raw Analyzed |
Description of data collection | Genomic DNA was isolated from Trichoderma asperellum PK1J2. The sequencing libraries were generated using the NEBNext® ULtra™ DNA Prep Kit for Illumina (NEB, USA). Illumina NovaSeq PE150 was used for whole genome sequencing. The genome assembly was performed with SOAPdenovo, SPAdes and Abyss software and integrated with CISA software. |
Data source location | Institution: Faculty of Agricultural Technology, Universitas Gadjah Mada City/Town/Region: Sleman, Yogyakarta Country: Indonesia Latitude and longitude for samples/data collection: 7° 46′ 14.5″ S, 110° 22′ 39.8″ E |
Data accessibility | Repository name: Data identification number: This whole genome shotgun project has been deposited at DDBJ/ENA/GenBank under accession JAGJIK000000000. The version described in this paper is version JAGJIK000000000. Direct URL to data: www.ncbi.nlm.nih.gov/assembly/GCA_022817925.1/ The raw sequence data of this paper are accessible under SRA accession number SRR19762116. Direct URL to data: www.ncbi.nlm.nih.gov/sra/?term=pk1j2 All data in this paper are available at NCBI with BioProject number PRJNA699105. Direct URL to data: www.ncbi.nlm.nih.gov/bioproject/PRJNA699105 |
Value of the Data
-
•
The genome data of Trichoderma asperellum PK1J2 isolated from Indonesia provide insight into the genetic diversity of T. asperellum and essential genetic information to reveal important details of effector proteins, metabolites and enzymes production.
-
•
The data can be useful for researchers working on fungal microbiology, biotechnology, genomics, and genetic engineering.
-
•
This genome information can be used for genome mining to discover the genes involved in metabolites and enzymes biosynthesis pathways.
-
•
Stakeholders, including industry, can use T. asperellum PK1J2 as a biocontrol agent, biofertiliser, and producer of metabolites and enzymes, especially cellulase, through this genetic information.
1. Data Description
T. asperellum is a mycoparasitic species widely used for its ability to inhibit the growth of plant pathogens [1]. T. asperellum has been shown to produce hydrolytic enzymes such as cellulase and xylanase [2,3]. T. asperellum has also been reported to hydrolyse wheat bran, wheat straw, paper, sawdust, corncob, duckweed, and agave by secreting cellulases [2], [3], [4], [5], [6], [7]. Strain PK1J2 has been proven to be capable of producing high cellulase. The cellulase from this fungi can hydrolyse cassava stem and sago waste into fermentable sugar [8,9]. In a previous study, strain PK1J2 produced highest cellulase activity among the examined fungi isolated from Indonesia and was further selected to characterize its genome.
Fig. 1 shows a phylogenetic tree of strain PK1J2 comparing its internal transcribed spacer (ITS) region with the other fungi. As can be seen in the figure, the ITS gene of strain PK1J2 showed the highest similarity with Trichoderma asperellum species.
Fig. 1.
Phylogenetic tree of strain PK1J2 ITS region compared with the other fungi.
T. asperellum PK1J2 had 6835 protein-coding genes with 9,233,597 bp in length, as seen in Table 1. The final assembly for the T. asperellum PK1J2 genome was approximately 36 Mbp with a GC content of 48.45%. The genome consisted of 249 scaffolds with a total length of 36,156,613 bp (N50, 724,251 bp; N90, 188,633 bp; and Nmax, 2,963,926 bp) and 636 contigs with a total length of 36,152,186 bp (N50, 143,266 bp; N90, 33,810 bp; and Nmax, 506,821 bp). This whole genome shotgun project has been deposited at DDBJ/ENA/GenBank under accession JAGJIK000000000. The version described in this paper is version JAGJIK000000000 [10].
Table 1.
Genome features of T. asperellum PK1J2.
Genome feature | Value |
---|---|
Genome size | 36,156,613 |
Gene number | 6835 |
Gene length | 9,233,597 |
tRNA genes | 236 |
rRNA genes (5 s) | 49 |
sRNA genes | 2 |
snRNA genes | 22 |
Functional gene annotation predicted about 4759 genes using GO, 6398 genes using KEGG, 1946 genes using KOG, 4759 genes using Pfam, 2783 genes using SWISS-PROT, and 6544 genes using NR database. Gene coding for protein possibly involved in secondary metabolite production revealed the presence of T1PKS cluster, NRPS cluster, NRPS-like cluster, T1PKS-NRPS hybrid cluster, and terpene cluster. A carbohydrate-active enzyme analysis showed that T. asperellum PK1J2 was dominated by GH18, GH3, GH16, GH2, and GH5.
2. Experimental Design, Materials and Methods
2.1. Fungal Strain and DNA Extraction
Strain Trichoderma asperellum PK1J2 was obtained from the Laboratory of Biotechnology, Faculty of Agricultural Technology, Universitas Gadjah Mada. Strain PK1J2 was isolated from a rotten palm empty fruit bunch, Pekanbaru, Riau, Indonesia. The strain was grown on PDA agar at 30 °C for a period of seven days. ZymoBIOIMICS™ DNA Mini Kit (Zymo Research, California) was used for extracting genomic DNA. The harvested DNA was detected by agarose gel electrophoresis and quantified by Qubit® 2.0 Fluorometer.
2.2. Species Identification
The DNA fragment was amplified using universal primer set ITS1 (forward primer) 5′-TCCGTAGGTGAACCTGCGG-3′ and ITS4 (reverse primer) 5′-TCCTCCGCTTATTGATATGC-3′. The PCR product was sequenced using Bi-directional Sequencing. The sequence was analyzed by BLAST and then compared to the NCBI database. The phylogenetic tree was constructed using the Neighbor-Joining method (Unrooted Tree) by NCBI BLAST.
2.3. Genome Sequencing and Assembly
Sequencing libraries were generated using NEBNext® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA) following manufacturer's recommendations. The whole genome sequencing of the fungi was performed using an Illumina NovaSeq PE150 at the Beijing Novogene Bioinformatics Technology Co., Ltd. The genome assembly was done using SOAPdenovo, SPAdes, and Abyss software. The assembly results from all three software were integrated with CISA software. The assembly result with the least scaffolds was selected.
2.4. Genome Component Prediction
Transfer RNA (tRNA) genes were predicted by tRNAscan-SE [11]. Also, ribosome RNA (rRNA) genes were analyzed by rRNAmmer [12], and small nuclear RNAs (snRNA) were predicted by BLAST against the Rfam database [13].
2.5. Genome Annotation
Genome functional annotation was based on the BLASTP with GO (Gene Ontology) [14], KEGG (Kyoto Encyclopedia of Genes and Genomes) [15], COG (Clusters of Orthologous Groups) [16], NR (Non-Redundant Protein Database) [17], and SWISS-PROT [18]. Carbohydrate-active enzymes were predicted by the Carbohydrate-Active Enzymes Database (CAZy) [19]. Genes coding for proteins that were possibly involved in secondary metabolite production were predicted by antiSMASH v.5.0 [20].
Ethics Statements
Not applicable.
CRediT authorship contribution statement
Fela Laila Nur Hidayati: Investigation, Formal analysis, Writing – original draft. Dian Anggraini Suroto: Methodology, Software, Data curation. : Resources. Muhammad Nur Cahyanto: Supervision, Project administration, Funding acquisition, Writing – review & editing. Jaka Widada: Conceptualization, Methodology, Validation, Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the Ministry of Research, Technology, and Higher Education of the Republic of Indonesia [grant number 3192/UN1.DITLIT/DIT-LIT/PT/2020].
Data Availability
richoderma asperellum strain:PK1J2 (Original data) (National Center for Biotechnology Information (NCBI)).
References
- 1.Go W.Z., H'ng P.S., Wong M.Y., Chin K.L., Ujang S. Evaluation of Trichoderma asperellum as a potential biocontrol agent against rigidoporus microporus hevea brasiliensis. Arch. Phytopathol. Plant Prot. 2019;52:639–666. doi: 10.1080/03235408.2019.1587821. [DOI] [Google Scholar]
- 2.Wang Q., Lin H., Shen Q., Fan X., Bai N., Zhao Y. Characterization of cellulase secretion and Cre1-mediated carbon source repression in the potential lignocellulose-degrading strain Trichoderma asperellum T-1. PLoS One. 2015;10:1–15. doi: 10.1371/journal.pone.0119237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bech L., Busk P.K., Lange L. Cell wall degrading enzymes in trichoderma asperellum grown on wheat bran. Fungal Genom. Biol. 2014;04:1–11. doi: 10.4172/2165-8056.1000116. [DOI] [Google Scholar]
- 4.Pandey S., Srivastava M., Shahid M., Kumar V., Singh A., Trivedi S., Srivastava Y.K. Trichoderma species cellulases produced by solid state fermentation. J. Data Min. Genom. Proteom. 2015;6:1–4. doi: 10.4172/2153-0602.1000170. [DOI] [Google Scholar]
- 5.Bech L., Herbst F.A., Grell M.N., Lange L. On-site enzyme production by Trichoderma asperellum for the degradation of duckweed. Fungal Genom. Biol. 2015;05:1–10. doi: 10.4172/2165-8056.1000126. [DOI] [Google Scholar]
- 6.Nava-Cruz N.Y., Contreras-Esquivel J.C., Aguilar-González M.A., Nuncio A., Rodríguez-Herrera R., Aguilar C.N. Agave atrovirens fibers as substrate and support for solid-state fermentation for cellulase production by Trichoderma asperellum, 3. Biotech. 2016;6:1–12. doi: 10.1007/s13205-016-0426-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zapata Y., Galviz-quezada A., Salcedo-reyes J.C. Cellulases production on paper and sawdust using native Trichoderma asperellum. Univ. Sci. 2018;23:419–436. doi: 10.11144/Javeriana.SC23-3.cpop. [DOI] [Google Scholar]
- 8.Dewi P., Indrati R., Millati R., Sardjono S. Effect of lime pretreatment on microstructure of cassava stalk fibers and growth of aspergillus niger. Biosaintifika J. Biol. Biol. Educ. 2018;10:205–212. doi: 10.15294/biosaintifika.v10i1.13802. [DOI] [Google Scholar]
- 9.Iqbal M., Rianse K., Millati R., Indrati R. Enhanced fermentable sugars production from sago waste by Trichoderma reesei Pk1J2 and aspergillus niger FNCC 6114 fermentation. Int. J. Sci. Res. 2020;9:1200–1204. doi: 10.21275/SR20617111902. [DOI] [Google Scholar]
- 10.Hidayati F.L.N., Suroto D.A., Sardjono S., Nur Cahyanto M., Widada J. v 1. National Center for Biotechnology Information; 2022. www.ncbi.nlm.nih.gov/bioproject/PRJNA699105 (Trichoderma Asperellum Strain PK1J2). [Google Scholar]
- 11.Lowe T.M., Eddy S.R. TRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–964. doi: 10.1093/nar/25.5.0955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lagesen K., Hallin P., Rødland E.A., Stærfeldt H.H., Rognes T., Ussery D.W. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100–3108. doi: 10.1093/nar/gkm160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gardner P.P., Daub J., Tate J.G., Nawrocki E.P., Kolbe D.L., Lindgreen S., Wilkinson A.C., Finn R.D., Griffiths-Jones S., Eddy S.R., Bateman A. Rfam: updates to the RNA families database. Nucleic Acids Res. 2009;37:136–140. doi: 10.1093/nar/gkn766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., Harris M.A., Hill D.P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J.C., Richardson J.E., Ringwald M., Rubin G.M., Sherlock G. Gene ontology : tool for the unification of biology. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kanehisa M., Goto S., Kawashima S., Okuno Y., Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32:277–280. doi: 10.1093/nar/gkh063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Galperin M.Y., Makarova K.S., Wolf Y.I., Koonin E.V. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res. 2015;43:261–269. doi: 10.1093/nar/gku1223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Li W., Jaroszewski L., Godzik A. Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics. 2002;18:77–82. doi: 10.1093/bioinformatics/18.1.77. [DOI] [PubMed] [Google Scholar]
- 18.Bairoch A., Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000;28:45–48. doi: 10.1093/nar/28.1.45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cantarel B.I., Coutinho P.M., Rancurel C., Bernard T., Lombard V., Henrissat B. The carbohydrate-active enzymes database (CAZy): an expert resource for glycogenomics. Nucleic Acids Res. 2009;37:233–238. doi: 10.1093/nar/gkn663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Blin K., Shaw S., Steinke K., Villebro R., Ziemert N., Lee Y., Medema M.H., Weber T. Antismash 5. 0 : updates to the secondary metabolite genome mining pipeline. Nucleic. 2019;47:81–87. doi: 10.1093/nar/gkz310. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
richoderma asperellum strain:PK1J2 (Original data) (National Center for Biotechnology Information (NCBI)).