Draft genome sequence data of Antarctic Penicillium sp. strain E22, from Deception Island

Teoh Chui Peng; Paris Lavin; Rómulo Oses Pedraza; Natalia Fierro-Vásquez; Cristina Purcarea; Sheau Ting Yong; Clemente MVL Wong

doi:10.1016/j.dib.2024.110143

. 2024 Feb 9;53:110143. doi: 10.1016/j.dib.2024.110143

Draft genome sequence data of Antarctic Penicillium sp. strain E22, from Deception Island

Teoh Chui Peng ^a, Paris Lavin ^b,^c,^⁎, Rómulo Oses Pedraza ^d, Natalia Fierro-Vásquez ^b, Cristina Purcarea ^e, Sheau Ting Yong ^f, Clemente MVL Wong ^a

PMCID: PMC10900114 PMID: 38419763

Abstract

Here, we report the draft genome sequence and assembly of the Penicillium sp. strain E22, which was isolated from Antarctic soil of Deception Island, South Shetland Islands close to the Antarctic Peninsula. The genome was sequenced using a 2 # 250 bp paired-end method by Illumina MiSeq 6000. The genome assembly was performed using softwares implemented in the Kbase web service. The phylogenetic tree of strain E22 comparing its internal transcribed spacer (ITS) region with the other Penicillium showed high genetic similarity to Penicillium griseofulvum MN545450 and Penicillium camemberti MT530220. Draf genome of Penicillium sp. strain E22 comprises 33,653 coding sequences, with a high G + C content of 48.32% and a total size of 37,484,944 bp. This draft genome assembly version has been deposited at GenBank under accession JASJUN000000000.

Keywords: Penicillium; Antarctica; Deception Island, Draft genome sequencing

Specifications Table

Subject	Microbiology • Fungal Biology
Specific subject area	The genome sequence was processed in Illumina MiSeq 6000 De novo assembly: SPAdes Genome Assembler software (v3.15.3), Annotation: DRAM (Distilled and Refined Annotation of Metabolism) software (v0.1.2) as implemented in the Kbase web service.
Data format	Raw, Analyzed, Filtered and deposited
Type of data	Table, Figure
Data collection	Purification of genomic DNA from a pure culture of Penicillium sp. strain E22 isolated from Antarctic soil, The sequencing library was generated using the Nextera® XT DNA sample preparation kit for Illumina. Illumina MiSeq PE250 was used for whole genome sequencing.
Data source location	The strain E22 was isolated from the soil of Deception Island (S 62° 55′ 58.1′' W 60° 35′ 26.8′'), Antarctica.
Data accessibility	Data are deposited at the NCBI GenBank https://www.ncbi.nlm.nih.gov/bioproject/PRJNA970415 https://www.ncbi.nlm.nih.gov/sra/SRR24472943

Open in a new tab

1. Value of the Data

•
The availability of the draft genome assembly for Penicillium sp. strain E22 provides significant benefits for microbial taxonomy and ecological studies, especially in terms of identifying and mapping species distribution.
•
The information presented in this article has the potential to be beneficial for researchers who are engaged in environmental microbiology, environmental biotechnology, extremophiles and genomics.
•
The genomic data of Penicillium sp. strain E22 contained in this report could be a valuable asset for scientists who wish to conduct comparative genomic analyses across different strains and environment.

2. Background

The genus Penicillium comprises the most extensively distributed fungi, which are present universally in both outdoor and indoor environments, including food, water, plants, and soils. Presently, there are 354 acknowledged species in this genus, and numerous among them have the ability to generate a wide range of natural products and enzymes, including amylases, glucoamylase, cellulase, proteases, and xylanase [1], [2], [3].

2.1. Data description

The data presented here represents the genome sequencing, assembly, and annotation of the Antarctic Penicillium strain E22, isolated from Deception Island soil. Illumina sequencing yielded 874.33 million paired-end reads. The N50 contig length was 53.9 Kb with an average coverage of 24 × . The resulting draft genome was 37,484,944 bp in size with a G+C content of 48.32 %. Gene prediction analysis using the kb_DRAM web-based app in KBase (v.0.1.2) [4], resulted in 33,653 protein coding genes (Table 1).

Table 1.

QUAST report and genome features for Penicillium sp. strain E22 assembly.

Statistics without reference	Penicillium sp. strain E22
# contigs	2,704
# contigs (>= 0 bp)	2,705
# contigs (>= 1000 bp)	1,846
# contigs (>= 10000 bp)	769
# contigs (>= 100000 bp)	70
# contigs (>= 1000000 bp)	0
Largest contig	298,884
Total length	37,484,944
Total length (>= 0 bp)	37,485,436
Total length (>= 1000 bp)	36,903,868
Total length (>= 10000 bp)	33,127,976
Total length (>= 100000 bp)	9,345,970
Total length (>= 1000000 bp)	0
N50	53,495
N75	24,743
L50	203
L75	460
GC (%)	48.32
Mismatches
# N's	2597
# N's per 100 kbp	6.93
Genome features
Total coding sequences	33,653
tRNA genes	198
rRNA genes	50

Open in a new tab

Based on the comparison of the internal transcribed spacer (ITS) region of the 18S–5.8S–26S nuclear ribosomal of the isolate to other strains, it was found that it had the closest genetic similarity to Penicillium griseofulvum MN545450 and Penicillium camemberti MT530220, with a 99.15% identity with both species (Fig. 1). Functional gene annotation of the draft genome predicted about 3253 genes using KEGG. The carbohydrate-active enzyme analysis showed that Penicillium sp. strain E22 was dominated by AA1, AA3, GH13, GT2, GH16, GH43 and GH5. Different types of secondary metabolite clusters that may be involved in the formation of secondary metabolites were found: T1PKS, NRPS, NRPS-like, fungal-RiPP-like, NI-siderophore, betalactone, indole, terpene and and several hybrids (NRPS,T1PKS; NRPS,indole; NRPS-like,T1PKS; NRPS,fungal-RiPP-like; NRP-metallophore-NRPS and T1PKS,indole,NRPS-like,terpene). This whole genome project has been deposited at NCBI GenBank under accession number for Bioproject, Biosample and SRA as PRJNA970415, SAMN35003752 and SRR24472943, respectively. The assembly version described in this paper is version JASJUN000000000.

Fig 1 — Phylogenetic tree of ITS region sequences inferred by maximum-likelihood method. Numbers above branches indicate 1000 bootstrap replicates values and 44 sequences of Penicillium species used are presented with GenBank accession numbers followed by the name of strains.

3. Experimental Design, Materials and Methods

3.1. Genome DNA extraction and sequencing

Penicillium sp. strain E22 was isolated from Deception Island (62° 55′ 58.1′'S 60° 35′ 26.8′'W), Antarctica. Strain E22 was routinely cultivated in Yeast Malt Extract Agar medium at 28°C for 3 days. TRIzol™ (Invitrogen™, USA) was used for genomic DNA extraction. The genomic library of strain E22 was generated using Nextera® XT DNA sample preparation kit according to the manufacturer's instructions. The whole genome sequencing was then performed by using an Illumina MiSeq PE250 at the Biotechnology Research Institute, Universiti Malaysia Sabah.

3.2. Species identification

The DNA fragment was amplified using universal primer set ITS1 (forward primer) 5-TCCGTAGGTGAACCTGCGG-3 and ITS4 (reverse primer) 5-TCCTCCGCTTATTGATATGC-3. The PCR product was sequenced using bi-directional sequencing. The sequence was analyzed by BLAST and then compared to the NCBI database. The phylogenetic tree was constructed using the Maximum-likelihood phylogenetic tree based on ITS rRNA gene sequences (879 base pair alignment positions including gaps; Substitution model: HKY85; Gamma shape parameter: 0.467; Transition/transversion ratio: 3.761; Number of categories: 4 and Proportion of invariant: 0.740) showing the relationship between strain E22 and the 43 most closely related reference species. The alignment, substitution model and construction of the phylogenetic tree were performed using the online Phylogeny.fr tool [4].

3.3. Reads pre-processing, genome assembly, quality assessment, and annotation

The raw reads were pre-processed using the Trimmomatic (v1.2.14) tool to trim low-quality bases and short reads (minimum length=36), then assembled using SPAdes Genome Assembler software (v3.15.3), Quast Report (QUality ASsessment Tool, v4.4) and the Annotation performed with DRAM (Distilled and Refined Annotation of Metabolism) software (v0.1.2). All software used was implemented in the Kbase web service [5].

Limitations

‘Not applicable’.

Ethics Statement

This work neither involves human subjects nor animal subjects. The authors declare that this manuscript is original work and has not been published elsewhere.

CRediT authorship contribution statement

Teoh Chui Peng: Formal analysis, Writing – original draft. Paris Lavin: Conceptualization, Supervision, Methodology, Writing – review & editing. Rómulo Oses Pedraza: Writing – review & editing. Natalia Fierro-Vásquez: Formal analysis, Writing – original draft. Cristina Purcarea: Writing – review & editing. Sheau Ting Yong: Data curation. Clemente M.V.L. Wong: Writing – review & editing, Conceptualization, Supervision, Methodology.

Acknowledgments

Acknowledgements

This work was supported by the INACH RT_20-19 Project (Instituto Antartico Chileno), Fondecyt Iniciación grant No 11190754 (National Agency of Research and Development, ANID), Convenio Mineduc-UA ANT20992, Funding granted by the Vice-Rector's Office for Research, Innovation and Postgraduate Studies at the University of Antofagasta and Romanian Academy project RO1567-IBB05/2022.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability

Draft genome of Penicillium sp. strain E22 (Original data) (OSF)

References

1.Visagie C.M., Houbraken J., Frisvad J.C., Hong S.-B., Klaassen C.H.W., Perrone G., Seifert K.A., Varga J., Yaguchi T., Samson R.A. Identification and nomenclature of the genus Penicillium. Stud. Mycol. 2014;78:343–371. doi: 10.1016/j.simyco.2014.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Kita D.M., Giovanella P., Yoshinaga T.T., Pellizzer E.P., Sette L.D. Antarctic fungi applied to textile dye bioremediation. Anais da Academia Brasileira de Ciências. 2022;94 doi: 10.1590/0001-3765202220210234. [DOI] [PubMed] [Google Scholar]
3.Vaishnav N., Singh A., Adsul M., Dixit P., Sandhu S.K., Mathur A., Puri S.K., Singhania R.R. Penicillium: the next emerging champion for cellulase production. Bioresour. Technol. Rep. 2018;2:131–140. doi: 10.1016/j.biteb.2018.04.003. [DOI] [Google Scholar]
4.Dereeper A., Guignon V., Blanc G., Audic S., Buffet S., Chevenet F., Dufayard J.F., Guindon S., Lefort V., Lescot M., Claverie J.M., Gascuel O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;1(Web Server issue):W465–W469. doi: 10.1093/nar/gkn180. 36. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: the United States department of energy systems biology knowledgebase. Nat. Biotechnol. 2018;36:566. doi: 10.1038/nbt.4163. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Draft genome of Penicillium sp. strain E22 (Original data) (OSF)

[bib0001] 1.Visagie C.M., Houbraken J., Frisvad J.C., Hong S.-B., Klaassen C.H.W., Perrone G., Seifert K.A., Varga J., Yaguchi T., Samson R.A. Identification and nomenclature of the genus Penicillium. Stud. Mycol. 2014;78:343–371. doi: 10.1016/j.simyco.2014.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0002] 2.Kita D.M., Giovanella P., Yoshinaga T.T., Pellizzer E.P., Sette L.D. Antarctic fungi applied to textile dye bioremediation. Anais da Academia Brasileira de Ciências. 2022;94 doi: 10.1590/0001-3765202220210234. [DOI] [PubMed] [Google Scholar]

[bib0003] 3.Vaishnav N., Singh A., Adsul M., Dixit P., Sandhu S.K., Mathur A., Puri S.K., Singhania R.R. Penicillium: the next emerging champion for cellulase production. Bioresour. Technol. Rep. 2018;2:131–140. doi: 10.1016/j.biteb.2018.04.003. [DOI] [Google Scholar]

[bib0004] 4.Dereeper A., Guignon V., Blanc G., Audic S., Buffet S., Chevenet F., Dufayard J.F., Guindon S., Lefort V., Lescot M., Claverie J.M., Gascuel O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;1(Web Server issue):W465–W469. doi: 10.1093/nar/gkn180. 36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib0005] 5.Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: the United States department of energy systems biology knowledgebase. Nat. Biotechnol. 2018;36:566. doi: 10.1038/nbt.4163. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Draft genome sequence data of Antarctic Penicillium sp. strain E22, from Deception Island

Teoh Chui Peng

Paris Lavin

Rómulo Oses Pedraza

Natalia Fierro-Vásquez

Cristina Purcarea

Sheau Ting Yong

Clemente MVL Wong

Abstract

1. Value of the Data

2. Background

2.1. Data description

Table 1.

Fig. 1.

3. Experimental Design, Materials and Methods

3.1. Genome DNA extraction and sequencing

3.2. Species identification

3.3. Reads pre-processing, genome assembly, quality assessment, and annotation

Limitations

Ethics Statement

CRediT authorship contribution statement

Acknowledgments

Acknowledgements

Declaration of Competing Interest

Data Availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Draft genome sequence data of Antarctic Penicillium sp. strain E22, from Deception Island

Teoh Chui Peng

Paris Lavin

Rómulo Oses Pedraza

Natalia Fierro-Vásquez

Cristina Purcarea

Sheau Ting Yong

Clemente MVL Wong

Abstract

1. Value of the Data

2. Background

2.1. Data description

Table 1.

Fig. 1.

3. Experimental Design, Materials and Methods

3.1. Genome DNA extraction and sequencing

3.2. Species identification

3.3. Reads pre-processing, genome assembly, quality assessment, and annotation

Limitations

Ethics Statement

CRediT authorship contribution statement

Acknowledgments

Acknowledgements

Declaration of Competing Interest

Data Availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases