Saprochaete fungicola is an arthroconidial yeast classified in the Magnusiomyces/Saprochaete clade of the subphylum Saccharomycotina. Here, we report the genome sequence of holotype strain CBS 625.85, assembled to five putative chromosomes.
ABSTRACT
Saprochaete fungicola is an arthroconidial yeast classified in the Magnusiomyces/Saprochaete clade of the subphylum Saccharomycotina. Here, we report the genome sequence of holotype strain CBS 625.85, assembled to five putative chromosomes. The genome sequence is 20.2 Mbp long and codes for 6,138 predicted proteins.
ANNOUNCEMENT
Saprochaete fungicola is an anamorphic yeast reproducing by fragmentation of hyphae into asexual spores dubbed arthroconidia (1). Arthoconidia facilitate dissemination, and in some pathogenic fungi, their formation contributes to virulence (2). To provide a resource to study genetic control of arthroconidiogenesis, we determined the genome sequence of the strain CBS 625.85, originally isolated from ascocarps of Nectria cinnabarina, a plant pathogenic fungus causing coral spots (1). Genomic DNA sequencing was performed using a combination of HiSeq 2000 (Illumina) and MinION (Oxford Nanopore Technologies) platforms. DNA was isolated using a standard protocol (3), and total cellular RNA was prepared from a culture grown in yeast extract-peptone-galactose (YPGal) medium (1% [wt/vol] yeast extract, 2% [wt/vol] peptone, and 2% [wt/vol] galactose) at 28°C using hot phenol extraction (4) and an RNeasy minikit (Qiagen).
In total, 3.0 Gbp (∼148× genome coverage) were sequenced in 309,350 long reads (mean, 9.7 kbp; longest read, 251 kbp) using a MinION Mk-1B device with an R9.4.1 flow cell and SQK-LSK109 kit. The paired-end (2 × 101 nucleotides [nt]) TruSeq PCR-free DNA library was sequenced on a HiSeq 2000 instrument at Macrogen Korea, yielding 40,313,550 short reads (4.1 Gbp; coverage, ∼203×). Finally, 38,086,316 transcriptome sequencing (RNA-Seq) reads were generated from a TruSeq nonstranded mRNA paired-end (2 × 101 nt) library using a NovaSeq 6000 instrument at Macrogen Korea.
Assembly with SPAdes v. 3.12.0 (5) resulted in 367 contigs (N50 value, 0.4 Mbp), and long-read assembly with miniasm v. 0.3-r179 (6) and minimap2 v. 2.13-r852 (7) polished by racon v. 1.3.1 (8) yielded 10 contigs (N50 value, 5.1 Mbp). Two contigs contained mitochondrial DNA (mtDNA), of which one complete copy (circular 33-kbp contig) was retained for the final assembly. Two contigs not supported by Illumina reads were discarded as contamination. One ribosomal DNA (rDNA) contig was discarded, and eight copies of the rDNA cluster were retained elsewhere. To further correct the long-read assembly, the rDNA cluster was polished separately by two iterations of pilon v. 1.21 (9) with BWA-MEM v. 0.7.17-r1188 (10), and four contig ends were extended using SPAdes contigs. Identified by a significant decrease in long-read coverage, five local misassemblies were corrected, also using SPAdes assembly. All changes were based on reliable long-range overlaps and supported by multiple long reads. The final assembly contains five nuclear contigs of lengths 5.4, 5.1, 4.4, 3.2, and 2.1 Mbp, with an overall G+C content of 40.6%. These likely correspond to full-length chromosomes since they terminate on both ends by putative telomeric repeats ([AACAG]0–1A2–6G0–1A0–1G4–7) with the predominating motif A2GAG6.
RNA-Seq reads, processed by Trimmomatic v. 0.36 (11), were assembled into transcripts by Trinity v. 2.8.4 (12) and aligned to the genome by blat v. 36 × 2 (13). Augustus v. 3.2.3 (14) trained on the related Magnusiomyces capitatus genome (15) with RNA-Seq evidence was used for initial gene predictions. Of these genes, 4,785 best supported by RNA-Seq transcripts (99% identity on 99% of length as identified by blat) were used to retrain Augustus parameters for S. fungicola and, together with the RNA-Seq evidence, to predict the final set of 6,138 protein-coding genes. The high-contiguity genome sequence of S. fungicola will be instrumental in comparative and functional studies focused on biology and evolution of arthroconidial yeasts.
Data availability.
The genome assembly has been deposited in ENA under the accession number CAACAH010000000, and the Illumina, MinION, and RNA-Seq reads were deposited in SRA under the accession numbers ERR3046939, ERR3046967, and ERR3046965, respectively. The genome annotations are available through a genome browser at http://genome.compbio.fmph.uniba.sk/ and are also archived through Zenodo (16).
ACKNOWLEDGMENTS
This project was supported by grants from the Slovak Research and Development Agency (APVV-14-0253 to J.N. and 15-0022 to L.T.) and VEGA (1/0052/16 to L.T., 1/0684/16 to B.B., and 1/0458/18 to T.V.). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
REFERENCES
- 1.de Hoog GS, Smith MT. 2011. Saprochaete Coker & Shanor ex D.T.S. Wagner & Dawes (1970), p 1317–1327. In Kurtzman CP, Fell JW, Boekhout T (eds), The yeasts: a taxonomic study, 5th ed, vol 2 Elsevier, London, United Kingdom. doi: 10.1016/B978-0-444-52149-1.00097-5. [DOI] [Google Scholar]
- 2.Barrera CR, Szaniszlo PJ. 1985. Formation and germination of fungal arthroconidia. Crit Rev Microbiol 12:271–292. doi: 10.3109/10408418509104431. [DOI] [PubMed] [Google Scholar]
- 3.Hodorova V, Lichancova H, Bujna D, Nebohacova M, Tomaska L, Brejova B, Vinar T, Nosek J. 2018. De novo sequencing and high-quality assembly of yeast genomes using a MinION device. London Calling, 24th–25th May 2018, London, United Kingdom: https://nanoporetech.com/resource-centre/de-novo-sequencing-and-high-quality-assembly-yeast-genomes-using-minion-device. [Google Scholar]
- 4.Collart MA, Oliviero S. 1993. Preparation of RNA. Curr Protoc Mol Biol 23:13.12.1–13.12.5. doi: 10.1002/0471142727.mb1312s23. [DOI] [PubMed] [Google Scholar]
- 5.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li H. 2016. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. doi: 10.1093/bioinformatics/btw152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Vaser R, Sović I, Nagarajan N, Šikić M. 2017. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746. doi: 10.1101/gr.214270.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Li H, Durbin R. 2010. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26:589–595. doi: 10.1093/bioinformatics/btp698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A. 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652. doi: 10.1038/nbt.1883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kent WJ. 2002. BLAT—the BLAST-like alignment tool. Genome Res 12:656–664. doi: 10.1101/gr.229202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Stanke M, Schöffmann O, Morgenstern B, Waack S. 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7:62. doi: 10.1186/1471-2105-7-62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Brejová B, Lichancová H, Brázdovič F, Hegedűsová E, Forgáčová Jakúbková M, Hodorová V, Džugasová V, Baláž A, Zeiselová L, Cillingová A, Neboháčová M, Raclavský V, Tomáška Ľ, Lang BF, Vinař T, Nosek J. 2019. Genome sequence of the opportunistic human pathogen Magnusiomyces capitatus. Curr Genet 65:539–560. doi: 10.1007/s00294-018-0904-y. [DOI] [PubMed] [Google Scholar]
- 16.Brejová B, Lichancová H, Hodorová V, Neboháčová M, Tomáška Ľ, Vinař T, Nosek J. 2019. Annotations of sapFunA1 genome assembly (version v2) [data set]. Zenodo. doi: 10.5281/zenodo.2566797. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The genome assembly has been deposited in ENA under the accession number CAACAH010000000, and the Illumina, MinION, and RNA-Seq reads were deposited in SRA under the accession numbers ERR3046939, ERR3046967, and ERR3046965, respectively. The genome annotations are available through a genome browser at http://genome.compbio.fmph.uniba.sk/ and are also archived through Zenodo (16).