Draft Genome Sequence of the Yeast Blastobotrys aristata Strain UCD613, Isolated from Soil in Ireland

Aida Don; Kevin P Byrne; Fouz Alqaderi; Alexandra Mazhova; Ben O’Leary Chaney; Sarah Redmond; Stephen Allen; Jiaran Gao; Eoin Ó Cinnéide; Sean A Bergin; Conor Hession; Kenneth H Wolfe; Geraldine Butler

doi:10.1128/mra.00957-22

. 2022 Oct 12;11(11):e00957-22. doi: 10.1128/mra.00957-22

Draft Genome Sequence of the Yeast Blastobotrys aristata Strain UCD613, Isolated from Soil in Ireland

Aida Don ^a, Kevin P Byrne ^b, Fouz Alqaderi ^a, Alexandra Mazhova ^a, Ben O’Leary Chaney ^a, Sarah Redmond ^a, Stephen Allen ^a, Jiaran Gao ^b, Eoin Ó Cinnéide ^b, Sean A Bergin ^a, Conor Hession ^a, Kenneth H Wolfe ^b, Geraldine Butler ^a,^✉

Editor: Antonis Rokas^c

PMCID: PMC9671022 PMID: 36222698

ABSTRACT

Blastobotrys aristata is a member of the Trichomonascaceae family in the order Saccharomycetales. Here, we present the genome sequence of B. aristata UCD613, which was isolated from soil in Dublin, Ireland. This genome is 13.3 Mb and was assembled into 4 chromosome-size scaffolds of >2.2 Mb in size plus a mitochondrial genome scaffold.

ANNOUNCEMENT

Blastobotrys aristata was first isolated from moldy plaster in the former Czechslovakia in 1976 (Marvanova 1976, also known as B. aristatus) (1). We identified isolate B. aristata UCD613 from soil collected from the campus of University College Dublin (GPS coordinates 53.3034961, −6.2131910). Soil material was passaged twice in in 9 mL liquid yeast extract-peptone-dextrose (YPD) containing chloramphenicol (30 μg/mL) and ampicillin (100 μg/mL) and cultured on YPD plates at 30°C.

The species was identified from single colonies by PCR amplification and Sanger sequencing of the internal transcribed spacer (ITS) (OP221981) and D1/D2 (OP221771) regions of its ribosomal DNA (rDNA) locus. The D1/D2 region was 100% identical to that of the type strain of B. aristata (2) (DQ442686.1). No other ITS sequence is available.

For short-read sequencing, total genomic DNA was extracted from a YPD culture using phenol-chloroform-isoamyl alcohol and dissolved in 150 μL water (3). Libraries were generated and sequenced by BGI Tech Solutions (Hong Kong). One microgram of DNA was fragmented using Covaris, size selected (200 to 400 bp) using magnetic beads, end repaired, and 3′ adenylated, and primers were ligated. Fragments were amplified by PCR and heat denatured and circularized using the splint oligonucleotide sequence. The library was amplified with ϕ29 DNA polymerase to make DNA nanoballs (DNBs). The DNBs were loaded on a patterned nanoarray, and 150 bases were sequenced from each end using combinatorial probe-anchor synthesis (cPAS) on a DNBSeq-G400, yielding ~6.1 million read pairs. Default parameters were used unless noted. Adapters and low-quality reads were removed first using SOAPnuke (4) and subsequently using Skewer v.0.2.2 (5). For long-read sequencing, genomic DNA was prepared using a Genomic Tip 100G kit (Qiagen). Two libraries were generated using the SQK-RBK004 kit from Oxford Nanopore Technologies (ONT) and cleaned with AMPure XP magnetic beads. Libraries were sequenced on primed R9.4.1 flow cells using MinKNOW v.4.1.22 on a MinION device. From run 1, raw data were base called using Guppy v.4.2.2 +effbaf8 (using the fast model [dna_r9.4.1_450bps_fast.cfg]) (ONT) and demultiplexed using qcat v.1.1.0 (ONT) with default settings. For the second run, Guppy v.4.2.2 +effbaf8 was used both for base calling and demultiplexing. Both sets of reads were concatenated together for downstream processing. NanoFilt v.2.3.0 (6) was used to select reads (minimum quality, ≥7; minimum length, ≥1,000 bp) which retained 107,000 reads with an N₅₀ of 6,639 bp.

The genome was assembled from the long reads using Canu v.2.2 (7), followed by five rounds of error correction with the DNBseq short reads using NextPolish (8). Five contigs of <45 kb (corresponding to rDNA and parts of the mitochondrial genome) were removed, leaving 4 chromosome-size contigs of >2.2 Mb in size and a circular mitochondrial genome (48,582 bp, manually edited; accession no. OX291664.1). The total size of the genome is 13.3 Mb, the N₅₀ value is 3.5 Mb, the L₅₀ value is 2 contigs, and the G+C content is 48%. The largest contig is 4.2 Mb. Using BUSCO v.5.1.2, genome completeness was estimated at 94.8% (compared to the Ascomycota lineage data set).

Data availability.

This whole-genome shotgun project has been deposited at DDBJ/ENA/GenBank (BioProject no. PRJEB55420). The version described in this paper is version 1. The raw reads were deposited at SRA (accession no. ERX9629577, ERX9629578, and ERX9629579). The ITS sequence is at OP221981 and the D1/D2 region sequence at OP221771.

ACKNOWLEDGMENTS

This work was supported by undergraduate teaching resources from University College Dublin and by Science Foundation Ireland (20/FFP-A/8795). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Geraldine Butler, Email: gbutler@ucd.ie.

Antonis Rokas, Vanderbilt University.

REFERENCES

1.Marvanova L. 1976. Two new Blastobotrys species. Trans Br Mycol Soc 66:217–222. doi: 10.1016/S0007-1536(76)80049-6. [DOI] [Google Scholar]
2.Kurtzman CP, Robnett CJ. 1995. Molecular relationships among hyphal ascomycetous yeasts and yeastlike taxa. Can J Bot 75:S1. [Google Scholar]
3.Dymond JS. 2013. Preparation of genomic DNA from Saccharomyces cerevisiae. Methods Enzymol 529:153–160. doi: 10.1016/B978-0-12-418687-3.00012-4. [DOI] [PubMed] [Google Scholar]
4.Chen Y, Chen Y, Shi C, Huang Z, Zhang Y, Li S, Li Y, Ye J, Yu C, Li Z, Zhang X, Wang J, Yang H, Fang L, Chen Q. 2018. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7:1–6. doi: 10.1093/gigascience/gix120. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Jiang H, Lei R, Ding S-W, Zhu S. 2014. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15:182. doi: 10.1186/1471-2105-15-182. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.De Coster W, D'Hert S, Schultz DT, Cruts M, Van Broeckhoven C. 2018. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34:2666–2669. doi: 10.1093/bioinformatics/bty149. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Chen Z, Erickson DL, Meng J. 2021. Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses. Genomics 113:1366–1377. doi: 10.1016/j.ygeno.2021.03.018. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[B1] 1.Marvanova L. 1976. Two new Blastobotrys species. Trans Br Mycol Soc 66:217–222. doi: 10.1016/S0007-1536(76)80049-6. [DOI] [Google Scholar]

[B2] 2.Kurtzman CP, Robnett CJ. 1995. Molecular relationships among hyphal ascomycetous yeasts and yeastlike taxa. Can J Bot 75:S1. [Google Scholar]

[B3] 3.Dymond JS. 2013. Preparation of genomic DNA from Saccharomyces cerevisiae. Methods Enzymol 529:153–160. doi: 10.1016/B978-0-12-418687-3.00012-4. [DOI] [PubMed] [Google Scholar]

[B4] 4.Chen Y, Chen Y, Shi C, Huang Z, Zhang Y, Li S, Li Y, Ye J, Yu C, Li Z, Zhang X, Wang J, Yang H, Fang L, Chen Q. 2018. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7:1–6. doi: 10.1093/gigascience/gix120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5.Jiang H, Lei R, Ding S-W, Zhu S. 2014. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15:182. doi: 10.1186/1471-2105-15-182. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.De Coster W, D'Hert S, Schultz DT, Cruts M, Van Broeckhoven C. 2018. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34:2666–2669. doi: 10.1093/bioinformatics/bty149. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7] 7.Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8.Chen Z, Erickson DL, Meng J. 2021. Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses. Genomics 113:1366–1377. doi: 10.1016/j.ygeno.2021.03.018. [DOI] [PubMed] [Google Scholar]

PERMALINK

Draft Genome Sequence of the Yeast Blastobotrys aristata Strain UCD613, Isolated from Soil in Ireland

Aida Don

Kevin P Byrne

Fouz Alqaderi

Alexandra Mazhova

Ben O’Leary Chaney

Sarah Redmond

Stephen Allen

Jiaran Gao

Eoin Ó Cinnéide

Sean A Bergin

Conor Hession

Kenneth H Wolfe

Geraldine Butler

Roles

ABSTRACT

ANNOUNCEMENT

Data availability.

ACKNOWLEDGMENTS

Contributor Information

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Draft Genome Sequence of the Yeast Blastobotrys aristata Strain UCD613, Isolated from Soil in Ireland

Aida Don

Kevin P Byrne

Fouz Alqaderi

Alexandra Mazhova

Ben O’Leary Chaney

Sarah Redmond

Stephen Allen

Jiaran Gao

Eoin Ó Cinnéide

Sean A Bergin

Conor Hession

Kenneth H Wolfe

Geraldine Butler

Roles

ABSTRACT

ANNOUNCEMENT

Data availability.

ACKNOWLEDGMENTS

Contributor Information

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases