Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2023 Jun 6:2023.06.03.543588. [Version 1] doi: 10.1101/2023.06.03.543588

miniBUSCO: a faster and more accurate reimplementation of BUSCO

Neng Huang, Heng Li
PMCID: PMC10274665  PMID: 37333158

Abstract

Motivation

Assembly completeness evaluation of genome assembly is a critical assessment of the accuracy and reliability of genomic data. An incomplete assembly can lead to errors in gene predictions, annotation, and other downstream analyses. BUSCO is one of the most widely used tools for assessing the completeness of genome assembly by comparing the presence of a set of single-copy orthologs conserved across a wide range of taxa. However, the runtime of BUSCO can be long, particularly for some large genome assemblies. It is a challenge for researchers to quickly iterate the genome assemblies or analyze a large number of assemblies.

Results

Here, we present miniBUSCO, an efficient tool for assessing the completeness of genome assemblies. miniBUSCO utilizes the protein-to-genome aligner miniprot and the datasets of conserved orthologous genes from BUSCO. Our evaluation of the real human assembly indicates that miniBUSCO achieves a 14-fold speedup over BUSCO. Furthermore, miniBUSCO reports a more accurate completeness of 99.6% than BUSCO’s completeness of 95.7%, which is in close agreement with the annotation completeness of 99.5% for T2T-CHM13.

Availability

https://github.com/huangnengCSU/minibusco .

Contact

hli@ds.dfci.harvard.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES