Table 4.
Gene Completeness Analysis
Dataset | Assembler | Complete (%) | Complete Single Copy (%) |
Complete Duplicate (%) | Fragmented (%) | Missing (%) | Total BUSCO Groups |
---|---|---|---|---|---|---|---|
E. coli (ONT) |
Canu | 4.1 | 4.1 | 0.0 | 16.8 | 79.1 | 440 |
Flye | NA | ||||||
wtdbg2 | 1.8 | 1.8 | 0.0 | 9.1 | 89.1 | 440 | |
miniasm | 3.0 | 3.0 | 0.0 | 18.0 | 79.0 | 440 | |
minia | 99.8 | 99.3 | 0.5 | 0.2 | 0.0 | 440 | |
SPAdes | 100.0 | 99.5 | 0.5 | 0.0 | 0.0 | 440 | |
hybridSPAdes | 100.0 | 99.5 | 0.5 | 0.0 | 0.0 | 440 | |
Unicycler | NA | ||||||
DBG2OLC | 35.9 | 35.7 | 0.2 | 33.0 | 31.1 | 440 | |
MaSuRCA | 99.7 | 98.6 | 1.1 | 0.0 | 0.3 | 440 | |
Wengan | 100.0 | 99.5 | 0.5 | 0.0 | 0.0 | 440 | |
HASLR | 97.8 | 97.3 | 0.5 | 1.6 | 0.6 | 440 | |
Yeast (PacBio) |
Canu | 96.6 | 94.8 | 1.8 | 0.2 | 3.2 | 2,137 |
Flye | 94.6 | 93.0 | 1.6 | 0.1 | 5.3 | 2,137 | |
wtdbg2 | 88.4 | 86.8 | 1.6 | 0.8 | 10.8 | 2,137 | |
miniasm | 25.8 | 25.6 | 0.2 | 5.2 | 69.0 | 2,137 | |
minia | 96.3 | 94.9 | 1.4 | 0.1 | 3.6 | 2,137 | |
SPAdes | 96.3 | 94.5 | 1.8 | 0.2 | 3.5 | 2,137 | |
hybridSPAdes | 96.6 | 94.8 | 1.8 | 0.1 | 3.3 | 2,137 | |
Unicycler | 96.4 | 94.7 | 1.7 | 0.1 | 3.5 | 2,137 | |
DBG2OLC | 57.1 | 56.5 | 0.6 | 0.5 | 42.4 | 2,137 | |
MaSuRCA | 96.3 | 94.1 | 2.2 | 0.1 | 3.6 | 2,137 | |
Wengan | 96.5 | 94.9 | 1.6 | 0.0 | 3.5 | 2,137 | |
HASLR | 95.8 | 94.4 | 1.4 | 0.1 | 4.1 | 2,137 | |
C. elegans (PacBio) |
Canu | 97.4 | 96.8 | 0.6 | 1.1 | 1.5 | 3,131 |
Flye | 98.6 | 98.0 | 0.6 | 0.3 | 1.1 | 3,131 | |
wtdbg2 | 97.1 | 96.5 | 0.6 | 1.3 | 1.6 | 3,131 | |
miniasm | 83.2 | 82.8 | 0.4 | 6.5 | 10.3 | 3,131 | |
minia | 80.4 | 79.9 | 0.5 | 9.0 | 10.6 | 3,131 | |
SPAdes | 91.4 | 90.8 | 0.6 | 4.1 | 4.5 | 3,131 | |
hybridSPAdes | 96.4 | 95.8 | 0.6 | 1.3 | 2.3 | 3,131 | |
Unicycler | 97.7 | 97.1 | 0.6 | 0.7 | 1.6 | 3,131 | |
DBG2OLC | 97.5 | 95.8 | 1.7 | 0.6 | 1.9 | 3,131 | |
MaSuRCA | 95.5 | 94.1 | 1.4 | 0.4 | 4.1 | 3,131 | |
Wengan | 91.6 | 91.1 | 0.5 | 0.9 | 7.5 | 3,131 | |
HASLR | 97.1 | 96.7 | 0.4 | 0.8 | 2.1 | 3,131 |
Note: We used enterobacterales odb10, saccharomycetes odb10, and nematoda odb10 gene sets for assessing gene completeness of E. coli, Yeast, and C. elegans assemblies, respectively. We were not able to obtain the gene completeness results for the human dataset due to time restrictions.