Skip to main content
. 2020 Sep 14;21:245. doi: 10.1186/s13059-020-02134-9

Table 2.

Merqury quality and completeness statistics for Arabidopsis thaliana and human genome assemblies

Assembly Continuity (Mb) QV k-mer completeness (%) BUSCO (%)
Total # Contigs Max NG50 Meryl Mash Map All Hap1 Hap2 PPV Comp. Dup. Frag. Mis.
A. thaliana Col x Cvi F1 TrioCanu
 Col 124 215 13.1 4.6 35.1 34.6 37.5 83.8 99.5 0.7 99.3 98.2 1.6 0.5 1.3
 Cvi 122 163 12.2 5.6 36.4 35.9 38.6 83.6 1.1 99.4 98.9 98.0 1.4 0.4 1.6
 Col + Cvi 246 378 13.1 5.5 35.7 33.0 38.0 98.3 99.7 99.4 99.1 98.2 78.8 0.2 1.6
FALCON-Unzip
 pri 140 172 13.3 8.0 34.8 33.9 36.9 87.1 65.2 59.8 N/A 98.1 6.2 0.3 1.6
 alt 105 248 11.6 4.3 38.3 37.9 39.9 74.5 38.2 40.6 N/A 93.1 2.0 0.3 6.6
 pri + alt 245 420 13.3 6.9 36.0 33.5 37.9 97.8 99.1 98.2 N/A 98.1 93.2 0.3 1.6
Canu 248 2368 6.9 2.3 29.3 26.8 27.2 95.7 90.5 90.8 N/A 97.6 61.5 0.4 2.0
H. sapiens NA12878 TrioCanu
 mat 2749 7388 9.0 1.1 31.3 30.4 34.1 94.0 90.7 0.9 99.1 86.5 0.8 6.9 6.6
 pat 2743 7252 11.5 1.1 31.0 30.1 34.1 93.7 1.0 91.0 98.8 85.1 0.7 7.8 7.1
 mat + pat 5492 14,640 11.5 1.1 31.1 27.5 34.1 98.2 90.9 91.2 99.0 90.0 47.3 5.0 5.0

All bases in the continuity stats are in Mbp. Haploid genome size of 130 Mbp and 3.2 Gbp was used for NG50 in the A. thaliana and H. sapiens NA12878 haploid assemblies, respectively, with twice the haploid genome size for combined assemblies. Merqury-specific column headers are in bold, including k-mer-based quality (QV) and completeness estimates. Merqury includes both exact (Meryl) and approximate (Mash) methods for measuring k-mer QV, while completeness uses only the exact k-mer counting method. Consensus QV scores are Phred-scaled where QV = − 10 log10 E for a probability of error E at each base in the assembly. K-mer completeness is measured by the fraction of all distinct, reliable k-mers (all) and haplotype specific k-mers. Hap1 and Hap2 correspond to Col and Cvi in A. thaliana, maternal and paternal in NA12878. BUSCO v3 was run with embryophyta_odb9 (n = 1440) for A. thaliana and mammalia_odb9 (n = 4104) for NA12878

Map. mapping, Comp. complete, Dup. duplicated, Frag. fragmented, Mis. missing