Table 2.
Assembly | Continuity (Mb) | QV | k-mer completeness (%) | BUSCO (%) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Total | # Contigs | Max | NG50 | Meryl | Mash | Map | All | Hap1 | Hap2 | PPV | Comp. | Dup. | Frag. | Mis. | ||
A. thaliana Col x Cvi F1 | TrioCanu | |||||||||||||||
Col | 124 | 215 | 13.1 | 4.6 | 35.1 | 34.6 | 37.5 | 83.8 | 99.5 | 0.7 | 99.3 | 98.2 | 1.6 | 0.5 | 1.3 | |
Cvi | 122 | 163 | 12.2 | 5.6 | 36.4 | 35.9 | 38.6 | 83.6 | 1.1 | 99.4 | 98.9 | 98.0 | 1.4 | 0.4 | 1.6 | |
Col + Cvi | 246 | 378 | 13.1 | 5.5 | 35.7 | 33.0 | 38.0 | 98.3 | 99.7 | 99.4 | 99.1 | 98.2 | 78.8 | 0.2 | 1.6 | |
FALCON-Unzip | ||||||||||||||||
pri | 140 | 172 | 13.3 | 8.0 | 34.8 | 33.9 | 36.9 | 87.1 | 65.2 | 59.8 | N/A | 98.1 | 6.2 | 0.3 | 1.6 | |
alt | 105 | 248 | 11.6 | 4.3 | 38.3 | 37.9 | 39.9 | 74.5 | 38.2 | 40.6 | N/A | 93.1 | 2.0 | 0.3 | 6.6 | |
pri + alt | 245 | 420 | 13.3 | 6.9 | 36.0 | 33.5 | 37.9 | 97.8 | 99.1 | 98.2 | N/A | 98.1 | 93.2 | 0.3 | 1.6 | |
Canu | 248 | 2368 | 6.9 | 2.3 | 29.3 | 26.8 | 27.2 | 95.7 | 90.5 | 90.8 | N/A | 97.6 | 61.5 | 0.4 | 2.0 | |
H. sapiens NA12878 | TrioCanu | |||||||||||||||
mat | 2749 | 7388 | 9.0 | 1.1 | 31.3 | 30.4 | 34.1 | 94.0 | 90.7 | 0.9 | 99.1 | 86.5 | 0.8 | 6.9 | 6.6 | |
pat | 2743 | 7252 | 11.5 | 1.1 | 31.0 | 30.1 | 34.1 | 93.7 | 1.0 | 91.0 | 98.8 | 85.1 | 0.7 | 7.8 | 7.1 | |
mat + pat | 5492 | 14,640 | 11.5 | 1.1 | 31.1 | 27.5 | 34.1 | 98.2 | 90.9 | 91.2 | 99.0 | 90.0 | 47.3 | 5.0 | 5.0 |
All bases in the continuity stats are in Mbp. Haploid genome size of 130 Mbp and 3.2 Gbp was used for NG50 in the A. thaliana and H. sapiens NA12878 haploid assemblies, respectively, with twice the haploid genome size for combined assemblies. Merqury-specific column headers are in bold, including k-mer-based quality (QV) and completeness estimates. Merqury includes both exact (Meryl) and approximate (Mash) methods for measuring k-mer QV, while completeness uses only the exact k-mer counting method. Consensus QV scores are Phred-scaled where QV = − 10 log10 E for a probability of error E at each base in the assembly. K-mer completeness is measured by the fraction of all distinct, reliable k-mers (all) and haplotype specific k-mers. Hap1 and Hap2 correspond to Col and Cvi in A. thaliana, maternal and paternal in NA12878. BUSCO v3 was run with embryophyta_odb9 (n = 1440) for A. thaliana and mammalia_odb9 (n = 4104) for NA12878
Map. mapping, Comp. complete, Dup. duplicated, Frag. fragmented, Mis. missing