Skip to main content
. 2020 Oct 9;48(19):e114. doi: 10.1093/nar/gkaa829

Table 1.

Improvement in mappability of reads using DuploMap on multiple SMS whole-genome sequence datasets

Genome Sequencing Median Reads Read MM2 (%) ΔMM2 + DuploMap (%)
technology coverage analyzed length (N50) MQ ≥ 10 MQ ≥ 20 MQ ≥ 10 MQ ≥ 20
HG002 PacBio CLR 45 878k 11 318 59.4 52.9 +8.4 +10.7
HG003 PacBio CLR 20 416k 10 999 59.9 53.5 +9.8 +11.3
HG004 PacBio CLR 19 362k 10 946 65.1 58.3 +8.7 +10.5
HG002 PacBio CCS 29 300k 13 480 65.7 58.9 +14.9 +19.5
HG005 PacBio CCS 32 454k 10 436 64.2 56.6 +15.8 +20.7
HG001 PacBio CCS 29 381k 10 004 71.6 63.7 +15.0 +21.2
HG001 ONT 36 535k 13 788 63.5 55.7 +3.9 +7.8
HG002 ONT 58 464k 54 352 64.5 58.0 −1.5 +1.7

The last four columns show the percentage of reads with high mapping quality (≥10 and ≥20) that overlap Long-SegDups regions in the Minimap2 alignments and the difference between Minimap2 + DuploMap alignments and Minimap2 alignments. MM2 = Minimap2.