Skip to main content
. 2018 May 5;7(5):giy048. doi: 10.1093/gigascience/giy048

Table 3:

Hybrid and long-read assemblies of NA12878

Discovar Dfs 10X Dovetail Canu-p MaSuRCA
Assembly statistics Number 37 393 7 323 9 926 9 463 2 337 4 885
Min. 2 000 2 000 2 000 2 000 2 981 4 103
Max. 1 380 479 30 548 185 69 726 354 95 295 052 50 410 306 9 066 374
N50 202 174 6 445 123 16 305 019 24 472 662 7 667 013 1 695 766
Size 2 794 627 041 2 884 349 664 2 835 096 130 2 800 321 128 2 866 880 913 2 849 443 591
Long-read coverage - 7X - - 35X 7X
1-to-1 alignments
Length 2 793 980 166 2 797 898 328 2 778 947 064 2 799 630 879 2 811 439 829 2 845 550 340
Identity 99.8 99.8 99.79 99.8 99.28 99.67
% Ref covered 90.16 90.29 89.68 90.35 90.73 91.83
Breakpoints
Relocations Number 120 1 151 688 997 501 374
Bases (Mb) 0.361 5.604 4.810 0.582 2.281 2.071
Translocations Number 373 1 856 883 976 1 082 941
Bases (Mb) 4.840 11.279 7.838 6.576 13.781 13.933
Inversions Number 53 768 871 2,813 299 240
Bases (Mb) 0.151 3.886 7.273 0.736 2.903 3.008
Total Number 546 3 775 2 442 4 786 1 882 1 555
Bases (Mb) 5.353 20.769 19.921 7.894 18.964 19.012
%1-to-1 0.192 0.742 0.717 0.282 0.675 0.668

Assembly statistics: Number - number of contigs/scaffolds assembled; Max/Min - the maximum/minimum contig/scaffold size in base pairs; N50 - contig/scaffold length such that half of the assembly size is obtained by adding contigs/scaffolds sorted in descending order by length; size - total size of the assembly in base pairs; 1-to-1 alignments: length - total length of nonrepetitive alignments between the assembly and GRCh38.p10 detected by Nucmer; identity - average identity between the assembly and GRCh38.p10 computed from the 1-to-1 alignments; %Ref covered - percentage of the GRCh38.p10 that is covered by 1-to-1 alignments where the length of the reference was set to 3.1 Gb; Breakpoints - structural errors were obtained from 1-to-1 alignments and reported using the Gage metrics (relocations, translocations, and inversions); number - counts the number of breakpoints by sort; bases (Mb) - adds the number of bases involved in breakpoints extracted from the Dnadiff report (qdiff file) in mega bases; %1-to-1 - percentage of structural errors with respect to the total 1-to-1 alignment length. Public NA12878 assemblies were downloaded and used for validation and comparisons against the DFS hybrid assembly pipeline.