Skip to main content
. 2019 Aug 21;15(8):e1007273. doi: 10.1371/journal.pcbi.1007273

Table 2. Scaffold and correctness statistics for NA12878 assemblies scaffolded with different Hi-C libraries.

“True links” is an idealized case where the Hi-C links have been filtered in advance. The NG50 of human reference GRCh38 is 145 Mbp. The ratio between NG50 and NGA50 represents how many erroneous joins affect large scaffolds in the assembly. The bigger the difference between these values, the more aggressive the scaffolding was at the expense of accuracy. Longest chunk represents the longest error-free portion of the scaffolds. We observed that the 3D-DNA mis-assembly detection was overly aggressive in some cases, and so we ran some assemblies both with and without this feature. For the Illumina assembly as an input, 3D-DNA w correction did not finish within two weeks and is omitted. An evaluation of a previously published [20] 3D-DNA assembly from short-read contigs is included in S3 Table but did not exceed SALSA2’s NGA50.

Dataset Method NG50(Mbp) NGA50(Mbp) Longest Chunk (Mbp) Orientation Errors Ordering Errors Chimeric Errors
Arima-HiC SALSA2 true links 83.31 79.48 172.19 78 101 0
SALSA2 w graph 112.08 71.54 164.46 102 106 90
SALSA2 wo graph 118.42 58.81 155.68 148 112 135
3D-DNA 90.15 22.44 89.46 182 133 115
SALSA1 19.09 14.81 73.14 99 176 96
Mitotic Hi-C SALSA2 w graph 61.97 24.18 145.53 81 54 19
SALSA1 27.88 15.62 85.71 142 78 120
3D-DNA w correction 0.199 0.627 2.24 9775 10563 6159
3D-DNA wo correction 85.56 17.18 70.18 250 215 164
Chicago SALSA2 w graph 5.80 4.54 34.60 46 60 98
SALSA1 5.21 3.94 34.60 83 21 187
3D-DNA w correction 3.63 2.74 18.62 63 69 324
3D-DNA wo correction 9.61 4.76 44.48 67 63 137
Illumina Assembly SALSA2 w graph 96.78 7.99 43.56 1830 2299 635
SALSA2 wo graph 119.57 4.16 26.22 2225 2353 738
3D-DNA w correction NA NA NA NA NA NA
3D-DNA wo correction 176.09 1.00 13.12 5935 3433 2119