Skip to main content
. 2018 May 5;7(5):giy048. doi: 10.1093/gigascience/giy048

Table 2:

Hybrid and long-read assemblies of Arabidopsis thaliana (Ler-0)

Number of scaffolds MAX N50 Size (Mb) Fold Long Read Coverage Scaffolder /Assembler Breakpoints 1-to-1 identity % Ref covered
Number Bases (Mb) % Error
2 384 1 551 485 320 571 119.45 1.00 - Discovar 91 0.48 0.49 99.07 82.044
1 577 5 305 497 1 076 408 120.05 3.36 5X Opera-LG 174 0.978 1.00 99.07 82.054
1 368 9 953 317 2 475 756 120.26 7.72 10X Opera-LG 202 1.197 1.22 99.07 82.047
1 249 16 906 870 4 165 132 120.32 12.99 15X Opera-LG 206 1.237 1.26 99.07 82.052
1 179 18 032 662 4 941 257 120.41 15.41 20X Opera-LG 218 1.588 1.62 99.07 82.060
1 103 14 710 653 4 756 724 120.43 14.84 30X Opera-LG 227 1.728 1.76 99.07 82.055
1 049 10 003 725 4 667 601 120.41 14.56 50X Opera-LG 230 1.732 1.76 99.07 82.060
1 345 8 867 374 1 632 787 120.40 5.09 5X ScaffM 195 1.620 1.65 99.07 82.058
1 143 8 867 059 5 142 417 120.65 16.04 10X ScaffM 203 1.319 1.34 99.07 82.045
1 072 11 814 750 6 165 459 120.73 19.23 15X ScaffM 205 1.330 1.36 99.07 82.045
1 020 11 873 221 6 221 109 120.80 19.41 20X ScaffM 207 1.477 1.50 99.07 82.039
958 13 946 812 7 073 179 120.90 22.06 30X ScaffM 209 1.651 1.68 99.07 82.042
923 13 957 620 6 292 557 120.85 19.63 50X ScaffM 210 1.712 1.74 99.07 82.041
1 593 5 296 335 1 037 785 119.96 3.24 5X Boss 179 1.171 1.19 99.07 82.061
1 371 13 608 688 2 554 739 120.17 7.97 10X Boss 200 1.335 1.36 99.07 82.054
1 239 13 643 115 2 829 628 120.22 8.83 15X Boss 207 1.189 1.21 99.07 82.061
1 173 7 977 908 3 005 451 120.23 9.38 20X Boss 212 1.564 1.59 99.07 82.060
1 093 9 004 636 2 974 378 120.28 9.28 30X Boss 219 1.575 1.60 99.07 82.057
1 031 11 011 921 3 179 270 120.29 9.92 50X Boss 229 2.162 2.20 99.07 82.050
1 439 447 211 80 063 89.84 - 10X Canu 107 0.675 1.10 98.19 51.188
259 4 542 617 1 170 676 118.25 - 20X Canu-p 201 0.969 0.99 99.06 81.907
258 4 543 625 1 170 942 118.31 - 20X Canu-q 183 0.831 0.85 99.02 81.808
259 4 535 400 1 168 180 118.05 - 20X Canu 185 1.030 1.09 98.82 78.874
119 15 152 700 6 219 401 120.67 - 50X Canu 219 1.766 1.79 99.02 82.565
88 15 945 651 8 307 845 121.45 - 150X Canu 215 1.935 1.95 99.06 82.938

Continuity was measured using maximum and N50 contig/scaffold size, where N50 is the contig/scaffold length such that half of the assembly size is obtained by adding contigs/scaffolds sorted in descending order by length. The quality of the assembly was evaluated via a direct comparison against the A. thaliana TAIR10 reference genome using Nucmer [38] and reported using the Gage [26] statistics, which from 1-to-1 alignment evaluates both identity and structural breakpoints (inversions, relocations, and translocations). An optimal assembly has high continuity, low breakpoint errors, high identity, and high coverage of the reference genome. Canu-p and Canu-q are Canu assemblies polished with Pilon [48] and Quiver, respectively. Pilon and Quiver are tools used after a long-read assembly to improve the quality of the consensus sequence. All datasets and commands used for the hybrid assembly of A. thaliana (Ler-0) are detailed in Table 1 and Supplementary Materials 2 and 3.