Skip to main content
. 2021 Jan 4;12:60. doi: 10.1038/s41467-020-20236-7

Table 3.

Performance of de novo assemblies before and after the bridging step of NECAT.

Species Stats Contig Assembly size (Mb) Max (Kb) NG50 (Kb) MA/local MA NGA50 (Kb) QV
E. coli Before 1 4.6 4587 4587 2/10 3977 17.7
After 1 4.6 4595 4595 2/3 3984 18.5
S. cerevisiae Before 20 12.3 1530 816 28/29 596 22.3
After 19 12.3 1529 937 26/35 708 23.1
A. thaliana Before 150 122.9 14,556 11,150 800/1284 535 15.7
After 136 122.9 14,567 11,157 886/1304 582 16.0
D. melanogaster Before 320 143.0 14,923 9612 1120/1424 4930 19.4
After 277 142.8 21,505 18,072 1117/1333 6323 20.2
C. reinhardtii Before 64 113.3 8997 5515 838/2345 706 19.3
After 54 113.4 9014 6169 831/2273 732 19.8
O. sativa Before 154 372.2 22,009 9241 466/7765 3206 15.5
After 120 373.1 22,094 9650 479/4873 3311 16.0
S. pennellii Before 1604 991.9 22,857 3704 5762/13,324 927 15.1
After 1344 991.8 22,879 4802 5813/12,592 992 15.2
NA12878 (rel3,4) Before 2151 2791.6 50,857 11,980 811/6716 8334 16.0
After 1494 2798.4 73,248 14,066 964/4591 9538 16.6
NA12878 (rel6) Before 1604 2848.6 95,968 18,488 809/1514 12,079 22.6
After 1047 2846.9 95,975 20,913 948/1467 13,441 23.1

“Contig” is the total number of contigs in assembly. “Assembly size” is the total number of base pairs in assembly. “Max” is the length of the largest contig. “NG50” indicates that 50% of reference genome size was contained in contigs having length ≥ n. “NGA50” is NG50 of aligned blocks that contigs are broken into at mis-assembly breakpoints. “MA/local MA” are the numbers of misassemblies and local misassemblies evaluated using QUAST. “QV” is defined as 10×log10(100kbp#mismatchesper100kbp+#indelsper100kbp), where “# mismatches per 100 kbp” and “# indels per 100 kbp” are evaluated by QUAST.