Skip to main content
. 2016 Jul 25;10(Suppl 2):20. doi: 10.1186/s40246-016-0068-0

Table 4.

Performance analysis of six k-spectrum-based error correctors as evaluated using six synthetic Illumina datasets

Dataset Method TP FP FN Recall Gain Precision F-score
EC-1 Reptile 2335361 144751 451889 0.8378 0.7859 0.9416 0.8867
36 bp Lighter 2695425 72843 91825 0.9671 0.9409 0.9737 0.9704
70× Bless 2624659 48342 56279 0.9790 0.9610 0.9819 0.9805
k = 19 Bloocoo 2411701 22259 375549 0.8653 0.8573 0.9908 0.9238
Musket 2701885 61096 85365 0.9694 0.9474 0.9779 0.9736
Trowel 1246340 705438 1539825 0.4473 0.1941 0.6386 0.5261
EC-2 Reptile 681551 140039 114910 0.8557 0.6799 0.8296 0.8424
36 bp Lighter 108241 58579 688220 0.1359 0.0624 0.6488 0.2247
20× Bless 779824 18095 16637 0.9791 0.9564 0.9773 0.9782
k = 17 Bloocoo 689322 6454 107139 0.8655 0.8574 0.9907 0.9239
Musket 767087 18182 29374 0.9631 0.9403 0.9768 0.9699
Trowel 434885 19167 361576 0.5460 0.5220 0.9578 0.6955
EC-3 Reptile 105 461 876053 0.0001 -0.0004 0.1855 0.0002
100 bp Lighter 858125 2446 18033 0.9794 0.9766 0.9972 0.9882
20× Bless 746 872860 875412 0.0008 -0.9954 0.0009 0.0009
k = 24 Bloocoo 79790 3644539 796368 0.0911 -4.0686 0.0214 0.0347
Musket 873592 1645 2566 0.9971 0.9952 0.9981 0.9976
Trowel 155 178354 876003 0.0002 -0.2034 0.0009 0.0003
BC-1 Reptile 382043 22303 16602 0.9584 0.9024 0.9448 0.9515
56 bp Lighter 331759 15470 141618 0.7008 0.6682 0.9554 0.8086
50× Bless 429017 34018 11943 0.9729 0.8958 0.9265 0.9492
k = 27 Bloocoo 410156 24127 63221 0.8664 0.8155 0.9444 0.9038
Musket 355015 47460 118362 0.7500 0.6497 0.8821 0.8107
Trowel 55277 4976 26744 0.6739 0.6133 0.9174 0.7770
BC-2 Reptile 497425 116 208081 0.7051 0.7049 0.9998 0.8269
100 bp Lighter 698089 159 7417 0.9895 0.9893 0.9998 0.9946
120× Bless
k = 31 Bloocoo 27409 1278837 678097 0.0389 -1.7738 0.0210 0.0272
Musket 703882 68 1624 0.9977 0.9976 0.9999 0.9988
Trowel 652845 108 52661 0.9254 0.9252 0.9998 0.9612
DM Reptile 11702183 187733 517322 0.9577 0.9423 0.9842 0.9708
100 bp Lighter 42 23055867 12224293 0.0000 -1.8861 0.0000 0.0000
10× Bless 11122683 126388 1101652 0.9099 0.8995 0.9888 0.9477
k = 21 Bloocoo
Musket 11550483 163838 673852 0.9449 0.9315 0.9860 0.9650
Trowel 1197127 384403 11027208 0.0979 0.0665 0.7569 0.1734

In the first column, dataset ID, read length, genome coverage, and the optimal k estimated using KmerGenie are shown. The values in TP, FP, and FN columns are numbers of bases. Italicized values denote the best performer with regard to a specific evaluation measure for a dataset. The symbol “–” indicates that a method failed to process a specific dataset