Table 4.
Dataset | Method | TP | FP | FN | Recall | Gain | Precision | F-score |
---|---|---|---|---|---|---|---|---|
EC-1 | Reptile | 2335361 | 144751 | 451889 | 0.8378 | 0.7859 | 0.9416 | 0.8867 |
36 bp | Lighter | 2695425 | 72843 | 91825 | 0.9671 | 0.9409 | 0.9737 | 0.9704 |
70× | Bless | 2624659 | 48342 | 56279 | 0.9790 | 0.9610 | 0.9819 | 0.9805 |
k = 19 | Bloocoo | 2411701 | 22259 | 375549 | 0.8653 | 0.8573 | 0.9908 | 0.9238 |
Musket | 2701885 | 61096 | 85365 | 0.9694 | 0.9474 | 0.9779 | 0.9736 | |
Trowel | 1246340 | 705438 | 1539825 | 0.4473 | 0.1941 | 0.6386 | 0.5261 | |
EC-2 | Reptile | 681551 | 140039 | 114910 | 0.8557 | 0.6799 | 0.8296 | 0.8424 |
36 bp | Lighter | 108241 | 58579 | 688220 | 0.1359 | 0.0624 | 0.6488 | 0.2247 |
20× | Bless | 779824 | 18095 | 16637 | 0.9791 | 0.9564 | 0.9773 | 0.9782 |
k = 17 | Bloocoo | 689322 | 6454 | 107139 | 0.8655 | 0.8574 | 0.9907 | 0.9239 |
Musket | 767087 | 18182 | 29374 | 0.9631 | 0.9403 | 0.9768 | 0.9699 | |
Trowel | 434885 | 19167 | 361576 | 0.5460 | 0.5220 | 0.9578 | 0.6955 | |
EC-3 | Reptile | 105 | 461 | 876053 | 0.0001 | -0.0004 | 0.1855 | 0.0002 |
100 bp | Lighter | 858125 | 2446 | 18033 | 0.9794 | 0.9766 | 0.9972 | 0.9882 |
20× | Bless | 746 | 872860 | 875412 | 0.0008 | -0.9954 | 0.0009 | 0.0009 |
k = 24 | Bloocoo | 79790 | 3644539 | 796368 | 0.0911 | -4.0686 | 0.0214 | 0.0347 |
Musket | 873592 | 1645 | 2566 | 0.9971 | 0.9952 | 0.9981 | 0.9976 | |
Trowel | 155 | 178354 | 876003 | 0.0002 | -0.2034 | 0.0009 | 0.0003 | |
BC-1 | Reptile | 382043 | 22303 | 16602 | 0.9584 | 0.9024 | 0.9448 | 0.9515 |
56 bp | Lighter | 331759 | 15470 | 141618 | 0.7008 | 0.6682 | 0.9554 | 0.8086 |
50× | Bless | 429017 | 34018 | 11943 | 0.9729 | 0.8958 | 0.9265 | 0.9492 |
k = 27 | Bloocoo | 410156 | 24127 | 63221 | 0.8664 | 0.8155 | 0.9444 | 0.9038 |
Musket | 355015 | 47460 | 118362 | 0.7500 | 0.6497 | 0.8821 | 0.8107 | |
Trowel | 55277 | 4976 | 26744 | 0.6739 | 0.6133 | 0.9174 | 0.7770 | |
BC-2 | Reptile | 497425 | 116 | 208081 | 0.7051 | 0.7049 | 0.9998 | 0.8269 |
100 bp | Lighter | 698089 | 159 | 7417 | 0.9895 | 0.9893 | 0.9998 | 0.9946 |
120× | Bless | – | – | – | – | – | – | – |
k = 31 | Bloocoo | 27409 | 1278837 | 678097 | 0.0389 | -1.7738 | 0.0210 | 0.0272 |
Musket | 703882 | 68 | 1624 | 0.9977 | 0.9976 | 0.9999 | 0.9988 | |
Trowel | 652845 | 108 | 52661 | 0.9254 | 0.9252 | 0.9998 | 0.9612 | |
DM | Reptile | 11702183 | 187733 | 517322 | 0.9577 | 0.9423 | 0.9842 | 0.9708 |
100 bp | Lighter | 42 | 23055867 | 12224293 | 0.0000 | -1.8861 | 0.0000 | 0.0000 |
10× | Bless | 11122683 | 126388 | 1101652 | 0.9099 | 0.8995 | 0.9888 | 0.9477 |
k = 21 | Bloocoo | – | – | – | – | – | – | – |
Musket | 11550483 | 163838 | 673852 | 0.9449 | 0.9315 | 0.9860 | 0.9650 | |
Trowel | 1197127 | 384403 | 11027208 | 0.0979 | 0.0665 | 0.7569 | 0.1734 |
In the first column, dataset ID, read length, genome coverage, and the optimal k estimated using KmerGenie are shown. The values in TP, FP, and FN columns are numbers of bases. Italicized values denote the best performer with regard to a specific evaluation measure for a dataset. The symbol “–” indicates that a method failed to process a specific dataset