Skip to main content
. 2017 Mar 14;18(Suppl 3):60. doi: 10.1186/s12859-017-1473-7

Table 5.

Comparison of AUC scores of RF (random forest) combined with word frequency vector with that based on Manhattan distance and d2 statistic when k=6

Manhattan d2 RF-feat-1
i.i.d. 1stm c 2ndm c
Bacillus 0.829 0.752 0.873 0.851 0.863
Escherichia 0.880 0.833 0.958 0.945 0.856
Lactococcus 0.767 0.775 0.828 0.750 1.000
Mycobacterium 0.976 0.977 0.966 0.984 0.985
Pseudomonas 0.951 0.934 0.974 0.970 0.981
Salmonella 0.837 0.818 0.900 0.900 0.896
Staphylococcus 0.964 0.941 0.947 0.974 0.987
Synechococcus 0.929 0.906 0.994 0.993 0.978
Vibrio 0.841 0.733 0.854 0.817 0.940

For the background model of d2 statistic, we considered independent identically distributed (i.i.d.) model, first and second order Markov chains