Table 3.
Comparison of the prediction performance of different methods based on the LC dataset.
Feature | 40-mer | 40-mer | Gene markers†† | Species abundance† | Presence of strain- specific markers† | |
---|---|---|---|---|---|---|
Experiment | Training (66P+56H) Validation (32P+27H) Testing (25P+31H) |
20 runs of 10-fold cross-validation (114P+118H) |
||||
Number of feature | 1 | 10 | 15 | 542 | 120553 | |
Classifier |
Single logical feature predictor |
Random forests |
Support vector machine |
Random forests |
Support vector machine |
|
AUC | Training validation testing |
ASS∗ = 0.87 ASS = 0.885y ASS = 0.87 |
0.963 0.969 0.942 |
0.918 0.838 0.836 |
0.946 ± 0.035 | 0.963 ± 0.027 |
Using much fewer features, MetaGO achieved better results compared to other methods. The results of MetaGO were in bold. †(Pasolli et al., 2016); ††(Qin et al., 2014); ∗average of sensitivity and specificity.