Skip to main content
. 2018 Apr 26;14(4):e1006106. doi: 10.1371/journal.pcbi.1006106

Table 3. Evaluation of RIDDLE and other methods under simulation of random missing data.

All values are averaged over ten k-fold cross-validation experiments involving different proportions of random missing data (10%–30%). In addition, the precision, recall and ROC scores are averaged across classes, weighted by the number of samples in each class. SVMs could not be evaluated on the full dataset as individual trials required more than 36 hours of computation.

Method Accuracy Loss Precision Recall F1 Macro-average ROC
RIDDLE 0.660 0.878 0.656 0.660 0.643 0.822
logistic regression 0.639 0.941 0.634 0.639 0.604 0.800
random forest 0.623 0.978 0.635 0.623 0.567 0.789
GBDT 0.627 0.967 0.628 0.627 0.580 0.782
SVM, linear kernel N/A N/A N/A N/A N/A N/A
SVM, polynomial kernel N/A N/A N/A N/A N/A N/A
SVM, RBF kernel N/A N/A N/A N/A N/A N/A
(a) 10% missing data
Method Accuracy Loss Precision Recall F1 Macro-average ROC
RIDDLE 0.654 0.897 0.649 0.654 0.631 0.814
logistic regression 0.634 0.954 0.629 0.634 0.596 0.792
random forest 0.616 0.994 0.631 0.616 0.556 0.779
GBDT 0.622 0.979 0.624 0.622 0.572 0.774
SVM, linear kernel N/A N/A N/A N/A N/A N/A
SVM, polynomial kernel N/A N/A N/A N/A N/A N/A
SVM, RBF kernel N/A N/A N/A N/A N/A N/A
(b) 20% missing data
Method Accuracy Loss Precision Recall F1 Macro-average ROC
RIDDLE 0.643 0.926 0.640 0.643 0.614 0.800
logistic regression 0.629 0.968 0.623 0.629 0.587 0.784
random forest 0.610 1.009 0.625 0.610 0.545 0.769
GBDT 0.616 0.995 0.617 0.616 0.561 0.764
SVM, linear kernel N/A N/A N/A N/A N/A N/A
SVM, polynomial kernel N/A N/A N/A N/A N/A N/A
SVM, RBF kernel N/A N/A N/A N/A N/A N/A
(c) 30% missing data