Table 3.
Average test set area under the ROC curve (AUC) on UCI classification datasets that have been modified to be semi-supervised anomaly detection tasks using four anomaly detection methods: one-class support vector machines, local outlier factor, cross-feature-analysis, and our approach, feature regression and classification
Data set | Estimated upper-bound | One-class SVM | LOF | CFA | FRaC | p-value |
---|---|---|---|---|---|---|
abalone | 0.64 | 0.58 | 0.51 | 0.39 | 0.48 | |
acute | 1.00 | 0.97 | 0.84 | 1.00 | 1.00 | |
adult | 0.88 | 0.61 | 0.47 | 0.53 | 0.61 | |
annealing | 0.95 | 0.63 | 0.85 | 0.80 | 0.82 | |
arrhythmia | 0.77 | 0.62 | 0.72 | 0.65 | 0.78 | 9.04e – 13 |
audiology | 0.95 | 0.77 | 0.77 | 0.78 | 0.80 | 0.00152 |
balance-scale | 0.99 | 0.96 | 0.95 | 0.94 | 0.97 | 0.00022 |
blood-transfusion | 0.62 | 0.57 | 0.57 | 0.51 | 0.59 | 0.000306 |
breast-cancer-wisconsin | 0.93 | 0.51 | 0.93 | 0.30 | 0.96 | 5.42e – 15 |
car | 0.99 | 0.98 | 0.91 | 0.97 | 0.97 | |
chess | 1.00 | 0.71 | 0.89 | 0.94 | 0.93 | |
cmc | 0.75 | 0.45 | 0.45 | 0.42 | 0.41 | |
connect-4 | 0.87 | 0.53 | 0.85 | 0.76 | 0.65 | |
credit-screening | 0.88 | 0.69 | 0.73 | 0.84 | 0.85 | 0.822 |
cylinder-bands | 0.81 | 0.81 | 0.68 | 0.82 | 0.69 | |
dermatology | 1.00 | 0.96 | 0.99 | 0.86 | 1.00 | 2.15e – 08 |
echocardiogram | 0.77 | 0.66 | 0.65 | 0.58 | 0.67 | 0.090 |
ecoli | 0.99 | 0.98 | 0.98 | 0.50 | 0.97 | |
glass | 0.79 | 0.61 | 0.70 | 0.64 | 0.65 | |
haberman | 0.66 | 0.67 | 0.64 | 0.67 | 0.67 | |
hayes-roth | 0.92 | 0.87 | 0.68 | 0.89 | 0.92 | 1.43e – 06 |
hepatitis | 0.84 | 0.58 | 0.46 | 0.81 | 0.83 | 0.108 |
horse-colic | 0.81 | 0.68 | 0.65 | 0.71 | 0.82 | 1.27e – 13 |
image | 1.00 | 0.83 | 0.94 | 0.66 | 0.98 | 7.62e – 08 |
internet_ads | 0.92 | 0.79 | 0.79 | 0.92 | 0.94 | 0.00459 |
ionosphere | 0.98 | 0.84 | 0.91 | 0.87 | 0.97 | 7.67e – 18 |
iris | 1.00 | 1.00 | 1.00 | 0.97 | 1.00 | |
letter-recognition | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | |
libras | 0.95 | 0.63 | 0.69 | 0.81 | 0.89 | 4.9e-05 |
magic | 0.88 | 0.78 | 0.81 | 0.60 | 0.83 | 5.22e-14 |
mammographic-masses | 0.89 | 0.74 | 0.58 | 0.72 | 0.73 | |
mushroom | 1.00 | 0.95 | 1.00 | 1.00 | 1.00 | |
nursery | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
ozone | 0.88 | 0.39 | 0.48 | 0.43 | 0.34 | |
page-blocks | 0.94 | 0.57 | 0.95 | 0.72 | 0.89 | |
parkinsons | 0.89 | 0.75 | 0.67 | 0.45 | 0.64 | |
pima-indians-diabetes | 0.83 | 0.65 | 0.71 | 0.50 | 0.75 | 3.74e-14 |
poker | 0.54 | 0.53 | 0.51 | 0.55 | 0.56 | 4.53e-06 |
secom | 0.53 | 0.50 | 0.53 | 0.52 | 0.57 | 8.95e-13 |
spambase | 0.94 | 0.78 | 0.58 | 0.82 | 0.84 | 1.15e-11 |
statlog | 0.69 | 0.55 | 0.58 | 0.58 | 0.63 | 5.3e-14 |
tae | 0.66 | 0.51 | 0.35 | 0.54 | 0.55 | 0.557 |
tic-tac-toe | 0.98 | 0.85 | 0.98 | 1.00 | 0.99 | |
voting-records | 0.99 | 0.98 | 0.83 | 0.77 | 0.95 | |
wine | 0.99 | 0.79 | 0.88 | 0.33 | 0.96 | 4.93e-14 |
yeast | 0.73 | 0.69 | 0.72 | 0.56 | 0.72 | |
zoo | 1.00 | 0.98 | 1.00 | 1.00 | 1.00 | |
Number of data sets with the maximum AUC score | 5 | 5 | 3 | 23 |
The highest average AUC score among the anomaly detection methods is shown in bold. On data sets where FRaC has the best AUC, we report the p-value of a paired, one-tailed t-test comparing FRaC to the alternative method with the best AUC. p-values under 0.05 are shown in bold. The estimated upper-bound reports the average AUC score for the corresponding supervised classification task, i.e., when training on both “normal” and “anomalous” examples