Table 3.
Average test set area under the ROC curve (AUC) on UCI classification datasets that have been modified to be semi-supervised anomaly detection tasks using four anomaly detection methods: one-class support vector machines, local outlier factor, cross-feature-analysis, and our approach, feature regression and classification
| Data set | Estimated upper-bound | One-class SVM | LOF | CFA | FRaC | p-value |
|---|---|---|---|---|---|---|
| abalone | 0.64 | 0.58 | 0.51 | 0.39 | 0.48 | |
| acute | 1.00 | 0.97 | 0.84 | 1.00 | 1.00 | |
| adult | 0.88 | 0.61 | 0.47 | 0.53 | 0.61 | |
| annealing | 0.95 | 0.63 | 0.85 | 0.80 | 0.82 | |
| arrhythmia | 0.77 | 0.62 | 0.72 | 0.65 | 0.78 | 9.04e – 13 |
| audiology | 0.95 | 0.77 | 0.77 | 0.78 | 0.80 | 0.00152 |
| balance-scale | 0.99 | 0.96 | 0.95 | 0.94 | 0.97 | 0.00022 |
| blood-transfusion | 0.62 | 0.57 | 0.57 | 0.51 | 0.59 | 0.000306 |
| breast-cancer-wisconsin | 0.93 | 0.51 | 0.93 | 0.30 | 0.96 | 5.42e – 15 |
| car | 0.99 | 0.98 | 0.91 | 0.97 | 0.97 | |
| chess | 1.00 | 0.71 | 0.89 | 0.94 | 0.93 | |
| cmc | 0.75 | 0.45 | 0.45 | 0.42 | 0.41 | |
| connect-4 | 0.87 | 0.53 | 0.85 | 0.76 | 0.65 | |
| credit-screening | 0.88 | 0.69 | 0.73 | 0.84 | 0.85 | 0.822 |
| cylinder-bands | 0.81 | 0.81 | 0.68 | 0.82 | 0.69 | |
| dermatology | 1.00 | 0.96 | 0.99 | 0.86 | 1.00 | 2.15e – 08 |
| echocardiogram | 0.77 | 0.66 | 0.65 | 0.58 | 0.67 | 0.090 |
| ecoli | 0.99 | 0.98 | 0.98 | 0.50 | 0.97 | |
| glass | 0.79 | 0.61 | 0.70 | 0.64 | 0.65 | |
| haberman | 0.66 | 0.67 | 0.64 | 0.67 | 0.67 | |
| hayes-roth | 0.92 | 0.87 | 0.68 | 0.89 | 0.92 | 1.43e – 06 |
| hepatitis | 0.84 | 0.58 | 0.46 | 0.81 | 0.83 | 0.108 |
| horse-colic | 0.81 | 0.68 | 0.65 | 0.71 | 0.82 | 1.27e – 13 |
| image | 1.00 | 0.83 | 0.94 | 0.66 | 0.98 | 7.62e – 08 |
| internet_ads | 0.92 | 0.79 | 0.79 | 0.92 | 0.94 | 0.00459 |
| ionosphere | 0.98 | 0.84 | 0.91 | 0.87 | 0.97 | 7.67e – 18 |
| iris | 1.00 | 1.00 | 1.00 | 0.97 | 1.00 | |
| letter-recognition | 1.00 | 0.99 | 0.99 | 1.00 | 1.00 | |
| libras | 0.95 | 0.63 | 0.69 | 0.81 | 0.89 | 4.9e-05 |
| magic | 0.88 | 0.78 | 0.81 | 0.60 | 0.83 | 5.22e-14 |
| mammographic-masses | 0.89 | 0.74 | 0.58 | 0.72 | 0.73 | |
| mushroom | 1.00 | 0.95 | 1.00 | 1.00 | 1.00 | |
| nursery | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
| ozone | 0.88 | 0.39 | 0.48 | 0.43 | 0.34 | |
| page-blocks | 0.94 | 0.57 | 0.95 | 0.72 | 0.89 | |
| parkinsons | 0.89 | 0.75 | 0.67 | 0.45 | 0.64 | |
| pima-indians-diabetes | 0.83 | 0.65 | 0.71 | 0.50 | 0.75 | 3.74e-14 |
| poker | 0.54 | 0.53 | 0.51 | 0.55 | 0.56 | 4.53e-06 |
| secom | 0.53 | 0.50 | 0.53 | 0.52 | 0.57 | 8.95e-13 |
| spambase | 0.94 | 0.78 | 0.58 | 0.82 | 0.84 | 1.15e-11 |
| statlog | 0.69 | 0.55 | 0.58 | 0.58 | 0.63 | 5.3e-14 |
| tae | 0.66 | 0.51 | 0.35 | 0.54 | 0.55 | 0.557 |
| tic-tac-toe | 0.98 | 0.85 | 0.98 | 1.00 | 0.99 | |
| voting-records | 0.99 | 0.98 | 0.83 | 0.77 | 0.95 | |
| wine | 0.99 | 0.79 | 0.88 | 0.33 | 0.96 | 4.93e-14 |
| yeast | 0.73 | 0.69 | 0.72 | 0.56 | 0.72 | |
| zoo | 1.00 | 0.98 | 1.00 | 1.00 | 1.00 | |
| Number of data sets with the maximum AUC score | 5 | 5 | 3 | 23 |
The highest average AUC score among the anomaly detection methods is shown in bold. On data sets where FRaC has the best AUC, we report the p-value of a paired, one-tailed t-test comparing FRaC to the alternative method with the best AUC. p-values under 0.05 are shown in bold. The estimated upper-bound reports the average AUC score for the corresponding supervised classification task, i.e., when training on both “normal” and “anomalous” examples