Skip to main content
. Author manuscript; available in PMC: 2012 May 23.
Published in final edited form as: Data Min Knowl Discov. 2011 Sep 8;25(1):109–133. doi: 10.1007/s10618-011-0234-x

Table 3.

Average test set area under the ROC curve (AUC) on UCI classification datasets that have been modified to be semi-supervised anomaly detection tasks using four anomaly detection methods: one-class support vector machines, local outlier factor, cross-feature-analysis, and our approach, feature regression and classification

Data set Estimated upper-bound One-class SVM LOF CFA FRaC p-value
abalone 0.64 0.58 0.51 0.39 0.48
acute 1.00 0.97 0.84 1.00 1.00
adult 0.88 0.61 0.47 0.53 0.61
annealing 0.95 0.63 0.85 0.80 0.82
arrhythmia 0.77 0.62 0.72 0.65 0.78 9.04e13
audiology 0.95 0.77 0.77 0.78 0.80 0.00152
balance-scale 0.99 0.96 0.95 0.94 0.97 0.00022
blood-transfusion 0.62 0.57 0.57 0.51 0.59 0.000306
breast-cancer-wisconsin 0.93 0.51 0.93 0.30 0.96 5.42e15
car 0.99 0.98 0.91 0.97 0.97
chess 1.00 0.71 0.89 0.94 0.93
cmc 0.75 0.45 0.45 0.42 0.41
connect-4 0.87 0.53 0.85 0.76 0.65
credit-screening 0.88 0.69 0.73 0.84 0.85 0.822
cylinder-bands 0.81 0.81 0.68 0.82 0.69
dermatology 1.00 0.96 0.99 0.86 1.00 2.15e08
echocardiogram 0.77 0.66 0.65 0.58 0.67 0.090
ecoli 0.99 0.98 0.98 0.50 0.97
glass 0.79 0.61 0.70 0.64 0.65
haberman 0.66 0.67 0.64 0.67 0.67
hayes-roth 0.92 0.87 0.68 0.89 0.92 1.43e06
hepatitis 0.84 0.58 0.46 0.81 0.83 0.108
horse-colic 0.81 0.68 0.65 0.71 0.82 1.27e13
image 1.00 0.83 0.94 0.66 0.98 7.62e08
internet_ads 0.92 0.79 0.79 0.92 0.94 0.00459
ionosphere 0.98 0.84 0.91 0.87 0.97 7.67e18
iris 1.00 1.00 1.00 0.97 1.00
letter-recognition 1.00 0.99 0.99 1.00 1.00
libras 0.95 0.63 0.69 0.81 0.89 4.9e-05
magic 0.88 0.78 0.81 0.60 0.83 5.22e-14
mammographic-masses 0.89 0.74 0.58 0.72 0.73
mushroom 1.00 0.95 1.00 1.00 1.00
nursery 1.00 1.00 1.00 1.00 1.00
ozone 0.88 0.39 0.48 0.43 0.34
page-blocks 0.94 0.57 0.95 0.72 0.89
parkinsons 0.89 0.75 0.67 0.45 0.64
pima-indians-diabetes 0.83 0.65 0.71 0.50 0.75 3.74e-14
poker 0.54 0.53 0.51 0.55 0.56 4.53e-06
secom 0.53 0.50 0.53 0.52 0.57 8.95e-13
spambase 0.94 0.78 0.58 0.82 0.84 1.15e-11
statlog 0.69 0.55 0.58 0.58 0.63 5.3e-14
tae 0.66 0.51 0.35 0.54 0.55 0.557
tic-tac-toe 0.98 0.85 0.98 1.00 0.99
voting-records 0.99 0.98 0.83 0.77 0.95
wine 0.99 0.79 0.88 0.33 0.96 4.93e-14
yeast 0.73 0.69 0.72 0.56 0.72
zoo 1.00 0.98 1.00 1.00 1.00
Number of data sets with the maximum AUC score 5 5 3 23

The highest average AUC score among the anomaly detection methods is shown in bold. On data sets where FRaC has the best AUC, we report the p-value of a paired, one-tailed t-test comparing FRaC to the alternative method with the best AUC. p-values under 0.05 are shown in bold. The estimated upper-bound reports the average AUC score for the corresponding supervised classification task, i.e., when training on both “normal” and “anomalous” examples