. Author manuscript; available in PMC: 2012 May 23.

Published in final edited form as: Data Min Knowl Discov. 2011 Sep 8;25(1):109–133. doi: 10.1007/s10618-011-0234-x

Table 3.

Average test set area under the ROC curve (AUC) on UCI classification datasets that have been modified to be semi-supervised anomaly detection tasks using four anomaly detection methods: one-class support vector machines, local outlier factor, cross-feature-analysis, and our approach, feature regression and classification

Data set	Estimated upper-bound	One-class SVM	LOF	CFA	FRaC	p-value
abalone	0.64	0.58	0.51	0.39	0.48
acute	1.00	0.97	0.84	1.00	1.00
adult	0.88	0.61	0.47	0.53	0.61
annealing	0.95	0.63	0.85	0.80	0.82
arrhythmia	0.77	0.62	0.72	0.65	0.78	9.04e – 13
audiology	0.95	0.77	0.77	0.78	0.80	0.00152
balance-scale	0.99	0.96	0.95	0.94	0.97	0.00022
blood-transfusion	0.62	0.57	0.57	0.51	0.59	0.000306
breast-cancer-wisconsin	0.93	0.51	0.93	0.30	0.96	5.42e – 15
car	0.99	0.98	0.91	0.97	0.97
chess	1.00	0.71	0.89	0.94	0.93
cmc	0.75	0.45	0.45	0.42	0.41
connect-4	0.87	0.53	0.85	0.76	0.65
credit-screening	0.88	0.69	0.73	0.84	0.85	0.822
cylinder-bands	0.81	0.81	0.68	0.82	0.69
dermatology	1.00	0.96	0.99	0.86	1.00	2.15e – 08
echocardiogram	0.77	0.66	0.65	0.58	0.67	0.090
ecoli	0.99	0.98	0.98	0.50	0.97
glass	0.79	0.61	0.70	0.64	0.65
haberman	0.66	0.67	0.64	0.67	0.67
hayes-roth	0.92	0.87	0.68	0.89	0.92	1.43e – 06
hepatitis	0.84	0.58	0.46	0.81	0.83	0.108
horse-colic	0.81	0.68	0.65	0.71	0.82	1.27e – 13
image	1.00	0.83	0.94	0.66	0.98	7.62e – 08
internet_ads	0.92	0.79	0.79	0.92	0.94	0.00459
ionosphere	0.98	0.84	0.91	0.87	0.97	7.67e – 18
iris	1.00	1.00	1.00	0.97	1.00
letter-recognition	1.00	0.99	0.99	1.00	1.00
libras	0.95	0.63	0.69	0.81	0.89	4.9e-05
magic	0.88	0.78	0.81	0.60	0.83	5.22e-14
mammographic-masses	0.89	0.74	0.58	0.72	0.73
mushroom	1.00	0.95	1.00	1.00	1.00
nursery	1.00	1.00	1.00	1.00	1.00
ozone	0.88	0.39	0.48	0.43	0.34
page-blocks	0.94	0.57	0.95	0.72	0.89
parkinsons	0.89	0.75	0.67	0.45	0.64
pima-indians-diabetes	0.83	0.65	0.71	0.50	0.75	3.74e-14
poker	0.54	0.53	0.51	0.55	0.56	4.53e-06
secom	0.53	0.50	0.53	0.52	0.57	8.95e-13
spambase	0.94	0.78	0.58	0.82	0.84	1.15e-11
statlog	0.69	0.55	0.58	0.58	0.63	5.3e-14
tae	0.66	0.51	0.35	0.54	0.55	0.557
tic-tac-toe	0.98	0.85	0.98	1.00	0.99
voting-records	0.99	0.98	0.83	0.77	0.95
wine	0.99	0.79	0.88	0.33	0.96	4.93e-14
yeast	0.73	0.69	0.72	0.56	0.72
zoo	1.00	0.98	1.00	1.00	1.00
Number of data sets with the maximum AUC score		5	5	3	23

The highest average AUC score among the anomaly detection methods is shown in bold. On data sets where FRaC has the best AUC, we report the p-value of a paired, one-tailed t-test comparing FRaC to the alternative method with the best AUC. p-values under 0.05 are shown in bold. The estimated upper-bound reports the average AUC score for the corresponding supervised classification task, i.e., when training on both “normal” and “anomalous” examples