Skip to main content
. Author manuscript; available in PMC: 2012 May 23.
Published in final edited form as: Data Min Knowl Discov. 2011 Sep 8;25(1):109–133. doi: 10.1007/s10618-011-0234-x

Table 1.

The 47 UCI data sets we use to evaluate our anomaly detection methods. All data sets are publicly available

Data set Examples Features
abalone 4,177 8
acute 120 6
adult 32,561 14
annealing 798 18
arrhythmia 452 266
audiology 200 62
balance-scale 625 4
blood-transfusion 748 4
breast-cancer-wisconsin 569 31
car 1,728 6
chess 3,196 36
cmc 1,473 9
connect-4 67,557 42
credit-screening 690 15
cylinder-bands 540 37
dermatology 366 34
echocardiogram 132 7
ecoli 336 7
glass 214 9
haberman 306 3
hayes-roth 132 4
hepatitis 155 19
horse-colic 300 27
image 210 18
internet_ads 3,279 1,558
ionosphere 351 33
iris 150 4
letter-recognition 20,000 16
libras 360 90
magic 19,020 10
mammographic-masses 961 5
mushroom 8,124 21
nursery 12,960 8
ozone 2,536 72
page-blocks 5,473 10
parkinsons 195 22
pima-indians-diabetes 768 8
poker 25,010 10
secom 1,567 474
spambase 4,601 57
statlog 1,000 20
tae 151 5
tic-tac-toe 958 9
voting-records 435 16
wine 178 13
yeast 1,484 8
zoo 101 16