Table 1.
The 47 UCI data sets we use to evaluate our anomaly detection methods. All data sets are publicly available
| Data set | Examples | Features |
|---|---|---|
| abalone | 4,177 | 8 |
| acute | 120 | 6 |
| adult | 32,561 | 14 |
| annealing | 798 | 18 |
| arrhythmia | 452 | 266 |
| audiology | 200 | 62 |
| balance-scale | 625 | 4 |
| blood-transfusion | 748 | 4 |
| breast-cancer-wisconsin | 569 | 31 |
| car | 1,728 | 6 |
| chess | 3,196 | 36 |
| cmc | 1,473 | 9 |
| connect-4 | 67,557 | 42 |
| credit-screening | 690 | 15 |
| cylinder-bands | 540 | 37 |
| dermatology | 366 | 34 |
| echocardiogram | 132 | 7 |
| ecoli | 336 | 7 |
| glass | 214 | 9 |
| haberman | 306 | 3 |
| hayes-roth | 132 | 4 |
| hepatitis | 155 | 19 |
| horse-colic | 300 | 27 |
| image | 210 | 18 |
| internet_ads | 3,279 | 1,558 |
| ionosphere | 351 | 33 |
| iris | 150 | 4 |
| letter-recognition | 20,000 | 16 |
| libras | 360 | 90 |
| magic | 19,020 | 10 |
| mammographic-masses | 961 | 5 |
| mushroom | 8,124 | 21 |
| nursery | 12,960 | 8 |
| ozone | 2,536 | 72 |
| page-blocks | 5,473 | 10 |
| parkinsons | 195 | 22 |
| pima-indians-diabetes | 768 | 8 |
| poker | 25,010 | 10 |
| secom | 1,567 | 474 |
| spambase | 4,601 | 57 |
| statlog | 1,000 | 20 |
| tae | 151 | 5 |
| tic-tac-toe | 958 | 9 |
| voting-records | 435 | 16 |
| wine | 178 | 13 |
| yeast | 1,484 | 8 |
| zoo | 101 | 16 |