TABLE 5.
Statistics of data sets.
Data set | Type | #instance | #feature | #cluster | #outlier |
---|---|---|---|---|---|
Amazon(A) | Image | 958 | 800 | 10 | 0 |
breast | Tabular | 699 | 9 | 2 | 0 |
Caltech(C) | Image | 1123 | 800 | 10 | 0 |
caltech | Image | 1415 | 4096 | 4 | 67 |
cacmcisi | Text | 4463 | 14409 | 2 | 0 |
classic | Text | 7094 | 41681 | 4 | 0 |
cranmed | Text | 2431 | 41681 | 2 | 0 |
Dslr(D) | Image | 157 | 800 | 1 | 0 |
ecoli * | Tabular | 331 | 7 | 6 | 0 |
elephant | Image | #superpixel | 3 | 2 | 0 |
fbis | Text | 2463 | 2000 | 10 | 332 |
ferrari | Image | #superpixel | 3 | 2 | 0 |
gymnastics | Image | #superpixel | 3 | 2 | 0 |
hitech | Text | 2301 | 126321 | 6 | 0 |
iris | Tabular | 150 | 4 | 3 | 0 |
k1b | Text | 2340 | 21839 | 6 | 0 |
kite | Image | #superpixel | 3 | 2 | 0 |
la12 | Text | 6279 | 31472 | 6 | 0 |
mm | Text | 2521 | 126373 | 2 | 0 |
pendigits | Tabular | 10992 | 16 | 10 | 0 |
satimage | Tabular | 4435 | 36 | 6 | 0 |
skating | Image | #superpixel | 3 | 2 | 0 |
re1 | Text | 1657 | 3758 | 6 | 527 |
reviews | Text | 4069 | 126373 | 5 | 0 |
Webcam(W) | Image | 295 | 800 | 10 | 0 |
wap | Text | 1560 | 8460 | 10 | 251 |
wine + | Tabular | 178 | 13 | 3 | 0 |
yeast | Tabular | 1484 | 8 | 4 | 185 |
Note:
two clusters containing only two objects are deleted.
the last attribute is normalized by a scaling factor 1000.