Skip to main content
. Author manuscript; available in PMC: 2024 Jul 1.
Published in final edited form as: IEEE Trans Pattern Anal Mach Intell. 2023 Jun 5;45(7):9149–9168. doi: 10.1109/TPAMI.2023.3237667

TABLE 5.

Statistics of data sets.

Data set Type #instance #feature #cluster #outlier
Amazon(A) Image 958 800 10 0
breast Tabular 699 9 2 0
Caltech(C) Image 1123 800 10 0
caltech Image 1415 4096 4 67
cacmcisi Text 4463 14409 2 0
classic Text 7094 41681 4 0
cranmed Text 2431 41681 2 0
Dslr(D) Image 157 800 1 0
ecoli * Tabular 331 7 6 0
elephant Image #superpixel 3 2 0
fbis Text 2463 2000 10 332
ferrari Image #superpixel 3 2 0
gymnastics Image #superpixel 3 2 0
hitech Text 2301 126321 6 0
iris Tabular 150 4 3 0
k1b Text 2340 21839 6 0
kite Image #superpixel 3 2 0
la12 Text 6279 31472 6 0
mm Text 2521 126373 2 0
pendigits Tabular 10992 16 10 0
satimage Tabular 4435 36 6 0
skating Image #superpixel 3 2 0
re1 Text 1657 3758 6 527
reviews Text 4069 126373 5 0
Webcam(W) Image 295 800 10 0
wap Text 1560 8460 10 251
wine + Tabular 178 13 3 0
yeast Tabular 1484 8 4 185

Note:

(1) *:

two clusters containing only two objects are deleted.

(2) +:

the last attribute is normalized by a scaling factor 1000.