Skip to main content
. 2017 Aug 29;114(37):9814–9819. doi: 10.1073/pnas.1700770114

Table 1.

Datasets used in experiments

Name Instances Dimensions Classes Imbalance
MNIST (41) 70,000 784 10 1
Coil-100 (45) 7,200 49,152 100 1
YaleB (43) 2,414 32,256 38 1
YTF (44) 10,036 9,075 40 13
Reuters-21578 9,082 2,000 50 785
RCV1 (38) 10,000 2,000 4 6
Pendigits (42) 10,992 16 10 1
Shuttle 58,000 9 7 4,558
Mice Protein (39) 1,077 77 8 1

For each dataset, the number of instances, number of dimensions, number of ground-truth clusters, and the imbalance, defined as the ratio of the largest and smallest cardinalities of ground-truth clusters, are shown.