Skip to main content
. 2017 Aug 29;114(37):9814–9819. doi: 10.1073/pnas.1700770114

Table 2.

Accuracy of all algorithms on all datasets, measured by AMI

Dataset k-means++ GMM Fuzzy MS AC-C AC-W N-Cuts AP Zell SEC LDMGI GDL PIC RCC RCC-DR
MNIST 0.500 0.404 0.386 0.264 NA 0.679 NA 0.478 NA 0.469 0.761 NA NA 0.893 0.828
COIL-100 0.803 0.786 0.796 0.685 0.703 0.853 0.871 0.761 0.958 0.849 0.888 0.958 0.965 0.957 0.957
YTF 0.783 0.793 0.769 0.831 0.673 0.801 0.752 0.751 0.273 0.754 0.518 0.655 0.676 0.836 0.874
YaleB 0.615 0.591 0.066 0.091 0.445 0.767 0.928 0.700 0.905 0.849 0.945 0.924 0.941 0.975 0.974
Reuters 0.516 0.507 0.272 0.000 0.368 0.471 0.545 0.386 0.087 0.498 0.523 0.401 0.057 0.556 0.553
RCV1 0.355 0.344 0.205 0.000 0.108 0.364 0.140 0.313 0.023 0.069 0.382 0.020 0.015 0.138 0.442
Pendigits 0.679 0.695 0.695 0.694 0.525 0.728 0.813 0.639 0.317 0.741 0.775 0.330 0.467 0.848 0.854
Shuttle 0.215 0.266 0.204 0.362 NA 0.291 0.000 0.322 NA 0.305 0.591 NA NA 0.488 0.513
Mice Protein 0.425 0.385 0.417 0.534 0.315 0.525 0.536 0.554 0.428 0.537 0.527 0.400 0.394 0.649 0.638
Rank 7.8 8.6 9.9 9.9 12.4 6.3 6.3 8.1 10.4 7.2 4.9 9.9 10 2.4 1.6

For each dataset, the maximum AMI is highlighted in bold. Some prior algorithms did not scale to large datasets such as MNIST (70,000 data points in 784 dimensions). RCC or RCC-DR achieves the highest accuracy on seven of the nine datasets. RCC-DR achieves the highest or second-highest accuracy on eight of the nine datasets. The average rank of RCC-DR across datasets is lower by a multiplicative factor of 3 or more than the average rank of any prior algorithm. NA, not applicable.