Table 3.
Measures of clustering quality. The measures are assigned into three different groups based on principal component analysis (see Results section).
Measure | Brief description | Range | Group |
---|---|---|---|
Adjusted mutual information (AMI) | A variation of mutual information between two clustering partitions, adjusted for the effect of chance agreements between partitions | 0 to 1 | 1 |
Adjusted Rand index (ARI) | A variation of Rand index as a measure of the percentage of correct matches, adjusted for the effect of chance agreements between partitions | 0 to 1 | 1 |
F-measure | A measure of accuracy that balances both the precision and recall | 0 to 1 | 1 |
Variation of information (VI) | An information-based measure that behaves like a true distance, with zero representing equality of the two partitions | 0 to infinity | 1 |
Homogeneity | Entropy-based measure that quantifies whether only those data points that are members of the same class are assigned to the same cluster. | 0 to 1 | 2 |
Majority | Proportion of the data in the largest cluster | 0 to 1 | 2 |
Silhouette | Clustering fitness that measures whether each data point belongs unambiguously to the cluster to which it has been assigned | −1 to 1 | 3 |