Example of the ranking of the 15 clustering algorithms based on eight different combinations of the three metrics used in the generation of the summary quality score. For each dataset-algorithm pair, the three representative measures (e.g. AMI, homogeneity and silhouette) were converted into quantile values based on the three respective data distributions. Thereafter for each dataset-algorithm pair, a median of the three quantile-normalized measures was generated, and is shown in the heatmap using the color-coded scale. The heatmap rows are then ranked by their median-per-row values, with the best performing algorithms shown at the top of the heatmap. The heatmap also shows that the datasets differ significantly in terms of the clustering quality: for example, most algorithms have better performance achieved on the Glioblastoma dataset but the poorer performance on the Melanoma dataset.