Skip to main content
. Author manuscript; available in PMC: 2020 Aug 10.
Published in final edited form as: Cytometry A. 2020 Jun 30;97(8):782–799. doi: 10.1002/cyto.a.24158

Figure 4 – Comparison of clustering results using SPADE, K-means clustering, PhenoGraph, and FlowSOM.

Figure 4 –

10,000 cells were subsampled from each of 3 B-cell Progenitor Acute Lymphoblastic Leukemia (BCP-ALL) patient samples analyzed using mass cytometry. Data were obtained from the GitHub repository from Good et al. (2018)96 and analyzed using R implementations of SPADE, k-means clustering, PhenoGraph, and FlowSOM. PhenoGraph automatically detected the presence of 4 clusters, so this number of clusters was specified for the 3 remaining algorithms in order to compare results; otherwise, default parameters were used. Contour plots were embedded within UMAP axes computed using all 30,000 subsampled cells, with distinct clusters identified by each algorithm represented with a unique color in each panel. Across all clustering methods, markers used for clustering were the following: CD19, CD20, CD24, CD34, CD38, CD127, CD179a, CD179b, IgM (intracellular and extracellular), and terminal deoxynucleotidyl transferase. Notably, different clustering approaches identify subtly different cellular subsets even within this relatively simple dataset. Often, iteratively testing different clustering approaches, visualizing the results, and adjusting hyperparameters can help to determine which method fits best for one’s particular dataset.