TABLE 3.
Unsupervised clustering tools.
| ID (References) | Name | Short description | Availability | Visualization | Easy to install and run | Cluster # flexibility | Reproducible | Running time (min) | ARI | F-measure |
| Unsupervised (compatible with any # of Samples) | ||||||||||
| 1. Shekhar et al., 2014 | ACCENSE | 1. t-SNE dimensionality reduction; 2. k-means or density-based clustering | GUI application | n/a | Yes | No | No | 2.48* | 0.28* | 0.60* |
| 2. Anchang et al., 2014 | CCAST | 1. identify cell population; 2. refine cluster assignment; 3. estimate a gating scheme by decision tree; 4. optimize the decision tree | R package “CCAST” | Decision tree | Yes | Yes | Yes | 77.32 | 0.71 | 0.72 |
| 3. Chen et al., 2016 | ClusterX | 1. t-SNE dimensionality reduction; 2. local density estimation; 3. peak detection; 4. clustering assigning | R package “cytofkit” | n/a | Yes | No | Yes | 105.14 | 0.25 | 0.22 |
| 4. Commenges et al., 2018 | Cytometree | Implements a binary tree algorithm for clustering | R package “cytometree” | Binary tree | Yes | No | No | 12.30 | 0.08 | 0.20 |
| 5. Ding et al., 2016 | densityCUT | 1. density estimation; 2. density refinement; 3. local-maxima based clustering; 4. hierarchical stable clustering | R package “densitycut” | n/a | Yes | No | Yes | 3.94 | 0.78 | 0.34 |
| 6. Becher et al., 2014 | DensVM | 1. t-SNE dimension reduction; 2. density-based peak calling and clustering; 3. SVM classification for less-confident cells | R package “cytofkit” | n/a | Yes | No | No | 43.83* | 0.71* | 0.69* |
| 7. Theorell et al., 2019 | DEPECHE | k-means clustering | R package “depecheR” | n/a | Yes | Yes | No | 3.46 | 0.75 | 0.53 |
| 8. MacQueen, 1967; Qian et al., 2010 | FLOCK | 1. hypergrid creation; 2. identifying dense hyperregions; 3. merging neighboring dense hyperregions; 4. clustering | Available at ImmPort online | n/a | Yes (Need to register at Galaxy) | No (can adjust # of bins and density) | Yes | 0.30 | 0.73 | 0.65 |
| 9. Lo et al., 2009 | flowClust | t-mixture models with the Box-Cox transformation | R package “flowClust” | n/a | Yes | Yes | Yes | 4.99 | 0.41 | 0.43 |
| 10. Ye and Ho, 2018 | FlowGrid | density-based clustering algorithm DBSCAN with the scalability of grid-based clustering | Github (Python package “FlowGrid”) | n/a | Yes | No (can adjust # of bins and density) | Yes | 0.25^ | 0.54 | 0.48 |
| 11. Aghaeepour et al., 2011 | flowMeans | k-means clustering | R package “flowMeans” | n/a | Yes | Yes | Yes | 6.01 | 0.64 | 0.63 |
| 12. Ge and Sealfon, 2012 | flowPeaks | 1. k-means; 2. Gaussian finite mixture to model the density function; 3. peak search and merging; 4. cluster tightening | R package “flowPeaks” | n/a | Yes | Yes | Yes | 0.19 | 0.64 | 0.55 |
| 13. Van Gassen et al., 2015 | FlowSOM | 1. self-organization map building; 2. MST building; 3. perform meta-clustering | R package “FlowSOM” and “cytofkit” | MST, Chart plot | Yes | Yes | Yes (if set a seed) | 0.19 | 0.62 | 0.67 |
| 14. Li Y. H. et al., 2017 | PAC-MAN | 1. partitioning by density-based methods; 2. post-processing | R package “PAC” | n/a | Yes | Yes | Yes | 0.35 | 0.78 | 0.74 |
| 15. Levine et al., 2015 | PhenoGraph | 1. Construct nearest-neighbor graph; 2. community partitioning | R package “cytofkit” | n/a | Yes | No (Can adjust # of nearest neighbours) | Yes | 5.89 | 0.71 | 0.78 |
| 16. [github] | Rclusterpp | flexible native hierarchical clustering | R package “Rclusterpp” | Hierarchical-structure | Yes (Need to manually download source file) | No | Yes | 17.40 | 0.70 | 0.71 |
| 17. Zare et al., 2010 | SamSPECTRAL | Spectral-clustering with data reduction scheme | R package “SamSPECTRAL” | n/a | No (requires manual tuning for optimal results) | Yes | Yes | 24.70 | 0.57 | 0.33 |
| 18. Qiu et al., 2011 | SPADE | 1. Density-dependent down-sampling; 2. MST construction | R package “spade” | MST | Yes | Yes (given cluster number K, it can create between [k/2,3k/2] clusters | No | 2.83 | 0.58 | 0.66 |
| 19. Mosmann et al., 2014 | SWIFT | 1. Fit GMM; 2. Refine GMM; 3. agglomerative merging | GUI application by Matlab | n/a | Yes | No (can adjust # of bins and density) | No | 20.02* | 0.06* | 0.29* |
| 20. Samusik et al., 2016 | X-shift | 1. estimate cell event density; 2. arrange populations by maker-based classification | GUI application | Divisive Marker Trees | Yes | Yes | Yes | 35.10 | 0.65 | 0.67 |
| 21. Sorensen et al., 2015 | immunoClust | 1. iterative model-based clustering; 2. meta-clustering | R package “immunoClust” | n/a | Yes | No | Yes | 82.72 | 0.29 | 0.47 |
| 22. Flock | k-means | k-means clustering | R base package “stats” | n/a | Yes | Yes | Yes | 11.68 | 0.63 | 0.63 |
|
Unsupervised (requiring multiple samples) | ||||||||||
| 23. Bruggner et al., 2014 | Citrus | cluster identification, characterization and regression | R package “Citrus” | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| 24. Arvaniti and Claassen, 2017 | CellCnn | convolutional neural networks | Python 2.7 package on Github | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| 25. Lun et al., 2017 | Cydar | 1. cell alignment in hyperspheres in high dimensional space; 2. differential abundance analysis | R package “cydar” | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| 26. Weber et al., 2018 | diffcyt | 1. FlowSOM clustering; 2. empirical Bayes moderated tests for differential abundance analysis | R package “diffcyt” | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
|
Unsupervised (other) | ||||||||||
| 27. Pouyan et al., 2016 | AUTO-SPADE | 1. Fuzzy-C-Mean clustering; 2. Merging clusters using Markov clustering; 3. Integration with SPADE | No tool available | |||||||
| 28. Linderman et al., 2012 | CytoSPADE | SPADE clustering | No tool available | |||||||
| 29. Walther et al., 2009 | DBM | density based merging (DBM) algorithm | No tool available | |||||||
| 30. Vinh et al., 2009 | FLAME | multivariate skew t mixture models | No full tool pipeline available | |||||||
| 31. Finak et al., 2009 | flowMerge | 1. clustering based on flowClust models; 2. merge clusters | For the downsampled data, number of cluster ranging from 15 to 25 wa applied, but it showed out NA merged result. | |||||||
| 32. Pouyan and Nourani, 2015 | Flow-SNE | 1. t-SNE data embedding; 2. cluster number estimation; 3. k-means clustering; 4. merging of clusters | No tool available | |||||||
∗If the tool cannot complete the running within 3 h, it was applied to a down-sampled data (with 20K cells) for evaluation. ^computing time varies with different setting, but generally fast. MST, minimum spanning tree.