Table 2. Algorithms used to plot cell conversion trajectories.
Algorithm | Required inputa (optional in parentheses) | Dimension reduction (D) | Trajectory plotting (T) | Cell Clustering (C) | Order | Example application | Properties | Assumptions / requirementsb |
---|---|---|---|---|---|---|---|---|
Monocle, 2014 [45] | Branches | ICA | Weighted complete graph, MST | No | D,T | scRNA-seq datasets for human myoblast differentiation | Robust to changes in subpopulation structure,
subsampling. Resolves cellular transitions during differentiation through temporal profiling of the entire transcriptome without a priori knowledge of marker genes. |
Continuous transcriptome path. Known number of branches. |
SCUBA, 2014 [72] | (Time course, marker genes) | k-means clustering | Gap statistic, penalized likelihood function, cusp bifurcation theory | k-means clustering | DC (simultaneous), T | RT-PCR and scRNA-seq datasets for early mouse embryo development | Robust to experimental platform
differences. Uses temporal information. |
Continuous transcriptome path. One or two branches. |
Wanderlust, 2014 [73] | Starting cells | KNN graph | Sets of random reference points (waypoints) and determines the position of each cell by weighted shortest-path distance | No | D,T | sc-mass cytometry data for human naïve B cell differentiation | Robust to technical and biological noise. | Continuous transcriptome path.
Non-branching trajectory. Known starting point of development. |
Waterfall, 2015 [74] | None | PCA | k-means clustering, MST | Unsupervised hierarchical clustering | C,D,T | scRNA-seq datasets for adult neurogenesis in mouse hippocampus | Does not need temporal information or
a priori knowledge of marker
genes. Applicable for diverse single-cell multi-dimensional datasets, including RNA-seq and mass cytometry. |
Continuous transcriptome path. |
destiny, 2016 [52] | None | KNN graph | Diffusion maps | No | D,T | scRNA-seq datasets for mouse embryonic fibroblast reprogramming; qRT-PCR data for mouse embryonic cell development; sc-mass cytometry for mouse induced pluripotent stem cell reprogramming | Robust to biological noise and variation in sample density. | Continuous transcriptome path. |
DPT, 2016 [53] | (Marker genes) | Diffusion maps | Diffusion maps, branching identification by comparing two independent diffusion pseudo-time (DPT) orderings over cells, metastable state identification | No | DT (simultaneous) | scRNA-seq datasets for mouse blood cell development | Robust to parameter choice. Does not need temporal information, a priori knowledge of marker genes, or starting and end cell identities. |
Continuous and smooth transcriptome path. |
SLICE, 2016 [48] | (Cell grouping, marker genes) | PCA | Linear Prize-Collecting Steiner Tree (LPCST) problem, MST, shortest-path approach, principal curve based approach | Partitioning around medoids (PAM) / complete weighted graph | D,C,T | scRNA-seq datasets for differentiation of mouse lung alveolar type | Robust to parameter choice. Does not need temporal information, a priori knowledge of marker genes, or starting and end cell identities. |
Continuous transcriptome path. Cells with higher pluripotent potential are hypothesized to express genes with more diverse and heterogeneous functions. |
SLICER, 2016 [75] | None | KNN graph, locally linear embedding (LLE) | Geodesic entropy | No | D,T | scRNA-seq datasets for mouse lung and neural cells | Robust to biological noise and presence of
irrelevant elements (genes). Detects non-tree-like loop structures in development path. Does not need temporal information or a priori knowledge of marker genes. |
Continuous transcriptome path. |
TSCAN, 2016 [61] | None | PCA / ICA | MST | Hierarchical clustering | D,C,T | scRNA-seq datasets for human skeletal muscle myoblast differentiation | Graphical user interface. Direct comparison with other algorithms. |
Continuous transcriptome path. Known number of cell clusters. |
Wishbone, 2016 [76] | Starting and ending cells, (marker genes) | Diffusion maps | KNN graph, waypoint sparse approximation | No | D,T | sc-mass cytometry data and scRNA-seq data for human myeloid differentiation | Robust to parameter choice. Good branching point detection in bifurcating systems. |
Continuous transcriptome path. One or two branches. |
Monocle 2, 2017 [60] | (Number of cell fates) | PCA / t-SNE / diffusion maps | Reversed graph embedding (RGE) | k-means clustering | D,C,T | scRNA-seq data for human myoblast differentiation | Robust to biological noise. Does not need a priori knowledge of genes that characterize the biological process or the number of branch points in the trajectory. |
Continuous transcriptome path. |
scTDA, 2017 [54] | (Time course) | MDS, top 5000 variant genes | Single-cell topological data analysis (scTDA) | Single-linkage clustering | D,C,T | scRNA-seq data for mouse motor neuron differentiation | Detects transient cellular populations and
their transcriptional repertoires. Detects non-tree-like loop structures in development path. Identifies cell-cycle-related features from loop structures in the trajectory. |
Continuous transcriptome path. |
CellRouter, 2018 [47] | (Marker genes) | t-SNE / diffusion maps | Flow network | KNN graph | D,C,T | scRNA-seq data for human neutrophil differentiation | Robust to subpopulation structure, subsampling, and choice of dimension reduction techniques. | A continuum of phenotypically distinct
subpopulations. State transitions are continuous with molecular hallmarks activated or silenced in a progressive manner. |
Slingshot, 2018 [50] | (Starting and ending cells) | PCA / ICA / diffusion maps | MST | k-means clustering / Gaussian mixture modeling | D,C,T | scRNA-seq data for olfactory stem cell niche | Robust to subsampling and cluster
assignments. Flexibility in upstream analysis, including choice of dimension reduction and clustering algorithms. Identification of multiple cell fates. |
Continuous transcriptome path. |
Abbreviations: independent component analysis (ICA), minimum spanning tree (MST), k-nearest neighbors (KNN), principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), multidimensional scaling (MDS)
These are in addition to required gene and/or protein expression data.
Although not always explicitly mentioned as a criterion, a continuous transcriptome path is an implicit assumption for all algorithms.