Skip to main content
. 2022 Jul 19;5:719. doi: 10.1038/s42003-022-03628-x

Fig. 7. Sensitivity of DR results to pre-processing methods on the Stuart et al.30 dataset.

Fig. 7

This dataset was pre-processed either with either no PCA and no log transformation (raw), log-normalization (Log-norm) or by GLM-PCA. The distance relationship between two of the DCs clusters (mDCs and pDCs) varies dramatically in t-SNE and UMAP when changing pre-processing methods. There are unfavorable outliers and tiny clusters in t-SNE and UMAP results with GLM-PCA pre-processing. For raw data and log-normalized data, the number of dimensions from PCA is 70, for GLM-PCA pre-processed data, the number of PCA dimensions is 50. GLM-PCA is computationally infeasible to run for more than 50 PCA dimensions.