a, UMAP depicting all CD34+ HSPCs cells from one healthy young individual. See b for color scheme. b, Decision tree using surface marker expression from the Abseq data to classify cells into cell types. See Methods and main text for details. c, UMAP highlighting cell type classification obtained from the decision tree. Colors correspond to ‘gates’ applied to the expression levels of the 12 markers shown in b, not gene expression clusters. d, UMAP highlighting classification obtained from a decision tree recapitulating the classical gating scheme used in the field17. Since CD135 was not part of the Abseq panel, the expression of FLT3 was smoothed using MAGIC48. e, Boxplot depicting the intragate dissimilarity for cell classification with panels from Doulatov et al.17, the gating scheme from Karamitros et al.25, a ‘consensus gating’ scheme (see Extended Data Fig. 9) and the data-driven gating scheme (c). Intragate dissimilarity is defined as one minus the average Pearson correlation of normalized gene and surface antigen expression values of all cells within the gate. P values are from a two-sided Wilcoxon test. Sample size is shown in the figure. See Methods, section Data visualization for a definition of boxplot elements. f, Implementation of FACS gating scheme from b. g, UMAP display of mRNA expression of n = 630 CD34+ HSPCs from an indexed single-cell Smart-seq2 experiment where the expression of relevant surface markers was recorded using FACS. Left panel: color indicates gene expression cluster, see Supplementary Note 8 for details. Right panel: color indicates classification by the FACS scheme from f. h, Precision of the classification scheme shown in b, computed on the training data (Abseq) and the test data (Smart-seq2). Precision was computed per gate as the fraction of correctly classified cells. For comparison with the Doulatov gating scheme, the dataset from Velten et al.9 was used. NS, not significant. P values are from a two-sided Wilcoxon test. Sample size is shown in the figure.