Figure 2.
Performance of ProtoCloud on cell type classification
(A–C) Benchmarking of cell type annotation performance. We compared ProtoCloud against Seurat V4,27 scANVI,26 CellTypist,8 scPoli,61 TOSICA,60 SIMS,62 scGPT,39 and scBERT37 for cell type annotation across eight datasets, ranging from approximately 10,000 to 400,000 cells. The x axis represents datasets, with the numbers in the parentheses indicating the number of rare cell types in each dataset. Downward-pointing arrows indicate values below 0.5. Error bars: standard error of the mean. (A) Evaluation metrics include accuracy, (B) macro F1 score, (C) and Cohen’s kappa coefficient. The metrics were averaged over five random seed experiments, with 80% of the data used for training and 20% for validation in each run.
(D) Macro F1 score rank distributions across methods and datasets over all experimental repetitions.
(E and F) Model performance analysis under varying conditions using the PBMC10K dataset.3 (E) Validation accuracy as the proportion of label perturbation in the training set increases from 0% to 20%. (F) Validation accuracy as training data ratios decrease, ranging from 0.8 to 0.1.
