The plots on the left show the median correlation distance between membership scores of different runs of
Scallop against (
A) the number of trials, (
B) the fraction of cells used in each bootstrap, and (
C) the resolution given to the clustering method (Leiden) in five independent scRNAseq datasets (
PBMC3K,
Joost et al., 2016;
Paul et al., 2015;
Moignard et al., 2015,
Heart10K). The median correlation distance was computed over 100 runs of
Scallop. The swarmplots on the right show the distribution of the correlation distances between membership scores against each of the input parameters for the heart10k dataset. The median is shown as a red point. While, for the sake of clarity, a random sample of 100 correlation distances is shown for each value of the parameter under study, the median was computed using all the correlation distances.
Scallop membership scores converge as we increase the number of bootstrap iterations and the fraction of cells used in the clustering.