Skip to main content
. 2019 Dec 23;20:296. doi: 10.1186/s13059-019-1874-1

Fig. 4.

Fig. 4

Regularized NB regression removes variation due to sequencing depth, but retains biological heterogeneity. a Distribution of residual mean, across all genes, is centered at 0. b Density of residual gene variance peaks at 1, as would be expected when the majority of genes do not vary across cell types. c Variance of Pearson residuals is independent of gene abundance, demonstrating that the GLM has successfully captured the mean-variance relationship inherent in the data. Genes with high residual variance are exclusively cell-type markers. d In contrast to a regularized NB, a Poisson error model does not fully capture the variance in highly expressed genes. An unconstrained (non-regularized) NB model overfits scRNA-seq data, attributing almost all variation to technical effects. As a result, even cell-type markers exhibit low residual variance. Mean-variance trendline shown in blue for each panel