a, Algorithmist flow of the deconvolution module of BayesPrism. b, Variables and their dimensions shown in a. c–f, Boxplots showing the cell-type-level Pearson’s correlation coefficient (c,e) and MSE (d,f) for deconvolution of GBM28 pseudo-bulks using refGBM8 (c,d), and bulk RNA-seq human whole blood samples with ground truth measured by flow cytometry (e,f). Boxes mark the 25th percentile (bottom), median (central bar) and 75th percentile (top). Whiskers represent extreme values within 1.5-fold of interquartile range. One-sided P values are shown for cell type fractions inferred by BayesPrism (updated θ using marker-free mode) and those by the second-best methods ranked by median value. T-test was used for MSE, and z-test was performed on Fisher’s z-transformed cell-type-level correlation coefficients (Methods). c,d, Statistics were computed for 1,350 pseudo-bulk RNA-seq samples simulated using scRNA-seq from 27 patients with GBM having more than ten malignant cells for all methods, except CIBERSORTx. For CIBERSORTx and its comparison with BayesPrism, statistics were computed using 270 downsampled pseudo-bulks across 27 patients. e,f, Statistics were computed using 12 bulk RNA-seq samples from independent healthy adults. g, Uniform manifold approximation and projection (UMAP) visualization showing expression of individual cells in GBM28. The expression profiles of nonmalignant cells before (gray) and after information pooling (black) were projected onto the UMAP manifold of scRNA-seq (left). Malignant cells in patients with more than ten malignant cells (n = 27) are visualized on the zoomed-in UMAP (right) and are colored by patient. The inferred expression profile, shown as △, and the averaged expression profile from scRNA-seq for each patient, shown as ○, are projected onto the UMAP manifold. h, Scatter plot showing Pearson’s correlation between average expression of malignant cells in pseudo-bulk and that estimated by BayesPrism (red) and CIBERSORTx group mode (orange) or the undeconvolved simulated bulk (blue), as a function of the fraction of malignant cells in a subsampled set (n = 270).
Source data