Skip to main content
. 2019 May 6;20:88. doi: 10.1186/s13059-019-1681-8

Fig. 3.

Fig. 3

Performance evaluation using down-sampled bulk RNA-seq data. a Schematic overview of the simulation strategy. Starting from the bulk RNA-seq data matrix consisting of three types of cells, T1 cells, T2 cells, and T3 cells, the data matrix X1 is obtained by resampling of raw data from the different type cells separately. Then, each element (xij) in the data matrix is perturbed by the normal distribution N(0, 5V) (V is the vector of standard deviation of genes across replicates in the bulk RNA-seq data), and the true data set X2 is generated. Finally, dropout events are introduced in X2 using an exponential function, resulting in the dropout data set X3. b A representative imputation result using simulated data. The dropout rate is 72%. c t-SNE plots of the representative imputation results. d MA plots of the representative imputation results. Imputation errors for data with 60% (e), 65% (f), 72% (g), and 77% (h) dropout rates. Each boxplot represents the result from 100 simulated datasets. P values are based on Student’s t test