Figure 2.
Benchmarking of GBMPurity against alternative tumor purity estimation methods across 5 datasets. The figure displays a 5 × 5 grid summarizing the performance of 5 RNA-based tumor purity estimation methods (rows: GBMPurity, Scaden, CIBERSORTx, PUREE, and MuSiC) across 5 datasets (columns). The first 3 columns represent pseudobulk datasets with ground truth purity labels: GBmap (n = 231), Wang et al11 (n = 57), and Neftel et al5 (n = 9). The last 2 columns correspond to bulk RNA-seq datasets with DNA-derived purity labels: EORTC (n = 235) and TCGA (n = 144). Each panel illustrates the correlation between the predicted purity (y-axis) and ground truth or DNA-derived purity (x-axis). GBmap served as the training dataset for all tools except PUREE, which does not require reference data. Performance metrics, including correlation coefficients and error rates, are summarized in Table 2. Abbreviation: CNA, copy number alteration.
