Fig. 2 |. DeepPT prediction of gene expression from H&E slides.
a, Violin plots depicting the distribution of Pearson correlations between predicted and measured expression values across the cohort samples for all (approximately 18,000) genes (empty) and the top 1,000 genes with the highest correlations (light blue). In violin plots, the central mark is the median. The number of patients in each cohort is shown in parentheses. b, Median correlation between the predicted and measured expression values across the cohort samples obtained by HE2RNA (gray), SEQUOIA (pink) and DeepPT (light blue) for the top 1,000 best-predicted genes (independently selected for each model—those with the highest correlation). The performance of HE2RNA and SEQUOIA is taken as reported in the original publication49. c, Mean correlation between the predicted expression values of all genes and their measured values across the samples, obtained by HE2RNA (gray), tRNAsformer (purple) and DeepPT (light blue) for kidney cancer. The performance of HE2RNA and tRNAsformer has been reported in ref. 46. P values in b and c were calculated using a one-sided permutation test, and their values were zero in every case (*P < 0.001). d, Correlation distribution of the top 1,000 genes (left) and the number of genes with a correlation of >0.4 (right) achieved by DeepPT in two external unseen test cohorts. In violin plots, the central mark is the median. e, Venn diagrams illustrating the overlap between the well-predicted genes (R > 0.4) in TCGA-breast and TransNEO-breast (left) and in TCGA-brain and NCI-brain (right). Both have hypergeometric P values equal to zero. f, Pathway enrichment analysis on the well-predicted genes (R > 0.4). Each row represents a different cancer hallmark, and each column represents a different cohort (the two rightmost columns correspond to the two external cohorts). Values denote the FDR-corrected P values for pathway enrichment among the well-predicted genes (hypergeometric test). BRCA, breast invasive carcinoma; KIRC, kidney renal clear cell carcinoma; LGG, brain lower-grade glioma; LUSC, lung squamous cell carcinoma; LUAD, lung adenocarcinoma; HNSC, head and neck squamous cell carcinoma; PRAD, prostate adenocarcinoma; COAD, colon adenocarcinoma; STAD, stomach adenocarcinoma; KIRP, kidney renal papillary cell carcinoma; CESC, cervical squamous cell carcinoma and endocervical adenocarcinoma; PAAD, pancreatic adenocarcinoma; READ, rectum adenocarcinoma; ESCA, esophageal carcinoma; GBM, glioblastoma multiforme; KICH, kidney chromophobe.
