Table 2. Overview of research works on fusion of pathomics with genomics.
References | Aims | Approach | Data used | Results | |
CNN, convolutional neural networks; GCN, graph convolutional networks; H&E, hematoxylin and eosin; IHC, immunohistochemistry; CNV, copy number variant; CCRCC, clear cell renal cell carcinoma; CV, cross-validation; TCGA, The Cancer Genomic Atlas; WSI, whole-section image; HR, hazard ratio; 95% CI, 95% confidence interval; ROI, region of interest; HPF, high power field; DNN, deep neural network; SVM, support vector machine; BCa, breast cancer; ER, estrogen receptor. | |||||
Chen
et al. (27) |
Constructing a prognostic models for glioma and CCRCC | Histologic image-based features extracted by CNN, and graph-based image features extracted by GCN, and genomic features learned by Feed Forward Network. All above mentioned data were integrated by a multimodal learning paradigm, which modeled on pairwise feature interactions across modalities by taking the Kronecker product of unimodal feature representations and gating attention mechanism, for prognostication. | Glioma: 1,505 H&E-stained images from 769 patient with 320 genomic features from CNV, mutation status and bulk RNA-Seq expression; 1,251 H&E-stained CCRCC images from 417 patients with 357 genomic features from CNV and RNA-Seq. | C-index=0.826 for Glioma; C-index=0.720 for CCRCC. Both models’ performance are higher than the corresponding unimodal models.
Results reported under CV scheme. |
|
Shao
et al. (26) |
Proposing a framework combining pathological images and multi-modal genomic data for the prognosis of early-stage cancer patients. | 1) A generalized sparse canonical correlation analysis, named ordinal multi-modality feature selection (OMMFS) that captures the intrinsic relationship among multiple views, to identify important features from WSI and multi-modal data; 2) cox proportional hazard model was applied for prognosticating patients. | Kidney renal clear cell carcinoma, kidney renal papillary cell carcinoma, and lung squamous cell
carcinoma cohorts with WSI and multi-modal genomic data from TCGA. |
The identified image and multi-modal features were strongly correlated with patients survival outcome, thus enable effective stratification of patients. | |
Cheerla
et al. (22) |
Constructing a deep learning based pancancer model for predicting survival of patients. | Auto encoder to extract four data modalities (gene expression, miRNA data, clinical data, and WSI) into a single feature vector for each patient, handling missing data through a resilient, multimodal dropout method. | Gene expression (n=10,198), miRNA data (n=10,125), clinical data (n=7,512), and WSI (n=10,914) from TGCA (20 different cancer types). | The pan-cancer prognostic model yielded a C-index of 0.78 overall. | |
Cheng
et al. (23) |
Constructing a prognostic model for clear cell renal cell carcinoma | 1) Nuclear features (nucleus size, shape, texture, and distance to neighbors) were aggregated statistically into patient-level features; 2) gene co-expression network analysis (GCNA) to cluster genes into co-expressed modules (clusters of highly interconnected/correlated genes); 3) lasso-regularized Cox proportional hazards model was used to calculate the risk scores based on the feature from 1 and 2. | WSI, transcriptome, and somatic mutation. N=410 from TCGA. | 1) Patients with high percentage of stromal tissue are related to poor prognosis; 2) risk index is independent of known prognostic factors with HR (95% CI)=3.06 (2.10−4.45) P<0.005.
Note: Results reported under CV scheme. |
|
Mobadersany
et al. (24) |
Predicting the overall survival of patients diagnosed with glioma | Hybrid architecture combing abstracted histologic image features from convolutional layers and genomic variables (IDH mutation status and 1p/19q codeletion) to fully connected layers. When predicting of a newly diagnosed patient, 9 HPFs were sampled from each ROI, and the median risk score was selected to represent that ROI. Second highest risk score among all ROIs of a WSI was used as the final risk score. | N=1,061 WSIs from 769 patients from TCGA. Genomic variables (IDH mutation status and 1p/19q codeletion). | Model achieved prognostic power with c index of 0.754 and correlate with molecular subtypes and histologic grade; the c-index boosted to 0.801 while integrating with genomic variables.
Note: Results reported under CV scheme. |
|
Ren
et al. (25) |
Constructing a survival model for predicting the recurrent of prostate cancer patients with Gleason score 7 | 1) Pathway activities were quantified by pathway scores using RNA sequences; 2) image patches from WSI and pathway scores were integrated into DNN to extract “deep features”; 3) “deep features” and clinical prognostic factors were fed into a Cox model. | N=339 WSIs and RNA (Illumina HiSeq) sequencing data from TCGA. | Integrated model yielded C-index=0.74, and C-index=0.71 for histology image only. | |
Yuan
et al. (7) |
Correlation between histology image features and genomic data; Prognosticating early-stage ER-BCa patients | Cancer cells, stroma cells and lymphocytes were detected from the histology image and the proportions of these cells are used as image features to correlate and combine with genomic data. | N=564 early-stage BCa patients with H&E-stained WSIs and genomic data. | A SVM predictor integrating gene expression and image features achieved 86%±3.0% cross-validation accuracy and improved stratification of the patient cohort. |