Table 1. Overview of research works on correlating pathomics with genomics.
References | Approach | Data used | Results | |
CAE, convolutional autoencoder; BCa, breast cancer; H&E, hematoxylin and eosin; IHC, immunohistochemistry; WSI, whole-section image; NSCLC, non-small cell lung cancer; TCGA, The Cancer Genomic Atlas; CNN, convolutional neural networks; TMA, tissue microarrays; CCA, canonical correlation analysis. | ||||
Ash et al.
(17) |
1) CAE was first applied to histology image to extracted features; 2) sparse canonical correlation analysis (CCA) was then applied to the image features and gene expression to find subsets of gene expression values that correlate to subsets of image features. | Three cohorts (BCa, lower grade glioma, and Genotype-Tissue Expression project) with histological images and bulk RNA-sequencing data from paired tissue samples. | 1) Gene sets associated with the structure of the extracellular matrix and cell wall infrastructure, implicating uncharacterized genes in extracellular processes; 2) found sets of genes associated with specific cell types; 3-image features that capture population variation in thyroid and in colon tissues associated with genetic variants. | |
AbdulJabbar et al. (5) | 1) Train deep learning model to identify cancer cells, lymphocytes, stromal cells and an “other” cell class in H&E-stained images (validated by sequencing data, IHC, and pathologists); 2) define immune hot and cold regions based on lymphocytes percentage (validated by the RNA-seq classification). | WSI, RNA-seq from multiregion TRAcking Cancer Evolution through Therapy (Rx) (TRACERx, n=100); The Leicester Archival Thoracic Tumor Investigatory Cohort (LATTICe-A, n=970). | High geospatial immune variability between tumor regions; Tumors with more than one immune cold region had a higher risk of relapse in lung adenocarcinomas. | |
Lu et al. (10) | Image features that captured cellular diversity in local region were correlated with bulk RNA expression data. | N=405 NSCLC histology image with bulk RNA expression data from TCGA | CellDiv features were found to be strongly associated with apoptotic signalling and cell differentiation pathways. | |
Subramanian et al. (18) | Use CCA and sparse CCA to correlate gene expression and histological features describing nucleus shape, texture and intensity. | N=615 BCa samples from TCGA with histology images and gene expression data. | CCA found significant correlation of image features with expression of PAM50 genes. | |
Martins et al. (16) | Stroma were segmented from H&E-stained images and quantified by a fraction score. The stroma score and gene expression were correlated using Pearson correlation. | Two independent cohorts of TMAs of ovarian cancer (n=521). | Stroma strongly biases estimate of PTEN expression | |
Wang et al. (19) | Image features captured tumor morphology were correlated with gene expression data. The strong correlated image features and gene lists/clusters were test for prognostic ability in independent test cohorts. | TCGA Triple-Negative BCa (n=44) with image and gene data. Evaluating the image features in a local TMA cohort (n=143). | Forty-eight pairs of significantly correlated image features and gene clusters were identified; four image features were prognostic in a validation cohort; gene clusters correlated with these four image features were prognostic in public gene datasets. | |