Skip to main content
. 2021 Oct 31;33(5):563–573. doi: 10.21147/j.issn.1000-9604.2021.05.03

Table 1. Overview of research works on correlating pathomics with genomics.

References Approach Data used Results
CAE, convolutional autoencoder; BCa, breast cancer; H&E, hematoxylin and eosin; IHC, immunohistochemistry; WSI, whole-section image; NSCLC, non-small cell lung cancer; TCGA, The Cancer Genomic Atlas; CNN, convolutional neural networks; TMA, tissue microarrays; CCA, canonical correlation analysis.
Ash et al.
(17)
1) CAE was first applied to histology image to extracted features; 2) sparse canonical correlation analysis (CCA) was then applied to the image features and gene expression to find subsets of gene expression values that correlate to subsets of image features. Three cohorts (BCa, lower grade glioma, and Genotype-Tissue Expression project) with histological images and bulk RNA-sequencing data from paired tissue samples. 1) Gene sets associated with the structure of the extracellular matrix and cell wall infrastructure, implicating uncharacterized genes in extracellular processes; 2) found sets of genes associated with specific cell types; 3-image features that capture population variation in thyroid and in colon tissues associated with genetic variants.
AbdulJabbar et al. (5) 1) Train deep learning model to identify cancer cells, lymphocytes, stromal cells and an “other” cell class in H&E-stained images (validated by sequencing data, IHC, and pathologists); 2) define immune hot and cold regions based on lymphocytes percentage (validated by the RNA-seq classification). WSI, RNA-seq from multiregion TRAcking Cancer Evolution through Therapy (Rx) (TRACERx, n=100); The Leicester Archival Thoracic Tumor Investigatory Cohort (LATTICe-A, n=970). High geospatial immune variability between tumor regions; Tumors with more than one immune cold region had a higher risk of relapse in lung adenocarcinomas.
Lu et al. (10) Image features that captured cellular diversity in local region were correlated with bulk RNA expression data. N=405 NSCLC histology image with bulk RNA expression data from TCGA CellDiv features were found to be strongly associated with apoptotic signalling and cell differentiation pathways.
Subramanian et al. (18) Use CCA and sparse CCA to correlate gene expression and histological features describing nucleus shape, texture and intensity. N=615 BCa samples from TCGA with histology images and gene expression data. CCA found significant correlation of image features with expression of PAM50 genes.
Martins et al. (16) Stroma were segmented from H&E-stained images and quantified by a fraction score. The stroma score and gene expression were correlated using Pearson correlation. Two independent cohorts of TMAs of ovarian cancer (n=521). Stroma strongly biases estimate of PTEN expression
Wang et al. (19) Image features captured tumor morphology were correlated with gene expression data. The strong correlated image features and gene lists/clusters were test for prognostic ability in independent test cohorts. TCGA Triple-Negative BCa (n=44) with image and gene data. Evaluating the image features in a local TMA cohort (n=143). Forty-eight pairs of significantly correlated image features and gene clusters were identified; four image features were prognostic in a validation cohort; gene clusters correlated with these four image features were prognostic in public gene datasets.