Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 1.
Published in final edited form as: Oral Oncol. 2020 Jun 30;110:104877. doi: 10.1016/j.oraloncology.2020.104877

Radiomic analysis identifies tumor subtypes associated with distinct molecular and microenvironmental factors in head and neck squamous cell carcinoma

Evangelia Katsoulakis 1,*, Yao Yu 2,*, Aditya P Apte 3, Jonathan E Leeman 4, Nora Katabi 5, Luc Morris 6,7, Joseph O Deasy 3, Timothy A Chan 2,6,8, Nancy Y Lee 2,^, Nadeem Riaz 2,6,^, Vaios Hatzoglou 9,^,, Jung Hun Oh 3,^,
PMCID: PMC7606635  NIHMSID: NIHMS1608480  PMID: 32619927

Abstract

Purpose

To identify whether radiomic features from pre-treatment computed tomography (CT) scans can predict molecular differences between head and neck squamous cell carcinoma (HNSCC) using The Cancer Imaging Archive (TCIA) and The Cancer Genome Atlas (TCGA).

Methods

77 patients from the TCIA with HNSCC had imaging suitable for analysis. Radiomic features were extracted and unsupervised consensus clustering was performed to identify subtypes. Genomic data was extracted from the matched patients in the TCGA database. We explored relationships between radiomic features and molecular profiles of tumors, including the tumor immune microenvironment. A machine learning method was used to build a model predictive of CD8+ T-cells. An independent cohort of 83 HNSCC patients was used to validate the radiomic clusters.

Results

We initially extracted 104 two-dimensional radiomic features, and after feature stability tests and removal of volume dependent features, reduced this to 67 features for subsequent analysis. Consensus clustering based on these features resulted in two distinct clusters. The radiomic clusters differed by primary tumor subsite (p=0.0096), HPV status (p=0.0127), methylation-based clustering results (p=0.0025), and tumor immune microenvironment. A random forest model using radiomic features predicted CD8+ T-cells independent of HPV status with R2=0.30 (p<0.0001) on cross validation. Consensus clustering on the validation cohort resulted in two distinct clusters that differ in tumor subsite (p=1.3×10−7) and HPV status (p=4.0×10−7).

Conclusion

Radiomic analysis can identify biologic features of tumors such as HPV status and T-cell infiltration and may be able to provide other information in the near future to help with patient stratification.

Keywords: Radiomics, Radiogenomics, Machine learning, TCIA, TCGA, HPV, Tumor immune microenvironment, Head and neck cancer, CD8

INTRODUCTION

Head and neck squamous cell carcinoma (HNSCC) comprises 5–10% of all cancers and is the 6th leading cause of cancer worldwide1. Within HNSCC, there exists a spectrum of tumor subtypes with varying anatomical, clinical, molecular and genomic characteristics2,3. For example, HNSCC is now known to consist of two molecularly distinct subtypes, those associated with the human papilloma virus (HPV-positive) and those associated with traditional risk factors, such as smoking and alcohol consumption (HPV-negative)4. Beyond HPV status, an integrated molecular analysis of somatic mutations, gene expression, copy number alterations, and DNA methylation has revealed additional molecular subtypes that differ in their molecular drivers and clinical behavior5. Each newly diagnosed HNSCC patient typically undergoes a computed tomography (CT) exam of the neck with contrast as part of the staging workup. Although CT imaging provides valuable anatomic information for surgical and/or radiotherapeutic interventions, it remains unclear if these scans can also be used to identify tumor heterogeneity between patients68.

Radiomics analysis allows for comprehensive quantification of tumor phenotype by examining imaging characteristics on regions of interest (ROIs) in medical images which may correspond to unappreciated biologic differences between tumors7,9,10. Because the entirety of the tumor can be non-invasively evaluated in a single study, imaging-based biomarkers may be able to assess tumor heterogeneity that would be difficult to achieve clinically by biopsy-based approaches. Prior work has demonstrated that CT imaging features are associated with tumor stage, metabolism, hypoxia, and angiogenesis7,1114. Further associations between enhancement on imaging and VEGF expression have been found in multiple cancers, including head and neck15. In HNSCC in particular, EGFR expression was related to CT invasion, mass effect, and lower capillary permeability on perfusion CT16. In studies of oropharyngeal tumors, radiomic features were able to discriminate the risk of recurrence post-treatment with chemoradiation, generating favorable and unfavorable subtypes8. In another radiomic study on CT imaging, a set of radiomic features defined in lung cancer and compared with gene expression profiles using gene-set enrichment analysis (GSEA) had reasonable prognostic power in HNSCC7.

Despite recent advances in radiogenomics, few attempts have been made to comprehensively examine the relationship between radiomic features of CTs and the molecular subtypes of HNSCC that have been previously described. We hypothesize that the molecular features driving these distinct tumor subtypes would also be reflected in their radiographic appearance and radiomic profiles. In the current study, we use data from The Cancer Genome Atlas (TCGA) and matched patient data using The Cancer Imaging Archive (TCIA) to systematically identify what biological information can be derived from radiomic analysis of CT scans. First, two-dimensional radiomic features were extracted from pre-treatment CTs and stable features were used to perform consensus radiomic clustering. Second, we performed an integrated analysis to investigate the biological difference between the radiomic clusters based on biological data, including single nucleotide variations, copy number alterations, gene expression profiles, methylation profiles, immune cell populations, and immune microenvironmental features. Finally, we also explored how radiomic analysis can reveal the tumor microenvironment by building a predictive model of CD8+ T-cell infiltrate which is independent of HPV status. In summary, we perform an integrated radiomic-molecular microenvironmental analysis of HNSCC and examine whether radiomics can identify molecular subtypes and predict the fraction of CD8+ T-cells that play a pivotal role in the immune microenvironment.

METHODS

Imaging cohort

The study cohort consists of all HNSCC tumors for which imaging is publicly available from the TCIA (http://www.cancerimagingarchive.net/). Cases (N=188) with available pre-treatment CT scans with IV contrast were downloaded and imported into the Eclipse treatment planning system (Varian Medical System, Palo Alto, CA) for segmentation. Scans were excluded for further analysis based on the following criteria: 1) non-contrast scans, 2) scans of non-target anatomic locations, 3) post-operative scans, 4) poor quality scans for which tumors could not be visualized secondary to artifact or motion, and 5) cases with very small primary tumors (<5mm in greatest dimension). The primary tumor was manually delineated on all evaluable scans by a radiation oncologist (EK) and the delineation was independently confirmed by a neuroradiologist with more than 10 years of experience in head and neck cancer imaging (VH).

Images were assessed for the presence of CT artifacts (i.e., streak artifacts due to dental fillings) within the primary tumor volume and slices with streak artifacts were excluded. As shown in prior studies, image features are robust with up to 50% of gross tumor volume (GTV) removed17. Thus, if <50% of the number of slices in each tumor was free of artifact, then the case was excluded from the study8,17. Applying these criteria, the final cohort consisted of a set of 77 patients. The corresponding matched clinical and genomic data were obtained from the TCGA for these same patients.

Radiomic feature extraction and quality assurance

In total, 104 two-dimensional radiomic features were extracted using the radiomic toolbox in the Computational Environment for Radiological Research (CERR)18. This toolbox has been validated with Image Biomarkers Standardization Initiative (IBSI)19 and digital phantom, and compared with other radiomics libraries such as PyRadiomics20 and Insight Toolkit (ITK)21. Extracted features fell into two categories: (i) first order statistics and (ii) higher order textures22. Higher order texture features included gray level cooccurrence matrix (GLCM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), neighborhood gray tone difference matrix (NGTDM), and neighboring gray level dependence matrix (NGLDM). For texture features, the intensity values in each image were discretized based on global minimum and maximum intensity values and the number of gray levels. These discretization parameters were selected based on the intensity distribution in the ROIs across the patient cohort. The minimum and maximum values were chosen to be 0.005 (−38 Hounsfield unit [HU]) and 0.995 (195HU) percentile for the entire cohort. This range also ensured that air and bone foci were excluded from the delineated ROIs. Based on the histogram of intensity distribution across the cohort, the number of gray levels was chosen to be 32. A binwidth of 5HU was used to compute texture features. A neighborhood of 5 voxels was used for GLCM, NGTDM, and NGLDM features. Scans were resampled at the resolution of 0.6 × 0.6 × 3.5mm. The texture features were computed using 2D neighborhoods and shape-based features were not extracted to exclude the effects of sub-sampling slices due to dental artifacts.

The majority of cases required omission of a subset of slices due to imaging artifacts, more often related to dental amalgam. Supplemental Table 1 shows the number of artifact-free slices in each sample. To identify stable radiomic features robust to this process, we employed the following quality assurance procedure: For each tumor, 75% of the artifact-free slices were randomly selected and 104 radiomic features were computed. After 100 iterations of this task, for each feature with 100 different values the coefficient of variation was computed. We repeated this process for each tumor. As a result, each feature had 77 coefficient of variation values. Features with median coefficient of variation > 0.1 were considered unstable and removed, resulting in 82 stable radiomic features.

Tumor subsites may differ by volume and there may be a volume effect when clustering. To ensure that radiomic features were independent of tumor volume, features that were highly correlated with tumor volume (Spearman’s correlation coefficient > 0.4) were also removed. As a result, the remaining 67 stable and volume-independent features were used for subsequent analysis (Supplemental Table 2). The 15 features removed were highly correlated with volume with p<0.001. Note that the feature stability and volume-independent tests were performed prior to clustering and modeling.

Consensus clustering

Unsupervised consensus clustering was performed based on the 67 stable, volume-independent radiomic features23. Hierarchical clustering was carried out over 1000 iterations; at each iteration, a new input dataset was created by sampling 80% of samples and 80% of features from the original dataset without replacement. After this process, the proportion that two samples are clustered in the same cluster out of the number of times they are sampled together was computed. To visually show the existence of potential clusters in the cohort, a consensus heatmap was generated for each of a sequence of cluster numbers (K=2,3, …, 6) along with the corresponding empirical cumulative distribution function (CDF) curve and a progression graph that indicates the relative change in the area under the CDF curves. These graphs also enable us to estimate the optimal cluster number. In addition, the proportion of ambiguous clustering (PAC) method was used to estimate the optimal cluster number24.

Recurrent gene mutations

The most frequently recurrent gene mutations in head and neck cancer and regions of recurrent copy number alterations were determined from the TCGA analysis and selected for further examination5. The top ten recurrently mutated genes were selected: AJUBA, FAT1, NFE2L2, CASP8, NOTCH1, NSD1, PIK3CA, CDKN2A, TP53, and KMT2D. For these genes, somatic single nucleotide variations, insertions, deletions, and copy number alterations were obtained from the cBioPortal database (https://www.cbioportal.org/)25. Differences between mutations and the imaging-based clusters were assessed using Fisher’s exact test.

HPV status

HPV status was determined as previously described26. All samples were analyzed for the presence of HPV using the TCGA RNA-Seq data. The HPV status was concordant with genomic, sequencing, and molecular data and consistent with prior publications by the TCGA.

Tumor subtypes identified using biological data

We examined tumor subtypes identified using consensus non-negative matrix factorization (NMF) clustering on RNA-Seq gene expression, DNA methylation, and copy number variation data27,28. This subtype information was downloaded from the Broad Institute FireBrowse (http://firebrowse.org/) portal29 and compared with the imaging-based clusters.

Immune infiltrates

Thirteen immune cell populations and immune microenvironmental features were selected from the Pan-Can TCGA immune analysis for evaluation30 including: Macrophages, M0 Macrophages, M2 Macrophages, NK activated, NK resting, Leukocyte fraction (LF), IFN-γ, CD4 T-cell memory resting, CD4 T-cell memory activated, Lymphocyte infiltration score, T regs, Lymphocytes, and CD8 T-cells. Details of the immune features have been previously described30. Briefly, immune cell fraction estimates such as CD8 T-cells were obtained from deconvolution analysis of RNA-seq data. These immune variables were compared between the radiomic clusters using logistic regression and odds ratios were calculated. Prediction modeling was performed on CD8+ T-cells, which have an emerging role in clinical decision-making. For the modeling, random forest regression was used on radiomic features in a 10-fold cross validation manner. The performance was assessed using R2 between observed and predicted CD8+ T-cells. In the random forest modeling, the number of trees was set to 500, the terminal node size was set to 5, and the number of features to be randomly chosen at each node split was set to the square root of the total number of features. The difference in these immune variables between HPV-positive and HPV-negative groups was assessed using Wilcoxon rank sum test.

Validation cohort

Radiomic clustering results were validated using an independent dataset. The validation cohort consisted of 83 HNSCC cases with 1 laryngeal tumor, 31 oropharyngeal and 51 oral cavity tumors, all with pre-treatment CT scans with IV contrast from Memorial Sloan Kettering Cancer Center. For 32 laryngeal and oropharyngeal tumors, there were 27 HPV-positive and 5 HPV-negative tumors. However, HPV status was not available for 51 oral cavity tumors as HPV-status was not routinely obtained on oral cavity tumors due to the low or rare prevalence. The prevalence of high-risk HPV DNA in oral cavity squamous cell carcinoma in the United States has been reported at less than 6% in multiple studies31,32. We utilized the identical radiomics analysis pipeline for both the TCGA-TCIA matched patient cohort and the validation cohort.

Statistical analysis

Additional statistical analyses were performed. A difference in tumor volume between tumor subsites was assessed using ANOVA test. Differences in clinical variables between radiomic clusters were assessed using Fisher’s exact test (for continuous variables using Wilcoxon rank sum test). Radiomic features that show significant differences between radiomic clusters were identified using Wilcoxon rank sum test. Tumor subtypes identified using biological data were compared with radiomic clusters using extended Fisher’s exact test. All analyses were performed using R language and STATA software.

RESULTS

Radiomic landscape of HNSCC

The study schema is shown in Fig. 1. In total, 188 patients with HNSCC were identified in the TCIA. After strict quality assurance, 77 patients were included for analysis. The detailed analysis pipeline is shown in Supplemental Fig. 1. There was a significant difference in tumor volume (for evaluable CT slices) with p=0.0495 (ANOVA test); mean volumes were 17.7, 19.7, and 6.5cc for oral cavity, laryngeal, and oropharyngeal tumors, respectively (Supplemental Fig. 2A).

Figure 1. Radiogenomics approach and study schema.

Figure 1.

(1) Radiomics: A patient with biopsy proven head and neck cancer undergoes diagnostic imaging CT with IV contrast that is stored in the TCIA repository. Following this, the region of interest (ROI) is manually segmented and independently validated by an experienced neuroradiologist. Radiomic features are extracted from the ROI using the Computational Environment for Radiological Research (CERR) radiomics toolbox.

(2) Genomics: The patient undergoes tissue sampling and genomic information including recurrent gene mutation, gene expression clustering, DNA methylation clustering, copy number variation clustering, intra-tumoral heterogeneity, and immunome data is extracted. Genomic data is stored in the TCGA repository.

(3) Radiogenomics correlates imaging features with genomics data and aims to identify biological subtypes of HNSCC.

To explore the landscape of radiomic features in HNSCC, we performed agglomerative hierarchical clustering across samples and across radiomic features (Fig. 2A). In this unsupervised analysis, HPV-positive oropharyngeal tumors clustered together, and considerable redundancy was noted between radiomic features. Consensus clustering resulted in two clearly separable radiomic clusters, with 39 and 38 patients in clusters 1 and 2, respectively (Fig. 2B, Supplemental Figs. 2B and 2C). Two clusters were found to be the optimal number of clusters using the PAC method.

Figure 2. Radiomic analysis of HNSCC reveals two stable and distinct clusters.

Figure 2.

(A) Hierarchical clustering on radiomic features. The matrix of Z-scores was scaled to range from −2 to 2. The annotation bar shows HPV status, tumor subsite, and consensus clustering results based on immune subtypes.

(B) The consensus matrix of hierarchal clustering demonstrates two stable and coherent clusters. Consensus values range between 0 and 1, colored from light purple to dark purple. A consensus value of 1 for two samples indicates that these are clustered together 100% always in the same cluster, whereas 0 indicates that two samples are never in the same cluster.

(C) Representative CT scans with the contoured regions of interest (ROIs) from the two radiomic clusters. Both cases are oral cavity tumors which are HPV-negative.

Patient characteristics, stratified by radiomic clusters, are shown in Table 1. There was a significant difference in tumor subsite between the two radiomic clusters (extended Fisher’s exact test p=0.0096). Cluster 1 was significantly enriched for oropharyngeal primaries and tumors that were HPV-associated (extended Fisher’s exact test p=0.0127). By contrast, cluster 2 was enriched with oral cavity primaries, and laryngeal tumors were evenly distributed across the two radiomic clusters. Representative axial images from patients in each cluster are shown in Fig. 2C. The two images are from oral cavity primaries.

Table 1.

Patient characteristics of the two radiomic clusters. P-values were computed using Fisher’s exact test. For age, Wilcoxon rank sum test was used.

Characteristics Radiomic Cluster 1 (N=39) Radiomic Cluster 2 (N=38) P-value
Age at treatment, years (standard deviation) 60.8 (10.8) 61.4 (8.7) 0.8464
T-stage 0.2911
T1 1 0
T2 8 3
T3 14 15
T4 16 20
N-stage 0.8636
N0 16 19
N1 5 5
N2 15 12
N3 3 2
Subsite 0.0096
Oral cavity 15 23
Larynx 14 14
Oropharynx 10 1
HPV status 0.0127
Positive 11 2
Negative 28 36

To better understand the radiomic differences between the two radiomic clusters, we identified features that were differentially distributed between clusters (Wilcoxon rank sum test p <0.05) and iteratively removed redundant features (Pearson correlation coefficient > 0.7). As a result, seven features were identified: one first order feature (quartile coefficient of dispersion), two GLCM features (Haralick correlation and joint energy), one GLRLM feature (run entropy), one GLSZM feature (zone entropy), and two NGLDM features (high dependence low gray level emphasis and dependence count variance).

Molecular differences correlate with radiomic phenotype

Next, we explored the molecular differences between radiomic clusters. Mutational data, as well as consensus clustering data from gene expression, copy number alterations, and methylation were retrieved from the TCGA and are displayed in oncoprint (Fig. 3A)5.

Figure 3. Radiomic clusters identify important clinical and genomic characteristics of HNSCC.

Figure 3.

(A) Oncoprint enables visualization of multiple genomic alterations such as mutation events for each patient. Oncoprint shows the distributions of the most frequent alterations in HNSCC in radiomic clusters 1 and 2. Mutations are color coded in blue. The annotation bar reveals gene expression clustering, copy number variation (CNV) clustering, and DNA methylation clustering.

(B) Associations of the top 10 recurrently mutated genes in HNSCC with the two radiomic clusters. Odds ratios are given for recurrent gene mutations comparing cluster 1 vs cluster 2.

We explored whether the frequency of recurrently mutated genes was differentially distributed between radiomic clusters. Due to limited sample size, we restricted our analysis to the ten most frequently altered genes in HNSCC5. Odds ratios for differences in the number of mutations between the two radiomic clusters were computed for each gene (Fig. 3B). There was a trend toward increased mutation rates for TP53 (p=0.1355), FAT1 (p=0.1781), and CDKN2A (p=0.1123) in radiomic cluster 2, but these were not significant at a p=0.05 significance level.

The methylation profiles were significantly different between radiomic clusters with p=0.0025 (extended Fisher’s exact test; Supplemental Fig. 3C). Radiomic cluster 1 was enriched for MT5 tumors, which are associated with HPV33. By contrast, radiomic cluster 2 was enriched for MT4 and MT6 tumors. No significant differences in gene expression or copy number variation were identified between radiomic clusters (Supplemental Figs. 3A and 3B).

Correlation between radiomic features and immune infiltrates

We examined the immune microenvironment of head and neck tumors using data from Thorssson et al. 30. Thirteen immune variables were selected for analysis (Fig. 4A). Odds ratios (ORs) comparing the respective radiomic clusters were computed for each immune variable. Compared with radiomic cluster 2, radiomic cluster 1 was associated with a greater degree of CD8 T-cell infiltrate (OR=2.0, 95% confidence interval [CI]: 1.12–3.52; p=0.0190), and a lesser degree of immunosuppressive macrophages (OR=0.60, 95% CI: 0.37–0.97; p=0.0390) and M2-polarized macrophages (OR=0.54, 95% CI: 0.32–0.90; p=0.0190).

Figure 4. Radiomic clusters and the immune microenvironment.

Figure 4.

(A) Immune variables and their associations with the two radiomic clusters. Odds ratios are given for 13 immune variables comparing cluster 1 vs cluster 2.

(B) Random forest regression modeling of CD8+ T-cells on imaging features using a 10-fold cross validation. TIME: Tumor Immune Microenvironment.

To further explore the relationship between radiomic features and the immune microenvironment, we used random forest regression on 67 radiomic features to model CD8+ T-cell infiltrate. Fig. 4B shows a scatter plot of predicted versus observed CD8+ T-cell fraction, which resulted from a 10-fold cross validation strategy with 100 randomized iterations. Each predicted CD8+ T-cell fraction indicates an averaged value of 100 different predicted values. Significant prediction power was obtained with R2=0.30 (p<0.0001). Our radiomic model was predictive for both HPV-positive (R2=0.36; p=0.0405) and HPV-negative (R2=0.16; p=0.0012) subsets. To compute the sensitivity, specificity, and accuracy, the predicted CD8+ T-cell fraction was dichotomized. Sensitivity is defined as the percentage of high abundance of CD8 infiltrate in cluster 1 and specificity is defined as the percentage of low abundance of CD8 infiltrate in cluster 2. After 100 iterations of random forest modeling, the average sensitivity, specificity, and accuracy were 67.1%, 64.4%, and 65.7%, respectively.

We confirmed that HPV-positive tumors were associated with increased CD8 T-cell fraction (Wilcoxon rank sum test p=0.0061; Supplemental Fig. 4). For HPV status, sensitivity is defined as the percentage of HPV-positive tumors in cluster 1 and specificity is defined as the percentage of HPV-negative tumors in cluster 2. The sensitivity, specificity, and accuracy were 84.6%, 56.3%, and 61.0%, respectively.

Validation

To validate the findings obtained using the discovery cohort (TCIA data), we performed the identical radiomics analysis pipeline on an independent cohort of 83 HNSCC. Consensus clustering using the same 67 radiomic features resulted in two clearly separable radiomic clusters, with 41 and 42 patients in clusters 1 and 2, respectively (Fig. 5A). Two clusters were found to be the optimal number of clusters using the PAC method. Cluster 1 consisted of 27 oropharyngeal tumors (24 out of 27 (89%) were HPV-positive) and 14 oral cavity tumors. Cluster 2 consisted of 37 oral cavity tumors, 4 oropharyngeal tumors, and 1 larynx tumor. There was a significant difference in tumor subsite between the two radiomic clusters (extended Fisher’s exact test p=1.3×10−7).

Figure 5. A validation cohort of HNSCC reveals two stable and distinct radiomic clusters.

Figure 5.

(A) The consensus matrix of hierarchical clustering demonstrates two stable and coherent clusters. Consensus values range between 0 and 1, colored from light purple to dark purple. The annotation bar shows tumor subsite.

(B) A sensitivity analysis, using simulations assuming the HPV-positive prevalence of 5%, 10%, 15%, and 20% within the oral cavity cases in the validation cohort. P-values indicate the statistical significance of the difference in HPV status between the two radiomic clusters.

Due to the low prevalence of HPV infection (less than 10%) in oral cavity cancers31, HPV/p16 status is not routinely assessed and was not available in the validation cohort. Assuming that all oral cavity tumors are HPV-negative, we found that HPV status was significantly associated with radiomic clusters (Fisher’s exact test p=4.0×10−7) and the sensitivity, specificity, and accuracy were 88.9%, 69.6%, and 75.9%, respectively.

As the above analysis required us to assume that there were no HPV-positive tumors in oral cavity tumors, we subsequently performed a sensitivity analysis to investigate whether our findings would still hold, assuming a small subset of oral cavity tumors are HPV-positive. We performed simulations, assuming the HPV-positive prevalence of 5%, 10%, 15%, and 20% within the oral cavity cases (Fig. 5B). Simulations with an assumption of a 5% HPV-positive incidence resulted in an average p-value of 1.4×10−5 over 1000 iterations (Fisher’s exact test; 95% CI: 1.3×10−5-1.5×10−5) for an association between radiomic clusters and HPV status (Fig. 5B and Supplemental Fig. 5). Our finding remained robust even with up to 20% HPV-positive prevalence in the oral cavity cohort, with an average p-value of 0.0012 (95% CI: 0.0010–0.0014). These results validate our findings observed in the discovery cohort.

In addition, we performed an independent feature stability and volume analysis on the validation dataset with the same thresholds used in the TCIA data analysis. A smaller number of radiomic features, 33 features in total, were used for subsequent analysis. The validation dataset is from a single institution and is homogeneous compared to the TCIA data and presumably this contributes to the stronger correlation between texture features and volume. Even though there were a smaller number of stable volume independent features in the validation dataset, the overall results were similar to the radiomic analysis using 67 features of the matched TCIA TGCA cohort. Consensus clustering using the 33 radiomic features resulted in the two clearly separable radiomic clusters: cluster 1 consisted of 25 oropharyngeal tumors (21 out of 25 (84%) were HPV-positive), 15 oral cavity tumors, and 1 larynx tumor. Cluster 2 consisted of 36 oral cavity tumors and 6 oropharyngeal tumors. There was a significant difference in tumor subsite between the two radiomic clusters (extended Fisher’s exact test p=6.0×10−6). HPV status was associated with radiomic clusters with the assumption that all oral cavity cancers are HPV-negative (Fisher’s exact test p=0.0004), and the sensitivity, specificity, and accuracy were 77.8%, 64.3%, and 68.7%, respectively. The finding remained robust even up to 20% HPV-positive prevalence with an average p-value of 0.0371 (95% CI: 0.0340–0.0401) in the simulation.

Intratumoral heterogeneity

Finally, we examined the correlation between intratumoral genetic heterogeneity and radiomic features. We report additional analysis results in Supplemental document.

DISCUSSION

HNSCC encompasses a heterogeneous group of diseases with a diverse range of molecular alterations and clinical behavior that makes optimal treatment selection a challenging task. Emerging molecular biomarkers stratify tumors based upon gene expression, copy number alterations, and mutational profiles5,34. In contrast to histopathologic and molecular biomarkers, which require invasive sampling and represent a small percentage of the tumor, radiomic biomarkers can be non-invasively obtained with standard of care imaging, are easily repeated over time (delta-radiomic features) and can assess the entire tumor volume. We hypothesized that the molecular features driving these distinct tumor subtypes would also be reflected in their radiographic appearance and radiomic profiles. In this integrated radiomic-molecular-microenvironmental analysis of HNSCC, we explore correlations between the pre-treatment radiomic profile of HNSCC and underlying tumor-host biology. We perform 1) unsupervised analysis of radiomic features and identify differences in immune infiltrate and HPV status; 2) confirm findings from prior studies that show radiomic differences between HPV-positive and HPV-negative tumors; and 3) model tumor infiltrating CD8 T-cells based upon radiomic features in an HPV-independent manner. Our results suggest that additional molecular information can be gained from radiomic analysis of CT images which may be able to influence therapeutic decision-making.

In the current study, unsupervised clustering of radiomic features identified significant differences in tumor subtype with oropharyngeal HPV-positive tumors clustering together and the findings were validated using an independent cohort. Prior radiomic studies3537 have suggested that radiomic signatures may distinguish HPV status (p16 immunohistochemistry) in oropharyngeal tumors as HPV-positive tumors are thought to be more homogeneous in CT density and characterized by lower contrast uptake, lower minimum density, and higher changes in the intensity of adjacent voxels.

Gene expression profiles have identified molecular subtypes of HNSCC (basal, mesenchymal, atypical, and classical)38. This classification system was subsequently validated using 279 patients with complete data from the TCGA5. In our analysis, we utilized gene expression clusters based on consensus NMF clustering from 528 TCGA samples. There was no difference in the distribution of gene expression clusters between the two radiomic clusters. Moreover, we did not detect a difference in the distribution of copy number alteration clusters between the radiomic clusters.

We further assessed whether methylation subtypes derived from consensus NMF clustering were differentially distributed between the radiomic clusters. Differentially methylated regions have been identified in HNSCC using the TCGA data and HPV-positive oropharyngeal carcinomas were found to have higher DNA methylation levels compared to normal samples and non-HPV-related HNSCC33. In our study, DNA methylation clustering was significantly different between the two radiomic clusters. Virtually all of the patients in methylation cluster 5, which were strongly associated with HPV-positivity, were segregated into radiomic cluster 1 as shown in Supplemental Fig. 3C. By comparison, methylation clusters 4 and 6 were enriched in radiomic cluster 2. Multiple prior studies suggested that HPV-positive tumors differentiate CpG island methylation and this is likely driving the difference in the methylation clustering33,39. Epigenetic factors are ideal therapeutic targets and this suggests that radiomics may capture methylation status which warrants future investigation. A recent study examined head and neck TCGA and TCIA matched patient data and performed linear regression analysis and gene set enrichment analysis to identify associations between imaging features and genomic features37. Radiomic features from a cohort of 126 cases were utilized to predict HPV and TP53 status using radiomic features with an AUC of 0.71 and 0.64, respectively. Neither clinical outcomes nor a secondary validation cohort was utilized. In contrast, the current study validated the radiomic clusters using an independent validation cohort.

Immune checkpoint blockade is efficacious across multiple cancer types, including HNSCC40. Despite proven benefit in the recurrent and metastatic setting, response rates to immune checkpoint blockade are in the 10 to 30% range, and most patients do not have an objective response to checkpoint blockade. Several biomarkers predicting response to immune checkpoint blockade have been proposed, including PD-L1 staining, T-cell infiltrate, and more complex characterizations of the immune microenvironment30,41. Tumor behavior is not only influenced by the immune function within the tumor and host, but also by the immune stromal microenvironment42. The local tissue response to tumor progression resembles an inflammatory response including increased vascular permeability, cytokine release, lymphocyte infiltration, and fibrosis. Qualitative imaging correlates of the immune-inflammatory response have been reported on CT43. Moreover, quantitative radiomic analysis linking radiomic signatures to the tumor immune markers prognostic for overall survival has been described in lung cancer44. In this study, we found that radiomic cluster 1 was associated with increased CD8+ T-cell infiltrate and reduced immunosuppressive M2-polarized macrophage infiltrate. To further expand on this finding, we trained a random forest model of CD8+ T-cell infiltrate based on radiomic features. On cross validation, we found that this model was predictive of CD8+ T-cell infiltrate in both HPV-positive and HPV-negative subgroups. Our radiomic model compellingly predicted CD8+ T-cell infiltrate better than HPV status alone. Moreover, the radiomic clusters significantly correlated with immune variables, specifically CD8+ T-cells. As expected, immune variables significantly differed between HPV-positive and HPV-negative groups, with HPV-positive tumors having a higher mean CD8+ T-cell infiltrate. Higher CD8+ T-cell infiltrate has been shown to be predictive of an anti-PD1 benefit among HPV-negative tumors in HNSCC and CD8+ T-cell prediction may be of clinical significance for determining which tumors may benefit most from immunotherapy45. Using a validation cohort, we validated the radiomic clusters. To our knowledge, the current study represents the first radiomics analysis utilizing machine learning methods to establish an association with the tumor host immune response and immune microenvironment exclusively in HNSCC.

Other groups have identified radiomic correlates of CD8+ T-cell infiltrate. Tang and colleagues44 developed a radiomic model based on four image features representative of immune-pathology subtypes which correlated with survival and immune pathology. The authors stained tumors for two immune pathologic metrics, namely CD3 count and PD-L1. More recently, Sun et al. developed and validated a radiomics-based biomarker of CD8+ T-cells using patients included in phase 1 trials of PD-1 and PD-L1 immunotherapy46. In contrast to the current study, CD8 was modeled by the CD8B gene and the use of a heterogenous group of tumor histologies with varying sizes and intrinsic degrees of CD8+ T-cell infiltrate and response to immune checkpoint blockade. Although their radiomic model was only moderately correlated with CD8+ T-cell infiltrate in the bladder, lung adenocarcinoma, lung squamous cell carcinoma, and hepatocellular carcinoma subsets (Spearman’s correlation coefficient=0.559; p=0.00022), their model successfully predicted both response rate and survival after immunotherapy. Of note, the radiomics model proposed by Sun et al. did not successfully stratify tumors by CD8+ T-cell infiltrate in the HNSCC subgroup (p=0.18).

There are various strengths and weaknesses of our study which need to be highlighted. Our utilization of multi-institutional imaging CT protocols from the TCIA to identify radiomic features predicting biological subtypes in HNSCC and generating a machine learning-based model predictive of CD8+ T-cells substantiates the potential strength of large-scale applications of radiomics in head and neck cancer. We employed several quality-assurance techniques to mitigate the effect of technical artifacts on our analysis. Prospective radiomic studies in head and neck cancer may use a bite block to angle and position the neck to limit artifact at the level of the GTV and enable inclusion of a larger cohort. In addition, Metal artifact reduction (MAR) scans may be utilized to reduce streak artifact with amalgam fillings. While some reports suggested that variation in scanner parameters such as slice thickness and reconstruction kernels may affect texture features quantified on CT images47, radiomic models have been shown to be consistent, reproducible, and translatable across disease subsites and with the inclusion of artifact in multiple independent data sets48. We performed quality assurance tests to find stable imaging features and utilized volume-independent radiomic features for our analysis49. The size of our patient cohort is comparable to the published literature5052. A validation cohort was used to validate the presence of two radiomic clusters. To limit the effect of overfitting, we utilized cross validation in our radiomic prediction model. Limitations include the fact that CT scans were collected retrospectively and from multiple institutions. In addition, a concern for beam hardening artifact was caused by metallic fillings or implants. To mitigate this effect, we limited our analysis to 2-dimensional features that could be assessed on artifact-free slices. In addition, progression free survival (PFS) was not available in most cases from the TCGA cohort, restricting our analysis. We discovered that six radiomic features were correlated with overall survival with a p-value < 0.05 using Wilcoxon rank sum test. However, after multiple testing correction, none of them remained significant. This is likely due to the limited number of samples to assess disease outcomes. On the other hand, 46 radiomic features were correlated with HPV status with false discovery rate < 0.05 after multiple testing correction. Moreover, immune variable information (including CD8+ T-cell fraction) was not available in our validation cohort. We are currently performing additional validation studies with cohorts that have complete immune microenvironment information as well as clinical outcomes data, including PFS.

CONCLUSION

Radiomic features extracted from CT scans were correlated with tumor molecular features and the tumor immune microenvironment. While validation studies are underway, our study suggests that radiomics carries significant potential for image biomarker development.

Supplementary Material

1
2

Supplemental Figure 1. Data analysis pipeline.

In total, 104 radiomic features were extracted. After quality assurance test and removal of features that are highly correlated with tumor volume, 67 features remained, which were used for further radiomic, genomic, immune microenvironment, and intratumoral genetic heterogeneity analyses.

Supplemental Figure 2. Additional radiomics analysis.

(A) Bar plots showing the distribution of tumor volumes in each subsite for the cohort. Oropharyngeal tumors are significantly smaller than tumors in other subsites.

(B) Consensus clustering cumulative distribution function (CDF) plots for the varying number of clusters (K) from 2 to 6.

(C) Delta area plot showing the change in the area under CDF curves with the increasing number of clusters, comparing K and K-1.

Supplemental Figure 3. Imaging subtypes correlate with biologic subtypes identified using the TCGA data.

(A) Comparison of radiomic clustering and gene expression clustering with four clusters resulting from consensus non-negative matrix factorization (NMF) clustering. Between the two radiomic clusters, no significant difference was found in gene expression clustering: Cluster 1= Classical, Cluster 2= Basal, Cluster 3= Atypical, Cluster 4= Mesenchymal. The bar graph shows the percentage of each gene expression cluster between the two radiomic clusters (the sum equals 100%).

(B) Comparison of radiomic clustering and copy number variation (CNV) clustering with four clusters resulting from consensus NMF clustering. There was no significant difference in CNV clustering between the two radiomic clusters.

(C) Comparison of radiomic clustering and DNA methylation clustering with seven clusters resulting from consensus NMF clustering. There is one sample in methylation cluster 7, and not depicted on the graph. A significant difference in methylation clusters between the two radiomic clusters was found with p=0.0025 (extended Fisher’s exact test).

Supplemental Figure 4. Correlation between CD8 T-cells and HPV status.

There was a significant difference in CD8 T-cells between HPV-positive and HPV-negative subgroups.

Supplemental Figure 5. Simulation test assuming the HPV-positive prevalence of 10% within oral cavity cases.

In each simulation test, 10% (n=5) of oral cavity tumors (n=51) were randomly selected and assigned as HPV-positive. Using the simulated HPV status in oral cavity tumors and already known HPV status in larynx and oropharyngeal tumors, Fisher’s exact test was performed. This task was repeated 1000 times.

3
4

Highlights.

  1. Radiomics of CT-scans distinguishes tumor microenvironment (TME)

  2. Radiomics can determine the fraction and activation of CD8+ T-cells in the TME

  3. Radiomics distinguishes head & neck cancer sub-types: HPV-positive vs HPV-negative

  4. Radiomics does not identify the tumor clonal structure

Acknowledgments

Funding Support: This research was funded in part through National Institutes of Health/National Cancer Institute Cancer Center Support grant P30 CA008748, R21 CA234752, and MSKCC Imaging and Radiation Sciences grant.

Footnotes

Conflict of Interest Disclosures: Dr. Nancy Lee is a consultant for Pfizer, Merck, Merck Serono, Sanofi and owns stocks in Astra Zeneca. The rest of the authors have no conflicts of interest to disclose.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 2015;136:E359–86. [DOI] [PubMed] [Google Scholar]
  • 2.Stransky N, Egloff AM, Tward AD, et al. The mutational landscape of head and neck squamous cell carcinoma. Science 2011;333:1157–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Riaz N, Morris LG, Lee W, Chan TA. Unraveling the molecular genetics of head and neck cancer through genome-wide approaches. Genes Dis 2014;1:75–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Le QT, Machtay M. Acceler-dated fractionation: the end of the era of the large, “one size fits all” trial for locally advanced head and neck cancer. Int J Radiat Oncol Biol Phys 2014;89:7–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Cancer Genome Atlas N. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 2015;517:576–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 2012;48:441–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 2014;5:4006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Head MDACC, Neck Quantitative Imaging Working G. Investigation of radiomic signatures for local recurrence using primary tumor texture analysis in oropharyngeal head and neck cancer patients. Sci Rep 2018;8:1524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016;278:563–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Liang C, Huang Y, He L, et al. The development and validation of a CT-based radiomics signature for the preoperative discrimination of stage I-II and stage III-IV colorectal cancer. Oncotarget 2016;7:31401–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Segal E, Sirlin CB, Ooi C, et al. Decoding global gene expression programs in liver cancer by noninvasive imaging. Nat Biotechnol 2007;25:675–80. [DOI] [PubMed] [Google Scholar]
  • 12.Zinn PO, Mahajan B, Sathyan P, et al. Radiogenomic mapping of edema/cellular invasion MRI-phenotypes in glioblastoma multiforme. PLoS One 2011;6:e25451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ganeshan B, Abaleke S, Young RC, Chatwin CR, Miles KA. Texture analysis of non-small cell lung cancer on unenhanced computed tomography: initial evidence for a relationship with tumour glucose metabolism and stage. Cancer Imaging 2010;10:137–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Panth KM, Leijenaar RT, Carvalho S, et al. Is there a causal relationship between genetic changes and radiomics-based image features? An in vivo preclinical experiment with doxycycline inducible GADD34 tumor cells. Radiother Oncol 2015;116:462–6. [DOI] [PubMed] [Google Scholar]
  • 15.Hoefling NL, McHugh JB, Light E, et al. Human papillomavirus, p16, and epidermal growth factor receptor biomarkers and CT perfusion values in head and neck squamous cell carcinoma. AJNR Am J Neuroradiol 2013;34:1062–6, S1-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Pickering CR, Shah K, Ahmed S, et al. CT imaging correlates of genomic expression for oral cavity squamous cell carcinoma. AJNR Am J Neuroradiol 2013;34:1818–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Ger RB, Craft DF, Mackin DS, et al. Practical guidelines for handling head and neck computed tomography artifacts for quantitative image analysis. Comput Med Imaging Graph 2018;69:134–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Apte AP, Iyer A, Crispin-Ortuzar M, et al. Technical Note: Extension of CERR for computational radiomics: A comprehensive MATLAB platform for reproducible radiomics research. Med Phys 2018;45:3713–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zwanenburg A, Leger S, Vallieres M, and Lock S. Image biomarker standardisation initiative. arXiv:161207003 2018 [Google Scholar]
  • 20.van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res 2017;77:e104–e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Beare R, Lowekamp B, Yaniv Z. Image Segmentation, Registration and Characterization in R with SimpleITK. J Stat Softw 2018;86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Haralick R SK, Dinstein I. Textural features for image classification. IEEE Transactions on systems, man and cybernetics 1973;SMC-3, No. 6:610–21. [Google Scholar]
  • 23.Monti Stefano TP, Mesirov Jill, and Golub Todd. Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data Machine Learning 2003;52:91–118. [Google Scholar]
  • 24.Senbabaoglu Y, Michailidis G, Li JZ. Critical limitations of consensus clustering in class discovery. Sci Rep 2014;4:6207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gao J, Aksoy BA, Dogrusoz U, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 2013;6:pl 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Nulton TJ, Olex AL, Dozmorov M, Morgan IM, Windle B. Analysis of The Cancer Genome Atlas sequencing data reveals novel properties of the human papillomavirus 16 genome in head and neck squamous cell carcinoma. Oncotarget 2017;8:17684–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Brunet JP, Tamayo P, Golub TR, Mesirov JP. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A 2004;101:4164–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature 1999;401:788–91. [DOI] [PubMed] [Google Scholar]
  • 29.Deng M, Bragelmann J, Kryukov I, Saraiva-Agostinho N, Perner S. FirebrowseR: an R client to the Broad Institute’s Firehose Pipeline. Database (Oxford) 2017;2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Thorsson V, Gibbs DL, Brown SD, et al. The Immune Landscape of Cancer. Immunity 2018;48:812–30 e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lingen MW, Xiao W, Schmitt A, et al. Low etiologic fraction for high-risk human papillomavirus in oral cavity squamous cell carcinomas. Oral Oncol 2013;49:1–8. [DOI] [PubMed] [Google Scholar]
  • 32.Herrero R, Castellsague X, Pawlita M, et al. Human papillomavirus and oral cancer: the International Agency for Research on Cancer multicenter study. J Natl Cancer Inst 2003;95:1772–83. [DOI] [PubMed] [Google Scholar]
  • 33.Ren S, Gaykalova D, Wang J, et al. Discovery and development of differentially methylated regions in human papillomavirus-related oropharyngeal squamous cell carcinoma. Int J Cancer 2018;143:2425–36. [DOI] [PubMed] [Google Scholar]
  • 34.Seiwert TY, Burtness B, Mehra R, et al. Safety and clinical activity of pembrolizumab for treatment of recurrent or metastatic squamous cell carcinoma of the head and neck (KEYNOTE-012): an open-label, multicentre, phase 1b trial. Lancet Oncol 2016;17:956–65. [DOI] [PubMed] [Google Scholar]
  • 35.Leijenaar RT, Bogowicz M, Jochems A, et al. Development and validation of a radiomic signature to predict HPV (p16) status from standard CT imaging: a multicenter study. Br J Radiol 2018;91:20170498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Yu K, Zhang Y, Yu Y, et al. Radiomic analysis in prediction of Human Papilloma Virus status. Clin Transl Radiat Oncol 2017;7:49–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Zhu Y, Mohamed ASR, Lai SY, et al. Imaging-Genomic Study of Head and Neck Squamous Cell Carcinoma: Associations Between Radiomic Phenotypes and Genomic Mechanisms via Integration of The Cancer Genome Atlas and The Cancer Imaging Archive. JCO Clin Cancer Inform 2019;3:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Walter V, Yin X, Wilkerson MD, et al. Molecular subtypes in head and neck cancer exhibit distinct patterns of chromosomal gain and loss of canonical cancer genes. PLoS One 2013;8:e56823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Beck TN, Golemis EA. Genomic insights into head and neck cancer. Cancers Head Neck 2016;1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Bauman JE, Cohen E, Ferris RL, et al. Immunotherapy of head and neck cancer: Emerging clinical trials from a National Cancer Institute Head and Neck Cancer Steering Committee Planning Meeting. Cancer 2017;123:1259–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Charoentong P, Finotello F, Angelova M, et al. Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade. Cell Rep 2017;18:248–62. [DOI] [PubMed] [Google Scholar]
  • 42.Quail DF, Joyce JA. Microenvironmental regulation of tumor progression and metastasis. Nat Med 2013;19:1423–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Koay EJ, Truty MJ, Cristini V, et al. Transport properties of pancreatic cancer describe gemcitabine delivery and response. J Clin Invest 2014;124:1525–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Tang C, Hobbs B, Amer A, et al. Development of an Immune-Pathology Informed Radiomics Model for Non-Small Cell Lung Cancer. Sci Rep 2018;8:1922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Hanna GJ, Lizotte P, Cavanaugh M, et al. Frameshift events predict anti-PD-1/L1 response in head and neck cancer. JCI Insight 2018;3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Sun R, Limkin EJ, Vakalopoulou M, et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: an imaging biomarker, retrospective multicohort study. Lancet Oncol 2018;19:1180–91. [DOI] [PubMed] [Google Scholar]
  • 47.Zhao B, Tan Y, Tsai WY, Schwartz LH, Lu L. Exploring Variability in CT Characterization of Tumors: A Preliminary Phantom Study. Transl Oncol 2014;7:88–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Leijenaar RT, Carvalho S, Hoebers FJ, et al. External validation of a prognostic CT-based radiomic signature in oropharyngeal squamous cell carcinoma. Acta Oncol 2015;54:1423–9. [DOI] [PubMed] [Google Scholar]
  • 49.Vallieres M DaHM V. Dependency of a validated radiomics signature on tumor volume and potential corrections. The Journal of Nucear Medicine 2018;59:640. [Google Scholar]
  • 50.Gevaert O, Xu J, Hoang CD, et al. Non-small cell lung cancer: identifying prognostic imaging biomarkers by leveraging public gene expression microarray data--methods and preliminary results. Radiology 2012;264:387–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ganeshan B, Panayiotou E, Burnand K, Dizdarevic S, Miles K. Tumour heterogeneity in non-small cell lung carcinoma assessed by CT texture analysis: a potential marker of survival. Eur Radiol 2012;22:796–802. [DOI] [PubMed] [Google Scholar]
  • 52.Ravanelli M, Farina D, Morassi M, et al. Texture analysis of advanced non-small cell lung cancer (NSCLC) on contrast-enhanced computed tomography: prediction of the response to the first-line chemotherapy. Eur Radiol 2013;23:3450–5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Supplemental Figure 1. Data analysis pipeline.

In total, 104 radiomic features were extracted. After quality assurance test and removal of features that are highly correlated with tumor volume, 67 features remained, which were used for further radiomic, genomic, immune microenvironment, and intratumoral genetic heterogeneity analyses.

Supplemental Figure 2. Additional radiomics analysis.

(A) Bar plots showing the distribution of tumor volumes in each subsite for the cohort. Oropharyngeal tumors are significantly smaller than tumors in other subsites.

(B) Consensus clustering cumulative distribution function (CDF) plots for the varying number of clusters (K) from 2 to 6.

(C) Delta area plot showing the change in the area under CDF curves with the increasing number of clusters, comparing K and K-1.

Supplemental Figure 3. Imaging subtypes correlate with biologic subtypes identified using the TCGA data.

(A) Comparison of radiomic clustering and gene expression clustering with four clusters resulting from consensus non-negative matrix factorization (NMF) clustering. Between the two radiomic clusters, no significant difference was found in gene expression clustering: Cluster 1= Classical, Cluster 2= Basal, Cluster 3= Atypical, Cluster 4= Mesenchymal. The bar graph shows the percentage of each gene expression cluster between the two radiomic clusters (the sum equals 100%).

(B) Comparison of radiomic clustering and copy number variation (CNV) clustering with four clusters resulting from consensus NMF clustering. There was no significant difference in CNV clustering between the two radiomic clusters.

(C) Comparison of radiomic clustering and DNA methylation clustering with seven clusters resulting from consensus NMF clustering. There is one sample in methylation cluster 7, and not depicted on the graph. A significant difference in methylation clusters between the two radiomic clusters was found with p=0.0025 (extended Fisher’s exact test).

Supplemental Figure 4. Correlation between CD8 T-cells and HPV status.

There was a significant difference in CD8 T-cells between HPV-positive and HPV-negative subgroups.

Supplemental Figure 5. Simulation test assuming the HPV-positive prevalence of 10% within oral cavity cases.

In each simulation test, 10% (n=5) of oral cavity tumors (n=51) were randomly selected and assigned as HPV-positive. Using the simulated HPV status in oral cavity tumors and already known HPV status in larynx and oropharyngeal tumors, Fisher’s exact test was performed. This task was repeated 1000 times.

3
4

RESOURCES