Skip to main content
Cancers logoLink to Cancers
. 2021 Nov 17;13(22):5761. doi: 10.3390/cancers13225761

Prognostic Matrisomal Gene Panel and Its Association with Immune Cell Infiltration in Head and Neck Carcinomas

Yuri Belotti 1, Su Bin Lim 2, Narayanan Gopalakrishna Iyer 3,4, Wan-Teck Lim 4,5,6,*, Chwee Teck Lim 1,7,8,*
Editor: Steve Oghumu
PMCID: PMC8616409  PMID: 34830910

Abstract

Simple Summary

Squamous cell carcinoma of the head and neck (SCCHN) is a heterogeneous group of tumors arising from squamous cells lining different anatomic sites. This type of malignancy has been mainly investigated by focusing primarily on tumor cells, but recent evidence highlighted the importance of the tumor microenvironment (TME) in cancer growth, progression and metastasis. Hence, we hypothesized that dysregulated matrisomal components could have a common association with patient survival, irrespective of the subsite of origin of the SCCHN. Using bioinformatic methods and public datasets, we successfully identified a gene panel with prognostic value in HPV-negative and non-metastatic node-negative tumors and demonstrated its association with immune cell infiltration.

Abstract

Squamous cell carcinoma of the head and neck (SCCHN) is common worldwide and related to several risk factors including smoking, alcohol consumption, poor dentition and human papillomavirus (HPV) infection. Different etiological factors may influence the tumor microenvironment and play a role in dictating response to therapeutics. Here, we sought to investigate whether an early-stage SCCHN-specific prognostic matrisome-derived gene signature could be identified for HPV-negative SCCHN patients (n = 168), by applying a bioinformatics pipeline to the publicly available SCCHN-TCGA dataset. We identified six matrisome-derived genes with high association with prognostic outcomes in SCCHN. A six-gene risk score, the SCCHN TMI (SCCHN-tumor matrisome index: composed of MASP1, EGFL6, SFRP5, SPP1, MMP8 and P4HA1) was constructed and used to stratify patients into risk groups. Using machine learning-based deconvolution methods, we found that the risk groups were characterized by a differing abundance of infiltrating immune cells. This work highlights the key role of immune infiltration cells in the overall survival of patients affected by HPV-negative SCCHN. The identified SCCHN TMI represents a genomic tool that could potentially aid patient stratification and selection for therapy in these patients.

Keywords: extracellular matrix, head and neck cancer, bioinformatics, TCGA, HPV, prognostic biomarker

1. Introduction

SCCHN comprises a heterogeneous group of tumors arising from squamous cells lining different anatomic sites within the upper aerodigestive tract such as the nasal cavity, paranasal sinuses, lips, oral cavity, oropharynx, hypopharynx or larynx [1,2]. The global incidence rates have increased by 36.5% between 2005 and 2015 [3]. SCCHN predominantly affects people above 50 years old, with incidence rates higher among men than women [4]. Tobacco and alcohol exposure constitutes the major risk factors for the development of such cancers [5,6]. Human papillomavirus (HPV) is an important etiologic factor of SCCHN [7]. Aside from etiology, both tumor staging and pathological features have prognostic value [8]. The presence of metastases and aggressive pathological features such as extranodal extension (ENE), perineural invasion (PNI) or lymphovascular invasion (LVI) are prognostic factors for reduced survival [9,10,11]. Subsite specific etiological factors and associated tumor and microenvironment differences may influence the clinical outcomes of existing modalities of treatment [12,13]. Recent advancements in high throughput molecular profiling have also added new prognostic markers [14,15] that relate to the biology of the specific anatomical subsite of interest, as a consequence of their specific underlying molecular pathways [16]. This could account for the inherent heterogeneity of this malignancy and aid prognostication and possibly treatment selection and intervention.

Epithelial malignancies have been extensively investigated at the genomic and epigenomic levels focusing primarily on tumor cells. Recent evidence increasingly highlights the importance of the tumor microenvironment (TME) in cancer growth, progression and metastasis [17,18]. Therefore, a deeper understanding of the role of the cancer-associated extracellular matrix (ECM) components might help to identify new diagnostic and prognostic tools. In 2015, Naba, A. et al. [19] identified a list of 1068 human ECM genes encoding ECM and ECM-associated proteins and presented omics data indicating their roles in development, homeostasis and disease.

Bioinformatic approaches are powerful tools that enable whole-genome investigation of the abnormalities exhibited by cancer tissues from large groups of patients. Hence, in this study, using a series of recently developed web-based tools and open-source software, a bioinformatic-based study on a transcriptomic dataset publicly available in the “The Cancer Genome Atlas (TCGA)” [20] database was conducted. First, based on recent studies [21,22], we hypothesized that dysregulated matrisomal components could have a common association with patient survival, irrespective of the subsite of origin of the SCCHN. Specifically, transcripts that were associated with survival in HPV-negative and non-metastatic node-negative tumors, were examined to minimize confounding by treatment, stage, and etiology. Next, we defined a novel prognostic signature, the SCCHN-tumor matrisome index (SCCHN TMI), assessed its prognostic ability across independent datasets and its association with immune cell infiltration.

2. Materials and Methods

2.1. SCCHN TCGA Data

A list of 1068 human matrisome genes, first published by Naba, A. et al. [19] and more recently revised [23], was retrieved from the M.I.T. “Matrisome Project” website (www.matrisomeproject.mit.edu/other-resources/human-matrisome, accessed on 10 March 2021) and loaded into the web-based tool XENA [24] (www.xenabrowser.net, accessed on 10 March 2021). The study “TCGA Head and Neck Cancer (HNSC) study” was selected: (1) as “first variable” the “Genomic” data type was selected; (2) the matrisomal genes inputted and (3) the normalized gene expression selected. (4) Relevant “phenotypic” data were selected to investigate the various clinical covariates.

2.2. SCCHN scRNA-Seq Data

Transcript-level expression values (TPMs) for 23,686 genes, across 5902 cells derived from 18 SCCHN patients, including five matched pairs of primary tumors and lymph node metastases [25], were re-analyzed in this study to identify specific cell types expressing SCCHN TMI signatures. Scaling of the data and linear dimensional reduction were performed using R (v 4.0.3) “Seurat” package (v 4.0.1). Cells that were annotated with the identified cell type by the authors of the original work (i.e., cells that are classified as “cancer cell”, “B cell”, “dendritic”, “endothelial”, “fibroblast”, “macrophage”, “mast”, “myocyte”, or “T cell”) were included in the analysis.

2.3. Construction of the SCCHN TMI Risk Score

Expression levels and clinical annotation for the HPN, N0 subgroup were selected and imported into R (v4.0.2) and RStudio (v1.3.1073) where the package “RegParallel” (v1.8.0) was used to fit the Cox proportional hazard model independently for each gene. The genes characterized by log-rank p-value < 0.01 were selected to generate the SCCHN-specific prognostic signature (the SCCHN-tumor matrisome index), combining the expression level and the Cox regression coefficient (Betai) associated with the prognostic genes: HNSCC TMI = iExpression(Genei)·Betai. The effect on overall survival probabilities of the HNSCC TMI scores was assessed using a Cox proportional hazard model (using the “coxph” function of the “survival” package (v3.2-11) in R/Bioconductor [26]). The Cox proportional hazard assumption was checked by the scaled Schoenfeld residual test using the cox.zph function provided by the survival package in R/Bioconductor [26].

2.4. Patient Stratification and Survival Analysis

A median cut-off was used to stratify patients into low- and high-risk groups. Kaplan−Meier (KM) survival curves were generated to test the prognostic value of the SCCHN TMI. Overall survival (OS) time was computed from the date of surgery until death. The survival analysis was conducted using the “survival” package. The log-rank p-value was indicated in each KM curve and considered statistically significant if smaller than 0.05.

2.5. Validation

The prognostic value of the SCCHN TMI was internally and externally validated using the penalized Cox model and represented via time-dependent AUCs using the R packages “hdnom“ (v6.0.0) and “rms” (v6.2-0). The internal validation was performed using elastic-net with “Bootstrap validation” penalty trade-off parameter α = 0.05, regularization parameter λ = 1.763886. The external validation was conducted on the SCCHN GEO dataset (accession number GSE65858).

This dataset was generated by Wichmann et al. [27] and contains gene expression data from 270 patients diagnosed with SCCHN collected using the platform GPL10558 Illumina HumanHT-12 V4.0 expression Beadchip. The dataset was accessed using Phantasus (v.1.9.2, https://artyomovlab.wustl.edu/phantasus/, accessed on 11 March 2021) where “Log2” and “quantile normalization” adjustments were applied to the data. Moreover, to remove lowly-expressed probes and ensure only one row per gene in the gene expression matrix, data were collapsed using “Maximum Median Probe” with “gene symbol” as the collapse field. The dataset was then downloaded as a spreadsheet file for further analysis.

2.6. Role of the Proteins Encoded by the SCCHN TMI Genes

The role of the proteins encoded by the SCCHN TMI genes were assessed using the Human Protein Atlas web tool (www.proteinatlas.org, accessed on 15 March 2021).

2.7. Network Analysis

The web-based tool NetworkAnalyst (https://www.networkanalyst.ca, accessed on 12 March 2021) was used to perform the network analysis of the 6 HNSC-TMI genes. Specifically, the “Gene Regulatory Network, Signaling Network” option was selected. Data from the SIGnaling Network Open Resource (downloaded on 12 March 2021) were analyzed.

2.8. Tumor-Infiltrating Immune Cell Analysis

The web-based tool CIBERSORTx (https://cibersortx.stanford.edu/, accessed on 6 May 2021) was used to estimate the relative fraction of 22 immune cell types based on the RNA-seq data, as a function of the SCCHN TMI risk group. We selected LM22 (22 immune cell types) for the signature gene file, 100 for permutations, and disabled quantile normalization for all runs. Subsequently, box plots were generated to present the differences in infiltrated immune cells between high and low risk groups using the “ggplot2” package (v3.3.3). Two-sided, unpaired two-samples Wilcoxon test was performed between the two risk groups.

2.9. Machine Learning for Personalized Prediction of the Risk Group

The computational classification was performed in Orange (Version 3.29.3) using the method recently developed by Belotti et al. [28]. The SCCHN TMI risk groups were classified based on the gene expression levels of the SCCHN TMI genes. Specifically, only 5 genes were included in the analysis as one of the genes (SFRP5) is missing in the GSE65858 dataset. The algorithm workflow is shown in Figure S1. First, expression values of the SCCHN TMI genes for each patient were imported into Orange, together with the SCCHN TMI risk group previously calculated using the median cut-off. Then the data are sent to the “test and score” widget where multiple models are tested: (a) k-nearest neighbors (kNN, number of neighbors = 5, metric = Euclidean, weight = uniform). (b) Logistic regression (regularization type = lasso, strength = C1). (c) Random forest (number of trees = 10, depth of the individual tree is limited to 5, subsets smaller than 5 are not split). (d) Support vector machine (SVM, cost = 1, regression loss = 0.1, kernel = linear, numerical tolerance = 0.001, iteration limit = 100). (e) Neural network (neurons in hidden layers = 200, activation = ReLu, solver = Adam with regularization = 0.0001, maximum number of iterations = 200). The validation of the models was carried out using 10-fold cross-validation. Moreover, the classification was also validated using an independent GEO dataset (GSE65858).

3. Results

3.1. Identification of SCCHN-Specific Tumor Matrisome Index

A series of recently developed web-based bioinformatics tools and open-source software were utilized to access and analyze the HNSC-TCGA dataset. The analysis workflow is summarized in Figure 1. First, the matrisomal genes (see Methods) were input into XENA [24] and the normalized RNA-seq data were retrieved, together with the clinical annotation of the samples. The subgroup of patients with no HPV infection and no regional lymph node involvement (HPV, N0) that underwent surgical treatment, was selected to reduce heterogeneity. A Cox proportional hazard model was independently applied for each matrisomal gene in the dataset against overall survival (OS). The genes with log rank p < 0.01 were selected and an SCCHN-specific tumor matrisome index (SCCHN TMI) was defined. The outcome of the Cox proportional hazard model to the matrisomal genes is shown in Table S1. Six genes were identified: MASP1 EGFL6, SFRP5, SPP1, MMP8 and P4HA1. Based on the results derived from the Cox regression, a nomogram to predict survival probability at 2, 3 and 5 years after surgery for clinical use was developed (Figure S2A). Calibration curves for this nomogram are plotted in Figure S2B,C.

Figure 1.

Figure 1

Schematic illustration of the bioinformatics workflow. We first constructed a matrisome-derived SCCHN-specific signature from primary tissue samples using a bioinformatic-based approach. Specifically, access and retrieval of large datasets from “The Cancer Genome Atlas database” was conducted via XENA browser. SCCHN-tumor matrisome index (SCCHN TMI) was computed using R: (1) prognostic genes were identified through the application of the Cox proportional hazard model; (2) the median cut-off was used to stratify patients into low and high-risk and generate Kaplan−Meier (KM) plots for each risk group; (3) internal validation and external validation of the SCCHN TMI were conducted on a microarray and scRNA-seq datasets. The web-based tool CIBERSORTx was used to estimate the relative fraction of cancer-infiltrating immune cells based on the RNA-seq data, as a function of the SCCHN TMI risk group.

Two of the genes composing the SCCHN TMI, MASP1 and EGFL6 are associated with good prognosis (HR < 1) while the other four: SFRP5, SPP1, MMP8 and P4HA1 are associated with poor outcomes (HR > 1), as shown in Table S1. The details of these genes are shown in Table S2. Using a median cut-off, patients were stratified into low and high-risk groups and Kaplan−Meier (KM) plots were generated, as shown in Figure 2A,B. Specifically, the group characterized by low SCCHN TMI had significantly better survival outcomes.

Figure 2.

Figure 2

Identification of the SCHNN TMI and evaluation of its prognostic value. (A) Kaplan−Meier plots for the HPV, N0 subgroup (n = 168). (B) Kaplan−Meier plots for the HPV, N0 subgroup for the validation set (n = 64). (C) Internal validation: the performance of the elastic-net model is internally assessed by time-dependent AUC (area under the ROC curve) with “bootstrap” resampling at every year from the first 6 months to the year 4.5. The solid line represents the mean of the AUC at each time point across all bootstrap predictions, the dashed line represents the median of the AUC. The shaded interval shows the minimum and maximum of AUC.

Univariable and multivariable Cox regression analyses were performed to adjust for confounding factors such as age and gender, history of cigarette smoke and alcohol consumption, and pathological tumor (T) stage (according to AJCC version 8), as shown in Figure S3. The comparison between low and high SCCHN TMI for the HPV, N0 subgroup among conventional clinical parameters is illustrated in Figure S4. Univariable survival analyses revealed that the SCCHN TMI could predict overall survival (OS), disease-specific survival (DSS), and disease-free interval (DFI), as shown in Figure S5. The individual patient’s SCCHN TMI scores exhibited statistically significant differences across different clinicopathological factors such as sample type, regional lymph node involvement and pathological stage, as shown in Figure 3. Specifically, the samples with the presence of regional lymph node invasion (N+) includes N1 (n = 78), N2 (n = 13), N2a (n = 11), N2b (n = 121), N2c (n = 54), N3 (n = 9), and NX (n = 89).

Figure 3.

Figure 3

Distribution of the SCCHN TMI scores for each HPV sample as a function of clinicopathological factors. (A) Sample type; (B) absence (N0) or presence (N+) of regional lymph node invasion; (C) pathological stage. The whiskers indicate the range from Q1 + 1.5 × IQR to Q3 − 1.5 × IQR and the line is the median. Wilcoxon test **** p ≤ 0.0001, *** p ≤ 0.001, ** p ≤ 0.01, * p ≤ 0.05, ns: not significant.

3.2. Validation of the SCCHN TMI

The prognostic value of the SCCHN TMI was internally and externally validated using machine-learning-based algorithms. Specifically, the internal validation was performed using the penalized Cox model. In Figure 2C and the time-dependent AUC (area under the receiver operating characteristic (ROC) curve) is shown. The Gene Expression Omnibus (GEO) dataset GSE65858 was used as a validation set. Figure 2B shows the KM plot for the validation set (n = 64). To identify specific cell types expressing the 6 genes composing the SCCHN TMI, we next analyzed the scRNA-seq SCCHN dataset (GSE103322), derived from 18 SCCHN patients, including matched pairs of primary tumors (PT) and lymph node (LN) metastases (Figure 4). MASP1 and EGFL6 were specifically expressed predominantly in subpopulations of fibroblasts and cancer cells, while SPP1 and P4HA1 were expressed by subgroups of several cell types composing the tumor and its matrisome, including immune cells, such as macrophages and dendritic cells, and fibroblasts. SFRP5 and MMP8 were expressed only in a small number of cells.

Figure 4.

Figure 4

Expression of the SCCHN TMI genes at the single-cell level. UMAP (Uniform Manifold Approximation and Projection) of 5560 cells colored by (A) issue of origin (primary tumors (PT) and lymph node (LN) metastases), and (B,C) cell type. (D) Relative expression of the six SCCHN TMI genes across different cell types.

3.3. Regulatory Signaling Network Analyses

The outcome of the regulatory network analysis is shown in Figure S6A. The KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis was performed for the genes identified in the signaling network (Figure S6B). Three genes (RUNX2, ERG and MMP3) are involved in “transcriptional misregulation in cancer”. ERG and MMP3 are also associated with prostate cancer. MMP3 is involved in the IL-17 signaling pathway, TNF signaling pathway, and rheumatoid arthritis. ETS2 and MMP7 are involved in HTLV-1 infection.

3.4. Role of the Proteins Encoded by the SCCHN TMI Genes

The gene MASP1 encodes a serine protease that is a component of the lectin pathway, which plays an essential role in the innate and adaptive immune response [29]. This gene has been found to be a favorable prognostic marker in liver cancer [29]. EGFL6 encodes a member of the epidermal growth factor (EGF) repeat superfamily, involved in cell cycle regulation, proliferation, and developmental processes [30]. It is a favorable marker in both ovarian and head and neck cancer [30]. SFRP5 has a role as a modulator of Wnt signaling, which is involved in regulating cell growth and differentiation in specific cell types [31]. SPP1 encodes a protein which is a cytokine that upregulates expression of interferon-gamma and interleukin-12, and it is an unfavorable prognostic marker in liver, pancreatic, and cervical cancer [32]. SPP1 is involved in ECM-receptor interaction, Toll-like receptor signaling pathway, Apelin signaling pathway [32]. MMP8 encodes a member of the Matrix metalloproteinases (MMPs), a family of proteolytic enzymes which is involved in degrading components of the extracellular matrix and promoting invasion and metastasis in various cancers. In SCCHN it has been shown that imbalances between matrix metalloproteinases and their inhibitors contribute to the progression are linked to the prognosis of the malignancy [33,34]. P4HA1 encodes a key enzyme involved in collagen synthesis and catalyzes the formation of 4-hydroxyproline, essential for the three-dimensional folding of procollagen chains [35]. It is an unfavorable prognostic marker in renal, head and neck, cervical, pancreatic, lung and breast cancers [35].

3.5. Association between Tumor-Infiltrating Immune Cells and the Risk Group

Using CIBERSORTx the relative abundance of 22 immune cell subsets of SCCHN were estimated for the HPV, N0 cohort. In Figure 5, the differences in the abundance of the infiltrative immune cells between high-risk and low-risk groups are shown. Specifically, only the immune cells with measurable abundance are shown. A statistically significant higher abundance of T cells CD8, T cells follicular helper, activated dendritic cells were found in the low-risk group, whereas a higher abundance of M0 and M2 macrophages were found in the high-risk group.

Figure 5.

Figure 5

Association between SCCHN TMI risk group and infiltrative immune cells. The box plots show the percentages of infiltrative immune cells calculated by CIBERSORTx between high- and low-risk groups in the “TCGA-HNSC”, HPV, N0 subgroup. The size of the boxes indicates the interquartile range IQR which spans from the first quartile (Q1) to the thirst quartile (Q3). The whiskers indicate the range from Q1 + 1.5 × IQR to Q3 − 1.5 × IQR and the line is the median. A two-sided, unpaired two-sample Wilcoxon test was performed between the two groups. **** p ≤ 0.0001; ** p ≤ 0.01; * p ≤ 0.05, ns: not significant.

3.6. Machine Learning Approach for Risk Group Classification

We sought to demonstrate the predictive potential of our SCCHN TMI in stratifying patients across different platforms. Given that the TCGA and GEO data are derived from RNA-seq and microarray, respectively, we used a cross-platform normalization tool to enable comparison between the two datasets of different profiling platforms. Specifically, we used TDM [36] transformation to make RNA-seq data compatible with microarray data, as recently shown [37]. Figure 6A shows the comparison between TDM and logarithmic transformation. The TDM transformation best fitted the reference microarray data (validation dataset, GSE65858) distribution. Using a supervised machine learning approach [28], we next developed SCCHN TMI-based risk group classifiers. First, we trained multiple classifiers (as shown in Figure S1) on the TDM-transformed SCCHN TCGA dataset using 10-fold cross-validation (Figure 6B–D). The best predictive model, support vector machine (SVM), was then evaluated on the validation dataset (Figure 6E–G). Both cross-validation and external validation on the GSE65858 dataset resulted in an area under the (receiver operating characteristic) curve (AUC) of 0.984 and 0.985 and classification accuracy of 95.2% and 93.8%, respectively. A clear separation between the two risk groups indicates a superior classification performance, as shown in the t-distributed stochastic neighbor embedding (t-SNE) plot (Figure 6H).

Figure 6.

Figure 6

Computational classification of the SCCHN TMI risk group at the individual patient level. (A) Comparison between TDM and LOG-transformed training data (TCGA) in fitting the validation dataset (GSE65858) distribution. (B) Results of the classification model evaluation using 10-fold cross-validation. (C) Confusion matrix of the support vector machine (SVM) model, which best scored in the classification. (D) Receiver operating characteristic (ROC) curves for each of the SCCHN TMI risk groups using SVM. (E) Results of the classification model evaluation using the validation dataset. (F) Confusion matrix of the support vector machine (SVM) model. (G) Receiver operating characteristic (ROC) curves for each of the SCCHN TMI risk groups using SVM. AUC = area under the curve, CA = classification accuracy. (H) The t-SNE plot of two risk groups.

4. Discussion

The discovery of reliable prognostic biomarkers capable of identifying patients with a higher risk of unfavorable survival outcomes is needed in order to better define patients who might require further adjuvant treatment after surgical resection. The SCCHN TMI gene panel, which was constructed by focusing only on ECM molecules, holds potential clinical as well as biological significance. In an HPV negative subset, where the overall prognosis is poor, the SCCHN TMI was able to predict overall survival (OS), disease-specific survival (DSS), and disease-free interval (DFI). A high SCCHN TMI score was an unfavorable prognostic factor for all the analyzed endpoints. Gene ontology (GO), KEGG pathway enrichment and signaling network analyses revealed that the six SCCHN TMI genes are mostly associated with signaling networks involved in cancer-related transcriptional dysregulation and two important pathways: IL-17 and TNF. The former has already been shown to negatively correlate with the overall survival of head and neck cancer patients [38]. TNF-α, which is found in the TME, is secreted by macrophages, lymphocytes and natural killer (NK) cells and mediates the production of proinflammatory factors that elicit tumor growth and recently emerged as a promising cancer therapy target [39,40].

We found that patients in the high-risk group exhibited a proinflammatory phenotype enriched with macrophages (M0 and M2 phenotypes). Tumor-associated macrophages (TAMs) are a key component of the SCCHN tumor microenvironment as they have specific roles in regulating the immune response to cancer (refer to Evrard et al. [41] for a detailed review). Moreover, TAMs have been shown to affect cell proliferation, vascularization, stromal formation and dissolution [42]. Our results are coherent with the previous literature that highlighted the proinflammatory and tumor-promoting role of TAMs in SCCHN [43,44,45,46]. Two recent meta-analyses [43,44] found that increased densities of TAMs in the TME, particularly M2-like, correlate with poor clinicopathologic markers in SCCHN. A recent study by Tekin et al. [47] showed that M0 macrophages harbor anti-tumorigenic activities, which seem to be mediated by TNF-α which is associated with M0 macrophage-induced cell death in pancreatic cancer. Furthermore, an increasing abundance of infiltrated M0 macrophages was associated with poorer outcomes in breast cancer [48]. To our knowledge, however, our results elucidated the potential role of M0 macrophages in SCCHN for the first time.

Patients in the SCCHN TMI low-risk group exhibited an increased abundance of CD8+ and follicular helper T cells as well as activated dendritic cells. These results are consistent with prior reports which showed significantly better survival outcomes [49,50,51] in SCCHN patients with a higher abundance of infiltrative lymphocytes. Moreover, a recent work by Cillo et al. [52] found that T follicular helper cells are associated with longer progression-free survival in SCCHN patients and that the activation of dendritic cells could improve antitumor T cell responses. Hence, the SCCHN TMI might have important implications for prognosis and further adjuvant treatment decisions, as low-risk scores are associated with high levels of infiltration of antitumor T cells and low levels of infiltration of tumor-promoting TAMs. These provide potential points of therapeutic intervention that need to be validated prospectively in clinical trials of specific inhibitors.

The single-cell RNA-seq analysis of the SCCHN TMI genes revealed that four of these genes (MASP1, EGFL6, SPP1, and P4HA1) are expressed in subpopulations of fibroblasts, macrophages, T cells and in tumor cells. It is noteworthy that two genes, SPP1 and P4H41, exhibited the highest association with immune cells. Specifically, they are highly expressed in both T cells and macrophages, hence they could play an important role in linking the SCCHN risk score with the different patterns of infiltrative immune cells in the SCCHN’s TME. Two genes (SFRP5 and MMP8) were lowly expressed in this dataset. In the study by Puram et al. [25], the authors analyzed 5902 single cells from 18 patients with tumors of the oral cavity, which is one of the subsites of SCCHN. Therefore, the low expression of SFRP5 and MMP8 in this dataset might be, to some extent, attributed to the small cohort of patients included in the analysis and the presence of only one SCCHN subsite.

Using machine learning, we demonstrated high computational classification accuracy between the risk groups in the data collected using different platforms (RNA-seq and microarrays), despite the small sample size of the validation dataset. This has important clinical implications as it demonstrates the robustness of our SCCHN TMI in stratifying HPV, N0 patients. Finally, statistically different expression levels were found for each SCCHN TMI gene between the two risk groups (Figure S7). As the SCCHN TMI comprises a small number of genes, their expression levels could be quantified using RT-PCR directly on postoperative specimens to conduct prospective validation studies.

One of the limitations of this study is the presence of only one validation dataset. This is due to the incomplete clinicopathological information in all the other available public datasets that we evaluated. This is a current issue in the field of SCCHN. Therefore, the assessment of the SCCHN TMI signature in multiple larger validation dataset cohorts is warranted in the future. This will enable the identification of a global cut-off value for patient stratification, and further improve the clinical utility of the SCCHN TMI. Another limitation is the use of publicly available algorithms that could change over time. This might limit the reproducibility of this study, but novel algorithms with improved quality, accuracy, usability and speed are likely to emerge in the future.

5. Conclusions

In conclusion, the identified SCCHN TMI gene signature represents a genomic tool that could potentially enable a better understanding of the molecular mechanisms associated with the interaction between the tumor and its microenvironment. Lastly, the SCCHN TMI could enhance patient stratification progression and selection and aid personalized intervention.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/cancers13225761/s1, Figure S1: Workflow for the machine learning prediction of the risk group, Figure S2: Nomogram and computational calibration of the model, Figure S3: Univariate and multivariate analysis, Figure S4: Comparison between low and high SCCHN TMI for the HPV, N0 sub-group among conventional clinical parameters (TCGA dataset), Figure S5: Association between SCCHN TMI and clinical outcomes, Figure S6: Regulatory network analyses of the SCCHN TMI genes, Figure S7: Expression levels as a function of the SCCHN TMI risk group for each gene, Table S1: Outcome of the Cox Proportional Hazard Model to the 1068 matrisomal genes, Table S2: The 6-gene SCCHN TMI signature.

Author Contributions

Conceptualization, Y.B. and S.B.L.; data curation, Y.B. and S.B.L. formal analysis, Y.B. and S.B.L.; investigation, Y.B., S.B.L., N.G.I., W.-T.L.; project administration, Y.B.; supervision, N.G.I., W.-T.L., C.T.L.; writing—original draft, Y.B.; writing—review and editing, S.B.L., N.G.I., W.-T.L., C.T.L. All authors have read and agreed to the published version of the manuscript.

Funding

Y.B. is supported by the Institute for Health Innovation & Technology (iHealthTech), National University of Singapore (R-722-007-004-731). S.B.L. is supported by the National Research Foundation of Korea (NRF/MSIT 2021R1F1A1064122), and Ajou University School of Medicine (new faculty research fund). N.G.I. is supported by National Medical Research Council (Singapore) Clinician Scientist Awards to (NMRC/CSA/001/2016, MOH-000325-00), and NCC Cancer Fund. W.-T.L. is supported by the National Medical Research Council (NMRC/CSA-INV/0025/2017, MOH-CIRGMay-0006), and NCC Cancer Fund. C.T.L. is supported by iHealthTech, National University of Singapore (R-722-007-004-731) and Mechanobiology Institute (MBI) Seed Grant, National University of Singapore (R-714-106-002-135).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the fact that only publicly available data and materials were used in this study.

Informed Consent Statement

Patient consent was waived due to the retrospective nature of this study and use of de-identified data.

Data Availability Statement

The data analyzed in this study were obtained from TCGA using XENA browser, accessing the “TCGA Head and Neck Cancer (HNSC) study”. The validation dataset was obtained from GEO (Gene Expression Omnibus) under the accession codes GSE65858 and accessed using Phantasus. The scRNA-seq dataset analyzed in this study is available from GEO (Gene Expression Omnibus) under the accession code GSE103322.

Conflicts of Interest

No potential conflict of interest were disclosed by the authors.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Dictionary-Pathology: Head and Neck Cancer—The Human Protein Atlas. [(accessed on 16 April 2020)]. Available online: https://www.proteinatlas.org/learn/dictionary/pathology/head+and+neck+cancer+2.
  • 2.Palka K., Slebos J.R., Chung H.C. Update in Molecular Diagnostic Tests in Head and Neck Cancer. J. Investig. Dermatol. 2008;35:198–210. doi: 10.1053/j.seminoncol.2008.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Fitzmaurice C., Allen C., Barber R.M., Barregard L., Bhutta Z.A., Brenner H., Dicker D.J., Chimed-Orchir O., Dandona R., Dandona L., et al. Global, Regional, and National Cancer Incidence, Mortality, Years of Life Lost, Years Lived With Disability, and DisabilityAdjusted Life-years for 32 Cancer Groups, 1990 to 2015. JAMA Oncol. 2017;3:524–548. doi: 10.1001/jamaoncol.2016.5688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Simon S. Facts & Figures 2019. American Cancer Society; Atlanta, GA, USA: 2019. p. 76. [Google Scholar]
  • 5.Mayne S.T., Morse D.E., Winn D.M. Cancer Epidemiology and Prevention. Oxford University Press; New York, NY, USA: 2009. Cancers of the Oral Cavity and Pharynx. [Google Scholar]
  • 6.Hashibe M., Brennan P., Chuang S.C., Boccia S., Castellsague X., Chen C., Curado M.P., Maso L.D., Daudt A.W., Fabianova E., et al. Interaction between tobacco and alcohol use and the risk of head and neck cancer: Pooled analysis in the international head and neck cancer Epidemiology consortium. Cancer Epidemiol. Biomark. Prev. 2009;18:541–550. doi: 10.1158/1055-9965.EPI-08-0347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Gillison M.L., Alemany L., Snijders P.J.F., Chaturvedi A., Steinberg B.M., Schwartz S., Castellsagué X. Human papillomavirus and diseases of the upper airway: Head and neck cancer and respiratory papillomatosis. Vaccine. 2012;30:F34–F54. doi: 10.1016/j.vaccine.2012.05.070. [DOI] [PubMed] [Google Scholar]
  • 8.Cadoni G., Giraldi L., Petrelli L., Pandolfini M., Giuliani M., Paludetti G., Pastorino R., Leoncini E., Arzani D., Almadori G., et al. Prognostic factors in head and neck cancer: A 10-year retrospective analysis in a single-institution in Italy. Acta Otorhinolaryngol. Ital. 2017;37:458–466. doi: 10.14639/0392-100X-1246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Liu S.A., Wang C.C., Jiang R.S., Lee F.Y., Lin W.J., Lin J.C. Pathological features and their prognostic impacts on oral cavity cancer patients among different subsites—A single institute’s experience in Taiwan. Sci. Rep. 2017;7:7451. doi: 10.1038/s41598-017-08022-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vasan K., Low T.H.H., Gupta R., Ashford B., Asher R., Gao K., Ch’ng S., Palme C.E., Clark J.R. Lymph node ratio as a prognostic factor in metastatic cutaneous head and neck squamous cell carcinoma. Head Neck. 2018;40:993–999. doi: 10.1002/hed.25066. [DOI] [PubMed] [Google Scholar]
  • 11.Wreesmann V.B., Katabi N., Palmer F.L., Montero P.H., Migliacci J.C., Gönen M., Carlson D., Ganly I., Shah J.P., Ghossein R., et al. Influence of extracapsular nodal spread extent on prognosis of oral squamous cell carcinoma. Head Neck. 2016;38:E1192–E1199. doi: 10.1002/hed.24190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jadhav K.B., Gupta N. Clinicopathological prognostic implicators of oral squamous cell carcinoma: Need to understand and revise. N. Am. J. Med. Sci. 2013;5:671–679. doi: 10.4103/1947-2714.123239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Thomas G.R., Nadiminti H., Regalado J. Molecular predictors of clinical outcome in patients with head and neck squamous cell carcinoma. Int. J. Exp. Pathol. 2005;86:347–363. doi: 10.1111/j.0959-9673.2005.00447.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Roesch-Ely M., Nees M., Karsai S., Ruess A., Bogumil R., Warnken U., Schnölzer M., Dietz A., Plinkert P.K., Hofele C., et al. Proteomic analysis reveals successive aberrations in protein expression from healthy mucosa to invasive head and neck cancer. Oncogene. 2007;26:54–64. doi: 10.1038/sj.onc.1209770. [DOI] [PubMed] [Google Scholar]
  • 15.Suresh A., Vannan M., Kumaran D., Gümüs Z.H., Sivadas P., Murugaian E.E., Kekatpure V., Iyer S., Thangaraj K., Kuriakose M.A. Resistance/response molecular signature for oral tongue squamous cell carcinoma. Dis. Markers. 2012;32:51–64. doi: 10.1155/2012/926703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Reddy R.B., Khora S.S., Suresh A. Molecular prognosticators in clinically and pathologically distinct cohorts of head and neck squamous cell carcinoma—A meta-analysis approach. PLoS ONE. 2019;14:e0218989. doi: 10.1371/journal.pone.0218989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Quail D.F., Joyce J.A. Microenvironmental regulation of tumor progression and metastasis. Nat. Med. 2013;19:1423–1437. doi: 10.1038/nm.3394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lu P., Weaver V.M., Werb Z. The extracellular matrix: A dynamic niche in cancer progression. J. Cell Biol. 2012;196:395–406. doi: 10.1083/jcb.201102147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Naba A., Clauser K.R., Hoersch S., Liu H., Carr S.A., Hynes R.O. The matrisome: In silico definition and in vivo characterization by proteomics of normal and tumor extracellular matrices. Mol. Cell. Prot. 2012;11:M111.014647. doi: 10.1074/mcp.M111.014647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.NCI. NHGRI The Cancer Genome Atlas Program—National Cancer Institute [(accessed on 14 April 2020)]; Available online: https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga.
  • 21.Bin Lim S., Chua M.L.K., Yeong J.P.S., Tan S.J., Lim W.-T., Lim C.T. Pan-cancer analysis connects tumor matrisome to immune response. NPJ Precis. Oncol. 2019;3:15. doi: 10.1038/s41698-019-0087-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Lim S.B., Tan S.J., Lim W.T., Lim C.T. An extracellular matrix-related prognostic and predictive indicator for early-stage non-small cell lung cancer. Nat. Commun. 2017;8:1734. doi: 10.1038/s41467-017-01430-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Naba A., Clauser K.R., Ding H., Whittaker C.A., Carr S.A., Hynes R.O. The extracellular matrix: Tools and insights for the “omics” era. Matrix Biol. 2016;49:10–24. doi: 10.1016/j.matbio.2015.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Goldman M.J., Craft B., Hastie M., Repečka K., McDade F., Kamath A., Banerjee A., Luo Y., Rogers D., Brooks A.N., et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 2020;38:675–678. doi: 10.1038/s41587-020-0546-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Puram S.V., Tirosh I., Parikh A.S., Patel A.P., Yizhak K., Gillespie S., Rodman C., Luo C.L., Mroz E.A., Emerick K.S., et al. Single-Cell Transcriptomic Analysis of Primary and Metastatic Tumor Ecosystems in Head and Neck Cancer. Cell. 2017;171:1611–1624. doi: 10.1016/j.cell.2017.10.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Therneau T. A Package for Survival Analysis in S. R Package Version 2.37-7. Springer; New York, NY, USA: 2014. [Google Scholar]
  • 27.Wichmann G., Rosolowski M., Krohn K., Kreuz M., Boehm A., Reiche A., Scharrer U., Halama D., Bertolini J., Bauer U., et al. The role of HPV RNA transcription, immune response-related gene expression and disruptive TP53 mutations in diagnostic and prognostic profiling of head and neck cancer. Int. J. Cancer. 2015;137:2846–2857. doi: 10.1002/ijc.29649. [DOI] [PubMed] [Google Scholar]
  • 28.Belotti Y., Jokhun D.S., Ponnambalam J.S., Valerio V.L.M., Lim C.T. Machine learning based approach to pH imaging and classification of single cancer cells. APL Bioeng. 2021;5:016105. doi: 10.1063/5.0031615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.MASP1 Protein Expression Summary—The Human Protein Atlas. [(accessed on 7 May 2021)]. Available online: https://www.proteinatlas.org/ENSG00000127241-MASP1.
  • 30.EGFL6 Protein Expression Summary—The Human Protein Atlas. [(accessed on 7 May 2021)]. Available online: https://www.proteinatlas.org/ENSG00000198759-EGFL6.
  • 31.Choe E.K., Yi J.W., Chai Y.J., Park K.J. Upregulation of the adipokine genes ADIPOR1 and SPP1 is related to poor survival outcomes in colorectal cancer. J. Surg. Oncol. 2018;117:1833–1840. doi: 10.1002/jso.25078. [DOI] [PubMed] [Google Scholar]
  • 32.SPP1 Protein Expression Summary—The Human Protein Atlas. [(accessed on 7 May 2021)]. Available online: https://www.proteinatlas.org/ENSG00000118785-SPP1.
  • 33.Van Tubergen E.A., Banerjee R., Liu M., Broek R.V., Light E., Kuo S., Feinberg S.E., Willis A.L., Wolf G., Carey T., et al. Inactivation or loss of TTP promotes invasion in head and neck cancer via transcript stabilization and secretion of MMP9, MMP2, and IL-6. Clin. Cancer Res. 2013;19:1169–1179. doi: 10.1158/1078-0432.CCR-12-2927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Pietruszewska W., Bojanowska-Poźniak K., Kobos J. Metaloproteinazy macierzy zewnątrzkomórkowej MMP1, MMP2, MMP9 oraz ich inhibitory tkankowe TIMP1, TIMP2, TIMP3 w rakach głowy i szyi: Badanie immunohistochemiczne. Otolaryngol. Pol. 2016;70:32–43. doi: 10.5604/00306657.1202546. [DOI] [PubMed] [Google Scholar]
  • 35.P4HA1 Protein Expression Summary—The Human Protein Atlas. [(accessed on 7 May 2021)]. Available online: https://www.proteinatlas.org/ENSG00000122884-P4HA1.
  • 36.Thompson J.A., Tan J., Greene C.S. Cross-platform normalization of microarray and RNA-seq data for machine learning applications. PeerJ. 2016;4:e1621. doi: 10.7717/peerj.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lim S.B., Tan S.J., Lim W., Lim C.T. Compendiums of cancer transcriptomes for machine learning applications. Sci. Data. 2019;6:194. doi: 10.1038/s41597-019-0207-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lee M.H., Chang J.T.C., Liao C.T., Chen Y.S., Kuo M.L., Shen C.R. Interleukin 17 and peripheral IL-17-Expressing T cells are negatively correlated with the overall survival of head and neck cancer patients. Oncotarget. 2018;9:9825–9837. doi: 10.18632/oncotarget.23934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Böhrnsen F., Holzenburg J., Godek F., Kauffmann P., Moser N., Schliephake H. Influence of tumour necrosis factor alpha on epithelial–mesenchymal transition of oral cancer cells in co-culture with mesenchymal stromal cells. Int. J. Oral Maxillofac. Surg. 2020;49:157–165. doi: 10.1016/j.ijom.2019.06.001. [DOI] [PubMed] [Google Scholar]
  • 40.Ward-Kavanagh L.K., Lin W.W., Šedý J.R., Ware C.F. The TNF Receptor Superfamily in Co-stimulating and Co-inhibitory Responses. Immunity. 2016;44:1005–1019. doi: 10.1016/j.immuni.2016.04.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Evrard D., Szturz P., Tijeras-Raballand A., Astorgues-Xerri L., Abitbol C., Paradis V., Raymond E., Albert S., Barry B., Faivre S. Macrophages in the microenvironment of head and neck cancer: Potential targets for cancer therapy. Oral Oncol. 2019;88:29–38. doi: 10.1016/j.oraloncology.2018.10.040. [DOI] [PubMed] [Google Scholar]
  • 42.Mantovani A., Bottazzi B., Colotta F., Sozzani S., Ruco L. The origin and function of tumor-associated macrophages. Immunol. Today. 1992;13:265–270. doi: 10.1016/0167-5699(92)90008-U. [DOI] [PubMed] [Google Scholar]
  • 43.Troiano G., Caponio V.C.A., Adipietro I., Tepedino M., Santoro R., Laino L., Lo Russo L., Cirillo N., Lo Muzio L. Prognostic significance of CD68+ and CD163+ tumor associated macrophages in head and neck squamous cell carcinoma: A systematic review and meta-analysis. Oral Oncol. 2019;93:66–75. doi: 10.1016/j.oraloncology.2019.04.019. [DOI] [PubMed] [Google Scholar]
  • 44.Kumar A.T., Knops A., Swendseid B., Martinez-Outschoom U., Harshyne L., Philp N., Rodeck U., Luginbuhl A., Cognetti D., Johnson J., et al. Prognostic Significance of Tumor-Associated Macrophage Content in Head and Neck Squamous Cell Carcinoma: A Meta-Analysis. Front. Oncol. 2019;9:656. doi: 10.3389/fonc.2019.00656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Sakakura K., Takahashi H., Kaira K., Toyoda M., Murata T., Ohnishi H., Oyama T., Chikamatsu K. Relationship between tumor-Associated macrophage subsets and CD47 expression in squamous cell carcinoma of the head and neck in the tumor microenvironment. Lab. Investig. 2016;96:994–1003. doi: 10.1038/labinvest.2016.70. [DOI] [PubMed] [Google Scholar]
  • 46.Gao L., Zhang W., Zhong W.Q., Liu Z.J., Li H.M., Yu Z.L., Zhao Y.F. Tumor associated macrophages induce epithelial to mesenchymal transition via the EGFR/ERK1/2 pathway in head and neck squamous cell carcinoma. Oncol. Rep. 2018;40:2558–2572. doi: 10.3892/or.2018.6657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Tekin C., Aberson H.L., Bijlsma M.F., Spek C.A. Early macrophage infiltrates impair pancreatic cancer cell growth by TNF-α secretion. BMC Cancer. 2020;20:1183. doi: 10.1186/s12885-020-07697-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Ali H.R., Chlon L., Pharoah P.D.P., Markowetz F., Caldas C. Patterns of Immune Infiltration in Breast Cancer and Their Clinical Implications: A Gene-Expression-Based Retrospective Study. PLoS Med. 2016;13:e1002194. doi: 10.1371/journal.pmed.1002194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Balermpas P., Rödel F., Rödel C., Krause M., Linge A., Lohaus F., Baumann M., Tinhofer I., Budach V., Gkika E., et al. CD8+ tumour-infiltrating lymphocytes in relation to HPV status and clinical outcome in patients with head and neck cancer after postoperative chemoradiotherapy: A multicentre study of the German cancer consortium radiation oncology group (DKTK-ROG) Int. J. Cancer. 2016;138:171–181. doi: 10.1002/ijc.29683. [DOI] [PubMed] [Google Scholar]
  • 50.Watanabe Y., Katou F., Ohtani H., Nakayama T., Yoshie O., Hashimoto K. Tumor-infiltrating lymphocytes, particularly the balance between CD8+ T cells and CCR4+ regulatory T cells, affect the survival of patients with oral squamous cell carcinoma. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. Endodontol. 2010;109:744–752. doi: 10.1016/j.tripleo.2009.12.015. [DOI] [PubMed] [Google Scholar]
  • 51.Boucek J., Mrkvan T., Chovanec M., Kuchar M., Betka J., Boucek V., Hladikova M., Betka J., Eckschlager T., Rihova B. Regulatory T cells and their prognostic value for patients with squamous cell carcinoma of the head and neck. J. Cell. Mol. Med. 2010;14:426–433. doi: 10.1111/j.1582-4934.2008.00650.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Cillo A.R., Kürten C.H.L., Tabib T., Qi Z., Onkar S., Wang T., Liu A., Duvvuri U., Kim S., Soose R.J., et al. Immune Landscape of Viral- and Carcinogen-Driven Head and Neck Cancer. Immunity. 2020;52:183–199. doi: 10.1016/j.immuni.2019.11.014. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The data analyzed in this study were obtained from TCGA using XENA browser, accessing the “TCGA Head and Neck Cancer (HNSC) study”. The validation dataset was obtained from GEO (Gene Expression Omnibus) under the accession codes GSE65858 and accessed using Phantasus. The scRNA-seq dataset analyzed in this study is available from GEO (Gene Expression Omnibus) under the accession code GSE103322.


Articles from Cancers are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES