Abstract
In this study, we explore the possibility of inferring characteristics of the tumor immune microenvironment from the blood. Specifically, we investigate two datasets of patients with head and neck squamous cell carcinoma with matched single-cell RNA sequencing (scRNA-seq) from peripheral blood mononuclear cells (PBMCs) and tumor tissues. Our analysis shows that the immune cell fractions and gene expression profiles of various immune cells within the tumor microenvironment can be inferred from the matched PBMC scRNA-seq data. We find that the established exhausted T-cell signature can be predicted from the blood and serve as a valuable prognostic blood biomarker of immunotherapy response. Additionally, our study reveals that the inferred ratio between tumor memory B- and regulatory T-cell fractions is predictive of immunotherapy response and is superior to the well-established cytolytic and exhausted T-cell signatures. These results highlight the promising potential of PBMC scRNA-seq in cancer immunotherapy and warrant, and will hopefully facilitate, further investigations on a larger scale. The code for predicting tumor immune microenvironment from PBMC scRNA-seq, TIMEP, is provided, offering other researchers the opportunity to investigate its prospective applications in various other indications.
Significance:
Our work offers a new and promising paradigm in liquid biopsies to unlock the power of blood single-cell transcriptomics in cancer immunotherapy.
Introduction
The immune status in the tumor microenvironment (TME) plays a critical role in determining the progression of cancer and the efficiency of immunotherapy. Two important determinants of the immune status in the TME are the composition and gene expression of different immune cell types. In terms of cell abundance, it was found that FOXP3+ CD4+ regulatory T cells (Tregs) and CD8+ T cells are positively associated with patient prognosis in a cohort of patients with mixed human papillomavirus–positive/negative (HPV+/HPV−) head and neck squamous cell carcinoma (HNSCC; ref. 1). More recently, a positive association was reported between PD1+ TCF1+ CD8 in human HPV+ oropharyngeal squamous cell carcinoma and response to immune checkpoint blockade (ICB; ref. 2). Furthermore, a strong infiltration of CD8+ T cells is associated with favorable patient prognosis (3, 4), whereas a high infiltration of Treg correlates with poor prognosis in many solid tumors (5, 6). Similarly, a high CD8+/Tregs ratio is associated with favorable prognosis in multiple tumor types (7, 8). Additionally, tumor-infiltrating neutrophils, lymphocytes, and neutrophils-to-lymphocytes ratio have been found to be important markers of patient survival (9). Beyond cell abundance, the gene expression in the TME has been shown to be predictive of cancer ICB response, for example, by identifying gene expression signatures of T-cell dysfunction and exclusion (10) or by pairwise transcriptomic relations between immune checkpoint genes (ICGs; ref. 11).
The rapidly evolving single-cell sequencing technology now offers an unprecedented opportunity to analyze cell composition and gene expression of tumor cells as well as immune cells. For example, this technology can be used to study genomic and transcriptomic heterogeneity of tumors (12), construct a tumor immune cell atlas (13), chart cell–cell communication in the TME (14, 15), understand genomic and transcriptomic dynamics during cancer progression (16–18), and track tumor evolution (19). In the clinic, ideally, one would like to perform multiple tumor biopsies and employ single-cell transcriptomics to study the dynamics of the TME. However, for many tumors, such invasive procedures are prohibitive. In this study, we aim to determine the extent to which we can learn about the state of the TME from single-cell transcriptomics of immune cells in the blood.
Liquid biopsies have greatly advanced the field of clinical oncology and have become a focus of ongoing research for precision diagnosis and personalized cancer treatment. Current efforts have been focused on studying cell-free DNA (cfDNA), circulating tumor DNA (ctDNA), circulating tumor cells (CTCs), and methylation signatures [see (20) for a review]. Here, our focus is different and complementary, namely, to study single-cell transcriptomics of immune cells in the blood to learn about the immune status of the TME. Previous related studies have aimed to predict normal tissue gene expression from the bulk expression of white blood cells (WBCs). For example, analyzing matched whole blood/lung gene expression data from the Genotype-Tissue Expression (GTEx) project, a generalized linear regression model was shown to predict the expression levels of ∼18% of the genes in the lung from the blood (21). Using a Bayesian ridge regression–based method, the expression levels of 20% to 60% genes in 16 tissues could be predicted from the blood (22). Recent research has also shown that whole blood transcriptomes can predict tissue-specific expression levels for ∼60% of the genes on average across 32 tissues, and the inferred tissue-specific expression from the blood transcriptome can predict the disease state in some disorders (23).
Encouraged by our previous success with predicting tissue-specific expression from an individual’s blood bulk transcriptome (23), in this study we first aim to learn to what extent can one use single-cell blood transcriptome to learn about the cellular composition and gene expression of various immune cell types within an individual’s TME. Second, based on that, we aim to learn if we can use the inferred TME immune state to predict ICB response, and furthermore, if that leads to more accurate predictions than predictors that are built directly on the blood information without inferring the TME immune state.
To this end, we first analyzed a dataset of tumor blood–matched single-cell RNA sequencing (scRNA-seq) data for HNSCC from 26 patients. We hypothesized that immune cells in the peripheral blood mononuclear cells (PBMCs) bear information on the immune state in the TME, due to ongoing circulation of both immune cells and immune signaling molecules like cytokines between the blood and the TME. Indeed, many previous studies have shown that the number and function of both innate and adaptive immune cells in the blood are altered in patients with cancer (24–26). Additionally, it has been found that the dynamics of circulating immune cell phenotypes reflect interactions between tumor and immune cells during immunotherapy (27). Furthermore, numerous studies have also reported that changes in blood immune cell composition, particularly in neutrophils-to-lymphocytes ratio, are associated with poor survival outcomes and response to immunotherapy in many types of solid tumors (28). Indeed, we find that the immune cell fractions (ICFs) of major immune cell types in the TME can be predicted from PBMC scRNA-seq as they are correlated with ICFs of certain cell types in the latter. ICF is a proxy for cell abundance and represents the fraction of cells of a given immune cell type of the total number of all viable CD45+ immune cells in the same sample which includes almost all immunologic and hematologic cells except for mature erythrocytes and platelets (29, 30). Additionally, we find that in different immune cell types in the TME the expression levels of 17% to 47% expressed genes can be predicted from the patient’s blood.
We then analyzed a second dataset of patients with HNSCC including both patient pretreatment blood scRNA-seq data and ICB response information. We set to study if we can predict ICB response based on patients’ PBMC scRNA-seq data. We find that the well-established exhausted T-cell signature in the TME can be inferred from the blood and the inferred exhausted T-cell score can further predict immunotherapy response in HNSCC. In addition, we identify a new immune signature named ICFR∗, which reflects the blood-inferred TME ICF ratio of (Bmemory − Treg)/(Bmemory + Treg). The inferred ICFR∗ scores in the TME are the most predictive of patients’ response to ICB treatment both in terms of tumor RECIST criteria and pathologic response in this dataset and importantly, are further validated in an independent large-scale bulk expression HNSCC dataset. Here, RECIST response is a measure of best overall response on serial radiologic analysis (31), whereas pathologic response is defined as “whether patients have 90% or less viable tumor after the ICB treatment” (32).
Materials and Methods
scRNA-seq data analysis
Seurat (v4.0.1, R package; RRID: SCR_016341; ref. 33) was used for data processing and cell clustering. Cells with insufficient number of genes (< 200) or greater than 20% of unique molecular identifiers (UMIs) mapped to mitochondrial genes were removed. Data integration across different samples was accomplished with Harmony (34). The “NormalizeData” function with parameters: normalization.method = “LogNormalize” and scale.factor = 10,000 was applied to normalize the expression levels of genes in each single cell. The “FindVariableFeatures” function with the “vst” method was utilized to identify 2,000 highly variable genes. The “ScaleData” function with default parameters was used to scale and center gene expression matrices. To perform clustering, principal component analysis (PCA) dimensionality reduction was first conducted with “RunPCA” function and the first 20 principal components were selected to construct the shared nearest neighbor (SNN) graph with “FindNeighbors” function and then clusters were determined using the Louvain algorithm with “FindClusters” function. CellTypist (35) with the default model (Immune_All_Low.pkl) and majority voting refinement was used to do automatic cell type annotation with the known cell type labels (in total 19 cell types) to predict the identities of cell clusters. Then, we manually checked whether the annotations are reliable by examining the top ranked differentially expressed genes of each cluster which were obtained with “FindAllMarkers” function with default parameters but with set min.pct = 0.25. The uniform manifold approximation and projection (UMAP) was finally applied to visualize the single-cell transcriptional profile in the 2D space.
Differential cell abundance and gene expression analysis
Differential cell abundance analysis was conducted with edgeR (RRID: SCR_012802; ref. 36) to identify which cell types were differentially enriched in tumor tissue or in the blood. “estimateDisp()” function was used to estimate the negative binomial dispersion for each cluster. “glmQLFit()” function was used to estimate the quasi-likelihood dispersion. “glmQLFTest()” function was used to test for differences in abundance between tumor tissue and blood with FDR < 0.05.
In this study, the gene expression of a specific immune cell type was defined as “pseudo-bulk” expression profiles generated by summing counts together for all individual cells of the same cell type in a sample. “Pseudo-bulk” samples with insufficient cells (< 10) were removed. As a result, 10 cell types [cycling T cells, cytotoxic T cells, dendritic cells (DCs), helper T cells, memory B cells, monocytes, naïve B cells, NK cells, plasma cells, and Tregs] were retained for differential gene expression analysis. “pseudoBulkDGE()” function from scran (v1.22.1, R package; RRID: SCR_016944) was used to perform the differential gene expression analysis to find significantly up and downregulated genes with FDR < 0.05 and log2 fold change (FC) = 0. This approach generated for each cell type a list of genes over- and underexpressed in the tumor compared with the peripheral blood. Gene Ontology (GO) enrichment analyses of the resulting gene lists were performed with R package clusterProfiler (v 4.2.2; RRID: SCR_016884).
Machine learning for the prediction of the ICFs in the TME
To predict the ICFs in the TME from the blood, for each immune cell type in the TME, we first identified the most predictive cell type in the blood using the highest Pearson correlation coefficient (rmax) of ICFs with the target cell types in the TME. If the correlation between a cell type in the TME with the identical cell type in the blood was higher than rmax × 0.9, then the identical cell type was used as the most predictive cell type. To reduce the impact of potential sampling bias, the correlation was calculated using 1,000-replicate bootstrapping.
Then, for each cell type in the TME, we constructed a one-variable linear regression baseline model by using the ICF of the corresponding most predictive cell type in the blood (as the only variable; model 1). Additionally, we constructed five two-variable linear regression models by using the ICF of the corresponding most predictive cell type in the blood (as the first variable) and one of the five clinical variables (as the second variable; models 2–6).
in which i is the ith immune cell type in the TME; ci is the corresponding most predictive cell type in the blood.
To reduce the risk of overfitting, elastic-net regularization was used during model training. Finally, for each cell type, among all six models we selected the model with the best performance as the final model. To perform cross-validation, the data were randomly split (80%: 20%) to do the training and testing for 1,000 replicates and the results were computed and analyzed over all 1,000 replicates.
The ICF of a specific immune cell type in the TME is defined predictable when the mean Pearson correlation coefficients between the predicted ICFs and the true ICFs on the 20% test sets are greater than 0.3 (r > 0.3), mean normalized mean absolute error less than 1 (NMAE < 1) with a Benjamini–Hochberg adjusted P value < 0.05 (FDR < 0.05) over 1,000 replicates. More details are in the following section of “Statistical test to identify predictable ICFs and genes.”
For model evaluation on the validation dataset and model application in predicting patient ICB response, the mean values of the parameters (i.e., coefficients and intercepts of the linear models) of the final models trained over 1,000 replicates were used.
Machine learning for the prediction of the gene expression levels in immune cells in the TME
Similar to the prediction of ICFs, we used one- and two-variable linear regression models to predict expression levels of expressed genes in different immune cell types in the TME. In this case, “expressed genes” are defined as genes that have non-zero expression values in more than half of the samples. Specifically, for each immune cell type in the TME, we first identified the most predictive cell type in the blood based on the Pearson correlation coefficient of the overall gene expression levels across samples. Then, we used six linear regression models to predict the expression level of a specific gene in a specific immune cell type in the TME.
in which i is the ith immune cell type in tumor tissue; gj is the j th expressed gene; and ci is the corresponding most predictive cell type in PBMCs identified by correlation of overall gene expression levels with i.
Elastic-net regularization was used during model training. Finally, for each gene, among all the six models we selected the model with the best performance as the final model. The results were computed, evaluated, and analyzed over 1,000 replicates.
Statistical test to identify predictable ICFs and genes
Pearson correlation coefficients between the predicted and true values of the ICF/gene expression in the TME on the test data over 1,000 replicates were calculated for each cell type/gene and were tested against the null hypothesis of:
H01: The mean value of these Pearson correlation coefficients is ≤ 0.3.
Similarly, the normalized mean absolute errors between the predicted and true values over all 1,000 replicates were tested against the null hypothesis of:
H02: The mean value of these normalized mean absolute errors is ≥ 1.
We applied FDR < 0.05 to both hypotheses (H01 and H02) to select the predictable ICFs and genes.
With regard to identifying the clinical variables that have the highest and significant contribution to increasing the predictive power of each model, we compared results from models 2 to 6 with the result from model 1 using the Mood median test.
Functional enrichment analysis
R package clusterProfiler (v 4.2.2; RRID: SCR_016884) was used to perform functional enrichment analysis of predictable gene lists including enrichment analysis of GO (RRID: SCR_002811) and enrichment analysis of Kyoto Encyclopedia of Genes and Genomes (RRID: SCR_012773), as well as enrichment analysis of hallmarks gene sets and Reactome subset of canonical pathways downloaded from MSigDB (http://www.gsea-msigdb.org/; RRID: SCR_016863).
Identification of blood-predictive signatures for the prediction of ICB response
To identify the potential immune signatures that can be learned from the blood and are predictive of the ICB response of patients with HNSCC, we comprehensively examined three categories of candidate immune information in the blood and in the TME, respectively, i.e., ICGs, ICFs, and the ratios between immune cell fractions (ICFRs; see Supplementary Material SM2 for comprehensive lists of them). All the three categories of candidate immune information in the TME were predicted from the blood scRNA-seq data using the machine learning models described above. In terms of ICB response, we conducted a thorough evaluation using multiple metrics wherever relevant data were available. These metrics included the RECIST response, which measures the best overall response on serial radiologic analysis (31); the pathologic response, defined as “whether patients have 90% or less viable tumor after the ICB treatment” (32); overall survival (OS); and progression-free survival (PFS). Note that among the total 30 patients in the Luoma and colleagues cohort, 17 and 29 of them have the RECIST response and pathologic response data available, respectively. In the GSE159067 cohort, the pathologic response was not available.
With regard to ICGs, in total, 79 ICGs were collected from literature review (11, 37–40). We first checked the expression levels of them in different immune cell types in the blood. There are 219 expressed pairs in the blood, among which 31 pairs in the TME can be predicted from the blood (with criteria of r > 0.3 and FDR < 0.05 on the training dataset; r > 0.3 on the validation dataset). With regard to ICFs, nine are present in the blood (ICF > 0) and six in the TME can be predicted from the blood (with criteria of r > 0.3 and FDR < 0.05 on the training dataset; r > 0.3 on the validation dataset; Fig. 2C). With regard to ICFRs, there are in total 3,476 different combinations in the form of (ICF_C1 ± ICF_C2)/(ICF_C3 + ICF_C4) in the blood, in which C1, C2, C3, and C4 are immune cell types, and 55 in the TME are significantly predictable (with criteria of r > 0.3; P < 0.05; ρ > 0.3; P < 0.05 in both the training dataset and the validation dataset). As C1, C2, C3, and C4 are not necessarily different from each other, the form of (ICF_C1 ± ICF_C2)/(ICF_C3 + ICF_C4) can actually include all the following forms: ICF_C1/ICF_C2, ICF_C1/(ICF_C2 + ICF_C3), (ICF_C1 ± ICF_C2)/ICF_C3, and (ICF_C1 ± ICF_C2)/(ICF_C3 + ICF_C4).
We then ranked these candidate predictors in each category based on their correlation with ICB response using data from (32). Here, the strength of the correlation was measured by the area under the receiver operating characteristic curve (AUC). The Mann–Whitney U test was used to calculate the P values against the null hypothesis of “the AUC is equal to 0.5” (41). AUCs and corresponding P values for predicting both the ICB RECIST response (n = 16) and the ICB pathologic response (n = 26) were calculated. Predictors with P < 0.05 for both RECIST response and pathologic response were considered as potential signatures.
Survival analyses
All Kaplan–Meier analyses were performed by comparing the survival of patients with high signature scores (> median) with those with low scores (≤ median) using a two-sided log-rank test.
Statistical analyses
For all the multiple comparisons performed, we employed the Benjamini–Hochberg correction method to adjust the P values.
Data availability
In this study, the primary sources of the data are obtained from NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/; RRID: SCR_005012). All data used were sourced from publicly accessible repositories that adhered to ethical standards, including the requirement for written informed consent from patients at the time of initial publication. Additional information about the datasets used in this study can be found in Supplementary Table S1.
All codes that are necessary to reproduce all the results in the article are implemented in Python and R and are publicly available in GitHub: https://github.com/yingstat/TIMEP.
Results
The data analyzed and analysis overview
We obtained the single-cell transcriptome data from ref. (42), in which all viable CD45+ immune cells were measured for matched PBMC samples and primary tumor tissue samples from 18 patients with HPV− and 8 patients with HPV+ immunotherapy treatment–naïve HNSCC. Immune cell identities were annotated using CellTypist, an automated cell classification tool designed in ref. (35). The resulting cell identities were then merged into 12 major immune cell types such that each cell type in tumor tissues had an average ICF >1% (“Materials and Methods”). ICF is defined as the fraction of cells of a given immune cell type among all CD45+ cells in the same sample. Specifically, the 12 immune cell types identified in this study are cycling T cells, cytotoxic T cells, DCs, germinal center B cells, helper T cells, macrophages, memory B cells, monocytes, naïve B cells, NK cells, plasma cells, and Tregs (Fig. 1A). Throughout this study, we quantify cell abundances in term of ICFs.
We first compared the similarity and difference of the immune microenvironment between the TME and the blood of patients with HNSCC. We observed that the ICFs of all immune cell types, except for naïve B cells, are significantly different between tissue samples and matched PBMC samples. Specifically, the ICFs of cycling T cells, cytotoxic T cells, DCs, germinal center B cells, macrophages, memory B cells, plasma cells, and Tregs are higher in the TME, whereas the ICFs of helper T cells, monocytes, and NK cells are lower in the TME (Fig. 1A and B).
When comparing the same immune cell types between the blood and the TME, we found that the expression levels of 100 to 1,000 of genes are significantly altered. Across different immune cell types, there are up to 1,772 genes that are significantly upregulated in PBMC samples and up to 2,097 genes that are significantly upregulated in the tumor (log2 FC > 0; FDR < 0.05; Fig. 1C). Specifically, genes that are upregulated in the TME are enriched in specific immune response functions (Supplementary Fig. S1). Our functional analysis revealed that genes upregulated in PBMCs were more enriched in basic housekeeping biological functions such as “rRNA processing” and “cytoplasmic translation,” whereas genes upregulated in TME were more enriched in antitumor immune functions such as “T-cell activation” and “lymphocyte-mediated immunity” (Supplementary Fig. S2).
Next, we evaluated the extent to which we could predict the immune status in the TME from the blood using a machine learning–based framework (Fig. 1D). Briefly, we first predicted the ICFs of 11 of 12 major immune cell types in the TME using the cell types’ ICFs in the matching blood sample and additional clinical information, such as HPV infection status, alcohol and tobacco use, age, and sex. Germinal center B cells were excluded from this analysis as they were found only present in HPV+ tumor tissue samples in this study. Second, we predicted the expression level of each gene in each immune cell type in the TME based on its expression level in the corresponding cell type in the matching blood sample along with the patient’s clinical information. Finally, using the immune status information learned from the blood, we identified immune signatures that are predictive of the ICB response of patients with HNSCC.
ICFs of all cell types in the TME can be predicted from the blood
We first explored the correlation between ICFs in the TME and those in the blood. To reduce the impact of potential sampling bias, correlations between ICFs were calculated with bootstrapping by randomly resampling the 25 patient samples for 1,000 replications (note that sample 5 was excluded from the 26 patient samples as an outlier; Supplementary Fig. S3). For different cell types in the TME, the highest correlations with cell types in the blood were very different, with the Pearson correlations ranging from 0.14 to 0.69 (Fig. 2A). More intriguingly, for 6 of 11 cell types in the TME, the most correlated cell types in the blood are different from themselves (Fig. 2A). Among these six cell types, two are not present in the blood and hence matching cells could not have been found (cycling T cells and macrophages); two are moderately positively correlated with themselves between the TME and the blood (plasma cells and Tregs); however, the ICFs of two cell types in the blood and in the tumor samples are not correlated at all (memory B cells and NK cells), suggesting that they may play different functional roles in the TME versus in the blood or that their recycling rates between the tumor and the blood are low.
We then aimed to improve the prediction of ICFs in the tumor from the blood ICFs beyond the basic correlations denoted above by building machine learning predictors. Due to the small size of the training data, we focused on building very compact prediction models to minimize the risk of overfitting as much as possible. Specifically, for each cell type in the TME, a one-variable linear regression model was built by using the ICF of its most correlated cell type in the blood (as the only variable), and five two-variable linear regression models were built using the ICF of its most predictive cell type in the blood (as the first variable) and one of the five clinical variables (as the second variable). The most predictive model among those six models was chosen as the final model for predicting that cell type’s ICF. To further overcome overfitting, we randomly split the data into training and test sets (80%: 20%) for 1,000 replicates and performed model training and testing on the 1,000 folds for each cell type (“Materials and Methods”). As shown in Fig. 2B, the ICFs of all 11 immune cell types in the TME were predictable, with mean Pearson correlation coefficients between the predicted and the true ICFs greater than 0.3 (r > 0.3), mean normalized mean absolute error less than 1 (NMAE < 1) and a Benjamini–Hochberg adjusted P value < 0.05 (FDR < 0.05; Fig. 2B). Notably, clinical information played a statistically significant role in increasing the predictability of the ICFs for all immune cell types, compared with the one-variable baseline models (Fig. 2B).
As these models were generated and tested via a standard cross-validation procedure on the training set, we further tested them (without any further changes) on a smaller, independent, recently published external dataset [n = 5; (32)]. Despite the very limited number of matched samples, quite strikingly, six of eight cell types in which ICFs were predictable in the original datasets were also predictable in this test set (with r > 0.3). These included the memory B cells, helper T cells, Tregs, naïve B cells, macrophages, and plasma cells. Notably, memory B cells, helper T cells, and Tregs were well predicted with both r > 0.6 and ρ (Spearman correlation coefficient) > 0.6 (Fig. 2C).
The expression levels of 17% to 47% of the genes that are expressed in specific immune cell types in the TME can be predicted from the blood
Similarly, we first examined the correlation between gene expression levels of immune cell types in the TME and those in the blood. As expected, we found that the gene expression levels of all cell types in the TME were most highly correlated with those of the same cell types in the blood, except for cycling T cells and macrophages, which were not found in the blood (Supplementary Fig. S4).
To predict gene expression levels of each immune cell type in the TME, we used a similar model construction strategy. For each gene in each cell type, a one-variable linear regression model was built by using the expression pattern of the gene itself in the corresponding cell type in the blood (as the only variable), and five two-variable linear regression models were constructed using the expression level of the gene itself in the corresponding cell type in the blood (as the first variable) and one of the five clinical variables (as the second variable). The model that had the strongest mean predictive power on the test sets over 1,000 replicates was chosen as the final model (“Materials and Methods”).
As observed in the prediction of for ICF, clinical information plays a significant role in the prediction of gene expression as well (Fig. 3A). Notably, various clinical variables seem to have varying levels of significance across different cell types. HPV infection status is the most crucial factor in predicting the expression of DCs and all T cells, tobacco use is the most crucial factor for innate immune cells (i.e., monocytes, macrophages, and NK cells), and alcohol use is the most crucial factor for plasma cells and all B cells (Fig. 3B).
Overall, the expression levels of 17% to 47% of the genes expressed in immune cells in the TME can be predicted from the blood (mean r > 0.3 and mean NMAE < 1 with FDR < 0.05 on the test sets over 1,000 replicates; Fig. 3B). Notably, DCs and helper T cells, cytotoxic T cells, and Tregs have the highest number of predictable genes (> 4,700), whereas plasma cells and naïve B cells have the least (< 2,000 and < 700 genes, respectively). The detailed lists of predictable genes in different immune cell types can be found in Supplementary Material SM1. Furthermore, our findings indicate that 22% to 57% of the significantly differentially expressed genes between the TME and the blood (Fig. 1C) are predictable across different immune cell types (Supplementary Fig. S5).
Functional GO enrichment analyses of the predictable genes in different immune cell types in the TME reveals a number of common immune response-related GO terms shared across different immune cell types, such as “adaptive immune response,” “leukocyte cell–cell adhesion,” “positive regulation of T-cell activation,” “response to virus,” “T cell activation,” “antigen processing and presentation,” and “Th17 cell differentiation” (Supplementary Table S2). Notably, genes that are most accurately predicted by the HPV infection status information are enriched in virus infection–related terms. For example, in helper T cells, these genes are enriched in “DNA repair,” “HIV infection,” and “HIV life cycle”; and in Tregs, these genes are enriched in “influenza infection” (Supplementary Fig. S6).
Exhausted T-cell signature in the TME can be inferred from the blood and the inferred exhausted T-cell score can further predict immunotherapy response in HNSCC
Given the predictability of ICFs and gene expression levels of different immune cell types in the TME, we were interested to test if we could predict the scores of some well-established immune signatures in the TME from the PBMC scRNA-seq data. Specifically, we checked the predictability of three pan-cancer T cell immune signatures for immunotherapy including the exhausted T-cell signature (average expression of PDCD1, CTLA4, LAG3, HAVCR2, and TIGIT in cytotoxic T cells; ref. 43), the cytolytic T-cell signature (average expression of GZMA and PRF1 in cytotoxic T cells; ref. 44), and CXCL13 expression in cytotoxic T cells (45). This investigation was performed by analyzing a dataset of patients with HNSCC (32), including both patient pretreatment blood scRNA-seq data and ICB response information.
First, we found that the exhausted T-cell signature in the TME could be predicted from blood cytotoxic T-cell gene expression levels in both the training dataset of (r = 0.56; ρ = 0.61; ref. 42) and the validation dataset of (r = 0.46; ρ = 0.40; Fig. 4A; ref. 32). Moreover, the inferred exhausted T-cell signature in the TME was significantly negatively associated with patients’ ICB RECIST response (AUC = 0.88, P = 0.01; Fig. 4B), as one would expect. However, the exhausted T-cell signature was not significantly associated with patients’ survival after ICB treatment (Fig. 4C). We also performed Cox proportional hazards regression using the continuously measured exhausted T-cell signature and found that although there were strong trends of HRs > 1, the results were still not statistically significant (OS HR = 11.39, P = 0.18; PFS HR = 4.49, P = 0.21). A similar trend was found using multivariable Cox proportional hazards regressions by adjusting for age, sex, and HPV infection (OS HR = 3.21, P = 0.57; PFS HR = 4.8, P = 0.25). This could be attributed to the relatively small sample size of this cohort, and/or the short follow-up time of 36 months, which might not have been sufficient to observe differences in survival among patients, as indicated by the original authors (32). It is worth noting that the exhausted T-cell signature score in the blood on its own is not correlated with that in the TME (Fig. 4D) and is not associated with ICB response or survival (Fig. 4E and F), highlighting the importance of predicting ICB response through the two-step approach. Finally, the cytolytic T-cell and CXCL13 signatures in the TME were less predictable from the blood than the exhaustion signature (Supplementary Fig. S7A and S7D). As a result, the predicted scores for these signatures had low correlations with ICB response and survival (Supplementary Fig. S7B, S7C, S7E, and S7F).
Prediction of HNSCC immunotherapy response via a newly identified TME immune signature, ICFR∗
We next examined whether we could identify new blood-based immune signatures that can help predict the ICB response in patients with HNSCC. To this end, we tested two different strategies. Strategy 1 was a direct approach, in which we predicted ICB response based directly on the immune information in the blood. Strategy 2 was an indirect, two-step approach, in which we first identified predictable immune information in the TME from the blood (Figs. 1 and 3) and then used this information to predict ICB response.
To achieve our goal, we comprehensively examined three categories of immune information in both the blood (for “strategy 1” candidates) and the corresponding predictable immune information in the TME (for “strategy 2” candidates). Specifically, we examined (i) the expression of ICGs, (ii) the levels of OSICFs, and (iii) ICFRs, as the predictive features for ICB response (Fig. 5A). More specifically, in the first category, we studied the expression of 79 ICGs in specific immune cell types, as reported in the literature (“Materials and Methods”). As a given ICG might be expressed in a few cell types, this resulted in a space of 219 gene–cell type pairs, of which the expression state of 31 pairs in the TME could be predicted from the blood. In the second category, we focused on nine ICFs in the blood in which their levels were >0 in the direct approach, and six ICFs in the TME in which their levels could be predicted from the blood for the indirect, two-step approach. In the third category, we comprehensively studied ICFRs in the form of (ICF_C1 ± ICF_C2)/(ICF_C3 + ICF_C4), in which C1, C2, C3, and C4 are immune cell types. This space was composed of 3,476 different ICFRs in the blood, of which 55 ICFRs in the TME could be predicted from the blood. See “Materials and Methods” for more details and see Supplementary Material SM2 for a comprehensive list of items in each category.
We next evaluated the correlation of these three categories of candidates and patients’ ICB response in the HNSCC dataset of ref. (32). Here, the strength of the correlation was measured by AUC. The statistical significance of the correlation was calculated against the null hypothesis of “the AUC is equal to 0.5.” In category (1), we identified a few ICGs, and their expression levels in the blood or in the TME were significantly associated with both ICB RECIST response (n = 16) and ICB pathological response (n = 26; “Materials and Methods”). The potential ICG signatures in the blood included LGALS9 expression in memory B cells, ICOSLG expression in naïve B cells, CD80 expression in monocytes, CD47 expression in Tregs, and CD96 expression in memory B cells (Supplementary Fig. S8). The potential ICG signatures in the TME, in which their expression levels were predicted from the blood, included LAG3 expression in NK cells, and ICOSLG expression in naïve B cells (Supplementary Fig. S9). In category (2), none of the ICFs, either from the blood or from the TME, was significantly associated with ICB response (Supplementary Material SM2).
We hence focused on category (3), in which there were three ICFR signatures in the TME that were highly predictive (Supplementary Material SM2). The top predictive ICFR signature in the TME was (Bmemory − Treg)/(Bmemory + Treg), denoted as ICFR∗. First, ICFR∗ values in the TME could be well predicted from the blood, with strong correlations between the predicted values and the measured values in both the training dataset of (r = 0.75; ρ = 0.77; ref. 42) and the independent validation dataset of (r = 0.96; ρ = 0.90; Fig. 5B; ref. 32). Second, inferred ICFR∗ values in the TME, which were predicted from the blood, were significantly associated with patients’ ICB response (positively correlated with ICB response; RECIST AUC = 0.79, P = 0.05; pathological AUC = 0.73, P = 0.03; Fig. 5C). Notably, as was the case with the exhausted T-cell signature, the ICFR∗ scores estimated directly from the blood were not significantly associated with ICB response (RECIST AUC = 0.54, P = 0.43; pathological AUC = 0.62, P = 0.17; Supplementary Fig. S10), in contrast to those inferred in the TME from the blood. Remarkably, none of the 3,476 ICFRs in the blood was significantly associated with ICB response (Supplementary Material SM2), which further underscored the importance of predicting ICB response via the two-step approach. Finally, although signatures from category (1) were not significantly associated with patients’ survival after ICB treatment (Supplementary Figs. S8 and S9), which might be due to the short follow-up time issue of this dataset as mentioned above, the ICFR∗ scores in the TME were predictive of patients’ survival after ICB treatment (OS, P = 0.1; PFS, P = 0.01; Fig. 5D).
To further test the robustness of the ICFR∗ signature in predicting ICB response, we analyzed an additional large cohort. This cohort was a recently released dataset (46) consisting of bulk tumor RNA-seq of 102 patients with advanced HNSCC treated with ICB. Although it did not include matched single-cell data, we were still able to use it to test the predictive power of the ICFR∗ signature, as described below. To calculate the ICFR∗ values in the TME in each of these tumor samples, we inferred the relative abundance of each immune cell type by deconvolving the bulk tumor RNA-seq data via CIBERSORT (47). Remarkably, the ICFR∗ scores were also predictive of ICB response in this independent dataset (AUC = 0.72, P = 0.004; Fig. 5E), which was much higher than the predictive power obtained using the FDA-approved TMB biomarker in HNSCC [AUC = 0.56, P = 0.74; n = 69; Supplementary Fig. S11; data from the MSK-IMPACT cohort (48)]. Additionally, ICFR∗ values significantly predicted patients’ survival after ICB treatment (OS, P = 0.05; PFS, P = 0.03; Fig. 5F). It was also worth noting that ICFR∗ did not predict survival of patients with HNSCC in The Cancer Genome Atlas (TCGA), in which patients were not treated with ICB (Supplementary Fig. S12), demonstrating its specificity for predicting ICB response.
For completeness, we also evaluated the predictive power of the two other ICFR signatures that ranked high on our list. One was (Bmemory − Treg)/(Treg + Plasma), which was positively correlated with ICB response (RECIST AUC = 0.79, P = 0.05; pathological AUC = 0.73, P = 0.03; Supplementary Fig. S13). The other was Bmemory/(Treg + Plasma), which was also positively correlated with ICB response (RECIST AUC = 0.79, P = 0.05; pathological AUC = 0.73, P = 0.03; Supplementary Fig. S13). Notably, these two ICFR signatures were also able to significantly predict the ICB response in the bulk dataset (AUC = 0.72, P = 0.004 and AUC = 0.63, P = 0.007, respectively) but were less predictive of patients’ survival after ICB treatment (Supplementary Fig. S13).
Discussion
Cancer is a systemic disease that affects the entire body, altering the composition and function of the immune system as a whole (49). Charting the immune status of the TME in individuals is crucial for understanding cancer progression, drug response, and resistance, which can help promote precision medicine (50). Due to the challenges and risks associated with tumor tissue biopsies, characterizing the TME from liquid biopsies has become one of the most important directions in cancer prognosis, stratification, and monitoring (51). Leveraging the advantage of single-cell sequencing techniques, here, we have established a machine learning framework utilizing the global expression profiles of major immune cells in the peripheral blood, derived from the CD45+ scRNA-seq data, to infer the immune status of the TME in patients with HNSCC (Fig. 1D). To the best of our knowledge, the immune status of the TME has not yet been comprehensively predicted from the blood previously.
Our results testify that ICFs of major immune cell types in the TME of HNSCC primary tumors can be inferred from the blood. The most predictable three immune cell types are memory B cells, Tregs, and helper T cells on the independent validation data (Fig. 2C). Coincidently, all these cell types have been shown to play important roles in regulating cancer prognosis and drug resistance in patients with HNSCC (52, 53). Additionally, we show that the expression levels of approximately 20% to 50% of the genes expressed in different immune cell types in the TME is strongly correlated between the blood and the TME in (mostly) the same cell types and can be predicted from the blood. Furthermore, our findings suggest that the clinical information significantly contributes to the predictability of both ICFs and gene expression levels of immune cells in the TME. Interestingly, different clinical variables are found to contribute to the prediction of the expression profiles of different immune cells. For example, among all five clinical variables, HPV infection status is found to predict the expression of the largest number of genes in DCs and all T cells, whereas tobacco use information is the largest contributor for innate immune cells (i.e., monocytes, macrophages, and NK cells), and alcohol use is the largest contributor for B cells and plasma cells (Fig. 3B). This suggests that different clinical variables may have different impacts on the gene expression levels of different cell types.
Turning to study the ability to predict patients’ response to ICB from the inferred ICFs and gene expression in the TME, we show that one of the most well-known pan-cancer immune signatures for immunotherapy, the exhausted T-cell signature, can be predicted from the blood single-cell transcriptomics in HNSCC and the inferred exhausted T-cell signature from the blood can predict patients’ ICB response (Fig. 4A and B). Moreover, we identify a new immune signature, ICFR∗, and its inferred TME scores from the blood are more strongly associated with patients response. Specifically, the ICFR∗ predicts both ICB response and survival on both single-cell and bulk datasets (Fig. 5B–F). Additionally, it is worth noting that the predictive power of ICFR∗ holds across different patient populations: the single-cell cohort contains only patients with newly diagnosed oral cavity squamous cell carcinoma, who have received the ICB as a neoadjuvant therapy before the surgery, whereas in the bulk cohort, the ICB treatment has been used as a main therapy for patients with advanced HNSCC and more than half of the patients have received at least one line therapy before the ICB treatment. In addition, ICFR∗ is not predictive of the survival of patients with HNSCC that are not treated with ICB therapy, in the TCGA cohort. Taken together, these results demonstrate that the prediction ability of ICFR∗ is fairly robust and also specific to immunotherapy treatment. It is also worth noting that in contrast to the predictive power of inferred ICFR∗ scores in the TME, none of the totaling ˜3,500 candidate ICFR combinations in the blood are directly predictive of ICB response, which highlights the importance of constructing a two-step predictor to predict ICB treatment outcome, i.e., first learning about the immune status of the TME from the blood data, and then using the latter to further predict the treatment outcome. Notably, we find that B cells play a crucial role in the three top ICFR signatures identified in this research, which is in line with recent studies that demonstrate the importance of B cells on ICB treatment outcome (54–56).
The present study has some limitations that should be acknowledged. First, the sample size used to train the models is relatively small, as the study relies on yet costly scRNA-seq data. This limitation makes it challenging to perform stratified cross-validation or to train and validate the models in separate cohorts for important factors like HPV infection status. Currently, there is no larger publicly available dataset that contains matched tumor-PBMC scRNA-seq data. Although by constructing very compact predictors, the risk of overfitting has been reduced and the resulting models have shown reasonable performance on an independent validation dataset, larger cohorts are still needed to further test and validate the results. Specifically, as the predictive power of inferred TME ICFR∗ signature has only been tested in one single-cell dataset, more PBMC cohorts are needed to further test and validate this potentially clinically relevant biomarker. Second, the focus of this study is on the relative fractions of immune cells, as the fractions of different immune cells of all cells (including malignant cells and other hematopoietic cells) are not available in current datasets. Such future studies may potentially further enhance ICB response prediction. Third, we observe an interesting correlation pattern between ICFs in the TME and blood (Fig. 2A), which suggests that there may be possible interactions between immune cells in the TME and blood that may have been previously overlooked. To predict ICFs for a specific immune cell type in the TME, we used the most highly correlated cell type in the blood to build our predictive models as a starting point. This can be conceived as a prior feature selection step that may introduce a potential bias when evaluating the model in cross-validation. However, given the limited sample size, the current strategy is the best option one could adopt. If we would have instead tested the ICFs of all cell types and all the clinical variables, the number of candidate models would have greatly exceeded the training sample size, making that analysis implausible. It is worth mentioning that it was reported that CD4+ and CD8+ T-cell levels in paired tumor and blood samples were inversely correlated in patients with melanoma (57). However, this was not the case in HNSCC (cytotoxic T cells, r = 0.37; helper T cells, r = 0.3; Tregs, r = 0.17; Fig. 2A). This discrepancy indicates that there may be different cell transit patterns between tumor and blood in different cancer indications. Fourth, the transcriptomic sequencing data used in this study remains costly and are not commonly included in routine clinical practices. However, its expense can be seen as justifiable when considering the average annual cancer treatment cost in the United States, which is approximately $150,000. The additional cost of sequencing patient tissue samples may be a worthwhile investment in enhancing treatment precision and outcomes (58, 59). Finally, in the future, when more data become available, it will be important to revisit and investigate the robustness and underlying mechanisms of the relationship between ICFs in the TME and blood in HNSCC and more other cancer indications. With future increased sample size, one could then also use more robust cross-validation models to test and validate the prediction results in more depth.
In summary, transcriptome-based biomarkers are showing increasing power in predicting cancer prognosis and immunotherapy response. Here, we have shown the feasibility of constructing predictive models of immune cell fractions and gene expression levels in the TME based on matching blood single-cell transcriptomics. We further show that the latter TME blood-derived information is in turn predictive of patients’ ICB response. These results offer a new and promising way to extend the realm of liquid biopsies beyond ctDNAs and CTCs to that of PBMC transcriptomics. Taken together, our work represents a promising initial step toward utilizing single-cell transcriptomics based liquid biopsies to accurately predict the immune status of patients’ tumors from their blood, facilitating personalized cancer precision therapy in the future.
Supplementary Material
Acknowledgments
This work is supported in part by the Intramural Research Program of the NIH, NCI, Center for Cancer Research. This work utilized the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov).
Footnotes
Note: Supplementary data for this article are available at Cancer Research Communications Online (https://aacrjournals.org/cancerrescommun/).
Authors’ Disclosures
E. Ruppin is a cofounder of MedAware, Metabomed and Pangea Biomed (divested) and an unpaid member of Pangea Biomed's scientific advisory board.
Authors’ Contributions
Y. Cao: Conceptualization, resources, data curation, software, formal analysis, validation, visualization, writing–original draft, writing–review and editing. T. Chang: Conceptualization, resources, data curation, software, formal analysis, validation, visualization, writing–original draft, writing–review and editing. F. Schischlik: Conceptualization, data curation, writing–original draft, writing–review and editing. K. Wang: Validation. S. Sinha: Validation. S. Hannenhalli: Conceptualization, validation. P. Jiang: Conceptualization, validation. E. Ruppin: Conceptualization, supervision, funding acquisition, writing–original draft, project administration, writing–review and editing.
References
- 1. Badoual C, Hans S, Merillon N, Van Ryswick C, Ravel P, Benhamouda N, et al. PD-1-expressing tumor-infiltrating T cells are a favorable prognostic biomarker in HPV-associated head and neck cancer. Cancer Res 2013;73:128–38. [DOI] [PubMed] [Google Scholar]
- 2. Eberhardt CS, Kissick HT, Patel MR, Cardenas MA, Prokhnevska N, Obeng RC, et al. Functional HPV-specific PD-1+ stem-like CD8 T cells in head and neck cancer. Nature 2021;597:279–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Li F., Li C, Cai X, Xie Z, Zhou L, Cheng B, et al. The association between CD8+ tumor-infiltrating lymphocytes and the clinical outcome of cancer immunotherapy: a systematic review and meta-analysis. EClinicalMedicine 2021;41:101134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Mahmoud SMA, Paish EC, Powe DG, Macmillan RD, Grainge MJ, Lee AHS, et al. Tumor-infiltrating CD8+ lymphocytes predict clinical outcome in breast cancer. J Clin Oncol 2011;29:1949–55. [DOI] [PubMed] [Google Scholar]
- 5. Saleh R, Elkord E. FoxP3+ T regulatory cells in cancer: prognostic biomarkers and therapeutic targets. Cancer Lett 2020;490:174–85. [DOI] [PubMed] [Google Scholar]
- 6. Togashi Y, Shitara K, Nishikawa H. Regulatory T cells in cancer immunosuppression—implications for anticancer therapy. Nat Rev Clin Oncol 2019;16:356–71. [DOI] [PubMed] [Google Scholar]
- 7. Sato E, Olson SH, Ahn J, Bundy B, Nishikawa H, Qian F, et al. Intraepithelial CD8+ tumor-infiltrating lymphocytes and a high CD8+/regulatory T cell ratio are associated with favorable prognosis in ovarian cancer. Proc Natl Acad Sci U S A 2005;102:18538–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Suzuki H, Chikazawa N, Tasaka T, Wada J, Yamasaki A, Kitaura Y, et al. Intratumoral CD8+ T/FOXP3+ cell ratio is a predictive marker for survival in patients with colorectal cancer. Cancer Immunol Immunother 2010;59:653–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Liu K, Zhao K, Wang L, Sun E. The prognostic values of tumor-infiltrating neutrophils, lymphocytes and neutrophil/lymphocyte rates in bladder urothelial cancer. Pathol Res Pract 2018;214:1074–80. [DOI] [PubMed] [Google Scholar]
- 10. Jiang P, Gu S, Pan D, Fu J, Sahu A, Hu X, et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat Med 2018;24:1550–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Auslander N, Zhang G, Lee JS, Frederick DT, Miao B, Moll T, et al. Publisher correction: robust prediction of response to immune checkpoint blockade therapy in metastatic melanoma. Nat Med 2018;24:1942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Patel AP, Tirosh I, Trombetta JJ, Shalek AK, Gillespie SM, Wakimoto H, et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 2014;344:1396–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Nieto P, Elosua-Bayes M, Trincado JL, Marchese D, Massoni-Badosa R, Salvany M, et al. A single-cell tumor immune atlas for precision oncology. Genome Res 2021;31:1913–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Ho DW-H, Tsui Y-M, Chan L-K, Sze KM-F, Zhang X, Cheu JW-S, et al. Single-cell RNA sequencing shows the immunosuppressive landscape and tumor heterogeneity of HBV-associated hepatocellular carcinoma. Nat Commun 2021;12:3684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Kumar MP, Du J, Lagoudas G, Jiao Y, Sawyer A, Drummond DC, et al. Analysis of single-cell RNA-seq identifies cell-cell communication associated with tumor characteristics. Cell Rep 2018;25:1458–68.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Casasent AK, Schalck A, Gao R, Sei E, Long A, Pangburn W, et al. Multiclonal invasion in breast tumors identified by topographic single cell sequencing. Cell 2018;172:205–17.e12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Chen S, Zhu G, Yang Y, Wang F, Xiao Y-T, Zhang N, et al. Single-cell analysis reveals transcriptomic remodellings in distinct cell types that contribute to human prostate cancer progression. Nat Cell Biol 2021;23:87–98. [DOI] [PubMed] [Google Scholar]
- 18. Lawson DA, Bhakta NR, Kessenbrock K, Prummel KD, Yu Y, Takai K, et al. Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells. Nature 2015;526:131–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Navin N, Kendall J, Troge J, Andrews P, Rodgers L, McIndoo J, et al. Tumour evolution inferred by single-cell sequencing. Nature 2011;472:90–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Lone SN, Nisar S, Masoodi T, Singh M, Rizwan A, Hashem S, et al. Liquid biopsy: a step closer to transform diagnosis, prognosis and future of cancer treatments. Mol Cancer 2022;21:79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Halloran JW, Zhu D, Qian DC, Byun J, Gorlova OY, Amos CI, et al. Prediction of the gene expression in normal lung tissue by the gene expression in blood. BMC Med Genomics 2015;8:77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Xu W, Liu X, Leng F, Li W. Blood-based multi-tissue gene expression inference with Bayesian ridge regression. Bioinformatics 2020;36:3788–94. [DOI] [PubMed] [Google Scholar]
- 23. Basu M, Wang K, Ruppin E, Hannenhalli S. Predicting tissue-specific gene expression from whole blood transcriptome. Sci Adv 2021;7:eabd6991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Accomando WP, Wiencke JK, Houseman EA, Butler RA, Zheng S, Nelson HH, et al. Decreased NK cells in patients with head and neck cancer determined in archival DNA. Clin Cancer Res 2012;18:6147–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Chiossone L, Dumas P-Y, Vienne M, Vivier E. Author correction: natural killer cells and other innate lymphoid cells in cancer. Nat Rev Immunol 2018;18:726. [DOI] [PubMed] [Google Scholar]
- 26. Wang X, Chen X, Zhang X, Su C, Yang M, He W, et al. A small-molecule inhibitor of PCSK9 transcription ameliorates atherosclerosis through the modulation of FoxO1/3 and HNF1α. EBioMedicine 2020;52:102650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Griffiths JI, Wallet P, Pflieger LT, Stenehjem D, Liu X, Cosgrove PA, et al. Circulating immune cell phenotype dynamics reflect the strength of tumor-immune cell interactions in patients during immunotherapy. Proc Natl Acad Sci U S A 2020;117:16072–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Templeton AJ, McNamara MG, Šeruga B, Vera-Badillo FE, Aneja P, Ocaña A, et al. Prognostic role of neutrophil-to-lymphocyte ratio in solid tumors: a systematic review and meta-analysis. J Natl Cancer Inst 2014;106:dju124. [DOI] [PubMed] [Google Scholar]
- 29. Hermiston ML, Xu Z, Weiss A. CD45: a critical regulator of signaling thresholds in immune cells. Annu Rev Immunol 2003;21:107–37. [DOI] [PubMed] [Google Scholar]
- 30. Barford D, Flint AJ, Tonks NK. Crystal-structure of human protein-tyrosine-phosphatase 1B. Science 1994;263:1397–404. [PubMed] [Google Scholar]
- 31. Schwartz LH, Litière S, de Vries E, Ford R, Gwyther S, Mandrekar S, et al. RECIST 1.1-update and clarification: from the RECIST committee. Eur J Cancer 2016;62:132–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Luoma AM, Suo S, Wang Y, Gunasti L, Porter CBM, Nabilsi N, et al. Tissue-resident memory and circulating T cells are early responders to pre-surgical cancer immunotherapy. Cell 2022;185:2918–35.e29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Hao Y, Hao S, Andersen-Nissen E, Mauck WM 3rd, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell 2021;184:3573–87.e29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, et al. Fast, sensitive and accurate integration of single-cell data with harmony. Nat Methods 2019;16:1289–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Domínguez Conde C, Xu C, Jarvis LB, Rainbow DB, Wells SB, Gomes T, et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 2022;376:eabl5197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010;26:139–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Hu F-F, Liu C-J, Liu L-L, Zhang Q, Guo A-Y. Expression profile of immune checkpoint genes and their roles in predicting immunotherapy response. Brief Bioinform 2021;22:bbaa176. [DOI] [PubMed] [Google Scholar]
- 38. Pardoll DM. The blockade of immune checkpoints in cancer immunotherapy. Nat Rev Cancer 2012;12:252–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Campbell KS, Purdy AK. Structure/function of human killer cell immunoglobulin-like receptors: lessons from polymorphisms, evolution, crystal structures and mutations. Immunology 2011;132:315–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Afrache H, Gouret P, Ainouche S, Pontarotti P, Olive D. The butyrophilin (BTN) gene family: from milk fat to the regulation of the immune response. Immunogenetics 2012;64:781–94. [DOI] [PubMed] [Google Scholar]
- 41. Mason SJ, Graham NE. Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: statistical significance and interpretation. Q J Roy Meteorol Soc 2002;128:2145–66. [Google Scholar]
- 42. Cillo AR, Kürten CHL, Tabib T, Qi Z, Onkar S, Wang T, et al. Immune landscape of viral- and carcinogen-driven head and neck cancer. Immunity 2020;52:183–99.e9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Wherry EJ, Ha S-J, Kaech SM, Haining WN, Sarkar S, Kalia V, et al. Molecular signature of CD8+ T cell exhaustion during chronic viral infection. Immunity 2007;27:670–84. [DOI] [PubMed] [Google Scholar]
- 44. Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell 2015;160:48–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Thommen DS, Koelzer VH, Herzig P, Roller A, Trefny M, Dimeloe S, et al. A transcriptionally and functionally distinct PD-1+ CD8+ T cell pool with predictive potential in non-small-cell lung cancer treated with PD-1 blockade. Nat Med 2018;24:994–1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Foy J-P, Karabajakian A, Ortiz-Cuaran S, Boussageon M, Michon L, Bouaoud J, et al. Datasets for gene expression profiles of head and neck squamous cell carcinoma and lung cancer treated or not by PD1/PD-L1 inhibitors. Data Brief 2022;44:108556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Newman AM, Steen CB, Liu CL, Gentles AJ, Chaudhuri AA, Scherer F, et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat Biotechnol 2019;37:773–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Chowell D, Yoo S-K, Valero C, Pastore A, Krishna C, Lee M, et al. Improved prediction of immune checkpoint blockade efficacy across multiple cancer types. Nat Biotechnol 2022;40:499–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Spitzer MH, Carmi Y, Reticker-Flynn NE, Kwek SS, Madhireddy D, Martins MM, et al. Systemic immunity is required for effective cancer immunotherapy. Cell 2017;168:487–502.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Binnewies M, Roberts EW, Kersten K, Chan V, Fearon DF, Merad M, et al. Understanding the tumor immune microenvironment (TIME) for effective therapy. Nat Med 2018;24:541–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Zhou H, Zhu L, Song J, Wang G, Li P, Li W, et al. Liquid biopsy at the frontier of detection, prognosis and progression monitoring in colorectal cancer. Mol Cancer 2022;21:86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Maggioni D, Pignataro L, Garavello W. T-helper and T-regulatory cells modulation in head and neck squamous cell carcinoma. Oncoimmunology 2017;6:e1325066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Wieland A, Patel MR, Cardenas MA, Eberhardt CS, Hudson WH, Obeng RC, et al. Defining HPV-specific B cell responses in patients with head and neck cancer. Nature 2021;597:274–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Cabrita R, Lauss M, Sanna A, Donia M, Skaarup Larsen M, Mitra S, et al. Tertiary lymphoid structures improve immunotherapy and survival in melanoma. Nature 2020;577:561–5. [DOI] [PubMed] [Google Scholar]
- 55. Helmink BA, Reddy SM, Gao J, Zhang S, Basar R, Thakur R, et al. B cells and tertiary lymphoid structures promote immunotherapy response. Nature 2020;577:549–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Petitprez F, de Reyniès A, Keung EZ, Chen TW-W, Sun C-M, Calderaro J, et al. B cells are associated with survival and immunotherapy response in sarcoma. Nature 2020;577:556–60. [DOI] [PubMed] [Google Scholar]
- 57. Krieg C, Nowicka M, Guglietta S, Schindler S, Hartmann FJ, Weber LM, et al. Author correction: high-dimensional single-cell analysis predicts response to anti-PD-1 immunotherapy. Nat Med 2018;24:1773–5. [DOI] [PubMed] [Google Scholar]
- 58. Sinha S, Vegesna R, Mukherjee S, Kammula AV, Dhruba SR, Wu W, et al. PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors. Nat Cancer 2024;5:938–52. [DOI] [PubMed] [Google Scholar]
- 59. Mariotto AB, Yabroff KR, Shao Y, Feuer EJ, Brown ML. Projections of the cost of cancer care in the United States: 2010–2020. J Natl Cancer Inst 2011;103:117–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
In this study, the primary sources of the data are obtained from NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/; RRID: SCR_005012). All data used were sourced from publicly accessible repositories that adhered to ethical standards, including the requirement for written informed consent from patients at the time of initial publication. Additional information about the datasets used in this study can be found in Supplementary Table S1.
All codes that are necessary to reproduce all the results in the article are implemented in Python and R and are publicly available in GitHub: https://github.com/yingstat/TIMEP.