Abstract
Neoantigens are the key targets of anti-tumor immune responses from cytotoxic T cells, and play a critical role in affecting tumor progressions and immunotherapy treatment responses. However, little is known about how the interaction between neoantigens and T cells ultimately impacts the evolution of cancerous masses. Here, we develop a hierarchical Bayesian model, named Neoantigen-T cell Interaction Estimation (netie) to infer the history of neoantigen-CD8+ T cell interactions in tumors. Netie was systematically validated and applied to examine the molecular patterns of 3,219 tumors, compiled from a panel of 18 cancer types. We showed that tumors with an increase in immune selection pressure over time are associated with T cells that have an activation-related expression signature We also identified a subset of exhausted cytotoxic T cells post-immunotherapy associated with tumor clones that newly arise after treatment. These analyses demonstrate how netie enables the interrogation of the relationship between individual neoantigen repertoires and the tumor molecular profiles. We found that a T cell inflammation gene expression profile (TIGEP) is more predictive of patient outcomes in the tumors with an increase in immune pressure over time, which reveals a curious synergy between T cells and neoantigen distributions. Overall, we provide a new tool that is capable of revealing the imprints left by neoantigens during each tumor’s developmental process, and of predicting how tumors will progress under further pressure of the host’s immune system.
INTRODUCTION
Neoantigens are a repertoire of mutated peptides translated from tumor somatic mutations and are presented on the surface of tumor cells by the Major Histocompatibility Complexes (MHC). Neoantigens are key antigens that are recognized by the tumor-specific T cells which initiates the antitumor cytotoxic effects(1–4). Unfortunately, the impact of neoantigens on tumorigenesis as well as prognosis, and treatment response is poorly understood. The neoantigen load approach (counts of all the neoantigens present in a patient) is commonly employed by researchers to examine the potential correlation with prognosis and treatment response(5–8). This approach, however, has only been successful in some studies(9–13). Moreover, the contribution of neoantigens in tumor evolution has not yet been elucidated. Analysis of this metric could predict the behavior of the tumor in the future and inform their response to immunotherapies.
The survival fitness of cancerous cells diminishes when mutations arise within the tumor DNA giving way to neoantigens that are presented on the cell surface leading to immunogenicity.. With constant external immune selection pressure, the numbers of neoantigens generated by newly occurring tumor somatic mutations are expected to stay constant over the course of tumorigenesis. When anti-tumor T cell immunity is strong, it is anticipated that the mutations that generate more neoantigens will be more strongly selected against(14,15). On the contrary, when there is not enough T cell infiltration or there is functional exhaustion/inhibition of the T cells, selection pressure will be substantially lessened for tumor cells with mutations leading to high neoantigen counts(16). Thus, ascertaining the dynamics of neoantigen distributions throughout molecular time can reveal the evolutionary history of the immune pressure
To achieve this task, it is critical to time the genetic events during tumorigenesis. Tumors at the time of diagnosis often consist of heterogeneous clones(17–19), each with a unique set of somatic mutations sharing similar cellular prevalence (Fig. 1a). The tumor clones can be detected through the clustering of mutations via algorithms such as PyClone(17), PhyloWGS(18), and SciClone(19). However, detection of the developmental time-ordering of these clones is a much harder problem. As shown in Fig. 1b, each clone arises from a tumor cell within the parent clone population, due to a tumor-driving event, along with possible additional passenger mutations. One parent clone may yield two or more child clones (such as clone 2 and clone 3 from clone 1, as seen in Fig. 1b). Due to the large potential search space, it is difficult to reliably order the clones into a phylogenetic tree of parent-child relationships. Also, the clonal size is an unreliable indicator of the appearance times of the tumor clones, due to sampling bias and the fact that different clones may have different proliferating potentials.
Fig. 1.
The rationale for inferring the history of anti-neoantigen immune pressure during tumor development. (a) Clonal composition of a hypothetical tumor. The circles refer to the proportions of tumor cells carrying each variant. The circles are colored according to the clones, to which they are assigned to. The histogram shows the distribution of the cellular prevalence of the variants. (b) Phylogenetic relationship between clones. All clones, inferred in (a), are derived either from the normal tissue or from a parent clone. Clone 1 breaks into two clones as two tumor cells are born (Clones 2 and 3). (c) Evolution of mutations and their prevalences within clones. The different shades of pink refer to nesting clones. (d) Inferring immune selection pressure from neoantigens. The blue curly shapes refer to neoantigens associated with each mutation (circle). (e) The setup of the simulation data, where the assumed clones and their parental relationships were shown. (f) The posterior density curves of the random variables to be estimated, with the 95% highest posterior density intervals presented by blue bars on the x-axes. The vertical red lines are located at the true assumed values. (g) Trace plots showing the convergence of the netie estimates of the random variables around the true values, throughout the MCMC iterations. (h) The potential scale reducing factors (PSRFs) for all the inferred variables of the simulation dataset in Fig. 1e. “ac” is the inferred trend of change in anti-tumor selection pressure for each clone. “bc” and “pi” are the posterior estimates of the other variables in the Bayesian model.
In this work, we employed an innovative approach of treating the intra-clone cellular prevalence of somatic mutations as a surrogate for a molecular clock within each tumor clone. We hypothesized that the per-mutation neoantigen load over the time of tumorigenesis is reflective of the history of the “immune selection pressure” in the tumor’s past. We developed netie, to infer the evolution of neoantigen-CD8+ T cell interactions in tumors by sampling from different clones. Netie is systematically validated by a series of simulation studies and real human tumor data. We utilized netie to evaluate 3,219 tumors of 18 cancer types, and provided the first pan-cancer landscape of the impact of neoantigens on the molecular phenotypes of tumors, prognosis, and treatment response to immunotherapies. While most prior studies of neoantigens focus on immunogenic tumors(1,20), such as lung cancer, we also showed an effect of neoantigens on non-immunogenic tumors using netie.
RESULTS
Inferring evolution of neoantigen-T cell interactions
We developed a rationale to examine somatic mutations within each individual tumor clone to order the genetic events within each clone. In Fig. 1c, we show that a tumor cell (cell A) is born with a new set of mutations, within the parent clone. This tumor cell proliferates and dominates the original parent clone. One of cell A’s progeny (cell B) then acquires another set of mutations. Cell B carries preceding mutations of its parent cell, cell A, along with its own unique mutations, and becomes dominant in the tumor clone of cell A, which is in turn a subset of the original parent clone (Fig. 1c). These clones form a nesting pattern, with similar but distinct prevalences. Software such as PyClone cluster these mutations (representing different clones) together. And in this setting, the prevalence of these nesting clones grouped together by PyClone can help distinguish which mutations occurred earlier from those that occurred later. This is different from the situation in Fig. 1b, where at least two competing clones occur, and the parent clone diverges into two or more mutually exclusive clones.
With this tool for time-ordering the variants in each clone, we aimed to investigate the history of anti-tumor immune pressure for each tumor clone (Fig. 1d), by examining how the average counts of neoantigenss generated by somatic mutations vary over time. In this work, we mainly consider interactions between neoantigens presented by class I MHC molecules and CD8+ T cells. Based on the rationale laid out above, we developed the netie model, to estimate the variance in the anti-tumor immune selection pressure by modeling the number of neoantigens from all mutations in each clone, ordered along the developmental time axis. The inference is performed for each tumor clone individually, so that the prevalence of each mutation can serve as a surrogate for developmental time. More specifically, the model estimates whether the immune selection pressure has been decreasing or increasing over a long duration of time. This trend of change is denoted as a variable of ac, where ac>0 means an increasing trend of immune selection pressure, and vice versa, for the c-th tumor clone. The immune selection pressure on tumor mutations of different clones within the same tumor should share some similarities, as they are exposed to the same tumor microenvironment. Therefore, netie employs a random effect framework to model the relatedness of different clones and estimates an overall “a” to represent the trend of immune selection pressure change for this whole tumor. Parameter estimation is performed with the Markov chain Monte Carlo (MCMC) algorithm. Details of netie are described in the method section and Sup. Note 1.
We applied netie to three simulated tumor samples to evaluate the effectiveness of our method. The first sample had four clones and 100 mutations (Fig. 1e); the second sample had one clone and 100 mutations (Extended Data Fig. 1a); the third sample had eight clones and 400 mutations (Extended Data Fig. 1b, c). The number of clones and mutations, prevalences, and the clonal structures were simulated to be comparable to those observed in typical real human tumors. The performance of netie was evaluated with respect to the estimation of each variable. For the estimated immune selection pressure “ac” and overall “a”, we compared the posterior estimates with the ground truths of the simulation. The true values were all located within the 95% highest posterior density interval (Fig. 1f), meaning that netie has correctly inferred the trend of variation in immune selection pressure. In Fig. 1g, the traceplot shows the sampled a at each MCMC iteration (X axis). The fluctuations of the sampled variables (Y axis) around stable values represent a good dynamics of convergence. The potential scale-reducing factors, which is a metric for measuring convergence of MCMC, for all the inferred parameters are less than 1.1, which also demonstrates that MCMC converged (Fig. 1h). All of these indicate dependable performance characteristics of netie.
Immune pressure is associated with T cell activation
We further validated netie using real data and demonstrating its ability to reveal biologically meaningful signals. We applied it to The Cancer Genome Atlas Program (TCGA) patients and some kidney cancer patients from our previous publication(21). We included 17 cancer types in the study, with a total of 6,436 patients. Netie analyses were successfully performed on 2,545 patients’ genomics data, with the other patient information lost in the analysis pipeline for a number of reasons, such as lack of whole exome-seq data, or no somatic mutations nor neoantigens detected.
We divided successfully processed patients based on the trends of their tumor immune pressure’s variation over time, “a”. We first define three groups of patients: patients with high “a” (more than 70% of MCMC iterations have inferred “a”>0), patients with low “a” (fewer than 30% of iterations have inferred “a”>0), and the other patients in the middle (their portion can be calculated by 1 minus the proportions of the other two groups, omitted from plotting). Fig. 2a shows the proportion of patients in each category, for each tumor type. In every cancer type that was investigated, we observed subgroups of patients displaying an increase or decrease in immune selection pressure over time, showing that heterogeneous tumor evolutionary processes, as a result of T cell-mediated pressure, exist in all cancer types. In addition, we found that, in certain tumor types (Fig. 2a), there were a greater proportion of patients who demonstrated an increased immune selection pressure over time, such as in Adrenocortical carcinoma (ACC), Lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), and Uveal melanoma (UVM); meanwhile, other tumor types had more patients who showed decreasing immune pressure, such as Pancreatic adenocarcinoma (PAAD), and Uterine carcinosarcoma (UCS). These findings suggest that the nature of the tumor-immune interactions varies across different cancer types and could be used to investigate tumor evolutionary mechanisms as well asinform immunotherapy choices for treating these cancers.
Fig. 2.
Immune selection pressure variations correlate with the phenotypes of the tumor and tumor clones. (a) Applying netie on the TCGA plus the kidney cancer data. The percentages of the patients with high “a” (a >0 in more than 70% MCMC iterations) and low “a” (a<0 in more than 70% iterations) are shown for each tumor type. (b) Circos plots showing the enriched pathways in the genes that are differentially expressed between INisp and DEisp patients (a> or <0 in more than 50% of iterations). Left: KIRC; right: SKCM. Only the top pathways are shown in each panel for ease of presentation. (c) The number of enriched immune-related pathways found in the genes differentially expressed between INisp and DEisp patients, for each cancer type. (d) The top differentially enriched pathways between INisp and DEisp patients, detected by GSEA. For this analysis, all patients regardless of cancer types were combined. The GSEA test was applied for the calculation of P values. (e) Volcano plot showing the genes that are differentially expressed between INisp and DEisp patients of SARC. A positive value on the X axis means the gene is up-regulated in the INisp patients. Two-sided T-test was applied. (f) A heatmap showing the differential expression of HAVCR2, LAG3, IL-2, IFNG, and TNF, in all cancer types. Orange refers to higher expression in INisp patients, and blue refers to higher expression in DEisp patients.
It is unclear from present studies if and how neoantigen presentation can alter the molecular phenotypes of the tumors. To answer this question, for each cancer type, we divided patients into a group “INisp”, with increasing immune pressure over time (a>0 in more than 50% of iterations), and another group “DEisp”, with decreasing immune pressure over time (a<0 in more than 50% of iterations). We compared the expression profiles of INisp patients and DEisp patients, and performed gene ontology (GO) analyses to identify enriched pathways in differentially expressed genes. We identified immune-related pathways in the differentially expressed genes for every tumor type we investigated. Immune-related pathways are defined as GO terms with any keyword related to any type of immune cells, or keyword related to immune/interleukin/cytokine/chemokine/bacteria. The top enriched pathways with the most significant P-values, for Kidney renal clear cell carcinoma (KIRC) and Melanoma (SKCM), are shown as examples in Fig. 2b. In fact, Fig. 2c shows that every tumor type has at least three enriched immune-associated pathways identified. Among the most immunogenic cancer types (Lung squamous cell carcinoma (LUSC), Lung adenocarcinoma (LUAD), Melanoma, and Kidney renal clear cell carcinoma)(21), kidney cancer has the highest number of immune-related pathways. Curiously, immune-related pathways are also detected in the differentially expressed genes for other cancer types that are usually considered non-immunogenic, such as Adrenocortical carcinoma (ACC), Bladder urothelial carcinoma (BLCA), and Uveal Melanoma (UVM). This observation suggests that neoantigens broadly impact the tumor evolutionary processes of non-immunogenic cancer types, in addition to immunogenic cancer types.
To confirm our findings above, we also performed Gene Set Enrichment Analysis (GSEA)(22) on the expression profiles of INisp patients and DEisp patients. We observed that a large number of immune-related pathways, especially T cell related pathways, are differentially enriched between INisp and DEisp patients (the top pathways shown in Fig. 2d and Sup. Note 2).
We next focused on individual genes that were directly related to T cell functions. The volcano plot of differential gene expression in Sarcoma (SARC) is shown in Fig. 2e as an example. We found that IL-2 and IFNG, which are landmark genes up-regulated in activated T cells(23), have higher expression levels in sarcoma INisp patients. We also identified another gene, TNF, which promotes T cell activation(24) andis up-regulated in INisp patients in many cancer types such as Head and Neck squamous cell carcinoma (HNSCC) and Adrenocortical carcinoma (ACC). We systematically demonstrated the differential expression of these genes between INisp and DEisp patients in all the cancer types analyzed. Across almost all cancer types, we observed higher expression of IL-2, IFNG, and to a lesser extent, TNF in INisp patients. In contrast, two genes, LAG-3 and HAVCR2, which are markers of T cell exhaustion(25,26), were consistently up-regulated in DEisp patients (Fig. 2f).
Overall, netie can build a link between patient neoantigens and tumor molecular profiles, which is capable of yielding novel critical insights into the complicated process of host immune cell-tumor interactions.
Immune pressure is correlated with the genotypes of tumors
Next we investigated whether the immune pressure variations inferred by netie are correlated with the genotypes of the tumor clones. We investigated whether the tumor clones with and without somatic mutations in each gene display any difference in the immune pressure variations, ac. For this analysis, we pooled all tumor clones from all patients and employed two-sided Wilcoxon test to compare ac (Extended Data Fig. 2a). We only tested genes that are mutated in at least 30 patients. Interestingly, among the top 10 genes from this analysis, NID1, has been shown to interact with several extracellular matrix proteins and to be associated with tumor infiltration of T cells, B cells, macrophages, neutrophils, and dendritic cells(27). STARD13 has been significantly correlated with immune infiltration in bladder cancer(28). BAGE2 is more frequently mutated in immunity-low gliomas(29). We also noticed several genes with significant P values, SETDB1, KEAP1, STK11, FN1, that have been implicated in tumor-T cell interactions and immunotherapy responses(30–33) (all genes shown in Sup. Table 1). In Extended Data Fig. 2b, we showed that when SETDB1 is mutated, the tumor clones will more likely experience an increase in immune pressure. FN1 shows the same phenotype. To determine the roles of the top genes with the most significant P values, we and investigated the enriched Gene Ontology (GO) pathways of the genes with P value<0.05 with enrichr(34) (Extended Data Fig. 2c), which is a comprehensive gene set enrichment analysis webserver. We observed an enrichment of interleukin-1 (IL-1) binding-related genes in the significant genes. IL-1 is an immunostimulatory cytokine abundant in the tumor microenvironment(35), and has been linked to immunity response against tumor neoantigens(36). We also identified many additional enriched GO pathways related to extracellular matrix and cell-to-cell interactions (e.g. “integral component of plasma membrane” and “extracellular matrix organization”), associated with effective tumor-T cell interactions.
Overall, these analyses demonstrated how the genetic perturbations to the cell-to-cell interaction machinery of the tumor cells may impact the evolution of T cells’ cytotoxic effectiveness over time.
Intra-tumor heterogeneity of immune selection pressure
Netie is also applicable to joint-analyses of multiple samples from the same tumor. In our prior analyses with only one sample per patient, netie infers the clone-specific immune selection pressure and reports an overall tumor-wise average. The availability of multiple samples per patient and the unique composition of tumor clones in each sample allows us to closely examine the differences between different samples and individual clones, providing a more fine-grained insight into the intra-tumor heterogeneity of immune selection pressure.
We generated WES and RNA-seq data for four Non-small cell lung cancer patients (NSCLC), for whom three samples each from different regions of the tumors were collected. The phylogenetic tree for each patient was reconstructed by Pyclone(17) and Clonevol(37) which are algorithms for tumor clone deconvolution from next generation sequencing data. We show one patient’s phylogenetic tree as an example in Fig. 3a. There were a total of 13 clones found in the three samples from this patient (one common clone, and 12 private clones). Interestingly, netie inferred that the private clones demonstrated an enhancement of immune selection pressure over time, while the sole shared clone demonstrated the opposite trend (Fig. 3b). We also found a stronger decrease in the immune selection pressure associated with tumor clones shared within each patient’s samples for each of the other three tumors, compared with private tumor clones (Extended Data Fig. 3). One possible explanation for this curious observation could be the different levels of immune selection pressure inflicted upon the distinct tumor clones. The tumor clones with stronger decrease of immune responses are more likely to persist and evolve in more regions of the tumor (and thus become the observed “shared” clone).
Fig. 3.
Netie is capable of performing multi-sample joint analyses. (a) Netie analysis of the multi-site samples of one MDACC lung cancer patient (Patient ID 886403). The tumor clones were visualized in the phylogenetic tree plot and the fish plot. (b) The immune selection pressure scores of the shared (N=1) and the private tumor clones (N=12) of this patient in (a). (c) Netie analysis of the pre-treatment and post-treatment samples from the Riaz cohort. The immune selection pressure scores were also visualized in barplots for comparison between clones that occurred in pre-treatment samples and new clones that occurred only in the post-treatment samples. The numbers of clones shown in the barplots are as follows. Pt3: N(Pre)=2, N(Post)=1. Pt9: N(Pre)=2, N(Post)=2. Pt10: N(Pre)=3, N(Post)=1. Pt11: N(Pre)=1, N(Post)=1. Pt26: N(Pre)=1, N(Post)=1. Pt27: N(Pre)=1, N(Post)=1. Pt31: N(Pre)=1, N(Post)=1. Pt89: N(Pre)=3, N(Post)=1. (d) Boxplots of the expression levels of the T cell exhaustion signature, comparing the pre-treatment and post-treatment samples. N(pre-treatment)=8 and N(post-treatment)=8. (e) GO analysis of the genes differentially expressed between the pre-treatment and post-treatment samples. The lengths of the bars are proportional to the −log(P value) of the GO analysis. The GSEA test was applied for statistical test. For barplots in (b)-(c), the center of the error bar represents the immune pressure variation (ac) and the error bars represent 95% confidence intervals. For boxplot in (d), box boundaries represent interquantile ranges, whiskers extend to the most extreme data point which is no more than 1.5 times the interquartile range, and the line in the middle of the box represents the median.
Additionally, we analyzed a cohort of melanoma patients treated with checkpoint inhibitors (Riaz et al(6)). There are 16 patients from the Riaz cohort for whom both pre- and post-treatment exome-seq data were generated, for whom neoantigen calling and Pyclone analyses were all successfully performed, and for whom at least one tumor clone is shared between the pre- and post-treatment samples of the same patients.
This cohort consisted of patients with either stable or progressive disease and no patient exhibiting a complete response. Netie showed that these tumors demonstrated an overall decreasing trend of immune activity (ac<0), which was consistent with the lack of responsiveness to treatment in these patients. Fig. 3c shows the results for 8 patients for whom we had access to both the exome-seq and RNA-seq datasets for neoantigen calling, and Sup. Note 2 shows the other 8 patients with only exome-seq data. Due to the availability of both pre- and post-treatment samples, we were able to distinguish which tumor clones emerged later during tumor progression. We compared the evolutionary patterns of the immune selection pressure on the pre-existing clones in the pre-treatment samples and those that emerged in the post-treatment samples. Interestingly, we found that the new clones arose after immune-checkpoint blockade experienced a greater loss of anti-tumor immune activity than clones that already existed in the pre-treatment samples (Fig. 3c, Pval=0.015, and Sup. Note 2, Pval=0.033). We hypothesised that this observation could be caused by the exhaustion of T cells after checkpoint blockade. To confirm this, we examined the expression of a T cell exhaustion gene signature(23) in the 8 patients with available RNA-seq data. We observed that the T cell exhaustion level was statistically higher in post-treatment samples than in pre-treatment samples (Fig. 3d, Pval=0.037). We also examined the differential expression of genes in the pre-treatment and post-treatment samples in an unbiased manner. In Fig. 3e, we showed that the differentially expressed genes were enriched in pathways essential for immune system activation, leukocyte activation, and leukocyte aggregation in the 8 patients with matched RNA-seq data. Overall, netie analyses revealed, from the perspective of the evolution of neoantigens, an exhaustion of T cell anti-tumor activity after checkpoint blockade in non-responsive tumors. This is consistent with the observation of Liu et al (38), who showed an increase of exhaustion gene signatures in T cells in non-responsive tumors after immune checkpoint inhibitor treatment, and a decrease of exhaustion in responsive tumors.
Overall, multi-sample genomics data, when viewed through the lens of netie, revealed that the out-growth of particular tumor clones is concomitant with a weakening of T cells’ immune surveillance on these clones.
Immune pressure is associated with clinical phenotype
Finally, we investigated whether the past history of tumor-T cell interactions could impact the future behavior of the tumors. Higher T cell inflammation is usually associated with better prognosis and immunotherapy responses(1,39–41). However, the strength of the correlation between T cell inflammation and patient phenotypes varies widely across tumor types and treatment types, and thus, does not always yield significant results. This is likely because a number of other factors, such as the frequency of neoantigens, can determine whether the tumor-infiltrating T cells exhibit anti-tumor effects (1,20). We have shown that the INisp and DEisp classification based on patient neoantigen repertoire can be correlated with the effectiveness of the anti-tumor immune response. These data motivated us to whether the T cell inflammation gene expression profile (TIGEP) patterns(42), in conjunction with netie’s INsip/DEisp classification, could better predict patient clinical outcomes and response to immunotherapies.
Across the pan-cancer TCGA cohort, we first confirmed, that a higher TIGEP score was associated with improved overall survival (Fig. 4a).Importantly, when we dichotomized the patients into INisp and DEisp we observed that the association between TIGEP and overall survival was stronger in the INisp patients (Fig. 4a left, Pval=0.00693) than in the DEisp patients (Fig. 4a right, Pval=0.0524). This indicates that for patients whose neoantigen repertoire implies a history of waning anti-tumor T cell effectiveness, T cell infiltration is indeed less likely to be effective in controlling tumor growth.
Fig. 4.
History of immune selection pressure is predictive of tumor prognosis and response to immunotherapy. (a-b) The patients were dichotomized into the INisp and DEisp groups. Survival analyses were performed by examining the association between TIGEP levels (top 50% vs. bottom 50%) and overall survival in each group. (a) all patients; (b) patients from immunogenic cancer types. (c,e) The immunotherapy-treated patients were dichotomized into INisp and DEisp groups, and each group is further split into patients with high and low TIGEPs (50% vs. 50% split). The proportion of responders in each group (INisp/DEisp, high/low T cell) was examined. (c) all patients, (e) patients from immunogenic cancer types. (d,f) The odds ratios of the enrichment of responders and patients with high TIGEPs was calculated and compared between INisp and DEisp patients. The “high TIGEP” patients were selected based on a number of cutoffs to assess the robustness of this analysis. (d) all patients (upper 30% TIGEP level as cutoff: Pval=0.0048, 40%: Pval=0.0040, 50%: Pval=0.0056, 60%: Pvall=0.0099) (f) patients from immunogenic cancer types (30%: Pval=0.032, 40%: Pval=0.027, 50%: Pval=0.016, 60%: Pval=0.047). For barplots in (d) and (f), the center of the error bars represents the odds ratio and the error bars represent 95% confidence intervals. One-tail p values of CoxPH model were shown in (a-b). One-tail P values of Chi-squared test were shown in (c, e).
When we limited the above analyses to the more immunogenic cancer types (LUSC, LUAD, SKCM, and KIRC)(43), the association between TIGEP and overall survival was ever stronger in the INisp but not in the DEisp patients (Fig. 4b left: Pval=0.000704, Fig. 4b right: Pval=0.0548). A multivariate analysis was performed, in which we adjusted for the clinical stage, gender, age, and cancer type in the cohort of INisp patients (Extended Data Fig. 4a) and DEisp patients (Extended Data Fig. 4b). The association between survival and TIGEP levels still held in INisp patients (Pval=0.004) and was also better than the association observed for the DEisp patients (Pval=0.02).
To test the robustness of our analyses, we employed PhyloWGS(18), SciClone(19) as well as PyClone for clonality inference (Extended Data Fig. 5a) and MHCflurry(44) (Extended Data Fig. 5b) for neoantigen prediction. We made the same observation that TIGEP is more predictive of patient responses, in the INisp patients than DEisp patients.
We then studied the implications of the netie INisp/DEisp classifications for patients who were treated by immunotherapies. We collected a total of 654 cancer patients(5–7,45–50) with five tumor types (LUSC, LUAD, SKCM, KIRC, Gastric cancer, HNSCC), who were on four immunotherapeutic regimens (anti-PD1, anti-PDL1, anti-CTLA4, anti-PDL1/anti-CTLA4). As expected(1,39–41), we again observed that patients with higher TIGEP levels were more likely to respond to treatments (Fig. 4c). We then applied netie to analyze the neoantigen profiles of these patients. The patients were grouped by the INisp/DEisp classifications, as done in prior analyses. In each of the INisp and DEisp patient groups, we calculated the association between increased TIGEP levels and favorable treatment response (definitions in method section). We confirmed that this association was stronger in the INisp patients (Fig. 4c left, Pval=0.00031) than in the DEisp patients (Fig. 4c right, Pval=0.023), consistent with with the patterns observed for the prognoses of the patients at baseline without immunotherapy.
We defined high and low TIGEP patients based on the median TIGEP levels on the patient cohort (Fig. 4c). To demonstrate the robustness of this analysis, we chose a range of cutoffs for splitting TIGEP levels, and calculated the odds ratios for the associations between high TIGEP and favorable responses, for the INisp and DEisp patients. Fig. 4d shows the association between increased TIGEP and responsiveness to treatment was always stronger in the INisp patients than in the DEisp patients, regardless of cutoffs (Pval<0.01 for each pair of odds ratios). We performed the same analysis for 517 out of 654 patients of immunogenic cancer types(5,6,12,42,47–49) and observed that the association between higher TIGEP levels and better responses was more significant in INisp patients of immunogenic cancer types than in DEisp patients (Fig. 4e and Fig. 4f, Pval<0.05). To demonstrate the additional value provided by netie DEisp/INisp, we built a RandomForest model with TIGEP only as the predictor and TIGEP + netie DEisp/INisp as the predictors to compare the performance of predicting patient immunotherapy responses. Our analyses in Sup. Note 2 Fig. 11–13 showed that consideration of neoantigen evolutionary history provided extra benefit for patient risk classification.
DISCUSSION
Our work provides a tool, netie, to infer the footprint left by anti-tumor T cells on the evolution of each subclone of a heterogeneous tumor over the course of tumorigenesis. Netie was systematically validated by simulation data and through application in large scale real human tumor data. This is the first study that has explicitly modeled how tumor neoantigens and T cells shape the clonal structures of tumors. Interestingly, we showed that tumor neoantigens shape the intra-tumor heterogeneity structure of not only the most immunogenic cancer types, but also many non-immunogenic cancer types. These findings provide support to efforts towards developing neoantigen-based therapies in various tumor types. With our model, we were also the first to characterize the extent to which tumor neoantigens impact the transcriptomic states of the tumors. While previous studies focus on studying the relationship between neoantigen loads and patient clinical phenotype, this analysis provided critical insights into the inter-relationship between neoantigen repertoire and tumor genotypes/phenotypes, and the roles of neoantigens during tumorigenesis and clonal evolution. In addition netie revealed that the past history of tumor-immune interactions can inform the prediction of patients’ prognosis and responsiveness to immunotherapy treatment.
Our strategy leverages mutation clusters (representing tumor clones) detected by software such as PyClone, PhyloWGS and SciClone, and the prevalences of the mutations in the same tumor clone as a molecular clock to order the occurrence of each genetic event. Mitchell et al(51) discovered that the number of somatic mutations in kidney cancer is correlated with age. This motivated them to develop a statistical model to treat the occurrence rate of insertion/deletion mutations per tumor clone as a way to time landmark events in the evolution of kidney cancers. In another study, Salichos et al(52) developed a model to estimate tumor evolution and the occurrence times between mutations based on comparison of mutation variant allele frequencies. These two studies adopted an approach related to the surrogate molecular clock approach employed in our study. However, our study was able to consider the variant prevalences, not considered by Michell et al, which provide more granularity in the clonal evolution estimation, and was able to consider multiple tumor clones with complicated evolutionary relationships, while Salichos et al can only consider tumors of linear evolution patterns. We expect our surrogate molecular clock approach to be generally applicable to other domains of tumor genomic research, and to provide new discoveries beyond the scope of tumor neoantigens.
One limitation of our study is that the trend of the variation in the anti-tumor immune pressure is simply categorized as increasing or decreasing in netie. The interaction between tumors and the immune system is usually complicated and the trend of change in the immune selection pressure may not necessarily be linear throughout the evolution process. We hope future works from the field of in silico dissection of intra-tumor heterogeneity will develop methodology to time the occurrences of different tumor clones and mutations, which will enable netie to more comprehensively model the relationships between tumors and anti-tumor immunity, and to capture their complicated, nonlinear interactions.
Netie is broadly applicable to sequencing data from various cancer types and reveals the details of the interactions between tumor cells and T cells, while metrics such as TIGEP are only able to infer the activity of T cells. By bridging the field of neoantigen research and the field of tumor clonal deconvolution research, we anticipate that netie will help enable an exciting and uncharted territory for future studies on the interaction between T cells and tumor cells and how it impacts tumorigenesis, metastasis and prognosis
ONLINE METHODS
The Neoantigen-T cell Interaction Estimation (netie) model
PyClone (version=0.13.1)(17) was used in this work to detect the existence of clones from tumor mutation data, with mutations called from either our own pipeline or by the authors of the original publications. PyClone was run in three steps as instructed by its manual: PyClone setup_analysis, PyClone run_analysis, and PyClone build_table. Then we obtained a table of detected clones with mutations for each sample. Netie inference results based on PhyloWGS (version=1.0) and SciClone (version=1.1.0) were also shown in this work (Extended Data Fig. 5). We provide a zipped version of the netie software for documentation purpose (Sup. Software).
The netie model was constructed as a Bayesian Hierarchical model based on the output of PyClone and similar software. The key rationale for netie is that we estimate, for each clone, the association between the number of neoantigens per mutation and the prevalence of the same mutation to infer how the immune selection pressure varies over molecular time. The prevalence of the somatic mutations represents the occurrence times of the mutations, with more truncal mutations happening earlier. If earlier (more truncal) mutations are associated with less neoantigens per mutation, it suggests that the immune selection pressure has been stronger during early tumorigenesis, and vice versa. We represent this varying trend by acc, for that tumor clone. However, the estimation is not conducted independently for each tumor clone, but rather, we assume a random effect model, where there is a whole tumor-level varying trend of a, which essentially represents the average of the correlated accs from all clones. Multi-region sampling is conducted based on the same framework, but we additionally identify the same clones across different samples of the same patient, and assign the same acc for them.
Importantly, we do not assume different tumors (in particular different clones) in the same patients are under the same evolutionary pressure. That’s why we use a random effect model to capture this difference among different tumors/clones. But on the other hand, we assume the same clone that exists in different tumors has experienced the same evolutionary pressure before. This is realistic and consistent with the rationale of our model, which is to look into the past of the tumor, not the current status. In the past, the tumor cells in the same tumor clone that later migrated to different tumors originated in the same place, and thus were under the same evolutionary pressure. So it’s correct to assume the same evolutionary process for them. After expansion to multiple sites, the same clones in different tumors will experience different environmental pressures, and will accumulate new mutations, which will form new (different) clones by themselves.
We solve the netie model by Markov chain Monte Carlo (MCMC). By default MCMC iterates 100,000 times. We discard the first half of MCMC iterations as burned-in iterations and utilize the second half of MCMC iterations for statistical inference. The netie model is implemented in the R language. Detailed model description is provided in Sup. Note 1.
Description of the effect of the tuning parameters on the performance of the netie model is provided in Sup. Table 2 and Sup. Note 2 to help users of the netie model choose the optimal parameter setting. We adopted the same criteria as in Extended Data Fig. 5 to show how much the results will vary for different choices of the parameters. We tested 4 tuning parameters: M (number of minimum mutations for a clone, 2–4), N (number of minimum neoantigens for a clone, 1–3), range (minimum range of CP or VAF for variants in a clone, 0.01–0.04), and coverage (minimum sequencing coverage for a variant, 40–80). The rationale for us to choose these ranges of parameters is mostly due to practical concerns regarding the sample sizes that would be left for analyses. With just M going from 2 to 4, the number of tumor clones left for analyses suffers a 16.19% loss. And we have three other parameters being changed at the same time. From a practical standpoint, the combinations of all four parameters in the given space (M=2->4, etc) ensure that decent numbers of data points are left for analyses after filtering, in most cases. To help convey this point, we will add into Sup. Table 2, the number of data points left for analyses after filtering, for each parameter combination.
In-house multi-region sampling patient cohort
Twelve primary tumor samples were collected from four patients with non-small cell lung cancer (3 regions per tumor) including Adenocarcinoma (n=3) and Squamous Cell Carcinoma (n=1). Matched normal samples were collected from white blood cells (n=3) or adjacent lung tissue (n=1). Informed consent was obtained from all patients. No compensation was provided for the patients. Collection and use of patient samples were approved by the Institutional Review Board of The University of Texas MD Anderson Cancer Center (MDACC). Detailed cohort characteristics including age, gender, and smoking status are given in Sup. Table 3.
Using the Ribo-SPIA Technology (NuGen, San Carlos, CA, USA), we converted the extracted RNA to cDNA libraries, which were then sequenced as 76 bp paired-end reads on the Illumina HiSeq 2000 platform (Illumina, Inc., San Diego, CA, USA). Genomic DNA was extracted and utilized for library preparation for sequencing with the Agilent SureSelect Human All Exon V4 kit, according to the manufacturer’s instructions (Agilent Technologies, Inc., Santa Clara, CA, USA). 76-bp paired-end whole exome sequencing was performed on the Illumina HiSeq 2000 platform with mean target sequencing coverage of 200x (Illumina, Inc., San Diego, CA, USA).
Mutation and neoantigen calling pipelines
For calling of neoantigens from raw genomics data, we used the QBRC mutation calling and neoantigen calling pipelines(1,53). For the TCGA samples analyzed in our study, we downloaded their mutation annotation files from https://gdac.broadinstitute.org/, and thus directly started with the neoantigen calling step. The numbers of the reference allele reads and the altered allele reads at each variant locus are required for clonality inference by PyClone. So we had to only analyze the 17 cancer types among all available cancer types, for which the read count information has been made available.
Simulation data generation
In Fig. 1 and Extended Data Fig. 1, we generated several simulation datasets for validation of netie. The simulated clonal structures of the in silico tumors were shown in the figures. We manually set “a” for each of the three simulation studies, which were 11 (Fig. 1e), 10 (Extended Data Fig. 1a), and 10 (Extended Data Fig. 1bc), respectively. The “ac”s were sampled from a normal distribution with “a” as the mean and a pre-determined variance of 1. “b/bc”s were simulated according to a similar procedure. The variant allele frequencies of the mutations were also manually chosen as real numbers between 0 and 1. With these, we calculated the expected number of neoantigens for each mutation (given no zero-inflation), and further used a zero-inflated poisson distribution to randomly simulate the “observed” neoantigen load for each mutation, according to the formula laid out in Sup. Note 1. The mixing ratios, π, were set at 0.86, 0.6, and 0.6, respectively. The model took in the “observed” neoantigen counts and tried to infer the “a/ac” and the other random variables, which were compared to the simulated ground truth.
Gene expression data analysis
In Fig. 2, we performed differential expression analysis for the baseline patients. The log of gene expression fold change between the INisp and DEisp groups were calculated for identifying the differentially expressed genes and their enriched pathways. In Fig. 2b, the genes were ranked and analyzed by the GOrilla webserver and the figure was made by using the “GOchord” function of the “GOplot” package. For Fig. 2d, the function of “fgsea” of the package “fgsea” was used for the Gene Set Enrichment Analysis. We set the number of permutations (“nperm”) to be 1000; maximal size of a gene set to test (“maxSize”) as 500; and minimal size of a gene set to test “minSize” as 15.
Statistical analyses
All computations are conducted in the R or Python programming languages. For all boxplots appearing in this study, box boundaries represent interquartile ranges, whiskers extend to the most extreme data point which is no more than 1.5 times the interquartile range, and the line in the middle of the box represents the median. The GOrilla webserver (version=2013Mar8) was used to detect enriched gene ontology pathways from expression analyses(54). The enriched GO terms were visualized by circos plots, created by the GOChord function in the R GOplot (version=1.0.2) package. GSEA analyses were performed by the fgsea function in the R fgsea package (version=1.20.0). The Cristescu et al study(42) defined a T cell inflammation GEP (TIGEP) score for each patient based on an 18-gene signature profiled from the RNA-seq data. We followed them to calculate this TIGEP for our patients. Survival analyses were conducted by functions from the R survival packages (version=3.2–13). The forest plot was performed by the forest_model function in the R forestmodel package. For the association between TIGEP and clinical response, we tested the dichotomized TIGEP with the response categories (responders or non-responders) using the chi-square test. Most immunotherapy treated patients were grouped into responders and non-responders following their original publications’ standards. For the few cohorts where no simple response/non-responder classification was provided in the original publication, the RECIST categories of complete response (CR), partial response (PR), stable disease (SD) and progressive disease (PD) were available, and responders were defined as patients with CR and PR, while non-responders as patients with SD and PD. The only exception is the Hugo cohort(5), which divided the patients into three categories of complete response, partial response, and progressive disease. For this cohort, patients with complete and partial responses are classified as responders and patients with progessive responses as non-responders. The bell plot, node-based tree plot, and branch-based tree plot were created by the plot.clonal.models function in the R Clonevol package (version=0.99.11).
Extended Data
Extended Data Fig. 1.
Applying netie on the simulation data. Two more simulation datasets (a) and (b,c). (a) Simulation setting of the second dataset and netie’s inference results. (b) Simulation setting of the third dataset. (c) Netie’s inference results for the third dataset. The same simulation and analysis procedures, as in Fig. 1e–g, were carried out.
Extended Data Fig. 2.
Immune selection pressure variations correlate with the genotypes of the tumor and tumor clones. (a) The top genes with smallest Wilcoxon test P values comparing the immune pressure variations in the tumor clones with and without mutations in each gene. (b) Boxplots of the immune pressure variation (ac) in the tumor clones with and without mutations in SETDB1 and FN1. (c) Enriched GO terms of the genes with Wilcoxon test P value<0.05. For boxplot in (b), box boundaries represent interquantile ranges, whiskers extend to the most extreme data point which is no more than 1.5 times the interquartile range, and the line in the middle of the box represents the median.
Extended Data Fig. 3.
The immune selection pressure scores of the shared and the private tumor clones of the other MDACC lung cancer patients with multi-region sampling. The center of the error bar represents the inferred immune pressure variation (ac), and the error bars represent 95% confidence intervals.
Extended Data Fig. 4.
Further validating the implication of Netie classifications for prognosis of patients (a,b) Multivariate analysis testing the association between TIGEP and overall survival in INisp and DEisp patients. (a) INisp patients, (b) DEisp patients. The association between TIGEP and overall survival is tested in a CoxPH model, with multi-variate adjustment for pathological stage, gender, age, and tumor types.
Extended Data Fig. 5.
Analyses as conducted in Fig. 4a, but with the clonality inference conducted by PhyloWGS and SciClone (a), or with the neoantigens predicted by MHCflurry (b). The TCGA LUAD cohort was employed as an example.
Supplementary Material
ACKNOWLEDGMENTS
We acknowledge Anagha Gouru and Abigail Passey for helping with proof-reading of the manuscript.
This study was supported by the National Institutes of Health (NIH) [CCSG 5P30CA142543/TW, 1R01CA258584/TW, 5P30CA142543/TW, U01AI156189/TW, R01CA234629/JZ], Cancer Prevention Research Institute of Texas [CPRIT RP190208/TW, CPRIT RP160668/JZ, RP160668/IW], University of Texas MD Anderson Cancer Center [Physician Scientist Program/JZ, Lung Cancer Moon Shot/AR], Cancer Foundation at the University of Texas MD Anderson Cancer Center [Institutional Research Grant/AR], the Waun Ki Hong Lung Cancer Research Fund [AR], Exon 20 Group [AR], and Rexanna’s Foundation for Fighting Lung Cancer [AR]. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Footnotes
COMPETING INTERESTS
The authors declare no competing interests.
Code availability
The netie R package is available at https://github.com/tianshilu/Netie with Apache license version 2.0.
Ethics statement
Our research complies with all relevant ethical regulations stipulated by the Institutional Review Boards of UT Southwestern Medical Center (UTSW) and of The University of Texas MD Anderson Cancer Center (MDACC).
Data availability
The baseline patients are from the Cancer Genome Atlas Program (TCGA) and the kidney cancer patients from our prior study(21). The TCGA data (expression, mutation and survival) were downloaded from the TCGA firehose website (https://gdac.broadinstitute.org/, version of 2016012800.0.0). TCGA HLA typing data were provided by Thorsson et al(55). For the treatment cohorts, we gathered 32 gastric cancer patients from Cristescu et al(42) and Kim et al(45); 8 NSCLC patients from Anagnostou et al(46); 105 HNSCC patients from Cristescu et al(42); 352 melanoma patients from Cristescu et al(42), Hugo et al(5), Liu et al(47), Riaz et al(6), and Van Allen et al(48); 157 RCC patients from IMmotion150 cohort(49) and Miao et al(7). Information on access to data is provided in these original reports. The in-house MDACC patients’ raw genomic data have been uploaded into the European Genome-Phenome Archive (EGA) with the accession numbers being EGAD00001008382 and EGAD00001008482. The accession codes and links for each dataset are listed in Sup. Table 4.
Bibliography
- 1.Lu T, Wang S, Xu L, Zhou Q, Singla N, Gao J, et al. Tumor neoantigenicity assessment with CSiN score incorporates clonality and immunogenicity to predict immunotherapy outcomes. Sci Immunol. 2020. Feb 21;5(44). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hsiehchen D, Hsieh A, Samstein RM, Lu T, Beg MS, Gerber DE, et al. DNA Repair Gene Mutations as Predictors of Immune Checkpoint Inhibitor Response beyond Tumor Mutation Burden. Cell Rep Med. 2020. Jun 23;1(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Chung AS, Mettlen M, Ganguly D, Lu T, Wang T, Brekken RA, et al. Immune checkpoint inhibition is safe and effective for liver cancer prevention in a mouse model of hepatocellular carcinoma. Cancer Prev Res (Phila Pa). 2020. Nov;13(11):911–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lu T, Zhang Z, Zhu J, Wang Y, Jiang P, Xiao X, et al. Deep learning-based prediction of the T cell receptor-antigen binding specificity. Nat Mach Intell. 2021. Oct;3(10):864–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, et al. Genomic and Transcriptomic Features of Response to Anti-PD-1 Therapy in Metastatic Melanoma. Cell. 2016. Mar 24;165(1):35–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Riaz N, Havel JJ, Makarov V, Desrichard A, Urba WJ, Sims JS, et al. Tumor and Microenvironment Evolution during Immunotherapy with Nivolumab. Cell. 2017. Nov 2;171(4):934–949.e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Miao D, Margolis CA, Gao W, Voss MH, Li W, Martini DJ, et al. Genomic correlates of response to immune checkpoint therapies in clear cell renal cell carcinoma. Science. 2018. Feb 16;359(6377):801–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hellmann MD, Nathanson T, Rizvi H, Creelan BC, Sanchez-Vega F, Ahuja A, et al. Genomic Features of Response to Combination Immunotherapy in Patients with Advanced Non-Small-Cell Lung Cancer. Cancer Cell. 2018. May 14;33(5):843–852.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Matsushita H, Sato Y, Karasaki T, Nakagawa T, Kume H, Ogawa S, et al. Neoantigen load, antigen presentation machinery, and immune signatures determine prognosis in clear cell renal cell carcinoma. Cancer Immunol Res. 2016. Mar 15;4(5):463–71. [DOI] [PubMed] [Google Scholar]
- 10.Miller A, Asmann Y, Cattaneo L, Braggio E, Keats J, Auclair D, et al. High somatic mutation and neoantigen burden are correlated with decreased progression-free survival in multiple myeloma. Blood Cancer J. 2017. Sep 22;7(9):e612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Matsushita H, Hasegawa K, Oda K, Yamamoto S, Nishijima A, Imai Y, et al. The frequency of neoantigens per somatic mutation rather than overall mutational load or number of predicted neoantigens per se is a prognostic factor in ovarian clear cell carcinoma. Oncoimmunology. 2017. Jun 16;6(8):e1338996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Gettinger S, Choi J, Hastings K, Truini A, Datar I, Sowell R, et al. Impaired HLA class I antigen processing and presentation as a mechanism of acquired resistance to immune checkpoint inhibitors in lung cancer. Cancer Discov. 2017. Dec;7(12):1420–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Sato Y, Yoshizato T, Shiraishi Y, Maekawa S, Okuno Y, Kamura T, et al. Integrated molecular analysis of clear-cell renal cell carcinoma. Nat Genet. 2013. Aug;45(8):860–7. [DOI] [PubMed] [Google Scholar]
- 14.Jiang T, Shi T, Zhang H, Hu J, Song Y, Wei J, et al. Tumor neoantigens: from basic research to clinical applications. J Hematol Oncol. 2019. Sep 6;12(1):93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Blass E, Ott PA. Advances in the development of personalized neoantigen-based therapeutic cancer vaccines. Nat Rev Clin Oncol. 2021. Apr;18(4):215–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Li S, Simoni Y, Zhuang S, Gabel A, Ma S, Chee J, et al. Characterization of neoantigen-specific T cells in cancer resistant to immune checkpoint therapies. Proc Natl Acad Sci USA. 2021. Jul 27;118(30). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Roth A, Khattra J, Yap D, Wan A, Laks E, Biele J, et al. PyClone: statistical inference of clonal population structure in cancer. Nat Methods. 2014. Apr;11(4):396–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Deshwar AG, Vembu S, Yung CK, Jang GH, Stein L, Morris Q. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 2015. Feb 13;16:35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Miller CA, White BS, Dees ND, Griffith M, Welch JS, Griffith OL, et al. SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLoS Comput Biol. 2014. Aug 7;10(8):e1003665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Łuksza M, Riaz N, Makarov V, Balachandran VP, Hellmann MD, Solovyov A, et al. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature. 2017. Nov 23;551(7681):517–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wang T, Lu R, Kapur P, Jaiswal BS, Hannan R, Zhang Z, et al. An empirical approach leveraging tumorgrafts to dissect the tumor microenvironment in renal cell carcinoma identifies missing link to prognostic inflammatory factors. Cancer Discov. 2018. Sep;8(9):1142–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005. Oct 25;102(43):15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yi JS, Cox MA, Zajac AJ. T-cell exhaustion: characteristics, causes and conversion. Immunology. 2010. Apr 1;129(4):474–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Mehta AK, Gracias DT, Croft M. TNF activity and T cells. Cytokine. 2018. Jan;101:14–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Anderson AC. Tim-3, a negative regulator of anti-tumor immunity. Curr Opin Immunol. 2012. Apr;24(2):213–6. [DOI] [PubMed] [Google Scholar]
- 26.Yang Z-Z, Kim HJ, Villasboas JC, Chen Y-P, Price-Troska T, Jalali S, et al. Expression of LAG-3 defines exhaustion of intratumoral PD-1+ T cells and correlates with poor outcome in follicular lymphoma. Oncotarget. 2017. Sep 22;8(37):61425–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhang B, Xu C, Liu J, Yang J, Gao Q, Ye F. Nidogen-1 expression is associated with overall survival and temozolomide sensitivity in low-grade glioma patients. Aging (Albany NY). 2021. Mar 18;13(6):9085–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Yang C, Wu S, Mou Z, Zhou Q, Zhang Z, Chen Y, et al. Transcriptomic Analysis Identified ARHGAP Family as a Novel Biomarker Associated With Tumor-Promoting Immune Infiltration and Nanomechanical Characteristics in Bladder Cancer. Front Cell Dev Biol. 2021. Jul 7;9:657219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Feng Q, Li L, Li M, Wang X. Immunological classification of gliomas based on immunogenomic profiling. J Neuroinflammation. 2020. Nov 27;17(1):360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Griffin GK, Wu J, Iracheta-Vellve A, Patti JC, Hsu J, Davis T, et al. Epigenetic silencing by SETDB1 suppresses tumour intrinsic immunogenicity. Nature. 2021. Jul;595(7866):309–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lyu X, Li G, Qiao Q. Identification of an immune classification for cervical cancer and integrative analysis of multiomics data. J Transl Med. 2021. May 10;19(1):200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen X, Su C, Ren S, Zhou C, Jiang T. Pan-cancer analysis of KEAP1 mutations as biomarkers for immunotherapy outcomes. Ann Transl Med. 2020. Feb;8(4):141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Biton J, Mansuet-Lupo A, Pécuchet N, Alifano M, Ouakrim H, Arrondeau J, et al. TP53, STK11, and EGFR Mutations Predict Tumor Immune Profile and the Response to Anti-PD-1 in Lung Adenocarcinoma. Clin Cancer Res. 2018. Nov 15;24(22):5710–23. [DOI] [PubMed] [Google Scholar]
- 34.Xie Z, Bailey A, Kuleshov MV, Clarke DJB, Evangelista JE, Jenkins SL, et al. Gene Set Knowledge Discovery with Enrichr. Curr Protoc. 2021. Mar;1(3):e90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Voronov E, Apte RN. Targeting the Tumor Microenvironment by Intervention in Interleukin-1 Biology. Curr Pharm Des. 2017;23(32):4893–905. [DOI] [PubMed] [Google Scholar]
- 36.Dinarello CA. Interleukin-1 Mediated Autoinflammation from Heart Disease to Cancer. In: Hashkes PJ, Laxer RM, Simon A, editors. Textbook of Autoinflammation. Cham: Springer International Publishing; 2019. p. 711–25. [Google Scholar]
- 37.Dang HX, White BS, Foltz SM, Miller CA, Luo J, Fields RC, et al. ClonEvol: clonal ordering and visualization in cancer sequencing. Ann Oncol. 2017. Dec 1;28(12):3076–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Liu B, Hu X, Feng K, Gao R, Xue Z, Zhang S, et al. Temporal single-cell tracing reveals clonal revival and expansion of precursor exhausted T cells during anti-PD-1 therapy in lung cancer. Nat Cancer. 2022. Jan;3(1):108–21. [DOI] [PubMed] [Google Scholar]
- 39.Barnes TA, Amir E. HYPE or HOPE: the prognostic value of infiltrating immune cells in cancer. Br J Cancer. 2018. Jan 9;118(2):e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Melero I, Rouzaut A, Motz GT, Coukos G. T-cell and NK-cell infiltration into solid tumors: a key limiting factor for efficacious cancer immunotherapy. Cancer Discov. 2014. May;4(5):522–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zuo S, Wei M, Wang S, Dong J, Wei J. Pan-Cancer Analysis of Immune Cell Infiltration Identifies a Prognostic Immune-Cell Characteristic Score (ICCS) in Lung Adenocarcinoma. Front Immunol. 2020. Jun 30;11:1218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Cristescu R, Mogg R, Ayers M, Albright A, Murphy E, Yearley J, et al. Pan-tumor genomic biomarkers for PD-1 checkpoint blockade-based immunotherapy. Science. 2018. Oct 12;362(6411). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Şenbabaoğlu Y, Gejman RS, Winer AG, Liu M, Van Allen EM, de Velasco G, et al. Tumor immune microenvironment characterization in clear cell renal cell carcinoma identifies prognostic and immunotherapeutically relevant messenger RNA signatures. Genome Biol. 2016. Nov 17;17(1):231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.O’Donnell TJ, Rubinsteyn A, Laserson U. MHCflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing. Cell Syst. 2020. Oct 21;11(4):418–9. [DOI] [PubMed] [Google Scholar]
- 45.Kim ST, Cristescu R, Bass AJ, Kim K-M, Odegaard JI, Kim K, et al. Comprehensive molecular characterization of clinical responses to PD-1 inhibition in metastatic gastric cancer. Nat Med. 2018. Sep;24(9):1449–58. [DOI] [PubMed] [Google Scholar]
- 46.Anagnostou V, Smith KN, Forde PM, Niknafs N, Bhattacharya R, White J, et al. Evolution of Neoantigen Landscape during Immune Checkpoint Blockade in Non-Small Cell Lung Cancer. Cancer Discov. 2017. Mar;7(3):264–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Liu D, Schilling B, Liu D, Sucker A, Livingstone E, Jerby-Arnon L, et al. Integrative molecular and clinical modeling of clinical outcomes to PD1 blockade in patients with metastatic melanoma. Nat Med. 2019. Dec 2;25(12):1916–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Van Allen EM, Miao D, Schilling B, Shukla SA, Blank C, Zimmer L, et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science. 2015. Oct 9;350(6257):207–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.McDermott DF, Huseni MA, Atkins MB, Motzer RJ, Rini BI, Escudier B, et al. Clinical activity and molecular correlates of response to atezolizumab alone or in combination with bevacizumab versus sunitinib in renal cell carcinoma. Nat Med. 2018. Jun 4;24(6):749–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Cristescu R, Lee J, Nebozhyn M, Kim K-M, Ting JC, Wong SS, et al. Molecular analysis of gastric cancer identifies subtypes associated with distinct clinical outcomes. Nat Med. 2015. May;21(5):449–56. [DOI] [PubMed] [Google Scholar]
- 51.Mitchell TJ, Turajlic S, Rowan A, Nicol D, Farmery JHR, O’Brien T, et al. Timing the landmark events in the evolution of clear cell renal cell cancer: tracerx renal. Cell. 2018. Apr 19;173(3):611–623.e17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Salichos L, Meyerson W, Warrell J, Gerstein M. Estimating growth patterns and driver effects in tumor evolution from individual samples. Nat Commun. 2020. Feb 5;11(1):732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lu T, Park S, Zhu J, Wang Y, Zhan X, Wang X, et al. Overcoming Expressional Drop-outs in Lineage Reconstruction from Single-Cell RNA-Sequencing Data. Cell Rep. 2021. Jan 5;34(1):108589. [DOI] [PubMed] [Google Scholar]
- 54.Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 2009. Feb 3;10:48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang T-H, et al. The immune landscape of cancer. Immunity. 2018. Apr 17;48(4):812–830.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The baseline patients are from the Cancer Genome Atlas Program (TCGA) and the kidney cancer patients from our prior study(21). The TCGA data (expression, mutation and survival) were downloaded from the TCGA firehose website (https://gdac.broadinstitute.org/, version of 2016012800.0.0). TCGA HLA typing data were provided by Thorsson et al(55). For the treatment cohorts, we gathered 32 gastric cancer patients from Cristescu et al(42) and Kim et al(45); 8 NSCLC patients from Anagnostou et al(46); 105 HNSCC patients from Cristescu et al(42); 352 melanoma patients from Cristescu et al(42), Hugo et al(5), Liu et al(47), Riaz et al(6), and Van Allen et al(48); 157 RCC patients from IMmotion150 cohort(49) and Miao et al(7). Information on access to data is provided in these original reports. The in-house MDACC patients’ raw genomic data have been uploaded into the European Genome-Phenome Archive (EGA) with the accession numbers being EGAD00001008382 and EGAD00001008482. The accession codes and links for each dataset are listed in Sup. Table 4.