Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 May 12.
Published in final edited form as: Nat Genet. 2022 Jul 11;54(7):996–1012. doi: 10.1038/s41588-022-01108-w

Functional Landscapes of POLE and POLD1 Mutations in Checkpoint Blockade-Dependent Anti-Tumor Immunity

Xiaoxiao Ma 1,2, Nadeem Riaz 3, Robert M Samstein 4,5, Mark Lee 6, Vladimir Makarov 1,2, Cristina Valero 6, Diego Chowell 5,7,8, Fengshen Kuo 9, Douglas Hoen 1,2, Conall WR Fitzgerald 6, Hui Jiang 9,10, Jonathan Alektiar 10, Tyler J Alban 1,2, Ivan Juric 1, Prerana Bangalore Parthasarathy 1, Yu Zhao 1, Erich Y Sabio 10, Richa Verma 1,2, Raghvendra M Srivastava 1,2, Lynda Vuong 9,10, Wei Yang 10, Xiao Zhang 1,2, Jingming Wang 6, Lawrence K Chu 1,11, Stephen L Wang 1,12, Daniel W Kelly 13, Xin Pei 3, Jiapeng Chen 3, Rona Yaeger 14, Dmitriy Zamarin 15, Ahmet Zehir 16, Mithat Gönen 17, Luc GT Morris 6, Timothy A Chan 1,2,3,18,19
PMCID: PMC10181095  NIHMSID: NIHMS1873994  PMID: 35817971

Abstract

Defects in pathways governing genomic fidelity have been linked to improved response to immune checkpoint blockade therapy (ICB). Pathogenic POLE/POLD1 mutations can cause hypermutation, yet how diverse mutations in POLE/POLD1 influence anti-tumor immunity following ICB is unclear. Here, we comprehensively determined the effect of POLE/POLD1 mutations in ICB and elucidated the mechanistic impact of these mutations on tumor immunity. Murine syngeneic tumors harboring Pole/Pold1 functional mutations displayed enhanced anti-tumor immunity and were sensitive to ICB. Patients with POLE/POLD1 mutated tumors harboring telltale mutational signatures respond better to ICB than patients harboring wild-type or signature-negative tumors. A mutant POLE/D1 function-associated signature-based model out-performed several traditional approaches for identifying POLE/POLD1 mutated patients that benefit from ICB. Strikingly, the spectrum of mutational signatures correlates with the biochemical features of neoantigens. Alterations that cause POLE/POLD1 function-associated signatures generate TCR-contact residues with increased hydrophobicity, potentially facilitating T-cell recognition. Altogether, the functional landscapes of POLE/POLD1 mutations shape immunotherapy efficacy.


Although immune checkpoint blockade (ICB) therapy is effective in multiple cancer types, durable response to ICB is still limited to a minority of patients1,2. Recent studies from us and others have revealed that pathogenic alterations in DNA damage repair pathways (DDR) are associated with improved response to ICB3-7. For instance, mismatch repair deficiency (MMRd), which results in microsatellite instability (MSI), is an FDA-approved indication for anti-PD1 therapy regardless of cancer type8. Mechanistically, MMRd increases tumour mutational burden (TMB) and indel load which contributes to an elevated level of neoantigens, promoting anti-tumor immunity9.

Polymerase epsilon and delta (POLE and POLD1, hereby called POLE/D1) are DDR genes that are genetically altered in nearly 4% of tumors across all cancer types, with 10-13% of these cases due to germline variants10. These proteins are catalytic subunits of DNA polymerases that are responsible for DNA synthesis during cell division and DNA damage repair11. Certain pathogenic mutations occurring both within and outside of the exonuclease domains of POLE/D1 can result in a hypermutator phenotype12-14. Recent studies reported that tumors harboring POLE/D1 pathogenic mutations are highly infiltrated with immune cells, suggesting these mutations may lead to improved immune recognition which potentially correlated with favorable response to ICB15-19; however, the functional significance of most of these mutations, including their effects on inducing genomic alterations, role in anti-tumor immunity, and modulation of immunotherapy response remains unclear12,20-22.

Well-known functional mutations (i.e. mutations that perturb function) in the exonuclease domains of POLE and POLD1 can produce telltale mutational signatures12,23,24. The COSMIC single-base substitutions (SBS) signature SBS10a/b are associated with proof-reading defects of polymerases in samples that have intact mismatch repair machinery, while SBS14/20 are associated with concurrent POLE/D1 mutations and MMRd. There is an opportunity to identify and characterize other pathogenic POLE/D1 mutations and their impact on tumor immune surveillance and immunotherapy outcomes based on POLE/D1 functional mutations associated mutational signatures25.

Here, we present a comprehensive evaluation of the functional implications of the mutational landscape associated with POLE/D1 alterations, including their role in inducing specific mutational patterns, altering the immune microenvironment, and shaping ICB response.

Results

Pole/Pold1 functional alterations directly sensitize tumors to ICB

Human POLE/POLD1 and mouse Pole/Pold1 are highly homologous with 91% and 90% amino acid similarity respectively (Supplementary fig. 1&2). We first introduced a well-established PoleP286R hotspot functional mutation22,26,27 into the murine B16F10 melanoma cell line, using the Clustered Regularly Interspaced Short Palindromic Repeats-Homology-directed Repair (CRISPR-HDR) technique (Extended data fig. 1a; Supplementary fig. 1&3)28. We detected a 4.7-fold increment of de novo single nucleotide variants (SNVs; Extended data fig. 1a&b & Methods) in the mutant cell lines compared to parental cell lines after 8-weeks of in vitro passaging (P=1.7e-5; Fig. 1a, Extended data fig. 1c). New indels (P=0.82) , somatic copy number variations (SCNVs, P=0.74) and the MSIsensor score29 (P=0.68), remained similar (Fig. 1b). These results are consistent with the observation in patients 30,31.

Figure 1. Mouse tumors harboring Pole/Pold1 functional mutations are sensitive to immunotherapy.

Figure 1.

a, SNV accumulation in B16F10 parental and PoleP286R mutant cell lines after 8 weeks of in vitro cell passaging compared to before 8 weeks passaging (N=3 biological replicates). Two-sided P=1.7e-5 was derived from student t-test (*** p<0.005). b, Changes of insertion/deletion, copy number alternation and MSIsensor score in the B16F10 parental and PoleP286R mutant cell lines after 8 weeks of in vitro cell passaging. FGA, fraction of genome with copy number alteration (N=3 biological replicates). Two-sided P values (InDel P=0.82, SCNVs P=0.74; MSIsensor score P=0.68) were derived from student t-tests (n.s., no statistical significance). c, Schematic of immunotherapy experiments with murine models. Parental and Pole mutant cell lines after 8 weeks of in vitro passaging were implanted into animals and treated with ICB. Tumor volume was monitored until the end point or when the tumor was no longer identifiable. d, Tumor growth curves of the B16F10 parental cell line with ICB alone or in combination. Representative results from two independent experiments (N=15 mice per group). P values (anti-CTLA4, P=0.67; anti-PD1, P=0.61; Combo, P= 0.007). e, Tumor growth of the B16F10 PoleP286R mutant cell line with ICB alone or in combination. Representative results from two independent experiments (N=15 mice per group). P values (anti-CTLA4, P=0.002; anti-PD1, P=0.003; Combo, P= 0.0009). f, Immunofluorescence analysis of Cd3+ T cell in the B16F10 parental and PoleP286R tumors after two weeks of immunotherapy, bars represent 50um. g, Immunofluorescence staining from (f) was quantified (N= at least 15 independent fields). Dots represent individual fields. For comparison between two treatments, P values (Mutant IgG vs anti-CTLA4 P=0.0079; Mutant IgG vs anti-PD1 P=5.7e-6; Mutant IgG vs Combo P=4.1e-8; Parental IgG vs Combo P= 9.7e-8) indicate two-sided Student’s t-tests. The comparison between parental and mutant tumors indicates two-way ANOVA tests (P<0.001). h, Tumor growth curves of the CT-26 parental colorectal cell line with single and combination ICB. Representative results from two independent experiments (N=15 mice per group). P values (anti-CTLA4, P=8.5e-14; anti-PD1, P=0.019; Combo, P= 1.3e-14). i, the CT-26 PoleP286R colorectal cell line are more sensitive to anti-Pd1 therapy than the parental CT-26 cell line. Representative results from two independent experiments (N=15 mice per group). P values (anti-CTLA4, P=1.3e-6; anti-PD1, P=3.2e-7; Combo, P= 0.0009). j, Immunofluorescence analysis of Cd3+ T cell in the CT-26 parental and PoleP286R tumors after two weeks of anti-PD1 therapy, bars represent 50um. k, Immunofluorescence staining from (j) was quantified (N= at least 15 independent fields). Dots represent individual fields. P values (Mutant IgG vs anti-PD1 P=0.0002; Parental anti-PD1 vs Mutant anti-PD1 P=0.0005; Parental IgG vs Mutant IgG P=0.0022; Parental IgG vs Parental anti-PD1 P=0.09). i, Tumor growth curves of the isogenic B16F10 cell lines harboring PoleV411L with anti-PD1 or IgG therapy (N=15 mice per group). P=0.0002. m, Tumor growth curves of isogenic B16F10 cell lines harboring Pold1L472P or Pold1E372K mutations after treatment with anti-PD1 therapy or IgG control (N=15 mice per group). P values (Pold1L472P P=1.4e-4; Pold1E372K P=0.0051). For all panels, data are presented as mean values ± s.e.m. with no multiple comparison adjustment performed (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). For all growth curves related panels (d-e, h-i, l-m), P values indicate two-sided Student’s t-tests at the end time points.

The growth of the B16F10 PoleP286R tumors, was dramatically reduced by either mono- or combination ICB in the C5BL/6J host (anti-CTLA4, P=0.002; anti-PD1, P=0.003; Combo, P= 0.0009, Fig. 1c-e, Methods), and this was validated in a separate B16F10 PoleP286R clone (anti-CTLA4, P=0.0003; anti-PD1, P=0.0002; Combo, P= 0.0001; Fig. 1e, Extended data fig. 1d). Conversely, only combination treatment moderately delayed the growth of the parental tumors (P=0.007, Fig. 1d). All three types of ICB improved the overall survival of the mice bearing the PoleP286R tumors, while only combination therapy modestly extended the survival of mice bearing parental tumors (Extended data fig. 1e). Immunofluorescence staining of ICB-treated tumors showed at least 3-fold increment of Cd3+ T cell infiltration in PoleP286R tumors (Fig. 1f-g), and even in the IgG treatment arm, suggesting that the mutant tumors are more inflamed even before therapy and poised to respond to ICB. A similar response to ICB, and immune infiltration pattern were observed in CT-26 mouse colorectal tumors harboring the PoleP286R mutation and B16F10 tumors harboring the PoleV411L hotspot functional mutation12 (Fig. 1h-l; Extended data fig. 1f-I; Supplementary fig. 3&4; Supplementary note 1), suggesting that functional mutations in Pole likely induce sensitivity to ICB.

We investigated whether Pold1 pathogenic mutations also directly drive response to ICB. We generated B16F10 mutant cell lines harboring Pold1L472P functional mutation32 or Pold1E372K variant of unknown significance (hereby referred as VUS), which is associated with ICB response as well as POLD1-dependent mutational signature in patients6 (Supplementary fig. 2&5, Supplementary note 2). Tumors harboring these two Pold1 mutations are also more sensitive to anti-PD1 therapy (P=1.4e-4; P=0.0051, Fig. 1m). These results indicate that Pole/Pold1 functional mutations or VUSes that generate mutational signatures associated with POLE/POLD1 activity (SBS10a, SBS10b, SBS14, and/or SBS20; hereafter called function-associated signatures) influence sensitivity to ICB.

Pole functional mutations enhance tumor immunogenicity

To elucidate how POLE functional mutations influence the immune microenvironment of tumors prior to ICB, we implanted B16F10 PoleP286R mutant and parental cell lines into nude and C57BL/6J recipient mice subcutaneously. Mutant and parental tumors showed similar growth kinetics in immunodeficient nude mice, but showed growth delay in the immuno-competent C57BL/6J recipients, consistent with a stronger effect of immunosurveillance against mutant tumors (Fig. 2a-b). RNA-seq analysis of treatment-naive parental and mutant tumors harvested at 14 days post-implantation revealed distinct gene expression profiles (Extended data fig. 2a). Gene set enrichment analysis (GSEA) indicated that gene sets related to T-cell infiltration, natural killer (NK) cell infiltration, adaptive immune response, and inflammation are highly enriched in mutant tumors versus the parental tumors (Fig. 2c, Extended data fig. 2a-b).

Figure 2. The immune microenvironment of tumors harboring PoleP286R functional mutations.

Figure 2.

a, Tumor growth curves of the parental and B16F10 PoleP286R mutant tumors in immuno-deficient nude mice. Representative results from two independent experiments (N=10 mice per group). P values from two-sided Student’s t-tests at the end time points (n.s., no statistical significance). b, Tumor growth curves of the parental and B16F10 PoleP286R mutant tumors in immunocompetent B6 mice. Representative results from two independent experiments (N=10 mice per group). P=0.047 indicate two-sided Student’s t-tests at the end time points (* p<0.05). c, Summary of GSEA analysis on MsigDB hallmark gene sets comparing the mutant vs. parental gene expression profiles. NES, normalized enrichment score; FWER.p.val, the family-wise error rate p value. d, Flow cytometry quantification of the Cd45+ immune cell in B16F10 parental and PoleP286R mutant tumors 14 days post implantation (N=6 biological replicates). P=0.002 indicate two-sided Student’s t-tests at the end time points (n.s., no statistical significance, ** p<0.01). e, Flow cytometry analysis of the T cell populations in the B16F10 parental and PoleP286R mutant tumors 14 days post implantation (N=6 biological replicates). Gzmb, Granzyme B protein. Treg, regulatory T cells. P values (Cd8+ T cell P=0.002; Ki67+ Cd8 T cell P=0.018; Gzmb+Cd8 T cell P=0.015; Treg P=0.0087) indicate two-sided Student’s t-tests. f, Flow cytometry analysis of the NK cell population in the parental and mutant tumors (N=6 biological replicates). P values (NK cell P=0.0053, Gzmb+ NK cell P=0.0062) indicate two-sided Student’s t-tests at the end time points. g, Flow cytometry analysis of the innate immune components in parental or PoleP286R tumors 14 days after implanting into B6 mice (N=6 biological replicates). TAM, tumor associated macrophage; m-MDSC, monocytic myeloid derived suppressor cell, determined by Cd11b+Ly6GLy6Chi phenotype (P=0.0022); g-MDSC, granulocytic myeloid derived suppressor cell, determined by Cd11b+Ly6G+Ly6Cint phenotype; P values (TAM P=0.0043, m-MDSC P=0.0022, g-MDSC P=0.093) indicate two-sided Student’s t-tests. h, PCA analysis based on immune cell type signature enrichment of post-ICB tumors. Colored areas indicate confidence ellipses. P values were derived from PERMANOVA test on the first two PCs. i, Normalized enrichment scores of the CD8+ effector memory T-cells (Tem) and CD8+ central memory T-cells (Tcm) in post-ICB tumors (N=3 biological replicates). P values (CD8 Tem P=7.7e-4; CD8 Tcm P=3.2e-5) were derived from two-way ANOVA tests. j, Normalized enrichment score of the NK cells and Dendritic cells (DC) in post-ICB tumors (N=3 biological replicates). P values (NK P=3.4e-5; DC P=8.6e-6) were derived from two-way ANOVA tests. For all panels excepting c&h, data are presented as mean values ± s.e.m. with no multiple comparison adjustment performed (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). For boxplots in i-j, the minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar.

We further performed flow cytometry analysis on the single cell suspension extracted from mutant and parental tumors 14 days after the initial implantation into B6 recipient mice. PoleP286R mutant tumors showed ~2.5-fold increase of Cd45+ immune cell influx over parental tumors (P=0.002; Fig. 2d) and this was validated by the expression of Ptprc from RNA-seq data (P=0.0008, Extended data fig. 2d). Compared to parental tumors, mutant tumors showed a ~5-fold higher percentage of Cd8+ T-cell infiltration (P=0.002) and reduced proportions of immuno-suppressive regulatory T-cells (Treg; P=0.0087, Fig. 2e). Cd8+ T cells in the mutant tumors had increased proliferation (Ki67+, P=0.018) and were potentially activated, as assessed by the cytotoxic marker Gzmb (P=0.015, Fig. 2e). There was also a higher fraction of Pd1+ Cd8+ but not Lag3+ or Tigit+ Cd8+ T cells in the mutant tumors (P=0.026, Fig. Extended data fig. 2e). Additionally, mutant tumors also showed significant differences in the composition of the innate immune compartment when compared to the parental tumors, with increased infiltration of Natural killer cells (NK, P=0.0053), reduced fractions of tumor associated macrophages (TAM, P=0.0056) and myeloid-derived suppressor cells (MDSCs, P=0.0022, Fig. 2f-g). TAMs in mutant tumors demonstrated lower expression of Cd204 (P=0.043) and Cd206 (P=0.0087), markers for M2 polarization, indicating a more inflammatory microenvironment (Extended data fig. 2f). Along with the increased Pd1+ Cd8+ T cell population, TAMs in mutant tumors also expressed higher levels of Pd-l1 (P=0.045, Extended data fig. 2f).

We next explored how Pole functional mutations influence the immune microenvironment of tumors after ICB. Gene expression profile analysis showed that immune related pathways were upregulated in post-ICB PoleP286R tumors (Extended data fig. 3a-e, Supplementary note 3). We inferred the presence of adaptive and innate immune cell types in post-ICB samples via single sample GSEA based cell type specific gene signatures enrichment33 and performed principal component (PC) analysis (Extended data fig. 4a&b). The first two PCs explained 46.49% and 15.59% of the variation observed (Fig. 2h, Extended data fig. 4b). Based on these two PCs, significant separation between the PoleP286R and parental tumors was observed in the permutational multivariate analysis of variance (PERMANOVA test; P<0.001; Fig. 2h), contributed by both adaptive and innate immune cell types (Fig. 2i-j, Extended data fig. 4d&e; Supplementary note 3). Analysis of the TCR-beta CDR3 clonotypes also showed higher levels of clonal expansion, richness and reduced evenness in post-ICB PoleP286R tumors (Extended data fig. 4f-h, Supplementary note 3). Altogether, these results indicated that Pole functional mutations alter both the adaptive and innate immune compartments and contribute to more inflamed tumor immune microenvironments before and after ICB.

PoleP286R show similar mutational signatures to human POLE tumors

Pathogenic POLE/D1 mutations generate somatic mutations following specific patterns, e.g., C>A transversions in a TCT trinucleotide context, C>T transition in a TCG trinucleotide context or T>G in a TTT trinucleotide context23,24. As we found that isogenic PoleP286R cell lines can accumulate SNVs during in vitro culturing (Fig. 1a), we wanted to explore these mutation patterns. As anticipated, the six SNV category profiles of the parental and mutant cell lines before in vitro culturing were similar (‘baseline mutations’; Fig. 3a; Extended data fig. 5a). In contrast, de novo mutations in the mutant cell lines showed a higher percentage of ‘C>T’ (1.2-fold, P=0.005) and ‘T>G’ (2.4-fold, P=0.007) mutations, indicating that pathogenic Pole mutations can induce a distinct mutational pattern (Fig. 3a; Extended data fig. 5a)

Figure 3. Dissecting the de novo mutational signatures in the B16F10 PoleP286R model.

Figure 3.

a, Analysis of the six classes of base substitutions of the baseline and de novo mutations in the B16F10 parental and PoleP286R mutant cell lines indicating altered mutational profile after introduction of the PoleP286R mutation (N=3). Baseline, SNVs found in the parental or PoleP286R mutant cell lines before 8 weeks of cell culture. De novo, SNVs found only in parental or PoleP286R mutant cell lines after 8 weeks cell culture. P values (C>T P=0.005, T>G P=0.007) indicate two-sided Student’s t-tests (** P<0.01). Data are presented as mean values ± s.e.m. b, Contribution of the three NMF extracted de novo mutation signatures to the baseline and de novo mutations in the B16F10 parental and PoleP286R mutant cell lines showed that the mSig. B exclusively contributed to the de novo mutations discovered in the B16F10 PoleP286R mutant cell lines. c, 96 base substitutions in trinucleotide sequence contexts of the three NMF extracted de novo mutational signatures from the baseline and de novo mutations of the B16F10 parental and PoleP286R mutant cell lines. d, Analysis of cosine similarity of the three NMF extracted de novo mutation signatures with the established human cancer mutational signatures from COSMIC SBS signatures v3 indicates a high similarity between the mSig. B and the known POLE/D1 function-associated mutational signature SBS10b. Cosmic SBS signatures were clusters based on their associated biological processes. POLE/D1, SBS mutational signatures associated with POLE/D1 functional mutations; MMRD, SBS mutational signatures related to mismatch repair deficiency; Clock-like, SBS mutational signatures that related to Clock-like mutational processes; Sequencing artifacts, SBS signatures possibly generated by sequencing artifacts; Other signatures, SBS signatures associated with all other biological processes.

We next generated de novo mutational signatures using the non-negative matrix factorization (NMF) of the 96-trinucleotide-context of baseline and de novo SNVs from 0-week and 8-week passaged cell lines (Extended data fig. 5a, Method)34. Three de novo mutational signatures were identified; designated mSigA, mSigB and mSigC (Fig. 3b). mSig. B, predominantly harboring C>T transitions in a TCG context, is strongly detected in de novo mutations from the mutant cell lines, and closely resembles the COSMIC SBS10b (cosine similarity of 0.82), one of the four characteristic single-base substitution (SBS) signatures observed in human tumors with POLE/D1 pathogenic mutations (Fig. 3b-c)24. mSig. A&C are signatures present in baseline or de novo mutations identified from the parental tumors. They showed distinct profiles and are potentially associated with other mutational processes (Fig. 3b-c; Supplementary note 4). With additional validation (Extended data fig. 5b-e; Supplementary note 5), these results indicated that PoleP286R mutation generated a similar de novo SNV profile in murine cell lines, as observed in patient samples harboring POLE functional mutations.

Signature-based model identifies tumors with functional POLE/D1 mutations

Next, we wanted to further systemically investigate the association of POLE/D1 functional mutations with response to immunotherapy. We reasoned that truly functional mutations will generate specific mutational signatures, as we observed in the murine cell lines, and thus sought to develop a logistic regression model to predict whether a tumor sample harbors POLE/D1 functional mutations based on the four POLE/POLD1-associated COSMIC SBS signatures (hereafter function-associated signatures).

We trained our model on a training set containing samples with known somatic POLE/D1 functional mutations with functional validation from previous literature and the OncoKB database35 (N=74 samples; Methods; Supplementary table 1), or POLE/D1 wild-type tumor samples with at least 20 SNVs (N=8757 samples) detected from WES sequencing from the Cancer Genome Atlas Program (TCGA) pan-cancer cohort (Fig. 4a; Extended data fig. 6a; Methods)12,35,36. The logistic regression model (WES-model) showed a high AUC (0.9870; 95%CI: 0.9697-1) on the training set (Fig. 4b). We selected an optimal cutoff for balance between sensitivity and specificity based on the Youden Index (Cutoff=0.6274, Extended data fig. 6b)37 and reached an overall accuracy of 0.9852 (95%CI: 0.9824-0.9876) with high sensitivity and specificity (Sensitivity=0.9459, 95%CI: 0.8673-0.9851; Specificity=0.9855, 95%CI: 0.9828-0.9879; Fig. 4b). We then tested our model on a test set composed of additional tumor samples from the International Cancer Genome Consortium (ICGC) and the Cancer Cell Line Encyclopedia (CCLE) datasets with defined functional POLE/D1 mutation status from somatic mutation calling and harboring at least 20 SNVs (functional N=16 samples, wild-type N=7822 samples; Extended data fig. 6c; Methods)38,39. The WES-model led to an AUC of 0.8929 (95%CI: 0.7806-1) and an overall accuracy of 0.9908 (95%CI: 0.9884-0.9928; Fig. 4c). We found that the false-negative predictions are associated with earlier MMRd events or other dominant mutation processes in tumors, while the false-positive prediction may be associated with low SNV load caused by genetic or environmental factors distinct from POLE/D1 functional mutations (Extended data fig. 6d-m; Supplementary note 6).

Figure 4. Statistical models based on mutational signatures can accurately identify tumors harboring POLE/D1 functional mutations from WES data and target panel sequencing data.

Figure 4.

a, Scheme of training a logistic regression model to identify tumors that contain POLE/D1 functional mutations from whole exome sequencing data. Tumor samples harboring known POLE/D1 functional mutations and POLE/D1 wild-type tumors were used to generate the training set, while samples from the ICGC and CCLE dataset were used to generate test set to evaluate the performance of the model. The trained model was then applied on tumor samples with POLE/D1 VUSes from the TCGA, ICGC and CCLE data sets, to identify potential samples containing new functional mutations. b, ROC (receiver operating characteristic) curve with AUC (area under the curve) and confusion matrix of the logistic regression model trained on the TCGA WES training set. Accuracy, sensitivity and specificity were calculated and presented. Pred. wild-type, samples predicted to be wild type for POLE/D1 functional mutation; Pred. functional, samples predicted to be harboring POLE/D1 functional mutations. c, ROC curve with AUC and confusion matrix of the trained logistic regression model on the ICGC/CCLE test set. Sensitivity and specificity were calculated and presented. d, ROC curve with AUC and confusion matrix for the MSK-IMPACT training set. Sensitivity and specificity were calculated and presented. e, ROC curve with AUC and confusion matrix for TCGA-IMPACT panel test set. Sensitivity and specificity were calculated and presented. f, Fraction of POLE/D1 VUS samples identified as functional mutation-positive samples by the WES and MSK-IMPACT logistic regression models, in the WES datasets and MSK-IMPACT datasets, accordingly. Functional, POLE/D1 VUS samples were predicted to harbor functional mutations; Passenger, POLE/D1 VUS samples were predicted as only harbored POLE/D1 passenger mutations. g, Association of the POLE/D1 function-associated signature-positive VUS samples with known POLE/D1 functional mutations, mutations that are associated with familial or early onsite tumors, or POLE/D1 mutator alleles in other species; h, Association of functional mutational signatures and SNV burden with the five categories of tumor samples, determined by the known POLE/D1 functional mutation and functional mutational signature status of the samples (Known functional mutation samples with function-associated signatures, N=206; Known functional mutation samples without function-associated signature N=21; VUS samples with function-associated signatures N=85; VUS samples without function-associated signature, N=2522; Wild type N=55630). Known functional mutation samples with function-associated signatures , samples harbored known POLE/D1 functional mutations and were also function-associated signature-positive; Known functional mutation samples without function-associated signature, samples harbored known POLE/D1 functional mutations but were function-associated signature negative; VUS samples with function-associated signatures, samples harbored POLE/D1 VUSes and were positive for the function-associated signatures; VUS samples with function-associated signatures, samples harbored POLE/D1 VUSes and did not show functional mutational signature based on our model; Wild type, POLE/D1 wild type samples. The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as red dot. P values (‘Known functional mutation samples with function-associated signatures’ vs ‘VUS samples with function-associated signatures’ P<2.2e-16; ‘VUS samples with function-associated signatures’ vs ‘VUS samples without function-associated signature’ P=0.045; ‘VUS samples with function-associated signatures’ vs ‘Wild type’ P<2.2e-16; ‘Known functional mutation samples with function-associated signatures’ vs ‘VUS samples without function-associated signature’ P<2.2e-16) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). i, Genomic features and mutational signature landscapes of tumor samples harbored known POLE/D1 functional mutations or were predicted to harbor functional mutations from WES and MSK-IMPACT data sets. Each bar represents a tumor sample, proportional contribution of the four POLE/D1 functional SBS signatures were shown. Sample with VUS, tumor samples only harbored POLE/D1 VUS but were predicted as functional samples by the logistic regression models; MSI status, MSI instability status were determined by MSIsensor score (MSI-H, MSIsensor score >=10; MSS/MSI-L, MSIsensor score<10; NA, MSI information not available); Primary site, the primary sites where the tumors were developed.

Panel sequencing is more widely used in clinical practice. Reduced numbers of mutations covered, together with distinct trinucleotide-context, lead to shifts of mutational signature readouts and sub-optimal prediction of our model on the MSK-IMPACT panel data (Extended data fig. 7. a-b; Supplemental note 7)40. Therefore, we built a separate model on the new training set containing pan-cancer samples with MSK-IMPACT data and at least 10 SNVs (sample with somatic functional mutations =137, wild-type sample=7649, Extended data fig. 7c-d; Methods). The IMPACT-model showed an AUC of 0.9560 and with a cutoff that delivered an accuracy of 0.9409 (cutoff=0.4935, accuracy 95%CI: 0.9355-0.9461; sensitivity=0.8978, 95%CI: 0.8344-0.9429; specificity=0.9416, 95%CI:0.9362-0.9468; Fig. 4d, Extended data fig. 7e). All of the 14 false-negative predictions showed either MMRd or ultraviolet radiation (UV) associated SBS signatures (SBS7a/b, Extended data fig. 7f). The IMPACT-model also demonstrated a high AUC of 0.9470 (95%CI: 0.9319-0.9801) and an overall accuracy of 0.9340 (accuracy 95%CI: 0.9231-0.9438; sensitivity=0.8888, 95%CI: 0.7927-0.9507; specificity=0.9354, 95%CI: 0.9244-0.9453) on the test set generated from the TCGA (Fig. 4e; Extended data fig. 7g; Methods).

We applied these two models and identified 85 samples that were POLE/D1 function-associated signature-positive out of the 2607 POLE/D1 VUS samples from the TCGA/ICGC/CCLE and MSK-IMPACT cohorts (Fig. 4f). To assess the functional ambiguity of these VUS samples with function-associated signatures, we expanded our baseline functional mutation list from a comprehensive literature review (Supplementary note 8) and found that 25% of these VUS samples are associated with functional mutations found in clinical studies or validated in other model systems, and showed similar genomic features to the samples harboring known POLE/D1 functional mutations (Fig. 4g, Supplementary table 4, Supplementary note 8).

We compared the immune landscape of samples that harbored either known functional mutations or VUSes and are function-associated signature-positive (hereby referred to as “functional mutation/signature-positive”) with two other groups: 1. samples which did not harbor any known POLE/D1 functional mutation and also were predicted as passenger mutation samples by our signature-based model (hereby referred to as “functional mutation/signature-negative”), and 2. POLE/D1 wild-type samples which were false positive (FP) predictions from the TCGA endometrial cohort (hereby referred as “false-positive prediction wild-type samples”). We also performed comparative analyses with the baseline mouse PoleP286R tumors. The functional mutation/signature-positive tumors are more immune active and share immune features with the mouse PoleP286R tumors (Fig. 5a-f; Extended data fig. 7h-j, Extended data fig. 8 a-c; Supplementary note 9).

Figure 5. POLE/D1 functional mutation/signature-positive tumors are more immune active and share similar immune features with the baseline mouse PoleP286R tumors.

Figure 5.

a, Immune infiltration score and CYT score (log10 transformed) of the POLE/D1 functional mutation/signature-positive tumors (N=60) and POLE/D1 functional mutation/signature-negative tumors (N=560) compared to the FP prediction wild-type samples (N=6) from the TCGA endometrial cohort. P value (Immune score: POLE/D1 functional mutation/signature-positive tumors vs POLE/D1 functional mutation/signature-negative tumors P=5.2e-4; FP prediction wild-type samples vs POLE/D1 functional mutation/signature-negative tumors P=0.26; CYT score: POLE/D1 functional mutation/signature-positive tumors vs POLE/D1 functional mutation/signature-negative tumors P=8.6e-6; FP prediction wild-type tumors vs POLE/D1 functional mutation/signature-negative tumors P=0.53) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). POLE/D1 functional mutation/signature-positive tumors, tumors either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 VUSes but were predicted as function-associated signature-positive based on the functional-signature-based model; POLE/D1 functional mutation/signature-negative tumors, tumors were predicted as wild-type samples by the function-associated signature-based model, regardless of the POLE/D1 mutation status; FP prediction wild-type tumors, POLE/D1 wild type tumors that were predicted as function-associated signature-positive (i.e., false positive) by the function-associated signature-based model. CYT score, cytotoxicity score. The minima (0 percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as red dot. b, PCA analysis on the immune features of the tumors from the TCGA-endometrial cohort (POLE/D1 functional mutation/signature-positive tumors N=60, POLE/D1 functional mutation/signature-negative tumors N=520, FP prediction wild-type tumors N=6), P values were calculated with PERMANOVA tests. (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005) c, Heatmap of the transformed enrichment scores of the immune cell types of the three indicated groups of samples from the TCGA-endometrial cohort. d, Enrichment scores of the immune cell types that are significantly upregulated or down-regulated in the POLE/D1 functional mutation/signature-positive tumors and FP prediction wild-type tumors compared to the POLE/D1 functional mutation/signature-negative tumors (POLE/D1 functional mutation/signature-positive tumors N=60, POLE/D1 functional mutation/signature-negative tumors N=520, FP prediction wild-type tumors N=6; also see Extended data fig. 8a). P values (CD8 Tcm P=4.1e-6; CD8 Tem P=0.024; naïve CD8 T cell P=1.7e-4; B-cell P=0.015; NK cell P=0.021; DC P=0.0077) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). The minima (0 percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as red dot. e, Immune cell types consistently altered in the human Functional tumors and the mouse B16F10 PoleP286R baseline tumors when compared to their corresponding control samples. Statistical significances were determined by two tailed student t-tests (P<0.05). f, Richness, and evenness index of the TCR-beta CDR3 repertoires from the POLE/D1 functional mutation/signature-positive tumors (N=59), POLE/D1 functional mutation/signature-negative (N=463), and FP prediction wild-type tumors (N=5) samples of the TCGA-endometrial cohort when TCR-beta CDR3 repertoire data is available. P value (richness P=8.7e-5, evenness P=0.045) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as red dot.

POLE/D1 function-associated signature-based model predicts patient outcome

POLE/D1 functional mutations can be observed concurrently with MMRd, a well-known genetic feature which leads to the MSI-H phenotype and promotes response to ICB9. To determine the independent effect of POLE/D1 functional mutations on ICB outcome, we applied our model to identify patients with POLE/D1 function-associated signatures in a pan-cancer cohort containing 2,700 MSI-Stable/Low cancer patients that received anti-PD-1/PD-L1 therapy and MSK-IMPACT sequencing from 2015 (Methods). Within this cohort, a group of patients that harbored at least one somatic POLE/D1 mutation (N=172) showed modest improvement in overall survival (OS) with ICB when corrected for cancer types (HR=0.74, P=0.0046, 95% CI=0.60-0.91; Fig. 6a; Supplementary table 5). From these patients, we identified a subgroup of 24 patients that harbored known POLE/D1 somatic functional mutations (N=12), or harbored POLE/D1 somatic VUSes but were predicted to be POLE/D1 function-associated signature-positive samples (N=12) based on our logistic regression model (Methods, Extended data fig. 8d, Supplementary table 6). Ten out of the twelve known POLE/D1 functional mutation-positive samples were also predicted as POLE/D1 function-associated signature-positive, while the remaining two have shown dominant UV (SBS7a/b) or MMRd signatures (Extended data fig. 8d). The tumors from these POLE/D1 functional mutation/signature-positive patients showed higher levels of TMB and lower levels of copy number variants than the tumors from the rest of the POLE/D1 mutated patients that are POLE/D1 function-associated signature-negative (P=2.2e-5; P=0.0021), or the tumors from the POLE/D1 wild-type patients (P<2.2e-16; P=0.0095; Extended data fig. 8e-f). We found that the POLE/D1 functional mutation/signature-positive patients showed significantly better overall survival than the POLE/D1 wild-type patients, even after correcting for cancer type in a multivariable coxph regression (HR=0.25, P=0.0072, 95% CI: 0.11-0.56, Fig. 6b), or when limiting the cohort to POLE/D1 functional mutation/signature-positive tumors and their histology-matched wild-type controls (HR=0.25, P=0.0008, 95% CI: 0.11-0.56, Extended data fig. 8g). Importantly, this survival benefit is not driven by a specific histology (Extended data fig. 8h; Supplementary note 10).

Figure 6. Patients with POLE/D1 functional mutations/signatures have better response and survival after anti-PD-1/PD-L1 immunotherapy.

Figure 6.

a, Kaplan-Meier overall survival probability plot of the patients harboring any type of POLE/D1 mutations versus POLE/D1 wild-type patients. Log-Rank Log-Rank P value and hazard ratio were derived from coxph model with cancer type correction. b, Kaplan-Meier overall survival probability plot of POLE/D1 functional mutation/signature-positive patients versus POLE/D1 wild-type patients in a PD-1/PD-L1 treated MSS/MSI-L patient cohort. Log-Rank Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction. Functional mutations/signatures, patients either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 variants of unknown significance (VUSes) but were predicted as function-associated signature-positive. c, Proportion of the POLE/D1 functional mutation/signature-positive patients deriving clinical benefit from immune checkpoint inhibition in the MSK-IMPACT cohort. PR, partial response; SD, stable disease (>6 months); PD, progressive disease; CR, complete response. d, Proportion of POLE/D1 functional mutation/signature-positive patients deriving clinical responses from immune checkpoint inhibition in different cancer type categories. e, MRI images and SBS signature profile of the tumor from one of the POLE/D1 functional mutation/signature-positive patients that harbors POLEE277Q function-associated signature-positive VUS reached complete response with anti-PD1 therapy. Black, green marker and numbers on the image indicate the location, sizes of the tumor before and after initiation of therapy. f, Comparison of the proportion of clinical beneficial cases of the POLE/D1 functional mutation/signature-positive patients and wild-type patients in pan-cancer and each individual cancer type category. Numbers indicate actual numbers of patients in each category. BLCA, bladder cancer; CRC, colorectal cancer; NSCLC, non-small cell lung cancer; Others, other cancer types with at least one POLE/D1 functional mutation/signature-positive patients combined. P values were derived from Fisher's exact tests. OR, odds ratio from Fisher’s exact t-tests. g, Kaplan-Meier progression free survival probability plot of the POLE/D1 functional mutation/signature-positive patients versus wild-type patients. Log-Rank Log-Rank P value and hazard ratio showed were calculated from the coxph model with cancer type correction. h, Forest plot of the POLE/D1 functional mutations/signatures as a predictive factor in coxph models of progression free survival after immunotherapy with cancer type correction for pan-cancer or each single cancer type category with at least three POLE/D1 functional mutation/signature-positive patients. Number of POLE/D1 functional mutation/signature-positive patients, number of wild-type patients, hazard ratio and p value were shown for each cancer type category in the figure. Horizontal bars represent the 95% confidence interval of the hazard ratio. Error bar centre indicate hazard ratio. Each line represents an individual coxph model on the indicated cancer type category. i, Kaplan-Meier progression free survival plot of the POLE/D1 functional mutation/signature-positive patients by the MSK-IMPACT logistic regression model versus other POLE/D1 functional mutation/signature-negative mutant patients. VUSes without function-associated signature, samples harbored POLE/D1 VUSes, but were predicted as function-associated signature-negative; Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction.

As response rate and progression-free survival (PFS) are also key measurements of efficacy of immunotherapy, we analyzed the immunotherapy-specific clinical response data via manual review of clinical records from the POLE/D1 functional mutation/signature-positive patients41 (Methods). Among the 24 POLE/D1 functional mutation/signature-positive patients, 18 patients experienced durable clinical benefit for at least 6 months, including four patients with complete response (CR), while only six patients experienced progression of disease (PD; Fig. 6c-d). Of the four complete responders, two harbored known POLE/D1 functional mutations while the other two harbored POLE/D1 VUSes and are function-associated signature positive patients. Of these complete responders, one harbored the classical POLEP286R mutation (Extended data fig. 8i; Supplementary note 11); the other harbored the POLEE277Q VUS (Fig. 6e; Supplementary note 11), indicating that ICB can induce long-lasting complete clinical and biochemical response in patients with either known POLE/D1 functional mutations or VUSes with function-associated signatures. Compared to the 30.0% clinical benefit rate of the 1038 POLE/D1 wild-type patients with response data available, the overall 75.0% overall clinical benefit rate of the 24 POLE/D1 functional mutation/signature-positive patients is significantly higher (OR=7.0, P<0.0001, Fig. 6f). This difference was also observed in the histology-matched setting (OR=6.6, P<0.0001, Extended data fig. 9a) or when limiting the comparison to individual histology category (Fig. 6f).

Consistent with the OS data, the overall POLE/D1 mutant patient group only experienced modest PFS benefit when compared to the wild-type patients (HR=0.50, P=1e-07, 95% CI: 0.38-0.65, Extended data fig. 9b), while the patients with the POLE/D1 functional mutation/signature-positive tumors showed a much more robust PFS benefit (HR=0.2, P=6.9e-07, 95% CI: 0.11-0.38, Fig. 6g), even when considering histology of the tumors (HR=0.2, P=7.4e-07, 95% CI:0.11-0.38, Extended data fig. 9c, Fig. 6h). Interestingly, the POLE/D1 functional mutation/signature-positive patients have improved response and survival even when compared to all other POLE/D1 mutated patients that are negative for POLE/D1 functional mutations and function-associated signatures, indicating that the function-associated signature-based model enriched for a subgroup of patients that might benefit more from immunotherapy (PFS: HR=0.16, P=0.0002, 95% CI:0.062-0.42; OS, HR=0.25, P=0.0023, 95% CI:0.10-0.61; Response OR= 3.4, P=0.03; Fig. 6i; Extended data fig. 9d-e). In contrast, false-positive prediction wild type patients (i.e., the wild-type false positive predictions by the logistic regression model) did not show improved ICB outcome when compared to the true negative prediction wild type patients (i.e., the wild-type true negative predictions by the MSK-IMPACT model, OS, N=49, 2479, HR=1.1, P=0.32, 95%CI:0.92-1.3; PFS, N=23, 1016, HR=0.93, P=0.55, 95%CI:0.75-1.2; Response OR=0.82, P=0.82; Extended data fig. 9f-h), indicating that POLE/D1 function-associated signatures do not predict better survival or response to ICB without the presence of POLE/D1 functional mutations. Altogether, the POLE/D1 functional mutation/signature-positive patients show better outcome with immunotherapy than the other POLE/D1 mutant patients or wild-type patients.

Signature-based model outperforms traditional approaches

As we observed that the POLE/D1 functional mutation/signature-positive patients have improved survival compared to other POLE/D1 mutated patients (Fig. 6i, Extended data fig. 9d-e), we compared our prediction approach with other methods to enrich for tumors with POLE/D1 functional mutations. We first considered two of such methods: (i) POLE/D1 mutated tumors with hypermutation phenotype (50 Mutations/Mb) (hypermutated, N=36)12 and (ii) tumors with at least one POLE/D1 exonuclease domain mutation (N=37) (Fig. 7a). There are 23 out of 36 hypermutated tumors and 21 out of 37 exonuclease domain mutation-positive samples are neither positive for known POLE/D1 functional mutations nor positive for POLE/D1 function-associated signatures, indicating that these two approaches are identifying relatively distinct patient populations from the POLE/D1 functional mutation/signature-positive patients (Fig. 7a). To assess the independent value of our signature-associated model against these other approaches, we generated multivariable Cox proportional hazard models (Coxph) and found that only the POLE/D1 functional mutations/ function-associated signatures, and hypermutation are independent predictive beneficial factors for both OS and PFS (Fig. 7b; Extended data fig. 9i). Exonuclease domain mutations of POLE/D1 have been used as a criterion to select POLE/D1 functional mutations for a long time32. Indeed, patients with POLE/D1 exonuclease domain mutations showed slightly better OS and PFS after immunotherapy (OS, HR=0.57, P=0.039, 95% CI: 0.36-0.97; PFS, HR=0.26, P=0.0030, 95% CI: 0.11-0.64; Extended data fig. 9j-k). However, when excluding the small portion of the POLE/D1 functional mutation/signature-positive patients from the POLE/D1 exonuclease domain mutation-positive population, we did not see survival benefit for the rest of the POLE/D1 exonuclease domain mutation-positive patients, compared to POLE/POLD1 wild-type patients (OS, HR=0.71, P=0.24, 95% CI: 0.40-1.26; PFS, HR=0.80, P=0.55, 95% CI: 0.39-1.64; Fig. 7c-d), revealing that the survival benefit of POLE/D1 exonuclease domain mutations is largely explained by the POLE/D1 exonuclease domain mutation-positive samples that are also predicted to be POLE/D1 functional mutation/signature-positive. In contrast, the POLE/D1 functional mutation/signature-positive patients whose mutations are out of the exonuclease domains of POLE/POLD1, or the POLE/D1 functional mutation/signature-positive patients whose tumor did not show hypermutator phenotype still showed better OS and PFS, in comparison with the POLE/D1 wild-type patients, suggesting that the predictive effect of POLE/D1 functional mutations/signatures is partially independent of POLE/D1 exonuclease domain mutations or hypermutator phenotype (Fig. 7e-f; Extended data fig. 9l-m).

Figure 7. POLE/D1 function-associated signatures positive status is an independent predictor that can enrich patients who benefit from immunotherapy in the patient population with POLE/D1 mutation.

Figure 7.

a, Comparison of the patient populations selected by different strategies. Known functional mutations, tumors with at least one known POLE/D1 functional mutation determined by the functional list used to build the MSK-IMPACT model; Hypermutated, tumors with at least 50 non-synonymous mutations per MB exome; Exonuclease domain mutations, tumors have at least one POLE/D1 mutation located in the exonuclease domain of POLE or POLD1. Function-associated signatures, i.e., function-associated signature-positive, tumor samples that were predicted to harbor POLE/D1 functional mutations by the MSK-IMPACT logistic regression model. Numbers of patients in each category were shown. b, A multivariable coxph model includes all the above patient selection strategies to compare the predictive capability on patients’ overall survival after ICB (N=2700). Functional mutations/signatures, i.e., POLE/D1 functional mutation/signature-positive patients, patients either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 variants of unknown significance (VUSes) but were predicted as function-associated signature-positive. Hazard ratio and Log-Rank P value are presented. Horizontal bars represent the 95% confidence interval of hazard ratio. Error bar centre indicates hazard ratio. Two-sided tests were performed for statistical significance without multiple comparison adjustment. (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005) c, Kaplan-Meier overall survival plot of patients harboring POLE/D1 exonuclease domain mutations but are also negative for known functional mutation and function-associated signature, versus the POLE/D1 wild-type patients. Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction. d, Kaplan-Meier progression free survival plot of the two patient groups in c when progression free survival data is available. Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction. e, Kaplan-Meier overall survival plot of the POLE/D1 functional mutation/signature-positive patients whose POLE/D1 mutations are not located in the exonuclease domain, versus the POLE/D1 wild-type patients. Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction. f, Kaplan-Meier progression free survival plot of the two groups of patients in e, when progression free survival data available. Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction. g. Fraction of patients with durable clinical response of the patient groups selected from all the patients with POLE/D1 mutations based on different strategies. h, C-index of the coxph models generated based on different patient selection strategies with cancer type correction on the ICB related overall survival of all the POLE/D1 mutated patients (N=172). P values were calculated from student t-tests of the coxph model based on POLE/D1 functional mutations/signatures, against other models based on other strategies. i, Multi-variable coxph test of ICB overall survival for POLE/D1 functional mutations/signatures and TMB with cancer type correction (N=2700). Only POLE/D1 functional mutations/signatures and TMB are shown in the forest plot. * log-rank P<0.05. *** log-rank P<0.005. Error bar indicating 95% CI of the Hazard ratio. j, Kaplan-Meier overall survival plot of the POLE/D1 functional mutation/signature-positive patients versus the POLE/D1 wild-type patients in the ICB treated patient cohort with high TMB (TMB>=10). Log-Rank P value and hazard ratio shown were calculated from the coxph model with cancer type correction.

We further compared our model approach with ‘known POLE/D1 functional mutation-positive’, ‘POLE/D1 exonuclease domain mutation-positive’, hypermutation and in silico function prediction-based algorithms on ICB outcomes, the results indicated that the function-associated signature-based model significantly out-perform most of these methods (Fig. 7g-h; Extended data fig. 10a-c; Supplementary note 12). Importantly, ‘POLE/D1 functional mutations/signatures’ showed independent predictive power for OS and PFS from TMB in multivariable coxph models (Fig. 7i, Extended data fig. 10d), or in ICB patient subsets that are TMBhi (i.e., with at least 10 mutations per Mb exome) or with balanced TMB distributions (Fig. 7j, Extended data fig. 10f-g, Supplementary note 13), indicating other biological mechanisms other than increased TMB also contribute to the improved ICB outcome of the POLE/D1 functional mutation/signature-positive patients.

Taken together, these results indicated that function-associated signature-generating POLE/D1 mutations, even those that are not classic functional mutations, predict outcome after anti-PD1/PD-L1 therapy.

Mutational signatures affect immunogenicity of neoantigens

Given that we observed a TMB-independent predictive effect of POLE/D1 function-associated signatures on ICB outcome, we wanted to further explore the potential underlying mechanism. While mutational signatures were associated to different probability of generating different SNV classes, the POLE/D1 function-associated mutational signatures do not differ from other COSMIC SBS signatures in their ability to generate missense SNV mutations, which could alter amino acid codons and generate immunogenic neo-peptides (Extended data fig. 10h, Supplementary table 7, Supplementary note 8). Further, the SNVs in the POLE/D1 functional mutation/signature-positive samples are also not more likely to generate Human leukocyte antigen class-I (HLA-I) binding peptides (Fig. 8a-b; Supplementary note 15).

Figure 8. Trinucleotide context spectrum of SBS mutational signatures and immunogenicity of neoantigens.

Figure 8.

a. Total number of SNVs per sample that generate at least one neo-peptide binding to at least one HLA-I allele of the same patient from the TCGA cohort, when HLA and neoantigen data is available (POLE/D1 functional mutation/signature-positive N=82, POLE/D1 functional mutation/signature-negative N=7003, FP prediction wild-type samples N=85). POLE/D1 functional mutation/signature-positive, samples either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 variants of unknown significance (VUSes) but were predicted as functional samples based on the logistic regression model; POLE/D1 functional mutation/signature-negative, samples didn’t harbor any known POLE/D1 functional mutation, and were predicted as function-associated signature-negative by the logistic regression model, regardless of the POLE/D1 mutation status; FP prediction wild-type samples, wild type samples that were predicted as POLE/D1 function-associated signature-positive (i.e., false positive). P values (POLE/D1 function-associated signature-positive vs POLE/D1 function-associated signature-negative P<2.2e-16; FP prediction wild-type vs POLE/D1 function-associated signature-negative P=0.037; POLE/D1 function-associated signature-positive vs FP prediction wild-type P<2.2e-16) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as center black dot. b. Fraction of SNVs per sample that generate at least one neo-peptide bind to at least one HLA-I allele of the same patient from the TCGA cohort, when HLA and neo-antigen data is available (POLE/D1 functional mutation/signature-positive N=82, POLE/D1 functional mutation/signature-negative N=7003, FP prediction wild-type N=85). P values ( POLE/D1 functional mutation/signature-positive vs POLE/D1 functional mutation/signature-negative P=0.13; FP prediction wild-type vs POLE/D1 functional mutation/signature-negative P=0.85; POLE/D1 functional mutation/signature-positive vs FP prediction wild-type P<2.2e-16) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as center black dot. c, Sample-level average Δ hydrophobicity of neo-peptide-associated amino acid (AA) alterations in three sample groups from the TCGA cohort (POLE/D1 functional mutation/signature-positive N=82, POLE/D1 functional mutation/signature-negative N=7003, FP prediction wild-type N=85). Bars indicate medians and dots indicate mean values. P values (POLE/D1 functional mutation/signature-positive vs POLE/D1 functional mutation/signature-negative P=0.0075; FP prediction wild-type vs POLE/D1 functional mutation/signature-negative P=0.93; POLE/D1 functional mutation/signature-negative vs FP prediction wild-type P=0.26) were generated with two-sided Wilcoxon Rank Sum Tests (n.s., no statistical significance, * P<0.05, ** P<0.01). The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as center black dot. d, Per peptide-residue-position Δhydrophobicity of the AA alterations in the POLE/D1 functional mutation/signature-positive samples (N=82) versus the POLE/D1 functional mutation/signature-negative (N=7003) samples described in (c). ‘*’ symbols indicate the residues at which Δ hydrophobicity of the AA alterations in the functional samples are significantly (P<0.05) higher than the other groups, as determined by two independent two-sided Wilcoxon Rank Sum Tests. Data are presented as mean values ± s.e.m. e, Per peptide-residue-position Δhydrophobicity of the AA alterations in the FP prediction wild-type (N=85) samples versus the POLE/D1 functional mutation/signature-negative (N=7003) samples described in (c). Data are presented as mean values ± s.e.m. g, Observed mean Δhydrophobicity of the neo-peptide-associated AA alterations for each SBS mutational signature in the TCGA cohort. Positive values indicated mutational signatures that are more likely associated with AA alterations generating residues with higher hydrophobicity compared to the original residues. POLE/D1 signatures, POLE/D1 function-associated signatures; Other signatures, signatures not associated to POLE/D1 functional mutations. h, Observed mean Δpolarity of neo-peptide-associated AA alterations for each SBS mutational signature in the TCGA cohort. Positive values indicated mutational signatures that are more likely associated with AA alterations generating residues with higher polarity compared to the original residues.

Increased hydrophobicity and decreased polarity are highly associated with enhanced immunogenicity of HLA-I binding peptides42-44. We found that the change-in-hydrophobicity (Δhydrophobicity) of the neo-peptides versus the corresponding wild-type peptides from the POLE/D1 functional mutation/signature-positive samples in the TCGA cohort is higher than that of the neo-peptides derived from the POLE/D1 functional mutation/signature-negative samples (P=0.0075, Fig. 8c). As the Δhydrophobicity of the neo-peptides is the consequence of amino acid alterations, we further looked at the Δhydrophobicity at each residue position in the predicted HLA-I binding neo-peptides (all 8-11 mers). The Δhydrophobicity of the amino acid alterations at the TCR-contact residues was significantly higher in the POLE/D1 functional mutation/signature-positive samples but not the false-positive prediction wild type samples, or that in the POLE/D1 functional mutation/signature-negative samples (Fig. 8d-e), which could be associated with higher immunogenicity42,45. To understand the basis of this difference, we examined the missense SNVs which generate specific amino acid alterations and found that mutational signatures are associated with the presence of different amino acid alterations (i.e., wild-type and mutant amino acid pairs) (Extended data fig. 10i; Supplementary table 8). For instance, the APOBEC-associated SBS2 signature is strongly associated with E-to-K alterations, while POLE/D1-related SBS10a is more associated with D/S-to-Y and L/R-to-I alterations. Strikingly, we found that two of the classical POLE/D1 function-associated signatures (SBS10a/b) are among the mutational signatures that are most likely to generate amino acid alterations which increase the hydrophobicity and decrease the polarity of the resultant neo-peptides (Fig 8. f-g, Supplementary table 9). Altogether, these results indicated that POLE/D1 function-associated signatures likely generate amino acid alterations that increase the hydrophobicity of the neo-peptides, enhancing immunogenicity (see Supplementary note 15-16).

Discussion

Here, we showed that POLE/D1 functional mutations are sufficient to induce anti-tumor immunity and impart sensitivity to ICB in both human and mouse. The introduction of POLE/D1 functional mutations into murine syngeneic tumors led to enhanced baseline activity of both adaptive and innate immune compartment, which imparted sensitivity to ICB. This effect appeared to be independent of cancer type in our model systems. These observations are consistent with previous studies in human15, and transgenic mice46.

Previous studies showed the value of using mutational signatures to identify pathogenic mutations in DDR pathways47. Here, our models trained on POLE/D1 function-associated signatures can distinguish samples harboring function POLE/D1 mutations from wild-type samples with high accuracy using both WES and targeted panel sequencing. Furthermore, our function-associated signature-based approach was superior in accurately classifying POLE/D1 mutated tumors that respond to ICB and out-performed most of the other traditional computational approaches for identifying likely pathogenic variants.

We also found that the spectrums of the SBS signatures are associated with different amino acid alterations. Classical POLE/D1 function-associated signatures SBS10a/b are more likely to be associated with amino acid alterations that increase the hydrophobicity of neo-peptides, potentially leading to higher immunogenicity42,48, and may contribute to the exceptional ICB outcome of the POLE/D1 functional tumor patients. Consistently, recent studies have indicated that specific mutational patterns associated with smoking and UV can generate more immunogenic neoantigens, highlighting the importance of utilizing mutational signatures to understand patients’ response to ICB49,50.

One limitation of this study pertains to the syngeneic models we used. These cell lines have relatively high baseline SNV burdens but provided easy isogenic-based comparisons. Additionally, our function-associated signature-based model generates false predictions and can be further improved. Future validation of the POLE/D1 function-associated signature-positive VUSes in yeast, mouse or human cell lines will determine whether they are bona fide functional mutations.

In summary, our study has significant utility for the interpretation of POLE and POLD1 mutations in the setting of ICB treatment. We provide an extensive category of immunologically relevant mutations and a definitive framework for using mutational signatures to understand the functional consequences of these mutations on ICB response and tumor immunology.

Methods

The use of the patient data was approved by the MSKCC Institutional Review Board (IRB). All patients provided informed consent to a Memorial Sloan Kettering IRB-approved protocol. The animal experiments were approved by the institutional animal care and use committee (IACUC) of MSKCC (Protocol# 07-09-015) and the Lerner Research Institute, the Cleveland Clinic Foundation (Protocol# 002467).

Generation of isogenic murine cell lines harboring Pole/Pold1 mutations

B16F10 (CRL-6475), a murine melanoma cell line derived from a spontaneous melanoma tumor which is widely used in immuno-oncology studies51-53, and CT-26 (CRL-2638) cell lines, an N-nitroso-N-methylurethane-(NNMU) induced murine colon carcinoma cell line that is a well-accepted model for testing efficacy of immunotherapy52-54, were obtained directly from ATCC prior to experimentation and cultured in RPMI-1640 complete medium with 10% FBS and antibiotics. 6-7 weeks old female C57BL6J, Nude or Balbc/J mice (Jackson Laboratories) were used for animal experiments. CRISPR-HDR editing was performed according to published methods28,55. CRISPR-Cas9 guide RNAs were designed using design tool provided by Benchling (Biology Software, 2019). Single-stranded oligo deoxynucleotides (ssODNs) aligned to the non-targeting strands with 40nt overhangs at each side of mutated regions were designed to introduce desired non-synonymous mutations, and to introduce synonymous mutations to disrupt potential binding between guide RNA and HDR production as well as to generate new restriction enzyme sites for downstream screening. As multiple guides were tested to generate the mutation cell lines, each ssODN may contain multiple synonymous mutations corresponding to PAM sites of multiple guides. The ssODNs were then synthesized by IDT with 5’ and 3’ phospho-modification on the first and last nucleotides. For generation of B16F10 mutant cell lines, briefly, optimized guide RNAs were cloned into the pSpCas9(BB)-2A-GFP (PX458), gift from Dr. Feng Zhang (Addgene). Transient transfection of 2ug plasmids and 1ul of 40uM ssODNs containing desired mutations and restriction enzyme sites was performed using Genejet-B16F10 (SignaGen) reagent according to manufacturer’s instruction. GFP positive live single cells were sorted into single wells containing culture medium 48-72 hours after transfection. For CT-26 mutant cell lines, Alt-R® CRISPR-Cas9 crRNAs containing the optimized guide sequences were ordered from IDT, then annealed together with Alt-R® CRISPR-Cas9 tracrRNA, ATTO 550 (IDT) and Alt-R® S.p. Cas9 Nuclease V3 (IDT) to form RNP complex then co-transfected with 1ul of 10uM ssODN template containing desired mutation and restriction enzyme site using RNAiMAX transfection reagent (Invitrogen). ATTO 550 positive live single cells were sorted into single wells containing culture medium 48 hours after the transfection. Single cell derived colonies were expanded and screened for incorporation of restriction enzyme cutting sites by direct digestion of PCR products (TseI for PoleP286Ralleles, AcuI for PoleV411Lalleles, HindIII for Pold1L472P Alleles, and NdeI for Pold1E372K alleles). Digestion products from clones that showed the desired digestion pattern were further sent for Amplicon sequencing (EZ-Amplicon, Genewiz) to validate the incorporation of the functional mutations using Crispresso v2 package56. Parental cell lines and validated mutant cell lines were cryopreserved as passage 0 week (P0w) and continuously cultured for 8 weeks in vitro and cryopreserved again at passage 8 weeks (P8w). All DNA oligo sequences will be available in Supplementary table 10.

Animal experiments

For mouse syngeneic tumor models, 2 x 105 B16F10 or CT-26 cells in 100 μl of PBS were injected into the left flanks subcutaneously in 6-7 weeks old female C57BL6J, Nude or Balbc/J mice (Jackson Laboratories). Mice with palpable tumors (20-100mm3) were randomized into indicated treatment arms. IgG (2A3, Bioxcel, 100μg; polyclonal Syrian hamster IgG, 250μg), anti-PD-1 (RMP1-14, Bioxcel 100μg), anti-CTLA4 (9H10, Bioxcel 250μg), or combination of anti-PD-1 and anti-CTLA4 antibodies were administered intraperitoneally in 100 μl of PBS twice weekly (every 3-4 days)7. Mice with ulcerative tumors or intramuscular tumors were excluded from analysis. Tumor volumes were measured twice weekly using calipers and calculated by the formula: ((Length) x (Width)^2)/2. All mouse experiments were repeated at least twice to ensure reproducibility. Mice were euthanized by carbon dioxide prior to necropsy. For survival analysis, end point of tumor bearing animals were determined by either death, severe health conditions or tumor volume of at least 1.7cm3, according to the IACUC approved animal protocol. The maximal tumor size limit of 2cm3 was followed, except in the cases where the tumors reached the limit between 2 contiguous days in which measurements were made. Every effort was made to abide by the state limit.

DNA/RNA extraction of murine tumors

Mouse tumors were trimmed, dissected, and then chopped to small pieces with blade in Petri dishes. Tumor pieces were further mixed and snap frozen with dry ice to reduce tumor regional heterogeneity. The frozen tissue was used for isolation of DNA and RNA and submitted to the Integrated Genomic Core for library preparation and sequencing. Three replicates per condition were submitted for sequencing.

Flow cytometry analysis

Murine tumors harvested at 14 days post-tumor implantation were dissociated into single cell suspensions using a gentle MACS tissue dissociator and the Miltenyi Mouse Tumor Dissociation Kit according to manufacturer instructions (Miltenyi Biotech). Cell suspensions were stained using the Live-Dead aqua fixable dye (Thermo) followed by surface and intracellular antibodies. Stained single cell suspensions were analyzed using the Fortessa (BD) flow cytometric analyzer. Quantitative data analysis was performed using Flowjo software (Treestar). All experiments were repeated at least twice. Information of the antibodies used for flow cytometry is available in Supplementary table 11. Gating strategy of flow cytometry is available in Supplemental fig 6.

Immunofluorescence and quantification of Cd3+ T cells

Harvested murine tumors were fixed in formalin at 4 °C for 48 hours and transferred to 70% ethanol at 4 °C. Tumors were embedded in paraffin and sectioned onto glass slides. Staining was performed by the Molecular Cytology Core Facility (MCCF) with Ventana Ultra stainer (Roche Diagnostics). Briefly, following 32 min of heat and CC1 (Cell Conditioning 1, Ventana 950-500) retrieval, the tissue sections were blocked first for 30 min in Background Blocking reagent (Innovex NB306). Anti-mouse Cd45 incubation (0.5ug/ml, 5h, BD 550539) was followed by incubation with biotinylated rabbit anti-rat IgG (5.75ug/ml, 1h, Vector BA-4000). Blocker D, Streptavidin- HRP and TSA A488 (Life Tech B40932) prepared according to manufacturer instruction in 1:100 dilution for 16 min. Then, anti-mouse Cd3 antibody incubation (1.2g/ml, 6h, Dako A0452) was followed by incubation with biotinylated goat anti-rabbit IgG (5.75ug /ml, 1h, Vector labs PK6101). Streptavidin- HRP and TSA CF594 (Biotium 92174) were prepared according to manufacturer instruction in 1:2000 dilution for 16 min. After that, anti-mouse Cd8a antibody incubation (4.8ug /ml, 6h, Cell Signaling 98941) followed by incubation with biotinylated goat anti-rabbit IgG (5.75ug /ml, 1h, Vector labs PK6101). Blocker D, Streptavidin- HRP and TSA Alexa 647 (Life Tech B40958) is prepared according to manufacturer instruction in 1:100 dilution for 16 min. All slides were counterstained in 5ug /ml DAPI [dihydrochloride [2-(4-amidinophenyl)-6-indolecarbamidine dihydrochloride], Sigma D9542, for 5 minutes at room temperature, mounted with anti-fade mounting medium Mowiol [Mowiol 4-88 (CALBIOCHEM 475904)] and coverslipped. After staining, slides were subjected to Pannoramic Scanner (3DHistech) at MCCF at 40x. For T-cell quantification, at least 5 tumors per groups and at least 3 random fields per tumor were generated by CaseViewer v2.4 (3DHistech) and then analyzed in ImageJ v1.5.3.

Whole exome sequencing of murine cell lines

Whole exome sequencing was performed with Novaseq (Illumina) with an average coverage of 250x. Three replicate cell lines were sequenced. Raw sequencing data are aligned to the GRCm38/mm10 genome build using the Burrows-Wheeler Aligner (BWA) v0.7.15 57. Further indel realignment, base-quality score recalibration and duplicate-read removal were performed using the Genome Analysis Toolkit (GATK) v4.1.4.158. SNV and indel callers include MuTect v1.1.459, VarScan v1.160, Strelka v2.9.1061, Mutect2 (part of GATK 4.1.4.1) and somaticSniper v1.0.562, annotations and filters were described in a previous study9. Briefly, we use Ensembl Variant Effect Predictor63 v102 to determine effect of called variants. Annotated VCF files are converted to Multiple Alignment Forma (MAF) format with Vcf2Maf v1.6.19 (doi:10.5281/zenodo.593251).

RNA-sequencing, GSEA, and TCR analysis

Three biological replicates per condition were submitted for sequencing depending on RNA quality. RNAseq raw read sequences were aligned against mouse genome assembly mm10 (Dec.2011/GRCm38, https://genome.ucsc.edu/cgi-bin/hgGateway?db=mm10) by STAR 2-pass alignment64. RNAseq gene level count values were computed by using the R package GenomicAlignments 65 over aligned reads with UCSC KnownGene66 in mm10 as the base gene model. The Union counting mode was used and only mapped paired reads were considered. Fragments per kilobase million (FPKM) values were then computed from gene level counts by using fpkm function from the R package DESeq2 v1.30.167.GSEA analysis was performed on DESeq2 outputs using pre-ranked GSEA module with default setting in GSEA v4.1.0 for Windows or on Genepattern server (Broad Institute)68. Pathway enrichment from the consensus DEGs were performed using R package clusterProfiler v3.18.169. Immune cell type signature enrichment were performed using xCell v1.133. PCA analysis were performed with R package PCAtools v2.2.0 with 10% variable with lowest variation removed from the analysis. PERMANOVA analysis were performed using R package vegan v2.5.7. TCR-beta CDR3 was extracted with MixCR70 v3.0.18 with default setting70. Calculation of Chao1 index, richness, evenness and clonality were performed as previously described71.

Mutational signature analysis on whole exome sequencing and MSK-IMPACT panel sequencing data

Mutation data were converted to trinucleotide context matrices using SigProfilerMatrixGeneratorR v0.1.072 and was limited to exome regions. De novo NMF prediction of mutational signature and transcriptional strand-based mutational signature analysis were preformed using the R package MutationalPatterns v1.8.073. NNLS analysis was performed using R package deconstructsig v1.9.074. SNVs detected in parental and mutant cell lines before 8 weeks of in vitro culturing were defined as baseline SNV mutations. SNVs detected only after 8 weeks of in vitro culturing were defined as de novo mutations. For MSK-IMPACT panel and mouse mutational signature analysis, a trinucleotide context matrix was first normalized based on the abundance of each trinucleotide context category in MSK-IMPACT regions, mouse and human exome regions. The normalize matrix were then used for downstream NMF and NNLS analysis.

Logistic regression model training on TCGA and MSK-IMPACT data

A priori, we classified samples as having a known functional mutation using a baseline list of reported known functionals from prior work classifying mutations based on the ultramutator phenotype12 and an FDA-approved database of functional mutations, the OncoKB data base35, which contains functional mutations curated from the literature. We then used this list to determine whether these samples harbored at least one POLE/D1 functional mutation. Logistic regression was applied using the R glm function with equation ‘functional status ~ SBS10a + SBS10b + SBS14 + SBS20’. For the WES dataset, only samples with at least 20 SNVs were used for generating the WES-model. For the MSK-IMPACT dataset, only samples with at least 10 SNVs were used for generating the IMPACT-model. Weight of functional and wild-type sample classes were calculated based on the percentage of functional samples and wild-type samples in the dataset. Predictions from the logistic regression model were further analyzed. Cutoff points were obtained using ROCit v2.1.1 package. Confusion matrix, accuracy and ROC curves were obtained using caret v6.0-86, PRROC v1.6.0 and pROC (v1.17.01) packages.

MSK-IMPACT immunotherapy cohort, survival and response analysis

The MSK-IMPACT immunotherapy cohort was assembled using a similar approach as described previously75,76. Briefly, after informed patient consent and approval by the MSK Institutional Review Board (IRB), patients were identified who received their anti-PD1/PD-L1 and underwent MSK-IMPACT targeted panel sequencing at MSK. In addition, to minimize the impact of potential survival bias due to left truncation (defined as a type of selection bias that results from only studying patients who have survived long enough on ICB therapy to receive MSK-IMPACT sequencing, in this circumstance) by limiting the cohort to patients received their first dosage of immunotherapy reagents between January 2015 to July 2018 when MSK-IMPACT was already routinely performed. Patients with more than one cancer type and cancer types with fewer than 10 patients were excluded. MSI-H patients, determined by having an MSIsensor score >= 10, were also excluded. Survival data was last updated July 2020 and overall survival calculated as time between first anti-PD-1/anti-PD-L1 therapy start and death, or censored at last contact. Kaplan–Meier survival analysis was performed with the R survival package v3.2.10, and hazard ratios (HR) and log-rank p values were calculated using the coxph analysis with cancer type correction; results were reported and presented using the R survminer package v-.4.9. Multi-variable coxph analysis was also performed using the survminer package. Known POLE/D1 functional mutation-positive patients were determined as patients harboring at least one functional mutation from the baseline functional list used for generating the logistic prediction model. POLE/D1 function-associated signature-positive VUS patients were identified as patients 1) harboring at least one somatic POLE/D1 mutation; 2) harboring at least 10 SNV for relative reliable mutation signature analysis; and 3) classified as functional samples by the IMPACT logistic regression model. TMB was calculated as MSK-IMPACT filtered non-synonymous mutation counts divided by the length of the corresponding panel version77. FGA (fraction genome altered; aka, fraction copy-number altered) was calculated as the length of FACETS78,79 segments with ∣cnlr.median.clust∣ >= 0.2 (i.e., segments with log2 CNA value > 0.2) divided by the total length of all segments. For the single cancer type specific coxph model, non-small cell lung cancer (NSCLC) was defined as the union of lung adenocarcinoma, lung squamous cell carcinoma, poorly differentiated non-small cell lung cancer and other non-small cell lung cancer cases. The response to immunotherapy was categorized based on RECIST v1.1 criteria80. When formal RECIST reads were not available, physician notes and imaging studies were reviewed to categorize overall best response for each patient using the same criteria based on change in the sum of diameters of target lesions. Patients with CR, PR or SD for at least 6 months were classified as clinical benefit; SD less than 6 months and PD were classified as no clinical benefit. PFS was calculated from ICB first infusion to disease progression or death of any cause; patients without progression were censored at last attended appointment at MSKCC with any clinician. For estimating the CR patient tumor sizes, tumor length and width were retrieved from clinical data and tumor volume is estimated using formula: ((Length) x (Width)^2)/2. In silicon mutation effect prediction result were calculated from dbNSFP4.2a81. C-index calculation and associated statistical analysis are performed as described previously82 using using survcomp v1.40.0. Heatmap is illustrated with complexheatmap package v2.6.283.

SBS signature SNV classes/AA alteration classes association, hydrophobicity and polarity calculation

To calculate the association between a mutational signature and SNV classes or amino acid alteration classes, we first calculated the association of each the SNVs in this class with this mutational signature, based on the mutational signature profile extracted from the samples the SNVs belongs to, and the probability that this mutational signature generates mutations with the same trinucleotide context of this SNV (which is the SBS trinucleotide context matrix). We then assigned each SNVs into different SNV classes or amino acid alteration classes, and summed up the probability of all SNVs in each class. Finally, we normalized to the total probability of all classes for this mutational signature to obtain the proportion of probability of each SNV or amino acid alteration class. For neoantigen related analysis, only samples with at least 1 known binder neoantigen were included into analysis. Hydrophobicity changes were calculated using the Kyte–Doolittle numeric hydrophobicity scale84. Polarity changes were calculated using the Grantham polarity index85.

Statistics & Reproducibility

All the statistical details of experiments including the statistical tests used, number of samples, definition of center, dispersion, precision measures and how statistical significance is determined can be found in figure legends. For mouse tumor experiments, sample size is based on our previous publication7,9. No statistical method was used to predetermine sample size. Mice with severely necrotic and intramuscular tumors are euthanized based on our approved IACUC protocol and thus excluded from analysis. No other data were excluded from the analyses. No statistical method was used to predetermine sample size for other experiments. Randomization was performed in mouse experiments between treatment groups. Where possible, investigators were blinded to allocation during objective outcome assessment including tumor size measurement, immunofluorescence quantification, and clinical immunotherapy response assessment. For generating patient sub-cohorts with matched N, median and mean TMB, wild-type patients were randomly selected in R 4.0.3 with the random seed 12345. For mouse tumor and flow cytometry experiments, statistical analyses were performed using the Graphpad Prism v7.00. For all the rest experiments and analyses were performed in RStudio v1.3.1093 with R v4.0.3.

Extended Data

Extended data figure 1. Mouse tumors harboring Pole/d1 functional mutations are sensitive to immunotherapy.

Extended data figure 1.

a, Strategy to introduce PoleP286R into murine cell lines. The P286R nonsynonymous mutation was introduced into endogenous Pole genes via CRISPR-HDR technique. b, Scheme of in vitro culture and WES (whole exome sequencing) of parental and PoleP286R mutant cell lines. After validated by amplicon sequencing, single cell clone derived PoleP286R mutant cell lines (clone 1) were subject to WES sequencing. The cell lines were further cultured for another 8 weeks, cryopreserved and subjected to WES sequencing again. c, Total SNV (single nucleotide variant) counts of the PoleP286R mutant and parental cell lines before and after 8 weeks of in vitro culture (N=3 biological replicates). P values indicate two-sided Student’s t-tests. Data are presented as mean values ± s.e.m. (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). d, Growth curve of the B16F10 PoleP286R clone2 cell line with ICB therapies (N=15 mice per group). P values (anti-CTLA4, P=0.0003; anti-PD1, P=0.0002; Combo, P= 0.0001). e, Survival analysis of mice bearing the B16F10 parental (anti-CTLA4, P=0.14; anti-PD1, P=0.13; Combo, P= 0.0003), the B16F10 PoleP286R (anti-CTLA4, P<0.0001; anti-PD1, P<0.0001; Combo, P<0.0001) or the B16F10 PoleP286R clone2 tumors (anti-CTLA4, P<0.0001; anti-PD1, P<0.0001; Combo, P<0.0001) after ICB (N=15 mice per group). P values indicate log rank test significance (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005, **** P<0.0001). f, Growth curves of the CT26- PoleP286R clone2 cell line with ICB therapies (N=15 mice per group). P values (CTLA4 P=1.1e-8, PD-1 P=0.0015, Combo P=3.2e-9). g, Growth curves of the CT-26 PoleWT single cell clone 1 and clone 2 tumors with IgG or anti-PD1 therapy (N=15 mice per group). P values (Clone 1 P=0.023, Clone2 P=0.053). h, Quantification of tumor inhibition rates of the CT-26 parental, CT-26 PoleWT single clones and CT-26 PoleP286R clone 1 and clone 2 tumors with anti-PD1 therapy at the last time point. Tumor inhibition rate was calculated as percentage of reduced tumor volume compared to the IgG treated tumors (N=15). Dots represent individual biological replicates. P values indicate two-sided Student’s t-tests (n.s., no statistical significance, *P<0.05, ** P<0.01, *** P<0.005). Data are presented as mean values +/− SEM. i, Growth curves of the B16F10 PoleV411L clone2 with anti-PD1 therapy (N=15 mice per group, P=0.0001). For all growth curves related panels (d, f, g, i), P values indicate two-sided Student’s t-tests at the end time points (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). Data are presented as mean values ± s.e.m.. No multiple comparisons adjustment was performed.

Extended data figure 2. The baseline immune microenvironment of the Pole mutant tumors.

Extended data figure 2.

a, Heatmap of 1298 DEGs (differentially regulated genes) from RNA-seq analysis of the B16F10 parental and PoleP286R mutant tumors 14 days post implantation. Color sale indicates normalized z-score. b, GSEA (gene set enrichment assay) indicating enrichment of gene sets related to interferon gamma response, T cell and NK cell activation, inflammation, antigen presenting pathway and PD1 signaling in the mutant tumors versus parental tumors. c, Heatmap showing DEGs (FDR P <= 0.05) between parental and mutant tumors from the Hallmark interferon gamma response pathway, PID CD8 TCR pathway and KEGG natural killer cell mediated cytotoxicity pathway. d, Ptprc TPM of parental and mutant tumors from the RNA-seq data showed in Fig. 2c (N=3 biological replicates). P=0.0008 indicate two-sided Student’s t-tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). Data are presented as mean values +/− SEM. e, Flow cytometry analysis of the percentage of Cd8 T cells expressing Pd-1 (P=0.026), Tigit (P=0.48) and Lag3 (P=0.78) in the parental and mutant tumors 14 days post implantation (N=6 biological replicates). P values indicate two-sided Student’s t-tests (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). Data are presented as mean values +/− SEM. f, Flow cytometry analysis of expression intensity of Pd-l1 (P=0.045), Cd204 (P=0.043) and Cd206 (P=0.0087) on tumor associated macrophages in the parental and mutant tumors 14 days post implantation (N=6 biological replicates). P values indicate two-sided Student’s t-tests at the end time points (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). Data are presented as mean values +/− s.e.m..

Extended data figure 3. Immune microenvironment of the post-treatment Pole mutant tumors.

Extended data figure 3.

a, Differentially expressed genes up-regulated or down-regulated in the post-ICB PoleP286R tumors versus the parental tumors. There are 82 DEGs that are consistently upregulated in all ICB treatment arms, while there are 59 genes are consistently downregulated in mutant tumors of all immune checkpoint blockade (ICB) arms. b, Top 10 KEGG pathways that were significantly enriched in the consistently up-regulated DEGs. c, Top 10 KEGG pathways that were enriched in the consistently down-regulated DEGs across all ICB arms. No pathway is statistical significantly enriched, as determined by q value <0.05. d, GSEA analysis of PoleP286R tumor in the combo arm versus the IgG arm. Only pathways with nominal p value <0.05 were shown. e, GSEA plot of enriched gene sets related with inflammatory response in combo ICB arm versus IgG arm of the PoleP286R tumors.

Extended data figure 4. Both adaptive and innate immune cell types contribute to the distinct immune profiles of the post-treatment Pole mutant tumors, compared to that of the parental tumors.

Extended data figure 4.

a, Heatmap of immune-cell-type-signatures enrichment in the post-ICB samples. Color scale indicates normalized enrichment scores. b, Screen plot of the principal component analysis. Bars indicate the explained variations for each PC. The red line and dots indicate the accumulatively explained variations from PC1 to each other PC. c, Contribution of each immune-cell-type-signature to PC1. Red indicates the enrichment score of the immune-cell-type-signatures aligned to the same direction with the PC1 axle, blue indicates the enrichment score of the immune-cell-type-signatures aligned to the opposite direction with the PC1 axle. d, Normalized enrichment score of monocytes and macrophages in post-ICB tumors (N=3 biological replicates). P values (monocytes P=3.98e-5; macrophage P=4.1e-6) were derived from two-way ANOVA test. The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar. e, Normalized enrichment score of Treg (regulatory T cell) and NKT cells in post-immunotherapy tumors (N=3 biological replicates). P values (Treg P=0.030; NKT P=0.022) were derived from two-way ANOVA test. i, z-score transformed fraction of the TCR-beta CDR3 clone types in the post-treated parental and PoleP286R tumors. Note that no CDR3 clone type is successfully extracted from two of the parental tumors treated with IgG. f, Chao1 index and richness score of the TCR-beta CDR3 sequence of post-treated tumors (N=3 biological replicates). g-h, Chao1, richness, evenness and clonality scores of the TCR-beta CDR3 sequences of post-treated tumors (N=3 biological replicates). P values (Chao1 P=0.012; Richness P=0.051; Evenness P=0.26; Clonality P=0.26) were derived from two-way ANOVA test. For all boxplots (d-e, g-h), the minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar.

Extended data figure 5. Mutation signatures of the Pole mutant cell lines.

Extended data figure 5.

a, Schematic explanation of the baseline and de novo SNVs in parental and mutant cell lines. b, Transcriptional strand bias (TSB) of the six base substitution categories of the baseline and de novo mutations from the B16F10 parental and PoleP286R mutant cell lines. De novo mutations in the parental and the PoleP286R mutation cell lines showed distinct transcriptional strand bias, indicating that these mutations are generated by different biological processes. c, The 192 TSB base substitutions in trinucleotide sequence contexts of the three NMF extracted de novo TSB mutational signatures from the baseline and de novo SNVs of the B16F10 parental and PoleP286R mutant cell lines. d, Contribution of the three NMF extracted de novo TSB mutation signatures to the baseline and de novo mutations in the B16F10 parental and PoleP286R mutant cell lines showed that the TSB-Sig.B is exclusively contributed to the de novo mutations discovered in the B16F10 PoleP286R mutant cell lines. e, Cosine similarity of the de novo mutational TSB signatures with COSMIC TSB-SBS signatures v3. The TSB-Sig.B is highly similar to the POLE/D1 functional signature TSB-SBS10b (cosine similarity of 0.84). f, COSMIC SBS signatures extracted from B16F10 parental and PoleP286R mutant cell lines by NNLS method. The POLE/D1 functional signature SBS10b can only be extracted from the de novo SNVs of the PoleP286R mutant samples.

Extended data figure 6. Statistical models based on functional mutational signatures can be used to identify tumors with POLE/D1 functional mutations.

Extended data figure 6.

a, Sample summary of the TCGA data set. Wild-type, tumors are wild-type for POLE/D1 (POLE or POLD1); Functional, tumors harboring at least one known POLE/D1 functional mutation; Mutated, tumors with only POLE/D1 variant of unknown significance (VUS); SNV, total counts of SNV in the exome region of the tumors. b, the optimal Youden Index point and corresponding probability cutoff value on the logistic regression model trained on the TCGA training set. TPR, True Positive Rate, FPR, False Negative Rate. c, Sample summary of the ICGC/CCLE test set. d, Heatmap of the Non-negative least squares (NNLS) extracted COSMIC SBS signatures of false negative predictions in the TCGA training set and ICGC/CCLE test set. e, Reconstitution accuracy of the non-negative matrix factorization (NMF) extracted signatures on SNVs from the TCGA samples with POLE/D1 functional mutations, an accuracy threshold of 0.7 were used determine the reliability of the reconstitution. f, Cosine similarity of the three NMF extracted mutational signatures from the TCGA tumors samples with known POLE/D1 functional mutations to the COSMIC SBS signatures. Cosmic SBS signatures were clusters based on their associated biological processes. POLE/D1, SBS mutational signatures associated with POLE/D1 functional mutations; MMRd, SBS mutational signatures related to mismatch repair deficiency; Clock-like, SBS mutational signatures that related to Clock-like mutational processes; Sequencing artifacts, SBS signatures possibly generated by sequencing artifacts; Other signatures, SBS signatures associated with all other biological processes. g, Contribution of the three NMF extracted de novo mutational signatures to the TCGA samples with known POLE/D1 functional mutation in the training set, with the known functional mutation in each sample labeled. MSIscore, MSI sensor score. TMB, SNV count in the exome region by WES sequencing. Functional mutation, whether the functional mutation in the samples belongs to POLE or POLD1 mutations. False negative, samples harboring known POLE/D1 functional mutations in the TCGA training set are predicted as non-functional mutation samples by the logistic regression model. h, Contribution of the three NMF extracted de novo mutational signatures from TCGA samples with known POLE/D1 functional mutations in the corresponding samples in the ICGC/CCLE test set. Cluster 1,2&3 corresponding to Cluster 1,2&3 in (g). i, Fisher exact test on the MSI status of the TP and FN samples from the WES training and test sets when MSI status is available. j. Tumor allele frequencies of the POLE/D1 functional mutations from the false negative predictions (FN, N=5), True positive predictions (TP, N=77) from the TCGA and ICGC WES cohorts, as tumor allele frequency is not available for some of the functional mutations in these samples. P value was calculated with two-sided Wilcoxon Rank Sum Test. k, Distribution of the false positive prediction (FP) samples from the WES training set based on SNV count/Mb exome. The green dash line indicates cutoff for SNVlow (3 SNV/Mb exome), the blue dash line indicates cutoff for SNVint/hi (10 SNV/Mb exome) and the red dash line indicates cutoff for SNVhyper (50SNV/Mb exome). l, Distribution of the TP samples (top plot) or VUS samples (bottom plot) that were predicted as functional mutation samples from the WES training model based on SNV count/Mb exome. The green dash lines indicate cutoff for SNVlow (3.6 SNV/Mb exome), the blue dash lines indicate cutoff for SNVint/hi (10 SNV/Mb exome) and the red dash lines indicate cutoff for SNVhyper (50SNV/Mb exome). m. Unsupervised clustering of the SNVint/hi FP samples from the TCGA training set based on the extracted COSMIC SBS signatures.

Extended data figure 7. Statistical models based on functional mutational signatures predict functional mutations and associated immune features.

Extended data figure 7.

a, Proportion of functional-related COSMIC SBS signatures extracted from TCGA WES data or TCGA-IMPACT panel simulation data in each samples in the training set. Pearson correlation coefficients were shown. b. Confusion table of the WES model applying to the TCGA-impact cohort. Accuracy, sensitivity and specificity were calculated and presented. Pred. wild-type, samples that were predicted as wild-type for POLE/D1 (POLE or POLD1) functional mutations; Pred. functional, samples that were predicted harboring POLE/D1 functional mutations. c, Scheme of training a logistic regression model to identify tumors containing known POLE/D1 functional mutations from MSK-IMPACT targeted panel sequencing data. d, Sample summary of the MSK-IMPACT training set. e, Optimal Youden Index point and corresponding probability cutoff value on the logistic regression model trained on MSK-IMPACT training set. f, Heatmap of the NNLS extracted COSMIC SBS mutational signatures of the false negative predictions from the MSK-IMPACT training set. POLE/D1, SBS mutational signatures associated with POLE/D1 functional mutations; MMRd, SBS mutational signatures related to mismatch repair deficiency; Clock-like, SBS mutational signatures that related to Clock-like mutational processes; Sequencing artifacts, SBS signatures possibly generated by sequencing artifacts; Other signatures, SBS signatures associated with all other biological processes. g, Co-efficiency of the four POLE/D1 functional-associated mutational signatures in the WES trained logistic regression model and the IMPACT panel trained model. h. Immune infiltration score and CYT score (log10 transformed) of the samples with known POLE/D1 functional mutations (N=53) and POLE/D1 variant of unknown significance (VUS) samples with functional signatures (N=7) compared to the POLE/D1 functional mutation/signature-negative tumors (N=520) from the TCGA endometrial cohort. samples with known POLE/D1 functional mutations, tumors harbor known POLE/D1 functional mutations; VUS samples with functional signatures, samples harbor POLE/D1 VUSes and were positive for POLE/D1 functional signatures predicted by the functional signature-based model, POLE/D1 functional mutation/signature-negative tumors, wild type samples or samples harbor POLE/D1 VUSes and did not show POLE/D1 functional signature. P values were calculated with two-sided Wilcoxon Rank Sum Test. The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar and means were plotted as center black dot. i. Screen plot of the principal component analysis on the TCGA-endometrial cohort showing how much variation could be explained by each principal component (PC). Bars indicate the explained variations for each PC. The red line and dots indicate the accumulatively explained variations from PC1 to other PCs respectively. j, Sample separation plot of the three groups of samples in (h) based on the first two PCs of the above PCA analysis, P values were calculated with Permutational multivariate analysis of variance (PERMANOVA) test. (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005)

Extended data figure 8. Immune features and the outcome of patients with POLE/D1 functional mutations/signatures.

Extended data figure 8.

a, Enrichment scores of the immune cell types that are significantly upregulated or down-regulated in the POLE/D1 (POLE or POLD1) functional mutation/signature-positive tumors and FP (false positive) prediction wild-type tumors compared to the POLE/D1 functional mutation/signature-negative tumors (POLE/D1 functional mutation/signature-positive tumors N=60, POLE/D1 functional mutation/signature-negative tumors N=520, FP prediction wild-type tumors N=6; also see Fig. 5d). POLE/D1 functional mutation/signature-positive, tumors either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 variants of unknown significance (VUS) and were positive for POLE/D1 functional signatures; POLE/D1 functional mutation/signature-positive tumors, samples were predicted as wild-type samples by the functional signature-based model, regardless of the POLE/D1 mutation status; FP prediction wild-type tumors, wild type sample predicted as POLE/D1 functional signature-positive (i.e., false positive) by the logistic regression model. P values (CD4 Tem P=0.0042; Th1 P=6.2e-5; Th2 P=1.6e-9; Eosinophils P=0.027; Macrophages P=4.8e-5; Memory B-cell P=0.036). b, Log-fold changes of the enrichment scores of different immune cell types from the human tumors (POLE/D1 functional mutation/signature-positive tumors versus POLE/D1 functional mutation/signature-negative tumors) and mouse tumors (PoleP286R baseline tumors versus parental baseline tumors). Red color indicates cell types that are consistently upregulated or downregulated with P<0.05 for both human and mouse tumor comparisons. c, Chao1 and clonality index of the TCR-beta CDR3 repertoires from the POLE/D1 functional mutation/signature-positive tumors (N=59), POLE/D1 functional mutation/signature-negative tumors (N=463), and FP prediction wild-type tumors (N=5) of the TCGA-endometrial cohort when TCR-beta CDR3 repertoire data is available. P values (Chao1 P= 8.5e-5, clonality P=0.045). d, COSMIC SBS signature profiles of the 24 POLE/D1 functional mutation/signature-positive patients in the ICB cohort. e-f. Comparison of the TMB (e) and copy number alterations (f) between the POLE/D1 functional mutation/signature-positive patients, other POLE/D1 mutated patients, and wild-type patients (N=24, 148 and 2528 patients). TMB, tumor mutational burden, non-synonymous mutation count/Mb IMPACT panel exome region (P=2.2e-5; P=0.0021). FGA, fraction of genome copy number alteration (P<2.2e-16; P=0.0095). g, Kaplan-Meier overall survival plot of the POLE/D1 functional mutation/signature-positive patients by the MSK-IMPACT logistic regression model versus the histology matched POLE/D1 wild-type patients. Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction. h, Forest plot of the POLE/D1 functional mutations/signatures in coxph models of overall survival after immunotherapy with cancer type correction for pan-cancer or single cancer type categories that have at least three POLE/D1 functional mutation/signature-positive patients. Number of POLE/D1 functional mutation/signature-positive patients, number of wild-type patients, hazard ratio and Log-Rank P value are shown for each cancer type category in the figure. Horizontal bars represent the 95% confidence interval for the hazard ratios. Each line indicates an individual coxph model generated for the indicated cancer type category. Error bar centres indicate Hazard ratios for each individual cox model. Statistical significance was evaluated with two-sided test. (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). i, estimated tumor size change and composition of the mutational signatures for one of the CR patients who harbored POLEP286R functional mutation. Composition of the mutational signatures, and MRI image (pre-ICB and different time point post-ICB, Green mark indicated tumor location and sizes) with estimated tumor volume curve were shown. For all boxplots (a, c, e), P values were calculated from Wilcoxon Rank Sum Test (n.s., no statistical significance, * P<0.05, ** P<0.01, *** P<0.005). The minima (0% percentile), maxima (100% percentile) were plotted as the whiskers, 25% percentile and 75% percentile were plotted as the bounds of the boxes, medians were plotted as the center bar.

Extended data figure 9. POLE/D1 functional signatures predicts immunotherapy response.

Extended data figure 9.

a. Comparison of the proportion of clinical beneficial cases between the POLE/D1 (POLE or POLD1) functional mutation/signature-positive patients and the histology matched wild-type patients. Functional mutations/signatures, patients either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 variants of unknown significance (VUSes) but were predicted as functional signature-positive, determined by the logistic regression model. P value was derived from Fisher’s exact test. b, Kaplan-Meier progression free survival probability plot of the patients harboring any types of POLE/D1 mutations versus POLE/D1 wild-type patients. c, Kaplan-Meier progression free survival plot of the POLE/D1 functional mutation/signature-positive patients versus the histology matched POLE/D1 wild-type patients. d, Kaplan-Meier overall survival plot of the POLE/D1 functional mutation/signature-positive patients versus the POLE/D1 functional signature-negative VUS patients. e. Comparison of the proportion of clinical beneficial cases of the POLE/D1 functional mutation/signature-positive patients to all the other POLE/D1 mutated patients after immunotherapy. P value was derived from Fisher’s exact t-test. F. Comparison of the proportion of clinical beneficial cases of the FP (false positive) prediction wild-type patients versus the TN (true negative) prediction wild-type patients upon ICB. FP prediction wild-type patients, POLE/D1 wild-type patients that were predicted as POLE/D1 functional mutation-positive by the logistic regression model. TN prediction wild-type patients, POLE/D1 wild-type patients that were predicted as wild-type samples by the logistic regression model. P value was derived from Fisher’s exact t-test. g-h, Kaplan-Meier overall survival (g) and progression free survival plot (h) of FP prediction wild-type patients versus TN prediction wild-type patients. I, A multivariable coxph model comparing the predictive capability of different patient selection strategies on the progression free survival on patients after ICB (N=1130). Hazard ratio and P value are presented in the figure. Horizontal bars represent the 95% confidence interval of the hazard ratio. Error bar centres indicate hazard ratios. Statistical significance levels were generated from the coxph model without adjustment for multiple comparison (* P<0.05, ** P<0.01, *** P<0.005). j-k, Kaplan-Meier overall survival plot (j) and progression free survival plot (k) of the POLE/D1 exonuclease domain mutation-positive patients versus the POLE/D1 wild-type patients. l-m, Kaplan-Meier overall survival plot (l) and progression free survival plot (m) of the POLE/D1 functional mutation/signature-positive patients that were not hypermutated, versus the POLE/D1 wild-type patients. For all Kaplan-Meier plots (b-d, g-h, j-m), Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction.

Extended data figure 10. POLE/D1 functional signature-based model predicts ICB outcome and outperforms traditional approaches.

Extended data figure 10.

a-b. Kaplan-Meier overall survival plot (a) and progression free survival plot (b) of the patients with at least one POLE/D1 (POLE or POLD1) mutation classified as damaging mutation by all five in silico prediction algorithms versus all the rest patients with POLE/D1 mutations. c, C-index (concordance index) of the coxph models generated based on different patient selection strategies with cancer type correction on the ICB related progression free survival of all the POLE/D1 mutated patients (N=172). Two-sided P values were calculated from paired student t-tests of coxph model based on POLE/D1 functional mutation/signature-positive against other models without multiple comparison adjustment. Functional mutations/signatures, patients either harbored known POLE/D1 functional mutations, or only harbored POLE/D1 variants of unknown significance (VUSes) but were predicted as functional signature-positive, determined by the logistic regression model. d, Multi-variable coxph model of ICB progression free survival for ‘POLE/D1 functional mutation/signature-positive’ and TMB with cancer type correction (N=1130). Only POLE/D1 functional mutation/signature-positive and TMB are shown in the forest plot, * log-rank P<0.05. *** log-rank P<0.005. Error bar indicating 95% CI of the Hazard ratio. e, Kaplan-Meier progression free survival plot of the POLE/D1 functional mutation/signature-positive patients versus the POLE/D1 wild-type patients in the ICB treated patient cohort with high TMB (TMB>=10). f, Kaplan-Meier overall survival plot of a random sub-cohort of the POLE/D1 functional mutation/signature-positive patients versus the POLE/D1 wild-type patients with matched median and minimum TMB. g, Kaplan-Meier progression free survival plot of a random sub-cohort of the POLE/D1 functional mutation/signature-positive patients versus the POLE/D1 wild-type patients with matched median and minimum TMB. h, Proportion heatmap of the observed association between SBS mutational signatures with each SNV class in the TCGA pan-cancer cohort. i. Proportion heatmap of the observed association between SBS mutational signatures with each amino acid alteration class in the TCGA pan-cancer cohort. Amino acids on the top row are the new AA generated from the SNV mutation (Post); Amino acids on the bottom row are the AA from the wild-type allele (Pre). For all Kaplan-Meier plots (a-b, e-g), Log-Rank P value and hazard ratio shown were calculated from coxph model with cancer type correction.

Supplementary Material

Supplementary tables
Supplementary notes and figures

Acknowledgements

We would like to thank colleagues at the MSK core facilities, including Integrated Genomics Operation (IGO), Flow Cytometry Core Facility (FCCF) and Molecular Cytology Core Facility (MCCF) for processing our samples and providing important suggestions. We would like to thank colleagues at the Molecular Diagnostics Service in the Department of Pathology, and the Marie-Josee and Henry R. Kravis Center for Molecular Oncology Marie-Josee of MSK for establishing and generating the MSK-IMPACT data. We would like to thank all the members and alumni of the Chan lab at MSK and CCF, as well as HOPP of MSK for their generous help and support of this study. The results presented here are in part based upon data generated or collected by the TCGA Research Network, Broad CCLE, ICGC. We acknowledge funding sources including NIH R01 CA205426 (TAC), NIH R35 CA232097 (TAC), the STARR Cancer Consortium (TAC) and the NIH/NCI Cancer Center Support Grant P30 CA008748 (MSKCC) .

Footnotes

Competing Interests Statement

T.A.C. is a co-founder of Gritstone Oncology and holds equity. T.A.C. holds equity in An2H. T.A.C. acknowledges grant funding from Bristol-Myers Squibb, AstraZeneca, Illumina, Pfizer, An2H, and Eisai. T.A.C. has served as an advisor for Bristol-Myers, MedImmune, Squibb, Illumina, Eisai, AstraZeneca, and An2H. T.A.C., L.G.T.M, and D.C. are inventors on intellectual property held by MSKCC on using tumor mutation burden to predict immunotherapy response, with pending patent, which has been licensed to PGDx. C.V. acknowledges research grant funding from Fundación Alfonso Martín Escudero. D.Z. received consulting fees from Agenus, Hookipa Biotech, Targovax, Astra Zeneca, Synthekine, Mana Therapeutics, Xencor, Crown Biosciences, and Memgen. D.Z. receives grant/research support from Astra Zeneca, Roche, and Plexxikon. D.Z. holds stock options for Immunos Therapeutics, Calidi Biotherapeutics, Mana Therapeutics and Accurius. D.Z. has a patent related to use of Newcastle Disease Virus for cancer therapy with royalties paid from Merck. R.Y. has served as an advisor for Natera, Array BioPharma/Pfizer, and Mirati Therapeutics and has received research support to her institution from Array BioPharma/Pfizer, Boehringer Ingelheim, and Mirati Therapeutics. The remaining authors declare no competing interests.

Data Availability Statement

Mutation data from the TCGA pan-cancer study (mc3.v0.2.8.PUBLIC.maf.gz) were downloaded from NCI genomic data commons (https://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc). Mutation data of ICGC and CCLE were download from ICGC data portal (https://dcc.icgc.org/api/v1/download?fn=/current/Summary/simple_somatic_mutation.aggregated.vcf.gz) and the Broad Institute website (https://ndownloader.figshare.com/files/34008434), accordingly. To generate the ICGC/CCLE test set, TCGA samples in ICGC cohort were manually removed. BED files of MSK-IMPACT sequencing regions are acquired from AACR-Genie project database (https://www.synapse.org/#!Synapse:syn7222066/wiki/405659). COSMIC SBS exome signature v3 and COSMIC SBS exome TSB signature v3 were downloaded from the Synapse database (https://www.synapse.org/#!Synapse:syn11967914). Mouse GRCm38 genome were download from UCSC genome browser (https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/mm10.fa.gz). A baseline list of known POLE/D1 functional mutations were generated from previous publications and the OncoKB database (https://www.oncokb.org/) 12,35. Additional lists of clinical associated POLE/D1 mutations and POLE/D1 mutator alleles in other species were manually curated from literatures and were available as Supplemental Table 2&3. Gene expression profile and xCell based cell type deconvolution of TCGA endometrial cohort were downloaded from TIMER database (https://timer.cistrome.org/infiltration_estimation_for_tcga.csv.gz) 86 87. TCRb reportorie data were downloaded from NCI genomic data commons (TCGA_mitcr_cdr3_result_161008.tsv, from https://gdc.cancer.gov/about-data/publications/panimmune) 88.TCGA predicted neoantigen information were acquired from TSNAdb (http://biopharm.zju.edu.cn/tsnadb/download/)89. Mouse WES and RNA-seq data were deposited to SRA (PRJNA701709). All the other data are deposited to Synapse (as Synapse project syn29404148, https://www.synapse.org/#!Synapse:syn29404148/wiki/617361) with public access: All the samples underwent evalutation for model building (Synapse ID: syn29477489.1), the mutational signature matrix for training and test the WES (Synapse ID: syn29478035, Synapse ID: syn29478033) and MSK-IMPACT models (Synapse ID: syn29478110, Synapse ID: syn29478151), the genomic, response and survival data of the MSK-IMPACT immunotherapy cohorts (Synapse ID: syn29478036)

Code Availability Statement

All customized code including code for generate the models (Synapse ID: syn29479495)90 ,the analysis of MSK-IMPACT cohort (Synapse ID: syn29479497)91 and other associated code (Synapse ID: syn30137113)92 are deposited to Synapse (Synapse project ID: syn29404148, https://www.synapse.org/#!Synapse:syn29404148/wiki/617361) with public access. Code for processing WES and RNAseq were from published tools and is available from the authors of the tools as described in the above Methods sections.

Reference

  • 1.Borcoman E et al. Novel patterns of response under immunotherapy. Ann Oncol 30, 385–396 (2019). [DOI] [PubMed] [Google Scholar]
  • 2.Hegde PS & Chen DS Top 10 Challenges in Cancer Immunotherapy. Immunity 52, 17–35 (2020). [DOI] [PubMed] [Google Scholar]
  • 3.Mouw KW, Goldberg MS, Konstantinopoulos PA & D'Andrea AD DNA Damage and Repair Biomarkers of Immunotherapy Response. Cancer Discov 7, 675–693 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bever KM & Le DT DNA repair defects and implications for immunotherapy. J Clin Invest 128, 4236–4242 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hsiehchen D et al. DNA Repair Gene Mutations as Predictors of Immune Checkpoint Inhibitor Response beyond Tumor Mutation Burden. Cell Rep Med 1(2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rizvi NA et al. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–8 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Samstein RM et al. Mutations in BRCA1 and BRCA2 differentially affect the tumor microenvironment and response to checkpoint blockade immunotherapy. Nature Cancer (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Le DT et al. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science 357, 409–413 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mandal R et al. Genetic diversity of tumors with mismatch repair deficiency influences anti-PD-1 immunotherapy response. Science 364, 485–491 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ying J et al. Additive effects of variants of unknown significance in replication repair-associated DNA polymerase genes on mutational burden and prognosis across diverse cancers. J Immunother Cancer 9(2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lujan SA, Williams JS & Kunkel TA DNA Polymerases Divide the Labor of Genome Replication. Trends Cell Biol 26, 640–654 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Campbell BB et al. Comprehensive Analysis of Hypermutation in Human Cancer. Cell 171, 1042–1056 e10 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kunkel TA Evolving views of DNA replication (in)fidelity. Cold Spring Harb Symp Quant Biol 74, 91–101 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Esteban-Jurado C et al. POLE and POLD1 screening in 155 patients with multiple polyps and early-onset colorectal cancer. Oncotarget 8, 26732–26743 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mehnert JM et al. Immune activation and response to pembrolizumab in POLE-mutant endometrial cancer. J Clin Invest 126, 2334–40 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.van Gool IC et al. POLE Proofreading Mutations Elicit an Antitumor Immune Response in Endometrial Cancer. Clin Cancer Res 21, 3347–3355 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Domingo E et al. Somatic POLE proofreading domain mutation, immune response, and prognosis in colorectal cancer: a retrospective, pooled biomarker study. Lancet Gastroenterol Hepatol 1, 207–216 (2016). [DOI] [PubMed] [Google Scholar]
  • 18.Wang F et al. Evaluation of POLE and POLD1 Mutations as Biomarkers for Immunotherapy Outcomes Across Multiple Cancer Types. JAMA Oncol 5, 1504–1506 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.He J et al. Distinctive genomic characteristics in POLE/POLD1-mutant cancers can potentially predict beneficial clinical outcomes in patients who receive immune checkpoint inhibitor. Ann Transl Med 9, 129 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Barbari SR & Shcherbakova PV Replicative DNA polymerase defects in human cancers: Consequences, mechanisms, and implications for therapy. DNA Repair (Amst) 56, 16–25 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Silberman R et al. Complete and Prolonged Response to Immune Checkpoint Blockade in POLE-Mutated Colorectal Cancer. JCO Precision Oncology, 1–5 (2019). [DOI] [PubMed] [Google Scholar]
  • 22.Rayner E et al. A panoply of errors: polymerase proofreading domain mutations in cancer. Nat Rev Cancer 16, 71–81 (2016). [DOI] [PubMed] [Google Scholar]
  • 23.Alexandrov LB et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Petljak M et al. Characterizing Mutational Signatures in Human Cancer Cell Lines Reveals Episodic APOBEC Mutagenesis. Cell 176, 1282–1294 e20 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Koh G, Degasperi A, Zou X, Momen S & Nik-Zainal S Mutational signatures: emerging concepts, caveats and clinical applications. Nat Rev Cancer 21, 619–637 (2021). [DOI] [PubMed] [Google Scholar]
  • 26.Albertson TM et al. DNA polymerase epsilon and delta proofreading suppress discrete mutator and cancer phenotypes in mice. Proc Natl Acad Sci U S A 106, 17101–4 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Li HD et al. Polymerase-mediated ultramutagenesis in mice produces diverse cancers with high mutational load. J Clin Invest 128, 4179–4191 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Cong L et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–23 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Niu B et al. MSIsensor: microsatellite instability detection using paired tumor-normal sequence data. Bioinformatics 30, 1015–6 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Knijnenburg TA et al. Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep 23, 239–254 e6 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Leon-Castillo A et al. Interpretation of somatic POLE mutations in endometrial carcinoma. J Pathol 250, 323–335 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Bellido F et al. POLE and POLD1 mutations in 529 kindred with familial colorectal cancer and/or polyposis: review of reported cases and recommendations for genetic testing and surveillance. Genet Med 18, 325–32 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Aran D, Hu Z & Butte AJ xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol 18, 220 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Alexandrov LB et al. Signatures of mutational processes in human cancer. Nature 500, 415–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chakravarty D et al. OncoKB: A Precision Oncology Knowledge Base. JCO Precis Oncol 2017(2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hoadley KA et al. Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer. Cell 173, 291–304 e6 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Youden WJ Index for rating diagnostic tests. Cancer 3, 32–5 (1950). [DOI] [PubMed] [Google Scholar]
  • 38.International Cancer Genome, C. et al. International network of cancer genome projects. Nature 464, 993–8 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ghandi M et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Zehir A et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat Med 23, 703–713 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Chowell D et al. Improved prediction of immune checkpoint blockade efficacy across multiple cancer types. Nat Biotechnol 40, 499–506 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chowell D et al. TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes. Proc Natl Acad Sci U S A 112, E1754–62 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Richman LP, Vonderheide RH & Rech AJ Neoantigen Dissimilarity to the Self-Proteome Predicts Immunogenicity and Response to Immune Checkpoint Blockade. Cell Syst 9, 375–382 e4 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Riley TP et al. Structure Based Prediction of Neoantigen Immunogenicity. Front Immunol 10, 2047 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Calis JJ et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput Biol 9, e1003266 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Li HD et al. A PoleP286R mouse model of endometrial cancer recapitulates high mutational burden and immunotherapy response. JCI Insight 5(2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Davies H et al. HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures. Nat Med 23, 517–525 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Lee JS & Ruppin E Multiomics Prediction of Response Rates to Therapies to Inhibit Programmed Cell Death 1 and Programmed Cell Death 1 Ligand 1. JAMA Oncol (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Cummings AL et al. Mutational landscape influences immunotherapy outcomes among patients with non-small-cell lung cancer with human leukocyte antigen supertype B44. Nature Cancer 1, 1167–1175 (2020). [DOI] [PubMed] [Google Scholar]
  • 50.Litchfield K et al. Meta-analysis of tumor- and T cell-intrinsic mechanisms of sensitization to checkpoint inhibition. Cell 184, 596–614 e14 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

Methods-only References

  • 7.Samstein RM et al. Mutations in BRCA1 and BRCA2 differentially affect the tumor microenvironment and response to checkpoint blockade immunotherapy. Nature Cancer (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mandal R et al. Genetic diversity of tumors with mismatch repair deficiency influences anti-PD-1 immunotherapy response. Science 364, 485–491 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Campbell BB et al. Comprehensive Analysis of Hypermutation in Human Cancer. Cell 171, 1042–1056 e10 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Cong L et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–23 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Aran D, Hu Z & Butte AJ xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol 18, 220 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chakravarty D et al. OncoKB: A Precision Oncology Knowledge Base. JCO Precis Oncol 2017(2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.van Elsas A, Hurwitz AA & Allison JP Combination immunotherapy of B16 melanoma using anti-cytotoxic T lymphocyte-associated antigen 4 (CTLA-4) and granulocyte/macrophage colony-stimulating factor (GM-CSF)-producing vaccines induces rejection of subcutaneous and metastatic tumors accompanied by autoimmune depigmentation. J Exp Med 190, 355–66 (1999). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Zitvogel L, Pitt JM, Daillere R, Smyth MJ & Kroemer G Mouse models in oncoimmunology. Nat Rev Cancer 16, 759–773 (2016). [DOI] [PubMed] [Google Scholar]
  • 53.Zeng Z et al. TISMO: syngeneic mouse tumor database to model tumor immunity and immunotherapy response. Nucleic Acids Res 50, D1391–D1397 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Griswold DP & Corbett TH A colon tumor model for anticancer agent evaluation. Cancer 36, 2441–4 (1975). [DOI] [PubMed] [Google Scholar]
  • 55.Ran FA et al. Genome engineering using the CRISPR-Cas9 system. Nat Protoc 8, 2281–2308 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Clement K et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat Biotechnol 37, 224–226 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Li H & Durbin R Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–60 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.McKenna A et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297–303 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Cibulskis K et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31, 213–9 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Koboldt DC et al. VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25, 2283–5 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Kim S et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat Methods 15, 591–594 (2018). [DOI] [PubMed] [Google Scholar]
  • 62.Larson DE et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28, 311–7 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.McLaren W et al. The Ensembl Variant Effect Predictor. Genome Biol 17, 122 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Dobin A et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Lawrence M et al. Software for computing and annotating genomic ranges. PLoS Comput Biol 9, e1003118 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Rosenbloom KR et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res 43, D670–81 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Love MI, Huber W & Anders S Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Reich M et al. GenePattern 2.0. Nat Genet 38, 500–1 (2006). [DOI] [PubMed] [Google Scholar]
  • 69.Yu G, Wang LG, Han Y & He QY clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–7 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Bolotin DA et al. MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods 12, 380–1 (2015). [DOI] [PubMed] [Google Scholar]
  • 71.Riaz N et al. Tumor and Microenvironment Evolution during Immunotherapy with Nivolumab. Cell 171, 934–949 e16 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Bergstrom EN et al. SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genomics 20, 685 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Blokzijl F, Janssen R, van Boxtel R & Cuppen E MutationalPatterns: comprehensive genome-wide analysis of mutational processes. Genome Med 10, 33 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Rosenthal R, McGranahan N, Herrero J, Taylor BS & Swanton C DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol 17, 31 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Valero C et al. The association between tumor mutational burden and prognosis is dependent on treatment context. Nat Genet 53, 11–15 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Valero C et al. Pretreatment neutrophil-to-lymphocyte ratio and mutational burden as biomarkers of tumor response to immune checkpoint inhibitors. Nat Commun 12, 729 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Samstein RM et al. Tumor mutational load predicts survival after immunotherapy across multiple cancer types. Nat Genet 51, 202–206 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Zhou J et al. Analysis of Tumor Genomic Pathway Alterations Using Broad-Panel Next-Generation Sequencing in Surgically Resected Lung Adenocarcinoma. Clin Cancer Res 25, 7475–7484 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Shen R & Seshan VE FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res 44, e131 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Schwartz LH et al. RECIST 1.1-Update and clarification: From the RECIST committee. Eur J Cancer 62, 132–7 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Liu X, Li C, Mou C, Dong Y & Tu Y dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Med 12, 103 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Chowell D et al. Improved prediction of immune checkpoint blockade efficacy across multiple cancer types. Nat Biotechnol (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Gu Z, Eils R & Schlesner M Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–9 (2016). [DOI] [PubMed] [Google Scholar]
  • 84.Kyte J & Doolittle RF A simple method for displaying the hydropathic character of a protein. J Mol Biol 157, 105–32 (1982). [DOI] [PubMed] [Google Scholar]
  • 85.Grantham R Amino acid difference formula to help explain protein evolution. Science 185, 862–4 (1974). [DOI] [PubMed] [Google Scholar]
  • 86.Li B et al. Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol 17, 174 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Sturm G et al. Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology. Bioinformatics 35, i436–i445 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Thorsson V et al. The Immune Landscape of Cancer. Immunity 48, 812–830 e14 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Wu J et al. TSNAdb: A Database for Tumor-specific Neoantigens from Immunogenomics Data Analysis. Genomics Proteomics Bioinformatics 16, 276–282 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Ma X et al. (2022) NG_WES_IMPACT_Model_training_testing_script.R [R code]. Synapse. 10.7303/syn29479497.1 [DOI] [Google Scholar]
  • 91.Ma X et al. (2022) NG_ICB_Cohort_script.R [R code]. Synapse. 10.7303/syn29479495 [DOI] [Google Scholar]
  • 92.Ma X et al. (2022) NG-A57433 other associated codes.R [R code]. Synapse. 10.7303/syn30137113 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary tables
Supplementary notes and figures

Data Availability Statement

Mutation data from the TCGA pan-cancer study (mc3.v0.2.8.PUBLIC.maf.gz) were downloaded from NCI genomic data commons (https://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc). Mutation data of ICGC and CCLE were download from ICGC data portal (https://dcc.icgc.org/api/v1/download?fn=/current/Summary/simple_somatic_mutation.aggregated.vcf.gz) and the Broad Institute website (https://ndownloader.figshare.com/files/34008434), accordingly. To generate the ICGC/CCLE test set, TCGA samples in ICGC cohort were manually removed. BED files of MSK-IMPACT sequencing regions are acquired from AACR-Genie project database (https://www.synapse.org/#!Synapse:syn7222066/wiki/405659). COSMIC SBS exome signature v3 and COSMIC SBS exome TSB signature v3 were downloaded from the Synapse database (https://www.synapse.org/#!Synapse:syn11967914). Mouse GRCm38 genome were download from UCSC genome browser (https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/mm10.fa.gz). A baseline list of known POLE/D1 functional mutations were generated from previous publications and the OncoKB database (https://www.oncokb.org/) 12,35. Additional lists of clinical associated POLE/D1 mutations and POLE/D1 mutator alleles in other species were manually curated from literatures and were available as Supplemental Table 2&3. Gene expression profile and xCell based cell type deconvolution of TCGA endometrial cohort were downloaded from TIMER database (https://timer.cistrome.org/infiltration_estimation_for_tcga.csv.gz) 86 87. TCRb reportorie data were downloaded from NCI genomic data commons (TCGA_mitcr_cdr3_result_161008.tsv, from https://gdc.cancer.gov/about-data/publications/panimmune) 88.TCGA predicted neoantigen information were acquired from TSNAdb (http://biopharm.zju.edu.cn/tsnadb/download/)89. Mouse WES and RNA-seq data were deposited to SRA (PRJNA701709). All the other data are deposited to Synapse (as Synapse project syn29404148, https://www.synapse.org/#!Synapse:syn29404148/wiki/617361) with public access: All the samples underwent evalutation for model building (Synapse ID: syn29477489.1), the mutational signature matrix for training and test the WES (Synapse ID: syn29478035, Synapse ID: syn29478033) and MSK-IMPACT models (Synapse ID: syn29478110, Synapse ID: syn29478151), the genomic, response and survival data of the MSK-IMPACT immunotherapy cohorts (Synapse ID: syn29478036)

All customized code including code for generate the models (Synapse ID: syn29479495)90 ,the analysis of MSK-IMPACT cohort (Synapse ID: syn29479497)91 and other associated code (Synapse ID: syn30137113)92 are deposited to Synapse (Synapse project ID: syn29404148, https://www.synapse.org/#!Synapse:syn29404148/wiki/617361) with public access. Code for processing WES and RNAseq were from published tools and is available from the authors of the tools as described in the above Methods sections.

RESOURCES