Abstract
Background
Response to immune checkpoint inhibition (ICI) in sarcomas is overall low and heterogeneous. Understanding determinants of ICI outcomes may improve efficacy and patient selection. Thus, we investigated whether the expression of transposable elements (TEs), which are epigenetically silenced and can stimulate antitumor immunity, influence ICI outcomes and immune infiltrates in common sarcoma subtypes.
Methods
We used transcriptomic data to assign immune enhanced versus immune depleted status to 67 pretreatment and on-treatment biopsies of sarcomas from patients treated on ICI trials, along with additional cohorts from The Cancer Genome Atlas (TCGA) and an independent ICI trial (SARC028). A machine learning technique (lasso-penalized logistic regression) controlled for sarcoma subtype was used to determine if TE and epigenetic regulatory gene expression predict immune infiltrates. Correlations between top features in these models and sarcoma immune infiltrates, immune pathway expression, and clinical outcomes were explored.
Results
Expression of TEs and epigenetic regulators significantly predicted immune enhanced status. TE subfamilies and Ikaros family zinc finger 1 (IKZF1), a chromatin-modulating transcription factor, were significantly contributory. TE and IKZF1 expression positively correlated with tumor immune infiltrates, inflammatory pathways, and improved clinical outcomes, and increased in tumors that gained immune infiltrates during ICI treatment. TE and IKZF1 expression similarly correlated with overall survival and immune features in a TCGA cohort. In an additional cohort of patients with sarcoma treated with ICI, IKZF1 expression correlated with progression-free survival and inflammatory features.
Conclusions
TE and IKZF1 expression warrant further translational investigation as potential biomarkers of tumor immune infiltrates and outcomes following ICI treatment, and as therapeutic targets in sarcomas.
Keywords: Sarcoma, Immune Checkpoint Inhibitor
WHAT IS ALREADY KNOWN ON THIS TOPIC
The presence of tumor immune infiltrates and tertiary lymphoid structures is known to correlate with response to immune checkpoint inhibition (ICI) treatment in sarcomas. However, the underlying tumor-intrinsic biological mechanisms driving these immune features in a subset of sarcomas are unknown.
WHAT THIS STUDY ADDS
This study demonstrates that high expression of transposable elements (TEs) and Ikaros family zinc finger 1 (IKZF1) are predictive of increased sarcoma immune infiltrates using a machine learning approach, correlate with interferon pathway activation, and are associated with improved clinical outcomes, including after ICI treatment.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
This work highlights TE and IKZF1 as potential biomarkers for ICI response and provides a rationale for preclinical investigation of epigenetic agents capable of derepressing TEs to enhance immune infiltrates in sarcomas.
Background
Sarcomas are a diverse group of more than 170 histologic entities1 that derive from tissues of mesenchymal origin. The underlying genetic causes of sarcomas2 3 are diverse and their biologic and pathologic behavior is highly varied.4 The clinical management of metastatic sarcomas is generally palliative and relies on systemic therapies including cytotoxic chemotherapy, targeted therapies, and in some instances, immune checkpoint blockade.5 The overall response rate (ORR) to first-line chemotherapy in soft tissue sarcoma is approximately 20%.6 Hence, there is a need for both new treatment modalities and improved methods to select patients most likely to respond to specific treatments.
Immune checkpoint inhibition (ICI) has been studied in sarcomas, and activity has been noted with nivolumab (anti-programmed cell death protein-1 (PD-1)) alone or in combination with ipilimumab (anti-cytotoxic T-lymphocyte-associated protein 4 (CTLA-4))7 or with pembrolizumab (anti-PD-1) as a single agent.8 However, the response rates to ICI are in general low, only reaching approximately 20% among more common soft tissue sarcoma subtypes such as undifferentiated pleomorphic sarcoma (UPS).7,9 Efforts to enhance the activity of ICI through combination with other immune-modulatory drugs or with cytotoxic therapies have revealed variable response rates, which likely depend on the drug combination and sarcoma subtype (recently reviewed10). In parallel, predictive biomarkers for ICI response in sarcoma are being explored. While microsatellite instability (MSI) and high tumor mutation burden (TMB) predict response in carcinomas,11,17 TMB is relatively low in sarcomas and MSI is exceedingly rare.18 Alternative biomarkers such as tertiary lymphoid structures (TLSs) and B cell and CD8+ T cell infiltrates correlate with ICI response in some soft tissue sarcomas such as UPS.2 19 20
Another potential determinant and predictor of antitumor immunity and ICI response is epigenetic states, which are established by chemical modifications of DNA, RNA, and DNA-associated proteins together with their positioning relative to specific genomic sequences.21 One key function of the epigenome is to regulate transcriptional programs, including those that influence immune signaling. Therefore, genetic or pharmacologic perturbation of the machinery that establishes or maintains epigenetic states can prime for ICI response in preclinical models and correlates with ICI clinical response.22,30 For example, epigenetic mechanisms can promote immune escape through repression of antigen-presenting machinery and transposable elements (TEs), which are epigenetically silenced sequences of viral origin that, when derepressed, stimulate antiviral immune signaling.23 25 31
We therefore hypothesized that sarcoma baseline immune infiltrates and clinical outcomes following immunotherapy treatment are influenced by expression of TEs and epigenetic regulators. To test this, we generated and analyzed transcriptomic profiles of pretreatment biopsies from 67 unique patients and 46 on-treatment samples from patients enrolled in three ICI trials at our institution. In addition, we analyzed a treatment-naïve cohort from The Cancer Genome Atlas (TCGA) and an independent cohort of patients treated with ICI (used to evaluate epigenetic regulators). Here, we demonstrate that derepression of TEs that are normally silenced by epigenetic mechanisms and that upregulation of the transcription factor and chromatin regulator Ikaros family zinc finger 1 (IKZF1), significantly predicts immune infiltrates when controlling for sarcoma subtype. TE and IKZF1 upregulation correlate with hallmarks of tumor-intrinsic innate immune activation such as type I interferon (IFN) and antigen presentation, suggesting a potential mechanism for enhanced immune response mediated by tumor epigenetic states. Finally, the increased expression of TEs and IKZF1 is also significant in samples that switch from depleted to enhanced immune phenotypes in response to ICI, thereby highlighting the relationship to immune infiltrates and suggesting that ICI treatment might induce expression of these transcripts.
Methods
Clinical data were collected and DNA and RNA sequencing (RNA-seq) of pretreatment biopsy samples was performed under Institutional Review Board oversight of three clinical trials performed at the Memorial Sloan Kettering Cancer Center. These include pembrolizumab plus talmogene laherparepvec (NCT03069378),32 nivolumab plus bempegaldesleukin (NCT03282344),2 and pembrolizumab plus epacadostat (NCT03414229).33 Details regarding each study’s design, safety oversight, and interventions can be found in referenced publications for each study.
Samples
In the initial ICI-treated cohort, a total of 67 baseline samples from 12 sarcoma subtypes (angiosarcoma (ANGS)=4, alveolar soft part sarcoma (ASPS)=1, chondrosarcoma=6, epithelioid hemangioendothelioma (EHE)=8, leiomyosarcoma (LMS)=11, liposarcoma (LPS)=8, myxofibrosarcoma (MFS)=2, osteosarcoma (OS)=4, Other=7, sarcoma not otherwise specified=2, small blue round cell sarcoma=4, and UPS=8, representing 12 patients with response and 55 without (complete response (CR)/partial response (PR)=12, stable disease (SD)=21, progressive disease (PD)=34) were transcriptionally profiled (online supplemental table 1A).
RNA sequencing and quantification of TEs and genes
After quantification of RNA using RiboGreen and quality control using the Agilent BioAnalyzer, 469–500 ng of total RNA with RNA integrity values ranging from 6.8 to 10 underwent polyA selection and TruSeq library preparation following the instructions provided by Illumina (TruSeq Stranded messenger RNA (mRNA) LT Kit, catalog #RS-122–2102), with eight cycles of PCR. The resulting samples were barcoded and run on a HiSeq 4000 generating 100 bp paired-end reads, using the HiSeq 3000/4000 Sequencing by Synthesis (SBS) Kit (Illumina), generating an average of 41 million paired reads per sample. Ribosomal reads represented 0.9–5.9% of the total reads generated and the percent of mRNA bases averaged 64%.
The obtained FASTQ files were processed using the REdiscoverTE34 workflow, which allowed for quantification based on transcript levels. Gene transcripts were aggregated to obtain individual gene quantification. Read counts for each individual TE were then gathered to the level of TE subfamily, family, and class, as defined by the human Repeatmasker Hg38. TE expression was further divided into inter- and intragenic regions as defined by Gencode GTF/GFF and implemented in REdiscoverTE. Downstream analysis considered only intergenic expression of 1,002 out of a total of 1,052 TE subfamilies that were expressed. Gene-based normalization factors were calculated using the “RLE” algorithm in edgeR,35 as determined by REdiscoverTE. We then performed filtering using the filterByExpr() function in the EdgeR package. This function retains genes with counts per million (CPM) greater than or equal to a defined threshold (CPM cut-off). CPM cut-off was calculated as: CPM cut-off=min.count/median(lib.size) × 1e6 where min.count is a predefined minimum count of reads=10, and lib.size represents the library size of each sample. This filtering step removed TE families with insufficient read counts across samples. The data was further variance-stabilized using the voom function from edgeR.
RNAseq deconvolution and generation of immune clusters
We quantified immune cell populations from variance-stabilized RNAseq data using the immunedeconv R package and its deconvolute function, along with the MCPcounter option V.3.6.3.36 Batch effects due to sequencing run were removed using the removeBatchEffect() function from the limma R package.37 To reduce the dimensionality of the immune cell proportion data, we first performed a principal component analysis, followed by hierarchical clustering on principal components (HCPC) using the FactoMineR package.38 Cluster types were visualized using the factoextra R package (https://CRAN.R-project.org/package=factoextra).
Heatmaps of expression in each cluster were generated based on the scaled (Z-scores) immune cell proportions. Z-scores were calculated using the formula z = (x-μ)/σ, where x is the raw cell fraction, μ is the mean of all samples, and σ is the SD for all samples.
The TLS signature was calculated using a set of nine genes (CD79B, CD1D, CCR6, LAT, SKAP1, CETP, EIF1AY, RBP5, and PTGDS)39,42 for enrichment analysis (single-sample gene set enrichment analysis (ssGSEA) in the gene set variation analysis (GSVA) package.43
To obtain the cellularity enrichment scores for 64 cell types, from which lymphoid and myeloid cell type proportions can be derived, we used the xCellAnalysis function in the xCell R package (https://github.com/dviraran/xCell). Total lymphoid content was calculated as the sum of 21 lymphoid cell scores, including CD8+T cells, natural killer (NK) cells, CD4+naiveT cells, B cells, CD4+T cells, CD8+ Effector Memory T cells (Tem), Regulatory T cells (Tregs), plasma cells, CD4+Tcm, CD4+Tem, memory B cells, CD8+ Central Memory T cells (Tcm), naive B cells, CD4+memory T cells, pro B cells, class-switched memory B cells, Th2 cells, Th1 cells, CD8+naive T cells, NKT, and Tgd cells. Total myeloid content was expressed as the sum of 13 cell scores, including monocytes, macrophages, dendritic cells (including activated, conventional, interstitial, and plasmacytoid), neutrophils, eosinophils, M1 macrophages, M2 macrophages, basophils, and mast cells.
To test if immune type predicts survival in sarcoma, we performed a Cox regression analysis that included histology to control for subtype-specific differences in outcomes. We compared survival between groups using the Kaplan-Meier survival curve and the Cox proportional-hazards regression model. Differences were considered significant if the p value was less than 0.05 for the tested group.
Exome sequencing and purity estimation
Viably frozen cells were thawed and pelleted and incubated for at least 30 min in 360 µL Buffer ATL+40 µL proteinase K at 55°C. DNA was isolated using the DNeasy Blood & Tissue Kit (QIAGEN catalog #69504) according to the manufacturer’s protocol with 1 hour of incubation at 55°C for digestion. DNA was eluted in 0.5× Buffer AE.
After PicoGreen quantification and quality control by Agilent BioAnalyzer, 100–250 ng of DNA was used to prepare libraries using the KAPA Hyper Prep Kit (Kapa Biosystems KK8504) with eight cycles of PCR. After sample barcoding, 100–500 ng of library DNA was captured by hybridization using the xGen Exome Research Panel V.1.0 (IDT) according to the manufacturer’s protocol. Postcapture libraries were amplified using eight PCR cycles. Samples were run on a HiSeq 4000 at 100 bp paired-end reads using the HiSeq 3000/4000 SBS Kit (Illumina). Normal and tumor samples were covered to an average of 102× and 219×, respectively.
FASTQ files were aligned and processed using the in-house workflow Tempo (https://github.com/mskcc/tempo).44 Briefly, reads were aligned using Burroughs-Wheeler Aligner (BWA)-MEM45 to the GRCh37 reference genome and base recalibration was performed using Genome Analysis Toolkit best practices. Somatic genome variants were called using the union of Mutect2 and Strelka2. Variants were then filtered based on the following criteria: tumor read depth of 20, variant allele frequency <0.5× the tumor alternate read count of 3, and normal read depth of 10. In addition, repeated regions from RepeatMasker46 and variants that appear at allele frequencies >0.01 in GNOMAD47 were filtered out. Somatic copy number alterations were analyzed using FACETS (Fraction and Allele-Specific Copy Number Estimates from Tumor Sequencing) V.0.5.14.48 Each tumor and matched normal pair was processed in two steps: a first run for ploidy and purity estimation followed by a second run for detection of focal events. Each fit was reviewed manually to minimize false positives and to estimate the quality of the fit. Purity estimates from FACETS were used in the subsequent analysis.
Lasso association between immune types and genomic features
To identify genomic features that significantly differed between the two immune types, we used lasso logistic regression via penalized maximum likelihood using the R package glmnet.49 To account for potential variations due to sequencing batch or subtypes, we incorporated these parameters into the basic model. Other models were further built with either normalized expression of 1,002 intergenic TEs, shuffled TE expression, normalized expression of epigenetic modulators (532 genes), or shuffled epigenetic modulator expression.
Basic model: Immune types∼ batch+sarcoma subtypes.
Basic model+TE: Immune types∼ batch+sarcoma subtypes+1,002 TEs.
Basic model+TE shuffled: Immune types∼ batch+sarcoma subtypes+1,002 TEs shuffled.
Basic model+epigenetic genes: Immune types∼ batch+sarcoma subtypes+532 epigenetic genes.
Basic model+epigenetic genes shuffled: Immune types∼ batch+sarcoma subtypes+532 epigenetic genes shuffled.
TEs in the models represent intergenic TEs, and shuffled TE or epigenetic genes data represents randomly assigned TE or epigenetic genes expression to the samples. All the permutations (shuffling) were performed 10,000 times for tenfold cross-validation (model 3 and 5).
10-fold and threefold cross-validations were tested for each regression and bootstrapped 10,000 times, and lasso coefficients at one SE of the minimum mean cross-validation errors (lambda 1se) were used (model 1, 2 and 4). Each lasso fit returned a small number of predictors, that is, variables with non-zero coefficients, matching genomic features with significant contributions to difference between the two immune types. Final R2 values for each model were calculated from the fraction of deviance explained and averaged across the 10 rounds of cross-validation and 10,000 bootstraps. R2 values were then used to determine the model with the best performance.
P values for the difference between permuted and bootstrapped R² values were calculated by determining the number of permuted R² values greater than the bootstrapped mean R², which were divided by the number of permutations (10,000). This analysis was performed for comparisons between models 2 and 3, and models 4 and 5.
To further test the relationship between significant TE and epigenetic features determined by the lasso-penalized logistic regression model, we used logistic glm regression in which immune type represented a dependent variable, while TE score and IKZF1 expression represented independent variables. The model was corrected for batch and histology covariates. The TE score was calculated by performing TE ssGSEA using the gsva function of the GSVA package.50 The expression of six TEs that were found to be significant in the lasso-penalized logistic regression model analysis and that exhibited positive correlation with each other (online supplemental figure S6B) was used as the TE set.
To assess whether IKZF1 expression and TE score predict survival in sarcoma, we performed Cox regression analysis, including sarcoma subtype to control for subtype-specific differences in outcomes. We used the surv_cutpoint function in the R package survminer to determine the optimal cutpoint for converting TE score and IKZF1 expression continuous variables into “high” and “low” categories. We compared survival between the “high” and “low” groups using Kaplan-Meier survival curves and Cox proportional-hazards regression models. Differences were considered significant at a p<0.05 for the tested group.
Gene signature calculations
Genes for immune/inflammatory and other signatures used to determine the correlation of significant features found to be predictive of immune type were defined as previously described in literature and summarized in Kong et al34 (except the cyclic GMP-AMP synthase (cGAS) pathway, which was downloaded from the Kyoto Encyclopedia of Genes and Genomes (KEGG) (https://www.gseamsigdb.org/gsea/msigdb/cards/). The ssGSEA algorithm was used to comprehensively assess gene signature expression of each.43 The correlation between gene signatures and normalized expression of significant features was assessed by partial Pearson correlation analysis with batch and histology as covariates. P values were corrected using the Benjamini-Hochberg correction.
Kyoto Encyclopedia of Genes and Genomes
KEGG (https://www.gseamsigdb.org/gsea/msigdb/cards/). The ssGSEA algorithm was used to comprehensively assess gene signature expression of each.45 The correlation between gene signatures and normalized expression of significant features was assessed by partial Pearson correlation analysis with batch and histology as covariates. P values were corrected using the Benjamini-Hochberg correction.
Comparison with previously reported immune classes
To compare our immune clusters with formerly derived five previously defined sarcoma immune classes (SIC)20 we obtained centroid infiltration scores for each of four cell types (ie, T cells, cytotoxic scores, B lineage, endothelial cells) of the five clusters derived from Microenvironment Cell Populations-counter (MCP-counter) analysis from the authors of the prior study. We then calculated Euclidean distance (distance = √Σ(Ai-Bi)2) between centroids of four cell types (ie, T cells, cytotoxic scores, B lineage, endothelial cells) from each SIC (ie, A, B, C, D, E) and the Z-score scaled MCP-counter proportions from the same four cell types in our data. Each sample was assigned to SIC type based on the lowest Euclidean distance with the four centroid infiltration scores for each SIC. Z-score-scaled immune cell proportions were then plotted using the Complex heatmap package in R, and the comparison with our immune enhanced and depleted clusters was performed using a χ² test.
SARC028 data validation cohort analysis
Raw FASTQ RNA-seq files generated from the SARC028/NCT02301039 trial were provided by Sarcoma Alliance for Research through Collaboration (SARC).51 As described above, the REdiscoverTE pipeline to quantify gene and TE expression, followed by immune deconvolution using MCP counter, unsupervised clustering, and determination of models that predict immune type using lasso-penalized logistic regression model. Finally, all the detected features were evaluated in the Cox regression model for association with progression-free survival (PFS) using sarcoma subtype as a covariate. The correlation between gene signatures and normalized expression of significant features was assessed by partial Pearson correlation analysis with sarcoma subtype as a covariate as described above.
TCGA data analysis
RNA-seq data and phenotypic information were obtained from dbGaP for 190 TCGA samples from five sarcoma subtypes, including dedifferentiated LPS (DDLPS) (n=49), MFS (n=17), LMS (n=80; 53 soft tissue LMS (STLMS)+27 uterine LMS (ULMS)), and UPS (n=44).3 The REdiscoverTE pipeline was used to quantify gene and TE expression. Batch effect information was downloaded from the TCGA Batch Effects Viewer (https://bioinformatics.mdanderson.org/public-software/tcga-batch-effects/) and considered in the subsequent data analysis. RNA-seq was deconvoluted, immune clusters identified, and lasso-penalized logistic regression model between immune types and genomic features and overall survival was analyzed as described above. To assess the predictive capacity of our original six-TE and IKZF1 model for immune type in the TCGA dataset, we used the predict function from the car package in R (https://cran.r-project.org/web/packages/car/car.pdf). The resulting probabilities were then used to calculate receiver operating characteristic (ROC) and precision-recall curves and area under the curve (AUC) using the Precrec R package.52 Sensitivity, specificity, and CIs using 2,000 iterations were calculated using the pROC package.53
For the Kaplan-Meier analysis of this dataset the Cox regression analysis included sarcoma subtype and tumor size. The latter was included since the TCGA dataset comprises nearly all primary cases in which tumor size can be an important prognostic factor. Three sarcoma subtypes with sufficient sample size (DDLPS, UPS, and LMS) were analyzed individually to determine if immune type, IKZF1 expression and TE score were associated with overall survival using Cox regression analysis, using tumor size as a covariate. Methylation cluster assignment for DDLPS samples was retrieved from the original sarcoma TCGA publication,3 and differences in TE score between Meth1 and Meth2 clusters were tested using a t-test.
On treatment biopsy analysis
We analyzed 46 cases from our cohort that had both baseline and on-treatment samples.
Immune deconvolution using MCP-counter followed by unsupervised clustering as described above was used to determine immune type for each pair of samples. Samples were then categorized into four groups; (1) switched from immune depleted to enhanced, (2) switched from immune enhanced to depleted, (3) stable immune depleted, (4) stable immune enhanced. A paired t-test was used to test the null hypothesis that within the group there was no difference between pretreatment and on-treatment biopsy for the features of interest including TE score and IKZF1 expression. We also calculated differences within each group for each individual cell proportion determined by MCP counter using a paired t-test.
Results
Baseline immune cell populations predict response and progression-free survival in patients with sarcoma treated with immune checkpoint inhibitors
To study the influence of features linked to epigenetic states on antitumor immunity, we first characterized baseline immune infiltrates in tumor biopsies from 67 patients with a heterogeneous set of sarcomas (>10 subtypes) who were subsequently treated on ICI clinical trials (figure 1A and online supplemental table S1). 12 patients responded to ICI and 55 did not (CR/PR=12, SD=21, PD=34). The UPS subtype accounted for 11% (n=8/67) of the cohort and had a high rate of response (50%; n=4/8), but two-thirds of the responding patients (n=8/12) did not have UPS (online supplemental table S2). To control for subtype-specific differences, we incorporated subtype as a covariate in subsequent models.
Figure 1. Clustering of immune cell fractions groups tumors into two distinct types. (A) Color bars at the top of the heatmap label samples by response and histological subtype. (B) Heatmap of immune and stromal cell fractions and cytotoxicity score determined by MCP-counter Z-scores. (C) Immune checkpoint gene expression Z-scores. (D) Kaplan-Meier plot representing PFS probability of immune enhanced and depleted types. Tick marks indicate censoring. The p values on the Kaplan-Meier plots represent output from a Cox proportional hazards model that includes sarcoma subtype as a covariate. ANGS, angiosarcoma; ASPS, alveolar soft part sarcoma; CHS, chondrosarcoma; CR, complete response; CTLA-4, cytotoxic T-lymphocyte-associated protein 4; EHE, epithelioid hemangioendothelioma; LMS, leiomyosarcoma; LPS, liposarcoma; MCP-counter, Microenvironment Cell Populations-counter; MFS, myxofibrosarcoma; NK, natural killer; OS, osteosarcoma; PD, progressive disease; PFS, progression-free survival; PR, partial response; SARCNOS, sarcoma not otherwise specified; SBRC, small blue round cell sarcoma; SD, stable disease; UPS, undifferentiated pleomorphic sarcoma;
Baseline samples were analyzed to identify tumor characteristics that could be informative prior to treatment and to eliminate confounding by varying ICI drugs and combinations used across trials. We employed an RNA-seq-based method to quantify the abundance of different immune populations (MCP-counter).36 To obtain robust clustering of samples based on their profile of immune infiltrates, we used an HCPC approach,38 which integrates principal components and hierarchical clustering. This HCPC revealed two highly distinct groups, which we termed “immune depleted” and “immune enhanced” (figure 1B; online supplemental figure S1). Except for cancer-associated fibroblasts, all cell types defined by MCP-counter were significantly associated with cluster partitioning, with T cells (p=3.41 x 10−10) contributing the most, followed by cytotoxicity score (representative of cytotoxic lymphocytes) (p=2.01×10−9), and CD8+ T cells (p=6.89×10−9), NK cells (p=3.54×10−8), B cells (p=3.58×10−8), neutrophils (p=9.12×10−8), myeloid dendritic cells (p=1.17×10−7), macrophage/monocytes (p=1.34×10−7), and endothelial cells (p=7.88×10−3). The immune-enhanced cluster displayed, on average, greater abundance of all immune cell types in comparison to the overall mean, and conversely the immune-depleted cluster displayed lower abundance of the same immune cell types (online supplemental table S3).
Having assigned tumors to enhanced and depleted immune groups, we next determined how these immune states correlated with clinical outcomes after the 67 patients in the cohort received ICI-based intervention in one of three clinical trials: pembrolizumab plus talmogene laherparepvec (NCT03069378),32 nivolumab plus bempegaldesleukin (NCT03282344),2 and pembrolizumab plus epacadostat (NCT03414229).33 There were no significant differences between the three ICI trials with respect to the number of patients who did or did not respond or with respect to immune-enhanced versus immune-depleted tumor phenotypes (online supplemental table S4). We compared ORRs by RECIST (Response Evaluation Criteria in Solid Tumors) V.1.154 in immune-enhanced (ORR=30% (9/30)) versus immune-depleted (ORR=8.1% (3/37)) tumors. The ORR in the immune-enhanced group was significantly greater than in the immune-depleted group (Fisher’s exact test, 95% CI 1.03 to 30.31, p=0.02). Furthermore, the immune-enhanced samples were more prevalent than the immune-depleted samples in the CR compared with PD groups (Fisher’s exact test, 95% CI 0.02 to 0.73, p=0.01), while there was no significant difference in the CR versus SD and SD versus PD groups. The expression levels of immune checkpoint-related genes were consistent with the patterns observed in immune infiltrates, with elevated expression of CD274 (programmed death-ligand 1 (PD-L1)), CTLA4 (two-sided t-test, p=1.83×10−6 and p=1.18×10−4, respectively), and LAG3 (two-sided t-test, p=0.12) in immune enhanced tumors (figure 1C).
To determine if the baseline immune type was prognostic for PFS, we performed survival analysis that included sarcoma subtype as a covariate (figure 1D, online supplemental figure S2). The median PFS among patients with immune-depleted tumors was 1.7 months versus 3.65 months for immune-enhanced. Based on a multivariate Cox proportional hazards model analysis, immune enhanced classification was a significant predictor of PFS (HR=0.43, 95% CI (0.22 to 0.84), p=0.01), with only the ANGS histological subtype having significant impact on PFS (HR 0.21, 95% CI (0.019 to 0.78), p=0.026) (online supplemental figure S2).
Enhanced and depleted immune types are analogous to previously identified sarcoma immune classes
To determine how the two immune subtypes identified in this study relate to previously described SICs, which correlate with immune infiltrates and ICI response, we classified our samples according to the five SIC clusters (labeled A–E, online supplemental figure S3).20 In total, 47% (14/30) of the immune enhanced samples from our study fell into immune enhanced SICs D and E. The remaining 53% (16/30) of immune enhanced samples were assigned to immune depleted SIC B. In contrast, the immune depleted samples from our study were almost exclusively classified into immune depleted SICs A and B, with only two samples matching SIC C (online supplemental figure S3).
A χ² test revealed a significant association between the immune-enhanced group and combined SIC D+E, and the immune-depleted group and combined SIC A, B, and C (p=1.243e-05). In summary, the two distinct immune clusters identified in this study between which we observe differences in PFS and ORR following ICI treatment are associated with SICs that are consistent with immune-high and immune-low states.
In addition to the validation of our clustering through comparison with independently developed classifications, we also reasoned that if the immune type clusters identified in our approach via deconvolution of bulk RNA-seq accurately reflected immune cell populations, then the immune enhanced cluster should contain more immune infiltrates than the immune depleted cluster, resulting in lower tumor content. Concordantly, the immune-enhanced type displayed significantly lower purity compared with the immune-depleted type (one-sided t-test; t=−3.11, df=62.43, p<2.2×10−16) (online supplemental figure S4A). To further confirm this relationship, we performed a permutation test randomly assigning samples to immune groups and comparing the difference in purity estimates between the two groups, which was repeated 10,000 times to produce a null distribution. The observed data displayed significantly greater differences in tumor purity estimates compared with the null distribution (p=1×10−4) (online supplemental figure S4B). Lastly, tumor purity was inversely correlated with lymphoid (r2=−0.07, p=0.55), and myeloid cell content (r2=−0.43, p=2.2×10−4)
Online supplemental figure S4C, which is consistent with immune cell populations contributing to the non-tumor cell fraction.
TE and Ikaros (IKZF1) expression predict immune types in sarcoma
Although the activation of immune response through increased expression of TEs and the involvement of epigenetic genes in the regulation of TEs has been established in many cancers,24 25 28 34 55 56 these processes have not been well studied in sarcoma. Our analysis of expression of 1,002 intergenic TEs across the two immune types shows heterogeneous expression (online supplemental figure S5). Thus, we next asked if expression of TEs and epigenetic regulators is predictive of tumor immune types in sarcoma using a machine learning approach. Lasso logistic regression models including expression of TEs (R2=0.29) and epigenetic regulators (R2=0.19) had higher R2 values compared with a basic model that included only sarcoma subtypes and sequencing batch or models of TE and epigenetic regulator expression with randomized immune type sample labels (R2 TE shuffled=0.02, p=0.01; R2 epigenetic regulators shuffled=0.01, p=0.01) indicating that the models including TEs and epigenetic regulators have predictive value for immune type (figure 2A).
Figure 2. TEs and IKZF1 expression predict tumor immune groups. (A) Comparison of lasso logistic regression model performances (R²) of the five tested models for prediction of immune type. Models represent 10-fold cross-validation and 10,000 bootstraps for models 1, 2 and 4, or 10,000 permutations for models 3 and 5. P values for the difference between permuted and bootstrapped R² values were calculated by determining the proportion of permuted R² values ≥the bootstrapped mean R² (R²=0.29 (model 2), R²=0.19 (model 4) divided by 10,000 permutations (*p value<0.05). (B) Contribution of significant features from the model that combines the best-predictive TE and epigenetic features represented as non-zero coefficients. The size and sign of contribution (coefficients) indicate the direction and strength of the feature’s effect on the outcome (immune cluster). (C) Violin plots of normalized expression of transcripts identified as significant features in the regression model in immune enhanced and depleted clusters. ***p<0.001 as determined by a one-sided t-test. IKZF1, Ikaros family zinc finger 1; TE, transposable element.
To identify notable features associated with immune type, we extracted top features that consistently emerged as the best contributors across all model iterations, regardless of whether threefold or 10-fold cross-validation was used. These features also demonstrated high stability across 10,000 bootstraps. The selected models identified one epigenetic regulator, IKZF1 (selected in 100% of the models), and six TEs (selected in more than 70% of the models) as significant predictors of immune type from a larger pool of genes and TEs included in the model. All selected features remained stable across both threefold and 10-fold cross-validation and 10,000 bootstrap iterations (onlinesupplemental tables S5 S6).
Signature TE features with the highest contribution to the model included the MER45A (DNA transposon), MER57F (ERV1), Tigger17a (DNA transposon), MER61F (ERV1), LTR104_Mam (Gypsy), and HERVL74.int (ERVL) TE subfamilies, the expression of which was significantly greater in the immune enhanced cluster (figure 2B,C). In addition, IKZF1, a chromatin-interacting transcription factor57 that regulates three-dimensional chromatin structure,58 was the only epigenetic regulator of 532 genes tested as single genes to significantly contribute to the immune type prediction model and was associated with B cell infiltrates (figure 2B,C and online supplemental figure S6A).
We next fitted a logistic regression model using the signature features (ie, IKZF1, TE score) to predict their effect on immune type. To calculate a TE score, we combined the expression values of the six-signature feature TEs, for which expression of each of which was also positively correlated (online supplemental figure S6B). After adjusting for sequencing batch and sarcoma subtype, we found that TE score (p=2.2×10−3) and IKZF1 expression (p=5.8×10−3) were significantly associated with immune type.
We investigated the association between IKZF1 and TE expression and several clinical variables, including age, body mass index (BMI), sex, and number of prior treatments. Linear regression analysis revealed no significant correlations between TE and age (p=0.86, adj r2=−0.01) or BMI (p=0.7, r2=−0.01). Similarly, IKZF1 expression showed no significant correlation with age (p=0.09, r2=0.02) or BMI (p=0.72, r2=−0.01). For sex, Spearman’s rank correlation test demonstrated no significant association with either TE (p=0.62) or IKZF1 (p=0.80) expression. Finally, when the number of prior treatments was categorized (less than three vs three or more therapies), no significant correlations were observed with TE (p=0.07) or IKZF1 (p=0.36) expression. Tumor stage was metastatic for all but one patient, precluding statistical calculation for this variable.
TE score and IKZF1 expression increase in on-treatment biopsies from tumors that gain ICI-induced immune infiltrates
To further evaluate the relationship between tumor immune phenotype and putative predictors of this immune status, TE and IKZF1 expression, we analyzed paired baseline and on-treatment samples from 46 patients. This paired approach allowed us to control for interpatient differences including sarcoma subtype. We then grouped tumors by their baseline immune status (depleted vs enhanced) and determined if that status changed during ICI treatment (figure 3). The different immune groups were not over-enriched for specific sarcoma subtypes or restricted to specific treatment protocols (onlinesupplemental tables S7 S8). Notably, there were no responses in tumors that were either stably immune depleted or converted from immune enhanced to immune depleted during ICI treatment. In contrast, all patients who responded to ICI had tumors that were stable immune-enhanced or converted from immune-depleted to immune-enhanced (figure 3).
Figure 3. TE score and IKZF1 expression increase in tumors that gain immune infiltrates following ICI treatment. Paired baseline and on-treatment biopsies were assessed for immune infiltrates and categorized based on the following groups: (A) Tumors that change from immune depleted to immune enhanced, (B) tumors that change from immune enhanced to immune depleted, (C) tumors that remain stably immune depleted, and (D) tumors that remain stably immune enhanced. Within each group, the violin plots represent the distribution of values for baseline (gray), and on-treatment (white) samples, dots represent each sample, and paired samples are connected with lines (yellow lines for patients who responded and brown lines for patients who did not). Asterisks indicate p values from paired t-test; *p≤0.05, **p<0.01, ***p<0.001. ICI, immune checkpoint inhibition; IKZF1, Ikaros family zinc finger 1; TE, transposable element.

In tumors that converted from a depleted to enhanced immune phenotype (n=10), there was a significant increase in both IKZF1 expression and TE score (p=1.73×10−5, p=7.38×10−4 and p=1.37×10−3, respectively) (figure 3A). Conversely, in tumors that converted from enhanced to depleted (n=7), there was a significant decrease in both IKZF1 expression and TE score (p=0.01 and p=0.05, respectively) using paired t-test in both groups (figure 3B). In tumors that remained immune depleted (n=15) or remained immune enhanced (n=14) after ICI treatment, there was no significant change in either IKZF1 or TE score when the pretreatment and on-treatment specimens were compared (figure 3C,D). This suggests that TE score and IKZF1 expression correlate with changes in immune infiltrates following ICI treatment.
As expected, most of the immune cell proportions were significantly increased when comparing baseline and on-treatment samples in the immune-depleted to immune-enhanced transition group, while they were decreased in the enhanced to depleted transition group (online supplemental table s9, online supplemental figure S7A,B). Stable immune-depleted samples showed an increase in multiple cell proportions during ICI treatment, whereas stable immune-enhanced tumors had only a significant increase in CD8+T cell (online supplemental figure S7C,D, online supplemental table S6).
High expression of IKZF1 or TE score is associated with immune and inflammatory pathway signatures, a TLS signature, and progression-free survival
We next determined whether IKZF1 expression and TE score correlated with activation of immune and inflammatory pathways using a partial Pearson correlation. Both IKZF1 and TE scores were positively correlated with multiple immune pathways, while pathways related to non-immune function were either significantly inversely correlated or not significantly correlated, suggesting a distinct relationship between TE and IKZF1 expression and immune activity in sarcomas (figure 4A). Specifically, TE score and IKFZ1 expression were significantly correlated with antiviral response pathways such as cGAS-stimulator if interferon genes (STING) (TE score, r2=0.64, p=7.90×10−9; IKZF1, r2=0.67, p=1.03×10−9), type I IFN (TE score, r2=0.38, p=1.55×10−3; IKZF1, r2=0.32, p=7.89×10−3), and type II IFN (TE score, r2=0.68, p=2.66×10−10; IKZF1, r2=0.55 p=1.28×10−6). We also observed positive correlations between TE score IKZF1 expression and the increased expression of antigen-processing machinery (TE score, r2=0.49, p=2.99×10−5; IKZF1, r2=0.27, p=2.44×10−2) as well as the CD8+ T cell effector pathway (TE score, r2=0.54, p=2.94×10−6; IKZF1, r2=0.45, p=1.17×10−4).
Figure 4. Immune pathway expression and PFS following ICI treatment are associated with increased TE score and expression of IKZF1. (A) Heatmap of partial Pearson correlation including batch and histology as covariates. Scale from −1 (inverse correlation, blue) to 1 (positive correlation, red). Asterisks indicate Benjamini-Hochberg-corrected p values: *p<0.05, **p<0.01, ***p<0.001. (B) Correlation between TLS signature and TE score, and TLS signature and IKZF1 expression. (C) Kaplan-Meier curves representing PFS probability according to high versus low TE scores and IKZF1 expression. The p values on the Kaplan-Meier plots represent output from a Cox proportional hazards model that includes sarcoma subtype as a covariate. ICI, immune checkpoint inhibition; IFN, interferon; IKZF1, Ikaros family zinc finger 1; IL, interleukin; PFS, progression-free survival; TE, transposable element; TLS, tertiary lymphoid structure.
We next compared expression of these same pathways in the pretreatment and on-treatment biopsy pairs from patients whose tumors converted from immune-depleted to immune-enhanced on ICI treatment since that transition was correlated with increased TE score and IKZF1 expression. A change in immune phenotype from depleted to enhanced was associated with significantly increased expression of the CD8+T cell effector pathway, type II IFN, cGAS-STING, and NFκB pathway (online supplemental table S10). This finding further supports the relationship between TE and IKZF1 expression, immune infiltrates, and immune/inflammatory pathway expression.
Because CD274 (PD-L1) expression was significantly higher in the immune enhanced group (figure 1C), we also investigated the association between immune checkpoint-related genes and immune activity in baseline samples and found a significant positive correlation between CD274 and immune and inflammatory pathways (CD8+T cell effector, r2=0.36, adjusted p=3.21×10−3, cGAS-STING, r2=0.71, p=1.53×10−11; type II IFN, r2=0.54, p=2.51×10−6). TE score and IKZF1 also positively correlated with 9-gene TLS signature (p=1.79×10−7 and p=3.34×10−7, respectively), which is a positive predictor of ICI response in some settings2039,42 (figure 4B).
Given these associations, we next tested whether the TE score and IKZF1 expression were predictive of outcomes following ICI treatment. To ensure the features’ independent impact on PFS from clinical variables, we first tested univariate and multivariate Cox proportional hazards models. These models included all collected clinical variables: histology, age, sex, BMI, tumor stage, and number of prior therapies. Histology was the only variable demonstrating a significant impact and was therefore subsequently included as a covariate in our downstream models. Both high TE score (p=1.65×10−3) and IKZF1 expression (p=9.28×10−3) correlated with prolonged PFS (high TE 4.4 months vs low TE 1.8 months; high IKZF1 5.3 months vs low IKZF1 1.8 months) (figure 4C). The ORR based on IKZF1 expression was 54.5% (6/11) in the high-expressing group and 10.7% (6/56) in the low-expressing group (p=2.72×10−3, Fisher’s exact test). ORR in the TE-high group was 40% (6/15) and 11.53% (6/52) in the TE-low group (p=0.13; Fisher’s exact test). Taken together, these findings suggest that IKZF1 expression and TE score, which were identified in a model that considered sarcoma subtypes as a variable, could be further explored as potential predictive biomarkers for ICI outcomes.
IKZF1 expression predicts immune type and associates with PFS and inflammatory pathway activation in a separate cohort of patients with sarcoma treated with ICI
To assess the replicability of our findings, we applied our analysis to gene expression data from 30 baseline sarcoma samples from the SARC028 clinical cohort in which patients with sarcoma received pembrolizumab.8 51 We used unsupervised clustering to identify immune types based on deconvoluted cell proportions and identified two major immune phenotypes: enhanced and depleted (figure 5A–D, online supplemental figure S8).
Figure 5. IKZF1 expression is associated with improved PFS in a SARC028 cohort. (A) Clustering of samples into immune enhanced and immune depleted types. Color bar at top labels samples by histological subtype. (B) Heatmap of immune and stromal cell fractions and cytotoxicity score determined by MCP-counter Z-scores. (C) Immune checkpoint gene expression Z-scores. (D) Kaplan-Meier plot of PFS probability for patients with immune enhanced and depleted type tumors. (E) Kaplan-Meier curves representing PFS probability for patients with tumors with high (red) and low (blue) IKZF1 expression. The p-values on the Kaplan-Meier plots represent output from a Cox proportional hazards model that includes sarcoma subtype as a covariate. CR, complete response; CS, dedifferentiated chondrosarcoma; CTLA-4, cytotoxic T-lymphocyte-associated protein 4; DDLPS, dedifferentiated liposarcoma; ES, Ewing sarcoma; IKZF1, Ikaros family zinc finger 1; LS, leiomyosarcoma; MCP-counter, Microenvironment Cell Populations-counter; NK, natural killer; OS, osteosarcoma; PD, progressive disease; PFS, progression-free survival; PR, partial response; SD, stable disease; SS, synovial sarcoma; UPS, undifferentiated pleomorphic sarcoma.
The immune-enhanced cluster was associated with improved PFS (p=0.03) after correcting for sarcoma subtype, confirming observations in the original ICI cohort (figure 5D). Additionally, we confirmed that IKZF1 is an important predictor of immune type (R2=0.36, model with epigenetic genes; IKZF1 is the only contributor to the model, contribution=0.48) and its association with PFS (p=0.05, after correction for subtype) (figure 5E, online supplemental figure S9A–C). The sample size was too small to determine if increased response rates were associated with the immune-enhanced group; the RNA-seq cohort from the SARC028 study included only three patients who responded (two in the immune-enhanced group and one in the immune-depleted group) (figure 5A).
The sequencing method for the SARC028 samples used an exome capture approach,51 which markedly limited the number of mapped TEs to 244 out of 1,052 TE families, with an average of 18 reads per TE family (online supplemental figure S10A). Compared with the initial ICI dataset which used whole transcriptome sequencing that covered 1,002 TE families with an average of 58 reads (online supplemental figure S10B), too few reads mapped to TEs in the SARC028 samples to evaluate this component of the findings from the initial ICI-treated cohort. Accordingly, TE expression was not a significant feature in the predictive models for immune type in the SARC028 cohort (online supplemental figure S9A).
We therefore focused our analysis on IKFZ1 expression in the SARC028 cohort and found that it significantly correlated with antiviral response pathways such as cGAS-STING (r2=0.76, p=1.45×10−6), type II IFN (r2=0. 61, p=3.56×10−4), antigen-processing machinery (r2=0.53, p=2.91×10−3), and the CD8+T cell effector pathway (r2=0.61, p=4.00×10−4) (online supplemental figure S11A). These findings are similar to observations in the original ICI cohort. In addition, CD274 expression showed positive correlation with immune/inflammatory pathways and negative correlation with non-immune pathways (online supplemental figure S11A). We also further confirmed that IKZF1 is significantly positively correlated with a TLS signature and B-cell content (online supplemental figure S11B,C).
TE and IKZF1 expression associate with immune infiltrates and inflammatory pathways in a treatment-naïve cohort
To assess the replicability of our findings with respect to predictors of immune infiltrates and outcomes in an additional cohort representing treatment-naïve patients, we applied our analysis to gene expression data from 190 sarcoma samples from the TCGA.3 This group was chosen as it includes five sarcoma subtypes, DDLPS (n=49), MFS (n=17), LMS (n=80; 53 STLMS; 27 ULMS), and UPS (n=44), which were prevalent in our original ICI cohort. The immune signatures in the TCGA group segregated into two distinct clusters marked by high (immune enhanced) and low (immune depleted) immune infiltrates and expression of immune checkpoint genes (figure 6A–C, online supplemental figure S12). The immune-enhanced cluster was associated with improved overall survival (p=1.09×10−2), at a median of 37.5 months versus 25.5 months for the immune-depleted cluster (figure 6D). As in the initial ICI-treatment cohort, expression of TEs and epigenetic regulators predicted immune type when accounting for sarcoma subtype (model with TEs; R2=0.36, and model with Epigenetic genes; R2=0.42), online supplemental figure S13A) and specific TEs and IKZF1 expression were identified as signature features that positively correlated with immune enhanced classification (online supplemental figure S13B,C). All selected features demonstrated stability across both threefold and 10-fold cross-validation, and 10,000 bootstrap iterations (onlinesupplemental tables S11 12).
Figure 6. TE score and IKZF1 expression associate with improved overall survival in a treatment-naïve TCGA SARC cohort. (A) Clustering of samples into immune-enhanced and immune-depleted types. Color bar at top labels samples by histological subtype. (B) Heatmap of immune and stromal cell fractions and cytotoxicity score determined by MCP-counter Z-scores. (C) Immune checkpoint gene expression Z-scores. (D) Kaplan-Meier plot of overall survival probability of patients with immune enhanced and depleted tumors. (E–F) Kaplan-Meier curves representing overall survival probability of high (red) and low (blue) (E) TE scores and (F) IKZF1 expression. The p values on the Kaplan-Meier plots represent output from a Cox proportional hazards model that includes sarcoma subtype and tumor size as covariates. CTLA-4, cytotoxic T-lymphocyte-associated protein 4; DDLPS, dedifferentiated liposarcoma; IKZF1, Ikaros family zinc finger 1; LS, leiomyosarcoma; LMS, leiomyosarcoma; MCP-counter, Microenvironment Cell Populations-counterMFS, myxofibrosarcoma; NK, natural killer; SARC, Sarcoma Alliance for Research through Collaboration); TCGA, The Cancer Genome Atlas; TE, transposable element; UPS, undifferentiated pleomorphic sarcoma.
Overall survival was significantly greater for patients whose tumor had a high TE score (p=1.26×10−3) or high IKZF1 (p=4.94×10−3) expression (figure 6E–F).
Expression of IKZF1, TEs (again defined as a composite TE score), and CD274 positively correlated with that of immune pathways including type I and II IFN (p<0.001), antigen-processing machinery (p<0.001), the cGAS-STING pathway (p<0.001), and CD8+ T effector cells, but not non-immune pathways except angiogenesis in the case of TE score and IKZF1 (online supplemental figure S14A). IKZF1 and TE scores were also positively correlated with TLS score (p=1.32×10−38, and p=6.31×10−15, respectively) (online supplemental figure S14B).
Although the preceding analyses were controlled for sarcoma subtypes, we definitively removed subtype as a variable by testing whether TE score or IKFZ1 expression was associated with overall survival within a given subtype. Three sarcoma subtypes, DDLPS, UPS, and LMS, were sufficiently represented in the TCGA cohort to allow this subgroup assessment. In a multivariate analysis including tumor size as a covariate, TE score was significantly associated with OS in all three subtypes (LMS; p=0.01, DDLPS; p=0.01, UPS; p=0.01) (online supplemental figures S15–S17). High IKZF1 expression significantly associated with prolonged OS in LMS (p=0.05) with a non-significant trend towards increased OS in UPS and DDLPS.
Finally, because DNA methylation is one of several epigenetic mechanisms for TE silencing,24 55 we explored whether DNA methylation status correlated TE expression and outcomes. Using DNA methylation data for the TCGA cohort, we focused on DDLPS, which was previously grouped into hypomethylated and hypermethylated clusters.3 We found that TE expression is higher in the hypomethylated cluster (p=0.01), which was previously shown to be associated with significantly longer disease-specific survival compared with the hypermethylated cluster.3 This suggests that DNA methylation is a potentially relevant and pharmacologically modifiable mechanism for maintaining TE silencing in DDLPS.
Generalizability of a TE/IKZF1 model for immune type prediction
While IKZF1 and TEs were associated with immune type in our independent datasets, additional TE families and epigenetic genes also contributed to immune enhanced/depleted classification in the TCGA data. Therefore, we assessed the generalizability of our original six-TE and IKZF1 model by evaluating its predictive capacity for immune type in the TCGA dataset. Using our lasso-penalized logistic regression model trained on 67 clinical samples (our original dataset), we predicted the probability of immune type for each of the 190 TCGA samples. The model achieved an AUC-Precision-Recall of 87.57% (95% CI: 80.06% to 90.93%), indicating very good discriminatory ability (online supplemental figure S18A). At a threshold of 0.5, the model demonstrated 17.3% sensitivity, 97.67% specificity, and 90% precision. Although the precision-recall curve is more informative for evaluating our model due to the imbalanced groups, we also present the ROC curve, which yielded an AUC of 85.84% (online supplemental figure S18B). Given its strength in correctly identifying immune-enhanced cases, this model may be useful for future patient selection in immunotherapy trials using IKZF1 and the six TE expression levels.
Discussion
This work addresses the pressing need to understand the determinants of antitumor immunity in sarcomas as a step towards the goal of identifying predictive biomarkers of ICI response and strategies to improve ICI outcomes. We determined the minimal number of immune clusters that represent immune-enhanced and depleted sarcomas and showed that the former is associated with higher ORR and longer PFS following ICI treatment when controlling for sarcoma subtype. This finding corroborates prior studies that have shown a correlation between high baseline immune infiltrates and response to immune therapy.20 Importantly, our work demonstrates that these findings apply in a cohort with a broad spectrum of sarcoma subtypes and in the setting of three combination ICI trials with diverse mechanisms. Moreover, our analysis shows that a binary classification of tumors is sufficient to correlate with clinical outcomes, indicating that immune clustering can be simplified compared with previous approaches that involved more groups.20 Such a simplified system could be helpful in smaller studies with limited numbers of cases.
Epigenetic mechanisms are known to suppress antitumor immune responses and targeting epigenetic pathways has emerged as a promising therapeutic strategy.22 31 59 We examined the expression of epigenetic regulators and TEs, the latter of which are normally epigenetically silenced (e.g., via heterochromatin or DNA methylation) and can stimulate innate immune responses when derepressed.28,3060 In keeping with this mechanism, we showed that DNA hypomethylation in DDLPS was associated with increased TE expression, thereby supporting a potential epigenetic role in regulating TE expression in sarcomas. While our findings suggest an association between DNA hypomethylation and TE expression in DDLPS, previous research has demonstrated that a group of TE families can be upregulated by IFN-γ/STAT1 signaling. However, our top TE families do not overlap with a previously published list enriched for IFN-γ binding sites.61 Furthermore, we observed increased expression of TEs in immune-enhanced tumors, and a gain in TE expression in tumors that transitioned from an immune-depleted to an immune-enhanced phenotype after ICI treatment.
TEs can activate double stranded RNAs (dsRNAs)-sensing pathways, which has been observed in the setting of genetic lesions in epigenetic regulators or pharmacologic treatment that led to their derepression.24 25 31 62 Our observation of upregulated antiviral immune responses (including cGAS and type I IFN signaling) with increased TE expression in two independent cohorts of patients with sarcoma and in tumors that gained immune infiltrates on ICI treatment is consistent with this mechanism. Further investigation is needed to determine whether the presentation of TE-derived neoantigens via major histocompatibility complex-I, as previously observed in the loss of epigenetic TE silencing in other contexts,28 could also contribute to the immune enhanced state.
In addition to TEs, our initial analysis of ICI-treated patients with sarcoma revealed that expression of IKZF1 was significantly greater in immune-enhanced tumors and associated with PD-L1 expression, B cell infiltrates, and a TLS signature, which was validated in a separate cohort of patients treated with pembrolizumab on the SAR028 trial. Although Ikaros, the IKZF1 gene product, is well known as a transcription factor in hematologic lineages,57 it was included in our list of epigenetic regulators given the inherent interaction of transcription factors and chromatin. In addition, recent reports also suggest an important role for Ikaros in regulating higher order chromatin structure.58
We validated the relationship between tumor immune infiltrates, TE and IKZF1 expression, and immune pathway signaling, in an unrelated set of samples from the TCGA in which high TE score and IKZF1 expression were associated with improved overall survival. Taking advantage of the relatively large sample size of the cohort, we repeated the same analysis within specific sarcoma subtypes, thereby removing subtype as a variable. We found that TE expression remained an important predictor of immune infiltrates and was associated with improved OS in three common sarcoma subtypes: DDLPS, LMS, and UPS. Higher IKZF1 was associated with improved OS in LMS with a trend towards improved OS in the DDLPS and UPS subtypes. It is possible that TE derepression emerged as the stronger signal in this analysis since epigenetic reprogramming, which often occurs during malignant transformation,55 may be more universal in the treatment-naïve setting represented in the TCGA cohort. In contrast, the association between OS and IKZF1 expression, which was more prominent in the larger pooled cohort, may be a weaker signal in treatment-naïve samples, and could be enhanced by ICI treatment given the observation of greater IKZF1 expression and expression of immune-related pathways in patients whose tumors gained immune infiltrates on ICI treatment.
We further analyzed a cohort of patients (n=30) who received pembrolizumab in the SARC028 trial and for whom RNA-seq data was available. This confirmed the relationship between IKZF1 expression immune infiltrates, immune pathway signaling, and PFS following ICI treatment. This analysis was unfortunately limited with respect to TE expression because of the RNA-seq technique used and the response analysis was precluded by the small number of samples associated with response in the subset of the SARC028 study with available RNA-seq data. Notably, the distribution of the subtypes represented in the initial ICI cohort and the SARC028 cohort was different. While the latter included common sarcoma subtypes also represented in the original group, the dominant subtypes in the SARC028 cohort were OS (30%, n=9/30) and Ewing sarcoma (16%, n=5/30), while the original cohort was dominated by LMS (16%, n=11/67), EHE (11%, n=8/67), and UPS (11%, n=8/67). These differences make it less likely that the observed relationships are driven by a specific sarcoma subtype. Given the breadth of subtypes across all cohorts and that we controlled for sarcoma subtype in our analyses, the relationship between TE and IKZF1 expression, immune infiltrates, immune signaling, and outcomes may be generalizable across sarcoma subtypes. However, future validation studies are required to fully evaluate this possibility.
Based on the data presented here, TE and IKZF1 expression are best classified as emerging or exploratory biomarkers in the context of the 2025 Society for Immunotherapy of Cancer consensus statement on biomarkers for immunotherapy protocols.63 Future work is required to further test the validity of these features as prognostic and/or predictive biomarkers. Thus, building from this initial study to advance the consideration of TEs and/or IKZF1 as biomarkers, we propose that formal evaluation, following REMARK (Reporting Recommendations for Tumor Marker Prognostic Studies) principles,64 should be included in future sarcoma ICI trial designs. Given that this study is discovery-focused, the analytical validity of these potential biomarkers requires further assessment with refined assay development and standardization, including exploration of immunohistochemistry-based approaches to assess IKZF1 status and potential identification of a core TE signature that can be measured using quantitative PCR instead of RNA-seq. That said, the current reporting successfully validates the correlation of TE and IKZF1 expression with outcomes and immune infiltrates across multiple cohorts. Prospective hypothesis testing would further enhance the clinical validity of TE and IKZF1 expression in identifying patients with sarcoma who are most likely to have improved outcomes, including after ICI treatment. Finally, the clinical utility of determining TE and IKZF1 expression is at present unknown but should be addressed as a further step after assessing their clinical validity in work motivated by our findings.
There are several limitations of this study including the heterogeneity in sarcoma subtypes and that the initial ICI-treated cohort had a relatively small sample size (n=67) with patients included from three trials of ICI-based regimens with different mechanisms. However, while this heterogeneity may have decreased our ability to identify signals related to specific epigenetic genes or TE families, we were reassuringly able to classify tumors into immune classes that were predictive of clinical outcomes and confirmed prior classification systems. In addition, while we controlled for subtype in our analysis methods and through intra-subtype cohorts, it is important to note that no statistical method can completely eliminate the possibility that subtype heterogeneity may have impacted the analysis of a relatively small cohort. Another limitation of the study is that we selected polyadenylated transcripts for RNA-seq, which may have limited the detection of theoretically transcribed but non-polyadenylated TEs. Additionally, we potentially overlooked important epigenetic mechanisms by considering epigenetic genes as independent, when many encode proteins that form complexes or functional pathways. Finally, given the role for Ikaros in B-cell development and that greater infiltration of B lineage immune cells associates with overall survival in soft tissue sarcomas,20 we cannot exclude the possibility that the bulk of the IKZF1 expression is in the immune, not malignant, portion of the tumors. In-depth single-cell or spatial transcriptomic analysis designed as correlates in future sarcoma ICI clinical trials would address this important question raised by our work.
Increasing the effectiveness of immunotherapies and identifying predictors of ICI response would represent important advances in sarcoma. Our work presents several possibilities for pursuing these goals using data from pretreatment biopsies. We confirmed earlier studies showing that pretreatment immune status can predict ICI outcomes and propose IKZF1 expression and TE score as potential predictive biomarkers for ICI outcomes, both of which require further testing and validation. In addition, this work suggests potential strategies to enhance ICI response through stimulation of immune responsiveness of baseline immune-depleted tumors to convert them into an immune-enhanced phenotype. Testing these strategies would also address a limitation inherent to investigations on patient materials, which is a lack of mechanistic studies in which the expression of TEs and IKZF1 can be experimentally manipulated in functional assays. For example, epigenetic drugs targeting DNA methylation, heterochromatin formation, or histone acetylation led to derepression of TEs in several of carcinomas.2425 65,67 Notably, several drugs in these classes are clinically available for other indications and many others are in development.68 Our analysis demonstrated an inverse correlation between DNA methylation and TE expression in DDLPS, which suggests that the relationship between at least one targetable epigenetic mechanism of TE silencing is relevant in sarcomas. Based on this work, promoting the derepression of TEs by pharmacologic manipulation of epigenetic regulators could be explored in sarcoma preclinical models.
Supplementary material
Acknowledgements
We acknowledge and appreciate editorial assistance from Jessica Moore. Generative AI was used to proofread sections of the manuscript to check for typographical or grammatical errors. Suggestions were then reviewed prior to incorporating them into the text.
Footnotes
Funding: This work was supported by Merck, Amgen, NEKTAR, Incyte, Bristol Myers Squibb, Cycle for Survival, and Witherwax Fund. BAN received support from the NCI K08CA245212 and the Connective Tissue Oncology Society Basic Science Research Award (with WT). BAN is a Damon Runyon Clinical Investigator supported (in part) by the Damon Runyon Cancer Foundation (CI-124-23). Additional support was provided by the Memorial Sloan Kettering Cancer Center Support Grant (P30 CA008748) and the Hillman Cancer Center Support Grant (P30 CA047904). We acknowledge the use of the Integrated Genomics Operation Core, funded by the NCI Cancer Center Support Grant (CCSG, P30 CA08748), and the Marie-Josée and Henry R Kravis Center for Molecular Oncology. Funding was provided to cover the cost of the clinical trials for the initial ICI-treated cohort, which is reported elsewhere. Funders were not involved in the analysis, interpretation, writing, or decision to submit.
Provenance and peer review: Not commissioned; externally peer reviewed.
Patient consent for publication: Not applicable.
Ethics approval: Clinical data were collected and DNA and RNA sequencing of pretreatment biopsy samples was performed under Institutional Review Board oversight of three clinical trials performed at the Memorial Sloan Kettering Cancer Center. These include pembrolizumab plus talmogene laherparepvec (NCT03069378), nivolumab plus bempegaldesleukin (NCT03282344), and pembrolizumab plus epacadostat (NCT03414229). Details regarding each study’s design, safety oversight, and interventions can be found in referenced publications for each study (see manuscript text). Participants gave informed consent to participate in the study before taking part.
Data availability free text: All RNA sequencing data, where informed consent has been obtained from the patient, is publicly available via dbGaP (accession number: phs003284)70. Three samples are not publicly available due to lack of consent for their release. All exome recapture sequencing data will be available via dbGaP under accession number phs00178371 by the time of publication. Custom code used for analysis is publicly available here: https://github.com/BradicM/Sarcoma_TE_paper_analysis.
Data availability statement
Data are available in a public, open access repository.
References
- 1.WHO . Soft Tissue and Bone Tumours. 5th. Lyon (France): International Agency for Research on Cancer; 2020. edn. [Google Scholar]
- 2.D’Angelo SP, Richards AL, Conley AP, et al. Pilot study of bempegaldesleukin in combination with nivolumab in patients with metastatic sarcoma. Nat Commun. 2022;13:3477. doi: 10.1038/s41467-022-30874-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Cancer Genome Atlas Research Network Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas. Cell. 2017;171:50–65. doi: 10.1016/j.cell.2017.10.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA A Cancer J Clinicians. 2018;68:7–30. doi: 10.3322/caac.21442. [DOI] [PubMed] [Google Scholar]
- 5.von Mehren M, Kane JM, Agulnik M, et al. Soft Tissue Sarcoma, Version 2.2022, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw. 2022;20:815–33. doi: 10.6004/jnccn.2022.0035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Seddon B, Strauss SJ, Whelan J, et al. Gemcitabine and docetaxel versus doxorubicin as first-line treatment in previously untreated advanced unresectable or metastatic soft-tissue sarcomas (GeDDiS): a randomised controlled phase 3 trial. Lancet Oncol. 2017;18:1397–410. doi: 10.1016/S1470-2045(17)30622-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.D’Angelo SP, Mahoney MR, Van Tine BA, et al. Nivolumab with or without ipilimumab treatment for metastatic sarcoma (Alliance A091401): two open-label, non-comparative, randomised, phase 2 trials. Lancet Oncol. 2018;19:416–26. doi: 10.1016/S1470-2045(18)30006-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tawbi HA, Burgess M, Bolejack V, et al. Pembrolizumab in advanced soft-tissue sarcoma and bone sarcoma (SARC028): a multicentre, two-cohort, single-arm, open-label, phase 2 trial. Lancet Oncol. 2017;18:1493–501. doi: 10.1016/S1470-2045(17)30624-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Burgess MA, Bolejack V, Schuetze S, et al. Clinical activity of pembrolizumab (P) in undifferentiated pleomorphic sarcoma (UPS) and dedifferentiated/pleomorphic liposarcoma (LPS): Final results of SARC028 expansion cohorts. JCO. 2019;37:11015. doi: 10.1200/JCO.2019.37.15_suppl.11015. [DOI] [Google Scholar]
- 10.Banks LB, D’Angelo SP. The Role of Immunotherapy in the Management of Soft Tissue Sarcomas: Current Landscape and Future Outlook. J Natl Compr Canc Netw. 2022;20:834–44. doi: 10.6004/jnccn.2022.7027. [DOI] [PubMed] [Google Scholar]
- 11.Carbone DP, Reck M, Paz-Ares L, et al. First-Line Nivolumab in Stage IV or Recurrent Non-Small-Cell Lung Cancer. N Engl J Med. 2017;376:2415–26. doi: 10.1056/NEJMoa1613493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hellmann MD, Ciuleanu T-E, Pluzanski A, et al. Nivolumab plus Ipilimumab in Lung Cancer with a High Tumor Mutational Burden. N Engl J Med. 2018;378:2093–104. doi: 10.1056/NEJMoa1801946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Rosenberg JE, Hoffman-Censits J, Powles T, et al. Atezolizumab in patients with locally advanced and metastatic urothelial carcinoma who have progressed following treatment with platinum-based chemotherapy: a single-arm, multicentre, phase 2 trial. The Lancet. 2016;387:1909–20. doi: 10.1016/S0140-6736(16)00561-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Van Allen EM, Miao D, Schilling B, et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science. 2015;350:207–11. doi: 10.1126/science.aad0095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Rizvi NA, Hellmann MD, Snyder A, et al. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science. 2015;348:124–8. doi: 10.1126/science.aaa1348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Samstein RM, Lee C-H, Shoushtari AN, et al. Tumor mutational load predicts survival after immunotherapy across multiple cancer types. Nat Genet. 2019;51:202–6. doi: 10.1038/s41588-018-0312-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Le DT, Durham JN, Smith KN, et al. Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade. Science. 2017;357:409–13. doi: 10.1126/science.aan6733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nacev BA, Sanchez-Vega F, Smith SA, et al. Clinical sequencing of soft tissue and bone sarcomas delineates diverse genomic landscapes and potential therapeutic targets. Nat Commun. 2022;13:3405. doi: 10.1038/s41467-022-30453-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Italiano A, Bessede A, Pulido M, et al. Pembrolizumab in soft-tissue sarcomas with tertiary lymphoid structures: a phase 2 PEMBROSARC trial cohort. Nat Med. 2022;28:1199–206. doi: 10.1038/s41591-022-01821-3. [DOI] [PubMed] [Google Scholar]
- 20.Petitprez F, de Reyniès A, Keung EZ, et al. B cells are associated with survival and immunotherapy response in sarcoma. Nature New Biol. 2020;577:556–60. doi: 10.1038/s41586-019-1906-8. [DOI] [PubMed] [Google Scholar]
- 21.Allis CD, Caparros M-L, Jenuwein T, et al. Cold Spring Harbor. 2nd. xiv. New York: Cold Spring Harbor Laboratory Press; 2015. Epigenetics; p. 984. edn. [Google Scholar]
- 22.Que Y, Zhang X-L, Liu Z-X, et al. Frequent amplification of HDAC genes and efficacy of HDAC inhibitor chidamide and PD-1 blockade combination in soft tissue sarcoma. J Immunother Cancer. 2021;9:e001696. doi: 10.1136/jitc-2020-001696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Krug B, De Jay N, Harutyunyan AS, et al. Pervasive H3K27 Acetylation Leads to ERV Expression and a Therapeutic Vulnerability in H3K27M Gliomas. Cancer Cell. 2019;35:782–97. doi: 10.1016/j.ccell.2019.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Chiappinelli KB, Strissel PL, Desrichard A, et al. Inhibiting DNA Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses. Cell. 2015;162:974–86. doi: 10.1016/j.cell.2015.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Topper MJ, Vaz M, Chiappinelli KB, et al. Epigenetic Therapy Ties MYC Depletion to Reversing Immune Evasion and Treating Lung Cancer. Cell. 2017;171:1284–300. doi: 10.1016/j.cell.2017.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sheng W, LaFleur MW, Nguyen TH, et al. LSD1 Ablation Stimulates Anti-tumor Immunity and Enables Checkpoint Blockade. Cell. 2018;174:549–63. doi: 10.1016/j.cell.2018.05.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Macfarlan TS, Gifford WD, Agarwal S, et al. Endogenous retroviruses and neighboring genes are coordinately repressed by LSD1/KDM1A. Genes Dev. 2011;25:594–607. doi: 10.1101/gad.2008511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Griffin GK, Wu J, Iracheta-Vellve A, et al. Epigenetic silencing by SETDB1 suppresses tumour intrinsic immunogenicity. Nature New Biol. 2021;595:309–14. doi: 10.1038/s41586-021-03520-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zhang S-M, Cai WL, Liu X, et al. KDM5B promotes immune evasion by recruiting SETDB1 to silence retroelements. Nature New Biol. 2021;598:682–7. doi: 10.1038/s41586-021-03994-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hu H, Khodadadi-Jamayran A, Dolgalev I, et al. Targeting the Atf7ip-Setdb1 Complex Augments Antitumor Immunity by Boosting Tumor Immunogenicity. Cancer Immunol Res. 2021;9:1298–315. doi: 10.1158/2326-6066.CIR-21-0543. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Burr ML, Sparbier CE, Chan KL, et al. An Evolutionarily Conserved Function of Polycomb Silences the MHC Class I Antigen Presentation Pathway and Enables Immune Evasion in Cancer. Cancer Cell. 2019;36:385–401. doi: 10.1016/j.ccell.2019.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kelly CM, Antonescu CR, Bowler T, et al. Objective Response Rate Among Patients With Locally Advanced or Metastatic Sarcoma Treated With Talimogene Laherparepvec in Combination With Pembrolizumab: A Phase 2 Clinical Trial. JAMA Oncol. 2020;6:402–8. doi: 10.1001/jamaoncol.2019.6152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kelly CM, Qin L-X, Whiting KA, et al. A Phase II Study of Epacadostat and Pembrolizumab in Patients with Advanced Sarcoma. Clin Cancer Res. 2023;29:2043–51. doi: 10.1158/1078-0432.CCR-22-3911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kong Y, Rose CM, Cass AA, et al. Transposable element expression in tumors is associated with immune infiltration and increased antigenicity. Nat Commun. 2019;10:5228. doi: 10.1038/s41467-019-13035-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–40. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Sturm G, Finotello F, Petitprez F, et al. Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology. Bioinformatics. 2019;35:i436–45. doi: 10.1093/bioinformatics/btz363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. doi: 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lê S, Josse J, Husson F. FactoMineR: A Package for Multivariate Analysis. J Stat Softw. 2008;25:1–18. doi: 10.18637/jss.v025.i01. [DOI] [Google Scholar]
- 39.Cabrita R, Lauss M, Sanna A, et al. Tertiary lymphoid structures improve immunotherapy and survival in melanoma. Nature New Biol. 2020;577:561–5. doi: 10.1038/s41586-019-1914-8. [DOI] [PubMed] [Google Scholar]
- 40.Feng H, Yang F, Qiao L, et al. Prognostic Significance of Gene Signature of Tertiary Lymphoid Structures in Patients With Lung Adenocarcinoma. Front Oncol. 2021;11:693234. doi: 10.3389/fonc.2021.693234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Wang L, Gong S, Pang L, et al. Genomic properties and clinical outcomes associated with tertiary lymphoid structures in patients with breast cancer. Sci Rep. 2023;13:13542. doi: 10.1038/s41598-023-40042-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wu W, Li H, Wang Z, et al. The tertiary lymphoid structure-related signature identified PTGDS in regulating PD-L1 and promoting the proliferation and migration of glioblastoma. Heliyon. 2024;10:e23915. doi: 10.1016/j.heliyon.2023.e23915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013;14:7. doi: 10.1186/1471-2105-14-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.MIT Tempo: ccs research pipeline for whole-genome and whole-exome sequencing. 2019
- 45.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. 2013
- 46.Smit AH, Green P. RepeatMasker open-4.0. 2013-2015
- 47.Karczewski KJ, Francioli LC, Tiao G, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature New Biol. 2020;581:434–43. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 2016;44:e131. doi: 10.1093/nar/gkw520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33:1–22. [PMC free article] [PubMed] [Google Scholar]
- 50.Lee E, Chuang H-Y, Kim J-W, et al. Inferring pathway activity toward precise disease classification. PLoS Comput Biol. 2008;4:e1000217. doi: 10.1371/journal.pcbi.1000217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Anzar I, Malone B, Samarakoon P, et al. The interplay between neoantigens and immune cells in sarcomas treated with checkpoint inhibition. Front Immunol. 2023;14:1226445. doi: 10.3389/fimmu.2023.1226445. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Saito T, Rehmsmeier M. Precrec: fast and accurate precision-recall and ROC curve calculations in R. Bioinformatics. 2017;33:145–7. doi: 10.1093/bioinformatics/btw570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Robin X, Turck N, Hainard A, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. doi: 10.1186/1471-2105-12-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1) Eur J Cancer. 2009;45:228–47. doi: 10.1016/j.ejca.2008.10.026. [DOI] [PubMed] [Google Scholar]
- 55.Grundy EE, Diab N, Chiappinelli KB. Transposable element regulation and expression in cancer. FEBS J. 2022;289:1160–79. doi: 10.1111/febs.15722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Anwar SL, Wulaningsih W, Lehmann U. Transposable Elements in Human Cancer: Causes and Consequences of Deregulation. Int J Mol Sci. 2017;18:974. doi: 10.3390/ijms18050974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Churchman ML, Mullighan CG. Ikaros: Exploiting and targeting the hematopoietic stem cell niche in B-progenitor acute lymphoblastic leukemia. Exp Hematol. 2017;46:1–8. doi: 10.1016/j.exphem.2016.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Hu Y, Salgado Figueroa D, Zhang Z, et al. Lineage-specific 3D genome organization is assembled at multiple scales by IKAROS. Cell. 2023;186:5269–89. doi: 10.1016/j.cell.2023.10.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Soldi R, Ghosh Halder T, Weston A, et al. The novel reversible LSD1 inhibitor SP-2577 promotes anti-tumor immunity in SWItch/Sucrose-NonFermentable (SWI/SNF) complex mutated ovarian cancer. PLoS ONE. 2020;15:e0235705. doi: 10.1371/journal.pone.0235705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Voon HPJ, Hughes JR, Rode C, et al. ATRX Plays a Key Role in Maintaining Silencing at Interstitial Heterochromatic Loci and Imprinted Genes. Cell Rep. 2015;11:405–18. doi: 10.1016/j.celrep.2015.03.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chuong EB, Elde NC, Feschotte C. Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Science. 2016;351:1083–7. doi: 10.1126/science.aad5497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Stone ML, Chiappinelli KB, Li H, et al. Epigenetic therapy activates type I interferon signaling in murine ovarian cancer to reduce immunosuppression and tumor burden. Proc Natl Acad Sci U S A. 2017;114:E10981–90. doi: 10.1073/pnas.1712514114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Cottrell TR, Lotze MT, Ali A, et al. Society for Immunotherapy of Cancer (SITC) consensus statement on essential biomarkers for immunotherapy clinical protocols. J Immunother Cancer. 2025;13:e010928. doi: 10.1136/jitc-2024-010928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Altman DG, McShane LM, Sauerbrei W, et al. Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK): explanation and elaboration. PLoS Med. 2012;9:e1001216. doi: 10.1371/journal.pmed.1001216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Mehdipour P, Marhon SA, Ettayebi I, et al. Epigenetic therapy induces transcription of inverted SINEs and ADAR1 dependency. Nature New Biol. 2020;588:169–73. doi: 10.1038/s41586-020-2844-1. [DOI] [PubMed] [Google Scholar]
- 66.Liu M, Thomas SL, DeWitt AK, et al. Dual Inhibition of DNA and Histone Methyltransferases Increases Viral Mimicry in Ovarian Cancer Cells. Cancer Res. 2018;78:5754–66. doi: 10.1158/0008-5472.CAN-17-3953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Roulois D, Loo Yau H, Singhania R, et al. DNA-Demethylating Agents Target Colorectal Cancer Cells by Inducing Viral Mimicry by Endogenous Transcripts. Cell. 2015;162:961–73. doi: 10.1016/j.cell.2015.07.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Mabe NW, Perry JA, Malone CF, et al. Pharmacological targeting of the cancer epigenome. Nat Cancer. 2024;5:844–65. doi: 10.1038/s43018-024-00777-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Rosenbaum E, D’Angelo S, Bradic M. Immune-related adverse events after immune checkpoint blockade-based therapy are associated with improved survival in advanced sarcoma. 2025 doi: 10.1158/2767-9764.CRC-22-0140. [DOI] [PMC free article] [PubMed]
- 71.Taylor BS, Drilon AE, Rosen EY, et al. Exome recapture and sequencing of prospectively characterized clinical specimens from cancer patients. 2025
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data are available in a public, open access repository.





