Machine-learning prediction of tumor antigen immunogenicity in the selection of therapeutic epitopes

Christof C Smith; Shengjie Chai; Amber R Washington; Samuel J Lee; Elisa Landoni; Kevin Field; Jason Garness; Lisa M Bixby; Sara R Selitsky; Joel S Parker; Barbara Savoldo; Jonathan S Serody; Benjamin G Vincent

doi:10.1158/2326-6066.CIR-19-0155

. Author manuscript; available in PMC: 2020 Apr 1.

Published in final edited form as: Cancer Immunol Res. 2019 Sep 12;7(10):1591–1604. doi: 10.1158/2326-6066.CIR-19-0155

Machine-learning prediction of tumor antigen immunogenicity in the selection of therapeutic epitopes

Christof C Smith ^1,², Shengjie Chai ^2,³, Amber R Washington ², Samuel J Lee ², Elisa Landoni ², Kevin Field ², Jason Garness ², Lisa M Bixby ², Sara R Selitsky ^2,⁴, Joel S Parker ^2,^4,⁵, Barbara Savoldo ^2,⁶, Jonathan S Serody ^1,^2,⁷, Benjamin G Vincent ^1,^2,^3,^7,⁸

PMCID: PMC6774822 NIHMSID: NIHMS1537822 PMID: 31515258

Abstract

Current tumor neoantigen calling algorithms primarily rely on epitope/MHC binding affinity predictions to rank and select for potential epitope targets. These algorithms do not predict for epitope immunogenicity using approaches modeled from tumor-specific antigen data. Here, we describe peptide-intrinsic biochemical features associated with neoantigen and minor histocompatibility mismatch antigen (mHA) immunogenicity and present a gradient–boosting algorithm for predicting tumor antigen immunogenicity. This algorithm was validated in two murine tumor models and demonstrated the capacity to select for therapeutically active antigens. Immune correlates of neoantigen immunogenicity were studied in apan-cancer dataset from TCGA and demonstrated an association between expression of immunogenic neoantigens and immunity in colon and lung adenocarcinoma. Lastly, we present evidence for expression of an out-of-frame neoantigen that was capable of driving antitumor cytotoxic T-cell responses. With the growing clinical importance of tumor vaccine therapies, our approach may allow for better selection of therapeutically relevant TSAs, including non-classical out-of-frame antigens capable of driving antitumor immunity.

Keywords: Neoantigen, immunogenicity, modeling, out-of-frame, immunotherapy

Introduction

T cells can affect antitumor immune responses through recognition of tumor-specific antigens (TSAs) presented by major histocompatibility complex (MHC) proteins. These peptides include tumor neoantigens, which are classically thought of as derived from mutation-containing proteins that generate novel immunogenic epitopes [1]. Despite the ability of neoantigen therapeutic vaccines to promote tumor-specific T-cell responses in a number of pre-clinical models [2–4], clinical efficacy has yet to be demonstrated [5,6]. A significant challenge for translation of TSA therapies is the ability to select the subset of clinically relevant epitopes from all computationally predicted neoantigens. Many neoantigen prediction algorithms rely heavily on peptide/MHC binding affinity predictions to rank epitopes [7–14]. Unlike murine pre-clinical models, where in vivo/ex vivo methods to further screen for immunogenicity can be applied [3,15], no such benchtop prediction method for immunogenicity is currently available for humans. We have previously demonstrated in multiple murine models that the number of predicted neoantigens is much higher than the number of confirmed immunogenic neoantigens [15]. Studies demonstrate that in some tumors, the number of predicted neoantigens is far greater than the number of immunogenic neoantigens which have been identified in mouse models [16,17]. As such, the development of an algorithm to predict the immunogenicity of neoantigen peptides (i.e. variant peptides predicted to bind MHC) would be valuable for screening predicted neoantigens for clinical application.

In addition to conventional single nucleotide variant (SNV) neoantigens, studies have suggested the presence of tumor-specific mRNA splice variants [18,19], expression of non-coding regions [20], and alternative ribosomal products [21–28], allowing an out-of-frame translation to occur outside the setting of an insertion/deletion (INDEL) mutation. An increasing need exists to define frequencies of predicted TSAs existing in an out-of-frame context, their clinical implications, and whether frame-filtering should be applied for computational neoantigen prediction. In the context of SNV tumor antigen prediction, allowing for out-of-frame calls may identify “pseudo-SNV” antigens (i.e. out-of-frame antigens that contain concurrent SNV mutations) with immunogenicity responses similar to what is observed in frameshift neoantigens. As a preliminary approach to identify both in- and out-of-frame neoantigens, we performed SNV tumor-antigen computational screening across all open reading frames, looking for: 1) the correlates of immunogenicity for these predicted neoantigens, and 2) the capacity for out-of-frame epitopes to drive antitumor immunity.

Features associated with neoantigen immunogenicity remain unclear. Here, we have elucidated peptide-intrinsic features significantly associated with vaccine/IFNγ ELISpot–derived immunogenicity scores of MHC class I and class II TSAs. Using gradient boosting with cross-validation, we developed an algorithm to predict MHC I TSA peptide immunogenicity based on peptide-intrinsic biochemical features. We modeled the immunogenicity of predicted neoantigens in the BBN963 basal-like bladder cancer model and demonstrated the capacity of epitopes with high predicted immunogenicity to control tumor growth significantly better than those with low predicted immunogenicity. This algorithm was additionally validated using graft-versus-leukemia (GvL) minor histocompatibility mismatch antigens (mHA) in the P815 mastocytoma allogeneic transplant model. Applying this algorithm to predicted MHC I neoantigens from a TCGA pan-cancer dataset, we observed significant positive association between highly immunogenic neoantigens (HINs; in the top 95^th percentile of predicted immunogenicity score) and microsatellite instability (MSI) high–driven immune features in colon adenocarcinoma (COAD) and significant negative association between signatures of anti–PD-1 therapy responsiveness and HIN numbers in lung adenocarcinoma (LUAD) cancer types. Lastly, we provide evidence in favor of antitumor cytotoxic T-cell responses generated against a predicted out-of-frame neoantigen, suggesting a proportion of predicted out-of-frame SNV tumor antigens may be presented by the tumor to generate an immune response. Prediction of peptide immunogenicity on a framework of peptide/MHC binding should improve understanding of antitumor T-cell responses and neoantigen selection for therapeutic vaccine applications.

Materials and Methods

Cell lines

The B16F10 cell line was purchased from ATCC (CLR-6475) and cultured according to the ATCC protocol. The P815 cell line was purchased from ATCC (TIB-64), transduced with luciferase as previously described [29], and cultured according to the ATCC protocol. The BBN963, UPPL1541, and MB49 cell lines were obtained and passaged as previously described [15]. The T11 model was obtained and passaged as previously described [30]. All cells used in this study were derived from viably frozen stocks of the above cell lines, with aliquots derived within ≤5 passages of the original stock. No mycoplasma testing was performed. No further authentication was performed on cell lines directly purchased from ATCC (B16F10, P815) or those received directly from the deriving lab (T11: Charles M. Perou, UNC Lineberger; BBN963, UPPL1541: William Y. Kim, UNC Lineberger). MB49 cell line was authenticated through transcriptomic analysis, as previously described [15].

Animal studies

All experiments described in this study were approved by the UNC Institutional Animal Care and Use Committee (IACUC). Animals used in this study, their vendor source, and respective tumor cell lines included: C57BL/6J (Jackson Laboratories; B16F10), C57BL/6 (Charles River Laboratories; BBN963, MB49, UPPL1541), DBA/2J (Jackson Laboratories; P815), and BALC/c (Jackson Laboratories; T11). Tumor injection routes and cell numbers for all models and experiments included: B16F10: Flank subcutaneous (s.c.),10⁵ cells; BBN963: Flank s.c., 10⁷ cells; MB49: Flank s.c., 10⁵, UPPL1541: Flank s.c., 10⁶, T11: Mammary fat pad intradermal, 10⁴, P815: Tail vein intravenous, 3×10⁵. Tissue collection and DNA/RNA isolation is described in the “Neoantigen and mHA prediction” section below. Graft-versus-Host disease (GvHD) scoring was performed as previously described [31], with score defined as the sum of five components of posture, fur, activity, skin, and weight loss on a 0–2 scale.

Tissue Dissociation

All single-cell suspensions mentioned in the below methods sections were derived using the below listed protocol. Tissues were homogenized in cold PBS using the GentleMACs Dissociator and the samples were passed through a 70 μM cell strainer using a 5 mL syringe plunger. The samples were centrifuged for seven minutes at 290 RCF, 4°C, decanting the supernatant. The remaining pellet was resuspended into 1 mL of ACK lysis buffer (150 mM NH4Cl, 10 mM, KHCO3, 0.1 nM Na2EDTA in DPBS, pH 7.3) for 2 minutes at room temperature before quenching with 10 mL of cold media. The samples were centrifuged for seven minutes at 290 RCF, 4°C, resuspended in 10 mL of cold media, and passed through a 40 μM cell strainer.

Neoantigen and mHA prediction

Neoantigen prediction was performed as previously described [15]. Briefly, mice were injection with tumors (Figure 1A) in the route and counts listed above, and monitored until tumor size reached 100mm³ by caliper measurement ( $\frac{l x w^{2}}{2}$ , where w is the smaller of two perpendicular tumor axes), at which point mice were humanely sacrificed with CO₂ asphyxiation followed by cervical dislocation. P815 tumor samples were collected directly from cell line culture (10⁵ cells per sample). RNA was extracted from single-cell suspensions of tumors using Qiagen RNeasy Mini kit (cat. # 74104), and DNA was extracted from single-cell suspensions of tumors and matched-normal tail clippings or livers using Qiagen DNeasy kit (cat. # 69504), all according to manufacturer’s protocol. Whole exome and transcriptome library preparation was performed using Agilent SureSelect XT All Exon and Illumina TruSeq Stranded mRNA library preparation kits, respectively. Libraries were sequenced via 2×100 runs on an Illumina HiSeq 2500 at the UNC High Throughput Sequencing Facility (HTSF). Tumor mutations were called using UNCeqR (https://lbc.unc.edu/~mwilkers/unceqr_dist/) [32], filtering for SNV mutations with at least 5x coverage by RNA-seq. Translated 8–11mer (class I) or 15mer (class II) peptides were derived across all open reading frames, and then predicted for MHC binding affinity using NetMHCPan3.0 (http://www.cbs.dtu.dk/services/NetMHCpan-3.0/) [8]. Class I minor mismatch antigens were predicted similarly in the P815 model against the BALB/c histocompatible donor. Predicted binders were filtered by binding affinity <500 nM, generally accepted cutoff for immunogenicity as previously noted [6,9,33], with top epitopes screened for immunogenicity using a vaccine/ELISpot approach, as described below.

Vaccine/ELISpot screening

Predicted neoantigen peptides (MHC I: n = 210; MHC II: n = 68) were synthesized by New England Peptide (Gardner, MA), using custom peptide array technology (Supplemental Data File 1). Non-tumor bearing wildtype animals were vaccinated with predicted neoantigen peptides of their respective predicted haplotype, given as a subcutaneous injection of a pool of 8 equimolar peptides (5 nmol total peptide) and 50 μg poly(I:C) (Sigma, cat. # P1530) in PBS. A second identical injection was repeated 6 to 7 days after primary injection. Mice were humanely sacrificed with CO₂ asphyxiation followed by cervical dislocation 5 to 6 days after the second injection. Spleens were harvested and prepared into single cell suspension, as described above. Splenocytes were plated in triplicate at 5 × 10⁵ cells per 100 μL media (RPMI 1640 (Gibco cat. # 11875–093) with 10% FBS (Gemini cat. # 900–108) onto an IFNγ capture antibody-coated ELISPOT plate (BD Biosciences, cat. # 551083) according to manufacturer protocol for 48–72 hours, along with 1 nmol of a single peptide against which the respective mouse was vaccinated. Immunogenicity was defined as the average number of spot-forming cells (SFC) identified using an ELISpot plate reader (AID Classic ERL07), with no-peptide background subtracted from each epitope.

Computational analysis

Variables used in neoantigen immunogenicity regression and modelling were derived using the “aaComp” command of the R package “Peptides” (v2.4; https://cran.r-project.org/web/packages/Peptides/index.html). Using features derived from this command (Tiny, Small, Aliphatic, Aromatic, Non-polar, Polar, Charged, Basic, Acidic), variables were derived by the presence (1) or absence (0) of each feature at each absolute and relative position along each antigen, at the site of SNV mutation along each antigen, at the first or last 3 amino acid residues (beginning/end) or middle residues (middle) of each antigen, or difference (loss: −1, gain: 1, or no change: 0) of each feature in the mutated versus reference antigen along SNV mutation site.

For all GLM and predictive modeling analyses, low variance variables (defined by the “nearZeroVar” function of the “caret” package) were removed prior to further analysis. Generalized linear models (GLM) using the R “glm” function were used for all non-modeling univariable and multivariable linear regression analyses, with significance reported as false discovery rate (FDR)-adjusted p-values (q-value) using the R “p.adjust” command. Backward stepwise regression for multivariable modelling was performed using the R “stepAIC” command of the “MASS” package (https://cran.r-project.org/web/packages/MASS/index.html), optimized on Akaike Information Criterion (AIC). Backward stepwise regression was performed by starting with all variable candidates and testing the deterioration of the model with removal of each variable.

For immunogenicity prediction modeling, analyses were performed using the R package “caret” as a wrapper for running each multivariable approach: GLM, elastic net, random forest, gradient boosting, and linear and radial support vector machine methods. For cross validation, data were split into exploration (n = 141) and validation (n = 69) sets using the caret “createDataPartition” function, confirming statistically non-significant differences in measured immunogenicity between exploration and validation sets (Mann-Whitney p > 0.2). Model performance was derived from Pearson correlation coefficients between ELISpot immunogenicity and predicted immunogenicity scores, using a 10,000-fold cross-validation (2/3^rd random resampling) approach within the exploration set, with the input predictor variables limited to those that demonstrated significant univariable correlation in >50% of 1000-fold bootstrapping iterations (2/3^rd resampling) within the exploration set. The final gradient boosting machine-learning algorithm immunogenicity prediction of MHC I epitopes can be accessed at https://github.com/vincentlaboratories/neoag.

To explore for computation evidence of out-of-frame transcripts, StringTie (https://ccb.jhu.edu/software/stringtie/) [34] and Trinity (https://github.com/trinityrnaseq/trinityrnaseq/wiki) [35] were used for de novo assembly of transcripts from BBN963 RNA-seq data, according to standard workflow provided in the above links.

Peptide treatment studies

BBN963 basal-like bladder cancer model treatment began with pre-tumor vaccination with 30 μg of a single peptide (or no-peptide control) and 50 μg poly(I:C) adjuvant injected in 100 μL PBS intradermally in the flank of 8–10 week old female C57BL/6 mice (Charles River). Twelve days after vaccination, 1×10⁷ BBN963 cells were injected in 100 μL PBS subcutaneously in the flank, ipsilateral to the vaccine site. On day 21 post primary vaccination, animals were given a vaccine booster with 30 μg of the initial respective peptide with no poly(I:C) adjuvant. This booster was delivered in 100 μL PBS intradermally in the skin directly adjacent to the tumor. Animals were monitored for tumor growth via caliper measurement and survival every 2–3 days for the remainder of the study, with UNC IACUC defined endpoints of area >200 mm² or ulceration >5 mm in the longest diameter.

For P815 treatment studies, 8–12 week old male BALB/c donors (Jackson Laboratory) were vaccinated on days 0 and 7 with 100 μg total peptide (3–4 pooled equimolar peptides, or no-peptide control) and 50 μg poly(I:C) adjuvant in 100 μL PBS intradermally in the flank. DBA/2 recipients were treated with 800 rad total body irradiation on day 13. On day 14, splenic-derived T cells and bone marrow cells were isolated from donor BALB/c animals. T cells were isolated from single-cell splenocyte suspensions uisng the Miltenyi Pan T Cell Isolation Kit II (cat. #130-095-130), according to manufacturer’s protocol. Bone marrow cells were isolated as previously described [36]. Recipient DBA/2 animals were given tail-vein IV injections of 3×10⁶ T cells, 3×10⁶ bone marrow cells, and 3×10⁵ P815-luciferase tumor cells (or bone-marrow only control). DBA/2 recipients were given a booster vaccine on day 21 after primary vaccine (100 μg total peptide, 50 μg poly(I:C)), with animals monitored every 2–3 days for survival, with UNC IACUC defined endpoints of bilateral hind-limb paralysis. Luciferase imaging studies were performed on days 8, 13, 22, 26, and 35 after transplant, using an IVIS imaging system on animals given 3 mg intraperitoneal D-luciferin (Perkin Elmer, cat. # 122799) 10 minutes prior to imaging.

Tetramer studies

Peptide/MHC tetramer and cell surface protein staining were performed as described previously [37]. Briefly, viable, single-cell suspensions derived from tumors (approximately 10⁷ total cells) were treated with 50 nM dasatinib (Sigma-Aldrich, cat. # CDS023389) for 30 minutes at 37°C, and then stained using approximately 10 μg/mL tetramer on ice for 30 minutes. Tetramers were generated using the MBL Quickswitch Quant H-2 Kb Tetramer Kit-PE (cat. # TB-7400-K1), using peptides VALLPSVMNL or SIINFEKL irrelevant control, according to manufacturer protocol. Cells were then washed and incubated on ice with biotin-conjugated anti-PE antibody (5 μg/ml; BioLegend; PE001) for 20 minutes, followed by 2 washes, then further incubation with streptavidin, R-PE conjugate (SAPE, 5 μg/mL) for 10 minutes on ice. Cells were then washed and stained for viability using BD fixable viability dye FVS620 according to the manufacturer’s directions. Last, cells were Fc blocked (anti-mouse CD16/CD32; 2.4G2, BD Biosciences) for 10 minutes on ice, followed by surface staining for 20 minutes on ice with the following markers: CD45 (BV510; 30-F11), CD4 (FITC; RM4–5), CD8 (APC/Fire-750; 53–6.7) (All antibodies purchased from BD Biosciences). Aquisition was performed using a BD LSRFortessa flow cytometer. FlowJo flow cytometry software version 10 was used for analyses of all flow cytometric data. Cells were selected using gates defined by single color controls and FMO or irrelevant tetramer controls, with tetramer-positive/negative CD8⁺ T cells defined within live, singlet (by FSC-A versus FSC-H), CD45⁺, CD4^–/CD8⁺ gates.

Ex vivo T-cell expansion and cytotoxicity assays

For tetramer-sorted T-cell isolation, CD8⁺ T cells were isolated from BBN963 tumor single-cell suspensions (as described above) using the Miltenyi Dead Cell Removal Kit (130-090-101) followed by the Miltenyi CD8a+ T Cell Isolation Kit (130-104-075), both according to manufacturer protocol. CD8⁺ T cells stained with tetramer as described above and sorted on the BD FACSJazz. Tetramer sorted T cells or column-sorted (Miltenyi CD8a⁺ T Cell Isolation Kit) CD8⁺ T cells from OT1 (C57BL/6-Tg(TcraTcrb)1100Mjb/J, Jackson Laboratories cat. # 003831) splenocytes were cultured in complete RPMI media (RPMI 1640 (Gibco cat. # 11875–093) with 10% FBS (Gemini cat. # 900–108), 1% sodium pyruvate (100nM; Gibco cat. # 11360–070), 1% non-essential amino acids (10mM; Gibco cat. # 11140–050), 1% l-glutamine (Gemini cat. # 400–106), 1% HEPES buffer (1Ml Corning cat. # 25–060-Cl), and. 1% 2-mercaptoethanol (55nM; Gibco cat. # 2198502)) in the presence of recombinant murine IL7 (10 ng/mL, Peprotech cat. # 217–17), IL15 (10 ng/mL, Peprotech cat. # 210–15), and IL2 (100 IU/mL, Peprotech cat. # 212–12) for 72 hours. Antigen-specific T-cell expansion was performed using a previously described protocol [38]. Briefly, all sorted T cells were recovered at 10⁶ cells/mL for 72 hr in RPMI complete media in the presence of IL7 (10 ng/mL), IL15 (10 ng/mL), and IL2 (10 ng/mL) at 37°C and 5% CO₂. T cells were then cocultured in RPMI complete media and IL7, IL15, and IL2, alongside peptide-pulsed DCs (2.5 μg/mL peptide) which had been pulsed approximately 18 hr prior to coculture. Media and cytokines were changed every 2–3 days, letting cells expand for 7–10 days after coculture with peptide-pulsed dendritic cells before downstream assays.

For flow cytometric-based cytotoxicity assays, target cells (BBN963 or an irrelevant splenocyte control) were pre-labelled in 5 μM CFSE for 15 minutes prior to co-culture. Tetramer-sorted and antigen-expanded T cells (per above section) were cocultured alongside targets at a 1:1 ratio, with 1×10⁵ of each target and effector population. Cells were plated on a v-bottom 96-well polypropylene plate, centrifuged at 300 × g for 1 minute, and incubated at 37°C, 5% CO₂ for 4 hours. After incubation, cells were stained using FVS700 viability dye (BD Biosciences, cat. # 564997) according to manufacturer’s directions. Aquisition was performed on a BD LSRFortessa flow cytometer. FlowJo flow cytometry software version 10 was used for analyses of all flow cytometric data. Cells were identified as targets (CFSE⁺) or effectors (CFSE^–), looking for percent viability among targets. Percent killing was reported as frequency of dead targets, background subtracted from no-effector control wells.

The cytotoxic activity of T cells was evaluated using a standard 4-hour ⁵¹Cr release assay [39]. In brief, 5×10³ ⁵¹Cr-labeled (Perkin Elmer cat. # NEZ030001MC) BBN963 target cells per well were plated in triplicate in a 96 well v-bottom plate with different ratios (10:1 and 5:1 effector:target) of effector cells and incubated for 4 hours at 37°C. The supernatant was collected and analyzed with a gamma-counter (Perkin Elmer). Before labeling, target cells were incubated for 2 hours at 37°C with the specific peptides (100 nM) and washed twice with complete medium. Target cells were incubated with medium alone or in 1% Triton X-100 (Sigma-Aldrich) to determine the spontaneous and maximum ⁵¹Cr release, respectively. The mean percentage of specific lysis of triplicate wells was calculated as follows: [(test counts - spontaneous counts) / (maximum counts - spontaneous counts)] × 100%.

TCGA data analyses

MapSplice-aligned, RSEM-quantified RNA-Seq expression matrices and survival data were downloaded from FireBrowse (http://firebrowse.org/). Expression matrices were merged between all cancer types, upper quartile normalized within each sample, and log₂ transformed. Immune gene signatures (IGS) were derived from previously described signatures [40–44], with expression calculated as the mean expression of each gene within the signature. TCGA LAML samples were omitted from analysis in order to prevent skewing of IGS patterns. Inclusion criteria were those defined by TCGA pan-immune working group, according to previous studies [17]. MHC I neoantigen expression used for machine-learning algorithm immunogenicity prediction were obtained from publicly available data derived in previous studies [17]. TCGA pan-cancer dataset (n = 11,092; LUAD n = 515, COAD n = 283) analyses were performed according to the above “Computational analysis” methods section.

Differential gene expression analysis was performed using DESeq2 (https://bioconductor.org/packages/release/bioc/html/DESeq2.html) [45]. Gene set enrichment analysis (http://software.broadinstitute.org/gsea/index.jsp) [46], Ingenuity pathway analysis (https://www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis/) [47], and DAVID gene ontology analysis (https://david.ncifcrf.gov/) [48] were performed from respective web portals. T-cell receptor diversity analysis was performed from previously published MiXCR-derived TCR reads [17] – MiXCR is an analytic tool for TCR inference from whole transcriptome RNA-seq data [49].

Statistical analyses

Statistical analyses for survival (displayed as Kaplan-Meier plots) were performed using log rank test, with no statistical correction. Differences in cytotoxicity in tetramer-sorted cytotoxicity assays were determined via Welch’s t-test, with no statistical correction. Differences in tetramer-positive T cell populations were determined via Mann-Whitney u-test, with no statistical correction. All above analyses were performed in Graphpad Prism 8. All other analyses and corresponding statistical tests are described in the above “Computational analysis” methods section.

Results

Correlates of immunogenicity in class I MHC epitopes

Neoantigens and mHA were predicted in six tumor models (B16F10, BBN963, MB49, UPPL1541, P815, and T11), allowing us to characterize neoantigens in the H2^b and H2^d haplotypes (Figure 1B). We predicted a total of 210 MHC I epitopes and 68 MHC II epitopes and determined their immunogenicity using a vaccine/ELISpot screening approach. Distribution of epitope ELISpot scores (SFC) varied by model, with MB49, B16F10, and P815 tumors including nine of the 10 most immunogenic epitopes, and BBN963 including seven of the 10 least immunogenic epitopes (Supplementary Figure S1). MHC II epitopes were not predicted for P815 GvL mHA. With the goal of identifying peptide-intrinsic features that associated with and predicted for immunogenicity, we derived a set of features for each peptide, including the amino acid sequence and characteristic at each I) absolute position, II) relative site, III) site of mutation and changes in amino acid sequence and characteristic at mutational site, and IV) presence of amino acid or characteristic at the beginning, middle, or end of each peptide (Figure 1C).

Univariable regression considering intrinsic peptide features as the predictor variable and immunogenicity (measured as IFNγ ELISpot values of T cells derived from vaccinated mice) of class I antigens demonstrated 38 significant features (q-value <0.05; Figure 2A) associated with ELISpot immunogenicity. Among these features, the most significant positive associations were changes in the mutation position to a small amino acid (Mutated_position_change_of_Small_feature), valine at relative site 2 (“Relative_site_2_V”), and basic amino acids of the reference sequence at the mutated position (Reference_AA_at_mutated_position_Basic). In contrast, the most signficant negative associations were small amino acids of the reference sequence at the mutated position (Reference_AA_at_mutated_position_Small), changes in the mutation position to a basic amino acid (Mutated_position_change_of_Basic_feature), and polar amino acids at position 6 (Absolute_position_6_Polar). We additionally sought to determine the correlation among the 38 significant features, observing relatively low correlation (Figure 2B). Features that demonstrated significant correlation were related, such as 1) the amino acid at the mutated position of the reference sequence with charged or basic features, 2) valine or small amino acids at absolute position 11, and 3) the presence of a valine or small amino acid at the last position and valine at relative site 8.

Figure 2: — (A and C) Volcano plots representing the generalized linear method (GLM) coefficient (x-axis) and −log10(q-value)(y-axis) for each peptide-intrinsic feature as a predictor for immunogenicity in (A) class I and (C) class II neoantigens/mHA. Dashed line represents q-value=0.05. Spot color represents −log10(q-value) magnitude and size represents magnitude of the coefficient. (B,D) Heatmap representing Spearman correlations between each significantly correlated feature from (A,C) for (B) class I and (D) class II neoantigens/mHA, respectively. Colored cells represent significantly correlated features (q<0.05), with magnitude of the correlation coefficient represented by color. (E) ELISpot-derived immunogenicity scores for class I neoantigens/mHA classified as predicted high (>100) or low (<100) immunogenic by multivariable GLM regression. Data represent median (middle line), with boxes encompassing the 25th to 75th percentile, whiskers encompassing 1.5× the interquartile range from the box, and independent values shown by dots. Statistics performed with Mann-Whitney U-test, with significance defined as p<0.05.

Next, we evaluated the independence of the variables identified using univariable analysis using multivariable regression. To increase confidence of our multivariable regression, we performed backward stepwise regression, optimizing on Akaike Information Criterion (AIC), as described in the Methods. Variables whose loss resulted in an insignificant change to the model performance (as measured by the AIC) were removed from the set of variables until no further variables could be removed without a significant decrease in model fit. Sixteen significant features from this step were inputted into multivariable regression. The resulting model of 8 features indicated 33.4% variation (p<0.0001) in immunogenicity was explained by the prediction (Supplementary Figure S2A). Significant features of the multivariable model included valine at the last position (Last_position_V (p=0.0001)), tyrosine at position 3 (Absolute_position_3_Y (p=0.003)), changes in the mutated position to a small amino acid (Mutated_position_change_of_Small_feature (p=0.007)), cysteine at relative site 4 (Relative_site_4_C (p=0.012)), lysine at relative site 5 (Relative_site_5_K (p=0.015)), tiny amino acids at relative site 6 (Relative_site_6_Tiny (p=0.016)), basic amino acids of the reference sequence at the mutated position (Reference_AA_at_mutated_position_Basic (p=0.027)), and valine at relative site 2 (Relative_site_2_V (p=0.041))(Supplementary Figure S2B, Supplementary Table S1). To ensure this model was accurately representing both H^b and H^d haplotypes, we tested the immunogenicity for each of these five significant features, split categorically, which demonstrated similar trends between both haplotypes (Supplementary Figure S3). We did not observe significant differences in ELISpot immunogenicity among predicted in-frame (n=131) and out-of-frame (n=79) epitopes, emphasizing that peptide-intrinsic features were the primary driver for immunogenicity (p>0.05; Supplementary Figure S4).

Correlates of immunogenicity in class II MHC epitopes

Among class II epitopes, 15 peptide-intrinsic features were significantly associated with ELISpot immunogenicity (GLM q-value <0.05; Figure 2C). Among the most significant positively associated included changes in the mutation position to a non-polar amino acid (Mutated_position_change_of_NonPolar_feature), valine at position 1, tyrosine at position 6, and basic amino acid at position 2. the more significant negatively correlated feature was a change in the mutation position into a small amino acid (Mutated_position_change_of_Small_feature), which was positively correlated in class I epitopes. Negatively correlated features also included changes in the mutation position to a polar amino acid (Mutated_position_change_of_Polar_feature), and small/tiny amino acids at the mutated site. Among the significant features (Figure 2D), we observed one cluster of closely inter-correlated features (Sig 1; n=7, right-hand side of dendrogram), as well as a second cluster of loosely inter-correlated features (Sig 2; n=8, left-hand side of dendrogram). With each respective tumor model defined as a binary variable (1 = true, 0 = false), the mean expression of cluster 1 features was significantly correlated with the B16F10 model and inversely correlated with MB49, whereas the mean expression of cluster 2 was significantly correlated with MB49 tumors (Supplementary Figure S5, Spearman q-value <0.05). This corroborated with the greater burden of immunogenic class II neoantigens identified in these two models, suggesting relatively greater contribution of these two models (particularly MB49) in the regression outcomes. Using the same backwards AIC stepwise regression approach described above, multivariable GLM regression was performed on six features. The resulting model indicated that 50.7% variation (p<0.0001) in immunogenicity was explained by the prediction (Supplementary Figure S6, Supplementary Table S2), with three significant features (tyrosine at positive 6 (Absolute_position_6_Y), valine at positive 1 (Absolute_position_1_V) and changes in the mutated position to a small amino acid (Mutated_position_change_of_Small_feature)) primarily driving the fit (p=0.024, 0.002, and 1.6×10⁻⁵, respectively). As with class I epitopes, we did not observe significant differences in immunogenicity among predicted in-frame (n=59) and out-of-frame (n=9) epitopes (p>0.05; Supplementary Figure S3).

Machine-learning algorithm for immunogenicity prediction in class I MHC epitopes

Our analysis of class I epitopes using multivariable GLM suggested an optimized multivariable model may be able to discriminate between high- and low-immunogenicity peptides (Figure 2E). With the goal of designing a predictive model for neoantigen and mHA immunogenicity, we split our class I epitope database into an exploration (2/3 of epitopes, n=141) and validation (1/3 of epitopes, n=69) set (Figure 3A). Class II modeling was not attempted due to the low number of epitopes available within our database (n=68). In order to reduce noise within our model, we collapsed ELISpot scores with absolute values less than or equal to the absolute value of the most negative count (–53 spots) to zero. This was performed because we were not focused on the ability of the model to characterize exact immunogenicity values within the low immunogenicity range. Within the exploration set, we used a 10,000-fold cross-validation (2/3^rd random resampling) approach, which demonstrated that only gradient boosting consistently performed better than chance. Our final gradient boosting algorithm contained seven predictive features: valine at position 1 (Absolute_position_1_V), valine at the last position (Last_position_V), small amino acids at the last position (Last_position_Small), basic amino acids of the reference sequence at the mutated position (Reference_AA_at_mutated_position_Basic), changes in the mutated position to a small amino acid (Mutated_position_change_of_Small_feature), lysine at relative site 1 (“Relative_site_1_K”), and presence of valine within the first 3 positions (First_three_AA_V). The class I validation set was run through this final gradient boosting algorithm, demonstrating significantly accurate performance when comparing the linear fit between the actual immunogenicity by ELISpot and the predicted immunogenicity by modeling (p=0.01893, coefficient=0.30, Figure 3B). This model provided a high negative predictive value (83.6% predicted values <53), ideal in the setting for filtering out a large pool of predicted tumor antigens in order to select epitopes for therapeutic targeting.

Figure 3: — (A) Schema of the cross-validation approach used for GBM model building. (B) Performance of the final GBM model in validation set, showing actual (x-axis) versus predicted (y-axis) immunogenicity scores. Size of each point represents number of antigens at each coordinate. Red line represents the line of best fit, with p-value of fit shown above the graph. (C,E) Schema for *in vivo* validation experiments, with tumor vaccine studies performed in (C) BBN963 basal-like bladder cancer and (E) the P815 mastocytoma syngeneic transplant model. (D,F) Kaplan-Meier survival curves for animals bearing (D) BBN963 basal-like bladder cancer and (F) P815 mastocytoma syngeneic transplants. Animals treated with predicted high (red) or low (blue) immunogenicity antigens, no-peptide control (black), or bone marrow only control (grey). Data in (D) represents two independent experiments. Data in (F) represents one independent experiment. Statistics performed with log-ranked testing (**p<0.01; ***p<0.001).

In vivo validation of the class I immunogenicity prediction model

To test whether our final algorithm increased the likelihood of identifying clinically relevant, immunogenic epitopes for antitumor vaccine responses, we used two tumor models within our validation set: BBN963 basal-like bladder cancer neoantigens (epithelial tumor) and P815 mastocytoma GvL mHA (hematopietic tumor). Epitopes were binned into predicted high immunogenicity (top quartile) and predicted low immunogenicity (bottom quartile) groups for comparison of relative efficacy (Supplementary Table S3). In BBN963 tumors, three predicted high (B2: VALLPSVML; C2: VSLTLFSSWL; A5: SNVMQLLL) and two predicted low (B5: ETLLNSATI; B12: MISRNRHTL) immunogenicity neoantigens were identified. Animals were vaccinated with 30 μg of one peptide (or no-peptide control) alongside 50 μg poly(I:C) as adjuvant, challenged with tumor 12 days after vaccination, then given a 30 μg peptide boost on day 21 after the initial vaccination (Figure 3C). Animals vaccinated with predicted a high immunogenicity peptide survived longer than those vaccinated with either predicted low immunogenicity peptide (p=0.0006) or no-peptide control (p=0.0031; Figure 3D, Supplementary Figure S7A). In contrast, no significant difference in survival was observed between predicted low immunogenicity peptide and no-peptide control groups (p=0.9674). We additionally observed better control of tumor size in animals vaccinated with predicted high immunogenicity peptide (Supplementary Figure S7B–D).

In P815 tumors, two predicted high (AFQRVTCTTL and QYSSANDWTV) and three predicted low (HYAANEWI, KFFPNCIFL, and LYISPNPEVL) immunogenicity GvL mHAs were identified. BALB/c donor animals were vaccinated with a pool of predicted high or low immunogenicity peptides (100 μg each peptide) or no-peptide control, with 50 μg poly(I:C) as adjuvant on days 0 and 7. DBA/2 recipient animals were lethally irradiated on day 13; transplanted with 3×10⁶ BALB/c T cells, 3×10⁶ BALC/c bone marrow cells, and 3×10⁵ P815 tumor cells on day 14; and finally given a 3^rd booster vaccine on day 21 (Figure 3E). Animals given predicted high immunogenicity T cells survived longer than those given predicted low immunogenicity T cells (median survival 44.5 and 28 days, respectively), both of which survived longer than no-peptide control T cells (median survival 19 days, Figure 3F). Additionally, we observed significantly lower tumor burden in high immunogenicity versus low immunogenicity peptide vaccinated animals by luciferase imaging on day 26 (p < 0.05, Mann-Whitney u-test; Supplementary Figure S8-S9). All groups receiving donor T cells demonstrated measurable graft-versus-host disease (GvHD) clinically after transplant, without significant differences in weight loss or GvHD clinical scores between groups up to day 30 (Supplementary Figure S10). In summary, these experiments demonstrated the in vivo biological relevance of our immunogenicity prediction model, with significant differences observed between predicted high and low immunogenicity epitopes.

Correlates of predicted immunogenicity in human class I epitopes

Although the immunogenicity prediction algorithm was designed and validated in mice, we hypothesized that similar rules of immunogenicity may exist among human neoantigens. To study this, we ran previously predicted MHC I neoantigens from TCGA through our machine-learning algorithm, generating immunogenicity scores for each epitope [17]. As expected, we observed a correlation between the number of HINs identified by our model (>95^th percentile) with number of total neoantigens (Pearson correlation p<0.0001; Supplementary Figure S11). Therefore, we performed regression studies between HIN count and immune features without controlling for total neoantigen burden. We observed significant association between HIN count and IFNγ, cytotoxicity, CD8⁺ T-cell and total T cell, and B-cell immune gene signatures (IGS) among the dataset (not controlling for cancer type; Figure 4A). Assessing these associations individually by tumor type, we observed that the most significant associations were encompassed by the colon (COAD) and lung (LUAD) adenocarcinoma cancer types (Figure 4B). Within COAD, ta positive association between HIN count and many T-cell and cytotoxicity signatures. To identify potential drivers of this pattern, we looked for the association between HIN count and MSI status, observing that MSI-high COAD tumors had significantly higher HIN counts (Figure 4C, Supplementary Figure S12).

Figure 4: — (A) Volcano plot representing generalized linear method generalized linear method (GLM) coefficient (x-axis) and −log10(q-value)(y-axis) between numbers of highly immunogenic neoantigens (HINs) and immune gene signatures (IGSs) in TCGA pan-cancer datasets. (B) Heatmap representing GLM regression between numbers of HINs and IGS for each TCGA cancer subset. Color represents direction of coefficient (red: positive; blue: negative), and shade represents −log10(q-value) magnitude. (C) Number of HINs (x-axis) versus microsatellite instability (MSI) score (y-axis) for a TCGA colorectal carcinoma (COAD) dataset. (D) Volcano plot representing GLM coefficient (x-axis) and −log10(q-value)(y-axis) between average expression of IGS in (B) with significantly negative association with HIN burden and oncogene/tumor suppressor copy numbers in a TCGA lung adenocarcinoma (LUAD) dataset. (A,D) Dashed line represents q-value=0.05.

In contrast, LUAD demonstrated a negative association with signatures of anti–PD-1 responsiveness and several immune cell signatures. Regression analysis between these negative IGS features and LUAD oncogene/tumor suppressor copy numbers demonstrated preferential association with MYC copy number (q-value <0.05; Figure 4D, Supplementary Figure S13). To demonstrate that MYC amplification provided a pro-tumorigenic signal in LUAD, we observed significantly greater expression of genes corresponding with cell cycle gene patterns (Supplementary Figure S14A,C,D), as well as enrichment of downstream genes involved in the MYC pathway (Supplementary Figure S14B) among MYC-amplified tumors. This increased proliferation pattern was additionally associated with decreased sharing of T-cell receptor sequences from tumor-infiltrating T cells in MYC-amplified tumors, suggesting a potentially altered antitumor immune response (Supplementary Figure S14E). We did not find HIN count to correlate with MYC copy number (Pearson p>0.5), suggesting tumor immunogenicity burden and MYC target expression may be independent predictors for immune exclusion and checkpoint inhibition resistance.

Out-of-frame neoantigen epitopes promote anti-tumor immunity

Through the design of our neoantigen prediction algorithm, we considered predicted tumor epitopes across all open reading frames. As such, subsets of our predicted neoantigens were frameshifted epitopes that contained a mutation. We hypothesized that through mechanisms, such as novel splice variants and ribosomal dysfunction, a subset of these out-of-frame predicted antigens could arise in the tumor, allowing for a viable target with greater heterogeneity from self-antigen. Indeed, two of the predicted high immunogenicity neoantigens used in our BBN963 vaccine studies were predicted to be out-of-frame (B2, C2), although still providing therapeutic efficacy over predicted low immunogenicity and no-peptide controls. One of these antigens (B2) demonstrated computational evidence of translation in the out-of-frame context using two de novo transcriptome assemblers Trinity [35] and StringTie [34], whereby the presence of a 5’ start codon was identified with no intervening stop codon up to the B2 antigen site.

With therapeutic and computational evidence in favor of B2 antigen-mediated antitumor immune responses against BBN963 tumors, we next confirmed the presence of B2/MHC tetramer–specific CD8⁺ T cells infiltrating within the tumor of BBN963-bearing animals, suggesting an antigen-driven T-cell response (Figure 5A, Supplementary Figure S15). Tetramer sorting and peptide-pulsed dendritic cell cocultures of B2-specific T cells demonstrated approximately 40-fold expansion of T cells within 10 days (<5×10⁵ to >2×10⁶), with maintenance of a B2-enriched population (Supplementary Figure S16). Using a flow cytometric cytotoxicity assay, coculture of B2-specific T cells with the BBN963 cell line demonstrated >1.7x increase in killing of target cells (15.25%) compared to the OTI T-cell irrelevant control (8.85%). Neither B2-specific nor OTI T cells demonstrated killing of irrelevant splenocytes control cells (1.1% and 0.85%, respectively; Figure 5B, Supplementary Figure S17). B2-specific T-cell killing of BBN963 over that of OTI T-cell controls was additionally confirmed using a ⁵¹Cr-release cytotoxicity assay (Figure 5C). Altogether, these results suggested the presence of a cytotoxic CD8⁺ T-cell response against the out-of-frame B2 neoantigen in BBN963.

Figure 5: — (A) Percent B2 tetramer–positive (left) versus irrelevant SIINFEKL-tetramer control (right) BBN963 tumor-infiltrating CD8⁺ T cells. Statistics performed with Mann-Whitney u-test (*p<0.05). (B) 4-hour flow cytometric cytotoxicity assay, comparing percent specific killing in 1:1 cocultures of B2 tetramer–specific T cells or OTI irrelevant T-cell controls versus BBN963 target or irrelevant splenocyte target control. Percent killing represents spontaneous target death background subtracted values. Statistics not performed for (B) due to n=2 sample size across all groups. (C) 4-hour chromium-51 release assay, comparing percent specific killing in cocultures of B2 tetramer–specific T cells or OTI irrelevant T-cell controls versus BBN963 targets at 10:1 and 5:1 effector-to-target ratios. (A-C) Error bars represent mean±standard deviation. Data from each graph represents one independent experiment, respectively. Statistics performed with Welch’s t-test (**p<0.01; *p<0.05).

Discussion

The study presented here addressed two unanswered questions regarding tumor antigens: 1) what features of a tumor antigen sequence are associated with immunogenicity, and 2) can inclusion of non-canonical, out-of-frame epitopes provide viable targets for anti-tumor therapeutic vaccination? We demonstrated that peptide-intrinsic features of predicted tumor antigens could discriminate epitopes with therapeutic efficacy, and that inclusion of out-of-frame epitopes among this pool could provide antitumor immunity against these alternative antigens. We showed that reading frame was not a significant determinant for immunogenicity (at least among peptides with predicted binding affinity <500 nM), and that exclusion of frame-filtering could identify out-of-frame epitopes with therapeutic antitumor, cytotoxic activity. Although the optimal rules for immunogenicity may differ between in-frame and out-of-frame tumor antigens, our relatively limited training set was underpowered to discriminate between these two classes. As such, future studies should be performed to address the biological differences between in- and out-of-frame tumor antigens, and what methods can most optimally identify clinically relevant epitopes in each class.

Our analysis of class I epitopes demonstrated similar trends in expression of features significantly associated with immunogenicity, as well as potential generalizability of our final gradient boosting algorithm for human MHC. We were unable to demonstrate here whether MHC haplotype may influence immunogenicity prediction, given our murine models were limited to two haplotypes. That said, it may be the case that certain features may significantly impact immunogenicity in a way that is conserved across haplotypes. A potential bias in our analysis is the variation in distribution of ELISpot scores among various models. This variation is likely a product of both our selection process (i.e. peptide selection based on a threshold predicted MHC binding affinity) as well as the number of predicted epitopes available for screening in each model. As methodology for antigen prediction and validation was conserved across all models, as well as biological validation performed across two independent tumor models, we do not believe there to be significant underlying biological differences among epitopes identified between different tumor models.

Despite the increased interest in neoantigen-based therapeutic tumor vaccine therapy, prediction algorithms capable of directly predicting for neoantigen immunogenicity are lacking [50], and no neoantigen immunogenicity predictor trained specifically on tumor antigen data exists. Current neoantigen immunogenicity predictors are instead trained on databases containing immunogenicity scores from all potential MHC-binding epitopes, of which the biology may not closely match that of mutation-derived tumor antigens. An example of this biological disparity is observed in the vastly different immune response rates between neoantigens and tumor-specific endogenous retroviral epitopes [51], whereby the concept of a “self” and “non-self” antigen is not considered as a feature for immunogenicity prediction among current algorithms. Training our model specifically on “self” tumor antigenic sources instead allows for greater specificity of selection for peptide-intrinsic features which correlated with ex vivo validated IFNγ release scores. Our final predictive algorithm demonstrated capacity to select for therapeutically relevant epitopes, as observed in our treatment studies where predicted high immunogenicity peptides controlled tumor burden significantly better than predicted low immunogenicity peptides and no-peptide control groups. This model demonstrated a high true-negative rate, which is ideal in the setting of filtering out many weakly immunogenic epitopes to select for a small pool of targets for therapy.

Validation experiments in BBN963 and P815 models were performed as a combination of prophylactic and therapeutic vaccines, rather than strictly treating animals after tumor injection. This method was selected due to the intrinsically low efficacy of free-peptide vaccines, whereby differences in therapeutic efficacy may not be observed between predicted high and low immunogenicity antigens [52]. As such, although these experiments provided evidence for the biological relevance of our computational model, development of more robust therapeutic vaccine platforms are still necessary for improving response rates to peptide-based, tumor-specific antigen vaccines. From these studies, we observed that predicted high immunogenicity peptides had greater benefit in the therapeutic vaccine setting than predicted low immunogenicity peptides. As such, we reasoned that although HIN count and total neoantigen burden were correlated, the most HINs were the key drivers of immunity. Thus, we performed regression studies between HIN count and immune features without controlling for total neoantigen burden. Analysis of human neoantigen data from TCGA demonstrated association between presence of HINs with features of immune response in colon and lung adenocarcinoma. Athough the association between HIN count and immune gene signature expression, as well as MSI-status, in COAD agreed with the classical view of a tumor-antigen driven immune response, the negative association with immune features (including signatures of anti–PD-1 responsiveness) in LUAD is less clear. A report from Jerby-Arnon and colleagues demonstrates an association between resistance to immune checkpoint inhibition and MYC target expression [53]. As such, we initially hypothesized that MYC target expression may be the common driver for immune exclusion, anti–PD-1 non-responsiveness, as well as high HIN burden. However, MYC copy number did not correlate with HIN count, suggesting MYC expression and HIN burden are independent pathways of immune exclusion and checkpoint inhibitor resistance in LUAD. Further studies are necessary to more closely examine the relationship between tumor immunogenic neoantigen burden and immune features, elucidating why higher immunogenicity burden is unexpectedly negatively associated with IGS patterns.

Currently, it is not well understood what frequency of tumor antigens arise from conventional in-frame antigens versus non-conventional antigenic sources, such as retroviral/retrotransponson expression, intron expression, and out-of-frame translation. A study from Laumont et al. suggests that non-coding regions are the main source of tumor-specific antigens in acute lymphoblastic leukemia patient samples [20], providing evidence that current methods for neoantigen prediction may be limited by filtering for in-frame exon regions. Laumont and colleagues used a RNA-based screening approach whereby k-mers derived from tumor RNA-seq reads were directly screened against matched-normal RNA k-mers, keeping only tumor specific regions. Compared to conventional exome-based TSA calling, this RNA-based screening approach allowed for identification of a broader repertoire of epitopes, consistent with the increased frequency of non-canonical TSAs identified by Laumont et al. compared to this current study. Although Laumont and colleagues used a mass spectrometry approach to confirm expression of out-of-frame epitopes, no computational methods have been used to identify such non-canonical epitopes. As such, our study relies upon a naïve, non-biased approach for screening out-of-frame antigens, whereby we combined conventional exome-based SNP antigen calling with identification of potential epitopes across all open reading frames. We demonstrated that the frame of an epitope did not associate with immunogenicity, but inclusion of out-of-frame epitopes could provide therapeutic benefit. This analysis highlighted how some proportion of SNV neoantigens predicted to be out-of-frame may still maintain expression and capacity to trigger a cytotoxic T-cell response against the tumor. If such antigens are further confirmed in human cancers, there are important implications that will need to be addressed: 1) whether the biology and immunogenicity of these out-of-frame “SNV” antigens more closely mirrors that of classical SNV-neoantigens or whether they are instead more similar to INDEL-derived neoantigens or alternative neoantigens, such as tumor-specific endogenous retroviral antigens; and 2) if reading frame filters should be applied to current neoantigen calling algorithms in order to most effectively capture the targetable antigen landscape of a tumor.

Supplementary Material

NIHMS1537822-supplement-1.pptx^{(10.2MB, pptx)}

NIHMS1537822-supplement-2.xlsx^{(25.6KB, xlsx)}

Acknowledgements

This work was supported by the US National Institutes of Health (NIH) grants F30 CA225136 (C.C.S.), U54 CA198999 (J.S.S.), and P50 CA058223 (J.S.S.), as well as the UNC University Cancer Research Fund (B.G.V.), and the Susan G. Komen Career Catalyst Research Grant (B.G.V.).

Footnotes

Conflict of Interest Statement: The authors declare no potential conflict of interest.

References

1.Gubin MM, Artyomov MN, Mardis ER, Schreiber RD. Tumor neoantigens: Building a framework for personalized cancer immunotherapy. Vol. 125, Journal of Clinical Investigation. 2015. p. 3413–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Castle JC, Kreiter S, Diekmann J, Löwer M, Van De Roemer N, De Graaf J, et al. Exploiting the mutanome for tumor vaccination. Cancer Res. 2012;72(5):1081–91. [DOI] [PubMed] [Google Scholar]
3.Kreiter S, Vormehr M, van de Roemer N, Diken M, Löwer M, Diekmann J, et al. Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature. 2015;520(7549):692–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kranz LM, Diken M, Haas H, Kreiter S, Loquai C, Reuter KC, et al. Systemic RNA delivery to dendritic cells exploits antiviral defence for cancer immunotherapy. Nature. 2016;534(7607):396–401. [DOI] [PubMed] [Google Scholar]
5.Sahin U, Derhovanessian E, Miller M, Kloke B-P, Simon P, Löwer M, et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature. 2017;547(7662):222–6. [DOI] [PubMed] [Google Scholar]
6.Ott PA, Hu Z, Keskin DB, Shukla SA, Sun J, Bozym DJ, et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature [Internet]. 2017;547(7662):217–21. Available from: http://www.ncbi.nlm.nih.gov/pubmed/28678778%0Ahttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC5577644 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, Lund O, et al. NetMHCpan, a method for MHC class i binding prediction beyond humans. Immunogenetics. 2009;61(1):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Nielsen M, Andreatta M. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome Med. 2016;8(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Jurtz V, Paul S, Andreatta M, Marcatili P, Peters B, Nielsen M. NetMHCpan-4.0: Improved Peptide–MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data. J Immunol. 2017;199(9):3360–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Andreatta M, Karosiene E, Rasmussen M, Stryhn A, Buus S, Nielsen M. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics. 2015;67(11–12):641–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Zhang Q, Wang P, Kim Y, Haste-Andersen P, Beaver J, Bourne PE, et al. Immune epitope database analysis resource (IEDB-AR). Nucleic Acids Res. 2008;36(Web Server issue). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Kim Y, Ponomarenko J, Zhu Z, Tamang D, Wang P, Greenbaum J, et al. Immune epitope database analysis resource. Nucleic Acids Res. 2012;40(W1). [DOI] [PMC free article] [PubMed] [Google Scholar]
13.O’Donnell TJ, Rubinsteyn A, Bonsack M, Riemer AB, Laserson U, Hammerbacher J. MHCflurry: Open-Source Class I MHC Binding Affinity Prediction. Cell Syst. 2018;7(1):129–132.e4. [DOI] [PubMed] [Google Scholar]
14.Karosiene E, Lundegaard C, Lund O, Nielsen M. NetMHCcons: A consensus method for the major histocompatibility complex class i predictions. Immunogenetics. 2012;64(3):177–86. [DOI] [PubMed] [Google Scholar]
15.Saito R, Smith CC, Utsumi T, Bixby LM, Kardos J, Wobker SE, et al. Molecular subtype-specific immunocompetent models of high-grade urothelial carcinoma reveal differential neoantigen expression and response to immunotherapy. Cancer Res. 2018;78(14):3954–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Colli LM, Machiela MJ, Myers TA, Jessop L, Yu K, Chanock SJ. Burden of nonsynonymous mutations among TCGA cancers and candidate immune checkpoint inhibitor responses. Cancer Res. 2016;76(13):3767–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang TH, et al. The Immune Landscape of Cancer. Immunity. 2018;48(4):812–830.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Jayasinghe RG, Cao S, Gao Q, Wendl MC, Vo NS, Reynolds SM, et al. Systematic Analysis of Splice-Site-Creating Mutations in Cancer. Cell Rep. 2018;23(1):270–281.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Kahles A, Lehmann K Van, Toussaint NC, Hüser M, Stark SG, Sachsenberg T, et al. Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients. Cancer Cell. 2018;34(2):211–224.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Laumont CM, Vincent K, Hesnard L, Audemard É, Bonneil É, Laverdure J-P, et al. Noncoding regions are the main source of targetable tumor-specific antigens. Sci Transl Med [Internet]. 2018. December 5;10(470):eaau5516 Available from: http://stm.sciencemag.org/content/10/470/eaau5516.abstract [DOI] [PubMed] [Google Scholar]
21.Weiss RB, Dunn DM, Atkins JF, Gesteland RF. Slippery runs, shifty stops, backward steps, and forward hops: −2, −1, +1, +2, +5, and +6 ribosomal frameshifting. Cold Spring Harb Symp Quant Biol [Internet]. 1987;52:687–93. Available from: http://www.ncbi.nlm.nih.gov/pubmed/3135981 [DOI] [PubMed] [Google Scholar]
22.Saulquin X, Scotet E, Trautmann L, Peyrat MA, Halary F, Bonneville M, et al. +1 Frameshifting as a novel mechanism to generate a cryptic cytotoxic T lymphocyte epitope derived from human interleukin 10. J Exp Med. 2002;195(3):353–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Macejak DG, Sarnow P. Internal initiation of translation mediated by the 5′ leader of a cellular mRNA. Nature. 1991;353(6339):90–4. [DOI] [PubMed] [Google Scholar]
24.Bullock TNJ, Patterson AE, Franlin LL, Notidis E, Eisenlohr LC. Initiation Codon Scanthrough versus Termination Codon Readthrough Demonstrates Strong Potential for Major Histocompatibility Complex Class I–restricted Cryptic Epitope Expression. J Exp Med. 1997;186(7):1051–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Bullock TN. Ribosomal scanning past the primary initiation codon as a mechanism for expression of CTL epitopes encoded in alternative reading frames. J Exp Med. 1996;184(4):1319–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Malarkannan S, Horng T, Shih PP, Schwab S, Shastri N. Presentation of out-of-frame peptide/MHC class I complexes by a novel translation initiation mechanism. Immunity. 1999;10(6):681–90. [DOI] [PubMed] [Google Scholar]
27.Van Den Eynde BBJ, Gaugler B, Probst-Kepper M, Michaux L, Devuyst O, Lorge F, et al. A new antigen recognized by cytolytic T lymphocytes on a human kidney tumor results from reverse strand transcription. J Exp Med. 1999;190(12):1793–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Bruce AG, Atkins JF, Gesteland RF. tRNA anticodon replacement experiments show that ribosomal frameshifting can be caused by doublet decoding. Proc Natl Acad Sci. 1986;83(14):5062–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Bruce DW, Stefanski HE, Vincent BG, Dant TA, Reisdorf S, Bommiasamy H, et al. Type 2 innate lymphoid cells treat and prevent acute gastrointestinal graft-versus-host disease. J Clin Invest. 2017;127(5):1813–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Roberts PJ, Usary JE, Darr DB, Dillon PM, Pfefferle AD, Whittle MC, et al. Combined PI3K/mTOR and MEK inhibition provides broad antitumor activity in faithful murine cancer models. Clin Cancer Res. 2012;18(19):5290–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Cooke KR, Kobzik L, Martin TR, Brewer J, Delmonte J, Crawford JM, et al. An experimental model of idiopathic pneumonia syndrome after bone marrow transplantation: I. The roles of minor H antigens and endotoxin. Blood. 1996; [PubMed] [Google Scholar]
32.Wilkerson MD, Cabanski CR, Sun W, Hoadley KA, Walter V, Mose LE, et al. Integrated RNA and DNA sequencing improves mutation detection in low purity tumors. Nucleic Acids Res. 2014;42(13). [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Rajasagi M, Shukla SA, Fritsch EF, Keskin DB, DeLuca D, Carmona E, et al. Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood. 2014;124(3):453–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015; [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013; [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Coghill JM, Carlson MJ, Panoskaltsis-Mortari A, West ML, Burgents JE, Blazar BR, et al. Separation of graft-versus-host disease from graft-versus-leukemia responses by targeting CC-chemokine receptor 7 on donor T cells. Blood. 2010; [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Dolton G, Tungatt K, Lloyd A, Bianchi V, Theaker SM, Trimby A, et al. More tricks with tetramers: A practical guide to staining T cells with peptide-MHC multimers. Vol. 146, Immunology. 2015. p. 11–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Wölfl M, Greenberg PD. Antigen-specific activation and cytokine-facilitated expansion of naive, human CD8+T cells. Nat Protoc. 2014;9(4):950–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Quintarelli C, Dotti G, De Angelis B, Hoyos V, Mims M, Luciano L, et al. Cytotoxic T lymphocytes directed to the preferentially expressed antigen of melanoma (PRAME) target chronic myeloid leukemia. Blood. 2008;112(5):1876–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Chan KS, Espinosa I, Chao M, Wong D, Ailles L, Diehn M, et al. Identification, molecular characterization, clinical prognosis, and therapeutic targeting of human bladder tumor-initiating cells. Proc Natl Acad Sci U S A. 2009;106(33):14016–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Prat A, Parker JS, Karginova O, Fan C, Livasy C, Herschkowitz JI, et al. Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res. 2010;12(5):R68. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Iglesia MD, Vincent BG, Parker JS, Hoadley KA, Carey LA, Perou CM, et al. Prognostic B-cell signatures using mRNA-seq in patients with subtype-specific breast and ovarian cancer. Clin Cancer Res. 2014;20(14):3818–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Kardos J, Chai S, Mose LE, Selitsky SR, Krishnan B, Saito R, et al. Claudin-low bladder tumors are immune infiltrated and actively immune suppressed. JCI Insight. 2016;1(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39(4):782–95. [DOI] [PubMed] [Google Scholar]
45.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Subramanian P; Mootha VK; Mukherjee S; Ebert BL; Gillette MA; Paulovich A; Pomeroy SL; Golub TR; Lander ES; Mesirov JP. A T. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Ingenuity Systems. Ingenuity Pathway Analysis. WwwIngenuityCom. 2013;(May):5020. [Google Scholar]
48.Huang DW, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, et al. The DAVID Gene Functional Classification Tool: A novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 2007;8(9). [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Bolotin DA, Poslavsky S, Mitrophanov I, Shugay M, Mamedov IZ, Putintseva EV., et al. MiXCR: Software for comprehensive adaptive immunity profiling. Vol. 12, Nature Methods. 2015. p. 380–1. [DOI] [PubMed] [Google Scholar]
50.Kim S, Kim HS, Kim E, Lee MG, Shin EC, Paik S, et al. Neopepsee: Accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information. Ann Oncol. 2018;29(4):1030–6. [DOI] [PubMed] [Google Scholar]
51.Smith CC, Beckermann KE, Bortone DS, Cubas AA, Bixby LM, Lee SJ, et al. Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma. J Clin Invest. 2018;128(11):4804–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Irvine DJ, Hanson MC, Rakhra K, Tokatlian T. Synthetic Nanoparticles for Vaccines and Immunotherapy. Vol. 115, Chemical Reviews. 2015. p. 11109–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su MJ, Melms JC, et al. A Cancer Cell Program Promotes T Cell Exclusion and Resistance to Checkpoint Blockade. Cell. 2018;175(4):984–997.e24. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS1537822-supplement-1.pptx^{(10.2MB, pptx)}

NIHMS1537822-supplement-2.xlsx^{(25.6KB, xlsx)}

[R1] 1.Gubin MM, Artyomov MN, Mardis ER, Schreiber RD. Tumor neoantigens: Building a framework for personalized cancer immunotherapy. Vol. 125, Journal of Clinical Investigation. 2015. p. 3413–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Castle JC, Kreiter S, Diekmann J, Löwer M, Van De Roemer N, De Graaf J, et al. Exploiting the mutanome for tumor vaccination. Cancer Res. 2012;72(5):1081–91. [DOI] [PubMed] [Google Scholar]

[R3] 3.Kreiter S, Vormehr M, van de Roemer N, Diken M, Löwer M, Diekmann J, et al. Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature. 2015;520(7549):692–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Kranz LM, Diken M, Haas H, Kreiter S, Loquai C, Reuter KC, et al. Systemic RNA delivery to dendritic cells exploits antiviral defence for cancer immunotherapy. Nature. 2016;534(7607):396–401. [DOI] [PubMed] [Google Scholar]

[R5] 5.Sahin U, Derhovanessian E, Miller M, Kloke B-P, Simon P, Löwer M, et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer. Nature. 2017;547(7662):222–6. [DOI] [PubMed] [Google Scholar]

[R6] 6.Ott PA, Hu Z, Keskin DB, Shukla SA, Sun J, Bozym DJ, et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature [Internet]. 2017;547(7662):217–21. Available from: http://www.ncbi.nlm.nih.gov/pubmed/28678778%0Ahttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC5577644 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, Lund O, et al. NetMHCpan, a method for MHC class i binding prediction beyond humans. Immunogenetics. 2009;61(1):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Nielsen M, Andreatta M. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome Med. 2016;8(1). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Jurtz V, Paul S, Andreatta M, Marcatili P, Peters B, Nielsen M. NetMHCpan-4.0: Improved Peptide–MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data. J Immunol. 2017;199(9):3360–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Andreatta M, Karosiene E, Rasmussen M, Stryhn A, Buus S, Nielsen M. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics. 2015;67(11–12):641–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Zhang Q, Wang P, Kim Y, Haste-Andersen P, Beaver J, Bourne PE, et al. Immune epitope database analysis resource (IEDB-AR). Nucleic Acids Res. 2008;36(Web Server issue). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Kim Y, Ponomarenko J, Zhu Z, Tamang D, Wang P, Greenbaum J, et al. Immune epitope database analysis resource. Nucleic Acids Res. 2012;40(W1). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.O’Donnell TJ, Rubinsteyn A, Bonsack M, Riemer AB, Laserson U, Hammerbacher J. MHCflurry: Open-Source Class I MHC Binding Affinity Prediction. Cell Syst. 2018;7(1):129–132.e4. [DOI] [PubMed] [Google Scholar]

[R14] 14.Karosiene E, Lundegaard C, Lund O, Nielsen M. NetMHCcons: A consensus method for the major histocompatibility complex class i predictions. Immunogenetics. 2012;64(3):177–86. [DOI] [PubMed] [Google Scholar]

[R15] 15.Saito R, Smith CC, Utsumi T, Bixby LM, Kardos J, Wobker SE, et al. Molecular subtype-specific immunocompetent models of high-grade urothelial carcinoma reveal differential neoantigen expression and response to immunotherapy. Cancer Res. 2018;78(14):3954–68. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Colli LM, Machiela MJ, Myers TA, Jessop L, Yu K, Chanock SJ. Burden of nonsynonymous mutations among TCGA cancers and candidate immune checkpoint inhibitor responses. Cancer Res. 2016;76(13):3767–72. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Thorsson V, Gibbs DL, Brown SD, Wolf D, Bortone DS, Ou Yang TH, et al. The Immune Landscape of Cancer. Immunity. 2018;48(4):812–830.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Jayasinghe RG, Cao S, Gao Q, Wendl MC, Vo NS, Reynolds SM, et al. Systematic Analysis of Splice-Site-Creating Mutations in Cancer. Cell Rep. 2018;23(1):270–281.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Kahles A, Lehmann K Van, Toussaint NC, Hüser M, Stark SG, Sachsenberg T, et al. Comprehensive Analysis of Alternative Splicing Across Tumors from 8,705 Patients. Cancer Cell. 2018;34(2):211–224.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Laumont CM, Vincent K, Hesnard L, Audemard É, Bonneil É, Laverdure J-P, et al. Noncoding regions are the main source of targetable tumor-specific antigens. Sci Transl Med [Internet]. 2018. December 5;10(470):eaau5516 Available from: http://stm.sciencemag.org/content/10/470/eaau5516.abstract [DOI] [PubMed] [Google Scholar]

[R21] 21.Weiss RB, Dunn DM, Atkins JF, Gesteland RF. Slippery runs, shifty stops, backward steps, and forward hops: −2, −1, +1, +2, +5, and +6 ribosomal frameshifting. Cold Spring Harb Symp Quant Biol [Internet]. 1987;52:687–93. Available from: http://www.ncbi.nlm.nih.gov/pubmed/3135981 [DOI] [PubMed] [Google Scholar]

[R22] 22.Saulquin X, Scotet E, Trautmann L, Peyrat MA, Halary F, Bonneville M, et al. +1 Frameshifting as a novel mechanism to generate a cryptic cytotoxic T lymphocyte epitope derived from human interleukin 10. J Exp Med. 2002;195(3):353–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Macejak DG, Sarnow P. Internal initiation of translation mediated by the 5′ leader of a cellular mRNA. Nature. 1991;353(6339):90–4. [DOI] [PubMed] [Google Scholar]

[R24] 24.Bullock TNJ, Patterson AE, Franlin LL, Notidis E, Eisenlohr LC. Initiation Codon Scanthrough versus Termination Codon Readthrough Demonstrates Strong Potential for Major Histocompatibility Complex Class I–restricted Cryptic Epitope Expression. J Exp Med. 1997;186(7):1051–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Bullock TN. Ribosomal scanning past the primary initiation codon as a mechanism for expression of CTL epitopes encoded in alternative reading frames. J Exp Med. 1996;184(4):1319–29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Malarkannan S, Horng T, Shih PP, Schwab S, Shastri N. Presentation of out-of-frame peptide/MHC class I complexes by a novel translation initiation mechanism. Immunity. 1999;10(6):681–90. [DOI] [PubMed] [Google Scholar]

[R27] 27.Van Den Eynde BBJ, Gaugler B, Probst-Kepper M, Michaux L, Devuyst O, Lorge F, et al. A new antigen recognized by cytolytic T lymphocytes on a human kidney tumor results from reverse strand transcription. J Exp Med. 1999;190(12):1793–800. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Bruce AG, Atkins JF, Gesteland RF. tRNA anticodon replacement experiments show that ribosomal frameshifting can be caused by doublet decoding. Proc Natl Acad Sci. 1986;83(14):5062–6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Bruce DW, Stefanski HE, Vincent BG, Dant TA, Reisdorf S, Bommiasamy H, et al. Type 2 innate lymphoid cells treat and prevent acute gastrointestinal graft-versus-host disease. J Clin Invest. 2017;127(5):1813–25. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Roberts PJ, Usary JE, Darr DB, Dillon PM, Pfefferle AD, Whittle MC, et al. Combined PI3K/mTOR and MEK inhibition provides broad antitumor activity in faithful murine cancer models. Clin Cancer Res. 2012;18(19):5290–303. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Cooke KR, Kobzik L, Martin TR, Brewer J, Delmonte J, Crawford JM, et al. An experimental model of idiopathic pneumonia syndrome after bone marrow transplantation: I. The roles of minor H antigens and endotoxin. Blood. 1996; [PubMed] [Google Scholar]

[R32] 32.Wilkerson MD, Cabanski CR, Sun W, Hoadley KA, Walter V, Mose LE, et al. Integrated RNA and DNA sequencing improves mutation detection in low purity tumors. Nucleic Acids Res. 2014;42(13). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Rajasagi M, Shukla SA, Fritsch EF, Keskin DB, DeLuca D, Carmona E, et al. Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia. Blood. 2014;124(3):453–62. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Pertea M, Pertea GM, Antonescu CM, Chang TC, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015; [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013; [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Coghill JM, Carlson MJ, Panoskaltsis-Mortari A, West ML, Burgents JE, Blazar BR, et al. Separation of graft-versus-host disease from graft-versus-leukemia responses by targeting CC-chemokine receptor 7 on donor T cells. Blood. 2010; [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Dolton G, Tungatt K, Lloyd A, Bianchi V, Theaker SM, Trimby A, et al. More tricks with tetramers: A practical guide to staining T cells with peptide-MHC multimers. Vol. 146, Immunology. 2015. p. 11–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] 38.Wölfl M, Greenberg PD. Antigen-specific activation and cytokine-facilitated expansion of naive, human CD8+T cells. Nat Protoc. 2014;9(4):950–66. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Quintarelli C, Dotti G, De Angelis B, Hoyos V, Mims M, Luciano L, et al. Cytotoxic T lymphocytes directed to the preferentially expressed antigen of melanoma (PRAME) target chronic myeloid leukemia. Blood. 2008;112(5):1876–85. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Chan KS, Espinosa I, Chao M, Wong D, Ailles L, Diehn M, et al. Identification, molecular characterization, clinical prognosis, and therapeutic targeting of human bladder tumor-initiating cells. Proc Natl Acad Sci U S A. 2009;106(33):14016–21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.Prat A, Parker JS, Karginova O, Fan C, Livasy C, Herschkowitz JI, et al. Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res. 2010;12(5):R68. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Iglesia MD, Vincent BG, Parker JS, Hoadley KA, Carey LA, Perou CM, et al. Prognostic B-cell signatures using mRNA-seq in patients with subtype-specific breast and ovarian cancer. Clin Cancer Res. 2014;20(14):3818–29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] 43.Kardos J, Chai S, Mose LE, Selitsky SR, Krishnan B, Saito R, et al. Claudin-low bladder tumors are immune infiltrated and actively immune suppressed. JCI Insight. 2016;1(3). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39(4):782–95. [DOI] [PubMed] [Google Scholar]

[R45] 45.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Subramanian P; Mootha VK; Mukherjee S; Ebert BL; Gillette MA; Paulovich A; Pomeroy SL; Golub TR; Lander ES; Mesirov JP. A T. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Ingenuity Systems. Ingenuity Pathway Analysis. WwwIngenuityCom. 2013;(May):5020. [Google Scholar]

[R48] 48.Huang DW, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, et al. The DAVID Gene Functional Classification Tool: A novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 2007;8(9). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] 49.Bolotin DA, Poslavsky S, Mitrophanov I, Shugay M, Mamedov IZ, Putintseva EV., et al. MiXCR: Software for comprehensive adaptive immunity profiling. Vol. 12, Nature Methods. 2015. p. 380–1. [DOI] [PubMed] [Google Scholar]

[R50] 50.Kim S, Kim HS, Kim E, Lee MG, Shin EC, Paik S, et al. Neopepsee: Accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information. Ann Oncol. 2018;29(4):1030–6. [DOI] [PubMed] [Google Scholar]

[R51] 51.Smith CC, Beckermann KE, Bortone DS, Cubas AA, Bixby LM, Lee SJ, et al. Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma. J Clin Invest. 2018;128(11):4804–20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] 52.Irvine DJ, Hanson MC, Rakhra K, Tokatlian T. Synthetic Nanoparticles for Vaccines and Immunotherapy. Vol. 115, Chemical Reviews. 2015. p. 11109–46. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] 53.Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su MJ, Melms JC, et al. A Cancer Cell Program Promotes T Cell Exclusion and Resistance to Checkpoint Blockade. Cell. 2018;175(4):984–997.e24. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Machine-learning prediction of tumor antigen immunogenicity in the selection of therapeutic epitopes

Christof C Smith

Shengjie Chai

Amber R Washington

Samuel J Lee

Elisa Landoni

Kevin Field

Jason Garness

Lisa M Bixby

Sara R Selitsky

Joel S Parker

Barbara Savoldo

Jonathan S Serody

Benjamin G Vincent

Abstract

Introduction

Materials and Methods

Cell lines

Animal studies

Tissue Dissociation

Neoantigen and mHA prediction

Figure 1: Summary of tumor antigen prediction and identification of peptide-intrinsic features.

Vaccine/ELISpot screening

Computational analysis

Peptide treatment studies

Tetramer studies

Ex vivo T-cell expansion and cytotoxicity assays

TCGA data analyses

Statistical analyses

Results

Correlates of immunogenicity in class I MHC epitopes

Figure 2: Linear regression analysis between peptide-intrinsic features and tumor antigen immunogenicity.

Correlates of immunogenicity in class II MHC epitopes

Machine-learning algorithm for immunogenicity prediction in class I MHC epitopes

Figure 3: Performance and validation of the gradient boosting model (GBM) approach for predicting neoantigen/mHA immunogenicity.

In vivo validation of the class I immunogenicity prediction model

Correlates of predicted immunogenicity in human class I epitopes

Figure 4: Correlative analysis of predicted neoantigen immunogenicity in TCGA human datasets.

Out-of-frame neoantigen epitopes promote anti-tumor immunity

Figure 5: Analysis of out-of-frame epitope B2-specific T cells.

Discussion

Supplementary Material

Acknowledgements

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases