Skip to main content
Immunology logoLink to Immunology
. 2020 Nov 3;162(2):235–247. doi: 10.1111/imm.13279

Comparison of HLA ligand elution data and binding predictions reveals varying prediction performance for the multiple motifs recognized by HLA‐DQ2.5

Zeynep Koşaloğlu‐Yalçın 1, John Sidney 1, William Chronister 1, Bjoern Peters 1,2, Alessandro Sette 1,2,
PMCID: PMC7808151  PMID: 33064841

Summary

Binding prediction tools are commonly used to identify peptides presented on MHC class II molecules. Recently, a wealth of data in the form of naturally eluted ligands has become available and discrepancies between ligand elution data and binding predictions have been reported. Quantitative metrics for such comparisons are currently lacking. In this study, we assessed how efficiently MHC class II binding predictions can identify naturally eluted peptides, and investigated instances with discrepancies between the two methods in detail. We found that, in general, MHC class II eluted ligands are predicted to bind to their reported restriction element with high affinity. But, for several studies reporting an increased number of ligands that were not predicted to bind, we found that the reported MHC restriction was ambiguous. Additional analyses determined that most of the ligands predicted to not bind, are predicted to bind other co‐expressed MHC class II molecules. For selected alleles, we addressed discrepancies between elution data and binding predictions by experimental measurements and found that predicted and measured affinities correlate well. For DQA1*05:01/DQB1*02:01 (DQ2.5) however, binding predictions did miss several peptides that were determined experimentally to be binders. For these peptides and several known DQ2.5 binders, we determined key residues for conferring DQ2.5 binding capacity, which revealed that DQ2.5 utilizes two different binding motifs, of which only one is predicted effectively. These findings have important implications for the interpretation of ligand elution data and for the improvement of MHC class II binding predictions.

Keywords: binding motif, binding predictions, ligand elution, MHC


We addressed discrepancies between elution data and binding predictions by experimental measurements and found that predicted and measured affinities correlate well. For DQ2.5 however, binding predictions did miss several peptides that were determined experimentally to be binders. For these peptides and several known DQ2.5 binders, we determined key residues for conferring DQ2.5 binding capacity, which revealed that DQ2.5 utilizes two different binding motifs, of which only one is predicted effectively.

graphic file with name IMM-162-235-g004.jpg


Abbreviations

DQ2.5

DQA1*05:01/DQB1*02:01

EBV

epstein‐barr virus

HLA

human leucocyte antigen

IEDB

immune epitope database

Kd

equilibrium dissociation constant

mAb

monoclonal antibody

MHC

major histocompatibility complex

MS

mass spectrometry

nM

nanomolar

P1, P2, P3, P4, P6, P7, P8, P9

binding pockets 1‐9 in MHC binding groove

PDB

protein database

TCR

T cell receptor

INTRODUCTION

Major histocompatibility complex (MHC) class II molecules are expressed on professional antigen‐presenting cells and present antigenic peptides to CD4+ T cells, which play an important role in regulating immune responses. 1 MHC class II, called human leucocyte antigen (HLA) class II in humans, and CD4+ T‐cell responses are of major significance for autoimmunity and antitumor immunity. It is known, for example, that coeliac disease is driven by CD4+ T‐cell responses to gluten proteins presented by certain HLA class II molecules. 2 More recently, it has been shown that neoantigens presented on HLA class II elicit potent antitumor responses in patients and that CD4+ T cells are required for successful anticancer immunotherapy. 3 , 4 , 5 Given these developments, there is an increased interest in developing efficient approaches for identifying epitopes presented by MHC class II. As experimental identification of epitopes remains time consuming and costly, and subject to several technical challenges, MHC class II binding prediction tools are now commonly used. 6

Most MHC binding prediction tools are based on machine learning algorithms trained on data sets of known peptide–MHC binding affinities. With an increasing amount of data available for training, prediction algorithms have been able to achieve high accuracy. It is however generally accepted that predicting binding to MHC class II is a more challenging problem when compared to predicting binding to MHC class I due to a number of reasons. MHC class II molecules consist of heterodimers of alpha and beta chains. These chains are encoded by genes in the HLA‐DP, HLA‐DQ or HLA‐DR region of chromosome 6 in humans and are highly polymorphic in the general population. 7 Most of the polymorphism is associated with residues forming the MHC’s peptide‐binding region, which accounts in large part for the high degree of variation in binding specificity observed between different alleles. The presence of two potentially polymorphic chains makes the assignment of MHC restriction harder for class II, compared to class I molecules, which consist of a variable alpha chain and the monomorphic beta‐2‐microglobulin molecule.

The main energy of interaction for peptide binding to MHC class I and II is conferred by a peptide core of approximately 9 residues. However, in contrast to MHC class I, the binding groove of MHC class II is open on both ends, allowing peptides of variable length to bind. The presence of amino acids flanking the peptide core is also necessary for efficient binding to MHC class II. As a result, MHC class II has a preferred minimal length of about 13 residues for binding peptides. The presence of such over‐hanging residues, plus the fact that the open ends of the binding groove afford multiple different peptide frames, means that machine learning algorithms for MHC class II binding have to consider multiple possible peptide cores.

The specificity of an MHC molecule can be described as a ‘motif’. Crystallographic analyses of MHC–peptide complexes have shown that different MHC molecules have different pockets within their binding groove. When a peptide is bound, the amino acid side chains of so‐called anchor residues in the peptide specifically insert into these pockets within the MHC binding groove. Both the position and specificity of these anchor residues are similar for different peptides binding the same MHC molecule and together describe the peptide‐binding motif of a given MHC molecule. For MHC class I binding peptides, the anchor residues are located at or near the N‐ and C‐termini of the peptide, usually P2 and P9 in the case of 9‐mer peptides. For MHC class I binding peptides anchor residues can be at various distances from the ends of the peptide and at various positions of the binding core. For DRB1*01:01 and DRB1*15:01, for example, the prototypical anchor positions are at P1, P4, P6, P7 and P9. But other molecules, such as DRB1*11:01 and DQA1*03:01/DQB1*03:01, appear to have slightly different anchor spacing, with the second anchor at P3 instead of P4, or an anchor at P5. 8 While for many MHC class I and class II molecules distinct binding motifs could be defined, there are also instances of MHC molecules that are suggested to carry multiple binding motifs, such as HLA‐DRB1*03:01. 9

Recent advances in mass spectroscopy (MS)‐based techniques have enabled the development of high‐throughput ligand elution assays, which allow the identification of thousands of natural ligands with a single experiment. 10 , 11 As eluted ligands undergo the natural processing and MHC loading cascade, the resulting elution data will inherently contain valuable biological information that is not available when only peptide binding is considered. 6 Given these advantages, ligand elution experiments have become quite popular, leading to a wealth of data in the form of eluted ligands from both MHC class I and class II molecules.

As eluted ligands were isolated from MHC molecules that presented them on the cell surface, their capacity to bind MHC is inherently supported. With the increasing amount of ligand elution data, discrepancies between MHC class II ligand elution and binding predictions have been reported and the efficiency of binding predictions has been put into doubt. 12 , 13 , 14 , 15 However, quantitative metrics for such comparisons are currently lacking. In this study, we wanted to systematically investigate how well HLA class II binding predictions and ligand elution data agree with each other. To achieve that, we assembled HLA class II ligand elution data reported in literature and performed multiple comparative analyses between elution data and binding predictions: (i) we analysed how many of the eluted ligands were predicted to bind their reported restricting MHC; (ii) for several MHC class II alleles, a significant fraction of reported ligands were not predicted to bind and we investigated these in detail; (iii) we then chose four representative HLA class II alleles and specifically selected peptides for which ligand elution data and binding predictions disagreed and experimentally measured binding affinity to determine which method was correct; and (iv) we found that the correlation between predicted and measured binding is lowest for DQA1*05:01/DQB1*02:01 (DQ2.5). We therefore investigated this allele in detail and performed experiments to the determine key residues for binding. We found that DQ2.5 can bind peptides using alternate binding modes and that, currently, only one of these is being predicted accurately.

MATERIAL AND METHODS

Data assembly

The Immune Epitope Database (IEDB) 16 catalogs experimental data on antibody and T‐cell epitopes as well as MHC ligands. The IEDB was queried to retrieve ligand elution data for MHC class II using the following criteria: ‘Positive Assays Only, Epitope Structure: Linear Sequence, No T cell assays, No B cell assays, MHC ligand assays: MHC ligand elution assay, MHC Restriction Type: Class II’. The collected data included ligand sequence, the MHC class II allele from which the ligand was eluted from, details of the source protein from which the ligands originated from and PubMed identifiers (PMIDs) of the studies that reported the ligand.

We further refined this set to only include peptides that were eluted from HLA molecules typed with 4‐digit resolution, resulting in a set of 24,601 peptides from 27 different HLA class II molecules. The length of the ligands in this set ranged from 3 to 43 residues; based on previous analyses, 17 we only retained peptides of 15–20 amino acids in length. After this step, the data set contained 12,506 peptides from 27 different HLA class II molecules. We then further filtered the set by considering only alleles for which at least 10 ligands were reported. The resulting final data set used for analysis included 12,449 sequences, eluted from 16 different HLA class II molecules. This data set is provided as Table S1.

HLA binding predictions

We performed binding predictions using the standalone version of NetMHCIIpan (version 3.1), as this tool provides predictions for most HLA class II alleles. 18 , 19 HLA class II binding predictions were performed for all 12,449 sequences and their reported HLA class II restriction element. Peptides with predicted percentile ranks of >20% were considered predicted non‐binders. This threshold was selected based on previous studies analysing the binding affinity of HLA class II‐restricted T‐cell epitopes. 20 , 21 , 22 , 23

To analyse contamination of ligand elution data, we also performed binding predictions for all peptides to common DR and DQ alleles: 20 HLA‐DRB1*01:01, HLA‐DRB1*03:01, HLA‐DRB1*04:01, HLA‐DRB1*04:05, HLA‐DRB1*07:01, HLA‐DRB1*08:02, HLA‐DRB1*09:01, HLA‐DRB1*11:01, HLA‐DRB1*12:01, HLA‐DRB1*13:02, HLA‐DRB1*15:01, HLA‐DRB3*01:01, HLA‐DRB3*02:02, HLA‐DRB4*01:01, HLA‐DRB5*01:01, HLA‐DQA1*05:01/DQB1*02:01, HLA‐DQA1*05:01/DQB1*03:01, HLA‐DQA1*03:01/DQB1*03:02, HLA‐DQA1*04:01/DQB1*04:02, HLA‐DQA1*01:01/DQB1*05:01, HLA‐DQA1*01:02/DQB1*06:02.

Selection of a peptide set with discrepancies

We selected 4 alleles to further investigate peptides with discrepancies between ligand elution data and binding predictions: HLA‐DQA1*05:01/DQB1*02:01, DRB1*04:01, DRB1*15:01 and DRB1*01:01. These representative alleles were selected considering the number of reported eluted ligands, frequency of the allele in the general human population and whether an in‐house quantitative binding assay 24 was available. In total, 3,443 eluted peptides were available for the 4 selected HLA class II alleles, of which 1,611 had predicted %ranks >20. Accordingly, these were considered as ‘predicted non‐binder’. The composition of the data set after each filtering step is described in detail in Table 1.

Table 1.

Composition of eluted ligand dataset after each filtering step and the number of peptides selected to be tested experimentally for the four alleles selected for detailed investigation

Allele Peptides in ligand elution dataset downloaded from IEDB Peptides with length 15–20 Peptides with predicted %rank>20 Predicted non‐binder redundant peptides removed Predicted non‐binders selected for testing Predicted binders selected for testing
HLA‐DQA1*05:01/DQB1*02:01 2245 2056 1097 811 30 20
HLA‐DRB1*01:01 638 594 114 97 20 15
HLA‐DRB1*04:01 555 463 222 176 30 20
HLA‐DRB1*15:01 388 330 178 132 17 18
4 selected HLA combined 3,826 3,443 1,611 1,216 97 73

Analysis of sequence similarity

Many of the eluted peptides are very similar to each other as they contain the same MHC class II binding core with varying flanking residues of different lengths. When selecting peptides for experimental testing, we wanted to avoid testing the same MHC class II binding core multiple times. Thus, to reduce redundancy, we removed sequences with >80% sequence similarity. We used the IEDB clustering tool, 25 which groups the set of input peptides into clusters based on sequence identity. Here, a cluster is defined as a group of sequences that have a similarity greater than the specified minimum sequence identity threshold (80% in our case). We used all 1611 eluted peptides that were not predicted to bind for the 4 selected alleles as input, and from each cluster, we randomly selected one peptide. As a result, 1216 peptides that had <80% sequence similarity remained in the data set, from which peptides were selected for experimental testing.

HLA binding measurements

Purification of MHC molecules by affinity chromatography was performed as detailed elsewhere. 24 Briefly, Epstein‐Barr virus (EBV) transformed homozygous cell lines or single MHC allele transfected RM3 or fibroblast lines were utilized as sources of HLA class II MHC molecules. MHC class II molecules were purified from cell pellet lysates by repeated passage over Protein A Sepharose beads, conjugated with the monoclonal antibodies (mAbs) LB3.1 (anti‐HLA‐DR), SPV‐L3 (anti‐HLA‐DQ) and B7/21 (anti‐HLA‐DP). Protein purity, concentration and the effectiveness of depletion steps were monitored by SDS‐PAGE and BCA assay.

Classical competition assays to quantitatively measure peptide binding to MHC class II molecules were performed as detailed elsewhere. 24 These assays are based on inhibition of binding of high‐affinity radiolabeled peptides to purified MHC molecules. Briefly, 0·1–1 nM of radiolabeled peptide was co‐incubated at room temperature or 37°C with purified MHC in the presence of a cocktail of protease inhibitors. Following a two‐ to four‐day incubation, MHC bound radioactivity was determined by capturing MHC/peptide complexes on MHC locus specific mAb‐coated Lumitrac 600 plates (Greiner Bio‐one, Frickenhausen, Germany). We then measured bound counts per minute (cpm) using the TopCount (Packard Instrument Co., Meriden, CT, USA) microscintillation counter. The concentration of peptide yielding 50% inhibition of binding of the radiolabeled peptide was calculated. Under the conditions utilized, where the concentration of the radiolabeled peptide is less than the concentration of MHC, and the affinity of the peptide (IC50 nM) is greater than or equal to the concentration of MHC (or more formally, [label] < [MHC] and IC50 ≥ [MHC]) measured IC50 values are reasonable approximations of true equilibrium dissociation constants (Kd). 26 , 27 Each competitor peptide was tested at six different concentrations covering a 100,000‐fold range, and in three or more independent experiments. As a positive control, the unlabelled version of the radiolabeled probe was also tested in each experiment.

Transforming measured IC50 values to percentile ranks

To compare the measured affinities to the predicted percentile ranks, we converted the measured IC50 values to percentile ranks as well. The percentile rank in this context describes the rank of a given IC50 value in a larger reference set of measured IC50 values. We obtained various data sets from the IEDB that contained binding measurements of overlapping 15mers for the four HLA class II alleles of interest: 28 , 29 , 30 , 31 HLA‐DQA1*05:01/DQB1*02:01 (n = 537), DRB1*04:01 (n = 610), DRB1*15:01 (n = 537) and DRB1*01:01 (n = 1,277). Each measured IC50 value was then converted to a percentile rank by calculating its rank in the allele‐specific reference set.

RESULTS

HLA class II reported eluted ligands are, in general, predicted to bind to their restriction element

We first addressed whether ligands eluted from a specific HLA would also be predicted to bind that allele. We utilized a percentile rank threshold of 20% to classify peptides into binders and non‐binders. This threshold is a rather comprehensive; however, as we wanted to assess the binding capacity of reported eluted ligands, a comprehensive threshold was suitable. In a setting where the rationale is to use binding predictions to select the best binders for experimental testing, another strategy would be recommended, such as selecting the top 10% of predicted binders.

To address the extent to which HLA class II eluted peptides are predicted by the 20%‐rank threshold, we used the Immune Epitope Database (IEDB) and downloaded all peptides reported as eluted from HLA class II‐expressing cells. We retained peptides of 15–20 amino acids in length and eluted from HLA molecules typed to the 4‐digit resolution level. We only further considered alleles for which at least 10 ligands were reported. The resulting data set included 12,449 sequences eluted from 16 different HLA class II molecules.

Next, for each of these 16 HLA class II molecules and their associated ligands, we performed predictions using the NetMHCIIpan algorithm. 18 , 19 As was the case with T‐cell epitopes, 17 significant variation in prediction performance across alleles was noted. On average, 57% ±17% SD (Table 2) of eluted peptides was predicted to bind the corresponding allele at the 20% rank threshold.

Table 2.

Ligand elution data and associated prediction efficacy

Allele ligands %recovered at 20%
HLA‐DRB5*01:01 131 75%
HLA‐DRB3*02:02 10 80%
HLA‐DRB1*15:01 330 46%
HLA‐DRB1*13:02 17 35%
HLA‐DRB1*11:01 50 62%
HLA‐DRB1*04:05 1206 59%
HLA‐DRB1*04:04 35 69%
HLA‐DRB1*04:02 29 31%
HLA‐DRB1*04:01 463 52%
HLA‐DRB1*03:01 79 49%
HLA‐DRB1*01:01 594 81%
HLA‐DQA1*05:05/DQB1*03:01 3908 58%
HLA‐DQA1*05:01/DQB1*02:01 2055 47%
HLA‐DQA1*02:01/DQB1*02:02 3492 36%
HLA‐DQA1*01:02/DQB1*06:02 27 85%
HLA‐DQA1*01:01/DQB1*05:01 23 52%

Thus, while the majority of eluted peptides could be effectively predicted by existing algorithms, the overall prediction efficiency was lower than what has been reported for known T‐cell epitopes. Further, the data also revealed significant inter‐allele variability and identified several alleles for which less than 60% of reported eluted ligands were predicted to bind below 20% rank (Table 2).

Most non‐predicted ligands are predicted to bind other HLA molecules

We addressed whether the 556 DRB1*04:01, DRB1*03:01, DRB1*04:02 and DRB1*13:02 ligands that were reported in these studies could have been eluted from associated and co‐expressed DRB3/4/5 molecules, or other contaminant HLA class II. Out of these 556 reported ligands 277 (50%) were not predicted to bind the reported DRB1 at the 20% rank. Of the 277 peptides not predicted to bind the reported DRB1, 30 (10%) were predicted to bind other representative DRB3/4/5 molecules (i.e. DRB3*01:01, DRB3*02:02, DRB4*01:01 and DRB5*01:01). We additionally found that out of the 279 peptides predicted to bind the reported DRB1, 119 peptides (43%) were also predicted to bind other representative DRB3/4/5 molecules (Table S2).

Next, in light of the observation that several papers had isolated ligands from heterozygous DR samples, we examined whether the 247 ligands that were not predicted to bind, neither the reported DRB1 nor other representative DRB3/4/5 molecules, may be predicted to bind other common DRB1 or DQ alleles. 20 We found that 60% (148/247) of the non‐predicted DRB1‐reported eluted ligands were, in fact, predicted to bind other common DR or DQ alleles. Conducting the same analysis for the 279 peptides that were predicted to the reported DRB1 molecule, we found that 99% (275/279) of them were also predicted to bind other common DRB1 or DQ alleles.

Taken together, of the 277 ligands that were not predicted to bind the reported HLA at the 20% rank threshold, 64% were predicted to bind other HLA molecules, suggesting that a wrong HLA might have been assigned to these ligands. Our results additionally suggest that ligands that were predicted to bind the reported DRB1 molecule could also be promiscuous binders with the capacity to bind multiple different HLA alleles.

Addressing discrepancies between elution data and binding predictions by experimental affinity determinations

Next, we chose representative HLA class II alleles from the set described above, considering the number of reported eluted ligands, frequency of the allele in the general human population and whether an in‐house quantitative binding assay 24 was available. Specifically, we selected HLA‐DQA1*05:01/DQB1*02:01, DRB1*04:01 and DRB1*15:01 for further study, all alleles for which less than 55% of ligands were predicted to bind at the 20% rank threshold. As a control, we selected DRB1*01:01, an allele for which over 80% of ligands were predicted to bind with affinities over the 20% rank.

For these four selected alleles, 3,443 eluted peptides with lengths of 15–20 amino acids were available, of which 1611 (47%) were not predicted to bind at the 20% rank threshold (Table 1). We excluded sequences with >80% sequence similarity to reduce redundancy, resulting in a set of 1,216 peptides. From this set, we selected between 17 and 30 peptides each for HLA (Table 2). This resulted in a total of 97 eluted peptide sequences that were not predicted to bind at 20%‐rank. As a control, we selected between 15 and 20 eluted peptides per allele that were predicted to bind with %‐rank of 10 or better. Binding affinities for the 97 peptides predicted not to bind and 73 predicted binders were measured using standard competitive binding assays, 24 as described in the Methods section.

To compare the measured affinities to the predicted percentile ranks, we converted the measured IC50 values to percentile ranks as well. The percentile rank in this context describes the rank of a given IC50 value in a larger reference set of measured IC50 values. We obtained various data sets from the IEDB that contained binding measurements of overlapping 15mers for the four HLA class II alleles of interest 28 , 29 , 30 , 31 and combined the data to be used as the reference data set of measured IC50 values. Each measured IC50 value was then converted to a percentile rank by calculating its rank in this allele‐specific reference set.

Table S3 lists all tested peptides together with predicted and measured IC50 and %‐rank values.

Overall (Table 3), 67 of the 73 predicted binders (92%) bound the corresponding HLA molecule when tested experimentally (shown as squares in Figure 1), consistent with previous data, 17 and as summarized above. Similarly, 79 of the 97 eluted peptides (81%) predicted not to bind (Table 3) indeed did not bind when tested experimentally (shown as circles in Figure 1). Thus, these results confirmed the general efficacy of the binding predictions, as applied to the experimentally determined binding capacity of reported eluted ligands.

Table 3.

Composition of peptide set and associated binding results

HLA allele Pred. non‐binders selected Pred. non‐binders that don't bind Pred. binders selected Pred. binders that bound
DQA1*05:01/DQB1*02:01 30 22 (73%) 20 18 (90%)
DRB1*01:01 20 17 (85%) 15 14 (93%)
DRB1*04:01 30 28 (93%) 20 17 (85%)
DRB1*15:01 17 12 (71%) 18 18 (100%)
Total 97 79 (81%) 73 67 (92%)

Figure 1.

Figure 1

Measured percentile ranks are plotted against predicted ones. Colors indicate different alleles, squares predicted binders, and circles predicted non‐binders. A red outline depicts cases with large discrepancies between measured and predicted affinities, that is measured rank < 10% and predicted rank > 30%.

Importantly, for some alleles a relatively large fraction of eluted peptides did not, in fact, bind the reported corresponding allele when measured experimentally. In light of the complexity and limitations of eluted ligand determinations, it appears likely that at least some of these reported ligands have an incorrectly assigned HLA molecule, or are otherwise contaminants.

Correlation between predicted and measured binding is lowest for DQA1*05:01/DQB1*02:01

Comparison of the NetMHCIIpan predicted %‐ranks to the measured %‐ranks revealed good correlation overall (r = 0·8, Pearson's correlation). However, this analysis also identified some inter‐allele variability and we found that the correlation between predicted and measured %‐ranks is lowest for DQA1*05:01/DQB1*02:01 (r = 0·743, Pearson's correlation). We additionally observed several cases with substantial disagreement between predicted and measured binding. In particular, the analysis identified 8 peptides (outlined in red in Figure 1) that were not predicted to bind and had %‐ranks of >30% but measured %‐ranks were all <10%. This suggests that these peptides are likely real ligands whose binding capacity was missed by the predictions. Further inspection of the affected peptide‐HLA pairs showed that the majority of these peptides (5 of 8) were associated with DQA1*05:01/DQB1*02:01 (Table 4).

Table 4.

Peptides showing greatest discrepancy between predicted and measured affinity

Sequence HLA allele Measured %‐Rank Predicted %‐Rank
APDTRFFVPEPGGRGAAP HLA‐DRB1*01:01 2.19 34.31
WISKQEYDESGPSIVHRK HLA‐DRB1*15:01 7.26 79.28
SPTEPKNYGSYSTQA HLA‐DRB1*15:01 2.61 86.56
GEPDYVNGEVAATEA HLA‐DQA1*05:01/DQB1*02:01 2.98 30.02
GDSDLQLDRISVYYNEA HLA‐DQA1*05:01/DQB1*02:01 6.7 31.25
RQEEPEYENVVPISRPP HLA‐DQA1*05:01/DQB1*02:01 3.54 40.81
KPPTADLFTGVLPNGYNPP HLA‐DQA1*05:01/DQB1*02:01 3.91 43.8
LPGRENYSSVDANGIQ HLA‐DQA1*05:01/DQB1*02:01 2.23 55.87

Accordingly, we next sought to characterize the respective modes of binding of the 5 ‘discordant’ peptides to DQA1*05:01/DQB1*02:01 (hereafter DQ2.5). As controls, we also included 2 peptides eluted from DQ2.5 that were predicted to bind. We also included one peptide (aTIP 430) previously reported as a DQ2.5‐restricted T‐cell epitope 32 and predicted to bind DQ2.5, and another (DQA 16) previously identified as an endogenously bound DQ2.5 ligand, 33 but not predicted to bind DQ2.5. The DQ2.5 binding capacity of both aTIP 430 and DQA 16 was previously characterized using purified HLA 21 , 34 and confirmed in the present study. Table 4 shows the 9 peptides included in the single amino acid substitution experiments. Table 3 lists the 8 peptides with greatest discrepancy between predicted and measured affinity, where an enrichment for DQ2.5 was noted (5/8). Some peptides are listed in both Tables, as they are relevant in both contexts.

Determining key residues for DQ2.5 binding within the eluted ligands

For each of the ligands in Table 4, we experimentally determined the residues crucial for conferring DQ2.5 binding capacity. We used a previously described single amino acid substitution approach, 35 scanning each peptide sequence by introducing non‐conservative substitutions at each position. Lysine (K) was selected as the basis for the analysis as it is non‐conservative with all amino acids except those with a positive charge (i.e. R, H and K). In cases where the native residue carried a positive charge, a glutamic acid (E) substitution was utilized. Each single substitution analog, and the original unmodified peptide, was tested for its capacity to bind DQ2.5. All peptides together with the measured IC50 values and ranks are detailed in Table S4.

All of the original unmodified peptides bound with IC50 values <1000 nM and %‐ranks <20%. However, several of the single substitution analogs did not bind at all, or bound with considerably lower affinity when compared to the corresponding unmodified peptide. The measured binding affinity of each analog was compared to the unmodified peptide, and a relative binding capacity was calculated as the fold change in IC50 between the peptide and its analog (Figure 2). Positions associated with a fourfold or greater reduction in binding capacity (highlighted by green shading in Figure 2) were defined as potential MHC contacts. These decreases in affinity allow mapping potential MHC contact residues for each ligand. The analysis revealed that on average 5·8 residues/ligand were identified as important for DQ2.5 binding, with a range from 2 to 9 residues. No clear, simple, common pattern was readily apparent across ligands.

Figure 2.

Figure 2

The fold change in IC50 between the unmodified peptide and the corresponding analog with substitutions at each position scanning the length of the respective peptide is shown. The dashed line indicated a 10‐fold reduction in binding. Residues for which substitutions resulted in a greater than 10‐fold reduction in binding were defined as anchor residues and are highlighted in green here.

DQA1*05:01/DQB1*02:01 Motif 2 peptides are not effectively predicted

Previous studies indicated that DQ2.5 may be associated with at least two different binding motifs. One motif is characterized by a preference for aromatic or hydrophobic aliphatic residues in position 1 (P1), 21 , 33 , 36 acidic residues in positions 4, 6 and/or 7 (P4, P6 and/or P7), and hydrophobic or aromatic residues at position 9 (P9). An independent study 33 also identified a preference for acidic residues in P4 and P6, and large hydrophobic residues in P9, but with no specificity in P1. Collectively, we hereby refer to these preferences as ‘Motif 1’. Other reported patterns 37 , 38 associate proline or polar residues with P1, acidic or polar residues at various other positions, and a hydrophobic or polar residue in P9 (‘Motif 2’).

We divided all DQ2.5 ligands into three groups according to whether they fit Motif 1, Motif 2 or no motif following the patterns outlined above: Motif1 with aromatic or hydrophobic aliphatic residues (i.e. Phenylalanine, Tryptophan, Tyrosine, Leucine, Isoleucine, Valine or Methionine) in positions P1 and/or P9, and acidic residues (i.e. Aspartate or Glutamate) in positions P4, P6 and/or P7; Motif 2 with Proline or polar residues (i.e. Glutamine, Asparagine, Serine, Threonine, Lysine or Arginine) in P1, hydrophobic or polar residue (i.e. Glutamine, Asparagine, Aspartate, Glutamate, Phenylalanine, Tryptophan, Tyrosine, Leucine, Isoleucine, Valine, Methionine) in P9, acidic residues (i.e. Aspartate or Glutamate) in positions P6 and/or P7 and/or acidic or polar residues (i.e. Aspartate, Glutamate, Glutamine, Asparagine) in position P4. Using the peptides fitting Motif 1, Motif 2 or no motif, we generated three different sequence logos using the Seq2Logo tool 39 (Figure 3A).

Figure 3.

Figure 3

(A) All DQ2.5 ligands were divided into three groups according to whether they fit Motif 1, Motif 2, or no motif and sequence logos were generated. (B) The analyzed 9 DQ2.5 binding peptides are grouped according to the motif they conform to. Residues that were identified as anchor residues in single amino acid substitution experiments are highlighted in green. The red boxes highlight residues that explicitly match the position and specificity of Motif 1 or Motif 2. A + in the column prediction indicates that the peptide was predicted to bind DQ2.5 with a percentile rank of < 20%.

Upon closer inspection of the data shown in Figure 2, we were able to classify five ligands as fitting either Motif 1 or Motif 2 (Figure 3B). Specifically, three peptides appear to fit Motif 1, to include PI‐a 70 with Isoleucine at peptide position 8 (I8) as its P1 anchor, and Aspartate at position 11 (D11) and Glutamate at position 13 (E13) as P4 and P7 anchors, respectively. Motif 1 also describes the binding of the TMA 266 ligand, with Glutamate at peptide position 8 (E8) and Leucine at position 13 (L13) as the P4 and P9 anchors, and aTIP 430, with Valine at position 3 (V3), Aspartate at position 9 (D9) and Leucine at position 11 (L11) as P1, P7 and P9 anchors, respectively.

Two other peptides appear to fit Motif 2 (Figure 3B). These are the DQA 16 peptide, for which the single substitution analysis indicated the Proline in position 6 of the peptide (P6) as the P1 anchor and Phenylalanine at position 14 (F14) as the P9 anchor, and the SR 812 ligand where the Proline in position 5 (P5) is the P1 anchor and Isoleucine at position 13 (I13) is the P9 anchor.

The four remaining peptides did not fit either motif. These include the LAM 246 and Cochlin 160 peptides, which have aromatic/hydrophobic residues that fit as P1 and P9 anchors, but do not have acidic residues in the P4, P6 or P7 positions as described by Motif 1; meanwhile, aromatic residues in P1 render the LAM 246 and Cochlin 160 peptides incompatible with Motif 2. For the remaining two peptides (LAT2 21 and Tubulin 38), the single amino acid substitution scan also revealed binding modes not compatible with either Motif 1 or Motif 2.

Interestingly, when we examined the motif groups with respect to predicted binding, it was found that all three peptides fitting Motif 1 were also predicted by the NetMHCIIpan algorithm to bind DQ2.5 with %‐ranks of 20 or better (Table 5). Conversely, none of the peptides associated with Motif 2 or with no motif were predicted by the NetMHCIIpan algorithm. In conclusion, while 3/3 Motif 1‐positive peptides were predicted, 0/6 peptides that did not carry this motif were predicted (P = 0·019, Fisher's exact test).

Table 5.

Peptides selected for DQ2.5 single amino acid substitution scan to determine the residues crucial for DQ2.5 binding

Protein Pos Sequence Set Measured %‐Rank Predcited %‐Rank
PI‐a 70 NPTVFFDIAVDGEPL Eluted 0.2 0.8
TMA 266 LAKTAFDEAIAELDT Eluted 0.9 0.5
aTIP 430 EEVDMTPADALDDFD Epitope 1.5 1.0
DQA 16 YQSYGPSGQYTHEFD Eluted 10.6 55.0
SR 812 RQEEPEYENVVPISRPP Eluted 2.2 40.8
LAM 246 KPPTADLFTGVLPNGYNPP Eluted 5.6 43.8
Cochlin 160 LPGRENYSSVDANGIQ Eluted 14.7 55.9
LAT2 21 GEPDYVNGEVAATEA Eluted 2.1 30.0
Tubulin 38 GDSDLQLDRISVYYNEA Eluted 6.9 31.3

To validate that the differential predictability of DQ2.5 ligands was associated with the presence of Motif 1, we assembled all described DQ2.5 eluted ligands and evaluated whether those predicted to bind DQ2.5 were enriched for the presence of Motif 1, compared to ligands that were not predicted to bind. Indeed, 20% (196/958) of the ligands predicted to bind did carry Motif 1. By contrast, only 5% (47/958) of ligands predicted to bind carried Motif 2 (P < 0·0001, chi‐square test). The remaining 75% of the ligands predicted to bind DQ2.5 did not conform to any known motif. The neural network architecture utilized by NetMHCIIpan theoretically has the capacity to learn and predict multiple binding motifs for an allele. Our results, however, suggest that the algorithm in practice does not efficiently predict Motif 2.

Taken together, these data highlight that DQ2.5 can bind peptides promiscuously using alternate modes. Further analysis of more peptides is necessary to fully understand what defines the way peptides bind or do not bind this molecule and to improve binding predictions.

DISCUSSION

In this study, we assessed how well MHC class II binding predictions and ligand elution data agree with each other. We found that the majority of experimentally identified MHC class II eluted ligands were predicted to bind to their reported restriction element. For HLA‐DR ligands that showed discrepancies, most of the ligands that were not predicted to bind were actually predicted to bind just below the threshold utilized, or were predicted to bind to other co‐expressed MHC class II molecules.

In contrast to the DR molecules studied, we found that for HLA‐DQA1*05:01/DQB1*02:01 (DQ 2.5) binding predictions did miss several peptides that were determined experimentally to be binders. By determining key residues for DQ2.5 binding in several peptides, we could trace discrepancies back to the existence of two binding motifs, only one of which was effectively predicted.

Our finding that the majority of eluted ligands were predicted to bind their restricting allele supports the use of these prediction tools that are commonly used. 40 , 41 This use had been put into doubt by reports that the predicted binding affinity of peptides to MHC class II correlates poorly with MHC antigen presentation 12 , 13 and peptide immunogenicity. 14 , 15 We have reported previously that in the case of HLA class II‐restricted T‐cell epitopes binding predictions do perform well, as we found that 83·3% of epitopes are predicted to bind their corresponding restricting allele with an affinity of 1000 nM or better. 17

In the present study, we investigated how efficiently binding predictions can predict eluted ligands, and found that, on average, 57% of eluted peptides are predicted to bind their corresponding allele below the 20%‐rank threshold. These data prove that the majority of eluted ligands can also be effectively predicted, although the percentage is lower than it is in the case for HLA class II‐restricted T‐cell epitopes. It was already shown in the case of HLA class I‐restricted T‐cell epitopes that alleles can vary in terms of the affinity threshold associated with immunogenicity. 42 This may indeed be the case as well for HLA class II ligands and T‐cell epitopes, and needs to be further investigated.

When we scrutinized experimental procedures of the ligand elution studies, we found that the assigned restrictions for several MHC class II ligands are likely incorrect. This reflects experimental challenges of ligand elution assays. The first steps in a typical MHC class II ligand elution assay are lysing the antigen‐presenting cells, purifying the MHC molecules via immunoprecipitation and chromatographically eluting the MHC ligands. 43 Here, the main drawback is that natural ligandomes are multi‐allelic. Different MHC molecules are expressed in the interrogated cell and the eluted ligands can correspond to any one of those MHC molecules. This makes it difficult to unambiguously annotate the ligands to the MHC molecule they were presumed to be eluted from.

Of the eleven DR molecules we studied, 5 (DRB1*04:01, DRB1*03:01, DRB1*04:02, DRB1*13:02 and DRB1*15:01) were associated with lower prediction efficiency. We suspected that the poor‐performing DR alleles may have resulted from an imprecise HLA assignment of the eluted ligands. To further explore this issue, we scrutinized papers that reported the eluted ligands associated with those DR molecules. HLA‐DR loci encompass a monomorphic alpha chain (DRA), which pairs with beta chains expressed by two different loci: DRB1 and DRB3/4/5. These two loci are in strong linkage disequilibrium and are co‐expressed in vivo. The only exceptions to this are the DR1, DR8 and DR10 haplotypes, for which there is no associated DRB3/4/5 locus. Thus, in heterozygous individuals, up to four different DR molecules (two with DRB1‐ and two with DRB3/4/5 beta chains) may be expressed, and even in homozygous humans, two different DR molecules may be co‐expressed. Importantly, all anti‐DR antibodies (L243, LB3.1 and others) commonly used to purify DR molecules and isolate eluted ligands are directed against the DR alpha chain, and thereby do not distinguish between DRB1 from DRB3/4/5 molecules. Thus, unless HLA mono‐allelic lines are used, both DRB1 and the DRB3/4/5 gene products will be co‐purified, and the pool of eluted ligands may contain a mixture of two different HLA class II molecules. All studies pertaining to those DR alleles associated with ‘lower efficiency of binding predictions’ suffered from the aforementioned issues. This included studies using the L243 or L227 monoclonal antibodies, both anti‐DRA, with EBV transformed homozygous 44 , 45 , 46 , 47 , 48 or heterozygous 49 , 50 , 51 cell lines. Other studies 52 , 53 similarly purified DR molecules from heterozygous patient material, yielding a mixture of DRB1 and DRB3 or DRB4 eluted peptides, as well as other undefined DR molecules expressed because of patient heterozygosity. While most authors do not mention issues pertaining to co‐expression, others acknowledge the problem. Koning et al. 54 state that ‘affinity purification… resulted in the isolation of a mixture of the HLA‐DRB1 and the HLA‐DRB4‐locus product DR53’. Similarly, Hill et al. 55 recognize that DRB1 and DRB4 molecules are tightly linked, but conclude that DRB4 is likely to represent a minor fraction of DR molecules, and is therefore ‘unlikely to influence results’.

This issue can be addressed, for example, by using genetically engineered mono‐allelic cell lines. This approach, however, is not feasible when patient samples are used, which will almost always be multi‐allelic. To overcome this, there are currently significant efforts to develop computational methods that can accurately annotate multi‐allelic ligands to their respective MHC. 56 , 57 , 58 However, such methods will still rely on the identification of shared motifs of the eluted peptides.

Once ligands are eluted from MHC molecules, they are sequenced by tandem mass spectrometry (MS/MS). A key step is the identification of peptides by matching acquired MS/MS spectra against a customized protein sequence database. This step has an inherent trade‐off between sensitivity and specificity, which leads to a certain number of false spectrum identifications. 59 , 60 It is thus expected that some ligands identified by these methodologies are not correct. Given that we explicitly focused on peptides that were not predicted to bind, we might have enriched for such peptides that fall into the known false discovery rate of peptide elution experiments.

On the other hand, specifically focusing on these outliers might have enabled us to discover patterns that would have otherwise been missed. When we compared predicted and measured affinities, we identified several peptides that were not predicted to bind their associated allele but were experimentally determined to actually bind. This, therefore, indicates that these peptides are likely real ligands, but were missed by the predictions. As 5 out of 8 of these instances were associated with HLA‐DQA1*05:01/DQB1*02:01 (DQ2.5), we performed a detailed investigation of binding modes to DQ2.5. We performed single amino acid substitutions for these 5 discordant DQ2.5 associated peptides and an additional 4 peptides that are known DQ2.5 binders. Two distinct binding motifs have been described for this allele in the literature, and 5 out of the 9 analysed peptides were determined to fit either of these two motifs, while 4 peptides did not fit any motif.

It might be that these peptides that did not fit any known motif bind DQ2.5 in a mode engaging pockets in the binding groove in a not readily apparent alternate mode. But, as suggested by several elegant thermodynamic studies, 61 , 62 , 63 , 64 it is also possible that the primary energy of binding in these cases is driven by MHC backbone‐peptide side chain interactions. However, without additional data, this remains speculation in the present case.

Crystallography studies of DQ2.5 with bound peptides conforming to Motif 1, Motif 2 and no motif would enable the detailed characterization of the interactions in the different binding modes. Nine crystal structures of DQ2.5 are available in the Protein Database (PDB). However, as DQ2.5 is mainly studied in the context of coeliac disease, in 7 of these studies various deamidated gliadin peptides are reported, which all conform to Motif 2. 65 , 66 , 67 The remaining 2 crystal structures report CLIP1 and CLIP2 peptides, which do not conform to any motif. 68

Crystal structures of DQ2.5 with bound gliadin peptides or mimics of these peptides in complex with T‐cell receptors (TCRs) showed that the TCR‐peptide contacts are at positions P2, P5, P6 and P8 of the MHC binding core. 66 , 69 Both Motif 1 and Motif 2 do not identify a preference for specific amino acids at positions P2, P5 and P8, and have the same preference for acidic residues at positions P6 and P7. It is hence theoretically possible that peptides conforming to Motif 1 or Motif 2 could be recognized as epitopes. Again, crystal structures of DQ2.5 with bound peptides conforming to Motif 1, ideally in complex with a TCR, would help to clarify the specific interactions of the different binding motifs.

Interestingly, peptides conforming to Motif 2 or no motif were not predicted correctly, indicating that prediction performance might depend on the motif of the peptide in question. We hypothesized that this might be due to a skewed Motif 1 to Motif 2 ratio in the DQ2.5 data set used to train NetMHCIIpan. We investigated the number of peptides in the training data set conforming to each motif and found that the majority of peptides did not fit any motif (53%) and there were less motif 1 peptides than motif 2 peptides (20% and 28%, respectively). The distribution of the different motifs in the training data set does hence not explain prediction efficacy of the different motifs.

While the neural network architecture utilized here hypothetically has the capacity to learn that multiple binding motifs are possible for a single allele, the algorithm in practice does not seem to pick this up. We used the most recent version of NetMHCIIpan (version 4.0) to investigate whether this issue persist in the updated implementation. The NetMHCIIpan 4.0 eluted ligand likelihood prediction (Rank_EL) was trained using eluted ligand data. As that includes all peptides we analysed in this study, the predictions will be overfitted and are not helpful in this context. The NetMHCIIpan 4.0 binding affinity prediction (Rank_BA) in contrast was trained using an extended data set of peptide‐MHC binding affinity data. Considering these updated predictions, all of the three tested Motif 1 peptides were predicted good binders with %‐ranks <1%. The predicted %‐ranks for two Motif 2 peptides in contrast are much weaker: while one peptide could be considered a weak binder with a %‐rank of 17%, the second peptide with a %‐rank of 48% was clearly not predicted to bind. While NetMHCIIpan 3.1 did not predict any of the 4 peptides that did not conform with any motif as binding, NetMHCIIpan 4.0 predicted two of them as binding with %‐ranks <20 (Table S5). While this analysis confirms our results regarding Motif 1, it also indicates that the updated neural network also has issues in predicting half of the peptides that do not conform with Motif 1.

There is already an example in the literature of an allele having two alternative motifs. 70 Interestingly, this is HLA‐DRB1*03:01, which was also found in our study to be associated with lower prediction performance, with only 49% of peptides binding below 20%‐rank. It is possible that alternative binding motifs are a general problem with HLA class II molecules and that currently only peptides conforming to one of the associated motifs are predicted well. Conversely, peptides with other motifs are missed by predictions. This needs to be investigated in more detail by extending the analysis to additional alleles and peptides. The resulting data set from our study will be valuable for training machine learning methods to specifically detect cases with multiple binding motifs and improve prediction performance.

Our study utilizes experimentally eluted MHC ligand data as a set of ‘positive data points’, and examines which of these ligands is not identified with prediction methods, thus focusing on ‘false‐negative predictions’. Examining the flip‐side, namely ‘false‐positive predictions’, requires ligand elution data sets that exhaustively search for all possible MHC ligands in a set of antigens so that a set of ‘negative data points’ that with certainty are not presented by MHC can be identified. The data sets used in this study do not meet these criteria, and generating them in sufficiently large scale is challenging. For MHC class I, a comprehensive data set of eluted ligands derived from vaccinia virus infected murine cells that were further tested for T‐cell recognition has been published, revealing 83 major epitopes. 71 Benchmarking revealed that the best prediction methods were able to capture more than 50% of these epitopes in the top N = 277 predictions within the set of N = 767,788 possible peptides. 72 This means the false‐positive rate, defined as the fraction of negatives that is incorrectly predicted to be a positive is only 0·03%. While this seems close to perfect, when evaluating this performance based on precision, defined as the fraction of positive predictions made that is actually positive, is only 15%. This highlights that metrics of prediction performance have to be considered in the context of the problem considered, and identifying MHC ligands and/or T‐cell epitopes in a complex antigen is a ‘needle‐in‐the‐haystack’ problem. It also has to be pointed out that the prediction methods were asked to identify ligands based on the peptide sequence alone, and other factors that can lead to lack of presentation of a ligand, such as low expression level of the source protein, were not included in this comparison. Incorporating such additional factors should reduce the false‐positive rate. Similar data sets and benchmark comparisons should also to be generated for MHC class II.

CONFLICT OF INTEREST

All authors declare that they have no conflict of interest.

Supporting information

Table S1‐S5. Experimentally measured DQ2.5 peptides with NetMHCIIpan v4 predictions added.

ACKNOWLEDGEMENTS

Research reported in this publication was supported in part by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R21AI134127 and 75N93019C00001, as well as the National Institute of Dental and Craniofacial Research and the National Cancer Institute under Award Number U01DE028227. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The work was also supported in part by a research agreement with Bristol‐Myers Squibb Company.

Senior author: Alessandro Sette

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are openly available in the IEDB under submission ID 1000852.

REFERENCES

  • 1. Zhu J, Paul WE. CD4 T cells: fates, functions, and faults. Blood. 2008;112:1557–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Sollid LM, Jabri B. Triggers and drivers of autoimmunity: lessons from coeliac disease. Nat Rev Immunol. 2013;13:294–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Ott PA, Hu Z, Keskin DB, Shukla SA, Sun J, Bozym DJ, et al. An immunogenic personal neoantigen vaccine for patients with melanoma. Nature. 2017;547:217–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Sahin U, Derhovanessian E, Miller M, Kloke BP, Simon P, Lower M, et al. Personalized RNA mutanome vaccines mobilize poly‐specific therapeutic immunity against cancer. Nature. 2017;547:222–6. [DOI] [PubMed] [Google Scholar]
  • 5. Alspach E, Lussier DM, Miceli AP, Kizhvatov I, DuPage M, Luoma AM, et al. MHC‐II neoantigens shape tumour immunity and response to immunotherapy. Nature. 2019;574:696–701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Peters B, Nielsen M, Sette A. T cell epitope predictions. Annu Rev Immunol. 2020;38(1):123–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Robinson J, Halliwell JA, McWilliam H, Lopez R, Parham P, Marsh SG. The IMGT/HLA database. Nucleic Acids Res. 2013;41:D1222–D1227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Rammensee HG, Friede T, Stevanoviic S. MHC ligands and peptide motifs: first listing. Immunogenetics. 1995;41:178–228. [DOI] [PubMed] [Google Scholar]
  • 9. Geluk A, van Meijgaarden KE, Southwood S, Oseroff C, Drijfhout JW, de Vries RR, et al. HLA‐DR3 molecules can bind peptides carrying two alternative specific submotifs. J Immunol. 1994;152:5742–8. [PubMed] [Google Scholar]
  • 10. Shao W, Pedrioli PGA, Wolski W, Scurtescu C, Schmid E, Vizcaino JA, et al. The SysteMHC atlas project. Nucleic Acids Res. 2018;46:D1237–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Vaughan K, Xu X, Caron E, Peters B, Sette A. Deciphering the MHC‐associated peptidome: a review of naturally processed ligand data. Expert Rev Proteomics. 2017;14:729–36. [DOI] [PubMed] [Google Scholar]
  • 12. Bassani‐Sternberg M, Braunlein E, Klar R, Engleitner T, Sinitcyn P, Audehm S, et al. Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry. Nat Commun. 2016;7:13404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Gowthaman U, Agrewala JN. In silico tools for predicting peptides binding to HLA‐class II molecules: more confusion than conclusion. J Proteome Res. 2008;7:154–63. [DOI] [PubMed] [Google Scholar]
  • 14. Backert L, Kohlbacher O. Immunoinformatics and epitope prediction in the age of genomic medicine. Genome Med. 2015;7:119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Chaves FA, Lee AH, Nayak JL, Richards KA, Sant AJ. The utility and limitations of current Web‐available algorithms to predict peptides recognized by CD4 T cells in response to pathogen infection. J Immunol. 2012;188:4235–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids. Res 2019;47:D339–D343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Paul S, Karosiene E, Dhanda SK, Jurtz V, Edwards L, Nielsen M, et al. Determination of a predictive cleavage motif for eluted major histocompatibility complex class II ligands. Front Immunol. 2018;9:1795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Andreatta M, Karosiene E, Rasmussen M, Stryhn A, Buus S, Nielsen M. Accurate pan‐specific prediction of peptide‐MHC class II binding affinity with improved binding core identification. Immunogenetics. 2015;67:641–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Karosiene E, Rasmussen M, Blicher T, Lund O, Buus S, Nielsen M. NetMHCIIpan‐3.0, a common pan‐specific MHC class II prediction method including all three human MHC class II isotypes, HLA‐DR, HLA‐DP and HLA‐DQ. Immunogenetics. 2013;65:711–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Greenbaum J, Sidney J, Chung J, Brander C, Peters B, Sette A. Functional classification of class II human leukocyte antigen (HLA) molecules reveals seven different supertypes and a surprising degree of repertoire sharing across supertypes. Immunogenetics. 2011;63:325–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Sidney J, Steen A, Moore C, Ngo S, Chung J, Peters B, et al. Divergent motifs but overlapping binding repertoires of six HLA‐DQ molecules frequently expressed in the worldwide human population. J Immunol. 2010;185:4189–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Paul S, Grifoni A, Peters B, Sette A. Major histocompatibility complex binding, eluted ligands, and immunogenicity: benchmark testing and predictions. Front Immunol. 2019;10:3151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Southwood S, Sidney J, Kondo A, del Guercio MF, Appella E, Hoffman S, et al. Several common HLA‐DR types share largely overlapping peptide binding repertoires. J Immunol. 1998;160:3363–73. [PubMed] [Google Scholar]
  • 24. Sidney J, Southwood S, Moore C, Oseroff C, Pinilla C, Grey HM, et al. Measurement of MHC/peptide interactions by gel filtration or monoclonal antibody capture. Curr Protoc Immunol. 2013;Chapter 18:Unit 18 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Dhanda SK, Vaughan K, Schulten V, Grifoni A, Weiskopf D, Sidney J, et al. Development of a novel clustering tool for linear peptide sequences. Immunology. 2018;155:331–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Cheng Y, Prusoff WH. Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reaction. Biochem Pharmacol. 1973;22:3099–108. [DOI] [PubMed] [Google Scholar]
  • 27. Gulukota K, Sidney J, Sette A, DeLisi C. Two complementary methods for predicting peptides binding major histocompatibility complex molecules. J Mol Biol. 1997;267:1258–67. [DOI] [PubMed] [Google Scholar]
  • 28. Oseroff C, Sidney J, Kotturi MF, Kolla R, Alam R, Broide DH, et al. Molecular determinants of T cell epitope recognition to the common Timothy grass allergen. J Immunol. 2010;185:943–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Kotturi MF, Botten J, Maybeno M, Sidney J, Glenn J, Bui HH, et al. Polyfunctional CD4+ T cell responses to a set of pathogenic arenaviruses provide broad population coverage. Immunome Res. 2010;6:4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Arlehamn CS, Sidney J, Henderson R, Greenbaum JA, James EA, Moutaftsi M, et al. Dissecting mechanisms of immunodominance to the common tuberculosis antigens ESAT‐6, CFP10, Rv2031c (hspX), Rv2654c (TB7.7), and Rv1038c (EsxJ). J Immunol. 2012;188:5020–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Sidney J, Becart S, Zhou M, Duffy K, Lindvall M, Moore EC, et al. Citrullination only infrequently impacts peptide binding to HLA class II MHC. PLoS One. 2017;12:e0177140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Reichstetter S, Kwok WW, Kochik S, Koelle DM, Beaty JS, Nepom GT. MHC‐peptide ligand interactions establish a functional threshold for antigen‐specific T cell recognition. Hum Immunol. 1999;60:608–18. [DOI] [PubMed] [Google Scholar]
  • 33. Vartdal F, Johansen BH, Friede T, Thorpe CJ, Stevanovic S, Eriksen JE, et al. The peptide binding motif of the disease associated HLA‐DQ (alpha 1* 0501, beta 1* 0201) molecule. Eur J Immunol. 1996;26:2764–72. [DOI] [PubMed] [Google Scholar]
  • 34. Sidney J, del Guercio MF, Southwood S, Sette A. The HLA molecules DQA1*0501/B1*0201 and DQA1*0301/B1*0302 share an extensive overlap in peptide binding specificity. J Immunol. 2002;169:5098–108. [DOI] [PubMed] [Google Scholar]
  • 35. Valli A, Sette A, Kappos L, Oseroff C, Sidney J, Miescher G, et al. Binding of myelin basic protein peptides to human histocompatibility leukocyte antigen class II molecules and their recognition by T cells from multiple sclerosis patients. J Clin Invest. 1993;91:616–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Eerligh P, van Lummel M, Zaldumbide A, Moustakas AK, Duinkerken G, Bondinas G, et al. Functional consequences of HLA‐DQ8 homozygosity versus heterozygosity for islet autoimmunity in type 1 diabetes. Genes Immun. 2011;12:415–27. [DOI] [PubMed] [Google Scholar]
  • 37. Godkin A, Friede T, Davenport M, Stevanovic S, Willis A, Jewell D, et al. Use of eluted peptide sequence data to identify the binding characteristics of peptides to the insulin‐dependent diabetes susceptibility allele HLA‐DQ8 (DQ 3.2). Int Immunol. 1997;9:905–11. [DOI] [PubMed] [Google Scholar]
  • 38. Stepniak D, Wiesner M, de Ru AH, Moustakas AK, Drijfhout JW, Papadopoulos GK, et al. Large‐scale characterization of natural ligands explains the unique gluten‐binding properties of HLA‐DQ2. J Immunol. 2008;180:3268–78. [DOI] [PubMed] [Google Scholar]
  • 39. Thomsen MC, Nielsen M. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two‐sided representation of amino acid enrichment and depletion. Nucleic Acids Res. 2012;40:W281–W287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Nielsen M, Lund O, Buus S, Lundegaard C. MHC class II epitope predictive algorithms. Immunology. 2010;130:319–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Gfeller D, Bassani‐Sternberg M, Schmidt J, Luescher IF. Current tools for predicting cancer‐specific T cell immunity. Oncoimmunology. 2016;5:e1177691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Paul S, Weiskopf D, Angelo MA, Sidney J, Peters B, Sette A. HLA class I alleles are associated with peptide‐binding repertoires of different size, affinity, and immunogenicity. J Immunol. 2013;191:5831–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Caron E, Kowalewski DJ, Chiek Koh C, Sturm T, Schuster H, Aebersold R. Analysis of major histocompatibility complex (MHC) immunopeptidomes using mass spectrometry. Mol Cell Proteomics. 2015;14:3105–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Gomez‐Tourino I, Simon‐Vazquez R, Alonso‐Lorenzo J, Arif S, Calvino‐Sampedro C, Gonzalez‐Fernandez A, et al. Characterization of the autoimmune response against the nerve tissue S100beta in patients with type 1 diabetes. Clin Exp Immunol. 2015;180:207–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Hayden JB, McCormack AL, Yates JR III, Davey MP. Analysis of naturally processed peptides eluted from HLA DRB1*0402 and *0404. J Neurosci Res. 1996;45:795–802. [DOI] [PubMed] [Google Scholar]
  • 46. Ovsyannikova IG, Johnson KL, Naylor S, Muddiman DC, Poland GA. Naturally processed measles virus peptide eluted from class II HLA‐DRB1*03 recognized by T lymphocytes from human blood. Virology. 2003;312:495–506. [DOI] [PubMed] [Google Scholar]
  • 47. Kirschmann DA, Duffin KL, Smith CE, Welply JK, Howard SC, Schwartz BD, et al. Naturally processed peptides from rheumatoid arthritis associated and non‐associated HLA‐DR alleles. J Immunol. 1995;155:5655–62. [PubMed] [Google Scholar]
  • 48. Peakman M, Stevens EJ, Lohmann T, Narendran P, Dromey J, Alexander A, et al. Naturally processed and presented epitopes of the islet cell autoantigen IA‐2 eluted from HLA‐DR4. J Clin Invest. 1999;104:1449–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Collado JA, Alvarez I, Ciudad MT, Espinosa G, Canals F, Pujol‐Borrell R, et al. Composition of the HLA‐DR‐associated human thymus peptidome. Eur J Immunol. 2013;43:2273–82. [DOI] [PubMed] [Google Scholar]
  • 50. Heyder T, Kohler M, Tarasova NK, Haag S, Rutishauser D, Rivera NV, et al. Approach for identifying human leukocyte antigen (HLA)‐DR bound peptides from scarce clinical samples. Mol Cell Proteomics. 2016;15:3017–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Sorvillo N, van Haren SD, Kaijen PH, ten Brinke A, Fijnheer R, Meijer AB, et al. Preferential HLA‐DRB1*11‐dependent presentation of CUB2‐derived peptides by ADAMTS13‐pulsed dendritic cells. Blood. 2013;121:3502–10. [DOI] [PubMed] [Google Scholar]
  • 52. Fissolo N, Haag S, de Graaf KL, Drews O, Stevanovic S, Rammensee HG, et al. Naturally presented peptides on major histocompatibility complex I and II molecules eluted from central nervous system of multiple sclerosis patients. Mol Cell Proteomics. 2009;8:2090–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Wahlstrom J, Dengjel J, Persson B, Duyar H, Rammensee HG, Stevanovic S, et al. Identification of HLA‐DR‐bound peptides presented by human bronchoalveolar lavage cells in sarcoidosis. J Clin Invest. 2007;117:3576–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Verreck FA, Elferink D, Vermeulen CJ, Amons R, Breedveld F, de Vries RR, et al. DR4Dw4/DR53 molecules contain a peptide from the autoantigen calreticulin. Tissue Antigens. 1995;45:270–5. [DOI] [PubMed] [Google Scholar]
  • 55. Davenport MP, Quinn CL, Chicz RM, Green BN, Willis AC, Lane WS, et al. Naturally processed peptides from two disease‐resistance‐associated HLA‐DR13 alleles show related sequence motifs and the effects of the dimorphism at position 86 of the HLA‐DR beta chain. Proc Natl Acad Sci USA. 1995;92:6567–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Garde C, Ramarathinam SH, Jappe EC, Nielsen M, Kringelum JV, Trolle T, et al. Improved peptide‐MHC class II interaction prediction through integration of eluted ligand and peptide affinity data. Immunogenetics. 2019;71:445–54. [DOI] [PubMed] [Google Scholar]
  • 57. Racle J, Michaux J, Rockinger GA, Arnaud M, Bobisse S, Chong C, et al. Robust prediction of HLA class II epitopes by deep motif deconvolution of immunopeptidomes. Nat Biotechnol. 2019;37:1283–6. [DOI] [PubMed] [Google Scholar]
  • 58. Alvarez B, Reynisson B, Barra C, Buus S, Ternette N, Connelley T, et al. NNAlign_MA;MHC peptidome deconvolution for accurate MHC binding motif characterization and improved T‐cell epitope predictions. Mol Cell Proteomics. 2019;18:2459–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Nesvizhskii AI. Proteogenomics: concepts, applications and computational strategies. Nat Methods. 2014;11:1114–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Barra C, Alvarez B, Paul S, Sette A, Peters B, Andreatta M, et al. Footprints of antigen processing boost MHC class II natural ligand predictions. Genome Med. 2018;10:84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Ferrante A, Gorski J. Cooperativity of hydrophobic anchor interactions: evidence for epitope selection by MHC class II as a folding process. J Immunol. 2007;178:7181–9. [DOI] [PubMed] [Google Scholar]
  • 62. Ferrante A, Templeton M, Hoffman M, Castellini MJ. The thermodynamic mechanism of peptide‐MHC class II complex formation is a determinant of susceptibility to HLA‐DM. J Immunol. 2015;195:1251–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. McFarland BJ, Katz JF, Sant AJ, Beeson C. Energetics and cooperativity of the hydrogen bonding and anchor interactions that bind peptides to MHC class II protein. J Mol Biol. 2005;350:170–83. [DOI] [PubMed] [Google Scholar]
  • 64. Nelson CA, Fremont DH. Structural principles of MHC class II antigen presentation. Rev Immunogenet. 1999;1:47–59. [PubMed] [Google Scholar]
  • 65. Dahal‐Koirala S, Ciacchi L, Petersen J, Risnes LF, Neumann RS, Christophersen A, et al. Discriminative T‐cell receptor recognition of highly homologous HLA‐DQ2‐bound gluten epitopes. J Biol Chem. 2019;294:941–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Petersen J, Montserrat V, Mujico JR, Loh KL, Beringer DX, van Lummel M, et al. T‐cell receptor recognition of HLA‐DQ2‐gliadin complexes associated with celiac disease. Nat Struct Mol Biol. 2014;21:480–8. [DOI] [PubMed] [Google Scholar]
  • 67. Kim CY, Quarsten H, Bergseng E, Khosla C, Sollid LM. Structural basis for HLA‐DQ2‐mediated presentation of gluten epitopes in celiac disease. Proc Natl Acad Sci USA. 2004;101:4175–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Nguyen TB, Jayaraman P, Bergseng E, Madhusudhan MS, Kim CY, Sollid LM. Unraveling the structural basis for the unusually rich association of human leukocyte antigen DQ2.5 with class‐II‐associated invariant chain peptides. J Biol Chem. 2017;292:9218–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Petersen J, Ciacchi L, Tran MT, Loh KL, Kooy‐Winkelaar Y, Croft NP, et al. T cell receptor cross‐reactivity between gliadin and bacterial peptides in celiac disease. Nat Struct Mol Biol. 2020;27:49–61. [DOI] [PubMed] [Google Scholar]
  • 70. Sidney J, Oseroff C, Southwood S, Wall M, Ishioka G, Koning F, et al. DRB1*0301 molecules recognize a structural motif distinct from the one recognized by most DR beta 1 alleles. J Immunol. 1992;149:2634–40. [PubMed] [Google Scholar]
  • 71. Croft NP, Smith SA, Pickering J, Sidney J, Peters B, Faridi P, et al. Most viral peptides displayed by class I MHC on infected cells are immunogenic. Proc Natl Acad Sci USA. 2019;116:3112–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Paul S, Croft NP, Purcell AW, Tscharke DC, Sette A, Nielsen M, et al. Benchmarking predictions of MHC class I restricted T cell epitopes in a comprehensively studied model system. PLoS Comput Biol. 2020;16(5):e1007757. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1‐S5. Experimentally measured DQ2.5 peptides with NetMHCIIpan v4 predictions added.

Data Availability Statement

The data that support the findings of this study are openly available in the IEDB under submission ID 1000852.


Articles from Immunology are provided here courtesy of British Society for Immunology

RESOURCES