Skip to main content
Protein Science : A Publication of the Protein Society logoLink to Protein Science : A Publication of the Protein Society
. 2011 Sep 6;20(11):1918–1928. doi: 10.1002/pro.732

HLA-DP2 binding prediction by molecular dynamics simulations

Irini Doytchinova 1,*, Peicho Petkov 2, Ivan Dimitrov 1, Mariyana Atanasova 1, Darren R Flower 3
PMCID: PMC3267955  PMID: 21898654

Abstract

Major histocompatibility complex (MHC) II proteins bind peptide fragments derived from pathogen antigens and present them at the cell surface for recognition by T cells. MHC proteins are divided into Class I and Class II. Human MHC Class II alleles are grouped into three loci: HLA-DP, HLA-DQ, and HLA-DR. They are involved in many autoimmune diseases. In contrast to HLA-DR and HLA-DQ proteins, the X-ray structure of the HLA-DP2 protein has been solved quite recently. In this study, we have used structure-based molecular dynamics simulation to derive a tool for rapid and accurate virtual screening for the prediction of HLA-DP2-peptide binding. A combinatorial library of 247 peptides was built using the “single amino acid substitution” approach and docked into the HLA-DP2 binding site. The complexes were simulated for 1 ns and the short range interaction energies (Lennard–Jones and Coulumb) were used as binding scores after normalization. The normalized values were collected into quantitative matrices (QMs) and their predictive abilities were validated on a large external test set. The validation shows that the best performing QM consisted of Lennard–Jones energies normalized over all positions for anchor residues only plus cross terms between anchor-residues.

Keywords: MHC class II proteins, peptide-MHC complex, molecular dynamics, MHC binding prediction, HLA-DP2

Introduction

Bacterial antigens are endocytozed by specialized host cells, such as B cells, macrophages, and dendritic cells; proteolytically cleaved into oligopeptides in endosomes; and then bind to major histocompatibility complex (MHC) class II proteins. The peptide-MHC protein complexes are subsequently translocated to the cell surface where they are recognized by T-cell Receptors (TCRs) borne by CD4+ T cells. Although many peptides are bound and presented by MHCs, only a small subset of such peptides is recognized by T cells. MHC-bound peptides that are recognized by the TCR typically initiate a response by the whole T cell: these peptides are referred to as epitopes.

MHC is generally regarded as the most polymorphic protein in higher vertebrates, with more than 6400 sequences of Class I and Class II MHC molecules listed in the April 2011 release of the IMGT/HLA database.1 There are three human Class II MHC loci: HLA-DP, HLA-DQ, and HLA-DR. At each locus there are numerous different sequence variants or alleles. These proteins have been associated with many chronic inflammatory diseases,2 including rheumatoid arthritis and Type 1 diabetes.

Many X-ray crystallographic structures are available for both HLA-DQ and HLA-DR alleles.37 They show that the peptide binding site is composed of two separate chains: α and β. The walls of the binding site are formed by two anti-parallel helices and its floor from an eight-stranded β-sheet. Much of the MHC polymorphism is concentrated in the residues forming the binding site. The site is open at both ends allowing peptides of many different lengths to bind, even though only nine amino acids actually occupy the site itself.

In contrast to HLA-DR and HLA-DQ, proteins of the HLA-DP locus have not been studied extensively, as they have been viewed as less important than HLA-DR and HLA-DQ in mediating immune responses. However, it is now known that HLA-DP proteins contribute significantly to the risk of graft-vs.-host disease,8 sarcoidosis,9 juvenile chronic arthritis,10 Graves' disease,11 hard metal lung disease,12 and especially, chronic beryllium disease.13

Quite recently, the X-ray structure of the HLA-DP2 (DPA*0103, DPB1*0201) in complex with a self-peptide derived from the HLA-DR α-chain has been published.14 Thus HLA-DP is a MHC class II locus, while HLA-DP2 is a MHC class II allele. Although the gross structure of DP2 is very similar to that of other MHC Class II proteins, a unique solvent-exposed acidic pocket containing three glutamic acids (Glu26β, Glu68β, and Glu69β) was revealed. This pocket may be able to bind the beryllium ion and present it to T cells, thus providing a mechanistic explanation for chronic beryllium disease.14 X-ray data also reveals the DP2 binding site consists of four binding pockets: deep and hydrophobic p1 and p6 pockets; a large, shallow, negatively charged pocket p4; and a deep, narrow, polar pocket p9.

The epitope is the immunological quantum: the smallest structural moiety recognized by the immune system. Identifying epitopes is crucial in the proper development of reagents, diagnostics, and vaccines, as well being vital to the proper dissection of the immune response. However, despite continuing technical advances, the experimental determination of epitopes remains problematic, being highly resource intensive. It would thus be advantageous to pursue a purely computational approach, yet there are significant deficiencies in current binding prediction methodologies. For properly evaluated Class I MHC alleles, particularly HLA-A*0201, approaches to the in silico prediction of MHC-peptide binding are known to be effective.15,16 The prediction of all other forms of immunological epitope data, both humoral and cellular, is however seldom satisfactory. Over the last few years, several comparative studies have shown that the prediction of Class II T-cell epitopes is usually poor.1719 There continues to be a pressing need to identify effective new approaches to MHC-peptide binding prediction, particularly for Class II; molecular simulation has long offered the prospect of being such an approach.

Atomistic simulation, as typified by molecular dynamics (MD), can provide a scientifically rigorous dynamic view of molecular behavior. Such simulation provides a solid link between microscopic inter-molecular interactions and experimentally accessible macroscopic quantities, including peptide-MHC binding affinities.20,21 MD simulation of MHC proteins has been much reported. Simulations of empty MHC proteins showed that the α-helical segments flanking the peptide binding site make large amplitude movements, including bending and kinking motions, and undergo partial unfolding.22 Wan et al.23 showed that when peptide-MHC protein complexes are being simulated, the α3 and β2m domains, often excluded for efficiency reasons, must not be neglected. Later, they extended their work using large-scale MD simulations to include a surrogate of the immune synapse, consisting of a TCR-pMHC-CD4 complex in a membrane environment.21 Rognan et al.24 studied the binding of six different peptides to MHC I HLA-B*2705 and found simulation can effectively differentiate good from poor binders. On the basis of MD calculations, Meng et al.25 proposed that the water molecules within the binding groove of HLA-A2 protein play an important role in forming the peptide complex. Omasits et al.26 simulated pMHC complexes evaluating the initial conditions, system simplification, solvation shell thickness, simulation duration, and the appropriate combination of water model and force field. They compared two peptide analogs with different lengths, a 12mer and an 18mer, and found they could explain the higher observed affinity of the 18mer by a loop-like structure formed by the additional N-terminal peptide flanking region.27 Toh et al.28 suggested subtle conformational changes in the TCR-binding sites of the DR4 molecule caused by peptide mutations in solvent-exposed positions induced TCR antagonistic activities.

MD simulation is very time-consuming even using massive parallel supercomputing, yet as a structure-based approach it is essentially independent of the availability of experimental peptide-affinity data. It is thus appropriate for the exploration of the specificity of peptide binding to new or poorly studied MHC proteins. In the present study, we have sought to apply MD simulation to derive easy-to-use quantitative matrices (QM) which can be used for the predictive identification of HLA-DP2 binding peptides in the virtual screening of new protein antigens.

A library of 247 modeled peptide-DP2 complexes were simulated using molecular dynamics technology in order to assess the contribution of each of the 20 naturally occurred amino acids at each of the nine core binding positions and the four flanking residues (two on both sides). The normalized binding scores become coefficients within a rectangular (20 aa by 13 positions) quantitative matrix (QM). The predictive ability of the QM was assessed using an external test set of 457 known binders to DP2. The comparison to existing servers for DP2 binding prediction indicates that our MD-based QM (MD-QM) makes improved predictions. A more general aim of the study was to benchmark progress in the use of MD-based methodology in MHC-binding prediction, and to evaluate the role of MD-based energies as quantitative descriptors in the peptide binding prediction.

Results

Energy for scoring the binding

20 HLA-DP2-peptide protein complexes were simulated for 1 ns in a water box with sides 1 nm extended around the complexes. Two types of short-range energies were considered: Lennard-Jones (LJ-SR), Coulomb (Coul-SR); as well as the sum of both. The energies were recorded, normalized, and used for prediction of the test set of known binders. Sensitivities using the top 5% cut-off are given in Figure 1. It is evident that LJ-SR predicts better (35%) than either Coul-SR (27%) or the sum of LJ-SR and Coul-SR (27%).

Figure 1.

Figure 1

Sensitivities of the predictions calculated at 5% threshold by Lennard–Jones (LJ), Coulomb (Coul) and sum of both energies. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Peptide position 4 (p4) is known to be involved in strong electrostatic interactions with MHC binding pocket 4.14 We assumed that for this position Coul-SR energies should be more predictive than LJ-SR energies. However, the comparative study indicated that even for this position, LJ-SR is more predictive than Coul-SR (34% vs. 26% sensitivity at the top 5% cut-off). Subsequently, we considered only LJ-SR energies as binding predictors.

Duration of MD simulations

To determine the appropriate length for subsequent MD simulation, the X-ray structure of the HLA-DP2-peptide complex was run for 50 ns: energies were recorded every nanosecond and coordinates every 10 ns. The resulting overlaid coordinates are shown in Figure 2. It is clear that the binding core of the peptide is stable, with time-dependent conformational variation observed only in the flanking residues at the C-terminus. The LJ-SR energy during the simulation is also quite stable, fluctuating between −508 kcal/mol and −402 kcal/mol [Fig. 3(a)]. Corresponding root-mean squared deviation (RMSD) values are in the range of 0.2 nm [Fig. 3(b)]. Thus, a simulation time of 1 ns is long enough to achieve equilibrium. All subsequent simulations were run for 1 ns.

Figure 2.

Figure 2

Overlapped coordinates of the complex peptide-DP2 protein written at 0th ns, 10th ns, 20th ns, 30th ns, and 50th ns of the simulation. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Figure 3.

Figure 3

LJ-SR energy (up) and RMSD of the peptide and protein backbones (down) during the MD simulation. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Size of the solvation box

To analyze the influence of the solvation box on the predictive ability of subsequent MD simulations, the 20 HLA-DP2-peptide complexes described above were simulated initially for 1 ns in two water boxes: one with sides of 1 nm and one with sides 2 nm longer then the outermost solute atoms in each direction. In these extended water boxes, no protein atom is initially less than 10 Å from the exterior edge of the box. The sensitivities using the top 5% cut-off were 35% and 29% for 1 nm and 2 nm, respectively. This indicates that simulations in the 1 nm extended water box give better predictions than those in the 2 nm extended box. Subsequently, the HLA-DP2-peptide complexes were simulated in a 1 nm extended solvation box.

External validation

The predictive ability of the various MD-QMs was validated using an external test set of 457 known HLA-DP2 binders originating from 24 proteins. These peptides were collected from the Immune Epitope Database.29 Each protein was represented as a set of overlapping nonamers and the predicted binding score of each nonamer was calculated as a sum of the amino acid contributions at the nine peptide positions. In the initial validation, we used MD-QMnpp and MD-QMnap for 9mers derived as described in Datasets and Methods. The peptides originating from one protein were ranked according to their binding score and the sensitivity at different cut-offs was assessed. Sensitivity is the ratio of true predicted binders to all binders in the test set at the given cut-off. A known binder from the test set is considered a true predicted binder if the predicted nonamer sequence is part of the binder sequence. Figure 4 gives the sensitivities of predictions by the two MD-QMs at different cut-offs. MD-QMnap predicts slightly better than MD-QMnpp. It identifies 33% of the known binders at the top 5% threshold, 54% at the top 10%, 66% at the top 15%, 77% at the top 20%, and 84% at the top 25%. MD-QMnap was used in subsequent predictions.

Figure 4.

Figure 4

Sensitivities of the predictions calculated at five different thresholds (5, 10, 15, 20, and 25%) by the two MD-QMs. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

We sought to improve the predictive ability of MD-QMnap through the addition of cross terms. Such cross terms should help account for any nonlinearity apparent in binding; our previous experience strongly supports the assertion that adding cross terms typically improves the accuracy of prediction.30,31 Cross terms between adjacent amino acids in the peptide were added to QMnap, slightly increasing the sensitivities by 2-5% (Fig. 5).

Figure 5.

Figure 5

Sensitivities of the predictions calculated at five different thresholds (5, 10, 15, 20, and 25%) by MD-QMnap and MD-QMnap+cross terms. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Further, we inspected the influence of the peptide anchor residues. Anchor positions are a relatively crude yet widely used simplification of the specificity of MHC-peptide interaction which identifies certain positions within a bound peptide as being essential to binding, usually by stipulating a very limited set of amino acids at such positions. The X-ray structure of peptide-DP2 complex revealed four pockets within the binding groove positioned to accept the side chains of peptide positions 1, 4, 6, and 9.14 A MD-QMnap was constructed to include only peptide anchor positions 1, 4, 6, and 9 (Fig. 6, middle bars). No significant improvement was observed. However, when peptide position 7 was added as an anchor, sensitivities increased by 3–5% (Fig. 6, right bars). This result identifies position 7 as an important additional anchor position for DP2.

Figure 6.

Figure 6

Sensitivities of the predictions calculated at five different thresholds (5, 10, 15, 20, and 25%) by MD-QMnap, MD-QMnap anchors 1, 4, 6, and 9 and MD-QMnap anchors 1, 4, 6, 7, and 9. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Combining anchors and adjacent anchor cross terms further improved the predictive ability of QMnap: sensitivities reached 38% at the top 5% cut-off, 58% at the top 10% level, 71% at the top 15% level, 82% with a top 20% cut-off, and 90% at the top 25% threshold (Fig. 7).

Figure 7.

Figure 7

Sensitivities of the predictions calculated at five different thresholds (5, 10, 15, 20, and 25%) by MD-QMnap and MD-QMnap anchors + cross terms. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Finally, we examined the influence of the flanking residues. Four flanking residues were considered: two at both ends. The proteins from the test set were represented as sets of overlapping 13mers and the binding score of each 13mer was calculated using the corresponding MD-QMnap for 13mers. Splitting the proteins into overlapping 13mers significantly reduces the sensitivity as the number of overlapping 13mer registers originating from a single binder will decrease (Fig. 8). Two types of predictions were compared in Figure 8. The left bars give the sensitivities calculated when only the binding core of nine amino acids was considered (the central part of each 13mer) and the right bars show the sensitivities for the whole 13mers (binding core + flanking residues). It is evident that the addition of flanking residues does not improve the prediction.

Figure 8.

Figure 8

Sensitivities of the predictions calculated at five different thresholds (5, 10, 15, 20, and 25%) by MD-QMnap binding core and MD-QMnap binding core + flanking residues for 13mers. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Comparison to existing servers for HLA-DP2 binding prediction

To our knowledge, there are only two servers for peptide HLA-DP2 binding prediction: NetMHCII32 and IEDB.29 Using our test set of 457 known binders, we compared the performance of these two servers to that of MD-QMnap (anchors + cross terms). The 24 proteins from the test set were split into overlapping nonamers and the sensitivities at different cut-offs were recorded (see Fig. 9). For peptide-HLA-DP2 binding prediction, MD-QMnap appears to give a non-negligible improvement of around 10% over the two existing servers.

Figure 9.

Figure 9

Sensitivities of the predictions calculated at five different thresholds (5, 10, 15, 20, and 25%) by different servers for DP2 binding prediction. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Discussion

In this study, we have combined structure-based MD simulation, with long-established QM methodology, to derive an effective tool for the rapid and accurate virtual screening for HLA-DP2-peptide binding prediction. Using the recently published X-ray structure of HLA-DP2, a combinatorial library of 247 peptides was built using the “single amino acid substitution (SAAS)” approach, and subsequently docked into the HLA-DP2 binding site. The complexes were simulated for 1 ns and the short range interaction energies (Lennard-Jones and Coulumb) were used to derive normalized binding scores. The normalized values were collected into MD-based QMs and their predictive abilities were validated on a large external test set. The validation of different QMs shows that the MD-QM consisting of LJ-SR energies normalized over all positions and including only anchor residues and cross terms between anchor-residues performed best.

Analysis of the best performing MD-QM indicates the preferred and non-preferred amino acids at each anchor position. The normalized values of the 20 naturally occurred amino acids at positions 1, 4, 6, 7, and 9 are given in Figure 10. Only four amino acids have positive values at p1. These are Trp, Phe, His, and Arg. The X-ray structure shows that the p1 pocket is deep and hydrophobic.14 It can accommodate all hydrophobic residues, including large aromatic amino acids, such as Phe, Trp, and Tyr. Among the non-preferred amino acids are Ala, Gly, Ser, and Thr, which are not able to fill the pocket because of their short side chains.

Figure 10.

Figure 10

Normalized LJ-SR values of the 20 naturally occurred amino acids at positions 1, 4, 6, 7, and 9. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

The most preferred amino acids at position 4 are Phe and Trp, followed by Leu and His. The binding pocket p4 is large, shallow, and negatively charged due to Glu26β, Glu68β, and Glu69β.14 It interacts strongly with positively charged amino acids, but hydrophobic residues are also well-accepted here.

Only two amino acids are preferred at position 6: Phe and Met. The binding pocket p6 is deep and hydrophobic like pocket p1.14 Short amino acids do not fill the pocket and are not preferred at this position.

Position 7 (p7) lies tangentially to the binding site and is considered as a secondary anchor position for some MHC class II proteins.3335 Originally, this position was not considered as an anchor for DP2 binding14 but its inclusion as an anchors QM improves the predictions. Hydrophobic amino acids, such as Trp, Tyr, and Phe, are well tolerated at position p7.

Binding pocket 9 (p9) accepts a wide variety of preferred amino acids, including large aliphatic, polar, or even charged residues.14 According to our MD simulations, the most preferred residues are Phe, Tyr, and Arg.

HLA-DP2 possesses a unique solvent-exposed, acidic pocket formed between the bound peptide backbone and the protein α-helix. It contains three glutamic acids: Glu26β, Glu68β, and Glu69β. Additionally, close to this acidic triad there is another glutamic acid: Glu67β. The strong negative electrostatic potential created by the four nominally negatively charged residues, determines the amino acid preferences within the main part of the binding core, especially Positions 4 and 6. In the present study, electrostatic energies were shown to be unreliable binding predictors, possessing a lower predictive ability than LJ-SR energies. Combination of both types of interaction using the linear interaction energy method36 did not improve the binding prediction (data not shown). Berretta et al.37 arrived at a similar conclusion after using the intermolecular Coulombic and van der Waals energies to predict binding in pockets 4 and 6 of HLA-DP2. They found the most significant contribution to the total energy came from the van der Waals interactions.

The simultaneous effect of preferred residues at anchor positions was identified by using cross terms between anchors. This improved the predictive accuracy of the model. However, the inclusion of terms accounting for flanking residues decreased model sensitivities slightly in comparison to these generated from the binding core alone.

In summary, when used as quantitative descriptors of peptide-MHC interaction, MD-based LJ-SR energies function are good predictors of binding. The MD-QMs based on these descriptors should prove to be a promising structural method for the virtual screening of peptides binding to MHC proteins. It is important to appreciate that even prediction methods that seem highly predictive seldom ever perform as well on novel data not included in training. All methods are or should be in a state of constant improvement in terms of the accuracy of future predictions. Our current paper should be seen as a staging post in the on-going development of immunoinformatic methods able to deal with the extraordinary surfeit of information within the immune arena.

The approach described here is not dependent on the extensive collation of peptide binding data, only on a structural model of the protein. As such models are indeed available for almost all MHCs via homology modelling,35,38 virtual screening should in turn prove to be a powerful and highly efficient in silico approach to the identification of potentially immunogenic epitopes suitable for inclusion in diagnostics, reagents, and poly-epitope vaccines. Our approach allows us to combine the speed of QM approaches with that most desirable quality of MD: that 100s of experimentally determined binding measurements are not requitred, with all the logistic benefits that affords. Instead MD generates such data de novo. As it is unlikely that all or even a majority of human alleles will ever be rigorously analyzed experimentally for peptide-binding specificity, MD-based methods should become of central importance for the future characterization of epitopes.

Datasets and Methods

Input data

The X-ray crystallographic structure of HLA-DP2 (DPA*0103, DPB1*0201), in complex with a self-peptide derived from the HLA-DR α-chain (pdb code: 3lqz), was used as the starting structure.14 The covalently bound peptide was separated and defined as chain C. Its backbone conformation was used as the initial template for subsequent homology modelling. The peptide consists of nine binding core positions (FHYLPFLPS) and six flanking residues (RK at the N terminal and TGGS at the C terminal). Of these, 13 positions were examined: nine binding core positions and four flanking residues (two at each termini). A library of 247 peptides (19 amino acids × 13 positions) was built on the principle “single amino acid substitution (SAAS)” using PyMOL.39 The protonation state of ionisable side chains within the binding site was assigned to standard state (neutral for His, positively charged for Arg and Lys, and negatively charged for Asp and Glu). No water molecules were registered in the X-ray structure. The entire peptide-MHC protein complex was considered in the MD simulations, as it has been previously demonstrated that the α3 and β2m domains cannot be omitted.23

Molecular dynamics protocol

Simulations were undertaken using GROMACS v.4.0.740 and the GROMOS96 53a6 force field.41 Complexes were centred in a solvent box with sides which were 1 nm or 2 nm longer than the outermost solute atoms in each direction. Periodic boundary conditions were applied. The box was filled with water molecules and the negative charge of the complexes was neutralized by counterions (Na+). The system's potential energy was initially minimized using a steepest descent procedure with a step size of 0.01 kcal/mol. A position-restraints simulation was run for 20 ps keeping the solute atoms in place and allowing the solvent molecules and ions to relax into their minimum potential energy positions. Finally, the system was simulated for 1 ns with annealing from 100 to 310 K and an integration step of 2 fs. The protein and the ligand were defined as separate groups and the short-range nonbonded interactions (Coulombic and Lennard–Jones) between them were accounted for up to a cut-off value of 1.4 nm. Electrostatic interactions were treated employing the particle mesh Ewald option.42 Simulations were run on the IBM Blue Gene – P of the Bulgarian Supercomputing Centre. For a 1 ns simulation run an average of 11 and 20 CPU hours were required for 1 nm and 2 nm extended solvation boxes, respectively.

MD calibration

MD simulations of the peptide-MHC complexes were calibrated in terms of solvation box size, energies used for scoring the binding, and the duration of MD simulations. For this purpose, a small library of 20 HLA-DP2-peptide complexes was designed by “single amino acid substitution (SAAS)” at peptide position 1 (p1). The peptides were modeled using the X-ray structure of HLA-DP2 in complex with the peptide: RKXHYLPFLPSTGGS, where X corresponds to peptide position p1. The complexes were simulated for 1 nanosecond (ns) and the energies were recorded, normalized, and used for prediction. The predictive ability was assessed making use of a test set and the parameter sensitivity with a top 5% cut-off.

Molecular dynamics-based quantitative matrices (MD-QMs)

The LJ-SR energies derived from the MD simulations were normalized using an average calculated with a position-dependent basis (epithet: position-per-position; acronym: npp) or corrected using an average calculated over all positions (acronym: nap). Normalized energies were thus calculated using the following formula:

graphic file with name pro0020-1918-m1.jpg

where Ei is the average LJ-SR energy of the i-th peptide over the simulation run, Inline graphic is the average for a given position (npp) or over all positions (nap), Emax and Emin are the maximum and minimum LJ-SR energies, respectively, for a given position (npp) or for all positions (nap).

Normalized energies were multiplied by (−1) before being entered into the quantitative matrices (MD-QMs). Thus, the positive values correspond to preferred amino acids, and the negative ones to non-preferred residues. Three MD-QMs were derived: one based on npp and two based on nap (one for 9mers and one for 13mers). These matrices are given in Supporting Information 1.

A cross term was defined as a product between normalized energies of two peptide positions:

graphic file with name pro0020-1918-m2.jpg

Cross terms account for any non-linearity of binding. They were added into MD-QMs to improve their predictive ability. Thus for a nonameric peptide, with cross terms, the total score would be given by:

graphic file with name pro0020-1918-m3.jpg

where the aa superscript indicates that the E value is extracted from the MD-QM on an amino acid basis, and the i subscript indicates that the E value is also extracted on a position-wise basis.

Test set

A test set of 457 peptides known to bind HLA-DP2 was collected from the Immune Epitope Database29 in November 2010. The set comprised peptides of different lengths originating from 24 foreign proteins. Each protein was represented as a set of nonamers overlapping by one. For each nonamer, a overall binding score was calculated by summing the contributions made by all nine positions using the MD-QMs. Peptides originating from one protein were ranked according to their binding score and the top scoring 5, 10, 15, 20, and 25% were selected and compared to the known binders. If a nonamer sequence was present in the sequence of known binder, the predicted peptide was considered to be true predicted binder. The ratio of true predicted binders to all binders in the test set defined the sensitivity of prediction at the given cut-off. In the case of flanking residues, the procedure was the same but the parent proteins were represented as a set of overlapping 13mers. The test set used in this study is given in Supporting Information 2.

Supplementary material

pro0020-1918-SD1.pdf (25.6KB, pdf)
pro0020-1918-SD2.doc (169.5KB, doc)

References

  • 1.Robinson J, Mistry K, McWilliam H, Lopez R, Parham P, Marsh SGE. The IMGT/HLA release. Nucleic Acids Res. 2011;39(Suppl 1):D1171–D1176. doi: 10.1093/nar/gkq998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Jones EY, Fugger L, Strominger JL, Siebold C. MHC class II proteins and disease: A structural perspective. Nat Rev Immunol. 2006;6:271–282. doi: 10.1038/nri1805. [DOI] [PubMed] [Google Scholar]
  • 3.Swain AL, Crowther R, Kammlott U. Peptide and peptide mimetic inhibitors of antigen presentation by HLA-DR4 class II MHC molecules. Design, structure—activity relationships, and X-ray crystal structures. J Med Chem. 2000;43:2135–2148. doi: 10.1021/jm000034h. [DOI] [PubMed] [Google Scholar]
  • 4.Zavala-Ruiz Z, Sundberg EJ, Stone JD, DeOliveira DB, Chan IC, Svendsen J, Mariuzza RA, Stern LJ. Exploration of the P6/P7 region of the peptide-binding site of the numan class II major histicompatibility complex protein HLA-DR1. J Biol Chem. 2003;278:44904–44912. doi: 10.1074/jbc.M307652200. [DOI] [PubMed] [Google Scholar]
  • 5.Gunther S, Schlundt A, Sticht J, Roske Y, Heinemann U, Wiesmuller K-H, Jung G, Falk K, Rotzschke O, Freund C. Bidirectional binding of invariant chain peptides to an MHC class II molecule. Proc Natl Acad Sci USA. 2010;107:22219–22224. doi: 10.1073/pnas.1014708107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lee KH, Wucherpfennig KW, Wiley DC. Structure of a human insulin peptide—HLA-DQ8 complex and susceptibility to type 1 diabetes. Nat Immunol. 2001;2:501–507. doi: 10.1038/88694. [DOI] [PubMed] [Google Scholar]
  • 7.Sethi DK, Wucherpfennig KW. A highly tilted binding mode by a self-reactive T cell receptor results in altered engagement of peptide and MHC. J Exp Med. 2011;208:91–102. doi: 10.1084/jem.20100725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Petersdorf EW, Smith AG, Mickelson EM, Longton GM, Anasetti C, Choo SY, Martin PJ, Hansen JA. The role of HLA-DPB1 disparity in the development of acute graft-versus-host disease following unrelated donor marrow transplantation. Blood. 1993;81:1923–1932. [PubMed] [Google Scholar]
  • 9.Lympany PA, Petrek M, Southcott AM, Newman Taylor AJ, Welsh KI, du Bois RM. HLA-DPB polymorphism: Glu 69 association with sarcoidosis. Eur J Immunogenet. 1996;23:353–359. doi: 10.1111/j.1744-313x.1996.tb00008.x. [DOI] [PubMed] [Google Scholar]
  • 10.Begovich AB, Bugawan TL, Nepom BS, Klitz W, Nepom GT, Erlich HA. A specific HLA-DPβ allele is associated with particular juvenile rheumatoid arthritis but not adult rheumatoid arthritis. Proc Natl Acad Sci USA. 1989;86:9489–9493. doi: 10.1073/pnas.86.23.9489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dong RP, Kimura A, Okubo R, Shinagawa H, Tamai H, Nishimura Y, Sasazuki T. HLA-A and DPB1 loci confer susceptibility to Graves' disease. Hum Immunol. 1992;35:165–172. doi: 10.1016/0198-8859(92)90101-r. [DOI] [PubMed] [Google Scholar]
  • 12.Potolicchio I, Mosconi G, Forni A, Nemery B, Seghizzi P, Sorrentino R. Susceptibility to hard metal lung disease is strongly associated with the presence of glutamate 69 in HLA-DP beta chain. Eur J Immunol. 1997;27:2741–2743. doi: 10.1002/eji.1830271039. [DOI] [PubMed] [Google Scholar]
  • 13.Richeldi L, Sorrentino R, Saltini C. HLA-DPB1 glutamate 69: a genetic marker of beryllium disease. Science. 1993;262:242–244. doi: 10.1126/science.8105536. [DOI] [PubMed] [Google Scholar]
  • 14.Dai S, Murphy GA, Crawford F, Mack DG, Falta MT, Marrack P, Kappler JW, Fontenot AP. Crystal structure of HLA-DP2 and implications for chronic beryllium disease. Proc Natl Acad Sci USA. 2010;107:7425–7430. doi: 10.1073/pnas.1001772107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lin HH, Ray S, Tongchusak S, Reinherz EL, Brusic V. Evaluation of MHC class I peptide binding prediction servers: applications for vaccine research. BMC Immunol. 2008;9:8. doi: 10.1186/1471-2172-9-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Peters B, Bui HH, Frankild S, Nielson M, Lundegaard C, Kostem E, Basch D, Lamberth K, Harndahl M, Fleri W, Wilson SS, Sidney J, Lund O, Buus S, Sette A. A community resource benchmarking predictions of peptide binding to MHC-I molecules. PLoS Comput Biol. 2006;2:e65. doi: 10.1371/journal.pcbi.0020065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gowthaman U, Agrewala JN. In silico tools for predicting peptides binding to HLA-class II molecules: more confusion than conclusion. J Proteome Res. 2008;7:154–163. doi: 10.1021/pr070527b. [DOI] [PubMed] [Google Scholar]
  • 18.Lin HH, Zhang GL, Tongchusak S, Reinherz EL, Brusic V. Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research. BMC Bioinformatics. 2008;9:S22. doi: 10.1186/1471-2105-9-S12-S22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wang P, Sidney J, Dow C, Mothé B, Sette A, Peters B. A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach. PLoS Comput Biol. 2008;4:e1000048. doi: 10.1371/journal.pcbi.1000048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Michielin O, Karplus M. Binding free energy differences in a TCR-peptide-MHC complex induced by a peptide mutation: a simulation analysis. J Mol Biol. 2002;324:547–569. doi: 10.1016/s0022-2836(02)00880-x. [DOI] [PubMed] [Google Scholar]
  • 21.Wan S, Flower DR, Coveney P. Toward an atomistic understanding of the immune synapse: large-scale molecular dynamics simulation of a membrane-embedded TCR-pMHC-CD4 complex. Mol Immunol. 2008;45:1221–1230. doi: 10.1016/j.molimm.2007.09.022. [DOI] [PubMed] [Google Scholar]
  • 22.Zacharias M, Springer S. Conformational flexibility of the MHC class I α1-α2 domain in peptide bound and free states: A molecular dynamics simulation study. Biophys J. 2004;87:2203–2214. doi: 10.1529/biophysj.104.044743. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wan S, Coveney P, Flower DR. Large-scale molecular dynamics simulations of HLA-A*0201 complexed with a tumor-specific antigenic peptide: Can the α3 and β2m domains be neglected? J Comput Chem. 2004;25:1803–1813. doi: 10.1002/jcc.20100. [DOI] [PubMed] [Google Scholar]
  • 24.Rognan D. Molecular dynamics simulations: A tool for drug design. Persp Drug Discov Des. 1998;9/10/11:181–209. [Google Scholar]
  • 25.Meng WS, von Grafenstein H, Haworth IS. A model of water structure inside the HLA-A2 peptide binding groove. Int Immunol. 1997;9:1339–1346. doi: 10.1093/intimm/9.9.1339. [DOI] [PubMed] [Google Scholar]
  • 26.Omasits U, Knapp B, Neumann M, Steinhauser O, Stockinger H, Kobler R, Schreiner W. Analysis of key parameters for molecular dynamics of pMHC molecules. Mol Simul. 2008;34:781–793. [Google Scholar]
  • 27.Knapp B, Omasits U, Bohle B, Maillere B, Ebner C, Schreiner W, Jahn-Schmid B. 3-Layer-based analysis of peptide-MHC interaction: In silico prediction, peptide binding affinity and T cell activation in a relevant allergen-specific model. Mol Immunol. 2009;46:1839–1844. doi: 10.1016/j.molimm.2009.01.009. [DOI] [PubMed] [Google Scholar]
  • 28.Toh H, Kamikawaji N, Tana T, Sasazuki T, Kuhara S. Molecular dynamics simulations of HLA-DR4 (DRB1*0405) complexed with analoque peptide: conformational changes in the putative T-cell receptor binding regions. Prot Eng. 1998;11:1027–1032. doi: 10.1093/protein/11.11.1027. [DOI] [PubMed] [Google Scholar]
  • 29.Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, Salimi N, Damle R, Sette A, Peters B. The immune epitope database 2.0. Nucleic Acids Res. 2010;38:D854–D62. doi: 10.1093/nar/gkp1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Doytchinova IA, Blythe MJ, Flower DR. Additive method for the prediction of protein–peptide binding affinity. Application to the MHC class I molecule HLA-A*0201. J Proteome Res. 2002;1:263–272. doi: 10.1021/pr015513z. [DOI] [PubMed] [Google Scholar]
  • 31.Dimitrov I, Garnev P, Flower DR, Doytchinova I. Peptide binding to the HLA-DRB1 supertype: a proteochemometrics analysis. Eur J Med Chem. 2010;45:236–243. doi: 10.1016/j.ejmech.2009.09.049. [DOI] [PubMed] [Google Scholar]
  • 32.Nielsen M, Lund O. NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC Bioinformatics. 2009;10:296. doi: 10.1186/1471-2105-10-296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Stern LJ, Brown JH, Jargetzky TS, Gorga JC, Urban RG, Strominger JL, Wiley DC. Crystal structure of the human class II MHC protein HLA-DR1 complexed with an influenza virus peptide. Nature. 1994;368:215–221. doi: 10.1038/368215a0. [DOI] [PubMed] [Google Scholar]
  • 34.Dessen A, Lawrence CM, Cupo S, Zaller DM, Wiley DC. X-ray crystal structure of HLA-DR4 (DRA*0101, DRB*0401) complexed with a peptide from human collagen II. Immunity. 1997;7:473–481. doi: 10.1016/s1074-7613(00)80369-6. [DOI] [PubMed] [Google Scholar]
  • 35.Doytchinova IA, Flower DR. In silico identification of supertypes for class II MHCs. J Immunol. 2005;174:7085–7095. doi: 10.4049/jimmunol.174.11.7085. [DOI] [PubMed] [Google Scholar]
  • 36.Aqvist J, Luzhkov VB, Brandsdal BO. Ligand binding affinities from MD simulations. Acc Chem Res. 2002;35:358–365. doi: 10.1021/ar010014p. [DOI] [PubMed] [Google Scholar]
  • 37.Berretta F, Butler RH, Diaz G, Sanarico N, Arroyo J, Fraziano M, Aichinger G, Wucherpfennig KW, Colizzi V, Saltini C, Amicosante M. Detailed analysis of the effects of Glu/Lys β69 human leukocyte antigen-DP polymorphism on peptide-binding specificity. Tissue Antigens. 2003;62:459–471. doi: 10.1046/j.1399-0039.2003.00131.x. [DOI] [PubMed] [Google Scholar]
  • 38.Doytchinova IA, Guan P, Flower DR. Identifying human MHC supertypes using bioinformatic methods. J Immunol. 2004;172:4314–4323. doi: 10.4049/jimmunol.172.7.4314. [DOI] [PubMed] [Google Scholar]
  • 39.PyMOL software. 2011. Available at: http://www.pymol.org.
  • 40.Hess B, Kutzner, van der Spoel D, Lindahl E. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput. 2008;4:435–447. doi: 10.1021/ct700301q. [DOI] [PubMed] [Google Scholar]
  • 41.Oosten C, Villa A, Mark AE, van Gunsteren WF. A biomolecular force field based on the free enthalpy of hydration and solvation: the GROMOS force-field parameter sets 53A5 and 53A6. J Comput Chem. 2004;25:1656–1676. doi: 10.1002/jcc.20090. [DOI] [PubMed] [Google Scholar]
  • 42.Darden T, York D, Pedersen L. Particle mesh Ewald sums in large systems. J Chem Phys. 1993;98:10089–10092. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

pro0020-1918-SD1.pdf (25.6KB, pdf)
pro0020-1918-SD2.doc (169.5KB, doc)

Articles from Protein Science : A Publication of the Protein Society are provided here courtesy of The Protein Society

RESOURCES