Skip to main content
Journal of Immunology Research logoLink to Journal of Immunology Research
. 2014 Jan 12;2014:768515. doi: 10.1155/2014/768515

Model for Vaccine Design by Prediction of B-Epitopes of IEDB Given Perturbations in Peptide Sequence, In Vivo Process, Experimental Techniques, and Source or Host Organisms

Humberto González-Díaz 1,2,*, Lázaro G Pérez-Montoto 3, Florencio M Ubeira 3
PMCID: PMC3987976  PMID: 24741624

Abstract

Perturbation methods add variation terms to a known experimental solution of one problem to approach a solution for a related problem without known exact solution. One problem of this type in immunology is the prediction of the possible action of epitope of one peptide after a perturbation or variation in the structure of a known peptide and/or other boundary conditions (host organism, biological process, and experimental assay). However, to the best of our knowledge, there are no reports of general-purpose perturbation models to solve this problem. In a recent work, we introduced a new quantitative structure-property relationship theory for the study of perturbations in complex biomolecular systems. In this work, we developed the first model able to classify more than 200,000 cases of perturbations with accuracy, sensitivity, and specificity >90% both in training and validation series. The perturbations include structural changes in >50000 peptides determined in experimental assays with boundary conditions involving >500 source organisms, >50 host organisms, >10 biological process, and >30 experimental techniques. The model may be useful for the prediction of new epitopes or the optimization of known peptides towards computational vaccine design.

1. Introduction

National Institute of Allergy and Infectious Diseases (NIAID) supported the launch, in 2004, of the Immune Epitope Database (IEDB), http://www.iedb.org/ [14]. The IEDB system withdrew information from approximately 99% of all papers published to date that describe immune epitopes. In doing so, IEDB system analyses over 22 million PubMed abstracts and subsequently curated 13 K references, including 7 K manuscripts about infectious diseases, 1 K about allergy topics, 4 K about autoimmunity, and 1 K about transplant/alloantigen topics [5]. IEDB lists a huge amount of information about the molecular structure as well as the experimental conditions (c ij) in which different ith molecules were determined to be immune epitopes or not. This explosion of information makes necessary both query/display functions for retrieval of known data from IEDB as well predictive tools for new epitopes. Salimi et al. [5] reviewed advances in epitope analysis and predictive tools available in the IEDB. In fact, IEDB analysis resource (IEDB-AR: http://tools.iedb.org/) is a collection of tools for prediction of molecular targets of T- and B-cell immune responses (i.e., epitopes) [6, 7].

On the other hand, Quantitative Structure-Activity/Property Relationships (QSAR/QSPR) techniques are useful tool to predict new drugs, RNA, drug-protein complexes, and protein-protein complexes. In general, QSAR/QSPR-like methods transform molecular structures into numeric molecular descriptors (λ i) in a first stage and later fit a model to predict the biological process. For example, DRAGON [810], CODESSA [11, 12], MOE [13], TOPS-MODE [1417], TOMOCOMD [18, 19], and MARCH-INSIDE [20] are among the most used softwares to calculate molecular descriptors based on quantum mechanics (QM) and/or graph theory [2127]. The software STATISTICA [28] and WEKA [29] are often used to perform multivariate statistics and/or machine learning (ML) analysis in order to preprocess data and later fit the final QSAR/QSPR model using techniques like principal component analysis (PCA), linear discriminant analysis (LDA), support vector machine (SVM), or artificial neural networks (ANN) [28].

QSAR/QSPR models are also important in immunoinformatics to predict the propensity of different molecular structures to play different roles in immunological processes. They include skin vaccine adjuvants and sensitizers [3038], drugs and their activity/toxicity protein targets in the immune system [39], and epitopes [4049]. Moreover, Reche and Reinherz [50] implemented PEPVAC (promiscuous epitope-based vaccine), a web server for the formulation of multiepitope vaccines that predict peptides binding to five distinct HLA class I supertypes (A2, A3, B7, A24, and B15). PEPVAC can also identify conserved MHC ligands, as well as those with a C-terminus resulting from proteasomal cleavage. The Dana-Farber Cancer Institute hosted the PEPVAC server at the site http://immunax.dfci.harvard.edu/PEPVAC/. To close with a last example, Lafuente and Reche [51] reviewed the available methods for predicting MHC-peptide binding and discussed their most relevant advantages and drawbacks.

In many complex QSPR-like problems in immunoinformatics, like in other areas, we know the exact experimental result (known solution) of the problem, but we are interested in the possible result obtained after a change (perturbation) on one or multiple values of the initial conditions of the experiment (new solution). For instance, we often know, for large collections of ith molecules (m i), organic compounds, drugs, xenobiotics, and/or peptide sequences, the efficiency of the compound ε(c ij) as adjuvant, action as epitope, immunotoxicity, and/or the interaction (affinity, inhibition, etc.) with immunological targets. In addition, we often known for each molecule the exact conditions (c ij) of assay for the initial experiment including structure of the molecule m i (drug, adjuvant, and sequence of the peptide), source organism (so), host organism (ho), immunological process (ip), experimental technique (tq), concentration, temperature, time, solvents, and coadjuvants. This is the case of big data retrieved from very large databases like IEDB [14] and CHEMBL [52]. However, we do not know the possible result of the experiment if we change at least one of these conditions (perturbation). We refer to small changes or perturbations in both structure and condition for input or output variables. It means that we include changes in ho, so, ip, and tq, changes of the compound by one analogue compound with similar structure, changes in the sequence of the epitope (artificial by organic synthesis or natural mutations), and polarity of the solvent or coadjuvants. In these cases, we could use a perturbation theory model to solve the QSAR/QSPR problem. Perturbation theory includes methods that add “small” terms to a known solution of a problem in order to approach a solution to a related problem without known solution. Perturbation models have been widely used in all branches of science from QM to astronomy and life sciences including chaos or “butterfly effect,” Bohr's atomic theory, Heisenberg's mechanics, Zeeman's and Stark's effects, and other models with applications in like protein spectroscopy and others [5357]. In a very recent work Gonzalez-Diaz et al. [58] formulated a general-purpose perturbation theory or model for multiple-boundary QSPR/QSAR problems. However, there is not report in the immunoinformatics literature of a general QSPR perturbation model for IEDB B-epitopes. Here we report the first example of QSPR-perturbation model for B-epitopes reported in IEDB able to predict the probability of occurrence of an epitope after a perturbation in the sequence, the experimental technique, the exposition process, and/or the source or host organisms.

2. Materials and Methods

2.1. Molecular Descriptors for Peptides

We calculated the molecular descriptors of the structure of peptides using the software MARCH-INSIDE (MI) based on the algorithm with the same name [59]. The MI approach uses a Markov Chain method to calculate the kth mean values of different physicochemical molecular properties λ(m i) for ith molecules (m). These λ(m i) values are calculated as an average of k λ(m i) values for all atoms placed at topological distance dk; which are in turn the means of atomic properties (λ j) for all atoms in the molecule and its neighbors placed at d = k. For instance, it is possible to derive average estimations of molecular refractivities kMR(m i), partition coefficients k P(m i), and hardness k η(m i) for atoms placed at different topological distances dk. In this first work, we calculated only one type of λ(m i) values. We calculated for all peptides the average value χ(m i) of all the atomic electronegativities χ i for all δ i atoms connected to the ith atom (ij) and their neighbors placed at a distance d ≤ 5 [59]:

χ(mi)=16k=05χjk=16k=05ijδipk(χj)·χj. (1)

We calculate the probabilities k p(λ j) for any atomic property including k p(χ j) using a Markov Chain model for the gradual effects of the neighboring atoms at different distances in the molecular backbone. This method has been explained in detail in many previous works so we omit the details here [59].

2.2. Electronegativity Perturbation Model for Prediction of B-Epitopes

Very recently Gonzalez-Diaz et al. [58] formulated a general-purpose perturbation theory or model for multiple-boundary QSPR/QSAR problems. We adapted here this new theory or modeling method to approach to the peptide prediction problem from the point of view of perturbation theory. Let be a set of ith peptide molecules denoted as m i with a value of efficiency ε ij as epitopes experimentally determined under a set of boundary conditions c j ≡ (c 0, c 1, c 2, c 3,…, c n). We put the main emphasis here on peptides reported in the database IEDB. In this sense, the boundary conditions c j used here are the same reported in this database, c 0 = is the specific peptide, c 1 = soj, c 2 = hoj, c 3 = ipj, and c 4 = tqj. In general, so is the organism that expresses the peptide (but it can include also artificial peptides, cellular lines, etc.), ho is the host organism exposed to the peptide by means of the bp detected with tq. As our analysis, based on the data reported by IEDB we are unable to work with continuous values of epitope activity ε ij. Consequently, we have to predict the discrete function of B-epitope efficiency λ(ε ij) = 1 for epitopes reported in the conditions c j and λ(ε ij) = 0, otherwise. Our main aim is to predict the shift or change in a function of the output efficiency Δλ(ε ij) = λ(ε ij)refλ(ε ij)new that takes place after a change, variation, or perturbation (ΔV) in the structure and/or boundary conditions of a peptide of reference. But we know the efficiency of the process of reference λ(ε ij)ref in addition to the molecular structure and the set of conditions c j for initial (reference) and final processes (new). Consequently, to predict Δλ(ε ij) we have to predict only λ(ε ij)new the efficiency function of the new state obtained by a change in the structure of the peptide and/or the boundary conditions. Let ΔV be a perturbation in a function λ; we can define V ij as the state information function for the reference and new states. According to our recent model [58], we can write V ij as a function of the conditions and structure of the peptide m i as follows. In fact, the variational state functions V ij have to be written in pairs in order to describe the initial (reference) and final (new) states of a perturbation, as follow:

Vij=λ(εij)newj=14(λ(mi)λ(cij)avg),Vqr=λ(εqr)refr=14(λ(mq)λ(cqr)avg). (2)

The state function n V ij is for the ith peptide measured under a set of c ij boundary conditions in output, final, or new state. The conjugated state function r V qr is for the qth peptide measured under a set of c qr boundary conditions for the input, initial, or reference state. The difference ΔV between the new (output) state and the reference (input) state is the additive perturbation [58]. Consider

ΔV=VijVqr=[λ(εij)newj=14(λ(mi)λ(cij)avg)][λ(εqr)refr=14(λ(mq)λ(cqr)avg)]. (2)

Equation (3) described before opens the door to test different hypothesis. A simple hypotheses is H0: existence of one small and constant value of the perturbation function ΔV = e 0 for all the pairs of peptides and a linear relationship between perturbations of input/output boundary conditions with coefficients a ij, b ij, c qr, and d ij. Consider

e0=ΔV=[aij·λ(εij)newj=14bij·(λ(mi)λ(cj)avg)][cqr·λ(εqr)refr=14dqr·(λ(mq)λ(cr)avg)]. (2)

We can use elemental algebraic operations to obtain from these equations an expression for efficiency as epitope of the peptide λ(ε ij)new. In this case, considering b ijd qr, we can obtain the different expressions; the last may be very useful to solve the QSRR problem for the large datasets formed by IEDB B-epitopes. Consider

λ(εij)new=(cqraij)·λ(εqr)ref+[j=14(bqraij)·(λ(mi)λ(cj)avg)new][r=14(dqraij)·(λ(mq)λ(cr)avg)ref]+(e0aij),λ(εij)new=c0·λ(εqr)ref+j=14dij·Δ(λ(mi)λ(cj)avg)+e0,λ(εij)new=c0·λ(εqr)ref+j=14dij·ΔΔλijqr+e0,λ(εij)new=c0·λ(εqr)ref+j=14dij·ΔΔχijqr+e0. (2)

3. Results and Discussion

We propose herein, for the first time, a QSRR-perturbation model able to predict variations in the propensity of a peptide to act as B-epitope taking into consideration the propensity of a peptide of reference and the changes in peptide sequence, immunological process, host organism, source organisms, and the experimental technique used. The best QSPR-perturbation model found here with LDA was

λ(εij)new=4.979·λ(εij)ref221.642·Δχseq+8.770·ΔΔχho+63.572·ΔΔχso55.387·ΔΔχip+201.919·ΔΔχtq2.149,N=155169,Rc=0.92,U=0.15,p<0.01. (6)

The first input term is the value λ(ε ij)ref is the scoring function λ of the efficiency of the initial process ε ij (known solution). The function λ(ε ij)ref = 1 if the ith peptide could experimentally be demonstrated to be a B-epitope in the assay of reference (reference) carried out in the conditions c j, λ(ε ij)ref = 0 otherwise. The variational-perturbation terms ΔΔχ cj are at the same time terms typical of perturbation theory and moving average (MA) functions used in Box-Jenkin models in time series [60]. These new types of terms account both for the deviation of the electronegativity of all amino acids in the sequence of the new peptide with respect to the peptide of reference and with respect to all boundary conditions. In Table 2, we give the overall classification results obtained with this model. Speck-Planche et al. [6163] introduced different multitarget/multiplexing QSAR models that incorporate this type of information based on MAs. The results obtained with the present model are excellent compared with other similar models in the literature useful for other problems including moving average models [64, 65] or perturbation models [58]. Notably, this is also the first model combining both perturbation theory and MAs in a QSPR context.

Table 1.

Results of QSPR-perturbation model for IEDB B-Epitopes.

Data Stat. Pred. Predicted epitope perturbations
subset param. % λ(ε ij) = 1 λ(ε ij) = 0
λ(ε ij) = 1 Sp 97.0 84607 2660
λ(ε ij) = 0 Sn 93.6 4354 63548

Total train Ac 95.5

λ(ε ij) = 1 Sp 97.1 28060 840
λ(ε ij) = 0 Sn 93.3 1485 20641

Total cv Ac 95.4

Bold font is used to highlight the number of cases correctly classified by the model.

The other input terms are the following. The first Δχ seq = χ(m q)refχ(m i)new is the perturbation term for the variation or in the mean value of electronegativity for all amino acids in the sequence of the peptide of reference. The remnant input variables of the model ΔΔχ cj = Δχ cj-ref − Δχ cj-new = [χ(m q)ref − *χ(c qr)ref]−[χ(m i)new − *χ(c ij)new] quantify values of the conditions of the new assay cj-new that represent perturbations with respect to the initial conditions c ij-ref of the assay of reference. The quantities *χ(c ij) and *χ(c qr) are the average values of the mean electronegativity values χ(m i) and χ(m q) for all new and reference peptides in IEDB that are epitopes under the jth or rth boundary condition. The values of these terms have been tabulated for >500 source organisms, >50 host organisms, >10 biological process, and >30 experimental techniques. We must substitute the values of χ(m i) and χ(m q) of the new and reference peptides and the tabulated values of *χ(c ij) and *χ(c qr) for all combinations of boundary conditions to predict the perturbations of the action as epitope of peptides. In doing so we can found the optimal sequence and boundary conditions towards the use of the peptide in the development of a vaccine. In Table 2 we give some of these values of *χ(c ij) and *χ(c qr).

Table 2.

Average values and count of input-output cases for different organisms, process, and techniques.

Source organism (so) N in N out *χ
Homo sapiens 38920 39274 2.685
Plasmodium falciparum 10669 9446 2.704
Hepatitis C virus 9935 10239 2.683
Bos taurus 5671 5780 2.690
Canine parvovirus 5655 5637 2.693
Foot-mouth disease virus 4062 4176 2.676
Triticum aestivum 3769 3887 2.703
Bacillus anthracis 3602 3600 2.699
Human papillomavirus 3316 3414 2.693
Human herpesvirus 3026 3132 2.684
Gallus gallus 2850 2829 2.689
Arachis hypogaea 2648 2670 2.687
Mycobacterium tuberculosis 2637 2593 2.688
Clostridium botulinum 2588 2722 2.685
SARS coronavirus 2550 2704 2.686
Mus musculus 2334 2287 2.682
Hepatitis B virus 2007 2066 2.680
Helicobacter pylori 1958 1796 2.695
Hevea brasiliensis 1938 1958 2.697
Hepatitis E virus 1928 1941 2.685
Shigella flexneri 1878 1701 2.699
Dengue virus 2 1767 1828 2.679
Staphylococcus aureus 1757 1661 2.694
Treponema pallidum 1739 1755 2.691
Escherichia coli 1721 1678 2.689
Murine hepatitis virus 1575 1603 2.692
Haemophilus influenzae 1545 1587 2.695
Streptococcus mutans 1523 1537 2.697
Puumala virus (strain) 1505 1574 2.689
Chlamydia trachomatis 1402 1546 2.704
Human respiratory  virus 1347 1398 2.682
Borrelia burgdorferi 1228 1237 2.698
Hepatitis delta virus 1182 1199 2.690
Streptococcus pyogenes 1181 1251 2.697
Porphyromonas gingivalis 1143 1085 2.688
Human enterovirus 1106 1132 2.689
Influenza A virus 1085 1086 2.687
Mycoplasma hyopneumoniae 1044 1024 2.695
Rattus norvegicus 1025 1039 2.689
Bordetella pertussis 1011 960 2.685
Human T-lymphotropic virus 996 1031 2.680
Anaplasma marginale 977 857 2.707
Measles virus strain 804 810 2.688
Fasciola hepatica 803 857 2.685
Neisseria meningitidis 789 853 2.696
Human poliovirus 766 780 2.690
Tityus serrulatus 764 775 2.680
Torpedo californica 752 788 2.687
Cryptomeria japonica 719 794 2.680
Mycobacterium bovis 717 733 2.688
Trypanosoma cruzi 691 777 2.704
Andes virus CHI-7913 679 687 2.690
Bovine papillomavirus 672 665 2.692
Human hepatitis 670 696 2.688
Leishmania infantum 659 735 2.688
Human parvovirus 649 691 2.683
Poa pratensis 648 664 2.692
Aspergillus fumigatus 642 709 2.677
Duck hepatitis 587 603 2.688
Olea europaea 571 577 2.692
Porcine reproductive 515 514 2.681
Fagopyrum esculentum 509 497 2.685
Juniperus ashei 505 568 2.672
Mycobacterium leprae 489 542 2.690
Glycine max 477 509 2.685
D. pteronyssinus 455 464 2.680
Plasmodium vivax 453 446 2.690
Chlamydophila pneumoniae 446 462 2.690
Pseudomonas aeruginosa 443 454 2.691
Vibrio cholera 427 426 2.694
Streptococcus sp. 426 425 2.691
Mycobacterium avium 425 415 2.689
Dermatophagoides farinae 410 390 2.693
Human coxsackievirus 406 392 2.694
Equine infectious  virus 404 419 2.688
Babesia equi 383 371 2.696
Prunus dulcis 383 379 2.708
Human adenovirus 375 405 2.686
Theileria parva 366 371 2.713
Candida albicans 365 370 2.690
Porcine endogenous 355 351 2.692
Ovis aries 352 350 2.683
Chironomus thummi 347 338 2.691
Sus scrofa 343 362 2.686
Bovine leukemia virus 333 329 2.676
Ricinus communis 329 314 2.692
Androctonus australis 322 357 2.685
Renibacterium salmoninarum 319 350 2.690
Orientia tsutsugamushi 309 372 2.705
Anacardium occidentale 293 306 2.693
Conus geographus 289 295 2.660

Host organism (ho) N in N out *χ

Homo sapiens 257293 91093 2.6856
Mus musculus 107867 51466 2.6873
Oryctolagus cuniculus 65053 31433 2.6900
Bos taurus 15333 2072 2.6909
Rattus norvegicus 9450 3562 2.6876
Aotus sp. 9044 3933 2.6879
Sus scrofa 7725 3464 2.6873
Gallus gallus 7507 997 2.6790
Canis lupus 6604 3334 2.6906
Macaca mulatta 5261 2569 2.6993
Ovis aries 3953 1653 2.6836
Equus caballus 3943 2099 2.6842
Cavia porcellus 3458 1688 2.6833
Capra hircus 2182 1127 2.6830
Aotus nancymaae 1659 852 2.6837
Pan troglodytes 1614 732 2.6757
Marmota monax 1100 509 2.7011
Felis catus 901 279 2.6838
Myodes glareolus 814 388 2.6863
Anas platyrhynchos 688 342 2.6880
Homo sapiens  (human) 508 270 2.6851
Trichosurus vulpecula 456 126 2.6921
Mesocricetus auratus 438 104 2.6909
Macaca cyclopis 382 193 2.6871
O. tshawytscha 333 159 2.6929
Macaca fuscata 188 100 2.6667
Cricetulus migratorius 171 142 2.7008
Camelus dromedarius 171 89 2.6886
Dicentrarchus labrax 121 55 2.6759
Macaca fascicularis 96 52 2.6793
Saimiri sciureus 92 44 2.6900
Canis familiaris 77 42 2.6850
Rattus rattus 72 31 2.6760
Callithrix pygmaea 67 30 2.6920
Chinchilla lanigera 41 24 2.6729
Aotus lemurinus 30 19 2.6860
Papio cynocephalus 27 13 2.7267
Aotus griseimembra 26 12 2.7000
Mustela vison 18 10 2.7000
Chlorocebus aethiops 15 10 2.6875
Bos indicus 13 4 2.6925
Oncorhynchus mykiss 10 4 2.6700
M. macquariensis 9 6 2.6600
Cricetulus griseus 8 4 2.6900
Aotus trivirgatus 7 4 2.7000

Process type (pt) N in N out *χ

AID 111197 108536 2.6876
OOID 32419 32617 2.6868
OAID 19210 18954 2.6801
OOA 15863 16303 2.6902
NI 13430 15206 2.6845
EWEIR 4818 4864 2.6843
EEE 3113 3546 2.6906
OOD 2806 2799 2.6887
AICD 1077 1095 2.6812
EWED 696 686 2.6879
DEWED 280 337 2.6804
TT 260 215 2.6806
OOC 153 137 2.6800

Technique (tq) N in N out *χ

ELISA 133458 135109 2.6871
WI 33627 33292 2.6887
ACAbB 7780 9068 2.6862
PhDIP 7450 4496 2.6771
RIA 5241 5218 2.6858
IFAIH 4454 4581 2.6879
NIAA 4222 4316 2.6892
FIA 2255 2276 2.6897
PAC 1312 1219 2.6837
IP 1127 1089 2.6886
SPR 758 639 2.6860
FACS 608 647 2.6907
Other 502 495 2.6813
SAC 484 393 2.6878
ELISPOT 396 412 2.6979
RDAT 366 323 2.6859
EDAT 284 330 2.6800
XRC 231 227 2.6880
MS 209 179 2.6849
PFF 171 153 2.6820
AbDPO 162 295 2.6968
CdC 146 205 2.6895
IAbBA 144 183 2.6940
IOT 124 106 2.6835
HAGGI 115 122 2.6834
IgMHR 89 90 2.6929
EAAA 84 139 2.6922
HS 82 67 2.6791
AbdCC 73 118 2.6897
AGG 50 60 2.6980
CM 50 57 2.6863

The indicates that quantities like χ is the average value of the mean electronegativity (m i) for all the peptides in IEDB that are epitopes for the same boundary condition.

In Table 3 we depict the sequences and input-output boundary conditions for top perturbations present in IEDB. All these perturbations have observed value of λ(ε ij)new = 1 and predicted value also equal to 1 with a high probability. See Supplementary Material available online at http://dx.doi.org/10.1155/2014/768515 file contains a full list of >200,000 cases of perturbations.

Table 3.

Top100 values of p1 for positive perturbations in training series.

New experiment Experiment of reference Input perturbation terms
IDE Sequence ho so ip tq IDE Sequence ho so ip tq Δχ ΔΔχ ho ΔΔχ so ΔΔχ ip ΔΔχ tq
115153 MKGVVC
TRIYEKV
Homo sapiens Homo sapiens OAID ELISA 115155 NNQRK
KAKNTP
FNMLKRERN
Mus musculus Dengue virus 2 AID WI 0.01 0.012 0.004 0.017 0.012
52124 QQQPP Homo sapiens Triticum aestivum OOA ELISA 52128 QQQQGG
SQSQKGKQQ
Homo sapiens Glycine max OOA WI 0 0 −0.018 0 0.002
3639 APLGVT Homo sapiens Hepatitis E virus EWEIR ELISA 3652 APLTRG
SCRKRN
RSPER
Homo sapiens Human herpesvirus OOID ELISA 0.04 0.04 0.039 0.043 0.04
135959 LTRAYA
KDVKFG
Homo sapiens Homo sapiens OAID ELISA 135963 NGQEEK
AGVVS
TGLIGGG
Mus musculus MD AID ELISA −0.04 −0.038 −0.047 −0.033 −0.04
108075 PREPQVY Homo sapiens Homo sapiens OAID ELISA 108078 PTSPSGV
EEWIVTQ
VVPGVA
Oryctolagus cuniculus Homo sapiens AID ACAbB −0.01 −0.006 −0.01 −0.003 −0.011
25113 HVVDLP Homo sapiens Hepatitis E virus EWEIR ELISA 25126 HWGNH
SKSHPQR
Mus musculus MD AID ELISA 0.02 0.022 0.013 0.023 0.02
48780 PPFSPQ Homo sapiens Hepatitis delta virus OOID ELISA 48782 PPFTSAV
GGVDHRS
Mus musculus MD AID SAC 0.02 0.022 0.008 0.021 0.021
40988 LYVVAYQA Mus musculus Viscum album AID ELISA 41004 MAARLCC
QLDPARDV
Homo sapiens Hepatitis B virus OOID ELISA 0.02 0.018 0.006 0.019 0.02
50439 QDAYNAAG Mus musculus Mycobacterium scrofulaceum AID ELISA 50445 QDCNCSI
YPGHAS
GHRMAWD
Homo sapiens Hepatitis C virus OOID ELISA 0.04 0.038 0.028 0.039 0.04
98849 KIPAVFKIDA Homo sapiens Bos taurus DEWED WI 98850 KKGSEEE
GDITNPIN
Homo sapiens Arachis hypogaea OOA IFAIH −0.05 −0.05 −0.053 −0.04 −0.051
116171 TQDQDP
BBHFFK
NIVTPR
Homo sapiens Homo sapiens OAID ACAbB 116286 CGKGLS
ATVTGG
QKGRGSR
Oryctolagus cuniculus Mus musculus AID MS 0.01 0.014 0.008 0.017 0.009
123442 LLKDLRKN Homo sapiens Borna disease virus EWEIR WI 123443 LLTEHR
MTWDPA
QPPRDLTE
Homo sapiens Homo sapiens OOID ELISA −0.02 −0.02 −0.025 −0.017 −0.022
47858 PHVVDL Homo sapiens Hepatitis E virus EWEIR ELISA 47860 PHWIK
KPNRQG
LGYYS
Capra hircus Human T-lymphotropic virus AID ACAbB 0.01 0.007 0.005 0.013 0.009
61783 STNKAVVSLS Bos taurus Bovine respiratory AID ELISA 61792 STNPKP
QRKTKRN
TNRRPQD
Homo sapiens Hepatitis C virus OOID ELISA 0.03 0.025 0.019 0.029 0.03
118210 VMLYQISEE Homo sapiens Homo sapiens OAID WI 118217 VTKYITK
GWKEVH
Oryctolagus cuniculus Homo sapiens AID ELISA 0.05 0.054 0.05 0.057 0.048
130944 LFKHS Oryctolagus cuniculus Rattus norvegicus AID ELISA 130956 LPPRVTP
KWSLDA
WSTWR
Homo sapiens MD OOD WI 0.01 0.006 −0.002 0.011 0.012
23028 GVKYA Homo sapiens MD OAID WI 23032 GVLAKD
VRFSQV
Homo sapiens MD OOID ELISA 0 0 0 0.007 −0.002
51199 QKKAIE Oryctolagus cuniculus Vibrio cholerae AID ELISA 51204 QKKNK
RNTNRR
PQDV
Homo sapiens Hepatitis C virus OOID ELISA 0.03 0.026 0.018 0.029 0.03
144783 SHVVT
MLDNF
Homo sapiens Homo sapiens NI ELISA 144786 SMNRGRG
THPSLIWM
Mus musculus MD AID ACAbB 0.03 0.032 0.023 0.033 0.029
134343 DLYIK Mus musculus Human papillomavirus AID NIAA 134344 DMAQV
TVGPGLL
GVSTL
Mus musculus Homo sapiens AID WI 0 0 −0.009 0 0
38321 LNQLAGRM Anas platyrhynchos Duck hepatitis AID ELISA 38323 LNQTAR
AFPDCAI
CWEPSPP
Oryctolagus cuniculus Bovine leukemia virus AID ACAbB −0.01 −0.008 −0.022 −0.01 −0.011
144657 GQITVD
MMYG
Homo sapiens Homo sapiens OAID ELISA 144661 GREGYP
ADGGCA
WPACYC
Oryctolagus cuniculus MD AID WI 0.02 0.024 0.013 0.027 0.022
21084 GLQN Mus musculus Chlamydia trachomatis AID ELISA 21093 GLRAQD
DFSGWDI
NTPAFEW
Mus musculus Mycobacterium tuberculosis AID WI 0.03 0.03 0.013 0.03 0.032
98453 SGFSGSVQFV Oryctolagus cuniculus Neisseria meningitidis AID ELISA 98456 SICSNN
PTCWAIC
KRIPNKK
Mus musculus Human respiratory virus AID IFAIH 0.04 0.037 0.027 0.04 0.041
98453 SGFSGSVQFV Mus musculus Neisseria meningitidis AID ELISA 98456 SICSNN
PTCWAIC
KRIPNKK
Mus musculus Human respiratory virus AID IFAIH 0.04 0.04 0.027 0.04 0.041
107107 EAIQP Rattus norvegicus Homo sapiens AID ELISA 107110 EKERRP
SPIGTATLL
Homo sapiens MD OOA ELISA 0.05 0.048 0.043 0.053 0.05
110857 FTGEAY
SYWSAK
Homo sapiens Mycoplasma penetrans EWEIR ELISA 110859 GEESRIS
LPLPNF
SSLNLRE
Mus musculus Homo sapiens AID FIA 0 0.002 −0.017 0.003 0.003
36315 LGSAYP Mus musculus Mycobacterium leprae MD ELISA 36317 LGSGAFG
TIYKG
Mus musculus Avian erythroblastosis virus AID ACAbB 0.01 0.01 0 0.011 0.009
122034 WNPAD Rattus norvegicus Torpedo californica AID ELISA 122035 WNPAD
YGGIKWN
PADYGGIK
Rattus norvegicus MD AID RIA 0.01 0.01 0.001 0.01 0.009
25013 HVADIDKLID Mus musculus Puumala virus Kazan AID ELISA 25021 HVAPTH
YVTESDA
SQRVTQL
Homo sapiens Hepatitis C virus OOID ELISA 0 −0.002 −0.014 −0.001 0
36162 LGIHE Oryctolagus cuniculus Candida albicans AID ELISA 36166 LGIMGE
YRGTPRN
QDLYDAA
Mus musculus Human respiratory  virus AID RIA 0 −0.003 −0.007 0 −0.001
67253 TWEVLH Mus musculus Plasmodium vivax AID ELISA 67257 TWGEN
ETDVLLL
NNTRPPQ
Homo sapiens Hepatitis C virus OOID ACAbB −0.02 −0.022 −0.027 −0.021 −0.021
50990 QGYRVSSYLP Homo sapiens Hevea brasiliensis OOA WI 50998 QHEQDR
PTPSPAP
SRPFSVL
Homo sapiens Hepatitis E virus OOID ELISA 0.01 0.01 −0.002 0.007 0.008
100458 RDVLQLYAPE Mus musculus Bacillus anthracis AID ELISA 100462 RFSTRY
GNQNGRI
RVLQRFD
Homo sapiens Arachis hypogaea EWED ELISA 0.03 0.028 0.018 0.03 0.03
111036 TESTFT
GEAYSV
Homo sapiens Mycoplasma penetrans EWEIR ELISA 111039 TGVPID
PAVPDSS
IVPLLES
Bos taurus Bovine papillomavirus AID ELISA 0.03 0.035 0.02 0.033 0.03
117919 IFIEME Homo sapiens Homo sapiens OAID WI 117921 IGIIDLIE
KRKFNQ
Mus musculus Homo sapiens AID WI 0.03 0.032 0.03 0.037 0.03
7127 CTDTDKLF Oryctolagus cuniculus Shigella flexneri AID ELISA 7128 CTDVST
AIHADQL
TPAW
Homo sapiens SARS coronavirus OOID ELISA 0.01 0.006 −0.003 0.009 0.01
112253 PGQSPKL Homo sapiens Homo sapiens OAID ELISA 112255 PIRALV
GDEVELP
CRISPGK
Mus musculus Homo sapiens AID ELISA 0.01 0.012 0.01 0.017 0.01
122034 WNPAD Rattus norvegicus Torpedo californica AID ELISA 122038 WNPDDY
GGVKWNP
DDYGGVK
Rattus norvegicus MD AID RIA 0 0 −0.009 0 −0.001
131878 FLMLVG
GSTL
Homo sapiens Homo sapiens OAID ACAbB 131879 FLVAHT
RARAPSA
GERARRS
Mus musculus Mus musculus AID NIAA 0.03 0.032 0.028 0.037 0.033
70664 VQVVYDYQ Homo sapiens Treponema pallidum OOID ELISA 70667 VQWMNR
LIAFAFAG
NHVSP
Homo sapiens Hepatitis C virus OOID ELISA 0.05 0.05 0.041 0.05 0.05
71545 VTV Homo sapiens Helicobacter pylori OOID ELISA 71559 VTVRGGL
RILSPDRK
Homo sapiens Arachis hypogaea OOA WI 0.04 0.04 0.032 0.043 0.042
127856 TDVRYKD Mus musculus Mus musculus AID ACAbB 127857 TDVRYK
DDMYHFF
CPAIQAQ
Mus musculus Mus musculus AID PFF 0.01 0.01 0.01 0.01 0.006
112149 GVGWIRQ Homo sapiens Homo sapiens OAID ELISA 112152 HHPART
AHYGSLP
QKSHGRT
Homo sapiens Homo sapiens AID ELISA 0 0 0 0.007 0
119581 FSCSVMHE Homo sapiens Homo sapiens OAID ELISA 119582 GLQLIQL
INVDEVNQI
Mus musculus Homo sapiens AID RIA −0.01 −0.008 −0.01 −0.003 −0.011
144657 GQITVD
MMYG
Homo sapiens Homo sapiens OAID ELISA 144659 GREGYP
ADGGAA
GYCNTE
Oryctolagus cuniculus MD AID WI −0.01 −0.006 −0.017 −0.003 −0.008
25013 HVADI
DKLID
Mus musculus Puumala virus Kazan AID ELISA 25022 HVAPTH
YVVESDA
SQRVTQV
Homo sapiens Hepatitis C virus OOID ELISA 0 −0.002 −0.014 −0.001 0
144652 GMRGM
KGLVY
Homo sapiens Homo sapiens OAID ELISA 144654 GPHPTLE
VVPMGRGS
Mus musculus MD AID ELISA −0.02 −0.018 −0.027 −0.013 −0.02
104515 HDCRPKKI Mus musculus La Crosse virus AID IFAIH 104521 IGTLKKIL
DETVKD
KIAKEQ
Rattus norvegicus Streptococcus pyogenes AID ELISA −0.04 −0.04 −0.053 −0.04 −0.041
7367 CYGDWA Homo sapiens Triticum aestivum OOA ELISA 7374 CYGLPDS
EPTKTNGK
Mus musculus Tityus serrulatus AID WI −0.02 −0.018 −0.043 −0.023 −0.018
7367 CYGDWA Homo sapiens Triticum aestivum OOA ELISA 7374 CYGLPDS
EPTKTNGK
Mus musculus Tityus serrulatus AID WI −0.02 −0.018 −0.043 −0.023 −0.018
144610 DFFTYK Mus musculus Porcine transmissible AID WI 144611 DFNGSF
DMNGTITA
Oryctolagus cuniculus Escherichia coli AID ELISA −0.01 −0.007 −0.018 −0.01 −0.012
112047 ASTRESG Homo sapiens Homo sapiens OAID ELISA 112048 ATASTM
DHARHGF
LPRHRDT
Homo sapiens Homo sapiens AID ELISA 0.05 0.05 0.05 0.057 0.05
36136 LGGVFT Homo sapiens Dengue virus 2 OOID ELISA 36137 LGGWKLQ
SDPRAYAL
Homo sapiens Ambrosia artemisiifolia OOA RIA 0.01 0.01 0.007 0.013 0.009
115256 FRELKD
LKGY
Homo sapiens Bos taurus DEWED WI 115261 GDLEILL
QKWENG
ECAQKKI
Homo sapiens Bos taurus OOA FIA −0.01 −0.01 −0.01 0 −0.009
129024 KADQLYK Homo sapiens Homo sapiens OAID ELISA 129026 KAKKP
AAAAGA
KKAKS
Oryctolagus cuniculus Homo sapiens AID ELISA 0.03 0.034 0.03 0.037 0.03
148481 YTRDLVYK Rattus norvegicus Homo sapiens AID WI 148483 YVPIVT
FYSEISM
HSSRAIP
Oryctolagus cuniculus MD AID ELISA 0 0.002 −0.007 0 −0.002
150850 GY Mus musculus Human papillomavirus AID ELISA 150853 HIGGLSI
LDPIFGVL
Homo sapiens Dermatophagoides farinae OOA ACAbB 0.04 0.038 0.04 0.043 0.039
107366 FPPKPKD Homo sapiens Homo sapiens OAID ELISA 107376 GDRSGYS
SPGSPG
Mus musculus Homo sapiens AID ACAbB −0.03 −0.028 −0.03 −0.023 −0.031
114859 ICGTD
GVTYT
Homo sapiens Gallus gallus OOA WI 114865 IVERETR
GQSENPL
WHALRR
Rattus norvegicus Human herpesvirus AID ELISA 0.04 0.042 0.035 0.037 0.038
62149 SVHLF Homo sapiens MD OAID WI 62150 SVIALGS
QEGALHQ
ALAGAI
Equus caballus West Nile virus AID IFAIH −0.02 −0.021 −0.012 −0.013 −0.021
98455 SGSVQFVPIQ Mus musculus Neisseria meningitidis AID ELISA 98456 SICSNNP
TCWAICK
RIPNKK
Mus musculus Human respiratory  virus AID IFAIH 0.04 0.04 0.027 0.04 0.041
61783 STNKAV
VSLS
Bos taurus Bovine respiratory AID ELISA 61791 STNPKPQ
RKTKRNT
NRRPQ
Homo sapiens Hepatitis C virus EWEIR ACAbB 0.04 0.035 0.029 0.037 0.039
100318 NAPKT
FQFIN
Mus musculus Bacillus anthracis AID ELISA 100319 NASSELH
LLGFGIN
AENNHR
Homo sapiens Arachis hypogaea EWED ELISA 0 −0.002 −0.012 0 0
118947 PFSAPPPA Homo sapiens Homo sapiens OAID ELISA 118948 PGAIEQG
PADDPGE
GPSTGP
Homo sapiens Human herpesvirus NI ACAbB −0.04 −0.04 −0.041 −0.036 −0.041
78323 YSFRD Mus musculus Bluetongue virus 1 AID ELISA 78341 AALTAEN
TAIKKRN
ADAKA
Homo sapiens Streptococcus mutans EEE ELISA 0.01 0.008 0.007 0.013 0.01
145831 IPLGTRP Mus musculus Human papillomavirus AID ELISA 145841 KEDFRY
AISSTNEI
GLLGA
Sus scrofa Classical swine AID PAC −0.04 −0.04 −0.045 −0.04 −0.043
119592 HTFPAVLQ Homo sapiens Homo sapiens OAID ELISA 119596 IHIPSEKI
WRPDLVLY
Mus musculus Homo sapiens AID RIA 0.01 0.012 0.01 0.017 0.009
58780 SKAANLSIIK MD Beet necrotic MD WI 58783 SKAFSN
CYPYDVP
DYASL
Oryctolagus cuniculus Influenza A virus AID RIA −0.01 −0.004 −0.014 −0.009 −0.013
115153 MKGVVC
TRIYEKV
Homo sapiens Homo sapiens NI ELISA 115155 NNQRKK
AKNTPFN
MLKRERN
Mus musculus Dengue virus 2 AID ELISA 0.01 0.012 0.004 0.013 0.01
133629 LPLRF Oryctolagus cuniculus Gallus gallus AID ACAbB 133630 LPPGLHV
FPLASNRS
Mus musculus MD AID SPR −0.01 −0.013 −0.021 −0.01 −0.01
96215 EEEEAE
DKED
Homo sapiens Homo sapiens OAID ELISA 96216 EEEGLLK
KSADTLW
NMQK
Mus musculus Mus musculus AID ELISA 0.08 0.082 0.078 0.087 0.08
39782 LTAASV Homo sapiens Triticum aestivum OOA ELISA 39788 LTAELKI
YSVIQAEI
NKHL
Oryctolagus cuniculus Yersinia pestis AID ACAbB 0.01 0.014 −0.001 0.007 0.009
107479 KFNWYVD Homo sapiens Homo sapiens OAID ELISA 107482 KGEPGL
PGRGFP
GFP
Mus musculus Homo sapiens AID ACAbB 0 0.002 0 0.007 −0.001
63569 TETVNSDI Macaca mulatta Shigella flexneri AID ELISA 63573 TEVELKER
KHRIEDAV
RNAK
Homo sapiens Mycobacterium leprae OOID ELISA 0.05 0.036 0.041 0.049 0.05
134028 DDTIS Mus musculus Homo sapiens AID NIAA 134029 DEDENQS
PRSFQKKTR
Oryctolagus cuniculus Homo sapiens AID ELISA 0.05 0.053 0.05 0.05 0.048
23028 GVKYA Homo sapiens MD OAID WI 23032 GVLAKDV
RFSQV
Homo sapiens MD EWEIR ACAbB 0 0 0 0.004 −0.003
115293 IMCVKK
ILDK
Homo sapiens Bos taurus DEWED WI 115295 INPSKEN
LCSTFCK
EVVRNA
Homo sapiens Bos taurus OOA FIA −0.03 −0.03 −0.03 −0.02 −0.029
65105 TLTPENTL Mus musculus Shigella flexneri AID ELISA 65110 TLTSGSD
LDRCTT
FDDV
Oryctolagus cuniculus SARS coronavirus AID ELISA 0.01 0.013 −0.003 0.01 0.01
134471 PKPEQ Mus musculus Streptococcus pneumoniae AID FACS 134472 PLLPGT
STTSTG
PCKT
Homo sapiens Hepatitis B virus AID ELISA 0.02 0.018 0.019 0.02 0.016
142228 LYCYEQ
LNDSSE
Homo sapiens Human papillomavirus NI ELISA 142250 NWGDEP
SKRRDRS
NSRGRKN
Felis catus Feline infectious AID ELISA 0.07 0.068 0.071 0.073 0.07
21549 GNYDFW
YQS
Homo sapiens Staphylococcus aureus OOID ELISA 21553 GNYNYKY
RYLRHGK
LRPFER
Mus musculus SARS coronavirus AID ELISA 0.03 0.032 0.021 0.031 0.03
21549 GNYDFWYQS Homo sapiens Staphylococcus aureus OOID ELISA 21553 GNYNYKY
RYLRHGK
LRPFER
Oryctolagus cuniculus SARS coronavirus AID ELISA 0.03 0.034 0.021 0.031 0.03
34908 LAPLGE Homo sapiens Hepatitis E virus EWEIR ELISA 34914 LAPSTL
RSLRKR
RLSSP
Homo sapiens Human herpesvirus OOID ELISA 0.06 0.06 0.059 0.063 0.06
130454 CLFPNNSYC Mus musculus MD AID NIAA 130456 CRPQVN
NPKEWS
CAAC
Homo sapiens MD OOD ACAbB 0.01 0.008 0.01 0.011 0.007
19644 GFVPSM Homo sapiens Hepatitis delta virus OOID ELISA 19647 GFVSASI
FGFQAEV
GPNNTR
Oryctolagus cuniculus Vaccinia virus WR AID ELISA −0.01 −0.006 −0.013 −0.009 −0.01
20678 GKRPE Mus musculus Streptococcus pyogenes AID WI 20680 GKSKRD
AKNNAAK
LAVDKLL
Mus musculus Vaccinia virus WR AID WI 0.01 0.01 0.001 0.01 0.01
123278 GYLKDLPTT Ovis aries Fasciola hepatica AID ELISA 123282 HACQKK
LLKFEAL
QQEEGEE
Rattus norvegicus Gallus gallus AID PFF −0.02 −0.016 −0.016 −0.02 −0.025
141067 PLSLEPDP Mus musculus Homo sapiens AID FACS 141073 REGVRW
RVMAIQ
Mus musculus Homo sapiens AID ELISA 0.06 0.06 0.06 0.06 0.056
139248 SFAGTVIE Mus musculus Classical swine AID WI 139305 TAAQITQ
RKWEAA
REAEQRR
Oryctolagus cuniculus Homo sapiens AID IP 0.03 0.033 0.026 0.03 0.03
104515 HDCRPKKI Mus musculus La Crosse virus AID IFAIH 104520 IAKEQE
NKETIGT
LKKILDE
Rattus norvegicus Streptococcus pyogenes AID ELISA −0.06 −0.06 −0.073 −0.06 −0.061
113517 HLYADGLTD Mus musculus Human papillomavirus AID IFAIH 113518 HNKIQA
IELEDLLR
YSKLYR
Mus musculus Homo sapiens AID ELISA 0.02 0.02 0.011 0.02 0.019
156970 VERHQ Homo sapiens Homo sapiens OAID ELISA 156975 WSSTV
LRVSPT
RTVP
Mus musculus MD AID ELISA 0.01 0.012 0.003 0.017 0.01
78252 PVQNLT Mus musculus Porphyromonas gingivalis AID WI 78253 QGGCGR
GWAFSA
TGAIEA
Mus musculus Glycine max AID ELISA 0.02 0.02 0.017 0.02 0.018
6068 CCPDKNKS Mus musculus Human herpesvirus AID WI 6074 CCRHKQ
KDVGDVK
QTLPPS
Ovis aries MD AID ELISA 0.01 0.006 0.004 0.01 0.008
53109 RAGVCY Homo sapiens Triticum aestivum OOA ELISA 53116 RAILTA
FSPAQDI
WGTS
Oryctolagus cuniculus SARS coronavirus AID ELISA −0.02 −0.016 −0.037 −0.023 −0.02
147041 IPEQ Homo sapiens Triticum aestivum OOA WI 147064 KHQGA
QYVWN
RTA
Homo sapiens Bos taurus OOID WI 0.06 0.06 0.047 0.057 0.06
70664 VQVVYDYQ Oryctolagus cuniculus Treponema pallidum AID ELISA 70667 VQWMN
RLIAFAF
AGNHVSP
Homo sapiens Hepatitis C virus OOID ELISA 0.05 0.046 0.041 0.049 0.05
134343 DLYIK Mus musculus Human papillomavirus AID PhDIP 134344 DMAQV
TVGPGLL
GVSTL
Mus musculus Homo sapiens AID PhDIP 0 0 −0.009 0 0

4. Conclusions

It is possible to develop general models for vaccine design able to predict the results of multiple input-output perturbations in peptide sequence and experimental assay boundary conditions using ideas of QSPR analysis, perturbation theory, and Box and Jenkins MA operators. The electronegativity values calculated with MARCH-INSIDE seem to be good molecular descriptors for this type of QSPR-perturbation models.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Supplementary Material

Supplementary Material includes the sequences of peptides obtained from IEDB, boudary conditions (source organisms, host organisms, techniques, biological process) as well as the values of the input/output variables of the models calculated for all the cases present in the dataset used. These values have been obtained for all the input-output boundary conditions for the perturbations.

Acknowledgments

The present study was partially supported by Grants AGL2010-22290-C02 and AGL2011-30563-C03 from Ministerio de Ciencia e Innovación, Spain, and Grant CN 2012/155 from Xunta de Galicia, Spain.

References

  • 1.Peters B, Sidney J, Bourne P, et al. The design and implementation of the immune epitope database and analysis resource. Immunogenetics. 2005;57(5):326–336. doi: 10.1007/s00251-005-0803-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Peters B, Sidney J, Bourne P, et al. The immune epitope database and analysis resource: from vision to blueprint. PLoS Biology. 2005;3(3):p. e91. doi: 10.1371/journal.pbio.0030091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wang P, Morgan AA, Zhang Q, Sette A, Peters B. Automating document classification for the Immune Epitope Database. BMC Bioinformatics. 2007;8, article 269 doi: 10.1186/1471-2105-8-269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sette A. The immune epitope database and analysis resource: from vision to blueprint. Genome Informatics. 2004;15(2):p. 299. [PubMed] [Google Scholar]
  • 5.Salimi N, Fleri W, Peters B, Sette A. The immune epitope database: a historical retrospective of the first decade. Immunology. 2012;137(2):117–123. doi: 10.1111/j.1365-2567.2012.03611.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kim Y, Sette A, Peters B. Applications for T-cell epitope queries and tools in the Immune Epitope Database and Analysis Resource. Journal of Immunological Methods. 2011;374(1-2):62–69. doi: 10.1016/j.jim.2010.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kim Y, Ponomarenko J, Zhu Z, Tamang D, et al. Immune epitope database analysis resource. Nucleic Acids Research. 2012:W525–W530. doi: 10.1093/nar/gks438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Helguera AM, Combes RD, Gonzalez MP, Cordeiro MNDS. Applications of 2D descriptors in drug design: a DRAGON tale. Current Topics in Medicinal Chemistry. 2008;8(18):1628–1655. doi: 10.2174/156802608786786598. [DOI] [PubMed] [Google Scholar]
  • 9.Casañola-Martín GM, Marrero-Ponce Y, Khan MTH, et al. Dragon method for finding novel tyrosinase inhibitors: Biosilico identification and experimental in vitro assays. The European Journal of Medicinal Chemistry. 2007;42(11-12):1370–1381. doi: 10.1016/j.ejmech.2007.01.026. [DOI] [PubMed] [Google Scholar]
  • 10.Tetko IV, Gasteiger J, Todeschini R, et al. Virtual computational chemistry laboratory—design and description. Journal of Computer-Aided Molecular Design. 2005;19(6):453–463. doi: 10.1007/s10822-005-8694-y. [DOI] [PubMed] [Google Scholar]
  • 11.Katritzky AR, Oliferenko A, Lomaka A, Karelson M. Six-membered cyclic ureas as HIV-1 protease inhibitors: a QSAR study based on CODESSA PRO approach. Bioorganic and Medicinal Chemistry Letters. 2002;12(23):3453–3457. doi: 10.1016/s0960-894x(02)00741-2. [DOI] [PubMed] [Google Scholar]
  • 12.Katritzky AR, Perumal S, Petrukhin R, Kleinpeter E. CODESSA-based theoretical QSPR model for hydantoin HPLC-RT lipophilicities. Journal of Chemical Information and Computer Sciences. 2001;41(3):569–574. doi: 10.1021/ci000099t. [DOI] [PubMed] [Google Scholar]
  • 13.Vilar S, Cozza G, Moro S. Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery. Current Topics in Medicinal Chemistry. 2008;8(18):1555–1572. doi: 10.2174/156802608786786624. [DOI] [PubMed] [Google Scholar]
  • 14.Marzaro G, Chilin A, Guiotto A, et al. Using the TOPS-MODE approach to fit multi-target QSAR models for tyrosine kinases inhibitors. The European Journal of Medicinal Chemistry. 2011;46(6):2185–2192. doi: 10.1016/j.ejmech.2011.02.072. [DOI] [PubMed] [Google Scholar]
  • 15.Vilar S, Estrada E, Uriarte E, Santana L, Gutierrez Y. In silico studies toward the discovery of new anti-HIV nucleoside compounds through the use of tops-mode and 2D/3D connectivity indices. 2. Purine derivatives. Journal of Chemical Information and Modeling. 2005;45(2):502–514. doi: 10.1021/ci049662o. [DOI] [PubMed] [Google Scholar]
  • 16.Estrada E, Quincoces JA, Patlewicz G. Creating molecular diversity from antioxidants in Brazilian propolis. combination of TOPS-MODE QSAR and virtual structure generation. Molecular Diversity. 2004;8(1):21–33. doi: 10.1023/b:modi.0000006804.97390.40. [DOI] [PubMed] [Google Scholar]
  • 17.Estrada E, González H. What are the limits of applicability for graph theoretic descriptors in QSPR/QSAR? modeling dipole moments of aromatic compounds with TOPS-MODE descriptors. Journal of Chemical Information and Computer Sciences. 2003;43(1):75–84. doi: 10.1021/ci025604w. [DOI] [PubMed] [Google Scholar]
  • 18.Marrero-Ponce Y, Castillo-Garit JA, Olazabal E, et al. Tomocomd-Cardd, a novel approach for computer-aided “rational” drug design: I. Theoretical and experimental assessment of a promising method for computational screening and in silico design of new anthelmintic compounds. Journal of Computer-Aided Molecular Design. 2004;18(10):615–634. doi: 10.1007/s10822-004-5171-y. [DOI] [PubMed] [Google Scholar]
  • 19.Marrero Ponce Y, Medina Marrero R, Castro EA, et al. Protein quadratic indices of the “macromolecular pseudograph’s α-carbon atom adjacency matrix”. 1. prediction of arc repressor alanine-mutant’s stability. Molecules. 2004;9(12):1124–1147. doi: 10.3390/91201124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.González-Díaz H, Prado-Prado F, Ubeira FM. Predicting antimicrobial drugs and targets with the MARCH-INSIDE approach. Current Topics in Medicinal Chemistry. 2008;8(18):1676–1690. doi: 10.2174/156802608786786543. [DOI] [PubMed] [Google Scholar]
  • 21.García-Domenech R, Gálvez J, de Julián-Ortiz JV, Pogliani L. Some new trends in chemical graph theory. Chemical Reviews. 2008;108(3):1127–1169. doi: 10.1021/cr0780006. [DOI] [PubMed] [Google Scholar]
  • 22.Estrada E, Uriarte E. Recent advances on the role of topological indices in drug discovery research. Current Medicinal Chemistry. 2001;8(13):1573–1588. doi: 10.2174/0929867013371923. [DOI] [PubMed] [Google Scholar]
  • 23.Todeschini R, Consonni V. Handbook of Molecular Descriptors. Weinheim, Germany: Wiley-VCH; 2008. [Google Scholar]
  • 24.Estrada E, Delgado EJ, Alderete JB, Jaña GA. Quantum-connectivity descriptors in modeling solubility of environmentally important organic compounds. Journal of Computational Chemistry. 2004;25(14):1787–1796. doi: 10.1002/jcc.20099. [DOI] [PubMed] [Google Scholar]
  • 25.Besalu E, Girones X, Amat L, Carbo-Dorca R. Molecular quantum similarity and the fundamentals of QSAR. Accounts of Chemical Research. 2002;35(5):289–295. doi: 10.1021/ar010048x. [DOI] [PubMed] [Google Scholar]
  • 26.Rincón DA, Cordeiro MNDS, Mosquera RA. On the electronic structure of cocaine and its metabolites. Journal of Physical Chemistry A. 2009;113(50):13937–13942. doi: 10.1021/jp9056048. [DOI] [PubMed] [Google Scholar]
  • 27.Mandado M, González-Moa MJ, Mosquera RA. Chemical graph theory and n-center electron delocalization indices: a study on polycyclic aromatic hydrocarbons. Journal of Computational Chemistry. 2007;28(10):1625–1633. doi: 10.1002/jcc.20647. [DOI] [PubMed] [Google Scholar]
  • 28.Hill T, Lewicki P. STATISTICS: Methods and Applications: A Comprehensive Reference for Science, Industry and Data Mining. Tulsa, Okla, USA: StatSoft; 2006. [Google Scholar]
  • 29.Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004;20(15):2479–2481. doi: 10.1093/bioinformatics/bth261. [DOI] [PubMed] [Google Scholar]
  • 30.Patlewicz G, Ball N, Booth ED, Hulzebos E, Zvinavashe E, Hennes C. Use of category approaches, read-across and (Q)SAR: general considerations. Regulatory Toxicology and Pharmacology. 2013;67(1):1–12. doi: 10.1016/j.yrtph.2013.06.002. [DOI] [PubMed] [Google Scholar]
  • 31.Roberts DW, Patlewicz GY. Updating the skin sensitization in vitro data assessment paradigm in 2009—a chemistry and QSAR perspective. Journal of Applied Toxicology. 2010;30(3):286–288. doi: 10.1002/jat.1508. [DOI] [PubMed] [Google Scholar]
  • 32.Roberts DW, Patlewicz GY. Nonanimal alternatives for skin sensitization: letter to the editor. Toxicological Sciences. 2008;106(2):572–574. doi: 10.1093/toxsci/kfn181. [DOI] [PubMed] [Google Scholar]
  • 33.Estrada E, Patlewicz G, Gutierrez Y. From knowledge generation to knowledge archive. a general strategy using TOPS-MODE with DEREK to formulate new alerts for skin sensitization. Journal of Chemical Information and Computer Sciences. 2004;44(2):688–698. doi: 10.1021/ci0342425. [DOI] [PubMed] [Google Scholar]
  • 34.Gerberick GF, Ryan CA, Kern PS, et al. A chemical dataset for evaluation of alternative approaches to skin-sensitization testing. Contact Dermatitis. 2004;50(5):274–288. doi: 10.1111/j.0105-1873.2004.00290.x. [DOI] [PubMed] [Google Scholar]
  • 35.Patlewicz GY, Basketter DA, Smith Pease CK, et al. Further evaluation of quantitative structure–activity relationship models for the prediction of the skin sensitization potency of selected fragrance allergens. Contact Dermatitis. 2004;50(2):91–97. doi: 10.1111/j.0105-1873.2004.00322.x. [DOI] [PubMed] [Google Scholar]
  • 36.Estrada E, Patlewicz G, Chamberlain M, Basketter D, Larbey S. Computer-aided knowledge generation for understanding skin sensitization mechanisms: the TOPS-MODE approach. Chemical Research in Toxicology. 2003;16(10):1226–1235. doi: 10.1021/tx034093k. [DOI] [PubMed] [Google Scholar]
  • 37.Patlewicz G, Rodford R, Walker JD. Quantitative structure-activity relationships for predicting skin and eye irritation. Environmental Toxicology and Chemistry. 2003;22(8):1862–1869. doi: 10.1897/01-439. [DOI] [PubMed] [Google Scholar]
  • 38.Patlewicz GY, Rodford RA, Ellis G, Barratt MD. A QSAR model for the eye irritation of cationic surfactants. Toxicology In Vitro. 2000;14(1):79–84. doi: 10.1016/s0887-2333(99)00086-7. [DOI] [PubMed] [Google Scholar]
  • 39.Tenorio-Borroto E, Penuelas Rivas CG, Vasquez Chagoyan JC, Castanedo N, Prado-Prado FJ, Garcia-Mera X. ANN multiplexing model of drugs effect on macrophages; theoretical and flow cytometry study on the cytotoxicity of the anti-microbial drug G1 in spleen. Bioorganic & Medicinal Chemistry. 2012;20(20):6181–6194. doi: 10.1016/j.bmc.2012.07.020. [DOI] [PubMed] [Google Scholar]
  • 40.Flower DR, McSparron H, Blythe MJ, et al. Computational vaccinology: quantitative approaches. Novartis Foundation Symposium. 2003;254:102–120. [PubMed] [Google Scholar]
  • 41.Doytchinova IA, Guan P, Flower DR. Quantitative structure-activity relationships and the prediction of MHC supermotifs. Methods. 2004;34(4):444–453. doi: 10.1016/j.ymeth.2004.06.007. [DOI] [PubMed] [Google Scholar]
  • 42.Xiao Y, Segal MR. Prediction of genomewide conserved epitope profiles of HIV-1: classifier choice and peptide representation. Statistical Applications in Genetics and Molecular Biology. 2005;4(1, article 25) doi: 10.2202/1544-6115.1158. [DOI] [PubMed] [Google Scholar]
  • 43.Fagerberg T, Zoete V, Viatte S, Baumgaertner P, Alves PM, Romero P, et al. Prediction of cross-recognition of peptide-HLA A2 by melan-a-specific cytotoxic T lymphocytes using three-dimensional quantitative structure-activity relationships. PLoS ONE. 2013;8(7) doi: 10.1371/journal.pone.0065590.e65590 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Barh D, Misra AN, Kumar A, Vasco A. A novel strategy of epitope design in Neisseria gonorrhoeae . Bioinformation. 2010;5(2):77–82. doi: 10.6026/97320630005077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bi J, Song R, Yang H, et al. Stepwise identification of HLA-A*0201-restricted CD8+ T-cell epitope peptides from herpes simplex virus type 1 genome boosted by a steprank scheme. Biopolymers. 2011;96(3):328–339. doi: 10.1002/bip.21564. [DOI] [PubMed] [Google Scholar]
  • 46.Bremel RD, Homan EJ. An integrated approach to epitope analysis II: a system for proteomic-scale prediction of immunological characteristics. Immunome Research. 2010;6(1, article 8) doi: 10.1186/1745-7580-6-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Diez-Rivero CM, Chenlo B, Zuluaga P, Reche PA. Quantitative modeling of peptide binding to TAP using support vector machine. Proteins. 2010;78(1):63–72. doi: 10.1002/prot.22535. [DOI] [PubMed] [Google Scholar]
  • 48.Martínez-Naves E, Lafuente EM, Reche PA. Recognition of the ligand-type specificity of classical and non-classical MHC I proteins. FEBS Letters. 2011;585(21):3478–3484. doi: 10.1016/j.febslet.2011.10.007. [DOI] [PubMed] [Google Scholar]
  • 49.Bhasin M, Reinherz EL, Reche PA. Recognition and classification of histones using support vector machine. Journal of Computational Biology. 2006;13(1):102–112. doi: 10.1089/cmb.2006.13.102. [DOI] [PubMed] [Google Scholar]
  • 50.Reche PA, Reinherz EL. PEPVAC: a web server for multi-epitope vaccine development based on the prediction of supertypic MHC ligands. Nucleic Acids Research. 2005;33(2):W138–W142. doi: 10.1093/nar/gki357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Lafuente EM, Reche PA. Prediction of MHC-peptide binding: a systematic and comprehensive overview. Current Pharmaceutical Design. 2009;15(28):3209–3220. doi: 10.2174/138161209789105162. [DOI] [PubMed] [Google Scholar]
  • 52.Gaulton A, Bellis LJ, Bento AP, et al. A large-scale bioactivity database for drug discovery. Nucleic Acids Research. 2012:D1100–D1107. doi: 10.1093/nar/gkr777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Cropper WH. Great Physicists: The Life and Times of Leading Physicists from Galileo to Hawking. New York, NY, USA: Oxford University Press; 2004. [Google Scholar]
  • 54.Bouzarth EL, Brooks A, Camassa R, et al. Epicyclic orbits in a viscous fluid about a precessing rod: theory and experiments at the micro-and macro-scales. Physical Review E. 2007;76(1, part 2) doi: 10.1103/PhysRevE.76.016313.016313 [DOI] [PubMed] [Google Scholar]
  • 55.Laberge M. Intrinsic protein electric fields: basic non-covalent interactions and relationship to protein-induced Stark effects. Biochimica et Biophysica Acta. 1998;1386(2):305–330. doi: 10.1016/s0167-4838(98)00100-9. [DOI] [PubMed] [Google Scholar]
  • 56.Fernandez FM, Morales JA. Perturbation theory without wave functions for the Zeeman effect in hydrogen. Physical Review A. 1992;46(1):318–326. doi: 10.1103/physreva.46.318. [DOI] [PubMed] [Google Scholar]
  • 57.Marshall BD, Chapman WG. Thermodynamic perturbation theory for associating fluids with small bond angles: effects of steric hindrance, ring formation, and double bonding. Physical Reiew E. 2013;87(5):12 pages. doi: 10.1103/PhysRevE.87.052307.052307 [DOI] [PubMed] [Google Scholar]
  • 58.Gonzalez-Diaz H, Arrasate S, Gomez-San Juan A, et al. New theory for multiple input-output perturbations in complex molecular systems. 1. Linear QSPR electronegativity models in physical, organic, and medicinal chemistry. Current Topics in Medicinal Chemistry. 2013 doi: 10.2174/1568026611313140011. [DOI] [PubMed] [Google Scholar]
  • 59.Gonzalez-Diaz H, Arrasate S, Sotomayor N, et al. MIANN models in medicinal, physical and organic chemistry. Current Topics in Medicinal Chemistry. 2013;13(5):619–641. doi: 10.2174/1568026611313050006. [DOI] [PubMed] [Google Scholar]
  • 60.Box GEP, Jenkins GM. Time Series Analysis: Forecasting and Control. San Francisco, Calif, USA: Holden-Day; 1970. [Google Scholar]
  • 61.Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Unified multi-target approach for the rational in silico design of anti-bladder cancer agents. Anti-Cancer Agents in Medicial Chemistry. 2013;13(5):791–800. doi: 10.2174/1871520611313050013. [DOI] [PubMed] [Google Scholar]
  • 62.Speck-Planche A, Kleandrova VV, Cordeiro MN. New insights toward the discovery of antibacterial agents: multi-tasking QSBER model for the simultaneous prediction of anti-tuberculosis activity and toxicological profiles of drugs. The European Journal of Pharmaceutical Sciences. 2013;48(4-5):812–818. doi: 10.1016/j.ejps.2013.01.011. [DOI] [PubMed] [Google Scholar]
  • 63.Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Multi-target inhibitors for proteins associated with Alzheimer: in silico discovery using fragment-based descriptors. Current Alzheimer Research. 2013;10(2):117–124. doi: 10.2174/1567205011310020001. [DOI] [PubMed] [Google Scholar]
  • 64.Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. In silico discovery and virtual screening of multi-target inhibitors for proteins in Mycobacterium tuberculosis. Combinatorial Chemistry & High Throughput Screening. 2012;15(8):666–673. doi: 10.2174/138620712802650487. [DOI] [PubMed] [Google Scholar]
  • 65.Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MNDS. Chemoinformatics in anti-cancer chemotherapy: multi-target QSAR model for the in silico discovery of anti-breast cancer agents. The European Journal of Pharmaceutical Sciences. 2012;47(1):273–279. doi: 10.1016/j.ejps.2012.04.012. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material includes the sequences of peptides obtained from IEDB, boudary conditions (source organisms, host organisms, techniques, biological process) as well as the values of the input/output variables of the models calculated for all the cases present in the dataset used. These values have been obtained for all the input-output boundary conditions for the perturbations.


Articles from Journal of Immunology Research are provided here courtesy of Wiley

RESOURCES