Table 1.
Data Set | Original Method | #Epitopes | #Non Epitopes | Experimentally Defined | Data Source | Peptide Length | Validation |
---|---|---|---|---|---|---|---|
General benchmark data sets | |||||||
BCPred | BCPred | 701 | 701 | Only Epitopes | Bcipep and SwissProt | 20-mer | Cross-validation |
ABCPred-1 | ABCPred | 700 | 700 | Only Epitopes | Bcipep and SwissProt | 20-mer | Blind |
ABCPred-2 | ABCPred | 187 | 200 | Only Epitopes | Bcipep/SDAP and SwissProt | Assorted | Blind |
AAP | AAP | 872 | 872 | Only Epitopes | Bcipep and SwissPro | 20-mer | Blind |
LBtope | LBTope | 7,824 | 7,853 | Yes | IEDB | 20-mer | Blind |
iBCE-EL-1 | iBCE-EL | 4,440 | 5,485 | Yes | IEDB | Assorted | Blind |
iBCE-EL-2 | iBCE-EL | 1,110 | 1,408 | Yes | IEDB | Assorted | Blind |
New curated large-scale benchmark data set | |||||||
epitope1D Training | epitope1D | 20,638 | 103,281 | Yes | IEDB | Assorted | Cross-validation |
epitope1D Testing | epitope1D | 5,264 | 25,716 | Yes | IEDB | Assorted | Blind |
The first column, named "Data Set", is the name we are referring them to throughout the text; "Original Method" is where the set originally is derived from; "Epitopes” and "Non Epitopes” correspond to the total amount of labeled data within the set; "Experimentally Defined" indicates if the data from the two classes were experimentally assessed; "Data Source” specifies the database from which the set was extracted; "Peptide Length" indicates the size of the peptides within the data set (specifies the length—if fixed; or Assorted); "Validation" column designates if we apply the set for cross-validation or blind-testing purposes.