Skip to main content
. 2023 Apr 10;24(3):bbad114. doi: 10.1093/bib/bbad114

Table 1.

Description of the data sets applied to train and evaluate epitope1D

Data Set Original Method #Epitopes #Non Epitopes Experimentally Defined Data Source Peptide Length Validation
General benchmark data sets
BCPred BCPred 701 701 Only Epitopes Bcipep and SwissProt 20-mer Cross-validation
ABCPred-1 ABCPred 700 700 Only Epitopes Bcipep and SwissProt 20-mer Blind
ABCPred-2 ABCPred 187 200 Only Epitopes Bcipep/SDAP and SwissProt Assorted Blind
AAP AAP 872 872 Only Epitopes Bcipep and SwissPro 20-mer Blind
LBtope LBTope 7,824 7,853 Yes IEDB 20-mer Blind
iBCE-EL-1 iBCE-EL 4,440 5,485 Yes IEDB Assorted Blind
iBCE-EL-2 iBCE-EL 1,110 1,408 Yes IEDB Assorted Blind
New curated large-scale benchmark data set
epitope1D Training epitope1D 20,638 103,281 Yes IEDB Assorted Cross-validation
epitope1D Testing epitope1D 5,264 25,716 Yes IEDB Assorted Blind

The first column, named "Data Set", is the name we are referring them to throughout the text; "Original Method" is where the set originally is derived from; "Epitopes” and "Non Epitopes” correspond to the total amount of labeled data within the set; "Experimentally Defined" indicates if the data from the two classes were experimentally assessed; "Data Source” specifies the database from which the set was extracted; "Peptide Length" indicates the size of the peptides within the data set (specifies the length—if fixed; or Assorted); "Validation" column designates if we apply the set for cross-validation or blind-testing purposes.