. 2023 Apr 10;24(3):bbad114. doi: 10.1093/bib/bbad114

Table 1.

Description of the data sets applied to train and evaluate epitope1D

Data Set	Original Method	#Epitopes	#Non Epitopes	Experimentally Defined	Data Source	Peptide Length	Validation
General benchmark data sets
BCPred	BCPred	701	701	Only Epitopes	Bcipep and SwissProt	20-mer	Cross-validation
ABCPred-1	ABCPred	700	700	Only Epitopes	Bcipep and SwissProt	20-mer	Blind
ABCPred-2	ABCPred	187	200	Only Epitopes	Bcipep/SDAP and SwissProt	Assorted	Blind
AAP	AAP	872	872	Only Epitopes	Bcipep and SwissPro	20-mer	Blind
LBtope	LBTope	7,824	7,853	Yes	IEDB	20-mer	Blind
iBCE-EL-1	iBCE-EL	4,440	5,485	Yes	IEDB	Assorted	Blind
iBCE-EL-2	iBCE-EL	1,110	1,408	Yes	IEDB	Assorted	Blind
New curated large-scale benchmark data set
epitope1D Training	epitope1D	20,638	103,281	Yes	IEDB	Assorted	Cross-validation
epitope1D Testing	epitope1D	5,264	25,716	Yes	IEDB	Assorted	Blind

The first column, named "Data Set", is the name we are referring them to throughout the text; "Original Method" is where the set originally is derived from; "Epitopes” and "Non Epitopes” correspond to the total amount of labeled data within the set; "Experimentally Defined" indicates if the data from the two classes were experimentally assessed; "Data Source” specifies the database from which the set was extracted; "Peptide Length" indicates the size of the peptides within the data set (specifies the length—if fixed; or Assorted); "Validation" column designates if we apply the set for cross-validation or blind-testing purposes.