Table 2. Descriptions of the four balancing procedures.
dataset 1 | dataset 2 | dataset 3 | dataset 4 |
---|---|---|---|
Training set: 975,036 samples Test set: 193,528 samples Class Balancing: TOMEK applied to dataset (before dataset has been split into training & test set) to remove tomek links, random undersampling applied to class 3 once dataset is split into training and testing sub-sets, then SMOTE applied to classes 1 and 2 to make their cardinalities equal to that of class 3 (325,012). |
Training set: 2,293,119 samples Test set: 201,926 samples Class Balancing: SMOTE applied to classes 1 & 2 to make their cardinalities equal to that of class 3 (764,373). |
Training set: 487,464 samples Test set: 106,028 samples Class Balancing: TOMEK applied to dataset (before dataset has been split into training & test set) to remove tomek links, random undersampling applied to class 3 once dataset is split into training and testing sub-sets, then SMOTE applied to classes 1 and 2 to make their cardinalities equal to that of class 3 (162,488). |
Training set: 1,462,503 samples Test set: 281,028 samples Class Balancing TOMEK applied to dataset (before dataset has been split into training & test set) to remove tomek links, random undersampling applied to class 3 once dataset is split into training and testing sub-sets, then SMOTE applied to classes 1 and 2 to make their cardinalities equal to that of class 3 (487,501). |