Table 5.
Data augmentation and sampling results across all datasets: rare event prediction.
PM | Dataset | All (Aug. + Tomek Samp.) | All (Aug. + ENN Samp.) | All (Aug. + ADASYN Samp.) | Aug. Only | Samp. Only-tomek | Samp. Only-ENN | Samp. Only-ADASYN | No Method |
---|---|---|---|---|---|---|---|---|---|
P | P and P | 0.87 | 0.72 | 0.88 | 0.9 | 0.86 | 0.6 | 0.7 | 0.7 |
FF-O | 1 | ||||||||
R | P and P | 0.72 | 0.74 | 0.7 | 0.78 | 0.73 | 0.64 | 0.63 | 0.7 |
FF-O | 1 | 1 | 1 | 0.98 | 0.85 | 0.91 | 0.87 | 0.67 | |
F1 | P and P | 0.78 | 0.73 | 0.76 | 0.83 | 0.77 | 0.62 | 0.65 | 0.7 |
FF-O | 1 | 0.97 | 0.99 | 0.99 | 0.76 | 0.73 | 0.76 | 0.73 | |
S | P and P | 3655 (3608 + 47) | |||||||
FF-O | 42,460 (42,407 + 53) | ||||||||
A | P and P | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0.98 |
FF-O | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
P and P—pulp-and-paper, BS—Bosch, APS—Air Pressure Systems, BR—Ball bearing, FF—S-Future factories sampled, FF—O-Future factories original, PM—Performance metric, P—Precision, R—Recall, F1—F1 Score, A—Accuracy, S—Support, Aug.—Augmentation, Samp.—Sampling. Bold numbers indicate the best performance.