Table 2.
Mitigation Strategy | Accuracy | AUC | Precision | Recall | Specificity | F1 Score | AUPRC | BCE Loss |
---|---|---|---|---|---|---|---|---|
Undersample | 0.774 | 0.897 | 0.496 | 0.842 | 0.754 | 0.624 | 0.776 | 0.465 |
None | 0.875 | 0.916 | 0.789 | 0.604 | 0.954 | 0.684 | 0.798 | 0.310 |
Oversample | 0.865 | 0.918 | 0.682 | 0.744 | 0.900 | 0.711 | 0.807 | 0.318 |
To undersample, we sampled from the majority class to reduce it in size to that of the minority class. To oversample, we sampled from the minority class with replacement to inflate in size to that of the majority class. Using the data without these measures resulted in the lowest BCE loss, so we did not use oversampling or undersampling for our final model.
AUC = area under the receiver-operating characteristic curve; AUPRC = area under precision recall curve; BCE = Binary Cross Entropy; F1 = Harmonic Mean of Precision and Recall.