Table 5.
Obtained performance in terms of average balanced accuracy, average false negative rate (FNR), and average F1-score values across genders for each mitigation technique.
| Technique | Female sample accuracy, mean (SD) | Male sample accuracy, mean (SD) | Female sample FNR | Male sample FNR | Female sample F1-score | Male sample F1-score |
| Original classifier (LogRega) | 0.793 (0.06) | 0.948 (0.02) | 0.136 | 0.012 | 0.820 | 0.968 |
| Disparate impact remover | 0.793 (0.03) | 0.941 (0.01) | 0.154 | 0.017 | 0.816 | 0.963 |
| Reweighting | 0.797 (0.03) | 0.932 (0.03) | 0.111 | 0.040 | 0.827 | 0.956 |
| Calibrated equalized odds | 0.793 (0.05) | 0.948 (0.02) | 0.136 | 0.012 | 0.820 | 0.968 |
aLogReg: logistic regression.