Table 1.
Algorithm | Fair → 1.0← | Accuracy ↑ | Fair → 1.0← | Accuracy ↑ |
---|---|---|---|---|
GP | 0.80 ± 0.07 | 0.888 ± 0.007 | 0.54 ± 0.05 | 0.900 ± 0.006 |
LR | 0.83 ± 0.06 | 0.884 ± 0.007 | 0.52 ± 0.03 | 0.898 ± 0.003 |
SVM | 0.89 ± 0.06 | 0.899 ± 0.004 | 0.49 ± 0.05 | 0.913 ± 0.004 |
FairGP (ours) | 0.86 ± 0.07 | 0.888 ± 0.006 | 0.87 ± 0.09 | 0.902 ± 0.007 |
FairLR (ours) | 0.90 ± 0.06 | 0.874 ± 0.009 | 0.93 ± 0.04 | 0.886 ± 0.012 |
ZafarAccuracy (Zafar et al., 2017b) | 0.67 ± 0.17 | 0.808 ± 0.016 | 0.77 ± 0.08 | 0.853 ± 0.017 |
ZafarFairness (Zafar et al., 2017b) | 0.81 ± 0.06 | 0.879 ± 0.009 | 0.74 ± 0.11 | 0.897 ± 0.004 |
Kamiran and Calders (2012) | 0.87 ± 0.07 | 0.882 ± 0.007 | 0.96 ± 0.03 | 0.900 ± 0.004 |
Agarwal et al. (2018) | 0.86 ± 0.08 | 0.883 ± 0.008 | 0.65 ± 0.04 | 0.900 ± 0.004 |
Fairness is defined as PRs = 0/PRs = 1 (a completely fair model would achieve a value of 1.0). Left: using race as the sensitive attribute. Right: using gender as the sensitive attribute. The mean and std of 10 repeated experiments.