Table 1.
Cause | Effect | Diagnosis | Mitigation |
---|---|---|---|
Label noise, label biases, selection biases | estimator bias, uninformative performance estimates | domain expertise, analyze label correlation with proxy variables,35 gather higher-fidelity labels22,31 | use other target variables,31,35 bias-robust learning techniques44,45,46 |
Concept shift: differences in between groups | estimator bias | investigate effects of group balancing and model stratification22,47 | use stratified model,22,47 gather additional features |
Low model expressivity, differences in between groups | estimator bias | investigate effects of group balancing22,47 and increasing model expressivity | increase model expressivity |
Underrepresentation and highly expressive model | high estimator variance | epistemic uncertainty quantification,15,16 analysis of sample size-performance relationship per group13,48 | gather more samples,48,49 decrease model expressivity, regularize |
High task difficulty | high irreducible error | aleatoric uncertainty quantification,15,16 analysis of sample size-performance relationship per group13,48 | gather additional or alternative features,31,50,51 reformulate prediction task or target population |
While these can help diagnose and mitigate bias in practice, they do not come with guarantees, and improved diagnostics and mitigation remain an open research problem. The list of potential causes of performance differences is not exhaustive.