Table 1.
AI biases and how to fix them
Phase | Ethical issues | Recommendation |
---|---|---|
Data creation | Data which are unrepresentative of the target population will lead to algorithms that poorly serve much of that population. | The context in which data are created, including the original purpose of data generation, areas of representation, and efforts that tackle barriers to inclusion, should be published alongside any AI dataset. |
Data acquisition | Incomplete data collection, mandatory redactions, and lack of information on financial and privacy protections limits the ability to assess whether the data source is fit for purpose and meets ethical standards. | Expand regulatory and transparency requirements to enable the additional sharing of essential demographic and health information and logistical data acquisition details needed to promote the development of reliable AI tools. |
Model development | Model optimization, in regards to data labeling, inclusion and exclusion criteria, and benchmarks for assessment can perpetuate discrimination against some groups and prevent marginalized populations from benefitting from the model. | Technical specifications of the AI model and any fairness deliberations should be required reporting criteria and made publicly available, allowing for the model development process to be fully replicated and inspected. |
Model evaluation | Limited evaluations of algorithm performance and fairness as fully integrated tools within a clinical workflow can mask potential societal harms the algorithm might cause to certain populations once deployed. | A comprehensive set of evaluation metrics, including fairness evaluations, should be required by regulatory leaders, including journal editors, hospital administrators, and federal agencies. |
Model deployment | Algorithms are deployed with minimal preceding understanding about impact and risks to populations, implementation, and the resources required for their success. Transparencies, generalizability, and fairness shortcomings can lead to inequitable real-world consequences. | Focused studies on the challenges and successes of AI deployment, including identifying standards and/or metrics to measure success, are needed, especially in low-resource healthcare settings. |