Artificial intelligence (AI) is transforming healthcare. The promise of AI is enormous, and it is already being realized to increase accuracy of diagnoses, promote patient engagement, identify patients at risk of disease, improve efficiency of healthcare clinicians, and lower healthcare costs [5]. These all are forces for good.
AI also has the potential to exacerbate health disparities [1]. Landmark research by Obermeyer et al. [21] showed racial bias against Black patients in algorithms used to predict patients with complex health needs. Such risk models can be used to prioritize patients with more severe medical comorbidities for earlier visits with their primary care physician for greater access to clinicians and additional supportive resources [7]. These models used healthcare spending as a surrogate for severity of illness, with the assumption that the patients who had more severe illness and comorbidities had the greatest utilization of the healthcare system. This flaw resulted in bias in the model: Black patients were less likely to be identified as candidates for proactive outreach by their primary care team in an effort to improve management of their medical conditions compared with White patients because less money was spent caring for Black patients compared with White patients. Obermeyer et al. [21] calculated that correcting this inaccurate modeling would have increased the percentage of Black patients receiving additional services from 17.7% to 46.5%. Some clinical algorithms also have been identified as racially biased. For example, pulmonary function tests traditionally included a “race correction,” which resulted in Black patients being underdiagnosed regarding the prevalence and severity of pulmonary disease [19]. These are clinically important findings.
But well-designed AI tools may also help us address disparities. Studies have found that Black patients reported higher levels of pain than White patients with the same degree of knee osteoarthritis [4, 20, 26]. Proposed explanations for this difference include higher levels of depression and obesity in Black patients. AI researchers using a deep-learning approach produced a new algorithmic measure of osteoarthritis severity from radiographs and showed that this tool dramatically reduced the discordance of pain with radiographic findings in Black patients. Although only 9% of racial disparities in pain were accounted for by radiologist-read radiographs, the AI model accounted for 43% of pain disparities, or 4.7 times more [22]. New AI tools may also advance health equity progress by supporting the identification of social determinants of health and measure improvement in delivery of fairer, better care [6, 28].
Although studies have shown that AI can detect the sex and age of the patient from radiographic imaging studies and age, sex, hypertension, and smoking status from retinal images [13, 24], one AI study merits particular attention because of an unexpected finding. In 2022, Emory researchers created a deep-learning model to evaluate whether race (White, Black, or Asian) could be determined from standard images such as chest, cervical spine, and hand radiographs; chest CT images; and mammograms [13]. To create the model, the researchers took large imaging datasets and inputted the patient’s self-reported race for each imaging study, meaning that the models knew the self-reported race of the patient of each image. By doing this, the models learned to associate race with imaging findings.
The researchers then tested the models against new, large imaging datasets, which contained no race or patient demographic information. The AI-driven deep-learning model accurately determined self-reported race across these multiple imaging modalities (for example, 94% to 99% for CXR, 80% to 96% for CT chest, and 78% to 81% for mammography). The researchers found this predictive ability to persist at greater than 90% accuracy even when they controlled for factors that they thought might influence the race prediction such as BMI, disease distribution, breast density, and bone density. Even when the image quality was degraded, Black or White race generally was still correctly determined. “We don’t know how the machines are detecting race so we can’t develop an easy solution,” Dr. Gichoya said in a press release [9]. I personally contacted Dr. Gichoya to confirm this quote; she did so. Let that sink in: The creator of the AI model cannot explain how it can so accurately predict race from an image.
The concern is that AI will incorporate race, as determined by AI, into predictions of treatment success. If Dr. Gichoya does not know how the machines are detecting race from deidentified images in her own study, how will we be confident that AI models are not self-learning factors which embed racism (or sexism or some other bias) into their outputs? Will AI model developers (and users) even be aware of such embedded bias? Of interest, my ChatGPT query about whether race can be determined by a chest x-ray resulted in, “No, AI cannot reliably detect a person’s race from a chest x-ray” (https://chat.openai.com). I suggest that ChatGPT update their algorithms.
This is more relevant to orthopaedic surgery than you may think. Although clinical decision support systems are built on evidence-based guidelines and expert consensus, AI models use statistical associations built on variables and interactions of those variables, some of which may not be apparent to clinicians responsible for the care of their patients [8]. Deep-learning models to predict the likelihood of a patient undergoing TKA based on knee radiographs have already been developed [16]. I do not know whether this model (and potentially others) may be self-learning to identify patient race and incorporate that into a likely biased prediction of patients who would be appropriate candidates for TKA.
Recognizing that surgical decision-making is not just about radiographic findings, a machine-learning model was developed to predict TKA based on pain, function, quality of life, and radiographic data. This model had a positive predictive value of 84% and a negative predictive value of 73% in assessing patients who may undergo TKA in the next 2 years [15]. I find this model exciting: If we can provide patients with a glimpse into their likely future, perhaps we can better engage them to address modifiable risk factors for disease progression such as levels of physical activity.
However, models are also being created that make me uneasy. Scottish researchers now are developing an AI machine-learning model to “correctly and rapidly select suitable patients for (hip and knee) arthroplasty surgery.” Using approximately 2000 patients who have undergone hip or knee replacement, a dataset will be built incorporating demographic, clinical, and surgeon decision-making information. This dataset will then be applied to train the AI model to use radiographs to develop pattern classification and probabilistic prediction models of having a successful surgical outcome (functional improvement and lack of surgical complications). In their paper on the development of this model, Farrow et al. [10] wrote, “This project provides a first step toward delivering an automated solution for arthroplasty selection using routinely collected health care data.” The results of the study applying this AI machine-learning model to clinical practice are expected to be published in the first quarter of 2024. Of note, information on the diversity of the 2000 patients used to build the model are not (yet) available [10].
I appreciate the potential for such a tool to identify priority patients for surgical referral and thus to improve efficiency of the healthcare system. But when an AI model is built upon the current method for determining the appropriate candidates for arthroplasty, there is a clear risk of embedding bias. For example, the model created by the Scottish team incorporates the Scottish Index of Multiple Deprivation [25], which identifies whether a patient lives in a low-income area. If Scottish surgeons currently are less likely to offer TKA to patients from such areas, then the model risks embedding bias against low-income patients. The model also includes data on the likelihood that a patient will avoid a postoperative complication, which may discriminate against patients with higher comorbidities. Shared decision-making should remain a cornerstone of the surgeon-patient interaction; it cannot be replaced by an AI tool.
Algorithms currently are in wide use to influence utilization of medical services, and serious concerns have already been raised. In a class-action lawsuit, Cigna insurance disputes allegations that its PXDX (“procedure-to-diagnosis”) algorithm automatically rejected more than 300,000 patient claims and states that the program does not result in care denials [2]. Likewise, UnitedHealthcare, using AI developed by NaviHealth, is being sued for alleged use of an algorithm with a 90% error rate to override physicians' recommendations and leading to inappropriate discharge of Medicare Advantage patients from rehabilitation centers. UnitedHealthcare states this lawsuit has no merit [3, 23]. As these lawsuits work their way through the courts, perhaps payers will evaluate how AI models are used to approve or deny medical services. Additional pressure on payers will likely come from President Biden’s October 2023 executive order on safe, secure, and trustworthy AI to address algorithmic discrimination and protect consumers [27].
In 2022, the FDA determined that certain AI tools should be regulated as medical devices [12, 18]. In December 2023, the federal Office of the National Coordinator for Health IT released a 900-plus page report [14]. This will require that by the end of 2024, AI developers must provide clinical users of decision support interventions with a “baseline set of information about the algorithms they use to support their decision making and to assess such algorithms for fairness, appropriateness, validity, effectiveness, and safety” [11]. This information includes whether patient demographic, social determinants of health, and health assessment data are incorporated into decision support intervention criteria. The need for AI models to be explicit in defining the patient characteristics used to create the datasets is important to ensure adequate representation of diverse populations [17]. This government action is to be applauded.
AI holds the potential for great benefit in medicine as well as unrecognized harm. We cannot stop, and should not desire to stop, the amazing progress of AI. But we need to hold AI to standards that promote better healthcare for all. If we don’t understand and control the power of AI, my fear is that we will move even further away from our goal of the best healthcare for all our patients.
Footnotes
A note from the Editor-in-Chief: I am pleased to present the next installment of “Equity360: Gender, Race, and Ethnicity,” written by Mary I. O’Connor MD, FAOA, FAAHKS, FAAOS. Dr. O’Connor is Chair of Movement is Life, a multistakeholder coalition committed to health equity, co-founder and Chief Medical Officer at Vori Health, Professor Emerita of Orthopedics at Mayo Clinic, and Past Professor of Orthopaedics and Rehabilitation at Yale School of Medicine. She has written extensively on increasing the number of women and underrepresented minorities in orthopaedics and other social issues. Her column will unravel the complex and controversial motives behind disparities in musculoskeletal medicine across sex, gender, race, and ethnicity.
The author certifies that there are no funding or commercial associations (consultancies, stock ownership, equity interest, patent/licensing arrangements, etc.) that might pose a conflict of interest in connection with the submitted article related to the author or any immediate family members.
All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.
The opinions expressed are those of the writer, and do not reflect the opinion or policy of CORR® or The Association of Bone and Joint Surgeons®.
References
- 1.Abràmoff MD, Tarver ME, Loyo-Berrios N, et al. Considerations for addressing bias in artificial intelligence for health equity. NPJ Digit Med. 2023;6:170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.AI Incident Database. Incident 591: Cigna algorithm PXDX allegedly rejected thousands of patient claims en masse in breach of California law. Available at: https://incidentdatabase.ai/cite/591. Accessed January 3, 2024.
- 3.AI Incident Database. Incident 608: UnitedHealth accused of deploying allegedly flawed AI to deny medical coverage. Available at: https://incidentdatabase.ai/cite/608. Accessed January 3, 2024.
- 4.Allen KD, Helmick CG, Schwartz TA, DeVellis RF, Renner JB, Jordan JM. Racial differences in self-reported pain and function among individuals with radiographic hip and knee osteoarthritis: the Johnston County Osteoarthritis Project. Osteoarthritis Cartilage. 2009;17:1132-1136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Alowais SA, Alghamdi SS, Alsuhebany N, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. 2023;23:689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Alvee We. give you the tools to advance health equity. Available at: https://www.alvee.io. Accessed January 3, 2024.
- 7.Commonwealth Fund. Can AI improved health without perpetuating bias? Available at: https://www.commonwealthfund.org/publications/podcast/2023/apr/can-ai-improve-health-without-perpetuating-bias. Accessed January 3, 2024.
- 8.DeCamp M, Lindvall C. Mitigating bias in AI at the point of care. Science. 2023;381:150-152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Emory University. AI systems can detect patient race, creating new opportunities to perpetuate health disparities. Available at: https://news.emory.edu/stories/2022/05/hs_ai_systems_detect_patient_race_27-05-2022/story.html. Accessed January 3, 2024.
- 10.Farrow L, Ashcroft GP, Zhong M, Anderson L. Using Artificial Intelligence to Revolutionise the Patient Care Pathway in Hip and Knee Arthroplasty (ARCHERY): protocol for the development of a clinical prediction model. JMIR Res Protoc. 2022;11:e37092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Healthcare Fierce. HHS pushes forward with new requirements for AI transparency, interoperability. Available at: https://www.fiercehealthcare.com/regulatory/hhs-finalizes-rule-move-needle-interoperability-algorithm-transparency. Accessed January 3, 2024.
- 12.Food and Drug Administration. Clinical decision support software. Guidance for industry and Food and Drug Administration staff. Available at: https://www.fda.gov/media/109618/download. Accessed January 3, 2024.
- 13.Gichoya JW, Banerjee I, Bhimireddy AR, et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit Health. 2022;4:e406-e414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.HealthIT.gov. Health data, technology, and interoperability: certification program updates, algorithm transparency, and information sharing (HTI-1) final rule Available at: https://www.healthit.gov/topic/laws-regulation-and-policy/health-data-technology-and-interoperability-certification-program. Accessed January 3, 2024.
- 15.Heisinger S, Hitzl W, Hobusch GM, Windhager R, Cotofana S. Predicting total knee replacement from symptomology and radiographic structural change using artificial neural networks-data from the Osteoarthritis Initiative (OAI). J Clin Med. 2020;9:1298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Leung K, Zhang B, Tan J, et al. Prediction of total knee replacement and diagnosis of osteoarthritis by using deep learning on knee radiographs: data from the Osteoarthritis Initiative. Radiology. 2020;296:584-593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Lown Institute. Leveraging AI to reduce health disparities: a closer look at the possibilities. Available at: https://lowninstitute.org/leveraging-ai-to-reduce-health-disparities-a-closer-look-at-the-possibilities/. Accessed January 3, 2024.
- 18.Med Device Online. The FDA regulatory landscape for AI in medical devices. Available at: https://www.meddeviceonline.com/doc/the-fda-regulatory-landscape-for-ai-in-medical-devices-0001 Accessed January 3, 2024.
- 19.Moffett AT, Bowerman C, Stanojevic S, Eneanya ND, Halpern SD, Weissman GE. Global, race-neutral reference equations and pulmonary function test interpretation. JAMA Netw Open. 2023;6:e2316174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nemati D, Keith N, Kaushal N. Investigating the relationship between physical activity disparities and health-related quality of life among black people with knee osteoarthritis. Prev Chronic Dis. 2023;20:E56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366:447-453. [DOI] [PubMed] [Google Scholar]
- 22.Pierson E, Cutler DM, Leskovec J, Mullainathan S, Obermeyer Z. An algorithmic approach to reducing unexplained pain disparities in underserved populations. Nat Med. 2021;27:136-140. [DOI] [PubMed] [Google Scholar]
- 23.Pifer R. UnitedHealth sued over use of algorithm to deny care for MA members. Available at: https://www.healthcaredive.com/news/unitedhealth-algorithm-lawsuit-care-denials/699834/. Accessed January 3, 2024.
- 24.Rim TH, Lee G, Kim Y, et al. Prediction of systemic biomarkers from retinal photographs: development and validation of deep-learning algorithms. Lancet Digit Health. 2020;2:e526-e536. [DOI] [PubMed] [Google Scholar]
- 25.Scottish Government. Scottish index of multiple deprivation 2020. Available at: https://www.gov.scot/collections/scottish-index-of-multiple-deprivation-2020/. Accessed January 3, 2024.
- 26.Simkin J, Valentino J, Cao W, et al. Quantifying mediators of racial disparities in knee osteoarthritis outcome scores: a cross-sectional analysis. JB JS Open Access. 2021;6:e21.00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.White House The. Fact sheet: President Biden issues executive order on safe, secure, and trustworthy artificial intelligence. Available at: https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/. Accessed January 3, 2024.
- 28.Unite US. Provider solutions. Available at: https://uniteus.com/solutions/providers/. Accessed January 3, 2024.
