Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jun 24.
Published in final edited form as: Eur J Cancer. 2025 Apr 1;220:115394. doi: 10.1016/j.ejca.2025.115394

Global disparities in artificial intelligence-based mammogram interpretation for breast cancer: A scientometric analysis of representation, trends, and equity

Isabele A Miyawaki a,*, Imon Banerjee b,c,d, Felipe Batalini e, Carlos A Campello Jorge f, Leo A Celi g,h,i, Marisa Cobanaj j, Edward C Dee k, Judy W Gichoya l, Zaphanlene Kaffey m, Maxwell R Lloyd n, Lucas McCullum m,o, Sruthi Ranganathan p, Chiara Corti q,r,s,t
PMCID: PMC12185668  NIHMSID: NIHMS2090753  PMID: 40209572

Abstract

Background:

Breast cancer (BC) is the most frequently diagnosed cancer and the leading cause of cancer death among women worldwide. Artificial intelligence (AI) shows promise for improving mammogram interpretation, especially in resource-limited settings. However, concerns remain regarding the diversity of datasets and the representation of researchers in AI model development, which may affect the models’ generalizability, fairness, and equity.

Methods:

We performed a scientometric analysis of studies published in 2017, 2018, 2022, and 2023 that used screening or diagnostic mammograms for BC detection to train or validate AI algorithms. PubMed (MEDLINE) and EMBASE were searched in July 2024. Data extraction focused on patient cohort sociodemographics (including age and race/ethnicity), geographic distribution (categorized by World Bank country income levels and regions), and author profiles (sex, affiliation, and funding sources).

Results:

Of 5774 studies identified, 264 met the inclusion criteria. The number of studies increased from 28 in 2017 to 115 in 2023—a 311 % increase. Despite this growth, only 0–25 % of studies reported race/ethnicity, with most patients identified as Caucasian. Moreover, nearly all patient cohorts originated from high-income countries, with no studies from low-income settings. Author affiliations were predominantly from high-income regions, and gender imbalance was observed among first and last authors.

Conclusion:

The lack of racial, ethnic, and geographic diversity in both datasets and researcher representation could undermine the generalizability and fairness of AI-based mammogram interpretation. Addressing these disparities through diverse dataset collection and inclusive international collaborations is critical to ensuring equitable improvements in breast cancer care.

Keywords: Breast cancer, Mammogram, Radiology, Mammography, Artificial intelligence, Scientometric

1. Introduction

Breast cancer (BC) remains the most frequently diagnosed cancer and the leading cause of cancer death among women worldwide, and its burden is expected to grow as many countries, especially those with low to medium Human Development Index (HDI), undergo demographic shifts [1,2]. Between 2022 and 2050, annual rates of female BC are projected to increase by 1–5 % in many countries [3]. Although mortality rates are expected to decline in 29 very high-HDI countries, new cases and deaths will have risen by 38 % and 68 %, respectively, disproportionately affecting low-HDI countries [3]. In this context, countries with low and medium HDI need high-quality cancer data, continual improvements in early detection, and, most importantly, increased access to treatment to reduce inequities and track progress toward cancer control goals [35].

Artificial intelligence (AI) has proven to be a valuable tool for predicting, screening, classifying, and diagnosing BC [610]. For example, growing evidence indicates that AI algorithms applied to screening and diagnostic mammograms can help alleviate shortages of radiologists and technicians—particularly in low-resource settings—without sacrificing performance [4,9,11]. In fact, a recent systematic review and meta-analysis found that a deep learning-based AI algorithm reduced radiologists’ workload by 68 % while maintaining high sensitivity in screening mammograms [12]. Additionally, a prospective, population-based, paired-reader non-inferiority trial in 55581 women aged 40–74 years undergoing routine screening in Stockholm compared standard-of-care double reading by two radiologists with alternative strategies—including one radiologist plus AI, AI alone, and two radiologists plus AI—and found that replacing one radiologist with AI not only maintained non-inferior cancer detection rates but achieved a 4 % higher detection rate, supporting the potential for controlled AI integration to reduce workload without compromising screening performance [13].

While enthusiasm for AI systems and their potential to support radiologists in evaluating mammograms continues to grow, we are also recognizing the shortcomings of some AI models and their capacity to reinforce bias [5,7,1416]. In fact, pitfalls throughout the entire AI lifecycle—from problem definition and dataset selection and curation to model training, deployment in healthcare systems, and ongoing monitoring—can introduce bias, resulting from a combination of human and machine factors [4,14].

A growing issue is that AI development and validation, primarily conducted in high-income countries, often rely on training datasets that lack diversity and undergo limited external validation [14,17,18]. Additionally, existing legal frameworks fail to encourage developers to transparently disclose patient dataset compositions or to include diverse populations, thereby perpetuating data-centric biases in AI development [14].

In this scientometric review, we examine the current state of algorithm development and validation in AI-supported radiology for mammography, focusing on disparities, socioeconomic representation, and research trends, while considering the BC burden in each country during the two-year periods 2017–2018 and 2022–2023.

2. Methods

Because no specific checklist exists for reporting scientometric reviews, we conducted this analysis using selected Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) items applicable to scientometric reviews (Supplementary File S1), and no institutional ethical approval was required.

2.1. Study eligibility criteria

We limited this scientometric analysis to studies that met the following criteria: (1) use of screening or diagnostic mammograms specifically for BC detection to train or validate an AI model, (2) presentation of an AI-based algorithm in the results, and (3) publication in 2017, 2018, 2022, or 2023. We excluded conference abstracts and studies with overlapping AI algorithm descriptions. These specific years were selected to capture trends in mammogram-based AI in recent years while avoiding potential distortions in scientific development during the COVID-19 pandemic by omitting 2019, 2020, and 2021.

2.2. Search strategy and study selection

We searched PubMed (MEDLINE) and EMBASE for studies meeting our eligibility criteria from inception to July 2024. The search was performed through July 2024 to account for occasional failures of PubMed’s custom year filters, after which we manually excluded (IAM, SR) articles published after December 31, 2023, to maintain methodological consistency.

Our search strategy used the following query: (“artificial intelligence” OR AI OR “deep learning” OR DL OR “machine learning” OR “neural network” OR “fuzzy expert system” OR “evolutionary computation” OR “expert system” OR “convolutional neural network” OR CNN) AND (mammography OR mammogram* OR mammographic OR DM) AND (breast) AND (cancer OR tumor OR carcinoma OR neoplasm). The final search was conducted on July 4, 2024.

2.3. Data extraction

We extracted sociodemographic data from each study’s cohort by manually recording the sample size, whether race/ethnicity was reported, the percentage distribution of patients by race/ethnicity, the country and income level of the patients, the dataset name, and whether the study received funding (IAM, IB, FB, CACJ, MC, ECD, ZK, MRL, LM, SR, CC). Countries were categorized by income (low, lower-middle, upper-middle, and high) according to the World Bank classification [19].

Additional scoping data were obtained using Dimensions AI software (2025 Digital Science & Research Solutions Inc.). For example, we analyzed the characteristics of first and last authors, including their sex, race/ethnicity, country income level, and region. Regions were categorized according to the World Bank: East Asia and Pacific, Europe and Central Asia, Latin America and Caribbean, Middle East and North Africa, North America, South Asia, and Sub-Saharan Africa.

Additionally, we collected and analyzed funding declarations, including whether the support came from academic or industry sources.

2.4. Data analysis

Data were systematically extracted from each study into a predesigned spreadsheet and analyzed post hoc using pivot tables. Continuous variables were summarized as means with standard deviations for normally distributed data or medians with interquartile ranges for skewed data. Categorical variables were reported as frequencies and percentages.

3. Results

3.1. Data search and inclusion

We identified 5774 studies, including 2569 duplicates and 2427 articles published outside the 2017–2018 or 2022–2023 periods, resulting in 778 studies for screening (Figure 1). After screening abstracts and full texts, 264 studies were included in the scientometric review. A complete list of these studies and their baseline characteristics is available in Supplementary File S2.

Fig. 1. PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow chart of study records.

Fig. 1.

The full list of screened papers is available in Supplementary File S2. Abbreviations: n, number; AI, artificial intelligence.

3.2. Sociodemographic characteristics of studies

Table 1 presents the sociodemographic distribution (age and race/ethnicity) of patient cohorts from the included studies for 2017, 2018, 2022, and 2023. From 2017 to 2023, the number of studies on developing and validating AI algorithms using mammograms increased from 28 to 115 (+311 % increase). Considering the two-year period of 2017–2018, the number of studies increased from 52 to 212 in 2022–2023, reflecting a 308 % increase. Despite this growth, the percentage of studies reporting race/ethnicity data remained low, ranging from 0 % in 2017 to 25 % in 2022. In 2022 and 2023, White patients made up the largest proportion (67 % and 70 %, respectively), while Black and Hispanic patients were underrepresented. In contrast, studies from 2018 reported similar proportions of Asian (32 %) and White (39 %) patients.

Table 1.

Sociodemographic representation of the cohort. Percentages across ethnic groups are reported with reference to the total sample size calculated by summing the participants from studies that specifically reported race.

Year Studies (n) Reported sample size (n) Studies that reported race n (%) Reported sample size (studies reporting race only) n (%) White n (%) Black n (%) Hispanic n (%) Asian n (%) Other n (%)
2017 28 423.619 0 (0) 0 (0) NA NA NA NA NA
2018 24 20.841 4 (17) 9.627 (46) 3.749 (39) 275 (3) 437 (4) 3.129 (32) 2.037 (21)
2022 97 1.973.106 24 (25) 1.152.176 (58) 783.427 (67) 43.083 (4) 62.926 (5) 159.361 (14) 103.379 (9)
2023 115 678.841 12 (10) 512.518 (75) 359.864 (70) 65.693 (13) 25.246 (5) 51.515 (10) 10.200 (2)

Abbreviations: NA, not applicable.

3.3. Geographic distribution of studies

Table 2 and Figure 2 display the geographical distribution of patient cohorts, with studies categorized by country, income level, and region. Across all four years, no studies were conducted in low-income countries. In 2017, 2018, and 2022, most studies took place in high-income countries (63 %, 68 %, and 77 %, respectively). In 2023, however, a similar proportion of studies were conducted in high-income (48 %) and upper-middle-income countries (43 %; Table 2).

Table 2.

Geographical representation of the cohorts according to the dataset’s country income. Percentages in each row are calculated based on the sum of the represented geographical regions across all studies in that row, as one study may include multiple cohorts with datasets from different countries. As a result, the percentages may differ from the total number of studies in the first column.

Year Studies (n) High n (%) Upper middle n (%) Lower middle n (%) Low n (%)
2017 28 17 (63) 6 (22) 4 (15) 0 (0)
2018 24 19 (68) 7 (25) 2 (7) 0 (0)
2022 97 108 (77) 21 (15) 11 (8) 0 (0)
2023 115 54 (48) 48 (43) 11 (10) 0 (0)

Fig. 2. Geographical distribution of patient cohorts by dataset’s region.

Fig. 2.

Percentages are calculated based on the sum of represented geographical regions across all studies in each year of reference, as one study may include multiple cohorts with datasets from different countries. Legend: light blue, East Asia Pacific; dark blue, Europe and Central Asia; yellow, Middle East and North Africa; orange, North America; green, South Asia; pink, Sub-Saharan Africa. The study year and number of studies for each evaluated period are displayed in the lower left corner of each map. Regional classifications follow the World Bank criteria.

Overall, the majority of studies were conducted in the East Asia Pacific (21–48 %), Europe and Central Asia (16–32 %), and North America (23–37 %) regions (Figure 2).

3.4. Sociodemographic profiles of study authors

Table 3 presents the sociodemographic characteristics of the first and last authors, including their sex and institutional location. The percentage of male first authors ranged from 47 % to 56 %, while male last authors ranged from 57 % to 75 %. Notably, the percentage of male last authors decreased from 75 % in 2017 to 57 % in 2023. In both groups, the majority of authors were affiliated with institutions in high-income countries, with no first authors from low-income countries. Furthermore, the largest proportion of both first and last authors were affiliated with institutions in North America, Europe and Central Asia, and East Asia Pacific.

Table 3.

Sociodemographic characteristics of first and last authors.

Year First Author Sex n (%) First Author Country Income n (%) First author Country Region n (%) Last Author Sex n (%) Last Author Country Income n (%) Last Author Country Region n (%)
2017 Males: 18 (56.2); Females: 11 (34.4); Unknown: 3 (9.4) H: 18 (56.2); UM: 8 (25.0); LM: 2 (6.2); L: 0 (0.0) EAP: 6 (18.8); ECA: 6 (18.8); LAC: 3 (9.4); MENA: 2 (6.2); NA: 9 (28.1); SA: 2 (6.2); SSA: 0 (0.0) Males: 24 (75.0); Females: 7 (21.9); Unknown: 1 (3.1) H: 19 (59.4); UM: 7 (21.9); LM: 2 (6.2); L: 0 (0.0) EAP: 6 (18.8); ECA: 6 (18.8); LAC: 3 (9.4); MENA: 2 (6.2); NA: 9 (28.1); SA: 2 (6.2); SSA: 0 (0.0)
2018 Males: 12 (50.0); Females: 9 (37.5); Unknown: 3 (12.5) H: 14 (58.3); UM: 6 (25.0);
LM: 4 (16.7); L: 0 (0.0)
EAP: 5 (20.8); ECA: 6 (25.0); LAC: 1 (4.2); MENA: 2 (8.3); NA: 7 (29.2); SA: 3 (12.5); SSA: 0 (0.0) Males: 15 (65.2); Females: 5 (21.7); Unknown: 3 (13.1) H:14 (60.9); UM: 3 (13.0); LM: 6 (26.1); L: 0 (0.0) EAP: 3 (13.0); ECA: 5 (21.7); LAC: 1 (4.3); MENA: 3 (13.0); NA: 7 (30.4); SA: 4 (17.4); SSA: 0 (0.0)
2022 Males: 49 (47.1); Females: 36 (34.6); Unknown: 19 (18.3) H: 44 (42.3); UM: 22 (21.2); LM: 19 (18.3); L: 0 (0.0) EAP: 28 (26.9); ECA: 18 (17.3); LAC: 0 (0.0); MENA: 6 (5.8); NA: 16 (15.4); SA: 16 (15.4); SSA: 1 (1.0) Males: 60 (57.7); Females: 31 (29.8); Unknown: 13 (12.5) H: 47 (45.6); UM: 21 (20.4); LM: 16 (15.5); L: 1 (1.0) EAP: 27 (26.0); ECA: 19 (18.3); LAC: 0 (0.0); MENA: 8 (7.7); NA: 17 (16.3); SA: 12 (11.5); SSA: 2 (1.9)
2023 Males: 65 (52.8); Females: 33 (26.8); Unknown: 25 (20.4) H: 57 (46.3); UM: 50 (40.7); LM: 16 (13.0);
L: 0 (0.0)
EAP: 56 (45.5); ECA: 25 (20.3); LAC: 5 (4.1); MENA: 6 (4.9); NA: 20 (16.3); SA: 10 (8.1); SSA: 1 (0.8) Males: 67 (57.3); Females: 27 (23.1); Unknown: 23 (19.6) H: 58 (49.6); UM: 42 (35.9); LM: 17 (14.5); L: 0 (0.0) EAP: 48 (41.0); ECA: 27 (23.1); LAC: 4 (3.4); MENA: 8 (6.8); NA: 20 (17.1); SA: 9 (7.7); SSA: 1 (0.9)

Abbreviations: H, high; UM, upper middle; LM, lower middle; L, low; EAP, East Asia Pacific; ECA, Europe and Central Asia; LAC, Latin America and the Caribbean; MENA, Middle East and Northern Africa; NA, North America; SA, South America; SSA, Sub-Saharan Africa.

3.5. Supporting funding for authors

Figure 3 shows the proportion of authors with academic and/or industry funding across the four years of AI algorithm studies in mammography for BC. In all years, academically funded authors comprised the largest group, ranging from 51 % in 2023 to 70 % in 2017. Authors with industry funding were noted only in 2022 (4 %).

Fig. 3. Funding sources for authors.

Fig. 3.

Abbreviations: N, number.

4. Discussion

This scientometric study examined the sociodemographic and geographic landscape of AI algorithm development and validation for mammogram interpretation in 2017, 2018, 2022, and 2023. Publication numbers increased significantly in 2022 and 2023 (212 articles) compared to 2017 and 2018 (52 articles). However, socioeconomic and ethnic representation—including cohort diversity and first- and last-author affiliations—remained consistently low across these years.

A major concern in this analysis is the underrepresentation of racial and ethnic diversity in datasets used to develop AI algorithms for mammogram interpretation. Only a minority of studies (0–25 %) reported the race of their cohort, with most patients identified as Caucasian. This lack of transparency and limited diversity compromise the generalizability, fairness, and equity of these models in clinical settings, as algorithms trained predominantly on Caucasian populations may perform poorly for other groups, potentially leading to inaccurate results and disparities in care [4]. Furthermore, current legal frameworks fall short in addressing this issue; for example, the Food and Drug Administration (FDA) does not require AI developers to disclose dataset compositions or ensure inclusion of diverse populations [20,21].

Moreover, across all years, most patient cohorts came from high-income countries and high-HDI regions—such as East Asia Pacific, Europe and Central Asia, and North America—with no datasets from low-income countries. This geographic bias in dataset development significantly limits the equitable application of AI in BC mammography-based evaluations, given the vast differences in healthcare infrastructure, imaging access, and population characteristics between lower- and high-income countries [22].

Indeed, AI models developed using data from high-income environments may not perform effectively in low-resource settings, limiting their global impact on breast cancer care [14]. Additionally, the challenges in resource-poor settings often differ from those in resource-rich ones [4]. Therefore, AI is not a one-size-fits-all solution for equitable radiographic interpretation of screening and diagnostic mammograms, and its training and implementation require careful, context-specific design.

High-income countries dominate author affiliations, with no first authors from low-income countries, highlighting a systemic imbalance in AI research participation. This lack of inclusivity risks omitting critical perspectives needed to develop algorithms that serve diverse patient populations [14]. Consequently, the underrepresentation of researchers from lower-income regions may lead to models that overlook the unique challenges of these populations, ultimately perpetuating inequities in breast imaging analyses and ultimately patient outcomes.

The observed predominance of male first and last authors across all analyzed years highlights a gender imbalance in AI research for mammogram interpretation in BC, which could influence the priorities and perspectives incorporated into model development. Ensuring diverse representation among researchers is essential for creating AI tools that adopt comprehensive, patient-centered approaches tailored to the varied needs of different populations [4,14,23].

5. Conclusion

In conclusion, the underrepresentation of racial and ethnic diversity in datasets for AI-driven mammogram interpretation could compromise the generalizability, fairness, and equity of these models in clinical settings. As algorithms trained predominantly on Caucasian populations may yield inaccurate results and misdiagnoses in underrepresented groups, patient outcomes may be ultimately jeopardized, and existing disparities may be reinforced. Moreover, the fairness of these AI tools is called into question, as they risk systematically disadvantaging certain racial, ethnic, or sociodemographic groups. To mitigate these issues and ensure that the benefits of AI in BC imaging are equitably distributed, it is essential to prioritize diversity in dataset collection, foster international collaborations that include researchers from lower- and middle-income countries, and actively incorporate diverse populations in clinical research.

Supplementary Material

Appendix A. Supplementary material: Supplementary File S1 - PRISMA Checklist
Appendix A. Supplementary material: Supplementary File S1 Table. Full list of included studies with the baseline characteristics for each study.

Appendix A. Supporting information

Supplementary data associated with this article can be found in the online version at doi:10.1016/j.ejca.2025.115394.

Funding

The authors did not receive support from any organization for the submitted work. ECD is funded in part through the Prostate Cancer Foundation Young Investigator Award and through the Cancer Center Support Grant from the National Cancer Institute (P30 CA008748). LM is supported by a National Institutes of Health (NIH) Diversity Supplement (R01CA257814-02S2). CC is supported by the Fondazione Gianni Bonadonna (FGB) and Associazione Italiana per la Ricerca contro il Cancro (AIRC) (2024-2026).

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Isabele A. Miyawaki has no potential conflicts of interest to disclose.

Imon Banerjee has no potential conflicts of interest to disclose.

Felipe Batalini has no potential conflicts of interest to disclose.

Carlos A. Campello Jorge has no potential conflicts of interest to disclose.

Leo A. Celi has no potential conflicts of interest to disclose.

Marisa Cobanaj has no potential conflicts of interest to disclose.

Edward C. Dee is funded in part through the Prostate Cancer Foundation Young Investigator Award and through the Cancer Center Support Grant from the National Cancer Institute (P30 CA008748).

Judy W. Gichoya has no potential conflicts of interest to disclose

Zaphanlene Kaffey has no potential conflicts of interest to disclose

Maxwell R. Lloyd has no potential conflicts of interest to disclose

Lucas McCullum is supported by a National Institutes of Health (NIH) Diversity Supplement (R01CA257814–02S2).

Sruthi Ranganathan has no potential conflicts of interest to disclose.

Chiara Corti reports travel/accommodations (to scientific meeting) from Veracyte (2023). CC is supported by the Fondazione Gianni Bonadonna (FGB) and Associazione Italiana per la Ricerca contro il Cancro (AIRC) (2024–2026).

Footnotes

CRediT authorship contribution statement

Isabele A. Miyawaki: Conceptualization, Methodology, Visualization, Writing – original draft, Writing – review & editing. Imon Banerjee: Conceptualization, Methodology, Writing – review & editing. Felipe Batalini: Conceptualization, Methodology, Writing – review & editing. Carlos A. Campello Jorge: Conceptualization, Methodology, Writing – review & editing. Leo Anthony Celi: Conceptualization, Supervision, Writing – review & editing. Marisa Cobanaj: Conceptualization, Methodology, Writing – review & editing. Edward C. Dee: Conceptualization, Methodology, Writing – review & editing. Judy W. Gichoya: Writing – review & editing. Zaphanlene Kaffey: Conceptualization, Methodology, Writing – review & editing. Maxwell R. Lloyd: Conceptualization, Methodology, Writing – review & editing. Lucas McCullum: Conceptualization, Methodology, Writing – review & editing. Sruthi Ranganathan: Conceptualization, Methodology, Writing – review & editing. Chiara Corti: Conceptualization, Supervision, Writing – review & editing.

References

  • [1].Siegel RL, Kratzer TB, Giaquinto AN, et al. Cancer statistics, 2025. CA Cancer J Clin 2025;75(1):10–45. 10.3322/caac.21871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Arnold M, Morgan E, Rumgay H, et al. Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast 2022;66:15–23. 10.1016/j.breast.2022.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Kim J, Harper A, McCormack V, et al. Global patterns and trends in breast cancer incidence and mortality across 185 countries. Nat Med 2025. 10.1038/s41591-025-03502-3. [DOI] [PubMed] [Google Scholar]
  • [4].Cobanaj M, Corti C, Dee EC, et al. Advancing equitable and personalized cancer care: Novel applications and priorities of artificial intelligence for fairness and inclusivity in the patient care workflow. Eur J Cancer 2024;198:113504. 10.1016/j.ejca.2023.113504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Ranard BL, Park S, Jia Y, et al. Minimizing bias when using artificial intelligence in critical care medicine. J Crit Care 2024;82:154796. 10.1016/j.jcrc.2024.154796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Ghosh A Artificial Intelligence Using Open Source BI-RADS Data Exemplifying Potential Future Use. J Am Coll Radio 2019;16(1):64–72. 10.1016/j.jacr.2018.09.040. [DOI] [PubMed] [Google Scholar]
  • [7].Corti C, Cobanaj M, Marian F, et al. Artificial intelligence for prediction of treatment outcomes in breast cancer: systematic review of design, reporting standards, and bias. Cancer Treat Rev 2022;108:102410. 10.1016/j.ctrv.2022.102410. [DOI] [PubMed] [Google Scholar]
  • [8].Liu L, Feng W, Chen C, et al. Classification of breast cancer histology images using MSMV-PFENet. Sci Rep 2022;12(1):17447. 10.1038/s41598-022-22358-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Schaffter T, Buist DSM, Lee CI, et al. Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms. JAMA Netw Open 2020;3(3):e200265. 10.1001/jamanetworkopen.2020.0265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Yala A, Lehman C, Schuster T, et al. A deep learning mammography-based model for improved breast cancer risk prediction. Radiology 2019;292(1):60–6. 10.1148/radiol.2019182716. [DOI] [PubMed] [Google Scholar]
  • [11].Torres-Mejía G, Smith RA, Carranza-Flores MeL, et al. Radiographers supporting radiologists in the interpretation of screening mammography: a viable strategy to meet the shortage in the number of radiologists. BMC Cancer 2015;15:410. 10.1186/s12885-015-1399-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Xavier D, Miyawaki I, Campello Jorge CA, et al. Artificial intelligence for triaging of breast cancer screening mammograms and workload reduction: a meta-analysis of a deep learning software. J Med Screen 2024;31(3):157–65. 10.1177/09691413231219952. [DOI] [PubMed] [Google Scholar]
  • [13].Dembrower K, Crippa A, Colón E, et al. Artificial intelligence for breast cancer detection in screening mammography in Sweden: a prospective, population-based, paired-reader, non-inferiority study. Lancet Digit Health 2023;5(10):e703–11. 10.1016/S2589-7500(23)00153-X. [DOI] [PubMed] [Google Scholar]
  • [14].Gichoya JW, Thomas K, Celi LA, et al. AI pitfalls and what not to do: mitigating bias in AI. Br J Radio 2023;96(1150):20230023. 10.1259/bjr.20230023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Futoma J, Simons M, Panch T, et al. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit Health 2020;2(9): e489–92. 10.1016/S2589-7500(20)30186-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Swami N, Corti C, Curigliano G, et al. Exploring biases in predictive modelling across diverse populations. Lancet Healthy Longev 2022;3(2). [DOI] [PubMed] [Google Scholar]
  • [17].Gichoya JW, Banerjee I, Bhimireddy AR, et al. AI recognition of patient race in medical imaging: a modelling study. Lancet Digit Health 2022;4(6):e406–14. 10.1016/S2589-7500(22)00063-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Banerjee I, Bhattacharjee K, Burns JL, et al. “Shortcuts” causing bias in radiology artificial intelligence: causes, evaluation, and mitigation. J Am Coll Radio 2023;20 (9):842–51. 10.1016/j.jacr.2023.06.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].The World Bank. World Bank country and lending groups. Available from: https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-worldbank-country-and-lending-groups. (last accessed 2024).
  • [20].Gottlieb S New FDA Policies Could Limit the Full Value of AI in Medicine. JAMA Health Forum 2025;6(2):e250289. 10.1001/jamahealthforum.2025.0289. [DOI] [PubMed] [Google Scholar]
  • [21].Corti C, Celi LA. Can we ensure a safe and effective integration of language models in oncology? Lancet Reg Health Eur 2024;46:101081. 10.1016/j.lanepe.2024.101081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Celi LA, Cellini J, Charpignon ML, et al. Sources of bias in artificial intelligence that perpetuate healthcare disparities-A global review. PLOS Digit Health 2022;1(3): e0000022. 10.1371/journal.pdig.0000022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Corti C, Cobanaj M, Dee EC, et al. Artificial intelligence in cancer research and precision medicine: applications, limitations and priorities to drive transformation in the delivery of equitable and unbiased care. Cancer Treat Rev 2023;112:102498. 10.1016/j.ctrv.2022.102498. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix A. Supplementary material: Supplementary File S1 - PRISMA Checklist
Appendix A. Supplementary material: Supplementary File S1 Table. Full list of included studies with the baseline characteristics for each study.

RESOURCES