Skip to main content
PLOS Global Public Health logoLink to PLOS Global Public Health
. 2024 Jan 19;4(1):e0002513. doi: 10.1371/journal.pgph.0002513

A scientometric analysis of fairness in health AI literature

Isabelle Rose I Alberto 1,#, Nicole Rose I Alberto 1,#, Yuksel Altinel 2,#, Sarah Blacker 3,#, William Warr Binotti 4,#, Leo Anthony Celi 5,6,7,#, Tiffany Chua 8,#, Amelia Fiske 9,#, Molly Griffin 6,#, Gulce Karaca 10,#, Nkiruka Mokolo 11,#, David Kojo N Naawu 11,#, Jonathan Patscheider 12,*,#, Anton Petushkov 13,#, Justin Michael Quion 14,*,#, Charles Senteio 15,#, Simon Taisbak 16,#, İsmail Tırnova 17,#, Harumi Tokashiki 18,#, Adrian Velasquez 18,19,#, Antonio Yaghy 20,#, Keagan Yap 21,#
Editor: Zahra Zeinali22
PMCID: PMC10798451  PMID: 38241250

Abstract

Artificial intelligence (AI) and machine learning are central components of today’s medical environment. The fairness of AI, i.e. the ability of AI to be free from bias, has repeatedly come into question. This study investigates the diversity of members of academia whose scholarship poses questions about the fairness of AI. The articles that combine the topics of fairness, artificial intelligence, and medicine were selected from Pubmed, Google Scholar, and Embase using keywords. Eligibility and data extraction from the articles were done manually and cross-checked by another author for accuracy. Articles were selected for further analysis, cleaned, and organized in Microsoft Excel; spatial diagrams were generated using Public Tableau. Additional graphs were generated using Matplotlib and Seaborn. Linear and logistic regressions were conducted using Python to measure the relationship between funding status, number of citations, and the gender demographics of the authorship team. We identified 375 eligible publications, including research and review articles concerning AI and fairness in healthcare. Analysis of the bibliographic data revealed that there is an overrepresentation of authors that are white, male, and are from high-income countries, especially in the roles of first and last author. Additionally, analysis showed that papers whose authors are based in higher-income countries were more likely to be cited more often and published in higher impact journals. These findings highlight the lack of diversity among the authors in the AI fairness community whose work gains the largest readership, potentially compromising the very impartiality that the AI fairness community is working towards.

Introduction

The fields of medicine and technology are undeniably intertwined; progress in one field often drives innovation in the other. It is no surprise that artificial intelligence (AI) is making headlines with its promise to inform or even automate clinical decision-making. However, the greatest impact this innovation has is not the language models trained on billions of parameters nor the generative models that create images from text prompts. Rather, complex bias exists within the data and it takes form in various ways, ranging from outcomes that are inconsistent across demographics, to subconsciously tainted tests and treatment decisions, and to influencing local clinical practice patterns in the form of institutional bias [1].

The concept of algorithm fairness has not only gained traction in the fields of artificial intelligence and software engineering, but also at global institutions such as the European Union. According to the European Commission, before people and societies develop, deploy, and use AI it first must be considered trustworthy. One of the critical elements of a trustworthy AI is fairness [2]. The European Commission defines it as having two dimensions: the “equal and just distribution of benefit and cost" to ensure freedom from unfair bias, discrimination, and stigmatization; and “the ability to contest and seek effective redress against decisions made by AI systems and by the humans operating them.”

Over the last five years, unfairness in machine learning has gone from virtually unknown to often making headlines. Significant instances of undesirable bias induced in automated processes are frequently identified. One in particular occurred in 2016, following an article by ProPublica, an independent nonprofit news organization focusing on accountability, justice, and safety. It revealed that a software used by judicial courts across the United States was discriminating against Black and Hispanic prisoners during parole hearings [3]. In the same vein, machine learning models applied in healthcare are equally prone to similar issues. In 2019, it was revealed that millions of black patients had been misdiagnosed as a result of racial imbalances in a health algorithm used to triage patients. Consequently, the presence of biases and discrimination in machine learning models used in healthcare fosters the risk of misdiagnosis for certain demographic population groups, ultimately leading to loss of life [4].

Definitions and metrics of fairness in medical algorithms subsequently appeared in the medical literature. However, it should be noted that the meaning of fairness also extends to researchers involved in publishing datasets and studying biases. It is prudent to consider the bias woven into the very fabric of the algorithm itself that reflects the human assumptions of those who created it. This is a problem of representation and exclusion, and of epistemic narrowing that can lead to the perpetuation of structural inequities. Through her concept of “strong objectivity,” the Science and Technology Studies scholar Sandra Harding has shown that the exclusion of marginalized authors, including People of Color, scholars based in low-income countries, and women, among others, from research and publishing is not only unjust, but also diminishes the scientific knowledge produced. In order to attain a stronger version of scientific objectivity, and to create a science that can work towards equity and justice, Harding argues that we need to fortify that science by increasing the diversity of academic authorship as much as possible [5]. Which leads to the crux of the matter: how diverse is the fairness of the AI community proposing these definitions and metrics of fairness? It is important to bring attention to the diversity, or lack thereof, within this community to help prevent future propagation of bias and promote equity in health care.

Methods

Search strategy

In order to analyze the AI fairness community in healthcare, an in-depth descriptive study was done measuring aspects related to the publications in the field with a bibliometric review. The AI field was defined with terms that included machine learning, deep learning, convolution neural network, and natural language processing. Fairness overlapped in terms of health equity and health disparities.

Literature search was conducted in December 2022 using PubMed, Google Scholar, and Embase. These databases were chosen for their popularity, author familiarity, and ease of use with Python. The collected data were manually curated to secure the field of interest.

The search was conducted with the help of librarian Paul Bain, Ph.D., MLIS, from Harvard Medical School’s Countway Library [S1 Appendix].

Eligibility of articles

Studies were considered eligible for inclusion if they met the following criteria: (i) Does the paper discuss machine learning fairness? (ii) Is the paper related to healthcare? and (iii) does it discuss clinical applications?

If the three questions were affirmative, then the article was eligible. If a paper’s eligibility was still uncertain, it was cross-checked by another author.

As scope of the analysis was to map the gender and ethnic representation in the community of AI fairness within healthcare, scientific and non-scientific articles were screened for eligibility. The authors used manual vetting to narrow down the list of articles by reading the abstracts or full texts when the abstracts did not provide enough information. Out of the 1614 articles initially found through the search, 375 (23%) were determined to be eligible for further analysis, with a total number of authors of 1984.

Data items

The bibliometric data obtained from Embase, PubMed, and Google Scholar provided the authors’ first and last names, gender, race, article title, abstract, keywords, and URL. The articles were vetted manually and with Python package PyPaperBot 1.2.2. [6] to obtain enough bibliometric data and to ensure a thorough mapping and measurement of academic trends. For each eligible study, the following data were extracted: type of article–opinion or research, which country the paper originated from, the journal it was published in, publication year, number of times each article was cited, whether funding was provided, and the name of the funding organization if provided. Additionally, the originating countries were classified based on the World bank classification for income to the following: low-income (<1.0 USD per year), lower-middle-income (1.0–4.1 USD per year), upper-middle income (4.1–12.7 USD per year), and high income (>12.7 USD per year) (cf. GDP per capita (current US$) | Data (worldbank.org)) [7].

Approach to identifying each author’s nationality, race, and sex

To ensure consistency in our dataset and perform statistical analysis, we used pre-defined groups provided by search platforms to classify the gender and race/ethnicity of authors. Race and ethnicity was classified as White, Asian, Black, Hispanic, or ‘none’, by the search platforms, while Gender was recorded as male, female or none.

When collecting data on the author’s gender, race, and ethnicity, the study relied on a variety of sources, including self-identification in terms of ethnicity and race, and the author’s chosen pronouns. If this information was not available, information found on web pages and articles, and details related to the authors’ affiliations or memberships in social or support groups, was used to determine gender, ethnicity and race. If this information was still unclear, the authors’ gender, race, and ethnicity were determined based on photographs found on multiple websites including university websites, private web pages, YouTube, and social media platforms such as LinkedIn. To maintain the accuracy and validity of the data, the study cross-checked every author’s gender, race, and ethnicity, against multiple sources of information, and in addition, each article and its inherent information were verified by another author of the paper to ensure the validity, consistency, and accuracy in the data collection process. When collecting data on the authors’ countries of origin, the study went back as far as possible on the authors’ past, reviewing information available on LinkedIn or faculty and research web pages. If the authors did not disclose their home country, the study considered the country of their furthest educational background. The country where authors are currently based was also investigated via the location of the author’s affiliated institution listed in the article.

The approach used to identify race, ethnicity, and gender has its limitations. When analyzing the bibliometric data, the collected information on the author’s race, ethnicity, and gender was found to be unreliable and inconsistent, which reflects the inherent complexity of the topic at hand. Not all of the websites used as sources in this study allowed for authors to state their own identity. As a result, not all information was equally accessible via web searches. Authors who identified as multiracial, nonbinary, or other situations where the data was unclear, were not included, as the pre-defined groups provided by search platforms did not account for this. Moreover, the social constructs that can vary significantly depending on their socio-political context, gender, ethnicity and race, cannot and should not be directly determined from a picture. This means that in some cases, the information found may not completely reflect the author’s preferred identities, which highlights some of the methodological challenges and the complexities of this kind of intersectional demographic data gathering, and the difficulty of analyzing data regarding race, ethnicity, and gender on a large scale using quantitative methods.

Lastly, the income level of the author’s country of origin (cf. GDP per capita (current US$) | data (worldbank.org)) [3] their affiliated institution, and whether it is a minority-serving institution (cf. MSI List 2021.pdf (rutgers.edu), and their highest academic degree obtained (MD, Ph.D., etc.) [8] were investigated. To confirm the statistical certainty of the paper, the accuracy of the author’s review of race, ethnicity, gender, country of origin, income level, etc. for each article’s datapoint was verified multiple times by other authors involved in the study. Emphasis is placed on the first and last author due to their significant roles during the research process.

Nonetheless, this approach was chosen, because such determinations are difficult to make without engaging with a more in-depth survey of all authors to accurately record their preferred race, ethnicity, and gender.

Research on diversity requires a high level of reflexivity, including reflecting on one’s own positionality in relation to matters of fairness in research. As such, we would like to situate ourselves in relation to this scholarship. Among the authors on this paper, 12 identify as male and 10 identify as female and 0 identify as non-binary. In terms of ethnicity, 10 authors identify as White, 8 identify as Asian, 3, identify as Black and 1 identifies as Hispanic. The authors come from the following countries of origin: USA (9), Turkey (2), Philippines (2), Canada (2), Denmark (2), Germany (1), Brazil (1), Lebanon(1), Peru (1). The idea for this research was inspired by conversations we have had with others in the field on matters of race, gender, and representation within AI and academia, and our own experiences of relative privileges working within this system.

Statistical analysis

Regression analyses were performed to evaluate multiple factors that influence the number of citations and the presence of funding during the study. Only 253 papers identified as research papers were included in this analysis as opinion pieces generally have little direct funding nor are widely cited beyond the community.

Results

Bibliographic data was directly obtained from Embase, which yielded 242 articles. Data from the PubMed API yielded a total of 875 articles. Finally, another 497 articles from Google Scholar were found using the package PyPaperBot 1.2.2 [6] in Python 3.9.12. In total, 1614 articles potentially related to AI fairness were collected.

Data was cleaned in Microsoft Excel. Spatial diagrams were generated using Public Tableau. Base map layers obtained from OpenStreetMap and go-cart.io. Additional graphs were generated using Matplotlib 3.5.1 [9] and Seaborn 0.11.2 [10]. The linear and logistic regressions were analyzed using the Python package statsmodels 0.13.2, and the t-tests were analyzed using the Python package scipy 1.7.3.

Distribution of each author’s ethnicity and gender

The results showed that, overall, 794 (40.0%) of the authors were female, and 1190 (60.0%) were male (S1 Appendix). When looking specifically at the first and last authors, 155 (41.3%) of the first authors were female, and 110 (32.2%) of the last authors were female (Fig 1).

Fig 1. Distribution of first and last author sex.

Fig 1

When the author’s race distribution was analyzed, the study categorized the race of 1966 authors out of the total of 1984 authors in our curated database. The study found that the majority of the authors were White (1270; 64.0%), followed by Asian (533; 26.9%), Black (89; 4.5%), and Hispanic (74; 3.7%) (Fig 2).

Fig 2. Distribution of first, last, and all authors’ race.

Fig 2

When dividing the authors into two groups, whites and non-whites, the study found that among the first authors, 234 (62.4%) were white and 141 (37.6%) were non-white. Among the last authors, 251 (66.9%) were white, and 124 (33.1%) were non-white (S1 Appendix).

Distribution of each nationality

When looking at the country of origin, it was clear that most articles were from the USA. The total author nationality distribution showed that 986 (49.7%) were from the USA, 142 (7.2%) were from Canada, 117 (5.9%) were from the UK, 115 (5.8%) were from China, and 83 (4.2%) were from India (Fig 3). From income levels, 1631 authors (82.2%) were from high-income countries, 209 (10.5%) were from upper-middle-income countries, 135 (6.8%) were from lower-middle-income countries, and 9 (0.5%) were from low-income countries.

Fig 3. Distribution of each nationality.

Fig 3

Base map layer data found at https://go-cart.io/cartogram, courtesy of go-cart.io, and is available under Creative Commons License CC-BY, found here:https://go-cart.io/faq.

When looking at the 375 first authors, 175 (46.7%) were from the USA, 27 (7.2%) were from Canada, 22 (5.9%) were from the UK, and 21 (5.60%) were from China (Fig 4A). Among the last authors, 179 (50.7%) were from the USA, 29 (8.2%) were from Canada, 20 (5.7%) were from the UK, and 14 (4.0%) were from China (Fig 4A). For the first authors, 318 (84.8%) were from high-income countries, 32 (8.5%) were from upper-middle-income countries, 24 (6.4%) were from lower-middle-income countries, and 1 (0.3%) were from low-income countries. For the last authors, 312 (88.4%) were from high-income countries, 27 (6.8%) were from upper-middle-income countries, 15 (4.2%) were from lower-middle-income countries, and 2 (0.6%) were from low-income countries (Fig 4B).

Fig 4. Global dispersion of first and last author countries.

Fig 4

Base map found at https://www.openstreetmap.org/#map=2/43.8/3.2 and data from OpenStreetMap and OpenStreetMap Foundation. Contains information from OpenStreetMap and OpenStreetMap Foundation, which is made available under the Open Database License, found here: https://www.openstreetmap.org/copyright.

Further analysis was done to investigate where first and last authors are currently based. This was determined by the location of the institution listed in the article. Of the listed first authors, 347 of the 357 (92.5%) are currently based in institutions located in high-income countries. Of these, 220 (59%) are based in the US, 28 (7%) based in Canada, 24 (6%) based in the UK, 11 (3%) based in Australia, and 10 (3%) are based in Germany. There is only 1 first author (0.3%) based in a low-income country, Mozambique, and 10 (2.7%) based in lower-middle-income countries. This trend continues for the last authors. 332 (94%) of last authors are based in high-income countries at similar proportions with regards to country: 214 (61%) based in the USA, 31 (9%) based in Canada, 20 (6%) based in the UK, 11 (3%) based in Australia, and 8 (2%) based in Germany. Of the remaining last authors, 14 (4%) are based in upper-middle-income countries, 7 (2%) are based in lower-middle-income countries, and 0 are based in low-income countries.

Citations and funding

By investigating citations across gender and ethnicity, it was observed that there was an overrepresentation of white-male authors. The data indicates, as illustrated in Fig 5, that papers with more male and white authors tended to receive more citations than those with fewer male and white authors.

Fig 5.

Fig 5

Distribution of Male Authorship (A) and White Authorship (B) per article with number of citations and percentage.

On further investigation, it became evident that on average for both first and last authors, non-white and female authors receive fewer citations than white male authors, which is illustrated for last authors in Fig 6. From here, the data revealed that articles with male last authors accounted for a substantial 76.4% of all citations, with white male last authors alone responsible for 58.3% of the total citations for all articles (Fig 6).

Fig 6. Distribution of citations among last authors based on gender and ethnicity.

Fig 6

The findings from Figs 5 and 6, prompts the argument that male authorship, particularly that of white males, may be associated with higher-impact articles published in high-impact journals.

The analyses in the study also suggest that higher-income countries may have a higher likelihood of being funded and producing higher-impact articles in terms of citations (Fig 7). This could be due to higher-income countries having greater access to resources and funding, which could contribute to the production of higher-impact articles. Additionally, the research culture, infrastructure, and collaboration networks in higher-income countries may also play a role in producing impactful research. It is also important to consider the potential biases that may be present, such as language bias, publication bias, citation practices and funding. Recent scholarship on citational practices and politics draws attention to the ways that structural inequities among authors are reflected in citation practices, noting that scholars can take an active role in upending these hierarchies through an intentional transformation in their own citational practices [11, 12]. These biases could impact the analysis and interpretation of the results, and their access to funding, leading them to make less impactful articles.

Fig 7. Distribution of citations among last authors income level.

Fig 7

Articles were also evaluated according to how often they get cited, in which year of publication was another variable in classification, which showed that more recent (closer to 2022) articles were most likely to be cited.

OLS regression results (Citations)

Regression analyses revealed that the percentage of female authors, percentage of white authors, race of authors, and the gender of authors are not significantly correlated with the likelihood of being cited. (Fig A, B in S1 Appendix) Publication year and the number of authors are the only factors affecting the number of citations.

OLS regression results (Funding)

Regression analyses revealed that the following parameters: percentage of female authors, percentage of white authors, gender of first and last authors, and first and last authors, whether white or non-white, do not affect funding. Additionally, the number of authors and years of publications does not affect being funded.

Predictor factors of citation and funding

Regression analyses revealed similar trends for citation and funding. The percentage of female authors, the percentage of white authors, and the gender and race of the first and last authors did not have a statistically significant effect on whether a paper was cited or whether it was funded. Instead, publication year was the only factor with a statistically significant effect on the number of citations a paper received.

Discussion

The performed bibliometric study highlights the relative homogeneity of the authors of the AI fairness community, most notably seen in the distribution of gender, ethnicity, and countries of authorship. Male authorship, particularly that of white males, may be associated with higher-impact articles published in high-impact journals.

The gender and racial disparities among authors in academic publications are evident, with a notable overrepresentation of white male authors, especially in the positions of first and last authors. Among all authors, 60% were male, and this proportion increased to 69% when considering the distribution of first authors and last authors, as illustrated in Fig 1.

Furthermore, there is a clear relationship between the gender and ethnicity of authors and the impact of their publications, as demonstrated in Figs 5 and 6. Publications with a higher proportion of male authors, particularly those who are white males, tend to have a greater number of citations, indicating a higher impact. Notably, articles with male authors in the last author position accounted for a substantial 76.4% of all citations. White male authors, in particular, were responsible for 58.3% of the total citations across all articles. This is important as the role of the last author typically goes to the lead principal investigator of the team who supervises the project.

The increasing numbers of health disparities in underrepresented ethnic groups, and the underrepresentation of minority ethnic groups and women in academia has been well documented. Multiple studies and reports have addressed this trend and this study shows that it persists within the AI fairness community as well, where there is a significant disparity in gender and ethnicity in critical care medicine [1320]. Women are underrepresented in leadership roles, which can limit their opportunities for publication and recognition [1315], and are less likely to be first or last authors and less likely to be cited compared to male authors [1416]. Certain racial/ethnic groups are represented only minimally, and Non-White groups–namely, Latinos, African-Americans, and American Indians–are underrepresented in health professions that require an undergraduate or graduate degree [1720]. As the US demographic landscape undergoes rapid transformations and the recognition of growing health disparities within historically marginalized ethnic communities intensifies, the future of healthcare in the country will increasingly rely on a diverse healthcare workforce, as this can help improve cultural competence among health care professionals thus reducing health disparities [19, 20].

These studies align with the findings of this papers’ analysis, as similar trends were observed. However, from the data, there was a slight indication that papers with female last authors were more likely to get a funding source, compared to the male counterpart, counterbalancing the notion of male predominance [19]. This finding highlights the positive impact that could be achieved, when equality measures are implemented by regulatory institutions in AI research within healthcare.

The analysis also shows that higher-income countries are more likely to produce higher-impact articles, most likely a reflection of the amount of funding received as demonstrated by Fig 7. Massuda et al show that underfunding a survey can lead to a significant reduction in quality, perpetuating the status quo [21]. This difference in funding perpetuates the current power dynamic where countries from underfunded institutions in low- and middle-income countries are less likely to produce high-quality research that is widely cited and well-regarded. Promoting research from underrepresented groups and communities is essential to promoting fairness and equity in research.

Authors are also more likely to be based in wealthier countries. This is understandable as skilled researchers from lower-income countries often relocate to higher-income countries for a variety of reasons, whether it is to conduct research or because they are offered scholarships. However, considering that 82% of authors are already from high-income countries, compared to the 0.5% of authors from low-income countries, this further underlines the lack of representation. The lack of last authors based in low-income countries is a glaring sign of the research being performed, as they play a large role in shaping the question and design.

Possible actions

As progress is made in both AI and healthcare, equity and inclusivity must be prioritized as it can lead to more innovative and impactful research, and a science that works for all [22, 23].

Possible actions to ensure proper representation include supporting research capacity development in lower-income and lower-middle-income countries and promoting research conducted by researchers of underrepresented gender identities and ethnic minorities. In a narrative review conducted by Bowsher and colleagues, several critical factors were identified to enhance research capacity in Low- and Middle-Income Countries (LMICs). These factors encompass the simultaneous addressing of individual, organizational, and institutional levels, ensuring sustainable funding and resource allocation, promoting capable and shared leadership within equitable partnerships, facilitating mentorship programs, establishing professional networks, and establishing links between research outcomes and policy/practice implementation. The process of strengthening research capacity necessitates focused investment, the implementation of mentorship initiatives, the cultivation of robust collaborations, and the implementation of effective monitoring and evaluation mechanisms. The success of capacity-building endeavors relies on long-term strategic planning and collaboration involving a diverse array of stakeholders, including researchers, program implementers, and policymakers across all levels [24].

By supporting this development, the global research landscape becomes more inclusive. This in turn helps to advance and strengthen medical knowledge and promote social justice within the scientific community. In addition, promoting collaboration and cooperation between researchers from diverse backgrounds and locations can also lead to more innovative and impactful research [22, 23].

Another way to promote diversity and inclusion in research is to establish guidelines for diversifying the composition of authors based on their ethnicity and sex. Providing formal training on equity issues and the importance of diversity in the research process can help educate researchers and promote greater awareness and understanding of these issues. This can be incorporated into the syllabi of academic institutions to ensure that the next generation of researchers is equipped with the knowledge and skills necessary to promote diversity and inclusion in their work. Diversity should be highlighted in published work and working groups. Disclosing authors’ nationalities, races, ethnicities, and sexes can promote diversity and inclusivity. Inclusivity also begins at the door. Institutions should develop initiatives that can help to attract more diverse scholars through transforming institutional cultures and priorities, as well as recruitment, hiring, and promotion policies.

In addition to promoting diversity in the composition of working groups and authorship of published work, it is also important to consider diversity in the content of the work, for example, including diverse perspectives and experiences in the research or addressing issues that affect diverse communities. AlShebli et al. found that ethnic diversity had the strongest correlation with scientific impact [22]. Recruiters should always strive to encourage and promote ethnic diversity, be it by recruiting candidates who complement the ethnic composition of existing members, or by recruiting candidates with proven track records in collaborating with people of diverse ethnic backgrounds.

Researchers should seek to understand their own group composition and how it should coincide with the communities which the research may impact. Representativeness and collaboration with communities can result in better science [24] and as such, greater understanding and awareness of these groups’ challenges and issues can ultimately lead to more effective solutions. It is also worth noting that groups with higher cognitive diversity are often more effective at complex problem-solving and can help to reduce biased judgment in strategic decision-making [24, 25].

Journals, editors, reviewers, and grantors can mandate that the author teams disclose their goals for achieving such diversity. Doing so would promote transparency and accountability and encourage authors to prioritize diversity and inclusion in their research. The National Institutes of Health (NIH) actively promotes diversity within the scientific community by encouraging conference grant applicants to include plans to enhance diversity in the selection of organizing committees, speakers, other invited participants and attendees [26]. These plans will be assessed during the scientific and technical merit review of the application and will be considered in the overall impact score. The underrepresented groups include individuals from nationally underrepresented racial and ethnic groups, individuals with disabilities, individuals from disadvantaged backgrounds, and women. Encouraging authors to highlight their efforts to promote diversity in their groups can raise awareness of the importance of diversity and inclusion in the scientific field and promoting diversity and inclusion in all aspects of research can ensure that the work is more representative and relevant to a broader range of people, ultimately leading to more equitable and effective outcomes.

Although several significant publications resulted from PubMed and Google Scholar searches, some were excluded. We used the third-party package PyPaperBot when selecting papers resulting from Google Scholar searchers which enabled us to extract 497 papers out of potentially hundreds of thousands. A large portion of the articles was removed from further analysis through manual vetting. However, third-party APIs are the only ways to parse through Google Scholar results. PyPaperBot was used, but a limitation of all the APIs seen is that they can only fetch the first 1,000 results, even if there are more. The extent of the literature is more vast than what we were able to extract, and it is crucial to scale up this analysis to capture more of the literature base in the future. It should also be noted that this field is relatively new and rapidly changing. As such, many regional databases may not have been available or accessible at the time of the literature search.

While manually vetting and extracting the authors’ demographic information, information may differ from the authors’ preferred identities, specifically for gender, race and countries of origin, as race, ethnicity, and gender are social constructs [27] that can vary significantly depending on their socio-political context, and it may not accurately reflect the author’s personal identity. Our analysis may have mischaracterized this vital information if their identities were not clearly stated on the internet. The use of predetermined categories for race and ethnicity made it particularly difficult to capture authors who may identify as multi-racial, or as belonging to several of these categories. It is also important to recognize that some people may not have the freedom or opportunity to publicly express how they identify. Similarly, assessments of whether an individual identified as non-binary were particularly challenging if not explicitly stated, and as such were not included in this study.

Conclusion

Moving forward, it will be important to develop better methodologies for representing a diverse range of possible identifications in order to better study questions of diversity, and a preferred methodology would involve interviewing each author in order to accurately record their nationality, self-identified race, and sex, as well as expanding the categories, however due to the scale of the study, it was not possible to obtain self-identified information in all cases. Systemic changes that allow for proper expression of identification are also necessary. Despite the presence of some inaccuracy within the data, as a necessity to perform statistical analysis, the overall trends revealed within this data are clear.

As progress is made in both AI and healthcare, equity and inclusivity must be prioritized as it can lead to more innovative and impactful research, and a science that works for all. Thus the composition of the AI fairness research community is of the utmost importance as whether AI will be a tool which only those who meet certain criteria can benefit from or a platform that serves all communities no matter their demographics, depends heavily on those who have a say in its design.

Supporting information

S1 Appendix. Additional information and data.

(PDF)

Data Availability

All relevant data are within the manuscript and its Supporting Information files. All data is third-party data and does not require any special privileges to access. More data that supports the findings of this study are publicly available here: https://public.tableau.com/app/profile/jonathan6077/viz/IstheFairnessCommuntyFair/IstheFairnessComminutyFair?publish=yes&fbclid=IwAR0_l5_b-lWLr_baGhCdwBKivVjyzVJg7CHG971EOEMiea2MSp33NXExVM The dataset can also be accessed at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/J1RTFG# Data and code repository can be found at https://github.com/anpetushkov/fairness-community.

Funding Statement

The authors received no specific funding for this work.

References

PLOS Glob Public Health. doi: 10.1371/journal.pgph.0002513.r001

Decision Letter 0

Zahra Zeinali

13 Sep 2023

PGPH-D-23-00817

A scientometric analysis of fairness in health AI literature

PLOS Global Public Health

Dear Dr. Quion,

Thank you for submitting your manuscript to PLOS Global Public Health. After careful consideration, we feel that it has merit but does not fully meet PLOS Global Public Health’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by 18 September. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at globalpubhealth@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pgph/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Zahra Zeinali, MD MPH DrGH (c)

Academic Editor

PLOS Global Public Health

Journal Requirements:

1. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

2. Please provide separate figure files in .tif or .eps format only and remove any figures embedded in your manuscript file. Please also ensure all files are under our size limit of 10MB.

For more information about figure files please see our guidelines:

https://journals.plos.org/globalpublichealth/s/figures 

https://journals.plos.org/globalpublichealth/s/figures#loc-file-requirement

3. We have noticed that you have a list of Supporting Information legends in your manuscript. However, there are no corresponding files uploaded to the submission. Please upload them as separate files with the item type 'Supporting Information'. 

4. Some material included in your submission may be copyrighted. According to PLOS’s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOS’s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form.

Please respond directly to this email or email the journal office and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. 

Potential Copyright Issues:

Figs 1-3: please (a) provide a direct link to the base layer of the map (i.e., the country or region border shape) and ensure this is also included in the figure legend; and (b) provide a link to the terms of use / license information for the base layer image or shapefile. We cannot publish proprietary or copyrighted maps (e.g. Google Maps, Mapquest) and the terms of use for your map base layer must be compatible with our CC-BY 4.0 license. 

Note: if you created the map in a software program like R or ArcGIS, please locate and indicate the source of the basemap shapefile onto which data has been plotted.

If your map was obtained from a copyrighted source please amend the figure so that the base map used is from an openly available source. Alternatively, please provide explicit written permission from the copyright holder granting you the right to publish the material under our CC-BY 4.0 license.

Please note that the following CC BY licenses are compatible with PLOS license: CC BY 4.0, CC BY 2.0 and CC BY 3.0, meanwhile such licenses as CC BY-ND 3.0 and others are not compatible due to additional restrictions. 

If you are unsure whether you can use a map or not, please do reach out and we will be able to help you. The following websites are good examples of where you can source open access or public domain maps: 

* U.S. Geological Survey (USGS) - All maps are in the public domain. (http://www.usgs.gov

* PlaniGlobe - All maps are published under a Creative Commons license so please cite “PlaniGlobe, http://www.planiglobe.com, CC BY 2.0” in the image credit after the caption. (http://www.planiglobe.com/?lang=enl) 

* Natural Earth - All maps are public domain. (http://www.naturalearthdata.com/about/terms-of-use/)

"

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does this manuscript meet PLOS Global Public Health’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Global Public Health does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This manuscript provides an in-depth bibliometric analysis of scholarly articles concerning the topic of artificial intelligence (AI) and medicine to further investigate the fairness of AI in literature. The authors reviewed 375 publications and reviewed corresponding multiple factors (including demographic data, funding, nationality, etc.) for each author.

With the rapid growth of AI in various fields, this article has a timely and relevant nature to it. I believe this article has the potential to contribute to the field and to PLOS Global Public Health with edits. The strength of this review lies largely in the analysis of these articles that yield significant results.

Attached please find additional comments and feedback, divided by section, which we believe can help strengthen your manuscript. We also advise that this manuscript be re-checked and read for grammatical errors.

Reviewer #2: This is an important area that fits in global Health issue. The authors used appropriate methodological approach to answer the review questions. Appropriate statistical tools and analysis were performed. However, it will be good if the authors consider the following points to revise and improve the manuscript.

ABSTRACT

1. Included papers. 375, were repeated in the methods and results. I suggest this is deleted from the methods in the abstract.

2. Authors should be consistent with the use of decimal place, I suggest 1 decimal place is used as it is largely used in the manuscript.

3. "The linear and logistic regressions were analyzed using Python"........ what was this analysis for?

4. The sentence before the last paragraph in their conclusion seem more like a finding than a conclusion. Authors can consider and revise.

5. "Most authors were from US, Canada and United Kingdom"....Can authors quantify what is referred to as "Most".

INTRODUCTION

6. Line 51 -54 may need revision. Long sentence and not clear.

7. It will of interest if authors expand on the fairness of AI in the introduction

METHODOLOGY

8. When was the literature search conducted? (period of search?)

9. How was papers excluded apart from the inclusion criteria, At what levels of were the papers screened?

10. How was the number of times a paper was cited identified? Authors should kindly elaborate

11. line 140.."authors who identified...." This aspect is not properly placed. I suggest it comes just after the inclusion criteria.

12. I suggest authors elaborate on the anaylsis procedure, as it stands, it does not reflect the content of the work

13. How many research papers were included in the regression analyses?

14. Line 190-196 ("As scope of the analysis.....") seem wrongly placed. I suggest it is moved to the methods section.

RESULTS

15. "Regression analyses revealed that the percentage of female authors, percentage of white authors, race of authors, gender of authors, and the number of authors is not correlated with the likelihood of being cited. Publication year was the only factor affecting to be cited"

Authors should support these findings with statistical inference. Which table or figure is being referred to in the work? Similar comment in the OLS regression.

16. I suggest authors go through the manuscript and correct sentences starting with numbers, eg, line 333

LIMITATION

17. Is the last paragraph part of limitations of the study? if not, authors should consider giving it appropriate heading. Other headings can also be considered in the manuscript especially the methods section

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: REVIEW_ A scientometric analysis of fairness in health AI literature_LHoemeke.docx

PLOS Glob Public Health. doi: 10.1371/journal.pgph.0002513.r003

Decision Letter 1

Zahra Zeinali

11 Dec 2023

A scientometric analysis of fairness in health AI literature

PGPH-D-23-00817R1

Dear Dr. Quion,

We are pleased to inform you that your manuscript 'A scientometric analysis of fairness in health AI literature' has been provisionally accepted for publication in PLOS Global Public Health.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact globalpubhealth@plos.org.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Global Public Health.

Best regards,

Zahra Zeinali, MD MPH DrGH (c)

Academic Editor

PLOS Global Public Health

***********************************************************

Reviewer Comments (if any, and for reference):

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Does this manuscript meet PLOS Global Public Health’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Global Public Health does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Thank you for thoroughly addressing the comments made on the original manuscript.

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Additional information and data.

    (PDF)

    Attachment

    Submitted filename: REVIEW_ A scientometric analysis of fairness in health AI literature_LHoemeke.docx

    Attachment

    Submitted filename: Response to Reviewers - AI Fairness.pdf

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files. All data is third-party data and does not require any special privileges to access. More data that supports the findings of this study are publicly available here: https://public.tableau.com/app/profile/jonathan6077/viz/IstheFairnessCommuntyFair/IstheFairnessComminutyFair?publish=yes&fbclid=IwAR0_l5_b-lWLr_baGhCdwBKivVjyzVJg7CHG971EOEMiea2MSp33NXExVM The dataset can also be accessed at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/J1RTFG# Data and code repository can be found at https://github.com/anpetushkov/fairness-community.


    Articles from PLOS Global Public Health are provided here courtesy of PLOS

    RESOURCES