Skip to main content
AEM Education and Training logoLink to AEM Education and Training
. 2021 Sep 29;5(Suppl 1):S76–S81. doi: 10.1002/aet2.10683

Images of dark skin in top emergency medicine journals: A cross‐sectional analysis of images of emergent cutaneous disorders

Geovonni Bell 1,, Sherita Holmes 1,2,3, Scott Gillespie 3, Anna Wood 3, Brittany L Murray 1,2,3
PMCID: PMC8480477  PMID: 34616977

Abstract

Introduction

Emergency medicine (EM) physicians must recognize emergent cutaneous disorders (CDs) in patients of all skin tones. In other medical specialties, images of CDs in light‐skinned individuals (LSI) are published more frequently than images of CDs in dark‐skinned individuals (DSI). This study aims to determine the representation of LSI versus DSI in images of emergent CDs published in top EM journals.

Methods

This is a cross‐sectional analysis of CD images published from 2015 to 2020 in the six most influential EM journals as determined by Eigenfactor. The 2016 Model of the Clinical Practice of Emergency Medicine (EM Model) by the American Board of Emergency Medicine was used to classify CDs as “emergent,” “nonemergent,” or “not listed.” The Fitzpatrick skin tone scale was used to classify skin tone as light, dark, or indeterminate. Two blinded reviewers classified each image; for disagreements, a third blinded reviewer determined the final classification. Descriptive statistics and chi‐square were used to analyze the data. A kappa coefficient was used to determine reviewer agreement (LSI vs. DSI), and a weighted kappa coefficient was used for agreement between individual Fitzpatrick categories.

Results

There were 314 images of CDs. Forty images were indeterminate, and one image was excluded, leaving 273. Of the 273 images analyzed, 44.0% were emergent, 8.0% were nonemergent, and 48.0% were not listed in the EM Model. DSI comprised 13.6% of images. For emergent CDs, 85.0% were LSI versus 15.0% DSI. For nonemergent CDs, 27.3% were DSI, and for CDs not listed in the EM Model, 9.9% were DSI. The kappa coefficient for reviewer agreement between LSI and DSI was 0.76 (95% confidence interval [CI] = 0.65 to 0.87) and the weighted kappa coefficient for agreement between Fitzpatrick categories was 0.70 (95% CI = 0.64 to 0.76), showing substantial agreement.

Conclusion

For emergent and nonemergent CDs, images of LSI were published more than those of DSI in top EM journals.

INTRODUCTION AND BACKGROUND

It is important that emergency physicians be equipped with the skills and knowledge to evaluate and treat dermatological conditions in patients from different ethnic, racial, and cultural backgrounds.1 Emergency department (ED) physicians are often the first point of care for patients experiencing acute conditions and the “safety net” for patient populations who do not visit their primary care doctor, let alone a dermatologist.2, 3 According to the Centers for Disease Control and Prevention, there are approximately 5.2 million ED visits per year for chief complaints relating to “diseases of the skin and subcutaneous tissue” in the United States, representing 3.8% of all ED visits, which is comparable to “diseases of the circulatory system.”4 Therefore, the ED physician is expected to identify which cutaneous disorders (CDs) are benign versus those that may be true emergencies and require further work‐up on patients of all skin tones. This can be challenging, as resources for CDs on dark skin tones are often scarce. In the dermatology literature, it has been shown that there are fewer representations of CDs on dark skin tones in dermatology texts and other resources.5, 6 A great portion of dermatology residents also feel that they did not receive adequate exposure to dermatological conditions on darker skin.7 To our knowledge, the representation of different skin tones in images of CDs in the emergency medicine (EM) literature has not been investigated.

The American Board of Emergency Medicine (ABEM) places the significance of “cutaneous disorders” at 3% of the overall board examination. For comparison, the broad category of “signs, symptoms, and presentations” is at 10%, while the smaller, more specific category of “psychobehavioral disorders” is at 2% of the board examination.8 In essence, the ABEM feels that CDs are significant enough to warrant their own category. Some of the conditions classified as “emergent” include erythema multiforme, erythema nodosum, pemphigus, and herpetic infections, while some of the conditions classified as “critical” include staphylococcal scalded skin syndrome, Steven‐Johnson syndrome, and toxic epidermal necrolysis. Additionally, there are other “lower‐acuity” and unlisted dermatological manifestations of other diseases that an ED physician should be able to recognize such as skin cancer, Kawasaki's syndrome, syphilis, and Rocky Mountain spotted fever.

Academic journals are an important source of information for physicians, especially with regard to up‐to‐date recommendations and new processes.9 Journals also highlight the importance of images and articles when they select what images to publish. Here, we examine the top EM journals to compare the representation of emergent CDs in light‐skinned individuals (LSI) versus dark‐skinned individuals (DSI). We hypothesized that fewer images of dark‐skinned CDs would be published overall.

METHODS

Study design

Three investigators conducted a cross‐sectional analysis of the top six EM journals by using the Eigenfactor, a network analysis used to determine the most influential journals in EM.10 The first investigator reviewed these journals, looking at every image to determine which images displayed a CD. Once a CD was identified, it was then classified as emergent, nonemergent, or not listed based on the most recent Model of the Clinical Practice of Emergency Medicine (EM Model), written by ABEM and published in 2016. Examining ABEM’s breakdown, each CD was classified into low acuity, emergent, and critical. Low acuity corresponded with our nonemergent categorization, while emergent and critical corresponded with our emergent classification. Our classification of “not listed” was designated as CDs not specifically included in the EM Model. Ulcerative conditions were excluded from the study given that their diagnosis is based on exposure of underlying tissues that do not involve skin pigmentation. Each image was also categorized based on the Fitzpatrick skin tone scale as being light skin (I–III on the scale) or dark skin (IV–VI on the scale). If the skin tone could not be clearly identified from the image, it was classified as indeterminate (designated as 0). This scale and these categories were used since they have been validated and serve as a standard in dermatology research.11 A blinded second investigator then rated the image on the Fitzpatrick scale. If there was a difference in skin tone ratings (light vs. dark) between the investigators, a third blinded investigator also reviewed the image to determine final classification. If there was a discrepancy between the individual Fitzpatrick classifications that did not affect categorization of images into LSI, DSI, or indeterminate; no third reviewer was used. The study was deemed exempt by the institutional review board at Emory University, given that the study did not involve human subjects.

Study setting and population

The top six EM journals determined by the Eigenfactor in order of popularity included Annals of Emergency Medicine (Annals), Academic Emergency Medicine (AEM), American Journal of Emergency Medicine (AJEM), Journal of Emergency Medicine (JEM), Emergency Medicine Journal (EMJ), and Pediatric Emergency Care (PEC). Each journal was manually surveyed from January 2015 to September 2020 in search of images of CDs.

Study protocol

Information gathered by the first investigator included the year and month the journal was published, the issue number, and whether or not the issue contained images of CDs. If a CD was present, the page number on which the CD could be located and whether the image depicted an emergent, nonemergent, or not listed condition was recorded. The first investigator also classified these images per the Fitzpatrick scale. This information was entered into Research Electronic Data Capture (REDCap) software. The classification of the Fitzpatrick scale and whether the image depicted DSI, LSI, or indeterminate was blinded to the second investigator. To ensure that each classification was reliable, the second investigator first confirmed based on the unblinded information if the condition was correctly classified according to the EM Model before blindly rating the image on the Fitzpatrick scale.

Measurements or key outcomes

The primary outcome measured was the proportion of images of emergent CDs depicting DSI versus LSI. Secondary outcomes measured included the proportion of images depicting DSI with nonemergent and not listed CDs when compared to LSI.

Data analysis

Statistical analyses were performed using SAS 9.4, and statistical significance was evaluated at the 0.05 level. A chi‐square test of independence evaluated for differences in the distributions of LSI versus DSI between disease categories. When testing for balanced skin tone representation in the literature, chi‐square goodness‐of‐fit tests were employed and evaluated if LSI and DSI were equally distributed within journals and if Fitzpatrick categories (I–VI) were equally distributed for CDs. For all chi‐square tests, exact methods were used when expected counts were < 5. Interinvestigator agreement between LSI and DSI was calculated using a kappa coefficient with 95% confidence interval (CI), while a weighted kappa coefficient was used to assess agreement between individual Fitzpatrick categories (I–VI), with 95% CI.

RESULTS

There were 314 images of CDs that met criteria for inclusion across the six journals. One journal, AEM, only contained one image of a CD, so it was excluded since it did not affect the results in any meaningful way. Of the 313 remaining images included in the final data analysis, 40 did not clearly show a skin tone (indeterminate on the Fitzpatrick scale). Of the other 273 images, 120 (44.0%) were emergent, 22 (8.0%) were nonemergent, and 131 (48.0%) were not listed in the EM Model. Of all images, 37 (13.6%) were of DSI. Examining Table 1, three of 55 (5.5%) were of DSI in AJEM, 18 of 80 (22.5%) in Annals, three of 26 (11.5%) in EMJ, 13 of 104 (12.5%) in JEM, and 0 of 8 (0.0%) in PEC, all of which were significantly less than images of LSI (all p < 0.01).

TABLE 1.

Images of different skin types in emergency medicine journals

Journala

Light skin

(n = 236)

Dark skin

(n = 37)

Determinate total

(N = 273)

GOF

p‐valueb

American Journal of Emergency Medicine (A) 52 (94.5) 3 (5.5) 55 <0.001
Annals of Emergency Medicine (B) 62 (77.5) 18 (22.5) 80 <0.001
Emergency Medicine Journal (C) 23 (88.5) 3 (11.5) 26 <0.001
Journal of Emergency Medicine (D) 91 (87.5) 13 (12.5) 104 <0.001
Pediatric Emergency Care (E) 8 (100) 0 (0) 8 0.008

Data are reported as n (row %). Emergency Medicine Journal (1/40), Journal of Emergency Medicine (6/40), Pediatric Emergency Care (29/40).

a

Forty skin types were indeterminable: American Journal of Emergency Medicine (2/40), Annals of Emergency Medicine (2/40),

b

p‐values calculated using chi‐square goodness‐of‐fit (GOF) tests; exact methods employed when expected counts < 5.

Overall, 18 of 120 (15.0%) images of emergent CDs were found to show DSI, while six of 22 (27.3%) images of nonemergent CDs showed DSI, and 13 of 131 (9.9%) images of conditions not listed in the EM Model showed DSI (Table 2). The relative proportions of images of light and dark skin color did not differ significantly between acuity categories (p = 0.074). Of the 273 images that were able to be categorized by the Fitzpatrick scale (Table 3), the first investigator had 125 (45.8%) for I, 72 (26.4%) for II, 31 (11.4%) for III, 22 (8.1%) for IV, 14 (5.1%) for V, and nine (3.3%) for VI, while the second investigator had 107 (39.5%) for I, 86 (31.4%) for II, 47 (17.3%) for III, 15 (5.5%) for IV, 10 (3.7%) for V, and seven (2.6%) for VI with a significant difference in distribution of Fitzpatrick category selection by the first investigator (p < 0.001) as well as by the second investigator (p < 0.001). Both investigators noted that with every increase in skin tone, there were fewer images. Interinvestigator reliability/agreement for LSI and DSI based on grouped Fitzpatrick categories of I–III and IV–VI, respectively, are shown in Table 4 and were found to be substantial, with a kappa coefficient of 0.76 (95% CI = 0.64 to 0.87). Likewise in Table 5, reliability/agreement was similar when considering ungrouped Fitzpatrick categories (I–VI), with a weighted kappa coefficient of 0.70 (95% CI = 0.64 to 0.76) for skin tones. Of note, a third reviewer was utilized in a disagreement between first and second reviewers for 22 of 313 (7%) of images.

TABLE 2.

Representation of dermatological diseases in relation to skin type

Acuity categorya

Light skin,

n = 236

Dark skin,

(n = 37)

Determinate total

(N = 273)

Chi‐square

p‐valueb

Emergent (A) 102 (85.0) 18 (15.0) 120 0.074
Nonemergent (B) 16 (72.7) 6 (27.3) 22
Not listed (C) 118 (90.1) 13 (9.9) 131

Data are reported as n (row %).

a

Forty skin types were indeterminable: emergent (13/40), nonemergent (6/40), not listed (21/40).

b

p‐value calculated using chi‐square test of independence.

TABLE 3.

Representation of dermatological diseases in relation to Fitzpatrick category

Fitzpatrick category, n (column %)

First reviewer,

n = 273a

GOF

p‐valuea

Second reviewer,

n = 271a

GOF

p‐valuea

Third reviewer,

n = 19a

GOF

p‐valueb

I or 1 125 (45.8) <0.001 107 (39.5) <0.001 0 (0.0) <0.001
II or 2 72 (26.4) 85 (31.4) 1 (5.3)
III or 3 31 (11.4) 47 (17.3) 11 (57.9)
IV or 4 22 (8.1) 15 (5.5) 6 (31.6)
V or 5 14 (5.1) 10 (3.7) 0 (0)
VI or 6 9 (3.3) 7 (2.6) 1 (5.3)
0 (Indeterminate) 39 41 3

Data are reported as n (column %).

a

Summary n and p‐values do not include indeterminate observations.

b

p‐values calculated using chi‐square goodness‐of‐fit (GOF) tests; exact methods employed when expected counts < 5.

TABLE 4.

Interobserver reliability among the grouped Fitzpatrick categories

Fitzpatrick categorya Rater 2 Kappa coefficientb (95% CI)
Light skin (I–III) Dark skin (IV–VI)
Rater 1
Light skin (I–III) 224 1 0.76 (0.64 to 0.87)
Dark skin (IV–VI) 15 30
a

N = 270 with determinate ratings from both raters.

b

0.00–0.20 = slight; 0.21–0.40 = fair; 0.41–0.60 = moderate; 0.61–0.80 = substantial; 0.81–1.00 = near perfect.

TABLE 5.

Interobserver reliability among the ungrouped Fitzpatrick categories

Fitzpatrick categorya Rater 2

Weighted kappa

coefficientb (95% CI)

I or 1 II or 2 III or 3 IV or 4 V or 5 VI or 6
Rater 1
I or 1 89 31 4 0 0 0 0.70 (0.64 to 0.76)
II or 2 14 48 10 0 0 0
III or 3 3 4 21 1 0 0
IV or 4 0 2 11 8 1 0
V or 5 0 0 1 5 6 2
VI or 6 1 0 0 0 3 5
a

= 270 with determinate ratings from both raters.

b

0.00–0.20 = slight; 0.21–0.40 = fair; 0.41–0.60 = moderate; 0.61–0.80 = substantial; 0.81–1.00 = near perfect.

DISCUSSION

In our analysis of top EM journals, there is a lack of representation of CDs in DSI. This was notable for both emergent conditions and nonemergent conditions. With each increase in the Fitzpatrick scale from I to VI (skin getting darker), there were fewer images of CDs that represented that skin tone as illustrated by Table 3. This trend was present in images of emergent CDs, nonemergent CDs, and other CDs (not listed in the ABEM Clinical Model of Emergency Medicine) as evidenced by Table 2. All of the journals we evaluated describe themselves as “international,” though most are based in the United States with one (EMJ) based in the United Kingdom. The representation of images in LSI and DSI in these journals is not reflective of the international or local populations served by the journals, with DSI being underrepresented in all of the journals.12, 13, 14 Although reliable data is not easily accessible on percentages of populations that fall into Fitzpatrick skin‐type categorizations, census data includes self‐identified race, and there has been some literature comparing self‐identified race and Fitzpatrick scale categorization. One study performed in the United States that specifically compared self‐identified race with Fitzpatrick skin categorization showed that category III was the predominant category for Caucasian individuals and that no race had more than 10% Fitzpatrick category I.15 Population‐based studies in the United States have also shown that the most common Fitzpatrick skin tone is category III with 35% or less of the population falling into Fitzpatrick categories I or II.11, 16 Therefore, the fact that more than 70% of the images across the journals that we evaluated were Fitzpatrick categories I and II shows a clear overrepresentation of LSI versus DSI even in predominantly Caucasian populations.14, 15 This becomes problematic when EM physicians are attempting to learn about rare emergent conditions, uncommon presentations of a common conditions, or new conditions through the literature. This was recently seen early in the coronavirus disease from 2019 (COVID‐19) pandemic with a dermatologic manifestation of COVID‐19 commonly known as “COVID toes.” Images of DSI with COVID toes were very rarely published in the literature or added to databases of COVID dermatologic findings.17

To combat health disparities for patients of minority racial and ethnic backgrounds, they must be represented in the literature from which health care providers learn and included in research and evidence‐based guidelines.18 Along with images in journals, other published materials for medical education, such as textbooks and online resources, have been found to have fewer images of DSI, which can lead to clinicians being less prepared to care for these patients and unintentional bias. This may contribute to health care disparities.6, 19 To our knowledge, representation of images of DSI in textbooks and online educational databases has not been specifically evaluated in the specialty of EM. Recruitment of racial minorities into studies and obtaining permission for educational images often proves to be challenging as there is a mistrust in the health care system, due to a history of bias and mistreatment permeating the medical field. Nevertheless, the inclusion of these patients will be essential to combat health disparities.20, 21, 22, 23

Although initially scarce, there has recently been more awareness around the paucity of images of CDs on darker skin tones, in medical reference material, and resources have begun to specifically address this. Printed texts like Taylor and Kelly's Dermatology for Skin of Color and Pediatric Skin of Color are two such resources.24, 25 There are also online resources such as “Brown Skin Matters” (a webpage that catalogues shared images of CDs on people of color), “Skin of Color Society” (an educational site about CDs on DSI), and “VisualDx” (a type of machine learning algorithm for CDs that includes images of DSI).26, 27, 28 Many physicians now turn to Web‐based resources such as these given their easy access and availability, and these resources are increasing representation of DSI at a greater rate than printed texts.29, 30 One other piece of Web‐based technology that is also becoming more prevalent to eliminate human bias and error is machine learning (ML) algorithms, which can help analyze CDs to determine a diagnosis. However, ML depends on input information, which means if current depictions of CDs are inputted, then the ML program itself may even show some bias due to a paucity of images of DSI.31 We encourage authors and editors of medical literature to critically evaluate their own practices in the publication of images of CDs to ensure inclusion of all skin tones.

LIMITATIONS

Since this study focused only on the top journals, it did not take into account other modalities discussed previously such as textbooks or Web‐based resources. Given that most evidence‐based information disseminated in the medical community originates with peer‐reviewed journals, we felt that it was prudent to analyze the root source of information in hopes that this study will serve as a foundation for further analysis in EM education. Although we used standardized, objective tools, and multiple reviewers of different skin colors were used, subjectivity could not be completely eliminated for the categorization of skin tones. However, even though our weighted kappa coefficient did not show inter‐reviewer agreement to be near perfect, it was still shown to be substantial. In addition, our categorization of emergent versus nonemergent CDs was based on ABEM's EM Model to have an objective standard. However, it is possible that the “not listed” conditions found in the top EM journals may be emergent conditions since including these images in the journals serves to educate on new conditions and disease processes and lesser‐known presentations of illnesses. Finally, some CDs that are depicted could be more prevalent in certain demographics. This study did not survey individual diseases in EM to determine how often images of LSI were depicted in comparison to DSI, which could perhaps be a topic in future research.

CONCLUSION

Our study found that when it comes to images of emergent cutaneous disorders, dark‐skinned individuals are scarcely depicted in the top EM journals. This trend was also shown in nonemergent conditions and conditions not specifically listed in ABEM’s Model of the Clinical Practice of Emergency Medicine. Furthermore, this study supplements the existing and scarce literature in the greater medical community of the underrepresentation of cutaneous disorders in dark‐skinned individuals. Future research should assess the inclusion of dark‐skinned individuals throughout multiple facets of emergency medicine education to include, but not limited to, textbooks, Web‐based resources, and national conference lectures as well as barriers and facilitators to equity in these resources. In turn, this may hopefully help decrease health care disparities in emergency medicine.

CONFLICT OF INTEREST

The authors have no potential conflicts to disclose.

Bell G, Holmes S, Gillespie S, Wood A, Murray BL. Images of dark skin in top emergency medicine journals: A cross‐sectional analysis of images of emergent cutaneous disorders. AEM Educ Train. 2021;5(Suppl. 1):S76–S81. 10.1002/aet2.10683

Supervising Editor: Alden Landry, MD, MPH.

REFERENCES


Articles from AEM Education and Training are provided here courtesy of Wiley

RESOURCES