Skip to main content
AEM Education and Training logoLink to AEM Education and Training
. 2022 Sep 27;6(5):e10808. doi: 10.1002/aet2.10808

Gender differences in emergency medicine resident assessment: A scoping review

Isabella Menchetti 1,, Debra Eagles 1,2, Dana Ghanem 3, Jennifer Leppard 1, Karine Fournier 4, Warren J Cheung 1,5
PMCID: PMC9513437  PMID: 36189450

Abstract

Background

Growing literature within postgraduate medical education demonstrates that female resident physicians experience gender bias throughout their training and future careers. This scoping review aims to describe the current body of literature on gender differences in emergency medicine (EM) resident assessment.

Methods

We conducted a scoping review which adhered to the Preferred Reporting Items for Systematic Reviews and Meta‐Analyses extension for Scoping Reviews guidelines. We included research involving resident physicians or fellows in EM (population and context), which focused on the impact of gender on assessments (concept). We searched seven databases from the databases' inception to April 4, 2022. Two reviewers independently screened citations, completed full‐text review, and abstracted data. A third reviewer resolved any discrepancies.

Results

A total of 667 unique citations were identified; 10 studies were included, and all were conducted within the United States. Four studies reported differences in EM resident assessments attributable to gender within workplace‐based assessments (qualitative comments and quantitative scores) by both attending physicians and nonphysicians. Six studies investigating clinical competency committee scores, procedural scores, and simulation‐based assessments did not report any significant differences attributable to gender.

Conclusions

This scoping review found that gender bias exists within EM resident assessment most notably at the level of narrative comments typically received via workplace‐based assessments. As female EM residents receive higher rates of negative or critical comments and discordant feedback documented on assessment, these findings raise concern about added barriers female EM residents may face while progressing through residency and the impact on their clinical and professional development.

BACKGROUND

Since 2017, females have made up the majority of students enrolled in medical schools in North America, and this proportion continues to increase each year. 1 , 2 Despite this, many medical specialties such as emergency medicine (EM) continue to be male dominated; in 2019, females comprised just 31% of EM physicians in Canada and 28.3% in the United States. 3 , 4 Recent literature suggests that female physicians face substantial challenges regarding gender bias and discrimination throughout all levels of training and their careers, primarily driven by implicit biases and gender stereotypes. 5 , 6

Implicit gender bias occurs when one unintentionally evaluates a group as inferior to another, based solely or partially on preexisting prejudice, such as gender role stereotypes. 6 , 7 Stereotypically male traits are typically associated with “agentic” behaviors, such as being dominant, assertive, confident, and ambitious, whereas “communal” behaviors such as being caring, compassionate, and polite are seen as stereotypical female traits. 8 This presents a unique challenge for female EM residents, as embodying these agentic traits and deviating from expected gender norms is often perceived negatively. 9

Gender bias has recently become a focus within postgraduate medical education. Fnais et al. 10 demonstrated that resident physicians reported gender discrimination as the most common form of abuse experienced across all years of training, resulting in poorer academic performance, disruptions in emotional well‐being, and a decline in the quality of patient care provided. A recent review conducted by Klein et al. 11 described the presence of gender bias in resident assessments across multiple specialties. Five of the nine included studies demonstrated that female residents consistently received lower evaluation scores than male residents; these were primarily through studies assessing qualitative workplace‐based assessments received from physician and nonphysician colleagues. 12 , 13 , 14 , 15 , 16 These findings suggest that gender bias poses a potential threat to the integrity of resident assessment.

As competency‐based medical education becomes more widely adopted in EM residency programs across North America and the globe, medical training programs are increasingly applying a programmatic approach to assessment by incorporating a suite of purposefully chosen and complimentary assessment methods to inform high‐stakes judgments of trainee competence. 17 , 18 Therefore, it is important to characterize and identify the presence of gender bias at any one of these assessment levels, to further elucidate how this may impact downstream decisions.

A comprehensive synthesis of the presence and impact of gender bias in EM resident assessments does not exist. This continues to be an emerging area of study with heterogenous original research; therefore, a scoping review is most appropriate to synthesize and characterize a broad and varied topic such as this. This scoping review aims to fill this gap by describing the current body of literature to help inform the identification of factors contributing to, and the impact of, gender bias in EM resident assessment.

METHODS

The scoping review protocol was developed following the Preferred Reporting Items for Systematic Reviews and Meta‐Analyses extension for Scoping Reviews (PRISMA‐ScR) checklist. 19 The protocol has not been registered.

Study eligibility criteria

This study included original research involving resident physicians or fellows (population) in EM (context). We included research where the EM resident physicians were a subgroup if their outcomes were reported separately. Included studies focused on the impact of gender on assessments (concept). Language, country, and year of publication were not restricted. Editorials, letters, abstracts, case studies, and systematic reviews were excluded.

Search methods

A peer‐reviewed 20 search of the literature was conducted by an information specialist (K.F.) in MEDLINE ALL (OvidSP), Embase (OvidSP), CINAHL (EBSCOHost), APA PsycInfo (OvidSP), ERIC (OvidSP), Education Source (EBSCOHost), and Scopus from the databases' inception to April 4, 2022, using a combination of subject headings and keywords for the concept of “gender differences or bias,” “residents,” and “emergency medicine” (Appendix S1). The search strategy was informed, in part, by examining a recent review on gender bias in graduate medical education 11 and reviews related to residents 21 and EM. 22 No search filters or language limits were used, and conference abstracts were removed when feasible. In addition, a manual search of the reference lists of the retrieved articles was conducted to capture all relevant studies for potential inclusion.

Selection of studies

The search results were imported into Covidence (Veritas Health Information), and duplicates were removed. Two reviewers (I.M., D.G.) independently screened all records by title and abstract, with any disagreements resolved by consensus, or by one of the lead investigators where necessary. Full‐text screening was performed similarly, via double independent screening (I.M., D.G.).

Data collection

Data abstraction was independently completed by two reviewers (I.M., D.G.) using a prespecified and standardized form. Extracted data were verified between the two reviewers. Data were collected on study, participant, and assessment characteristics. The prespecified primary outcome of interest was the impact of gender on resident assessments via quantitative or qualitative assessment data.

Synthesis of results

Outcomes were grouped according to assessment type and essential themes for the purpose of analysis.

RESULTS

Study selection

Our search strategy identified 1411 citations (Figure 1). Following duplicate removal, 667 unique citations remained. Title and abstract screening excluded 646 citations, resulting in 21 records eligible for full‐text screening. Five were excluded based on study objectives, 23 , 24 , 25 , 26 , 27 four for study design, 28 , 29 , 30 , 31 one because it was a duplicate, 32 and one for not including EM residents. 33 Ten records underwent data extraction. 12 , 15 , 32 , 34 , 35 , 36 , 37 , 38 , 39 , 40

FIGURE 1.

FIGURE 1

PRISMA flow diagram.

Study characteristics

Characteristics of included studies are summarized in Table 1. Studies included were published between 2017 and 2022. All included studies were conducted in the United States, with four multicenter studies. 12 , 32 , 35 , 36 Nine of 10 studies focused on EM residents, 12 , 15 , 32 , 34 , 35 , 37 , 38 , 39 , 40 and one study was solely focused on pediatric EM (PEM) fellows. 36

TABLE 1.

Characteristics of included studies

Study Study setting Population characteristics Assessment characteristics
Workplace‐based assessments
Brewer 2020 40 A single 3‐year ACGME‐accredited EM residency program in the United States

71 EM residents (PGY‐1 + PGY‐3 only)

31% female residents

43% female assessors

2765 assessments completed on any one of 23 possible individual ACGME EM milestone subcompetencies. Assessments completed by ED faculty.
Cherney 2018 38 Single‐center suburban health care system with a 4‐year EM residency in the United States

37 EM residents (PGY‐3 and PGY‐4 only)

30% female residents

52% female assessors

517 on‐shift teaching assessments of residents' teaching skills using the Emergency Rotation Scale tool, completed by fourth‐year medical students and/or physician assistant students.
Dayal 2017 12 Multicenter: eight community and academic ACGME‐accredited EM residency programs in the United States

359 EM residents

34% female residents

32% female assessors

33,456 assessments completed on any one of 23 possible individual ACGME EM milestone subcompetencies. Assessments completed by ED faculty.
Mueller 2017 15 A single 3‐year ACGME‐accredited EM residency program in the United States

35 EM residents (PGY‐3 residents only with >15 comments received)

37% female residents

43% female assessors

1317 assessments completed on any one of 23 possible individual ACGME EM milestone subcompetencies. Assessments completed by ED faculty.
Procedural assessment data
See 2022 34 A single, 4‐year EM residency program in the United States

10 EM residents

50% female residents

57% female assessors

18,360 video assessments total (9180 gender evident and 9180 gender blind videos) assessing three procedures: lumbar puncture, tube thoracostomy, and central venous access. Each video assessed by an EM faculty assessor across the United States.
Acuña 2019 32 Multicenter: 16 ACGME‐accredited EM residency programs in the United States

1186 EM Residents

37% female residents

% of female assessors not reported

2554 biannual assessments of the PC‐12 ultrasound subcompetency of ACGME EM educational milestones were completed
Standardized simulation‐based assessment
Siegelman 2018 37 A single urban 3‐year ACGME‐accredited EM residency program in the United States

102 EM residents

54% female residents

48% female assessors

Number of assessments not reported. Biannual simulation‐based assessments with two cases at a time. Each case assessed four or five ACGME EM subcompetencies. Faculty raters included board‐certified or board‐eligible EM physicians.
Nonphysician qualitative assessment
Brucker 2019 39 A single 3‐year ACGME‐accredited EM residency program in the United States

69 EM residents

% of female residents or % of female assessors not reported

1112 biannual assessments of residents by nursing staff completed. Range of 10–18 completed evaluations for each resident.
CCC milestone data review (multisource ratings)
Santen 2020 35 Multicenter: 117 ACGME‐accredited EM residency programs in the United States

1282 EM residents in 3‐year EM residency programs

35% female residents

% of female assessors not reported

Total number of assessments not explicitly reported. The CCC generates semiannual milestone ratings for each resident based multiple data sources. Medical knowledge subcompetency was excluded as it was modified during the study.
Zuckerbraun 2021 36 Multicenter: 48 PEM fellowship programs in the United States

604 PEM fellows

67% female PEM fellows

% of female assessors not reported

Total number of assessments not explicitly reported. The CCC generates semiannual milestone ratings for each fellow based multiple data sources. The 23 ACGME PEM specialty‐based subcompetencies are a combination of pediatrics and EM residency milestones.

Abbreviations: ACGME, Accreditation Council for Graduate Medical Education; CCC, Clinical Competency Committee; PC, patient care subcompetency; PEM, pediatric emergency medicine; PGY, postgraduate year.

A summary of each study's main findings is outlined in Table 2. Seven of the 10 studies examined assessments based on the Accreditation Council for Graduate Medical Education (ACGME) EM or PEM milestones. 12 , 15 , 32 , 35 , 36 , 37 , 39 , 40 Four studies investigated and reported the impact of assessor gender on resident performance scores, and none of the studies reported a statistically significant difference in scores between female and male assessors. 15 , 32 , 34 , 37 Only three studies included a priori definitions of meaningful differences in outcomes, 32 , 34 , 37 two of which included sample size calculations. 32 , 34

TABLE 2.

Summary of study results

Study Reported difference(s) between male and female residents? Reported difference based on assessor gender or gender–rater pairing? Summary of main findings
Workplace‐based assessments
Brewer 2020 40 Yes Not assessed or reported Qualitative: Male and female residents are viewed as equally capable in PGY‐1 (with female receiving more positive comments to their role as a learner/student). In PGY‐3, males were seen to be more competent leaders and communicators and were rated as being more ready for independent practice compared to female residents.
Cherney 2018 38 No Not assessed or reported Quantitative: Students' evaluations of the residents' teaching skills did not reveal gender differences across any of the four domains of the Emergency Rotation Scale assessment tool.
Dayal 2017 12 Yes No significant differences based on assessor gender Quantitative: Male and female residents entered residency with similar ratings (some higher for females) and by the end of residency, males outscored females on all 23 subcompetencies, equating to a 3‐ to 4‐month additional training gap for females compared to males.
Mueller 2017 15 Yes Not assessed or reported Qualitative: Female residents receive less consistent feedback from attending physicians, with higher rates of negative comments including strong criticism and criticism for lacking valued EM traits and higher rates of discordant comments containing both praise and criticism compared to males.
Procedural assessment data
See 2022 34 No The gender of the evaluator was not associated with a difference in mean scores Quantitative: There were no statistically significant differences in the study evaluators' scores of the proceduralists based on their gender, for both the gender‐evident and the gender‐blinded video assessments.
Acuña 2019 32 No No significant differences based on percentage of female faculty Quantitative: Overall, there is no evidence suggesting significant gender disparity between male and female EM resident ultrasound milestone assessments.
Standardized simulation‐based assessment
Siegelman 2018 37 No No interaction between resident and rater gender Quantitative: In a scripted and controlled assessment environment such as simulation, scores did not differ as a function of resident or rater gender.
Nonphysician qualitative assessment
Brucker 2019 39 Yes Not assessed or reported

Quantitative: No difference between female and male residents in ability or competence as measured by written examination and milestone evaluation scores.

Qualitative: Nurses evaluate female residents lower in their abilities, work ethic, leadership, and professionalism compared to male residents.

CCC milestone data review (multisource ratings)
Santen 2020 35 No Not assessed or reported Quantitative: Small but statistically significant differences in some patient care subcompetencies in PGY‐1 between males and females (some equivalent and some with females initially scoring higher). Females and males were rated similarly in PGY‐3, for all but four subcompetencies, which roughly equate to <1 month of additional training. However, these differences in milestone scores are not educationally meaningful.
Zuckerbraun 2021 36 No Not assessed or reported Quantitative: No major differences in ACGME milestone attainment by sex in any year of training, with the exception that after correction for multiple comparisons, female fellows achieved significantly higher mean scores for PC3 and ICS2 subcompetencies in their final year of training.

Abbreviations: ACGME, Accreditation Council for Graduate Medical Education; CCC, Clinical Competency Committee; ICS, interprofessional and communication skills subcompetency; PC, patient care subcompetency.

Quantitative studies

Workplace‐based assessments

Two studies investigated the influence of gender on workplace‐based assessments via quantitative data. One of the studies, Dayal et al., 12 assessed 33,456 resident ACGME milestone evaluations completed by EM faculty, based on end‐of‐shift assessments across eight community and academic EM residency programs. The second study, Cherney et al., 38 analyzed 517 on‐shift teaching assessments of senior residents (PGY‐3 and PGY‐4) completed by fourth‐year medical students in a single suburban EM residency program.

Dayal et al. 12 reported that female and male residents were evaluated similarly at the beginning of residency, with female residents receiving higher mean scores on a few subcompetencies: diagnostic skills (PC04), multitasking (PC08), and patient accountability (PROF02). By their final year of training, male residents were rated higher than females in all 23 subcompetencies, with the biggest mean differences seen in general approach to procedures (PC09, 0.24) and airway management (PC10, 0.26). These differences roughly equated to 3 to 4 additional months of training for female residents to achieve parity with their male counterparts. 12

Conversely, Cherney et al. 38 examined whether differences existed between female and male residents' teaching skills across the four domains outlined in their assessment tool. Resident teaching skills were assessed via the use of the Emergency Rotation Scale, which includes four domains (didactic, clinical, approachable, helpful) and ranks abilities on a scale of 1–5, ranging from “unacceptable” to “outstanding.” No significant differences were found within any of the four domains between male and female residents' teaching abilities.

Standardized simulation‐based assessment

Siegelman et al. 37 sought to determine whether semiannual simulation assessments resulted in equivalent scores for male and female residents at a single EM program with 102 residents. This study used a region of practical equivalence determined by expert consensus of EM physicians to classify a meaningful outcome difference and a sample size calculation was not provided. In an environment that is scripted, standardized, and controlled, such as simulation, there was no significant difference in mean simulation scores between male (64.60) and female residents (64.95). This study did not investigate the effects of training year as a covariate.

Procedural assessment data

Two studies investigated whether gender differences existed within the assessment of procedural competencies, and neither study found a statistically significant difference in procedural assessments by resident gender. See et al. 34 examined procedural competency in a simulated environment, through a combination of gender‐blinded (hands only) and gender‐evident video recordings of procedures being performed by 10 EM residents. The procedures included lumbar puncture, thoracostomy, and central venous access under ultrasound guidance. This study defined a 10% change in scores using a modified global rating scale, adapted from the Objective Structural Assessment of Technical Skills, as an educationally meaningful difference based on prior studies of trainee progress. Based on their sample size calculation, the study was sufficiently powered to detect a statistically significant difference. A total of 51 EM faculty across the United States scored and reviewed the blinded and unblinded assessment videos, for a total of 18,360 assessments. Using a mixed‐effects regression model, this study found that the gender‐blinded assessment scores of female and male proceduralists were similar (B = 0.05, 95% confidence interval [CI] −0.20 to 0.29). Additionally, the difference in scores was not significant between gender‐evident and gender‐blinded videos.

Acuña et al. 32 examined gender differences in 2554 ultrasound milestone evaluations in 1186 EM residents across 16 programs. The investigators defined a priori a 10% difference in mean scores as educationally meaningful. Based on their sample size calculation, the study was sufficiently powered to detect a significant difference. They reported no significant overall difference in mean milestone scores between female and male residents. Moreover, there was no statistically significant difference between female and male mean milestone scores at baseline (PGY‐1) or by the end of their training (PGY‐3).

Qualitative studies

Workplace‐based assessment (physician)

Two single‐center studies investigated gender bias in EM resident assessment via qualitative comments collected from workplace‐based, end‐of‐shift assessments. Brewer et al. 40 included 71 PGY‐1 and PGY‐3 EM residents, with no PGY‐2s, whereas Mueller et al. 15 included 35 PGY‐3 EM residents but analyzed those who specifically received more than 15 comments total.

Both Brewer et al. 40 and Mueller et al. 15 collected comments from InstantEval V2.0 (Monte Carlo Software LLC) and employed a robust multistage, multianalyst procedure for coding and analyzing data involving expert sociologist input. Resident and assessor gender information was blinded during all stages of coding in both studies. Brewer et al. 40 ensured each comment was coded independently by two sociologists and reviewed by a third medical sociologist. Feedback was broadly coded as positive or negative based on whether the comments were complimentary or critical of resident performance. Mueller et al. 15 first conducted simultaneous open coding of data with five team members with extensive sociological training to capture the main themes from the comments to compile “valued EM traits” and then undertook focused coding within those themes. Authors reported proportions between female and male residents but did not mention whether any differences were statistically significant.

In PGY‐1, Brewer et al. 40 reported female residents received a higher number of positive comments in relation to their skills as “learners and students” compared to male residents. Female residents received more “purely positive” feedback, meaning comments with only positive language, and 28.9% of comments stated females performed “exceptionally well” compared to 18.9% of comments for male residents. Female and male residents received similar rates of positive comments in reference to leadership and communication skills.

By PGY‐3, both studies reported that male residents received more positive feedback regarding leadership and communication. Brewer et al. 40 found that 15.9% of male residents received positive feedback about their ability to lead compared to 9.7% of female residents. Additionally, 25.9% of male residents were praised for their ability to communicate effectively with others compared to 18.9% of female residents. Mueller et al. 15 found that 61.5% of female residents received comments containing strong criticism and 76.9% of female comments specifically referenced a lack of valued EM traits, compared to 45.4% and 50% of male comments, respectively.

Additionally, and perhaps most notably, Brewer et al. 40 reported that 41.9% of comments for female residents stated they were not ready for independent practice compared to only 6.2% of male residents. Within the comments received by male residents, 22.7% specifically mentioned the male resident's suitability to a career in an academic center, whereas no female residents received such comments.

Workplace‐based assessment (nonphysician)

Only one study, Brucker et al., 39 assessed whether gender differences exist in semiannual evaluation of 69 residents by nurses at a single center, primarily via qualitative data. The study authors created their own assessment forms, and each resident received 10–18 completed assessments total. The comments were blinded by gender and coded into four general categories (standout, ability, grindstone, and interpersonal) by two independent reviewers, with a third to settle disputes. The categories included were based on a study conducted by Schmader et al., 41 which had previously outlined traits/words for inclusion. The comments were then coded for strength and valence of language.

Via the ordinal scale data on the assessments, nurses evaluated female residents lower in their professionalism (p = 0.04) and leadership skills (p = 0.019) in comparison to male residents. When analyzing valence of language, 51% of comments for female residents contained negative comments in respect to female resident's ability, and 57% negative comments regarding resident work ethic (grindstone), compared to 20% and 24%, respectively, for male residents (p < 0.01). In contrast to these differences, this study did not find any statistically significant differences in written examination scores and milestone scores between female and male residents, included in their analysis as a frame of reference for resident knowledge and clinical performance.

Clinical competency committee milestone data review (multisource ratings)

Two studies explored whether gender differences exist within semiannual EM milestone assessments following clinical competency committee (CCC) review. Santen et al. 35 collated data for 1282 EM residents from 117 EM residency programs, whereas Zuckerbraun et al. 36 included 604 PEM fellows from 48 PEM fellowship programs. Overall, neither study found educationally significant differences in milestone scores between male and female residents/fellows, as the differences fell within the same “functional level” of a milestone rating and were therefore unlikely to represent an educationally meaningful difference or difference in ability. 35 , 36

DISCUSSION

This study identified differences attributed to gender at the level of workplace‐based assessments in both qualitative and quantitative assessment data of EM residents. As residency programs increasingly apply programmatic approaches to trainee assessment, it is important to consider the impact of these gender disparities on downstream high‐stakes judgments of trainee competence and on female residents' professional and career development.

Qualitative assessment data

One of the notable findings from our review is the increased prevalence of gender differences in the narrative comments associated with workplace‐based assessments of EM residents. Current medical education literature has increasingly acknowledged the importance of written comments as a means of providing details about a resident's abilities or elements of their performance that cannot be captured via a numeric score. 42

The narrative feedback that residents receive can have an important impact on their clinical and professional development. 43 Studies assessing resident perspectives on the utility of narrative feedback have demonstrated that it is highly valued and utilized to improve performance and master complex clinical skills. 44 While effective feedback can be defined as “timely, specific, actionable and task‐oriented, rather than person‐oriented,” 45 narrative comments documented on assessments are not always achieving this intended goal and are often found to be “nonspecific,” “vague,” and sometimes focused on unrelated personality traits. 44 , 45 , 46 With this increased reliance on narrative comments, it is essential to outline how this is impacting resident assessment and professional development and particularly how this may differ by trainee gender.

Assessment within competency‐based medical education has evolved to serve two purposes: assessment “of” and assessment “for” learning. 47 The recent trends in programmatic assessment have emphasized frequent, low‐stakes formative assessments, such as workplace‐based assessments, as a means for providing feedback and guidance for future clinical and professional development. 47 However, as female residents receive similar rates of positive comments compared to males, 15 but face higher proportions of negative, vague, and discordant feedback, this raises questions about the impact of this feedback on female residents' clinical practice, progress through residency, and career development. 15 , 39 , 40 Brewer et al. 40 reported that 41.9% of female residents were told they were not ready for independent practice, compared to only 6.2% of male residents, and no female residents were told they would be suited to an academic career, compared to 22.7% of males.

These findings are in keeping with previous studies that have shown that the greatest levels of attrition in commitment to academia for females occurs during residency 48 with increasing levels of attrition continuing up each faculty rank position to the title of full professor. 49 A recent multi‐institutional survey sent to female EM faculty and residents found that 68.4% of respondents experienced gender discrimination as they progressed throughout their academic careers and that this continues to be a significant barrier to female advancement within academia in EM. 50 While these differences are striking, future research investigating the direct impacts of discordant and critical feedback on female resident professional development and methods to address these gender differences within assessment feedback is needed.

Quantitative assessment data

Of the seven included studies that reported quantitative results across a wide range of different assessment types, 12 , 32 , 34 , 35 , 36 , 37 six did not identify any differences in scores between female and male residents. 32 , 34 , 35 , 36 , 37 The study by Dayal et al. 12 was the only study to report statistically significant differences attributable to gender. While this study was robust in terms of its large sample across multiple centers, educationally meaningful outcome differences were not defined a priori to facilitate a clear interpretation of the identified differences in ACGME milestone scores. While the remaining six studies investigating quantitative score differences across multiple assessment types (simulation cases, procedures, and CCC outcomes) 32 , 34 , 35 , 36 , 37 did not report any statistically significant differences in both single‐center and multicenter settings, only three of these studies included an a priori definition of educationally meaningful outcome differences, 32 , 34 , 37 and only two reported sample size calculations. 32 , 34 Therefore, it is difficult know whether the remaining studies were sufficiently powered to detect differences attributable to gender.

While there may be some evidence that gender differences in quantitative assessment data among EM trainees exists, the findings of the included studies are challenging to interpret as a direct comparison is not possible due to the heterogeneity in methodology and outcomes measures. The lack of contextual information about what difference is considered educationally meaningful further adds to this challenge. Thus, to facilitate interpretation of findings, future studies seeking to examine gender differences in resident assessments should define what magnitude of difference is considered meaningful within the specific assessment context and ensure that the study is sufficiently powered to detect this difference.

Assessment data synthesis

This study identified evidence of gender differences in both quantitative and qualitative assessment data generated from workplace‐based assessments; however, the included studies examining semiannual CCC reviews did not identify any meaningful differences between female and male residents. While these findings do not discredit the existence of gender bias in EM, they do suggest that perhaps these workplace‐based differences, when taken together with other forms of assessment data, may not impact downstream summative outcomes at the CCC level. 35 , 36

Previous studies have postulated that implicit bias and shifting role expectations may be factors influencing rater cognition 40 leading to gender bias in assessors' ratings of performance and written comments. Additionally, these factors may influence the design of workplace‐based assessment tools further contributing to gender differences if, for example, the language of the rating scale anchors commend more agentic traits.

The value of programmatic assessment, however, draws on the fact that a variety of assessment types, from multiple assessors, across different contexts are collected over time. No single assessment encounter or assessment method in itself is sufficient to support a high‐stakes summative decision. Instead, the whole of the assessment data must be synthesized and interpreted. 51 Therefore, the limitations of any one assessment may be countered by the strengths of another. 52 It is possible that gender differences observed in workplace‐based assessments do not surface to the level of summative decisions made by the CCC when synthesized with data generated from a variety of other assessment activities. 51 Additionally, committee member training to increase knowledge of and recognition of unconscious biases may have served as a mitigating factor. 53 , 54 Nevertheless, as mentioned, the gender differences identified at the individual workplace‐based assessment level do raise concerns about the differential impact on EM residents' professional development.

STRENGTHS, LIMITATIONS, AND FUTURE DIRECTIONS

This scoping review has several strengths, primarily within the robustness of the methodology following the PRISMA‐ScR guidelines 19 and peer‐reviewed search strategy development. This scoping review is the first synthesis and comparison of several assessment types within a single specialty. More specifically, this study captured a wide range of assessments beginning at the shift level, via workplace‐based assessments, up to the summative decisions derived regarding resident progression at the level of the CCC. However, this heterogeneity precludes any direct comparisons and statistical analysis between studies. Included studies each had varying sample sizes, assessment goals, assessment tools, and proportions of female participants.

The presence and impact of gender bias within residency assessment is a rapidly evolving area of research, yet there are still several gaps that exist within the literature. First, as mentioned, given that every study identified by this review was conducted in the United States, future high‐quality research is warranted within other jurisdictions and medical education systems. From a methodological standpoint, future studies should define what is considered an educationally meaningful difference in outcome measure to facilitate interpretation of the findings. Second, if we are to identify effective strategies to address gender bias in resident assessment, a deeper understanding of the specific factors driving these differences is needed. Finally, long‐term studies are needed to gather evidence of the impact of these gender disparities in assessment on female physicians' professional and career development.

CONCLUSIONS

Overall, the findings of our review are notable as evidence of gender bias was found within emergency medicine resident workplace‐based assessments, particularly in relation to narrative comments. The findings of this study do not negate the existence of any gender‐based differences through CCC assessments, simulation, or procedural assessments, but instead highlight a significant source of bias within emergency medicine resident assessment. Studies demonstrating higher rates of negative or critical comments and discordant feedback among female emergency medicine residents raise concern about added barriers they may face while progressing through residency and the impact on their clinical and professional development.

AUTHOR CONTRIBUTIONS

Isabella Menchetti, Debra Eagles, and Warren J. Cheung conceived and designed the study. Isabella Menchetti led the production of the review across the difference phases, supervised by Warren J. Cheung, who is the methodology expert and senior author. The search strategy was designed and performed by information specialist Karine Fournier, with the assistance of Isabella Menchetti and Debra Eagles. Isabella Menchetti led the entire screening process and with the assistance of Dana Ghanem; both conducted independent screening, data abstraction, and data analyses as per the PRISMA‐ScR protocol, with the input of all other authors to resolve disputes where necessary (Debra Eagles, Jennifer Leppard, and Warren J. Cheung). Isabella Menchetti, with the assistance of Karine Fournier, drafted the manuscript. Debra Eagles, Jennifer Leppard, and Warren J. Cheung were the content experts and contributed substantive changes to the manuscript. All other authors contributed to the revision of the manuscript. All authors have seen and approved the final manuscript version. Isabella Menchetti takes responsibility for the manuscript as a whole.

CONFLICT OF INTEREST

The authors declare no potential conflict of interest.

Supporting information

Appendix S1

Menchetti I, Eagles D, Ghanem D, Leppard J, Fournier K, Cheung WJ. Gender differences in emergency medicine resident assessment: A scoping review. AEM Educ Train. 2022;6:e10808. doi: 10.1002/aet2.10808

Supervising Editor: Wendy Coates

REFERENCES

  • 1. Canadian Medical Education Statistics . Association of Faculties of Medicine of Canada; 2015. Accessed August 2021. https://afmc.ca/sites/default/files/pdf/CMES/CMES2019‐Complete_EN.pdf
  • 2. 2020 Fall Applicant, Matriculant, and Enrollment Data Tables . AAMC Press Release. Association of American Medical Colleges (AAMC); 2020. [Google Scholar]
  • 3. Emergency Medicine Profile. Canadian Medical Association ; 2018. Accessed August 2021. https://www.cma.ca/Assets/assets‐library/document/en/advocacy/profiles/emergency‐e.pdf
  • 4. 2020 physician specialty data report: executive summary. Association of American Medical Colleges . 2021. Accessed August 2021. https://www.aamc.org/data‐reports/data/2020‐physician‐specialty‐data‐report‐executive‐summary
  • 5. Sheppard G, Pham C, Nowacki A, Bischoff T, Snider C. Towards gender equity in emergency medicine: a position statement from the CAEP women in emergency Medicine committee. Can J Emerg Med. 2021;23(4):455‐459. doi: 10.1007/s43678-021-00114-x [DOI] [PubMed] [Google Scholar]
  • 6. Kramer M, Heyligers IC, Könings KD. Implicit gender‐career bias in postgraduate medical training still exists, mainly in residents and in females. BMC Med Educ. 2021;21(1):1‐9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Fitzgerald C, Hurst S. Implicit bias in healthcare professionals: a systematic review. BMC Med Ethics. 2017;18(1):19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Prentice DA, Carranza E. What women and men should be, shouldn't be, are allowed to be, and don't have to be: the contents of prescriptive gender stereotypes. Psychol Women Q. 2002;26(4):269‐281. doi: 10.1111/1471-6402.t01-1-00066 [DOI] [Google Scholar]
  • 9. Kolehmainen C, Brennan M, Filut A, Isaac C, Carnes M. Afraid of being “Witchy with a ‘B’”: a qualitative study of how gender influences residents' experiences leading cardiopulmonary resuscitation. Acad Med. 2014;89(9):1276‐1281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Fnais N, Soobiah C, Chen MH, et al. Harassment and discrimination in medical training: a systematic review and meta‐analysis. Acad Med. 2014;89(5):817‐827. [DOI] [PubMed] [Google Scholar]
  • 11. Klein R, Julian KA, Snyder ED, et al. Gender bias in resident assessment in graduate medical education: review of the literature. J Gen Intern Med. 2019;34(5):712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Dayal A, O'Connor DM, Qadri U, Arora VM. Comparison of male vs female resident milestone evaluations by faculty during emergency medicine residency training. JAMA Intern Med. 2017;177(5):651‐657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Rand VE, Hudes ES, Browner WS, Wachter RM, Avins AL. Effect of evaluator and resident gender on the American board of internal medicine evaluation scores. J Gen Intern Med. 1998;13(10):670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Galvin SL, Parlier AB, Martino E, Scott KR, Buys E. Gender bias in nurse evaluations of residents in obstetrics and gynecology. Obstet Gynecol. 2015;126(Suppl 4):7S‐12S. [DOI] [PubMed] [Google Scholar]
  • 15. Mueller AS, Jenkins TM, Osborne M, Dayal A, O'Connor DM, Arora VM. Gender differences in attending physicians' feedback to residents: a qualitative analysis. J Grad Med Educ. 2017;9(5):577‐585. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Loeppky C, Babenko O, Ross S. Examining gender bias in the feedback shared with family medicine residents*. Educ Prim Care. 2017;28(6):319‐324. doi: 10.1080/14739879.2017.1362665 [DOI] [PubMed] [Google Scholar]
  • 17. Van Melle E, Frank JR, Holmboe ES, Dagnone D, Stockley D, Sherbino J. A core components framework for evaluating implementation of competency‐based medical education programs. Acad Med. 2019;94(7):1002‐1009. [DOI] [PubMed] [Google Scholar]
  • 18. Schut S, Maggio LA, Heeneman S, van Tartwijk J, van der Vleuten C, Driessen E. Where the rubber meets the road – an integrative review of programmatic assessment in health care professions education. Perspect Med Educ. 2020;10(1):6‐13. doi: 10.1007/s40037-020-00625-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMA‐ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467‐473. doi: 10.7326/M18-0850 [DOI] [PubMed] [Google Scholar]
  • 20. McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. Press peer review of electronic search strategies: 2015 guideline statement. J Clin Epidemiol. 2016;75:40‐46. [DOI] [PubMed] [Google Scholar]
  • 21. Sekhar P, Tee QX, Ashraf G, et al. Mindfulness‐based psychological interventions for improving mental well‐being in medical students and junior doctors. Cochrane Database Syst Rev. 2021;12(12):CD013740. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Gonçalves‐Bradley D, Khangura JK, Flodgren G, Perera R, Rowe BH, Shepperd S. Primary care professionals providing non‐urgent care in hospital emergency departments. Cochrane Database Syst Rev. 2018;2(2):CD002097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Cleveland Manchanda EC, Chary AN, Zanial N, et al. The role of gender in nurse‐resident interactions: A mixed‐methods study. West J Emerg Med. 2021;22(4):919‐930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Ashurst JV, McGregor AJ, Safdar B, et al. Emergency medicine gender‐specific education. Acad Emerg Med. 2014;21(12):1453‐1458. doi: 10.1111/acem.12545 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Chang YC, Lee CH, Chen CK, et al. Exploring the influence of gender, seniority and specialty on paper and computer‐based feedback provision during mini‐CEX assessments in a busy emergency department. Adv Heal Sci Educ. 2017;22(1):57‐67. doi: 10.1007/s10459-016-9682-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Jung W, Kim J. Factors associated with first‐pass success of emergency endotracheal intubation. Am J Emerg Med. 2020;38(1):109‐113. [DOI] [PubMed] [Google Scholar]
  • 27. Rosenman ED, Misisco AM, Olenick J, et al. Does team leader gender matter? A Bayesian reconciliation of leadership and patient care during trauma resuscitations. J Am Coll Emerg Physicians Open. 2021;2(1):e12348. doi: 10.1002/emp2.12348 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Jordan J, Maculatis M, Linden J, et al. Do personality characteristics vary by gender in emergency medicine residents? West J Emerg Med. 2019;20(4):S2‐S3. [Google Scholar]
  • 29. Lewiss RE. Gender influence on ultrasound milestone evaluations among emergency medicine residents. AEM Educ Train. 2020;4(2):166‐167. doi: 10.1002/aet2.10396 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. O'Connor DM, Dayal A, Arora VM. Differences in milestone evaluations of men and women: the devil is in the details. Acad Med. 2020;95(10):1465. [DOI] [PubMed] [Google Scholar]
  • 31. Weichenthal LA, Campagne D, Sawtelle S, Ives C, Comes J. 83 Gender at the head of the bed: does gender bias exist in the evaluation of emergency medicine residents’ leadership skills during medical resuscitations? Ann Emerg Med. 2017;70(4):S34. [Google Scholar]
  • 32. Acuña J, Stolz U, Stolz LA, et al. Evaluation of gender differences in ultrasound milestone evaluations during emergency medicine residency training: a multicenter study. AEM Educ Train. 2019;4(2):94‐102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Saleta JM. The Effects of Utilizing High‐Fidelity Simulation in Medical Residency Programs. St. John's University (New York), School of Education and Human Services ProQuest Dissertations Publishing; 2012. [Google Scholar]
  • 34. See A, Pallaci M, Aluisio AR, et al. Assessment of implicit gender bias during evaluation of procedural competency among emergency medicine residents. JAMA Netw Open. 2022;5(2):e2147351. doi: 10.1001/jamanetworkopen.2021.47351 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Santen SA, Yamazaki K, Holmboe ES, Yarris LM, Hamstra SJ. Comparison of male and female resident milestone assessments during emergency medicine residency training: a national study. Acad Med. 2020;95(2):263‐268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Zuckerbraun NS, Levasseur K, Kou M, et al. Gender differences among milestone assessments in a national sample of pediatric emergency medicine fellowship programs. AEM Educ Train. 2021;5(3):e10543. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Siegelman JN, Lall M, Lee L, Moran TP, Wallenstein J, Shah B. Gender bias in simulation‐based assessments of emergency medicine residents. J Grad Med Educ. 2018;10(4):411‐415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Cherney AR, Smith AB, Worrilow CC, et al. Emergency medicine resident self‐assessment of clinical teaching compared to student evaluation using a previously validated rubric. Clin Ther. 2018;40(8):1375‐1383. [DOI] [PubMed] [Google Scholar]
  • 39. Brucker K, Whitaker N, Morgan ZS, et al. Exploring gender bias in nursing evaluations of emergency medicine residents. Acad Emerg Med. 2019;26(11):1266‐1272. [DOI] [PubMed] [Google Scholar]
  • 40. Brewer A, Osborne M, Mueller AS, O'Connor DM, Dayal A, Arora VM. Who gets the benefit of the doubt? Performance evaluations, medical errors, and the production of gender inequality in emergency medical education. Am Sociol Rev. 2020;85(2):247‐270. doi: 10.1177/0003122420907066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Schmader T, Whitehead J, Wysocki VH. A linguistic comparison of letters of recommendation for male and female chemistry and biochemistry job applicants. Sex Roles. 2007;57(7–8):509‐514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Cheung WJ, Wagner N, Frank JR, et al. Implementation of competence committees during the transition to CBME in Canada: a national fidelity‐focused evaluation. Medical Teacher. 2022;44:781‐789. doi: 10.1080/0142159X.2022.2041191 [DOI] [PubMed] [Google Scholar]
  • 43. Ramani S, Könings KD, Ginsburg S, van der Vleuten CPM. Twelve tips to promote a feedback culture with a growth mind‐set: swinging the feedback pendulum from recipes to relationships. Med Teach. 2018;41(6):625‐631. doi: 10.1080/0142159X.2018.1432850 [DOI] [PubMed] [Google Scholar]
  • 44. Raaum SE, Lappe K, Colbert‐Getz JM, Milne CK. Milestone implementation's impact on narrative comments and perception of feedback for internal medicine residents: a mixed methods study. J Gen Intern Med. 2019;34(6):929‐935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Lefroy J, Watling C, Teunissen PW, Brand P. Guidelines: the do's, don'ts and don't knows of feedback for clinical education. Perspect Med Educ. 2015;4(6):284‐299. doi: 10.1007/s40037-015-0231-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Ginsburg S, Regehr G, Lingard L, Eva KW. Reading between the lines: faculty interpretations of narrative evaluation comments. Med Educ. 2015;49(3):296‐306. doi: 10.1111/medu.12637 [DOI] [PubMed] [Google Scholar]
  • 47. Watling CJ, Ginsburg S. Assessment, feedback and the alchemy of learning. Med Educ. 2019;53(1):76‐85. doi: 10.1111/medu.13645 [DOI] [PubMed] [Google Scholar]
  • 48. Edmunds LD, Ovseiko PV, Shepperd S, et al. Why do women choose or reject careers in academic medicine? A narrative review of empirical evidence. Lancet. 2016;388(10062):2948‐2958. [DOI] [PubMed] [Google Scholar]
  • 49. Agrawal P, Madsen TE, Lall M, Zeidan A. Gender disparities in academic emergency medicine: strategies for the recruitment, retention, and promotion of women. AEM Educ Train. 2020;4(Suppl 1):S67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Graham EM, Ferrel MN, Wells KM, et al. Gender‐based barriers to the advancement of women in academic emergency medicine: a multi‐institutional qualitative study. West J Emerg Med. 2021;22(6):1355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Heeneman S, de Jong LH, Dawson LJ, et al. Ottawa 2020 consensus statement for programmatic assessment – 1. Agreement on the principles. Med Teach. 2021;43(10):1139‐1148. [DOI] [PubMed] [Google Scholar]
  • 52. Wilkinson TJ, Tweed MJ. Deconstructing programmatic assessment. Adv Med Educ Pract. 2018;9:191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Goldhamer MEJ, Martinez‐Lage M, Black‐Schaffer WS, et al. Reimagining the clinical competency committee to enhance education and prepare for competency‐based time‐variable advancement. J Gen Intern Med. 2022;37(9):2280‐2290. doi: 10.1007/s11606-022-07515-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Andolsek K, Padmore J, Hauer KE, Edgar L, Holmboe E. Clinical Competency Committees: A Guidebook for Programs. 3rd ed. Accreditation Program for Graduate Medical Education; 2020. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1


Articles from AEM Education and Training are provided here courtesy of Wiley

RESOURCES