Abstract
Objectives
Prior literature has demonstrated incongruities among faculty evaluation of male and female residents’ procedural competency during residency training. There are no known studies investigating gender differences in the assessment of procedural skills among emergency medicine (EM) residents, such as those required by ultrasound. The objective of this study was to determine if there are significant gender differences in ultrasound milestone evaluations during EM residency training.
Methods
We used a stratified, random cluster sample of Accreditation Council for Graduate Medical Education (ACGME) EM residency programs to conduct a longitudinal, retrospective cohort analysis of resident ultrasound milestone evaluation data. Milestone evaluation data were collected from a total of 16 ACGME‐accredited EM residency programs representing a 4‐year period. We stratified milestone data by resident gender, date of evaluation, resident postgraduate year, and cohort (residents with the same starting date).
Results
A total of 2,554 ultrasound milestone evaluations were collected from 1,187 EM residents (750 men [62.8%] and 444 women [37.1%]) by 104 faculty members during the study period. There was no significant overall difference in mean milestone score between female and male residents [mean difference = 0.01 (95% confidence interval {CI} = −0.04 to 0.05)]. There were no significant differences between female and male residents’ mean milestone scores at the first (baseline) PGY1 evaluation (mean difference = −0.04 [95% CI = −0.09 to 0.003)] or at the final evaluation during PGY3 (mean difference = 0.02 [95% CI = −0.03 to 0.06)].
Conclusions
Despite prior studies suggesting gender bias in the evaluation of procedural competency during residency training, our study indicates that there were no significant gender‐related differences in the ultrasound milestone evaluations among EM residents within training programs throughout the United States.
There are gender gaps throughout medicine, and the field of emergency medicine (EM) is no exception.1, 2 Despite significant advancements over past decades, there are fewer women than men in EM residency programs and in positions of leadership in academic emergency departments.3 It is postulated that this difference in representation in EM is likely related to a broader imbalance in medicine, with the disparity starting early in medical education and residency training. Several studies have explored gender differences in resident training evaluations among various specialties, with findings that support bias against female residents.4, 5
The Accreditation Council for Graduate Medical Education (ACGME)‐accredited EM programs utilize educational milestones to evaluate residents and track progression toward unsupervised practice. The concept of milestones for residency training was introduced in 2008, developed to assist faculty in standardizing evaluations of resident physicians by identifying and tracking resident attainment of core knowledge and skills throughout their training.6 Residents are assessed throughout the year by faculty and milestones are recorded for each resident biannually for review by the program leadership. This directs their academic plan and methods for further education. Prior literature on gender bias has demonstrated that despite having similar skills and fund of knowledge at the beginning of residency, female EM residents attained lower‐level milestones, most evident by the time of graduation when compared to their male counterparts.5
Bedside ultrasound is one of the required milestone achievements for EM residents and is recognized as one of the essential procedural skills for an emergency physician. Prior studies among other specialties have demonstrated significant incongruities among male and female trainees regarding the assessment of procedural skill competency during residency training. Studies have demonstrated disparities not only in the level of autonomy female residents were given while performing procedures when compared to male trainees but also in the overall number and type of procedures performed.7, 8, 9 Unfortunately, these previous studies are poorly generalizable to EM training as many were single‐center studies among non‐EM residents, often utilizing institution‐specific evaluation scales. Therefore, it remains unclear after a review of the prior literature whether such a difference is present in EM training as well. We sought to study potential gender differences in evaluations of residents’ procedural skills by analyzing the ultrasound milestones data for EM residents throughout the United States. A pilot study performed at our institution suggested a gender bias in the evaluation of ultrasound milestone levels among EM residents, which is consistent with other gender bias studies.10 However, a more robust study is needed to assess whether this disparity exists among EM residency programs throughout the country. To our knowledge, there are no previous multicenter studies specifically investigating gender differences in the evaluation of psychomotor or procedural skills among EM residents, such as those required by ultrasound. The objective of this study was to investigate the possible existence of gender differences in ultrasound milestone evaluations during EM residency training across the country.
Methods
Study Design
This was a longitudinal, retrospective analysis of ACGME resident milestone evaluation data from a representative sample of EM residency programs in the United States. The research data were collected from a total of 16 ACGME‐accredited EM residency programs from Fall 2014 to Spring 2018. Both 3‐year and 4‐year residency programs were included in the analysis. An institutional review board (IRB) approval was obtained at the institution initiating the study. Additional IRB approval was obtained from the collaborating programs as deemed necessary by each institution.
Study Population/Selection
A list of all eligible programs (the sampling frame) was compiled from the ACGME list of accredited EM residency programs.11 Programs not accredited through the ACGME and osteopathic programs were excluded. Survey sampling was similar that in a previous study by Stolz et al.12 Programs were stratified by geographic region and the designation of geographic region was based on U.S. Census Bureau Geographic areas, with the four geographic regions being West, Midwest, Northeast, and South.13 Four programs were randomly selected from each stratum, with alternate programs identified to account for nonresponse of originally chosen programs. The study population includes EM residents trained at the randomly chosen residency programs over a 4‐year period between Fall 2014 to Spring 2018. The milestones evaluation data on these residents were collected biannually as a part of their residency training and competency assessment. Evaluations were performed by faculty members at each program. The ultrasound subcompetency outlines five advancing proficiency levels ranging from level 1 through 5. EM residents are expected to meet at least a level 4 subcompetency requirement prior to graduation.12 Ultrasound competency is typically assessed through the number of ultrasound examinations performed that have passed quality assurance and direct observation of skills.12 All former or current residents who had milestone evaluations recorded over a 4‐year period were included in the study. After randomization, programs were excluded if they elected to not be part of the study after an invitation to participate was extended. Programs were also excluded after randomization if they had not been in existence prior to Fall 2014 as these programs would not have the full 4 years of data required for participation in the study.
Study Protocol
All programs were assigned a random number using a random number generator14 and then sorted in descending order. The first four programs per stratum were designated as primary programs, and the next four, as backup programs in the event that any primary programs were not able to participate. Thus, for each stratum, four primary programs along with four backup programs were randomly chosen for participation. Each of these programs had an on‐site faculty member who was chosen to be the site coordinator. This faculty member was asked to coordinate data collection and facilitate any IRB process deemed necessary by their institution. The site coordinator compiled ultrasound milestone evaluations recorded on the residents from their program from Fall 2014 to Spring 2018. The data came from the Accreditation Data System, which is entered semiannually. These data had been culminated by clinical competency committees at each institution per ACGME program requirements. All data were deidentified. The resident milestones were recorded along with the gender of each resident. Once the deidentified data were collected on the standardized data collection sheet, it was sent securely to the primary investigator. The site coordinator also completed and returned a short survey that reported the gender of the faculty evaluators, a description of the primary practice setting (academic, community, or county), the average number of ultrasound examinations completed by residents prior to graduation and the presence/absence of a required ultrasound rotation for residents. The presence or absence of a fellowship training program was determined from the Emergency Ultrasound Fellowships website. Data from all programs were deidentified and pooled to create a composite data set.
Data Analysis
Based on our previous survey of US ACGME EM residency programs, we determined that stratifying by geographic region and then sampling EM residencies (clusters) would result in a variance inflation factor (the increase in the variance relative to simple random sampling) of 2.9 based on a previous survey study.12 We a priori considered a 10% difference in means to be meaningful (e.g., for a mean score of 1.5 we wanted to be able to detect a significant increase or decrease of 0.15; for a mean of 2.00, a 0.20 change; etc.). We used a mean ultrasound milestone evaluation score from a convenience sample (single program, single evaluation period: mean ± SD = 2.48 ± 1.06) to estimate that in order to detect a 10% change/difference between two groups (mean group 1 = 2.25, mean group 2 = 2.5) with a 1:3 allocation ratio (women represent approximately 35% of EM residents nationwide), power = 80%, and p = 0.05, we would need 590 evaluation scores. Given the variance inflation factor of 2.9, we estimated that we would need 1,711 evaluation scores to have sufficient power to detect our a priori proposed difference. We estimated that each sampled program would provide 120 scores (representing several classes of residents across eight evaluation periods) and that to get at least 1,711 scores and sample an equal number of programs for each stratum, we would need to sample four programs per stratum for a total of 16 programs and 1,920 total evaluations to have sufficient power. Ultrasound milestone evaluation scores ranged from 1 to 5, and evaluators commonly used 0.5 increments.
We accounted for the survey design and sampling weights for all analyses using the survey functions (“svy”) in Stata (version 14.2; StataCorp). Sampling weights were the inverse probability of sampling each cluster within each stratum and we applied a finite population correction. We used generalized linear multilevel mixed‐effects regression to estimate mean milestone evaluation data, with individual residents considered a random effect. Residents were nested within programs and programs were nested within strata. We constructed an overall regression model that included gender, evaluation date, the postgraduate year (PGY) for each milestone evaluation, and an interaction term between PGY at each evaluation and evaluation date. We also included several other potential confounding factors as judged relevant by the authors to examine confounding of the relationship between milestone scores and gender (presence of an ultrasound fellowship, size of the program, type of program [community vs. university‐based], percentage of female EM faculty, self‐reported [by program] number of ultrasound examinations by each resident prior to graduation), and 3‐year versus 4‐year program. The distribution of scores was approximately normal, and we found no transformation of the data that improved normality. Preliminary analyses indicated that the distribution of milestone scores and regression residuals met the assumption of normality; consequently, we used the continuous, untransformed milestone evaluation scores for analyses. We used the final regression models to estimate differences (i.e., marginal mean differences) between mean evaluation scores for male and female residents after stratifying by PGY at each evaluation and evaluation date. This generated mean differences for 32 male–female resident comparisons representing seven distinct resident cohorts. Regression models were also used to analyze differences in the increase of mean evaluation scores from PGY‐1 to PGY‐3 for all residents that had both PGY‐1 and PGY‐3 evaluations. We also analyzed differences in the increase from PGY‐1 to PGY‐2 and from PGY‐2 to PGY‐3 between male and female evaluation scores. Finally, we also examined mean differences between men and women's scores for their first PGY‐1 evaluation and their last PGY‐3 evaluation.
Results
Demographic Characteristics
A total of 2,554 milestone evaluations were collected from 1,186 EM residents (750 men [63.2%] and 436 women [36.8%]) by 104 faculty members during the study period. Our study included evaluations from 16 EM training programs (10 academic programs, four community programs, and two county programs). The training programs represent all four U.S. Census–designated regions of the United States (Northeast, Midwest, South, West) in a mix of rural, suburban, and urban settings. Training programs ranged in size from 29 to 62 residents and 18 to 133 physicians on faculty. Fifteen of the 16 programs required that residents rotate through an ultrasound rotation as part of their residency training. Nine of the 16 programs had an ultrasound fellowship program.
Table 1 shows the demographic and study characteristics for our sample as well as the national estimates for the number of residents and proportions and means. While the overall proportion of female EM residents in our sample was 36.8%, we estimated the proportion of female EM residents nationally to be 37.2% (95% confidence interval [CI] = 32.3 to 42.5) after applying survey sampling weights and taking the survey design into account.
Table 1.
Survey sample characteristics and national population estimates for emergency medicine residents in ACGME‐accredited residency programs
| EM Resident/residency characteristic | Survey sample, n (%) | National estimate (95% CI) | National estimate, percentage (95% CI) |
|---|---|---|---|
| Total | 1,186 (100) | 61,964 (52,448 to 71,479) | 100 |
| Gender | |||
| Female | 436 (36.8) | 23,052 (18,357 to 27,746) | 37.2 (32.3 to 42.5) |
| Male | 750 (63.2) | 38,861 (32,051 to 45,670) | 62.8 (57.5 to 67.7) |
| Program affiliation | |||
| University | 773 (65.1) | 42,605 (22,983 to 62,226) | 68.8 (38.5 to 88.6) |
| Community | 418 (34.9) | 19,359 (3,129 to 35,588) | 31.2 (11.4 to 61.5) |
| Program size | |||
| ≤35 residents | 210 (17.7) | 10,860 (956, to 22,676) | 17.5 (4.8 to 47.5) |
| 36+ residents | 917 (82.3) | 51,104 (31,646 to 70,561) | 82.5 (52.5 to 95.2) |
| Self‐reported (by program) number of completed ultrasound exams by residents upon graduation – mean (95% CI) | 308 (301 to 315) | – | 304 (225 to 383) |
| Presence of ultrasound fellowship | 785 (66.1) | 40,145 (19,969 to 60,320) | 64.8 (34.3 to 86.7) |
| Percent female faculty – mean (95% CI) | 32.2 (31.6 to 32.8) | – | 35.6 (28.7 to 38.5) |
| Residency type | |||
| 4‐year | 326 (27.5) | 19,083 (56, to 38,734) | 30.8 (9.6 to 65.1) |
| 3‐year | 861 (72.5) | 42,881 (23,911 to 61,850) | 69.2 (34.9 to 90.4) |
Abbreviations: CI, confidence interval; EM, emergency medicine.
Figure 1 shows the mean ultrasound milestone evaluation scores for the seven cohorts sampled in our survey, stratified by gender, date of evaluation, and PGY in residency. Table 2 shows the mean differences and associated 95% CIs for all male–female comparisons graphically shown in Figure 1. Only paired differences within a cohort between male and female scores for the same evaluation date were analyzed. Only one paired difference (Fall 2014/PGY‐1 scores for the 2014/2015 cohort) was statistically significant with a mean difference (female–male residents) of −0.28 (95% CI = −0.51 to −0.06).
Figure 1.

Mean ultrasound milestone evaluation score estimates for emergency medicine (EM) residents in the United States. All estimates account for the complex survey sampling design of the study and represent national estimates for all EM residents in the United States from Fall (Fa) 2014 to Spring (Sp) 2017. Estimates for female residents are represented by circles and estimates for males by squares. I‐bars represent 95% CIs. Integers 1 to 4 represent the postgraduate year for the fall and spring scores in the same academic year. Estimates represent seven distinct cohorts of EM residents (shaded alternating gray and black), each labeled parenthetically with the cohorts’ starting academic year in italics.
Table 2.
Mean differences for ultrasound milestone evaluation scores between male and female emergency medicine residents
| Cohort (starting academic year) | Post‐graduate year | Evaluation date | Mean ultrasound milestone evaluation score difference: females‐males (95% confidence interval)† |
|---|---|---|---|
| 2011/2012 | 4 | Fall 2014 | 0.30 (−0.12, 0.73) |
| Spring 2015 | 0.14 (−0.07, 0.35) | ||
| 2012/2013 | 3 | Fall 2014 | 0.16 (−0.10, 0.42) |
| Spring 2015 | 0.00 (−0.20, 0.19) | ||
| 4 | Fall 2015 | 0.11 (−0.11, 0.34) | |
| Spring 2016 | 0.11 (−0.12, 0.33) | ||
| 2013/2014 | 2 | Fall 2014 | −0.19 (−0.49, 0.11) |
| Spring 2015 | −0.15 (−0.41, 0.11) | ||
| 3 | Fall 2015 | −0.12 (−0.29, 0.06) | |
| Spring 2016 | −0.02 (−0.11, 0.07) | ||
| 4 | Fall 2016 | −0.18 (−0.46, 0.10) | |
| Spring 2017 | −0.18 (−0.47, 0.11) | ||
| 2014/2015 | 1 | Fall 2014 | −0.28 (−0.51, −0.06)‡ |
| Spring 2015 | −0.18 (−0.48, 0.13) | ||
| 2 | Fall 2015 | −0.10 (−0.35, 0.15) | |
| Spring 2016 | −0.03 (−0.23, 0.18) | ||
| 3 | Fall 2016 | −0.01 (−0.14, 0.12) | |
| Spring 2017 | −0.06 (−0.14, 0.01) | ||
| 4 | Fall 2017 | −0.10 (−0.30, 0.11) | |
| Spring 2018 | −0.22 (−0.49, 0.05) | ||
| 2015/2016 | 1 | Fall 2015 | −0.05 (−0.27, 0.16) |
| Spring 2016 | −0.19 (−0.49, 0.11) | ||
| 2 | Fall 2016 | −0.02 (−0.34, 0.29) | |
| Spring 2017 | −0.07 (−0.30, 0.16) | ||
| 3 | Fall 2017 | −0.01 (−0.19, 0.17) | |
| Spring 2018 | 0.01 (−0.08, 0.10) | ||
| 2016/2017 | 1 | Fall 2016 | 0.11 (−0.20, 0.41) |
| Spring 2017 | 0.07 (−0.27, 0.40) | ||
| 2 | Fall 2017 | −0.04 (−0.29, 0.21) | |
| Spring 2018 | −0.03 (−0.19, 0.13) | ||
| 2017/2018 | 1 | Fall 2016 | −0.03 (−0.33, 0.27) |
| Spring 2017 | 0.07 (−0.17, 0.32) |
Bold differences: mean female scores > male scores.
Differences calculated from final regression model (generalized linear mixed model) as marginal mean difference between female and male mean scores for each post‐graduate year and milestone evaluation date, controlling for all other variables in the model (see text for all covariates).
Statistically significant.
Our initial combined multilevel mixed effects model included resident gender as the primary independent variable along with evaluation date, PGY at evaluation date, and a PGY by evaluation date interaction term and showed no statistically significant difference between male and female residents’ scores. After various factors were controlled for and after it was determined whether any of them were confounders for the relationship between gender and milestone scores, there were no differences between mean ultrasound evaluation scores between men and women. The mean difference for evaluation scores for female–male residents was 0.01 (−0.04 to 0.05), evaluation date was controlled for, PGY at each evaluation, presence of ultrasound fellowship, size of program, university versus community program, the number of self‐reported (by programs) ultrasound examinations by each resident upon graduation, 3‐year versus 4‐year programs, and the percentage of female EM faculty for each program (Table 3).
Table 3.
Generalized linear mixed‐model regression results for ultrasound milestone evaluation scores for emergency medicine residents
| Resident/residency characteristic | Regression coefficient† | 95% Confidence interval† |
|---|---|---|
| Female (vs. male) | 0.01 | −0.04 to 0.05 |
| Post graduate year (PGY) | ||
| 1 | Referent | |
| 2 | 1.08 | 1.04 to 1.12 |
| 3 | 1.89 | 1.82 to 1.96 |
| 4 | 2.49 | 2.38 to 2.61 |
| Ultrasound milestone evaluation date | ||
| 2014 – Fall | Referent | |
| 2015 – Spring | 0.61 | 0.56 to 0.66 |
| 2015 – Fall | 0.26 | 0.20 to 0.32 |
| 2016 – Spring | 0.69 | 0.63 to 0.76 |
| 2016 – Fall | 0.41 | 0.33 to 0.49 |
| 2017 – Spring 2017 | 0.84 | 0.75 to 0.94 |
| 2017 – Fall | 0.59 | 0.48 to 0.71 |
| 2018 – Spring | 1.01 | 0.88 to 1.14 |
| PGY * evaluation date (interaction term) | −0.04 | −0.05 to −0.03 |
| Ultrasound fellowship (vs. No) | 0.18 | −0.26 to 0.62 |
| Large program size, 36+ residents (vs. small program, <36) | −0.25 | −0.52 to 0.02 |
| University‐based residency (vs. community‐based) | −0.10 | −0.47 to 0.28 |
| Completed US exams per resident before graduation (self‐reported by programs) | ||
| 100–200 | Referent | |
| 201–300 | 0.40 | −0.05 to 0.84 |
| 301–400 | 0.18 | −0.21 to 0.56 |
| 401–500 | −0.14 | −0.66 to 0.39 |
| >500 | −0.20 | −0.61 to 0.23 |
| Percent female faculty (per 10%) | 0.02 | −0.02 to 0.06 |
| 4‐Year residency (vs. 3‐year) | 0.05 | −0.25 to 0.34 |
Increase in Ultrasound Milestone Evaluation Score.
Generalized Linear Mixed Model Linear Regression – Gaussian family and link function, residents nested within programs, programs nested within geographic strata. Regression model error estimates account for complex survey design, with sampling weights equal to the inverse probability of cluster selection, and a finite population correction.
A comparison of the initial evaluation scores for PGY‐1 residents showed that women's ultrasound evaluation scores were on average 2% lower (mean difference female–male = −0.04 [95% CI = −0.09 to 0.003]) compared to men's scores; however, this was not statistically significant based on our a priori threshold of a 10% difference. For all PGY‐3 residents, women had a slightly higher final mean milestone evaluation score (mean difference female–male scores = 0.02 [95% CI = −0.03 to 0.06]), but again this difference (<1%) was not statistically significant.
We also compared the change in scores between male and female residents from initial PGY‐1 score to the final PGY‐3 score for the two cohorts (2014/2015 and 2015/2016) that had data for PGY‐1 and ‐3 residents. The overall difference for increase of ultrasound milestone scores (mean increase females–males) from first PGY‐1 score to final PGY‐3 score was not statistically different. There was also no statistically significant difference in the mean increase in ultrasound milestone evaluation scores (from first PGY‐1 score to last PGY‐2 score) between men and women (mean increase female–male) for the 2016/2017 cohort (−0.14 [95% CI = −0.36 to 0.077]) or the change from first PGY‐2 score to final PGY‐3 score for cohort 2013/2014 (+0.19 [95% CI = −0.07 to 0.46]).
Discussion
This is the one of the largest nationally representative studies to date examining whether gender differences exist in ultrasound milestone evaluations. We found that, overall, there is no evidence suggesting significant gender disparity between male and female EM resident ultrasound milestone evaluations. While there are consistent small differences between male and female mean evaluation scores, the magnitude of these differences did not reach statistical significance, nor did they meet our a priori threshold of a 10% change. These findings may be unexpected given the recent evidence of gender bias in medical training across multiple specialties. A recent comprehensive study by Dayal et al.5 revealed a wide gender gap in the milestone evaluations of male and female EM residents. The study found that despite having similar skills and knowledge at the beginning of residency training, female EM residents were evaluated lower on the vast majority of the 23 EM competencies/subcompetencies by the time of graduation when compared to their male counterparts. A limitation of this study was its use of a convenience sample of residency programs. We hoped to overcome this limitation with a robust and representative study design that produces generalizable findings.
A number of factors could have contributed to the lack of gender bias in this study, contrary to that seen in recent literature. Perhaps the recent awareness and advancements toward gender equality are encouraging faculty members to deliberately address and reveal their own overt and covert biases. Traditionally, ultrasound evaluations are performed by faculty with ultrasound expertise who rely on objective factors for evaluating residents, such as direct supervision of psychomotor skills, image quality/acquisition, number of ultrasound examinations performed, quality assurance reviews, and performance on knowledge‐based examinations. This is supported Amini et al.,15 who demonstrated that nearly all ACGME EM residency programs currently use image quality assurance and direct observation during clinical shifts during ultrasound skills assessments. These assessments are more objective and therefore might be less prone to bias compared to more subjective assessments. By choosing to evaluate a more objective milestone, greater levels of bias are likely required to demonstrate a difference between male and female evaluations.
The goal of this study was to identify a hypothesized gender bias for resident milestone achievement from a representative sample of programs; therefore, the decision was made to systematically stratify EM residency programs by geographic region. Since gender bias could be influenced by regional cultural practices and beliefs, it was presumed that stratification by geography might result in the greatest variance reduction (compared to simple random sampling). We used a population‐based stratified cluster sample of EM residency programs that allowed us to generate national estimates for United States EM residency ultrasound milestone characteristics for men and women, and our findings are generalizable to all EM residents in the United States. Previous studies on this topic utilized convenience samples and therefore had a significant chance for selection bias. Our systematic method of sampling was chosen to minimize the risk of introducing any selection bias. Our findings may therefore better reflect the national status of gender differences among EM resident evaluations than prior studies, although our findings are limited in scope as they are focused on a single subcompetency.
Limitations
Our study has several limitations. Our focus on a single subcompetency limits generalizability in this regard, but it informs the point‐of‐care ultrasound community on how we are evaluating EM residents. There are also concerns regarding the current practice of using the ultrasound subcompetency to assess resident progression. The PC12 subcompetency has been largely criticized by experts in the point‐of‐care ultrasound community. A publication by Nelson et al.,16 published in 2016, discussed several measurement issues with the current ultrasound subcompetency. As written, the ultrasound milestones do not address procedural guidance, the ability to utilize emergency ultrasound protocols and clinical algorithms, documentation of ultrasound examinations, and understanding of the limitations of emergency ultrasound. These omissions were addressed by Nelson et al. and incorporated into their proposed revision of the PC12 subcompetency. Some programs use the PC12 subcompetency as written currently, and some programs use the suggested and supported version as proposed by Nelson et al. A limitation to our study is that we are unaware if any of the participating programs are using the modified version of the subcompetency. These data could have been obtained using the survey we sent out to the programs and incorporated into our analysis.
An additional limitation is that faculty evaluators are not consistent among all the programs. At programs with a robust ultrasound program, an ultrasound faculty member would be the evaluator, but other programs may have their residency program director make these determinations. Resident ultrasound rotations are also not standardized, so the length, structure, curricula, and materials can vary greatly from one program to the next. Timing of the ultrasound rotations can also affect the scores, since milestone evaluations are performed twice per academic year. If the resident had not completed their ultrasound rotation by the time the evaluations are completed, then there is one fewer data point.
We attempted to determine if evaluators were male or female, but most programs had both male and female faculty assess their residents, which made it difficult to form any conclusions on gender biases based on the gender of the evaluators. We did not collect data on the race of the evaluators as well, so we cannot comment on how that would potentially impact our results. In addition, we did not collect data related to nonultrasound milestones, which would have been ideal for comparison. We suggested that given the objective nature of ultrasound milestone evaluations, the scores may have been less vulnerable to the effects of implicit bias. This argument could have been supported had we obtained and compared data from some of the less objective milestones.
Conclusions
Our study results indicate there was no significant difference in the faculty evaluation of ultrasound milestones among emergency medicine residents within training programs throughout the United States.
AEM Education and Training 2020;4:94–102
Supported by the Department of Emergency Medicine, University of Arizona.
The authors have no potential conflicts of interest to disclose.
Author contributions: JA—study concept and design, acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, and critical revision of the manuscript for important intellectual content; US—analysis and interpretation of the data, drafting of the manuscript, critical revision of the manuscript for important intellectual content, and statistical expertise; LAS—study concept and design, acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, critical revision of the manuscript for important intellectual content, and statistical expertise; EHS—drafting of the manuscript and critical revision of the manuscript for important intellectual content; GB, RB, JB, DC, KC, TF, EG, RJ, SH, CK, KK, SL, PP, ES, JS, EJ, and DT—acquisition of the data; SA—study concept and design, acquisition of the data, analysis and interpretation of the data, drafting of the manuscript, critical revision of the manuscript for important intellectual content, and statistical expertise.
A related article appears on page https://doi.org/10.1002/aet2.10396.
Supervising Editor: Sorabh Khandelwal, MD.
References
- 1. Madsen TE, Linden JA, Rounds K, et al. Current status of gender and racial/ethnic disparities among academic emergency medicine physicians. Acad Emerg Med 2017;24:1182–92. [DOI] [PubMed] [Google Scholar]
- 2. Wiler JL, Rounds K, McGowan B, Baird J. Continuation of gender disparities in pay among academic emergency medicine physicians. Acad Emerg Med 2019;26:286–92. [DOI] [PubMed] [Google Scholar]
- 3. Association of American Medical Colleges . The State of Women in Academic Medicine: The Pipeline and Pathways to Leadership, 2015–2016. Washington, DC: AAMC, 2016. [Google Scholar]
- 4. Mueller AS, Jenkins TM, Osborne M, Dayal A, O'Connor DM, Arora VM. Gender differences in attending physicians’ feedback to residents: a qualitative analysis. J Grad Med Educ 2017;9:577–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Dayal A, O'Connor DM, Quadri U, Arora VM. Comparison of male versus female resident milestone evaluations by faculty during emergency medicine residency training. JAMA Intern Med 2017;177:651–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Beeson MS, Carter WA, Christopher TA, et al. The development of the emergency medicine milestones. Acad Emerg Med 2013;20:724–9. [DOI] [PubMed] [Google Scholar]
- 7. Meyerson SL, Sternbach JM, Zwischenberger JB, Bender EM. The effect of gender on resident autonomy in the operating room. J Surg Educ 2017;74:e111–8. [DOI] [PubMed] [Google Scholar]
- 8. Hoops H, Heston A, Dewey E, Spight D, Brasel K, Kiraly L. Resident autonomy in the operating room: does gender matter? Am J Surg 2019;217:301–5. [DOI] [PubMed] [Google Scholar]
- 9. Wang H, Warwick E, de Grubb MC, Deng N, Corboy J. Evaluation of obstetrics procedure competency of family medicine residents. Fam Med Community Health 2016;3:69–78. [Google Scholar]
- 10. Acuna J, Patanwala A, Situ‐LaCasse EH, et al. Identification of gender differences in ultrasound milestone assessments during emergency medicine residency training: a pilot study. Adv Med Educ Pract 2019;1:141–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Accreditation Council for Graduate Medical Education . List of Programs by Specialty. Available at: https://www.acgme.org/acgmeweb/tabid/172/GraduateMedicalEducation/AccreditedProgramsandSponsoringInstitutions.aspx. Accessed July 31, 2018.
- 12. Stolz LA, Stolz U, Fields JM, et al. Emergency medicine resident assessment of the emergency ultrasound milestones and current training recommendations. Acad Emerg Med 2017;24:353–61. [DOI] [PubMed] [Google Scholar]
- 13. U.S. Bureau of the Census . Geographic Areas Reference Manual. Washington, DC: U.S. Department of Commerce, Economics and Statistics Administration, Bureau of the Census, 1994. [Google Scholar]
- 14. Matsumoto M, Nishimura T. Mersenne Twister: a 623‐dimensionally equidistributed uniform pseudo‐random number generator. ACM Trans Model Comput Simul 1998;8:3–30. [Google Scholar]
- 15. Amini R, Adhikari S, Fiorello A. Ultrasound competency assessment in emergency medicine residency programs. Acad Emerg Med 2014;21:799–801. [DOI] [PubMed] [Google Scholar]
- 16. Nelson M, Abdi A, Adhikari S, et al. Goal‐directed ultrasound milestones revised: a multiorganizational consensus. Acad Emerg Med 2016;23:1274–9. [DOI] [PubMed] [Google Scholar]
