Skip to main content
Lippincott Open Access logoLink to Lippincott Open Access
. 2023 Jun 2;98(10):1154–1158. doi: 10.1097/ACM.0000000000005287

Increasing Diversity in the Physician Workforce: Pathway Programs and Predictive Analytics

Michael Mayrath 1,, Darah Fontanez 2, Ferrahs Abdelbaset 3, Bryan Lenihan 4, David V Lenihan 5
PMCID: PMC10516161  PMID: 37267045

Abstract

Problem

Lack of diversity in the physician workforce has well-documented negative impacts on health outcomes. Evidence supports the use of pathway or pipeline programs to recruit underrepresented in medicine students. However, data on how a pathway program should deliver instruction are lacking. This report describes a multiyear project to build such a system with the goal of increasing diversity within medical school cohorts and ultimately the physician workforce.

Approach

In the 2015–2016 academic year, the Ponce Health Sciences University started a 3-phase project to create a data-driven medical school feeder system by coupling a pathway program with predictive analytics. Phase 1 launched the pathway program. Phase 2 developed and validated a predictive model that estimates United States Medical Licensing Examination (USMLE) Step 1 performance. Phase 3 is underway and focuses on adoption, implementation, and support.

Outcomes

Data analysis compared 2 groups of students (pathway vs direct) across specific factors, including Medical College Admission Test (MCAT) score, undergraduate grade point average (GPA), first-generation status, and Step 1 exam performance. Statistically significant differences were found between the 2 groups on the MCAT exam and undergraduate GPA; however, no significant differences were found between groups for first-generation status and performance on the Step 1 exam. This finding supports the authors’ hypothesis that although pathway students have significantly lower mean MCAT exam scores compared with direct students, pathway students perform just as well on the USMLE Step 1 exam.

Next Steps

Next steps include expanding the project to another campus, adding more socioeconomic status and first-generation data, and identifying best curricular predictors. The authors recommend that medical school programs use pathway programs and predictive analytics to create a more data-centered approach to accepting students with the goal of increasing physician workforce diversity.

Problem

Lack of diversity in physician workforce

This report addresses a generalizable problem that the Association of American Medical Colleges (AAMC) has identified as a core strategic priority: lack of diversity in the physician workforce.1,2 Evidence shows that the lack of diversity has a negative impact on medical outcomes, including treatment decisions, treatment adherence, and patient health.24 Diversity is not limited to race. A recent survey of 45,000 respondents to the AAMC’s Matriculating Student Questionnaire found that a “low socioeconomic status significantly decreases the likelihood that a student who is interested in medicine will apply or gain acceptance into medical school.”5(p2)

Solving the lack of physician diversity starts with medical school admissions. Research shows that there is pressure on admissions officers to accept the top scorers on the Medical College Admission Test (MCAT),6 and although this approach may help with rankings, it undercuts progress toward a more diverse physician workforce. “Assigning too much weight to the highest MCAT scores in admissions decision-making makes it difficult to build medical school classes that are representative of patient communities.”6(p351) Lucey and Saguil6 recommend 3 levers to address the issue: (1) holistic admissions processes, (2) pathway programs, and (3) curriculum.

Problem definition

Pathway programs have been suggested as a solution to increase diversity in medical education and ultimately the physician workforce.6 Evidence supports the use of pathway or pipeline programs to recruit underrepresented in medicine students.7 However, data on how a pathway program should deliver instruction are lacking.

Lucey and Saguil’s 3 levers help to address the diversity issue; however, we assert there is a fourth lever to include: data and analytics. We were unable to find any research on how a pathway program can leverage predictive analytics to support students on their journey to becoming a physician. This report describes a multiyear project to build such a system with the goal of increasing diversity within medical school cohorts and ultimately the physician workforce. Analytics is a primary innovation described in this report.

Approach

Hypothesis

We contend that a pathway program coupled with predictive analytics provides a new metric—predicted Step 1 pass/fail strata—for medical school admissions committees to use when evaluating applicants from their pathway program(s). This new metric does not replace existing medical school admissions processes; rather, it is supplemental and only applies to applicants from the pathway program(s). This new metric could help medical school admissions committees make decisions about pathway applicants without relying on the MCAT exam as much. For example, if a pathway student demonstrates mastery of the program’s courses but has a weak MCAT score, a decision could be made to put more weight into the longitudinal performance assessment (pathway program) rather than the 1-time snapshot of knowledge (MCAT exam).

Setting

Ponce Health Sciences University (PHSU) is a Middle States Commission on Higher Education–accredited institution that offers 15 health science programs, including a Liaison Committee on Medical Education–accredited medical program. PHSU has campuses in Puerto Rico and St. Louis, Missouri. The main campus was established in Ponce, Puerto Rico, more than 40 years ago.

Three-phase implementation

The following section describes a multiyear project to implement PHSU’s medical school pathway program and develop predictive analytics. The goal is to provide the PHSU medical school admissions committee with more data when evaluating students who do not fit the traditional medical student profile but who are a good fit in terms of being committed to patient care while also increasing diversity in the physician workforce. The project is being conducted in 3 phases: (1) pathway program establishment, (2) predictive analytics development (predicted Step 1 pass/fail strata), and (3) pilot testing and adoption by the medical school admissions committee.

Phase 1: Pathway program establishment.

In the 2015–2016 academic year, PHSU started a pathway program titled Master of Science in Medical Sciences (MSMS). The MSMS is tuition based and uses the same curriculum as the PHSU year 1 curriculum except for 3 courses. We consider the MSMS to be a longitudinal performance assessment program because students must demonstrate the ability to keep up with the appropriate level curricula. Research shows that a longitudinal performance assessment can be a better measure of a student’s knowledge, skills, and abilities compared with a 1-time snapshot of knowledge, such as the MCAT or SAT exam.8

Between the academic years 2015–2016 and 2021–2022, PHSU admitted a total of 668 medical students. A total of 135 (20.2%) were MSMS or pathway program students and 533 (79.8%) were non-MSMS or direct students. The mean age was 23 years for both groups, with 322 women (53.0%) and 285 men (47.0%) across both groups for a total of 607. Sixty-one students did not report their gender. The findings reported in this study are based on PHSU medical school data before the St. Louis campus was opened in fall 2022. Thus, students in the data set are Puerto Rican or Hispanic. First-generation students are defined as students whose parents did not graduate from college. First-generation students as a percentage of total medical school students increased in the last 3 cohorts: 7 (7.6%) in the 2023 graduating cohort, 12 (9.2%) in the 2024 graduating cohort, and 16 (13.8%) in the 2025 graduating cohort.

Phase 2: Predictive analytics development (predicted Step 1 pass/fail strata).

The original predictive model was created during 2010 to 2014 to predict United States Medical Licensing Examination (USMLE) Step 1 scores. D.V.L. was teaching neuroscience at Touro College of Osteopathic Medicine and was seeking a way to reduce competition within the medical student cohort after a student died by suicide. Rather than students competing with their peers, D.V.L. wanted medical students to compete against historical data collected from previous medical students who passed the Step 1 exam.

The original predictive model used linear regression and relied on an exam question categorization system. Categories are based on PHSU’s medical school program objectives, USMLE systems and disciplines, Liaison Committee on Medical Education and Accreditation Council for Graduate Medical Education standards, and other elements. Students’ exam performance is the primary data source. This original model was used during academic advising sessions with students; however, the system was built as a prototype and needed more modern technologies for scaling.

The second phase of development was from 2015 to 2022 and focused on modernizing the infrastructure and algorithm underlying the predictive model. The algorithm includes multiple steps and recursive calculations. These calculations result in the Tiber Performance Value (TPV). The TPV is applied to machine learning models to estimate a midpoint (predicted Step 1 score) within a range. This estimated score was used for internal conversations by faculty, administrations, and staff at PHSU. However, we resisted providing the predicted score to MSMS and medical students because of the potential for demotivation, overinterpretation, and other issues. For example, receiving a predicted score of 235 would be good news to most PHSU medical and MSMS students. However, that score might give them a false sense of security, resulting in them not working as hard. In contrast, a student receiving a predicted score of 165 might lose all confidence and even quit. Thus, we purposely looked for an opportunity to share the prediction without giving a specific score. The Step 1 exam change to pass/fail in early 2022 afforded that opportunity.

In 2022, the predictive model was adapted to pass/fail. The adaptation was centered on a pass/fail strata system that estimates the probability of an MSMS or medical student passing the Step 1 exam. There are 5 strata: will pass, likely pass, borderline, at risk, and high risk. In contrast, the previous model used various types of regression models offered by machine learning. The predictive model has undergone rigorous validity testing described in the outcomes section.

Phase 3: Pilot testing and adoption by medical school admissions committees.

The third phase is currently underway and consists of working with faculty, administrators, and stakeholders across the PHSU. These efforts include conducting workshops and webinars to increase understanding of how the predictive analytics platform can support early alerts, academic advising, and medical school admissions. The goal of this phase is earning adoption from the medical school admissions committee of the predictive analytics through a process of communication and transparency.

Outcomes

Phase 1 evaluation

A significant difference was found when comparing the mean MCAT scores between direct (mean score, 498.7) and MSMS (mean score, 490.2) groups for medical school cohorts that started between 2017 and 2021 (P = .002). A significant difference was found when comparing the mean undergraduate grade point averages (GPAs) between direct (mean GPA, 3.68) and MSMS (mean GPA, 3.46) groups for years 2021 to 2025 (P = .001). No difference was found between the percentage of first-generation students when comparing the 2 groups: 34 direct students (10.1%) and 11 MSMS students (10.8%). No differences were found when comparing the mean USMLE Step 1 scores for both groups: direct (mean score, 217.8) and MSMS (mean score, 217.4) students (P = .46). A 2-tailed, 2-sample t-test assuming unequal variances was used to test for significance.

Figure 1 visualizes the relationship between PHSU medical students’ USMLE Step 1 scores and MCAT scores. The figure shows that MSMS students had lower MCAT scores compared with their direct peers; nonetheless, MSMS students performed comparably on the Step 1 exam. The PHSU medical program accepted pathway students with MCAT scores as low as 480 to 490; however, these pathway students benefited from their well-established prior knowledge going into their second year after taking year 1 of medical school twice (once during the MSMS program and then again during year 1 of medical school). These results together indicate that direct and MSMS groups are significantly different in terms of MCAT exam scores and undergraduate GPAs but not significantly different with respect to USMLE Step 1 performance. Results also indicate a 3-year positive trend in the percentage of first-generation medical students at PHSU.

Figure 1.

Figure 1

Relationship between Ponce Health Sciences University medical students’ actual (not predicted) Medical College Admission Test (MCAT) scores and United States Medical Licensing Examination (USMLE) Step 1 exam scores. The horizontal line at the Step 1 score of 194 indicates the passing score during the data collection period. The vertical line at the MCAT score of 500 indicates that most medical school applicants with a score below 500 will likely not be accepted. Abbreviation: MSMS, Master of Science in Medical Sciences.

Phase 2 evaluation

Phase 2 evaluation examined the validity and reliability of the predictive model. Figure 2 visualizes the relationship between USMLE Step 1 actual scores and the TPV. Figure 2 shows a significant positive linear correlation between the TPV ratio and USMLE Step 1 scores (r = .78, P = .001), supporting our hypothesis that use of the MSMS as a pathway program coupled with the TPV predictive model is a stronger predictor of Step 1 performance compared with the MCAT. The Pearson correlation coefficient was used to measure the strength of linear association. Multiple methods are being used to validate the predicted pass/fail strata system, such as dependent variable results validation.

Figure 2.

Figure 2

Correlation between students’ actual United States Medical Licensing Examination (USMLE) Step 1 exam scores and a proprietary calculation created by the Ponce Health Sciences University technology team called the Tiber Performance Value (TPV). The TPV measures student performance and predicts the likelihood of passing the USMLE Step 1 exam. The horizontal line at the Step 1 score of 194 indicates the passing score during the data collection period. This figure shows a strong positive linear correlation, whereas Figure 1 shows a weak correlation. Abbreviation: MSMS, Master of Science in Medical Sciences.

Figure 3 visualizes the original predictive model’s reliability on a 4-quadrant graph. As Figure 3 shows, the predictive model is generally correct when predicting a student will pass the Step 1.

Figure 3.

Figure 3

Reliability of the original predictive model. The markers on the graph represent individual students and show the intersection between their actual and predicted scores. The quadrants divide the graph into 4 regions based on the accuracy of the predictions. The top-left quadrant shows students that the model predicted would pass but failed. The top-right quadrant shows students that the model predicted would pass and passed. This quadrant has significantly more students in it compared with the other 3 quadrants. The bottom-left quadrant shows students that the model predicted would fail and failed. The bottom-right quadrant shows students that the models predicted would fail but passed. The 3 lines represent the midpoint and a range of acceptable error.

Phase 2 evaluation summary

A comparison of Figure 1 (MCAT and Step 1 exams) and Figure 2 (TPV and Step 1 exam) shows the visible difference in correlations. Figure 1 shows a weak correlation between MCAT and USMLE Step 1 scores, whereas Figure 2 shows a strong positive linear correlation between the predictive model and USMLE Step 1 scores. This finding indicates that the TPV predictive model is a better predictor of performance on the Step 1 exam compared with the MCAT exam. This finding can be explained by MSMS serving as a longitudinal performance assessment that collects data over a year compared with the MCAT exam.

Phase 3 evaluation

The PHSU medical school admissions committee is using this new metric (predicted Step 1 fail/pass strata) to help evaluate medical school applicants from the MSMS pathway program. This phase is currently underway and will require more time to evaluate. The Phase 3 evaluation plan includes examining how students who were admitted to medical school using the TPV predictive model performed during years 1 and 2 as well as performance on the Step 1 exam.

Next Steps

Findings and implications

This report is consistent with other research calling for less reliance on the MCAT exam and undergraduate GPA to eliminate bias from medical school admissions.9 Our evidence supports the use of pathway programs to recruit medical school applicants who do not have the traditional profile. Furthermore, our research indicates that coupling a pathway program with predictive analytics creates a data-centered approach to selecting medical students, with the goal being to increase diversity in the physician workforce. Our findings indicate that MSMS students benefit from taking the year 1 curriculum twice before moving on to the more complex and clinical topics of year 2.

Limitations

The primary limitations of this study are sample size and generalizability. The findings in this report are limited to PHSU medical students in Puerto Rico; thus, the findings may not be generalizable to other medical school programs.

Conclusions

We are excited to continue this line of research in multiple areas. First, phase 3 is underway and will generate new data with each cohort. Second, we are expanding our data set to include the PHSU St. Louis campus, which opened in August 2022. Third, socioeconomic status and first-generation data will be presented in a future article. Fourth, we are identifying which areas in the curriculum are the best predictors. Fifth, we will be adding distance travel as a factor.

On a larger scale, to address the lack of diversity in the physician workforce, our recommendation is that U.S. medical schools create a network of pathway programs. This pathway program network could help address the diversity issue by creating a standardized evaluation framework for comparing students based on a shared set of assessments.

PHSU has started this initiative by partnering with universities around the United States that are seeking a pathway program, such as the MSMS, for their undergraduate students in premedicine, biology, and other fields. This project is called the MSMS University Network and includes 7 universities to date. We are seeking more university partners to join our effort of increasing diversity in the physician workforce through pathway programs and predictive analytics.

Acknowledgments:

The authors would like to thank the following contributors and collaborators: Dr. Olga Rodriguez de Arzola, Dr. Georgina Aguirre, Dr. Elizabeth Rivera, Dr. Jose Torres, Sam Willis, and Alex Ruiz.

Footnotes

Funding/Support: None reported.

Other disclosures: None reported.

Ethical approval: This study was approved as exempt by the PHSU IRB, July 20, 2022, #2207109740.

Contributor Information

Darah Fontanez, Email: dfontanez@psm.edu.

Ferrahs Abdelbaset, Email: fabdelbaset@psm.edu.

Bryan Lenihan, Email: dlenihan@psm.edu.

David V. Lenihan, Email: dlenihan@psm.edu.

References

  • 1.Association of American Medical Colleges. Diversity in medicine: Facts and figures 2019. https://www.aamc.org/data-reports/workforce/report/diversity-medicine-facts-and-figures-2019. Accessed May 26, 2023.
  • 2.IHS Markit Ltd. The Complexities of Physician Supply and Demand: Projections From 2019 to 2034. Washington, DC: Association of American Medical Colleges; 2021. [Google Scholar]
  • 3.Hall WJ, Chapman MV, Lee KM, et al. Implicit racial/ethnic bias among health care professionals and its influence on health care outcomes: A systematic review. Am J Public Health. 2015;105:e60–e76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cooper LA, Roter DL. Patient-provider communication: The effect of race and ethnicity on process and outcomes of healthcare. In: Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington, DC: National Academies Press; 2003:552–593. [PubMed] [Google Scholar]
  • 5.Shahriar AA, Puram VV, Miller JM, et al. Socioeconomic diversity of the matriculating US medical student body by race, ethnicity, and sex, 2017-2019. JAMA Netw Open. 2022;5:e222621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lucey CR, Saguil A. The consequences of structural racism on MCAT Scores and medical school admissions: The past is prologue. Acad Med. 2020;95:351–356. [DOI] [PubMed] [Google Scholar]
  • 7.Formicola AJ, D’Abreu KC, Tedesco LA. Underrepresented minority dental student recruitment and enrollment programs: An overview from the dental Pipeline program. J Dent Educ. 2010;74:S67–S73. [PubMed] [Google Scholar]
  • 8.Clarke-Midura J, Dede C. Assessment, technology, and change. J Res Technol Educ. 2010;42:309–328. [Google Scholar]
  • 9.Liaison Committee on Medical Education. Liaison Committee on Medical Education (LCME) standards on diversity. https://health.usf.edu/~/media/Files/Medicine/MD%20Program/Diversity/LCMEStandardsonDiversity1.ashx?la=en. Accessed May 26, 2023.

Articles from Academic Medicine are provided here courtesy of Wolters Kluwer Health

RESOURCES