Skip to main content
Maternal-Fetal Medicine logoLink to Maternal-Fetal Medicine
. 2024 Oct 11;6(4):211–214. doi: 10.1097/FM9.0000000000000242

Quantifying the Accuracy of Clinician Risk Assessment for Postpartum Hemorrhage

Ashley N Lewis 1, Diego Villela-Franyutti 2, Henry J Domenico 3, Daniel W Byrne 3, Michaela K Farber 2, Holly B Ende 1,
Editors: Yang Pan, Jue Li
PMCID: PMC12094342  PMID: 40406176

Abstract

Objective

To measure the accuracy of postpartum hemorrhage (PPH) risk assessment performed by unaided individual clinicians, to inform future comparison to alternative risk assessment methods.

Methods

Prospective PPH risk assessments were collected from obstetric care team clinicians at two quaternary medical centers in the United States (Vanderbilt University Medical Center, Brigham and Women’s Hospital) from January 2022 to January 2023, following written informed consent from the providers. The data included a cohort of both vaginal and cesarean deliveries (CD). For each assessment, the clinician quantified the patient’s predicted PPH risk on a scale from 0 to 100% and rated their confidence in these assessments using a 5-point Likert scale, ranging from ‘not at all confident’ to ‘completely confident’. Medical records were reviewed 24 hours postpartum to assess the dichotomous outcome of PPH, defined as blood loss ≥1000 mL. The accuracy of these predictions was evaluated using the area under the receiver operating characteristic curve (AUC).

Results

Of 271 patients, 32 (11.8%) experienced PPH, accounting for 11.4% (104/915) of assessments. The overall AUC was 0.64 (95% confidence interval (CI): 0.58–0.71). Prediction accuracy was higher for CD than for vaginal deliveries, with AUCs of 0.82 (95% CI: 0.72–0.91) and 0.56 (95% CI: 0.48–0.63), respectively. No significant differences in the accuracy of assessments were observed according to physician specialty, physician experience level, or confidence level of the assessment.

Conclusion

Overall unaided clinician performance in predicting PPH was moderate, with an AUC of 0.64. Predictions were more accurate for patients undergoing CD. Further study is needed to understand how clinician performance compares to other modalities of risk prediction.

Keywords: Postpartum hemorrhage, Maternal morbidity, Obstetrics, Pregnancy, Risk factors, Risk prediction

Introduction

Preemptively identifying patients at increased risk of postpartum hemorrhage (PPH) may reduce maternal morbidity and mortality, as treatment delays correlate with PPH severity.1 While PPH risk has historically been determined by the clinical judgment of individual providers, over 40 known risk factors exist,2 and cognitive aids or risk assessment tools may provide enhanced prediction.3 Multiple PPH risk assessment tools have been published in recent years, including category-based (low, medium, high risk) tools published by the California Maternal Quality Care Collaborative, Association of Women’s Health Obstetric and Neonatal Nurses, and the American College of Obstetricians and Gynecologists.46 In addition, nearly 30 logistic regression and machine learning prediction models have been published within the past decade.7,8 In order to compare these new risk prediction modalities to the historical method of unaided clinical judgment by individual providers, a quantification of the performance of this baseline method is required. The aim of this study was to measure the accuracy of PPH risk assessment performed by unaided individual clinicians, to inform future comparison to alternative risk assessment methods.

Materials and methods

Prospective assessments of PPH risk were collected from obstetric care team members at two large quaternary medical centers in the United States from January 2022 to January 2023. Participants included labor and delivery nurses, midwives, certified registered nurse anesthetists, and medical personnel at various levels of training (residents, fellows, and attendings) from anesthesiology and obstetrics departments. These participants constituted a convenience sample of clinicians working in the labor and delivery suite during the recruitment period. Once enrolled, clinicians were eligible to provide assessments during any clinical shift for patients under their care. No clinicians declined participation in the study.

Prior to conducting risk assessments, the research team provided comprehensive training on the study procedures and the consistent, standardized definition of PPH adopted for the study, defined as an estimated blood loss of ≥1000 mL for both vaginal and cesarean delivery. During each assessment, clinicians were required to estimate the patient’s predicted risk of PPH on a scale from 0% to 100% and to rate their confidence in this prediction using a 5-point Likert scale (not at all confident, slightly confident, somewhat confident, fairly confident, completely confident). Additionally, data on the anticipated mode of delivery at the time of prediction were collected. The following script was used during study data collection:

“We are hoping to change the way that postpartum hemorrhage risk prediction occurs at our institution. In preparation for that, we want to collect baseline data on the performance of clinician assessment of postpartum hemorrhage risk without the use of tools or predictive models. For each of these patients, could you provide an estimate of the risk of postpartum hemorrhage defined as blood loss greater than or equal to one liter, based on your knowledge of the patient and her risk factors? We would then like you to report your confidence in the prediction, from 1 ‘not at all confident’ to 5 ‘completely confident’.”

Study investigators collected predictions for both vaginal deliveries (VD) and cesarean deliveries (CD); however, risk assessment procedures differed between groups. For VD, clinicians could be approached for predictions once the patient reached active labor, characterized by cervical dilation of >6 centimeters. Consequently, these risk predictions could be made several hours prior to the actual delivery of the infant(s). In contrast, for CD, predictions were conducted during the preprocedural huddle, typically closer to the time of delivery. It was possible for multiple predictions to be recorded for a single patient from different care team members. Medical records were reviewed 24 hours postpartum by the study investigators to assess for the dichotomous outcome of PPH, defined as blood loss ≥1000 mL.

Statistical analysis

The distribution of predicted risks by PPH status was compared using a Wilcoxon test. The accuracy of predictions was assessed using the area under the receiver operating characteristic curve (AUC) with 95% confidence intervals (95% CI). Differences in predictive accuracy between groups were tested using Delong’s test for subgroup comparisons or logistic regression with prediction/variable interactions for full cohort comparisons.9 Within-patient variability of predictions was assessed visually by plotting the range of predictions for each patient, stratified by PPH status. All statistical analyses were conducted using R statistical software version 4.2.3 (R Core Team. Vienna, Austria).

Ethical approval

This study received approval from the institutional review board of both participating institutions (Vanderbilt University Medical Center IRB #220244, Brigham and Women’s Hospital IRB # 2022P001516) and abides by the ethical principles outlined in the Declaration of Helsinki. Clinician participant written informed consent was obtained prior to data collection. The need for individual patient consent was waived by both review boards.

Results

A total of 915 predictions were collected among 271 patients. Of those, 32 patients (11.8%) experienced PPH, accounting for 104 (11.4%) of predictions. The median predicted risk for patients who did not experience a PPH was 10.0% (interquartile range 5.0%–22.0%) compared to 20.0% (interquartile range 7.0%–60.0%) for patients who did experience a PPH (difference in means = 16.1%, 95% CI: 9.8%%–22.4%, P < 0.001) (Fig. 1).

Figure 1.

Figure 1

The distribution of clinicians’ predicted risk of postpartum hemorrhage stratified by risks given for patients with and without postpartum hemorrhage. Predicted risks (x-axis) are shown on a probability scale (0–1). Density (y-axis) reflects relative frequency of clinician-predicted risk of PPH. PPH: Postpartum hemorrhage.

The overall AUC for clinician assessment of PPH risk was 0.64 (95% CI: 0.58–0.71). PPH risk predictions for CD were more accurate than for VD (AUC 0.82 (95% CI: 0.72–0.91) vs. AUC 0.56 (95% CI: 0.48–0.63)). However, there were no significant differences in prediction accuracy according to physician specialty, physician experience level, or confidence level of the prediction (Table 1). Visual examination of predicted risks by the patient showed a high degree of within-subject variability (Fig. 2).

Table 1.

Postpartum hemorrhage risk prediction performance overall and by subgroup.

Group n AUC 95% CI Statistical value P
All deliveries 915 0.64 0.58–0.70 −1.38 0.17*
 Hospital 1 715 0.62 0.55–0.68
 Hospital 2 200 0.72 0.61–0.83
Anticipated vaginal delivery 645 0.56 0.48–0.63 4.39 <0.001*
 Hospital 1 589 0.57 0.49–0.65
 Hospital 2 56 0.50 0.31–0.69
Anticipated cesarean delivery 270 0.82 0.72–0.91
 Hospital 1 126 0.77 0.63–0.91
 Hospital 2 144 0.91 0.86–0.97
Confidence levels 3.56 0.47
 Not at all confident 26 0.70 0.42–0.98
 Slightly confident 233 0.49 0.36–0.62
 Somewhat confident 245 0.68 0.58–0.78
 Fairly confident 316 0.68 0.57–0.76
 Completely confident 95 0.63 0.43–0.83
Provider specialty −0.21 0.84*
 Anesthesia 454 0.69 0.61–0.78
 Obstetrics 149 0.67 0.50–0.84
Provider type§ 2.08 0.04*
 Physician 524 0.70 0.62–0.78
 Nurse 353 0.56 0.45–0.66
Physician experience level −1.20 0.23*
 Attending 301 0.67 0.56–0.76
 Trainee 223 0.76 0.64–0.87

Hospitals 1 and 2 refer to Vanderbilt University Medical Center and Brigham and Women’s Hospital, respectively.

*P values were calculated using Delong’s test for comparison of receiver operating characteristic curves. Test statistic shows DeLongs D test statistic. P value comparing vaginal and cesarean delivery used overall numbers.

P value was calculated using likelihood ratio test for significance of interaction between confidence level and prediction accuracy. Test statistic shows likelihood ratio test statistic value. Test used four degrees of freedom.

Anesthesia includes obstetric anesthesia attendings, fellows, residents, and certified registered nurse anesthetists. Obstetrics includes obstetric attendings, fellows, residents, and certified nurse midwives.

§Physician includes obstetric attendings, fellows, residents, and obstetric anesthesia attendings, fellows, and residents. Nurse includes labor and delivery nurses.

AUC: Area under the receiving operating characteristic curve; CI: Confidence interval.

Figure 2.

Figure 2

Range of predicted risks for each patient, stratified by postpartum hemorrhage status. Each line represents one patient. Lines show range from minimum to maximum predicted risk, with points designating median predicted risk. Patients are shown stratified by postpartum hemorrhage status, then sorted in descending order of median predicted risk. PPH: Postpartum hemorrhage.

Discussion

We provide preliminary data on the accuracy of individual clinicians’ unaided assessments of PPH risk, to inform future comparisons with alternative assessment methods. We also explored the performance of risk prediction in subgroups, including delivery type, hospital, confidence level, provider type, specialty, and experience. These data are helpful, as they provide a baseline performance of how accurately clinicians predict PPH without the use of cognitive aids or artificial intelligence tools, which are becoming more widely used and accepted on labor and delivery floors internationally.

Because of the rising morbidity and mortality associated with PPH, multiple international organizations currently recommend standardized PPH risk assessments. Many PPH-related deaths have been shown to be preventable; thus, prediction and preparation are important strategies to combat worsening maternal outcomes. To achieve standardized PPH risk assessment, many tools and prediction models have been published, with varying performance as demonstrated in validation studies. While these tools share many common elements, they differ in their positive and negative predictive values for PPH.3 The next-generation artificial intelligence and machine learning tools are in development and offer the advantage of quantitative risk assessment, similar to that provided by clinicians in this study, as opposed to category-based assessment into low-, medium-, and high-risk groups.7 However, studies comparing these many potential methods of assessment are lacking.

In our study, predictions prior to CD were more accurate than those prior to VD. There were important differences in the timing of the predictions made for these two delivery types, specifically that predictions for CD were elicited during a preprocedural timeout, ensuring that they occurred proximally to the time of delivery, almost always within one hour. This nearly eliminated the possibility of significant changes to patient risk factors between prediction and delivery. Vaginal delivery predictions, on the other hand, only had to occur during the active phase of labor. Thus, in some instances, they likely occurred multiple hours prior to delivery, allowing the possibility that clinical factors could have evolved or newly presented, which would have changed a provider’s assessment of their risk. An additional factor contributing to the observed difference in accuracy between CD and VD predictions was that clinicians overall tended to overestimate the possibility of PPH in the population as a whole; thus, higher rates of PPH after CD, including for intrapartum arrest of labor, may have inadvertently led to more accurate predictions in that group.

We did not find a difference in the accuracy of predictions between specialties (anesthesia vs. obstetrics) or physician experience levels (attendings vs. trainees). The lack of observed difference between anesthesia and obstetrics may result from the setting in which this study was conducted. Both quaternary medical centers included in this study employ full-time subspecialty-trained obstetric anesthesiologists, who receive more intensive training and education on obstetric comorbidities and pathologies, including PPH. These results may differ in settings where generalist anesthesiologists, with less specific training or focus in obstetrics, cover labor and delivery suites, and perform clinical assessments of PPH risk. We also did not find a statistically significant difference in the accuracy of prediction between attendings and trainees. It is likely that the added knowledge, wisdom, and clinical acumen of more experienced attendings was offset by the greater working knowledge of patients and their active risk factors, including most recent labs, vital signs, medication administrations, and changes to clinical status, which trainees are more likely to have during a routine shift. We did note a small difference between physician and nurse PPH prediction accuracy, which may partially be explained by a greater working knowledge of published literature pertaining to PPH risk factors. Importantly, the confidence in a prediction did not correlate with accuracy. This is especially worrisome as clinicians did not appear to have insight into their ability to accurately predict, suggesting the possibility of overconfidence in inaccurate predictions.

Strengths of our study include a large sample size, collection of prospective risk predictions, and multicenter data collection; however, there are several limitations to consider. First, the clinician participants in this study were not blinded to the outcome of PPH after providing risk assessments. Thus, it is possible that clinicians who provided risk predictions for a given patient could have been biased to report higher or lower estimated blood loss, based on their preconceived assessment of the patient’s level of risk. Because any bias would have served to improve overall discriminative performance, our estimates may overestimate overall clinician performance. Second, we collected only the quantitative prediction of risk from each clinician. We did not capture any detail about why or how the predictions were made, which may provide more insight into how individual clinicians perform this complex task. Additionally, because the eventual mode of delivery was unknown at the time of prediction, assessments are categorized by anticipated mode of delivery, thus complicating the comparison of anticipated VD assessments (where the mode of delivery frequently subsequently changes to CD) and anticipated CD assessments (where the mode of delivery rarely subsequently changes to VD).

Finally, while we collected data in a multicenter fashion to improve the generalizability of these results, all assessments came from quaternary medical centers in the US, which may not represent the performance of clinicians in other practice settings or other countries. While some may argue that the act of performing a risk assessment would have changed provider behaviors and potentially altered the occurrence of PPH, this concern is lessened by the fact that PPH risk assessment is now required at the time of labor and delivery admission in all birthing hospitals in the United States. Thus, any change to provider behavior should reflect routine practice.

Conclusion

In conclusion, we demonstrate moderate accuracy and possible overconfidence in individual provider PPH risk prediction among obstetric, anesthesia, and nursing clinicians. Studies to further explore and compare individual performance to other risk assessment methods are warranted.

Acknowledgments

The authors would like to acknowledge Noor Raheel and Laura Ibanez-Pintor for their assistance with data collection for this project. Each is employed by Brigham and Women’s Hospital and reports no financial compensation.

Funding

Dr. Farber serves on advisory boards for HemoSonics and Octapharma and receives research funding support from Flat Medical.

Author Contributions

Ashley N. Lewis: Data acquisition, manuscript writing, manuscript editing;

Diego Villela-Franyutti: Data acquisition, manuscript editing;

Henry J. Domenico: Research design, data analysis, manuscript writing, manuscript editing;

Daniel W. Byrne: Research design, data analysis, manuscript editing;

Michaela K. Farber: Research design, manuscript editing;

Holly B. Ende: Research design, data analysis, manuscript writing, manuscript editing.

Conflicts of Interest

None.

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Footnotes

First online publication: 11 October 2024

How to cite this article: Lewis AN, Villela-Franyutti D, Domenico HJ, Byrne DW, Farber MK, Ende HB. Quantifying the Accuracy of Clinician Risk Assessment for Postpartum Hemorrhage. Maternal Fetal Med 2024;6(4):211–214. doi: 10.1097/FM9.0000000000000242.

Contributor Information

Ashley N. Lewis, Email: ashleynlewis12@gmail.com.

Diego Villela-Franyutti, Email: dvillelafranyutti@mgh.harvard.edu.

Henry J. Domenico, Email: henry.domenico@vumc.org.

Daniel W. Byrne, Email: daniel.byrne@vumc.org.

Michaela K. Farber, Email: mfarber@bwh.harvard.edu.

References


Articles from Maternal-Fetal Medicine are provided here courtesy of Wolters Kluwer Health

RESOURCES