Abstract
Objective
To examine the variability in the cesarean delivery (CD) rates of individual labor and delivery nurses compared with physicians at three attribution time points.
Data Sources
Medical record data from nine hospitals in Washington State from January 2016 through September 2018.
Study Design
Retrospective, observational cohort design using an aggregated database of birth records.
Data Collection/Extraction Methods
Chart‐abstracted clinical data from a subset of nulliparous, term, singleton, vertex births attributed at admission, labor management, and delivery to nurses and physicians. Two classification methods were used to categorize nurse‐ and physician‐level CD rates at three attribution time points and the reliability of these methods compared.
Principal Findings
The sample included 12 556 births, 319 nurses, and 126 physicians. Overall, variation in nurse‐level CD rates did not differ significantly across the three attribution time points, and the extent of variation was similar to that observed in physicians. However, agreement between attribution time points varied between 35 percent and 65 percent when classifying individual nurses into the top and bottom deciles. The average reliability of nurse‐level CD rates was 32 percent at admission (IQR 22.0 percent to 38.7 percent), 32.6 percent at labor (IQR 23.1 percent to 40.9 percent), and 29.3 percent (IQR 20.9 percent to 35.8 percent) at delivery. The average reliability of physician‐level CD rates was higher: 54.2 percent (IQR 38.7 percent to 71.4 percent) at admission, 62.5 percent (IQR 49.0 percent to 79.6 percent) at labor management, and 66.1 percent (IQR 53.7 percent to 81.2 percent) at delivery.
Conclusion
Feedback on nurse‐level CD rates as part of routine clinical quality audits can provide insight into nurse performance in the context of other individual‐level and unit‐level information. To reliably distinguish individual nurse performance, larger sample sizes are needed.
Keywords: attribution, cesarean delivery, nursing, performance, physician, quality measurement
1. What is already known on this topic
Individual clinical practices are known to vary widely among physicians but little is known about patterns of variability in patient outcomes among nurses.
Attribution of individual providers to the outcomes of the patients under their care is an increasingly common but understudied aspect of audit and feedback interventions used to improve quality in hospitals throughout the United States.
2. What this study adds
We found substantial variation in cesarean delivery rates for low‐risk patients among nurses that did not differ significantly across the three attribution time points and was similar to that observed in physicians.
We found that physicians were more reliably classified relative to nurses at all attribution points and to increase the reliability of the nurse‐level cesarean delivery measure, larger sample sizes are needed.
Linking births to the responsible nurse and physician at delivery may be a practical attribution method to understand the scope of variation in practice for a given hospital; however, additional methods are needed to fairly classify clinicians into performance categories.
1. INTRODUCTION
Audit and feedback interventions for health care professionals consist of a summary of clinical performance data over a specified time period, shared with the intention of provoking reflection on and adjustment to clinicians’ approach to care. 1 This approach is increasingly commonplace—either on its own or as part of a multifaceted improvement program—as institutions seek to foster improvements in the quality of health care delivery. For clinicians, access to performance data can help shape awareness and beliefs about clinical practice and motivate the modification of group norms—particularly when performance lags behind care standards or quality benchmarks. 2 , 3
Obstetrics is the most commonly utilized inpatient health care service in the United States, 4 and cesarean delivery (CD) is one of the most commonly utilized major surgical procedure. 5 Significant between‐hospital variation in CD rates has been documented. 6 , 7 Audit and feedback of delivery mode rates has become a widely used method of initiating practice changes to improve care by safely reducing CD rates. 8 , 9 , 10 To date, this strategy has been used exclusively with physicians despite the crucial role of nurses in intrapartum labor management, 11 known variation in nurse‐level CD rates, 12 , 13 and the American Congress of Obstetricians and Gynecologists recommendation that hospitals review individual CD rates for nurses. 14 Moreover, nurses themselves consider CD rates to be sensitive to the care they provide 15 and claim partial accountability for delivery outcomes of women under their care. 16 , 17 Therefore, extending audit and feedback to nurses may provide an important new lever to improve the quality of care during childbirth. 18 However, we must first establish the credibility of the data used to evaluate rates of delivery mode for nurses.
Critics of audit and feedback commonly voice concerns about attribution, the method of assigning responsibility for patient outcomes to specific health care providers. They question the ability of an algorithm to fairly and consistently hold the correct clinician responsible for a given patient or episode of care. 19 Moreover, there is increasing recognition of the importance of team‐based care for patient safety, and it can be unclear in the context of the complex care episode how a single provider influences patient outcomes. Nonetheless, individual clinical practices are known to vary widely. How to best attribute individual clinical practices to the care and the outcomes patients receive remains a necessary but understudied aspect of measurement science. Evidence to support one approach over another is limited, 20 and there are calls in the literature for further investigation. 21
For nurses, the issue of attribution is further complicated by a lack of accessible data. Historically, nurse identity has not been available in population and administrative datasets. 22 Nurses lack a National Provider Identifier or NPI number, and nurse identity is not linked directly to insurance claims. 23 Therefore, in most cases, nurse identity must be manually extracted from the clinical record. Thus, unlike physicians, there have been minimal attempts to audit nurse performance, and the comparability of attribution methods between physicians and nurses has not been well studied.
Using a large dataset with data abstracted from the clinical record, we investigated three different attribution time points during childbirth for determining nurse‐level CD rates: admission, labor management, and delivery. We determined the aggregate CD rates for nurses by attribution time point and determined differences in individual nurse‐level CD rate classification based on time point. We also determined the reliability of the CD metric at each attribution time point. For comparison, we conducted the same analyses for physicians.
2. DATA AND METHODS
The Foundation for Health Care Quality, Obstetrical Care Outcomes Assessment Program (OB COAP), a clinician‐led, quality improvement collaborative based in Seattle, WA collected the data. OB COAP uses provider‐specific, chart‐abstracted data from consecutive births at participating sites for quality improvement, and participating hospitals in the Northwest include suburban and rural centers supported by Levels I, II, III, and IV neonatal units. Trained abstractors (obstetric providers, midwives, nurses, and health care data and quality improvement specialists) at each site collect data from maternal and newborn records and enter these into a cloud‐based, standardized data tool. Where possible, they upload subsets of data directly from the electronic medical record. The data elements include maternal sociodemographic characteristics, prepregnancy health, pregnancy complications, labor course, delivery, and postnatal outcomes for mothers and newborns. They also collect practitioner and nurse identity at each phase of care (hospital admission, intrapartum care, and delivery), and the OB COAP research dataset assigns the clinicians a unique research number. These three phases approximate the stages of labor. When two or more nurses provide intrapartum care, data abstractors use their judgement to determine the primary nurse—often the one who spent the greatest amount of time with the patient during active labor (defined by the time between 6cm cervical dilation to delivery), following the “majority of care” rule.
The dataset obtained from OBCOAP contained all nulliparous, term, singleton, vertex (NTSV) births from January 2016 to September 2019 from nine hospitals. We aimed to understand variation in CD rates by nurse; therefore, we excluded births that a nurse was unlikely to influence due to preexisting health condition, nonlaboring status, or absolute indications for CD. We excluded three groups of births: elective or scheduled prelabor CDs; births in which the maternal outcomes included uterine rupture, eclamptic seizure, or abruption; and CDs that were indicated due to malpresentation, placenta previa, fetal anomalies, suspected macrosomia, cord prolapse, or maternal disease. We excluded cases indicated as “maternal disease” because we were unable to determine whether the CD was medically indicated without reviewing the electronic health record, which was unavailable. The removal of these three groups left the final analytic sample of eligible NTSV births used in the analysis.
Eligible clinicians (nurses and physicians) had at least ten births at all three attribution time points (admission, labor management, and delivery). The clinicians at each time point were drawn from the fixed group of nurses and physicians working at each hospital. For physicians, we excluded physicians without CD privileges and midwives since only vaginal births could be attributed at the delivery time point.
We calculated CD rates as the number of cesarean deliveries divided by the total number of births, among all eligible births attributed to the clinician at each attribution time point during the study period. Each clinician had three separate CD rates based on three separate sets of at least ten births attributed to them at three attribution time points (admission, labor management, and delivery).
We calculated standard descriptive statistics on the nurse‐level and physician‐level CD rates, including mean, median, interquartile range, and total range at each attribution time point and presented them in box and whisker plots. We also calculated the median number of births per clinician at each time point. Next, we assigned each clinician to performance categories using two classification methods: above or below the Healthy People 2020 target of 23.9 percent (MICH‐7.1 Reduce cesarean births among low‐risk women with no prior cesarean births) and at the top and bottom deciles, a typical method of detecting negative and positive outliers in audit and feedback programs. We then assessed the level of agreement between the three attribution methods based on the number of clinicians classified in the same or different categories in both classification methods. We summarized the sequence of transitions using Sankey diagrams in which the width of each flow is proportional to the number of clinicians or flow quantity. 24
We then calculated the reliability of each clinician's CD rate at each attribution time point with a beta binomial model 25 in two steps. First, we estimated a prior beta distribution of CD rates among all nurses using a nonlinear mixed model, which yielded the two distributional parameters ⍺ and β. We estimated the between‐clinician variance as:
Second, we calculated individual reliability estimates for each CD rate as:
where p is the clinician's CD rate under a given attribution method, and n is the total number of deliveries for that clinician and attribution method. To further stabilize the estimate of p, we used a shrunken empirical Bayes estimate for all rates:
where n is the number of births and n Cesareans is the number of cesarean births for that clinician and attribution method, and is the CD rate for the entire dataset at that attribution time point.
We then calculated the mean reliability of CD rates for nurses and physicians at discrete birth volume categories (ie, 10‐19, 20‐29, 30‐39, 40‐49, 50‐59, and 60+) for each attribution time point. Using the Spearman‐Brown Prophecy formula, we estimated the mean birth volume that would be needed to achieve average reliability of 0.5, 0.6, and 0.7, given our observed reliability and the mean births per provider in our samples. We performed all data manipulation and analyses using SAS 9.4. The Harvard TH Chan School of Public Health institutional review board ethically approved this study.
3. RESULTS
The original dataset from OBCOAP included 13 299 NTSV births from nine hospitals over 2.75 years. We removed 31 cases of duplicate records, multiparous births, and records missing delivery mode. We further removed 551 cases of nonlaboring CDs; 87 cases of uterine rupture, eclamptic seizure, or abruption; and 74 cases with the excluded indications for CD. Among the remaining 12 556 eligible births, 319 nurses had ten or more births at all three attribution time points, with a median of 21 births at admission (IQR 15‐29), 21 at labor management (IQR 15‐32), and 24 at delivery (IQR 17‐35). In total, these nurses attended 7550 births at admission, 7674 at labor management, and 8655 at delivery. The sample also included 126 eligible physicians with at least 10 births by each attribution method. The median number of births per physician was 69 at admission (IQR 28‐122), 68 at labor management (IQR 26‐120), and 68 at delivery (IQR 31‐126). In total, these physicians attended 11 522 births at admission, 11 494 at labor management, and 11 536 at delivery. Nurses and physicians were attributed to births at one, two or three time points. On average, nurses were attributed to just one time point in 54 percent of the births attributed to them at all, while physicians were attributed to just one time point in 34 percent of the births attributed to them at all. Overall, physicians were more likely than nurses to be attributed to more than one time point per birth (Appendix S1).
The overall CD rate among the 12 556 eligible births was 24.2 percent. In the aggregate, nurse‐level CD rates were similar across the three attribution time points (Figure 1): mean 24.1 percent (IQR 15.6 percent to 30.9 percent) at admission, mean 23.6 percent (IQR 16.2 percent to 30.0 percent) at labor, and mean 24.0 percent (IQR 16.5 percent to 30.0 percent) at delivery. Physician‐level CD rates were also similar by attribution time point: mean 28.0 percent (IQR 21.2 percent to 34.0 percent) at admission, mean 28.2 percent (IQR 20.3 percent to 36.6 percent) at labor, and mean 28.8 percent (IQR 22.2 percent to 36.4 percent) at delivery. The physicians’ mean CD rates were roughly four percentage points higher than the nurses’ rates regardless of attribution method.
FIGURE 1.

Variation in nurse and physician level cesarean delivery rates by attribution method [Color figure can be viewed at wileyonlinelibrary.com]
The four Sankey diagrams (Figure 2) show how individual nurses and physicians move from one performance category to another depending on the attribution time point and method of classification. Using the Healthy People 2020 classification method, there was a 58 percent agreement in performance categorization for nurses between the admission and labor management attribution time points and 84 percent agreement in performance categorization between the labor management and delivery attribution time points. While aggregate proportions were generally similar, individual assignments varied. For example, 22 percent of all nurses were classified as performing above the target when the birth was attributed to the nurse at admission but below the target when the birth was attributed to the nurse during labor management. Using the ten percentile cutoff method, there was 35 percent agreement in performance categorization for nurses between the admission and labor management attribution time points and 65 percent agreement in performance categorization between the labor management and delivery attribution time points. In other words, of the nurses placed in the highest and lowest categories as determined by the admission attribution time point, only 35 percent would be classified in the same way if determined by the labor management attribution time point (7 percent agreement out of 20 percent). In comparison, of the nurses placed in the highest and lowest categories as determined by the labor management attribution time point, 65 percent (13 percent out of 20 percent) would be classified in the same way if determined by the delivery time point.
FIGURE 2.

Agreement in classification of nurses and physicians by attribution method [Color figure can be viewed at wileyonlinelibrary.com]
There was greater agreement in performance categorization with physicians compared with nurses for both methods of classification. Using the Healthy People 2020 target, there was an 84 percent agreement between the admission and labor management attribution time point and 86 percent agreement between labor management and delivery attribution time point. Using the ten percentile cut off method, 60 percent of physicians would be classified the same across admission and labor management attribution time points, and 65 percent of physicians would be classified the same across labor management and delivery attribution time points.
The results of beta binomial reliability analysis and its application to different groups of clinicians with different birth volumes are presented in detail in Appendix S2 and summarized here. The average reliability of nurse‐level CD rates was 32 percent at admission (IQR 22.0 percent to 38.7 percent), 32.6 percent at labor (IQR 23.1 percent to 40.9 percent), and 29.3 percent (IQR 20.9 percent to 35.8 percent) at delivery. The average reliability of physician‐level CD rates was higher: 54.2 percent (IQR 38.7 percent to 71.4 percent) at admission, 62.5 percent (IQR 49.0 percent to 79.6 percent) at labor management, and 66.1 percent (IQR 53.7 percent to 81.2 percent) at delivery (Figure 3).
FIGURE 3.

Reliability of nurse and physician level cesarean delivery rates by attribution method [Color figure can be viewed at wileyonlinelibrary.com]
The mean reliability of nurse‐ and physician‐level CD rates increased as the sample size or birth volume increased at each attribution time point (Table 1). For example, the mean reliability of nurse‐level CD rates at delivery ranged from 21.1 percent at the 10‐19 birth volume category to 55.3 percent at the 60+ birth volume category. In comparison, the mean reliability of the physician‐level CD rate at delivery ranged from 36.1 percent at the 10‐19 birth volume category to 79.9 percent at the 60+ birth volume category. Notably, the number of nurses substantially decreased with each incremental increase in birth volume. For example, at delivery, 116 nurses (36.4 percent) were in the 10‐19 birth volume category, 68 (21.3 percent) in the 30‐39 birth volume category, and 9 (2.8 percent) in the 60+ birth volume category. In comparison, the majority of physicians (53.2 percent) had 60 more births attributed to them at the delivery time point. This trend was the same for the other two time points as well.
TABLE 1.
Distribution and reliability of the cesarean delivery rates of nurses and physicians by attribution time point and discrete birth volume
| Volume category | Nurses | Physicians | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Clinicians | N births | Cesarean delivery rates | Mean reliability of cesarean delivery rate per clinician (%) | Clinicians | N births | Cesarean delivery rates | Mean reliability of cesarean delivery rate per clinician (%) | |||||||
| N | % | Mean (%) | Min (%) | Max (%) | N | % | Mean (%) | Min (%) | Max (%) | |||||
| At admission | ||||||||||||||
| 10‐19 | 138 | 43.3 | 1977 | 24.1 | 0.0 | 75.0 | 25.0 | 18 | 14.3 | 265 | 36.6 | 0.0 | 60.0 | 25.6 |
| 20‐29 | 104 | 32.6 | 2498 | 24.9 | 4.8 | 57.1 | 32.3 | 14 | 11.1 | 330 | 30.6 | 9.1 | 45.0 | 31.6 |
| 30‐39 | 40 | 12.5 | 1342 | 24.6 | 5.7 | 36.7 | 38.9 | 12 | 9.5 | 413 | 23.1 | 10.5 | 36.7 | 43.6 |
| 40‐49 | 27 | 8.5 | 1183 | 22.6 | 9.8 | 35.0 | 46.9 | 7 | 5.6 | 312 | 31.6 | 20.8 | 43.2 | 44.9 |
| 50‐59 | 9 | 2.8 | 485 | 16.4 | 3.8 | 21.8 | 58.2 | 7 | 5.6 | 384 | 26.9 | 18.6 | 35.1 | 51.9 |
| 60+ | 1 | 0.3 | 65 | 35.4 | 35.4 | 35.4 | 48.6 | 68 | 54.0 | 8148 | 25.9 | 11.1 | 42.9 | 69.4 |
| During labor | ||||||||||||||
| 10‐19 | 141 | 44.2 | 1998 | 22.4 | 0.0 | 75.0 | 25.6 | 18 | 14.3 | 275 | 33.3 | 0.0 | 62.5 | 33.4 |
| 20‐29 | 87 | 27.3 | 2060 | 24.0 | 0.0 | 59.1 | 33.2 | 15 | 11.9 | 348 | 30.8 | 0.0 | 48.0 | 43.4 |
| 30‐39 | 58 | 18.2 | 1983 | 24.8 | 6.3 | 54.8 | 40.5 | 9 | 7.1 | 311 | 26.5 | 12.9 | 48.4 | 51.9 |
| 40‐49 | 20 | 6.3 | 856 | 23.9 | 11.1 | 40.5 | 45.1 | 10 | 7.9 | 465 | 32.8 | 16.7 | 47.6 | 55.8 |
| 50‐59 | 10 | 3.1 | 544 | 28.7 | 14.8 | 38.5 | 47.9 | 7 | 5.6 | 387 | 28.8 | 21.2 | 37.9 | 60.8 |
| 60+ | 3 | 0.9 | 233 | 24.1 | 13.7 | 32.1 | 60.0 | 67 | 53.2 | 8055 | 25.6 | 10.8 | 42.4 | 77.3 |
| At delivery | ||||||||||||||
| 10‐19 | 116 | 36.4 | 1685 | 22.6 | 0.0 | 75.0 | 21.1 | 15 | 11.9 | 228 | 33.8 | 0.0 | 61.1 | 36.1 |
| 20‐29 | 84 | 26.3 | 1998 | 24.0 | 0.0 | 59.1 | 28.4 | 13 | 10.3 | 302 | 28.0 | 4.8 | 47.6 | 46.9 |
| 30‐39 | 68 | 21.3 | 2309 | 26.5 | 8.1 | 53.3 | 33.4 | 12 | 9.5 | 413 | 29.0 | 13.2 | 40.0 | 53.9 |
| 40‐49 | 27 | 8.5 | 1164 | 22.7 | 10.0 | 38.1 | 41.0 | 7 | 5.6 | 307 | 39.2 | 26.8 | 46.7 | 55.3 |
| 50‐59 | 15 | 4.7 | 817 | 26.8 | 16.1 | 38.5 | 43.3 | 12 | 9.5 | 665 | 27.7 | 18.6 | 38.0 | 65.3 |
| 60+ | 9 | 2.8 | 682 | 21.4 | 11.8 | 33.8 | 55.3 | 67 | 53.2 | 8268 | 27.0 | 0.0 | 42.8 | 79.9 |
Larger sample sizes would be required to achieve 70 percent reliability at each attribution time point given our observed reliability and birth volumes; however, the magnitude of increase is greater for nurses. For example, to achieve a reliability of 70 percent, a mean of 88.5 births per nurse would be needed, which is 5.7 times greater than the observed mean of 15.5 births per nurse. In comparison, to achieve the same reliability among physicians, a mean of 56.5 births per physician would be needed, or 1.2 times the observed mean of 47 births per physician (Appendix S3).
4. CONCLUSIONS
Our study is the first to use data from a large, multihospital dataset to examine the variability of nurse‐level delivery mode rates, extending the literature on attribution in the obstetrical context. We observed a fourfold variation in CD rates for low‐risk patients among nurses with similar aggregate patterns at each of the three attribution time points (admission, labor management, and delivery). We further observed that the extent of variation in the aggregate nurse data was similar to physician data across attribution time points with slightly lower averages due to exclusion of midwives and physicians without CD privileges from the physician estimates. However, we found that the choice of attribution time point can have substantial impact on the classification or relative performance ranking to which nurses are assigned. Further, low reliability suggests that much of the variation in nurse CD rates might be due to “noise” in the dataset or not reflective of individual performance effects. Thus, some nurses’ CD rates may fluctuate noticeably over time, no matter what attribution time point we report from.
Physician cesarean rates are customarily attributed to them at delivery for both practical and clinical reasons. Practically, it is easiest to derive physician identity from the electronic health record at the point of delivery. Clinically, the physician is held responsible at delivery because they make the decision most proximal to the event and often perform the procedure. We observed relatively high agreement between time points and moderate reliability at each time point, particularly at delivery. Our findings support the plurality of American hospitals currently attributing births to the physician at delivery as a relatively fair method for calculating individual physician cesarean delivery rates.
Yashusheva and her colleagues have pioneered an approach to estimating nurse‐level variability in patient outcomes by applying an individual performance measurement method from the field of education economics, called Value‐Added Modeling (VAM). 26 They have found evidence of significant individual nurse variability in patients’ clinical condition change during hospitalization on medical‐surgical units, 26 catheter‐associated asymptomatic bacteriuria in intensive care units, 27 and discharge preparation on medical‐surgical units. 28 Taken together, this seminal body of work has demonstrated that bedside nurses differ in their performance or “value‐added” as individual care providers to patients and that the differences among nurses have a measurable, significant effect on patient outcomes and cost of care, concluding that the data could be potentially used in performance evaluation and merit‐based payment schemes. 26
While we did not apply VAM methods, our study contributes to this literature by examining the variability of individual nurse‐patient delivery mode outcomes on maternity units and extends the work by exploring the effect that different attribution methods have on the performance ranking to which nurses are classified in the context of an audit and feedback intervention. Prior studies have examined the impact of different dimensions of attribution methods on physician quality 29 and cost profiles, 30 but our study is the first to evaluate the impact of attribution methods on nurse quality profiles. Given our results, while we see great future promise in individual audit and feedback of CD rates for nurses, we advise caution in how such individual‐level nurse profiling is employed until our understanding of the best methods for measuring quality indicators for nurses has improved.
Nurse cesarean rates are not customarily calculated, and while aggregate data suggest nurses have a similar pattern of variation to physicians, individual nurse‐level data suggest a nurse's CD rate varies more widely by attribution time point and is therefore subject to misclassification in performance rankings. This variation may be because admission, labor management, and delivery are associated with specific and distinct clinical nursing care practices that determine the influence of nursing care on outcomes. For example, it is possible that in comparison with the admission and delivery time points, nurses might have significantly more impact on CD rates during labor management when they are continuously assessing maternal‐fetal status, managing labor pain, and providing labor support. In comparison, nurses at admission typically assess labor progression (cervix and membrane status), make an initial assessment of fetal well‐being, review prenatal history, elicit birth plans and preferences, and develop the intrapartum care plan. Nursing practices thought to reduce the likelihood of cesarean delivery including delaying admissions in early labor, making standard assessments of fetal heart rate and labor progression, supporting nonpharmacologic approaches to relieve labor pain, facilitating position changes, encouraging pushing in second‐stage labor, providing continuous labor support, and effectively communicating with birthing persons and the care team.
However, attributing a birth to the nurse on shift during labor is practically difficult due to limitations in the capacity of many hospitals to extract and aggregate nurse identity in the electronic health record. We find that the overall agreement between the labor management and delivery attribution time point is 84 percent when nurses are classified above or below the Healthy People 2020 target and the overall agreement is 65 percent when classified by the top and bottom deciles. As a result, attributing cesareans to the nurse at delivery may be both a practical and reasonably accurate attribution method to understand the scope of variation in nursing practice for a given hospital. Yet, nurses’ CD rates demonstrate relatively low reliability at all attribution time points, which limits the pertinence of the individual nurse at the delivery attribution time alone. Importantly, this reliability construct was originally developed to support classification of physicians into tiers for public reporting and pay for performance schemes rather than quality performance measurement, is highly sensitive to sample size, and will vary depending on the population used to calculate the estimate. The observed low reliability may be due to transient shifts in the data: high within nurse variation or low across nurse variation rather than true differences in nurses’ clinical performance. This suggests that investing in the capacity to extract data on the nurse present for labor management may be worthwhile and that reviewing these data alongside the admission and delivery time points may be most helpful in auditing individual nurse performance. Those consistently classified among the top and bottom performers across all three time points may be particularly worthy of additional inquiry to understand differences in their practice.
There are no accepted reliability standards for measurements used in quality improvement at this time. A recent report by the National Quality Forum on how to attribute responsibility for patient outcomes to specific health care providers does not offer a single solution but instead outlines key considerations for selecting the right methodology given the circumstances and purpose of the evaluation. 20 However, reliability of 30 percent or below would not be acceptable for use as a performance measure in most settings. Statistically, the low reliability of nurse CD rates can be explained in part by the smaller birth volume of nurses relative to physicians. Nurses have fewer births attributed to them than do physicians because of differences in patient assignments, intensity of care activities, and shorter work shifts. As an illustrative example, the majority of nurses in our sample cared for 10‐19 eligible births compared with the majority of physicians who cared for 60 or more eligible births. Further, most nurses are more likely to work as part of a collaborative team which may warrant multiple attribution or attributing a birth to all clinicians who meet some minimum care requirement or threshold. Another approach is to report aggregate nurse‐level data by shift (day/night) or unit which would increase the number of births available to characterize nurse performance, thereby improving reliability. In summary, to increase the reliability of the nurse‐level CD measures larger sample sizes are needed. Despite the limitations we have discussed, it is possible that even imperfect data audits may help nurse managers “shrink and shift” the distribution of CD rates across their units over time by inviting data‐driven inquiry into differences in practice that they are often well‐positioned to observe directly. Prior data suggest that “strong” signals (eg, the top and bottom 10 percent of CD rates) likely represent true differences in performance. 25 Nurse managers who operate with a multitude of data points about their nursing staff can use individual performance data to support identification of positive deviants and subsequently explore their care practices with chart review and direct observation. As suggested by the Alliance of Innovation in Maternity Care, to safely reduce primary cesarean rates, learning effective care strategies from top‐performers and disseminating them throughout the birth team can contribute to a maternity unit culture that values and supports vaginal birth. 31 Further, outliers with consistently high CD rates could be identified and observed for causal explanations and coaching opportunities.
Notably, while individual nurse performance is unique and integral to patient care, other clinicians and factors contribute to patients’ birth outcomes, and we must consider contextual information alongside reports of individual nurse data. Nurse‐level profiles, for example, can be compared alongside those of obstetricians and nurse midwives to support existing clinical collaboration as well as with relevant unit‐level data stratified by variables meaningful to nurses.
However, transparency about how nurses are assigned to performance categories is necessary. The potential for unintended consequences from producing individualized nurse rankings and applying them to performance evaluation needs careful scrutiny in order to avoid creating an alienating and demotivating influence. A positive environment and management structure that considers how feedback is provided, nurses’ well‐being, and quality improvement methods is required for nurses to perceive data feedback as a job resource and motivating tool. 32
Ultimately, our study must be interpreted in the context of our study design. Our observational approach relies on retrospective, chart‐abstracted data from nine hospitals in Washington state, limiting generalizability. Future work should compare our findings with datasets from other care settings and populations. Furthermore, the sensitivity of these methods to sample size strongly suggests that rural or smaller birth volume hospitals may lack sufficient sample size to aggregate birth outcomes at the individual provider level at regular increments of time. Moreover, we cannot determine from our analysis what individual nurse characteristics or attributes generated the observed nurse‐level variation. It could be that nurse experience, state‐trait anxiety, 33 or attitudes toward evidence‐based practice account for differences in performance. It could also be due to non‐nurse‐related factors, such as frequently working with a particular physician or working on an understaffed unit. Finally, we did not adjust our measure for case mix. Detailed case mix adjustment is not a routine part of hospital quality improvement programs and so we believe our approach adheres to the current practice environment. Comparison of unadjusted and adjusted measures is a future research direction. Despite these limitations, this paper raises an important set of opportunities to better understand how we might parse and ultimately value the care individual nurses provide.
We need more research into other potential attribution methods and quality measures for labor and delivery nurses. Testing multiple attribution methods or using a majority of care rule should be explored. Patient attestation information may help validate the best approach. Additionally, it may not be appropriate to use the same measure for nurses that we use for physicians. We may need to explore other promising quantitative measures of nurse performance in labor and delivery care such as time spent at the bedside, infusion rates of oxytocin, or patient satisfaction with nursing care. The results of our study are intended to be a starting point to the development of a more robust evidence base that isolates the unique contributions of individual nurses to birth outcomes, a methodological complex but compelling line of inquiry, given the state of the science on attribution and the crucial role of nurses in ensuring high‐quality health care.
Supporting information
Author matrix
Appendix S1
Appendix S2
Appendix S3
Appendix S4
ACKNOWLEDGMENTS
Joint Acknowledgment/Disclosure Statement: This study was funded by the Rx Foundation. The authors would like to thank Ian Painter at OBCOAP and Ami Karlage at Ariadne Labs for their assistance with manuscript preparation.
Edmonds JK, Weiseth A, Neal BJ, et al. Variability in cesarean delivery rates among individual labor and delivery nurses compared to physicians at three attribution time points. Health Serv Res.2021;56:204–213. 10.1111/1475-6773.13546
REFERENCES
- 1. Ivers N, Jamtvedt G, Flottorp S, et al. Audit and feedback: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2012;6:CD000259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Gardner B, Whittington C, McAteer J, Eccles MP, Michie S. Using theory to synthesise evidence from behaviour change interventions: the example of audit and feedback. Soc Sci Med. 2010;70(10):1618‐1625. [DOI] [PubMed] [Google Scholar]
- 3. Colquhoun HL, Brehaut JC, Sales A, et al. A systematic review of the use of theory in randomized controlled trials of audit and feedback. Implement Sci. 2013;8:66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. McDermott KW, Elixhauser A, Sun R.Trends in hospital inpatient stays in the United States, 2005–2014. HCUP statistical brief. 2017;225.
- 5. Podulka J, Stranges E, Steiner C. Hospitalizations related to childbirth, 2008: statistical brief #110. In: Healthcare Cost and Utilization Project (HCUP) Statistical Briefs. Rockville, MD: Agency for Healthcare Research and Quality (US); 2006. [PubMed] [Google Scholar]
- 6. Sebastião YV, Womack L, Vamos CA, et al. Hospital variation in cesarean delivery rates: contribution of individual and hospital factors in Florida. Am J Obstet Gynecol. 2016;214(1):123.e1‐123.e18. [DOI] [PubMed] [Google Scholar]
- 7. Kozhimannil KB, Law MR, Virnig BA. Cesarean delivery rates vary tenfold among US hospitals; reducing variation may address quality and cost issues. Health Aff. 2013;32(3):527‐535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Chaillet N, Dumont A, Abrahamowicz M, et al. A cluster‐randomized trial to reduce cesarean delivery rates in Quebec. N Engl J Med. 2015;372(18):1710‐1721. [DOI] [PubMed] [Google Scholar]
- 9. Peng F‐S, Lin H‐M, Lin H‐H, Tu F‐C, Hsiao C‐F, Hsiao S‐M. Impact of clinical audits on cesarean section rate. Taiwan J Obstet Gynecol. 2016;55(4):530‐533. [DOI] [PubMed] [Google Scholar]
- 10. Montoya‐Williams D, Lemas DJ, Spiryda L, Patel K, Neu J, Carson TL. What are optimal cesarean section rates in the U.S. and how do we get there? A review of evidence‐based recommendations and interventions. J Womens Health. 2017;26(12):1285‐1291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Gennaro S, Mayberry LJ, Kafulafula U. The evidence supporting nursing management of labor. J Obstet Gynecol Neonatal Nurs. 2007;36(6):598‐604. [DOI] [PubMed] [Google Scholar]
- 12. Edmonds JK, O'Hara M, Clarke SP, Shah NT. Variation in cesarean birth rates by labor and delivery nurses. J Obstet Gynecol Neonatal Nurs. 2017;46(4):486‐493. [DOI] [PubMed] [Google Scholar]
- 13. Radin TG, Harmon JS, Hanson DA. Nurses' care during labor: its effect on the cesarean birth rate of healthy, nulliparous women. Birth. 1993;20(1):14‐21. [DOI] [PubMed] [Google Scholar]
- 14. American College of Obstetricians , Gynecologists Task Force on Cesarean Delivery Rates . Evaluation of Cesarean Delivery. Washington, DC: ACOG; 2000. [Google Scholar]
- 15. Lyndon A, Simpson KR, Spetz J. Thematic analysis of US stakeholder views on the influence of labour nurses' care on birth outcomes. BMJ Qual Saf. 2017;26(10):824‐831. [DOI] [PubMed] [Google Scholar]
- 16. Simpson KR, Lyndon A. Labor nurses' views of their influence on cesarean birth. MCN Am J Matern Child Nurs. 2017;42(2):81‐87. [DOI] [PubMed] [Google Scholar]
- 17. Edmonds JK, Jones EJ. Intrapartum nurses' perceived influence on delivery mode decisions and outcomes. J Obstet Gynecol Neonatal Nurs. 2013;42(1):3‐11. [DOI] [PubMed] [Google Scholar]
- 18. Edmonds JK, Hacker MR, Golen TH, Shah NT. Nurses count: tracking performance to improve cesarean delivery rates. Birth. 2016;43(1):3‐5. [DOI] [PubMed] [Google Scholar]
- 19. Mehrotra A, Burstin H, Raphael C. Raising the bar in attribution. Ann Intern Med. 2017;167(6):434‐435. [DOI] [PubMed] [Google Scholar]
- 20. National Quality Forum . Improving Attribution Models Final Report. Washington, DC: NQF; 2019. https://www.qualityforum.org/Publications/2018/08/Improving_Attribution_Models_Final_Report.aspx [Google Scholar]
- 21. Scholle SH, Roski J, Adams JL, et al. Benchmarking physician performance: reliability of individual and composite measures. Am J Manag Care. 2008;14(12):833‐838. [PMC free article] [PubMed] [Google Scholar]
- 22. Welton JM, Harper EM. Nursing care value‐based financial models. Nurs Econ. 2015;33(1):25. [PubMed] [Google Scholar]
- 23. Welton JM, Harper EM. Measuring nursing care value. Nurs Econ. 2016;34(1):7‐14; quiz 15. [PubMed] [Google Scholar]
- 24. Huang C‐W, Lu R, Iqbal U, et al. A richly interactive exploratory data analysis and visualization tool using electronic medical records. BMC Med Inform Decis Mak. 2015;15:92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Adams JL, Mehrotra A, Thomas JW, et al. Physician cost profiling‐reliability and risk of misclassification: detailed methodology and sensitivity analyses. Rand Health Q. 2012;2(1):3. [PMC free article] [PubMed] [Google Scholar]
- 26. Yakusheva O, Lindrooth R, Weiss M. Nurse value‐added and patient outcomes in acute care. Health Serv Res. 2014;49(6):1767‐1786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Yakusheva O, Costa DK, Bobay KL, Parada JP, Weiss ME. Variability in catheter‐associated asymptomatic bacteriuria rates among individual nurses in intensive care units: an observational cross‐sectional study. PLoS One. 2019;14(7):e0218755. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Yakusheva O, Weiss ME, Bobay KL, et al. Individual nurse productivity in preparing patients for discharge is associated with patient likelihood of 30‐day return to hospital. Med Care. 2019;57(9):688‐694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Scholle SH, Roski J, Dunn DL, et al. Availability of data for measuring physician quality performance. Am J Manag Care. 2009;15(1):67‐72. [PMC free article] [PubMed] [Google Scholar]
- 30. Mehrotra A, Adams JL, Thomas JW, McGlynn EA. The effect of different attribution rules on individual physician cost profiles. Ann Intern Med. 2010;152(10):649‐654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Lagrew DC, Low LK, Brennan R, et al. National partnership for maternal safety: Consensus bundle on safe reduction of primary cesarean births‐supporting intended vaginal births. Obstet Gynecol. 2018;131(3):503‐513. [DOI] [PubMed] [Google Scholar]
- 32. Christina V, Baldwin K, Biron A, Emed J, Lepage K. Factors influencing the effectiveness of audit and feedback: nurses’ perceptions. J Nurs Manag. 2016;24(8):1080‐1087. [DOI] [PubMed] [Google Scholar]
- 33. Yee LM, Liu LY, Grobman WA. The relationship between obstetricians’ cognitive and affective traits and their patients' delivery outcomes. Am J Obstet Gynecol. 2014;211(6):692.e1‐692.e6. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Author matrix
Appendix S1
Appendix S2
Appendix S3
Appendix S4
