Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Sep 1.
Published in final edited form as: J Midwifery Womens Health. 2019 Jul 2;64(5):657–663. doi: 10.1111/jmwh.12996

The ARRIVE Trial – Interpretation from an Epidemiologic Perspective

Suzan L CARMICHAEL 1, Jonathan M SNOWDEN 2
PMCID: PMC6821557  NIHMSID: NIHMS1027733  PMID: 31264773

Abstract

The findings of the ARRIVE trial (A Randomized Trial of Induction Versus Expectant Management) were recently published. This multi-site randomized trial was designed to provide evidence regarding whether labor induction or expectant management is associated with increased adverse perinatal outcomes and risk of cesarean birth among healthy nulliparous women at term. The trial reported that the primary outcome, a composite of adverse neonatal outcomes, was not significantly different between the two groups; the principal secondary outcome, cesarean birth, was significantly more common among women whose pregnancy was expectantly managed than among women whose labor was induced at 39 weeks. These results have the potential to change existing practice. Several aspects of the study design may influence its potential internal and external validity and should be considered in order to make sound causal inferences from this trial, which will in turn affect how its findings are translated to practice. While chance and confounding are of minimal concern, given the sample size and randomization used in the study, selection bias may be a concern. Studies are vulnerable to selection bias when the sample population differs from eligible non-participants, including in randomized controlled trials. External validity is defined as the extent to which the study population and setting are representative of the larger source population the study intends to represent. External validity may be limited given the characteristics of the women enrolled in the ARRIVE trial and the practice settings where the study was conducted. This brief report provides concrete suggestions for further analyses that could help solidify conclusions from the trial, and for further research questions that will continue advancement toward answering this complex question of how best to manage labor and delivery decisions at full term among low-risk women.

Keywords: Causality, Cesarean Section, Induction of labor, Labor onset, Pragmatic randomized controlled trials, Randomized controlled trials

Précis:

This brief report offers an epidemiologic perspective on the ARRIVE trial, highlighting internal validity, external validity, time at risk, and causal mechanisms of labor induction effects.

INTRODUCTION

With the emergence of evidence that neonatal risk is heterogeneous across the term gestational age range,13 there has been increased focus on defining the timing of birth that best minimizes risks of adverse perinatal outcomes at term. To date, evidence suggests that birth is optimal for the health of the newborn and the woman, barring any medical indications for giving birth earlier, during a relatively narrow time period – from 39 weeks 0 days to 40 weeks 6 days gestation.35 Approximately 58% of births occur within this 2-week timeframe.6

The optimal timing of birth and best clinical management within the confines of this timeframe are open questions. Uncertainties exist regarding how to optimally balance the potential risks and benefits of expectantly managing pregnancy and waiting for spontaneous onset of labor, which may entail the emergence of a medical indication for birth, versus electively inducing labor. Elective induction of labor has conveniences such as planned scheduling, and the earlier it occurs, the less at-risk time there is for adverse events, such as preeclampsia or stillbirth, to occur. Conversely, elective induction is costly with respect to staffing levels and patient time, and it introduces medical intervention into labor and birth, which does not align with some women’s preferences.7,8

Prior evidence evaluating the benefits and harms of expectant management versus elective induction of labor is mixed and primarily observational.912 Observational studies that compare the outcomes of women and newborns when labor is electively induced with those preceded by spontaneous onset of labor at the same gestational age have generally found that induction is associated with an increased incidence of cesarean birth.9,10 In contrast, studies that compare elective induction of labor to expectant management, which includes all later births, whether they occur spontaneously or not, have found that that induction is associated with lower risks of cesarean and adverse perinatal outcomes.11,12

Given the conflicting findings of observational research, the recent ARRIVE trial (A Randomized Trial of Induction Versus Expectant Management) was designed to provide evidence regarding whether induction or expectant management is associated with more adverse perinatal outcomes, among healthy nulliparous women at full term.13 In order to appropriately interpret the trial’s findings from a causal inference perspective, it is important to consider epidemiologic concepts related to study design and how the design may influence a study’s internal and external validity, which should in turn influence how the findings are translated to practice. This brief report provides concrete suggestions for further analyses that could help solidify conclusions from the trial, and for further research questions that will continue advancement toward answering this complex question of how best to manage labor and delivery decisions at full term among low-risk women.

SUMMARY OF THE DESIGN AND RESULTS OF ARRIVE

In brief, the ARRIVE trial compared perinatal outcomes among 3,062 women who were assigned to undergo elective induction of labor at 39 weeks and 0 to 4 days gestation (referred to hereafter as the ‘IOL ‘ group) versus 3,044 women assigned to expectant management. In this study, expectant management was defined as no elective induction before 40 weeks 5 days, and birth no later than 42 weeks 2 days. Randomization occurred at 38 weeks 0 days to 38 weeks 6 days of gestation. The trial was restricted to low-risk nulliparous women whose singleton fetus was in a vertex position. The low-risk status was defined at the time of randomization. Women had to have a relatively certain date of last menstrual period and her estimated gestational age had to match the gestational age determined by ultrasound; or if she was uncertain, she had to have had a first trimester ultrasound that established gestational age. Women were ineligible if they had any high-risk conditions, such as oligohydramnios, fetal growth restriction, hypertensive disorders, or diabetes. Participating hospitals were affiliated with the National Institutes of Health, National Institute of Child Health and Human Development (NICHD) Maternal-Fetal Medicine Units Network (MFMU), a network of academic health centers.13

The primary newborn outcome was a composite of perinatal death or severe neonatal complications, which was 20% lower in the induced group; specifically, 4.3% of infants in the induced group and 5.4% of infants in the expectant management group were affected, which resulted in a relative risk of 0.80 (95% confidence interval [CI]l 0.64 – 1.00). This result was largely driven by lower frequency of respiratory support in the newborns of women in the induced group. The difference was not considered statistically significant at P=0.049 since significance was set at P<0.046, conservatively adjusted from 0.05 to reflect the one interim analysis that was performed. The principal secondary outcome was cesarean birth, which was 16% lower in the induction group, ie, 18.6% versus 22.2% respectively (P<0.01), which resulted in a relative risk of 0.84 (95% CI 0.76 – 0.93). Secondary newborn outcomes such as birth weight were not significantly different between the women in the two study arms. Several secondary maternal outcomes were significantly different (P≤0.01) between the two groups; for example, hypertensive disorders were 36% lower among the induction group (9.1% versus 14.1%), and duration of stay in the labor and delivery unit was slightly longer, and postpartum hospital stay slightly shorter, among women in the induction group.

An editorial accompanying the article ended by stating: “These results…should reassure women that elective induction of labor at 39 weeks is a reasonable choice that is very unlikely to result in poorer obstetrical outcomes.”14 A statement published by the Society for Maternal-Fetal Medicine (SMFM) recommended that “It is reasonable to offer elective induction of labor to low-risk, nulliparous women at or beyond 39 weeks and 0 days of gestation” and that “women can be reassured that both elective IOL and expectant management are reasonable options at 39 weeks of gestation”.15 The American College of Nurse-Midwives (ACNM) stated that “Implementation of practice changes to offer 39 week induction of labor should proceed cautiously.”16 An epidemiologic perspective on interpretation of the trial’s findings and their applicability beyond the study’s setting may help illuminate the clinical value of these findings for different populations.

INTERNAL VALIDITY

When evaluating a study, a usual sequence is to consider its potential internal validity, then its external validity, and then what inferences can be reasonably substantiated by the findings. Table 1 presents definitions of epidemiologic terms related to this process. Causal inference refers to the process of inferring that a cause led to an effect and if so, how. A prerequisite for causal inference is confidence that a study’s findings are internally valid (Figure 1). To assess a study’s internal validity, the findings are first evaluated to determine if they are affected by or the result of chance, confounding, other types of bias, or a combination of factors.17 The well-designed size of the ARRIVE trial helps to minimize chance as an alternative explanation, at least for the primary outcome and the more common secondary outcomes.

Table 1.

Definition of terms

Term Definition
Causal inference The process of inferring that a cause led to an effect and if so, how. For the ARRIVE study, defined as ensuring that observed effects are attributable to elective labor induction rather than differences between the sample and the eligible population or between study arms, and characterizing the mechanisms underlying the causal association.
Bias Systematic distortion of the relationship between a treatment and the outcome. Bias can be introduced at any step in a research study and there are multiple forms of bias such as selection bias and confounding.
Selection Bias Bias due to selecting or retaining a sample in which exposure effects do not reflect exposure effects in the overall eligible population. (Thus, it could be a threat to internal as well as external validity.)
Confounding Distortion of the association between an exposure and an outcome due to the influence of another variable that is associated with both (Snowden et al., 2018).40
Internal validity The degree to which a study’s finding is accurate and correctly reflects the true association in the entire eligible population.
External validity (also known as generalizability) The degree to which a study’s finding generalizes to the overall population to which researchers wish to make inference.
Selection bias Bias due to selecting or retaining a sample in which exposure effects do not reflect exposure effects in the overall eligible population. (Thus, it could be a threat to internal as well as external validity.)
Pragmatic trial A randomized trial designed to inform a policy or clinical decision by characterizing effects of an exposure in real-world settings. This contrasts with an explanatory trial, which aims to confirm a more narrow biological or clinical hypothesis (Ford and Norrie, 2016).33
Incidence proportion (also known as cumulative incidence) The number of new cases (ie, incident outcomes) among the study population initially at risk, within a specified period of time. At-risk people are the denominator (Rothman et al., 2008; Szklo and Nieto, 2012).17,41
Incidence density (also known as incidence rate) The number of new cases that emerge during the study population’s time at risk. At-risk person-time is the denominator rather than people, accounting for differences in time at risk between exposure groups.
Mediation analysis Analysis focusing on what specific mechanisms (ie, causal mediators) explain the association between exposure and outcome. Mediators are temporally ‘on the causal pathway’ between an exposure and an outcome.

Figure 1.

Figure 1.

Illustration of internal and external validity as applied to the ARRIVE trial

A factor or variable that independently affects both the intervention and the outcomes is referred to as a confounder (Figure 2a). In the ARRIVE trial, the treatment was assigned randomly and adherence was high at 94% in the IOL arm and 95% in the expectant management arm. These features help to ensure that the comparison groups were similar to each other and results would not be attributable to a third unmeasured confounder. However, bias due to unmeasured confounders introduced during implementation of the study, and due to the unblinded nature of the trial, remains a possibility.1820

Figure 2. Potential violations of internal validity: confounding bias (A), and selection bias (B).

Figure 2.

Dots represent a given characteristic that differs between individuals. Different color dots represent individuals with different values of that characteristic. Confounding bias is characterized by an imbalance in participant characteristics between treatment groups. Here, blue dots are more frequent in the treatment arm and red dots are more frequent in the control arm (2A). Selection bias in characterized by the study participants being different from eligible non-participants (see Figure 1). In this example, the study sample contains only blue and red dots, while eligible non-participants include yellow dots in addition to blue and red dots (2B).

Selection bias is related to how individuals are selected into a study (Figure 2b). Selection bias is refers to the extent to which results in a study sample differ from results in the study population (ie, all eligible individuals). Unfortunately this can generally not be assessed directly, but when reviewing a study, high participation rates are reassuring.21 Any study wherein the sample population differs from eligible non-participants is vulnerable, including randomized controlled trials.22 Specifically, selection bias can be a problem if factors affecting participation in the study also affect the effectiveness of the treatment under study, thereby affecting the study outcomes.22 Selection bias is a concern with regard to the ARRIVE trial. It is not clear that the sample of women who participated is representative of the overall study population, and whether associations between the treatment and outcomes are likely to be similar for those who are eligible but not in the study.

First, women had to have a relatively certain date of their last menstrual period to be eligible; 13% (n = 6606) of the screened low-risk women were ineligible based on this criterion. Second, only 27% of eligible women (n=6106) agreed to participate. Third, the cesarean birth rate in both groups within the study was lower than in many US settings which suggests there might be something about this cohort that is different than the general population.23 For example, approximately 20% of participants in this study had cesarean births, which is lower than the 2017 US rate of 26% among women with nulliparous, term, singleton, vertex presentations, a group with an admittedly higher risk profile than ARRIVE participants.6 Fourth, the participants were considerably more likely to be African-American than the general US population of women who give birth, at 23% in the trial vs. 15% in the United States, and they were also younger, with 4% 35 years of age or older in ARRIVE, compared to the United States nationally in which 18% of childbearing women are 35 years or older.24 Fifth, the incidence of hypertension may be higher than expected, for low-risk women during a brief at-risk duration. Specifically, the median gestational age at randomization was 38 weeks 3 days (the range was 38 weeks, 0–6 days). At this point, women did not have hypertensive disorders, yet 8% of the IOL group and 14% of the expectant management group had a hypertensive disorder by the time they gave birth. During the entire pregnancy, only about 5% to 6% of all pregnant women – high- and low-risk women combined – would be expected to develop a hypertensive disorder (approximately 3% for preeclampsia and 2–3% for gestational hypertension).25 It is difficult to assess the precise risk of hypertensive disorders that would be expected in a low-risk population beyond 38 weeks’ gestation in the absence of detailed week-by-week incidence data. However, considering the overall incidence of hypertensive disorders during pregnancy and the US distribution of gestational length,24 one could expect the cumulative incidence of hypertension at term to be less than 8% per week (IOL group) or 14% during two weeks of expectant management , especially among low-risk women. This example assumes the approximate median time between randomization and delivery to have been one to two weeks for the two respective arms of the trial.

A thorough comparison of the participants and non-participants is thus critical to addressing the question of internal validity. For example, it would be very helpful to know whether sociodemographic characteristics or the prevalence of any of the ARRIVE outcomes were similar among all low-risk women who gave birth at the study hospitals or any MFMU-affiliated hospitals, in comparison to the ARRIVE participants. In addition, given the potentially high incidence of hypertension among study participants, it would be useful to determine whether the higher risk of cesarean birth and neonatal complications is also true among women who did not develop hypertension, ie, women who were definitely low risk. This could be assessed via stratum-specific analysis.

EXTERNAL VALIDITY

Once the internal validity of a study is deemed to be strong, one may consider a study’s external validity, which is pertinent to the generalizability of the findings (Figure 1). External validity addresses whether the intended study population and study findings are applicable to the broader population they intended to represent. In considering the applicability of the ARRIVE study sample to the general population, similar concerns arise as described above regarding internal validity; that is, it is uncertain to what extent is the study population and setting are representative of the larger source population the study intends to represent.

Another concern about generalizability relates to practice setting. To assess whether associations between labor induction and outcomes such as cesarean birth will generalize to other settings, one must consider the degree to which MFMU-affiliated hospitals are representative of most clinical settings where women give birth. Intrapartum management varies considerably across hospitals. This is illustrated, for example, by the wide range of cesarean birth rates for low-risk women, which vary 10-fold across U.S. hospitals, from approximately 7% to 70%.26 In California, the median hospital cesarean rate after labor induction in low-risk nulliparous women is 32%, and it is as high as 50% to 60% in different hospitals across the state.27 This is in contrast to the cesarean rate of 18.6% among the ARRIVE cohort assigned to induction of labor. One must consider the degree to which ARRIVE study results are applicable to hospitals that have twice this rate of cesarean birth. Evidence also suggests that childbirth care practices differ between academic health centers and their affiliates, which include the ARRIVE study sites, and hospitals that are not academically affiliated. 26,28 In some instances, teaching hospitals have been demonstrated to have higher levels of evidence-based practice and lower levels of unwarranted practice variation.26,28

For example, all hospitals in the study adhered to a common definition of unsuccessful labor induction in the latent phase of labor,27,29 and once in the active phase followed American College of Obstetricians and Gynecologists (ACOG) / SMFM guidelines for diagnosis of labor arrest and descent disorders.27,30 It is possible that in the absence of such practice guidelines, an increased frequency of labor induction, with concomitant longer mean labor duration, would result in a higher cesarean birth rate due to increased diagnoses of labor dystocia and labor arrest.27,31,32 More information about how these labor management considerations compare to most US hospitals, and how the differences could translate to variability in perinatal outcomes, will help in assessing the generalizability of the ARRIVE results. It could be that the ARRIVE participants are representative of the study population from which they were drawn and their management resulted in particularly low rates of cesarean birth, but the question still remains regarding whether the implementation of IOL would yield similar outcomes in other settings.

FURTHER QUESTIONS THAT ARRIVE CAN ANSWER

The ARRIVE trial focused on a dichotomized question, which was a comparison of perinatal outcomes if labor is induced within a 4-day time window early during the 39th week of gestation, to outcomes of women who do not undergo elective induction of labor at 39 weeks gestation. The rich ARRIVE data could be used to determine the source of the improvements in outcomes in the IOL group.

First, the occurrence of the primary outcomes of the women in the IOL group could be compared to sub-sets of the expectant management group based on timeframe of birth (eg, <395, 395-404, 405-414, 415-422 weeks’ gestation, as sample size allows) and labor onset (ie, spontaneous onset of labor, elective IOL, or IOL or pre-labor cesarean birth owing to onset of complications). Although these comparison groups cannot be randomly assigned, understanding associations with outcomes in these more specific sub-groups could be informative to decision-making and understanding risks and benefits. For example, this type of analysis would address the risks and benefits associated with expectant management for a relatively shorter amount of time, rather than waiting until 42 weeks 2 days. This is particularly relevant in the context of the design of the ARRIVE study. Elective IOL was not an option for the women in the expectant management group until 40 weeks 5 days. Thus, spontaneous delivery, barring medical indications for delivery, was implicitly assigned, from the time of randomization through 40 weeks 4 days, for these women, and useful information could be gained from stratifying analyses by gestational age.

Second, it would be helpful to know whether the increased occurrence of cesarean birth in the women in the IOL group was attributable to the increased incidence of hypertensive disorders. Similarly, it would be informative to determine whether the higher risk of neonatal complications, especially respiratory support, was attributable to the higher occurrence of cesarean birth. Mediation analyses could be conducted to answer these questions. Mediation analysis assesses the portion of an effect that is attributable to a mediator, ie, a factor that is on the causal path between an exposure and an outcome. In these two examples, hypertensive disorders could be a mediator between treatment, which in this case is induction or expectant management, and cesarean birth, and cesarean birth could be a mediator between treatment and neonatal complications.

Finally, the incidence rate (ie, incidence density) of some maternal outcomes could be calculated in addition to the incidence proportion (ie, cumulative incidence), which was presented for the ARRIVE trial (see Table 1 for definitions). The two treatment arms had different amounts of time at risk based on the study design, and the risk of some outcomes, such as hypertension, increases with longer duration of pregnancy. Therefore, explicitly including person-time in the denominator of incidence calculations for relevant outcomes would analytically address this difference between study arms, providing information about the mechanism behind observed differences in risk.

NEXT STEPS: RECOMMENDATIONS FOR FUTURE RESEARCH

The ARRIVE trial has addressed an important question in a specific population and practice setting. Although this question is a clinically relevant one, there are many other questions that need to be asked and answered about optimal timing of birth and use of elective induction of labor, especially if inherent goals include applicability to real-world practice and women’s choices. Pragmatic trials may be important tools to answer questions about the real-world effectiveness – weighing benefits and risks – of elective induction of labor as compared to expectant management.3335 The ARRIVE study incorporated elements of a pragmatic trial (ie, not pre-specifying the methods of labor induction), but this approach could be more fully applied to the design of the comparison groups. For example, if women not receiving elective induction at 39 weeks were encouraged to actively participate in their care, including selection of an elective induction at a later date if preferred, rather than cede this choice to trial protocol, this might produce a different result. It also would possibly result in higher enrollment. Such questions are pertinent to shared decision-making and are likely to be of high interest to women and clinicians. This type of inquiry is becoming more common in other areas of health care such as mental health care and chronic disease management,36,37 and will be important to guide the practical translation of the results to practice.

Further analysis and discussion of the ARRIVE data and further research would be useful before substantial practice change, especially given the differences between controlled study settings and general practice.38,39 The development and implementation of practice guidelines is complex, and integrating across many sources of evidence and questions enables practice change to occur in ways that are evidence-based, vigilant regarding potential unintended consequences, and conducive to shared decision-making.35,38 Other approaches to support the goal of safe reduction of primary cesarean birth also merit further research, eg, doula support and manual rotation of the fetal occiput for fetal malposition.16,30

There is also a need for research on the supporting factors that might help explain the associations found in the ARRIVE study and help translate these benefits to other settings, eg, if broad adherence to modern evidence-based definitions of unsuccessful induction and labor dystocia might safeguard against potential unintended consequences of practice change, such as an increase in cesarean birth.27,29,30 In addition, work remains to be done to ensure that women are fully informed about their options for labor and birth, and the concomitant risks and benefits. The final recommendation of the SMFM statement about ARRIVE is: “We recommend that further research be conducted to measure the impact of this practice in settings other than a clinical trial.”15 Additional analysis of the ARRIVE data and additional rigorous research will support this goal.

Quick points:

  • The recent ARRIVE trial helped fill evidence gaps on the effects of elective induction of labor at 39 weeks’ gestation, finding a significantly lower risk of cesarean birth and no significant difference in composite neonatal complications after elective induction, compared to expectant management.

  • Complex study design and causal considerations arise when considering the associations between labor induction, cesarean birth, and complications.

  • Selection bias may affect studies, including randomized controlled trials, when study participants and eligible non-participants differ.

  • Future research on elective induction of labor should explore the causal mechanisms in the ARRIVE trial, characterize this association in other populations and practice settings, and implement pragmatic clinical trial designs that may more closely reflect the concept of shared decision-making.

Acknowledgments

Sources of funding:

NIH (NR017020)

Footnotes

Conflicts of interest:

The authors have no conflicts of interest to disclose.

Contributor Information

Suzan L. CARMICHAEL, Departments of Pediatrics and Obstetrics & Gynecology, in the School of Medicine at Stanford University.

Jonathan M. SNOWDEN, School of Public Health at Oregon Health & Science University and Portland State University, and in the Department of Obstetrics & Gynecology at Oregon Health & Science University..

References

  • 1.Cheng YW, Nicholson JM, Nakagawa S, Bruckner TA, Washington AE, Caughey AB. Perinatal outcomes in low-risk term pregnancies: do they differ by week of gestation? Am J Obstet Gynecol. 2008;199(4):370 e371–377. [DOI] [PubMed] [Google Scholar]
  • 2.Reddy UM, Bettegowda VR, Dias T, Yamada-Kushnir T, Ko CW, Willinger M. Term pregnancy: a period of heterogeneous risk for infant mortality. Obstet Gynecol. 2011;117(6):1279–1287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Spong CY. Defining “term” pregnancy: recommendations from the Defining “Term” Pregnancy Workgroup. JAMA. 2013;309(23):2445–2446. [DOI] [PubMed] [Google Scholar]
  • 4.ACOG (American College of Obstetricians and Gynecologists). Practice bulletin no. 146: Management of late-term and postterm pregnancies. Obstet Gynecol. 2014;124(2 Pt 1):390–396. [DOI] [PubMed] [Google Scholar]
  • 5.ACOG (American College of Obstetricians and Gynecologists). ACOG committee opinion no. 561: Nonmedically indicated early-term deliveries. Obstet Gynecol. 2013;121(4):911–915. [DOI] [PubMed] [Google Scholar]
  • 6.Martin JA, Hamilton BE, Osterman MJK, Driscoll AK, Drake P. Births: final data for 2016. Nat Vital Stat Rep. 2018;67(1). [PubMed] [Google Scholar]
  • 7.ACOG (American College of Obstetricians and Gynecologists). Committee Opinion No. 687: Approaches to Limit Intervention During Labor and Birth. Obstet Gynecol. 2017;129(2):e20–e28. [DOI] [PubMed] [Google Scholar]
  • 8.American-College-of-Nurse-Midwives;, Midwives-Alliance-of-North-America;, National-Association-of-Certified-Professional-Midwives. Supporting healthy and normal physiologic childbirth: a consensus statement by the American College of Nurse-Midwives, Midwives Alliance of North America, and the National Association of Certified Professional Midwives. J Midwifery Womens Health. 2012;57(5):529–532. [DOI] [PubMed] [Google Scholar]
  • 9.Vahratian A, Zhang J, Troendle JF, Sciscione AC, Hoffman MK. Labor progression and risk of cesarean delivery in electively induced nulliparas. Obstet Gynecol. 2005;105(4):698–704. [DOI] [PubMed] [Google Scholar]
  • 10.Vrouenraets FP, Roumen FJ, Dehing CJ, van den Akker ES, Aarts MJ, Scheve EJ. Bishop score and risk of cesarean delivery after induction of labor in nulliparous women. Obstet Gynecol. 2005;105(4):690–697. [DOI] [PubMed] [Google Scholar]
  • 11.Cheng YW, Kaimal AJ, Snowden JM, Nicholson JM, Caughey AB. Induction of labor compared to expectant management in low-risk women and associated perinatal outcomes. Am J Obstet Gynecol. 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Darney BG, Snowden JM, Cheng YW, et al. Elective induction of labor at term compared with expectant management: maternal and neonatal outcomes. Obstet Gynecol. 2013;122(4):761–769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Grobman WA, Rice MM, Reddy UM, et al. Labor Induction versus Expectant Management in Low-Risk Nulliparous Women. N Engl J Med. 2018;379(6):513–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Greene MF. Choices in managing full-term pregnancy. N Engl J Med. 2018;379(6):580–581. [DOI] [PubMed] [Google Scholar]
  • 15.SMFM (Society for Maternal-Fetal Medicine). SMFM Statement on Elective Induction of Labor in Low-Risk Nulliparous Women at Term: the ARRIVE Trial. Am J Obstet Gynecol. 2018. [DOI] [PubMed] [Google Scholar]
  • 16.ACNM (American College of Nurse-Midwives). ARRIVE trial: talking points for members. 2018. [Google Scholar]
  • 17.Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3rd ed ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2008. [Google Scholar]
  • 18.Schulz KF. Unbiased research and the human spirit: the challenges of randomized controlled trials. CMAJ. 1995;153(6):783–786. [PMC free article] [PubMed] [Google Scholar]
  • 19.Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995;273(5):408–412. [DOI] [PubMed] [Google Scholar]
  • 20.Klein MC, Kaczorowski J, Robbins JM, Gauthier RJ, Jorgensen SH, Joshi AK. Physicians’ beliefs and behaviour during a randomized controlled trial of episiotomy: consequences for women in their care. CMAJ. 1995;153(6):769–779. [PMC free article] [PubMed] [Google Scholar]
  • 21.Haneuse S Distinguishing Selection Bias and Confounding Bias in Comparative Effectiveness Research. Med Care. 2016;54(4):e23–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Jadad AR, Enkin MW. Chapter 3: Bias in randomized controlled trials In: Randomized Controlled Trials: Questions, Answers, and Musings. 2nd ed. Malden, MA: Blackwell Publishing; 2007. [Google Scholar]
  • 23.Kozhimannil KB, Arcaya MC, Subramanian SV. Maternal clinical diagnoses and hospital variation in the risk of cesarean delivery: analyses of a National US Hospital Discharge Database. PLoS Med. 2014;11(10):e1001745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Martin JA, Hamilton BE, Osterman MJK, Driscoll AK, Drake P. Births: Final data for 2017. Natl Vital Stat Rep. 2018;67(8):1–50. [PubMed] [Google Scholar]
  • 25.Hutcheon JA, Lisonkova S, Joseph KS. Epidemiology of pre-eclampsia and the other hypertensive disorders of pregnancy. Best Pract Res Clin Obstet Gynaecol. 2011;25(4):391–403. [DOI] [PubMed] [Google Scholar]
  • 26.Kozhimannil KB, Law MR, Virnig BA. Cesarean delivery rates vary tenfold among US hospitals; reducing variation may address quality and cost issues. Health Aff (Millwood). 2013;32(3):527–535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Main E How to apply the ARRIVE trial to my practice. CMQCC website: CMQCC (California Maternity Quality Care Collaborative);2018. [Google Scholar]
  • 28.Kozhimannil KB, Karaca-Mandic P, Blauer-Peterson CJ, Shah NT, Snowden JM. Uptake and utilization of practice guidelines in hospitals in the United States: the case of routine episiotomy. Jt Comm J Qual Patient Saf. 2017;43(1):41–48. [DOI] [PubMed] [Google Scholar]
  • 29.Grobman WA, Bailit J, Lai Y, et al. Defining failed induction of labor. Am J Obstet Gynecol. 2018;218(1):122 e121–122 e128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Caughey AB, Cahill AG, Guise JM, Rouse DJ. Obstetric care consensus no. 1: safe prevention of the primary cesarean delivery. Obstet Gynecol. 2014;123(3):693–711. [DOI] [PubMed] [Google Scholar]
  • 31.Harper LM, Caughey AB, Odibo AO, Roehl KA, Zhao Q, Cahill AG. Normal progress of induced labor. Obstet Gynecol. 2012;119(6):1113–1118. [DOI] [PubMed] [Google Scholar]
  • 32.Neal JL, Lowe NK, Schorn MN, et al. Labor dystocia: a common approach to diagnosis. J Midwifery Womens Health. 2015;60(5):499–509. [DOI] [PubMed] [Google Scholar]
  • 33.Ford I, Norrie J. Pragmatic trials. New England Journal of Medicine. 2016;375(5):454–463. [DOI] [PubMed] [Google Scholar]
  • 34.Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10:37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hawe P Lessons from complex interventions to improve health. Annu Rev Public Health. 2015;36:307–323. [DOI] [PubMed] [Google Scholar]
  • 36.Lovell K, Bee P, Brooks H, et al. Embedding shared decision-making in the care of patients with severe and enduring mental health problems: The EQUIP pragmatic cluster randomised trial. PLoS One. 2018;13(8):e0201533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Bennett GG, Warner ET, Glasgow RE, et al. Obesity treatment for socioeconomically disadvantaged patients in primary care practice. Arch Intern Med. 2012;172(7):565–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Phillippi JC, King TL. Assessing the value of the ARRIVE trial for clinical practice: sea change or just a splash? J Midwifery Womens Health. 2018;63(6):645–647. [DOI] [PubMed] [Google Scholar]
  • 39.Davies-Tuck M, Wallace EM, Homer CSE. Why ARRIVE should not thrive in Australia. Women Birth. 2018;31(5):339–340. [DOI] [PubMed] [Google Scholar]
  • 40.Snowden JM, Tilden EL, Odden MC. Formulating and answering high-impact causal questions in physiologic childbirth science: concepts and assumptions. J Midwifery Womens Health. 2018;63(6):721–730. [DOI] [PubMed] [Google Scholar]
  • 41.Szklo M, Nieto FJ. Epidemiology: Beyond the Basics. 3rd ed. Burlington, MA: Jones & Bartlett; 2012. [Google Scholar]

RESOURCES