Abstract
Background:
Causal graphs are an important tool for covariate selection but there is limited applied research on how best to create them. Here, we used data from the Coronary Drug Project trial to assess a range of approaches to directed acyclic graph (DAG) creation. We focused on the effect of adherence on mortality in the placebo arm, since the true causal effect is believed with a high degree of certainty.
Methods:
We created DAGs for the effect of placebo adherence on mortality using different approaches for identifying variables and links to include or exclude. For each DAG, we identified minimal adjustment sets of covariates for estimating our causal effect of interest and applied these to analyses of the Coronary Drug Project data.
Results:
When we used only baseline covariate values to estimate the cumulative effect of placebo adherence on mortality, all adjustment sets performed similarly. The specific choice of covariates had minimal effect on these (biased) point estimates, but including nonconfounding prognostic factors resulted in smaller variance estimates. When we additionally adjusted for time-varying covariates of adherence using inverse probability weighting, covariates identified from the DAG created by focusing on prognostic factors performed best.
Conclusion:
Theoretical advice on covariate selection suggests that including prognostic factors that are not exposure predictors can reduce variance without increasing bias. In contrast, for exposure predictors that are not prognostic factors, inclusion may result in less bias control. Our results empirically confirm this advice. We recommend that hand-creating DAGs begin with the identification of all potential outcome prognostic factors.
Keywords: Causal graphs, Causal inference, DAGs, Directed acyclic graphs, Evidence synthesis
Estimating causal effects requires the assumption of exchangeability between exposed and unexposed groups. Historically, researchers have been taught that this requires identifying and adjusting for all common causes of exposure and outcome.1–4 The development of causal graphs, such as directed acyclic graphs (DAGs) and single-world intervention graphs, has helped clarify that bias can be removed by successfully blocking all confounding pathways, regardless of whether every variable traditionally considered a confounder is included.5–7 In epidemiologic and public health research, most causal graphs are hand-constructed by researchers.8 Despite this, there is limited evidence-based guidance on the actual process of creating a causal DAG,9–11 with most guidance on DAGs focusing on the underlying theory or the selection process for identifying valid adjustment sets.12 Indeed, many epidemiologists report that unfamiliarity with the DAG-building process has been a major barrier to their use of this tool in selecting adjustment variables.8,13
In general, a causal DAG should include all variables that are common causes of any two other variables already included on the graph.12 In the language of causal graphs, a pair of variables that have a cause-and-effect relationship is called the “parent” and “child” nodes. When creating a graph, we want to include all variables that are parents of two or more variables already on the graph. However, in practice, choices must be made about how many generations removed from the effect of interest are sufficient to stop adding new parents. Furthermore, determining whether or not a variable is a cause of the exposure, outcome, and/or any other already on the graph is not a simple matter. Many potential confounding variables are never rigorously investigated, while even rigorous evidence of null relationships struggles to find a home in the peer-reviewed literature.11 Many researchers resort to relying on their own expert knowledge for populating graphical causal models8 or on the general consensus of experts in their field regarding the “important” confounding variables. However, relying on one’s own knowledge or the consensus of the field can lead to the propagation of errors across a body of research. In addition, there is limited guidance on how to practically determine the consensus of experts to aid in DAG building.
To support open and replicable science practices, DAG building ideally should use a clearly described and prespecified approach to identifying variables for inclusion (or exclusion). A variety of theoretical papers have proposed approaches to guide the hand creation of DAGs and other causal graphs. One approach, called the “disjunctive cause criterion,” recommends that a DAG be built by selecting all known direct causes of either the exposure or outcome.14 Other approaches include focusing on pretreatment covariates related to the exposure (sometimes called the “kitchen sink” approach)15 or on identifying all known prognostic factors for the outcome. Finally, the recently proposed “evidence synthesis for constructing DAGs” approach focuses on recreating and assessing the DAG implied by an existing study.10 Although some of these approaches have been assessed for methodologic reliability using simulation studies, there are few studies that have assessed different approaches to DAG creation on confounding control in real datasets.16
One major challenge for evaluating DAG-building strategies in real datasets is that the “true” DAG is always unknown. Assessment of DAG-building strategies, however, requires the ability to score the resulting graphs for validity against some “gold standard.” However, if we have a strong prior belief about the true value of the causal effect of an exposure and outcome relationship, we may be able to use this knowledge to compare and evaluate different DAG-building approaches. This is the scenario of the Coronary Drug Project (CDP) trial placebo arm, in which it is strongly believed that the true causal effect of adherence versus nonadherence to placebo should have no causal effect on mortality over 5 years.17,18
Here, we use the CDP adherence arm data to assess methods for hand creation of causal DAGs. The true relationship between placebo adherence and mortality is widely believed to be truly null but has also been shown to be clearly confounded by a range of baseline and time-varying covariates. We exploit these properties to provide a method of assessing DAGs created using a range of theory-based DAG-building approaches. We anticipate that DAGs that more closely represent the “true” underlying DAG will return a null estimate of the causal effect of placebo adherence on mortality when used to determine adjustment sets for analysis.
CASE STUDY: ADHERENCE TO PLACEBO IN THE CORONARY DRUG PROJECT TRIAL
Medication adherence varies for a wide variety of reasons, with an estimated 50% of patients not adhering to their prescribed medication schedule.19 Nonadherence, defined as less than 80% adherence in a given time period, can be as high as 80% for asymptomatic conditions such as hypertension.19 Although many studies have shown a link between drug adherence and mortality, there are also well-established links between drug adherence and healthy behaviors, leading many to believe the relationship with mortality is due to bias, often called the “healthy adherer” bias.20 Skepticism of adherence-adjusted effect estimates can be traced to a highly cited21 paper from the CDP trial team that in 1980 demonstrated substantial confounding of the relationship between placebo adherence and mortality when comparing placebo adherers and placebo nonadherers.17 This paper was an effective demonstration of the challenge of intractable adherence confounding; the authors reported a 13.1% percentage point difference in 5-year mortality comparing adherers to placebo with nonadherers to placebo, and a 9.4% difference persisted despite adjustment for a large set of baseline covariates. However, while the bias due to confounding in this example may have been intractable with the statistical tools available at the time, modern developments in causal inference methodology have made this bias tractable.22 In 2016, a reanalysis of the CDP data using g-methods to account for postrandomization adherence confounding and selection bias demonstrated that these methods could successfully eliminate healthy adherer bias, at least in the placebo arm.18,23 In that analysis, using the original list of covariates but including measures from both baseline and follow-up resulted in a final estimated mortality difference of only 2.5 percentage points (95% confidence interval [CI]: −2.1, 7.0).18,23
This well-studied relationship between adherence to placebo and mortality in the CDP trial provides an excellent case study within which to empirically evaluate causal graph-building approaches. The prior studies of this effect suggest that we should see two trends when the minimal adjustment set identified from a causal DAG is sufficient to control for confounding in this relationship. First, we anticipate that controlling for the “correct” set of covariates using only their values at baseline will reduce but not remove confounding, resulting in an estimated risk difference intermediate between the unadjusted result and the null value. Second, we anticipate that controlling for a “correct” set of covariates both at baseline and over follow-up using appropriate methods for time-varying confounder-adherence feedback control will provide a null estimated effect of adherence on mortality.
METHODS
Ethical Review
The data do not include any identifiable patient information and this study was deemed exempt by the Boston University Institutional Review Board (BU:BMC/BUMC IRB #H-40362).
Coronary Drug Project Data and Variable Selection
Full details on the CDP study are reported elsewhere17,24–26 and are summarized in eAppendix 1a; http://links.lww.com/EDE/C152. Briefly, this trial assessed potential lipid-lowering treatments for the prevention of mortality among men who had at least one prior heart attack. To provide sufficient power for the assessment of five separate treatment arms, the trial included a large placebo arm (N = 2789). Mortality data for the CDP were obtained using death certificates, autopsy reports, physician summaries, and medical records. Individuals for whom vital status was unknown after 5 years (n = 159) were censored from the analysis. The CDP trial team analyzed the relationship between adherence to placebo on mortality adjusting for a list of 40 baseline covariates, restricting to individuals with known vital status at the end of the study and no missing covariate information. Those covariates were selected by experts who were asked to identify prognostic factors for mortality in the target population of the trial (eAppendix 1b and eTable 1; http://links.lww.com/EDE/C152). These covariates were also assessed for relevance using a forward selection process against the interim outcome of 3-year mortality. However, ultimately, the trial team used all covariates identified by the experts. The CDP trial dataset is available for researchers from the National Heart, Lung, and Blood Institute’s Biologic Specimen and Data Repository Information Coordinating Center repository;26 however, one of the 40 covariates (time since last myocardial infarction at baseline) was corrupted in the stored data and is no longer available for researchers.
Causal Estimand
The outcome of interest has variously been assessed as mortality incidence or survival time over the 5-year follow-up;17,18,23 for simplicity, we focus on the cumulative incidence difference here. The primary exposure of interest, adherence, is somewhat more complicated (eAppendix 1c; https://links.lww.com/EDE/C152). Prior analyses have used a binary variable for high or low cumulative average adherence over follow-up dichotomized at 80%,17,18,23 and we use the same measure here. The original data were collected categorically every 3 months at the study visits, with the study doctor determining from the pill bottle whether the patient had used more than 80%, 20%–79%, or less than 20% of their assigned medication. To obtain the cumulative average, we combined the two lower adherence categories at each visit and then the proportion of visits with adherence ≥80% was calculated to obtain the final measure: at least 80% adherence reported for at least 80% of the study visits. As with prior reanalyses using causal methods, we carry forward the cumulative average for up to three missed visits before censoring for loss to follow-up and use inverse probability weights to adjust for this censoring.
Directed Acyclic Graph Building
For each of the variable selection approaches described below, we created two DAGs. First, we created a baseline DAG that assumed that only baseline covariates were of relevance and that cumulative adherence could be treated as a single time-invariant exposure. This DAG reflects the analysis done by the original 1980 CDP paper. We also created a longitudinal DAG that incorporated information available over time, including time-varying covariates, time-varying adherence, and time-varying vital status. Due to a large number of covariates, we visualized the longitudinal DAGs with only three time points (baseline, during follow-up, and end of follow-up; see eFigure 1; http://links.lww.com/EDE/C152 for the basic skeleton structure of these DAGs). We created DAGs in LaTeX using the Tikz package.
Selecting Variables for Hand Creation of Directed Acyclic Graphs
We sequentially created a causal DAG starting from the implied DAG underlying the analysis done by the CDP trial team in 1980. Outcome prognostic factors for 5-year mortality among men with prior heart attack were identified by the experts in the original CDP trial.25 The selection process for these covariates was detailed in a 1973 publication.25 Although this publication clearly indicates that these covariates were selected without consideration of adherence to either the trial medications or to placebo, we assume that all variables are potential exposure predictors since all were used as baseline confounders in the 1980 CDP placebo adherence analysis. We drew the implied baseline DAG for these covariates and then assumed that the same covariates were relevant over time to create a longitudinal DAG that reflects the implied DAG of previous CDP reanalyses.18,23 We refer to these as the outcome-driven DAGs, since the covariates were selected based on their relationship to mortality.
Next, we conducted a scoping review using the National Library of Medicine’s National Center for Biotechnology Information PubMed database to assess whether for each variable there was reasonable evidence to support inclusion as a potential cause of adherence among individuals with existing cardiovascular disease.11 Under the paradigm of causal DAGs, the exclusion of a variable or connection between two variables is a stronger assumption than inclusion. Thus, the decision to retain a variable was made whenever the PubMed search indicated a reasonable body of literature supporting the possibility of a relationship between each covariate and adherence. We removed any variables for which there was no evidence to support such a relationship, or, rarer, strong evidence against such a relationship (eTable 2; http://links.lww.com/EDE/C152). In particular, we removed arrows linking covariates to adherence when there was no evidence to support that relationship. We then removed any covariates on the DAG that were caused only by one other variable. We refer to the resulting baseline and longitudinal DAGs as the trimmed DAGs.
Finally, following the advice laid out under the disjunctive cause criterion for variable selection, we searched for additional variables that may be outcome or exposure causes. To do this, we conducted additional literature searches and consulted with experts to identify potentially missing adherence or mortality causes that were not included in the outcome-driven or trimmed DAGs. To guide this variable-finding process, we constructed general categories of adherence determinants informed by the literature reviewed while creating the trimmed DAGs. These categories were demographic variables, lifestyle variables, medical history, prescription medications, laboratory test values, electrocardiogram (ECG) findings, and mental health variables. We then used these categories to help guide additional literature review and obtained expert input on adherence predictors and additional outcome prognostic factors (eTables 3 and 4; http://links.lww.com/EDE/C152). Finally, we assessed the possible existence of covariate–covariate relationships. We refer to the resulting baseline and longitudinal DAGs as the maximal DAGs.
Statistical Analysis
Using each of the DAGs created above, we identified adjustment sets based on the confounding structure and available covariates, as well as the assumptions about exposure implied by the DAG. We used each adjustment set to estimate the average 5-year mortality difference between at least 80% adherence to placebo versus less than 80% adherence to placebo in the CDP trial. For comparison with prior CDP placebo adherence analyses, we estimated this effect as a time-invariant exposure with baseline confounding only, and separately, as a time-varying exposure with time-varying confounding. Our analytic plan matched that used in a prior reanalysis of the CDP data to estimate this causal effect.18 This plan has been explained in detail elsewhere, but we provide a brief summary here.
For the time-invariant exposure analysis, we used the plug-in g-formula to estimate the average causal effect. To do so, we first fit a logistic regression model for death during follow-up, conditional on the adjustment set and the binary cumulative adherence variable (≥80% versus <80%). We then standardized the predicted conditional mortality probabilities from this dataset across the baseline covariate distribution to obtain the average effect estimate on the risk difference scale.18 For each DAG, we used the analytic sample of participants with complete information on the adjustment variables. To assess the potential impact of varying the analytic sample, we also ran the time-invariant analysis with no covariate adjustment; this unadjusted analysis differs between analyses only in the sample used to fit the logistic regression model. For comparisons between DAGs, we also ran all models on the subset of individuals with complete data for all covariates in any adjustment set (N = 2410).
For the time-varying exposure analysis, we used inverse probability weighting to adjust for time-varying covariates.18 First, we fit pooled logistic regression models for exposure at each time point, given prior exposure, baseline covariates, and covariates at the prior time point. Since not all participants had adherence information available at each study visit, we created a set of weights models for measurement of exposure at each visit plus a set for the binary adherence level at each visit. We stabilized each set of weights with the predicted probability of an individual’s exposure from pooled logistic regression models with baseline covariates only. Weights were then estimated for each individual at each time point, and the measurement and adherence weights were combined. The final weights were truncated at the 99th percentile for all models (see eTable 5; http://links.lww.com/EDE/C152 for weight distributions before and after truncation). The final estimate of the average causal effect was obtained using a weighted pooled logistic regression model for mortality over time, given exposure and baseline covariates, with the resulting model parameters standardized across baseline covariates to obtain an estimate on the risk difference scale.
Code is available in Statistical Analysis System (SAS) and Stata formats in the eAppendix 1; http://links.lww.com/EDE/C152 and on GitHub.27 For all analyses in SAS, we estimated 95% CIs using nonparametric bootstraps with 500 iterations. The SAS code used here mirrors that from prior reanalyses of the CDP data, but is updated to allow easily changing the adjustment sets.18 Stata code was newly created for the present study; Stata code makes use of the margins function to obtain CIs. Results presented in the current manuscript were obtained via SAS 9.4.
To assess the results of our DAG-building process, we estimated the association strength between newly added covariates and exposure and outcome, separately, using Chi-squared tests. We also computed P value functions from the bootstrapped results using the R package pvaluefunction.28
RESULTS
The final list of variables included in each of the six DAGs is given in Table 1. Of the 40 original covariates identified by the trial team in the outcome-driven approach, only 39 were available in public datasets; the missing covariate (time since prior myocardial infarction) is included in the DAGs but not in the analytic adjustment sets. The implied baseline outcome-driven DAG is shown in Figure 1 and replicates the assumptions we believe were used for the 1980 CDP placebo adherence analysis. Figure 2 shows the extension of this DAG to include multiple time points and allow exposure and covariates to change over time.
TABLE 1.
Final Variable List for Each Directed Acyclic Graph (DAG)-building Approach
| Variable | Outcome-driven DAG | Trimmed DAG | Maximal DAG | |||
|---|---|---|---|---|---|---|
| Baseline | Longitudinal | Baseline | Longitudinal | Baseline | Longitudinal | |
| Demographics | ||||||
| Age at entry | ☑ | ☑ | ☑ | |||
| Race | ☑ | ☑ | ☑ | |||
| Occupation | ☑ | |||||
| Employment status | ☑ | ☑ | ||||
| Education level | ☑a | |||||
| Lifestyle characteristics | ||||||
| Run-in adherenceb | ☑ | ☑ | ☑ | |||
| Cigarette smoking | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Current habitual level of physical activity | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Alcohol use | ☑a | ☑a | ||||
| Medical history | ||||||
| Risk group | ☑ | ☑ | ☑ | |||
| Number of myocardial infarctions at baseline | ☑ | ☑ | ☑ | |||
| Time since most recent myocardial infarction | ☑a | ☑a | ☑a | |||
| Relative body weight | ☑ | ☑a | ☑ | ☑a | ☑ | ☑a |
| History of congestive heart failure | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| History of angina pectoris | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| History of acute coronary insufficiency | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| History of intermittent cerebral ischemic attack | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| History of intermittent claudication | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Systolic blood pressure | ☑ | ☑ | ||||
| Diastolic blood pressure | ☑ | ☑ | ||||
| New York Heart Association functional class | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Cardiomegaly on chest x-ray | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Hypertension | ☑ | ☑ | ||||
| Dyslipidemia | ☑ | ☑ | ||||
| Diabetes | ☑ | ☑ | ||||
| Atrial fibrillation | ☑ | ☑ | ||||
| Mental health | ||||||
| Depression | ☑a | ☑a | ||||
| Cognitive status | ☑a | ☑a | ||||
| Prescription of nonstudy medications | ||||||
| Oral hypoglycemic agents | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Digitalis | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Antiarrhythmic agents | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Diuretics | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Antihypertensives other than diuretics | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Lab findings | ||||||
| Serum total bilirubin | ☑ | ☑ | ||||
| Serum total cholesterol | ☑ | ☑ | ||||
| Serum triglyceride | ☑ | ☑ | ||||
| Serum uric acid | ☑ | ☑ | ||||
| Serum alkaline phosphatase | ☑ | ☑ | ||||
| Plasma urea nitrogen | ☑ | ☑ | ||||
| Plasma fasting glucose | ☑ | ☑ | ☑ | ☑ | ☑ | ☑ |
| Plasma 1-hour glucose, after 75 g oral load | ☑ | ☑ | ||||
| White blood cell count | ☑ | ☑ | ||||
| Absolute neutrophil count | ☑ | ☑ | ||||
| Hematocrit | ☑ | ☑ | ||||
| ECG findings | ||||||
| Q/QS pattern on anterolateral, posteroinferior, or anteroseptal recording | ☑ | ☑ | ||||
| ST segment depression on anterolateral, posteroinferior, or anteroseptal recording | ☑ | ☑ | ||||
| T-wave findings on anterolateral, posteroinferior, or anteroseptal recording | ☑ | ☑ | ||||
| ST segment elevation on anterolateral, posteroinferior, or anteroseptal recording | ☑ | ☑ | ||||
| Ventricular conduction defect | ☑ | ☑ | ||||
| Premature ventricular beats | ☑ | ☑ | ||||
| Heart rate | ☑ | ☑ | ||||
Covariates are listed by the time point available. Baseline DAGs included only baseline covariates. Longitudinal DAGs included the variables both at baseline and over time. Variables included in the baseline column but not the longitudinal column indicate variables for which only the baseline value is used, this value is included in both baseline and longitudinal DAGs for a given approach. Variable definitions and supporting evidence are provided in eAppendix 1; http://links.lww.com/EDE/C152.
Variables that were desired for inclusion but not available in the dataset, or not available at the specified time point(s).
Run-in adherence is adherence to placebo as measured during a pretrial run-in period when all trial participants were given placebo. This occurred before participants were assigned their trial randomization.
FIGURE 1.
Directed acyclic graph (DAG) for the baseline outcome-driven DAG-building approach. Covariates are colored by conceptual category for ease of reading. Covariate enclosed in dashed box is unmeasured; that enclosed in solid box is restricted by study design. This graph represents the implied DAG used in the 1980 Coronary Drug Project placebo adherence assessment.
FIGURE 2.
Directed acyclic graph (DAG) for the longitudinal outcome-driven DAG-building approach. Covariates are colored by conceptual category for ease of reading. Covariates enclosed in dashed boxes are unmeasured; those enclosed in solid boxes are restricted by study design. This graph represents the implied DAG used in the 2016 reanalysis of the 1980 Coronary Drug Project placebo adherence assessment incorporating time-varying confounding control.
The trimmed DAGs have the following modifications relative to the outcome-driven DAG. First, we removed all laboratory test result variables except fasting plasma glucose from the DAG because we could not find support for these variables affecting adherence. In addition, we also removed all variables for specific ECG findings due to a lack of evidence supporting their role in adherence determination. We also hypothesized that patients would typically be unaware of the value of these sets of variables and thus that they would be unlikely to directly affect adherence choices over time. Figure 3 shows the baseline trimmed DAG after the removal of variables with only one arrow. The longitudinal trimmed DAG and versions of these DAGs showing the removed variables and their reduced arrows are provided in eFigures 2 and 3; http://links.lww.com/EDE/C152.
FIGURE 3.
Directed acyclic graph (DAG) for the baseline trimmed DAG, including covariates with one or fewer descendants. Covariates are colored by conceptual category for ease of reading. Dashed box indicates unmeasured covariate; solid box indicates covariate restricted by study design.
Finally, the maximal DAG was built on the trimmed DAG by adding a number of baseline and time-varying variables for which there was evidence suggesting relationships with adherence and/or mortality. The variables added at this stage included further information on health status (hypertension, dyslipidemia, diabetes, and atrial fibrillation), mental health information (depression and cognitive status), and lifestyle features (alcohol consumption). Note that, of these, only the health status variables were actually available in the CDP trial dataset. The maximal DAG therefore suggests potential residual confounding due to unknown mental health and lifestyle variables. We additionally added links between covariates in the maximal DAG to allow consideration of covariate–covariate paths. These paths suggest that some of the confounding due to depression, cognition, and alcohol consumption may be partially controlled via the set of available adjustment variables. Figure 4 shows the longitudinal maximal DAG without covariate–covariate links for easier assessment. The baseline maximal DAG and versions with the covariate–covariate links are provided in eFigures 4–7; http://links.lww.com/EDE/C152.
FIGURE 4.
Directed acyclic graph (DAG) for the longitudinal maximal DAG, excluding covariates with one or fewer descendants. Covariates are colored by conceptual category for ease of reading. Dashed boxes indicate unmeasured covariates; solid boxes indicate covariates restricted by study design.
When we removed observations with any missing information on variables, the largest impact on the sample size was seen in the outcome-driven DAG. This DAG (which represents prior CDP analyses) provided a sample size of 2413 individuals with complete covariate data; the trimmed DAG increased this to 2485 individuals, and the maximal DAG again reduced the sample size slightly to 2481. Unadjusted analyses within each of these data subsets returned nearly identical results for all analytic samples (risk difference:11 percentage points; see eTable 6; http://links.lww.com/EDE/C152), suggesting minimal bias related to restricting to individuals with complete covariate data.
Combining the covariate lists from all DAGs, the final complete case analytic dataset had a sample size of 2410 individuals (risk difference = 11; 95% CI = 6.6, 16). The estimated risk differences adjusting for baseline covariates alone and for baseline and time-varying covariates using this analytic dataset are reported in Table 2 for all analyses. Our estimates from the outcome-driven approach closely matched previously reported CDP estimates for both baseline (7.0 percentage points, 95% CI = 2.9, 11) and longitudinal DAGs (2.5 percentage points, 95% CI = −2.0, 7.0). Note that although we use the same variable set as the prior analyses, our analytic dataset here is slightly smaller.18
TABLE 2.
Estimated Risk Difference (Percentage Points; 95% CIa) for the Effect of Adherence on 5-year Mortality Among the Placebo Arm, Coronary Drug Project (N = 2410b)
| Baseline Adjusted Analysis | Baseline and Time-varying Adjusted Analysis | |
|---|---|---|
| Outcome-driven DAG | 7.0 (2.9, 11) | 2.5 (−2.0, 7.0) |
| Trimmed DAG | 7.6 (3.3, 12) | 3.7 (−1.1, 8.5) |
| Maximal DAG | 7.6 (3.4, 12) | 5.2 (0.4, 10) |
Unadjusted estimate 11 (95% CI = 6.6, 16).
Confidence intervals were obtained via 500 bootstrap samples in SAS 9.4.
Sample size is the maximum available complete case data for all covariates in any adjustment set; for analyses using maximum complete case data for each covariate list separately, see eAppendix 1, eTable 6; http://links.lww.com/EDE/C152.
Removal of laboratory test results and ECG results in the trimmed DAG had very little effect on the estimate for either the baseline DAG time-invariant analysis or the longitudinal DAG time-varying analysis. In both cases, the point estimate and CIs were shifted slightly away from the null, relative to the outcome-driven DAG. For the time-invariant analysis, the risk difference was 7.6 percentage points for the trimmed DAG (95% CI = 3.3, 12), but adjustment for time-varying covariates still resulted in improved confounding control relative to baseline covariates only, and did not affect CI precision (trimmed DAG time-varying adjusted risk difference: 3.7 percentage points, 95% CI = −1.1, 8.5).
Interestingly, while the analysis using the baseline maximal DAG was reasonably similar to the other two approaches (risk difference: 7.6 percentage points; 95% CI = 3.4, 12), the longitudinal maximal DAG performed less well than either the trimmed DAG or the outcome-driven DAG at confounding control. The estimated risk difference from the longitudinal maximal DAG after controlling for time-varying covariates was 5.2 percentage points (95% CI = 0.4, 10); nearly 3 percentage points further from the null than the outcome-driven DAG. Running all analyses in the analytic sets of individuals with complete information on each adjustment set separately had minimal impact on these estimates, thus suggesting the primary issue is covariate choice rather than the analytic sample (eTable 6; http://links.lww.com/EDE/C152).
To evaluate the DAGs created under each strategy, we also looked at the associations between covariates that were removed and/or added from the trimmed and maximal DAGs relative to the outcome-driven DAG (Table 3). Most of the covariates that we selected for removal when creating the trimmed DAG were selected because our review and subject-matter experts suggested these were unlikely to be causes of adherence, and the statistical associations largely match this expectation with the removed covariates more often having statistically significant associations with death than with adherence. On the other hand, the covariates that we added in creating the maximal DAG were ones that our review and experts suggested could be causes of both adherence and mortality, but these covariates were generally only statistically significantly associated with either adherence or mortality but not both.
TABLE 3.
Strength of Associations With Death and Adherence for Removed and Newly Added Covariates Identified During the Maximal DAG-building Process
| Adherence | Death | |||
|---|---|---|---|---|
| Chi-squared | P Value | Chi-squared | P Value | |
| Removed variables | ||||
| Systolic blood pressure | 8.1 | 0.01 | 0.3 | 0.59 |
| Diastolic blood pressure | 4.6 | 0.03 | 0.48 | 0.49 |
| Bilirubin | 0.06 | 0.81 | 0.04 | 0.85 |
| Cholesterol | 0.86 | 0.36 | 2.7 | 0.10 |
| Triglycerides | 1.4 | 0.24 | 0.05 | 0.83 |
| Uric acid | 0.0004 | 0.98 | 9.7 | 0.002 |
| Alkaline phosphatase | 2.6 | 0.11 | 2.2 | 0.13 |
| Urea nitrogen | 0.01 | 0.93 | 1.6 | 0.21 |
| 1-hour glucose | 0.25 | 0.61 | 8.5 | 0.004 |
| White cell count | 7.9 | 0.01 | 19 | <0.0001 |
| Neutrophil count | 7.6 | 0.01 | 18 | <0.0001 |
| Hematocrit | 1.0 | 0.31 | 0.3 | 0.58 |
| Q/QS pattern | 0.02 | 0.89 | 22 | <0.0001 |
| ST segment depression | 20 | <0.0001 | 98 | <0.0001 |
| T-wave findings | 7.1 | 0.01 | 54 | <0.0001 |
| Premature ventricular beats | 1.9 | 0.17 | 12 | 0.001 |
| Ventricular conduction defect | 2.6 | 0.11 | 16 | <0.0001 |
| Heart rate | 0.008 | 0.93 | 23 | <0.0001 |
| Added variables | ||||
| Occupation (DF = 9) | 7.7 | 0.56 | 40 | <0.0001 |
| Employment status (yes/no) | 2.8 | 0.09 | 32 | <0.0001 |
| Employment status (full-time/not) | 2.1 | 0.15 | 34 | <0.0001 |
| Hypertension | 9.6 | 0.02 | 1.3 | 0.25 |
| Dyslipidemia (DF = 2) | 1.1 | 0.59 | 6.0 | 0.05 |
| Diabetes | 0.42 | 0.51 | 0.79 | 0.37 |
| Atrial fibrillation | 8.1 | 0.01 | 0.3 | 0.59 |
Chi-squared statistics and two-sided P values for the null hypotheses that the newly added variable is identically distributed between those who do and do not adhere to placebo and between those who do and do not die during the 5 years of follow-up. Degrees of freedom = 1 except where specified.
DF indicates degrees of freedom.
Finally, to better assess the results of each DAG-based analysis, we plotted P value functions for the analysis accounting for baseline and time-varying covariates (Figure 5; for baseline analysis, see eFigure 8; http://links.lww.com/EDE/C152). The P value functions estimate the probability that the result is as extreme as observed or more extreme under a range of null hypotheses from a risk difference of −3 percentage points to +11 percentage points. From this figure, we can see that while results as or more extreme than those obtained from the outcome-driven DAG might be expected to occur by chance nearly 30% of the time if the risk difference is truly null, these results might also be expected to occur roughly 30% of the time if the true risk difference were closer to +5 percentage points. In contrast, the maximal DAG analysis only reaches this threshold of 30% expectation for null hypothesis values of roughly +2.5 and +8 percentage points. Thus, from the P value functions we can see that the results of the outcome-driven DAG are the most compatible with the true null of no difference, followed by the trimmed DAG, with the maximal DAG is largely incompatible with this hypothesis.
FIGURE 5.
P value functions for effect estimates adjusting for baseline and time-varying covariates by adjustment set. The P value functions display the P values for each set of model results compared to null hypotheses from risk difference of −3 to risk difference of 11. Dotted gray lines indicate the location of the null hypothesis risk difference of 0, and the alpha value P = 0.05.
DISCUSSION
Confounder selection is an important part of estimating causal effects but is challenging when the exposure varies for reasons that are not well-known or are unmeasured, such as with adherence to assigned treatment in randomized controlled trials. Causal graphs, such as DAGs, can be a useful tool for synthesizing evidence and assessing potential control variables. However, researchers must still rely on the assumption that their DAG reflects reality. While it is well-established that unbiased causal effects can be estimated validly when a valid adjustment set is selected from a causal DAG, there remains limited guidance on the actual creation of the causal DAG. Some experts argue for a maximalist approach of including all pre-exposure covariates available, but this approach may result in an inclusion of unnecessary variables and can be difficult to generalize to the setting of time-varying exposures.
In this paper, we take advantage of the existence of a large, rich, real dataset, for which the true causal effect is known with a high degree of certainty—a randomized trial with a large placebo arm, where the effect of adherence to placebo on mortality is widely accepted to be null. We use this trial as a case study to test the performance of selecting confounders via a variety of DAG-building and covariate selection approaches previously suggested in the literature. We compare the DAG implied by the original trial analysis with DAGs created by trimming unnecessary edges and nodes and adding additional potentially relevant covariates. This DAG-building approach is similar to one we have used in other settings9,11 but here we are able to evaluate this approach by comparing analyses resulting from these DAGs.
At first glance our results may seem somewhat surprising, the outcome-driven DAG performed the best in this data, while the inclusion of additional covariates suggested by the literature in the maximal DAG resulted in a failure to completely remove the healthy adherer bias. However, these findings are in line with the theoretical principles of DAG-based adjustment sets. For example, it is well-known in the literature that the inclusion of variables that are prognostic for the outcome but not necessary for bias correction based on the DAG will result in valid effect estimates (potentially with reduced variance).29 The original CDP trial team focused on identifying predictors of mortality without consideration of adherence.25 While the resulting outcome-driven DAG may be missing some causes of adherence, it is likely to include nearly all mortality predictors. Thus, we would anticipate that the adjustment set from this DAG would be sufficient for addressing confounding (assuming the use of an appropriate time-varying confounding control approach), and any unnecessary variables that are included may act only to reduce the variance. This is supported by the very minor change in estimate observed when variables thought not to be related to adherence were excluded in the trimmed DAG. In addition, most of the variables removed for the trimmed DAG showed highly statistically significant univariate associations with mortality. Note, however, that significance is neither necessary nor sufficient for determining which variables belong on the DAG; we did not check these associations prior to building the DAGs.
Conversely, there is substantial theoretical literature on the problems that can arise when covariates that are associated only with exposure are incorrectly included in the adjustment set. For these variables, any residual bias not controlled by the other variables in the adjustment set can be amplified by the inclusion of variables that are, essentially, instruments.30 This can occur even when that variable is actually associated with the outcome if that outcome–covariate relationship is very weak (often referred to as a “near-instrument”).31 Pearl32 has suggested that this bias amplification may even increase faster than any bias reduction when multiple weakly outcome-associated confounders are included in an analysis. In the maximal DAG, we attempted to identify any missed covariates from the previous DAGs. Although we searched for both outcome and exposure causes, in practice we mainly identified variables that were suspected to be causes of adherence. Several of these variables likely affected mortality via pathways mediated by covariates removed in the trimmed DAG. For example, we removed the blood pressure variables and added hypertension. The rationale for this was that a hypertension diagnosis may be more visible to the individual and thus more likely to affect adherence decisions. However, this means that the new covariate is likely less strongly associated with the outcome than the removed covariates. Indeed, this is supported by the analysis of statistical associations, the medical variables added to the maximal DAG were more strongly associated with adherence than with mortality. Interestingly, the reverse was true for the occupational variables, but only one of these variables was available as a time-varying covariate. This could explain why we found more bias, even in the analysis adjusting for time-varying confounding, when using the maximal DAG compared to the outcome-driven or trimmed DAGs.
Our case study has several limitations. First, the measure of adherence available in the CDP study is relatively coarse. Adherence data were collected at each study visit, so we could potentially have assessed patterns of adherence over time.33 Despite this, few participants were ever reported to have less than 20% adherence at a study visit. In addition, the categorization was made based on the study clinician’s best guess rather than a pill count so the likelihood of misclassification at any one study visit is high. We therefore opted to retain the coarse measure of adherence used in prior analyses of this data.
Another limitation of this study is that the original trial was conducted in the 1970s among an entirely male study sample. It is possible that the causes of adherence and mortality that were relevant at that time and for that population may differ from those that our experts and literature review identified as relevant for individuals with cardiovascular disease in the 2020s.
Here, we assessed a range of hand-created approaches to DAG building, using a well-studied causal effect in real data. Our findings confirm that theoretical concerns regarding bias amplification resulting from adjustment for covariates weakly associated with the outcome can have real implications for real analyses. In addition, they support the theoretical literature on the potential value, and minimal harm, of including outcome prognostic factors that may not be associated with the exposure. This case study represents only one example, and there is no guarantee that the best DAG-creation approach in this case will be universally appropriate; indeed, there is no guarantee that the DAGs we created here for the placebo arm comparison are valid even for an analysis of the full trial. However, we believe that the fact that our conclusions align well with theoretical principles of DAG creation suggests researchers may find value in first focusing on identifying all possible outcome prognostic factors when creating causal DAGs.
Supplementary Material
Footnotes
Editors’ note: A related article appears on page 654.
L.C. was supported by Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD) grant number [K12HD092535] and by a Tufts University Career Development Award.
Disclosure: The authors report no conflicts of interest.
Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com).
Data are available by application to the NIH BIOLINCC repository. Code for the analyses in the current manuscript is available on GitHub at https://github.com/eleanormurray/CDP-analysis2023.
REFERENCES
- 1.Celentano D, Szklo M, Gordis L. Chapter 15. More on causal inference: bias, confounding, and interaction. In: Gordis Epidemiology. 6th ed. Elsevier; 2019:289–306. [Google Scholar]
- 2.Fletcher GS. Chapter 5. Risk: exposure to disease. In: Clinical Epidemiology: The Essentials. 5th ed. Wolters Kluwer; 2021:71–76. [Google Scholar]
- 3.van den Broeck J, Brestoff JR. Chapter 22: statistical estimation. In: Epidemiology: Principles and Practical Guidelines. Springer; 2013:432. [Google Scholar]
- 4.Doi SA, WIlliams GM. Chapter 10: modelling binary outcomes. In: Methods of Clinical Epidemiology. 1st ed. Springer; 2013:143–157. [Google Scholar]
- 5.Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10:37–48. [PubMed] [Google Scholar]
- 6.Richardson TS, Robins J. Single World Intervention Graphs (SWIGS): A Unification of the Counterfactual and Graphical Approaches to Causality. Technical Report 128. Center for Statistics and the Social Sciences, University of Washington; 2013. [Google Scholar]
- 7.VanderWeele TJ, Shpitser I. On the definition of a confounder. Ann Stat. 2013;41:196–220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tennant PWG, Murray EJ, Arnold KF, et al. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol. 2021;50:620–632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Barnard-Mayers R, Kouser H, Cohen JA, et al. A case study and proposal for publishing directed acyclic graphs: the effectiveness of the quadrivalent human papillomavirus vaccine in perinatally HIV infected girls. J Clin Epidemiol. 2022;144:127–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ferguson KD, McCann M, Katikireddi SV, et al. Evidence synthesis for constructing directed acyclic graphs (ESC-DAGs): a novel and systematic method for building directed acyclic graphs. Int J Epidemiol. 2020;49:322–329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cheema H, Brophy R, Collins J, et al. Causal relationships between pain, medical treatments, and knee osteoarthritis: a graphical causal model to guide analyses. Osteoarthritis Cartilage. 2024;32:319–328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Sauer B, VanderWeele TJ. Use of Directed Acyclic Graphs. Agency for Healthcare Research and Quality (US); 2013. Available at: https://www.ncbi.nlm.nih.gov/books/NBK126189/. Accessed 10 September 2021. [Google Scholar]
- 13.Barnard-Mayers R, Childs E, Corlin L, et al. Assessing knowledge, attitudes, and practices towards causal directed acyclic graphs: a qualitative research project. Eur J Epidemiol. 2021;36:659–667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.VanderWeele TJ, Shpitser I. A new criterion for confounder selection. Biometrics. 2011;67:1406–1413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shortreed SM, Ertefaie A. Outcome-adaptive lasso: variable selection for causal inference. Biometrics. 2017;73:1111–1122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Riseberg E, Melamed RD, James KA, Alderete TL, Corlin L. Development and application of an evidence-based directed acyclic graph to evaluate the associations between metal mixtures and cardiometabolic outcomes. Epidemiol Methods. 2023;12(Suppl 1):20220133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.The Coronary Drug Project Research Group. Influence of adherence to treatment and response of cholesterol on mortality in the Coronary Drug Project. N Engl J Med. 1980;303:1038–1041. [DOI] [PubMed] [Google Scholar]
- 18.Murray EJ, Hernán MA. Adherence adjustment in the Coronary Drug Project: a call for better per-protocol effect estimates in randomized trials. Clin Trials. 2016;13:372–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Brown MT, Bussell J, Dutta S, Davis K, Strong S, Mathew S. Medication adherence: truth and consequences. Am J Med Sci. 2016;351:387–399. [DOI] [PubMed] [Google Scholar]
- 20.Simpson SH, Eurich DT, Majumdar SR, et al. A meta-analysis of the association between adherence to drug therapy and mortality. BMJ. 2006;333:15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Cheema H, Murray EJ. Science Changes Over Time: Do Scientists? A Bibliographic Exploration of Views on Adherence Adjustment among Clinical Trialists. OSF Preprints; 2020. [Google Scholar]
- 22.Hernán MA, Robins JM. Per-protocol analyses of pragmatic trials. N Engl J Med. 2017;377:1391–1398. [DOI] [PubMed] [Google Scholar]
- 23.Murray EJ, Hernán MA. Improved adherence adjustment in the Coronary Drug Project. Trials. 2018;19:158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Coronary Drug Project Group. Factors influencing long-term prognosis after recovery from myocardial infarction—three-year findings of the Coronary Drug Project. J Chronic Dis. 1974;27:267–285. [DOI] [PubMed] [Google Scholar]
- 25.The Coronary Drug Project Research Group. The Coronary Drug Project: design, methods, and baseline results. Circulation. 1973;47(3 Suppl):I1-50. [DOI] [PubMed] [Google Scholar]
- 26.NHLBI Biolincc. Coronary Drug Project (CDP). Accession Number HLB02412121a. Published March 13, 2023. Available at: https://biolincc.nhlbi.nih.gov/studies/cdp/. Accessed 16 March 2023. [Google Scholar]
- 27.Murray E. eleanormurray/CDP-analysis2023: CDP-DAGvariants_Epide-miology.v1.0.0 (publication_version). Zenodo. 2024. Available at: 10.5281/zenodo.10903682. [DOI] [Google Scholar]
- 28.Infanger D, Schmidt-Trucksäss A. P value functions: an underused method to present research results and to promote quantitative reasoning. Stat Med. 2019;38:4189–4197. [DOI] [PubMed] [Google Scholar]
- 29.Lee P. Covariate adjustments in randomized controlled trials increased study power and reduced biasedness of effect size estimation. J Clin Epidemiol. 2016;76:137–146. [DOI] [PubMed] [Google Scholar]
- 30.Myers JA, Rassen JA, Gagne JJ, et al. Effects of adjusting for instrumental variables on bias and precision of effect estimates. Am J Epidemiol. 2011;174:1213–1222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Schisterman EF, Cole SR, Platt RW. Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiology. 2009;20:488–495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Pearl J. Invited commentary: understanding bias amplification. Am J Epidemiol. 2011;174:1223–7; discussion p. 1228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wanis KN, Sarvet AL, Wen L, et al. The role of grace periods in comparative effectiveness studies of different medications. 2022.





