Abstract
A recently published analysis in Hypertension suggests that thiazide use, versus non-use, is associated with excess risk of adverse cardiovascular outcomes in patients with diabetes enrolled in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial. Herein, we replicate these findings using the same publicly-available datasets and following their reported methods. We further show that possible misclassification of thiazide exposure exists in the original analysis. We perform alternative analyses that correct for this misclassification to highlight the impact that misclassification can have on observed associations between an exposure (e.g., thiazides) and outcomes (e.g., stroke and major adverse cardiovascular events).
In an era of stagnant or reduced funding for biomedical research, there has been increasing interest in repurposing existing clinical trial datasets to answer important clinical questions and generate hypotheses for future research. The National Heart, Lung and Blood Institute (NHLBI) has been a leader in this regard by creating the Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC), which serves as a warehouse for limited datasets from NHLBI-funded clinical trials. These data are widely available to researchers, contingent on approval of a brief research plan and signed data use agreement. However, these research plans do not undergo scientific review, nor does NHLBI review appropriateness of the study design, analyses, or reporting of results stemming from these datasets. This process, while maximizing research opportunities and data availability, also creates the potential for introducing inadvertent biases in results, particularly when ambiguity exists in the documentation related to data collection and curation processes employed when creating these publicly-available datasets.
Herein, we highlight one example, recently published in Hypertension, where study design choices, combined with somewhat ambiguous documentation in a publicly-available dataset, may have led to use of an inaccurate drug exposure variable, causing biased results and incorrect conclusions. We focus on this article to accomplish two goals: 1) to correct the record regarding the conclusions from this particular study; and 2) to highlight the importance of understanding the provenance of clinical research datasets and considering the potential for exposure misclassification and its effects on outcome estimates in observational research.
In the December 2019 issue of Hypertension, Tsujimoto and Kajio reported an analysis comparing thiazide use versus non-use on the risk of adverse cardiovascular outcomes in patients with diabetes enrolled in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial and its long-term follow-up extension, ACCORDIAN.1 The authors found, across several types of analyses, that thiazide use was associated with excess risk of the ACCORD primary outcome (first occurrence of nonfatal MI, nonfatal stroke, or cardiovascular death) and total (fatal/non-fatal) stroke in the overall ACCORD cohort and, in stratified analysis, in the intensive systolic BP (SBP) target arm of the ACCORD BP trial. Herein, we attempt to replicate these surprising findings using the same publicly-available datasets from the NHLBI and following the methods described by Tsujimoto and Kajio.1 We also examine the extent to which exposure misclassification may account for these findings and perform alternative analyses that minimize misclassification of thiazide exposure at baseline and during follow-up for comparison. We focus on their multivariable Cox regression analysis comparing baseline use of thiazide vs. no thiazide for illustrative purposes, although the issues highlighted here apply to the other analyses in the Tsujimoto and Kajio paper1 as well.
In the primary analyses by Tsujimoto and Kajio, the researchers performed the observational equivalent of an “intent-to-treat” (ITT) analysis, where thiazide exposure was determined at cohort entry and patients were assigned to fixed exposure groups (thiazide or no thiazide) for the duration of follow-up, regardless of subsequent treatment modifications. In such an approach, exposure misclassification can bias results if either (1) the initial group assignments are inaccurate or (2) true exposure changes over time because patients discontinue or start therapy during follow-up and thus their initial group assignment becomes inaccurate. The first source of exposure misclassification can be minimized by using the most accurate measure possible of drug exposure at baseline going forward into follow-up. The second source, if present, cannot be addressed at all in an ITT analysis, but instead requires an “as treated” analysis, where drug exposure is allowed to vary over time or where patients are censored when exposure status changes. In our attempts to replicate the findings by Tsujimoto and Kajio, we discovered that both sources of exposure misclassification were likely present.
To understand how the first source of exposure misclassification was introduced, it is important to note that the ACCORD datasets from NHLBI contain two sets of information on thiazide exposure: 1) a curated “analysis” dataset that includes an indicator for thiazide use at “baseline” and at each annual visit for all ACCORD patients (i.e., annually-updated antihypertensive use); and, 2) a medication log that contains information for each antihypertensive drug started, continued, or stopped at each visit for the ACCORD BP trial participants only. The analysis by Tsujimoto and Kajio used the former data in classifying thiazide exposure at baseline.1 Crucially, although it is not obvious from the NHLBI-supplied study documentation, information on “baseline” medication use in this curated “analysis” dataset comes from the history and physical performed at enrollment in ACCORD and documents thiazide exposure prior to randomization in ACCORD, and thus prior to any treatment modification by investigators at the baseline visit. Conversely, the medication log data contain information on thiazide exposure at the end of the baseline visit, including changes implemented by study investigators. This distinction is not trivial for any antihypertensive (or antidiabetic) agent because the use of these agents would be expected to change in a substantial proportion of patients immediately on trial entry as site investigators sought to achieve disease targets (e.g., intensive or standard SBP targets). This is particularly true for thiazides, which investigators were explicitly instructed to consider as initial therapy or as part of any combination regimen.2 Thus, using pre-trial thiazide exposure to group patients as thiazide-exposed or unexposed during follow-up could lead to a significant number of patients having incorrect group assignments, with corresponding misattribution of events or event-free survival time.
Indeed, comparison of the two aforementioned measures of thiazide exposure reveals significant differential misclassification (i.e., the degree of misclassification differs between patients considered thiazide-exposed and unexposed) among ACCORD-BP trial participants. Specifically, among those considered unexposed to thiazides at baseline in the Tsujimoto and Kajio analysis, approximately 29% are exposed to a thiazide starting at the baseline visit. These patients were not on a thiazide pre-randomization, but began thiazide treatment at the baseline visit to achieve their SBP target. Conversely, among those considered exposed at baseline in the Tsujimoto and Kajio analysis, approximately 15% are actually unexposed starting at the baseline visit. For these patients, the pre-randomization thiazide was presumably discontinued because they were too far below their SBP target or because their medical history warranted a switch to alternate therapy.
The problem of differential misclassification also extends throughout ACCORD BP trial follow-up, where exposure, defined annually in the curated “analysis” dataset, is misclassified relative to medication logs, on average, approximately 30% to 50% of the days in a given year, depending on the year of the trial (Figure, top panel). As the proportion of truly exposed thiazide days that are considered not exposed increases (dashed gray line), the proportion of non-exposed days that are considered exposed decreases (solid black line). This finding is consistent with the recommendation for prescribing ACE inhibitors (or ARBs) to patients with cardiovascular risk factors in ACCORD BP,2 which positioned thiazides later in the antihypertensive regimen titration scheme for some patients. Accordingly, thiazide exposure increased during later trial follow-up, heightening the likelihood of misclassifying exposure over time in the “unexposed” group. This misclassified time without study outcome is erroneously attributed to no thiazide exposure and introduces bias. Furthermore, the misclassification patterns differ substantially during follow-up comparing the standard and intensive SBP target arms of the trial (Figure, middle and bottom panels, respectively). Specifically, the proportion of patients who are using thiazides but are considered not exposed (gray, dashed lines) is larger and rises faster in the intensive arm, likely because thiazides are added earlier to achieve lower SBP targets.
Figure. Exposure Misclassification During Trial Follow-up.

Each panel displays the average proportion of days in a given year where thiazide exposure ascertained by medication logs differs from the pre-randomization thiazide exposure (i.e., the thiazide exposure variable used in the original analysis). Data are presented for the overall BP trial cohort (top panel), and stratified by BP trial randomization group (middle and bottom panels). SBP, systolic blood pressure.
Non-differential exposure misclassification – that is, exposure misclassification that occurs at a similar rate between comparison groups – is not uncommon in cohort studies, but typically biases results towards the null (i.e., towards a hazard ratio of 1). However, differential misclassification, as is present here, creates unpredictable biases, which can often be away from the null.3 The varying misclassification patterns in the standard and intensive SBP target arms of the trial could explain the surprising association between thiazide exposure and risk for the ACCORD primary outcome (“MACE”) and stroke in the intensive BP arm of the trial.1 For example, exposure misclassification is non-differential between patients considered thiazide-exposed and unexposed in the first few years of the intensive arm (Figure, bottom panel), which accords with complete overlap of the MACE and stroke survival curves during this timeframe in the original analysis. Conversely, exposure misclassification begins to differ substantially after about year 4 – almost exactly the time point at which the survival curves rapidly diverged in the original analysis.1
We hypothesized that the two sources of thiazide exposure misclassification, as detailed above, could be largely responsible for excess risk observed with thiazide exposure. We tested this hypothesis by performing four analyses: 1) a replication of the original analysis by Tsujimoto and Kajio; 2) a modified replication of the original analysis, where the only difference was use of the correctly classified baseline thiazide exposure indicator from the medication logs (Fixed-Exposure, Baseline); 3) a modified replication of the original analysis, using the same curated “analysis” dataset as in the original analysis, but setting ”baseline” as the year 1 visit, i.e., determining thiazide and covariate exposure at the year 1 visit, and beginning follow-up for outcomes thereafter (Fixed-Exposure, Year 1); and finally, 4) a modified replication of the original analysis, using the correctly classified thiazide indicator from the medication logs as a time-dependent exposure, in which exact start and stop dates determined periods of exposure or non-exposure (Time-Dependent Exposure). Of the modified analyses (analyses 2–4), the second and third address only the exposure misclassification introduced by use of the pre-trial thiazide indicator, but not the exposure misclassification that may occur over time in an ITT-type analysis, whereas the fourth addresses both types of exposure misclassification. The results of all four analyses are presented in the Table together with the original results by Tsujimoto and Kajio. Briefly, we were able to almost exactly replicate the original findings using the misclassified pre-trial thiazide variable (small differences are present because we could not ascertain exactly what variable definitions were used for some covariates, e.g., smoking, from the description in the original paper). In these replicated analyses, thiazide exposure appears to be associated with excess risk of MACE in the intensive arm of the BP trial, and excess risk of stroke in the overall cohort, the BP trial cohort, and the intensive arm of the BP trial, similar to the Tsujimoto and Kajio paper. Conversely, when thiazide exposure is correctly classified at the baseline visit (“Fixed-Exposure, Baseline”), thiazide exposure is not associated with any excess risk of either cardiovascular outcome in the BP trial cohort, nor either BP trial intervention arm. When thiazide exposure is correctly classified at the year 1 visit (“Fixed-Exposure, Year 1”), virtually identical null results are observed. Finally, when thiazide exposure is correctly classified in a time-dependent fashion, null results are observed for stroke, whereas, for MACE, thiazide exposure is associated with a protective effect in the overall BP trial population and in the intensive SBP target arm. Thus, it appears that differential misclassification at the baseline visit completely accounts for the biased association observed between thiazide exposure and adverse cardiovascular risk in the original analysis. And, differential misclassification during follow-up obscures a possible protective effect of thiazides on MACE risk.
Table.
Replicate and Alternative Analysis Results from the Multivariable Cox Regression Models Testing Thiazide Exposure on Cardiovascular Risk.
| Outcome | Population | Adjusted Hazard Ratio (95% CI) | ||||
|---|---|---|---|---|---|---|
| Reported in Tsujimoto and Kajio Paper1 | Replicated Analysis* | Fixed-Exposure, Baseline† | Fixed-Exposure, Year 1‡ | Time-Dependent Exposure§ | ||
| MACE | All ACCORD | 1.12 (1.01–1.25) | 1.07 (0.96–1.19) | - | 0.90 (0.79–1.02) | - |
| BP Trial only | NR | 1.22 (1.03–1.43) | 1.03 (0.88–1.21) | 1.02 (0.83–1.25) | 0.81 (0.66–1.00) | |
| Standard SBP | 1.09 (0.86–1.37) | 1.08 (0.86–1.36) | 1.01 (0.81–1.27) | 1.20 (0.91–1.57) | 0.95 (0.69–1.31) | |
| Intensive SBP | 1.49 (1.18–1.88) | 1.43 (1.13–1.79) | 1.08 (0.87–1.35) | 0.85 (0.62–1.17) | 0.74 (0.55–0.99) | |
| Stroke | All ACCORD | 1.34 (1.10–1.63) | 1.33 (1.09–1.62) | - | 1.02 (0.80–1.30) | - |
| BP Trial only | NR | 1.74 (1.32–2.29) | 1.12 (0.85–1.48) | 1.10 (0.77–1.55) | 0.94 (0.61–1.45) | |
| Standard SBP | 1.36 (0.91–2.02) | 1.36 (0.92–2.00) | 1.06 (0.72–1.55) | 1.31 (0.80–2.12) | 0.95 (0.53–1.73) | |
| Intensive SBP | 2.21 (1.47–3.32) | 2.43 (1.62–3.62) | 1.20 (0.79–1.81) | 0.99 (0.56–1.74) | 1.18 (0.58–2.38) | |
All models use the same outcomes and adjust for the same baseline variables as described in the original paper by Tsujimoto and Kajio.
Hazard ratios represent thiazide exposure versus non-exposure (referent) on risk of the specified outcome. A dash indicates an analysis that could not be performed because antihypertensive medication log data were unavailable in ACCORD participants who were not in the BP trial. BP, blood pressure; MACE, major adverse cardiovascular event; NR, not reported; SBP, systolic blood pressure.
Analysis attempting to exactly replicate methods used in the original analysis.1
Analysis replicates methods used in the original analysis, with one modification: thiazide exposure at baseline is ascertained from the medication logs, rather than the pre-randomization thiazide exposure variable used originally.
Analysis replicates approach used in the original analysis, but excludes the first year of follow-up in the trial, uses thiazide exposure (and other covariates in the model) ascertained from year 1 visit and begins follow-up for outcome ascertainment at the year 1 visit; patients missing exposure or covariate data at the year 1 visit were excluded, as were individuals who had the outcome before the year 1 visit.
Analysis replicates approach used in the original analysis, but considers thiazide use (derived from medication logs with exact start/stop dates) as a time-dependent exposure.
Tsujimoto and Kajio concluded that “thiazide use may be harmful in type 2 diabetic patients with relatively low BP.”1 We suggest that, on the basis of ACCORD data, this conclusion is likely inaccurate and should not be taken to mean that thiazides should be avoided in patients with type 2 diabetes. In fact, it may be that thiazides protect against MACE, consistent with prior literature, although more advanced analytic approaches (e.g., marginal structural modeling) would be helpful in ruling out time-dependent confounding.
Perhaps more importantly, we believe this is a cautionary example of biases that can be inadvertently introduced in observational studies of clinical trial data. Confounding is perhaps the most well-recognized source of bias in observational studies, and residual confounding can occur even when most major confounders are measured well, as is often the case in large clinical trials, and controlled for. However, lack of granular data on the timing of changes in treatment regimen is generally problematic and can introduce bias in observational analyses, often greater in degree than bias due to residual confounding. Moreover, such biases can be amplified if treatment changes are not random but rather follow a pattern that then introduces differential exposure misclassification. In this case, thiazide exposure misclassification was created by using pre-trial thiazide exposure as an indicator of in-trial thiazide exposure. However, exposure misclassification was amplified because in-trial thiazide exposure did not occur randomly. ACCORD BP trial participants were more likely to receive thiazides during the trial than non-BP trial participants by virtue of having hypertension. And, among BP trial participants, those randomized to the intensive SBP target were more likely to receive thiazides during follow-up by virtue of requiring greater BP lowering. We suspect this is not the only study from the ACCORD data that has employed, or may employ, information on pre-trial medication use as an indicator for in-trial medication use. But, particular caution is needed when the drug is prescribed differentially across trial participants and over time. Finally, the present report should serve as a reminder that detailed, clear, and unambiguous study documentation is crucially important in publicly-available research datasets to maximize their scientific value.
Supplementary Material
Funding
Dr. Smith’s effort for this manuscript was funded by the National Heart, Lung and Blood Institute (K01 HL138172).
Footnotes
Disclosures
All authors report no other conflicts of interest related to this work. This manuscript was prepared using ACCORD Research Materials obtained from the NHLBI Biologic Specimen and Data Repository Information Coordinating Center and does not necessarily reflect the opinions or views of the ACCORD investigators or the NHLBI.
References:
- 1.Tsujimoto T, Kajio H. Thiazide Use and Cardiovascular Events in Type 2 Diabetic Patients With Well-Controlled Blood Pressure. Hypertension. 2019;74(6):1541–1550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Cushman WC, Grimm RH Jr., Cutler JA, et al. Rationale and design for the blood pressure intervention of the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial. Am J Cardiol. 2007;99(12a):44i–55i. [DOI] [PubMed] [Google Scholar]
- 3.Hartzema AG, Schneeweiss S. Addressing Misclassification in Pharmacoepidemiologic Studies In: Hartzema AG, Tilson HH, Chan KA, Pharmacoepidemiology and Therapeutic Risk Management. 3rd ed. Cincinnati, OH: Harvey Whitney Books Co.; 2008:325–376. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
