Abstract
Purpose:
The case-crossover design is increasingly used to evaluate the effects of chronic medications; however, as traditionally implemented in pharmacoepidemiology, with referent period preceding the outcome, it may lead to bias in the presence of persistent exposures. We aimed to evaluate the extent and magnitude of bias in case-crossover analyses of chronic and persistent exposures, using simulations.
Methods:
We simulated cohorts with either 30-day, 180-day, or 2-year exposure duration; and with varying degrees of persistence (10%, 30%, 50%, 70%, or 90% of patients not stopping exposure). We evaluated all scenarios under the null and the scenario with 30% persistence under varying exposure effects (odds ratios of 0.25 to 4.0). Cohorts were analyzed using conditional logistic regression that compared the odds of exposure on the outcome day to the odds of exposure on a referent day 30 days prior to the outcome. We further implemented the case-time-control design to evaluate its ability to adjust for bias from persistence.
Results:
Case-crossover analyses produced unbiased estimates across all scenarios without persistent users, regardless of exposure duration. In scenarios where some patients persisted on treatment, case-crossover analyses resulted in upward bias, which increased with increasing proportion of persistent users, but did not vary substantially in relation to the magnitude of the true effect. Case-time-control analyses removed bias in all scenarios.
Conclusions:
Investigators should be aware of bias due to treatment persistence in unidirectional case-crossover analyses of chronic medications, which can be remedied with a control group of similarly persistent noncases.
Keywords: bias, case-crossover, drug safety, epidemiologic methods, pharmacoepidemiology, research design
1 |. INTRODUCTION
The use of self-controlled designs for pharmacoepidemiologic investigations of drug safety and effectiveness is on the rise.1,2 The case-crossover design, in particular, has been shown to be a valid and efficient approach to evaluate acute effects of transient exposures3 and may be well suited for active safety monitoring of drugs in electronic healthcare data.2,4 However, many medications are intended to be used over prolonged periods of time and the use of the case-crossover design to evaluate exposures that are not necessarily transient is increasing.1,5
In a case-crossover analysis, an individual’s exposure at the time of or preceding the outcome (hazard window) is compared with the same individual’s exposure at other times (referent window).3 Typically, in pharmacoepidemiologic studies of treatments that are often affected by an outcome occurrence, a unidirectional sampling of control times is implemented, with referent window preceding the hazard window. This approach assumes stationary exposure prevalence across the referent and hazard windows under the null and will lead to biased estimates in the presence of population-level changes in exposure prevalence over time.3,6 Since the case-crossover design requires a change in exposure status within each individual for that individual to contribute to analysis, it was originally believed that as long as there was no time trend in drug utilization on a population level, a case-crossover analysis of chronic exposures produced valid estimates, but was not necessarily efficient.3,7 It has been suggested, however, that the case-crossover design may produce biased estimates if exposures across the hazard and referent periods within individuals are not independent.8 A recent empirical investigation further confirmed that in the presence of exposure patterns in which at least some patients persist on treatment, the case-crossover design yields upwardly biased estimates.9 Similar to bias due to population-level exposure trends,10 bias due to persistence arises from the fixed ordering in time of outcome and referent periods in unidirectional case-crossover analyses. For patients with persistent use, only one pattern of discordant exposure becomes possible - exposed at the time of outcome, unexposed in the past.
Since other sources of bias, including time trends in drug utilization during the study period, time-varying confounding due to changes in patients’ health status, and exposure misclassification, could have impacted the results in the empirical investigation, the expected magnitude of bias due to persistence in case-crossover analyses remains unknown. Moreover, the case-time-control analysis, which has been shown to mitigate bias due to population-level trends in exposure in case-crossover analyses,6,11 failed to completely adjust for bias in the empirical investigation, leaving open the question about the optimal strategy to deal with bias due to persistence.9
In this study, we sought to evaluate the extent and magnitude of bias associated with exposure persistence in case-crossover analyses, as commonly implemented in studies of drug effects, using simulated data. In particular, we sought to evaluate: (a) whether long-term exposure of finite duration would lead to biased estimates; (b) the magnitude of bias, if present, as a function of the proportion of patients who persist on treatment; (c) the magnitude of bias as a function of the true treatment effect; and (d) the ability of the case-time-control approach to adjust for bias due to persistence.
2 |. METHODS
2.1 |. Data generation
We generated cohorts of 15 000 patients, each with 545 simulated days and one incidence of drug exposure during that time. The exposure start day was simulated from a uniform distribution, that is, probability of starting was constant across 545 simulated days and patients could start exposure at any time (Figure S1 in Data S1). We generated four scenarios of exposure duration: (a) all patients exposed for 30 days; (b) all patients exposed for 180 days; (c) all patients exposed for 2 years; and (d) some patients exposed for 30 days with the remainder staying exposed through the end of simulated time (persistent users). We created five versions of scenario 4 by varying the proportion of patients who persisted on treatment (10%, 30%, 50%, 70%, and 90%). To avoid bias due to an initial increase in exposure prevalence in the simulated population because patients can only start discontinuing exposure after a pre-defined period (exposure duration),11 we excluded the first 180 days (run-in window) from each patient’s simulated time in scenarios 1, 2, and 4 (Figure S1 in Data S1). In scenario 3, which had 2-year exposure duration, we excluded the first 2 years of the 3-year simulated time. Following these exclusions, all patients had 1 year of data (evaluation window), during which daily probability of outcome was calculated as a function of exposure and was used to generate a binary outcome (Yij) for each day j for person i:
The intercept in the outcome model was selected so that the baseline probability of the outcome over 1 year was around 0.3. The effect of exposure on outcome was assumed to be constant and transient - that is, no cumulative, time-varying, and residual (persisting after treatment discontinuation) effects were modeled. All scenarios were simulated under the null: the true odds ratio [OR] for the effect of exposure on outcome was assumed to be 1, meaning the coefficient for exposure, β, equaled 0. Furthermore, in the scenario with 30% persistence, we varied the magnitude of the exposure effect (odds ratios of 0.25, 0.5, 0.8, 1.0, 1.25, 2.0, and 4.0). For each scenario, 1000 cohorts were generated.
R code for generating the simulated datasets is provided in the Data S1.
2.2 |. Analysis
Only patients who experienced outcomes were included in the case-crossover analysis. If patients experienced more than one outcome, only the first one was included. Logistic regression, stratified on individual, was used to compare the odds of exposure on the date of the outcome to the odds of exposure on a referent day 30 days preceding the outcome. To ensure that patients’ referent day was within the evaluation time and exposure could be ascertained, patients were included in the analysis only if their outcome occurred at least 31 days after the start of evaluation window.
Across the 1000 simulation iterations, we estimated the mean of the log exposure effect estimates. Bias was calculated as the mean difference between the estimated effect and the true effect on the log scale across the 1000 simulation iterations. The estimated OR was calculated by exponentiating the mean log estimate. We also evaluated the distribution of the 1000 iteration-specific estimated ORs for each scenario, as well at the daily exposure prevalence across evaluation window in simulated cohorts.
For the case-time-control analysis, we used risk-set sampling to identify controls from the entire simulated cohort. Cases were eligible to be controls prior to their outcome date. Controls were required to be free of the outcome prior to and including the outcome date of the matched case and were assigned the outcome date of the matched case. We selected controls in a ratio of 1 per case. As with the case-crossover analysis, logistic regression, stratified on individual, was used to compare the odds of exposure on the outcome date to the odds of exposure on the referent day 30 days prior; however, the adjusted, case-time-control OR was derived from a product term between case status and exposure discordance, yielding the OR above and beyond that observed in controls.6 Estimates and bias were calculated as described above.
2.3 |. Secondary analyses with confounding
As secondary analyses, we also generated and analyzed the scenario with 30% persistence in the presence of (a) a time-invariant confounder, and (b) a time-varying confounder.
In the scenario with a time-invariant confounder, 50% of patients were exposed to a confounder that increased the odds of the outcome 1.5-fold (OR 1.5) and lasted throughout the enrolled time (did not vary over time). Drug exposure was simulated as a function of the confounder such that the expected prevalence of the confounder was 53% among patients exposed to the drug and 17% among unexposed patients.
For the scenario with a time-varying confounder, the same parameters were implemented: confounder prevalence 50%, OR for the association with the outcome of 1.5, the expected prevalence of 53% among exposed patients and of 17% among unexposed patients. In this scenario, the confounder lasted for 15 days only and could start at any time (the confounder start day was simulated from a uniform distribution). For patients who were exposed to both the confounder and the drug, the start of drug exposure corresponded to the start of confounder exposure.
All simulations and analyses were performed using R statistical software (RStudio version 3.5.2). Risk-set sampling and matching was performed using R Package for Statistical Analysis in Epidemiology (Package “Epi”), version 2.37.12
3 |. RESULTS
The estimated exposure effects (on the log and OR scales) and bias on the log scale for both case-crossover and case-time-control analyses are reported in Table 1. The distributions of estimated ORs across the 1000 iterations for scenarios with varying exposure duration under the null are presented in Figure 1. Both the case-crossover and case-time-control analyses produced unbiased estimates (mean bias ≤0.01) in all scenarios where patients were exposed for finite amount of time, regardless of the duration of exposure (30 days, 180 days, or 2 years). However, in scenarios where some patients were persistent (stayed exposed through the end of evaluation window), the case-crossover analysis produced biased estimates. The magnitude of bias increased with increasing proportion of persistent users (Figure 1, Table 1). With a true OR of 1, the estimated ORs were 1.11, 1.43, 2.00, 3.33, and 10.16 when 10%, 30%, 50%, 70%, and 90% of patients persisted on therapy, respectively. In all of these scenarios, the case-time-control analysis corrected for bias and yielded unbiased estimates (Table 1).
TABLE 1.
Estimated effects and bias for all scenarios
| Case-crossover | Case-time-control | ||||||
|---|---|---|---|---|---|---|---|
| Scenario | True effect (log scale) | Mean estimated effect | Bias | Estimated OR | Mean estimated effect | Bias | Estimated OR |
| True OR = 1; varying exposure duration | |||||||
| 30 days | 0 | 0.004 | 0.004 | 1.00 | 0.001 | 0.001 | 1.00 |
| 180 days | 0 | 0.006 | 0.006 | 1.01 | 0.005 | 0.005 | 1.01 |
| 2 years | 0 | −0.001 | −0.001 | 1.00 | 0.01 | 0.01 | 1.01 |
| 10% persistent | 0 | 0.10 | 0.10 | 1.11 | 0.00 | −0.003 | 1.00 |
| 30% persistent | 0 | 0.36 | 0.36 | 1.43 | 0.00 | <0.001 | 1.00 |
| 50% persistent | 0 | 0.69 | 0.69 | 2.00 | 0.00 | <0.001 | 1.00 |
| 70% persistent | 0 | 1.20 | 1.20 | 3.33 | −0.01 | −0.01 | 0.99 |
| 90% persistent | 0 | 2.32 | 2.32 | 10.16 | 0.00 | 0.003 | 1.00 |
| Varying true OR; 30% persistent use | |||||||
| True OR 0.25 | −1.39 | −1.04 | 0.35 | 0.35 | −1.38 | 0.003 | 0.25 |
| True OR 0.5 | −0.69 | −0.35 | 0.35 | 0.71 | −0.70 | −0.008 | 0.50 |
| True OR 0.8 | −0.22 | 0.13 | 0.35 | 1.14 | −0.23 | −0.006 | 0.80 |
| True OR 1.0 | 0.00 | 0.36 | 0.36 | 1.43 | 0.00 | <0.001 | 1.00 |
| True OR 1.25 | 0.22 | 0.59 | 0.36 | 1.80 | 0.23 | 0.005 | 1.26 |
| True OR 2.0 | 0.69 | 1.07 | 0.37 | 2.91 | 0.70 | 0.003 | 2.01 |
| True OR 4.0 | 1.39 | 1.78 | 0.39 | 5.93 | 1.39 | −0.001 | 4.00 |
Abbreviation: OR, odds ratio.
FIGURE 1.

Boxplots of estimated odds ratios for scenarios with varying exposure duration (1000 iterations each) and true OR of 1.0. Whiskers extend to cover values no more than 1.5 times the interquartile range. The horizontal line indicates the true OR of 1.0. In scenarios where the exposure duration was not fixed for everybody, the indicated percent of patients persisted on therapy (remained exposed) and the rest were exposed for 30 days
Figure 2 presents the distributions of estimated ORs for scenarios with varying true exposure effects and 30% persistence. There was a slight increase in bias with increasing true OR (Table 1), although the differences (0.39 vs 0.35) were minor. Similar to scenarios with varying degrees of persistence, the case-time-control analysis produced unbiased estimates across all scenarios with varying true exposure effect (Table 1).
FIGURE 2.

Boxplots of estimated odds ratios for scenarios with varying true odds ratio (1000 iterations each). In all scenarios, 30% of patients persisted on therapy and 70% were exposed for 30 days. Whiskers extend to cover values no more than 1.5 times the interquartile range. The diagonal line indicates where the estimated OR would be if unbiased
Figure S2 in Data S1 displays daily exposure prevalence over the evaluation window in simulated cohorts. Exposure prevalence was stable in scenarios with finite exposure but increased in all scenarios where some patients persisted on treatment.
Secondary analyses yielded expected results (Table S1 in Data S1). The time-invariant confounder had no impact on the performance of either the case-crossover design or the case-time-control design in the presence of persistence. Time-varying confounding produced the same amount of bias in both the case-crossover and the case-time-control analyses and had no impact on the magnitude of bias due to persistence (Table S1 in Data S1).
4 |. DISCUSSION
In the present study, we examined the magnitude of bias when the unidirectional case-crossover design is used to evaluate the effects of chronic exposure. As expected, the case-crossover analysis produced unbiased estimates when exposure was transient (30 days). In addition, it also produced unbiased estimates when exposures of longer, but still finite, duration (180 days or 2 years) were evaluated. However, when some patients remained persistently exposed (persistent users), the case-crossover design yielded upwardly biased estimates, with the magnitude of bias increasing as the proportion of persistent users increased. Case-time-control analyses yielded unbiased estimates in all scenarios.
In a case-crossover analysis, only patients with contrasting exposure status contribute information. As traditionally implemented in pharmacoepidemiology, with the referent period preceding the outcome (ie, unidirectional, right-censored at outcome), a case-crossover analysis can only include persistent users who are exposed at or prior to the outcome event and unexposed during referent time in the past. Unlike patients on therapies with finite duration, whether short or prolonged, persistent users do not discontinue their exposure and cannot contribute the opposite pattern: unexposed at outcome and exposed in the past. Thus, within persistent users, there will be an increase in exposure probability across the referent and the hazard (outcome) windows, violating one of the main assumptions of the case-crossover design–stable exposure prevalence.6
Indeed, it has been suggested that the case-crossover design requires exchangeability of exposure probability between periods for every person to produce unbiased estimates.8 Our study, along with the empirical investigation by Hallas et al, confirms that persistent exposure, but not fixed long-term exposure, will yield biased estimates in a unidirectional case-crossover analysis. This bias will exist even in the absence of an observable time trend in drug utilization on a population level as the rate of death of individuals using a drug reaches the rate of drug initiation and drug utilization reaches the steady state. Thus, examining population-level trends in utilization may not detect potential persistent user bias, while true prevalence of persistence for many medications intended for life-long use is unknown. Matched controls from the exposed population could be a useful diagnostic for the presence of potential bias, including bias due to persistence, even if they cannot necessarily distinguish the source of exposure trend. The use of the control group in the case-time-control analysis eliminated bias in all of our simulated scenarios.
In contrast to simulations, Hallas et al observed a significant reduction in bias with the case-time-control approach only in some of the evaluated empirical examples.9 The presence of residual bias following adjustment could be attributable to differential persistence among cases and controls or the presence of time-varying, within-person confounding among cases.9 The ability of the case-time-control design to adjust for persistent user bias will depend on how well persistence in controls approximates persistence in cases. If persistence is related to time-invariant factors that differ between cases and controls, the case-time-control analysis may result in residual bias. The case-case-time-control design, a variant of the case-time-control approach that utilizes future cases as controls, may minimize the risk of residual bias caused by sampling person-time from an inappropriate control group.13 However, if cases modify their persistence right before an outcome occurs, the case-case-time-control approach may still produce biased estimates.13 Furthermore, although it can adjust for bias due to time trends in exposure, including persistent user bias, the case-time-control approach does not address bias due to time-varying confounding in cases, such as transient confounding by indication.
Our findings should be interpreted within the context and limitations of data generating process. For each patient, we simulated only one exposure period; however, in real world, patients may stop and re-start treatment, providing some period of unexposed time, although the decision to both stop and re-start may be associated with change in health status, introducing time-varying, within-patient confounding. Further, in our scenarios with persistent use, patients were either persistent or stopped exposure after 30 days. In real world, however, persistence would be a more complicated phenomenon that may also change over time, and, as mentioned above, differ between cases and controls. Other sources of bias, such as time-varying confounding, are also likely to be present and lead to biased estimates in case-time-control analyses. Finally, censoring events, which were not simulated in our study, may increase the proportion of patients who appear persistent. Plasmode simulations, which create simulated cohorts based on real healthcare claims data and preserve the existing, real-world pattern of drug use, may provide more realistic distributions of persistence and its relation with outcomes and confounders.14 Nevertheless, our simplified simulations allowed us to isolate and investigate the magnitude of bias associated with exposure persistence in unidirectional case-crossover analyses, as well as the ability of the case-time-control approach to mitigate this particular bias.
In conclusion, while the case-crossover design is useful for evaluating chronic medications of finite duration, it will yield biased estimates when some patients persist on treatment. Researchers considering using the case-crossover design within the context of medications intended for life-long treatment need to consider the extent of persistent use in their study population and whether using controls may ameliorate the issue or whether another study design would be more appropriate.
Supplementary Material
KEY POINTS.
Case-crossover design is a valid and efficient approach for evaluating acute effects of transient exposures.
Case-crossover design also produces valid estimates when evaluating the effects of chronic medications of finite duration.
Analyzing the effects of chronic medications via unidirectional case-crossover design will yield biased estimates when at least some individuals persist on treatment.
Magnitude of bias will increase with increasing percent of individuals persisting on treatment and can be remedied with a similarly persistent control group.
Case-time-control analysis may still produce biased estimates if persistence between cases and controls differ or in the presence of time-varying, within-person confounding.
ACKNOWLEDGEMENTS
The authors would like to thank Dr. John G. Connolly for assistance with the simulation code. K. Bykov was supported by a training grant from the National Institute of Child Health and Human Development (T32 HD40128).
Funding information
This study was supported internally by the Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital. K. Bykov was supported by a training grant from the National Institute of Child Health and Human Development (T32 HD40128). The funders had no role in study design, analysis and interpretation of data, and in the decision to submit for publication.
Footnotes
CONFLICT OF INTEREST
K. Bykov is a consultant to Alosa Health for unrelated work. S.V. Wang received salary support from investigator-initiated grants to Brigham and Women’s Hospital from Boehringer Ingelheim, Novartis Pharmaceuticals, and Johnson & Johnson for unrelated work. J.J. Gagne received salary support from investigator-initiated grants to Brigham and Women’s Hospital from Eli Lilly and Company and Novartis Pharmaceutical Corporation and was a consultant to Aetion, Inc. and Optum, Inc., all for unrelated work. Other authors have no conflicts of interest to declare.
ETHICS STATEMENT
The authors state that no ethical approval was needed.
SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of this article.
REFERENCES
- 1.Nordmann S, Biard L, Ravaud P, Esposito-Farese M, Tubach F. Case-only designs in pharmacoepidemiology: a systematic review. PLoS ONE. 2012;7:e49444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Maclure M, Fireman B, Nelson JC, et al. When should case-only designs be used for safety monitoring of medical products? Pharmacoepidemiol Drug Saf. 2012;21(Suppl 1):50–61. [DOI] [PubMed] [Google Scholar]
- 3.Maclure M The case-crossover design: a method for studying transient effects on the risk of acute events. Am J Epidemiol. 1991;133: 144–153. [DOI] [PubMed] [Google Scholar]
- 4.Bykov K, Schneeweiss S, Glynn RJ, Mittleman MA, Gagne JJ. A case-crossover-based screening approach to identifying clinically relevant drug-drug interactions in electronic healthcare data. Clin Pharmacol Ther. 2019;106:238–244. [DOI] [PubMed] [Google Scholar]
- 5.Gault N, Castaneda-Sanabria J, De Rycke Y, Guillo S, Foulon S, Tubach F. Self-controlled designs in pharmacoepidemiology involving electronic healthcare databases: a systematic review. BMC Med Res Methodol. 2017;17:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Suissa S The case-time-control design. Epidemiology. 1995;6: 248–253. [DOI] [PubMed] [Google Scholar]
- 7.Delaney JA, Suissa S. The case-crossover study design in pharmacoepidemiology. Stat Methods Med Res. 2009;18:53–65. [DOI] [PubMed] [Google Scholar]
- 8.Vines SK, Farrington CP. Within-subject exposure dependency in case-crossover studies. Stat Med. 2001;20:3039–3049. [DOI] [PubMed] [Google Scholar]
- 9.Hallas J, Pottegard A, Wang S, Schneeweiss S, Gagne JJ. Persistent user bias in case-crossover studies in pharmacoepidemiology. Am J Epidemiol. 2016;184:761–769. [DOI] [PubMed] [Google Scholar]
- 10.Confounding Greenland S. and exposure trends in case-crossover and case-time-control designs. Epidemiology. 1996;7:231–239. [DOI] [PubMed] [Google Scholar]
- 11.Wang SV, Schneeweiss S, Maclure M, Gagne JJ. “First-wave” bias when conducting active safety monitoring of newly marketed medications with outcome-indexed self-controlled designs. Am J Epidemiol. 2014;180:636–644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Carstensen B, Plummer M, Laara E, Hills M. Epi: A Package for Statistical Analysis in Epidemiology. R Package Version 2. 37; 2019. https://CRAN.R-project.org/package=Epi. [Google Scholar]
- 13.Wang S, Linkletter C, Maclure M, et al. Future cases as present controls to adjust for exposure trend bias in case-only studies. Epidemiology. 2011;22:568–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Franklin JM, Schneeweiss S, Polinski JM, Rassen JA. Plasmode simulation for the evaluation of pharmacoepidemiologic methods in complex healthcare databases. Comput Stat Data Anal. 2014;72:219–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
