Bipolar disorder (BD) is associated with poor psychosocial functioning (e.g., high divorce rates; erratic work performance), significant medical and psychiatric comorbidities (e.g., comorbid cardiovascular conditions, comorbid anxiety and substance use disorders), and high rates of suicide attempts and death by suicide.1 Given such severe morbidity, clinicians and patients need to know the efficacy and effectiveness of both drug and non-drug treatment and management options for BD. Clinicians and patients also need better understanding of how effects of treatment vary based on age of BD illness onset, current age, BD type, psychiatric comorbidity, and specific demographic characteristics. However, when we sought to evaluate a broad range of drug and nondrug BD interventions via a systematic literature review, we found a sparse and scattered evidence base with no high- or moderate-strength evidence for any intervention to effectively treat any type of BD compared with placebo or an active comparator.2 Two issues posed major challenges in our evidence synthesis—lack of diagnostic accuracy and high attrition of study participants.
CHALLENGES WITH DIAGNOSTIC ACCURACY IN BD RESEARCH
Clinicians and researchers need to be able to clearly determine the BD subtype (e.g., Bipolar I Disorder [BD-I] versus Bipolar II Disorder [BD-II]) and the severity of BD (e.g., number of episodes, persistence of symptoms/episodes) in the population being studied to assess extent of treatment effectiveness across various BD presentations. A reliable and valid diagnostic assessment is prerequisite for such clarity in research. Identifying these diagnostic sample characteristics is critical when synthesizing research to inform evidence-based medicine, but this was rarely possible in our recent systematic review of existing studies.2 Most studies used the Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria current for the study period, but methods and reliability of patient ascertainment varied. Often, studies provided only limited details on diagnostic assessment and failed to describe procedures for ensuring diagnostic validity (e.g., use of gold-standard assessment tools, consensus reviews) or report interrater reliability statistics. Compared with drug studies that tended to focus on BD-I, psychosocial treatment studies tended to include a wider range of BD diagnoses. Even still, a significant subset of psychosocial studies did not report outcomes separately based on BD subtype or indicators of BD severity, thus not allowing us to determine variability in treatment effectiveness based on these patient characteristics.
CHALLENGES WITH HIGH ATTRITION IN BD RESEARCH
High attrition rates in clinical trials also presented a significant barrier to yielding a high strength of evidence for a particular treatment.2 In studies of drugs for acute mania, we consistently found that 30 to 70% of randomized patients failed to complete treatment for 3-week trials.2 We tried to address this by excluding studies with greater than 50% attrition. However, the typical attrition range for the remaining studies was still quite high at 35 to 45%.2 Such high attrition negatively affects risk of bias and ultimately undermines strength of evidence. This is because it is difficult to assess efficacy or effectiveness when nearly half of study participants fail to complete outcome assessments at designated intervals.
Methods to create values for missing data rely on the untestable assumption that missing data is unrelated to the disease condition. This is a highly questionable assumption for BD, given the complexities of the patient population. Studies that met our attrition criteria commonly relied on last-observation carried forward (LOCF) to address missing data. This is perhaps unsurprising given that the Food and Drug Administration (FDA) previously seemed to prefer this approach.3 However, studies often used LOCF without providing details about when in the follow-up period participants’ last observations were recorded, other than “after baseline.” Given the frequency of measurements in these trials, dropout as early as the first week cannot be ruled out. Single imputation methods like LOCF are unreliable because they assume that the health-status of patients who dropped out of the trial would not have changed had researchers recorded further observations. This is a problematic assumption in the context of withdrawal due to lack of efficacy or adverse events, both common occurrences in BD treatment.4 LOCF methods can also bias effect estimates. Estimates of standard errors will understate the true uncertainty surrounding effect estimates due to the added uncertainty of having to impute data. This can imply the result is more precise than it actually is, potentially inflating the type-I error rate.5 This potential bias in the estimates of effect is even more problematic in studies with greater than 50% attrition, which require imputing half or more of the data.
Without sufficient strength of evidence, systematic reviewers cannot draw conclusions about the efficacy or effectiveness of interventions. Given that systematic reviews are used to inform and improve clinical practice guidelines, the inconclusive findings have broader implications as they hamper efforts to develop evidence-based clinical practice guidelines.
ADDRESSING DIAGNOSTIC ACCURACY IN RESEARCH DESIGN AND ANALYSIS
The accuracy of BD diagnosis depends on the psychometric quality of interviews and screening tools used, training of individuals performing diagnostic assessments, and procedures assuring good interrater reliability across interviewers. Reliability assessments should also assure that diagnostic accuracy is not affected by patient factors such as gender, race, ethnicity, or culture. Additional information and rigor in diagnostic assessment would generate greater confidence about whom the study participants represent and, therefore, to which populations the study results apply. Improved and more standardized information on the BD assessment processes would also ensure that evidence synthesis is combining results for similar patient populations across studies.
In addition to improved reporting on diagnostic accuracy, future studies should investigate treatment effects across all BD diagnoses, and attend to BD subpopulation analyses. The lack of evidence (or of separate reporting of studies’ results) for specific BD subpopulations, such as individuals with BD-II or adults over age 60, stems directly from prevailing inclusion and exclusion criteria. For example, most BD treatment studies focus on individuals with BD-I diagnoses. This practice is understandable for mania, but less so for treatment targeting maintenance or depression.
ADDRESSING HIGH ATTRITION IN RESEARCH DESIGN AND ANALYSIS
Studying and treating patients with BD is difficult; however, a few steps could address the unavoidable issue of attrition in BD treatment research. Future studies of BD treatments should consider innovative ways to increase study completion rates. Possibilities include using technology for follow-up assessments and study reminders (e.g., “smart” bottles, mobile apps), multiple secondary contacts for participants, all-inclusive contact information (e.g., including social media contact information), and flexible scheduling (e.g., staff/treatment availability at last-minute notice).
Such innovations notwithstanding, we understand that even the most well-designed research study may experience high attrition. No analysis can ever fully make up for study data deficits. Given the limitations of LOCF, better methods exist for addressing high attrition.3,5 First, investigators could consistently examine and report clinical and demographic characteristics that differentiate participants who withdraw from versus complete a trial, as well as incorporate these findings into caveats about potential conclusions of treatment effects. Increased awareness of the clinical and demographic predictors of withdrawal should lead to new studies that better address treatment for specific subsets of BD patients. Second, investigators could systematically assess and report reasons for withdrawal of consent beyond side effects and lack of efficacy. Often, reasons for withdrawal of consent are not provided or vague. Third, studies with high attrition could conduct sensitivity analyses to determine how different assumptions about missing data affect the effect size and corresponding confidence intervals. This would be an important analytical step prior to drawing conclusions based on the existing data. Finally, in cases where attrition appears to have been random, certain statistical techniques are more adept at modeling missing data without unduly influencing the results, such as average score/observation method or multilevel linear mixed modeling.5
Longitudinal data analysis techniques for intermittent follow-up will also help by allowing the pooling of individual patient-level data across studies. Such techniques would require greater funding for to create data repositories and BD research with longer follow-ups.
GOING FORWARD
More work is needed to address problems in research on patients with BD. Evidence-based medicine relies on three realms—evidence, clinical experience, and patient experience. Current literature on BD treatment provides insufficient evidence, meaning that decisions can only be informed by the latter two realms. This is an unsatisfying position for both clinicians and patients. We believe that addressing high attrition and diagnostic accuracy in studies on BD treatment will greatly improve the quality of the literature available on drug and non-drug treatments for all types and states of BD. With higher quality evidence, clinicians and patients will be better able to make informed clinical decisions about BD treatment.
Funding/Support:
This work is based on research conducted by the Minnesota Evidence-based Practice Center (EPC) under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, MD (Contract No. 290-2012-00016-I). Dr. Urosevic’s work on this manuscript was partially supported by National Institute of Mental Health’s Career Development Award (K01 093621) and resources at the Minneapolis VA Health Care System.
Role of Funder:
The findings and conclusions in this document are those of the authors, who are responsible for its contents; the findings and conclusions do not necessarily represent the views of AHRQ or US Department of Veterans Affairs. No statement in this report should be construed as an official position of AHRQ or the U.S. Department of Health and Human Services.
Footnotes
Conflict of Interest Disclosures: None.
Contributor Information
Priyanka J Desai, Minnesota Evidence-based Practice Center, University of Minnesota School of Public Health, Minneapolis, MN.
Snežana Urošević, Minneapolis VA Health Care System, Minneapolis, Minnesota, USA; Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis, MN.
Mary Butler, Minnesota Evidence-based Practice Center, University of Minnesota School of Public Health, Minneapolis, MN.
REFERENCES
- 1.Carvalho AF, Firth J, Vieta E. Bipolar Disorder. N Engl J Med. 2020;383(1):58–66. [DOI] [PubMed] [Google Scholar]
- 2.Butler M, Urosevic S, Desai P, et al. Treatment for Bipolar Disorder in Adults: A Systematic Review. Comparative Effectiveness Review No. 208. AHRQ Publication No. 18-EHC012-EF Rockville, MD: Agency for Healthcare Research and Quality; 2018. [PubMed] [Google Scholar]
- 3.Hamer RM, Simpson PM. Last observation carried forward versus mixed models in the analysis of psychiatric clinical trials. Am J Psychiatry. 2009;166(6):639–41. [DOI] [PubMed] [Google Scholar]
- 4.Leon AC, Mallinckrodt CH, Chuang-Stein C, Archibald DG, Archer GE, Chartier K. Attrition in Randomized Controlled Clinical Trials: Methodological Issues in Psychopharmacology. Biol Psychiatry. 2006;59(11):1001–1005. [DOI] [PubMed] [Google Scholar]
- 5.Little RJ, D’Agostino R, Cohen ML, et al. The Prevention and Treatment of Missing Data in Clinical Trials. N Engl J Med. 2012;367(14):1355–1360. [DOI] [PMC free article] [PubMed] [Google Scholar]
