Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Dec 1.
Published in final edited form as: J Affect Disord. 2020 Sep 2;277:908–913. doi: 10.1016/j.jad.2020.08.067

Validity of Retrospectively-Reported Depressive Episodes

Samantha L Birk 1, Thomas M Olino 1, Daniel N Klein 2, John R Seeley 3
PMCID: PMC7575822  NIHMSID: NIHMS1626961  PMID: 33065833

Abstract

Background

Depression and other psychopathology are often assessed retrospectively. Few studies have evaluated the validity of these reports by comparing prospectively-assessed symptoms to retrospective reports during the same time period.

Methods

This study utilized a subset of participants (n = 68) from the Oregon Adolescent Depression Project who completed at least one mailer assessment of depressive symptoms during a retrospectively-reported depressive episode. Participants completed up to seven mailer assessments of depression and suicidal ideation and diagnostic assessments that included retrospectively-reported depressive episodes that coincided with the mailer assessments.

Results

Multilevel linear models examined differences in depressive symptoms and suicidal ideation during and between retrospectively-reported depressive episodes. Results showed that individuals reported significantly higher levels of depression and suicidal ideation for retrospectively-reported depressive episodes compared to when they were not in depressive episodes. In addition, the average level of depressive symptoms endorsed during retrospectively-reported depressive episodes reached established clinical cut-offs.

Limitations

Although we were able to determine whether symptoms during retrospectively-reported depressive episodes approached clinical cut-offs, we were unable to examine whether symptoms met criteria for depressive episodes. Additionally, we could not examine whether episode severity related to recall ability, and other forms of psychopathology were not assessed. Conclusion: These findings provide critical evidence for the validity of retrospectively-reported depressive episodes. Future research should examine whether these findings generalize across varying recall periods and retrospective assessments for other psychopathology.

Keywords: Self-report, Validity, Depression

Introduction

Depression is one of the most common mental illnesses and impacted 17.3 million adults in the United States in 2017 (National Survey on Drug Use and Health, 2017). It is associated with several negative outcomes, including increased risk for suicidal ideation and behavior (Hawton, Comabella, Haw, & Saunders, 2013). Extensive research has been conducted identifying risk factors for the development (e.g., Hartman, Nelson, Ratheesh, Treen, & McGorry, 2019; Hölzel, Härter, Reese, & Kriston, 2011; Lewinsohn et al., 1994; Liu et al., 2019), maintenance (e.g., van der Velden et al., 2015; Visted, Vøllestad, Nielsen, & Schanche, 2018), epidemiology (e.g., Bonde, 2008; Flemming & Offord, 1990), and genetics of depression (e.g., Middeldorp, Cath, Van Dyck, & Boomsma, 2005). Many of these studies assess current and past depressive episodes, with the latter being reported retrospectively. Given that retrospective reports of depressive episodes are commonly used in research on depression, it is crucial to assess the validity of those reports by examining whether they coincide with higher levels of symptoms, and associated factors, such as suicidal ideation, in concurrent reports obtained during the recalled episodes as part of prospective studies. If retrospective reports of depressive episodes do not coincide with higher levels of symptoms and suicidal ideation during the purported episodes, this would have critical implications for the literatures relying on retrospective reports.

Previous research has compared prospective reports of diagnoses from longitudinal studies to retrospective reports from cross-sectional surveys, and the results raise questions about the accuracy of retrospective reports. Moffitt and colleagues (2010) compared lifetime prevalence rates from a prospective study with assessments at ages 18, 21, 26, and 32 with those from three other epidemiological studies using retrospective interviews of individuals up to age 32. Results showed that lifetime prevalence rates of depressive, anxiety, and substance use disorders from Moffitt et al.’s (2010) prospective assessments were twice as high as prevalence rates from the three retrospective studies for all disorders (e.g., 41% for prospective depression vs. 17–19% for retrospective reports). Additionally, Olino and colleagues (2012) compared lifetime prevalence rates of psychopathology in a community sample with four prospective assessments across 15 years to their siblings who completed a single retrospective evaluation. Results showed that prospective assessments resulted in higher prevalence rates of major depressive disorder than the single retrospective assessment (55% vs. 28%). However, this was not found for rates of dysthymic, anxiety, bipolar, and substance use disorders. Finally, Takayanagi and colleagues (2014) found that prospective rates, collected through four waves of interviews over 24 years, were substantially higher than prevalence rates from one retrospective evaluation (e.g., 13% vs. 5% for major depressive disorder). Importantly, this was the only study that made comparisons within the same participants.

Previous research also has examined the consistency of diagnoses between interviews over varying time intervals, ranging from hours to years, by examining diagnostic concordance in overlapping, but not identical, time periods. These findings have been mixed, with associations ranging from poor to good (κs = .34-.80; e.g., Bromet, Dunn, Connell, Dew, & Schulberg, 1986; Kendler, Neale, Kessler, Heath, & Eaves, 1993; Prusoff, Merikangas, & Weissman, 1988). For example, Bromet and colleagues (1986) examined the test-retest reliability of lifetime reports of major depression in a community sample using the Schedule for Affective Disorders and Schizophrenia-Lifetime Version (SADS-L; Endicott & Spitzer, 1978), which was administered twice across an 18-month interval, with the second interview assessing the period before and after the first interview. Results showed that the reliability of a lifetime major depression diagnosis in the overlapping time periods was poor (κ = .41). In contrast, in a family-genetic study of affective disorders and healthy controls, Prusoff and colleagues (1988) found that the reliability of lifetime major depression as assessed using the SADS-L over a three-to-five year period was excellent (κ = .80), whereas reliability of other disorders ranged from good to poor (κs = .32-.66). These studies benefit from reliance on diagnostic interviews for their assessments. However, they have several shortcomings. First, individuals were required to report both current and past episodes during the interviews, and concurrently-reported symptoms were not compared to retrospectively-reported symptoms. It is plausible that reporting across multiple recall periods may result in conflation of recall of previous episodes with current episodes. In other words, symptoms recalled from previous episodes may be conflated with the current symptoms that are experienced, reducing the reliability of these reports. Second, these studies did not consider that the length of recall period may impact retrospective reports. Longer intervals may lead to poorer convergence.

These studies provide some insight into the reliability and validity of retrospective assessments of psychopathology, although findings have not been entirely consistent. Given the widespread reliance on retrospective reports of depression, alternative methods of testing their validity are needed. One novel means of evaluating this question is to compare concurrent self-reports of symptoms and associated constructs to retrospective reports of episodes of disorders during that same time period. Examining both depressive symptoms and other well-studied constructs, such as suicidal ideation, which has been robustly linked to depressive symptoms (e.g., Carlson & Cantwell, 1982; Harwood, Hawton, Hope, & Jacoby, 2001; Hawton et al., 2013), allows for the direct evaluation of the validity of retrospective reports and provides the unique opportunity to explore whether similar patterns emerge across constructs. Thus, depression symptom levels and frequency of suicidal ideation when individuals retrospectively report being in a depressive episode can be compared to levels when they retrospectively report not being in an episode to determine whether there are differences, and if the symptoms that occur during retrospectively-reported depressive episodes approach established clinical-cutoffs. This provides critical information regarding the validity of retrospectively-reported depressive episodes. This study examined how contemporaneously-reported depressive symptoms and levels of suicidal ideation, assessed annually, differed when participants were and were not in retrospectively-recalled depressive episodes. We hypothesized that contemporaneously-reported depressive symptoms and levels of suicidal ideation would be higher during retrospectively-reported episodes of depression compared to retrospectively-reported periods of not being in a depressive episodes. We also examined whether the link between retrospectively-reported depressive episodes and contemporaneously-reported depressive symptoms and levels of suicidal ideation varied based on the length of the recall period. We hypothesized that shorter recall periods would be associated with stronger associations between retrospectively-reported depressive episodes and levels of contemporaneously-reported depressive symptoms and suicidal ideation.

Method

Participants

Participants are a subset of individuals from the Oregon Adolescent Depression Project (OADP; Lewinsohn, Rohde, & Seeley, 1996). A total of 1709 adolescents (ages 14–18; Mage = 16.6, SD = 12) completed an initial assessment (T1) between 1987 and 1989, and 1507 (88%) returned for a second evaluation (T2) approximately one year later. At age 24, all adolescents with a history of MDD by T2 (n = 360) or a history of non-mood disorders (n = 284), and a random sample of individuals with no history of psychopathology by T2 (n = 457), were invited to participate in a third evaluation (T3). Of those participants, 941 (85%) completed the third assessment. At age 30, 816 (87%) individuals completed a fourth diagnostic interview (T4). Participants also completed questionnaires assessing depressive symptoms and suicidal ideation by mail on seven annual occasions. See Figure 1 for an illustration of the study timeline.

Figure 1.

Figure 1.

Oregon Adolescent Depression Project study timeline.

1KSADS = version of the Schedule for Affective Disorders and Schizophrenia for School-Age Children that combined features of the Epidemiologic version (KSADS-E; Orvaschel, Puig-Antich, Chambers, Tabrizi, & Johnson, 1982) and included additional items to derive diagnoses of past and current psychiatric disorders in the revised DSM-III (American Psychiatric Association, 1987); 2HRSD = Hamilton Rating Scale for Depression (Hamilton, 1960); 3Battery of self-report measures included measures of stress, current depressive symptoms, other psychopathology (i.e., internalizing and externalizing), pessimism, attributions, self-consciousness, self-esteem, self-rated social competence, emotional reliance, future goals, coping skills, social support, interpersonal factors, physical health and illness, maturation, and other variables; 4LIFE = Longitudinal Interval Follow-Up Evaluation (Shapiro & Keller, 1979); 5CESD = Center for Epidemiologic Studies-Depression Scale (Radloff, 1977); 6Perceived Social Support Questionnaire (PSS; Procidano & Heller, 1983).

*In the Oregon Adolescent Depression Project, annual mailer assessments spanned ages 18 to 32, thus starting before T3 and ending after T4. However, for the current study, only mailer assessments that occurred after T3 and before T4 were included in the study.

Our analyses focused on the period between T3 and T4 to focus on a single recall period. Of the 816 participants at T4 (i.e., the assessment following the completion of the mailer assessments), 68 individuals completed at least one mailer assessment during a retrospectively-reported depressive episode. For these participants, there were 110 mailers total completed when participants retrospectively-reported being in depressive episodes and 135 completed outside of depressive episodes (total mailers completed by these individuals = 245; average mailers completed = 1.96; SD = 1.38; range = 1–7). More specifically, 39 (of these 68 individuals) completed one mailer; 20 completed two mailers; six completed three mailers; two completed four mailers; and one completed five mailers during retrospectively-reported episodes. In addition, 11 (of these 68 individuals) completed one mailer; 20 completed two; 20 completed three; and six completed four mailers when not in retrospectively-reported episodes. As the focus of this study is on the within-person comparison of symptoms when in and not in retrospectively-reported depressive episodes, all available data from these 68 individuals were included in the analyses.

Of these 68 individuals, 60 identified as White, four as Hispanic, two as Asian, one as Indian, and one as Other. In addition, 53 participants identified as female and 15 as male. Thirty-four participants reported that none of their parents had a bachelor’s degree, 24 reported having at least one parent with a bachelor’s degrees, and 10 did not respond. For demographic information on the full sample at T1, see Lewinsohn et al. (1994).

Measures

Mailer assessments – Contemporaneously-Reported Depressive Symptoms

Participants’ levels of depressive symptoms were assessed using the Center for Epidemiologic Studies-Depression Scale (CES-D; Radloff, 1977). The CES-D is a 20-item self-report questionnaire assessing the frequency of depressive symptoms in the past week. Responses are rated on 4-point scale from 0 (rarely or none of the time [less than 1 day]) to 3 (most or all of the time [5–7 days]), with total scores of 21 and above considered as being in the clinical range (Henry et al., 2018). Out of the 110 mailers completed during retrospectively-reported depressive episodes, 54 mailers had total scores of 21 or greater on the CES-D (49.1%). Of the 153 mailers completed when participants did not report being in a depressive episode, 97 had total scores below 21 (63.4%). Internal consistency for the CES-D across assessments was excellent (mean α = .91; αs across mailer administrations ranged from .91-.93). Across all observations and participants, the mean level of depressive symptoms was 12.65 (SD = 10.12).

Mailer assessments – Contemporaneously-Reported Suicidal Ideation

Participants’ levels of suicidal ideation were assessed using a 4-item suicidal ideation screener that assessed the frequency of thoughts of death, killing oneself, and that one’s family would be better off without them, and the intent to kill oneself in the past week (Lewinsohn et al., 1996). Responses are rated on a 4-point scale, from 1 (rarely/none of the time during the past week) to 4 (most/all of the time during the past week). Internal consistency for the suicidal ideation screener across assessments was good (mean α = .75; αs across mailer administrations ranged from .68-.80). Across all observations and participants, the mean level of suicidal ideation was 4.47 (SD = 1.30).

Diagnostic Assessment – Retrospectively-Reported Depressive Episodes

The Structured Clinical Interview for DSM-IV (First, Spitzer, Gibbon, & Williams, 1996) was completed at T4. Participants reported on their depressive episodes throughout the study and provided onset and offset dates based on months for each episode. This information was utilized to determine whether mailer assessments were filled out during retrospectively-reported depressive episodes. Most interviewers had advanced degrees in a mental health field and several years of clinical experience, and interviewers were blind to previous diagnostic information. Interrater reliabilities indicated excellent agreement for MDD (k = .81 at T4; see Olino, Klein, Lewinsohn, Rohde, & Seeley, 2008 for more information).

Proportion of Time Depressed

To control for possible influence of depression chronicity on reported depressive symptoms (i.e., the potential for greater chronicity of symptoms to account for the associations between retrospectively-reported episodes and contemporaneously-reported symptoms), we computed the proportion of mailers that were completed when participants retrospectively reported they were in a depressive episode. This was included as a covariate in the models.

Depression Status at Mailers

The date of the mailer assessments and participants’ date of birth was utilized to calculate each participants’ age in months at the time of completing each mailer. During diagnostic interviews, the onset and offset dates of depressive episodes were coded in months. This information was utilized to determine whether a contemporaneously-completed mailer was filled out during a retrospectively-reported depressive episode.

Attrition Analyses

Attrition analyses examined whether T1 and T4 diagnoses (major depression, dysthymia, substance abuse/dependence, or anxiety disorders) or participant sex were associated with the total number of mailer assessments completed in the entire T4 sample. There were significant differences in the number of mailer assessments completed between males (M = 5.79, SD = 1.43) and females (6.31, SD = .99; F(1, 815) = 36.91). However, there were no significant differences in number of mailers completed by individuals with and without T1 or T4 diagnoses (all ps > .08; full data are presented in Supplementary Materials).

We also conducted attrition analyses to examine whether the subsample of individuals included in the current study differed from the entire T4 sample based on demographic characteristics (i.e., participant sex, race [coded as white vs. non-white], and education level) and psychiatric diagnoses (anxiety disorders or substance use disorders). Relative to those not included (NI), the subsample (SS) in this report was more likely to be female (NI = 57.7%; SS = 79.7%, χ2(1) = 13.34, p < .001), more likely to complete a 4 year college degree (NI = 52.3%; SS = 67.6%, χ2(1) = 5.76, p < .05), and more likely to have a lifetime history of anxiety disorders (NI = 23.4%; SS = 47.1%, χ2(1) = 19.90, p < .0001) and SUDs (NI = 42.8%; SS = 67.6%, χ2(1) = 17.06, p < .001). There were no significant differences on race (NI = 90.8%; SS = 90.6%, χ2(1) = 0.16, p = .69).

Data Analysis

We examined differences in levels of depressive symptoms as a function of retrospectively-reported depressive episodes using multilevel linear models with the lme4 (Bates, Maechler, Bolker, & Walker, 2015) package in R (R Core Team, 2017). Models included all available data. Models predicted contemporaneously-reported depressive symptoms and levels of suicidal ideation with depression status, length of recall period (i.e., difference between age at T4 and age at mailer completion with the variable centered at T4; M = 3.08 years, SD = 1.53), and the interaction between depression status and length of recall period as level 1 predictors. We ran the models both with and without random slope effects for length of recall period. However, given that the time slope variance was not significant, the models presented in Table 1 do not include the random slope for length of recall. As the suicidal ideation measure does not have established clinically-informative cutoff scores, it was standardized so model parameters could be interpreted as effect sizes. As the CES-D has recommendations for clinical screening (Henry, Grant, & Cropsey, 2018; Santor, Zuroff, Ramsay, Cervantes, & Palacios; 1995), it was left in raw units. Models included the proportion of mailers completed while depressed and sex as level 2 predictors. We also explored the proportion of time depressed as a potential moderator of the relationship between being in a retrospectively-reported depressive episode versus not in an episode and contemporaneously-reported symptoms to see whether this effect varied based on the amount of time an individual was depressed.

Table 1.

Demographic and clinical characteristics of the sample.

N %
Sex -- --
Male 53 77.94
Female 15 22.06
Parental Education Level -- --
No parent with bachelor’s degree 34 50.00
At least one parent with bachelor’s degree 24 35.29
No information 10 14.71
Race/Ethnicity -- --
White 60 88.24
Asian 2 2.94
Indian 1 1.47
Other 1 1.47
Hispanic 4 5.88
Completed Mailers -- --
During Retrospectively-Reported Episodes 110 44.90
Between Retrospectively-Reported Episodes 135 55.10
Total Number of Retrospectively-Reported Episodes 134 --
Lifetime History of Psychopathology (T1-T4) -- --
Major Depressive Disorder 68 100.0
Dysthymia 9 13.2
Anxiety Disorder 32 47.1
Substance Abuse/Dependence 46 67.6

In order to quantify the magnitude of differences in symptoms when participants did and did not report being in depressive episodes, we also examined the cross-sectional differences in depressive symptoms and suicidal ideation at the final (T4) assessment. These fully cross-sectional analyses provide a benchmark for the strongest difference in depressive symptoms between individuals in and not in depressive episodes. All models were estimated using restricted maximum likelihood. Missing observations were handled using listwise deletion, which reduced observations from 245 to 244 for suicidal ideation.

Results

Depressive symptoms

Results for the models examining the relationship of contemporaneously-reported depressive symptoms with length of recall, retrospective-depressive episode status, and the interaction between length of recall and retrospective-depressive episode status, controlling for sex and proportion of mailers completed while depressed, showed that there was a significant positive relationship between contemporaneously-reported depressive symptoms and being in a retrospectively-reported depressive episode (unstandardized coefficients presented in Table 1, Model 1). This was a medium-to-large effect (d = .66)1. The average CES-D score of the 110 mailers filled out during retrospectively-reported depressive episodes was 22.14 (SD = 12.98), and the average of the 135 mailers filled out in between depressive episodes was 15.26 (SD = 10.55). Length of recall period was not significantly associated with contemporaneously-reported depressive symptoms. In addition, there were no significant interactions between length of recall period and retrospective-depressive episode status or between proportion of time depressed and retrospective-depressive episode status.

In order to provide more context for evaluating the magnitude of the effect of retrospective depressive episode status, we compared the T4 CES-D scores for individuals with a lifetime history of depression who were currently in a depressive episode (n = 36; M = 21.61, SD = 11.87) to those with a lifetime history of depression, but not in a depressive episode at T4 (n = 425; M = 12.06, SD = 10.11). The mean difference was 9.55 points (t(459) = 5.38, p<.001, d = .87). The difference in CES-D scores based on contemporaneous report and depressive status was almost twice as large as the difference between depressive symptoms when individuals retrospectively-reported being in a depressive episode compared to symptoms when individuals reported not being in an episode (9.55 vs. 5.14, respectively).

Suicidal Ideation

Results for the models examining the relationship of contemporaneously-reported suicidal ideation with length of recall period, retrospective-depressive episode status, and the interaction between these variables, controlling for sex and proportion of mailers completed while depressed, showed that there was a significant positive relationship between contemporaneously-reported suicidal ideation and being in a retrospectively-reported depressive episode (Table 1, Model 2). This was a small-to-medium effect (d = .33). The average of the 109 mailers filled out during retrospectively-reported depressive episodes was 0.27 (SD = 1.25), and the average of the 135 mailers filled out in between depressive episodes was −0.22 (SD = .67). Length of recall period was not significantly associated with contemporaneously-reported suicidal ideation, and the interaction between length of recall period and retrospective-depressive episode status was not significant. However, the interaction between the proportion of time depressed and retrospective-depressive episode status was significant (b = 1.54, SE = .68, t = 2.28, p < .05). Post-hoc analyses showed that there were stronger differences between reports of suicidal ideation in and not in depressive episodes when participants were depressed during a higher proportion of assessments (b = 0.66, SE = .21, t = 3.18, p < .01) than when they were depressed during a lower proportion of assessments (b = 0.01, SE = .16, t = 0.06, p = .95).

In order to provide more context for interpreting the magnitude of the effect of retrospective depressive episode status, we compared the T4 suicidal ideation measure for individuals with a lifetime history of depression who were currently in a depressive episode (n = 36; M = 0.54, SD = 1.57) to those with a lifetime history of depression, but not in a depressive episode at T4 (n = 425; M = −0.05, SD = 0.92). The mean difference was 0.58 points (t(459) = 3.39, p<.01, d = 0.46). The difference based on contemporaneous report of suicidal ideation and depressive status was twice as large as the difference between depressive symptoms when individuals retrospectively-reported being in a depressive episode compared to when individuals reported not being in an episode (0.58 vs. 0.26, respectively).

Discussion

Retrospective reports of depressive disorders assessed via diagnostic interviews are extensively used in studies of risk, course and outcome, epidemiology, and genetics (e.g., Bonde, 2008; Flemming & Offord, 1990; Hartman et al., 2019; Hölzel et al., 2011; Lewinsohn et al., 1994; Liu et al., 2019; Middeldorp et al., 2005; van der Velden et al., 2015; Visted et al., 2018). Some studies have evaluated the reliability and validity of retrospective reports, including examinations of prospective versus retrospective reports (Moffitt et al., 2010; Olino et al., 2012; Takayanagi et al., 2014) and test-retest reliability of retrospective reports (Bromet et al., 1986; Kendler et al., 1993; Prusoff et al., 1988). However, none of these studies have examined whether contemporaneously-reported depressive symptoms and suicidal ideation differed based on whether or not participants were in retrospectively-recalled depressive episodes and whether symptoms during retrospectively-reported episodes approached established clinical-cutoffs. The current study capitalized on a unique opportunity to directly address this gap in the literature and provides incremental evidence for the validity of retrospectively-reported depressive episodes.

Results showed that higher levels of depressive symptoms were reported during retrospectively-reported depressive episodes compared to when not in episodes of depression. Specifically, individuals’ scores on the CES-D were five points higher when in a retrospectively reported depressive episode compared to when not in a depressive episode. This indicates that depressive symptoms track with depressive episodes, even when reported retrospectively. These results remained significant when controlling for both sex and proportion of mailers completed while depressed. However, this difference is only about one-half of the difference that existed between individuals who were depressed at T4 and individuals who were not depressed at that time. Thus, the difference between being in, and not in, episodes was more modest for retrospective than concurrent reports. Additionally, many individuals reported CES-D scores during retrospectively-reported depressive episodes that were lower than established cut-offs used to screen for depression (15 to 23 on the CES-D, depending on the sample, with the most recent recommendation being 21; Henry et al., 20182), suggesting that retrospective dating of episodes may not be very precise. However, when looking across individuals, the average score on the CES-D for mailers completed during retrospectively-reported depressive episodes was 22.14, and 54 of the 110 mailers completed during retrospectively-reported depressive episodes received total scores of 21 or greater. This indicates that on average, individuals were reporting concurrent levels of depressive symptoms that were above established clinical cut-offs for the CES-D during retrospectively-reported depressive episodes.

Conclusions are similar when examining contemporaneously-reported levels of suicidal ideation, such that individuals endorsed greater suicidal ideation during their retrospectively-reported depressive episodes, compared to when they retrospectively-reported not being in a depressive episode. These results remained significant when controlling for both sex and proportion of mailers completed while depressed. These findings suggest that concurrent symptoms track with retrospectively-reported depressive episodes and provide support for the validity of retrospective reports of symptoms.

The previous literature on the validity of retrospective reports of depression has produced mixed findings. Some studies have found that prospective assessments result in higher prevalence rates of depressive disorders than retrospective assessments (Mofit et al., 2010; Olino et al., 2012; Takayanagi et al., 2014). However, other studies have reported that the consistency of lifetime depressive disorder diagnoses over varying time intervals ranges from poor to good (κs = .34-.80; e.g., Bromet, Dunn, Connell, Dew, & Schulberg, 1986; Kendler, Neale, Kessler, Heath, & Eaves, 1993; Prusoff, Merikangas, & Weissman, 1988). Notably, this is the first study to compare contemporaneously-reported symptoms to retrospectively-reported symptoms during the same time period, within the same participants. This design eliminated potential confounding memory processes across multiple recall periods and differences across participants which could explain the mixed results in the literature.

Our models also examined whether longer recall periods influenced the magnitude of symptom differences between being in, and not in, depressive episodes. However, interactions between time and retrospective-depressive episode status were not significant. This indicates that the relationship between retrospectively-reported depressive episodes and contemporaneously-reported symptoms does not vary as a function of time. Thus, the magnitude of differences was unaffected by length of the recall period, at least within the bounds of the period examined in this study.

Finally, we explored whether the proportion of time depressed impacted the relationship between being in, and not in, retrospectively-reported depressive episodes and contemporaneously-reported symptoms. Although the proportion of time depressed did not impact the relationship between depressive episode status and contemporaneously-reported depressive symptoms, the proportion of time depressed did influence the relationship between depressive episode status and contemporaneously-reported suicidal ideation. Specifically, there were stronger differences between reports of suicidal ideation in and not in depressive episodes when individuals were depressed during a higher proportion of assessments. These results are consistent with the literature indicating that the persistence of depression is associated with greater suicidal ideation during depressive episodes (e.g., Carlson & Cantwell, 1982; Harwood, Hawton, Hope, & Jacoby, 2001; Hawton, Comabella, Haw, & Saunders, 2013). Importantly, the current study utilized longitudinal data to compare concurrently-assessed symptoms to retrospective reports of symptoms during that same time period. This allowed for a direct evaluation of the validity of retrospectively-reported episodes. Findings, however, should be interpreted in light of several limitations.

First, although we were able to determine whether symptoms during retrospectively-reported depressive episodes approached the established clinical cut-offs for the CES-D, we were unable to examine whether or not contemporaneous symptom reports met criteria for depressive episodes. Secondly, we could not examine whether depressive episode severity relates to recall ability, and we did not have the ability to precisely map the concurrent symptoms to exact dates of retrospectively-recalled depressive episodes due to the fact that the CES-D assesses symptoms in the past week and the onset and offset dates for episodes were coded in months.

Third, the mailers did not assess other forms of psychopathology. Thus, we could not conduct similar analyses for other common mental disorders, such as anxiety and substance use disorders. Fourth, relative to other studies of retrospective recall of depressive episodes, participants in the present study were potentially primed to recall episodes by virtue of completing self-report measures on multiple occasions. Thus, these results may upwardly bias the magnitude of the association.

Fifth, it is possible that a much longer recall period would reveal a greater decrement in recall accuracy. Sixth, based on the study design, we relied on a fairly fixed recall period of 6–7 years in early adulthood. Thus, we cannot generalize our findings beyond this specific context. Finally, the study design prevented us from conducting a similar comparison for experiences of suicidal ideation during periods of time when participants retrospectively endorsed versus did not endorse suicidal ideation as this was only systematically assessed in the depression module of the interviews. This limits our ability to generalize the results of these analyses beyond the experience of suicidal ideation during depressive episodes.

Using a novel approach to evaluate the validity of retrospective reports of depressive episodes, our findings show that individuals’ retrospectively-reported depressive episodes were marked by higher levels of contemporaneously-reported depressive symptoms and suicidal ideation compared to when they retrospectively-reported not being in depressive episodes, regardless of length of recall period, sex, and proportion of mailers completed while depressed. As there is only limited research on this topic, further work is needed to extend these results.

Supplementary Material

1

Table 2.

Linear effects models of contemporaneously-reported depressive symptoms and suicidal ideation by length of recall and retrospectively-reported depressive episode status

Dependent variable:
Contemporaneously-Reported Depressive Symptoms Contemporaneously-Reported Suicidal Ideation (z)
b (SE) b (SE)
Observation-Level Variables:
 Length of Recall Period 0.19 (0.39) −0.03 (0.04)
 In Depressive Episode – Retrospectively-Reported 5.14*** (1.18) 0.26* (0.11)
Person-Level Variables:
 Proportion of Time Depressed 10.58 (5.74) 1.29** (0.41)
 Female −2.83 (2.96) −0.27 (0.21)
Constant 16.01*** (3.64) −0.41 (0.27)

Observation Level Residual Variance 60.60* 0.59*
Person Level Residual Variance 79.20* 0.31*

Note:

*

p<.05

**

p<.01

***

p<0.001.

Standard errors shown in parentheses. Interactions not displayed as no interactions were significant.

Highlights.

  • Examined validity of retrospectively-reported depressive episodes

  • Greater depressive symptoms endorsed during retrospectively-reported episodes

  • Greater suicidal ideation endorsed during retrospectively-reported episodes

  • Average symptom levels during depressive episodes reached clinical cut-offs

Acknowledgements

Funding: This work was supported by the National Institute of Mental Health [Grant R01MH66023 awarded to Daniel N. Klein, Ph.D.; Grants R01MH40501, R01MH50522, R01MH52858, and R01DA012951 awarded to Peter M. Lewinsohn, Ph.D.; and Grant R01MH107495 awarded to Thomas M. Olino, Ph.D.

Role of the funding source

The funding sources did not have a role in the study design, collection, analysis, and interpretation of data; writing of the report; or in the decision to submit the article for publication.

Footnotes

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests.

1

In order to obtain interpretable effect sizes, we converted the t-statistic from the linear mixed models to Cohen’s d using: d = 2t/√df. This procedure was used for all models.

2

Additional supplementary analyses were conducted using multiple different cut-offs for the CES-D. Similar results were found across the range of cutoff points. Results of these analyses are available in the supplemental material.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders, 4th ed. APA: Washington, DC. [Google Scholar]
  2. Bates D, Maechler M, Bolker B, & Walker S (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67, 1–48. doi: 10.18637/jss.v067.i01 [DOI] [Google Scholar]
  3. Bonde JPE (2008). Psychosocial factors at work and risk of depression: a systematic review of the epidemiological evidence. Occupational and environmental medicine, 65, 438–445. doi: 10.1136/oem.2007.038430 [DOI] [PubMed] [Google Scholar]
  4. Bromet EJ, Dunn LO, Connell MM, Dew MA, & Schulberg HC (1986). Long-term reliability of diagnosing lifetime major depression in a community sample. Archives of General Psychiatry, 43, 435–440. doi: 10.1001/archpsyc.1986.01800050033004 [DOI] [PubMed] [Google Scholar]
  5. Carlson GA & Cantwell DP (1982). Suicidal behavior and depression in children and adolescents. Journal of the American Academy of child Psychiatry, 21, 361–368. doi: 10.1016/s0002-7138(09)60939-0 [DOI] [PubMed] [Google Scholar]
  6. Center for Behavioral Health Statistics and Quality. (2018). 2017 National Survey on Drug Use and Health: Detailed Tables. Substance Abuse and Mental Health Services Administration, Rockville, MD. [Google Scholar]
  7. Endicott J, & Spitzer RL (1978). A diagnostic interview: the schedule for affective disorders and schizophrenia. Archives of general psychiatry, 35, 837–844. [DOI] [PubMed] [Google Scholar]
  8. First MB, Spitzer RL, Gibbon M, & Williams JBW (1996). Structured clinical interview for DSM-IV Axis I disorders, research version, patient/non-patient edition. New York: Biometrics Research, New York State Psychiatric Institute. [Google Scholar]
  9. Fleming JE & Offord DR (1990). Epidemiology of childhood depressive disorders: A critical review. Journal of the American Academy of Child & Adolescent Psychiatry, 29, 571–580. doi: 10.1097/00004583-199007000-00010 [DOI] [PubMed] [Google Scholar]
  10. Hartmann JA, Nelson B, Ratheesh A, Treen D, & McGorry PD (2019). At-risk studies and clinical antecedents of psychosis, bipolar disorder and depression: a scoping review in the context of clinical staging. Psychological medicine, 49, 177–189. doi: 10.1017/s0033291718001435 [DOI] [PubMed] [Google Scholar]
  11. Harwood D, Hawton K, Hope T, & Jacoby R (2001). Psychiatric disorder and personality factors associated with suicide in older people: A descriptive and case- control study. International journal of geriatric psychiatry, 16, 155–165. doi: [DOI] [PubMed] [Google Scholar]
  12. Hawton K, I Comabella CC, Haw C, & Saunders K (2013). Risk factors for suicide in individuals with depression: a systematic review. Journal of affective disorders, 147, 17–28. doi: 10.1016/j.jad.2013.01.004 [DOI] [PubMed] [Google Scholar]
  13. Henry SK, Grant MM, & Cropsey KL (2018). Determining the optimal clinical cutoff on the CES-D for depression in a community corrections sample. Journal of Affective Disorders, 234, 270–275. doi: 10.1016/j.jad.2018.02.071 [DOI] [PubMed] [Google Scholar]
  14. Hölzel L, Härter M, Reese C, & Kriston L (2011). Risk factors for chronic depression—a systematic review. Journal of affective disorders, 129, 1–13. doi: 10.1016/j.jad.2010.03.025 [DOI] [PubMed] [Google Scholar]
  15. Kendler KS, Neale MC, Kessler RC, Heath AC, & Eaves LJ (1993). A longitudinal twin study of personality and major depression in women. Archives of general psychiatry, 50, 853–862. doi: 10.1001/archpsyc.1993.01820230023002 [DOI] [PubMed] [Google Scholar]
  16. Lewinsohn PM, Roberts RE, Seeley JR, Rohde P, Gotlib IH, & Hops H (1994). Adolescent psychopathology: II. Psychosocial risk factors for depression. Journal of abnormal psychology, 103, 302. doi.org/ 10.1037//0021-843x.103.2.302 [DOI] [PubMed] [Google Scholar]
  17. Lewinsohn PM, Rohde P, & Seeley JR (1996). Adolescent suicidal ideation and attempts: Prevalence, risk factors, and clinical implications. Clinical Psychology: Science and Practice, 3, 25–46. doi: 10.1111/j.1468-2850.1996.tb00056.x [DOI] [Google Scholar]
  18. Liu Y, Zhang N, Bao G, Huang Y, Ji B, Wu Y, … & Li G (2019). Predictors of depressive symptoms in college students: A systematic review and meta-analysis of cohort studies. Journal of Affective Disorders, 244, 196–208. doi: 10.1016/j.jad.2018.10.084 [DOI] [PubMed] [Google Scholar]
  19. Moffitt TE, Caspi A, Taylor A, Kokaua J, Milne BJ, Polanczyk G, & Poulton R (2010). How common are common mental disorders? Evidence that lifetime prevalence rates are doubled by prospective versus retrospective ascertainment. Psychological Medicine, 6, 899–909. doi: 10.1017/s0033291709991036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Middeldorp CM, Cath DC, Van Dyck R, & Boomsma DI (2005). The co-morbidity of anxiety and depression in the perspective of genetic epidemiology. A review of twin and family studies. Psychological medicine, 35, 611–624. doi: 10.1017/s003329170400412x [DOI] [PubMed] [Google Scholar]
  21. Olino TM, Klein DN, Lewinsohn PM, Rohde P, & Seeley JR (2008). Longitudinal associations between depressive and anxiety disorders: a comparison of two trait models. Psychological Medicine, 38, 353–363. doi: 10.1017/s0033291707001341 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Olino TM, Shankman SA, Klein DN, Seeley JR, Pettit JW, Farmer RF, & Lewinsohn PM (2012). Lifetime rates of psychopathology in single versus multiple diagnostic assessments: Comparison in a community sample of probands and siblings. Journal of Psychiatric Research, 46, 1217–1222. doi: 10.1016/j.jpsychires.2012.05.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Prusoff BA, Merikangas KR, & Weissman MM (1988). Lifetime prevalence and age of onset of psychiatric disorders: Recall 4 years later. Journal of Psychiatric Research, 22, 107–117. doi: 10.1016/0022-3956(88)90075-1 [DOI] [PubMed] [Google Scholar]
  24. Radloff LS (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied psychological measurement, 1, 385–401. doi: 10.1177/014662167700100306 [DOI] [Google Scholar]
  25. R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria: URL https://www.R-project.org/. [Google Scholar]
  26. Santor DA, Zuroff DC, Ramsay JO, Cervantes P, & Palacios J (1995). Examining scale discriminability in the BDI and CES-D as a function of depressive severity. Psychological Assessment, 7, 131. doi: 10.1037//1040-3590.7.2.131 [DOI] [Google Scholar]
  27. Substance Abuse and Mental Health Services Administration. (2018). Key substance use and mental health indicators in the United States: Results from the 2017 National Survey on Drug Use and Health (HHS Publication No. SMA 18–5068, NSDUH Series H-53). Rockville, MD: Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration; Retrieved from https://www.samhsa.gov/data. [Google Scholar]
  28. Takayanagi Y, Spira AP, Roth KB, Gallo JJ, Eaton WW, & Mojtabai R (2014). Accuracy of reports of lifetime mental and physical disorders: results from the Baltimore Epidemiological Catchment Area study. JAMA psychiatry, 71, 273–280. doi: 10.1001/jamapsychiatry.2013.3579 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. van der Velden AM, Kuyken W, Wattar U, Crane C, Pallesen KJ, Dahlgaard J, … & Piet J (2015). A systematic review of mechanisms of change in mindfulness-based cognitive therapy in the treatment of recurrent major depressive disorder. Clinical psychology review, 37, 26–39. doi: 10.1016/j.cpr.2015.02.001 [DOI] [PubMed] [Google Scholar]
  30. Verweij KH, Derks EM, Hendriks EJ, & Cahn W (2011). The influence of informant characteristics on the reliability of family history interviews. Twin Research and Human Genetics, 14, 217–220. doi.org/ 10.1375/twin.14.3.217 [DOI] [PubMed] [Google Scholar]
  31. Visted EV, Vøllestad JJ, Nielsen MM, & Schanche EE (2018). Emotion regulation in current and remitted depression: A systematic review and meta-analysis. Frontiers in psychology, 9, 756. doi: 10.3389/fpsyg.2018.00756 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES