Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Nov 28.
Published in final edited form as: J Psychiatr Res. 2018 Feb 10;100:63–70. doi: 10.1016/j.jpsychires.2018.01.016

Genome-wide scan of depressive symptomatology in two representative cohorts in the United States and the United Kingdom

Krisztina Mekli a,*, Drystan F Phillips b,c, Thalida E Arpawong d, Bram Vanhoutte a, Gindo Tampubolon e, James Y Nazroo a, Jinkook Lee b,c, Carol A Prescott d,f, Adam Stevens g, Neil Pendleton h
PMCID: PMC6882010  NIHMSID: NIHMS1054251  PMID: 29486404

Abstract

Unlike the diagnosed Major Depressive Disorder, depressive symptomatology in the general population has received less attention in genome-wide association scan (GWAS) studies.

Here we report a GWAS study on depressive symptomatology using a discovery-replication design and the following approaches: To improve the robustness of the phenotypic measure, we used longitudinal data and calculated mean scores for at least 3 observations for each individual. To maximize replicability, we used nearly identical genotyping platforms and identically constructed phenotypic measures in both the Discovery and Replication samples.

We report one genome-wide significant hit; rs58682566 in the EPG5 gene was associated (p = 3.25E-08) with the mean of the depression symptom in the Discovery sample, without confirmation in the Replication sample. We also report 4 hits exceeding the genome-wide suggestive significance level with nominal replications. Rs11774887, rs4147527 and rs1379328, close to the SAMD12 gene, were associated with the mean depression symptom score (P-values in Discovery sample: 4.58E-06, 7.65E-06 and 7.66E-06; Replication sample: 0.049, 0.029 and 0.030, respectively). Rs13250896, located in an intergenic region, was associated with the mean score of the three somatic items of the depression symptoms score (p = 3.31E-07 and 0.042 for the Discovery and Replication samples). These results were not supported by evidence in the literature.

We conclude that despite the strengths of our approach, using robust phenotypic measures and samples that maximize replicability potential, this study does not provide compelling evidence of a single genetic variant’s significant role in depressive symptomatology.

Keywords: Genetics, Depressive symptomatology, Genome-wide association, Ageing

1. Introduction

Major depression is the most common psychiatric disorder with the lifetime prevalence of 16% in the community (Kessler et al., 2003) and also moderately heritable. Studies have estimated the heritability of depression between 16 and 37%, depending on the type of depression and the study design (Gatz et al., 1992; Sullivan et al., 2000). These heritability estimates are now being supplemented with association studies that set out to identify genes with a possible role in susceptibility to depression (Kohli et al., 2011; Wray et al., 2012; Ripke et al., 2013; Hek et al., 2013; CONVERGE Consortium, 2015; Hyde et al., 2016). As for underlying biological mechanisms of depression, recent research suggests that they may differ across the life course. In later life, plausible mechanisms include those that are characteristic to the ageing process, such as endocrine changes (for example decreasing testosterone levels) (Blazer and Hybels, 2005), vascular risk factors (hypertension, diabetes and atherosclerosis) and neurodegenerative diseases (Alzheimer disease and dementia) (Weisenbach & Kumar, 2014).

On the other hand, depressive symptoms and trait depression have received less attention as phenotypic measures in GWAS studies (Terracciano et al., 2010), although literature suggests that diagnosable depression exists on a continuum with subthreshold depressive symptoms (Lewinsohn et al., 2000). Moreover, the presence of depressive symptoms is important on its own, as they are affecting the individual’s well-being and associated with cognitive (Wilson et al., 2002) and physical decline (Penninx et al., 1998) among other conditions.

Here we report the results of a genome-wide association scan (GWAS) study on depressive symptoms using a discovery-replication design with two independent community representative samples of older adults. The aim was to attempt replication with nearly identical genotyping platforms and carefully constructed identical phenotypic measures in two large independent samples. As a phenotypic measure, we used a self-report scale designed to measure the frequency of depressive symptomatology in the general population, the Center for Epidemiologic Studies Depression Scale (CESD; Radloff, 1977). The CESD scale assesses the level of psychological distress by marking the presence or absence of symptoms of depression. Although the CESD was not designed to measure the prevalence of depression, it performs well against other depression diagnostic inventories. Using 3 as cut-off point for caseness on the short version of CESD, its specificity and sensitivity were over 70% against the World Health Organisation’s Composite International Diagnostic Interview’s measure, which is commonly used as a substitute for a clinician’s diagnosis (HRS Health Working Group, 2000). However, the CES-D questions do not match to the Diagnostic and Statistical Manual (DSM) criteria for depressive disorder in many ways. For example, they do not address duration and intensity, which are important components for a diagnosis of disorder. They also ignore the possibility of other psychological disorders that have other symptoms similar to depression, such as anxiety disorders (HRS Health Working Group, 2000). We used the shorter, eight items version of the CESD scale as a continuum from low to high psychological distress, rather than dichotimising the scores into depressed and not depressed.

We used longitudinal data to construct phenotypic measures that did not represent a single time point, rather was constructed to represent a more stable pattern of symptomatology over the observed period. This approach allowed us to reduce variability due to both temporary effects and random error, and critically, to obtain a more robust phenotypic measure that reflects a trait rather than a state for elevated or reduced symptomatology. This approach minimizes problems with prior GWAS studies that use single assessments in which the phenotype reflects transitory depression, and fluctuations likely reflect environmental influences. Our approach of using a more stable phenotype, that draws from repeat measures data, is preferable and more robust for genetic association studies.

We also conducted confirmatory factor analysis to ensure measurement equivalence in the two samples. We used relatively large sample sizes with over 10,000 individuals in the Discovery sample and over 5000 individuals in the Replication sample. The genotyping platform was nearly identical between the two samples using 2.5 million single nucleotide polymorphisms (SNPs). Using population representative cohorts of older adults, we anticipated that the depressive symptomatology-associated genes will be relevant to emotional health in later life.

2. Materials and methods

2.1. Sample

We used the discovery and replication sample approach. The Discovery sample was drawn from the Health and Retirement Study (HRS), a nationally representative sample of households of older Americans in the United States (http://hrsonline.isr.umich.edu/). We included all participants who were interviewed in wave 2 (1994) through 10 (2010) and were at least 50 years old at any wave of data collection. All participants were provided with written consent forms, and ethical approval was granted by the University of Michigan Institutional Review Board.

The Replication sample was the English Longitudinal Study of Ageing in the United Kingdom (ELSA, http://www.elsa-project.ac.uk/), a nationally representative cohort of individuals living in England aged 50 and older (Steptoe et al., 2013). In this study, we included individuals who were interviewed in wave 1 (2002) through 5 (2010) and were 50 years old or older at first observation. All participants provided signed consent and ethical approval was granted by the London Multi Centre Research Ethics Committee.

The investigation was carried out in accordance with the latest version of the Declaration of Helsinki. Both samples contain only individuals with at least 3 observations for any phenotypic measure, even if the observed waves were not consecutive.

2.2. Genetic data

The genetic data was obtained by using Illumina’s Human Omni2.5-Quad BeadChip methodology (Illumina, San Diego, CA, USA). Genotyping was performed by the NIH Center for Inherited Disease Research (Johns Hopkins University, Baltimore, MD, USA) using HumanOmni2.5-4v1 platform for the Discovery sample. The University College London Genomics (London, UK) performed the genotyping on HumanOmni2.5-8v1 for the Replication sample. Quality control on the genetic data was performed by the data holder.

In brief from the 2,443,179 SNPs in the Discovery sample, 241,808 were excluded based on Hardy-Weinberg equilibrium p-value in European samples < 0.0001 or missingness ≥ 2% or minor allele frequency = 0% or other QC discrepancies (Health and Retirement Study, Quality Control Report for Genotyping Data, 2012. http://hrsonline.isr.umich.edu/sitedocs/genetics/HRS_QC_REPORT_MAR2012.pdf). During the analysis, SNPs with minor allele frequency below 5% were also excluded.

For the Replication sample similar quality filters were applied by the University College London, Genetics Institute, apart from SNP missingness, which was > 5%. As a result of quality control, the number of SNPs analysed were 1,223,220 and 1,490,612 for the Discovery and Replication samples, respectively.

In the Discovery sample, participants were excluded based on relatedness (n = 88) and chromosome anomalies (n = 79), yielding 12,341 individuals with genotype data. In the Replication sample 41 individuals were excluded for non-white ethnicity and sex discrepancies.

To minimize population stratification and maximize genetic similarity between the samples, we only analysed self-reported white individuals. As a result, the number of individuals in the analysis was 10,123, (including 773 Hispanic whites) in the Discovery sample and 5251 in the Replication sample.

2.3. Phenotypic measures: 8-item, 6-item measures and separate mood and somatic measures

Eight-item measure.

We used a shortened, eight-item version scale of the CESD. The 8 items were selected for their psychometric properties in assessing the continuum of depressive symptoms compared to the full scale (Kohout et al., 1993), and demonstrate very similar construct and external validity to the original inventory (HRS Health Working Group, 2000; Turvey et al., 1999). These 8 questions ask whether the following was true for the respondent much of the time during the past week, that (s)he:

  • felt depressed,

  • felt like everything (s)he did was an effort,

  • his/her sleep was restless,

  • was happy,

  • felt lonely,

  • enjoyed life,

  • felt sad,

  • could not get going.

We counted the number of symptoms for each wave (0–8). The maximum number of available waves was 9 for the Discovery and 5 for the Replication sample. Our main phenotypic variable is the mean of the individual scores of each wave, a continuous variable on a scale of 0–8.

Six-item measure.

We investigated the measurement invariance across the two samples using multigroup confirmatory factor analysis (Cheung and Rensvold, 1999, 2002). This enabled us to ensure that the factors are measuring the same underlying latent construct within each sample. This analysis suggested that the ‘respondent’s sleep was restless’ and ‘respondent did not enjoy life’ items differed significantly between the Discovery and Replication samples; therefore we also developed a mean CESD score that excluded these two items.

Separate mean mood and mean somatic measures.

Based on earlier confirmatory factor analysis results (Vanhoutte, 2014), we divided the CESD scores into mood items (respondent felt depressed, felt lonely, felt sad, was unhappy and not enjoyed life) and somatic items (respondent felt everything was an effort, his/her sleep was restless and could not get going) and calculated the mean scores separately, thus developing two additional phenotypic measures.

2.4. Statistical analysis

Phenotypic measures were developed using Stata12 software (Stata Corporation, http://www.stata.com/).

We performed association analysis against four phenotypic measures using linear regression and with sex included as a covariate to correct for observed gender differences in depression (Karim et al., 2015). For the meta-analysis we chose 64399 SNPs with p ≤ 0.05 in the Discovery sample and the mean of the total CESD score. These analyses were performed using Plink software (Chang et al., 2015; Purcell et al., 2007).

To avoid spurious association results arising from unadjusted population substructure we used the first six Eigenvectors as covariates in the analyses. A scree plot generated during the quality control of the genetic data (Health and Retirement Study, Quality Control Report for Genotyping Data, 2012. http://hrsonline.isr.umich.edu/sitedocs/genetics/HRS_QC_REPORT_MAR2012.pdf) shows that the fraction of variance accounted for is less than 4%, and approximately levels after the first two vectors, indicating that the use of the first six components is sufficient. The Replication sample contained only white individuals and literature indicates only modest population stratification in the British population (The Wellcome Trust Case Control Consortium, 2007); therefore only sex was used as covariate in the initial analyses. However, we repeated all analyses in the Replication sample with the first 4 Eigenvectors also included as covariates.

Gene-based association analysis was performed by using VEGAS web platform (https://vegas2.qimrberghofer.edu.au/). We used the defaults settings, chose CEU for the Discovery sample and GBR for the Replication sample as subpopulations and defined gene boundaries as ± 10000 kb outside of the gene.

The replication criteria for nominal replication in the Replication sample was p < 0.05 and the same direction of beta value as in the Discovery cohort.

3. Results

3.1. Demographic and phenotypic results

Table 1 shows that in both samples, there were more female participants than males. Individuals at first CESD observation were significantly younger in the Discovery sample, compared to the Replication sample. The mean CESD scores and the mean scores of the 3 somatic items were significantly different between the Discovery and the Replication samples (p < 0.001).

Table 1.

Demographic and phenotypic results. The significant differences between means are marked with asterisks.

Discovery sample Replication sample
Males (%) 4210 (41.59) 2397 (45.65)
Females (%) 5913 (58.41) 2854 (54.35)
Age at first CESD observation (years) 58.59* 63.18*
SE: 0.088 SE: 0.134
Mean total CESD score (8 item scale) Mean: 1.287* Mean: 1.385*
SE: 0.014 SE: 0.021
Mean mood CESD score (5 mood items only) Mean: 0.622 Mean: 0.619
SE: 0.009 SE: 0.013
Mean somatic CESD score (3 somatic items only) Mean: 0.665* Mean: 0.767*
SE: 0.007 SE: 0.010
Mean reduced CESD score (6 item scale) Mean: 0.92 Mean: 0.913
SE: 0.011 SE: 0.017

3.2. Confirmatory factor analysis results

To investigate measurement invariance by using confirmatory factor analysis, we divided one cross sectional wave (wave 6 of the Discovery sample and wave 1 of the Replication sample, combined n = 23,975) of our sample into four groups by gender and country. Results indicated that two items, ‘enjoying life’ and ‘restless sleep’, introduce country and gender bias. When allowing the factor loading of enjoying life, and the intercept for restless sleep to vary across gender and country, partial measurement equivalence is established (RMSEA = 0.047, CFI = 0.97) (Cheung and Rensvold, 1999, 2002). (CFI = comparative fit index, RMSEA = root mean squared error of approximation).

3.3. Genetic association results

The top 5 hits for each of the four phenotypic measures are shown in Table 2a,b,c, and d.

Table 2a.

Phenotype: Mean CESD score (eight item scale).

Chr BP SNP Allele Discovery sample Replication sample Nearest gene
Beta P-value MAF Beta P-value MAF
18 43531868 rs58682566 G 0.2009 3.25E-08 0.082 −0.07645 0.218 0.062 EPG5, intronic
18 43533220 rs11082501 A 0.1946 6.02E-08 0.086 −0.07795 0.206 0.063 EPG5, intronic
18 43522732 rs12232683 G 0.1931 6.60E-08 0.087 −0.07217 0.239 0.063 EPG5, intronic
18 43520214 rs12604492 A 0.1929 7.32E-08 0.086 −0.06953 0.257 0.063 EPG5, intronic
8 58844022 rs13250896 A −0.1583 1.31E-07 0.120 −0.08396 0.072 0.114 intergenic

Table 2b.

Phenotype: Mean CESD score (six item scale).

Chr BP SNP Allele Discovery sample Replication sample Nearest gene
Beta P-value MAF Beta P-value MAF
18 43531868 rs58682566 G 0.1526 7.79E-08 0.082 −0.04222 0.384 0.062 EPG5, intronic
8 58844022 rs13250896 A −0.1248 1.02E-07 0.120 −0.05297 0.147 0.114 intergenic
18 43522732 rs12232683 G 0.1471 1.40E-07 0.087 −0.03745 0.435 0.063 EPG5, intronic
18 43533220 rs11082501 A 0.1477 1.43E-07 0.086 −0.04148 0.389 0.063 EPG5, intronic
8 58843535 rs11774225 C −0.1186 1.73E-07 0.129 −0.05074 0.152 0.122 intergenic

Table 2c.

Phenotype: Mean CESD score (five mood items).

Chr BP SNP Allele Discovery sample Replication sample Nearest gene
Beta P-value MAF Beta P-value MAF
18 43531868 rs58682566 G 0.1076 1.51E-06 0.082 −0.02044 0.591 0.062 EPG5, intronic
4 38871385 rs73236646 G 0.0723 1.74E-06 0.192 0.0007874 0.974 0.177 FAM114A1, intronic
18 43522732 rs12232683 G 0.1046 1.99E-06 0.087 −0.0172 0.647 0.063 EPG5, intronic
9 32744916 rs13297009 C* −0.05643 2.62E-06 0.444 −0.02274 0.213 0.455 intergenic
20 11138229 rs6108847 G 0.05746 2.74E-06 0.383 0.008241 0.662 0.370 intergenic
*

In ELSA the minor allele is G.

Table 2d.

Phenotype: Mean CESD score (three somatic items).

Chr BP SNP Allele Discovery sample Replication sample Nearest gene
Beta P-value MAF Beta P-value MAF
18 43531868 rs58682566 G 0.09336 1.01E-07 0.082 −0.05601 0.069 0.062 EPG5, intronic
18 43533220 rs11082501 A 0.09154 1.28E-07 0.086 −0.0593 0.053 0.063 EPG5, intronic
18 43520214 rs12604492 A 0.08971 2.10E-07 0.086 −0.05302 0.083 0.063 EPG5, intronic
18 43522732 rs12232683 G 0.08856 2.81E-07 0.087 −0.05497 0.072 0.063 EPG5, intronic
8 58844022 rs13250896 A −0.07387 3.31E-07 0.120 −0.04725 0.042* 0.114 intergenic
*

Nominally replicated result.

We observed only one genome-wide significant association, between rs58682566 and the total mean CESD score, without confirmation in the Replication sample.

Considering the suggestive level (p < 1E-05), we had 48 SNPs significantly associated with the mean total CESD score, 34 with the reduced CESD score, 25 with the mood items and 29 with the somatic items in the Discovery sample by 74 unique SNPs (results can be seen in Appendix). Among these results, we found 4 nominal replications; they are presented in Table 3. However, three of them (rs11774887, rs4147527 and rs1379328) span a ~5500 bp region of chromosome 8, and thus likely represent the same signal (r2 > 0.98 for each SNP pair). The fourth SNP (rs13250896) is also located on chromosome 8 but resides 60 million bp away, therefore is possibly an independent signal.

Table 3.

Nominally replicated SNP results.

Chr BP SNP Allele Discovery sample Replication sample Nearest gene Phenotype
Beta P-value MAF Beta P-value MAF
8 119171312 rs11774887 A 0.1186 4.58E-06 0.174 0.7702 0.049 0.172 SAMD12 Mean CESD
8 119165749 rs4147527 A 0.1157 7.65E-06 0.174 0.08539 0.029 0.172 SAMD12 Mean CESD
8 119165984 rs1379328 A 0.1157 7.66E-06 0.174 0.08448 0.030 0.172 SAMD12 Mean CESD
8 58844022 rs13250896 A −0.07387 3.31E-07 0.120 −0.04725 0.042 0.114 intergenic Mean CESD somatic

On the suggestive level, we found good overlap of associated SNPs between the total and the reduced CESD scores (29 SNPs) and between the total score and either the mood or somatic items (26 SNPs). However, we only found 7 SNPs in common between the mood and somatic items analyses. We also observed 7 SNPs (rs12232683, rs12604492, rs58682566, rs73236646, rs1047115, rs11082501, rs13250896) that were significant in every analysis in the Discovery sample (p < 1E-05), but only one of them replicated on a nominal level in the Replication sample (rs13250896). All suggestive level results can be seen in Appendix.

3.3.1. Meta-analysis results

In the meta-analysis, assuming a fixed effect, none of the SNPs reached genome-wide significance. The most significant association was between rs13250896 and the total mean CESD score (p = 6.14E-08, beta = −0.137). The same SNP yielded the most significant result in two other analyses, namely in the reduced CESD score and somatic items analyses. This was the only SNP which reached a suggestive level significance in all four analyses (Results are presented in Table 4). Altogether we found 54 SNPs which reached at suggestive genome-wide significance in any of the four analyses. These results are presented in the Appendix.

Table 4.

Fixed effects meta-analysis results for rs13250896.

Chr BP SNP Allele MeanCESD MeanCESD 6 items Mood items Somatic items
P-value beta P-value beta P-value beta P-value beta
8 58844022 rs13250896 A 6.14E-08 −0.137 1.39E-07 −0.104 5.49E-06 −0.070 6.24E-08 −0.067

3.3.2. Gene based analysis

In the gene-based analysis the most significantly associated genes with the total mean CESD score were EPG5 (p = 6.0E-06, represented by 42 SNPs) and TLR1 (p = 9.0E-06, represented by 8 SNPs). These p-values do not reach genome-wide significance of 2.5E-06 (= 0.5/20000), and were not significant in the Replication sample.

The genomic inflation factor values for the Discovery sample vary between 1.017 and 1.022 and between 0.996 and 1.008 for the Replication sample. These results did not indicate serious population stratification in the samples and the literature suggests that polygenicity (many small genetic effects) accounts for the majority of inflation in test statistics in GWAS studies (Bulik-Sullivan et al., 2015). LD score regression intercepts range between 1.006 and 1.016 (SE 0.0064–0.0069) in the Discovery sample and 1.002–1.010 (SE 0.0059–0.0068) in the Replication sample, also indicating adequate control for population stratification.

Manhattan and QQ plots for each analysis are provided in the Appendix.

4. Discussion

In this study we conducted a genome-wide association study on depressive symptomatology, in two representative cohorts. The aim was to attempt replication with highly comparable genotype platforms and identical phenotypes in the two independent samples. To take advantage of the repeated measures and longitudinal design of the study cohorts, we used the mean CESD score across waves for individuals who had at least three observations on the full eight-item scale. With this approach, our phenotype did not represent only a single point in time, which would be a disadvantage because CESD scores fluctuate over time; rather the phenotype represents more of a stable trait, which is preferable for discovering underlying genetic associations. We also developed three further phenotypic measures: (a) a six-item phenotype to ensure measurement invariance between the samples and (b) separate mood and (c) somatic scores in order to differentiate between emotional and somatic domains of depressive symptomatology. We corrected for population substructure by using only white participants and the first six Eigenvectors.

In these analyses, there was one genome-wide significant association, but it did not replicate. Considering analysis of all four phenotypes, altogether, we had 74 unique SNPs yielding 136 associations below the genome-wide suggestive level (p < 1E-05). Of these associations, four were nominally replicated in the Replication sample but they correspond to only 2 independent signals. The meta-analyses enhanced the significance of one of the nominally replicated SNP, rs13250896 but it still not reached the genome-wide significance level. We found no replication after performing gene-based association, either. We also concluded from our results that the lack of replication is unlikely to be attributed to phenotypic measurement differences between the two samples. Despite our expectation, we could not provide evidence of a single genetic variant’s association with the depressive symptomatology.

The SNP yielding genome-wide significant association in the Discovery sample rs58682566 is an intronic variant within the EPG5 (Ectopic P-Granules Autophagy Protein 5 Homolog (C. elegans) gene on chromosome 18. This gene is the human homolog of the metazoan-specific autophagy gene Epg5 which encodes the key autophagy protein Epg5. Despite the indications of the importance of this gene in neurological processes (Cullup et al., 2013), neither the gene nor rs58682566 has been associated with emotional health phenotypes in the literature previously.

All four SNPs that considered as nominally replicated are located on chromosome 8. Three of these (rs11774887, rs4147527, rs1379328) are close to the SAMD12 (Sterile Alpha Motif Domain Containing 12) gene. The fourth SNP, rs13250896 occupies an intergenic region of the genome.

The role of SAMD12 is not well characterized in the literature. A study found that transcription of the SAMD12 gene was down-regulated in response to Tat protein in human T cells (Johnson et al., 2013). Tat is a viral protein crucial for HIV replication and present at reasonably high levels in the brain and cerebrospinal fluid. Tat was also found to upregulate proinflammatory cytokines (such as IL1B and IL8) and dysregulate IL-17 signaling (Johnson et al., 2013). This suggests that a SAMD12 gene product may take part in immunological processes, and this is notable with the growing interest in immune mediated theories for depression (Furtado and Katzman, 2015; van Dooren et al., 2016). It worth mentioning, that another gene, EXT1 (Exostosin Glycosyl-transferase 1) can be found in the region of the three SNPs. This gene’s product is involved in the chain elongation step of heparan sulphate biosynthesis. Interestingly, heparan sulphate proteoglycans are associated with the pathology of common neurodegenerative disorders, such as Alzheimer’s disease and Parkinson’s disease (Nadanaka and Kitagawa, 2008).

Also, there are some indications for past associations in GWAS studies for 3 of our 4 nominally replicated SNPs in the literature with the associated phenotypes covering a wide range. Rs4147527 and rs1379328 have shown associations with autism (Anney et al., 2010), late onset Alzheimer disease (Hu et al., 2011), and age-related macular degeneration (Fritsche et al., 2013). Rs13250896 has shown associations with cardiovascular risk-related phenotypes, such as carotid artery thickness (Melton et al., 2013), LDL and total cholesterol levels (Teslovich et al., 2010), and diastolic blood pressure (Ehret et al., 2011) in GWAS studies. However, none of these associations were below the p = 1.0E-4 levels (GRASP database of GWAS studies, Leslie et al., 2014).

We compared our results against the publicly available results of a GWAS study on MDD by the Psychiatric Genetic Consortium (PGC) (Ripke et al., 2013). Our top hit, rs58682566 was not present among the genotyped PGC SNPs, but two SNPs were imputed from this area of the genome (rs9964220 and rs8090930, spanning 9500 bps, with rs58682566 located between them) and their significance was 0.368 and 0.88. Of our four nominally replicated SNPs, rs4147527, rs1379328 and rs13250896 were among the genotyped PGC results, with p values of 0.412, 0.4311 and 0.2714, respectively. Therefore comparison of our results against PGC’s MDD results did not provide supporting evidence for SNPs identified in this study.

For the results presented in the present study, the GWAS literature provided some evidence for the associations in GWAS studies of the identified SNPs, but failed to confirm their role in influencing emotional health; this does not provide confidence for drawing conclusions on the relationships between these SNPs and depressive symptomatology from our weak replications.

The lack of success in replication can result from many factors. It is generally accepted that the effects of common genetic variants for a complex phenotype such as depressive symptomatology is very small (Ripke et al., 2013) and many studies suffer from insufficient power to detect these small effect sizes. In a recent study using depressive symptoms as phenotypic measure in a meta-analysis of 180,866 individuals, only 2 SNPs exceeded the genome-wide significance level and replicated successfully in a companion cohort (Okbay et al., 2016). This gives an indication of the magnitude of sample size required for successful detections of true findings. This study using heterogeneous phenotypic measures found an effect size of 0.002%. This value is 250 times smaller than the 0.5% that our study supports. Even if we consider that the phenotypic measure we used is homogenous and measured at several time points (70% of the participants have at least 7 observations in the Discovery sample), our study is likely to be underpowered.

On the other hand, there is example in the literature for successful identification and replication of MDD-associated SNPs with much smaller sample size with carefully selected subjects for whom known sources of phenotypic and genetic heterogeneity were minimized (CONVERGE Consortium, 2015).

Recently, the proportion of phenotypic variance explained by common SNPs (Yang et al., 2011) of the CESD score was found to be 0.35 in a univariate model (Boardman et al., 2015). However, after controlling for the first five Principal Components, the heritability estimate dropped to 0.19. This relatively moderate heritability explained by common SNPs indicates that other genetic (Glessner et al., 2010) or non-genetic factors may play a significant role in depression.

The non-genetic environmental factors may co-act in an additive way or interact with the genetic factors. For example, Caspi et al. (2003) showed that functional polymorphism (5-HTTLPR) in the promoter region of the serotonin transporter gene (SLC6A4) was only associated with depression in an interaction with life events, not on its own. This demonstrates that the risk allele only confers susceptibility to depression if the carrier has been exposed to stressful life events, and provides evidence for a gene-environment interaction. The Caspi finding has been replicated in some studies since then, although not consistently (Cohen-Woods et al., 2013). We did not take into consideration current or past stressful life events in our analyses, despite the likely influences of a gene-environment interaction within individuals and across the sample.

Moreover, there could be larger-scale environmental effects present, as the sample mean CESD was higher in some waves than in others. In the Discovery sample in wave 3, the sample mean of the mean of CESD measure was 1.021 (lowest), whereas in wave 8 it was 1.35 (highest). This range in the Replication sample was 1.29 (lowest, wave 4) – 1.42 (highest, wave 2 and wave 5). We have no explanation for the origin of this difference between waves but it could have contributed to the failure of replication.

In addition, the CESD score is a subjective measure, with the possibility of individual variation in interpretation of and response to particular items, making it more difficult to achieve successful replication. However, here we have used a version of the measure that has measurement equivalence in the Discovery and Replication samples (Vanhoutte, 2014) and the measure has been shown to have good validity and reliability (HRS Health Working Group, 2000).

Further concern could be the robustness of our phenotypic measure. We calculated the association between the individual wave-specific CESD scores and the overall mean CESD scores in the Discovery sample. The results showed high correlation (range: 0.70–0.79), indicating the robustness of the mean CESD score.

Despite the lack of convicting replications, our study has a number of strengths. It uses two nationally representative samples with large sample sizes in both the Discovery (n = 10,123) and the Replication sample (n = 5251). We used a continuous scale phenotypic measure, which increases the phenotypic variance compared to binary phenotypic measures. Also, phenotypes were constructed by including repeated assessments of depressive symptomatology over at least three waves in order to provide some degree of stability to the measure. This is relevant because symptomatology can fluctuate widely over time, partially due to environmental exposures. Additionally, by using the same questions in both samples, we attempted to ensure phenotypic homogeneity between the two samples. A further notable strength of our study design is the use of the highly comparable genotype platform for the Discovery and Replication samples.

Among the weaknesses of the study include the use of the community samples, as the number of individuals with higher depressive symptoms scores may be lower than in a specifically recruited sample of depressed individuals. In relation to this, although we have 40% of individuals in the Discovery sample and 70% in the Replication sample with maximum number of observations, the missing data may be due to a depressive episode, therefore the CESD measure is somewhat biased. Moreover, the CESD questions ask about the past week, which may not be typical of the individual’s life in general; however, by using at least three separate observations for each individual, we are more likely to capture a trait that is characteristic to that individual, rather than a labile state.

We also acknowledge that one of the observed four nominal replications was in relation to the mean of the three somatic items measure. The already weak replication is further questioned by the fact, that one of the items in this measure was among the two with different measurement properties in the two samples as indicated by the multigroup confirmatory factor analysis.

A further weakness of the study is that depressive symptomatology is heavily influenced by the environment, such as by stressful life events (Lotrich, 2011), and can be affected by cultural and health-care access differences between the two countries; variables we did not measure and correct for.

Although studies have demonstrated the importance of epigenetic regulation in depression-related behavior or its treatment (Sun et al., 2013), epigenetic mechanisms cannot be interrogated within a framework of a conventional GWAS study.

The different number of available waves in the two samples may cause differences in the accuracy of the phenotypic measures. Moreover, these waves span different calendar years in the two samples and this may pose cohort effects. Lastly, the multigroup confirmatory factor analysis results indicated differences in the mean CESD measures between the eight or six item scales. However, that difference was not supported by the genetic association results; in fact one of the nominal replications was observed in association with the somatic measure which contains the ‘restless sleep’ item.

In summary, here we report the results of a genome-wide association scan study of depressive symptomatology with two large population representative samples, paying particular attention to independent replication. Similar to previous studies investigating MDD and depression as a trait, we cannot report robust and replicable findings. We had one association that exceeded the genome-wide significance threshold, although it did not replicate. This suggests future studies of genetic determinants of depressive symptomatology may require alternate study designs, such as incorporating larger sample sizes, gene × environment interactions, or epigenetic approaches.

Supplementary Material

AppendixQQ
AppendixTable

Acknowledgements

Mr. John Mcloughlin for his technical assistance, programming and Linux administration.

All participants in HRS and ELSA.

Funding source

This work was supported by the Medical Research Council [grant number: G1001375] in the United Kingdom and the National Institute on Aging [grant numbers: R01 AG030153 and F32 AG048681] in the United States. The funding sources had no further role in conducting the research or in publication.

Footnotes

All the pheno-and genotype data are publicly available.

All authors have approved the final article.

Conflicts of interest

The authors report no conflict of interest.

Appendix A. Supplementary data

Supplementary data related to this article can be found at http://dx.doi.org/10.1016/j.jpsychires.2018.01.016.

References

  1. Anney R, Klei L, Pinto D, Regan R, Conroy J, Magalhaes TR, et al. , 2010. A genome-wide scan for common alleles affecting risk for autism. Hum. Mol. Genet 19, 4072–4082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Blazer DG 2nd, Hybels CF, 2005. Origins of depression in later life. Psychol. Med 35, 1241–1252. [DOI] [PubMed] [Google Scholar]
  3. Boardman JD, Domingue BW, Daw J, 2015. What can genes tell us about the relationship between education and health? Soc. Sci. Med 12, 171–180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, Yang J, et al. , Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2015. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet 47, 291–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Caspi A, Sugden K, Moffitt TE, Taylor A, Craig IW, Harrington H, et al. , 2003. Influence of life stress on depression: moderation by a polymorphism in the 5-HTT gene. Science 301, 386–389. [DOI] [PubMed] [Google Scholar]
  6. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ, 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cheung G, Rensvold R, 1999. Testing factorial invariance across groups: a reconceptualization and proposed new method. J. Manag 25, 1–27. [Google Scholar]
  8. Cheung G, Rensvold R, 2002. Evaluating goodness-of-fit indexes for testing measurement invariance. Struct. Equ. Model 9, 233–255. [Google Scholar]
  9. Cohen-Woods S, Craig IW, McGuffin P, 2013. The current state of play on the molecular genetics of depression. Psychol. Med 43, 673–687. [DOI] [PubMed] [Google Scholar]
  10. CONVERGE Consortium, 2015. Sparse whole-genome sequencing identifies two loci for major depressive disorder. Nature 523, 588–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cullup T, Kho AL, Dionisi-Vici C, Brandmeier B, Smith F, Urry Z, et al. , 2013. Recessive mutations in EPG5 cause Vici syndrome, a multisystem disorder with defective autophagy. Nat. Genet 45, 83–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Ehret GB, Munroe PB, Rice KM, Bochud M, Johnson AD, Chasman DI, et al. , 2011. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 478, 103–109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fritsche LG, Chen W, Schu M, Yaspan BL, Yu Y, Thorleifsson G, et al. , 2013. Seven new loci associated with age-related macular degeneration. Nat. Genet 45, 439e1–2 433–439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Furtado M, Katzman MA, 2015. Examining the role of neuroinflammation in major depression. Psychiatr. Res 229, 27–36. [DOI] [PubMed] [Google Scholar]
  15. Gatz M, Pedersen NL, Plomin R, Nesselroade JR, McClearn GE, 1992. Importance of shared genes and shared environments for symptoms of depression in older adults. J. Abnorm. Psychol 101, 701–708. [DOI] [PubMed] [Google Scholar]
  16. Glessner JT, Wang K, Sleiman PMA, Zhang H, Kim CE, Flory JH, et al. , 2010. Duplication of the SLIT3 locus on 5q35.1 predisposes to major depressive disorder. PLoS One 5, e15463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Health and Retirement Study Health Working Group, 2000. Documentation of Affective Functioning Measures in the Health and Retirement Study. University of Michigan [Google Scholar]
  18. Hek K, Demirkan A, Lahti J, Terracciano A, Teumer A, Cornelis MC, et al. , 2013. A genome-wide association study of depressive symptoms. Biol. Psychiatr 73, 667–678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hu X, Pickering E, Liu YC, Hall S, Fournier H, Katz E, et al. , 2011. Meta-analysis for genome-wide association study identifies multiple variants at the BIN1 locus associated with late-onset Alzheimer’s disease. PLoS One 6, e16616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hyde CL, Nagle MW, Tian C, Chen X, Paciga SA, Wendland JR, et al. , 2016. Identification of 15 genetic loci associated with risk of major depression in individuals of European descent. Nat. Genet 48, 1031–1036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Johnson TP, Patel K, Johnson KR, Maric D, Calabresi PA, Hasbun R, et al. , 2013. Induction of IL-17 and nonclassical T-cell activation by HIV-Tat protein. Proc. Natl. Acad. Sci. U. S. A 110, 13588–13593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Karim J, Weisz R, Bibi Z, ur Rehman S, 2015. Validation of the eight-item center for epidemiologic studies depression scale (CES-D) among older adults. Curr. Psychol 34, 681–962. [Google Scholar]
  23. Kessler RC, Berglund P, Demler O, Jin R, Koretz D, Merikangas KR, et al. , 2003. The epidemiology of major depressive disorder: results from the National Comorbidity Survey Replication (NCS-R). J. Am. Med. Assoc 289, 3095–3105. [DOI] [PubMed] [Google Scholar]
  24. Kohli MA, Lucae S, Saemann PG, Schmidt MV, Demirkan A, Hek K, et al. , 2011. The neuronal transporter gene SLC6A15 confers risk to major depression. Neuron 70, 252–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kohout FJ, Berkman LF, Evans D. a., cornoni-Huntley J, 1993. Two Shorter Forms of the CES-D Depression Symptoms Index. J. Aging Health 5, 179–193. [DOI] [PubMed] [Google Scholar]
  26. Leslie R, O’Donnell CJ, Johnson AD, 2014. GRASP: analysis of genotype-phenotype results from 1,390 genome-wide association studies and corresponding open access database. Bioinformatics 30, i185–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lewinsohn PM, Solomon A, Seeley JR, Zeiss A, 2000. Clinical implications of “subthreshold” depressive symptoms. J. Abnorm. Psychol 109, 345–351. [PubMed] [Google Scholar]
  28. Lotrich FE, 2011. Gene-environment interactions in geriatric depression. Psychiatr. Clin 34, 357–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F, 2010. Deriving the consequences of genomic variants with the Ensembl API and SNP effect predictor. Bioinformatics 26, 2069–2070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Melton PE, Carless MA, Curran JE, Dyer TD, Goring HH, Kent JW Jr., et al. , 2013. Genetic architecture of carotid artery intima-media thickness in Mexican Americans. Circ. Cardiovasc. Genet 6, 211–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Nadanaka S, Kitagawa H, 2008. Heparan sulphate biosynthesis and disease. J. Biochem 144, 7–14 Review. [DOI] [PubMed] [Google Scholar]
  32. Okbay A, Baselmans BM, De Neve JE, Turley P, Nivard MG, Fontana MA, et al. , 2016. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet 48, 624–633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Penninx BW, Guralnik JM, Ferrucci L, Simonsick EM, Deeg DJ, Wallace RB, 1998. Depressive symptoms and physical decline in community-dwelling older persons. J. Am. Med. Assoc 279, 1720–1726. [DOI] [PubMed] [Google Scholar]
  34. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. , 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet 81, 559–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Radloff LS, 1977. The CES-D Scale: a self-report depression scale for research in the general population. Appl. Psychol. Meas 1, 385–401. [Google Scholar]
  36. Ripke S, Wray NR, Lewis CM, Hamilton SP, Weissman MM, Breen G, et al. , 2013. A mega-analysis of genome-wide association studies for major depressive disorder. Mol. Psychiatr 18, 497–511. Website for MDD association results. https://www.med.unc.edu/pgc/results-and-downloads. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Steptoe A, Breeze E, Banks J, Nazroo J, 2013. Cohort profile: the English longitudinal study of ageing. Int. J. Epidemiol 42, 1640–1648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Sullivan PF, Neale MC, Kendler KS, 2000. Genetic epidemiology of major depression: review and meta-analysis. Am. J. Psychiatr 157, 1552–1562. [DOI] [PubMed] [Google Scholar]
  39. Sun H, Kennedy PJ, Nestler EJ, 2013. Epigenetics of the depressed brain: role of histone acetylation and methylation. Neuropsychopharmacology Reviews 38, 124–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. The Wellcome Trust Case Control Consortium, 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Terracciano A, Tanaka T, Sutin AR, Sanna S, Deiana B, Lai S, et al. , 2010. Genome-wide association scan of trait depression. Biol. Psychiatr 68, 811–817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M, et al. , 2010. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Turvey CL, Wallace RB, Herzog R, 1999. A revised CES-D measure of depressive symptoms and a DSM-based measure of major depressive episodes in the elderly. Int. Psychogeriatr 11, 139–148. [DOI] [PubMed] [Google Scholar]
  44. van Dooren FE, Schram MT, Schalkwijk CG, Stehouwer CD, Henry RM, Dagnelie PC, et al. , 2016. Associations of low grade inflammation and endothelial dysfunction with depression - the Maastricht Study. Brain, Behav. Immun 56, 390–396. [DOI] [PubMed] [Google Scholar]
  45. Vanhoutte B, 2014. The multidimensional structure of subjective well-being in later life. J. Popul. Ageing 7, 1–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Weisenbach SL, Kumar A, 2014. Current understanding of the neurobiology and longitudinal course of geriatric depression. Curr. Psychiatr. Rep 16, 463. [DOI] [PubMed] [Google Scholar]
  47. Wilson RS, Bennett DA, Bienias JL, Aggarwal NT, Mendes De Leon CF, Morris MC, et al. , 2002. Cognitive activity and incident AD in a population-based sample of older persons. Neurology 59, 1910–1914. [DOI] [PubMed] [Google Scholar]
  48. Wray NR, Pergadia ML, Blackwood DH, Penninx BW, Gordon SD, Nyholt DR, et al. , 2012. Genome-wide association study of major depressive disorder: new results, meta-analysis, and lessons learned. Mol. Psychiatr 17, 36–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Yang JA, Lee SH, Goddard ME, Visscher PM, 2011. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet 88, 76–82. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

AppendixQQ
AppendixTable

RESOURCES