Skip to main content
JAMA Network logoLink to JAMA Network
. 2021 Aug 11;78(10):1152–1160. doi: 10.1001/jamapsychiatry.2021.1988

Polygenic Risk Scores Derived From Varying Definitions of Depression and Risk of Depression

Brittany L Mitchell 1,2, Jackson G Thorp 1,3, Yeda Wu 4, Adrian I Campos 1,3,5, Dale R Nyholt 2,6, Scott D Gordon 1, David C Whiteman 1, Catherine M Olsen 1, Ian B Hickie 7, Nicholas G Martin 1, Sarah E Medland 1, Naomi R Wray 4,8, Enda M Byrne 4,9,
PMCID: PMC8358814  PMID: 34379077

Key Points

Question

To what extent does the depth of phenotyping matter in genetic studies of depression?

Findings

In this case-control polygenic risk score analysis including 12 106 individuals with major depressive disorder, the major factor in estimating risk was sample size of the discovery genome-wide association studies. Polygenic risk scores derived from studies assessing diagnostic criteria for major depressive disorder had associations with higher odds ratios with somatic symptoms and comorbidities of major depressive disorder.

Meaning

Results of this study suggest that to generate potential better genetic estimations of risk for severe depression, larger genome-wide association study sample sizes, regardless of the depth of phenotyping, should be prioritized.

Abstract

Importance

Genetic studies with broad definitions of depression may not capture genetic risk specific to major depressive disorder (MDD), raising questions about how depression should be operationalized in future genetic studies.

Objective

To use a large, well-phenotyped single study of MDD to investigate how different definitions of depression used in genetic studies are associated with estimation of MDD and phenotypes of MDD, using polygenic risk scores (PRSs).

Design, Setting, and Participants

In this case-control polygenic risk score analysis, patients meeting diagnostic criteria for a diagnosis of MDD were drawn from the Australian Genetics of Depression Study, a cross-sectional, population-based study of depression, and controls and patients with self-reported depression were drawn from QSkin, a population-based cohort study. Data analyzed herein were collected before September 2018, and data analysis was conducted from September 10, 2020, to January 27, 2021.

Main Outcome and Measures

Polygenic risk scores generated from genome-wide association studies using different definitions of depression were evaluated for estimation of MDD in and within individuals with MDD for an association with age at onset, adverse childhood experiences, comorbid psychiatric and somatic disorders, and current physical and mental health.

Results

Participants included 12 106 (71% female; mean age, 42.3 years; range, 18-88 years) patients meeting criteria for MDD and 12 621 (55% female; mean age, 60.9 years; range, 43-87 years) control participants with no history of psychiatric disorders. The effect size of the PRS was proportional to the discovery sample size, with the largest study having the largest effect size with the odds ratio for MDD (1.75; 95% CI, 1.73-1.77) per SD of PRS and the PRS derived from ICD-10 codes documented in hospitalization records in a population health cohort having the lowest odds ratio (1.14; 95% CI, 1.12-1.16). When accounting for differences in sample size, the PRS from a genome-wide association study of patients meeting diagnostic criteria for MDD and control participants was the best estimator of MDD, but not in those with self-reported depression, and associations with higher odds ratios with childhood adverse experiences and measures of somatic distress.

Conclusions and Relevance

These findings suggest that increasing sample sizes, regardless of the depth of phenotyping, may be most informative for estimating risk of depression. The next generation of genome-wide association studies should, like the Australian Genetics of Depression Study, have both large sample sizes and extensive phenotyping to capture genetic risk factors for MDD not identified by other definitions of depression.


This case-control study examines the association between the depth of phenotyping and estimation of the risk for major depression disorder in genetic studies.

Introduction

Depression is a common, often recurrent or severe, psychiatric disorder and one of the leading causes of global disability.1 Depression is characterized by significant heterogeneity in timing of onset, symptom profile, course, response to treatment, and both psychiatric and physical comorbidities.2 Approximately 30% to 40% of the total variance in liability to major depressive disorder (MDD) is attributable to additive genetic factors.3 Since 2015, there have been a number of breakthroughs in identifying genetic risk factors for depression.4,5,6,7,8 In 2019, the Psychiatric Genomics Consortium (PGC) identified 102 independent variants associated with depression. In 2021, a meta-analysis including the Million Veterans Project identified 233 associated variants.9 To achieve the extensive sample sizes needed to identify these loci, a large proportion of cases were defined based on (1) responses to a single screening question regarding seeking professional help for depression, worries, or tension; (2) a self-reported diagnosis of depression during a nurse-led interview in the UK Biobank; (3) online assessment in 23andMe; or (4) a diagnosis from electronic health records (collectively referred to as minimal phenotyping). Thus, many of the individuals were either not assessed for or did not meet the criteria for MDD as defined by the DSM-5.10

Cai and colleagues11 found evidence for differences in genetic architecture between depression defined using minimal phenotyping and MDD assessed using a diagnostic questionnaire, including a higher heritability and lack of enrichment of association in genes expressed in the brain for clinically defined depression and nonspecificity of loci identified using minimal phenotyping. Including minimally phenotyped patients and controls thus substantially boosts power to detect genetic loci, but may increase heterogeneity within and across cohorts and so miss clinically important genetic effects specific to MDD.

The proliferation of large, population-based health studies with genomic information and the increasing availability of administrative health data with diagnostic codes for depression might facilitate valuable insights into the cause of depression. However, the extent to which genetic findings from depression defined by minimal phenotyping extend to clinical diagnoses of depression using diagnostic questionnaires or interviews is a key issue that will inform the interpretation and design of future studies.

Herein, we used the Australian Genetics of Depression Study (AGDS), a large online study of the genetic cause of depression,12 to investigate how polygenic risk scores (PRSs) constructed from different definitions of depression and meta-analyses encompassing multiple definitions map to specific features of clinical depression, such as age at onset, severity, reported trauma, and psychiatric and physical comorbidities. The large sample size and breadth of phenotyping make this a unique cohort for dissecting the genetic architecture of depression.

Methods

A schematic overview of the design of the study is shown in eFigure 1 in the Supplement. All protocols and questionnaires were approved by the QIMR Berghofer Medical Research Institute Human Research Ethics Committee. Data analysis for this study was conducted from September 10, 2020, to January 27, 2021. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.

The Australian Genetics of Depression Study

The AGDS is a large ongoing study of the causes of depression and treatment response. The recruitment and sample characteristics of the AGDS have been described in detail elsewhere.5 This present study uses data from the first data freeze in September 2018. Between 2016 and 2018, 20 689 participants (age, 28-58 years; 75% women) provided online consent and enrolled in the study. Participants completed a compulsory module that included the Composite International Diagnostic Interview Short Form13 to assess diagnostic criteria for depression. The compulsory module also assessed psychiatric comorbidities. Before September 2018, a total of 15 792 participants had provided a saliva sample (GeneFix; Isohelix saliva kit).

We evaluated the association between depression PRSs and a number of clinical features of depression in individuals meeting DSM-5 criteria for MDD. These features included early age at onset (defined as reported age at first episode of depression <21 years), reporting more than 2 episodes of depression, childhood trauma (defined as having experienced sexual, physical, or emotional abuse before age 18 years), and a self-reported diagnosis of an anxiety disorder, bipolar disorder, migraine, chronic fatigue, or chronic pain. Furthermore, we investigated the self-reported current measures of psychological distress and somatic symptoms determined using the PSYCH and SOMA subscales, respectively, of the SPHERE-12.14 The sample sizes for each of the phenotypes are shown in eTable 1 in the Supplement.

QSkin Study

The QSkin sun and health study is a prospective cohort study initiated in 2011 primarily to examine skin cancer outcomes. Participants aged 40 to 70 years responded to a mailing to residents of Queensland, Australia, selected at random from the electoral role (n = 43 794). A total of 17 218 QSkin participants provided a saliva sample in 2014; answered the lifestyle questionnaire, which included a disease checklist comprising questions about ever having been diagnosed with psychiatric disorders; and provided consent for their data to be used for future research. Participants of European ancestry who reported not having been given a diagnosis of any psychiatric disorder were selected as controls for the case-control analysis. Those who reported a diagnosis of depression were included in the case cohort.

Depression Phenotypes Used to Generate PRSs

We evaluated the association of PRSs from summary statistics derived from 9 different genome-wide association studies (GWASs) of depression (Table).4,6,11,15 First, we used the results of the most recent published analysis of the Psychiatric Genomics Consortium Major Depression Working Group (PGC 2019),6 to our knowledge, the largest published study of summary statistics available, with the Australian samples (listed as the QIMR cohort) removed to ensure there was no chance of sample overlap. The PGC 2019 study is a meta-analysis including clinical cohorts, population registers, data from 23andMe, and broadly defined depression in the UK Biobank. 23andMe participants provided informed consent and participated in the research online, under a protocol approved by the external Association for the Accreditation of Human Research Protection Programs–accredited institutional review board. Second, we used the published results from a GWAS of broad depression in the UK Biobank that includes individuals with depression defined by answering yes to having sought help for nerves, anxiety, tension, or depression or a diagnosis of depression using linked hospital records. Third, we used summary statistics from the cohorts with clinically defined MDD in the PGC2019 study. These groups are described as the PGC29 cohorts by Wray et al.4 The summary statistics do not include the QIMR cohorts, but for consistency with previous studies, we refer to this discovery sample as PGC29. The remaining 6 phenotypes and their corresponding downsampled results are from the study of Cai et al,11 who conducted GWASs using 6 different definitions of depression in the UK Biobank. These definitions were based on measures including responses to single questions regarding help seeking, depression diagnoses obtained from linked health records, and MDD defined using DSM-5 criteria. Because different definitions produce widely varying numbers of cases and controls, which will affect power, we further evaluated the performance of PRSs derived from the 6 definitions of depression in the UK Biobank when each definition is downsized to give equal numbers of cases and controls between definitions (7500 cases and 42 500 controls) using the summary statistics provided by Cai et al.11 Because the PGC 2019, broadly defined depression in the UK Biobank, and PGC29 studies include depression diagnoses defined in multiple ways rather than a single strict definition, downsampling was not performed.

Table. Discovery GWASs Used to Generate PRSs.

Source Discovery GWAS Description Cases Controls
Howard et al,6 2019 PGC 2019 Largest published genome-wide association study of depression by the Psychiatric Genomics Consortium, with Australian samples removed 246 819 561 485
Howard et al,15 2018 UKB Broad Defined using self-reported help-seeking behavior for mental health difficulties. “Have you ever seen a GP for nerves, anxiety, tension or depression?” or “Have you ever seen a psychiatrist for nerves, anxiety, tension or depression?” 113 769 208 811
Wray et al,4 2018 PGC29 Meta-analysis of cohorts from Psychiatric Genomics Consortium MDD study with cases assessed using clinical criteria or clinician diagnosis, with Australian samples removed 14 833 23 921
Cai et al,11 2020 Lifetime MDD MDD defined using DSM criteria assessed using the CIDI-SF in the follow-up mental health questionnaire (MHQ in UKB) 16 301 50 870
Cai et al,11 2020 GPPsy Seen a GP for nerves, anxiety, tension, or worries (UKB) 113 262 219 360
Cai et al,11 2020 PsyPsy Seen psychiatrist for nerves, anxiety, tension, or depression 36 286 297 126
Cai et al,11 2020 DepAll Seen GP or psychiatrist for tension or depression and 2 wk of depression or anhedonia 21 777 58 398
Cai et al,11 2020 SelfRepDep Self-report of history of depression in interview with trained nurse in the UKB 19 805 234 114
Cai et al,11 2020 ICD-10 Dep ICD-10 code for depression from linked electronic health records in the UKB 9176 203 235

Abbreviations: CIDI-SF, Composite International Diagnostic Interview Short Form; GP, general practitioner; MDD, major depressive disorder; MHQ, Mental Health Questionnaire; PGC, Psychiatric Genomics Consortium; PRS, polygenic risk scores; UKB, UK Biobank.

Polygenic Risk Scores

Details of the genotyping and quality control are provided in the eMethods in the Supplement. SBayesR,16 a bayesian method that assumes that single-nucleotide variant (SNV) effects are drawn from a mixture of four 0-mean normal distributions with different variances, was used to generate the weights for the PRSs. This method rescales the GWAS SNV effects with many SNVs assumed to have an effect size of 0. Full details are provided in the eMethods in the Supplement. The posterior SNV effects estimated by SBayesR were used to generate PRSs for each individual using the score function in PLINK.

Polygenic risk scores were standardized to calculate the effect size per SD unit of PRS. We also used linkage disequilibrium score regression17 to calculate the SNV-based heritability for clinical and self-reported depression and the genetic correlation with depression phenotypes from the UK Biobank.

In addition to evaluating the association with clinical depression in AGDS and self-reported depression in QSkin, we examined the association between depression PRSs and a number of clinical features of depression in individuals meeting MDD criteria in the AGDS. These features included early age at onset (defined as reported age at first episode of depression <21 years), reporting more than 2 episodes of depression, childhood trauma (defined as having experienced sexual, physical, or emotional abuse before age 18, assessed using part A of the PTSD Checklist for DSM-518), and a self-reported diagnosis of an anxiety disorder, bipolar disorder, migraine, chronic fatigue, or chronic pain. Furthermore, we investigated the self-report current measures of psychological distress and somatic symptoms measured using the PSYCH and SOMA subscales of the SPHERE-12.14 The sample sizes for each of the phenotypes are reported in eTable 1 in the Supplement.

Each of the full and downsampled PRSs was regressed against the clinical phenotypes of interest using logistic regression for binary variables and linear regression for continuous variables. Continuous variables were standardized before the regression. All analyses included age at enrollment, sex, and 10 genetic principal components as covariates.

Results

A total of 12 106 (75% female; mean age, 42.3 years; range, 18-88 years) participants of European ancestry who met DSM-5 criteria for MDD were included. A further set of individuals (3083; 68% female) who self-reported a diagnosis of depression but for whom diagnostic criteria were not assessed was drawn from the QSkin study.19 Participants of European ancestry from QSkin who reported not having a diagnosis of any psychiatric disorder were included as controls (12 621; 51% female; mean age, 60.9 years; range, 43-87 years). The SNV-based heritability on the liability scale when comparing individuals with MDD in the AGDS with controls was 0.16 (0.02) and comparing QSkin participants with self-reported depression with controls was 0.12 (0.06) (eTable 2 in the Supplement).

We evaluated the association of each PRS with case status in the AGDS and QSkin. Regardless of whether the target sample included participants assessed for lifetime MDD (Figure 1A; eTable 3 in the Supplement) or a self-report diagnosis of depression (Figure 1C), the larger the sample size of the GWAS discovery, the larger the effect size of the PRS in the target sample, with the largest study (PGC2019) having the largest effect size with the odds ratio for MDD (1.75; 95% CI, 1.73-1.77) per SD of PRS and the PRS derived from International Statistical Classification of Diseases, 10th Edition (ICD-10) codes documented in hospitalization records in a population health cohort having the lowest odds ratio (1.14; 95% CI, 1.12-1.16). For all PRSs, the effect size was larger in the individuals with lifetime MDD, indicating that patients meeting clinical criteria in AGDS have a higher mean depression PRS than those who report having a depression diagnosis in the QSkin community sample. Given equal sample sizes, the lifetime MDD PRS had associations with higher ORs with lifetime MDD (OR, 1.20; 95% CI, 1.16-1.24) than the other definitions, such as PsyPsy (OR, 1.12; 95% CI, 1.08-1.15) (Figure 1C; eTable 3 in the Supplement). This association was not found when evaluating self-reported depression in QSkin, in which diagnostic criteria were not assessed (Figure 1D). Consistent with these results, the estimated genetic correlation with lifetime MDD in the UK Biobank when including patients with clinically defined MDD was higher (genetic correlation, 0.92; SE, 0.11), compared with when including self-reported cases (genetic correlation, 0.78; SE, 0.25), although this difference was not statistically significant (eTable 2 in the Supplement). Similarly, despite a larger sample size than the downsampled GWASs, the PRS derived from individuals with clinically defined MDD in PGC29 was not significantly more associated with self-reported depression in QSkin (Figure 1D). To investigate whether selecting screening for all psychiatric disorders in controls affected the results, we repeated the analysis with controls who reported not being diagnosed with depression only (n = 13 696). The increased association of the MDD-PRS in the individuals meeting MDD criteria remained (eFigure 2 in the Supplement).

Figure 1. Association of Depression Polygenic Risk Scores (PRSs) With Clinically Defined Major Depressive Disorder (MDD) and Self-reported Depression.

Figure 1.

Results from estimating depression in target samples using PRSs from different depression genome-wide association study (GWAS) discovery samples. Full indicates the total sample for each discovery GWAS, and downsampled each discovery GWAS downsampled to 7500 patients and 12 500 controls. For illustrative purposes, the PGC29 PRS effect size is plotted in both the full and downsampled panels, but this GWAS was not downsampled. Estimation of depression in the AGDS full (A) and downsampled (B) cohorts and the QSkin full (C) and downsampled (D) cohorts. The PGC 2019 PRS, which has the largest sample size, was the best estimator of depression in both cases assessed using DSM-5 criteria (A) and those assessed using a single self-report item (C). When sample sizes were equal, the lifetime MDD PRS was a better estimator of case status in those meeting DSM-5 criteria (B) but not in those assessed using minimal phenotyping (D) Circles indicate the odds ratio per SD in profile score with lines showing the 95% CIs. DepAll indicates self-report of seeing a general practitioner for nerves, anxiety, tension, or worry and at least 2 weeks of depression or anhedonia in the UK Biobank (21 777 cases and 58 396 controls); GPPsy, self-report of seeing a general practitioner for nerves, anxiety/tension, worry in the UK Biobank (113 262 cases and 219.360 controls); ICD-10, International Statistical Classification of Diseases, 10th Edition code for depression from linked electronic health records in UK Biobank (9176 cases, 203 235 controls); Lifetime MDD, patients meeting DSM-5 criteria for MDD in the UK Biobank and controls that screened negative for MDD (16 301 patients and 50 870 controls); PGC29, meta-analysis of cohorts from PGC-MDD study with clinical diagnoses from interviews or from clinicians (14 833 cases and 23 921 controls); PGC 2019, largest published GWAS of depression published to date (includes 246 819 clinically defined and minimally phenotyped patients and 561 485 controls); PsyPsy, self-report of seeing a psychiatrist for nerves, anxiety, tension, or worry in the UK Biobank (36 286 patients, 297 126 controls); SelfRepDep, self-report of history of depression in interview with trained nurses in the UK Biobank (19 805 cases, 234 114 controls), and UKB Broad, self-report of seeing a general practitioner or psychiatrist in the UK Biobank (113 769 cases, 208 811 controls).

We further sought to evaluate whether there are other clinical features of depression that are better captured by the clinically defined PRS. The results are shown in eFigure 3 and eTable 4 in the Supplement. Across all clinical measures examined, the PRSs from the largest PGC meta-analysis had the largest effect size. Likewise, when considering the different definitions of depression in the UK Biobank, the broad definition that encompasses multiple definitions and self-reports of seeing a physician for nerves, anxiety, tension, or worry, which has the largest sample size, generally gives the best estimation. By contrast, there are a number of notable features of the lifetime MDD PRSs. First, consistent with it better capturing the genetic risk for depression that is not shared with other major psychiatric disorders, the lifetime MDD PRS was not significantly higher in those reporting a comorbid anxiety disorder (OR, 1.02; 95% CI, 0.98-1.06; P = .31) or comorbid bipolar (OR, 1.01; 95% CI, 0.95-1.09; P = .80). In comparison, multiple other definitions, including both the ICD-10 codes from electronic records, and self-reports of seeing a physician for nerves, anxiety, tension, or worry in the UK Biobank, PRSs were significantly increased in those with comorbidities. Second, when accounting for differences in sample size, the lifetime MDD PRS had associations with higher ORs with reporting childhood trauma (OR, 1.14; 95% CI, 1.09-1.19) vs the association with the next highest OR, PRS (OR, 1.07; 95% CI, 1.02-1.12) from other definitions. Third, when accounting for sample size, the lifetime MDD and ICD-10 PRSs are better than other definitions at estimating current levels of somatic distress (eFigure 3 and eTable 4 in the Supplement).

Given the high prevalence of somatic symptoms reported by patients with more severe depressive disorders,20 we hypothesized that genetic analyses based on clinical definitions of depression better capture risk of somatic symptoms of depression than do definitions based on a single question or multiple screening questions, particularly when that question focuses on mood or psychological distress alone. We next investigated the association between the PRSs and current levels of mental and physical health measured on a scale from 1 (very poor) to 5 (excellent). Both the lifetime MDD PRS (β = −0.01 [0.009]; P = .29) and PGC29 PRS (β = −0.004 [0.01]; P = .67) showed no evidence of association with current mental health but show evidence of association with physical health (Lifetime MDD, β = −0.041 [0.009]; P = 6.05 × 10−06; PGC29, β = −0.023 [0.009]; P = .01). When considering equal sample sizes, the lifetime MDD PRS has the highest effect size with physical health (β = −0.035 [0.009]; P = 7.6 × 10−05) than other definitions, with the next largest being the ICD-10–based PRS (β = −0.026 [0.009]; P = .003) (eFigure 4 and eTable 5 in the Supplement).

In addition, we investigated which PRSs are associated with reporting common physical comorbidities of depression. When discovery GWAS sample sizes are equal, the lifetime MDD PRS had associations with higher ORs with migraine (OR, 1.08; 95% CI, 1.04-1.12; P = 4.76 × 10−05) than the association with the next highest ORs, PRS (GPPsy; OR, 1.02; 95% CI, 0.98-1.07; P = .36). Similarly, the lifetime MDD PRS had associations with higher ORs with chronic fatigue syndrome (OR, 1.13; 95% CI, 1.05-1.22; P = 7.11 × 10−04) with the next most associated PRS derived from ICD-10 codes (OR, 1.04; 95% CI, 0.97-1.11; P = .34). The lifetime MDD PRS was also associated with chronic pain (OR, 1.07; 95% CI, 1.01-1.13; P = .02); however, the results from other PRSs were comparable (ICD-10 PRS, OR, 1.06; 95% CI, 1.00-1.12; P = .04). (Figure 2; eTable 6 in the Supplement). This pattern of results suggests that selecting individuals with depression and controls by screening for diagnostic criteria for MDD gives a genetic risk score with associations with higher ORs with physiologic perturbations and phenotypes characterized by somatic symptoms, than other definitions of depression. However, the PGC29 PRS, which has only clinically defined cases, was associated only with comorbid migraine (OR, 1.06; 95% CI, 1.02-1.11; P = .005).

Figure 2. Association Between Depression Polygenic Risk Scores (PRSs) and Self-reported Diagnosis of Physical Comorbidities in Individuals With Major Depressive Disorder (MDD).

Figure 2.

Results from estimating comorbid physical disorders in patients with MDD in the Australian Genetics of Depression Study. Full indicates the total sample for each discovery genome-wide association study (GWAS). When sample sizes are equal, the lifetime MDD PRS was a better estimator of comorbid migraine (A) and chronic fatigue syndrome (B) and had the largest effect size for chronic pain (C). DepAll indicates self-report of seeing a general practitioner for nerves, anxiety, tension, or worry and at least 2 weeks of depression or anhedonia in the UK Biobank (21 777 cases and 58 396 controls); GPPsy, self-report of seeing a general practitioner for nerves, anxiety/tension, worry in the UK Biobank (113 262 cases and 219.360 controls); ICD-10, International Statistical Classification of Diseases, 10th Edition code for depression from linked electronic health records in UK Biobank (9176 cases, 203 235 controls); Lifetime MDD, patients meeting DSM-5 criteria for MDD in the UK Biobank and controls that screened negative for MDD (16 301 patients and 50 870 controls); PGC29, meta-analysis of cohorts from PGC-MDD study with clinical diagnoses from interviews or from clinicians (14 833 cases and 23 921 controls); PGC 2019, largest published GWAS of depression published to date (includes 246 819 clinically defined and minimally phenotyped patients and 561 485 controls); PsyPsy, self-report of seeing a psychiatrist for nerves, anxiety, tension, or worry in the UK Biobank (36 286 patients, 297 126 controls); SelfRepDep, self-report of history of depression in interview with trained nurses in the UK Biobank (19 805 cases, 234 114 controls), and UKB Broad, self-report of seeing a general practitioner or psychiatrist in the UK Biobank (113 769 cases, 208 811 controls).

Discussion

We evaluated the association of PRSs generated from different discovery samples of depression with depression in individuals meeting clinical criteria and self-reported depression. We found that estimation in the target samples was proportional to the sample size of the discovery GWAS, despite the larger GWASs depending on minimal phenotyped cases. Consistent with the findings of Cai et al,11 we found that when sample sizes of the discovery GWAS were equal, the clinical MDD PRS appeared to be a better variable associated with patients with MDD, but not in patients who self-report a diagnosis of depression without being assessed using the MDD criteria. This finding supports the conjecture of Cai et al that GWASs including only patients and controls screened for diagnostic criteria capture a genetic component of risk specific to MDD.

Analyses of clinical phenotypes of MDD showed that when sample sizes are equal, the lifetime MDD PRS is also associated with poor physical health, higher rates of somatic symptoms, and having comorbid migraine or chronic fatigue syndrome. A similar pattern was seen for the PRSs generated using ICD-10 codes for depression from electronic health records, although not as stark as for the MDD PRS (eFigure 3 in the Supplement). In contrast, PRSs derived from analyses using minimal phenotyping are not significantly less useful at estimating measures of severity, such as age at onset and number of episodes. The PGC29 PRS also showed evidence of association with somatic symptoms, but with lower effect sizes than the lifetime MDD PRS. Although patients in the PGC29 discovery GWAS met clinical criteria, there were differences in the ascertainment of patients across the cohorts in the PGC studies and, perhaps more importantly, differences in the screening of controls, with some cohorts using unscreened controls, which may have affected the results.

Somatic symptoms are common in patients with MDD and include fatigue, headaches, and back pain,21 and previous studies have found that a large proportion of patients meeting the criteria for depression present initially to primary care clinicians with somatic symptoms.22 Painful somatic symptoms are associated with increased functional impairment20 and poorer outcomes in patients with depression.21,23

Our results have important implications. If genetic information will have utility in estimating who in the population is at risk of having MDD, then increasing sample size of the discovery sample for GWASs of depression, regardless of the depth of phenotyping, should remain a high priority. If we seek to understand more completely the neurobiological underpinnings of more clinical forms of MDD, then as postulated by Cai and colleagues,11 minimal phenotyping will not capture all of the genetic risk for depression. However, even if studies that do not assess diagnostic criteria for MDD capture genetic risk that is nonspecific, the identified genetic risk factors contribute to the severity of depression as measured by earlier age at onset and chronicity and is therefore of major importance to elucidating the cause of depression.

Another key implication is investigating gene by environment interactions with childhood trauma using PRSs. Although the PRSs from all of the definitions of depression are enriched in individuals reporting trauma (eFigure 3 in the Supplement), the lifetime MDD has the largest effect size. Thus, the phenotype definition in both the discovery and target samples may affect the results of PRSs by trauma analyses, and screening for diagnostic criteria in patients and controls will be informative for untangling the association between genetic and environmental risks for depression. The increased PRSs in individuals reporting trauma is consistent with the findings of Coleman et al,24 who showed that the SNV-based heritability was higher in patients with MDD who reported childhood trauma in the UK Biobank. The association with high ORs with the MDD PRS may reflect the phenotypic association of trauma with MDD compared with other definitions as described by Cai et al.11 The phenotypic association of trauma with MDD could induce gene-environment associations influenced by differences in socioeconomic status,25 which would manifest in the discovery GWASs as genetic effects.25,26 Within-family analyses will be valuable for further investigating differences in polygenic risk between patients with and without exposure to trauma.27

Limitations

The study had limitations. Both the UK Biobank discovery sample and the AGDS rely on structured diagnostic questionnaires to assess criteria for MDD. Although these instruments have been found to have good validity, an interview with a trained clinician remains the standard in diagnosing MDD, and the results of this study should be viewed with caution. Likewise, the UK Biobank, AGDS, and QSkin are cohort studies that have not recruited participants in the clinical setting. They therefore may not be representative of the full clinical spectrum of MDD in the population. In addition, participants in the discovery and target studies are mostly of British and Irish ancestries, and these results may not generalize to other ancestral groups both within and outside Europe.

Conclusions

Results of this case-control study suggest that increasing sample sizes by including patients defined in numerous ways is essential to enhancing our understanding of genetic risk for depression and generating more accurate PRSs for use in research and clinical settings. However, to see a complete picture of the biological characteristics of depression, large, well-phenotyped cohorts that are enriched for clinical depression are needed. The AGDS demonstrates that it is feasible to establish large genetically informative cohorts with in-depth online phenotyping that can provide meaningful insights into the cause of depression.

Supplement.

eMethods. Detailed Methods

eTable 1. Clinical Phenotypes in Cases Meeting MDD Criteria In the Australian Genetics of Depression Study Analyzed in the Polygenic Risk Score Analyses

eTable 2. SNV-Based Heritability and Genetic Correlations for Definitions of Depression

eTable 3 Results From Polygenic Risk Estimation of MDD and Self-report Depression Cases Using Different Definitions of Depression

eTable 4. Results From Polygenic Risk Estimation of clinical phenotypes of MDD in the AGDS

eTable 5 Results From Polygenic Risk Estimation of Self-report Current Health in the AGDS

eTable 6 Results From Polygenic Risk Estimation of Comorbid Physical Conditions in Cases With MDD

eFigure 1. Schematic of the Design of the Study

eFigure 2. Results From Estimation of Depression in Target Samples Using Controls Screened for Depression Only

eFigure 3. Associations of PRS From Different Definitions of Depression With Clinical Features of Depression in MDD Cases

eFigure 4. Association Between PRS Derived From Different Definitions of Depression and Self-rated Physical and Mental Health Measured on a 5-Point Scale (1 = Very Poor, 5 = Very Good) in Participants in Australian Genetics of Depression Study

eReferences

References

  • 1.Diseases GBD, Injuries C; GBD 2019 Diseases and Injuries Collaborators . Global burden of 369 diseases and injuries in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396(10258):1204-1222. doi: 10.1016/S0140-6736(20)30925-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lynch CJ, Gunning FM, Liston C. Causes and consequences of diagnostic heterogeneity in depression: paths to discovering novel biological depression subtypes. Biol Psychiatry. 2020;88(1):83-94. doi: 10.1016/j.biopsych.2020.01.012 [DOI] [PubMed] [Google Scholar]
  • 3.Sullivan PF, Neale MC, Kendler KS. Genetic epidemiology of major depression: review and meta-analysis. Am J Psychiatry. 2000;157(10):1552-1562. doi: 10.1176/appi.ajp.157.10.1552 [DOI] [PubMed] [Google Scholar]
  • 4.Wray NR, Ripke S, Mattheisen M, et al. ; eQTLGen; 23andMe; Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium . Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet. 2018;50(5):668-681. doi: 10.1038/s41588-018-0090-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.CONVERGE consortium . Sparse whole-genome sequencing identifies two loci for major depressive disorder. Nature. 2015;523(7562):588-591. doi: 10.1038/nature14659 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Howard DM, Adams MJ, Clarke TK, et al. ; 23andMe Research Team; Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium . Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nat Neurosci. 2019;22(3):343-352. doi: 10.1038/s41593-018-0326-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Hyde CL, Nagle MW, Tian C, et al. Identification of 15 genetic loci associated with risk of major depression in individuals of European descent. Nat Genet. 2016;48(9):1031-1036. doi: 10.1038/ng.3623 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Levey DF, Stein MB, Wendt FR, et al. . GWAS of depression phenotypes in the Million Veteran Program and meta-analysis in more than 1.2 million participants yields 178 independent risk loci. medRxiv. 2020:2020.2005.2018.20100685. doi: 10.1101/2020.05.18.20100685 [DOI]
  • 9.Levey DF, Stein MB, Wendt FR, et al. ; 23andMe Research Team; Million Veteran Program . Bi-ancestral depression GWAS in the Million Veteran Program and meta-analysis in >1.2 million individuals highlight new therapeutic directions. Nat Neurosci. 2021;24(7):954-963. doi: 10.1038/s41593-021-00860-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders. 5th ed. American Psychiatric Association; 2013. [Google Scholar]
  • 11.Cai N, Revez JA, Adams MJ, et al. ; MDD Working Group of the Psychiatric Genomics Consortium . Minimal phenotyping yields genome-wide association signals of low specificity for major depression. Nat Genet. 2020;52(4):437-447. doi: 10.1038/s41588-020-0594-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Byrne EM, Kirk KM, Medland SE, et al. Cohort profile: the Australian Genetics of Depression Study. BMJ Open. 2020;10(5):e032580. doi: 10.1136/bmjopen-2019-032580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kessler RC, Andrews G, Mroczek D, Ustun B, Wittchen H-U. The World Health Organization Composite International Diagnostic Interview short-form (CIDI-SF). Int Methods PsychiatrRes . 1998;7(4):171-185. doi: 10.1002/mpr.47 [DOI] [Google Scholar]
  • 14.Hickie IB, Davenport TA, Hadzi-Pavlovic D, et al. Development of a simple screening tool for common mental disorders in general practice. Med J Aust. 2001;175(S1)(suppl):S10-S17. doi: 10.5694/j.1326-5377.2001.tb143784.x [DOI] [PubMed] [Google Scholar]
  • 15.Howard DM, Adams MJ, Shirali M, et al. ; 23andMe Research Team . Genome-wide association study of depression phenotypes in UK Biobank identifies variants in excitatory synaptic pathways. Nat Commun. 2018;9(1):1470. doi: 10.1038/s41467-018-03819-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lloyd-Jones LR, Zeng J, Sidorenko J, et al. Improved polygenic prediction by bayesian multiple regression on summary statistics. Nat Commun. 2019;10(1):5086. doi: 10.1038/s41467-019-12653-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bulik-Sullivan BK, Loh PR, Finucane HK, et al. ; Schizophrenia Working Group of the Psychiatric Genomics Consortium . LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47(3):291-295. doi: 10.1038/ng.3211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.US Dept of Veterans Affairs . PTSD: National Centers for PTSD. Accessed July 12, 2021. www.ptsd.va.gov
  • 19.Olsen CM, Green AC, Neale RE, et al. ; QSkin Study . Cohort profile: the QSkin Sun and Health Study. Int J Epidemiol. 2012;41(4):929-929i. doi: 10.1093/ije/dys107 [DOI] [PubMed] [Google Scholar]
  • 20.Fritzsche K, Sandholzer H, Brucks U, et al. Psychosocial care by general practitioners—where are the problems? results of a demonstration project on quality management in psychosocial primary care. Int J Psychiatry Med. 1999;29(4):395-409. doi: 10.2190/MCGF-CLD4-0FRE-N2UK [DOI] [PubMed] [Google Scholar]
  • 21.Vaccarino AL, Sills TL, Evans KR, Kalali AH. Prevalence and association of somatic symptoms in patients with major depressive disorder. J Affect Disord. 2008;110(3):270-276. doi: 10.1016/j.jad.2008.01.009 [DOI] [PubMed] [Google Scholar]
  • 22.Simon GE, VonKorff M, Piccinelli M, Fullerton C, Ormel J. An international study of the relation between somatic symptoms and depression. N Engl J Med. 1999;341(18):1329-1335. doi: 10.1056/NEJM199910283411801 [DOI] [PubMed] [Google Scholar]
  • 23.McIntyre RS, Konarski JZ, Mancini DA, et al. Improving outcomes in depression: a focus on somatic symptoms. J Psychosom Res. 2006;60(3):279-282. doi: 10.1016/j.jpsychores.2005.09.010 [DOI] [PubMed] [Google Scholar]
  • 24.Coleman JRI, Peyrot WJ, Purves KL, et al. ; on the behalf of Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium . Genome-wide gene-environment analyses of major depressive disorder and reported lifetime traumatic experiences in UK Biobank. Mol Psychiatry. 2020;25(7):1430-1446. doi: 10.1038/s41380-019-0546-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Marees AT, Smit DJA, Abdellaoui A, et al. Genetic correlates of socio-economic status influence the pattern of shared heritability across mental health traits. Nat Hum Behav. 2021. doi: 10.1038/s41562-021-01053-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Abdellaoui A, Hugh-Jones D, Yengo L, et al. Genetic correlates of social stratification in Great Britain. Nat Hum Behav. 2019;3(12):1332-1342. doi: 10.1038/s41562-019-0757-5 [DOI] [PubMed] [Google Scholar]
  • 27.Howe LJ, Nivard MG, Morris TT, et al. . Within-sibship GWAS improve estimates of direct genetic effects. bioRxiv. 2021:2021.2003.2005.433935.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement.

eMethods. Detailed Methods

eTable 1. Clinical Phenotypes in Cases Meeting MDD Criteria In the Australian Genetics of Depression Study Analyzed in the Polygenic Risk Score Analyses

eTable 2. SNV-Based Heritability and Genetic Correlations for Definitions of Depression

eTable 3 Results From Polygenic Risk Estimation of MDD and Self-report Depression Cases Using Different Definitions of Depression

eTable 4. Results From Polygenic Risk Estimation of clinical phenotypes of MDD in the AGDS

eTable 5 Results From Polygenic Risk Estimation of Self-report Current Health in the AGDS

eTable 6 Results From Polygenic Risk Estimation of Comorbid Physical Conditions in Cases With MDD

eFigure 1. Schematic of the Design of the Study

eFigure 2. Results From Estimation of Depression in Target Samples Using Controls Screened for Depression Only

eFigure 3. Associations of PRS From Different Definitions of Depression With Clinical Features of Depression in MDD Cases

eFigure 4. Association Between PRS Derived From Different Definitions of Depression and Self-rated Physical and Mental Health Measured on a 5-Point Scale (1 = Very Poor, 5 = Very Good) in Participants in Australian Genetics of Depression Study

eReferences


Articles from JAMA Psychiatry are provided here courtesy of American Medical Association

RESOURCES