Skip to main content
JAMA Network logoLink to JAMA Network
. 2020 Feb 12;77(5):1–11. doi: 10.1001/jamapsychiatry.2019.4910

An Investigation of Psychosis Subgroups With Prognostic Validation and Exploration of Genetic Underpinnings

The PsyCourse Study

Dominic B Dwyer 1,, Janos L Kalman 1,2,3, Monika Budde 2, Joseph Kambeitz 4, Anne Ruef 1, Linda A Antonucci 1,5, Lana Kambeitz-Ilankovic 4, Alkomiet Hasan 1, Ivan Kondofersky 6,7, Heike Anderson-Schmidt 2,8, Katrin Gade 2,8, Daniela Reich-Erkelenz 2, Kristina Adorjan 1,2, Fanny Senner 1,2, Sabrina Schaupp 2,9, Till F M Andlauer 10, Ashley L Comes 2,3, Eva C Schulte 1,2, Farah Klöhn-Saghatolislam 1,2, Anna Gryaznova 2, Maria Hake 2, Kim Bartholdi 2, Laura Flatau-Nagel 2, Markus Reitt 8, Silke Quast 8, Sophia Stegmaier 11, Milena Meyers 12, Barbara Emons 12, Ida Sybille Haußleiter 12, Georg Juckel 12, Vanessa Nieratschker 11, Udo Dannlowski 13, Tomoya Yoshida 14, Max Schmauß 9, Jörg Zimmermann 15, Jens Reimer 16,17, Jens Wiltfang 8,18,19, Eva Reininghaus 20, Ion-George Anghelescu 21, Volker Arolt 13, Bernhard T Baune 13,22,23, Carsten Konrad 24, Andreas Thiel 24, Andreas J Fallgatter 11, Christian Figge 25, Martin von Hagen 26, Manfred Koller 27, Fabian U Lang 28, Moritz E Wigand 28, Thomas Becker 28, Markus Jäger 28, Detlef E Dietrich 29,30,31, Harald Scherk 32, Carsten Spitzer 33, Here Folkerts 34, Stephanie H Witt 35, Franziska Degenhardt 36,37, Andreas J Forstner 36,37,38,39, Marcella Rietschel 35, Markus M Nöthen 36,37, Nikola Mueller 6, Sergi Papiol 1,2, Urs Heilbronner 2, Peter Falkai 1, Thomas G Schulze 2, Nikolaos Koutsouleris 1,40,
PMCID: PMC7042925  PMID: 32049274

Key Points

Question

Will data-driven clustering using high-dimensional clinical data reveal psychosis subgroups with relevance to prognoses and polygenic risk?

Findings

In this cohort study including 1223 individuals, in the discovery sample of 765 individuals with predominantly bipolar and schizophrenia diagnoses, 5 subgroups were detected with different clinical signatures, illness trajectories, and genetic scores for educational attainment. Results were validated in a sample of 458 individuals.

Meaning

New data-driven clustering paired with rigorous validation may offer a means to extend symptom-based psychosis taxonomies toward functional outcomes, genetic markers, and trajectory-based stratifications.


This cohort study aims to detect psychosis subgroups and examine their illness courses over 1.5 years and their polygenic scores for schizophrenia, bipolar disorder, major depression disorder, and educational achievement.

Abstract

Importance

Identifying psychosis subgroups could improve clinical and research precision. Research has focused on symptom subgroups, but there is a need to consider a broader clinical spectrum, disentangle illness trajectories, and investigate genetic associations.

Objective

To detect psychosis subgroups using data-driven methods and examine their illness courses over 1.5 years and polygenic scores for schizophrenia, bipolar disorder, major depression disorder, and educational achievement.

Design, Setting, and Participants

This ongoing multisite, naturalistic, longitudinal (6-month intervals) cohort study began in January 2012 across 18 sites. Data from a referred sample of 1223 individuals (765 in the discovery sample and 458 in the validation sample) with DSM-IV diagnoses of schizophrenia, bipolar affective disorder (I/II), schizoaffective disorder, schizophreniform disorder, and brief psychotic disorder were collected from secondary and tertiary care sites. Discovery data were extracted in September 2016 and analyzed from November 2016 to January 2018, and prospective validation data were extracted in October 2018 and analyzed from January to May 2019.

Main Outcomes and Measures

A clinical battery of 188 variables measuring demographic characteristics, clinical history, symptoms, functioning, and cognition was decomposed using nonnegative matrix factorization clustering. Subtype-specific illness courses were compared with mixed models and polygenic scores with analysis of covariance. Supervised learning was used to replicate results in validation data with the most reliably discriminative 45 variables.

Results

Of the 765 individuals in the discovery sample, 341 (44.6%) were women, and the mean (SD) age was 42.7 (12.9) years. Five subgroups were found and labeled as affective psychosis (n = 252), suicidal psychosis (n = 44), depressive psychosis (n = 131), high-functioning psychosis (n = 252), and severe psychosis (n = 86). Illness courses with significant quadratic interaction terms were found for psychosis symptoms (R2 = 0.41; 95% CI, 0.38-0.44), depression symptoms (R2 = 0.28; 95% CI, 0.25-0.32), global functioning (R2 = 0.16; 95% CI, 0.14-0.20), and quality of life (R2 = 0.20; 95% CI, 0.17-0.23). The depressive and severe psychosis subgroups exhibited the lowest functioning and quadratic illness courses with partial recovery followed by reoccurrence of severe illness. Differences were found for educational attainment polygenic scores (mean [SD] partial η2 = 0.014 [0.003]) but not for diagnostic polygenic risk. Results were largely replicated in the validation cohort.

Conclusions and Relevance

Psychosis subgroups were detected with distinctive clinical signatures and illness courses and specificity for a nondiagnostic genetic marker. New data-driven clinical approaches are important for future psychosis taxonomies. The findings suggest a need to consider short-term to medium-term service provision to restore functioning in patients stratified into the depressive and severe psychosis subgroups.

Introduction

Schizophrenia and bipolar diagnoses group patients on the basis of shared patterns of psychiatric history, symptoms, and illness courses.1,2 The categories drive research and clinical practice,3 despite evidence of symptomatic ambiguity,4,5 heterogeneous illness courses,6 and overlapping genetic risk profiles.7,8,9 The existing taxonomy has been questioned as a result,10,11 and new psychosis configurations are under consideration.3,12,13,14,15

Novel symptom subgroups15,16,17,18,19 or dimensions13,14,15,20,21,22 have been proposed using unbiased statistical approaches, but assessing symptoms alone does not account for the complexity of disease phenotypes involving the patients’ history, illness course, cognition, or daily functioning.1,12,20 These factors were important to historical taxonomic formulations,1,23 critical to subgroup hypotheses (eg, deficit schizophrenia12,21,22), and have biological associations suggestive of a genetic23,24 or brain25 diathesis. While some unbiased studies have included such additional information,26 the breadth and depth of measures does not match the detail of naturalistic clinical assessments that originally defined psychosis categories.1,26 If diagnoses are to be reconsidered, then it is important to assess a similarly broad range of variables.

New clustering methods27,28,29,30 provide an opportunity to define subgroups based on high-dimensional data31 rather than biasing analyses by focusing on single domains. Fundamentally, the methods use computational approaches to define stable, separable, and interpretable subgroup solutions. Thus, the first aim of this study was to adapt clustering methods used in oncology27,28,29,30 to identify psychosis subgroups with a highly representative phenomenological battery of variables in a sample of individuals with established diagnoses of psychotic disorders (mainly schizophrenia and bipolar disorders). Rather than prespecifying variables, a data-driven approach was used to refine a large battery that would ideally be collected during comprehensive clinical assessments for clinical management and treatment planning.

Longitudinal studies of illness course were also critical to traditional psychosis nosologies1,32 and is an essential component of a clinically valid and useful diagnosis.11 However, with some notable exceptions,33 studies have been mostly confined to trajectories based on diagnostic groups,34,35,36 symptoms,33,34 or general functioning36,37 with long follow-up periods that obscure illness dynamics relevant in a clinical setting where decisions need to be made over the course of weeks or months rather than years. As such, the second aim of the study was to validate the subgroups using longitudinal data collected in 6-month intervals over 1.5 years based on symptom and functioning measures.

Evidence of specific genetic signatures have critically informed psychosis categorizations3,9,35,37,38 and dimensions.9,39,40,41 For example, polygenic risk for schizophrenia is positively related to psychotic symptoms,9,39,40,41 treatment resistance,24 and hospitalizations.42 Recently, genetic risk scores for reduced educational attainment have also suggested a cognitive disorder subtype of schizophrenia.40 However, to our knowledge, such genetic risk scores have not been compared in diagnostically mixed subgroups derived from unbiased statistical clustering. Thus, the third aim of the study was to investigate how the subgroups differed based on polygenic scores for schizophrenia, bipolar disorder, major depression disorder, and educational attainment.

Based on the established psychiatric nosologies as well as prior research,12,22 we expected to find a subgroup of individuals with nonaffective diagnoses and an early illness onset, sustained poor functioning, persistent symptoms, high schizophrenia polygenic scores, and low educational attainment scores. On the other end of the spectrum, we expected to see a subgroup of individuals with affective psychosis characterized by less severe symptoms, high bipolar polygenic risk, and retained functioning.43

Methods

Participants

Participants were included from an ongoing multisite, naturalistic, longitudinal cohort study being conducted in 18 clinical sites in Germany and Austria beginning in January 2012 (Pathomechanisms and Signature in the Longitudinal Course of Psychosis [PsyCourse]44; http://www.psycourse.de; SCHU 1603/4-1, 5-1, 7-1). Adult participants 18 years and older were identified based on referrals from the clinical staff or by querying patient registries (eMethods 1 in the Supplement). The following diagnoses were included using the Structured Clinical Interview for DSM-IV, Axis I Disorders (SCID-I45): recurrent depressive disorder, bipolar II disorder, bipolar I disorder, schizoaffective disorder, brief psychotic disorder, schizophreniform disorder, and schizophrenia. Individuals were excluded if they did not meet DSM-IV diagnostic criteria, had insufficient language ability, or were intellectually impaired. The study protocol was ethically approved by independent committees at each site, all participants gave written informed consent, and raters were trained.

The present study divided the sample into a discovery set consisting of 765 individuals, after excluding 126 individuals with 25% or more missing baseline data (eMethods 4 in the Supplement), from the PsyCourse version 1.0 data release (September 2016) and an unmatched validation data set consisting of an additional 458 individuals from the PsyCourse version 3.0 data release (October 2018). Data releases were randomly based on the convenience sample acquisition rate and data processing times. Individuals excluded because of missing data from the discovery cohort had higher psychotic and depressive symptoms, and the validation cohort included a higher percentage of individuals with recurrent depression (eTable 3 in the Supplement). For details, see eMethods 1 and 4 in the Supplement and Budde et al.44

Baseline and Longitudinal Measures

The analysis overview is explained in the eMethods 3 and eFigure 1 in the Supplement. A data-driven approach was used to select baseline variables. From an initial inclusive list of 230 variables that included single test items (eMethods 2 and eTable 1 in the Supplement), variables were excluded if more than 95% of values were equal (n = 10) or more than 25% values were missing (n = 32) (eTable 2 in the Supplement), before being scaled (0, 1) and imputed using a nearest-neighbor method46 (3.5% imputed). This resulted in a total of 188 remaining baseline variables (eTable 1 in the Supplement) assessing the domains of medical history (eg, family history, hospitalizations), symptoms (eg, psychosis, suicidality), cognition (eg, attention, speed, working memory, verbal IQ), and functioning (eg, self-reported and clinician-reported). Specific variables were selected a priori to examine longitudinal courses (eMethods 7 in the Supplement), including the Positive and Negative Symptom Scale (PANSS),47 the Inventory of Depressive Symptomatology (IDS-C30),48 the Young Mania Rating Scale (YMRS),49 the World Health Organization Quality of Life Questionnaire, brief version (WHOQoL-BREF),50 and the Global Assessment of Functioning (GAF).51

Statistical Analysis

Clustering Analysis

We aimed to find stable, interpretable, and clinically separable subgroups by adapting and extending a novel clustering approach (nonnegative matrix factorization [NMF] consensus clustering) that has been successfully used in oncology27,28,29,30 to the 188 variables included at the baseline time point (eMethods 5 in the Supplement). The technique reduces data into parsimonious factors, selects clusters based on their stability, and can identify nonlinear and non-Gaussian boundaries.31 MATLAB release R2015b (MathWorks) was used for analyses, and code is available on request.

Genotyping and Calculation of Polygenic Risk Scores

Samples were genotyped using the Infinium CoreExome-24+Human PsyChip Consortium, versions 1.0 and 1.1 (Illumina). Standard procedures were used to calculate polygenic risk scores (PRSs) based on the latest summary statistics of genome-wide association studies of schizophrenia,52 bipolar disorder,53 major depressive disorder,54 and educational attainment.55 Genetic risk burden was calculated for each individual at 10 commonly used56 PRS thresholds between P < .0005 and P > .99 with established methods57 (eMethods 6 and eFigure 8 in the Supplement).

Subgroup Characterization

Subgroups were clinically characterized by investigating NMF scores and comparing subgroups across baseline variables (using analysis of variance and χ2 tests; MATLAB release R2015b; significance set at a false-discovery rate–corrected 2-tailed P value less than .05) based on previous research.12,22,58 Illness course was investigated over the 3 longitudinal time points at 6-month intervals using mixed models (R version 3.6.0 [The R Foundation]; significance set at a false-discovery rate–corrected 2-tailed P value less than .05) (eMethods 7 in the Supplement). Polygenic risk scores were analyzed with analysis of covariance using the first 4 ancestry principal (principal component analysis) components as covariates testing PRS subgroup differences across the 10 selected thresholds (SPSS version 25 [IBM]; significance set at a 2-tailed P value less than .05).9,39

Subgroup Validation

Using 188 baseline features to determine subgroups enhances the similarity to clinical reality but limits replicability and clinical utility. To simultaneously address these limitations and validate the subgroups, we used a separate supervised machine learning analysis using NeuroMiner (https://github.com/neurominer-git; MATLAB release 2017a) (eMethods 8 in the Supplement) to (1) reduce dimensionality by building a subgroup classifier using the top 10 most highly weighted features from each NMF factor in the discovery sample and (2) apply the models to the 458 individuals from the validation cohort to assign subgroup labels. Using the assigned subgroup labels, we then determined the NMF factor loadings for each subgroup, clinically compared the subgroups, analyzed PRS differences, and applied the mixed-model trajectory models (eMethods 8 in the Supplement).

Results

Sample Characteristics

Among the 765 individuals in the discovery sample, 341 (44.6%) were women, and the mean (SD) age was 42.7 (12.9) years (Table 1). These individuals received DSM-IV diagnoses of recurrent major depression (n = 2 [0.3%]); bipolar II disorder (n = 60 [7.8%]), bipolar I disorder (n = 256 [33.4%]), schizoaffective disorder (n = 73 [9.5%]), brief psychotic disorder (n = 6 [0.8%]), and schizophreniform disorder (n = 10 [1.3%]); and schizophrenia (n = 358 [46.8%]). All patients were included in longitudinal mixed-model analyses and had an average (SD) of 2.5 (0.1) follow-up assessments. There were a total of 458 patients included in the validation sample at baseline (eTable 3 in the Supplement) and 453 included in longitudinal analyses (5 were excluded with completely missing data), with an average (SD) of 1.8 (0.1) follow-up assessments.

Table 1. Characterization of Main Sample and Subgroupsa.

Variable No. (%) F2 (df) P Value η2
Total (N = 765) Affective Psychosis (n = 252) Suicidal Psychosis (n = 44) Depressive Psychosis (n = 131) High-Functioning Psychosis (n = 252) Severe Psychosis (n = 86)
Age, mean (SD), y 42.7 (12.9) 50.3 (11.6) 45.5 (11.0) 41.8 (11.8) 36.6 (11.2) 38.5 (11.9) 49.09 (4760) 9.41 × 10−37 0.21
Male 424 (55.4) 105 (41.7) 13 (30) 62 (47.3) 192 (76.2) 52 (60) 79.58 (10) 2.14 × 10−16 0.32
<12 y of schoolb 170 (22.2) 55 (21.8) 6 (14) 30 (22.9) 38 (15.1) 41 (48) 14.23 (4748) 3.37 × 10−11 0.07
Paid employment 287 (38.0) 122 (49.0) 14 (32) 34 (26.2) 96 (38.7) 21 (25) 27.68 (10) 1.45 × 10−5 0.19
Onset (first inpatient year) 30.2 (11.5) 35.6 (13.0) 28.3 (9) 31.4 (10.9) 25.7 (8.8) 27.2 (9) 28.50 (4728) 5.07 × 10−22 0.14
Duration of illness, mean (SD), y 12.5 (10.5) 14.9 (12.1) 16.6 (11.1) 10.2 (9.7) 11.2 (9.0) 11.1 (9.0) 7.72 (4727) 4.28 × 10−6 0.04
Diagnosis
Bipolar II disorder 60 (7.8) 29 (11.5) 5 (11) 16 (12.2) 10 (4.0) 0 21.45 (10) 2.57 × 10−4 0.17
Bipolar I disorder 256 (33.5) 125 (49.6) 16 (36) 45 (34.4) 69 (27.4) 1 (1) 74.18 (10) 2.97 × 10−15 0.31
Schizoaffective disorder 73 (9.5) 19 (7.5) 9 (20) 13 (9.9) 24 (9.5) 8 (9) NA NA NA
Schizophrenia 358 (46.8) 72 (28.6) 14 (32) 53 (40.5) 143 (56.7) 76 (88) NA NA NA
Symptoms
PANSS total score, mean (SD) 51.7 (18.4) 42.4 (11.0) 51.8 (17.3) 58.7 (14.6) 45.3 (12.5) 84.3 (12.8) 195.84 (4717) 2.04 × 10−113 0.52
IDS-C30 sum score, mean (SD) 12.6 (10.3) 8.6 (6.4) 15.6 (7.2) 28.0 (8.9) 9.1 (6.6) 9.9 (10.1) 148.70 (4660) 1.31 × 10−90 0.47
YMRS sum score, mean (SD) 3.0 (5.2) 3.3 (5.9) 3.6 (5.3) 3.2 (4.4) 2.4 (4.2) 3.4 (6.2) 1.49 (4737) NA NA
Functioning
GAF score, mean (SD) 57.5 (13.9) 62.5 (12.6) 56.1 (11.0) 51.1 (12.1) 59.7 (13.6) 46.7 (12.5) 35.17 (4753) 5.89 × 10−27 0.16
WHOQoL-BREF global score, mean (SD) 12.7 (3.7) 14.5 (3.0) 11.2 (3.4) 8.9 (3.1) 13.5 (3.2) 11.7 (3.6) 70.49 (4716) 2.43 × 10−50 0.28
Verbal IQ
MWTB score, mean (SD) 27.9 (5.1) 29.1 (4.7) 27.9 (6.5) 27.5 (4.5) 27.8 (4.9) 24.1 (5.8) 10.28 (4631) 4.58 × 10−8 0.06

Abbreviations: GAF, Global Assessment of Functioning; IDS-C30, Inventory of Depressive Symptomatology; MWTB, Mehrfachwahl-Wortschatz Test; NA, not applicable; PANSS, Positive and Negative Symptom Scale; WHOQoL-BREF, World Health Organization Quality of Life Questionnaire, brief version; YMRS, Young Mania Rating Scale.

a

Statistical comparisons (analysis of variance/χ2) only test differences between subgroups derived from clustering. Results reported exceed false-discovery rate correction for multiple comparisons (P < .05). Additional comparisons are presented in eTable 5 in the Supplement.

b

Less than 12 years of school was based on previous enrollment in hauptschule (secondary school) within the German schooling system.

Subgroup and Factor Solution

Five subgroups were identified (Figure 1; eResults 1 and eFigure 2 in the Supplement), as mediated by a factor solution comprised by coherent mixtures of demographic, diagnostic, treatment, symptom, and functioning questionnaire items (Table 2; eResults 2 and eTable 4 in the Supplement). Broadly, the 5 factors could be summarized as representing quality of life, suicide history, depression symptoms and deficits, environmental risk and male sex, and psychosis symptoms and deficits. The clustering approach27,28,29,30 links the factorization with the subgroup determination (eMethods 5 in the Supplement); thus, the 5 subgroups preferentially loaded on each factor.

Figure 1. Multigroup Radar Plots Demonstrating Distinctive Factor Loading Patterns for Each of the 5 Subgroups.

Figure 1.

A, The mean factor loadings from each subgroup within the discovery sample corresponding to functioning and quality of life (factor 1), suicide history (factor 2), depression symptoms and deficits (factor 3), environmental risk and male sex (factor 4), and psychosis symptoms and deficits (factor 5). B, The factor loadings separated for each subgroup containing the discovery sample factor loading (dark shade) and when the discovery sample factor loading solution was applied without modification to the replication sample factor loading (light shade). For the replication sample, the subgroup assignments were independent of the calculation of the sparse nonnegative matrix factorization components in the discovery sample.

Table 2. Top 10 Features for Each of the 5 Factorsa.

Feature Functioning and Quality of Life Suicide History Depression Symptoms and Deficits Environmental Risk and Male Sex Psychosis Symptoms and Deficits
1 In a relationship Past suicide attempt Medication change Single marital status Impaired work
2 WHOQoL-BREF item 13 Suicide ideation severity Impaired work Illicit drug history Patient status
3 WHOQoL-BREF item 24 Methods of suicide IDS-C30 item 5 WHOQoL-BREF item 3 Ever had hallucinations
4 WHOQoL-BREF item 14 Suicide preparation Family history Smokes tobacco PANSS item P1
5 Treated as outpatient Suicidal ideation IDS-C30 item 16 Male sex PANSS total score
6 WHOQoL-BREF item 25 Number of past suicide attempts IDS-C30 total score Native German speaker PANSS positive score
7 WHOQoL-BREF item 6 Suicide note history Infectious disease history Ever had delusions PANSS item P2
8 WHOQoL-BREF item 12 Suicide note thoughts Ever had delusions Education level PANSS negative score
9 WHOQoL-BREF environmental Family psychiatric history IDS-C30 item 15 WHOQoL-BREF item 15 PANSS item N1
10 WHOQoL-BREF item 20 Treated as outpatient Adverse medication event WHOQoL-BREF item 13 PANSS general score

Abbreviations: IDS-C30, Inventory of Depressive Symptomatology; PANSS, Positive and Negative Symptom Scale; WHOQoL-BREF, World Health Organization Quality of Life Questionnaire, brief version.

a

See eTable 4 in the Supplement for full descriptions of items and mean factor weights.

Subgroups were ordered based on the percentage of individuals with a diagnosis of schizophrenia (eFigure 3 in the Supplement) and interpreted in reference to their association with the factor scores (Figure 1) and a battery of commonly used variables (Table 1; eTable 5 in the Supplement). The first subgroup (n = 252) was labeled as the affective psychosis subgroup and was associated with a mean (SD) age at first inpatient treatment of 35.6 (13.0) years, female sex, mild symptom severity, and high levels of functioning and education. In contrast, subgroup 5 (n = 86) was labeled as the severe psychosis subgroup, as it contained patients with schizophrenia diagnoses and substantially lower educational achievement (with 41 [48%] having less than 12 years of schooling), low verbal intelligence, male sex, high symptoms of psychosis (but not of depression or mania), and low GAF scores. Supplementary analyses also showed that this was the only subgroup that could be identified when using cognitive variables (eResults 6, eFigure 12, and eTable 18 in the Supplement).

The remaining subgroups could be distinguished between the extremes of high-functioning affective psychosis and severe psychosis (Table 1; eTable 5 in the Supplement). Subgroup 2 (n = 44) was labeled as suicidal psychosis since it was most distinguishable by a high loading on the suicide factor (eFigure 2 in the Supplement), but a high percentage of women (70% [31 of 44]) and moderate symptoms/functioning was also notable. Subgroup 3 (n = 131) was labeled as the depressive psychosis subgroup because it loaded highly on the depressive factor, and their depressive symptoms were double that of the next highest subgroup (Table 1). Subgroup 4 (n = 252) was labeled as the high-functioning psychosis subgroup, which consisted of predominantly men (76.2% [192 of 252]) with relatively low symptom levels and relatively high global functioning.

Supplementary analyses were also conducted. To control for diagnosis, we investigated subgroup differences only in 358 individuals with schizophrenia and found similar results (eTable 6 in the Supplement). Site differences between subgroups were found (eTable 7 in the Supplement), but further analyses reduced the possibility of systematic rater and site biases (eResults 3, eFigures 9 and 10, and eTables 8, 14, 15, 16, and 17 in the Supplement). Factor solutions were stable when preprocessing parameters were changed (eResults 5 and eFigure 11 in the Supplement).

Illness Course Analyses

Mixed models containing subgroup, linear, quadratic, and interaction terms were found for PANSS (R2 = 0.41; 95% CI, 0.38-0.44), IDS-C30 (R2 = 0.28; 95% CI, 0.25-0.32), GAF (R2 = 0.16; 95% CI, 0.14-0.20), and WHOQoL-BREF (R2 = 0.20; 95% CI, 0.17-0.23) (Figure 2; eTable 9 in the Supplement). The interaction of subgroup with quadratic trends significantly improved the models and were analyzed in post hoc analyses (eTable 10 in the Supplement). Pairwise tests of quadratic trends revealed increases in the severe psychosis subgroup (PANSS and GAF) and the depressive psychosis subgroup (PANSS, IDS-C30, GAF, and WHOQoL-BREF) (Figure 2). Supplementary analyses demonstrated that the quadratic illness course of the severe psychosis subgroup occurred against a background of long-term ongoing illness (eResults 7 and eFigure 13 in the Supplement). Attrition was noted, but controlling for this variable did not affect longitudinal estimates (eResults 8 and eTables 20 and 21 in the Supplement).

Figure 2. Illness Course Comparisons in 6-Month Intervals.

Figure 2.

Linear mixed-model quadratic trend analyses corrected for multiple comparisons (false-discovery rate correction). A, Psychosis symptoms as measured by the Positive and Negative Symptom Scale (PANSS) demonstrated significant differences in quadratic trends of the severe psychosis and depressive psychosis subgroups. B, Depressive symptoms as measured by the Inventory of Depressive Symptomatology (IDS-C30) demonstrated differences in the quadratic trend of the depressive psychosis subgroup compared with all other subgroups. C, General functioning as measured with the clinician-reported Global Assessment of Functioning (GAF) demonstrated differences between the severe psychosis subgroup and all other subgroups except the depressive psychosis subgroup. D, Quality of life as measured with the self-reported World Health Organization Quality of Life Questionnaire, brief version (WHOQoL-BREF), indicated differences between the depressive psychosis subgroup and both the affective psychosis and the higher-functioning psychosis subgroups. Data points indicate mean scores; error bars indicate SEs; lines connecting data points indicate fitted quadratic trends; and brackets indicate significant differences in quadratic trends. See eTables 9 and 10 in the Supplement for detailed statistics.

Polygenic Score Analyses

The highest effect sizes separating the subgroups were found for educational attainment polygenic score (mean [SD] partial η2 = 0.014 [0.003]; Figure 3). Significance was found across 8 of 10 thresholds (partial η2 > 0.015; uncorrected P < .05), and post hoc analyses indicated a reduction of the educational attainment polygenic score in the severe psychosis subgroup (eFigure 7 in the Supplement). For comparison purposes, PRS differences between diagnostic subgroups (DSM-IV) were analyzed (Figure 3; eFigure 6 in the Supplement). Results demonstrated expected large effects for the schizophrenia and bipolar polygenic scores but not for the major depression or educational attainment polygenic scores.

Figure 3. Polygenic Risk Score Effect Sizes for Subgroup and Diagnostic Comparisons.

Figure 3.

Differences based on genome-wide association studies for schizophrenia, bipolar disorder, major depressive disorder, and educational attainment calculated using analysis of covariance. For each polygenic score, 10 values are presented that reflect commonly used P value cutoff values56 for single-nucleotide polymorphism included in making the score. P values are displayed for the highest significant effect sizes across cutoff values for each polygenic score.56 A, When comparing the subgroups, the highest effect sizes were found for the education polygenic score, which was significant (uncorrected P < .05) across all P value cutoff values except for P < .0001 and P < .01. No other polygenic scores were significant across any threshold. B, When comparing diagnostic subgroups, significant differences were found for the schizophrenia (except for the P < .001 cutoff) and bipolar disorder (except for the P < .05 cutoff) polygenic scores but not for major depressive disorder or educational attainment. For further details, see eFigures 5 and 6 in the Supplement.

Validation of Subgroups

Subgroups from the discovery sample could be robustly separated using the features in Table 2 as expected (eTable 11 in the Supplement). Application of the models to the validation cohort replicated the detected phenotypes for the factor solutions (Figure 1; eResults 4 and eTable 13 in the Supplement), distinguishing baseline clinical features (eTable 12 in the Supplement), longitudinal courses (eFigure 4 in the Supplement), and educational attainment polygenic scores (eFigures 5, 6, and 7 in the Supplement). However, the suicidal psychosis subgroup was not well replicated in longitudinal analyses because of missing data (eFigure 4 and eTable 19 in the Supplement). Also, in contrast to the discovery sample, a comparative increase in the effect size of subgroup differences using the schizophrenia PRS was found (eFigure 5 in the Supplement).

Discussion

Five psychosis subgroups were detected demonstrating distinctive clinical signatures, 18-month illness courses, and polygenic scores for educational achievement. The identified subgroup solution has not been reported before within a single study, to our knowledge. In partial agreement with current diagnostic systems, the results broadly supported the hypotheses of affective and nonaffective subgroup extrema. However, our results refine these groups and critically introduce intermediate subgroups with mixed diagnoses and divergent functional outcomes.

The identification of a severe psychosis subgroup with a poor educational history, high psychosis symptoms, and low functioning broadly agrees with longstanding hypotheses of a deficit form of schizophrenia12,21,22 with a developmental and/or genetic origin. However, the symptom profile of individuals in this group was not limited to negative symptoms, and their illness course was suggestive of remitting symptomatic (eg, PANSS) and remitting-relapsing functional (eg, GAF) patterns rather than stable impairment. These results imply that such individuals benefit from treatment and also that functional interventions should at least cover a period of up to 18 months to prevent functional relapse. Lifetime illness course in this subgroup was predominantly estimated to be chronic, which highlights the importance of studying such shorter illness dynamics that fluctuate and could be targeted with optimized treatment (eResults 7 in the Supplement).

The clear distinction between the severe psychosis subgroup and the high-functioning psychosis subgroup with mixed bipolar and schizophrenia diagnoses (ie, men with more education, less symptom severity, and stable course) potentially suggests a different illness phenotype.3,59,60 These phenotypes may be divided based on the relative contribution of developmental (severe psychosis) and environmental (high-functioning psychosis) risks. This hypothesis was supported by the finding of a reduced educational achievement polygenic score in the severe psychosis subgroup, potentially suggesting neurodevelopmental contributions.4

A second major finding was of a depression subgroup with mixed diagnoses, low functioning, moderate depression, and a remitting-relapsing functional course (Figure 2). These results extend the notion of schizodepression61 by suggesting that the subgroup is diagnostically mixed (eg, 53 of 131 [40.5%] were diagnosed with schizophrenia and 61 of 131 [46.6%] with bipolar I/II disorder) and experiences a similar functional course as the severe psychosis subgroup. In this context, it is important to note that while this course is most likely representative of a treatment response, both subgroups did not fulfil criteria for symptom remission (ie, 50%61,62) prior to their relapse between 12 and 18 months. Clinically, this further implies that addressing treatment resistance early and following up, especially between 12 and 18 months, is important for both subgroups.

A related finding was of a diagnostically mixed subgroup characterized by suicide history, which may either support research indicating that suicidality is a separable trait that is not connected to a specific diagnosis62,63,64 or that a personality disorder subgroup was detected (eg, borderline personality disorder65). A limitation of the study was that personality disorders were not assessed, but despite this, the results may suggest a need for further transdiagnostic research62,63,64 and a role for suicide-specific treatments.

Polygenic scores reflecting educational attainment exhibited relatively high effect sizes across discovery and validation analyses. Similar to genetic subtyping research,23 the finding of decreased educational attainment scores in the severe psychosis subgroup highlights the specificity of the association with subgroups and not diagnostic categories.66 The lack of a consistent association of diagnostic polygenic scores with schizophrenia, bipolar disorder, or depression indicated their lack of specificity for functional or symptom severity in this study.9,43,44 The results suggest a need for new risk scores reflecting other transdiagnostic factors, such as developmental risk (eg, birth history), functioning (eg, social impairment), and illness course (eg, remitting-relapsing). Such transdiagnostic scores may reveal previously unknown gene candidates and illness mechanisms.

Implications

The study highlights the power of using recently developed bioinformatics methods that can accommodate high-dimensional clinical data. The results partially agree with a separation of affective and nonaffective psychosis subgroups, but they also reinforce the need to look beyond conventional diagnostic categories, symptom domains, and polygenic scores to better understand psychosis heterogeneity. Doing so has the potential to facilitate a research transition from traditional case-control comparisons3,13 to modular taxonomies based on intersections of comorbid symptoms (eg, depression), premorbid and current functioning, and illness courses.11 Clinically, by separating individuals into subgroups linked to distinct baseline and longitudinal patterns, more effective resource allocation could be achieved because of better functional matching. Monitoring short-term to medium-term illness trajectories in chronic subgroups (eg, the severe and depressed subgroups) could also enhance treatment engagement and adherence. The precision of clinical and biological research could also be improved by using the subgroups. To determine subgroup labels using new data, a prototype web interface has been developed and can be accessed at http://www.proniapredictors.eu.

Limitations

The study has a number of limitations. Our aim was to use an unbiased data-driven approach, but all studies create subgrouping biases because of their selection of measures when the study is designed.2 Data from participants who refused to take part were not recorded and thus participation biases could not be assessed. We also attempted to study the illness severity spectrum but were restricted to naturalistic hospital settings where very mild cases would not be available and thus higher-functioning subgroups could be expected. Interventions were uncontrolled, and we found a treatment status bias (eg, more inpatients in the severe psychosis subgroup). Conclusions also cannot be drawn regarding schizophreniform disorder or brief psychotic disorder because of their low representation in this study. Attrition over the 18-month follow-up period emphasized a need for external replication and future research could consider longer follow-up periods, but it should be noted that investigating this timeframe is critical for translational purposes, as it defines an important operational window for clinical management. Additionally, the subgroups were validated in a separate group from the same study, and further validation is required.

Conclusions

The results of this study inform the intensifying efforts to redefine psychosis taxonomies but do so using criteria that include the assessment of clinical history, functioning, illness course, and genetic risk factors. Further research is needed to investigate communalities and discrepancies between different subtyping results to develop a consensus among competing taxonomic solutions. Etiological and treatment implications of the present subgroups need to be studied to move toward targeted mechanistic research and clinical care.

Supplement.

eMethods 1. Recruitment and sample characteristics.

eMethods 2. Baseline measures.

eMethods 3. Analysis overview.

eMethods 4. Participant and feature filtering.

eMethods 5. Clustering analyses.

eMethods 6. Genotyping and calculation of schizophrenia polygenic risk scores.

eMethods 7. Longitudinal analyses using mixed models.

eMethods 8. Identification of critical variables and replication analyses.

eResults 1. Subgroup determination.

eResults 2. Factor solutions.

eResults 3. Site and experimental rater effects.

eResults 4. Validation of subgroups: further explanation.

eResults 5. Supplementary analysis: clustering stability for different imputation settings.

eResults 6. Supplementary analysis: analysis of separability based on cognitive variables.

eResults 7. Supplementary analysis: analysis of differences in lifetime illness course.

eResults 8. Supplementary analysis: analysis of missing participants in longitudinal data and mixed models.

eFigure 1. Analysis flowchart and overview.

eFigure 2. Sparse nonnegative matrix factorization (sNMF) consensus clustering results.

eFigure 3. Proportion of diagnoses across the subgroups.

eFigure 4. Illness course comparisons of discovery and validation cohorts.

eFigure 5. Effect sizes of polygenic scores differentiating subgroups.

eFigure 6. Effect sizes of polygenic scores differentiating diagnostic groups.

eFigure 7. Violin plots of education polygenic scores.

eFigure 8. Genetic ancestry overlap with the European reference population.

eFigure 9. Günzburg site exclusion factor and consistency matrix results.

eFigure 10. Repetition of validation analyses after excluding infectious diseases variable.

eFigure 11. Factor matrices of clustering solution across different K-nearest neighbors.

eFigure 12. Feature importance for the supplementary analysis of cognition.

eFigure 13. Assessment of lifetime illness course using the Operational Criteria Checklist for Psychotic Illness and Affective Illness (OPCRIT).

eBox. Abbreviations used throughout supplementary materials.

eTable 1. Unfiltered features originally selected for analyses.

eTable 2. Variables excluded from clustering analyses.

eTable 3. Comparisons of discovery sample with excluded participants and the validation sample.

eTable 4. Top 10 features and mean factor weights.

eTable 5. Differences between the sparse nonnegative matrix factorization–derived subgroups across six clinical domains.

eTable 6. Differences between sparse nonnegative matrix factorization–derived subgroups (schizophrenia only).

eTable 7. Proportions (No. [%]) of individuals in each group across PsyCourse sites.

eTable 8. Site, rater, and site × rater analysis of variance analyses.

eTable 9. Mixed-model analysis of illness course.

eTable 10. Post hoc analysis of mixed-model quadratic trends.

eTable 11. Multigroup classification performance in the discovery set.

eTable 12. Differences between subgroups in the validation sample across 6 clinical domains.

eTable 13. Classification of the discovery and replication samples for each subgroup.

eTable 14. Somatic variables requiring exclusion for the Günzburg replacement analyses.

eTable 15. Günzburg exclusion and replacement clinical comparison table.

eTable 16. Günzburg exclusion site comparison table.

eTable 17. Günzburg site replacement site comparison table.

eTable 18. Supervised learning classification of subgroups using cognitive variables.

eTable 19. Total number of participants across time points for each subgroup.

eTable 20. Mixed-model analyses controlling for missing data.

eTable 21. Post hoc analysis of mixed models controlling for missing data.

eReferences.

References

  • 1.Kendler KS. Kraepelin and the differential diagnosis of dementia praecox and manic-depressive insanity. Compr Psychiatry. 1986;27(6):549-558. doi: 10.1016/0010-440X(86)90059-3 [DOI] [PubMed] [Google Scholar]
  • 2.Jablensky A. Subtyping schizophrenia: implications for genetic research. Mol Psychiatry. 2006;11(9):815-836. doi: 10.1038/sj.mp.4001857 [DOI] [PubMed] [Google Scholar]
  • 3.Owen MJ. New approaches to psychiatric diagnostic classification. Neuron. 2014;84(3):564-571. doi: 10.1016/j.neuron.2014.10.028 [DOI] [PubMed] [Google Scholar]
  • 4.Craddock N, Owen MJ. The Kraepelinian dichotomy—going, going... but still not gone. Br J Psychiatry. 2010;196(2):92-95. doi: 10.1192/bjp.bp.109.073429 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Derks EM, Allardyce J, Boks MP, Vermunt JK, Hijman R, Ophoff RA; GROUP . Kraepelin was right: a latent class analysis of symptom dimensions in patients and controls. Schizophr Bull. 2012;38(3):495-505. doi: 10.1093/schbul/sbq103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Carpenter WT Jr, Kirkpatrick B. The heterogeneity of the long-term course of schizophrenia. Schizophr Bull. 1988;14(4):645-652. doi: 10.1093/schbul/14.4.645 [DOI] [PubMed] [Google Scholar]
  • 7.Kapur S, Phillips AG, Insel TR. Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? Mol Psychiatry. 2012;17(12):1174-1179. doi: 10.1038/mp.2012.105 [DOI] [PubMed] [Google Scholar]
  • 8.Linden DE. The challenges and promise of neuroimaging in psychiatry. Neuron. 2012;73(1):8-22. doi: 10.1016/j.neuron.2011.12.014 [DOI] [PubMed] [Google Scholar]
  • 9.Allardyce J, Leonenko G, Hamshere M, et al. . Association between schizophrenia-related polygenic liability and the occurrence and level of mood-incongruent psychotic symptoms in bipolar disorder. JAMA Psychiatry. 2018;75(1):28-35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hyman SE. The diagnosis of mental disorders: the problem of reification. Annu Rev Clin Psychol. 2010;6:155-179. doi: 10.1146/annurev.clinpsy.3.022806.091532 [DOI] [PubMed] [Google Scholar]
  • 11.Robins E, Guze SB. Establishment of diagnostic validity in psychiatric illness: its application to schizophrenia. Am J Psychiatry. 1970;126(7):983-987. doi: 10.1176/ajp.126.7.983 [DOI] [PubMed] [Google Scholar]
  • 12.Carpenter WT Jr, Heinrichs DW, Wagman AM. Deficit and nondeficit forms of schizophrenia: the concept. Am J Psychiatry. 1988;145(5):578-583. doi: 10.1176/ajp.145.5.578 [DOI] [PubMed] [Google Scholar]
  • 13.Insel T, Cuthbert B, Garvey M, et al. . Research Domain Criteria (RDoC): toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010;167(7):748-751. doi: 10.1176/appi.ajp.2010.09091379 [DOI] [PubMed] [Google Scholar]
  • 14.van Os J, Reininghaus U. Psychosis as a transdiagnostic and extended phenotype in the general population. World Psychiatry. 2016;15(2):118-124. doi: 10.1002/wps.20310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kotov R, Krueger RF, Watson D. A paradigm shift in psychiatric classification: the Hierarchical Taxonomy Of Psychopathology (HiTOP). World Psychiatry. 2018;17(1):24-25. doi: 10.1002/wps.20478 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Marzolf SS. Symptom and syndrome statistically interpreted. Psychol Bull. 1945;42(3):162-176. doi: 10.1037/h0055033 [DOI] [Google Scholar]
  • 17.Strauss JS, Bartko JJ, Carpenter WT Jr. The use of clustering techniques for the classification of psychiatric patients. Br J Psychiatry. 1973;122(570):531-540. doi: 10.1192/bjp.122.5.531 [DOI] [PubMed] [Google Scholar]
  • 18.Farmer AE, McGuffin P, Spitznagel EL. Heterogeneity in schizophrenia: a cluster-analytic approach. Psychiatry Res. 1983;8(1):1-12. doi: 10.1016/0165-1781(83)90132-4 [DOI] [PubMed] [Google Scholar]
  • 19.Dickinson D, Pratt DN, Giangrande EJ, et al. . Attacking heterogeneity in schizophrenia by deriving clinical subgroups from widely available symptom data. Schizophr Bull. 2018;44(1):101-113. doi: 10.1093/schbul/sbx039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kahn RS, Keefe RS. Schizophrenia is a cognitive illness: time for a change in focus. JAMA Psychiatry. 2013;70(10):1107-1112. doi: 10.1001/jamapsychiatry.2013.155 [DOI] [PubMed] [Google Scholar]
  • 21.Crow TJ. The two-syndrome concept: origins and current status. Schizophr Bull. 1985;11(3):471-486. doi: 10.1093/schbul/11.3.471 [DOI] [PubMed] [Google Scholar]
  • 22.Kirkpatrick B, Galderisi S. Deficit schizophrenia: an update. World Psychiatry. 2008;7(3):143-147. doi: 10.1002/j.2051-5545.2008.tb00181.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bansal V, Mitjans M, Burik CAP, et al. . Genome-wide association study results for educational attainment aid in identifying genetic heterogeneity of schizophrenia. Nat Commun. 2018;9(1):3078. doi: 10.1038/s41467-018-05510-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Frank J, Lang M, Witt SH, et al. . Identification of increased genetic risk scores for schizophrenia in treatment-resistant patients. Mol Psychiatry. 2015;20(2):150-151. doi: 10.1038/mp.2014.56 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Koutsouleris N, Kambeitz-Ilankovic L, Ruhrmann S, et al. ; PRONIA Consortium . Prediction models of functional outcomes for individuals in the clinical high-risk state for psychosis or with recent-onset depression: a multimodal, multisite machine learning analysis. JAMA Psychiatry. 2018;75(11):1156-1172. doi: 10.1001/jamapsychiatry.2018.2165 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kendell RE, Gourlay J. The clinical distinction between the affective psychoses and schizophrenia. Br J Psychiatry. 1970;117(538):261-266. doi: 10.1192/S0007125000193225 [DOI] [PubMed] [Google Scholar]
  • 27.Brat DJ, Verhaak RG, Aldape KD, et al. ; Cancer Genome Atlas Research Network . Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N Engl J Med. 2015;372(26):2481-2498. doi: 10.1056/NEJMoa1402121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Chen F, Zhang Y, Şenbabaoğlu Y, et al. . Multilevel genomics-based taxonomy of renal cell carcinoma. Cell Rep. 2016;14(10):2476-2489. doi: 10.1016/j.celrep.2016.02.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Hoadley KA, Yau C, Wolf DM, et al. ; Cancer Genome Atlas Research Network . Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell. 2014;158(4):929-944. doi: 10.1016/j.cell.2014.06.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Network CGAR; Cancer Genome Atlas Research Network . Comprehensive molecular characterization of gastric adenocarcinoma. Nature. 2014;513(7517):202-209. doi: 10.1038/nature13480 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Jain AK. Data clustering: 50 years beyond K-means. Pattern Recognit Lett. 2010;31:651-666. doi: 10.1016/j.patrec.2009.09.011 [DOI] [Google Scholar]
  • 32.Jablensky A. The diagnostic concept of schizophrenia: its history, evolution, and future prospects. Dialogues Clin Neurosci. 2010;12(3):271-287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Hall MH, Holton KM, Öngür D, Montrose D, Keshavan MS. Longitudinal trajectory of early functional recovery in patients with first episode psychosis. Schizophr Res. 2019;209:234-244. doi: 10.1016/j.schres.2019.02.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kotov R, Leong SH, Mojtabai R, et al. . Boundaries of schizoaffective disorder: revisiting Kraepelin. JAMA Psychiatry. 2013;70(12):1276-1286. doi: 10.1001/jamapsychiatry.2013.2350 [DOI] [PubMed] [Google Scholar]
  • 35.Kotov R, Foti D, Li K, Bromet EJ, Hajcak G, Ruggero CJ. Validating dimensions of psychosis symptomatology: neural correlates and 20-year outcomes. J Abnorm Psychol. 2016;125(8):1103-1119. doi: 10.1037/abn0000188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lambert M, Schimmelmann BG, Schacht A, et al. . Long-term patterns of subjective wellbeing in schizophrenia: cluster, predictors of cluster affiliation, and their relation to recovery criteria in 2842 patients followed over 3 years. Schizophr Res. 2009;107(2-3):165-172. doi: 10.1016/j.schres.2008.08.035 [DOI] [PubMed] [Google Scholar]
  • 37.Heering HD, van Haren NE, Derks EM; GROUP Investigators . A two-factor structure of first rank symptoms in patients with a psychotic disorder. Schizophr Res. 2013;147(2-3):269-274. doi: 10.1016/j.schres.2013.04.032 [DOI] [PubMed] [Google Scholar]
  • 38.Velthorst E, Fett AJ, Reichenberg A, et al. . The 20-year longitudinal trajectories of social functioning in individuals with psychotic disorders. Am J Psychiatry. 2017;174(11):1075-1085. doi: 10.1176/appi.ajp.2016.15111419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Purcell SM, Wray NR, Stone JL, et al. ; International Schizophrenia Consortium . Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460(7256):748-752. doi: 10.1038/nature08185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lee SH, Ripke S, Neale BM, et al. ; Cross-Disorder Group of the Psychiatric Genomics Consortium; International Inflammatory Bowel Disease Genetics Consortium (IIBDGC) . Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat Genet. 2013;45(9):984-994. doi: 10.1038/ng.2711 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Sullivan PF, Daly MJ, O’Donovan M. Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nat Rev Genet. 2012;13(8):537-551. doi: 10.1038/nrg3240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Meier SM, Agerbo E, Maier R, et al. ; MooDS SCZ Consortium . High loading of polygenic risk in cases with chronic schizophrenia. Mol Psychiatry. 2016;21(7):969-974. doi: 10.1038/mp.2015.130 [DOI] [PubMed] [Google Scholar]
  • 43.Clementz BA, Sweeney JA, Hamm JP, et al. . Identification of distinct psychosis biotypes using brain-based biomarkers. Am J Psychiatry. 2016;173(4):373-384. doi: 10.1176/appi.ajp.2015.14091200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Budde M, Anderson-Schmidt H, Gade K, et al. . A longitudinal approach to biological psychiatric research: the PsyCourse study. Am J Med Genet B Neuropsychiatr Genet. 2019;180(2):89-102. doi: 10.1002/ajmg.b.32639 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Wittchen HU, Fydrich T. Strukturiertes Klinisches Interview für DSM-IV (SKID-I und SKID-II). Göttingen, Germany: Hogrefe; 1997. [Google Scholar]
  • 46.Troyanskaya O, Cantor M, Sherlock G, et al. . Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17(6):520-525. doi: 10.1093/bioinformatics/17.6.520 [DOI] [PubMed] [Google Scholar]
  • 47.Kay SR, Fiszbein A, Opler LA. The Positive and Negative Syndrome Scale (PANSS) for schizophrenia. Schizophr Bull. 1987;13(2):261-276. doi: 10.1093/schbul/13.2.261 [DOI] [PubMed] [Google Scholar]
  • 48.Rush AJ, Bernstein IH, Trivedi MH, et al. . An evaluation of the quick Inventory of Depressive Symptomatology and the Hamilton rating scale for depression: a sequenced treatment alternatives to relieve depression trial report. Biol Psychiatry. 2006;59(6):493-501. doi: 10.1016/j.biopsych.2005.08.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Young RC, Biggs JT, Ziegler VE, Meyer DA. A rating scale for mania: reliability, validity and sensitivity. Br J Psychiatry. 1978;133:429-435. doi: 10.1192/bjp.133.5.429 [DOI] [PubMed] [Google Scholar]
  • 50.Angermeyer MC, Kilian R, Matschinger H. WHOQOL100 und WHOQOL-BREF. Handbuch für die deutschsprachigen Versionen der WHO Instrumente zur Erfassung von Lebensqualität. Göttingen, Germany: Hogrefe; 2000. [Google Scholar]
  • 51.American Psychiatric Association Diagnostic and Statistical Manual of Mental Disorders. 4th ed Washington, DC: American Psychiatric Association; 1994. [Google Scholar]
  • 52.Pardiñas AF, Holmans P, Pocklington AJ, et al. ; GERAD1 Consortium; CRESTAR Consortium . Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat Genet. 2018;50(3):381-389. doi: 10.1038/s41588-018-0059-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Stahl EA, Breen G, Forstner AJ, et al. ; eQTLGen Consortium; BIOS Consortium; Bipolar Disorder Working Group of the Psychiatric Genomics Consortium . Genome-wide association study identifies 30 loci associated with bipolar disorder. Nat Genet. 2019;51(5):793-803. doi: 10.1038/s41588-019-0397-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Wray NR, Ripke S, Mattheisen M, et al. ; eQTLGen; 23andMe; Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium . Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet. 2018;50(5):668-681. doi: 10.1038/s41588-018-0090-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Lee JJ, Wedow R, Okbay A, et al. ; 23andMe Research Team; COGENT (Cognitive Genomics Consortium); Social Science Genetic Association Consortium . Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat Genet. 2018;50(8):1112-1121. doi: 10.1038/s41588-018-0147-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Ruderfer DM, Fanous AH, Ripke S, et al. ; Schizophrenia Working Group of the Psychiatric Genomics Consortium; Bipolar Disorder Working Group of the Psychiatric Genomics Consortium; Cross-Disorder Working Group of the Psychiatric Genomics Consortium . Polygenic dissection of diagnosis and clinical dimensions of bipolar disorder and schizophrenia. Mol Psychiatry. 2014;19(9):1017-1024. doi: 10.1038/mp.2013.138 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Schizophrenia Working Group of the Psychiatric Genomics Consortium Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511(7510):421-427. doi: 10.1038/nature13595 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.van Os J, Linscott RJ, Myin-Germeys I, Delespaul P, Krabbendam L. A systematic review and meta-analysis of the psychosis continuum: evidence for a psychosis proneness-persistence-impairment model of psychotic disorder. Psychol Med. 2009;39(2):179-195. doi: 10.1017/S0033291708003814 [DOI] [PubMed] [Google Scholar]
  • 59.Neale BM, Sklar P. Genetic analysis of schizophrenia and bipolar disorder reveals polygenicity but also suggests new directions for molecular interrogation. Curr Opin Neurobiol. 2015;30:131-138. doi: 10.1016/j.conb.2014.12.001 [DOI] [PubMed] [Google Scholar]
  • 60.Owen MJ, O’Donovan MC, Thapar A, Craddock N. Neurodevelopmental hypothesis of schizophrenia. Br J Psychiatry. 2011;198(3):173-175. doi: 10.1192/bjp.bp.110.084384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Kendler KS, Karkowski LM, Walsh D. The structure of psychosis: latent class analysis of probands from the Roscommon Family Study. Arch Gen Psychiatry. 1998;55(6):492-499. doi: 10.1001/archpsyc.55.6.492 [DOI] [PubMed] [Google Scholar]
  • 62.Baldessarini RJ, Hennen J. Genetics of suicide: an overview. Harv Rev Psychiatry. 2004;12(1):1-13. [DOI] [PubMed] [Google Scholar]
  • 63.Brent DA, Melhem N. Familial transmission of suicidal behavior. Psychiatr Clin North Am. 2008;31(2):157-177. doi: 10.1016/j.psc.2008.02.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Bronisch T. The relationship between suicidality and depression. Arch Suicide Res. 1996;2(4):235-254. doi: 10.1080/13811119608259005 [DOI] [Google Scholar]
  • 65.Oldham JM. Borderline personality disorder and suicidality. Am J Psychiatry. 2006;163(1):20-26. doi: 10.1176/appi.ajp.163.1.20 [DOI] [PubMed] [Google Scholar]
  • 66.Sørensen HJ, Debost JC, Agerbo E, et al. . Polygenic risk scores, school achievement, and risk for schizophrenia: a Danish population-based study. Biol Psychiatry. 2018;84(9):684-691. doi: 10.1016/j.biopsych.2018.04.012 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement.

eMethods 1. Recruitment and sample characteristics.

eMethods 2. Baseline measures.

eMethods 3. Analysis overview.

eMethods 4. Participant and feature filtering.

eMethods 5. Clustering analyses.

eMethods 6. Genotyping and calculation of schizophrenia polygenic risk scores.

eMethods 7. Longitudinal analyses using mixed models.

eMethods 8. Identification of critical variables and replication analyses.

eResults 1. Subgroup determination.

eResults 2. Factor solutions.

eResults 3. Site and experimental rater effects.

eResults 4. Validation of subgroups: further explanation.

eResults 5. Supplementary analysis: clustering stability for different imputation settings.

eResults 6. Supplementary analysis: analysis of separability based on cognitive variables.

eResults 7. Supplementary analysis: analysis of differences in lifetime illness course.

eResults 8. Supplementary analysis: analysis of missing participants in longitudinal data and mixed models.

eFigure 1. Analysis flowchart and overview.

eFigure 2. Sparse nonnegative matrix factorization (sNMF) consensus clustering results.

eFigure 3. Proportion of diagnoses across the subgroups.

eFigure 4. Illness course comparisons of discovery and validation cohorts.

eFigure 5. Effect sizes of polygenic scores differentiating subgroups.

eFigure 6. Effect sizes of polygenic scores differentiating diagnostic groups.

eFigure 7. Violin plots of education polygenic scores.

eFigure 8. Genetic ancestry overlap with the European reference population.

eFigure 9. Günzburg site exclusion factor and consistency matrix results.

eFigure 10. Repetition of validation analyses after excluding infectious diseases variable.

eFigure 11. Factor matrices of clustering solution across different K-nearest neighbors.

eFigure 12. Feature importance for the supplementary analysis of cognition.

eFigure 13. Assessment of lifetime illness course using the Operational Criteria Checklist for Psychotic Illness and Affective Illness (OPCRIT).

eBox. Abbreviations used throughout supplementary materials.

eTable 1. Unfiltered features originally selected for analyses.

eTable 2. Variables excluded from clustering analyses.

eTable 3. Comparisons of discovery sample with excluded participants and the validation sample.

eTable 4. Top 10 features and mean factor weights.

eTable 5. Differences between the sparse nonnegative matrix factorization–derived subgroups across six clinical domains.

eTable 6. Differences between sparse nonnegative matrix factorization–derived subgroups (schizophrenia only).

eTable 7. Proportions (No. [%]) of individuals in each group across PsyCourse sites.

eTable 8. Site, rater, and site × rater analysis of variance analyses.

eTable 9. Mixed-model analysis of illness course.

eTable 10. Post hoc analysis of mixed-model quadratic trends.

eTable 11. Multigroup classification performance in the discovery set.

eTable 12. Differences between subgroups in the validation sample across 6 clinical domains.

eTable 13. Classification of the discovery and replication samples for each subgroup.

eTable 14. Somatic variables requiring exclusion for the Günzburg replacement analyses.

eTable 15. Günzburg exclusion and replacement clinical comparison table.

eTable 16. Günzburg exclusion site comparison table.

eTable 17. Günzburg site replacement site comparison table.

eTable 18. Supervised learning classification of subgroups using cognitive variables.

eTable 19. Total number of participants across time points for each subgroup.

eTable 20. Mixed-model analyses controlling for missing data.

eTable 21. Post hoc analysis of mixed models controlling for missing data.

eReferences.


Articles from JAMA Psychiatry are provided here courtesy of American Medical Association

RESOURCES