Abstract
Heterogeneity is a key feature of all psychiatric disorders that manifests on many levels, including symptoms, disease course, and biological underpinnings. These form a substantial barrier to understanding disease mechanisms and developing effective, personalized treatments. In response, many studies have aimed to stratify psychiatric disorders, aiming to find more consistent subgroups on the basis of many types of data. Such approaches have received renewed interest after recent research initiatives, such as the National Institute of Mental Health Research Domain Criteria and the European Roadmap for Mental Health Research, both of which emphasize finding stratifications that are based on biological systems and that cut across current classifications. We first introduce the basic concepts for stratifying psychiatric disorders and then provide a methodologically oriented and critical review of the existing literature. This shows that the predominant clustering approach that aims to subdivide clinical populations into more coherent subgroups has made a useful contribution but is heavily dependent on the type of data used; it has produced many different ways to subgroup the disorders we review, but for most disorders it has not converged on a consistent set of subgroups. We highlight problems with current approaches that are not widely recognized and discuss the importance of validation to ensure that the derived subgroups index clinically relevant variation. Finally, we review emerging techniques—such as those that estimate normative models for mappings between biology and behavior—that provide new ways to parse the heterogeneity underlying psychiatric disorders and evaluate all methods to meeting the objectives of such as the National Institute of Mental Health Research Domain Criteria and Roadmap for Mental Health Research.
Keywords: European Roadmap for Mental Health Research, Heterogeneity, Latent cluster analysis, Psychiatry, RDoC, Research Domain Criteria, Subgroup, ROAMER
Psychiatric disorders are, without exception, highly heterogeneous in terms of symptoms, disease course, and biological underpinnings. Diagnoses are made on the basis of symptoms, while at the level of the underlying biology their causes are complex and multifaceted. This becomes acutely problematic in psychiatry because biological tests to assist diagnosis or predict outcome have not been developed (1). Diagnostic categories therefore often do not map cleanly onto either biology or outcome, which forms a major barrier to understanding disease mechanisms and developing more effective treatments.
A recognition of the imperfections of psychiatric nosology is not new; the debate between “lumpers” and “splitters” (2) over the number and validity of diagnostic classifications has continued unabated for more than a century following the classifications of dementia praecox and schizophrenia proposed by Kraepelin and Bleuler (3, 4). Reflecting this ongoing debate, classifications are revised with every new edition of diagnostic manuals (5, 6). Data-driven approaches to address heterogeneity in psychiatric disorders have also been applied for decades, in which the dominant approach has been to partition clinical groups into more homogeneous subgroups using data clustering methods—early examples can be seen in Paykel (7) and Farmer et al. (8). These approaches have recently received renewed interest for three reasons: 1) the advent of technologies for measuring many aspects of biology noninvasively and in vivo, particularly neuroimaging and genetics; 2) advances in statistical and machine learning data analytic approaches that make it possible to extract information from complex and high-dimensional data; and 3) increasing emphasis on using biological data to tailor treatments to the needs of individual patients (“precision medicine”) (9, 10). Most notably, recent funding initiatives, such as the National Institue of Mental Health Research Domain Criteria [RDoC (11)] and the European Roadmap for Mental Health Research [ROAMER (12)], have encouraged researchers to think beyond the classical case-control approach—where participants are either “patients” or “controls” based on fixed diagnostic criteria—and instead link cognitive dimensions with underlying biology while cutting across diagnostic classifications. The hope is that this will lead to biologically grounded understanding of disease entities and ultimately to more effective, personalized treatments.
These initiatives have stimulated an increasing number of studies that have used data-driven methods to stratify many disorders, including schizophrenia, major depression, attention-deficit/hyperactivity disorder (ADHD), and autism based on many types of data, including symptoms, neuropsychologic scores, and neuroimaging measures (13, 14, 15, 16, 17, 18, 19, 20, 21). We selectively review this burgeoning literature.1 We first present a didactic introduction to the most prevalent methodologic approaches for stratifying psychiatric disorders, highlighting the (often implicit) assumptions they entail. We then present an illustrative overview of studies that have used these methods to parse the heterogeneity underlying psychiatric disorders. We identify problems with current approaches and discuss the importance of validation to ensure reproducibility and ensure that clusters map onto clinically meaningful variation. We discuss emerging techniques, such as normative modeling (22), that provide means to parse heterogeneity in clinical cohorts without needing to make strong assumptions about clinical groups and evaluate the suitability of each method for meeting the objectives of recent research initiatives. Finally, we propose future developments that may help to parse heterogeneity more effectively.
Methodologic Approaches For Stratifying Clinical Populations
The predominant approach has been to subdivide clinical cohorts using statistical or machine learning methods, largely of two main types: clustering (23) and finite mixture models (FMMs) (24, 25, 26). Both are unsupervised in that they do not have access to class labels (e.g., diagnostic labels) and must find subgroups automatically based on structure within the data and heuristics used by each algorithm. In contrast, supervised methods are provided with labels that indicate the class to which each subject belongs (e.g., “patient” or “control”). Supervised learning has been successful for predicting diagnosis or outcome from neuroimaging data in research settings (27, 28, 29) but is fundamentally limited by the quality of the clinical labels and the heterogeneity within disease cohorts (29) and cannot, by definition, inform on the validity of the labels. Therefore, unsupervised methods have been more widely used for discovering latent structure within clinical groups. We present a brief introduction to clustering and FMM methods below; additional details and a didactic introduction are provided in the Supplement.
Clustering
The classical case-control approach can itself be phrased in terms of defining clusters and associated decision boundaries. For example, Fisher’s linear discriminant (23) uses the class-dependent mean response (e.g., in patients vs. controls) and thereby clusters the entire cohort along a decision boundary defined by the mean and class-specific covariances. More generally, given a set of data points (e.g., clinical or neuroimaging measures), clustering algorithms aim to partition the data into a specified number () of clusters such that the samples in each cluster are more similar to one another than to those in the other clusters. This entails defining a measure of similarity or distance between data points. One of the simplest and most widely used approaches is K-means clustering, which partitions the input space into subregions based on the squared Euclidean distance (see Supplement). A wide variety of other algorithms have also been proposed in the machine learning literature (23, 30, 31). Two that are relevant for stratifying psychiatric disorders are 1) hierarchical clustering, which forms a hierarchy of cluster assignments by recursively splitting larger groups (“divisive clustering”) or combining individual samples (“agglomerative clustering” [e.g., Ward’s method (32)]), and 2) community detection, which is a graph-based method that aims to cluster nodes into “communities” (33).
Finite Mixture Modeling
FMMs2 are a broad class of probabilistic approaches that aim to represent data using a finite number of parametric distributions (“components”). The simplest examples are Gaussian mixture models (GMMs),3 where all components have Gaussian distributions (24), but many other models are also members of this class (26), including latent class cluster analysis (LCCA) (25, 34), growth mixture modeling (35), latent class growth analysis4 (LCGA) (36), and factor mixture modeling (20) (see Supplement).
LCCA is a particularly widely used approach that accommodates many different data types (e.g., continuous, categorical, and ordinal). It is highly generic and can model, for example, dependence between variables (e.g., to model correlated clinical variables) or can use covariates to help predict class membership (25, 26, 34). Growth mixture modeling is a useful generalization and is derived by combining FMM with growth models (26, 35). This is appropriate for modeling longitudinal data derived from different growth trajectories. Given the neurodevelopmental basis for psychiatric disorders (37) and the importance of disease course in diagnosis (38), these approaches are increasingly being applied to stratify psychiatric disorders (39, 40).
One advantage of FMMs is that they provide a full statistical model for the data, and therefore classical statistical techniques can be used to assess fit (e.g., likelihood ratio tests). They are also flexible; for example, GMMs can approximate any continuous distribution to acceptable error (41). However, modeling complex distributions may require many mixture components having many parameters.
Model Order Selection
Choosing the number of clusters or components is an important consideration and directly influences model flexibility. Many techniques have been proposed for comparing model orders, including classical information criteria (42, 43) and specialized methods (44, 45, 46, 47, 48). Different methods embody different heuristics (e.g., how parameters are penalized), which may not yield the same or even a unique optimal model order, indicating that the data can be equally well-explained using different model orders. Some methods automatically estimate model order (33, 49) but do not indicate whether other model orders are equally appropriate and often have additional parameters that influence the estimated model order. For example, graph-based methods (33) entail specifying a threshold above which nodes are considered connected (see Advantages and Disadvantages of Clustering for further discussion).
Applications to Stratify Psychiatric Disorders
Clustering methods5 have been used extensively to stratify all psychiatric disorders, both individually and across diagnoses; Table 1, Table 2, Table 3, Table 4, Table 5 provide a representative (but not exhaustive) overview. Several articles offer more extensive quantitative reviews (19, 50, 51). Three salient observations can be made: first, during the many years that computational approaches have been used, relatively few algorithms have been used. There is, however, more variability among methods to select model order. Second, stratifications have been based on a range of measures, but predominantly symptoms or psychometric variables. This is notable considering that RDoC and ROAMER emphasize stratification on the basis of mappings between biological systems and cognitive domains, not just symptoms (10). To date, few studies have stratified psychiatric disorders on the basis of quantitative biological measures, and these studies have predominantly used neuroimaging-based measures (13, 16, 17, 52). This may be because of well-known problems with clustering complex, high-dimensional data (see Advantages and Disadvantages of Clustering).
Table 1.
Study | Subjects (N) | Measures | Algorithm | No. of Clusters (Method) | Cluster Descriptions | External Validation |
---|---|---|---|---|---|---|
Farmer et al., 1983 (8) | SCZ (65) | Symptoms and case history variables | K means and hierarchical clustering | 2 (maximal agreement between methods) | Good premorbid adjustment, late onset, and well organized delusions | None |
Poor premorbid functioning, early onset, incoherent speech, and bizarre behavior | ||||||
Castle et al., 1994 (93) | SCZ (447) | Symptoms and case history variables | LCCA | 3 (χ2 test) | Neurodevelopmental | Premorbid, phenomenologic, and treatment response variables [see (94)] |
Paranoid | ||||||
Schizoaffective | ||||||
Dollfus et al., 1996 (95) | SCZ (138) | Symptoms | Ward’s hierarchical clustering method (32) | 4 (informal examination of cluster dendrogram) | Positive symptoms | Social variables |
Negative symptoms | ||||||
Disorganized symptoms | ||||||
Mixed symptoms | ||||||
Kendler et al., 1998 (96) | SCZ (348) | Symptoms | LCCA | 6 (not specified) | Classic schizophrenia | Historical data |
Major depression | ||||||
Schiophreniform disorder | ||||||
Bipolar-schizomania | ||||||
Hebephrenia | ||||||
Murray et al., 2005 (97) | SCZ (387) | “Operational criteria” diagnostic measures (medical records and interview) | LCCA | BIC (42) | Depression | None |
Reality distortion | ||||||
Mania | ||||||
Disorganization | ||||||
Dawes et al., 2011 (98) | SCZ and SAD (144) | Neuropsychological measures | K means | 5 (Ward method) | Visual learning and memory (–) | None |
Verbal comprehension (+), processing speed (+), abstraction (–) auditory and visual learning, and memory (–) | ||||||
Abstraction (–) | ||||||
Verbal comprehension (+), visual learning and memory (+), abstraction (–), auditory learning and memory (–) | ||||||
Verbal comprehension (+), abstraction (–), visual learning and memory (–) | ||||||
Cole et al., 2012 (99) | SCZ (208) | Social and academic adjustment scales | LCGA | 3 [BIC and Lo-Mendell-Rubin test (44)] | Good—stable | None |
Insidious onset | ||||||
Poor deteriorating | ||||||
Bell et al., 2013 (18) | SCZ and SAD (77 + 63 validation) | Symptoms and social cognitive measures | K means | 3 (Ward method) | High negative symptoms | None |
High social cognition | ||||||
Low social cognition | ||||||
Brodersen et al., 2014 (13) | SCZ (41) and HC (42) | Dynamic causal model (100) derived from fMRI data | Gaussian mixture | 3 [Bayesian model evidence (101)] | Subgroups characterized in terms of DCM model parameters | Symptoms and medication |
Geisler et al., 2015 (102) | SCZ (129) | Neuropsychological measures | K-means | 4 (fixed a priori) | Verbal fluency (–), processing speed (–) | fMRI |
Verbal episodic memory (–), fine motor control (–), signal detection | ||||||
Face episodic memory (–), processing speed (–) | ||||||
General intellectual function (–) | ||||||
Sun et al., 2015 (52) | SCZ (113) | White matter integrity measured by diffusion tensor imaging | Hierarchical clustering | 2 [Silhouette, Dunn, and connectivity indices (46, 47, 48)] | Subgroups characterized in terms of white matter abnormalities | Symptoms |
External validation is defined as a data measure used to validate the derived classes that is of a different type to the data use to derive the classes. Wherever possible, we follow the authors’ own nomenclature for describing clusters, and a (+) or (–) indicates relative improvement or deficit in the specified variable.
BIC, Bayesian information criterion; DCM, dynamic causal modeling; fMRI, functional magnetic resonance imaging; LCCA, latent class cluster analysis; LCGA, latent class growth analysis; SAD, schizoaffective disorder; SCZ, schizophrenia.
Table 2.
Study | Subjects (N) | Measures | Algorithm | No. of Clusters (Method) | Cluster Descriptions | External Validation |
---|---|---|---|---|---|---|
Paykel, 1971 (7) | Patients with depression (165) | Clinical interviews, case history, and personality variables | Friedman–Rubin algorithm (103) | 4 (maximize the ratio of between to within class scatter) | Psychotic | None |
Anxious | ||||||
Hostile | ||||||
Young depressive with personality disorder | ||||||
Maes et al., 1992 (57) | MDD (80) | Symptoms | K means | 2 (not specified) | Vital (i.e., psychomotor disorders, loss of energy, early morning awakening, and nonreactivity) | Biological (e.g., endocrine) measures |
Nonvital | ||||||
Kendler et al., 1996 (53) | Female twin pairs (2163) | Symptoms | LCCA | 7 (not specified) | Only 3 clusters described: | Body mass index, personality, and concordance of cluster membership among twin pairs |
| ||||||
Sullivan et al., 1998 (54) | National comorbidity survey respondents (2836) | Symptoms | LCCA | 6 (χ2 statistic) | Severe typical | Demographic and personality variables |
Mild typical | ||||||
Severe atypical | ||||||
Mild atypical | ||||||
Intermediate | ||||||
Minimal symptoms | ||||||
Hybels et al., 2009 (58) | MDD (368) | Symptoms | LCCA | 4 [L2 statistic (34), BIC] | DSM-IV depression: Moderate sadness, lassitude and inability to feel | Demographic, social, and clinical variables |
Higher severity for all items, especially apparent sadness | ||||||
Milder profile | ||||||
Highest severity and most functional limitations | ||||||
Lamers et al., 2010 (55) | MDD (818) | Symptoms plus demographic, psychosocial, and physical health variables | LCCA | 3 [BIC and AIC (43)] | Severe melancholic (decreased appetite, weight loss) | Stability over time, sociodemographic, clinical, and biological (e.g., metabolic) variables (104, 105) |
Severe atypical (overeating and weight gain) | ||||||
Moderate severity | ||||||
Lamers et al., 2012 (56) | National comorbidity survey—replication respondents. Adolescents (912) and adults (805) | Symptoms | LCCA | Adolescents: 3, adults: 4 (BIC) | Adolescents: | None |
| ||||||
Adults: | ||||||
| ||||||
Rhebergen et al., 2012 (39) | MDD (804) | Longitudinal symptom scores | LCGA | 5 (BIC and Lo-Mendell-Rubin test) | Remission | Demographic and diagnostic variables, fMRI [see (73)] |
Decline (moderate severity) | ||||||
Decline (severe) | ||||||
Chronic (moderate severity) | ||||||
Decline (severe) | ||||||
Van Loo et al., 2014 (59) | MDD (8,261) | Retrospective symptom reports and demographic data that predict disease course | K-means | 3 (Inspection of dichotomization scores and area under the receiver operating characteristic curve [see (59)]) | High risk | None |
Intermediate risk | ||||||
Low risk | ||||||
Milaneschi et al., 2015 (60) | MDD (1477) | Symptoms | LCCA | 3 (BIC, AIC, and likelihood ratio test) | Severe melancholic [see Lamers et al., (55)] | Polygenic risk scores |
Severe atypical | ||||||
Moderate |
External validation is defined as a data measure used to validate the derived classes that is of a different type to the data use to derive the classes. Wherever possible, we follow the authors’ own nomenclature for describing clusters.
AIC, Akaike information criterion; BIC, Bayesian information criterion; fMRI, functional magnetic resonance imaging; LCCA, latent class cluster analysis; LCGA, latent class growth analysis; MDD, major depressive disorder.
Table 3.
Study | Subjects (N) | Measures | Algorithm | No. of Clusters (Method) | Cluster Descriptions | External Validation |
---|---|---|---|---|---|---|
Fair et al., 2012 (15) | ADHD (285) and TDC (213) | Neuropsychologic scores | CD (33) | 6 for ADHD (determined implicitly by the algorithm) | Response time variability (+) | None |
Working memory (–), memory span (–), inhibition (–), and output speed (–) | ||||||
Working memory (–), memory span (–), inhibition (–), and output speed (–), minor differences in remaining measures | ||||||
Temporal processing (–) | ||||||
Arousal (–) | ||||||
Arousal (–), minor differences in remaining measures | ||||||
Karalunas et al., 2014 (14) | ADHD (247) and TDC (190) | Personality measures (e.g., temperament) | CD | 3 (determined implicitly by the algorithm) | Mild | Physiological (e.g., cardiac) measures, resting state fMRI and 1-year clinical outcomes |
Surgent (positive apporach motivation) | ||||||
Irritable (negative emotionality, anger, and poor soothability) | ||||||
Gates et al., 2014 (16) | ADHD (32) and TDC (58) | fMRI (functional connectivity) | CD | 5 (determined implicitly by the algorithm) | Subgroups characterized in terms of functional connectivity profiles | None |
Costa Dias et al., 2015 (17) | ADHD (42) and TDC (63) | fMRI (reward related functional connectivity) | CD | 3 (determined implicitly by the algorithm) | Subgroups characterized in terms of functional connectivity profiles | Clinical variables and reward sensitivity |
Van Hulst et al., 2015 (67) | ADHD (96) and TDC (121) | Neuropsychological scores | LCCA | 5 (BIC) | Quick and accurate | Parent ratings of behavioral problems |
Poor cognitive control | ||||||
Slow and variable timing | ||||||
Remaining 2 groups were too small to characterize | ||||||
Mostert et al., 2015 (106) | ADHD (133) and TDC (132) | Neuropsychological scores | CD | 3 (determined implicitly by the algorithm) | Attention (–), inhibition (–) | Clinical symptoms and case history |
Reward sensitivity (+) | ||||||
Working memory (–) and verbal fluency (–) |
External validation is defined as a data measure used to validate the derived classes that is of a different type to the data use to derive the classes. Wherever possible, we follow the authors’ own nomenclature for describing clusters, and a (+) or (–) indicates relative improvement or deficit in the specified variable.
ADHD, attention-deficit/hyperactivity disorder; BIC, Bayesian information criterion; CD, community detection; fMRI, functional magnetic resonance imaging; LCCA, latent class cluster analysis; TDC, typically developing control.
Table 4.
Study | Subjects (N) | Measures | Algorithm | No. of Clusters (Method) | Cluster Descriptions | External Validation |
---|---|---|---|---|---|---|
Munson et al., 2008 (107) | ASD (245) | IQ scores | LCCA and taxonometric analysis | 4 (BIC, entropy, and Lo-Mendell-Rubin test) | Low IQ | Symptom scores |
Low verbal IQ/medium nonverbal | ||||||
Medium IQ | ||||||
High IQ | ||||||
Sacco et al., 2012 (21) | ASD (245) | Demographic, clinical, case history, and physiologic (e.g., head circumference) variables | K means | 4 (Ward’s method) | Immune + circadian and sensory | None |
Circadian and sensory | ||||||
Stereotypic behaviors | ||||||
Mixed | ||||||
Fountain et al., 2012 (40) | ASD (6795) | Symptoms | LCGA | 6 (BIC) | High functioning | Demographic variables and autism risk factors |
Bloomers (substantial improvement) | ||||||
Medium-high functioning | ||||||
Medium functioning | ||||||
Low-medium functioning | ||||||
Low functioning | ||||||
Georgiades et al., 2013 (108) | ASD (391) | Symptom scores | FMM | 3 (AIC and BIC) | Social communication (–), repetitive behaviors (+) | Demographic and cognitive meaures |
Social communication (+), repetitive behaviors (–) | ||||||
Social communication (–), repetitive behaviors (–) | ||||||
Doshi-Velez et al., 2014 (109) | ASD (4927) | Electronic medical records | Ward’s method | 4 (Ward’s method) | Seizures | None |
Multisystem disorders | ||||||
Auditory disorders and infections | ||||||
Psychiatric disorders | ||||||
Not otherwise specified | ||||||
Veatch et al., 2014 (68) | ASD (1261 + 2563 for replication) | Symptoms, demographic, and somatic variables | Ward’s method | 2 [Adjusted Arabie Rand index (110) and validation with additional clustering algorithms] | Severe | Genomic data |
Less severe |
External validation is defined as a data measure used to validate the derived classes that is of a different type to the data use to derive the classes. Wherever possible, we follow the authors’ own nomenclature for describing clusters, and a (+) or (–) indicates relative improvement or deficit in the specified variable.
ASD, autism spectrum disorder; BIC, Bayesian information criterion; FMM, factor mixture modeling; LCCA, latent class cluster analysis; LCGA, latent class growth analysis.
Table 5.
Study | Subjects (N) | Measures | Algorithm | No. of Clusters (Method) | Cluster Descriptions | External Validation |
---|---|---|---|---|---|---|
Olinio et al., 2010 (113) | Adolescents (1653), including MDD (603), ANX (253), SUD (453) | Diagnosis (longitudinal) | LCGA | 6 (BIC) | Persistent depression | Demographic and case history variables |
Persistent anxiety | ||||||
Late onset anxiety, increasing depression | ||||||
Increasing depression | ||||||
Initially high, decreasing anxiety | ||||||
Absence of psychopathology | ||||||
Lewdanowski et al., 2014 (111) | SCZ (41), SAD (53), BPDp (73) | Clinical and cognitive measures | K means | 4 (Ward’s method) | Neuropsychologically normal | Diagnosis, demographic variables, and community functioning |
Globally and significantly impaired | ||||||
Mixed cognitive profiles (×2) | ||||||
Kleinman et al., 2015 (112) | ADHD (23), BPD (10), BPDa (33), and HCs (18) | Continuous performance test measures | K means | 2 [Silhouette index (46)] | Sustained attention (–) , inhibitory control (–), impulsiveness (+), and vigilance (–) | Diagnosis |
The converse of above |
External validation is defined as a data measure used to validate the derived classes that is of a different type to the data use to derive the classes. Wherever possible, we follow the authors’ own nomenclature for describing clusters and a (+) or (–) indicates relative improvement or deficit in the specified variable.
ADHD, attention-deficit/hyperactivity disorder; ANX, anxiety disorders; BPD(p/a), bipolar disorder (with psychosis/ADHD); BIC, Bayesian information criterion; DEP, depressive disorders (major depression and dysthymia); HC, healthy control; LCGA, latent class growth analysis; MDD, major depressive disorder; SAD, schizoaffective disorder; SCZ, schizophrenia; SUD, substance use disorder.
Clinical Implications
One of the most striking features evident from Table 1, Table 2, Table 3, Table 4, Table 5 is that the outcomes of clustering are heavily dependent on the input data; the overall picture derived from the literature is a profusion of different ways to subtype psychiatric disorders with relatively little convergence onto a coherent and consistent set of subtypes (19, 50). The disorder with the most consistent stratifications across studies is major depression, where many (53, 54, 55, 56), but not all (57, 58, 59) studies report evidence for “typical” (melancholic) and “atypical” subtypes, although these often do not align with the classical DSM subtypes (60). In contrast, stratifications of schizophrenia, ADHD, and autism have been much more variable across studies. In these cases, it is difficult to know how these different clustering solutions relate to each other or which are most relevant for clinical decision-making. From a clinical perspective, the discrepancies in these findings may reflect different subgroupings being reflected in different measures or a convergence of multiple causal mechanisms on the same phenotype. There are hundreds of genetic polymorphisms associated with most psychiatric disorders (61, 62), all having small effect sizes and converging on similar symptoms. This aggregation of small effects has been likened to a “watershed,” where genetic polymorphisms aggregate as they flow downstream, finding full expression in the syndromic expression of the disorder (63). An additional complication in comparing studies is that symptom profiles of many disorders vary over the course of the disorder, even within individual subjects (64). Therefore, quantitative comparisons between different studies and cohorts are needed, as is a greater focus on external validation (see below).
Advantages and Disadvantages of Clustering
Table 1, Table 2, Table 3, Table 4, Table 5 show that clustering algorithms have been the method of choice for stratifying clinical groups and have made an important contribution to studying the heterogeneity underlying psychiatric disorders. Clustering methods are ideal if the disorder can be cleanly separated into subgroups (e.g., for separating typical from atypical depression). However, our review shows that psychiatric disorders cannot be reproducibly stratified using symptoms alone, probably because of extensive overlap between disorders. Indeed, finding an optimal solution is in general a computationally difficult problem (65).6 Therefore, all algorithms used in practice use heuristics to find approximate solutions that do not guarantee convergence to a global optimum. This is not overly problematic in itself, and standard approaches are to run multiple random restarts to find the best solution possible or to integrate different solutions to provide measures of cluster uncertainty. A more serious problem is that clustering algorithms always yield a result and partition the data into the specified number of clusters regardless of the underlying data distribution (Supplementary Figure S1). The number and validity of the clusters must be specified a priori or assessed post hoc. In this regard, it is important to recognize that different approaches to clustering embody different heuristics, possibly leading to different solutions. These heuristics are determined by many factors, including the choice of algorithm and distance function, the model order, the subspace in which clustering takes place, and the method used to search the space. Moreover, in general it is not possible to adjudicate unambiguously between methods because there is no clear measure of success for unsupervised learning methods (23).7 For example, different metrics for assessing model order often yield different answers and also may not identify a unique optimal model order. Therefore, heuristics and previous expectations play a strong role in the choice of algorithm and model order. Indeed, many studies use multiple approaches, aiming for consensus (Table 1, Table 2, Table 3, Table 4, Table 5), but the final choice of method is often a matter of taste.
High-dimensional data bring additional problems for clustering that are well-recognized in the machine learning literature (see Supplementary Methods) (31,66). Specialized algorithms are therefore recommended for high-dimensional data (31, 66), but to date these have not been applied to psychiatric disorders. Another problem for biological data (e.g., neuroimaging and genetics) is that the magnitude of nuisance variation is usually larger than clinically relevant variation, so the clustering solution can be driven by the nuisance variation rather than clinical heterogeneity. Therefore, it can be difficult to constrain clustering algorithms to find clinically relevant clusters, which necessitates careful data handling and preprocessing.
More specific problems with applying clustering algorithms to stratify psychiatric disorders include the following: 1) some participants may not clearly belong to any class; 2) some classes may be not well defined or may be unmanageably small (67); 3) subgroups may principally index severity (39, 55, 68); and 3) it is not clear whether healthy participants should be clustered separately or in combination with patients.
Validation
The complexity of deriving clustering solutions makes validation crucial to ensure reproducibility and to ensure that the derived clusters index clinically meaningful variation. A common approach is to train supervised classifiers to separate classes using the same data that were used to derive the clusters or data that are highly correlated (e.g., different symptom measures). However, this approach is circular and simply measures how well classes can be separated within the training sample. A better approach is to assess cluster reproducibility, which requires additional cohorts or resampling of the data (e.g., cross-validation). However, to avoid bias, the entire procedure—including clustering—must be embedded within the resampling framework. To assess clinical validity, external data are necessary and should be defined a priori. For this, prediction of future outcome is considered the best test (69) if outcome can be clearly defined (e.g., the absence of relapse in schizophrenia). Biological measures can also provide useful validation because they can determine whether clusters map onto pathophysiology (11, 12), which is important because subgroups that reduce phenotypic heterogeneity may not reduce biological heterogeneity (70). Historically, the importance of validation has been somewhat overlooked (Table 1, Table 2, Table 3, Table 4, Table 5), but it is reassuring to note that studies are increasingly validating stratifications against external measures, especially in the case of major depression (60, 71, 72, 73); for example, Rhebergen et al. (39) derived a set of symptom trajectories to stratify depressed subjects that were subsequently validated against measures of affective processing derived from functional magnetic resonance imaging scans (73). Another notable example of external validation was provided by Karalunas et al. (14), who stratified children with ADHD on the basis of temperament ratings and validated these stratifications against cardiac measures, resting state functional magnetic resonance imaging scans, and clinical outcome.
Alternatives to Clustering
Surprisingly few alternatives to clustering have been proposed. Proposed alternatives are of 3 main types: first, some methods extend supervised learning to classify predefined disease states while accommodating uncertainty in the class labels. This has been achieved in the following ways: embedding the algorithm in a “wrapper” that identifies mislabeled samples [(74) Figure 1A, B]; semisupervised methods that only use labels for subjects with a definite diagnosis [(75) Figure 1C]; and hybrid methods that combine supervised learning with clustering [(76, 77, 78) Figure 1D] or fusing the image registration process with FMMs such that brain images are clustered at the same time as they are registered together (79). Second, manifold learning techniques (Figure 2A) have been used to find low-dimensional representations of the data that highlight salient axes of variation. For high-dimensional data, approaches that preserve local distances are well-suited for this (80) and have been used to find latent structure underlying neurologic disorders (81) and used for dimensionality reduction before clustering (82). Third, novelty detection algorithms, such as the one-class support vector machine (83), aim to identify samples that are different from a set of training examples [(84) Figure 3B].
Normative modeling (Figure 3) is an alternative approach for parsing heterogeneity in clinical conditions (22, 85, 86) and aims to model biological variation within clinical cohorts, such that symptoms in individual patients can be recognized as extreme values within this distribution. This can be compared to the use of growth charts to map child development in terms of height and weight as a function of age, where deviations from a normal growth trajectory manifest as outliers within the normative range at each age. This is operationalized by learning some decision function that quantifies the variation across the population range, including healthy functioning and also potentially symptoms (see Supplementary Methods). Such approaches have been proposed for identifying subjects that have an abnormal maturational trajectory in brain structure (86) or in cognitive development (85), or for mapping any clinically relevant variable (22). This approach breaks the symmetry inherent in case-control and clustering approaches and provides multiple benefits. First, it does not entail making strong assumptions about the clinical group (e.g., existence or number of subgroups). This was shown by Marquand et al. (22), where the clinical variables did not form clearly defined clusters but normative modeling identified distinct brain mechanisms that give rise to symptoms. Second, it allows both normal functioning and deviations from normal functioning that may underlie symptoms to be mapped in individual subjects. Third, it permits diagnostic labels to be used as predictor variables, enabling inferences over the labels. Finally, it intuitively matches the clinical conception where diseases in individual patients are recognized as deviations from normal functioning. This approach can be used to estimate mappings between biology and behavior across multiple cognitive domains; therefore, it is well aligned with RDoC and ROAMER and also compliments clustering because clustering algorithms can still be applied to these mappings. On the other hand, normative modeling requires careful data processing to ensure that the outliers detected are not outliers from the normative distribution due to artifacts. It is also best suited to large normative cohorts that capture the full range of functioning in the reference population.
Discussion
In this article, we introduced the basic concepts of data-driven stratification of psychiatric disorders and reviewed the existing literature. The overwhelming majority of studies have employed clustering or FMM, aiming to subgroup clinical populations. This has been somewhat successful (Table 1, Table 2, Table 3, Table 4, Table 5), although the results are heavily dependent on the type of data used; for most disorders, both the number and characteristics of the derived clusters vary between studies, and a consensus as to a consistent set of subgroups is yet to be reached. We highlighted the importance of validation to ensure that derived clusters map onto clinically relevant variation and outlined various alternatives to clustering.
The ongoing discussion surrounding psychiatric nosology reflects well-acknowledged difficulties in finding biological markers that predict current disease state or future outcomes with sufficient sensitivity and specificity to be clinically useful (1, 10). While this is an important motivation behind RDoC and ROAMER (11, 12, 87), this review highlights that neither the reclassification of psychiatric disorders nor the emphasis on cutting across current diagnostic classifications is a central innovative feature. A more important contribution is a shift away from symptoms and towards conceptualizing pathology as spanning multiple domains of functioning and across multiple levels of analysis. In RDoC, this is represented as a matrix with rows containing basic cognitive dimensions (“constructs”) grouped into domains of functioning (e.g., positive or negative valence systems) and columns containing units of analysis (e.g., genes, cells, or circuits) (87). Viewed in this light, clustering of algorithms provides only a partial answer to the challenges posed by RDoC and ROAMER because it does do not provide an obvious means to link constructs with units of analysis. Put simply, it is necessary to link the rows of the RDoC matrix with its columns and chart the variation in these mappings. This is necessary before the clinical validity of RDoC domains can be assessed as to whether they predict disease states more accurately than classical diagnostic categories (38).
Surprisingly few methods have been proposed that meet these objectives. Most that do exist aim to break the symmetry that both the case-control paradigm and clustering approaches entail in that all clinical groups are well-defined entities. Normative modeling (22, 85, 86) is one particularly promising approach that aims to map variation in clinically relevant variables, so that each individual subject can be placed within the population range and disease can be considered as an extreme deviation from a normal pattern of functioning. This provides a workable alternative to lumping and splitting the psychiatric phenotype and a method to chart variability across different domains of functioning and different units of analysis.
Our review also highlighted that few studies have used biological measures to derive stratifications. This may be because of difficulties that unsupervised methods have with separating nuisance variation from clinically relevant variation, particularly in high dimensions (31). This may be particularly problematic in genomic studies; some reports have used genomic data as validation of the derived clusters (60, 68), but the only study we are aware of that used genomic data to derive clusters (88) has received severe criticism for inadequately dealing with artefactual variation.8 One way that this problem may be addressed in the future is by developing richer clustering models that integrate clinical or domain knowledge in a way that guides the clustering algorithm toward clinically relevant variation. A simple example is the use of growth mixture models to cluster samples on the basis of within-participant change over time (39, 40). More generally, probabilistic graphic models (24) provide an elegant framework that allows existing knowledge to be incorporated to help find clinically meaningful clusters. To our knowledge, this approach has not been used in psychiatry, but it has been useful to stratify disease cohorts in other clinical domains (89). Other emerging machine learning techniques that may be fruitfully applied to stratifying psychiatric disorders include probabilistic methods that allow for multiple labels within individual patients (90), clustering methods that do not uniquely assign points to a single cluster (31), and deep learning methods (91, 92).
In summary, we reviewed the literature for stratifying psychiatric disorders and showed that the field has, to date, relied heavily on clustering and FMM. These undoubtedly provide an important contribution but only partially satisfy the objectives of RDoC and ROAMER. It is also necessary to chart variation in brain-behavior mappings to fully parse heterogeneity across domains of functioning and diagnostic categories. The hope is that using such mappings to derive future disease stratifications will enable clinical phenotypes to be dissected along the most relevant axes of variation, ultimately enabling treatments to be better targeted to individual patients.
Acknowledgments And Disclosures
This work was supported by the Netherlands Organization for Scientific Research (NWO) under the Gravitation Programme (Grant No. 024.001.006 supporting AFM), by a Marie Curie International Incoming Fellowship under the European Union’s Seventh Framework Programme (Grant No. FP7/2007-2013) Grant No. 327340 (BRAIN FINGERPRINT to MM), and by a VIDI grant from the NWO (Grant No. 864-12-003 to CFB). JB received funding from the FP7 under Grant Nos. 602805 (AGGRESSOTYPE), 603016 (MATRICS), and 278948 (TACTICS) and from the European Community’s Horizon 2020 Programme (H2020/2014-2020) under Grant Nos. 643051 (MiND) and 642996 (BRAINVIEW). We also acknowledge funding from the Wellcome Trust UK Strategic Award (098369/Z/12/Z).
JB has been a consultant to, advisory board member of, and a speaker for Janssen Cilag BV, Eli Lilly, Shire, Lundbeck, Roche, and Servier. He is not an employee of any of these companies, and not a stock shareholder of any of these companies. He has no other financial or material support, including expert testimony, patents or royalties. CFB is director and shareholder of SBGneuro Ltd. The other authors report no biomedical financial interests or potential conflicts of interest.
Footnotes
We identified studies by performing a PubMed search for each disorder separately using the following search string: [(clustering OR subtypes OR subgroups OR stratification) and (disorder name OR disorder acronyms)]. We then selected a representative overview of studies for each disorder (this was exhaustive for ADHD, autism, and cross-diagnostic studies). For example, in the case of multiple studies using the same cohort, we only included the first or most important in this review. We also gave priority to studies that have not been reviewed previously (19, 51, 52).
Many of the FMM approaches discussed here originate in the psychometric literature, which uses different nomenclature to mainstream statistics. Unfortunately, this nomenclature also varies between authors. We use consistent terminology throughout and synthesize with mainstream statistical literature wherever possible.
Referred to as “latent profile analysis” in the psychometric literature.
Also referred to as “group-based trajectory modeling.”
The overall objectives of clustering approaches and FMMs are similar; for the remainder of this article, we refer to both as “clustering” for brevity.
Technically, clustering belongs to the “NP-hard” class of problems.
In contrast, there is a clear measure by which success of supervised methods can be assessed: the expected loss, measured by some loss function, over the joint distribution of labels and covariates. This can be estimated in various ways (e.g., cross-validation).
For example, see the discussion at: http://www.ncbi.nlm.nih.gov/pubmed/25219520.
Supplementary material cited in this article is available online at 10.1016/j.bpsc.2016.04.002.
Appendix A. Supplementary material
References
- 1.Kapur S., Phillips A.G., Insel T.R. Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? Mol Psychiatry. 2012;17:1174–1179. doi: 10.1038/mp.2012.105. [DOI] [PubMed] [Google Scholar]
- 2.McKusick V.A. On lumpers and splitters or nosology of genetic disease. Perspect Biol Med. 1969;12:298–312. doi: 10.1353/pbm.1969.0039. [DOI] [PubMed] [Google Scholar]
- 3.Kraepelin E. 8th ed. Krieger Publishing; Huntington, NY: 1909. Psychiatrie; p. 1971. [Google Scholar]
- 4.Bleuler E. Springer-Verlag; Berlin: 1920. Lehrbuch der Psychiatrie. [Google Scholar]
- 5.American Psychiatric Association . 5th ed. American Psychiatric Association; Washington, DC: 2013. Diagnostic and Statistical Manual of Mental Disorders. [Google Scholar]
- 6.World Health Organization . World Health Organization; Geneva, Switzerland: 1992. International Statistical Classification of Diseases and Health Related Problems. [Google Scholar]
- 7.Paykel E.S. Classification of depressed patients—cluster analysis derived grouping. Br J Psychiatry. 1971;118:275–288. doi: 10.1192/bjp.118.544.275. [DOI] [PubMed] [Google Scholar]
- 8.Farmer A.E., McGuffin P., Spitznagel E.L. Heterogeneity in schizophrenia—a cluster-analytic approach. Psychiatry Res. 1983;8:1–12. doi: 10.1016/0165-1781(83)90132-4. [DOI] [PubMed] [Google Scholar]
- 9.Mirnezami R., Nicholson J., Darzi A. Preparing for precision medicine. N Engl J Med. 2012;366:489–491. doi: 10.1056/NEJMp1114866. [DOI] [PubMed] [Google Scholar]
- 10.Insel T.R., Cuthbert B.N. Brain disorders? Precisely. Science. 2015;348:499–500. doi: 10.1126/science.aab2358. [DOI] [PubMed] [Google Scholar]
- 11.Insel T., Cuthbert B., Garvey M., Heinssen R., Pine D.S., Quinn K. Research domain criteria (RDoC): Toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010;167:748–751. doi: 10.1176/appi.ajp.2010.09091379. [DOI] [PubMed] [Google Scholar]
- 12.Schumann G., Binder E.B., Holte A., de Kloet E.R., Oedegaard K.J., Robbins T.W. Stratified medicine for mental disorders. Eur Neuropsychopharmacol. 2014;24:5–50. doi: 10.1016/j.euroneuro.2013.09.010. [DOI] [PubMed] [Google Scholar]
- 13.Brodersen K.H., Deserno L., Schlagenhauf F., Lin Z., Penny W.D., Buhmann J.M. Dissecting psychiatric spectrum disorders by generative embedding. Neuroimage Clin. 2014;4:98–111. doi: 10.1016/j.nicl.2013.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Karalunas S.L., Fair D., Musser E.D., Aykes K., Iyer S.P., Nigg J.T. Subtyping attention-deficit/hyperactivity disorder using temperament dimensions toward biologically based nosologic criteria. JAMA Psychiatry. 2014;71:1015–1024. doi: 10.1001/jamapsychiatry.2014.763. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 15.Fair D.A., Bathula D., Nikolas M.A., Nigg J.T. Distinct neuropsychological subgroups in typically developing youth inform heterogeneity in children with ADHD. Proc Natl Acad Sci U S A. 2012;109:6769–6774. doi: 10.1073/pnas.1115365109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gates K.M., Molenaar P.C.M., Iyer S.P., Nigg J.T., Fair D.A. Organizing heterogeneous samples using community detection of GIMME-derived resting state functional networks. Plos One. 2014;9:e91322. doi: 10.1371/journal.pone.0091322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Costa Dias T.G., Iyer S.P., Carpenter S.D., Cary R.P., Wilson V.B., Mitchell S.H. Characterizing heterogeneity in children with and without ADHD based on reward system connectivity. Dev Cogn Neurosci. 2015;11:155–174. doi: 10.1016/j.dcn.2014.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bell M.D., Corbera S., Johannesen J.K., Fiszdon J.M., Wexler B.E. Social cognitive impairments and negative symptoms in schizophrenia: Are there subtypes with distinct functional correlates? Schizophr Bull. 2013;39:186–196. doi: 10.1093/schbul/sbr125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.van Loo H.M., de Jonge P., Romeijn J.-W., Kessler R.C., Schoevers R.A. Data-driven subtypes of major depressive disorder: A systematic review. BMC Med. 2012;10:156. doi: 10.1186/1741-7015-10-156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Pattyn T., Van Den Eede F., Lamers F., Veltman D., Sabbe B.G., Penninx B.W. Identifying panic disorder subtypes using factor mixture modeling. Depression Anxiety. 2015;32:509–517. doi: 10.1002/da.22379. [DOI] [PubMed] [Google Scholar]
- 21.Sacco R., Lenti C., Saccani M., Curatolo P., Manzi B., Bravaccio C. Cluster analysis of autistic patients based on principal pathogenetic components. Autism Res. 2012;5:137–147. doi: 10.1002/aur.1226. [DOI] [PubMed] [Google Scholar]
- 22.Marquand A.F., Rezek I., Buitelaar J., Beckmann C.F. Understanding heterogeneity in clinical cohorts using normative models: Beyond case control studies. Biol Psychiatry. 2016;80:547–556. doi: 10.1016/j.biopsych.2015.12.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hastie T., Tibshirani R., Friedman J. 2nd ed. Springer; New York: 2009. The Elements of Statistical Learning. [Google Scholar]
- 24.Bishop C. Springer; New York: 2006. Pattern Recognition and Machine Learning. [Google Scholar]
- 25.Lazarsfeld P.F., Henry N.W. Houghton Mifflin; Boston: 1968. Latent Structure Analysis. [Google Scholar]
- 26.Muthen B. Beyond SEM: General latent variable modeling. Behaviormetrika. 2002;29:81–117. [Google Scholar]
- 27.Klöppel S., Abdulkadir A., Jack C.R., Jr, Koutsouleris N., Mourão-Miranda J., Vemuri P. Diagnostic neuroimaging across diseases. Neuroimage. 2012;61:457–463. doi: 10.1016/j.neuroimage.2011.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Orru G., Pettersson-Yeo W., Marquand A.F., Sartori G., Mechelli A. Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: A critical review. Neurosci Biobehav Rev. 2012;36:1140–1152. doi: 10.1016/j.neubiorev.2012.01.004. [DOI] [PubMed] [Google Scholar]
- 29.Wolfers T., Buitelaar J.K., Beckmann C., Franke B., Marquand A.F. From estimating activation locality to predicting disorder: A review of pattern recognition for neuroimaging-based psychiatric diagnostics. Neurosci Biobehav Rev. 2015;57:328–349. doi: 10.1016/j.neubiorev.2015.08.001. [DOI] [PubMed] [Google Scholar]
- 30.Xu R., Wunsch D., 2nd Survey of clustering algorithms. IEEE Trans Neural Netw. 2005;16:645–678. doi: 10.1109/TNN.2005.845141. [DOI] [PubMed] [Google Scholar]
- 31.Kriegel H.-P., Kroeger P., Zimek A. Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data 3. 2009:1–58. [Google Scholar]
- 32.Ward J.H. Hierarchical grouping to optimize an objective function. J Am Statistical Assoc. 1963;58:236–244. [Google Scholar]
- 33.Newman M.E.J. Modularity and community structure in networks. Proc Natl Acad Sci U S A. 2006;103:8577–8582. doi: 10.1073/pnas.0601602103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hagenaars J.A., McCutcheon A.L. Cambridge University Press; Cambridge: 2002. Applied latent class cluster analysis. [Google Scholar]
- 35.Muthen B., Shedden K. Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics. 1999;55:463–469. doi: 10.1111/j.0006-341x.1999.00463.x. [DOI] [PubMed] [Google Scholar]
- 36.Nagin D.S. Analyzing developmental trajectories: A semiparametric, group-based approach. Psychological Methods. 1999;4:139–157. doi: 10.1037/1082-989x.6.1.18. [DOI] [PubMed] [Google Scholar]
- 37.Insel T.R. Mental disorders in childhood shifting the focus from behavioral symptoms to neurodevelopmental trajectories. JAMA. 2014;311:1727–1728. doi: 10.1001/jama.2014.1193. [DOI] [PubMed] [Google Scholar]
- 38.Weinberger D.R., Goldberg T.E. RDoCs redux. World Psychiatry. 2014;13:36–38. doi: 10.1002/wps.20096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rhebergen D., Lamers F., Spijker J., de Graaf R., Beekman A.T.F., Penninx BWJH Course trajectories of unipolar depressive disorders identified by latent class growth analysis. Psychol Med. 2012;42:1383–1396. doi: 10.1017/S0033291711002509. [DOI] [PubMed] [Google Scholar]
- 40.Fountain C., Winter A.S., Bearman P.S. Six developmental trajectories characterize children with autism. Pediatrics. 2012;129:E1112–E1120. doi: 10.1542/peds.2011-1601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Titterington D.M., Smith A.F.M., Makov U.E. John Wiley and Sons; New York: 1985. Statistical analysis of finite mixture distributions. [Google Scholar]
- 42.Schwarz G. Estimating dimension of a model. Ann Stat. 1978;6:461–464. [Google Scholar]
- 43.Akaike H. A new look at the statistical model identification. IEEE Trans Automatic Control. 1974;19:716–723. [Google Scholar]
- 44.Lo Y.T., Mendell N.R., Rubin D.B. Testing the number of components in a normal mixture. Biometrika. 2001;88:767–778. [Google Scholar]
- 45.Nylund K.L., Asparouhov T., Muthen B.O. Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Struct Equ Modeling. 2007;14:535–569. [Google Scholar]
- 46.Rousseeuw P.J. Silhouettes—a graphical aid to the interpretation and validation of cluster-analysis. J Comput Appl Mathematics. 1987;20:53–65. [Google Scholar]
- 47.C DJ. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybernet. 1973;3:32–57. [Google Scholar]
- 48.Saha S., Bandyopadhyay S. Some connectivity based cluster validity indices. Applied Soft Computing. 2012;12:1555–1565. [Google Scholar]
- 49.Ferguson T.S. A Bayesian analysis of some nonparametric problems. Annals of Statistics. 1973;1:209–230. [Google Scholar]
- 50.Jablensky A. Subtyping schizophrenia: Implications for genetic research. Mol Psychiatry. 2006;11:815–836. doi: 10.1038/sj.mp.4001857. [DOI] [PubMed] [Google Scholar]
- 51.Heinrichs R.W. Meta-analysis, and the science of schizophrenia: Variant evidence or evidence of variants? Neurosci Biobehav Rev. 2004;28:379–394. doi: 10.1016/j.neubiorev.2004.06.003. [DOI] [PubMed] [Google Scholar]
- 52.Sun H., Lui S., Yao L., Deng W., Xiao Y., Zhang W. Two patterns of white matter abnormalities in medication-naive patients with first-episode schizophrenia revealed by diffusion tensor imaging and cluster analysis. JAMA Psychiatry. 2015;72:678–686. doi: 10.1001/jamapsychiatry.2015.0505. [DOI] [PubMed] [Google Scholar]
- 53.Kendler K.S., Eaves L.J., Walters E.E., Neale M.C., Heath A.C., Kessler R.C. The identification and validation of distinct depressive syndromes in a population-based sample of female twins. Arch Gen Psychiatry. 1996;53:391–399. doi: 10.1001/archpsyc.1996.01830050025004. [DOI] [PubMed] [Google Scholar]
- 54.Sullivan P.F., Kessler R.C., Kendler K.S. Latent class analysis of lifetime depressive symptoms in the National Comorbidity Survey. Am J Psychiatry. 1998;155:1398–1406. doi: 10.1176/ajp.155.10.1398. [DOI] [PubMed] [Google Scholar]
- 55.Lamers F., de Jonge P., Nolen W.A., Smit J.H., Zitman F.G., Beekman A.T.F. Identifying depressive subtypes in a large cohort study: Results from the Netherlands Study of Depression and Anxiety (NESDA) J Clin Psychiatry. 2010;71:1582–1589. doi: 10.4088/JCP.09m05398blu. [DOI] [PubMed] [Google Scholar]
- 56.Lamers F., Burstein M., He J.P., Avenevoli S., Angst J., Merikangas K.R. Structure of major depressive disorder in adolescents and adults in the US general population. Br J Psychiatry. 2012;201:143–150. doi: 10.1192/bjp.bp.111.098079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Maes M., Maes L., Schotte C., Cosyns P. A clinical and biological validation of the DSM-III melancholia diagnosis in men—results of pattern-recognition methods. J Psychiatr Res. 1992;26:183–196. doi: 10.1016/0022-3956(92)90022-g. [DOI] [PubMed] [Google Scholar]
- 58.Hybels C.F., Blazer D.G., Pieper C.F., Landerman L.R., Steffens D.C. Profiles of depressive symptoms in older adults diagnosed with major depression: Latent cluster analysis. Am J Geriatr Psychiatry. 2009;17:387–396. doi: 10.1097/JGP.0b013e31819431ff. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.van Loo H.M., Cai T., Gruber M.J., Li J., de Jonge P., Petukhova M. Major depressive disorder subtypes to predict long-term course. Depression Anxiety. 2014;31:765–777. doi: 10.1002/da.22233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Milaneschi Y., Lamers F., Peyrot W.J., Abdellaoui A., Willemsen G., Hottenga J.J. Polygenic dissection of major depression clinical heterogeneity. Mol Psychiatry. 2015;21:516–522. doi: 10.1038/mp.2015.86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Betancur C. Etiological heterogeneity in autism spectrum disorders: More than 100 genetic and genomic disorders and still counting. Brain Res. 2011;1380:42–77. doi: 10.1016/j.brainres.2010.11.078. [DOI] [PubMed] [Google Scholar]
- 62.Schizophrenia Working Group of the Psychiatric Genomics Consortium Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421–427. doi: 10.1038/nature13595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Cannon T.D. Deciphering the genetic complexity of schizophrenia. JAMA Psychiatry. 2016;73:5–6. doi: 10.1001/jamapsychiatry.2015.2111. [DOI] [PubMed] [Google Scholar]
- 64.Lahey B.B., Pelham W.E., Loney J., Lee S.S., Willcutt E. Instability of the DSM-IV subtypes of ADHD from preschool through elementary school. Arch Gen Psychiatry. 2005;62:896–902. doi: 10.1001/archpsyc.62.8.896. [DOI] [PubMed] [Google Scholar]
- 65.Slagle J.L., Chang C.L., Heller S.L. A clustering and data-reorganization algorithm. IEEE Transactions on Systems Man and Cybernetics. 1975;5:121–128. [Google Scholar]
- 66.Bouveyron C., Brunet-Saumard C. Model-based clustering of high-dimensional data: A review. Computational Statistics & Data Analysis. 2014;71:52–78. [Google Scholar]
- 67.van Hulst B.M., de Zeeuw P., Durston S. Distinct neuropsychological profiles within ADHD: A latent class analysis of cognitive control, reward sensitivity and timing. Psychological Med. 2015;45:735–745. doi: 10.1017/S0033291714001792. [DOI] [PubMed] [Google Scholar]
- 68.Veatch O.J., Veenstra-VanderWeele J., Potter M., Pericak-Vance M.A., Haines J.L. Genetically meaningful phenotypic subgroups in autism spectrum disorders. Genes Brain Behav. 2014;13:276–285. doi: 10.1111/gbb.12117. [DOI] [PubMed] [Google Scholar]
- 69.Robins E., Guze S.B. Establishment of diagnostic validity in psychiatric illness—its application to schizophrenia. Am J Psychiatry. 1970;126:983–987. doi: 10.1176/ajp.126.7.983. [DOI] [PubMed] [Google Scholar]
- 70.Chaste P., Klei L., Sanders S.J., Hus V., Murtha M.T., Lowe J.K. A genome-wide association study of autism using the Simons simplex collection: Does reducing phenotypic heterogeneity in autism increase genetic homogeneity? Biol Psychiatry. 2015;77:775–784. doi: 10.1016/j.biopsych.2014.09.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Milaneschi Y., Lamers F., Bot M., Drent M.L., Penninx BWJH Leptin dysregulation is specifically associated with major depression with atypical features: Evidence for a mechanism connecting obesity and depression. Biol Psychiatry Nov. 2015:17. doi: 10.1016/j.biopsych.2015.10.023. [Epub ahead of print] [DOI] [PubMed] [Google Scholar]
- 72.Lamers F., Beekman A.T.F., van Hemert A.M., Schoevers R.A., Penninx BWJH Six-year longitudinal course and outcomes of subtypes of depression. Br J Psychiatry. 2016;208:62–68. doi: 10.1192/bjp.bp.114.153098. [DOI] [PubMed] [Google Scholar]
- 73.Schmaal L., Marquand A.F., Rhebergen D., van Tol M.J., Ruhe H.G., van der Wee N.J.A. Predicting the naturalistic course of major depressive disorder using clinical and multimodal neuroimaging information: A multivariate pattern recognition study. Biol Psychiatry. 2015;78:278–286. doi: 10.1016/j.biopsych.2014.11.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Young J., Ashburner J., Ourselin S. IEEE; Philadelphia: 2013. Wrapper methods to correct mislabelled training data. 3rd International Workshop on Pattern Recognition in Neuroimaging. [Google Scholar]
- 75.Filipovych R., Davatzikos C., Alzheimer’s Disease Neuroimaging Initiative Semi-supervised pattern classification of medical images: Application to mild cognitive impairment (MCI) Neuroimage. 2011;55:1109–1119. doi: 10.1016/j.neuroimage.2010.12.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Filipovych R., Resnick S.M., Davatzikos C. JointMMCC: Joint maximum-margin classification and clustering of imaging data. IEEE Transactions on Medical Imaging. 2012;31:1124–1140. doi: 10.1109/TMI.2012.2186977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Varol E., Sotiras A., Davatzikos C. Springer; Heidelberg: 2015. Disentangling disease heterogeneity with max-margin multiple hyperplane classifier. Medical Image Computing and Computer-Assisted Intervention—MICCAI; pp. 702–709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Eavani H., Hsieh M.K., An Y., Erus G., Beason-Held L., Resnick S. Capturing heterogeneous group differences using mixture-of-experts: Application to a study of aging. Neuroimage. 2016;125:498–514. doi: 10.1016/j.neuroimage.2015.10.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Sabuncu M.R., Balci S.K., Shenton M.E., Golland P. Image-driven population analysis through mixture modeling. IEEE Transactions on Medical Imaging. 2009;28:1473–1487. doi: 10.1109/TMI.2009.2017942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.van der Maaten L., Hinton G. Visualizing data using t-SNE. J Machine Learn Res. 2008;9:2579–2605. [Google Scholar]
- 81.Ridgway G.R., Lehmann M., Barnes J., Rohrer J.D., Warren J.D., Crutch S.J. Early-onset Alzheimer disease clinical variants Multivariate analyses of cortical thickness. Neurology. 2012;79:80–84. doi: 10.1212/WNL.0b013e31825dce28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Mwangi B., Soares J.C., Hasan K.M. Visualization and unsupervised predictive clustering of high-dimensional multimodal neuroimaging data. J Neurosci Methods. 2014;236:19–25. doi: 10.1016/j.jneumeth.2014.08.001. [DOI] [PubMed] [Google Scholar]
- 83.Scholkopf B., Platt J.C., Taylor J.S., Smola A.J., Williamson R.C. Estimating the support of a high-dimensional distribution. Neural Computation. 2001;13:1443–1471. doi: 10.1162/089976601750264965. [DOI] [PubMed] [Google Scholar]
- 84.Mourao-Miranda J., Hardoon D.R., Hahn T., Marquand A.F., Williams S.C.R., Shawe-Taylor J. Patient classification as an outlier detection problem: An application of the One-Class Support Vector Machine. Neuroimage. 2011;58:793–804. doi: 10.1016/j.neuroimage.2011.06.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Gur R.C., Calkins M.E., Satterthwaite T.D., Ruparel K., Bilker W.B., Moore T.M. Neurocognitive growth charting in psychosis spectrum youths. JAMA Psychiatry. 2014;71:366–374. doi: 10.1001/jamapsychiatry.2013.4190. [DOI] [PubMed] [Google Scholar]
- 86.Erus G., Battapady H., Satterthwaite T.D., Hakonarson H., Gur R.E., Davatzikos C. Imaging patterns of brain development and their relationship to cognition. Cerebral Cortex. 2015;25:1676–1684. doi: 10.1093/cercor/bht425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Cuthbert B.N. The RDoC framework: Facilitating transition from ICD/DSM to dimensional approaches that integrate neuroscience and psychopathology. World Psychiatry. 2014;13:28–35. doi: 10.1002/wps.20087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Arnedo J., Svrakic D.M., del Val C., Romero-Zaliz R., Hernandez-Cuervo H., Fanous A.H. Uncovering the hidden risk architecture of the schizophrenias: Confirmation in three independent genome-wide association studies. Am J Psychiatry. 2015;172:139–153. doi: 10.1176/appi.ajp.2014.14040435. [DOI] [PubMed] [Google Scholar]
- 89.Simpson A., Tan V.Y.F., Winn J., Svensen M., Bishop C.M., Heckerman D.E. Beyond atopy multiple patterns of sensitization in relation to asthma in a birth cohort study. Am J Respir Crit Care Med. 2010;181:1200–1206. doi: 10.1164/rccm.200907-1101OC. [DOI] [PubMed] [Google Scholar]
- 90.Ruiz F.J.R., Valera I., Blanco C., Perez-Cruz F. Bayesian nonparametric comorbidity analysis of psychiatric disorders. J Machine Learn Res. 2014;15:1215–1247. [Google Scholar]
- 91.LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
- 92.Kim J., Calhoun V.D., Shim E., Lee J.H. Deep neural network with weight sparsity control and pre-training extracts hierarchical features and enhances classification performance: Evidence from whole-brain resting-state functional connectivity patterns of schizophrenia. Neuroimage. 2016;124:127–146. doi: 10.1016/j.neuroimage.2015.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Castle D.J., Sham P.C., Wessely S., Murray R.M. The subtyping of schizophrenia in men and women—a latent class analysis. Psychological Med. 1994;24:41–51. doi: 10.1017/s0033291700026817. [DOI] [PubMed] [Google Scholar]
- 94.Sham P.C., Castle D.J., Wessely S., Farmer A.E., Murray R.M. Further exploration of a latent class typology of schizophrenia. Schizophrenia Res. 1996;20:105–115. doi: 10.1016/0920-9964(95)00091-7. [DOI] [PubMed] [Google Scholar]
- 95.Dollfus S., Everitt B., Ribeyre J.M., AssoulyBesse F., Sharp C., Petit M. Identifying subtypes of schizophrenia by cluster analyses. Schizophrenia Bull. 1996;22:545–555. doi: 10.1093/schbul/22.3.545. [DOI] [PubMed] [Google Scholar]
- 96.Kendler K.S., Karkowski L.M., Walsh D. The structure of psychosis—Latent class analysis of probands from the Roscommon family study. Arch Gen Psychiatry. 1998;55:492–499. doi: 10.1001/archpsyc.55.6.492. [DOI] [PubMed] [Google Scholar]
- 97.Murray V., McKee I., Miller P.M., Young D., Muir W.J., Pelosi A.J. Dimensions and classes of psychosis in a population cohort: a four-class, four-dimension model of schizophrenia and affective psychoses. Psychological Med. 2005;35:499–510. doi: 10.1017/s0033291704003745. [DOI] [PubMed] [Google Scholar]
- 98.Dawes S.E., Jeste D.V., Palmer B.W. Cognitive profiles in persons with chronic schizophrenia. J Clin Exp Neuropsychol. 2011;33:929–936. doi: 10.1080/13803395.2011.578569. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Cole V.T., Apud J.A., Weinberger D.R., Dickinson D. Using latent class growth analysis to form trajectories of premorbid adjustment in schizophrenia. J Abnormal Psychol. 2012;121:388–395. doi: 10.1037/a0026922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Friston K.J., Harrison L., Penny W. Dynamic causal modelling. Neuroimage. 2003;19:1273–1302. doi: 10.1016/s1053-8119(03)00202-7. [DOI] [PubMed] [Google Scholar]
- 101.Kass R.E., Raftery A.E. Bayes factors. J Am Statistical Assoc. 1995;90:773–795. [Google Scholar]
- 102.Geisler D., Walton E., Naylor M., Roessner V., Lim K.O., Schulz S.C. Brain structure and function correlates of cognitive subtypes in schizophrenia. Psychiatry Res Neuroimaging. 2015;234:74–83. doi: 10.1016/j.pscychresns.2015.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Friedman H.P., Rubin J. On some invariant criteria for grouping data. J Am Statistical Assoc. 1967;62 1159–1178. [Google Scholar]
- 104.Lamers F., Rhebergen D., Merikangas K.R., de Jonge P., Beekman A.T.F., Penninx BWJH Stability and transitions of depressive subtypes over a 2-year follow-up. Psychological Med. 2012;42:2083–2093. doi: 10.1017/S0033291712000141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Lamers F., Vogelzangs N., Merikangas K.R., de Jonge P., Beekman A.T.F., Penninx BWJH Evidence for a differential role of HPA-axis function, inflammation and metabolic syndrome in melancholic versus atypical depression. Mol Psychiatry. 2013;18:692–699. doi: 10.1038/mp.2012.144. [DOI] [PubMed] [Google Scholar]
- 106.Mostert J.C., Hoogman M., Onnink A.M.H., van Rooij D., von Rhein D., van Hulzen K.J.E. Similar subgroups based on cognitive performance parse heterogeneity in adults with ADHD and healthy controls. J Atten Disord Sep. 2015:14. doi: 10.1177/1087054715602332. pii: 1087054715602332. [Epub ahead of print] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Munson J., Dawson G., Sterling L., Beauchaine T., Zhou A., Koehler E. Evidence for latent classes of IQ in young children with autism spectrum disorder. Am J Ment Retard. 2008;113:439–452. doi: 10.1352/2008.113:439-452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Georgiades S., Szatmari P., Boyle M., Hanna S., Duku E., Zwaigenbaum L. Investigating phenotypic heterogeneity in children with autism spectrum disorder: A factor mixture modeling approach. J Child Psychol Psychiatry. 2013;54:206–215. doi: 10.1111/j.1469-7610.2012.02588.x. [DOI] [PubMed] [Google Scholar]
- 109.Doshi-Velez F., Ge Y., Kohane I. Comorbidity clusters in autism spectrum disorders: An electronic health record time-series analysis. Pediatrics. 2014;133:E54–E63. doi: 10.1542/peds.2013-0819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Hubert L., Arabie P. Comparing partitions. J Classification. 1985;2:193–218. [Google Scholar]
- 111.Lewandowski K.E., Sperry S.H., Cohen B.M., Oenguer D. Cognitive variability in psychotic disorders: A cross-diagnostic cluster analysis. Psychological Med. 2014;44:3239–3248. doi: 10.1017/S0033291714000774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Kleinman A., Caetano S.C., Brentani H., de Almeida Rocca C.C., dos Santos B., Andrade E.R. Attention-based classification pattern, a research domain criteria framework, in youths with bipolar disorder and attention-deficit/hyperactivity disorder. Aust N Z J Psychiatry. 2015;49:255–265. doi: 10.1177/0004867414557957. [DOI] [PubMed] [Google Scholar]
- 113.Olino T.M., Klein D.N., Lewinsohn P.M., Rohde P., Seeley J.R. Latent trajectory classes of depressive and anxiety disorders from adolescence to adulthood: Description of classes and associations with risk factors. Compr Psychiatry. 2010;51:224–235. doi: 10.1016/j.comppsych.2009.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.