Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Mar 1.
Published in final edited form as: Clin Psychol Sci. 2021 May 18;10(2):259–278. doi: 10.1177/21677026211002500

Folk Classification and Factor Rotations: Whales, Sharks, and the Problems With the Hierarchical Taxonomy of Psychopathology (HiTOP)

Gerald J Haeffel 1, Bertus F Jeronimus 2, Bonnie N Kaiser 3, Lesley Jo Weaver 4, Peter D Soyster 5, Aaron J Fisher 6, Ivan Vargas 7, Jason T Goodson 8, Wei Lu 9
PMCID: PMC9004619  NIHMSID: NIHMS1694419  PMID: 35425668

Abstract

The Hierarchical Taxonomy of Psychopathology (HiTOP) uses factor analysis to group people with similar self-reported symptoms (i.e., like-goes-with-like). It is hailed as a significant improvement over other diagnostic taxonomies. However, the purported advantages and fundamental assumptions of HiTOP have received little, if any scientific scrutiny. We critically evaluated five fundamental claims about HiTOP. We conclude that HiTOP does not demonstrate a high degree of verisimilitude and has the potential to hinder progress on understanding the etiology of psychopathology. It does not lend itself to theory-building or taxonomic evolution, and it cannot account for multifinality, equifinality, or developmental and etiological processes. In its current form, HiTOP is not ready to use in clinical settings and may result in algorithmic bias against underrepresented groups. We recommend a bifurcation strategy moving forward in which the DSM is used in clinical settings while researchers focus on developing a falsifiable theory-based classification system.

Keywords: taxonomy, classification, DSM, HiTOP, homology, theory, mental health


Structural approaches to the classification of psychopathology use factor analysis to cluster symptoms of mental illness into dimensional groupings. This quantitative approach is currently exemplified by the Hierarchical Taxonomy of Psychopathology (HiTOP; Kotov et al., 2017). There has been a steady stream of articles from the HiTOP consortium (e.g., Conway et al., 2019; DeYoung et al., 2020; Kotov et al., 2018; 2020; Krueger et al., 2018; Latzman et al., 2020; Ruggero et al., 2019; Widiger et al., 2019) touting the benefits of its system. They claim it can carve nature at its joints (p. 429, Conway et al., 2019), resolve problems of comorbidity and heterogeneity (p. 1071, Ruggero et al., 2019), revolutionize clinical practice (p. 15, Hopwood et al., 2019), and advance psychiatric genetics and neuroscience research (Latzman et al., 2020; Waszczuk et al., 2019).

These extraordinary claims have received little, if any, scientific scrutiny. A critical evaluation of HiTOP and its purported advantages is needed. The purpose of this article is to fill this gap in the literature. First, we critically evaluated five fundamental claims about HiTOP. Second, we compared HiTOP to alternative taxonomies to evaluate the degree to which they lend themselves to taxonomic evolution (from description to theory) and scientific progress (e.g., falsification). Finally, we made recommendations for future research.

Claim 1. Symptom Correlations "Carve Nature at its Joints"

“Humans are prone to a “folk understanding bias”—the sensation that simplistic explanations lead us to believe we truly understand more complex phenomena” (p. 436).

– Jolly & Chang (2019)

Ostensibly, the HiTOP approach follows the same logic as the Linnaean system in biology, in which every organism is classified over seven hierarchical taxa based on shared features (kingdom, phylum, class, order, family, genus, species). However, there is a critical difference between the HiTOP and Linnaean system. HiTOP dimensions are derived in a theoretical vacuum in which all characteristics (predominantly self-reported symptoms) are considered equally important. For example, the symptom of “avoidance” is weighted the same as “sleep difficulties” and “hearing voices.” No symptom is considered more essential than any other symptom in this system. In contrast, the Linnaean system uses a theoretical perspective in which some characteristics are more important, and when present, take precedence over all other shared similarities, because of their ontogenetic precedence.

In the Linnaean system, classification decisions are not based on total levels of “likeness” (i.e., their covariation) as in HiTOP, but rather on a subgroup of highly meaningful features as determined by evolutionary theory (i.e., phylogeny, e.g., Nickels & Nelson, 2005). To this end, the Linnaean system distinguishes between homology and analogy (Petto & Mead, 2009). Homologous structures are those that descended from a common evolutionary ancestor. For example, the forelegs of horses and dogs are homologous structures because they evolved from a common ancestral tetrapod. Thus, horses and dogs are considered more “alike” than animals that do not share this common ancestor. In contrast, analogous features are those that have a similar structure and function (due to convergent evolution) but did not evolve from a common ancestor. For example, birds, bats, moths, and sea snails (pteropoda) have wings to fly but do not share a common ancestor that evolved wings. And, because this shared feature (wings) is not homologous, they are not grouped together (e.g., birds as Aves, bats as Mammalia, moths as Insects, and sea snails as Gastropoda). Similarly, echolocation evolved independently in birds (e.g., swiftlets), noctuid moths, bats, cetaceans (e.g., dolphins), shrews, tenrec, and humans, which are each grouped in different phyla and clades and use this skill in radically different environments (e.g., seas, skies, caves and cities). Differentiating homologous versus analogous features is critical to the Linnaean system as it is the basis for understanding the evolution and the origin of species (Dawkins & Wong, 2016).

In contrast to the Linnaean system and the newer genetically informed cladistics systems, HiTOP resembles a folk classification system (Nickels & Nelson, 2005; Petto & Meyers, 2009). HiTOP puts “like-with-like” without considering etiological or underlying developmental processes. This is a problem because “like things” may be grouped together inaccurately based on superficial characteristics (analogous features), and “unlike things” might be classified separately despite sharing a common etiology (homologous features). To illustrate this point, consider what biological classification might look like if it were created using the same strategy as HiTOP (see Figure 1 below) -- that is, classifying animals based on shared features regardless of evolutionary ancestry. This process would likely lead to an overarching factor of “animal” (the A-factor), which might then break down into a bifactor model of “land” and “water” animals. An examination of the subgroups of animals organized within these two levels starts to reveal the problems with HiTOP. For example, whales and sharks would be incorrectly classified together given the high correlations among their shared features (e.g., ocean dwellers, fins for locomotion, fish and crustacean eaters, similar life spans, can adapt to multiple aquatic habitats, both largest of their family). This is because in HiTOP, features such as being warm blooded and having hair do not carry special importance. Moreover, bats would likely be incorrectly classified with other flying animals such as birds, moths and butterflies. Red pandas would likely be classified with raccoons despite phylogenetic analysis confirming that they belong in their own evolutionary family. Elephants would be grouped with other large thick-skinned herbivores such as hippos and rhinos even though their closest evolutionary relatives are hyraxes (which look like prairie dogs) and manatees. And, the Tasmanian tiger would be grouped with canids (dogs, wolves, foxes) despite being a marsupial. These are just a few of a myriad of examples that illustrate a fundamental flaw in the structural approach to classification, namely, theoretical and etiological factors are ignored. Using an empirically-based strategy to sort (i.e., correlate) a large set of features does not necessarily lead to “more accurate” (Kotov et al., 2017, p. 469) or valid diagnoses even when the model has an excellent statistical fit.

Figure 1.

Figure 1.

This calls into question HiTOP’s most fundamental assumption, namely that individuals who report similar patterns of symptoms have the same form of psychopathology (which can be targeted by the same treatment due to shared etiology; Ruggero et al., 2019). As our animal classification example illustrates, HiTOP cannot account for equifinality (Cicchetti & Rogosch, 1996). In the case of equifinality, two individuals can reach the same phenotypic end state through different etiological processes (similarly to how birds and bats both developed wings). In HiTOP, these individuals would be considered “the same” despite the fact that they may have different disorders and need different treatments. There are numerous examples of equifinality in nature. For example, fatigue, body aches, pain, and headache are all symptoms common to influenza, rhinovirus, mononucleosis, and Lyme disease. Yet, despite sharing the same phenotype, all of these medical problems have different etiologies (i.e., they are caused by distinct viruses), and are all treated differently. Similarly, chest pain and shortness of breath are common to acute coronary syndrome, pulmonary embolism, pneumonia, rib fracture, anxiety, and heart failure (McConaghy, 2020; Schwartzstein, 2020). Again, despite sharing the same symptom phenotype, these physical ailments are distinct and also treated differently. Similarly, it is untenable to assume that people with depression and people with PTSD should be grouped together (because of shared “distress” symptoms) without understanding their etiology. Meehl (1989) noted that “a one-to-one correlation over individuals between two things does not mean that the two things are actually identical […] all animals with a heart have a kidney, but that does not show that the words heart and kidney designate the same concept!” (p. 938).

In summary, a taxonomy built on symptom covariation is unlikely to capture the complexity of nature. There is little evidence that HiTOP: 1) is “modeled in nature” (p. 286; Kreuger et al., 2018), 2) will “improve our ability to carve nature at its joints” (p. 429; Conway et al., 2019), and 3) can “explain the etiology of psychological problems” (p. 432; Conway et al., 2019).

Claim 2. HiTOP Will Solve the Problems of Comorbidity and Heterogeneity

“The hypotheses the statistician tests exist in a world of black and white, where the alternatives are clear, simple, and few in number, whereas the scientist works in a vast gray area in which the alternative hypotheses are often confusing, complex, and limited in number only by the scientist’s ingenuity”(p. 639).

– Bolles (1962)

Comorbidity

The HiTOP approach “promises to resolve problems of comorbidity, heterogeneity, and arbitrary diagnostic thresholds” (Waszczuk and colleagues, 2019; p. 12). In the case of comorbidity, it is possible that HiTOP is waging a battle on a false front. Comorbidity is a problem when the co-occurring disorders represent the same condition and can be treated the same way (i.e., they are redundant). Without understanding the etiology of the disorders we diagnose, it is impossible to know if current comorbidity rates are artificially high.

Nature is complex, and etiologically distinct conditions can frequently co-occur. For example, 60% of Americans over the age of 65 have 2 or more types of chronic medical conditions (43% have 3 or more; 24% have 4 or more; Centers for Disease Control and Prevention [CDC], 2019). Research shows that cardiovascular disease is highly comorbid with diabetes, chronic kidney disease, and depression (CDC, 2019). However, we suspect that most medical doctors and scientists would not dismiss the distinctiveness of these conditions and call for the eradication of this kind of comorbidity. In fact, level of comorbidity can be an important predictor of clinical outcomes such as adverse drug events, poor functioning, unnecessary hospitalizations, and even death (De Vries et al., 2019; Wolff et al., 2002). This kind of (valid) comorbidity is not inherently bad, nor does it invalidate a classification system.

That said, let us assume that comorbidity in the currently used diagnostic system (DSM) does reflect redundancies and inaccuracies. Does HiTOP solve the problem as promised by Conway and colleagues (2019)? The HiTOP solution is to lump diagnoses together and then give them a new label. This approach eliminates the need to provide more than one diagnosis for a cluster of symptoms, but this shell game does not create new knowledge, new theoretical explanations, or identify new etiological pathways. Rather, it gives new labels to the same collection of symptoms. This creates larger more heterogeneous groupings, which may not be clinically useful and can hinder our understanding of the etiology of mental illness. As noted by Smith and colleagues (2009), “when it occurs that a previously recognized psychological construct is subdivided into more elemental components that have different etiologies, or different external correlates, or that require different interventions, it no longer makes sense to treat the original entity as a coherent, homogeneous construct” (p. 273).

Moreover, an implicit assumption of HiTOP is that people will fit neatly into one spectrum and a line of subfactors. However, research indicates that this is unlikely. Instead, people will “score high” on multiple subfactors and spectra (e.g., the co-occurrence of internalizing and externalizing problems is substantial in both clinical and epidemiological studies; Pesenti-Gritti et al., 2008). Thus, people categorized using HiTOP are still going to carry an abundance of labels, as a person might report internalizing, externalizing, substance use, distress, and antisocial behavior symptoms.

One might respond to this criticism by asking -- if HiTOP’s hierarchical approach is not valid, then why do some treatments appear to cut across current diagnostic categories? This would seem to suggest that there are common etiologies cutting across the DSM categories, which are being captured by HiTOP’s “transdiagnostic” hierarchy. Unfortunately, the cause of a disorder does not always match up with the treatment of a disorder and vice versa (e.g., cigarette smoking is a causal risk factor for lung cancer, but stopping smoking is not an effective treatment for lung cancer). Exercise, good sleep, healthy diet, and cognitive expectations (placebo) are effective in mitigating and preventing nearly every human physical and mental ailment. The beneficial effects cut across hundreds of human problems (heart disease, depression, obesity, cancers, anxiety, etc.), but it does not mean that the problems they alleviate should be considered “the same.” Acetaminophen, Naproxen Sodium, and Ibuprofen all are effective in treating headaches, pain, and fever associated with a variety of illnesses. Yet, there is not a push in medicine to label these transdiagnostic treatments. Their efficacy also would not support the creation of a “headache” diagnostic category in a medical taxonomy. The point is that just because a treatment works for multiple problems, it does not mean those problems belong together in a taxonomy. Similarly, evidence of transdiagnostic treatments does not validate HiTOP or invalidate existing taxonomies.

Related to the idea of transdiagnostic treatments are transdiagnostic risk factors. Research shows that many risk factors are non-specific. It is unclear what conclusions can be made about this kind of non-specificity. It is not necessarily appropriate to conclude that the existence of common risk factors means that the disorders they influence should be considered “the same.” Again, research shows that smoking, poor nutrition, and low levels of exercise are the three most important predictors of common health problems in Americans including heart disease and a variety of cancers (Khera et al., 2016). The lack of specificity for these risk factors does not invalidate the diagnoses that arise from them (or justify their lumping together). This is another example of the complexity of nature and a reminder that common contributors may ultimately lead to a variety of different outcomes. Trying to eliminate comorbidity because it is “messy” likely leads to an even more invalid and artificial taxonomy.

Heterogeneity

Another purpose of HiTOP is to resolve the problem of within-disorder heterogeneity (Kotov, Krueger & Watson, 2018). The problem of heterogeneity is typically illustrated by showing that two people with the same DSM diagnosis may not share any of the same symptoms. For example, Conway and colleagues (2019) note that there are 600,000 possible PTSD symptom combinations, which indicates that the DSM and its polythetic “menu” approach is not a valid taxonomy. First, it is important to recognize that just because it is mathematically possible to have a large number of symptom combinations, it does not mean that all those combinations are expressed in reality. For example, it may be possible to have a large number of genetic configurations (haplotypes) and yet all of those combinations are not expressed in nature. That said, even if all 600,000 combinations did exist in nature, it does not invalidate the diagnosis. It is possible for individuals with the same underlying problem to express completely different symptom profiles as demonstrated by the principle of multifinality.

In the case of multifinality, the same causal agent (e.g., obesity) can lead to distinct outcomes or symptom profiles in people (e.g., diabetes or obstructive sleep apnea). Thus, it is possible for two people to express completely different symptom profiles yet share a common etiological pathway that can be targeted by the same treatment. There are numerous examples of this phenomenon in medicine. People with Lupus often have completely different symptom presentations that include some combination of fatigue, fever, joint pain, rash, pericarditis, Raynaud phenomenon, vasculitis, blood clots, nephritis, shortness of breath, and anemia (Cojocaro et al., 2011; Wallace & Gladman, 2020). Systemic Sclerosis is another disorder in which there may be no overlap in self-reported symptoms among people (symptoms can include things such as skin sclerosis, renal failure, interstitial lung disease, pulmonary hypertension, joint pain, pericardial effusion, erectile dysfunction, myopathy, and myocarditis; Adigun et al., 2002; Varga, 2020). These are just a few examples (others include COVID-19, hyperthyroidism, irritable bowel syndrome, etc.) that illustrate how people can express completely different symptom profiles without overlapping symptoms, and yet suffer from the same underlying problem. HiTOP would miss these cases because the symptom profiles do not covary; it cannot deal with this kind of natural complexity (Kendler et al., 2011).

Symptom heterogeneity is a problem when the different symptoms do not share a common etiology. Strauss and Smith (2009) provided the following example to illustrate this point. According to these authors, neuroticism consists of six correlated but distinct constructs. Thus, it is possible for two people to have the exact same score on a general measure of neuroticism but for different reasons (e.g., one person may score high on hostility and low on self-consciousness, whereas another person may score low on hostility and high on self-consciousness). They argue that this kind of heterogeneity makes a total score on neuroticism imprecise, ambiguous, and an obstacle to theory testing. If we apply this example to HiTOP, we can see how its hierarchy may also hinder scientific progress. Depression appears to be a heterogeneous construct, likely reflecting multiple disorders with distinct etiologies (McGrath, 2005; Smith et al., 2009). Thus, an overall depression score is imprecise and may lead to uninterpretable findings. HiTOP compounds the problem by creating even larger groupings such as “distress”, which includes not only depression, but also syndromes like Post Traumatic Stress Disorder and Generalized Anxiety Disorder. Distress is then combined with other heterogeneous groupings (e.g., fear, eating pathology, mania, sexual problems) under the umbrella of “internalizing.” As one moves up the hierarchy, the scores become less and less useful. As noted by Littlefield and colleagues (2020) “currently, there is no clear consensus [...] regarding the utility of these common factors as a way to understand the potential structure of important constructs or to inform theoretical and clinical efforts” (p.10).

In sum, it is premature to assume that a classification system is invalid because two people can have the same disorder without sharing the same symptoms (e.g., COVID-19 is a valid diagnosis despite highly heterogeneous symptom presentations). In fact, it may show that a classification system is scientifically progressive as it can account for multifinality. For example, after experiencing a life-threatening event, a small number of people will develop a clinically significant form of psychopathology (PTSD) that is expressed in a variety of ways. Despite the different symptom expressions, the DSM can identify these people as having the same problem, in part, by requiring the presence of a common contributory cause (life threatening event).

Claim 3. HiTOP is Empirical and Objective

“A statistical procedure is not an automatic, mechanical truth-generating machine for producing or verifying substantive causal theories. Of course we all know that, as an abstract proposition; but psychologists are tempted to forget it in practice. (I conjecture the temptation has become stronger due to modern computers, whereby an investigator may understand a statistical procedure only enough to instruct an R.A. or computer lab personnel to ‘factor analyze these data’)” (p. 143).

– Meehl (1992)

The structural approach to classification is described as “quantitative”, “empirical”, “more accurate”, and “derived strictly from data, free of political considerations” (p. 165; Kotov et al., 2020). Alternative approaches (e.g., DSM), in contrast, are described as the result of “authority and fiat” in which “experts gather under the auspices of official bodies and delineate classificatory rubrics through group discussions and associated political processes” (p. 282; Kreuger et al., 2018). This characterization of HiTOP suggests that it is more objective and empirically valid than other classification systems; it is based on scientific facts, whereas taxonomies like the DSM are based on scientific opinions.

The insinuation that DSM committee members embrace politics over science is likely unjustified. As stated by Kendler (2018), “The procedures developed for change in DSM-5 by the American Psychiatric Association’s Steering Committee are empirically rigorous and data driven” (p. 242). Similarly, the notion that HiTOP’s 100-member consortium is immune to group dynamics is probably untrue. It is difficult to believe that decisions about HiTOP rely solely on the unthinking application of data.

Representation and Structure

Politics aside, factor analysis does seem more objective than expert consensus. Data are entered into a statistical software package, analyses specified, and a statistical solution appears without human interference. However, describing this approach as “empirical” and “data-driven” is somewhat misleading. Although HiTOP is derived from empirical data, its “structure” of symptom descriptors is not empirically supported. HiTOP uses a dimensional interpretation/simple structure procedure (Thurstone, 1947) in which stimuli are rotated to have high loadings on one dimension but low loadings on others in an effort to reduce cross-loadings and create unique factors; this is the same approach used by its predecessor, the five-factor model of personality. However, this mode of representation likely does not capture the complexity of the actual empirical structure of the data, which has yet to be actually tested (e.g., facet theory; Guttman, 1982). For example, the structure may be better represented by a radex, cylinder, circumplex, or simplex. As cautioned by Maraun (1997) “without a careful distinction being made between model, structure, representation, and mode of representation, and without the employment of appropriate methods for structural analysis, researchers are destined to confuse mere appearance with reality” (p. 646).

The dimensional interpretation/simple structure procedure leads to an infinite number of well-fitting models1. Choosing among these models is often based on ease of interpretation and personal preference, not empirical veracity. And, as statistical software packages have made it easier and easier to rotate solutions to simple structures, it has been “forgotten that the resulting dimensions were a post-hoc MBA [meaningful but arbitrary] expediency, not a data-driven realization of a deeper scientific reality” (p. 35; Turkheimer, 2017). HiTOP is a mathematical solution constrained by an inadequate representation of the dimensional space of the symptoms of psychopathology. According to Maraun (1997), this ensures a “systematic misrepresentation of the structure” (p. 632). Supporting this claim, multiple studies show that the complexity of human personality descriptors may be better represented by a spherical three-dimensional model than the more widely endorsed five factor model (e.g., Markey & Markey, 2006; Turkheimer et al., 2014).

In sum, the HiTOP model is not the result of some “truth generating machine” (p. 152; Meehl, 1992). Rather, it is a human construction based on “meaningful but arbitrary” choices (p. 1588, Turkheimer et al., 2008). Fit indices are not an indicator of validity or even replicability (Littlefield et al., 2020; Watts et al., 2020). HiTOP may ultimately be a useful heuristic, but it is false to claim that it is an empirically validated or a data-driven realization of the structure of the symptoms of psychopathology. As noted by Turkheimer (2017), “internalizing and externalizing are not substrates, with the implication of biological reality. They are dimensions, convenient statistical abstractions. We only think of rotated factors as being more natural than category boundaries because they emerge so effortlessly from the computer programs that rotate them into existence” (p. 41).

Data Decisions

Another potential source of bias in factor analysis is the data; the validity of the model depends on the validity of the information used to create it. According to Barocas and Selbst (2016), “advocates of algorithmic techniques like data mining argue that these techniques eliminate human biases from the decision-making process. But an algorithm is only as good as the data it works with” (p. 671). Data decisions are easy when there is a well-defined and circumscribed body of data. For example, input decisions for the five-factor model of personality, from which HiTOP was derived, are based on the lexical hypothesis. According to the lexical hypothesis, the most frequently used descriptors in a given language represent socially important personality traits. The usage correlations among these words results in a factor structure of socially important traits for a particular society. Here, the input decision is easy, as it is possible to analyze an entire lexicon and compare between word types and languages.

Unfortunately, this type of breadth and inclusion is currently unavailable in the area of mental illness. This raises questions about the usefulness of the input used in HiTOP. Are the self-reported symptoms used to create the HiTOP factors all meaningful indicators of psychopathology (e.g., McGrane & Maul, 2020; Michell, 2000)? Further, how many important indicators are missing from the model (Haroz et al., 2017; Huber, 2011; Keyes, 2007; van der Krieke et al., 2015)? And, how many symptoms are included in the model that are superfluous or do not generalize across cultures, gender, and age (e.g., age-crime curve, Moffitt, 1993; Shulman ea., 2013)? For example, we already know that the data used by HiTOP are biased in terms of culture, race, age, and gender, as they come from studies using samples of Western, Educated, Industrial, Rich, Democratic (WEIRD) participants (Arnett, 2008; Henrich et al., 2010; Kaiser & Weaver, 2019; Kohrt et al., 2014, 2016; Muroff et al., 2008; Neighbors et al., 1989; Weaver and Kaiser 2015). There is at least one study to indicate that HiTOP will not be robust to changes in symptom input. For example, Wittchen and colleagues (2009) found that even the basic internalizing and externalizing structure was not robust when different ages and different diagnoses were considered. They concluded that “it seems unlikely that fairly simple and robust structural models will ever be derived, given the complexity of psychopathological features across the lifespan” (p. 201).

The lack of representation in psychological research is a problem for all taxonomies. However, it may be significantly more difficult for data-driven models like HiTOP to capture cultural nuance than it is for other approaches (where it is possible to include cultural concepts of distress; Kaiser et al., 2015; Lewis-Fernandez & Kirmayer, 2019; Weaver and Kaiser 2015). This is the case because cultural variability is effectively erased as it is dwarfed by the overwhelming amount of data arising from WEIRD samples (which Gone et al. [2010] called “conceptual imperialism”; see also Henrich, Heine, & Norenzayan, 2010). And, when data fail to reflect heterogeneity of human experience (Fisher et al., 2018) in terms of race, gender, age, class, and culture, then systemic bias can arise (Cooper & Davids, 1986; Gelfand et al., 2002; Gone et al., 2010). For example, despite disparate symptoms and biological signatures of heart disease by gender (Chuang et al. 2012; Goldberg et al. 1998; Wenger 1990), many clinical guidelines and practices (e.g., diet, physical activity, and aspirin) are derived from foundational research that was done on men (e.g., Caerphilly Heart Disease and Whitehall Studies of the 1970s and 80s).

As the use of algorithms based on unrepresentative data has increased, so have the instances of systemic bias including: advertisements that are less likely to be presented to women, black-sounding names being falsely linked to arrest records, face recognition algorithms failing to recognize the faces of black people, photo software automatically lightening the skin tones of black people, failure to identify poor people and black people with complex health care needs, and predictive policing (Buolamwini et al., 2018; Ferguson, 2019; Lee, 2013; Morse, 2017; Obermeyer et al., 2019). In fact, the first case of an incorrect facial recognition match leading to the arrest of an innocent man has been reported (Hill, 2020, June 24).

In summary, there is little evidence to support the claim that HiTOP is more “empirical”, “accurate”, or verisimilar than existing taxonomies. This is not necessarily a problem in and of itself. What is concerning is that the HiTOP consortium continues to promote its system as objective and empirically valid. As warned by Kleinberg and colleagues (2019), “it would be naive - even dangerous - to conflate algorithmic with objective” (p. 9). Failing to acknowledge this fact (or worse, promoting the opposite) may lead to overconfidence in the validity of HiTOP and in turn, promote a mindless application of the system leading to systemic algorithmic bias for underrepresented groups.

Claim 4. HiTOP Will Lead to Genetic Discovery

“It will become apparent that seeking biology via factor analysis may be just tilting at a windmill” (p. 177).

– Guttman (1992)

According to Waszczuki and colleagues (2019), the lack of progress in identifying specific genetic variants that confer risk for psychopathology is due, in part, to poor DSM phenotypes. The authors claim that HiTOP can “accelerate genetic discovery” (p. 8) and solve the problems “that impede progress in psychiatric genetics” (p. 12). In support of this claim, Waszczuki and colleagues (2019) review a growing number of studies which have found high heritability estimates and genetic correlations with HiTOP dimensions.

There are at least two reasons HiTOP will not solve the problem of genetic discovery. First, HiTOP probably is not valid; it is a descriptive taxonomy based on symptom correlations. There is little reason to believe that these groupings reflect any natural kinds for which causal genetic variants can be discovered. Second, there is the “gloomy prospect” (Plomin & Daniels, 1987; Turkheimer & Waldron, 2000). Even if HiTOP somehow got everything right, it still would not lead to the identification of any genetic mechanisms. That is because there are no specific genetic mechanisms to be found (i.e., no “mental illness genes”). Mental illness is too complex. Researchers are converging on the conclusion that complex behavioral phenotypes are likely the result of thousands of genes, each with a negligible effect (Turkheimer, 2016; Visscher et al., 2010). Further, the myriad genes will likely combine and interact in ways that are different for each individual (e.g., intragenomic conflict; Kramer & Bressan, 2015). Genes do not directly cause psychopathology; rather, these genetic correlations are indicators of a general probabilistic influence - an uninterpretable confluence of genes and environment that influence behavior throughout the lifespan with a substantial random factor (e.g., Bierbach et al., 2017; Flint & Ideker, 2019; Turkheimer, 2016). In other words, even when genetic correlations are found, they may or may not reflect any direct etiological/causal influence on the phenotype.

If the slow progress in this area was caused by poor DSM phenotypes, as claimed by the HiTOP consortium, then we should see success in other areas of social science that have better theories and measurement tools. This is not the case; researchers have yet to discover the genetic mechanism for any complex human phenotype (intelligence, personality, etc.; Matthews & Turkheimer, 2019). Consider the example of human height. It is more heritable (.8 – .9) than mental illness and can be precisely measured. Scientists (e.g., Boyle et al., 2017; Yengo et al., 2018) have identified over 100,000 SNPs, accounting for less than 25% of variance in height (recent non-replications suggest this percentage is inflated; Berg et al., 2018; Sohail et al., 2019). It remains unclear which, if any, of the identified genetic variants exert a causal/mechanistic influence on height (Boyle, Yang, & Pritchard, 2017). As explained by Turkheimer (2012), “the unspoken claim is that assiduous attention to statistical significance and population stratification will lead to discovery of an allele with an identifiable biological pathway extending through the many levels of analysis separating the allele from the complex phenomenon it is purported to explain. If I am correct that this is what the GWAS researchers intend, it is no wonder that they don’t unpack the content of the claim, because on minimal examination it is so obviously false, false even for something not-really-so-complex as height, never mind delinquency” (p. 62).

Research on the five-factor model of personality has already shown us how genetic discovery will progress under HiTOP. Turkheimer (2014) reviewed the literature on personality and heritability and concluded “that in the genetics of personality, a paradoxical outcome that has been looming for a long time has finally come to pass: personality is heritable, but it has no genetic mechanism.” We suspect this conclusion also applies to psychopathology (as well as every other complex behavioral phenotype) regardless of how it is operationalized. Yes, psychopathology is “genetic”, but there are no specific genetic mechanisms to discover.

It is also important to address the claim that heritability estimates and genetic correlations can be used to validate the HiTOP hierarchy (Waszczuk et al., 2019). Unfortunately, showing that HiTOP taxa are heritable is relatively meaningless. This is because everything is heritable (Turkheimer’s [2000] first law of behavioral genetics). All measurable human differences have genetic correlations. Researchers have found that income, marital status, health insurance coverage, homophobia, military service, frequency of bread eating, and dog ownership are all heritable (Beaver et al., 2015; Fall et al., 2019; Hasselbalch et al., 2010; Hyytinen et al., 2019; Trumbetta et al., 2007; Wehby et al., 2019; Zapko-Willmes & Kandler, 2018). Obviously, human genes do not code for whether or not someone enrolls in healthcare coverage or joins the military. And yet, the heritability estimates for phenotypes such as marital status and owning a dog are just as large as those found for mental illness (as operationalized by HiTOP facet or DSM diagnosis). Wicherts and Johnson (2009) have shown that it is even possible to find genetic correlations using a random scale. They created a scale with random items from a multidimensional personality measure and then demonstrated that scores on it were heritable. If group differences on an artificial scale are heritable, then how noteworthy is it to show that HiTOP spectra are also heritable? It is not appropriate to use heritability estimates as a method for corroborating a taxonomy: “Neither the magnitude nor new reports of the existence of heritability in previously unmeasured psychological or behavioural measures alone tells us much of anything. Most importantly, it is not useful as a criterion to judge the biological importance or even construct validity of a psychological measure” (Johnson et al., 2011).

But what about genetic correlations? Conway and colleagues (2018) argue that it will be possible to identify specific genetic variants at different levels of HiTOP hierarchy, with some influencing nonspecific psychopathology risk and others conferring risk for individual spectra, subfactors, or even symptoms. Waszczuki and colleagues (2019) provide support for this statement by citing studies that have found an alignment between genetic architecture and the HiTOP structure. They conclude that, “although these specific genetic factors often are comparatively small, they provide etiological support for a hierarchy” (p. 425, Conway et al., 2018). It is a mistake to interpret this “alignment” as validation for HiTOP. Research shows that both genetic and environmental structures often align with the phenotypic structure (e.g., Loehlin & Martin, 2013). It is called the “puzzle of parallel structure” (McCrae et al., 2001; Turkheimer, 2016). One cannot conclude that it is the genetic structure that gives rise to (and validates) HiTOP’s structure. In fact, it is likely the reverse, in that “phenotypic variation explains the genetic structure of behavior” (p. 536; Turkheimer, 2014).

In summary, it will be difficult for HiTOP to fulfill its promise to accelerate genetic discovery (Waszczuki et al., 2019). It is another descriptive taxonomy that lumps people based on similar symptom presentations. It proposes a unique hierarchy, but the symptom heterogeneity in the upper-level spectra will likely hinder genetic discovery (Smith et al., 2009). That leaves HiTOP’s dimensional rating system as its primary route for facilitating genetic discovery (although the use of continuous measures is not exclusive to HiTOP). Dimensional ratings will make it easier to detect more significant genetic correlations because of increased statistical power (similarly to using larger samples). However, identifying a few hundred more statistically significant genetic correlations does not necessarily translate to a deeper understanding of the genetic causes of psychopathology.

Claim 5. HITOP is Ready to Use Today

“Because the field of psychology has been reluctant to police itself, the consequences for mental health consumers and the profession at large have been problematic” (p.53).

– Lilienfeld (2007)

According to Ruggero and colleagues (2019), HiTOP “is a viable alternative to classifying mental illness that can be integrated into practice today” (p. 1070). It is “poised to revolutionize the field’s understanding of the structure of mental disorder and reshape how diagnostic assessments are performed and utilized” (p. 5; Hopwood et al., 2019). We were unable to find any published studies or empirical data to support these claims.

There is no evidence that practicing clinicians can reliably interpret a HiTOP profile. Over 50 years of research on the fallibility of human judgment (Garb, 2005; Grove et al., 2000; Meehl, 1954) indicates that it will be extremely difficult for clinicians to reliably and validly interpret a symptom report containing potentially dozens of subscale scores (Millon, 1991). Patients are going to score high on multiple spectra, subfactors, and disorders. How will a clinician interpret all of these scores? Currently, there are no established norms or clinical cut-offs, no information for identifying primary versus secondary problems, no interpretation or treatment guidelines, etc. To date, there is not even a standardized measure that can assess the entire HiTOP taxonomy, which means clinicians are on their own to piece together an assessment and then somehow interpret the patchwork of results.

Even if the HiTOP consortium eventually creates a standardized measure with interpretation guidelines, then practitioners will still need to predict which treatment will be most effective for which profile. To date, there are no studies to identify which specific HiTOP profiles respond to which empirically supported treatments.

Finally, there is no evidence that using HiTOP enhances diagnostic or treatment outcomes compared to using other taxonomies. There is not a single study in which clinicians were randomly assigned to use HiTOP or an alternative system in order to determine if a particular classification system creates better treatment outcomes. There is at least one study that provides indirect evidence that using HiTOP may not enhance treatment outcomes. Using a manipulated assessment design, Lima and colleagues (2005) randomized clinicians to either receive or not receive the MMPI symptom information for their patients. Results showed that the addition of symptom information did not improve treatment outcomes.

It is difficult to reconcile the HiTOP consortium’s call for an “empirical” classification system with their recommendation for practitioners to start using a system for which there is no empirical data to support its usefulness. There is not a standardized measure of the entire HiTOP system; there are no empirically derived interpretation and treatment guidelines; and there is yet to be a single published study directly comparing the usefulness of HiTOP to other taxonomies. In fact, there is little if any research directly testing any aspect of HiTOP. As noted by Conway and colleagues (2019) “many of the analyses that we have reviewed were carried out using datasets that were not assembled with HiTOP in mind” (p. 428). In other words, support for HiTOP has not actually come from using HiTOP. The recommendation to use HiTOP for clinical purposes is premature at best and reckless at worst.

A Comparison of Taxonomies

“To be scientifically useful a concept must lend itself to the formulation of general laws or theoretical principles which reflect uniformities in the subject matter under study, and which thus provides a basis for explanation, prediction, and generally scientific understanding” (p. 146).

– Hempel (1965)

In this section, we compare HiTOP to three alternative taxonomic approaches -- DSM2, RDoC, and Meehlian taxometrics (see Table 1). We focus on the HiTOP vs. DSM comparison because these are the two taxonomies in direct competition. Both HiTOP and DSM are descriptive taxonomies, and HiTOP is promoted as a replacement for DSM.

Table 1.

Aooroach Currently Useful Potential For Proaress Strenaths Weaknesses
DSM Descriptive Info. Retrieval, Prediction, Nomenclature Atheoretical
HiTO P Descriptive Dimensional Ratings Atheoretical, Unfalsifiable
RDoC Research Framework Focused on Etiology Reductionists
Taxo metrics Taxometrics Falsifiable, Search For Natural Kinds No Evidence of Latent Taxa

DSM

HiTOP and DSM are more similar than different. They are descriptive taxonomies that share the same fundamental assumption: symptom covariation is meaningful in nature (i.e., like-goes-with-like). Both HiTOP and DSM are atheoretical and lump people based on sharing the same self-reported symptoms. There is some empirical support for the factor structure illustrated by HiTOP (Conway et al., 2019), but there is also support for the distinctiveness of some DSM diagnoses (i.e., evidence against lumping; Gray et al., 2020; Jha et al., 2019; Korgaonkar et al., 2014a, 2014b; Tung & Brown, 2020; Webb et al., 2018). That said, neither system is a long-term solution to the problem of classification in psychopathology, as both taxonomies are likely “wrong” (i.e., “splendid fictions”, Millon, 1991).

There are two primary differences between HiTOP and the DSM. The first difference is how the symptom groupings are created. HiTOP uses factor analysis, whereas the DSM uses expert consensus. Both approaches are fallible and rely on subjective decision making. Expert consensus requires human decisions about how to interpret empirical findings and aggregate them into a coherent and usable taxonomy. Similarly, in factor analysis, there are decisions about mode of representation and how to deal with rotational indeterminacy; the consequence being that HiTOP is not anymore “empirical” or “truthful” than the DSM approach. The choice between factor analysis and expert consensus is one of personal preference, as both strategies may ultimately lead to something that is clinically useful (e.g., communication, prognosis, treatment planning) even if not valid.

The second difference between HiTOP and DSM pertains to the rating system, which is dimensional in HiTOP and categorical in the DSM. It is important to underscore that the decision to parse the landscape of psychopathology into categories or facets is based more on expedience than empirical evidence (Turkheimer, 2017). HiTOP facets and DSM categories are both artificial delineations. That said, there is research showing that most forms of mental illness (self-reported symptoms) appear to differ in quantity rather than quality (Haslam et al., 2012; Markon et al., 2011; cf. Meehl, 1999). Further, using dimensional ratings increases reliability and statistical power to detect correlations among symptoms and other constructs. Research shows that reliability estimates for specific HiTOP dimensions tend to be stronger than reliability estimates for DSM diagnoses. According to Waszczuki and colleagues (2019), 40% of DSM diagnoses did not meet acceptable levels of interrater reliability in the DSM-5 field trials, whereas reliability estimates for the same diagnoses were strong when rated dimensionally. This comparison is a bit misleading, however, because the field trials’ estimates for the DSM used clinicians who received no training in the diagnostic categories and did not use structured interviews. Thus, it is not surprising that the reliability estimates would be low. Consider the reliability estimates for diagnosing a broken bone if medical doctors were not allowed to use x-rays. Proper training and proper assessment tools (i.e., a structured interview) are needed to make reliable diagnoses. Reliability estimates for DSM diagnoses tend to be uniformly strong when structured interviews are used (e.g., Osório et al., 2019). That said, reliability estimates for HiTOP are probably going to be superior to diagnoses made using the DSM because of statistical necessity, not because it is more valid or scientific. As cautioned by Meehl (2002), “the intrinsic validity (empirical meaningfulness) of a diagnostic construct cannot be dismissed ipso facto on grounds of poor average clinician agreement” (p. 156).

Although symptom ratings tend to be more reliable when operationalized as dimensions rather than categories, it should be noted that their usefulness in clinical practice has yet to be validated. In the real world, dichotomous decisions often need to be made, such as to admit or not to admit, to intervene or not intervene, or the picking of a diagnostic code for billing (Kendler, 2018). Moreover, there is at least some evidence that clinicians prefer categories to dimensions (Mullins-Sweat & Widiger, 2009; Sprock, 2003). Further, some have argued that mental illness can build over time until there is a tipping point (or a qualitative difference) in which impairment, symptom severity, or distress becomes too much to bear for an individual (e.g., Nelson et al., 2017). As noted by Kendler (2018), “while not all psychiatric disorders have such dramatic “avalanche-like” transitions, they are fairly common in clinical psychiatry and challenge the authors’ conclusions that there is little viable evidence that psychiatric disorders need to be understood from a categorical perspective” (p. 241).

Usefulness and Scientific Progress.

It is important to evaluate the two taxonomies from a philosophy of science perspective. According to Hempel (1965), a scientifically progressive classification system is characterized by features such as operational definitions, open concepts, descriptions, explanations, predictions, and testable assumptions. It engenders assertions about origins and outcomes by weaving a nomological net of relationships between the taxa and their correlates (Meehl & Golden, 1982). A useful taxonomy should “tell us a lot about the patient – the course, the likely etiologic process, the best treatment, etc.” (Kendler, 2018), and it should have generative power and provide us with new attributes, relations, or taxa, that is, ones other than those used to construct it (Millon, 1991).

As imperfect as it is, the DSM exhibits many of the features found in a useful taxonomy: a) it provides descriptive information and explanations about the disorders (e.g., discussion of course, severity, differential diagnosis, why specific disorders have been added or removed); b) it distinguishes among symptoms with some being necessary (e.g., criterion A) and some supplementary to the syndrome; c) it considers issues related to duration and persistence; d) it integrates impairment ratings to reduce over-pathologizing; e) it specifies inclusion and exclusion criteria; f) it allows for information retrieval (e.g., prevalence, comorbid conditions); g) it allows for prediction (e.g., one can go to the literature to determine which treatment will work for which specific disorders); h) it includes cultural considerations (Cultural Formulation and Cultural Concepts of Distress); and, i) it contains at least some information related to risk and developmental factors (e.g., major stressor required for PTSD; identifies disorders developing in adulthood versus childhood). In sum, the DSM provides hundreds of pages of information related to its categories.

HiTOP, on the other hand, exhibits few, if any, of the features found in a useful taxonomy. Its classification system is an interpretation of factor analytic results. It is a single picture. Absent our knowledge and previous experience with DSM descriptions and disorders, HiTOP contains no additional information. It contains no explanations, no descriptive information (other than symptom labels and lists), no necessary symptoms, no inclusion or exclusion criteria, no information about how to integrate impairment severity, no information about prevalence, no information on underlying developmental processes, and it ignores differences in culture, age and/or gender. Further, despite claims about eliminating comorbidity, it provides no information about how to interpret subscale comorbidity (i.e., when patients score high on multiple spectra, subfactors, and disorders).

It may be more accurate to think of HiTOP as a sorting algorithm (or multifaceted measurement tool) rather than a classification system. It does not feature information that lends itself to scientific discourse, disagreement, or progress. HiTOP is a statistical outcome from testing correlations among a large set of symptom items.

We acknowledge that HiTOP is much newer than DSM, and at some point, it may have a standardized measure with clinical cut-offs and interpretation guidelines and include descriptive information for the different symptom profiles (e.g., base rates, course of illness, etc.). If this happens, then the question is which of these two systems (HiTOP or DSM) is better positioned to evolve from a system based on observable characteristics to one based in theory (Hempel, 1965; Millon, 1991). We contend that the DSM has more potential for scientific progress than HiTOP. Ironically, the DSM’s most cited “weakness” may actually be its greatest strength with regard to potential for scientific change. The DSM is not bound by an analytic procedure but rather is fueled by scientific debate (Zachar & Kendler, 2007). If scientific progress and self-correction comes from disagreement (Popper, 1959; Meehl, 1978; Lakatos, 1970), then look no further than a group of human scientists. The DSM can be altered to incorporate more specific explanations, descriptions, additional open concepts, and even theory. There is a path for DSM in which “the various classes or categories distinguished now are no longer defined just in terms of symptoms, but rather in terms of the key concept of theories, which are intended to explain the observable behavior including the symptoms in question” (Hempel, 1965). The DSM could be changed back to a theoretical system as quickly as it was changed from being one (DSM-I and DSM-II were theoretical; DSM-III changed to a descriptive system).

HiTOP does not have a clear path for scientific and taxonomic progress. The main mode of change for HiTOP is to add or subtract symptom information in its analysis. This may lead to small changes in its structure or factor labels, but it will not lead to the type of scientific evolution that characterizes progressive taxonomies (description to theory). HiTOP was created using a statistic within a theoretical vacuum; there are few, if any, specific predictions and hypotheses that can be falsified, which would result in corrective change over time. Further, HiTOP may even hinder progress, as it may be creating larger, more heterogeneous factors that do not reflect meaningful etiological differences. This can obscure discovery and lead to more non-replicable findings in the literature.

Research Domain Criteria Initiative (RDoC)

Launched in 2009, RDoC is the National Institute of Mental Health’s (NIMH) solution to the problems associated with descriptive taxonomies like DSM and HiTOP. Instead of focusing on symptom presentations, RDoC is concerned with etiology. Using an endophenotype approach (Gottesman & Gould, 2003), RDoC specifies a set of intermediate constructs (negative and positive valence, cognitive systems, social process systems, and arousal systems) thought to form the link between mental illness and some biological or genetic process (Cuthbert & Insel, 2013).

Usefulness and Scientific Progress.

RDoC is unusable in clinical settings. It cannot be used for diagnosis, case conceptualization, treatment choice, or billing options. However, this is to be expected because RDoC is not yet a taxonomy; it is a “framework for research on pathophysiology, especially for genomics and neuroscience” (p. 748; Insel & Cuthbert, 2010).

Ostensibly, RDoC has more potential for scientific progress than HiTOP and DSM. Its goal is to characterize psychopathology in terms of etiology instead of description. Further, it is not tied to a particular clinical outcome or a statistical procedure. Thus, researchers are free to explore new syndromes. That said, RDoC does not explicitly promote theory building or the generation of falsifiable mechanistic explanations; instead, the focus is on identifying specific genes and/or markers of neurological dysfunction associated with its list of endophenotypes.

Unfortunately, the scientific potential of RDoC is limited by biological reductionism (e.g., Lilienfeld, 2014). In the RDoC framework, mental illness is a “brain disorder.” The overriding purpose is to understand the biological and genetic basis of mental illness, not its psychological and environmental bases. This is a high-risk strategy, as it is possible that low level brain and genetic factors do not have a direct causal effect on higher level psychological phenotypes (Turkheimer, 2017). It also means that RDoC is wedded to neuroimaging tools such as MRI and fMRI, which are “not currently suitable for brain biomarker discovery or for individual-differences research” (p.1; Elliot et al., 2020; Weinberger & Radulescu, 2020). This has culminated in a research literature characterized by underpowered studies and nonreplicable findings (Button et al., 2013; Lilienfeld, 2014; Parnas, 2014; Szucs & Loannidis, 2020). Even Thomas Insel, who launched RDoC, now questions its potential for success: “I spent 13 years at NIMH really pushing on the neuroscience and genetics of mental disorders, and when I look back on that I realize that while I think I succeeded at getting lots of really cool papers published by cool scientists at fairly large costs - I think $20 billion - I don’t think we moved the needle in reducing suicide, reducing hospitalizations, improving recovery for the tens of millions of people who have mental illness. I hold myself accountable for that” (Rogers, 2017).

Taxometrics

Bootstrap taxometrics (Meehl & Golden, 1982) was Meehl’s response to the unfalsifiable and atheoretical nature of symptom based statistical clustering (Meehl, 1978; 1989). According to Meehl (1995), “we admire Linnaeus, the creator of modern taxonomy, for discerning the remarkable truth - a “deep structure” fact, as Chomsky might say - that the bat doesn’t sort with the chickadee and the whale doesn’t sort with the pickerel, but both are properly sorted with the grizzly bear…I see classification as an enterprise that aims to carve nature at its joints (Plato)” (p. 267). To this end, he created a mathematical method for testing the existence of latent taxa or “natural kinds.” It should come as no surprise that Meehl’s critique of cluster analysis (and psychological science more generally) has motivated much of this critical evaluation. The HiTOP approach to classification is history repeating itself all over again.

Usefulness and Scientific Progress.

Meehlian taxometrics is not usable in clinical settings, but it is more scientifically progressive than HiTOP and DSM. It provides a method to corroborate or refute theories of mental illness. From a Popperian perspective, taxometrics has been hugely successful; nearly every proposed taxon has been refuted (falsified). This does not completely shut the door on the existence of mental illness taxons, but it raises serious doubts.

Recommendation

“Without theory-driven models to guide the interpretation of data, it is not likely that any empirical truth will emerge” (p. 1131).

– Follette & Houts (1996)

Scientifically progressive taxonomies tend to evolve over time from description to theory. Descriptive taxonomies, like DSM and HiTOP, can be useful, but they should be considered a stopgap. It is time for clinical psychology to put its resources and efforts into developing a theoretically derived system that can explain mental illness. A theory-based classification system would not be tied to a specific level of analysis, current diagnostic syndromes, or rely on finding an association between some genetic/biological measure and a clinical outcome. Rather, the focus would be on creating and testing mechanistic explanations of mental illness.

The research process used in clinical psychology is often atheoretical and backwards. Science usually starts with a theory to explain a particular outcome; then, experiments are conducted to test the predictions derived from that theory. But, in clinical psychology, researchers focus on the outcome (diagnosis or symptom dimension) rather than the explanation. There appears to be more interest in obtaining the “hard to get” clinical sample than there is in proposing theories (i.e., falsifiable mechanistic explanations) to explain the development of the clinical problems. The dominant research design in clinical psychology is to compare people with varying levels of psychopathology to determine if they differ on some measure (e.g., amygdala activation). And when between-group differences are inevitably found (Meehl, 1978), they are assumed to reflect an etiological process. This kind of post-hoc conjecturing is a problem because any difference found in the clinical group (relative to control) could be a concomitant or scar of experiencing psychopathology rather than a part of its etiology.

Pursuing a theory-based classification system may help to curb clinical psychology’s obsession with testing samples rather than theories. Further, it would push researchers to use more rigorous research methodologies such as behavioral high-risk designs and targeted prevention interventions (in which participants are selected on individual differences in a hypothesized risk factor rather than the clinical outcome). Examples of this kind of theory-based research include the hopelessness theory of depression (Abramson et al., 1989) and Newman’s (1998) attention-based theory of psychopathy. The hopelessness theory specifies a falsifiable etiological sequence that explains a clinical outcome: it specifies distal, proximal, contributory, and sufficient causes as well as both mechanisms (e.g., hopelessness) and moderators (e.g., stress, cognitive vulnerability) of the outcome of interest. It also proposes a theory-based clinical outcome that is not tied to our current descriptive system (hopelessness subtype of depression). Along these same lines, Newman’s (1998) attention theory of psychopathy is an exemplar of a progressive theory (Lakatos, 1970) that can both explain existing findings and generate novel predictions that cannot be explained by competing theories (such as the low fear hypothesis).

Obviously, a theory-based taxonomy remains a pipe dream. The field still needs to build stronger explanatory theories, rigorously test them (alone and in competition), and somehow integrate the findings into a taxonomy. The question is what to do in the meantime? We recommend a bifurcation strategy. Clinicians should continue to use the DSM while researchers focus on theory development and testing. We choose the DSM, not because we believe it to be particularly valid, but because it is currently the most useful taxonomy in clinical practice. As theory development progresses, the information can be integrated into DSM (similarly to how intervention research has influenced treatment guidelines), or it can be used to create an entirely new system. Research using RDoC and taxometrics can complement the theory-driven approach and be used in parallel. Although the RDoC is limited by biological reductionism, it can still serve as a basis for theory development. Similarly, taxometrics can be used to try to corroborate new theoretical subtypes. In contrast, there appears to be limited incremental value in pursuing HiTOP, which is another descriptive system. DSM already meets the need for a useful descriptive taxonomy that can be used in clinical practice. It is possible that HiTOP could also meet this need at some point, but it is ultimately handcuffed by its inability to evolve over time.

Conclusion

“Is there a named cognitive bias describing the preference for a concrete quantitative answer to a complex question, even if it is invalid?”

– Turkheimer (2020)

Factor analysis provides a straightforward, intuitive, and parsimonious solution to the problem of classification. Researchers are able to impose a hierarchical structure on mental illness with the push of a button. According to Waszczuk and colleagues (2019), the result of this button push “promises to resolve problems of comorbidity, heterogeneity, and arbitrary diagnostic thresholds” (p. 12). It is a “paradigm shift” (Kotov et al., 2018) that will transform mental health research (Conway et al., 2019), improve clinical practice (Ruggero et al., 2019), and advance genetic discovery (Latzman et al., 2020; Waszczuk et al., 2019).

The purpose of this article was to critically evaluate the HiTOP approach and its purported advantages. We conclude that the extraordinary claims about HiTOP are not matched by extraordinary evidence (Gillispie et al., 1999; Sagan, 1979); it appears the HiTOP consortium is writing checks their taxonomy can’t cash. Unless psychopathology plays by a different set of rules than nearly every other realm of nature, the result of pushing the factor analysis button is an incorrect answer. In order for HiTOP to be valid, it would mean that: 1) self-reported symptom expressions are meaningful indicators of development processes and the etiology of psychopathology; 2) all of the symptom indicators are equally important (deserve equal weighting) for classifying psychopathology; 3) equifinality and multifinality do not apply to psychopathology; 4) the expression and reporting of symptoms are not influenced by sex, culture, or age (and failing to account for them does not lead to algorithmic bias); and 5) a dimensional interpretation/simple structure approach represents the structure of psychopathology symptom data. To date, there is little evidence to support any of these statements. Moreover, HiTOP does not lend itself to theory building. It does not feature the characteristics of a falsifiable, scientifically progressive, and evolving taxonomy. It is bound to a statistical procedure in which change comes from adding or subtracting symptom information rather than through the falsification of specific hypotheses.

Over 40 years ago, Meehl (1978) argued that psychology was not progressing like the hard sciences because of shoddy theorizing and an overreliance on null hypothesis testing. The problems he noted are currently exemplified by the push for atheoretical, statistically-driven structural taxonomies of psychopathology. He tried to remind us that creating specific and falsifiable theory (e.g., Popper, 1959) is necessary for scientific progress. Psychology’s statistical-driven approach to classification seems to fail this critical requirement, as it is difficult to “be wrong” in the absence of any specific theoretical hypotheses while reporting the output of factor analyses. Because of this and the limitations discussed in this article, replacing the DSM with HiTOP has the potential to hinder progress on understanding the etiology of psychopathology. We recommend a bifurcation strategy in which the DSM continues to be used in clinical settings because of its usefulness while researchers focus on creating and testing falsifiable theories of mental illness that can eventually inform the DSM or lead to a new theory-based classification system.

Footnotes

1

There is an extensive literature questioning the logic and appropriateness of factor modeling for understanding complex phenotypes (e.g., interpretation of models; failure to test assumptions of quantitative structure, etc.). For additional discussion of these issues please see: Aristodemou & Fried, 2020; Bornovalova et al., 2020; Heene, 2013; Riet Van Bork et al., 2017; Rhemtulla, van Bork, & Borsboom, 2019; Wittchen & Beesdo-Baum, 2018.

2

This comparison would also apply to the International Statistical Classification of Diseases and Related Health Problems (ICD); one minor difference is that ICD focuses more on public health and applicability to a diverse worldly population.

Contributor Information

Gerald J. Haeffel, University of Notre Dame

Bertus F. Jeronimus, University of Groningen

Bonnie N. Kaiser, University of California-San Diego

Lesley Jo Weaver, University of Oregon.

Peter D. Soyster, University of California-Berkeley

Aaron J. Fisher, University of California-Berkeley

Ivan Vargas, University of Arkansas.

Jason T. Goodson, VA Salt Lake City Healthcare Systems

Wei Lu, University of Iowa Hospitals and Clinics.

References

  1. Abramson LY, Alloy LB, Hogan ME, Whitehouse WG, Donovan P, Rose D, Panzarella C, & Raniere D (1999). Cognitive vulnerability to depression: Theory and evidence. Journal of Cognitive Psychotherapy: An International Quarterly, 13, 5–20. [Google Scholar]
  2. Abramson LY, Metalsky GI, & Alloy LB (1989). Hopelessness depression: A theory-based subtype of depression. Psychological Review, 96, 358–372. [Google Scholar]
  3. Achenbach (2020). Bottom-up and top-down paradigms for psychopathology: A half-century odyssey. Annual Review of Clinical Psychology, 16, 1–24. 10.1146/annurev-clinpsy-071119-115831 [DOI] [PubMed] [Google Scholar]
  4. Adigun R, Goyal A, Bansal P, et al. (2020). Systemic Sclerosis (CREST syndrome) [Updated 2020 Apr 28]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2020 Jan-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK430875/ [Google Scholar]
  5. Alloy LB, Abramson LY, Walshaw P, & Neeren A (2006). Cognitive vulnerability to unipolar and bipolar mood disorders. Journal of Social and Clinical Psychology, 25(7), 726–754. [Google Scholar]
  6. Alloy LB, Kelly KA, Mineka S, & Clements CM (1990). Comorbidity of anxiety and depressive disorders: A helplessness-hopelessness perspective. In: Maser JD & C. R. Cloninger (Eds.), Comorbidity of mood and anxiety disorders(p. 499–543). American Psychiatric Association. [Google Scholar]
  7. Aristodemou ME, & Fried EI (2020). Common Factors and Interpretation of the p Factor of Psychopathology. Journal of the American Academy of Child and Adolescent Psychiatry, 59(4), 465–466. 10.1016/j.jaac.2019.07.953 [DOI] [PubMed] [Google Scholar]
  8. Arnett J (2008). The weirdest people in the world. American Psychologist, 63(7), 602–14. [DOI] [PubMed] [Google Scholar]
  9. Barocas S & Selbst A (2016). Big data’s disparate impact. California Law Review, 104. [Google Scholar]
  10. Beaver KM, Barnes JC, Schwartz JA, & Boutwell BB (2015). Enlisting in the Military: The Influential Role of Genetic Factors. SAGE Open, 5(2). 10.1177/2158244015573352 [DOI] [Google Scholar]
  11. Berg JJ, Harpak A, Sinnott-Armstrong N, Jørgensen AM, Mostafavi H, Field Y, Boyle EA, Zhang X, Racimo F, Pritchard JK, & Coop G (2018). Reduced signal for polygenic adaptation of height in UK Biobank. eLife 2019;8:e39725. doi: 10.1101/354951 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bierbach D, Laskowski KL, & Wolf M (2017). Behavioural individuality in clonal fish arises despite near-identical rearing conditions. Nature communications, 8, 15361. 10.1038/ncomms15361 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bolles RC (1962). The Difference between Statistical Hypotheses and Scientific Hypotheses. Psychological Reports, 11(3), 639–645. 10.2466/pr0.1962.11.3.639 [DOI] [Google Scholar]
  14. Bornovalova MA, Choate AM, Fatimah H, Petersen KJ, Wiernik BM (2020). Appropriate Use of Bifactor Analysis in Psychopathology Research: Appreciating Benefits and Limitations. Biological Psychiatry, 88, 18–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Boyle EA, Li YI, Pritchard JK. (2017). An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell, 169(7):1177–86. 10.1016/j.cell.2017.05.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Brose A, Voelkle MC, Lövdén M, Lindenberger U, & Schmiedek F (2015). Differences in the between-person and within-person structures of affect are a matter of degree. European Journal of Personality, 29(1), 55–71. 10.1002/per.1961 [DOI] [Google Scholar]
  17. Buolamwini J & Gebru T (2018). In Proceedings of the Conference on Fairness, Accountability and Transparency, pp. 77–91. [Google Scholar]
  18. Button K, Ioannidis J, Mokrysz C et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci 14, 365–376. 10.1038/nrn3475 [DOI] [PubMed] [Google Scholar]
  19. Center for Disease Control (2019). Chronic Diseases in America. https://www.cdc.gov/chronicdisease/pdf/infographics/chronic-disease-H.pdf
  20. Cervone D (1999). Bottom-up explanation in personality psychology. The coherence of personality: Social-cognitive bases of consistency, variability, and organization, 303–341. [Google Scholar]
  21. Chuang ML, Massaro JM, Levitzky YS, Fox CS, Manders ES, Hoffmann U & O’Donnell CJ (2012). Prevalence and distribution of abdominal aortic calcium by gender and age group in a community-based cohort (from the Framingham Heart Study).” The American journal of cardiology, 110, 891–896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cicchetti D, & Rogosch F (1996). Equifinality and multifinality in developmental psychopathology. Development and Psychopathology, 8(4), 597–600. doi: 10.1017/S0954579400007318 [DOI] [Google Scholar]
  23. Cojocaru M, Cojocaru IM, Silosi I, & Vrabie CD (2011). Manifestations of systemic lupus erythematosus. Maedica, 6(4), 330–336. [PMC free article] [PubMed] [Google Scholar]
  24. Conway CC, Forbes MK, Forbush KT, Fried EI, Hallquist MN, Kotov R, … Eaton NR (2019). A Hierarchical Taxonomy of Psychopathology Can Transform Mental Health Research. Perspectives on Psychological Science, 14(3), 419–436. 10.1177/1745691618810696 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Cuthbert BN, Insel TR (2013). Toward the future of psychiatric diagnosis: the seven pillars of RDoC. BMC Med 11, 126. 10.1186/1741-7015-11-126 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. DeYoung CG, Chmielewski M, Clark LA, Condon DM, Kotov R, Krueger RF, Lynam DR, Markon KE, Miller JD, Mullins-Sweatt SN, Samuel DB, Sellbom M, South SC, Thomas KM, Watson D, Watts AL, Widiger TA, Wright A, & HiTOP Normal Personality Workgroup (2020). The distinction between symptoms and traits in the Hierarchical Taxonomy of Psychopathology (HiTOP). Journal of personality, 10.1111/jopy.12593. [DOI] [PubMed] [Google Scholar]
  27. De Vries YA, Al-Hamzawi A, Alonso J, Borges G, Bruffaerts R, Bunting B, … & Esan O (2019). Childhood generalized specific phobia as an early marker of internalizing psychopathology across the lifespan: results from the World Mental Health Surveys. BMC medicine, 17(1), 101. 10.1186/s12916-019-1328-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Dawkins R & Wong Y 2016. The Ancestor’s Tale: A Pilgrimage to the Dawn of Evolution. Mariner. Hachette UK. [Google Scholar]
  29. Elliott ML, Knodt AR, Ireland D, Morris ML, Poulton R, Ramrakha S, Sison ML, Moffitt TE, Caspi A, & Hariri AR (2020). What Is the Test-Retest Reliability of Common Task-Functional MRI Measures? New Empirical Evidence and a Meta-Analysis. Psychological science, 31, 792–806. 10.1177/0956797620916786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Ernst A, Albers C, Jeronimus BF, & Timmerman M (2020). Inter-individual differences in multivariate time series: Latent class vector-autoregressive modelling. European Journal of Psychological Assessment, 36(3), 482–491. 10.1027/1015-5759/a000578 [DOI] [Google Scholar]
  31. Fall T, Kuja-Halkola R, Dobney K et al. (2019). Evidence of large genetic influences on dog ownership in the Swedish Twin Registry has implications for understanding domestication and health associations. Sci Rep 9, 7554. 10.1038/s41598-019-44083-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Feldman LA (1995). Valence-focus and arousal-focus: Individual differences in the structure of affective experience. Journal of Personality and Social Psychology, 69, 153–166. 10.1037/0022-3514.69.1.153 [DOI] [Google Scholar]
  33. Ferguson AG (2019). The rise of big data policing: Surveillance, race, and the future of law enforcement. NYU Press. [Google Scholar]
  34. Fisher AJ, Medaglia JD, & Jeronimus BF (2018). Lack of group-to-individual generalizability is a threat to human subjects research. Proceedings of the National Academy of Sciences, 115(27), E6106–E6115. 10.1073/pnas.1711978115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Fleeson W (2001). Towards a structure- and process-integrated view of personality: Traits as density distributions of states. Journal of Personality and Social Psychology, 80, 1011–1027. 10.1037/0022-3514.80.6.1011 [DOI] [PubMed] [Google Scholar]
  36. Flint J & Ideker T (2019) The great hairball gambit. PLoS Genet 15(11): e1008519. 10.1371/journal.pgen.1008519 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Follette WC, & Houts AC (1996). Models of scientific progress and the role of theory in taxonomy development: A case study of the DSM. Journal of Consulting and Clinical Psychology, 64(6), 1120–1132. 10.1037/0022-006X.64.6.1120 [DOI] [PubMed] [Google Scholar]
  38. Fulford K, & Handa A (2018). Categorical and/or continuous? Learning from vascular surgery. World psychiatry : official journal of the World Psychiatric Association (WPA), 17(3), 304–305. 10.1002/wps.20565 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Garb HN (2005). Clinical judgment and decision making. Annual Review of Clinical Psychology, 1:1, 67–89 [DOI] [PubMed] [Google Scholar]
  40. Gelfand MJ, Raver JL, Holcombe Ehrhart K (2002). Methodological issues in cross-cultural organizational research. In: Rogelberg (Ed.), Blackwell handbooks of research methods in psychology. Handbook of research methods in industrial and organizational psychology (p. 216–246). Blackwell Publishing. [Google Scholar]
  41. Gillispie CC, Gratton-Guinness I, Fox R (1999). Pierre Simon Laplace, A Life in Exact Science. Princetion, NJ: Princeton University Press [Google Scholar]
  42. Goldberg Robert J., Caitlin O’Donnell Jorge Yarzebski, Bigelow Carol, Savageau Judith, and Gore Joel M.. “Sex differences in symptom presentation associated with acute myocardial infarction: a population-based perspective.” American heart journal, 136(2), 189–195. [DOI] [PubMed] [Google Scholar]
  43. Gone JP, & Kirmayer LJ (2010). On the wisdom of considering culture and context in psychopathology. In: Millon T, Krueger RF, & Simonsen E (Eds.), Contemporary directions in psychopathology: Scientific foundations of the DSM-V and ICD-11 (p. 72–96). The Guilford Press. [Google Scholar]
  44. Gottesman II, & Gould TD (2003). The endophenotype concept in psychiatry: Etymology and strategic intentions. American Journal of Psychiatry, 160, 636–645. [DOI] [PubMed] [Google Scholar]
  45. Grove WM, Zald DH, Lebow BS, Snitz BE, & Nelson C (2000). Clinical versus mechanical prediction: A meta-analysis. Psychological Assessment, 12(1), 19–30. 10.1037/1040-3590.12.1.19 [DOI] [PubMed] [Google Scholar]
  46. Guttman L (1982). Facet theory, smallest space analysis, and factor analysis. Perceptual and Motor Skills, 54, 487–493. [DOI] [PubMed] [Google Scholar]
  47. Guttman L (1992) The Irrelevance of Factor Analysis for the Study of Group Differences. Multivariate Behavioral Research, 27:2, 175–204 [DOI] [PubMed] [Google Scholar]
  48. Haeffel GJ, Gibb BE, Abramson LY, Alloy LB, Metalsky GI, Joiner T, Hankin BL, and Swendsen J (2008). Measuring cognitive vulnerability to depression: Development and Validation of the Cognitive Style Questionnaire. Clinical Psychology Review, 28, 824–836. 10.1016/j.cpr.2007.12.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Haroz EE, Ritchey M, Bass JK, Kohrt BA, Augustinavicius J, Michalopoulos L, … & Bolton P (2017). How is depression experienced around the world? A systematic review of qualitative literature. Social Science & Medicine, 183, 151–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Haslam N, Holland E, & Kuppens P (2012). Categories versus dimensions in personality and psychopathology: a quantitative review of taxometric research. Psychological medicine, 42(5), 903. 10.1017/S0033291711001966 [DOI] [PubMed] [Google Scholar]
  51. Hasselbalch AL, Silventoinen K, Keskitalo K, et al. Twin study of heritability of eating bread in Danish and Finnish men and women. (2010). Twin Res Hum Genet, 13(2):163–167. doi: 10.1375/twin.13.2.163 [DOI] [PubMed] [Google Scholar]
  52. Heene M (2013). Additive conjoint measurement and the resistance toward falsifiability in psychology. Frontiers in psychology, 4, 246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Hempel CG (1965). Aspects of scientific explanation and other essays in the philosophy of science. New York: Free Press. [Google Scholar]
  54. Henrich J, Heine SJ, & Norenzayan A (2010). The weirdest people in the world?. Behavioral and brain sciences, 33(2–3), 61–83 [DOI] [PubMed] [Google Scholar]
  55. Hill K (2020, June 24). Wrongfully accused by an algorithm. New York Times. [Google Scholar]
  56. Hopwood CJ, Bagby RM, Gralnick T, Ro E, Ruggero C, Mullins-Sweatt S, Kotov R, Bach B, Cicero DC, Krueger RF, Patrick CJ, Chmielewski M, DeYoung CG, Docherty AR, Eaton NR, Forbush KT, Ivanova MY, Latzman RD, Pincus AL, … Zimmermann J (2019). Integrating psychotherapy with the hierarchical taxonomy of psychopathology (HiTOP). Journal of Psychotherapy Integration, 30(4), 477–494. 10.1037/int0000156 [DOI] [Google Scholar]
  57. Huber M, Knottnerus JA, Green L, Horst H. v. d., Jadad AR, Kromhout D, … Smid H (2011). How should we define health? BMJ, 343, d4163. doi: 10.1136/bmj.d4163 [DOI] [PubMed] [Google Scholar]
  58. Hyytinen A, Ilmakunnas P, Johansson E et al. (2019). Heritability of lifetime earnings. J Econ Inequal 17, 319–335. 10.1007/s10888-019-09413-x [DOI] [Google Scholar]
  59. Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Sanislow C, & Wang P (2010). Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. The American journal of psychiatry, 167(7), 748–751. 10.1176/appi.ajp.2010.09091379 [DOI] [PubMed] [Google Scholar]
  60. Jha MK, Minhajuddin A, Chin-Fatt C, Greer TL, Carmody TJ, Trivedi MH (2019) Sex differences in the association of baseline c-reactive protein (CRP) and acute-phase treatment outcomes in major depressive disorder: Findings from the EMBARC study, Journal of Psychiatric Research, 113,165–171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Johnson W, Penke L, & Spinath FM (2011). Heritability in the era of molecular genetics: Some thoughts for understanding genetic influences on behavioural traits. European Personality Reviews, 25, 254–266. [Google Scholar]
  62. Jolly E and Chang LJ (2019). The Flatland Fallacy: Moving Beyond Low–Dimensional Thinking. Top Cogn Sci, 11(2), 433–454. doi: 10.1111/tops.12404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Kaiser BN, Haroz EE, Kohrt BA, Bolton PA, Bass JK, & Hinton DE (2015). “Thinking too much”: A systematic review of a common idiom of distress. Social Science & Medicine, 147, 170–183. 10.1016/j.socscimed.2015.10.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Kaiser BN, & Weaver L (2019). Culture-bound syndromes, idioms of distress, and cultural concepts of distress: New directions for an old concept in psychological anthropology. Transcultural Psychiatry, 56(4), 589–598. 10.1177/1363461519862708 [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Kalluri (2020). Don’t ask if AI is good or fair, ask how it shifts power. Nature, 583, 169. [DOI] [PubMed] [Google Scholar]
  66. Kendler KS, Zachar P, & Craver C (2011). What kinds of things are psychiatric disorders? Psychological Medicine, 41(6), 1143–1150. [DOI] [PubMed] [Google Scholar]
  67. Kendler KS (2014). The structure of psychiatric science. The American Journal of Psychiatry, 171, 931–938. [DOI] [PubMed] [Google Scholar]
  68. Kendler KS (2018). Classification of psychopathology: conceptual and historical background. World psychiatry: official journal of the World Psychiatric Association (WPA), 17(3), 241–242. 10.1002/wps.20549 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Keogh E (2015). “Men, masculinity, and pain.” Pain, 156, 2408–2412. [DOI] [PubMed] [Google Scholar]
  70. Keyes CLM (2007). Promoting and protecting mental health as flourishing: A complementary strategy for improving national mental health. American Psychologist, 62(2), 95–108. 10.1037/0003-066X.62.2.95 [DOI] [PubMed] [Google Scholar]
  71. Khera AV et al. (2016). Genetic Risk, Adherence to a Healthy Lifestyle, and Coronary Disease. N. Engl. J. Med, 375, 2349–2358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Kleinberg J, Ludwig J, Mullainathan S, & Sunstein CR (2019). Discrimination in the Age of Algorithms. Journal of Legal Analysis, 10, 113–174. 10.1093/jla/laz001 [DOI] [Google Scholar]
  73. Kohrt BA, & Mendenhall E (2016). Global Mental Health: Anthropological Perspectives. Routledge. [Google Scholar]
  74. Kohrt BA, Rasmussen A, Kaiser BN, Haroz EE, Maharjan SM, Mutamba BB, … & Hinton DE (2014). Cultural concepts of distress and psychiatric disorders: literature review and research recommendations for global mental health epidemiology. International Journal of Epidemiology, 43(2), 365–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Korgaonkar MS, Fornito A, Williams LM, & Grieve SM (2014a). Abnormal structural networks characterize major depressive disorder: a connectome analysis. Biological psychiatry, 76(7), 567–574. 10.1016/j.biopsych.2014.02.018 [DOI] [PubMed] [Google Scholar]
  76. Korgaonkar M, Williams L, Song Y, Usherwood T, & Grieve S (2014b). Diffusion tensor imaging predictors of treatment outcomes in major depressive disorder. British Journal of Psychiatry, 205(4), 321–328. doi: 10.1192/bjp.bp.113.140376 [DOI] [PubMed] [Google Scholar]
  77. Kotov R, Jonas KG, Carpenter WT, Dretsch MN, Eaton NR, Forbes MK, Forbush MA, Widiger TA, Wright A, Zald DH, Krueger RF, Watson D and (2020), Validity and utility of Hierarchical Taxonomy of Psychopathology (HiTOP): I. Psychosis superspectrum. World Psychiatry, 19, 151–172. doi: 10.1002/wps.20730 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Kotov R, Krueger RF, & Watson D (2018). A paradigm shift in psychiatric classification: the Hierarchical Taxonomy of Psychopathology (HiTOP). World psychiatry: official journal of the World Psychiatric Association (WPA), 17(1), 24–25. 10.1002/wps.20478 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Kotov R, Krueger RF, Watson D, Achenbach TM, Althoff RR, Bagby RM, Brown TA, Carpenter WT, Caspi A, Clark LA, Eaton NR, Forbes MK, Forbush KT, Goldberg D, Hasin D, Hyman SE, Ivanova MY, Lynam DR, Markon K, … Zimmerman M (2017). The Hierarchical Taxonomy of Psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of Abnormal Psychology, 126(4), 454–477. 10.1037/abn0000258 [DOI] [PubMed] [Google Scholar]
  80. Kramer P, & Bressan P (2015). Humans as superorganisms: How microbes, viruses, imprinted genes, and other selfish entities shape our behavior. Perspectives on Psychological Science, 10(4), 464–481. 10.1177/1745691615583131 [DOI] [PubMed] [Google Scholar]
  81. Krueger RF, Kotov R, Watson D, Forbes MK, Eaton NR, Ruggero CJ, Simms LJ, Widiger TA, Achenbach TM, Bach B, Bagby RM, Bornovalova MA, Carpenter WT, Chmielewski M, Cicero DC, Clark LA, Conway C, DeClercq B, DeYoung CG, Docherty AR, Drislane LE, First MB, Forbush KT, Hallquist M, Haltigan JD, Hopwood CJ, Ivanova MY, Jonas KG, Latzman RD, Markon KE, Miller JD, Morey LC, Mullins-Sweatt SN, Ormel J, Patalay P, Patrick CJ, Pincus AL, Regier DA, Reininghaus U, Rescorla LA, Samuel DB, Sellbom M, Shackman AJ, Skodol A, Slade T, South SC, Sunderland M, Tackett JL, Venables NC, Waldman ID, Waszczuk MA, Waugh MH, Wright AG, Zald DH and Zimmermann J (2018). Progress in achieving quantitative classification of psychopathology. World Psychiatry, 17, 282–293. doi: 10.1002/wps.20566 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Lakatos L (1970). Falsification and the methodology of scientific research programs. In: Lakatos I & Musgrave A (Eds.). Criticism and the growth of knowledge (pp. 91–196). Cambridge, England: Cambridge University Press. [Google Scholar]
  83. Latzman RD, DeYoung CG & HiTOP Neurobiological Foundations Workgroup (2020). Using empirically-derived dimensional phenotypes to accelerate clinical neuroscience: The Hierarchical Taxonomy of Psychopathology (HiTOP) framework. Neuropsychopharmacology, 45, 1083–1085 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Lee D (2013), “Google searches expose racial bias, says study of names”, BBC, available at: https://www.bbc.com/news/technology-21322183 (accessed 5 March 2018).
  85. Lewis-Fernandez R & Kirmayer LJ (2019). Cultural concepts of distress and psychiatric disorders: Understanding symptom experience and expression in context. Transcultural Psychiatry, 56, 786–803. [DOI] [PubMed] [Google Scholar]
  86. Lilienfeld SO (2007). Psychological Treatments That Cause Harm. Perspectives on Psychological Science, 2(1), 53–70. 10.1111/j.1745-6916.2007.00029.x [DOI] [PubMed] [Google Scholar]
  87. Lilienfeld SO (2014). The Research Domain Criteria (RDoC): an analysis of methodological and conceptual challenges. Behaviour research and therapy, 62, 129–139. 10.1016/j.brat.2014.07.019 [DOI] [PubMed] [Google Scholar]
  88. Lima EN, Stanley S, Kaboski B, Reitzel LR, Richey A, Castro Y, Williams FM, Tannenbaum KR, Stellrecht NE, Jakobsons LJ, Wingate LR, & Joiner TE Jr. (2005). The incremental validity of the MMPI-2: When does therapist access not enhance treatment outcome? Psychological Assessment, 17(4), 462–468. 10.1037/1040-3590.17.4.462 [DOI] [PubMed] [Google Scholar]
  89. Littlefield AK, Lane SP, Gette JA, Watts AL, & Sher KJ (2020). The “Big Everything”: Integrating and investigating dimensional models of psychopathology, personality, personality pathology, and cognitive functioning. Personality Disorders: Theory, Research, and Treatment. Advance online publication. 10.1037/per0000457 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Maraun MD (1997). Appearance and reality: Is the big five the structure of trait descriptors? Personality and Individual differences, 22, 629–647. [Google Scholar]
  91. Markey PM & Markey CN (2006). A spherical conceptualization of personality traits. European Journal of Personality, 20, 169–193. https://doi.org/10.1002%2Fper.582 [Google Scholar]
  92. Markon KE, Chmielewski M, & Miller CJ (2011). The reliability and validity of discrete and continuous measures of psychopathology: A quantitative review. Psychological Bulletin, 137(5), 856–879. 10.1037/a0023678 [DOI] [PubMed] [Google Scholar]
  93. McConaghy JR (2020). Outpatient evaluation of the adult with chest pain. In: UpToDate, Post TW (Ed), UpToDate, Waltham, MA. [Google Scholar]
  94. McCrae RR, Jang KL, Livesley WJ, Riemann R & Angleitner A (2001), Sources of Structure: Genetic, Environmental, and Artifactual Influences on the Covariation of Personality Traits. Journal of Personality, 69: 511–535. 10.1111/1467-6494.694154 [DOI] [PubMed] [Google Scholar]
  95. Matthews LJ & Turkheimer E (2019). Across the great divide: pluralism and the hunt for missing heritability. Synthese. 10.1007/s11229-019-02205-w [DOI] [Google Scholar]
  96. Meehl PE (1954). Clinical versus statistical prediction: A theoretical analysis and a review of the evidence. University of Minnesota Press. 10.1037/11281-000 [DOI] [Google Scholar]
  97. Meehl PE (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806–834. [Google Scholar]
  98. Meehl PE (1989). Paul E. Meehl. In Lindzey G (Ed.), A history of psychology in autobiography (Vol. 8, pp. 337–389). Stanford, CA: Stanford University Press. [Google Scholar]
  99. Meehl PE (1989). Schizotaxia Revisited. Archives of General Psychiatry; 46, 935–944. doi: 10.1001/archpsyc.1989.01810100077015 [DOI] [PubMed] [Google Scholar]
  100. Meehl PE (1992). Factors and taxa, traits and types, differences of degree and differences in kind. Journal of Personality, 60(1), 117–174. 10.1111/j.14676494.1992.tb00269.x [DOI] [Google Scholar]
  101. Meehl PE (1995). Bootstraps taxometrics. Solving the classification problem in psychopathology. The American psychologist, 50(4), 266–275. 10.1037//0003-066x.50.4.266 [DOI] [PubMed] [Google Scholar]
  102. Meehl PE (1999). Clarifications about taxometric method. Applied & Preventive Psychology, 8, 165–174. [Google Scholar]
  103. Meehl PE & Golden RR (1982). Taxometric methods. In: Kendall & Butcher (Eds.). Handbook of research methods in clinical psychology, page 127–181. Wiley. New York. [Google Scholar]
  104. Millon T (1991). Classification in psychopathology: Rationale, alternatives, and standards. Journal of Abnormal Psychology, 100(3), 245. 10.1037/0021-843X.100.3.245 [DOI] [PubMed] [Google Scholar]
  105. Moffitt TE (1993). Adolescence-limited and life-course-persistent antisocial behavior: A developmental taxonomy. Psychological Review, 100(4), 674–701. 10.1037/0033-295X.100.4.674 [DOI] [PubMed] [Google Scholar]
  106. Morse J (2017), “App creator apologizes for ‘racist’ filter that lightens skin tones”, Mashable, available at: https://mashable.com/2017/04/24/faceapp-racism-selfie/#zeUItoQB5iqI (accessed 5 March 2018).
  107. Mullins-Sweatt SN, & Widiger TA (2009). Clinical utility and DSM-V. Psychological Assessment, 21, 302–312. 10.1037/a0016607 [DOI] [PubMed] [Google Scholar]
  108. Muroff J, Edelsohn GA, Joe S, & Ford BC (2008). The role of race in diagnostic and disposition decision making in a pediatric psychiatric emergency service. General Hospital Psychiatry, 30(3), 269–276. 10.1016/j.genhosppsych.2008.01.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Neighbors HW, Jackson JS, Campbell L, et al. (1989). The influence of racial factors on psychiatric diagnosis: A review and suggestions for research. Community Ment Health, 25, 301–311. 10.1007/BF00755677 [DOI] [PubMed] [Google Scholar]
  110. Nelson B, McGorry PD, Wichers M, Wigman JT, & Hartmann JA (2017). Moving from static to dynamic models of the onset of mental disorder: a review. JAMA Psychiatry, 74(5), 528–534. 10.1001/jamapsychiatry.2017.0001 [DOI] [PubMed] [Google Scholar]
  111. Newman JP (1998). Psychopathic behavior: An information processing perspective. In Cooke DJ, Hare RD, & Forth A (Eds.), Psychopathy: Theory, Research and Implications for Society (pp. 81–104). Dordrecht, The Netherlands: Kluwer Academic Publishers. 10.1007/978-94-011-3965-6_5 [DOI] [Google Scholar]
  112. Nickels MK, & Nelson CE (2005). Beware of nuts & bolts: Putting evolution into the teaching of biological classification. The American Biology Teacher, 67(5), 283–289. 10.1662/0002-7685(2005)067[0283:BONBPE]2.0.CO;2 [DOI] [Google Scholar]
  113. Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019). Dissecting racial bais in an algorithm used to manage the health of populations. Science, 366, 447–453. [DOI] [PubMed] [Google Scholar]
  114. Osório FL, Loureiro SR, Hallak JEC, Machado-de-Sousa JP, Ushirohira JM, Baes CVW, Apolinario TD, Donadon MF, Bolsoni LM, Guimarães T, Fracon VS, Silva-Rodrigues APC, Pizeta FA, Souza RM, Sanches RF, dos Santos RG, Martin-Santos R and Crippa JAS (2019). Clinical validity and intrarater and test–retest reliability of the Structured Clinical Interview for DSM-5 – Clinician Version (SCID-5-CV). Psychiatry Clin. Neurosci, 73: 754–760. doi: 10.1111/pcn.12931 [DOI] [PubMed] [Google Scholar]
  115. Parnas J (2014). The RDoC program: psychiatry without psyche?. World psychiatry: Official journal of the World Psychiatric Association (WPA), 13(1), 46–47. 10.1002/wps.20101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Pesenti-Gritti P, Spatola CAM, Fagnani C et al. (2008). The co-occurrence between internalizing and externalizing behaviors. Eur Child Adolesc Psychiatry 17, 82–92. 10.1007/s00787-007-0639-7 [DOI] [PubMed] [Google Scholar]
  117. Plomin R & Daniels D (1987) Why are children in the same family so different from each other? Behavioral and Brain Sciences 10(1), 1–16. [Google Scholar]
  118. Popper KR (1959). The Logic of Scientific Discovery. London: Hutchinson of London. [Google Scholar]
  119. Petto AJ & Mead LS (2009). Homology: Why we know a whale is not a fish. Evolutionary Education Outreach, 2, 617–621. [Google Scholar]
  120. Rhemtulla M, van Bork R, & Borsboom D (2020). Worse than measurement error: Consequences of inappropriate latent variable measurement models. Psychological Methods, 25(1), 30–45. 10.1037/met0000220 [DOI] [PubMed] [Google Scholar]
  121. Rogers A (2017). Star Neuroscientist Tom Insel Leaves the Google-Spawned Verily… For a Startup? Wired. May 11, 2017. https://www.wired.com/2017/05/star-neuroscientist-tom-insel-leaves-google-spawned-verily-startup/ [Google Scholar]
  122. Ruggero CJ, Kotov R, Hopwood CJ, First M, Clark LA, Skodol AE, Mullins-Sweatt SN, Patrick CJ, Bach B, Cicero DC, Docherty A, Simms LJ, Bagby RM, Krueger RF, Callahan JL, Chmielewski M, Conway CC, De Clercq B, Dornbach-Bender A, … Zimmermann J (2019). Integrating the Hierarchical Taxonomy of Psychopathology (HiTOP) into clinical practice. Journal of Consulting and Clinical Psychology, 87(12), 1069–1084. 10.1037/ccp0000452 [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Sagan C (1979). Broca’s Brain, Reflections on the Romance of Science. New York: Random House. [Google Scholar]
  124. Shulman EP, Steinberg LD, & Piquero AR (2013). The age–crime curve in adolescence and early adulthood is not due to age differences in economic status. Journal of Youth and Adolescence, 42(6), 848–860. 10.1007/s10964-013-9950-4 [DOI] [PubMed] [Google Scholar]
  125. Schwartzstein RM (2020). Approach to the patient with dyspnea. In: UpToDate, Post TW (Ed), UpToDate, Waltham, MA. [Google Scholar]
  126. Smith GT, McCarthy DM, & Zapolski TC (2009). On the value of homogeneous constructs for construct validation, theory testing, and the description of psychopathology. Psychological assessment, 21, 272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Sohail M, Maier RM, Ganna Andrea, Bloemendal Alex, Martin Alicia R., Turchin Michael C., Chiang Charleston W. K., Hirschhorn Joel N., Daly Mark J., Patterson Nick, Neale Benjamin M., Mathieson Iain, Reich David, Sunyaev Shamil R. (2019). Signals of polygenic adaptation on height have been overestimated due to uncorrected population structure in genome-wide association studies. eLife 2019;8:e39702 10.7554/eLife.39702.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Sprock J (2003). Dimensional versus categorical classification of proto-typic and nonprototypic cases of personality disorder. Journal of Clinical Psychology, 59, 991–1014. [DOI] [PubMed] [Google Scholar]
  129. Strauss ME, & Smith GT (2009). Construct validity: Advances in theory and methodology. Annual Review of Clinical Psychology, 5, 1–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Szucs D & Ioannidis PA (2020). Sample size evolution in neuroimaging research: An evaluation of highly-cited studies (1990–2012) and of latest practices (2017–2018) in high-impact journals. NeuroImage, 221, 117164. 10.1016/j.neuroimage.2020.117164. [DOI] [PubMed] [Google Scholar]
  131. Thurstone LL (1947). Multiple factor analysis. University of Chicago Press: Chicago. [Google Scholar]
  132. Trumbetta SL, Markowitz EM, Gottesman II. (2007). Marriage and genetic variation across the lifespan: Not a steady relationship. Behavior Genetics, 37(2), 62–375. [DOI] [PubMed] [Google Scholar]
  133. Tung ES, & Brown TA (2020). Distinct Risk Profiles in Social Anxiety Disorder. Clinical Psychological Science, 8(3), 477–490. 10.1177/2167702620901536 [DOI] [PMC free article] [PubMed] [Google Scholar]
  134. Turkheimer E (2012). Genome Wide Association Studies of Behavior are Social Science. In: Plaisance K, Reydon T (eds) Philosophy of Behavioral Biology. Boston Studies in the Philosophy of Science, vol 282. Springer, Dordrecht. 10.1007/978-94-007-1951-4_3 [DOI] [Google Scholar]
  135. Turkheimer E (2016). Weak Genetic Explanation 20 Years Later: Reply to Plomin et al. Perspectives on Psychological Science, 11(1), 24–28. [DOI] [PubMed] [Google Scholar]
  136. Turkheimer E (2017). The hard question in psychiatric nosology. In: Kendler KS & Parnas J (eds) Philosophical Issues in Psychiatry IV: Classification of Psychiatric Illness. Oxford University Press. DOI: 10.1093/med/9780198796022.001.0001 [DOI] [Google Scholar]
  137. Turkheimer E [@ent3c]. (2020, Jan. 9). Different topic: Is there a named cognitive bias describing the preference for a concrete quantitative answer to a complex question, even if it is invalid? Like counting papers at tenure evaluation [Tweet]. Retrieved from: https://twitter.com/ent3c/status/1215291870168977410 [Google Scholar]
  138. Turkheimer E, Ford DC, & Oltmanns TF (2008). Regional analysis of self-reported personality disorder criteria. Journal of Personality, 76, 1587–1622. [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Turkheimer E, Pettersson E, & Horn EE (2014). A Phenotypic Null Hypothesis for the Genetics of Personality. Annual Review of Psychology,65, 515–40 [DOI] [PubMed] [Google Scholar]
  140. Turkheimer E, & Waldron MC (2000). Nonshared environment: A theoretical, methodological, and quantitative review. Psychological Bulletin, 126, 78–108. [DOI] [PubMed] [Google Scholar]
  141. van Bork R, Epskamp S, Rhemtulla M, Borsboom D, & van der Maas HLJ (2017). What is the p-factor of psychopathology? Some risks of general factor modeling. Theory & Psychology, 27(6), 759–773. 10.1177/0959354317737185 [DOI] [Google Scholar]
  142. van der Krieke L, Jeronimus BF, Blaauw FJ, Wanders RB, Emerencia AC, Schenk HM, … & Bos EH (2016). HowNutsAreTheDutch (HoeGekIsNL): A crowdsourcing study of mental symptoms and strengths. International journal of methods in psychiatric research, 25(2), 123–144. 10.1002/mpr.1495 [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Varga J (2020). Clinical manifestations and diagnosis of systemic sclerosis (scleroderma) in adults. In: UpToDate, Post TW (Ed), UpToDate, Waltham, MA. [Google Scholar]
  144. Visscher PM, Yang JA, & Goddard ME (2010). A commentary on ‘Common SNPs explain a large proportion of the heritability for human height’ by Yang et al. (2010). Twin Research and Human Genetics, 13, 5117–524. [DOI] [PubMed] [Google Scholar]
  145. Wallace DJ & Gladman DD (2020). Clinical manifestations and diagnosis of systemic lupus erythematosus in adults. In: UpToDate, Post TW (Ed), UpToDate, Waltham, MA. [Google Scholar]
  146. Waszczuk MA, Eaton NR, Krueger RF, Shackman AJ, Waldman ID, Zald DH, Lahey BB, Patrick CJ, Conway CC, Ormel J, Hyman SE, Fried EI, Forbes MK, Docherty AR, Althoff RR, Bach B, Chmielewski M, DeYoung CG, Forbush KT, … Kotov, R. (2020). Redefining phenotypes to advance psychiatric genetics: Implications from hierarchical taxonomy of psychopathology. Journal of Abnormal Psychology, 129(2), 143–161. 10.1037/abn0000486 [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Watson D (2005). Rethinking the mood and anxiety disorders: a quantitative hierarchical model for DSM-V. Journal of abnormal psychology, 114(4), 522. [DOI] [PubMed] [Google Scholar]
  148. Watts AL, Lane SP, Bonifay W, Steinley D, & Meyer FAC (2020). Building theories on top of, and not independent of, statistical models: The case of the p-factor. Psychological Inquiry, 31(4), 310–320. 10.1080/1047840X.2020.1853476 [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Weinberger DR & Radulescu E (2020). Structural Magnetic Resonance Imaging All Over Again. JAMA Psychiatry, 78(1), 11–12. doi: 10.1001/jamapsychiatry.2020.1941 [DOI] [PubMed] [Google Scholar]
  150. Weaver LJ, & Kaiser BN (2015). Developing and testing locally derived mental health scales: examples from North India and Haiti. Field Methods, 27(2), 115–130. [Google Scholar]
  151. Webb CA, Trivedi MH, Cohen ZD, Dillon DG, Fournier JC, Goer F, Fava M, McGrath PJ, Weissman M, Parsey R, Adams P, Trombello JM, Cooper C, Deldin P, Oquendo MA, McInnis MG, Huys Q, Bruder G, Kurian BT, Jha M, … Pizzagalli DA (2019). Personalized prediction of antidepressant v. placebo response: evidence from the EMBARC study. Psychological medicine, 49(7), 1118–1127. 10.1017/S0033291718001708 [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Wehby GL & Shane D (2019). Genetic variation in health insurance coverage. (Int J Health Econ Manag 19, 301–316. 10.1007/s10754-018-9255-y [DOI] [PubMed] [Google Scholar]
  153. Wenger Nanette K. (1990). “Gender, coronary artery disease, and coronary bypass surgery.” Annals of Internal Medicine 112(8), 557–558. [DOI] [PubMed] [Google Scholar]
  154. Wicherts JM, & Johnson W (2009). Group differences in the heritability of items and test scores. Proceedings. Biological sciences, 276(1667), 2675–2683. 10.1098/rspb.2009.0238 [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Widiger TA, Bach B, Chmielewski M, Clark LA, DeYoung C, Hopwood CJ, Kotov R, Krueger RF, Miller JD, Morey LC, Mullins-Sweatt SN, Patrick CJ, Pincus AL, Samuel DB, Sellbom M, South SC, Tackett JL, Watson D, Waugh MH, Wright A, … Thomas KM (2019). Criterion A of the AMPD in HiTOP. Journal of personality assessment, 101(4), 345–355. 10.1080/00223891.2018.1465431 [DOI] [PubMed] [Google Scholar]
  156. Wittchen HU, Beesdo-Baum K, Gloster AT, Höfler M, Klotsche J, Lieb R, Beauducel A, Bühner M, & Kessler RC (2009). The structure of mental disorders re-examined: is it developmentally stable and robust against additions? International Journal of Methods in Psychiatric Research, 18, 189–203. 10.1002/mpr.298 [DOI] [PMC free article] [PubMed] [Google Scholar]
  157. Wittchen HU, & Beesdo-Baum K (2018). “Throwing out the baby with the bathwater”? Conceptual and methodological limitations of the HiTOP approach. World Psychiatry: Official journal of the World Psychiatric Association (WPA), 17(3), 298–299. 10.1002/wps.20561 [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Wolff JL, Starfield B, & Anderson G (2002). Prevalence, expenditures, and complications of multiple chronic conditions in the elderly. Archives of internal medicine, 162(20), 2269–2276. 10.1001/archinte.162.20.2269 [DOI] [PubMed] [Google Scholar]
  159. Yengo L, Sidorenko J, Kemper KE, et al. (2018). Meta-analysis of genome-wide association studies for height and body mass index in 700,000 individuals of European ancestry. Hum Mol Genet, 27(20), 3641–3649. doi: 10.1093/hmg/ddy271 [DOI] [PMC free article] [PubMed] [Google Scholar]
  160. Zachar P & Kendler KS (2007). Psychiatric disorders: A conceptual analysis. American Journal of Psychiatry, 164, 557–565. [DOI] [PubMed] [Google Scholar]
  161. Zapko-Willmes A, & Kandler C (2018). Genetic Variance in Homophobia: Evidence from Self- and Peer Reports. Behavior genetics, 48(1), 34–43. 10.1007/s10519-0179884-9 [DOI] [PubMed] [Google Scholar]

RESOURCES