A 55-year-old woman presents seeking treatment with clear symptoms of a major depressive episode. This is her third such episode. She reports that sertraline worked well for her the first time. The second time, however, it made her dizzy and she had to discontinue; she eventually responded well to bupropion. She expresses a preference for medication over psychotherapy but is not sure which one to try. What do you do? Do you try bupropion again? It worked last time, but in exploring her current symptoms it is clear that she has prominent feelings of somatic and psychological anxiety that she did not experience last time. Overall, her condition is less acute, and you recall a meta-analysis suggesting that anti-depressants are less effective in milder cases of depression. She experienced trauma during adolescence, has been unable to fall asleep in the evenings, and has extreme feelings of worthlessness. How do these factors relate to her diagnosis, prognosis, or likely treatment response?
Modern medicine is increasingly focused on evidence-based practice: the systematic study of what treatments work best for a given problem. The gold standard tool in this regard is the large, double-blind, placebo-controlled trial. Results from such trials may be further combined in meta-analyses that pool data across studies or via systematic reviews. This process is ultimately designed to leverage a large body of data to make broad generalizations about a population (e.g., for individuals with major depressive disorder, selective serotonin reuptake inhibitors are an effective first-line treatment).
But each patient is unique—and for psychiatry in particular, their uniqueness may be relevant to care. For example, for the patient above, how do you factor in the history of early trauma? Or how would you factor in a family history of bipolar disorder? Or even her religious and cultural background?
While clinical trials are designed to minimize the impact of individual patient nuance, we know that these nuances can be crucial. For example, among a group of patients with first-episode psychosis, Cannon et al. (1) found that those with higher levels of unusual thought content converted to schizophrenia sooner than those who did not. In depression, patients who were abused as children have been shown to respond better to cognitive behavioral therapy than to the antidepressant nefazodone (2). The scientific literature is brimming with examples like these: variables that have small but meaningful associations with clinical outcomes.
Sometimes these minor differences can add up to a much larger story. In 2006, Perlis et al. (3) studied whether it was possible to distinguish at time of presentation individuals with unipolar versus bipolar depression. Most of the variations they identified were small—e.g., slightly more apparent sadness in the unipolar patients, slightly more pessimistic thoughts in the bipolar patients—and had little predictive value on their own. When taken together, though, a bigger picture emerged. With mathematical modeling, it was possible to correctly differentiate unipolar from bipolar depression 87% of the time (3).
To imagine that an individual clinician can accurately factor these data into his or her decision-making would be a fallacy. Given the seemingly infinite ways in which any two patients may be dissimilar, the number of “personalized” rules to remember vastly exceeds human capacity (4). As clinicians, we intuitively know this to be true and may feel alternately helpless, scared, or frustrated as we stare into this abyss of unknowns.
Computational psychiatry embraces this uncertainty. Put simply, the field is based on the idea that advanced computer models can help us navigate the complexity of modern psychiatry. It falls within a broader movement toward “personalized medicine”—a focus on individuals, not averages, with the goal of leveraging each person’s unique biological and behavioral profile to improve patient care.
Trying to leverage individual data, though, is no small mathematical feat. The complexity of different models can escalate quickly, rendering traditional models unviable. As a simple example, if two potentially interesting input variables (such as body mass index and waist circumference) are related to one another (a phenomenon known as collinearity), this violates assumptions of traditional analytic approaches like linear regression. Standard approaches also struggle when the number of potential factors becomes large, such as if there are more variables of interest than the number of available patients. Historically, this has forced researchers to limit the number of factors in their analysis (or to use stepwise approaches).
Advances in computing power and novel statistical techniques offer a different approach, known as machine learning. The term, coined roughly in the 1950s, refers to a technique whereby computers (machines) identify relationships (learn) without being explicitly programmed. Rather than having to manually predetermine a subset or specific combination of variables, computer algorithms can iteratively comb through large amounts of data and determine on their own which are relevant.
We know, of course, that these approaches are already widespread—and extraordinarily successful—in a range of nonmedical settings. Virtually every aspect of our online experience is personally tailored: what results show up in your Google search, what books Amazon recommends that you read, even the advertisement you may be seeing alongside this article in your web browser.
In health care, opportunities for this approach are broad but still nascent. In oncology, for example, multiple treatments are often mixed and matched according to the biological profile of a tumor, which can be determined with the help of machine learning algorithms. While promising, this approach may not be enough to overcome extraordinarily complex biology: even within the same patient, dramatic heterogeneity has been found even within a single tumor (5). More recently, spurred by great progress in training machines to perform image recognition tasks, dermatology has seen how algorithms can predict whether a given lesion is cancerous (6).
A number of computational approaches are now being tried in psychiatry, as well. One recent example entailed using machine learning approaches to predict treatment response to different selective serotonin reuptake inhibitors. For decades, patients have endured a trial-and-error process with multiple medications before finding the right one. With this in mind, Chekroud et al. (7) analyzed the symptom profiles of more than 4000 patients with depression, using artificial intelligence to determine which patients would respond best to a specific antidepressant (citalopram, in this case). When looking for predictive relationships, the algorithm simultaneously considered more than 160 potential variables—far too many for traditional approaches. The algorithm homed in on a broad array of clinical features, including the presence of somatic complaints, insomnia, and previous exposure to traumatic events. Overall, the algorithm was able to accurately predict which patients would experience remission with citalopram—in fact, predicting as accurately as practicing psychiatrists—and is now regularly used through an online questionnaire (8).
The field of computational psychiatry is not just trying to improve what we do with patients—it is also about expanding what we know about mental disorders (9). Just as some researchers have used computer algorithms to make predictions, others have used them to develop and refine formal models of psychiatric illness. These models can then be compared to the basic frameworks by which we define psychiatric illnesses. While there are compelling reasons for how and why the DSM has evolved to its current state, a major problem is that it is largely syndrome based, without connection to underlying pathophysiology.
Computational methods offer the possibility of identifying more parsimonious diagnostic groupings and, in the process, may further elucidate underlying behavioral and neurobiological processes (9). The model put forward by Petzschner et al. (10) in this issue of Biological Psychiatry is one such attempt to offer a formal computational taxonomy for understanding psychiatric disease. The authors propose an overarching statistical framework for considering behavior that is based on hierarchical Bayesian models. These are statistical models that have multiple levels (e.g., an education model can have components at the level of the classroom and others at the level of the school), and where parameters are estimated using Bayesian methods (a technique whereby current observations are combined with previous beliefs).
Petzschner et al.’s general thesis is that we can conceptualize behavior in terms of loops between beliefs and observations, and that this conceptualization can be implemented as a hierarchical Bayesian model. Once we have this framework, we can be more precise and systematic about how and why disruptions might emerge. Ultimately, the authors hope that we may one day be able to describe psychiatric phenotypes more effectively by isolating specific components of these computational models.
What does the future look like for computational psychiatry? Broadly speaking, it can be helpful to think about computational approaches as aiming either to improve what we know or to improve what we do. If and when they are successful, treatment-oriented studies have the potential to be incorporated into clinical practice relatively quickly—though such changes may address relatively narrow questions or populations. The Petzschner et al. study is a fine example of a framework that may iteratively improve our overall understanding of mental illness—though it may be many years before such changes directly translate into new diagnostic schema or improved clinical outcomes.
In the long run, there is hope that computational tools—currently constrained to specific clinical decisions and patient strata—might become more widespread, useful, and accessible to clinicians and their patients. The path forward will not be easy [see (4)], requiring considerable effort to collect data routinely in clinical practice and a shift in physician education as tools become more prevalent. However, the vision is compelling—if we approach patient care with the same degree of innovation and computational rigor as the Googles and Amazons of the world, perhaps our treatments will be as successful as our advertisements.
Acknowledgments
Dr. Ross, as co-chair of the National Neuroscience Curriculum Initiative, receives support from the National Institutes of Health Grant Nos. R25 MH10107602S1 and R25 MH08646607S1. This commentary was produced in collaboration with the National Neuroscience Curriculum Initiative.
Mr. Chekroud holds equity in Spring Care Inc., a behavioral health startup. He is lead inventor on two patent submissions relating to treatment for major depressive disorder (United States Patent and Trademark Office docket number Y0087.70116US00 and United States Patent and Trademark Office Provisional Application No. 62/491,660).
Footnotes
Disclosures
Dr. Lane reported no biomedical financial interests or potential conflicts of interest. Dr. Ross reports no other financial interests or potential conflicts of interest.
References
- 1.Cannon TD, Cadenhead K, Cornblatt B, Woods SW, Addington J, Walker E, et al. Prediction of psychosis in youth at high clinical risk: A multisite logitudinal study in North America. Arch Gen Psychiatry. 2008;65:28–37. doi: 10.1001/archgenpsychiatry.2007.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nemeroff CB, Heim CM, Thase ME, Klein DN, Rush JA, Schatzberg AF, et al. Differential responses to psychotherapy versus pharmacotherapy in patients with chronic forms of major depression and childhood trauma. Proc Natl Acad Sci U S A. 2003;100:14293–14296. doi: 10.1073/pnas.2336126100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Perlis RH, Brown E, Baker RW, Nierenberg AA. Clinical features of bipolar depression versus major depressive disorder in large multicenter trials. Am J Psychiatry. 2006;163:225–231. doi: 10.1176/appi.ajp.163.2.225. [DOI] [PubMed] [Google Scholar]
- 4.Perlis RH. Abandoning personalization to get to precision in the pharmacotherapy of depression. World Psychiatry. 2016;15:228–235. doi: 10.1002/wps.20345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E, et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med. 2012;366:883–892. doi: 10.1056/NEJMoa1113205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–118. doi: 10.1038/nature21056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Chekroud AM, Zotti RJ, Shehzad Z, Gueorguieva R, Johnson MK, Trivedi MH, et al. Cross-trial prediction of treatment outcome in depression: A machine learning approach. Lancet Psychiatry. 2016;3:243–250. doi: 10.1016/S2215-0366(15)00471-X. [DOI] [PubMed] [Google Scholar]
- 8.Chekroud AM, Gueorguieva R, Krumholz HM, Trivedi MH, Krystal JH, McCarthy G. Reevaluating the efficacy and predictability of antidepressant treatments. JAMA Psychiatry. 2017;74:370–378. doi: 10.1001/jamapsychiatry.2017.0025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Krystal JH, Murray JD, Chekroud AM, Corlett PR, Yang G, Wang XJ, Anticevic A. Computational psychiatry and the challenge of schizophrenia. Schizophr Bull. 2017;43:473–475. doi: 10.1093/schbul/sbx025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Petzschner FH, Weber LAE, Gard T, Stephan KE. Computational psychosomatics and computational psychiatry: Toward a joint framework for differential diagnosis. Biol Psychiatry. 2017;82:421–430. doi: 10.1016/j.biopsych.2017.05.012. [DOI] [PubMed] [Google Scholar]