Psychiatry has been waiting for a neuroimaging test that can provide practical information to guide treatment decisions (1). Despite decades of using magnetic resonance imaging to characterize psychiatric neurobiology, no imaging tests have been translated into clinical practice. Challenges to this goal include costs, relevant applications, reproducibly, and the absence of a simple method to evaluate a test’s clinical utility. The traditional statistical metrics reported in neuroimaging studies, such as sensitivity, specificity, and area under the curve, do not provide direct information about whether a test would change a clinical decision. To this end, we introduce the approach of decision curve analysis (DCA) to evaluate predictive neuroimaging models and demonstrate how DCA could be applied to the prediction of treatment response to transcranial magnetic stimulation (TMS) for major depression.
DCA provides a framework to evaluate predictive models that incorporates the balancing of risks and benefits of treatment across a range of clinician and patient preferences (2). DCA has been used to evaluate the clinical utility of predictive tests in oncology, cardiology, and other areas of medicine (3-5), but has yet to be adopted in psychiatry. The core component of DCA is the concept of “threshold probability” (Pt), or the probability at which an individual values the benefits of treatment equally to avoiding unnecessary treatment. If the probability of a condition being present were above the threshold probability, individuals would opt for treatment. Conversely, if this probability were below their threshold, individuals would avoid treatment.
DCA calculates the net benefit of predictive models over a range of threshold probabilities and therefore a precise estimate of threshold probability is not required. The unit of net benefit in DCA is equal to the percentage of individuals appropriately treated (“true positives”) minus a weighted percentage of those inappropriately treated (“false positives”) given by a ratio of the threshold probability over its complement (as shown in Equation 1). Therefore, at low threshold probabilities, the potential harm of false positives is considered low compared to the benefit of treatment. But, if the cost or risk of false positives were high, threshold probability increases and treatment would be reserved to individuals with a higher probability of the condition. The net benefit is then calculated over a range of threshold probabilities and is compared to “treat all” and “treat none” models. The strategy with the highest net benefit over a range of reasonable threshold probabilities is considered superior.
(1) |
To illustrate an example of how DCA could be used to guide clinical psychiatric treatment, we applied this methodology to a predictive neuroimaging model of clinical response to TMS. TMS is a noninvasive neuromodulation technique that utilizes pulsed magnetic fields to modulate cortical activity and is efficacious for pharmacoresistant depression (6). Standard TMS is delivered five days a week for five to eight weeks and can be expensive. While TMS has few systemic side effects, financial and time commitments are often significant barriers; the delivery of standard TMS treatment also limits the number of patients who can be treated on a single device to 60-80 per year, making TMS a limited resource. A predictive model for treatment response to TMS could, in theory, identify which patients are more likely to respond and increase the percentage of successful treatments.
Recent research demonstrated that functional connectivity could be used to identify four “biotypes” of depression that are associated with different patterns of response to TMS and facilitates an example of how DCA could be implemented (7). The model for biotype identification was developed using a large multisite validation dataset (n=711) and replicated with an independent sample (n=477). A subset of 124 subjects with depression who participated in this study were treated with dorsomedial prefrontal TMS (utilizing repetitive or intermittent theta-burst TMS). Approximately 36% (n=45) of these participants achieved a significant clinical response, defined here as a 50% reduction in symptoms. Using connectivity-defined biotypes, Biotype 1 was associated with a 65% response rate, followed by Biotype 3 at 32%, Biotype 4 at 15% and Biotype 2 at 12.5%. From these results, meaningful post-test probabilities are obtained that could inform the decision to pursue TMS. Consider the following two strategies for identifying TMS responsive patients: treat only Biotype 1 patients, resulting in a sensitivity of 58%, a specificity of 82% and 73% accuracy; or treat Biotypes 1 & 3, resulting in a sensitivity of 87%, a specificity of 47% and 61% accuracy. Applying DCA to these models results in the decision curves shown in Figure 1. For threshold probabilities between 14%-32%, the net benefit of treating Biotypes 1 & 3 is higher than treating only Biotype 1 and the “treat all” approach. For threshold probabilities above 32%, treating just Biotype 1 achieves greater net benefit.
Figure 1.
Decision Curve Analysis (DCA) for neuroimaging models predictive of responsiveness to transcranial magnetic stimulation (TMS). The threshold probability (Pt) represents the point at which positive treatment response to TMS is valued equally to avoiding unnecessary treatment. Net benefit is defined as the percentage of individuals who receive TMS and achieve response minus a weighted percentage of treated individuals who do not respond. The dotted lines represent potential treatment strategies based on functional MRI-defined biotypes of depression from Drysdale et al.(7) When compared with alternative strategies of “treat all” or “treat none” (solid lines), the neuroimaging based strategies provide greater net benefit over a wide range of Pt values above 0.14.
Greater net benefit translates into a higher proportion of successful TMS treatments and reduced cost per response. Assuming a TMS treatment series costs $15,000 and takes 36 sessions, the “treat all” strategy currently used costs $41,333 and 99 visits per clinical response. Adding the predictive neuroimaging models and an approximate cost of $1,500 and a one-hour visit per MRI scan, the predictive approaches yield reduced costs per response (Biotype 1 only: cost $30,231, 60 visits; Biotypes 1 & 3: cost $35,923, 78 visits). Applying this approach to clinical practice would provide psychiatrists and patients with a more informed perspective for treatment decisions. One can imagine the conversations, “Mrs. Brown, based on the results of your MRI scan, you have a 65% chance of responding to TMS,” or conversely, “…based on your MRI scan, we should discuss other treatments.”
This exercise demonstrates the potential of evaluating predictive neuroimaging models in a clinical decision framework using DCA and illustrates how functional neuroimaging could be a cost-effective tool in guiding TMS treatment decisions. The strength of DCA is that it bridges the gap between neuroscience research and practical clinical decision-making. Furthermore, it demonstrates that predictive models can be clinically useful without perfect accuracy - as long as they address the right clinical questions. With recent advances in neuroimaging and computation in psychiatry, there is clear need for a simple method to evaluate the clinical utility of predictive models and thus direct future research towards applications that change clinical practice.
Acknowledgments
Conflict of Interest Disclosures:
Drs. Berlow, Zandvakili, and Philip have received or been supported by grants from the National Institute of Mental Health (NIMH R25 MH101076 (YAB)), the National Institute of General Medical Sciences (NIGMS U54GM115677 (AZ)), the U.S. Department of Veterans Affairs (I01 RX002450 (NSP), and the VA RR&D Center for Neurorestoration and Neurotechnology (YAB, AZ, NSP) during the conduct of this study. In the past, Dr. Philip has received grant support from Janssen, Neosync, and Neuronetics through clinical trial contracts and has served as an unpaid scientific advisory board member for Neuronetics. The opinions herein represent those of the authors and not the U.S. Department of Veterans Affairs. The other authors report no financial relationships with commercial interests.
REFERENCES
- 1.Paulus MP. Pragmatism Instead of Mechanism: A Call for Impactful Biological Psychiatry. JAMA Psychiatry. 2015;72(7):631–2. [DOI] [PubMed] [Google Scholar]
- 2.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26(6):565–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Siddiqui MM, Rais-Bahrami S, Turkbey B, George AK, Rothwax J, Shakir N, et al. Comparison of MR/ultrasound fusion-guided biopsy with ultrasound-guided biopsy for the diagnosis of prostate cancer. JAMA. 2015;313(4):390–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Apostolakis S, Lane DA, Guo Y, Buller H, Lip GY. Performance of the HEMORR 2 HAGES, ATRIA, and HAS-BLED bleeding risk-prediction scores in nonwarfarin anticoagulated atrial fibrillation patients. J Am Coll Cardiol. 2013;61(3):386–7. [DOI] [PubMed] [Google Scholar]
- 5.Collins GS, Altman DG. Predicting the 10 year risk of cardiovascular disease in the United Kingdom: independent and external validation of an updated version of QRISK2. BMJ. 2012;344:e4181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.O'Reardon JP, Solvason HB, Janicak PG, Sampson S, Isenberg KE, Nahas Z, et al. Efficacy and safety of transcranial magnetic stimulation in the acute treatment of major depression: a multisite randomized controlled trial. Biol Psychiatry. 2007;62(11):1208–16. [DOI] [PubMed] [Google Scholar]
- 7.Drysdale AT, Grosenick L, Downar J, Dunlop K, Mansouri F, Meng Y, et al. Resting-state connectivity biomarkers define neurophysiological subtypes of depression. Nat Med. 2017;23(1):28–38. [DOI] [PMC free article] [PubMed] [Google Scholar]