Table 3.
Studies on AI-assisted monitoring in mental health
| Ref. | Subject description | Mental health condition | Aim | AI-based method | Models | Variables for monitoring/prediction | Results and accuracy | Conclusions |
|---|---|---|---|---|---|---|---|---|
| Carreiro et al. (2024) | Patients with substance use disorder (n = 30) |
|
To uses continuous physiologic data to detect high-risk behavioral states (stress and craving) during substance use disorder recovery |
Machine learning models |
|
|
|
All models performed close to previously validated models from a research grade sensor |
| Choo et al. (2024) | People with borderline personality disorder (n = 80) | Suicidal ideation | To explore predicting suicidal ideation in individuals with borderline personality disorder using EMA data | Machine learning models |
|
|
|
RNN showed enhanced predictive accuracy for higher SI values and participants with depression diagnoses or higher baseline depression score |
| Dong et al. (2024) | Patients with diagnosis of schizophrenia (n = 92) | Schizophrenia | To predict the responsiveness of patients with schizophrenia to rTMS treatment | Machine learning models |
|
|
Balanced accuracy for predicting ≥20% reduction in negative symptoms of PANSS:
|
Key predictors of non-response:
|
| Hammelrath et al. (2024) | Patients with mild-to-moderate depression (training sample: n = 1270; test sample: n = 318) | Mild-to-moderate depression | To compare algorithms using features collected at baseline or early in treatment to predict non-response to a 6-week online depression program | Machine learning algorithm | RF |
Baseline variables:
|
Best performance form early treatment variables AUC: 0.71–0.77 Recall: 0.75–0.76 |
Therapeutic alliance and early symptom change constituted the most important predictors |
| Hilbert et al. (2024) | Patients with a diagnosis of panic disorder, agoraphobia, social anxiety disorder, or multiple specific phobias (n = 309) | Panic disorder, agoraphobia, social anxiety disorder, or multiple specific phobias | To test if functional neuroimaging data maintains strong prediction accuracy in larger samples using rs-fMRI data | Machine learning models |
|
|
Accuracy: 0.465–0.600 Balanced accuracy: 0.465–0.613 Sensitivity: 0.460–0.687 Specificity: 0.375–0.539 |
Caution is advised when interpreting promising prediction results from neuroimaging data in small samples |
| Wang, Wu, et al. (2024) | College students with symptoms of anxiety or depression (n = 107) | Symptoms of anxiety, depression, stress | To predict efficacy and response using machine learning in college students undergoing biofeedback therapy | Machine learning model | ANN |
|
Model accuracy for anxiety treatment response: 62% | Speech features, such as the energy parameters as more accurate and objective indicators for tracking biofeedback therapy response and predicting efficacy |
| Wang, Wu, et al. (2024) | Patients with MDD (training samples: n = 85; test samples: n = 147) | MDD | To predict treatment response by using neuroimaging data | Machine learning models |
|
|
|
The machine learning pipeline exhibited high accuracy and AUC (>0.80) on the training set but encountered challenges when applied to an external validation dataset, prompting an investigation into site heterogeneity issues |
| Zainal and Newman (2024) | Patients with GAD (N = 110) | GAD | To identify which clients with generalized anxiety disorder benefit from mindfulness ecological momentary intervention versus self-monitoring app | Machine learning models |
|
|
GAD severity prediction SVM nested leave-one-out cross-validation: AUC = 0.817, accuracy = 0.800, balanced accuracy = 0.795, sensitivity = 0.767, specificity = 0.822 RF nested leave-one-out cross-validation: AUC = 0.817, accuracy = 0.819, balanced accuracy = 0.814, sensitivity = 0.791, specificity = 0.837 |
Predictors of optimization to the intervention were higher anxiety severity, higher trait perseverative cognition, lower set-shifting deficits, older age, and stronger trait mindfulness |
| Brandt et al. (2023) | Participants with schizophrenia or schizoaffective disorder (aged ≥18 years) (n = 1392) | Schizophrenia or schizoaffective disorder | To identify general prognostic factors of relapse for all participants (irrespective of treatment continuation or discontinuation) and specific predictors of relapse for treatment discontinuation | Machine learning |
|
36 variables:
|
The concordance index for predictive performance was 0.707, meaning that the algorithm’s prediction about which of the two participants will relapse sooner is correct in 71% of the cases | Out of the 36 baseline variables, general prognostic factors of increased risk of relapse for all participants were drug-positive urine; paranoid, disorganized, and undifferentiated types of schizophrenia; psychiatric and neurological adverse events; higher severity of akathisia; antipsychotic discontinuation; lower social performance; younger age; lower glomerular filtration rate; benzodiazepine comedication Predictors of increased risk specifically after antipsychotic discontinuation were increased prolactin concentration, higher number of hospitalizations, and smoking |
| Barrigon et al. (2023) | Patients with a history of suicidal thoughts and behavior (n = 225) | Suicidal ideation | To predict short-term (one week) suicide risk by using smartphone data in suicidal patients | Machine learning algorithm | Bayesian algorithm |
|
AUC: 0.78 | Unsupervised machine learning on smartphone data from patients with suicidal ideation effectively predicts suicide risk |
| Dougherty et al. (2023) | Patients with TRD (n = 233) | TRD | To predict which participants with treatment-resistant depression would be week 3 responders and sustained responders through week 12 to psilocybin treatment | Machine learning algorithms and models |
|
Two-dimensional sentiment from the first session (computed by NLP), emotional breakthrough index, treatment dose | At week 3: Accuracy: 85% AUC: 88% At week 12: Accuracy: 88% AUC: 85% |
Treatment response to psilocybin is accurately predicted using a logistic regression model incorporating NLP metrics, EBI scale responses, and treatment arm data |
| Harrer et al. (2023) | Patients with chronic back pain and depressive symptoms (n = 504) | Depressive symptoms | To predict treatment effects of an Internet-based depression intervention for patients with chronic back pain | Machine learning models | DT (developed by multilevel model-based recursive partitioning) |
|
|
Predictions of the multivariate tree learning model suggest a pattern in which patients with moderate depression and relatively low pain self-efficacy benefit most, while no benefits arise when patients’ self-efficacy is already high |
| Jankowsky et al. (2024) | Naturalistic inpatients (n = 723) | Anxious and depressive symptoms | To compare machine learning algorithms for predicting treatment response in naturalistic inpatient samples | Machine learning algorithms |
|
|
Training: R2: 0.329–0.70 Test: R2: 0.315–0.441 |
Treatment-related variables were the most predictive, followed psychological indicators |
| Ricka et al. (2023) | Patients with MDD (n = 26) | MDD | To identify markers of mood disorders using six months of physiological and clinical data by machine learning | Machine learning algorithm | Label extension and detrending processes, a feature selection, and a deep learning multilayer perceptron model |
|
2-class prediction (depressed/not depressed) Accuracy: 86% Sensitivity: 79% Specificity: 94% |
A supervised ML system can efficiently predict a patient’s clinical score by identifying their biosignature of symptoms during a MDD episode |
| Scodari et al. (2023) | Patients with subclinical depression (n = 236) | Minor depressive symptoms | To forecast symptom changes among subclinical depression patients receiving stepped care or usual care |
Machine learning models | Tree-based and nested framework |
|
For the intervention group, the R2 for models at various treatment time intervals are as follows:
|
Patients who received stepped care were more likely to reduce PHQ–9 scores if they had high PHQ–9 but low HADS-Anxiety scores at baseline, a low number of chronic illnesses, and an internal locus of control |
| Zou et al. (2023) | Patients with MDD (N = 245) | MDD | Using passive sensing data to predict treatment response in patients with MDD | Machine learning models |
|
|
GRU-Decay Precision: 0.61 Recall: 0.64 F1 score: 0.58 AUC: 0.65 Other models Precision: 0.57–0.71 Recall: 0.22–0.59 F1 score: 0.33–0.54 AUC: 0.54–0.59 |
In terms of recall, F1 score, and AUC, the sequence model based on GRU-Decay achieve the best performance |
| Weintraub et al. (2023) | Youth aged 13to 19 who had active mood symptoms, mood instability, and at least one parent with bipolar or MDD (n = 44) | Depressive symptoms | Use of machine learning to identify the speech features that most strongly correlated with concurrent depressive symptoms over 18 weeks | Machine learning algorithm | SVM | PSRs from the Adolescent Longitudinal Interval Follow-up Evaluation 20 speech features reflecting affective processes, social processes, drives, informal, time orientation words etc. |
Strongest correlated combination of features: affective processes, drives, informal, leisure, and risk (r = 0.47, 95% CI: 0.37–0.56, R2 = 0.12) Strongest association of features from subject’s first speech features: affective processes, nonfluencies, drives, and risks (r = 0.68, 95% CI: 0.48–0.81, R2 = 0.11) |
Speech features identified by machine learning analysis achieved moderate correlation |
| Jacobson et al. (2022) Note: Also included in the diagnosis domain | Participants aged 38.5 years old on average (n = 126,060) | MDD Generalized anxiety disorder Social anxiety disorder Panic disorder Borderline personality Paranoid personality disorder Schizophrenia |
To examine the effectiveness of prediction of mental health outcomes based on exposure to online screening tools | Machine learning | RF Cox Proportional Hazards Models |
Screening tool topic Screening tool attributes Hour of the day and day of the week at which the screening tool was clicked Whether the screening tool was a Mental Health America screening tool or from another online web domain Number of previous searches which resulted in a click to a screening tool Past interests, e.g., distribution of query topics prior to the clicking on the first screening tool by each use (to ascertain whether the online screen information provided incremental information to their general search pattern types) |
Prediction accuracy was high for mental health self- references, self-diagnosis, and seeking care: screen content predicted later searches with mental health self-references (AUC =0·73), mental health self-diagnosis (AUC = 0·69), mental health care-seeking (AUC = 0·61) Other outcomes were more difficult to predict: psychoactive medications (AUC = 0·55), suicidal ideation (AUC = 0·58), and suicidal intent (AUC = 0·60) Cox proportional hazards models suggested individuals utilizing tools with in-person care referral were significantly more likely to subsequently search for methods to actively end their life (HR = 1·727) |
Online screens may influence help-seeking behavior, suicidal ideation, and suicidal intent Websites with referrals to in-person treatments could put persons at greater risk of active suicidal intent |
| Nguyen et al. (2022) | Participants with MDD, included early onset (before the age of 30) and chronic (episode duration of two years) or recurrent (2+ episodes) disease episodes (n = 222) | MDD | To determine whether pretreatment reward task-based fMRI can predict treatment-specific outcome | Deep learning models | Feedforward neural network (a separate model was trained for each treatment: sertraline, bupropion, and placebo) |
Reward task-based fMRI, which was acquired during a block-design number-guessing task that probes reward processing neural circuitry known to be altered in MDD Clinical measurements Demographic features |
For predicting change in HAMD
|
All the models explained a substantial proportion of the variance in change in HAMD. The combination of these predictive models presented a possible precision medicine approach for antidepressant selection, and each model would be applied to provide a prediction of response to each treatment |
| Webb et al. (2022) | School district employees aged 18 or above who owned a smartphone, had limited exposure to meditation app, and had depressive symptoms below the severe range (n = 662) | Depression and anxiety | To use a data-driven algorithm to predict which individuals are most likely to benefit from app-based meditation training | Machine learning | ENRR | Pre-intervention distress, anxiety, depression, stress, repetitive negative thinking, the mindfulness aspect of acting with awareness, loneliness, diffusion, presence, search for meaning, self-compassion, well-being, age, gender, race, marital status, and income Anxiety measure, PROMIS Depression measures, and 10-item Perceived Stress Scale |
Multivariable ENRR model:
Higher baseline levels of the following variables predicted a greater reduction in distress:
Linear regression mode: Higher levels of repetitive negative thinking predicted:
A significant group with PAI interaction was observed
|
Either the linear regression model with a single predictor of baseline levels of repetitive negative thinking, or the multivariable ENRR model with multiple predictors can predict changes in the level of distress |
| Athreya et al. (2021) | People with nonpsychotic MDD and received at least 8 weeks of treatment with a study drug, including SSRIs, SNRIs or TCAs, placebo (n = 3,518) | Depression | To identify specific depressive symptoms and thresholds of improvement that were predictive of antidepressant response | Machine learning | Gaussian mixture models Probabilistic graphical models |
Four HDRS items (depressed mood, psychic anxiety, guilt feelings/ delusions, and work/activities) Thresholds of change in prognostic symptom severity, derived based on the absolute difference in median scores on symptom dynamic paths between baseline and four-week strata |
Four depressive symptoms and specific thresholds of four-week change in each symptom predicted the eventual eight-week outcome of SSRI therapy with an average accuracy of 77%. The symptoms and thresholds derived from patients treated with SSRIs correctly predicted outcomes in 72% of patients treated with other antidepressants |
Conjunction of the two AI models derived consistently high predictive accuracies across numerous commonly prescribed antidepressants, and hence interpretable and accurate prognoses of antidepressant treatment outcomes |
| Bao et al. (2021) | Depressive patients receiving six intravenous infusions of ketamine over 2 weeks (n = 83) | MDD | To identify a set of biomarkers that could be used to predict clinical outcomes for treatment in MDD | Machine learning |
|
Age, sex, BMI, smoking status, and the HAMD score | Accuracy:
|
Machine learning approach could predict treatment outcomes of multiple ketamine infusions on the basis of the genotyping information |
| Lee et al. (2021) | Adults aged 18 to 65 with bipolar disorder (n = 60) | Bipolar depression | To identify biologically relevant moderators of response to TNF-α inhibitor, infliximab | Machine learning | CART | Plasma cytokine and neuronal origin-enriched extracellular vesicle protein concentrations, intervention assignment and week SHAPS MADRS |
Accuracy of predicting reduction in anhedonic symptoms with baseline cytokine biotype, intervention allocation, week, and baseline and change in neuronal origin-enriched extracellular vesicle factor scores:
|
Pretreatment biotypes, which derived from peripheral cytokine measurements, can predict antianhedonic efficacy with infliximab |
| Solomonov et al. (2021) | Older adults over 60 who suffered from unipolar, nonpsychotic MDD (n = 221) | Suicidal ideation | To identify baseline predictors of the course of suicidal ideation | Machine learning algorithms |
|
Demographics, treatment assignment, age of onset, length of current episode, number of previous episodes, severity of depression, disability, cognitive impairment, executive functioning, neuroticism, apathy, hopelessness, activation, avoidance/rumination, work/school impairment avoidance/rumination; social impairment; anhedonia; rumination response style scale; and digit span | Predictive performance:
|
Four machine learning algorithms identified hopelessness, neuroticism, and low general self-efficacy as the strongest predictors of an unfavorable trajectory of suicidal ideation |
| Van Bronswijk et al. (2021) | Adult outpatients recruited from the mood disorders unit with a primary diagnosis of MDD (n = 151) | MDD | To extend the PAI to long-term depression outcomes after acute-phase psychotherapy | Two-step machine learning |
|
38 pretreatment variables from six domains:
|
For parental alcohol abuse, the regression coefficients across the bootstrapped samples were stable with a positive value in 99.8% of the samples | A history of parental alcohol abuse was associated with higher BDI-II scores during the 17-month follow-up phase. Therefore, parental alcohol abuse could be used as a predictor for long-term depression outcomes following cognitive therapy and interpersonal psychotherapy |
| Busk et al. (2020) | Patients with bipolar disorder who had previously been treated (n = 15,975) | Bipolar disorder | To examine the feasibility of forecasting daily subjective mood scores based on daily self-assessments | Multi-task learning | Hierarchical Bayesian models | Daily self-assessments via Android smartphone app, including activity, alcohol, anxiety, irritability, cognitive difficulty, medicine intake, presence of mixed mood, mood, sleep, stress Clinical evaluations with HDRS and YMRS to assess depression and mania |
Historical mood was the most important predictor of future mood, with self-reported mood scores and HDRS scores were negatively correlated (r = −0.40) whereas self-reported mood scores and YMRS scores were positively correlated (r = 0.22) | Application of hierarchical Bayesian models could forecast subjective mood for up to 7 days, thus improving continuous disease monitoring |
| Furukawa et al. (2020) | Patients aged 25 to 75 years, with nonpsychotic unipolar MDD episode, and having received no antidepressant, antipsychotic, or mood stabilizer in the previous month (n = 2,011) | MDD | To predict depression severity from a large set of baseline predictors through a web app | Machine learning | Penalized linear regression models using LASSO Penalized linear regression models using the ridge penalty SVM with a polynomial or radial kernel Artificial neural networks with one hidden layer, three or four nodes |
Sociodemographic variables including age, sex, education, employment status, and marital status Baseline clinical characteristics include age at onset of depression, number of previous depressive episodes, length of index episode, and concurrent physical illness Depression characteristics by week three include individual item scores of PHQ–9 for the index episode; individual item scores of the BDI-II; individual item scores of the FIBSER; and adherence to pharmacotherapy |
SVMs are observed with a lower prediction error in both internal and internal-external cross-validation (MAE = 1.5) | Three different SVMs with a radial kernel, one SVM per treatment arm, could be chosen to predict treatment outcome |
| Rajpurkar et al. (2020) | Outpatients aged 18 to 65 from primary or specialty care practices with a diagnosis of MDD (n = 518) | MDD | To identify the extent to which a machine learning approach can predict acute improvement for individual depressive symptoms with antidepressants based on pretreatment symptom scores and EEG measures | Machine learning | ELECTREE Score algorithm using GBDTs | Resting-state EEG continuously recorded Symptoms of HRSD–21 |
C index score, which is indicative of discriminative performance, was found for 12 symptoms. The highest C index score was found on:
Any single EEG feature was higher than 5% predictors for seven symptoms Combination of EEG and baselines symptom feature significantly increased the C index for improvement in four symptoms:
|
The machine learning model could predict the improvement in depressive symptoms most accurately with baseline symptom severity in combination with EEG features |
| Rozek et al. (2020) | Army soldiers reporting active suicide ideation with intent to die during the previous week and/or a suicide attempt during the previous month (n = 152) | Suicide | To examine predictors of suicidal behaviors among high-risk suicidal soldiers who received outpatient mental health services in a RCT of Brief CBT for Suicide Prevention compared to treatment as usual | Machine learning | MondoBrain Augmented Intelligence® System |
|
This combination of variables correctly classified eight of 26 participants who attempted suicide during the two-year follow-up period (30.8%) and misclassified only one of 126 participants who did not attempt suicide (0.8%), yielding 88.9% positive predictive value, and 87.4% negative predictive value | This combination of variables correctly classified almost one-third of participants who attempted suicide in the subsequent two years with good positive predictive value and negative predictive value |
| Browning et al. (2019) | Depressive patients whose treating clinician had made the decision to prescribe citalopram (n = 239) | Depression | To assess whether changes in emotional processing and subjective symptoms over the first week of antidepressant treatment predicts clinical response after four–eight weeks of treatment | Machine learning | SVM | QIDS-SR16, ECAT, EREC, FERT | Accuracy:
|
Cognitive and symptomatic measures were possible to be used in guiding antidepressant treatment in depressed patients |
| Foster et al. (2019) | Adolescents aged 12–17 with MDD (n = 439) | MDD | To estimate patient-specific inter-treatment differences among three treatment conditions: CBT, FLX, and the combination of CBT and FLX, as a function of patients’ baseline characteristics | Machine learning | Model-based Random Forest | Gender, race, family income, referral source, dysthymia, anxiety disorder, ADHD, childhood trauma, study site, age, verbal intelligence, current episode duration, baseline depression severity, functional impairment, suicidal ideation, melancholic features, number comorbid diagnoses, caregiver depression, conflict with caregiver, hopelessness, cognitive distortions, treatment expectations from parent, treatment expectations from adolescents |
FLX-CBT difference:
FLX was more effective (b = −0.13, 95% CI: −0.22 to −0.05), especially with more severe baseline depression CB -combination difference: Combination was more effective (b = −0.25, 95% CI: −0.33 to −0.17) FLX-combination difference: Combination was more effective (b = −0.11, 95% CI: −0.21 to −0.02), especially with less severe baseline depression and higher treatment expectations from patients |
Combined treatment with CBT and FLX was consistently superior to either therapy administered alone across a broad range of patients |
| Vitinius et al. (2019) | Depressed patients with CAD (n = 570) | Depression | To identify somatic and sociodemographic predictors of depression outcome among depressed patients with CAD | Machine learning | LR and linear or binomial linear model with LASSO regularization | 141 potential sociodemographic and somatic predictors including blood tests, medical history, current drug use, comorbidities, and sociodemographic data. HADS |
Predictors to favorable depression outcome:
higher heart rate variability during numeracy tests (p = 0.020), unknown previous myocardial infarction (p = 0.013), higher age (p = 0.002) Predictors to unfavorable depression outcome: anticholinergic drugs (p = 0.045), state after resuscitation (p ≤ 0.042), uric acid drugs (p ≤ 0.039), beta blockers (p = 0.035), New York Heart Association (NYHA) class III (p ≤ 0.028), analgesic drugs (p = 0.027), antidiabetic drugs (p = 0.015), higher triglycerides (p = 0.014), intake of thyroid hormones (p = 0.007), and hyperuricemia (p ≤ 0.003) |
Machine learning could identify somatic and sociodemographic predictors of depression outcome in patients with CAD |
| Bailey et al. (2018) | Patients with TRD and healthy controls aged 20 to 72 with normal or corrected to normal vision (n = 50) | Depression | To determine whether working memory related power, connectivity, and theta- gamma coupling measures could be used to predict responders to rTMS treatment for treatment-resistant depression | Multivariate machine learning | SVM |
|
Prediction of individual responders:
|
Baseline and week 1 frontal-midline theta power and theta connectivity showed good potential for predicting response to rTMS treatment for depression |
| Kautzky et al. (2018) | Patients diagnosed with MDD (n = 55) | MDD | To generate a prediction model for TRD using machine learning featuring a large set of clinical and sociodemographic predictors of treatment outcome | Machine learning |
RF | 47 predictors documented in the GSRD database, which can be classified into:
|
The full model with 47 predictors yielded an accuracy of 75.0% for predicting TRD and treatment response, with positive predictive value of 79.6%, and negative predictive value of 67.9% When the number of predictors was reduced to 15, accuracies between 67.6% and 71.0% were attained for different test sets |
Machine learning techniques have shown promising results on prediction of TRD by considering interaction and main effects equally and producing reliable classification with high accuracy |
| Lenhard et al. (2018) | Adolescents with aged 12–17 with OCD and had received either immediate or delayed (12 weeks) internet-delivered CBT (n = 61) | Pediatric OCD | To test four different machine learning methods in the prediction of treatment response in a sample of pediatric OCD patients who had received internet-delivered CBT | Machine learning | Linear model with best subset predictor selection L1 Elastic Net (LASSO) RF SVM |
46 demographic and clinical baseline variables, related to:
|
Accuracy:
|
Machine learning models were able to predict treatment outcome in internet-delivered CBT for pediatric OCD with good to excellent accuracy |
| Maciukiewicz et al. (2018) | Individuals diagnosed with MDD from three clinical trials who received duloxetine or placebo for up to eight weeks (n = 186) | MDD | To use supervised machine learning to build predictive models of duloxetine outcome for MDD with genome-wide data | Machine learning models | LASSO regression CRT SVM |
SNPs |
Accuracy on remission prediction:
Of the 19 most robust SNPs, 17 were characterized by large LASSO coefficients |
None of the machine learning models performed satisfactorily in remission prediction. For treatment response, SVM achieved moderate performance whereas CRT’s performance was just equal to chance accuracy |
| Nie et al. (2018) | STAR*D cohort: Patients with MDD. RIS-INT–93 cohort: Patients with MDD and had history of resistance to therapy with antidepressant medication and were treated prospectively with citalopram for up to six weeks (n = 5686) |
MDD | To identify risk factors of treatment resistance by extending the work in predictive modeling of treatment-resistant depression via partition of the data from the STAR*D cohort and completely independent cohort RIS-INT–93 into training and testing datasets | Machine learning |
|
CRS, demographics, PHX, MHX, PRISE, PDSQ, baseline and week two of level 1 treatment which include records from Clinic Visit Form, QIDS-C16, QIDS-SR16, Bech melancholia scale, the Maier-Phillipp severity subscale, the Santen Subscale, the Gibbons’ global depression severity scale, HAM-D7 | STAR*D testing dataset and RIS-INT–93 independent dataset with an AUC of 0.70–0.78 and 0.72–0.77, respectively | The series of machine learning models were able to predict treatment-resistant depression using clinical and sociodemographic data |
| Chekroud et al. (2016) | STAR*D trial: Patients from primary and psychiatric care settings, with nonpsychotic MDD, with at least 14 score on 17-item HAMD, and aged 18–75 COMED trial: Patients with nonpsychotic MDD, had recurrent or chronic depression, with at least 16 scores on 17-item HAMD, and aged 18–75 (n = 4041) |
MDD | To develop an algorithm to assess whether patients will achieve symptomatic remission from a 12-week course of citalopram | Machine learning | EN | Overlapping variables in the two clinical trials including sociodemographic features, DSM-IV-based diagnostic items, depressive severity checklists, eating disorder diagnoses, whether the patient had previously taken specific antidepressant drugs, the number and age of onset of previous major depressive episodes, and the first 100 items of the psychiatric diagnostic symptom questionnaire | Accuracy in internal validation:
|
Machine learning achieved moderate performance for internal prediction. The performance across cohort varied for different treatment groups showed fair to moderate accuracy |
| Iniesta et al. (2016) | Treatment-seeking adults with MDD and a current depressive episode (n = 793) | MDD | To optimize prediction of symptom improvement and remission during treatment with escitalopram or nortriptyline | Machine and statistical learning | ENRR | Demographics data including current age, age at onset of depression, sex, smoking status, BMI, occupation, marital status, years of education and number of children Baseline severity measures including the clinician-rated MADRS, the 17-item HRSD and the self-report BDI Individual depressive symptoms from the SCAN interview and depression subtypes Observed mood, cognitive and neurovegetative symptom factors, and six dimensions (mood, anxiety, pessimism, interest-activity, sleep, and appetite) from a published factor analysis Stressful life events experienced during the six months prior to the baseline assessment, measured with the LTE-Q Medication history included the use of antidepressant at the time of recruitment, any prior antidepressant treatment, number and types of antidepressants tried established with Medication History Form |
Accuracy of prediction on different outcomes:
|
Easily obtained demographic and clinical variables could predict therapeutic response to escitalopram with clinically meaningful accuracy |
| Amminger et al. (2015) | Individuals with ultra-high risk for psychosis and meeting at least one operationally defined groups of risk factors for psychosis:
|
Psychosis | To determine biological and clinical factors associated with treatment response indexed by functional improvement in a pre–post examination of a 12-week intervention in individuals at ultra-high risk for psychosis | Machine learning | Linear regression models Gaussian Process Classification |
Erythrocyte fatty acid composition of the phosphatidylethanolamine phospholipid fraction |
Univariate analysis:
Variance in prediction of functional improvement:
Multivariate analysis: Overall accuracy of fatty acid prediction in treatment response:
|
Univariate analysis:
Higher levels of erythrocyte membrane ALA (parent fatty acid of the ω–3 family) and more severe negative symptoms at baseline predicted subsequent functional improvement in the treatment group Less severe positive symptoms and lower functioning at baseline were predictive on functional improvement in the placebo group Multivariate analysis: Fatty acids predicted response to treatment in both ω–3 PUFA and placebo groups with a high level of accuracy |
| Guilloux et al. (2015) | Anxious-depressed adults with nonpsychotic MDD episode of sufficient severity (score ≥ 15 on the 25-item HRSD) and elevated symptoms of panic or anxiety (score ≥ 7 on the past-month panic and agoraphobic spectrum self-report) Nonpatient controls not meeting criteria for any mood or anxiety disorder (n = 67) |
MDD | To identify the biomarkers predicting nonremission prior treatment initiation | Machine learning prediction model | Random intercept model SVM |
Peripheral blood-based gene expression | The results from these studies indicate an average cross-validated accuracy (i.e., model selection bias corrected) of 79.4% in predicting remission status, with the 13-gene model displaying the highest individual noncorrected prediction value (88%). The newly built prediction model in the validation cohort using the same 13 genes identified in the initial cohort, and found through another round of leave-one-out cross-validation that a 6-gene model achieved the highest accuracy (76.2%) |
At pretreatment assessment, the gene expression profiles obtained from blood samples of MDD subjects who will not attain remission after treatment differ from nondepressed controls and also from MDD patients who will remit with treatment Six out of 13 genes identified in the initial cohort could predict remission in an independent cohort, which demonstrated the potential of pretreatment peripheral gene expression profiles to predict nonremission following an eight- to 12-week course of citalopram treatment |
Abbreviations: ADHD: Attention-Deficit/Hyperactivity Disorder; AIMS: Abnormal Involuntary Movement Scale; ALA: α-linolenic acid; ANN: Artificial neural network; AUC: Area under the receiver operating characteristic curve; BARS: Barnes Akathisia Rating Scale; BDI: Beck Depression Inventory; BMI: Body mass index; BSSI-W: Beck Scale for Suicide Ideation, Worst Point; CAD: Coronary artery disease; CART: Classification and regression trees; CBT: Cognitive behavioral therapy; CDSS: Sum of Calgary Depression Scale for Schizophrenia; CGI: Clinical Global Impression; COSTA: Cognitive Style Assessment measuring cognitive distortions; CRS: Cumulative Illness Rating Scale; CRT: Classification and regression tree; DT: Decision tree; EBI: Emotional Breakthrough Index; ECAT: Emotional categorization task; EEG: Electroencephalographic; EMA: Ecological Momentary Assessment; EN: Elastic net; ENRR: Elastic net regularized regression; EREC: Emotional recall task; FERT: Face-based emotional recognition task; FFMQ: Five Factor Mindfulness Questionnaire; FLX: Fluoxetine; fMRI: Functional magnetic resonance imaging; GAD: Generalized anxiety disorder; GAF: Global Assessment of Functioning; GBDT: Gradient-boosted decision trees; GRU: Gated Recurrent Unit; GSRD: Group for the Study of Resistant Depression; HADS: Hospital Anxiety and Depression Scale; HAMD: Hamilton Depression Rating Scale; HDRS: Hamilton Depression Rating Scale; HRSD: Hamilton Rating Scale for Depression; kNN: K-nearest neighbor; LASSO: Least absolute shrinkage and selection operator; LR: Logistics regression; LSTM: Long Short-Term Memory; LTE-Q: List of Threatening Experiences Questionnaire; MADRS: Montgomerye-Åsberg Depression Rating Scale; MAPE: Mean absolute percent error; MDD: Major depressive disorder; MEM: Mixed-effects linear regression models; MHX: Medication history; NLP: Natural language processing; NPRS: Numerical pain rating scale; ODI: Oswestry Disability Index; OCD: Obsessive-compulsive disorder; PAI: Personalized Advantage Index; PANSS: Positive and Negative Syndrome Scale; PDSQ: Psychiatric Diagnostic Screening Questionnaire; PHQ-9: Personal Health Questionnaire-9; PHX: Psychiatric history; PRISE: Patient Rated Inventory of Side Effect; PROMIS: Patient-Reported Outcomes Information System; PRS: Polygenic risk score; PSEQ: Pain Self-Efficacy Questionnaire; PSP: Personal and Social Performance; PSRs: Psychiatric Status Ratings; QIDS-C16: Quick Inventory of Depressive Symptomatology (Clinician-Rated); QIDS-SR16: Quick Inventory of Depressive Symptomatology (Self-assessment); QoL: Quality of life; RCT: randomized controlled trial; RF: Random Forest; rTMS: Repetitive transcranial magnetic stimulation; RMSE: Root mean squared error; RNN: Recurrent neural networks; SCAN: Schedules for Clinical Assessment in Neuropsychiatry; SCS: Suicide Cognitions Scale; SEWIP: Scale for the Multiperspective Assessment of General Change Mechanisms in Psychotherapy; SHAPS: Snaith Hamilton Pleasure Scale; SICD: Structured clinical interview for DSM-IV; sMRI: Structural Magnetic Resonance Imaging; SNPs: Single nucleotide polymorphism; SNRIs: Serotonin-norepinephrine reuptake inhibitors; SPE: Subjective Prognostic Employment Scale; SSRIs: Selective serotonin reuptake inhibitors; SVM: Support vector machine; TCAs: Tricyclic antidepressants; TNF: Tumor necrosis factor; TRD: Treatment-resistant depression; XGBoost: Extreme gradient boosting; YMRS: Young Mania Rating Scale; ω-3 PUFA: Omega-3 polyunsaturated fatty acids.