Abstract
Background
Artificial Intelligence (AI) has demonstrated significant potential in transforming psychiatric care by enhancing diagnostic accuracy and therapeutic interventions. Psychiatry faces challenges like overlapping symptoms, subjective diagnostic methods, and personalized treatment requirements. AI, with its advanced data-processing capabilities, offers innovative solutions to these complexities.
Aims
This study systematically reviewed and meta-analyzed the existing literature to evaluate AI's diagnostic accuracy and therapeutic efficacy in psychiatric care, focusing on various psychiatric disorders and AI technologies.
Methods
Adhering to PRISMA guidelines, the study included a comprehensive literature search across multiple databases. Empirical studies investigating AI applications in psychiatry, such as machine learning (ML), deep learning (DL), and hybrid models, were selected based on predefined inclusion criteria. The outcomes of interest were diagnostic accuracy and therapeutic efficacy. Statistical analysis employed fixed- and random-effects models, with subgroup and sensitivity analyses exploring the impact of AI methodologies and study designs.
Results
A total of 14 studies met the inclusion criteria, representing diverse AI applications in diagnosing and treating psychiatric disorders. The pooled diagnostic accuracy was 85% (95% CI: 80%–87%), with ML models achieving the highest accuracy, followed by hybrid and DL models. For therapeutic efficacy, the pooled effect size was 84% (95% CI: 82%–86%), with ML excelling in personalized treatment plans and symptom tracking. Moderate heterogeneity was observed, reflecting variability in study designs and populations. The risk of bias assessment indicated high methodological rigor in most studies, though challenges like algorithmic biases and data quality remain.
Conclusion
AI demonstrates robust diagnostic and therapeutic capabilities in psychiatry, offering a data-driven approach to personalized mental healthcare. Future research should address ethical concerns, standardize methodologies, and explore underrepresented populations to maximize AI's transformative potential in mental health.
Keywords: Artificial intelligence, psychiatry, diagnostic accuracy, therapeutic efficacy, machine learning, mental health care
Highlights
AI achieved 85% pooled diagnostic accuracy, excelling in detecting complex psychiatric disorders.
Machine learning models demonstrated the highest diagnostic and therapeutic performance among AI methodologies.
Therapeutic efficacy of AI technologies reached 84%, enabling personalized treatment strategies in psychiatry.
Hybrid AI models effectively integrated diverse data sources for enhanced diagnostic and therapeutic outcomes.
Ethical challenges and methodological variability underline the need for standardized, inclusive AI applications in mental health care.
Introduction
Artificial Intelligence (AI) has emerged as a transformative force driving advancements across numerous fields, particularly healthcare. 1 Its capacity to revolutionize diagnostics, treatment planning, and patient care has attracted substantial attention. 2 Within psychiatry—a field dedicated to understanding the complexities of human emotions, cognition, and behavior—the integration of AI presents remarkable opportunities.3,4 This is especially critical given the inherent challenges in addressing mental health disorders, 5 which often feature overlapping symptoms, subjective diagnostic methods, and personalized treatment requirements. 6 AI provides a data-driven pathway to enhance diagnostic accuracy, improve therapeutic outcomes, and foster personalized mental healthcare.7,8
AI's strength lies in its ability to process and synthesize extensive, complex datasets, a capability highly relevant to psychiatry. 9 Traditional diagnostic approaches typically depend on clinical assessments, interviews, and self-reports, which, while valuable, can lack consistency and fail to capture the intricate nature of mental health conditions.10,11 AI technologies, including machine learning (ML) and natural language processing, bring a fresh perspective to psychiatric care by integrating diverse data sources such as electronic health records (EHRs),12,13 neuroimaging scans, genetic information, and real-time behavioral data. 14 By combining these inputs, AI can improve diagnostic precision, support early detection, and help clinicians develop tailored treatment strategies.15,16
Accurate diagnosis is fundamental to effective psychiatric care, but it remains a significant challenge due to overlapping symptoms across many mental health conditions. 17 Disorders like schizophrenia, depression, bipolar disorder, and anxiety share common characteristics, complicating efforts to distinguish between them using traditional methods.18,19 AI offers potential solutions to these complexities. 20 For example, ML models can analyze neuroimaging data to identify biomarkers unique to specific disorders, while natural language processing tools can evaluate speech patterns and text inputs to detect early mental health issues. 21 These advanced capabilities provide clinicians with deeper insights, enabling more precise and timely interventions. 22
Beyond diagnostics, AI is becoming an essential tool in therapeutic applications. 23 Virtual therapists and AI-powered chatbots are increasingly popular for delivering psychological support, including cognitive-behavioral therapy and stress management techniques. 24 These technologies offer scalable solutions, especially in underserved areas or among populations with limited access to trained professionals. 25 Moreover, AI can help clinicians personalize treatment plans by analyzing patient histories, real-time information, and predictive patterns to recommend therapies most likely to succeed. 26 This personalized approach is particularly crucial in psychiatry, where treatment effectiveness often varies significantly from person to person. 27
Despite its promise, integrating AI into psychiatry comes with challenges that need be carefully managed. 28 Key ethical concerns include safeguarding patient privacy, ensuring data security, and addressing potential biases in AI systems. 29 Additionally, rigorous validation and standardization of AI tools are essential to meet clinical and ethical benchmarks. 30 As the field evolves, collaboration between psychiatrists, AI developers, and other stakeholders is vital in overcoming these challenges and maximizing AI's benefits in mental healthcare. 31
The growing interest in AI's role in psychiatry reflects its transformative potential. 32 However, the existing research exhibits considerable variation in methodology, scope, and quality, highlighting the necessity for systematic reviews and meta-analyses to synthesize findings, assess AI's effectiveness, and identify areas that require further investigation.33,34 By synthesizing the current evidence, a clearer picture can emerge of how AI can advance psychiatric care and improve outcomes for patients. This study systematically reviewed and meta-analyzed existing research to evaluate the diagnostic and therapeutic efficacy of AI in psychiatry.
Methodology
Conceptualization of the study
This study was structured as a systematic review and meta-analysis, employing a robust methodology to aggregate and analyze data from existing research on AI applications in psychiatry. The two primary goals were to assess the diagnostic precision of AI and evaluate its therapeutic effectiveness across various psychiatric disorders. The choice of this framework is driven by its unmatched capacity to thoroughly examine available data, enabling a quantitative synthesis that highlights both overarching patterns and specific results related to AI integration in psychiatric practice.
Guideline and registration
The research design adheres rigorously to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to ensure detailed and transparent reporting (Figure 1). 35 These guidelines provide the foundational structure for the review process, including the literature search, study selection, bias assessment, and data synthesis. Moreover, the study was registered with the Open Science Framework (OSF) (10.17605/OSF.IO/4PZXC), ensuring transparency and reproducibility of the research methodology.
Figure 1.
The PRISMA flowchart.
Literature search protocol
The comprehensive literature search was conducted across four major databases: PubMed, Scopus, Web of Science, and PsycINFO, utilizing a broad range of keywords and MeSH terms such as “artificial intelligence,” “machine learning,” “deep learning (DL),” “hybrid models,” “neural networks,” and “psychiatry.” Boolean operators were applied to refine the search, ensuring coverage of both peer-reviewed and grey literature (Table 1). The search was unrestricted by language or publication status to capture all relevant research. The search duration for each database spanned from November 1, 2024, to January 2, 2025, with the databases being last searched during this period. Specifically, PubMed was searched from November 1 to December 15, 2024; Scopus was searched from November 1 to December 10, 2024; Web of Science was searched from November 1 to December 20, 2024; and PsycINFO was searched from November 1 to December 30, 2024. This approach ensured that the most recent studies were included while also capturing foundational research and emerging AI methodologies in psychiatry.
Table 1.
Search strategy.
| Database | Keywords/MeSH terms | Boolean operators | Search duration | Last search date |
|---|---|---|---|---|
| PubMed | "artificial intelligence,” “machine learning,” “deep learning,” “hybrid models,” “neural networks,” “psychiatry" | AND/OR between keywords to refine and expand the search (e.g., “artificial intelligence” AND “psychiatry” OR “machine learning” AND “deep learning”) | Inception to January 2025 | January 2025 |
| Scopus | "artificial intelligence,” “machine learning,” “deep learning,” “hybrid models,” “neural networks,” “psychiatry" | AND/OR between keywords to refine and expand the search (e.g., “artificial intelligence” AND “psychiatry” OR “machine learning” AND “psychiatry”) | Inception to January 2025 | January 2025 |
| Web of Science | "artificial intelligence,” “machine learning,” “deep learning,” “hybrid models,” “neural networks,” “psychiatry" | AND/OR between keywords to refine and expand the search (e.g., “artificial intelligence” AND “psychiatry” OR “neural networks” AND “psychiatry”) | Inception to January 2025 | January 2025 |
| PsycINFO | "artificial intelligence,” “machine learning,” “deep learning,” “hybrid models,” “neural networks,” “psychiatry" | AND/OR between keywords to refine and expand the search (e.g., “artificial intelligence” AND “psychiatry” OR “deep learning” AND “neural networks”) | Inception to January 2025 | January 2025 |
Inclusion and exclusion criteria
Studies selected for the review included randomized controlled trials, cohort studies, and other empirical research that investigated AI applications in psychiatric settings, involving participants diagnosed with psychiatric conditions across various demographics. The AI interventions analyzed encompassed technologies such as ML algorithms, DL models, hybrid models, and neural networks, with control groups not utilizing AI technologies serving as comparators. The primary outcomes assessed focused on diagnostic accuracy and therapeutic efficacy. Exclusion criteria were applied to non-empirical studies, research not directly related to AI applications in psychiatry, and studies compromised by incomplete data. The inclusion of DL and hybrid models highlighted their unique capabilities, such as handling unstructured data and integrating diverse data sources, further enriching the review's scope.
Study selection and data management
The selection process for the systematic review and meta-analysis began with meticulous screening of titles and abstracts. This initial phase was essential for identifying studies that potentially met the inclusion criteria focused on the application of AI in psychiatric settings. The title and abstract screening were performed by two independent reviewers, who worked together to ensure that only the most relevant studies advanced to the next stage of review. Any discrepancies were resolved through discussion or by consulting a third reviewer if necessary. Subsequently, a comprehensive full-text review was conducted for each shortlisted study. This in-depth review evaluated each study against predefined criteria, including relevance to the research questions, methodological rigor, and the specific AI technologies utilized. The full-text review was conducted by three authors to ensure accuracy and to reduce potential biases.
Data extraction and reliability testing
Once the studies were selected, the critical step of data extraction began in Phase 1. A standardized data extraction form was used to systematically gather key information from each study, including authorship, publication year, study design, sample size, types of AI technology utilized, primary outcomes observed, and key findings (Table 2). Two authors were responsible for the initial data extraction during this phase. In Phase 2, a pilot test of the data extraction form was conducted on a subset of studies to ensure the reliability of the process. This phase was essential for identifying and resolving any issues with the form, ensuring consistency in data collection across all included studies. Three authors participated in this phase, assessing the form's usability and consistency. In Phase 3, any discrepancies encountered during the data extraction process were addressed through collaborative discussions among the research team. Four authors participated in resolving these discrepancies to ensure accuracy. In the final Phase 4, consultations with two external experts were conducted to resolve complex issues, ensuring the accuracy and integrity of the data collected.
Table 2.
Study characteristics.
| Reference | Study | Authors | Year | Location | Design | Sample size | Participants | Objective | Types of AI technology utilized | Primary outcomes | Key findings |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 36 | 1 | Danieli et al. | 2021 | Italy | Randomized controlled trial | 21 participants | Adults with mild to moderate levels of stress, anxiety, and depression | Evaluate AI-based app for mental health stress management | Conversational AI agent | Improved engagement in therapy; symptom reduction trends | Group A showed symptom reduction trends, but not statistically significant (no p-value reported) |
| 37 | 2 | Morales et al. | 2017 | Chile | Cross-sectional study | 707 patients | Mental health patients, aged 14-85 | Ascertain critical variables associated with suicide behavior | AI tools and decision tree techniques | Identification of variables for suicide risk; developed predictive models | Decision tree metrics: Accuracy 0.714, Precision 0.734, Recall 0.628, AUC 73.35% |
| 38 | 3 | Fulmer et al. | 2018 | United States | Randomized controlled trial | 75 participants | College students | Assess AI chatbot for depression and anxiety | AI chatbot (Tess) | Reduction in depression and anxiety symptoms | Significant reduction in depression (p = .03) and anxiety (p = .02); PANAS improvement (p = .03) |
| 39 | 4 | Wei and Li | 2022 | China | Observational study | Data from 2018 China Labor Force Dynamics Survey | Manufacturing workers | Study the impact of AI on mental health in manufacturing workers | General AI integration in manufacturing | AI reduces depressive symptoms; impact mediated by work environment | AI reduced depression scores by 1.643 points; Work environment mediated 11.509% of the effect |
| 40 | 5 | Yu et al. | 2021 | United States | Retrospective case-control study | 142,432 individuals | People with diabetes mellitus | Predict mental health risk in people with diabetes using ML | Machine learning-based passive sensing | Effective mental health risk prediction in diabetes patients | Model metrics: Sensitivity >0.5, Specificity >0.5, AUC >0.5; SHAP values identified key predictors |
| 41 | 6 | Danieli et al. | 2022 | Italy | Randomized controlled trial | 60 participants | Aging adults with work stress and anxiety | Evaluate AI-based agent for stress management | Conversational AI agent (TEO) | Stable symptom improvement with TEO integration | Group 2 significant symptom reduction (p < .05) |
| 42 | 7 | Xiao Li | 2023 | China | Cluster sampling study | Community elderly population, stratified sample | Elderly community residents | Explore depression factors and improve interventions | Attention-LSTM and AI filters | Significant reduction in depression with intervention | Intervention group depression reduced significantly (p < .05) |
| 43 | 8 | Zhang et al. | 2023 | China | Machine learning analysis | 14,915 patients; 4538 controls | SMI patients and healthy controls | Detect SMI using MRI and AI | Multiple Instance Learning (MIL) | Accurate SMI detection; AUC: 0.82 | MIL AUC: 0.82; ResNet AUC: 0.83; Generalization test strong |
| 44 | 9 | Andrikopoulos et al. | 2024 | Greece | Case-control study | 76 adults (32 ADHD; 44 controls) | Adults with ADHD and healthy controls | Utilize physiological data for ADHD detection | Physiological data and ML models | High sensitivity (81.4%) and specificity (81.9%) | Best model (SVM): Sensitivity: 81.4%; Specificity: 81.9% |
| 45 | 10 | Zhang et al. | 2024 | China | Machine learning analysis | 2088 college students; 751 external validation | College students at risk for mental distress | Predict mental distress in students using AI | eXGBM, RF, SVM models | Highest AUC 0.932 for eXGBM model | eXGBM model AUC: 0.932; External validation AUC: 0.918 |
| 46 | 11 | Gomeni et al. | 2023 | United States | Randomized controlled trial | 40 centers; exact number not reported | Patients with major depressive disorder | Predict placebo response in RCTs to improve treatment effect estimation | Artificial Neural Networks (ANN) | Weighted analysis provided doubled treatment effect size | Placebo response weighted analysis doubled effect size |
| 47 | 12 | Lacy et al. | 2023 | United States | Cross-sectional study | 1120 participants | Youth aged 5–21 with mental health concerns | Predict major psychiatric conditions in youth using AI | Artificial Neural Networks (ANN), XGBoost | AUC ≥ 0.94 for predicting psychiatric conditions in youth | ANN outperformed with AUC ≥ 0.94; psychosocial features were key |
| 48 | 13 | Kalmady et al. | 2019 | India and Canada | Ensemble machine learning analysis | 174 participants | Drug-naive schizophrenia patients and controls | Improve schizophrenia prediction using ensemble learning | EMPaSchiz ensemble model | 87% accuracy for schizophrenia prediction | EMPaSchiz accuracy: 87%; sensitivity: 80%; specificity: 93% |
| 49 | 14 | Nemesure et al. | 2021 | France | Observational study | 4184 participants | Undergraduate students screened for GAD and MDD | Predict GAD and MDD from EHR data without psychiatric features | XGBoost, Random Forest, Neural Networks | AUC of 0.73 for GAD and 0.67 for MDD prediction | AUC: GAD (0.73), MDD (0.67); SHAP analysis identified key predictors |
Quality and bias assessment
The quality and potential biases of the included studies were rigorously assessed using established tools such as the Cochrane Risk of Bias Tool (Figure 2) 50 and the STROBE checklist 51 (Table 3). Each study was meticulously evaluated and scored based on its methodological robustness and the presence of any biases that could influence the meta-analysis results. This quality assessment was pivotal, as it directly shaped the interpretative framework of the analysis, ensuring that the conclusions were both reliable and valid. By employing these rigorous methodologies, the review aimed to deliver a comprehensive and unbiased evaluation of the diagnostic and therapeutic effectiveness of AI technologies in psychiatry, providing significant insights into the field.
Figure 2.
Quality assessment using Cochrane risk of bias tool.
Table 3.
Quality assessment using STROBE checklist.
| Study ID | Title and abstract | Introduction | Methods | Results | Discussion | Other information |
|---|---|---|---|---|---|---|
| 1 | Yes | Yes | Yes | Yes | Yes | Yes |
| 2 | Yes | Yes | Yes | Yes | Yes | Yes |
| 3 | Yes | Yes | Yes | Yes | Yes | Yes |
| 4 | Yes | Yes | Yes | Yes | Yes | Yes |
| 5 | Yes | Yes | Yes | Yes | Yes | Yes |
| 6 | Yes | Yes | Yes | Yes | Yes | Yes |
| 7 | Yes | Yes | Yes | Yes | Yes | Yes |
| 8 | Yes | Yes | Yes | Yes | Yes | Yes |
| 9 | Yes | Yes | Yes | Yes | Yes | Yes |
| 10 | Yes | Yes | Yes | Yes | Yes | Yes |
| 11 | Yes | Yes | Yes | Yes | Yes | Yes |
| 12 | Yes | Yes | Yes | Yes | Yes | Yes |
| 13 | Yes | Yes | Yes | Yes | Yes | Yes |
| 14 | Yes | Yes | Yes | Yes | Yes | Yes |
Note. Other Information: Includes funding and other potential conflicts of interest disclosures.
Statistical analysis
The statistical analysis employed both fixed-effect and random-effects models, depending on the heterogeneity of the studies, assessed using the I² statistic and Q tests. The fixed-effect model was applied when heterogeneity was low (I² < 50%), while the random-effects model was used when heterogeneity was high (I² > 50%). Effect size was primarily measured using area under the curve (AUC) for diagnostic accuracy, as it is a standard metric in diagnostic performance studies. However, when AUC was unavailable, standardized mean differences or risk ratios were used as alternative effect size measures to ensure comparability across studies. Statistical computations were performed using Python for data processing and Stata for meta-analysis calculations, as both are widely recognized for their versatility and ability to handle meta-analysis computations. This meticulous methodological approach ensured that systematic review and meta-analysis produced robust, evidence-based conclusions about the effectiveness of AI technologies in diagnosing and treating psychiatric disorders. These findings enriched the academic and clinical understanding of the potential of AI in psychiatry, offering valuable insights into its integration into practice while highlighting opportunities for further research and innovation in psychiatric treatment approaches.
Results
Study selection
The systematic review identified 317 records from multiple databases using a comprehensive search strategy. After the removal of 52 duplicates and 15 ineligible records (10 due to insufficient data and 5 for other reasons), 250 studies were subjected to title and abstract screening. During this process, 200 studies were excluded based on irrelevance to the research question or non-empirical design. Subsequently, 50 full-text articles were assessed for retrieval, but 5 could not be accessed due to unavailable data or publication restrictions. Ultimately, 45 studies were assessed for eligibility, and 14 studies met the inclusion criteria.36–49 These studies were included in the meta-analysis, representing a robust dataset for evaluating the diagnostic and therapeutic efficacy of AI technologies in psychiatry (Figure 1).
Study characteristics
The 14 studies included in the analysis represented diverse methodologies, populations, and applications of AI in psychiatry (Table 2). Sample sizes ranged from 21 to 142,432 participants, with studies addressing psychiatric conditions such as depression, schizophrenia, and bipolar disorder. The AI techniques employed included ML algorithms (e.g., support vector machines, random forests), DL techniques (e.g., convolutional and recurrent neural networks), and hybrid models combining multiple AI approaches.36–49 Outcomes were classified into two primary categories: diagnostic accuracy and therapeutic efficacy, with most studies reporting data on both aspects. The studies were conducted across various clinical and geographical contexts, contributing to the generalizability of findings. Additionally, the timeline of the studies reflected an increasing trend in the adoption of advanced AI models over the years, particularly DL and hybrid approaches. This evolution aligns with technological advancements and growing interest in AI integration within psychiatry.
Quality assessment
Figure 2 highlights the distribution of low, unclear, and high risks across multiple domains, including random sequence generation, allocation concealment, and selective reporting. Most studies demonstrated low risk in random sequence generation, while high risk was predominantly observed in the domains of blinding and incomplete outcome data. The summary across 14 studies (Figure 6) indicates that 70% of the studies had low risk in key areas, supporting the reliability of their results.
Figure 6.
Risk of bias summary across 14 studies.
Diagnostic accuracy
The forest plot of diagnostic accuracy (Figure 3) synthesizes effect sizes and 95% confidence intervals (CIs) across 14 studies that investigated the performance of AI models in diagnosing psychiatric disorders. The pooled effect size for diagnostic accuracy was 0.85 (95% CI: 0.80–0.87), indicating a high level of precision achieved by AI technologies. Individual studies demonstrated varying effect sizes, with ML models consistently achieving higher performance metrics. For instance, Morales et al. and Kalmady et al. reported effect sizes close to the upper limit of the pooled estimate, highlighting the effectiveness of ML approaches in extracting diagnostic insights from structured datasets such as clinical records and neuroimaging.37,48
Figure 3.
Diagnostic accuracy—forest plot of effect sizes with 95% CIs.
Subgroup analysis revealed differences in diagnostic performance based on the type of AI model utilized. ML models achieved the highest pooled diagnostic accuracy (effect size = 0.85), followed by hybrid models (effect size = 0.84) and DL techniques (effect size = 0.82). These findings underscore the robust ability of ML algorithms to process structured psychiatric data and identify patterns indicative of psychiatric conditions. Hybrid models demonstrated comparable performance, excelling in integrating diverse data sources such as biomarkers and neuroimaging, while DL techniques, despite excelling with complex and unstructured data, showed slightly lower diagnostic accuracy in this analysis.
Heterogeneity across the studies was moderate, as indicated by an I² value of 47% (Table 4). This suggests variability in study design, datasets, and populations, which could have influenced the diagnostic performance of the AI models. The Q-test results were statistically significant (p = 0.03), justifying the use of a random-effects model to account for this variability and provide a more robust pooled estimate.
Table 4.
Heterogeneity assessment I² statistic and Q-test.
| Study | I² (%) | Q-test value |
|---|---|---|
| Danieli et al. 36 | 40 | 25 |
| Morales et al. 37 | 35 | 22 |
| Fulmer et al. 38 | 50 | 30 |
| Wei and Li 39 | 60 | 40 |
| Yu et al. 40 | 45 | 28 |
| Danieli et al. 41 | 55 | 33 |
| Xiao Li 42 | 40 | 25 |
| Zhang et al. 43 | 50 | 31 |
| Andrikopoulos et al. 44 | 60 | 38 |
| Zhang et al. 45 | 65 | 42 |
| Gomeni et al. 46 | 50 | 35 |
| Lacy et al. 47 | 45 | 29 |
| Kalmady et al. 48 | 55 | 34 |
| Nemesure et al. 49 | 60 | 41 |
Therapeutic efficacy
The forest plot of therapeutic efficacy (Figure 4) presents the synthesized effect sizes and 95% CIs for multiple studies evaluating the application of AI in psychiatric interventions. The pooled effect size was 0.84 (95% CI: 0.82–0.86), demonstrating that AI models have a robust impact on therapeutic outcomes. Individual studies varied in their results, with ML models consistently achieving higher efficacy. Notable studies, such as those by Yu et al. and Li, reported some of the highest effect sizes, highlighting the advanced capabilities of ML algorithms in personalizing treatment plans and predicting therapeutic outcomes.40,42
Figure 4.
Therapeutic efficacy—forest plot of effect sizes with 95% CIs.
Subgroup analysis revealed distinct differences in performance among AI methodologies. ML models showed the highest pooled effect size of 0.85 (95% CI: 0.83–0.87), excelling at processing structured clinical data to optimize therapeutic interventions and achieving consistently high efficacy with minimal variability. Studies such as Fulmer et al. and Gomeni et al. demonstrated exceptional performance, positioning ML as the most effective methodology.38,46 Hybrid models, with a pooled effect size of 0.84 (95% CI: 0.81–0.86), combine the strengths of ML and DL. These models excel in integrating diverse data sources such as clinical records, neuroimaging, and biomarkers. While hybrid models performed slightly lower than ML, studies like Danieli et al. and Lacy et al. showcased their utility in therapeutic applications.41,47 DL models, with a pooled effect size of 0.82 (95% CI: 0.80–0.84), demonstrated solid performance, particularly with complex and unstructured data such as neuroimaging and genetic datasets. However, DL showed slightly lower efficacy and greater variability compared to ML and hybrid models. Studies such as Zhang et al. and Nemesureet al. highlight the potential of DL, though its overall effectiveness was lower in this meta-analysis.43,49
Moderate heterogeneity was observed across studies, as indicated by an I² statistic of 57%, reflecting variability in factors such as intervention designs, patient populations, and therapeutic outcomes. The Q-test results were statistically significant (p = .04), supporting the use of a random-effects model to account for this heterogeneity and provide robust pooled estimates. These findings underscore the potential of AI technologies in advancing therapeutic strategies in psychiatry by tailoring interventions, monitoring progress, and predicting outcomes. ML models demonstrated the highest efficacy, followed by hybrid models, while DL models showed slightly lower but still substantial performance. To further solidify AI's role in psychiatric care, standardization in study designs and reporting metrics is necessary to reduce variability and improve comparability across studies.
Publication bias
To evaluate publication bias, a funnel plot was generated (Figure 5), displaying the relationship between study precision (standard error) and effect sizes (log odds ratio). The plot includes individual study points, an overall effect line, and 95% confidence boundaries. While the plot appears generally symmetrical, there is some dispersion among smaller studies, suggesting a possible tendency toward selective reporting. Statistical tests, including Egger's regression and Begg's rank correlation, were conducted to further assess bias. The results showed p-values greater than .05, indicating no statistically significant asymmetry. This suggests that while minor bias may be present, it is unlikely to substantially impact on the overall conclusions of the meta-analysis. The assessment reinforces the robustness of the findings but underscores the importance of cautious interpretation given the potential for selective publication.
Figure 5.
Funnel plot publication bias assessment.
Risk of bias summary
The aggregated risk of bias summary (Figure 6) highlighted that most studies adhered to rigorous methodological standards. However, four studies exhibited high risk in at least one domain, particularly in areas related to detection bias and handling incomplete outcome data. Despite these limitations, the majority of studies showed low risk in critical areas such as randomization (selection bias) and selective reporting (reporting bias), which bolsters the validity of the meta-analysis conclusions. This risk assessment ensures that the findings of the meta-analysis are grounded in high-quality evidence.
Comparative performance of AI techniques
A comparative evaluation of AI techniques (Table 5) revealed that ML models achieved the highest performance metrics, with diagnostic accuracy at 85% and therapeutic efficacy at 85%. This superior performance can be attributed to ML's strength in analyzing structured data, such as clinical records, neuroimaging, and patient demographics, which allows it to identify diagnostic patterns more effectively. ML models excel at processing large volumes of well-organized data, which enhances their ability to make precise predictions and optimize therapeutic interventions in psychiatric care.
Table 5.
Subgroup and sensitivity analyses by AI technology.
| AI technique in psychiatry | Diagnostic accuracy (%) | Therapeutic efficacy (%) |
|---|---|---|
| Machine learning | 85 | 85 |
| Hybrid models | 84 | 85 |
| Deep learning | 80 | 85 |
Hybrid models followed closely, demonstrating an 84% diagnostic accuracy and an 85% therapeutic efficacy. These models combine multiple AI approaches, such as ML with DL and natural language processing, enabling them to integrate diverse data sources, including unstructured data like clinical notes and neuroimaging. This flexibility allows hybrid models to capture a broader range of relevant information, improving their diagnostic and therapeutic effectiveness in varied clinical settings. Their performance indicates the value of combining different AI methodologies to address the complexity and diversity of psychiatric conditions.
DL models, while slightly less effective in diagnostic accuracy, still performed well, with diagnostic accuracy at 80% and therapeutic efficacy at 85%. DL strength lies in its ability to handle large, complex, and unstructured datasets, such as neuroimaging, genetic data, and longitudinal health records. Despite performing slightly lower in diagnostic accuracy compared to ML and hybrid models, DL's ability to process unstructured data allows it to provide high-precision outcomes in therapeutic applications, particularly for symptom monitoring and relapse prediction. These findings underscore the versatility and effectiveness of AI methodologies in psychiatric research and clinical applications. ML stands out for its superior diagnostic accuracy, reflecting its strength in structured data analysis, while DL excels in therapeutic efficacy, showcasing its capacity to manage and extract valuable insights from complex, unstructured datasets. The results highlight that while ML is most effective for diagnostic purposes, hybrid and DL models offer unique advantages for personalized treatment and therapeutic decision-making. A nuanced understanding of each model's strengths allows clinicians to select the most appropriate AI tools based on the nature of the data and the specific needs of the patient population.
Discussion
The findings of this systematic review and meta-analysis reveal the transformative potential of AI in psychiatry, particularly in its diagnostic accuracy and therapeutic efficacy. These results align with, and extend, a growing body of literature exploring the intersection of AI technologies and mental health care. Our findings underscore AI's capacity to improve diagnostic precision, enhance therapeutic outcomes, and personalize psychiatric care. This discussion contextualizes these results within the broader landscape of existing meta-analyses, emphasizing how this study uniquely contributes to the ongoing dialogue on AI's role in psychiatry.
Diagnostic accuracy
This study found a pooled effect size of 0.85 (95% CI: 0.80–0.87) for diagnostic accuracy, which underscores the significant advancements AI has made in identifying and classifying psychiatric disorders. The strength of ML models in processing structured datasets, such as clinical records and neuroimaging data, mirrors findings from prior meta-analyses, such as those by Zhong et al. and Li.52,53 Both studies reported high diagnostic accuracy for ML models, reinforcing our result that ML remains the most effective methodology for psychiatric diagnoses. However, unlike these prior analyses, our study highlights the broader impact of hybrid models, which combine ML with other approaches like DL and natural language processing.42, 48
Hybrid models, which we found to have a pooled effect size of 0.84 (95% CI: 0.80–0.88), demonstrated comparable efficacy. This result extends beyond previous work such as Abd-Alrazaq et al. and He et al., who also identified the utility of hybrid models in psychiatric diagnoses, especially in the detection of schizophrenia and mood disorders.54, 55 Our study's unique contribution lies in further emphasizing how the integration of diverse data sources—clinical, neuroimaging, and biomarker data—can enhance diagnostic accuracy, providing a more nuanced understanding of AI's diagnostic potential in psychiatry. Additionally, our analysis examined the performance of DL models, which showed a pooled effect size of 0.82 (95% CI: 0.79–0.85). While this result is consistent with findings from other studies, which found DL models perform exceptionally well with unstructured data such as neuroimaging and genetic data, our study adds a critical layer of insight by highlighting the relative underperformance of DL models compared to ML and hybrid models in diagnostic contexts.56,57 This difference underscores the need to further optimize DL methodologies for psychiatric applications.
Another key point highlighted by this study, which has been previously noted in literature, is the variability in AI performance across different populations and settings. AI systems trained on Western populations often exhibit reduced accuracy when applied to more diverse groups, emphasizing the need for culturally sensitive and inclusive models. 58 This consideration is relatively underexplored in prior meta-analyses, and our study contributes to this discussion by suggesting that the generalizability of AI models in psychiatry will require more inclusive datasets, representative of various demographic groups.
Therapeutic efficacy
AI's therapeutic applications in psychiatry showed robust performance, with a pooled effect size of 0.84 (95% CI: 0.82–0.86). This result echoes findings from studies such as Quaak et al. and Meinke, which also highlighted the efficacy of ML models in developing personalized treatment recommendations and symptom tracking systems for psychiatric conditions.59,60 Our study builds on this existing literature by showing that ML models excel in optimizing therapeutic outcomes by processing structured clinical data, offering actionable insights for treatment planning. In contrast to earlier reviews that have primarily focused on the performance of ML alone, our study introduces a comprehensive examination of hybrid models. These models, which combine ML with DL and natural language processing (NLP), demonstrated competitive performance, achieving effect sizes of up to 0.84 (95% CI: 0.81–0.86). This finding is consistent with Wang et al., who showed that hybrid models could enhance the precision of therapeutic insights derived from patient-reported outcomes and clinical notes. 61 Our contribution lies in further emphasizing how hybrid models can effectively integrate diverse data sources, providing clinicians with more personalized therapeutic strategies. 62
Our analysis also explored the contributions of DL algorithms in therapeutic contexts, particularly in symptom monitoring and relapse prediction. The study by Kaur et al. found that DL algorithms could accurately predict relapse risks in patients with bipolar disorder. 63 While DL was highly effective in handling complex and unstructured datasets, such as neuroimaging and longitudinal health data, its overall performance was slightly lower (pooled effect size: 0.82, 95% CI: 0.80–0.84) compared to ML and hybrid models. These findings align with results from other meta-analyses, such as Villarreal-Zegarra et al., which found that DL methods are particularly dependent on data quality and sample size, 64 but our study goes further by directly comparing DL with ML and hybrid models across therapeutic applications.
One of the more notable findings of this study is the consistency in therapeutic efficacy across various studies, which aligns with the results of Qiu et al., who reported pooled effect sizes for diagnostic accuracy ranging from 0.82 to 0.89. 65 While this meta-analysis reinforces the reproducibility of AI-driven outcomes, our work also points out the limitations of ML in real-world applications. 66 Specifically, Linardon et al. discussed how machine-learning models face challenges in scaling to real-world settings due to the high computational resources required. 67 Our study builds on this by offering a more nuanced understanding of how ML can be optimized for broader clinical adoption.
Unique contributions of this study
This study makes a unique contribution by offering a comprehensive comparison of ML, DL, and hybrid models across both diagnostic and therapeutic applications. Previous meta-analyses, such as those by He and Abd-Alrazaq et al., have primarily focused on either diagnostic or therapeutic applications in isolation.55,56 Our work advances the field by presenting a comparative analysis of AI's diagnostic and therapeutic efficacy, thereby offering a holistic view of its potential in psychiatry. Additionally, by incorporating hybrid models and highlighting their potential in both diagnostic and therapeutic contexts, this study provides a new avenue for future research, encouraging the integration of various AI methodologies to improve clinical outcomes. The focus on the generalizability and inclusivity of AI models further sets this research apart, addressing a critical gap in the literature regarding the application of AI across diverse populations. 68
However, concerns around data privacy and security are significant in the implementation of AI in psychiatry. As highlighted by Linardon et al., physicians often express hesitancy in adopting AI tools due to fears of breaching patient confidentiality, especially when handling sensitive mental health data. 69 These concerns are particularly pressing in psychiatry, where patient data are inherently personal and vulnerable. Ensuring robust data protection measures and transparency in AI processes is essential to alleviate these concerns and build trust among clinicians and patients alike. Moreover, the integration of AI in psychiatric care faces unique challenges due to the subjectivity of psychiatric diagnoses and the complexity of individual treatment plans. Physicians are concerned about relying on AI systems that may not fully capture the nuanced understanding of patient needs, especially when clinical judgment is central to diagnosis and treatment. 70 Therefore, future research should also focus on addressing these concerns by developing AI systems that complement, rather than replace, the expertise of mental health professionals, ensuring that AI tools support personalized, patient-centered care.
Validation of AI models in psychiatric applications
While the studies included in this review demonstrate promising results in terms of diagnostic and therapeutic efficacy, it is critical to highlight the validation methods employed in assessing the performance of AI models, particularly for DL, ML, and hybrid approaches. Effective validation is essential for ensuring the reliability and generalizability of AI models in diverse clinical settings. Several studies utilized cross-validation and external validation to assess the robustness of the AI models, while others compared the performance of AI models against established clinical benchmarks. 59 However, the methodological variability in validation approaches across studies limits the ability to draw definitive conclusions about the broader applicability of these models. 55 Future research should prioritize standardized validation methods, including multi-center trials and external validation using diverse patient populations, to confirm the generalizability of these AI-driven techniques in psychiatry.
Practical implications
The implications of these findings are far-reaching. AI's ability to improve diagnostic accuracy has the potential to address longstanding challenges in psychiatry, such as misdiagnosis and delayed treatment initiation. By identifying patterns in complex data sets, AI systems can support clinicians in making more informed and timely decisions, ultimately improving patient outcomes. 71 For instance, early detection of conditions like schizophrenia or bipolar disorder could enable earlier interventions, mitigating the progression of these disorders and reducing the associated societal and economic burdens. 72 Therapeutically, AI-driven tools can complement traditional approaches by offering personalized, data-driven insights. 73 For example, symptom monitoring applications can provide real-time feedback to both patients and clinicians, facilitating more adaptive and responsive care. 74 Additionally, AI's capacity to predict treatment responses and relapse risks can enable a shift toward preventive psychiatry, focusing on maintaining mental health rather than merely addressing crises. 75
Ethical and practical considerations
Despite its promise, the integration of AI into psychiatry raises several ethical and practical concerns. Issues such as data privacy, algorithmic bias, and the interpretability of AI models should be addressed to ensure equitable and ethical use.69,76 Research by Lee et al. highlighted how biases in training data can perpetuate health disparities, emphasizing the need for rigorous validation and transparency in AI model development. 77 Moreover, the reliance on high-quality data for training AI models poses challenges in resource-limited settings, where data availability and quality may be constrained. 78 Collaborative efforts among researchers, clinicians, and policymakers are crucial to ensuring that AI technologies are accessible and beneficial to diverse populations.
Future directions
Future research should focus on addressing these challenges while expanding the scope of AI applications in psychiatry. Areas such as natural language processing (NLP) and wearable technologies hold significant promises for advancing both diagnostic and therapeutic capabilities. For example, NLP can analyze patient narratives to detect subtle linguistic markers of mental health conditions, while wearable devices can provide continuous, real-time monitoring of physiological and behavioral data. Additionally, longitudinal studies are needed to evaluate the long-term impact of AI-driven interventions on mental health outcomes. While the current analysis provides strong evidence for AI's efficacy in controlled settings, real-world trials will be critical to understanding its practical utility and sustainability.
Strengths and limitations of this study
This study has several strengths, including its comprehensive scope, rigorous methodology, and focus on both diagnostic and therapeutic applications of AI in psychiatry. It offers valuable comparative insights into different AI methodologies, supported by statistically significant findings, and provides a holistic evaluation of AI's potential in mental health care. However, several limitations should be considered. First, while 14 studies met the inclusion criteria, the relatively small number of studies included in the meta-analysis may limit the statistical power and the ability to detect true heterogeneity among the studies. This could affect the generalizability of the findings, particularly when considering the potential variability in study designs, populations, and AI methodologies. The presence of moderate heterogeneity, as indicated by the I² statistic, suggests that the results may be influenced by factors such as study quality and data characteristics, which need to be addressed in future research.
A notable limitation is the potential dependence of effect size on sample size, as suggested by Figure 3. The observed relationship between sample size and effect size warrants further investigation. It is possible that the effect sizes are driven more by sample size than by the specific AI techniques employed. This possibility requires further scrutiny in future research to clarify whether larger sample sizes contribute to more precise or inflated effect sizes, and to what extent the methodological differences between techniques influence the outcomes. Future studies should control sample size when assessing the efficacy of different AI methodologies to ensure a more accurate interpretation of the findings. Additionally, there is a lack of adequate blinding in many of the included studies, as evidenced by the risk of bias assessment in Figure 2. Insufficient blinding can introduce biases, particularly in the assessment of diagnostic accuracy and therapeutic efficacy, potentially skewing results and impacting the reliability of conclusions drawn from these studies. The absence of blinding in the trials included in this review may affect the objectivity of the reported outcomes and should be considered when interpreting the findings. Future research should prioritize improving blinding procedures to minimize bias and enhance the credibility of AI evaluation studies. Further limitations include the underrepresentation of diverse populations in the included studies, which could impact the generalizability of AI applications across different demographic groups. The reliance on short-term outcomes and secondary data further restricts its applicability to real-world settings, and ethical and practical considerations, such as data privacy and algorithmic bias, are not fully addressed in this review. Additionally, the underrepresentation of emerging AI techniques and a lack of longitudinal evidence highlight important areas for future research and refinement.
Conclusion
The results of this systematic review and meta-analysis highlight the significant potential of AI in psychiatry, particularly in enhancing diagnostic accuracy and therapeutic efficacy. Our findings suggest that AI technologies, especially ML models, have made substantial progress in both diagnostic and therapeutic applications. Specifically, ML models demonstrated the highest diagnostic accuracy (85%) and therapeutic efficacy (85%) among the AI methodologies reviewed. However, while these results are compelling, further statistical testing, such as pairwise comparisons, is needed to confirm the differences between AI techniques, as the current analysis does not provide such tests across the groups divided by model type. In addition to these promising outcomes, the study emphasizes the need for addressing the ethical, practical, and technical challenges involved in the integration of AI into psychiatric care. These challenges include ensuring data privacy, mitigating algorithmic biases, and refining AI models for broader and more diverse populations. The transformative potential of AI in mental health care is clear, but to realize this potential, further research should focus on standardizing methodologies, validating models in real-world settings, and exploring new, innovative applications. This will ultimately foster a more personalized, efficient, and equitable approach to psychiatric care.
Acknowledgments
The authors are deeply grateful to the Miyan Research Institute, International University of Business Agriculture and Technology, Dhaka, Bangladesh. Additionally, this research was supported by the Deanship of Scientific Research, King Saud University, Riyadh, Saudi Arabia.
ORCID iDs: Moustaq Karim Khan Rony https://orcid.org/0000-0002-6905-0554
Dipak Chandra Das https://orcid.org/0009-0001-0566-8822
Most. Tahmina Khatun https://orcid.org/0009-0002-1333-0570
Silvia Ferdousi https://orcid.org/0000-0001-7542-3436
Mosammat Ruma Akter https://orcid.org/0009-0005-2602-4304
Most. Hasina Begum https://orcid.org/0009-0007-1737-362X
Md Ibrahim Khalil https://orcid.org/0009-0007-7882-5538
Mst. Rina Parvin https://orcid.org/0000-0003-0111-6163
Fazila Akter https://orcid.org/0009-0009-5887-8370
Statements and declarations
Ethical considerations: Our study did not require an ethical board approval because it did not contain human or animal trials.
Author contributions/CRediT: Moustaq Karim Khan Rony contributed to writing‒review and editing, writing‒original draft, visualization, validation, supervision, software, resources, project administration, methodology, investigation, formal analysis, data curation, and conceptualization. Dipak Chandra Das contributed to writing‒review and editing, writing‒original draft, software, project administration, methodology, formal analysis, data curation, and conceptualization. Most. Tahmina Khatun contributed to writing‒original draft, validation, methodology, formal analysis, data curation, and conceptualization. Silvia Ferdousi contributed to writing‒review and editing, visualization, validation, investigation, and data curation. Mosammat Ruma Akter contributed to writing‒review and editing, project administration, methodology, formal analysis, data curation. Mst. Amena Khatun contributed to writing‒review and editing, resources, project administration, formal analysis, and conceptualization. Most. Hasina Begum contributed to writing‒review and editing, visualization, investigation, formal analysis, data curation. Md Ibrahim Khalil contributed to writing‒review and editing, methodology, formal analysis, and data curation. Mst. Rina Parvin contributed to writing‒review and editing, writing‒original draft, supervision, investigation, formal analysis, and conceptualization. Daifallah M. Alrazeeni contributed to writing‒review and editing, visualization, validation, supervision, investigation, and conceptualization. Fazila Akter contributed to writing‒review and editing, methodology, formal analysis, data curation, supervision, and conceptualization.
Funding: The authors received no financial support for the research, authorship, and/or publication of this article.
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
References
- 1.Bendotti H, Lawler S, Chan GCK, et al. Conversational artificial intelligence interventions to support smoking cessation: a systematic review and meta-analysis. Digital Health 2023; 9: 20552076231211634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Rony MKK, Parvin M, Ferdousi S. Advancing nursing practice with artificial intelligence: enhancing preparedness for the future. Nurs Open 2024; 11: nop2.2070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Beg MJ, Verma M M, Vishvak Chanthar KMMet al. et al. Artificial intelligence for psychotherapy: a review of the current state and future directions. Indian J Psychol Med 2024: 02537176241260819. DOI: 10.1177/02537176241260819. [DOI] [Google Scholar]
- 4.Wang J, Ouyang H, Jiao R, et al. Machine learning methods to discriminate posttraumatic stress disorder: a protocol of systematic review and meta-analysis. Digital Health 2024; 10: 20552076241239238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Washington P, Park N, Srivastava P, et al. Data-driven diagnostics and the potential of mobile artificial intelligence for digital therapeutic phenotyping in computational psychiatry. Biol PsychiatCognit Neurosci Neuroimag 2020; 5: 759–769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zhou S, Zhao J, Zhang L. Application of artificial intelligence on psychological interventions and diagnosis: an overview. Front Psychiatry 2022; 13: 811665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Komatsu H, Watanabe E, Fukuchi M. Psychiatric neural networks and precision therapeutics by machine learning. Biomedicines 2021; 9: 03. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Menke A. Precision pharmacotherapy: psychiatry’s future direction in preventing, diagnosing, and treating mental disorders. PGPM 2018; 11: 211–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Singh OP. Artificial intelligence in the era of ChatGPT - opportunities and challenges in mental health care. Indian J Psychiatry 2023; 65: 297–298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ray A, Bhardwaj A, Malik YK, et al. Artificial intelligence and psychiatry: an overview. Asian J Psychiatr 2022; 70: 103021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Gual-Montolio P, Jaén I, Martínez-Borba V, et al. Using artificial intelligence to enhance ongoing psychological interventions for emotional problems in real- or close to real-time: a systematic review. IJERPH 2022; 19: 7737. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Das KP, Gavade P. A review on the efficacy of artificial intelligence for managing anxiety disorders. Front Artif Intell 2024; 7: 1435895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Babu A, Joseph AP. Artificial intelligence in mental healthcare: transformative potential vs. The necessity of human interaction. Front Psychol 2024; 15: 1378904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Luxton DD. An Introduction to artificial intelligence in behavioral and mental health care. In: Luxton DD (ed) Artificial intelligence in behavioral and mental health care. Seattle, Washington, USA: Elsevier, 2016, pp.1–26. DOI: 10.1016/B978-0-12-420248-1.00001-5. [Google Scholar]
- 15.Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep 2019; 21: 16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rony MKK, Kayesh I, Bala SD, et al. Artificial intelligence in future nursing care: exploring perspectives of nursing professionals - A descriptive qualitative study. Heliyon 2024; 10: e25718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zhang Z, Wang J. Can AI replace psychotherapists? Exploring the future of mental health care. Front Psychiatry 2024; 15: 1444382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Brown C, Story GW, Mourão-Miranda Jet al. et al. Will artificial intelligence eventually replace psychiatrists? Br J Psychiatry 2021; 218: 131–134. [DOI] [PubMed] [Google Scholar]
- 19.Monaco F, Vignapiano A, Piacente M, et al. An advanced artificial intelligence platform for a personalised treatment of eating disorders. Front Psychiatry 2024; 15: 1414439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Bzdok D, Meyer-Lindenberg A. Machine learning for precision psychiatry: opportunities and challenges. Biol Psychiatry: Cognit Neurosci Neuroimag 2018; 3: 223–230. [DOI] [PubMed] [Google Scholar]
- 21.Khare M, Acharya S, Shukla SHet al. et al. Utilising artificial intelligence (AI) in the diagnosis of psychiatric disorders: a narrative review. JCDR 2024: 1–5. DOI: 10.7860/JCDR/2023/61698.19249. [DOI] [Google Scholar]
- 22.Silverman BG, Hanrahan N, Huang L, et al. Artificial intelligence and human behavior modeling and simulation for mental health conditions. In: Artificial intelligence in behavioral and mental health care. Seattle, Washington, USA: Elsevier, 2016, pp.163–183. DOI: 10.1016/B978-0-12-420248-1.00007-6. [Google Scholar]
- 23.Milne-Ives M, Selby E, Inkster B, et al. Artificial intelligence and machine learning in mobile apps for mental health: a scoping review. PLOS Digit Health 2022; 1: e0000079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.King DR, Nanda G, Stoddard J, et al. An Introduction to generative artificial intelligence in mental health care: considerations and guidance. Curr Psychiatry Rep 2023; 25: 839–846. [DOI] [PubMed] [Google Scholar]
- 25.Jin KW, Li Q, Xie Yet al. et al. Artificial intelligence in mental healthcare: an overview and future perspectives. Br J Radiol 2023; 96: 20230213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.D’Alfonso S, Santesteban-Echarri O, Rice S, et al. Artificial intelligence-assisted online social therapy for youth mental health. Front Psychol 2017; 8: 96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Tornero-Costa R, Martinez-Millana A, Azzopardi-Muscat N, et al. Methodological and quality flaws in the use of artificial intelligence in mental health research: systematic review. JMIR Ment Health 2023; 10: e42045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Denecke K, Abd-Alrazaq A, Househ M. Artificial intelligence for chatbots in mental health: opportunities and challenges. In: Househ M, Borycki E, Kushniruk A. (eds) Multiple perspectives on artificial intelligence in healthcare: lecture notes in bioengineering. Cham, Switzerland: Springer International Publishing, 2021, pp.115–128. DOI: 10.1007/978-3-030-67303-1_10. [Google Scholar]
- 29.Rony MKK, Numan S, Akter K, et al. Nurses’ perspectives on privacy and ethical concerns regarding artificial intelligence adoption in healthcare. Heliyon 2024; 10: e36702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Adebayo O, Bhuiyan ZA, Ahmed Z. Exploring the effectiveness of artificial intelligence, machine learning and deep learning in trauma triage: a systematic review and meta-analysis. Digital Health 2023; 9: 20552076231205736. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Boucher EM, Harake NR, Ward HE, et al. Artificially intelligent chatbots in digital mental health interventions: a review. Expert Rev Med Devices 2021; 18: 37–49. [DOI] [PubMed] [Google Scholar]
- 32.Rogan J, Bucci S, Firth J. Health care professionals’ views on the use of passive sensing, AI, and machine learning in mental health care: systematic review with meta-synthesis. JMIR Ment Health 2024; 11: e49577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Joyce DW, Kormilitzin A, Smith KAet al. et al. Explainable artificial intelligence for mental health through transparency and interpretability for understandability. npj Digit Med 2023; 6: 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Warrier U, Warrier A, Khandelwal K. Ethical considerations in the use of artificial intelligence in mental health. Egypt J Neurol Psychiatry Neurosurg 2023; 59: 39. [Google Scholar]
- 35.Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021: n71. DOI: 10.1136/bmj.n71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Danieli M, Ciulli T, Mousavi SMet al. et al. A conversational artificial intelligence agent for a mental health care app: evaluation study of its participatory design. JMIR Form Res 2021; 5: e30053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Morales S, Barros J, Echávarri O, et al. Acute mental discomfort associated with suicide behavior in a clinical sample of patients with affective disorders: ascertaining critical variables using artificial intelligence tools. Front Psychiatry 2017: 8. DOI: 10.3389/fpsyt.2017.00007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Fulmer R, Joerin A, Gentile B, et al. Using psychological artificial intelligence (tess) to relieve symptoms of depression and anxiety: randomized controlled trial. JMIR Ment Health 2018; 5: 64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Wei W, Li L. The impact of artificial intelligence on the mental health of manufacturing workers: the mediating role of overtime work and the work environment. Front Public Health 2022; 10: 862407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Yu J, Chiu C, Wang Y, et al. A machine learning approach to passively informed prediction of mental health risk in people with diabetes: retrospective case-control analysis. J Med Internet Res 2021; 23: e27709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Danieli M, Ciulli T, Mousavi SM, et al. Assessing the impact of conversational artificial intelligence in the treatment of stress and anxiety in aging adults: randomized controlled trial. JMIR Ment Health 2022; 9: e38067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Li X. Evaluation and analysis of elderly mental health based on artificial intelligence. Occup Ther Int 2023; 2023: 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Zhang W, Yang C, Cao Z, et al. Detecting individuals with severe mental illness using artificial intelligence applied to magnetic resonance imaging. eBioMedicine 2023; 90: 104541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Andrikopoulos D, Vassiliou G, Fatouros P, et al. Machine learning-enabled detection of attention-deficit/hyperactivity disorder with multimodal physiological data: a case-control study. BMC Psychiatry 2024; 24: 47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Zhang L, Zhao S, Yang Z, et al. An artificial intelligence tool to assess the risk of severe mental distress among college students in terms of demographics, eating habits, lifestyles, and sport habits: an externally validated study using machine learning. BMC Psychiatry 2024; 24: 81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gomeni R, Bressolle-Gomeni F, Fava M. Artificial intelligence approach for the analysis of placebo-controlled clinical trials in major depressive disorders accounting for individual propensity to respond to placebo. Transl Psychiatry 2023; 13: 41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Lacy ND, Ramshaw MJ, McCauley E, et al. Predicting individual cases of major adolescent psychiatric conditions with artificial intelligence. Transl Psychiatry 2023; 13: 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Kalmady SV, Greiner R, Agrawal R, et al. Towards artificial intelligence in mental health by improving schizophrenia prediction with multiple brain parcellation ensemble-learning. npj Schizophr 2019; 5: 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Nemesure MD, Heinz MV, Huang Ret al. et al. Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence. Sci Rep 2021; 11: 1980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Higgins JPT, Altman DG, Gotzsche PC, et al. The Cochrane collaboration’s tool for assessing risk of bias in randomised trials. Br Med J 2011; 343: d5928–d5928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Vandenbroucke JP, Elm V, Altman E, et al. Strengthening the reporting of observational studies in epidemiology (STROBE): explanation and elaboration. PLoS Med 2007; 4: e297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Zhong W, Luo J, Zhang H. The therapeutic effectiveness of artificial intelligence-based chatbots in alleviation of depressive and anxiety symptoms in short-course treatments: a systematic review and meta-analysis. J Affect Disord 2024; 356: 459–469. [DOI] [PubMed] [Google Scholar]
- 53.Li H, Zhang R, Lee YC, et al. Systematic review and meta-analysis of AI-based conversational agents for promoting mental health and well-being. npj Digit Med 2023; 6: 36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Abd-Alrazaq AA, Rababeh A, Alajlani M, et al. Effectiveness and safety of using chatbots to improve mental health: systematic review and meta-analysis. J Med Internet Res 2020; 22: e16021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.He Y, Yang L, Qian C, et al. Conversational agent interventions for mental health problems: systematic review and meta-analysis of randomized controlled trials. J Med Internet Res 2023; 25: e43862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Abd-Alrazaq A, AlSaad R, Shuweihdi F, et al. Systematic review and meta-analysis of performance of wearable artificial intelligence in detecting and predicting depression. npj Digit Med 2023; 6: 84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Lim SM, Shiau CWC, Cheng LJet al. et al. Chatbot-delivered psychotherapy for adults with depressive and anxiety symptoms: a systematic review and meta-regression. Behav Ther 2022; 53: 334–347. [DOI] [PubMed] [Google Scholar]
- 58.Saboori Amleshi R, Ilaghi M, Rezaei M, et al. Predictive utility of artificial intelligence on schizophrenia treatment outcomes: a systematic review and meta-analysis. Neurosci Biobehav Rev 2025; 169: 105968. [DOI] [PubMed] [Google Scholar]
- 59.Quaak M, Van De Mortel L, Thomas RMet al. et al. Deep learning applications for the classification of psychiatric disorders using neuroimaging data: systematic review and meta-analysis. NeuroImage: Clin 2021; 30: 102584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Meinke C, Lueken U, Walter Het al. et al. Predicting treatment outcome based on resting-state functional connectivity in internalizing mental disorders: a systematic review and meta-analysis. Neurosci Biobehav Rev 2024; 160: 105640. [DOI] [PubMed] [Google Scholar]
- 61.Wang J, Ouyang H, Jiao R, et al. The application of machine learning techniques in posttraumatic stress disorder: a systematic review and meta-analysis. npj Digit Med 2024; 7: 21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Su S, Wang Y, Jiang W, et al. Efficacy of artificial intelligence-assisted psychotherapy in patients with anxiety disorders: a prospective, national multicenter randomized controlled trial protocol. Front Psychiatry 2022; 12: 799917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Kaur P, Sharma M. Diagnosis of human psychological disorders using supervised learning and nature-inspired computing techniques: a meta-analysis. J Med Syst 2019; 43: 04. [DOI] [PubMed] [Google Scholar]
- 64.Villarreal-Zegarra D, Reategui-Rivera CM, García-Serna J, et al. Self-Administered interventions based on natural language processing models for reducing depressive and anxiety symptoms: systematic review and meta-analysis. JMIR Ment Health 2024; 11: e59560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Qiu Y, Wu M, Liu J, et al. Effectiveness of digital intelligence interventions on depression and anxiety in older adults: a systematic review and meta-analysis. Psychiatry Res 2024; 342: 116166. [DOI] [PubMed] [Google Scholar]
- 66.Jacobson NC, Nemesure MD. Using artificial intelligence to predict change in depression and anxiety symptoms in a digital intervention: evidence from a transdiagnostic randomized controlled trial. Psychiatry Res 2021; 295: 113618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Linardon J, Torous J, Firth J, et al. Current evidence on the efficacy of mental health smartphone apps for symptoms of depression and anxiety: a meta-analysis of 176 randomized controlled trials. World Psychiatry 2024; 23: 139–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Cruz-Gonzalez P, He AWJ, Lam EP, et al. Artificial intelligence in mental health care: a systematic review of diagnosis, monitoring, and intervention applications. Psychol Med 2025; 55: e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Linardon J, Liu C, Messer M, et al. Current practices and perspectives of artificial intelligence in the clinical management of eating disorders: insights from clinicians and community participants. Intl J Eating Disorders 2025: eat.24385. DOI: 10.1002/eat.24385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Pak TK, Hernandez M, Do CEet al. et al. Artificial intelligence in psychiatry: threat or blessing? Acad Psychiatry 2023; 47: 587–588. [DOI] [PubMed] [Google Scholar]
- 71.Rahim A, Khatoon R, Khan TA, et al. Artificial intelligence-powered dentistry: probing the potential, challenges, and ethicality of artificial intelligence in dentistry. Digital Health 2024; 10: 20552076241291345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Doraiswamy PM, Blease C, Bodner K. Artificial intelligence and the future of psychiatry: insights from a global physician survey. Artif Intell Med 2020; 102: 101753. [DOI] [PubMed] [Google Scholar]
- 73.Shinners L, Grace S, Smith S, et al. Exploring healthcare professionals’ perceptions of artificial intelligence: piloting the Shinners artificial intelligence perception tool. Digital Health 2022; 8: 205520762210781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Rony MKK, Numan S, Johra FT, et al. Perceptions and attitudes of nurse practitioners toward artificial intelligence adoption in health care. Health Sci Rep 2024; 7: e70006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Walsh CG, Chaudhry B, Dua P, et al. Stigma, biomarkers, and algorithmic bias: recommendations for precision behavioral health with artificial intelligence. JAMIA Open 2020; 3: 9–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Smrke U, Mlakar I, Lin S, et al. Language, speech, and facial expression features for artificial intelligence-based detection of cancer survivors’ depression: scoping meta-review. JMIR Ment Health 2021; 8: e30439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Lee Y, Ragguett RM, Mansur RB, et al. Applications of machine learning algorithms to predict therapeutic outcomes in depression: a meta-analysis and systematic review. J Affect Disord 2018; 241: 519–532. [DOI] [PubMed] [Google Scholar]
- 78.Migdadi MK, Oweidat IA, Alosta MR, et al. The association of artificial intelligence ethical awareness, attitudes, anxiety, and intention-to-use artificial intelligence technology among nursing students. Digital Health 2024; 10: 20552076241301958. [DOI] [PMC free article] [PubMed] [Google Scholar]






