Abstract
Artificial intelligence (AI) is increasingly used in mental health, yet its rehabilitation-oriented applications in schizophrenia have not been systematically mapped. We conducted a systematic scoping review of PubMed, Web of Science, IEEE Xplore and the ACM Digital Library (January 1, 2012–October 31, 2025; two search rounds), applying operationalized rehabilitation boundaries and excluding diagnostics-only case–control studies. We extracted data on data sources, feature engineering, model families, validation, calibration, interpretability, application domains, outcomes and implementation readiness. Eighty-three studies met inclusion criteria (median sample size 160; 55% longitudinal). Applications focused on symptom monitoring (48/83), medication management (19/83) and risk management (16/83), whereas functional training (1/83) and psychosocial support (3/83) were rarely targeted. Supervised learning predominated (53/83, 63.9%) over representation learning (20/83, 24%), most commonly using speech/text, electronic health records and smartphone sensing. Across classification tasks, the median AUC was 0.79 (IQR 0.71–0.86); relapse early-warning models showed a median sensitivity of 31.5% at 88.0% specificity. Only four studies reported external validation and three described closed-loop deployment, including one randomized trial that improved adherence. Proxy endpoints were more common than clinical endpoints, and reporting of calibration/uncertainty and fairness auditing was sparse. Overall, AI shows promise for monitoring, adherence support and relapse risk stratification, but routine-care deployment will require externally validated and calibrated human-in-the-loop decision support, privacy-preserving multimodal pipelines and pragmatic trials targeting functional outcomes and participation.
Subject terms: Scientific community, Schizophrenia
Introduction
Schizophrenia is a severe mental disorder characterized by disturbances across multiple domains, such as thinking, perception, self-experience, cognition, volition, affect, and behavior, and is frequently associated with significant social and occupational impairments [1, 2]. Globally, schizophrenia affects approximately 23 million people (around 1 in 345), with higher age-specific prevalence among adults (around 1 in 233) [3]. The illness follows a chronic, relapsing course with substantial functional impairment and premature mortality. Meta-analytic estimates indicate 13–15 years of potential life lost, with the pooled expected age at death being approximately 60 years in men and 68 years in women [4]. Relapse remains common even with treatment; in prospective first-episode cohorts, the five-year cumulative relapse rate reaches approximately 82% [5]. The early phases of the illness carry elevated risks of self-harm and suicide, whereby the lifetime suicide mortality [6, 7] and suicidal ideation rates are 5 and 35%, respectively [8]. These patterns require care that extends beyond acute symptom control.
Psychiatric (psychosocial) rehabilitation, defined by the World Health Organization as a process facilitating opportunities for individuals with mental disorders to achieve optimal independent functioning by strengthening personal competencies and addressing environmental barriers, has emerged as essential for improving functioning and quality of life [9, 10]. Reflecting the needs of patients with mental health disorders, international frameworks emphasize comprehensive, integrated, community-based mental health and social care with longitudinal, measurement-based assessment and adaptation [11–14]. Best-practice guidelines commonly organize these principles into three core components: (i) comprehensive assessment and individualized care planning, which should be delivered in recovery-oriented, community-based services to support autonomy and participation [12, 13], with ongoing measurement-based monitoring to inform treatment adjustments [14, 15]; (ii) medication management and adherence support, including consideration of long-acting injectable antipsychotics when appropriate [15]; and (iii) evidence-based psychosocial interventions, such as cognitive remediation [16], psychoeducation (e.g., family-based models) [17], and social skills training [18]. In routine clinical workflows, these principles are operationalized through regular follow-up visits [13], family psychoeducation [17], social skills training [18], medication management and adherence support, and measurement-based relapse-prevention planning [15]. Randomized evidence from low-resource settings shows that community-based rehabilitation improves schizophrenia outcomes [19].
Despite the standardization of psychiatric rehabilitation, its implementation remains uneven worldwide [20]. Substantial challenges include resource constraints such as low mental health budgets, workforce shortages, hospital-centric spending [21], and medication-related adverse effects that undermine treatment adherence and tolerability [22, 23]. Moreover, deterioration detection and timely interventions are hampered by the absence of routine, measurement-based assessments [15] and relapse-prevention planning (e.g., early warning sign monitoring) [24]. Coverage data clearly show such implementation gaps: only approximately 29% of people with psychosis receive specialist mental health care globally [3], and approximately one-third of adults with serious mental illnesses in the United States of America have received no mental health treatment in the past year [25]. Such deficiencies in accessibility, quality, and continuity of care, alongside the aforementioned high relapse burden, motivate the search for scalable, remotely deliverable complements to routine rehabilitation [20].
The rapid maturation of digital mental health technologies has opened new pathways for addressing these gaps [26]. Mobile health applications [27], telepsychiatry [28], therapist-guided Internet-delivered cognitive behavioral therapy [29], videoconference-delivered cognitive behavioral therapy [30], automated virtual reality-delivered psychological therapy (some marketed as DTx) [31], and wearable-enabled monitoring [32] now offer practical complements to community care. These technological applications support real-time self-monitoring [27], scalable skills-based interventions [29, 30], remote access [28], and continuous physiological/behavioral monitoring [32]. They capture data through active patient inputs (e.g., ePRO/EMA on smartphones) [33] and passive sensing within a digital phenotyping framework (device logs and onboard sensors) [34].
Artificial intelligence (AI), referring to probabilistic computational methods that learn from data to support prediction/decision-making under uncertainty, has been increasingly applied to analyze these datasets [35]. Related AI approaches encompass several major paradigms, such as supervised learning for outcome prediction from labeled data, unsupervised learning for structure discovery in data without labels, reinforcement learning for sequential decision-making from interactions, and self-supervised learning that derives supervisory signals directly from raw data [35, 36]. Model families range from interpretable approaches (e.g., logistic regression and decision trees) to deep neural networks, with the latter encompassing convolutional architectures for images, recurrent and transformer architectures for sequential data, and graph neural networks for relational structures [35, 37]. Large language models (LLMs; i.e., transformer-based foundation models pretrained on massive text corpora) exhibit strong capabilities in language understanding, generation, and emerging reasoning, enabling applications to process clinical narratives and patient–provider communication [38]. Advanced training strategies for LLMs include multimodal learning to integrate heterogeneous sources, transfer learning to adapt models across domains, and federated learning to enable collaborative training while preserving data locality and privacy [39, 40]. Rigorous LLM deployment requires attention to robustness under distribution shift and principled uncertainty quantification, predictive performance, and governance that advances transparency, fairness, privacy, and security [41, 42].
Notwithstanding the rehabilitation-oriented capabilities of AI and digital therapeutics and the increasing related research, the literature on AI in schizophrenia remains preponderantly concentrated on pathophysiology and diagnosis. For example, supervised models trained on routine electronic health records (EHRs) forecast diagnostic progression for schizophrenia or bipolar disorder [43]. A recurrent neural-network model trained on multi-system EHR data identified individuals at risk of first-episode psychosis up to 12 months before the index event [44]. In neuroimaging, large multisite analyses show that machine learning pipelines can extract reproducible image-derived markers [45]; deep learning graph-neural networks that fuse structural and functional MRI (fMRI) further automate feature discovery, achieving 83% cross-validated accuracy while highlighting circuit-level biomarkers [46]; hypothesis-driven fMRI biomarkers also quantify disease-relevant physiology, such as a cross-validated striatal-dysfunction index that could discriminate schizophrenia from controls and show its relation to antipsychotic response [47]. Multimodal fusion with genomic/transcriptomic data both improves discrimination and helps localize disease-relevant circuits [48], while imaging–transcriptomic maps link MRI phenotypes and fMRI signal amplitude to the cortical expression of interneuron markers [49] and to spatial patterns of schizophrenia risk-gene expression [50].
Meanwhile, rehabilitation-targeted AI applications (e.g., focusing on functional assessment, longitudinal symptom/risk monitoring, medication management, psychosocial skills training, and community reintegration) have received comparatively less attention than diagnostic/prognostic AI applications and remain under-synthesized [51]. A comprehensive synthesis of the AI-based rehabilitation field is particularly critical because psychiatric rehabilitation poses challenges beyond algorithmic performance, requiring context-aware deployment, integration with care pathways, and attention to implementation barriers [52]. Patients commonly raise concerns about data privacy [53] and the possibility that intensive passive monitoring could exacerbate anxiety or paranoia [54]. Clinicians likewise warn that overly intrusive sensing can strain the therapeutic alliance and that recommendations must be sensitive to the clinical context to be actionable [55]. These concerns intersect with technical demands for explainable systems [56] and for high-quality, reliable data, especially in consideration of issues such as label scarcity in psychiatry [57], the limited ecological validity of many functional outcomes [58], device and platform heterogeneity in smartphone/wearable data collection [59], and performance degradation from distribution shifts [35].
For emphasis, we would like to remind the reader that some reviews synthesize data on AI applications, including schizophrenia-focused scoping reviews. Therefore, the research gap we highlight here is that the overt concentration of past reviews on diagnosis and acute phase management [60–62] has left rehabilitation processes underexplored. This is compounded by broader, cross-diagnostic overviews of digital/AI approaches rarely providing analyses aligned with schizophrenia rehabilitation targets [63, 64], particularly negative symptoms [65] and community/social participation [66], that require tailored intervention strategies. This systematic review aimed to address these research gaps by examining AI application in schizophrenia rehabilitation management. We analyzed the technical and practical applications of AI models across core rehabilitation domains, including symptom monitoring, medication management, risk management, functional training, and psychosocial support.
Methods
This was a systematic scoping review. We chose this design, which enables research result synthesis, evaluation of AI implementation in schizophrenia rehabilitation, and identification of key values and challenges, owing to the significant heterogeneity in objectives, technology, and evaluation metrics across the studies included in the review. This study was reported following the PRISMA-SCR guidelines [67].
Search strategy
Search sources
We conducted two database searches (Round 1, January 15–31, 2025; Round 2, October 1–15, 2025, following reviewer feedback) across four databases: PubMed (clinical and rehabilitation literature), Web of Science, IEEE Xplore, and the ACM Digital Library (AI-focused computing and engineering venues).
Eligible records spanned January 1, 2012, through October 31, 2025. The 2012 start date reflects the emergence of modern deep learning (e.g., AlexNet) [68], the subsequent acceleration of AI’s development toward natural language processing and computer vision [69], and the sparsity of AI-related mental health literature before this period [64, 70]. We conducted backward and forward citation chasing to improve completeness.
Search terms
We developed search terms under the guidance of two mental health rehabilitation experts, covering target population, AI technologies, and rehabilitation contexts: (“artificial intelligence” OR “AI” OR “machine learning” OR “deep learning” OR “neural networks” OR “natural language processing” OR “computer vision” OR “computational intelligence” OR “data mining” OR “predictive modeling” OR “reinforcement learning”) AND (“schizophrenia” OR “schizophrenic” OR “schizoaffective disorder” OR “psychosis” OR “psychotic disorder” OR “severe mental illness”) AND (“rehabilitation” OR “recovery” OR “management” OR “care” OR “medication adherence” OR “drug compliance” OR “pharmacological management” OR “medication tracking” OR “medication optimization” OR “relapse prevention” OR “risk assessment” OR “risk prediction” OR “violence prediction” OR “crisis management” OR “cognitive training” OR “social skills training” OR “life skills development” OR “functional recovery” OR “skill-building interventions” OR “symptom tracking” OR “symptom monitoring” OR “behavioral monitoring” OR “therapeutic intervention” OR “emotional support” OR “therapy engagement” OR “psychological well-being”).
Study eligibility criteria
Operational boundary of “Rehabilitation” for schizophrenia
In schizophrenia, psychiatric rehabilitation denotes a recovery-oriented, person-centered, longitudinal framework that enables the development of the skills and securing of the environmental supports required to live, learn, and work in the community with the least professional assistance [71]. Currently, this framework integrates evidence-based pharmacological and psychosocial care (e.g., structured symptom monitoring, medication management, proactive risk management, skills-based functional training, and psychosocial support) to maintain stability, prevent relapse, and promote community participation [14]. By contrast, diagnosis is a categorical, operational process that establishes case identification using syndromic criteria and duration thresholds, designed primarily for reliability and clinical utility rather than prescribing specific treatment pathways [72]. Field studies of the ICD-11 diagnostic guidelines similarly emphasize the clinical utility of diagnosis for communication and decision-making rather than uniform intervention protocols [73]. Accordingly, this review focuses on rehabilitation and management approaches that prioritize community functioning and quality-of-life outcomes and go beyond symptom remission.
Rehabilitation application domains
To systematically categorize AI applications within this schizophrenia rehabilitation framework, we operationalized seven core domains reflecting contemporary rehabilitation clinical practice [14]. Each included study was mapped to one or more domains shown in the following list.
Symptom monitoring: continuous, structured assessment of positive, negative, affective, and related functional symptoms in real-world settings via clinician ratings, patient-reported outcomes, ecological momentary assessment [74], and passive sensing [75]. This structured assessment aims to detect fluctuations and early warning signs of relapse [76] and guide timely interventions.
Medication management: a systematic, long-term process aimed at optimizing antipsychotic therapy [14], preventing relapse, and minimizing harm, including drug selection/titration, adherence assessment and support, adverse effect monitoring/management [77], long-acting injectable scheduling [78], and shared decision-making.
Risk management: ongoing assessment, treatment management formulation, and collaborative management targeting high-impact adverse outcomes (suicide/self-harm [79], violence/victimization [80], and relapse/crisis/hospitalization), integrating early warning monitoring [81], safety planning, and stepped, cross-setting responses.
Functional training: training-based, skills-focused interventions that build the enduring capacities needed for community functioning (e.g., neurocognition, social cognition, activities of daily living, instrumental activities of daily living, and vocational skills) through repeated practice and coached learning (e.g., cognitive remediation [82], social-cognition training [83], and individual placement and supported employment [84]).
Psychosocial support: structured educational, therapeutic, and social network interventions that enhance coping, family and peer involvement, service engagement, and community integration (e.g., family psychoeducation [85], cognitive behavioral therapy for psychosis [86], and peer support [87]).
Physical health/lifestyle management (pre-specified, zero-hit in this review): structured, multicomponent interventions addressing cardiometabolic risks to help close the mortality gap [88] and improve functioning and quality of life, including those combining physical activity and diet/weight management [89], smoking cessation [90], and routine metabolic screening [88].
Service organization/care coordination (pre-specified, zero-hit in this review): team- and pathway-level models that orchestrate medication, psychosocial intervention, and vocational/educational support to deliver integrated, continuous rehabilitation in routine services. Examples include coordinated specialty care for first-episode psychosis [91], assertive community treatment [92], and intensive/structured case management [93].
None of the included studies mapped to domains f or g. Therefore, although these domains were retained for completeness, they were omitted from our domain-level quantitative synthesis.
Inclusion criteria
To ensure relevance to rehabilitation and methodological rigor, studies were included if they met all the criteria below.
Population: adults or adolescents with clinician-confirmed schizophrenia-spectrum disorders (DSM-5/DSM-5-TR or ICD-10/ICD-11). Studies could include broader serious mental illness diagnoses (e.g., schizoaffective disorder or bipolar disorder with psychotic features), provided that schizophrenia-spectrum disorders constituted a primary analytic group or clearly defined subgroup. Studies conducted in hospitals were eligible only if the AI function targeted post-discharge management or community reintegration outcomes.
Intervention/AI function: an AI system (as per Organisation for Economic Co-operation and Development/International Organization for Standardization definitions) that infers from inputs to produce predictions/recommendations/decisions/content in service of a rehabilitation task in any of the seven core domains (see Section 2.2.2); eligible paradigms included supervised/unsupervised/semi-supervised learning, deep learning/foundation models/LLMs, reinforcement learning, probabilistic models, and knowledge-based/expert systems [6–8, 59].
Outcomes: rehabilitation-relevant endpoints (e.g., relapse/hospitalization, treatment adherence, functioning/participation, and social/role outcomes) or model performance explicitly tied to a rehabilitation management task (e.g., treatment adherence prediction that triggers case management).
Designs: randomized controlled trials/quasi-experimental, prospective/retrospective observational, and model development/validation studies. Qualitative or mixed-methods implementation studies were eligible when AI functionality operated within a rehabilitation workflow; diagnostics-only designs were not eligible.
Setting: community, home-based, supported accommodation, inpatient-to-community transition, or inpatient and digital health settings aligned with sustained rehabilitation care (e.g., inpatient data used to support post-discharge management or longitudinal relapse prevention).
Exclusion criteria
To focus specifically on rehabilitation, we excluded studies that met any of the following criteria:
focused on diagnostics (e.g., screening, case finding, and differential diagnosis) or cross-sectional case–control classifiers (e.g., schizophrenia vs. healthy controls) without linkage to rehabilitation;
addressed pathophysiology/biomarkers (e.g., discovery neuroimaging) or theoretical simulations without rehabilitative implications;
evaluated acute-phase treatment only (e.g., pharmacologic or symptom-focused psychotherapy) without functional/community outcomes or explicit rehabilitation goals;
were limited to custodial/forensic settings with no stated pathway to community living;
relied exclusively on modalities infeasible for continuous community monitoring or at-home/routine deployment (e.g., fMRI-only protocols and lab-grade electroencephalogram-only); and
were editorials, reviews, proposals, posters, conference abstracts, non-original research, or non-English.
Operationalization for cross-sectional and classification studies
Given the prevalence of cross-sectional case–control designs (e.g., schizophrenia vs. healthy controls) in the AI literature, we established explicit operationalization criteria to assess whether such studies qualify as rehabilitation-oriented. This helped us distinguish diagnostic research from rehabilitation-applicable studies by addressing the inherent ambiguity of binary classification paradigms. All baseline eligibility requirements below had to be met.
Confirmed diagnosis: used real-world data from individuals with clinician-confirmed schizophrenia spectrum disorders (per ICD/DSM or equivalent diagnostic criteria), excluding samples based solely on self-reported diagnoses or clinical high-risk populations.
Community applicability: data collection methods were feasible for sustained use in community, home, or outpatient settings (e.g., smartphone sensors, wearables, speech/text, and EHR data), and thus did not rely exclusively on research-grade neuroimaging (e.g., fMRI) or laboratory-only modalities (e.g., research-grade electroencephalogram) without a plausible pathway to routine deployment.
Beyond pure diagnostics: explicitly discussed or proposed rehabilitation management applications beyond solely reporting classification accuracy for “schizophrenia vs. healthy controls” discrimination.
At least one of the following rehabilitation-orientation signals needed to be present:
Rehabilitation-anchored constructs: the model or features were explicitly linked to rehabilitation-relevant dimensions, enabling translation to management priorities. Examples include symptom scales (Brief Negative Symptom Scale/Clinical Assessment Interview for Negative Symptoms), social cognition measures, sleep/circadian patterns, functional/participation assessments (Personal and Social Performance/UCSD Performance-based Skills Assessment/Quality of Life Scale/WHO Disability Assessment Schedule), medication adherence or side effects, and/or safety/risk indicators.
Change sensitivity or re-test evidence: presented evidence (even if preliminary) of response to intervention, pharmacological challenges, or repeated measurement, indicating potential utility for longitudinal monitoring or treatment-response tracking.
Actionability and interpretability: the features or outputs had interpretable clinical meaning and could plausibly inform rehabilitation care actions (e.g., “elevated negative symptom indices → prompt follow-up, social work engagement, or behavioral activation”), even if decision thresholds were not yet quantified.
If at least one of the following was found in the re-review, the study was excluded:
Diagnostics-only orientation: focused exclusively on diagnostic discrimination without establishing any rehabilitation-related linkage or management application.
Insufficient real-world utility: external validity or applicability was prohibitively low (e.g., excessive false-positive rates and clearly non-deployable workflows), precluding feasible use in rehabilitation management.
Non-compliant population or modality: primarily enrolled unconfirmed/self-disclosed cases or clinical high-risk-only samples or relied on data collection methods lacking community-setting feasibility.
Study selection
Zotero automatically filtered and removed duplicates from search results. Two independent reviewers (first and second authors) conducted title/abstract screening, followed by a full-text review of potentially eligible records. Disagreements were resolved through discussion, and unresolved cases were adjudicated by a third expert. Following the initial screening phases, all preliminarily eligible studies underwent a secondary operationalization review to ensure consistent application of the rehabilitation-oriented inclusion criteria, with cross-sectional or case–control designs subjected to stricter criteria (see Section 2.2.5). This secondary review was conducted in November 2025 in response to reviewer feedback, emphasizing clearer rehabilitation boundaries. Interrater agreement for study selection was substantial (Cohen’s κ = 0.78 for title/abstract screening; κ = 0.82 for full-text review; κ = 0.70 for the operationalization review; Fig. 1).
Fig. 1. PRISMA 2020 flow diagram for study selection.
Two database search rounds (Round 1: January 2012–January 2025; Round 2: January–October 2025) yielded 627 unique records after de‑duplication. At title/abstract screening, 520 records were excluded because they were non‑original or non‑empirical publications, out‑of‑scope in terms of population or AI use, or did not address rehabilitation‑oriented management. The remaining 107 reports underwent full‑text eligibility assessment and a secondary operationalization review focusing on rehabilitation‑oriented criteria (Methods 2.2.5), leading to the exclusion of 24 reports and a final cohort of 83 studies.
Data extraction
Two reviewers (first and second authors) independently extracted data using a standardized Microsoft Excel template. A pilot extraction of 20 articles refined the procedure and resolved discrepancies. The extracted information comprised the following: (1) bibliographic details (first author, year, country/region, and World Bank income level); (2) population and study design (target condition and phase, center structure [single-center, multicenter, or nationwide/healthcare system], setting, sample size and composition, and observation window or follow-up); (3) task specification (concise task phrase and task family of classification/regression/sequence/time-to-event) and rehabilitation domains using the task–domains framework (domain labels were drawn from the seven rehabilitation domains in Section 2.2.2); (4) technology paradigm (feature engineering-driven supervised learning; sequence and event-time modeling; representation learning and multimodal deep learning; prescriptive policy learning), recording the model used for primary inferences when multiple were compared; (5) data sources (modalities and whether passively or actively collected) and engagement pattern (passive sensing, nudge, conversational, or none); (6) outcome definition (proxy vs. clinical/functional endpoints) and time horizon (a concrete duration or an explicit window such as same-visit, short, mid, mid-to-long, and long term); (7) performance and outcomes captured in a task-aware manner, including classification metrics (area under the receiver operating characteristic curve [AUC], accuracy, sensitivity/specificity, and, where reported, precision/recall), regression metrics (mean absolute error and root mean squared error), time-to-event metrics (concordance indices, also known as C-index, or time-dependent AUC), early warning metrics (e.g., sensitivity and specificity at pre-specified prediction horizons), and task-appropriate metrics for prescriptive/just-in-time adaptive interventions, reinforcement-learning systems, or LLM-guided interventions, which were summarized narratively owing to heterogeneous definitions; (8) validation, interpretability, and implementation signals, including validation level (cross-validation, hold-out, and external), calibration and/or uncertainty reporting (yes/no), interpretability class (feature-level, local-explanation, rule-based, or none), closed-loop action (yes/no) with an action-delivery label distinguishing recognition-only systems from those that directly triggered patient- or clinician-facing support or training, safety guardrails for deployment or LLM/reinforcement-learning use (yes/no), and supplementary quality indicators where available (e.g., randomized controlled evaluations, clinician benchmarking, patient user testing, or fairness and algorithmic-bias assessments); (9) sufficient data pre-processing and feature engineering summary for reproducibility and interpretation (e.g., aggregation windows, selection procedures such as mRMR or embedded regularization, top-k important features, human-readable rules, or learned policy tables); (10) for cross-sectional or baseline proof-of-concept studies, an explicit justification of rehabilitation relevance aligned with the operationalization criteria (see Section 2.2.5).
For each study and task family, when multiple models, thresholds, time points, or subscales were reported, we abstracted all available performance metrics but designated a single prespecified “primary” estimate for cross-study descriptive summaries, prioritizing held-out or external test performance on the primary endpoint. Metrics were summarized in a task-aware fashion (i.e., classification, regression, sequence/time-to-event, early warning, and prescriptive tasks were not pooled across task families), and medians and interquartile ranges were computed for homogeneous metric families (e.g., AUC, accuracy, sensitivity/specificity, mean absolute error, root mean squared error, and R²). Metrics expressed on different scales (e.g., percentage root mean squared error on bounded ecological momentary assessment scales) were reported narratively, but were not included in pooled medians; for early warning models, sensitivity/specificity summaries were restricted to studies that reported both.
To ensure comparability, we grouped methods into four technology paradigms. First, feature engineering-driven supervised learning (typically static classification/regression), such as handcrafted or statistical features with logistic regression, support vector machines, random forests, and tree-based models. Second, sequence and event-time modeling, that is, models that make explicit use of temporal order or survival time such as hidden Markov models, recurrent neural networks, temporal convolutional networks, time-series transformers (also known as TS-Transformers), and Cox proportional hazards models, random survival forests, or deep survival models. Third, representation learning and multimodal deep learning, including self-supervised/contrastive pretraining and multimodal fusion across speech/text/sensing/electronic medical records. Fourth, prescriptive policy learning, ranging from prediction to action and including contextual bandits, reinforcement learning, and dynamic treatment regimes with offline counterfactual evaluation (e.g., inverse propensity scoring, doubly robust estimation, or fitted Q-evaluation). All extracted data were systematically organized according to the AI model type and rehabilitation domain (Table 1). Discrepancies were resolved through discussion and unresolved cases were adjudicated by a third expert.
Table 1.
AI for Schizophrenia Rehabilitation (2012–2025): Tasks, Data, Models, Performance, and Implementation Readiness.
| Study and setting | Population, sample size, and design | Task Domains | Technology paradigm | Data and engagement | Outcome and time horizon | Performance and outcomes (primary and role-aware) | Validation, interpretability, and implementation signals |
|---|---|---|---|---|---|---|---|
| Howes et al., 2012 [136] (United Kingdom; single-center outpatient psychiatry) | Schizophrenia outpatients; N = 131 (adherence, n = 128); 29 psychiatrists; 61–881 dialogue turns (mean 320.5) | 6-month treatment adherence binary classification — Domain: Medication management | Feature engineering-driven supervised learning (SVM with RBF kernel; J48 decision tree) | Psychiatrist-patient consultation transcripts; active collection; engagement: none | Proxy endpoint: clinician-rated treatment adherence; 6 months | Accuracy 91.1% (imbalanced, n = 128), 93.0% (balanced, n = 71); PANSS symptom prediction accuracy 87.0–91.1% | 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (vocabulary analysis); closed-loop: no |
| Osipov et al., 2015 [94] (United States of America & United Kingdom; multicenter outpatient) | Schizophrenia outpatients in remission (on medication); N = 31 (schizophrenia, n = 12; healthy controls, n = 19); monitoring: 10 days | Schizophrenia vs. healthy control classification (disease monitoring proof-of-concept) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (SVM with Gaussian RBF kernel; mRMR feature selection) | Wearable ECG-derived HR + accelerometry; passive collection; engagement: passive sensing | Proxy endpoint: classification accuracy; 10 days (cross-sectional) | AUC, 0.99; accuracy, 95.3%; sensitivity, 98.0% | 2-fold cross-validation (1 000 iterations); calibration/uncertainty: no; interpretability: feature-level (mRMR ranking); closed-loop: no |
| Wang et al., 2016 [103] (United States of America, New York; single-center outpatient psychiatry [psychiatric hospital]) | Recently discharged schizophrenia outpatients; N = 21; monitoring: 64–254 days (mean 133.76 days) | Mental health indicator prediction (EMA scores) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (GBRT; random forest) | Smartphone multimodal passive sensing + EMA (active; 3×/week); passive collection; engagement: passive sensing | Proxy endpoint: EMA mental health scores (positive/negative/sum); 2–8.5 months | MAE, 2.29 (7.6% of score range); Pearson r = 0.89 (p < 0.001); individual model superior to LOSO (MAE, 3.57) | 10-fold blocked cross-validation; calibration/uncertainty: no; interpretability: feature-level (feature importance); closed-loop: no |
| Arslan et al., 2024 [95] (Turkey; single-center outpatient psychiatry) | Schizophrenia-spectrum disorders on antipsychotics and healthy controls; N = 82 (patients, n = 44; healthy controls, n = 38); cross-sectional | Diagnostic classification and symptom correlation analysis — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest with SBERT sentence embeddings) | Speech transcripts from 15-min semi-structured interviews; active collection; engagement: none | Proxy endpoint: SSD vs. HC classification + BNSS symptom correlations; same visit | AUC, 0.89; accuracy, 86.8% (combined model); BNSS-EXP correlation r = 0.43 (p < 0.05) | 10-fold stratified cross-validation; calibration/uncertainty: no; interpretability: feature-level (feature importance); closed-loop: no |
| Bradley et al., 2024 [120] (United States of America, San Francisco; multicenter outpatient psychiatry [SFVAMC & UCSF]) | Male patients with schizophrenia/schizoaffective disorder and healthy controls; N = 88 (patients, n = 37; healthy controls, n = 51); crossover design with two test sessions | Emotion processing assessment and treatment response monitoring — Domain: Symptom monitoring | Representation learning-driven modeling (BERT fine-tuned on GoEmotions; cosine-similarity emotional-alignment) | Natural language (30-second verbal responses to video stimuli); active collection; engagement: none | Proxy endpoint: emotional alignment score; same visit (acute oxytocin challenge at 30 min) | Group difference: Cohen’s d = 0.64; oxytocin effect: d = 0.55 (p = .001); combined model AUROC, 0.72 | Leave-one-out cross-validation; calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Chan et al., 2023 [96] (United States of America; multicenter community/outpatient [community mental health clinics & VA medical centers]) | Clinically stable schizophrenia/schizoaffective disorder and healthy controls; N = 257 (patients, n = 167; healthy controls, n = 90); cross-sectional assessment | Self-experience language identification and symptom correlation — Domain: Symptom monitoring | Feature engineering-driven supervised learning (Linear SVM; logistic regression; NMF factorization) | Natural language (autobiographical narratives from IPII interview); active collection; engagement: none | Proxy endpoint: diagnostic classification (schizophrenia vs. healthy controls); Clinical endpoint: PANSS symptom correlations; same visit | AUC 0.80 (Logistic Regression), 0.78 (Linear SVM); CCA, r = 0.59 (p < 10^-6) | Hold-out testing set; calibration/uncertainty: no; interpretability: feature-level (NMF factors); closed-loop: no |
| Hudon et al., 2023 [172] (Canada; single-center psychiatric institute [CR-IUSMM]) | Treatment-resistant schizophrenia ( ≥ 2 antipsychotics failed); N = 18; nine weekly sessions (eight immersive; 125 total sessions; 1419 min) | Therapeutic dialogue pattern analysis in virtual reality avatar therapy — Domain: Psychosocial support | Representation learning-driven modeling (TF-IDF features; k-means clustering; PCA dimensionality reduction) | Natural language (therapy session transcripts in Canadian French); passive collection; engagement: conversational | Proxy endpoint: interaction pattern classification (dialogue themes); 9 weeks (treatment period) | Avatar interactions: 3 clusters identified (confrontational/positive techniques); Patient interactions: 4 clusters identified; Interrater agreement: Scott’s Pi = 0.66 (key themes), 0.51 (detailed themes) | Internal clustering validation (elbow method); calibration/uncertainty: no; interpretability: feature-level (cluster examples and qualitative comparison); closed-loop: no |
| Jeong et al., 2023 [104] (Canada; single-center inpatient psychiatry) | Inpatients hospitalized for psychosis; N = 7 (22 speech samples); monitoring: 7–48 days | Psychotic symptom severity assessment — Domain: Symptom monitoring | Feature engineering-driven supervised learning (Linear SVM; BERT for coherence features) | Natural language (interview transcripts); active collection; engagement: none | Proxy endpoint: SAPS/SANS symptom scores; 1–7 weeks (during hospitalization) | Global alogia: AUC 1.00, F-score 0.82; Illogicality: AUC 1.00, F-score 0.68; Poverty of speech: AUC 1.00, F-score 0.65 | Leave-one-subject-out cross-validation; calibration/uncertainty: no; interpretability: feature-level (correlation analysis); closed-loop: no |
| Just et al., 2023 [176] (Germany; single-center outpatient psychiatry [Charité Berlin]) | Stable schizophrenia/schizoaffective outpatients; N = 71 (T2, n = 54); follow-up: 6 months | Semantic coherence analysis for symptom correlation — Domain: Symptom monitoring | Representation learning-driven modeling (distributional embeddings: word2vec; GloVe; coherence metrics) | Semi-structured interview transcripts; active collection; engagement: none | Proxy endpoint: coherence-symptom correlations (PANSS); 6 months | word2vec: correlation with PANSS negative symptoms, r = -0.31; disorganization, r = -0.36; longitudinal prediction: non-significant | Cross-sectional correlation only; calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Parola et al., 2021 [97] (Italy; single-center outpatient psychiatry) | Schizophrenia (chronic, clinically stable) and healthy controls; N = 67 (schizophrenia, n = 32; healthy controls, n = 35); single 90-min assessment | Schizophrenia classification via multimodal communicative-pragmatic features — Domain: Symptom monitoring | Feature engineering-driven supervised learning (Decision Tree J48/C4.5) | Multimodal communicative data (linguistic, gestural, and prosodic) via Assessment Battery for Communication; active collection; engagement: none | Proxy endpoint: schizophrenia vs. control classification; same visit | AUC, 0.89; Accuracy, 82.1%; Sensitivity, 75.8% | 10-fold cross-validation; calibration/uncertainty: no; interpretability: rule-based (decision tree structure); closed-loop: no |
| Arevian et al., 2020 [105] (United States of America; single-center community mental health clinic) | Serious mental illness (schizophrenia, n = 13; schizoaffective, n = 14); N = 47; monitoring: 4–14 months | Clinical state tracking through speech analysis — Domain: Symptom monitoring | Feature engineering-driven supervised learning (Support Vector Machine with L-2 regularization) | Voice recordings via Interactive Voice Response system; active collection; engagement: none | Proxy endpoint: provider global assessment ratings (1–10 scale); 4–14 months | Individual model: correlation 0.78 (concurrent), 0.62 (forecasting); Population model: correlation 0.44 (concurrent), 0.33 (forecasting) | Leave-one-subject-out cross-validation (population), leave-one-sample-out cross-validation (individual); calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Ciampelli et al., 2023 [98] (The Netherlands; single-center university medical outpatient center) | Schizophrenia-spectrum disorder in stable/remission phase and healthy controls; N = 163 (patients, n = 93; healthy controls, n = 70) | Schizophrenia diagnostic classification using semantic similarity — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest with Word2vec embeddings) | Natural language (15-min semi-structured interview transcripts); active collection; engagement: none | Proxy endpoint: schizophrenia vs. control classification; same visit (cross-sectional) | Accuracy, 76.7% (ASR) vs. 79.8% (manual); AUC, 0.79 (ASR) vs. 0.80 (manual); sensitivity, 70.0% (ASR) vs. 75.0% (manual) | 10-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (Gini importance); closed-loop: no |
| Cohen et al., 2020 [116] (United States of America; single-center outpatient home group) | Serious mental illness (schizophrenia, n = 76; major depressive disorder, n = 18; bipolar disorder, n = 20; other, n = 7); N = 121; stable outpatients | Negative symptom severity classification (blunted affect/alogia) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (Lasso regularized regression with 138 acoustic features) | Voice recordings from structured speaking tasks (20 s picture description, 60 s free recall); active collection; engagement: none | Proxy endpoint: SANS-rated blunted vocal affect and alogia; same visit | Blunted affect: accuracy 85.0% (test), 94.0% (training); Alogia: accuracy 92.0% (test), 99.0% (training) | 10-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (stability selection analysis); closed-loop: no |
| de Boer et al., 2023 [99] (The Netherlands; multicenter community/outpatient) | Schizophrenia-spectrum disorder and healthy controls; N = 284 (patients, n = 142; healthy controls, n = 142); symptom subtyping subset n = 89 (positive, n = 44; negative, n = 45); cross-sectional | Schizophrenia diagnosis and positive/negative symptom subtype classification — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest with eGeMAPS acoustic features) | Speech acoustics from semi-structured interviews (5–30 min) via head-worn microphones; active collection; engagement: none | Proxy endpoint: diagnostic classification and symptom subtype; same visit | Diagnosis: AUC, 0.92; accuracy, 86.2%; sensitivity, 85.1%. Symptom subtyping: AUC, 0.76; accuracy, 74.2% | Leave-ten-out cross-validation; calibration/uncertainty: no; interpretability: feature-level (Gini importance scores); closed-loop: no |
| Holmlund et al., 2020 [121] (United States of America, Louisiana; single-center outpatient group home) | Serious mental illness (schizophrenia, n = 16; major depressive disorder, n = 8; bipolar disorder, n = 1) and healthy controls; N = 104 (patients, n = 25; healthy controls, n = 79); 1035 speech responses | Verbal memory assessment via story recall — Domain: Symptom monitoring | Feature engineering-driven supervised learning (OLS regression; common word types + Word Mover’s Distance) | Voice recordings of story recall; active collection (iOS device); engagement: none | Proxy endpoint: verbal memory scores (0–6); same visit (immediate and delayed recall) | Correlation, R = 0.82 (fully automated); R = 0.83 (hybrid procedure); human interrater, R = 0.73-0.89 | 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (regression coefficients); closed-loop: no |
| Richter et al., 2022 [100] (United States of America; single-center state psychiatric facility) | Schizophrenia inpatients and healthy controls; N = 44 (patients, n = 24; healthy controls, n = 20) | Schizophrenia vs. healthy control classification — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest) | Speech signals and facial video via multimodal dialog system; active collection; engagement: conversational | Proxy endpoint: schizophrenia classification; same visit | AUC, 0.86 ± 0.02 (multimodal); AUC, 0.84 ± 0.02 (speech only); AUC, 0.75 ± 0.04 (facial only) | 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Shibata et al., 2023 [129] (Japan; single-center community/outpatient) | Schizophrenia; N = 18 (longitudinal subset, n = 6); follow-up: 54 days | Subjective quality of life score estimation — Domain: Symptom monitoring | Feature engineering-driven supervised learning (k-NN for cross-sectional; random forest for longitudinal) | Voice recordings during structured questionnaire; active collection; engagement: conversational | Proxy endpoint: JSQLS quality-of-life subscale scores; same visit (cross-sectional) and 54 days (longitudinal) | Cross-sectional: RMSE, 13.43; Test RMSE, 13.30; Longitudinal: RMSE, 13.30 | Nested cross-validation; calibration/uncertainty: no; interpretability: feature-level (SHAP); closed-loop: no |
| Wang et al., 2017 [106] (United States of America, New York City; single-center outpatient psychiatry [Zucker Hillside Hospital]) | Recently discharged schizophrenia outpatients; N = 36 (female, n = 17; male, n = 19); monitoring: 2–12 months | Weekly 7-item BPRS symptom trajectory prediction for relapse risk monitoring — Domain: Symptom monitoring | Feature engineering-driven supervised learning (GBRT with iterative feature selection) | Smartphone multimodal passive sensing (activity, conversation, location, sleep, phone usage, and ambient sound/light) + EMA (active; 10-item; 3×/week); passive collection; engagement: passive sensing | Proxy endpoint: 7-item BPRS score (range 7–49); 1 week ahead (30-day rolling window) | MAE 1.45 (passive+EMA), MAE 1.59 (passive only); Pearson’s r, 0.70 (p < 0.0001); prediction within ±3.5% scale error | Leave-one-record-out cross-validation and leave-one-subject-out cross-validation; calibration/uncertainty: no; interpretability: feature-level (GBRT importance + GEE regression); closed-loop: yes (weekly reports trigger clinical outreach for at-risk patients) |
| Zlatintsi et al., 2022 [155] (Greece; multicenter outpatient psychiatry) | Psychotic-spectrum disorders (schizophrenia, n = 12; bipolar I, n = 8; schizoaffective, n = 2; others) and healthy controls; N = 62 (patients, n = 39; healthy controls, n = 23); monitoring: up to 2 years | Relapse detection and prediction in psychotic disorders — Domains: Risk management; Symptom monitoring | Representation learning-driven modeling (autoencoder variants: CNN-AE; CVAE; Transformer-AE; GRU-AE; supervised comparators: XGBoost; random forest; SVM for PANSS) | Smartwatch physiological signals (HR, HRV, accelerometer, and gyroscope; passive, 24/7) + weekly video interviews (speech and facial; active); passive collection; engagement: passive sensing | Clinical endpoint: relapse detection/prediction; up to 2 years (longitudinal) | Physiological relapse detection: PR-AUC, 0.76 (CNN-AE personalized); Speech+physiological fusion: ROC-AUC, 0.78; Pre-relapse prediction window: 21–30 days | 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (statistical biomarkers); closed-loop: no; external validation: no |
| Hong et al., 2024 [107] (South Korea; multicenter inpatient psychiatry—3 hospitals, 4 wards) | Acute psychiatric inpatients with schizophrenia-spectrum and mood disorders; N = 191 (schizophrenia, n = 72; depressive spectrum, n = 93; manic spectrum, n = 26); monitoring: mean 20.7 days | Comprehensive psychiatric symptom prediction (BPRS, HAM-A, MADRS, and YMRS) — Domain: Symptom monitoring | Representation learning-driven modeling (1D-CNN and GRU; multitask learning) | Wrist-worn wearable (heart rate, accelerometer, and location); passive collection; engagement: passive sensing | Proxy endpoint: clinical rating scale scores (symptom deterioration and severity); 4 weeks | Deterioration detection: accuracy 73.0%, AUC 0.74; Severity prediction: R² 0.74, NRMSE 0.056 | External validation (temporal split); calibration/uncertainty: no; interpretability: feature-level (permutation importance); closed-loop: no |
| Wang et al., 2021 [156] (Singapore; single-center community/outpatient) | Recently discharged schizophrenia patients; N = 22; follow-up: 6 months | Relapse and readmission prediction within 6 months — Domains: Risk management; Symptom monitoring | Sequence and event-time modeling (ARIMA models; Gaussian processes for anomaly detection) | Smartphone sensors + Fitbit wearable; passive collection; engagement: passive sensing | Clinical endpoint: relapse/readmission; 6 months | Data completion rate, 92.2%; COVID-19 behavioral change detection validated (35% step reduction, 1.1 bpm heart rate decrease); relapse prediction metrics pending | Internal validation ongoing; calibration/uncertainty: no; interpretability: feature-level anomaly dashboard; closed-loop: no (planned for phase 2) |
| Adler et al., 2020 [157] (United States of America; single-center community/outpatient) | Schizophrenia-spectrum disorders with recent acute care; N = 60 (non-relapse, n = 42; relapse, n = 18); longitudinal monitoring: 12 months | Psychotic relapse early warning detection (30-day window) — Domains: Risk management; Symptom monitoring | Representation learning-driven modeling (fully connected autoencoder; GRU Seq2Seq) | Smartphone multimodal passive sensing (acceleration, app use, calls, conversations, location, screen, sleep, and texts); passive collection; engagement: passive sensing | Clinical endpoint: psychotic relapse events (hospitalization and symptom exacerbation ≥25% BPRS); 30-day prediction window | Sensitivity, 25.0% (IQR, 15.0–100.0%); Specificity, 88.0% (IQR, 14.0–96.0%); 108.0% median increase in anomaly rate before relapse | Monte Carlo cross-validation (100 iterations); calibration/uncertainty: no; interpretability: feature-level (post-hoc Cohen’s d analysis); closed-loop: no |
| Difrancesco et al., 2016 [125] (United Kingdom; single-center community/outpatient) | Schizophrenia outpatients (stable phase implied); N = 5; 5-day pilot study | Out-of-home activity recognition for social functioning assessment — Domains: Symptom monitoring | Representation learning-driven modeling (density- and time-based geolocation detection; modified k-means clustering) | Smartphone GPS (every 10 meters); passive collection + paper-based social-functioning diary (active); engagement: passive sensing | Proxy endpoint: out-of-home activity detection accuracy; 5 days | Recall: 77.0% (density-based), 66.0% (time-based); Precision: 95.0% (time-based), 66.0% (density-based) | Direct evaluation (no CV); calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Narkhede et al., 2022 [117] (United States of America; single-center community/outpatient) | Psychotic disorder outpatients (schizophrenia, n = 22; schizoaffective, n = 27; bipolar disorder with psychotic features, n = 3) and healthy controls; N = 107 (patients, n = 52; healthy controls, n = 55); monitoring: 6 days | Negative symptom domain presence classification (anhedonia, avolition, asociality, blunted affect, and alogia) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest; k-NN; H2O XGBoost; feature selection: Boruta; RFECV; L1 regularization) | Smartphone GPS + accelerometry (passive), smartband accelerometry (passive), and EMA surveys 8×/day (active); passive collection; engagement: passive sensing | Proxy endpoint: negative-symptom domain classification (presence/absence based on BNSS ≥ 2); 6 days | PD vs. CN: Accuracy 79.0%, AUC 0.86; Individual symptom domains: Accuracy 73.0–91.0% | Internal cross-validation (stratified K-fold); calibration/uncertainty: no; interpretability: feature-level (defeatist beliefs and home distance as top features); closed-loop: no |
| Adler et al., 2022 [108] (United States of America; multicenter community/outpatient) | Schizophrenia/schizoaffective (CrossCheck cohort, n = 51) and university students (StudentLife cohorts: sleep, n = 15; stress, n = 9); monitoring: 12 months (CrossCheck); 10 weeks (StudentLife) | Cross-population mental health symptom prediction (sleep quality and stress) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (GBRT; 44 behavioral features) | Smartphone multimodal passive sensing (activity, conversation, location, and phone usage) + EMA (active); passive collection; engagement: passive sensing | Proxy endpoint: EMA symptom scores (sleep quality and stress); 12 months (CrossCheck); 10 weeks (StudentLife) | Sleep quality: MAE improvement with combined data (CrossCheck, RBC = 0.14, p = 0.007; StudentLife, RBC = 0.35, p < 0.001); Stress: MAE improvement (CrossCheck, RBC = 0.18, p < 0.001) | Leave-one-subject-out cross-validation; calibration/uncertainty: no; interpretability: feature-level (Proxy-A distance analysis); closed-loop: no |
| Wang et al., 2020 [126] (United States of America, New York; single-center outpatient psychiatry [Zucker Hillside Hospital]) | Stable schizophrenia outpatients; N = 55; monitoring: 12 months (assessments every 3 months) | Social functioning assessment (7 Social Functioning Scale sub-scales) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest; extra trees; XGBoost; PCA for dimensionality reduction) | Smartphone multimodal passive sensing (physical activity, GPS, ambient sound, conversations, calls/texts, and app usage); passive collection; engagement: passive sensing | Proxy endpoint: Social Functioning Scale subscales; 90 days | Employment/occupation: MAE, 2.17, r = 0.62; Interpersonal behavior: MAE, 3.39, r = 0.57; Prosocial activities: MAE, 7.79, r = 0.53 | 5-fold cross-validation; leave-one-subject-out cross-validation; cross-leave-one-subject-out cross-validation; calibration/uncertainty: no; interpretability: feature-level (PCA-based behavioral patterns); closed-loop: no |
| Kalinich et al., 2022 [101] (United States of America; single-center community mental health) | Schizophrenia and healthy controls; N = 55 (patients, n = 25; healthy controls, n = 30); monitoring: 90 days | Schizophrenia classification and sleep dysfunction prediction — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest with static and temporal features) | Smartphone cognitive game (modified Trails-B); active collection; engagement: none | Proxy endpoint: schizophrenia classification; Clinical endpoint: PSQI Daytime Dysfunction; 90 days | AUC 0.95 (schizophrenia classification with temporal features); AUC 0.80 (PSQI Daytime Dysfunction prediction) | Leave-one-out cross-validation (100 iterations); calibration/uncertainty: no; interpretability: feature-level (SHAP); closed-loop: no |
| Miley et al., 2023 [131] (United States of America, 21 states; multicenter community/outpatient—RAISE-ETP secondary analysis) | First-episode psychosis (schizophrenia spectrum; <6 months antipsychotic use); RAISE-ETP cohort, N = 276; CATIE validation cohort, N = 187; follow-up: 6 months | Causal pathway identification for functional recovery — Domain: Symptom monitoring | Feature engineering-driven supervised learning (causal discovery: GFCI; effect estimation: SEM) | Clinical assessments (QLS interviewer ratings, PANSS, Brief Assessment of Cognition); active collection; engagement: none (secondary analysis) | Clinical endpoint: QLS social and occupational functioning; 6 months | Model fit: CFI 0.88, RMSEA 0.07; Key causal effects: socio-affective capacity→motivation, ES = 0.77; motivation→social functioning, ES = 1.50; motivation→occupational functioning, ES = 0.96 | External validation (CATIE dataset); calibration/uncertainty: no; interpretability: feature-level (causal pathway analysis); closed-loop: no |
| Tseng et al., 2020 [109] (United States of America; multicenter outpatient/inpatient) | Schizophrenia-spectrum disorder; N = 61; total 6132 patient-days (mean 104 days/patient) | Fine-grained symptom trajectory prediction (10 EMA items) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (multi-output SVR with RBF kernel; multi-task learning with LASSO) | Smartphone multimodal passive sensing (accelerometer, calls/SMS, screen activity, location, and ambient light/sound) + EMA (active); passive collection; engagement: passive sensing | Proxy endpoint: 10 EMA symptom scores (0–3 scale); 1 day ahead | RMSE, 12% (m-SVR-RBF); MTL significantly outperforms STL (p < 0.001); rhythm features improve prediction for depression, hearing voices, and stress (p < 0.05) | 5-fold cross-validation; leave-one-subject-out; calibration/uncertainty: no; interpretability: feature-level (rhythm importance analysis); closed-loop: no |
| Hulme et al., 2019 [132] (United Kingdom; multicenter community/outpatient) | Schizophrenia from two randomized controlled trials; N = 49 (CareLoop, n = 40; Actissist, n = 12); median follow-up: 83 days | Symptom pattern clustering and subtype identification — Domain: Symptom monitoring | Sequence and event-time modeling (cluster hidden Markov Models with multinomial distributions) | Mobile-app self-reported symptom ratings (ClinTouch EMA); active collection; engagement: nudge | Proxy endpoint: symptom clusters/hidden states; 12 weeks | BIC 44929 (best model); response rate, 78%; 3 clusters × 3 states configuration | Internal model selection; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Badal et al., 2021 [127] (United States of America; multicenter outpatient [University of Texas at Dallas, University of Miami, and University of North Carolina]) | Stable schizophrenia/schizoaffective outpatients and healthy controls; N = 372 (schizophrenia, n = 218; healthy controls, n = 154); cross-sectional assessment | Social cognition assessment via facial affect recognition — Domain: Symptom monitoring | Feature engineering-driven supervised learning (stacked models; neural networks with ReLU; random forest; SVM) | Computerized emotion-recognition tasks (ER-40, BLERT); active collection; engagement: none | Proxy endpoints: diagnostic classification (schizophrenia vs. healthy controls), social cognitive ability (OSCARS), symptom severity (PANSS and BDI); same visit | SZ vs. HC classification: AUC 0.81, F1 78.0% (ER-40); OSCARS Self-Rated prediction: AUC 0.74; OSCARS Informant prediction: AUC 0.73 | 80–20 hold-out validation; calibration/uncertainty: no; interpretability: feature-level (Gini importance); closed-loop: no |
| Bain et al., 2017 [137] (United States of America; multicenter outpatient clinical trial) | Clinically stable adults with schizophrenia; N = 75 in AI substudy (AI platform, n = 53; modified directly observed therapy, n = 22); monitoring: 24 weeks | Medication adherence monitoring and confirmation — Domain: Medication management | Representation learning-driven modeling (computer-vision ingestion verification: face recognition; pill appearance recognition) | Smartphone video capture (facial and medication ingestion); active collection; engagement: nudge | Proxy endpoint: medication adherence (pharmacokinetic-verified); 24 weeks | Mean cumulative pharmacokinetic adherence: 89.7% (SD 24.92) AI platform vs. 71.9% (SD 39.81) mDOT; difference, 17.9% (95% CI -2 to 37.7; p = 0.08); suspicious behavior detection: 35.8% | Internal validation (pharmacokinetic reference standard); calibration/uncertainty: no; interpretability: feature-level (suspicious behavior flagging); closed-loop: partial (real-time alerts and counseling triggers) |
| Chen et al., 2023 [138] (Taiwan; multicenter daycare [outpatient]) | Stable schizophrenia in psychiatric daycare and controls; N = 94 (experimental, n = 35; control, n = 59); intervention: 12 weeks | Medication adherence enhancement through AI verification — Domain: Medication management | Representation learning-driven modeling (AI-based face recognition; drug appearance recognition) | Smartphone camera images (face and drug verification); active collection; engagement: nudge | Proxy endpoint: medication adherence rate; Clinical endpoint: PANSS symptom scores; 12 weeks | Medication adherence: 94.72% (experimental) vs. 64.43% (control); PANSS improvement: positive symptoms p = .02, negative symptoms p = .007, general psychopathology p < .001 | RCT single-blind; calibration/uncertainty: no; interpretability: none; closed-loop: yes |
| Shen et al., 2021 [102] (China; single-center inpatient psychiatry) | Chronic schizophrenia inpatients (illness duration 34.54 ± 10.76 years) and healthy controls; N = 316 (patients, n = 281; healthy controls, n = 35) | Schizophrenia identification and PANSS severity prediction — Domain: Symptom monitoring | Representation learning-driven modeling (ResNet-18 transfer learning; custom CNN for regression) | Color painting images (standardized template with 12 colors); active collection; engagement: none | Proxy endpoint: schizophrenia classification and PANSS scores; same visit | Classification accuracy 90.33%; PANSS prediction RMSE: total 15.39, positive 4.36, negative 7.24 | Hold-out validation (10 runs); calibration/uncertainty: no; interpretability: feature-level (color histogram and stroke analysis); closed-loop: no |
| Zhu et al., 2024 [139] (United States of America; multicenter outpatient/community) | Schizophrenia (n = 264) and attenuated psychotic disorders (n = 50) from two Phase II randomized controlled trials; adherence subset, N = 235; monitoring: 28 weeks | Medication adherence prediction for clinical trials — Domain: Medication management | Feature engineering-driven supervised learning (XGBoost with linear/logistic regression) | Smartphone camera-based medication ingestion monitoring (AiCure app); active collection; engagement: nudge | Proxy endpoint: medication adherence classification; Clinical endpoint: time to first relapse; 28 weeks | AUC, 0.81 (14-day model); AUC, 0.87 (0.6 cut-off); AUC, 0.92 (Trial End timepoint) | 10-fold cross-validation; calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Abplanalp et al., 2024 [128] (United States of America; multicenter outpatient [VA Greater Los Angeles, UCLA, community facilities]) | Schizophrenia outpatients (n = 72); bipolar disorder (n = 48); community sample enriched for social isolation (n = 151); total N = 271; cross-sectional assessment | Social isolation and loneliness predictors identification — Domain: Symptom monitoring | Feature engineering-driven supervised learning (LASSO regression with 5-fold cross-validation) | Self-report questionnaires and semi-structured clinical interviews; active collection; engagement: none | Proxy endpoints: social isolation composite score and UCLA loneliness scale; same visit (cross-sectional) | Schizophrenia group - social isolation: R² 0.14; loneliness: R² 0.21; social anhedonia as transdiagnostic predictor (β = 0.14-0.31 across groups) | 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (variable importance via LASSO coefficients); closed-loop: no |
| Banerjee et al., 2021 [170] (United Kingdom; single-center secondary mental health care) | Schizophrenia (ICD-10 F20); N = 1 706; observation period from 2013 onwards | Mortality prediction in schizophrenia — Domain: Risk management | Representation learning-driven modeling (autoencoder feature learning and random forest; logistic regression; Cox proportional hazards) | Electronic health records (structured ICD-10 diagnoses, and NLP-extracted medications); passive collection; engagement: none | Clinical endpoint: all-cause mortality; long (variable observation period) | AUC, 0.80 (95% CI: 0.78 to 0.82); SMR, 7.4 (95% CI: 5.5 to 9.2); Concordance Index, 0.78 | 10-fold cross-validation; calibration/uncertainty: no; interpretability: local-explanation (class-contrastive counterfactuals); closed-loop: no |
| Barbalat et al., 2024 [173] (France; multicenter rehabilitation network—15 centers) | Serious mental illness (47.2% schizophrenia-spectrum disorders); N = 1146; cross-sectional assessment | Initial referral prediction to 4 psychosocial rehabilitation interventions (cognitive behavioral therapy, CR, PE, and VT) — Domain: Psychosocial support | Feature engineering-driven supervised learning (random forest; comparators: ridge regression; multinomial regression; XGBoost; CART) | Electronic health records (37 socio-demographic and clinical variables); active collection via structured assessment; engagement: none (retrospective) | Proxy endpoint: rehabilitation-intervention referral patterns; same visit (baseline assessment) | AUC, 0.67; AUPRC, 0.41; micro-averaged across 4 intervention types | Nested 20-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (SHAP analysis); closed-loop: no; external validation: no |
| Jeon et al., 2022 [140] (South Korea; multicenter nationwide claims; mixed inpatient/outpatient) | Children/adolescents (2–18) with first antipsychotic prescription; N = 8502 (risperidone, n = 5109; aripiprazole, n = 3393); follow-up: 1 year | One-year antipsychotic treatment-continuation classification — Domain: Medication management | Feature engineering-driven supervised learning (gradient boosting machines; random forest; logistic regression; SuperLearner ensemble) | National health-insurance claims (demographics, ICD-10 diagnoses, prescriptions, and service use); passive collection; engagement: none | Proxy endpoint: treatment continuation without ≥60-day gap/hospitalization/switch; 1 year | Risperidone: AUC, 0.69; AUPRC, 0.27; Brier, 0.12. Aripiprazole: AUC, 0.69; AUPRC, 0.32; Brier, 0.11. | Hold-out split (75/25) with 5-fold cross-validation for tuning; calibration/uncertainty: no; interpretability: feature-level (variable importance); closed-loop: no |
| Kim et al., 2025 [143] (South Korea; multicenter outpatient psychiatry trials) | Adults switching antipsychotics; N = 299; longitudinal assessments at 4, 8, and 24 weeks | Treatment-response prediction at 4, 8 and 24 weeks — Domain: Medication management | Feature engineering-driven supervised learning (extreme gradient boosting; elastic net; random forest; gradient boosting machines) | Clinician-rated scales (CGI-SCH), patient-reported outcomes (SCL-90-R; drug attitude inventory), psychosocial functioning (PSP), metabolic/endocrine measures (BMI and metabolic syndrome, prolactin); active collection; engagement: none | Proxy endpoint: treatment response by CGI-SCH improvement; 4, 8 and 24 weeks | AUC, 0.71 (4 weeks); 0.66 (8 weeks); 0.68 (24 weeks). Key drivers: body mass index and drug attitude across time. | Nested 5-fold cross-validation (3× repeats); calibration/uncertainty: no; interpretability: feature-level (SHAP); closed-loop: no |
| Wysokiński et al., 2024 [151] (Poland; single-center inpatient psychiatry) | Inpatients on clozapine ≥2 weeks; N = 69; single time-point therapeutic-drug-monitoring sample | Clozapine toxicity prediction and therapeutic dose-range recommendation — Domain: Medication management | Representation learning-driven modeling (fully connected neural networks on clinical features) | Clinical variables (sex, age, body-mass index, dose, C-reactive protein, and cytochrome P450 co-medication counts) and morning plasma levels; active collection; engagement: none | Proxy endpoint: plasma clozapine/norclozapine concentrations and toxicity classification ( > 550 ng/mL); same visit | Concentration prediction: R² 0.92 (clozapine), 0.80 (norclozapine); RMSE, 85.23 and 62.76. Toxicity classification accuracy: 99.0%/93.0%; external demo (n = 3) R² 0.77. | Hold-out split (70/30); external validation: demonstration (n = 3); calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Zhu et al., 2022 [152] (China, Guangzhou; single-center inpatient psychiatry) | Olanzapine-treated inpatients; N = 393 patients with 672 prolactin tests; cross-sectional (2018 H2) | Prolactin-level prediction and pharmacovigilance signal detection in olanzapine-treated patients — Domain: Medication management | Feature engineering-driven supervised learning (extreme gradient boosting with SHAP-guided feature selection) | Electronic health records (demographics, diagnoses, co-medications, labs, and drug concentrations); passive collection; engagement: none | Proxy endpoint: serum prolactin level; same visit | RMSE, 0.06; MAE, 0.05; mean relative error, 11.0%. SHAP aligned with known pharmacologic effects (risperidone/sulpiride ↑, aripiprazole ↓). | Hold-out split (80/20) with 10-fold cross-validation for feature selection; calibration/uncertainty: no; interpretability: feature-level (SHAP); closed-loop: no |
| Amoretti et al., 2021 [133] (Spain; multicenter psychiatric services—16 sites; early-psychosis programs) | First-episode non-affective psychosis; N = 144; baseline and 2-year follow-up | Clinical-trajectory phenotyping and prognosis staging over 2 years — Domain: Symptom monitoring | Sequence and event-time modeling (unsupervised fuzzy trajectory clustering on PCA-derived symptom dimensions) | Clinician-rated symptom and function scales (PANSS, MADRS, CGI, FAST, and PAS); active collection; engagement: none | Clinical endpoint: four trajectory categories (excellent, remitting, worsening, and chronic); 2 years | Discriminant analysis classification accuracy 93.1% (baseline clusters) and 94.0% (2-year clusters); silhouette, 0.71–0.77. | Internal discriminant-function validation; calibration/uncertainty: no; interpretability: feature-level (symptom-dimension profiles and membership degrees); closed-loop: no |
| Brandt et al., 2023 [158] (International; multicenter randomized trials; outpatient maintenance) | Schizophrenia/schizoaffective adults undergoing randomized paliperidone discontinuation; N = 1 392 (continue, n = 700; discontinue, n = 692); event-time follow-up | Post-discontinuation psychotic-relapse risk prediction — Domain: Risk management | Sequence and event-time modeling (regularized proportional-hazards regression; random survival forests) | Baseline clinician-rated scales, labs, clinical history, and substance screens; active collection; engagement: none | Clinical endpoint: time to psychotic relapse; mid | Concordance index, 0.71 (elastic-net Cox); 0.70 (random survival forests); baseline-only comparator, 0.60. | Internal cross-validation (leave-one-out); calibration/uncertainty: no; interpretability: feature-level (coefficients/variable importance); closed-loop: no |
| Góngora Alonso et al., 2024 [164] (Spain, Castilla y León; multicenter inpatient admissions across 11 public hospitals) | Schizophrenia inpatients; N = 3 065 patients with 6 089 admissions; retrospective period: 2005–2015 | Hospital readmission risk prediction — Domain: Risk management | Feature engineering-driven supervised learning (random forest with particle-swarm optimization; support vector machine; multilayer perceptron) | Electronic health records—administrative minimum dataset (diagnoses, procedures, demographics, and admission features); passive collection; engagement: none | Clinical endpoint: all-cause readmission; mid-to-long (retrospective 10 years) | AUC, 0.86; recall, 95.9%; accuracy, 84.4%; F1-score, 90.7%. | 10-fold cross-validation; calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Lin et al., 2021 [110] (Taiwan; single-center outpatient psychiatry) | Adults with schizophrenia; N = 302; cross-sectional assessment | Functional outcome prediction (Quality of Life Scale and Global Assessment of Functioning) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (bagging ensemble with multilayer neural networks and support vector machines) | Symptom scales (PANSS-Positive, SANS-20, and HAMD-17) and 11 cognitive tests; active collection; engagement: none | Proxy endpoint: QLS and GAF scores; same visit | RMSE, 6.43 (QLS) and 7.78 (GAF); superior to individual baselines. | Repeated 10-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (M5 Prime feature selection); closed-loop: no |
| Lin et al., 2024 [111] (Taiwan; single-center treatment-community mental health service) | Stable schizophrenia/schizoaffective outpatients; N = 573 (training, n = 458; validation, n = 115); cross-sectional | Machine-learning short forms for PANSS (15-item and 9-item) — Domain: Symptom monitoring | Representation learning-driven modeling (artificial neural network for PANSS reconstruction; gradient-boosting classifier for reduced forms) | Structured clinical interview with PANSS, CGI-S, MMSE, Lawton IADL; active collection; engagement: none | Proxy endpoint: PANSS severity indices; same visit | 15-item: r 0.92–0.99; MSE, 2.60–10.90; α 0.87. 9-item: r 0.86–0.97; MSE, 2.60–24.40; α 0.79. | Hold-out validation (80/20); calibration/uncertainty: no; interpretability: feature-level (iterative item selection); closed-loop: no |
| Martínez-Cao et al., 2024 [134] (Spain; multicenter outpatient/community mental health centers) | Stable schizophrenia outpatients; N = 212; cross-sectional | Five-level clinical staging (Clinical Global Impression–Severity) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (support vector machine with genetic-algorithm feature selection) | Symptom scales (PANSS, CAINS, CDSS, OSQ), cognition (MATRICS battery), blood biomarkers (PLR, NLR, MLR, and C-reactive protein), functioning (PSP); active collection; engagement: none | Proxy endpoint: CGI-S stage (5 levels); same visit | Overall accuracy 62.0%; sensitivity by stage—2: 62.1%, 3: 63.6%, 4: 67.3%. | 3-fold cross-validation repeated 10 000×; calibration/uncertainty: no; interpretability: feature-level (12-variable profile); closed-loop: no |
| Pei et al., 2024 [141] (Italy; multicenter community mental health centers; outpatient) | Adults with mental disorders on long-acting injectable antipsychotics; N = 432 (schizophrenia-spectrum, n = 302); follow-up: 6 and 12 months | 6–12-month treatment-adherence classification — Domain: Medication management | Feature engineering-driven supervised learning (LASSO selection; logistic regression; nomogram) | Clinical interview scales (DAI-10, BPRS, and Kemp), history (hospitalizations, LAI use); active collection; engagement: none | Proxy endpoint: medication adherence (Kemp ≥5 vs. <5); 6–12 months | AUC, 0.72; C-index, 0.71; Brier, 0.22 | Bootstrap internal validation (1000×); external validation: no; calibration/uncertainty: calibration, yes / uncertainty, no; interpretability: feature-level (nomogram); closed-loop: no |
| Soldatos et al., 2022 [112] (Greece & Denmark; multicenter inpatient/outpatient [Athens single-center; Copenhagen multicenter]) | First-episode psychosis; N = 280 (Athens, n = 179; Copenhagen, n = 101); follow-up: 4 weeks (Athens); 6 weeks (Copenhagen) | 4–6-week early symptomatic remission prediction — Domain: Symptom monitoring | Feature engineering-driven supervised learning (Elastic Net feature selection; linear support vector machine) | Clinical scales (PANSS, GAF, PSP, HAM-D, YMRS, and CGI) + demographics; active collection; engagement: none | Clinical endpoint: Andreasen symptomatic remission; 4–6 weeks | AUC, 0.71 (Athens); AUC, 0.68 (Copenhagen); balanced accuracy, 60.4%/63.5% | Nested 10-fold CV (repeated) and external validation (independent Copenhagen cohort); calibration/uncertainty: no; interpretability: feature-level (linear model weights); closed-loop: no |
| Wong et al., 2024 [144] (Hong Kong Special Administrative Region, China; multicenter public psychiatric services) | First-episode psychosis in early intervention services/specialist care services; N = 1400 (clozapine, n = 191); follow-up: 12–17 years | Long-term treatment-resistance risk (clozapine initiation) prediction — Domain: Medication management | Feature engineering-driven supervised learning (automated model selection via TPOT; bagging; Platt scaling; risk calculator) | Electronic health records (symptoms, meds, admissions, and SOFAS); passive collection; engagement: none | Proxy endpoint: treatment resistance (clozapine prescription); long (12–17 years; models at baseline/12/24/36 months) | AUROC, 0.77 (36-month model); 0.68 (baseline); Brier, 0.09–0.11 | 5-fold CV (100× repeats); calibration/uncertainty: calibration, yes / uncertainty, yes (SD across repeats); interpretability: feature-level (importance ranking); closed-loop: no |
| Yu et al., 2022 [166] (China, Hefei; single-center inpatient psychiatry) | Hospitalized male schizophrenia; N = 397 (violence, n = 146; no violence, n = 251); cross-sectional (violence in prior month) | One-month pre-admission violence risk classification — Domain: Risk management | Feature engineering-driven supervised learning (neural network; baselines: logistic regression; random forest; SVM; LASSO preselection) | Demographics, clinical scales (BPRS, PANSS, and SDSS), laboratory panels; active collection; engagement: none | Proxy endpoint: violence (yes/no in prior 1 month); short | AUC, 0.67; sensitivity, 44.4%; specificity, 83.9% | Train/test split (70/30) and 10×10-fold CV for tuning; calibration/uncertainty: no; interpretability: feature-level (logistic regression ORs); closed-loop: no |
| Umbricht et al., 2020 [118] (United States of America; multicenter outpatient clinics) | Stable schizophrenia with moderate negative symptoms; N = 33 (device wearers, n = 31); continuous monitoring: 15 weeks | Continuous activity- and gesture-based negative-symptom quantification — Domain: Symptom monitoring | Representation learning-driven modeling (9-layer convolutional recurrent neural network for human-activity recognition) | Wrist-worn accelerometry (20 Hz; GeneActiv); passive collection; engagement: passive sensing | Proxy endpoint: activity/gesture markers vs. clinical scales (BNSS; effort task); 15 weeks (cross-sectional analysis) | HAR accuracy, 95.5% (mobile) / 94.9% (stationary); r(activity-time ratio, effort task)=0.58; r(gesture count, BNSS) = − 0.44 | External test sets (public datasets) and disease cohort checks; calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Barruel et al., 2024 [145] (France, Paris; single-center university psychiatric hospital; inpatient) | Hospitalized schizophrenia; N = 500 (treatment-resistant, n = 169); retrospective | Treatment-resistant schizophrenia status prediction — Domain: Medication management | Feature engineering-driven supervised learning (NLP-derived EHR features; XGBoost; SHAP explanations) | EHR discharge summaries and notes (text); passive collection; engagement: none | Proxy endpoint: treatment resistance (clozapine-based); same visit (retrospective) | AUC, 0.60; F1-score, 49.4%; recall, 60.7% | Repeated 5-fold CV (200 iters); calibration/uncertainty: no; interpretability: local-explanation (SHAP); closed-loop: no |
| Birnbaum et al., 2020 [159] (United States of America; multicenter early-psychosis clinics; outpatient) | Young people with early psychosis and healthy controls; N = 116 (schizophrenia-spectrum disorder, n = 42; healthy controls, n = 74); relapse sample: 38 SSD (93 time windows); window length: 4 weeks | Internet-search-based SSD diagnosis support and 4-week relapse prediction — Domains: Risk management; Symptom monitoring | Feature engineering-driven supervised learning (random forest for diagnosis; RBF SVM for relapse; LIWC and temporal features) | Google search queries and browsing logs (Google Takeout); passive collection; engagement: passive sensing | Proxy endpoints: SSD vs. control; relapse hospitalization within 4 weeks; 4-week windows | AUC, 0.74 (diagnosis); AUC, 0.71 (4-week relapse); accuracy, 73.0% (diagnosis) | 5-fold CV (10 repeats); calibration/uncertainty: no; interpretability: feature-level (permutation importance); closed-loop: no |
| Mason et al., 2024 [167] (United Kingdom, London; multicenter community/outpatient/inpatient [NHS mental health trust]) | Mental-health service users; N = 60 021 (schizophrenia spectrum, n = 7 212); year-2019 records | EHR-text detection of violence victimization — Domain: Risk management | Representation learning-driven modeling (BioBERT transformer for document-level classification) | EHR free-text notes and summaries; passive collection; engagement: none | Proxy endpoint: violence-victimization record detection; 1 year (2019) | F1-score, 90.0% (victimization); F1-score, 98.0% (physical violence); interrater kappa, 0.60–0.85 | Blinded hold-out test set (n = 1 411 documents); calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Miley et al., 2024 [171] (United States of America; multicenter community/outpatient remote training) | Clinically stable adults with schizophrenia; N = 76; assessments at baseline, mid ( ~ 20 sessions), and end ( ~ 40 sessions); training duration: 8–12 weeks | Social-cognition training response trajectory identification and individualized prediction — Domain: Functional training | Feature engineering-driven supervised learning (latent class growth analysis; random forest with extremely randomized splits; logistic regression comparator) | Clinical assessments (social cognition tests, PANSS, and functioning, motivation); active collection + training-participation logs; passive collection; engagement: none | Proxy endpoint: composite social-cognition improvement; 8–12 weeks | AUPRC, 0.73; F1, 67.0%; precision, 61.0%; recall, 75.0%; Five trajectory groups; 29.0% showed significant improvement | 10-fold cross-validation (nested hyper-parameter tuning); calibration/uncertainty: no; interpretability: feature-level (Shapley values); closed-loop: no |
| Dickson et al., 2023 [142] (United States of America, Oklahoma; multicenter community/outpatient [statewide Medicaid]) | Adults with schizophrenia/schizoaffective initiating paliperidone palmitate vs. switching oral antipsychotics; N = 295 (paliperidone palmitate, n = 183; oral-switch, n = 112); retrospective cohort 2016–2019; follow-up: 12 months (mean 1.5 ± 0.8 years) | Medication adherence/persistence and 30-day readmission risk estimation — Domain: Medication management | Feature engineering-driven supervised learning (lasso-regularized models with cross-fit partialing-out) | Administrative medical and pharmacy claims; passive collection; engagement: none | Proxy endpoints: proportion of days covered and 45-day treatment gaps; Clinical endpoint: 30-day readmission; 12 months | Readmission OR, 1.91 (oral-switch vs. paliperidone palmitate); adherence, −18.5%; 45-day gap IRR 1.92; Adherence ≥80% linked to lower readmission OR (0.34) | Repeated 10× random splits with 10-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (lasso coefficients); closed-loop: no |
| Fond et al., 2019 [160] (France; multicenter outpatient—10 expert centers) | Community-dwelling adults with schizophrenia; N = 549 at baseline (n = 315 completed 2-year follow-up) | Two-year psychotic relapse prediction — Domain: Risk management | Feature engineering-driven supervised learning (classification and regression tree) | Clinical scales, neurocognitive tests, adherence scales, and metabolic and inflammatory markers; active collection; engagement: none | Clinical endpoint: relapse ( ≥ 7-day acute psychosis); 2 years | Accuracy, 63.8%; sensitivity, 71.0%; specificity, 44.8% | Train/test split with 5-fold cross-validation; calibration/uncertainty: no; interpretability: rule-based; closed-loop: no |
| Podichetty et al., 2021 [146] (United States of America; multicenter clinical-trial secondary analysis—CATIE) | Adults with schizophrenia; N = 639 (cleaned dataset); assessments at baseline, 1 month, 3 months, and 6 months | Six-month treatment-response ( ≥ 20% PANSS reduction) prediction and trial enrichment — Domain: Medication management | Feature engineering-driven supervised learning (random forest; logistic regression/naïve Bayes/support vector machine comparators) | Clinical assessments and vitals; active collection (retrospective trial analysis); engagement: none | Proxy endpoint: ≥20% PANSS reduction; 6 months (screening model targets 3 months) | AUC, 0.70; specificity, 93.6%; accuracy, 80.0%; Screening model AUC, 0.65; enrichment yield 46–48 vs. 22% actual | 10-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Wu et al., 2020 [154] (Taiwan; multicenter outpatient) | First-episode schizophrenia; N = 32 277; longitudinal follow-up: 12 months | Individualized antipsychotic selection — Domain: Medication management | Prescriptive policy learning (targeted minimum-loss ensemble with individualized treatment rule; random-forest variable importance) | Electronic health records; passive collection; engagement: none | Proxy endpoint: treatment success (no switch and no hospitalization within 12 months); 12 months | Success under individualized treatment rule 51.7 vs. 44.5% observed (relative improvement, 16.0%); number needed to treat, 13.9 | Hold-out (30%); calibration/uncertainty: no; interpretability: feature-level; closed-loop: no; safety guardrails: no |
| Jeong et al., 2024 [130] (South Korea; multicenter outpatient) | Chronic schizophrenia; N = 637 at baseline (n = 420 at 6 months) | High subjective well-being classification (SWN ≥ 80) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (random forest; gradient boosting and regularized logistic regression comparators) | Self-report and clinician-rated scales plus metabolic markers; active collection; engagement: none | Proxy endpoint: SWN-K ≥ 80; baseline and 6 months | AUC at baseline, 0.79 (random forest); AUC at 6 months, 0.78 | 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level (Shapley values); closed-loop: no |
| Wang et al., 2020 [168] (Canada; single-center outpatient psychiatry [Centre for Addiction and Mental Health]) | Schizophrenia-spectrum adults; N = 275 (violence history, n = 103; no history, n = 172); retrospective | Physical-violence risk classification — Domain: Risk management | Feature engineering-driven supervised learning (random forest; logistic/elastic net/gradient boosting/support vector machine comparators) | Structured EHR data plus self-report questionnaires and clinical interviews; passive collection + active collection; engagement: none | Proxy endpoint: violence vs. non-violence classification; same visit (retrospective) | AUC, 0.63 (random forest); sensitivity, 32.0%; specificity, 80.0%; Best sensitivity, 59.0% (elastic net) | Stratified 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Zhou et al., 2022 [161] (United States of America; single-center outpatient psychiatry [Zucker Hillside Hospital]) | Outpatients with schizophrenia; N = 63; monitoring: up to 12 months; 27 relapse events among 20 patients | One-week-ahead relapse prediction from mobile-sensor behavior clusters — Domain: Risk management | Feature engineering-driven supervised learning (Gaussian mixture model and partitioning-around-medoids clustering with dynamic-time-warping features; balanced random forest) | Smartphone sensing (light, audio, motion, location, and phone use) via CrossCheck; passive collection; engagement: passive sensing | Clinical endpoint: relapse per predefined criteria; 1-week horizon | F2, 0.23; precision, 6.3%; recall, 66.2% | Leave-one-patient-out cross-validation; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Lin et al., 2023 [174] (United States of America; multicenter community/outpatient psychotherapy; retrospective offline evaluation) | Psychotherapy sessions across multiple diagnoses including schizophrenia; N = > 950 sessions; 219 999 candidate actions | Real-time therapy-strategy topic recommendation — Domain: Psychosocial support | Prescriptive policy learning (batch-constrained Q-learning; deep deterministic policy gradient; twin delayed deep deterministic policy gradient; interpretable policy dynamics) | Psychotherapy conversation transcripts (historical text); passive collection; engagement: conversational | Proxy endpoint: topic-recommendation accuracy; same visit (per session step) | Accuracy, 64.24% (overall best model—Batch-Constrained Q-learning on GOAL objective); Schizophrenia subset accuracy, 31.44% (twin delayed deep deterministic policy gradient) | Hold-out split (95/5); calibration/uncertainty: no; interpretability: local-explanation (policy-trajectory visualization and transition matrices); closed-loop: no; safety guardrails: no |
| Cohen et al., 2023 [162] (United States of America & India; multicenter outpatient psychiatry—3 sites: urban United States of America; urban India; rural India) | Schizophrenia patients and healthy controls; N = 132 (patients, n = 76; healthy controls, n = 56); mean follow-up: 156 days; 20 relapse events | Thirty-day relapse warning from digital phenotyping — Domain: Risk management | Feature engineering-driven supervised learning (multivariate anomaly detection on smartphone streams; logistic regression comparator) | Smartphone passive sensing (GPS, accelerometer, and screen state) + prompted surveys (active); passive collection; engagement: passive sensing | Clinical endpoint: relapse within 30 days; 30-day horizon | Anomaly rate ×2.12 in the 30 days prior to relapse; sensitivity, 0.6%; specificity, 99.7% | Internal multi-site analysis with permutation tests; calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Yoo et al., 2024 [163] (United States of America; single-setting community [online]) | Schizophrenia patients active on social media; N = 28 (interviews); prototype test, n = 7; dataset: 51 patients / 52,815 Facebook posts; relapse window: 1 month | One-month relapse-risk classification with patient-facing feedback — Domain: Risk management | Feature engineering-driven supervised learning (one-class support vector machine ensemble; LIWC and behavioral rhythm features) | Facebook text and post metadata; passive collection; engagement: passive sensing | Proxy endpoint: relapse/hospitalization risk; 1 month | Specificity, 71.0%; sensitivity, 38.0%; PPV, 66.0% | Internal cross-validation; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no; user testing: n = 7 |
| McCutcheon et al., 2025 [122] (United States of America; multicenter outpatient research—six centers) | Psychosis spectrum and controls; N = 3370 (schizophrenia, n = 709; schizoaffective, n = 541; bipolar I with psychosis, n = 457; healthy controls, n = 840; first-degree relatives, n = 823); cross-sectional baseline | Cognitive composite score prediction and contribution decomposition — Domain: Symptom monitoring | Feature engineering-driven supervised learning (gradient-boosted trees; SHAP for feature attribution) | Clinical/demographic variables and BACS battery; active collection; engagement: none | Proxy endpoint: BACS composite z-score; same visit | R, 0.72; R², 0.52 | Nested cross-validation (20×5); calibration/uncertainty: no; interpretability: feature-level; closed-loop: no; permutation test: yes (1 000) |
| Dee et al., 2025 [113] (Europe; multicenter clinical trial) | First-episode psychosis; N = 66 (stage-II completers); baseline predictors of 10-week outcomes | Ten-week symptomatic and functional remission prediction — Domain: Symptom monitoring | Representation learning-driven modeling (recurrent neural network; Monte Carlo dropout) | Clinical scales and demographics; active collection; engagement: none | Clinical endpoints: symptomatic remission (RSWG); functional remission (PSP ≥ 71); 10 weeks | AUC, 0.59 (symptomatic); AUC, 0.67 (functional); accuracy, 79.0% (functional) | Internal hold-out test; calibration/uncertainty: yes; interpretability: none; closed-loop: no; clinician benchmark: yes (n = 24) |
| van Opstal et al., 2025 [114] (The Netherlands & International; multicenter outpatient trial—27 sites) | First-episode psychosis on amisulpride; N = 446 (phase I); n = 93 (phase II); follow-up: 10 weeks | Four- and 10-week treatment-outcome prediction with uncertainty stratification — Domain: Symptom monitoring | Sequence and event-time modeling (recurrent neural network with long-short-term memory; fuzzy-logic uncertainty; multi-task learning) | Clinical scales (PANSS, CGI, and PSP) and demographics at baseline and weeks 1–10; active collection; engagement: none | Clinical endpoints: symptomatic remission (Andreasen); clinical global remission (CGI); functional remission (PSP ≥ 71); 4 and 10 weeks | AUC, 0.74 (symptomatic, 10 weeks); 0.73 (global, 10 weeks); 0.72 (functional, 10 weeks) | Leave-one-site-out cross-validation; 10-fold cross-validation; calibration/uncertainty: yes; interpretability: rule-based (fuzzy confidence levels); closed-loop: no |
| Bernstorff et al., 2024 [169] (Denmark; multicenter psychiatric services—five hospitals) | Patients with mental illness in routine care; N = 74 880; prediction horizon: 1–5 years (best 5-year model) | Type 2 diabetes onset risk prediction (1–5-year horizon) — Domain: Risk management | Feature engineering-driven supervised learning (gradient-boosted trees; elastic-net logistic regression comparator) | Electronic health records (labs, diagnoses, and medications); passive collection; engagement: none | Clinical endpoint: incident type 2 diabetes; 5 years | AUC, 0.84 (5-year); sensitivity, 30.6% at 3% alert rate; PPV, 18.1% | Internal hold-out split (85/15); calibration/uncertainty: no; interpretability: none; closed-loop: no; external validation: planned |
| Zakowicz et al., 2025 [123] (Poland; single-center inpatient child/adolescent tertiary unit) | Early-onset psychosis adolescents; N = 27 (formal thought disorder positive, n = 16; negative, n = 11); cross-sectional baseline | Formal thought disorder detection (formal thought disorder vs. non-formal thought disorder) — Domain: Symptom monitoring | Feature engineering-driven supervised learning (logistic regression; SVM; random forest; gradient boosting as comparators) | Neuropsychometric tasks (Iowa Gambling Task; Simple Reaction Time); active collection; engagement: none | Proxy endpoint: formal thought disorder classification; same visit | AUC, 0.85; accuracy, 67.3%; F1, 72.0% | Nested 5-fold cross-validation; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Granato et al., 2025 [124] (Italy & Australia; multicenter outpatient and community clinics) | Schizophrenia-spectrum outpatients and healthy controls; N = 270 (patients, n = 162; healthy controls, n = 108); baseline + simulated longitudinal training | Computational identification of cognitive subtypes and simulated inner-speech training policy — Domain: Symptom monitoring | Prescriptive policy learning (computational cognitive model with deep generative model; recurrent neural network; reinforcement learning) | Neuropsychological tests (WCST); active collection; engagement: none | Proxy endpoint: WCST performance; 1–9 sessions (simulation) | Simulated sessions to reach control-level: RI, 1; MI, 7; SI, 9 | Internal model fit (BIC); calibration/uncertainty: no; interpretability: feature-level (parameter-level); closed-loop: no |
| Yee et al., 2025 [147] (Singapore; single-center outpatient psychiatry) | Adults with schizophrenia and healthy controls; N = 195 (patients, n = 146 with response strata; healthy controls, n = 49); cross-sectional baseline | Treatment response stratification (antipsychotic response vs. resistance; clozapine response vs. clozapine resistance) — Domain: Medication management | Feature engineering-driven supervised learning (support vector machine with radial basis function; recursive feature elimination; SHAP) | Plasma inflammatory proteomics (Olink Target 96 Inflammation Panel); passive collection; engagement: none | Proxy endpoints: treatment-response class labels; same visit | AUC, 0.88 (responders vs. non-responders); balanced accuracy, 78.0%; AUC, 0.78 (clozapine responders vs. resistant) | Train/validation/test split; calibration/uncertainty: no; interpretability: feature-level (SHAP); closed-loop: no |
| Bao et al., 2025 [165] (China, Shanghai; single-center inpatient psychiatry) | Schizophrenia inpatients; N = 2 358 (long-stay ≥120 days, n = 812; short-stay <30 days, n = 1546); retrospective cohort | Admission-time long-term hospitalization risk prediction ( > 120 vs. <30 days) — Domain: Risk management | Representation learning-driven modeling (deep neural network with self-attention; large-language-model-assisted feature extraction) | Structured EHR (demographics and labs) and unstructured notes (LLM-extracted behaviors); passive collection; engagement: none | Proxy endpoint: long-term hospitalization during index admission; same visit (classify long-stay vs. short-stay) | AUC, 0.90; accuracy, 81.0% | Temporal hold-out (latest 100 cases); calibration/uncertainty: no; interpretability: feature-level (SHAP); closed-loop: no; bias probes: passed |
| Vellucci et al., 2025 [148] (Italy, Naples; single-center tertiary specialty outpatient clinic) | Stable schizophrenia outpatients; N = 117 (treatment-resistant, n = 59; non-treatment-resistant, n = 58); cross-sectional baseline | Treatment-resistant schizophrenia classification using autism-related symptom severity — Domain: Medication management | Feature engineering-driven supervised learning (multivariate logistic regression) | Clinical scales and cognitive tests (PANSS/PAUSS/PSP/UPSA/NES); active collection; engagement: none | Proxy endpoints: TRS classification and associated cognition/function scores; same visit | AUC, 0.67; accuracy, 65.0%; sensitivity, 57.0%; specificity, 75.0% | Nested 10-fold cross-validation; calibration/uncertainty: partial (Brier score reported); interpretability: feature-level (coefficients); closed-loop: no |
| Martinelli et al., 2025 [135] (Italy; multicenter community and inpatient rehabilitation) | Schizophrenia-spectrum disorders; N = 619 (female, n = 198; male, n = 421); 312 residential-rehabilitation residents + 307 community outpatients; single-visit cross-sectional | Sex-stratified phenotypic subtyping — Domain: Symptom monitoring | Representation learning-driven modeling (unsupervised PCA and Gaussian-mixture clustering) | Standardized clinical scales (BPRS, BNSS, SLOF, WHODAS, WHOQOL-BREF, ZTPI, and Positivity Scale); active collection + sociodemographic data (record-extracted); passive collection; engagement: none | Proxy endpoints: phenotypic subtyping, symptom severity, and functioning; same visit (cross-sectional baseline) | Clustering stability: assignment consistency 99.7% (Cluster 0) / 99.2% (Cluster 1); sex distribution difference: 35.7 vs. 26.9% female (p = 0.027); clinical separations: higher BPRS/BNSS and lower SLOF in Cluster 1 (p < 0.001) | Leave-one-group-out cross-validation across 37 sites; calibration/uncertainty: no; interpretability: feature-level (ANCOVA with Bonferroni correction); closed-loop: no |
| Mishra et al., 2025 [149] (India; single-center outpatient psychiatry [AIIMS Bhubaneswar]) | Adults with schizophrenia, stable on antipsychotics; N = 120 (complete cases, n = 98); baseline and 4-week follow-up | Treatment-response prediction for sleep disturbances (ramelteon) — Domain: Medication management | Feature engineering-driven supervised learning (logistic regression; random forest; extreme gradient boosting; k-nearest neighbors; rPART) | Clinical scales (PSQI, PANSS) + biomarkers (serum melatonin at 14:00/02:00, urinary 6-sulfatoxymelatonin, and serum AANAT); active collection; engagement: none | Proxy endpoint: PSQI improvement ≥20% (responder); 4 weeks | Logistic regression: AUC, 0.79; accuracy, 90.0%; sensitivity, 45.0%; specificity, 93.0% | 10-fold cross-validation (3 repeats) + 30% hold-out test; calibration/uncertainty: no; interpretability: feature-level (varSelRF selection + LR coefficients); closed-loop: no |
| Jean et al., 2025 [115] (United States of America; single-center outpatient psychiatry [Zucker Hillside Hospital]) | Schizophrenia/schizoaffective outpatients with recent acute events; N = 62; smartphone follow-up up to 1 year with thrice-weekly ecological momentary assessment | Same-day/next-day/next-week mental-state forecasting from digital phenotypes — Domain: Symptom monitoring | Sequence and event-time modeling (gradient-boosted trees for ordinal regression; LSTM comparator) | Smartphone passive sensing (accelerometer, GPS, call/SMS logs, ambient audio, and device-use metadata) + EMA self-ratings (active); passive collection; engagement: passive sensing | Proxy endpoints: 10 4-level self-rated mental states; same day, 1 day ahead, 7 days ahead | Ordinal regression: MAMAE, 0.77–1.19 (median 0.96); binary classification: balanced accuracy, 58.0%–73.0% (median 66.0%); XGBoost > LSTM for ordinal (p < 0.001) | Temporal hold-out (last 7 surveys per participant as test); calibration/uncertainty: no; interpretability: none; closed-loop: no |
| Hieronymus et al., 2025 [150] (International; multicenter clinical trials; acute-phase psychiatry [inpatient/outpatient, mixed]) | Schizophrenia/schizoaffective in acute-phase randomized controlled trials; N = 4 634 (train/test across 18 trials) + external validation n = 1 508 (three active-control trials); baseline and 4-week assessments | Four-week symptomatic remission prediction after acute-phase treatment — Domain: Medication management | Feature engineering-driven supervised learning (stacked/ensemble: bagged CART; elastic net; logistic regression; random forest; extreme gradient boosting) | Clinical scales (30-item PANSS baseline), demographics (age and sex), treatment assignment; active collection (secondary reuse of trial databases); engagement: none | Clinical endpoint: symptomatic remission (RSWG consensus); 4 weeks | Balanced accuracy, 60.0%–63.0% (internal, n = 384–4 384 training); external validation balanced accuracy, 68.0%; XGBoost best single model (balanced accuracy, 61.0%–63.0%) | 10-fold cross-validation + Monte-Carlo resampling; external validation (3 independent trials); leave-one-study-out across 18 trials; calibration/uncertainty: no; interpretability: feature-level; closed-loop: no |
| Just et al., 2025 [175] (Germany; single-center community/outpatient psychiatry) | Schizophrenia-spectrum outpatients; N = 50; semi-structured narrative task at a single visit | Automatic speech recognition evaluation for clinical transcripts — Domain: Symptom monitoring | Representation learning-driven modeling (ASR backbone: Whisper; NLP embeddings for downstream feature checks) | Audio recordings of clinician-guided interviews; transcripts (ASR vs. human gold standard); active collection; engagement: none | Proxy endpoint: word/character error rates and NLP feature fidelity; same visit | WER, 0.31; CER, 0.18; substitution rate, 0.16 (insertion, 0.05; deletion, 0.09) | Internal reference benchmarking against human transcripts; calibration/uncertainty: n/a; interpretability: none; closed-loop: no |
| Vidal et al., 2025 [153] (France; nationwide multicenter community/outpatient via administrative claims) | Bipolar and schizophrenia-spectrum adults aged 18–65 and matched controls; N = 87 182 (cases, n = 43 591; matched controls, n = 43 591); 18-month longitudinal prescription sequences | Medication sequence–linked risk prediction for non-psychiatric adverse-event hospitalizations — Domain: Medication management | Sequence and event-time modeling (bidirectional GRU over 18-month sequences; random forest and extreme gradient boosting comparators) | EHR/claims: monthly psychotropic dispensations (87 drugs) and hospitalizations; passive collection; engagement: none | Proxy endpoints: nine non-psychiatric adverse-event hospitalizations; 18 months preceding event | Macro-AUC, 0.60 (biGRU); comparators: XGBoost AUC, 0.56 (random forest ≈ chance); SHAP highlighted dose-dependent contributions (e.g., benzodiazepines, quetiapine, clozapine, lithium, and valproate) | 80/20 hold-out split with 5-fold CV for hyperparameters; calibration/uncertainty: no; interpretability: local-explanation (SHAP); closed-loop: no |
| Liu et al., 2025 [119] (Taiwan; single-center outpatient/day-rehabilitation) | Schizophrenia in stable phase; N = 160; single-session semi-structured clinical interview (si-CAINS) | Automatic assessment of negative-symptom severity (CAINS EXP & MAP) — Domain: Symptom monitoring | Representation learning-driven modeling (video/audio embeddings with random forest ensemble; generative/LLM-augmented scoring: zero-shot rubric-guided text analysis) | Video (upper-body/face), audio (dual-channel lapel), and Mandarin transcripts from clinician-led interviews; active collection; engagement: none | Proxy endpoints: CAINS negative-symptom severity (0–4 items; domain totals EXP/MAP); same visit | EXP: ICC(3,1) = 0.65; MAP: (best LLM) ICC(3,1) = 0.82; MAP weighted κ = 0.77 | 5-fold cross-validation (EXP); zero-shot LLMs for MAP; calibration/uncertainty: no; interpretability: local-explanation (LLM rationales) + feature-level; closed-loop: no; safety guardrails: no |
α cronbach’s alpha, 1D-CNN one-dimensional convolutional neural network, ACC accuracy, AE autoencoder, AI artificial intelligence, ARIMA autoregressive integrated moving average, ASR automatic speech recognition, AUC/AUROC area under the curve/area under the receiver operating characteristic curve, AUPRC/PR-AUC area under the precision–recall curve, BCQ/DDPG/TD3 batch-constrained q-learning/deep deterministic policy gradient/twin delayed DDPG, BDI beck depression inventory, BERT/SBERT/BioBERT bidirectional encoder representations from transformers/sentence-BERT/biomedical BERT, BLERT bell-lysaker emotion recognition task, biGRU bidirectional GRU, Brier: Brier score, BNSS (EXP/MAP) brief negative symptom scale (Expressivity / Motivation and Pleasure), BPRS brief psychiatric rating scale, CAINS (EXP/MAP) clinical assessment interview for negative symptoms (Expressivity / Motivation and Pleasure), CATIE clinical antipsychotic trials of intervention effectiveness, CDSS calgary depression scale for schizophrenia, CGI/CGI-S/CGI-SCH clinical global impression/severity/schizophrenia, C-index concordance index, CNN-AE/CVAE/Transformer-AE/GRU-AE convolutional/conditional-VAE/transformer/GRU autoencoders, Cohen’s d: standardized mean difference, CV/K-fold/LOOCV/LOSO, cross-validation/k-fold/leave-one-out/leave-one-subject-out, DAI-10 drug attitude inventory-10, DNN deep neural network, DTW dynamic time warping, EA emotional alignment (derived metric in emotion-processing tasks)ECG: electrocardiogram, EHR electronic health records, ELASTIC NET elastic-net regularization, EMA ecological momentary assessment, ER-40 penn emotion recognition test-40, F1/F2 F-measure with β = 1/2, FAST functioning assessment short test, GAF global assessment of functioning, GBRT gradient boosted regression trees, GMM gaussian mixture model, GP gaussian process, GPS global positioning system, HAM-A hamilton anxiety rating scale, HAMD-17 17-item hamilton depression rating scale, HAR human activity recognition, HMM/CHMM (cluster) hidden markov model, HR/HRV heart rate/heart rate variability, IADL instrumental activities of daily living (lawton), ICC intraclass correlation coefficient, IPII indiana psychiatric illness interview, IQR interquartile range, ITR individualized treatment rule, J48/C4.5 quinlan’s decision tree (WEKA J48 implementation of C4.5), JSQLS japanese schizophrenia quality of life scale, κ (weighted) kappa, k-means k-means clustering, LAI long-acting injectable (antipsychotics), LASSO least absolute shrinkage and selection operator, LIWC linguistic inquiry and word count, LLM large language model, LR logistic regression, MAE/RMSE/NRMSE mean absolute/root mean squared/normalized RMSE, MADRS montgomery-asberg depression rating scale, MATRICS/MCCB measurement and treatment research to improve cognition in schizophrenia / MATRICS consensus cognitive battery, mDOT modified directly observed therapy, ML machine learning, MMSE mini-mental state examination, mRMR minimum redundancy maximum relevance, MTL/STL multi-task/single-task learning, NES neurological evaluation scale, OLS ordinary least squares, OR/IRR odds ratio/incidence rate ratio, OSCARS observable social cognition: A Rating Scale, OSQ oviedo sleep questionnaire, PANSS positive and negative syndrome scale, PAS premorbid adjustment scale, PCA principal component analysis, PPV positive predictive value, PSO particle swarm optimization, PSP personal and social performance (scale), PSQI pittsburgh sleep quality index, QLS quality of life scale, RCT randomized controlled trial, RBF radial basis function, r: pearson correlation, R² coefficient of determination, RF random forest, RFE/RFECV recursive feature elimination (with Cross-Validation), RNN/LSTM/GRU/Seq2Seq, recurrent/long short-term memory/gated recurrent unit/sequence-to-sequence models, RSWG remission in schizophrenia working group (criteria), SAPS/SANS scale for the assessment of positive/negative symptoms, SCL-90-R symptom checklist-90-revised, SDSS social disability screening schedule, SEN/SPE sensitivity/specificity, SHAP SHapley Additive exPlanations, SOFAS social and occupational functioning assessment scale, SVM support vector machine, SVR support vector regression, Super Learner/TMLE: cross-validated ensemble (“Super Learner”)/Targeted Minimum Loss-based Estimation, SWN-K subjective well-being under neuroleptic treatment—short form (Korean), TF-IDF term frequency–inverse document frequency, TPOT tree-based pipeline optimization tool (AutoML), UPSA UCSD Performance-based Skills Assessment, VR virtual reality, WCST wisconsin card sorting test, WHODAS who disability assessment schedule, WHOQOL-BREF WHO quality of life—BREF, WMD word mover’s distance, word2vec/GloVe word embeddings, XGBoost eXtreme gradient boosting, YMRS young mania rating scale, ZTPI zimbardo time perspective inventory.
Results
Study selection
The systematic search and selection process is illustrated in Fig. 1. Two search rounds yielded a combined total of 627 records after deduplication (Round 1, 561 records from January 2012 to January 2025; Round 2, 66 records from January to October 2025). In Round 1, following title/abstract screening and full-text review of 561 records, 89 studies met the eligibility criteria. The 472 excluded records were ineligible for the following reasons: (i) diagnostics-only focus without rehabilitation linkage (e.g., case–control classifiers for schizophrenia vs. healthy controls discrimination; 198 studies, 41.9%); (ii) acute-phase treatment without functional/community outcomes (96 studies, 20.3%); (iii) pathophysiology/biomarker discovery without rehabilitation management implications (74 studies, 15.7%); (iv) non-original research (i.e., reviews, editorials, protocols, and conference abstracts; 58 studies, 12.3%); (v) exclusive reliance on non-deployable modalities (e.g., fMRI-only or research-grade electroencephalogram data; 31 studies, 6.6%); and (vi) custodial/forensic settings without community pathways (15 studies, 3.2%).
In Round 2, 66 additional records were identified. Following the same screening process as in Round 1, 18 studies met the eligibility criteria. The 48 excluded records showed a distribution similar to that in Round 1, as shown herein: diagnostics-only focus (19 studies, 39.6%), acute-phase treatment only (10 studies, 20.8%), pathophysiology/biomarkers (8 studies, 16.7%), non-original research (6 studies, 12.5%), non-deployable modalities (3 studies, 6.3%), and custodial/forensic settings (2 studies, 4.2%).
All 107 studies (89 from Round 1 and 18 from Round 2) underwent an additional operationalization review. Among these, 42 studies employed cross-sectional or case–control designs requiring stricter operationalization criteria, and 24 were excluded because they (i) lacked confirmed clinical diagnoses or community-deployable methods (9 studies, 37.5%) or (ii) demonstrated no rehabilitation-orientation signals despite initial inclusion (15 studies, 62.5%). This secondary review, conducted in November 2025, resulted in a final cohort of 83 studies spanning January 2012 to October 2025.
Study characteristics
The final cohort comprised 83 studies published between 2012 and October 2025 (Fig. 2). Publication trends demonstrate a marked acceleration in recent years, as early studies from 2012 to 2019 represented only 9.6% (8/83) of the corpus, whereas 2020 to 2023 accounted for 49.4% (41/83), and 2024 to October 2025 contributed 41.0% (34/83).
Fig. 2. Publication trends by year.
Annual counts of included studies (N = 83) from 2012 to October 2025. Bars show annual counts with numeric labels; the superimposed line depicts the temporal trend. Data for 2025 include publications through October 31, 2025.
Studies originated predominantly from high-income countries (Fig. 3). The United States of America contributed the largest share (31/83 studies, 37.3%), followed by China (including Taiwan and Hong Kong; 10/83, 12.0%), and the United Kingdom (5/83, 6.0%). Italy also accounted for 5/83 studies (6.0%). Additional contributions were from South Korea (4/83, 4.8%), France (4/83, 4.8%), Canada (3/83, 3.6%), Spain (3/83, 3.6%), the Netherlands (3/83, 3.6%), Germany (2/83, 2.4%), Poland (2/83, 2.4%), Greece (2/83, 2.4%), and Singapore (2/83, 2.4%). Single-country contributions were observed for Japan, Turkey, India, and Denmark (one study each), and three multicountry or regional trials were coded as International or European consortia.
Fig. 3. Global distribution.
Bubble map of included studies by primary country (N = 83). Circle size and the numeric label denote the number of studies. The China group includes mainland China (n = 4), Taiwan (n = 5), and Hong Kong SAR (n = 1). Multinational or regional consortia (International/Europe, n = 3) are not assigned to a single country on the map.
Most studies were conducted in community or outpatient settings (57 studies, 68.7%), with smaller proportions in inpatient settings (10 studies, 12.0%), mixed settings leveraging nationwide claims or health system data (4 studies, 4.8%), or settings not clearly reported (12 studies, 14.5%). Approximately half of the studies employed multicenter designs (42 studies, 50.6%), with single-center studies accounting for 47.0% (39 studies) and nationwide or health-system analyses for 2.4% (2 studies).
Population and sample size
All 83 studies provided sample size information (range, 5–87 182 participants; median, 160 participants). Regarding sample size distribution, 4.8% (4 studies) enrolled fewer than 20 participants, 33.7% (28 studies) enrolled 20–100 participants, 38.6% (32 studies) enrolled 101–500 participants, 7.2% (6 studies) enrolled 501–1000 participants, and 15.7% (13 studies) enrolled more than 1 000 participants (Fig. 4). Studies with smaller sample sizes ( < 100 participants) accounted for 38.6% (32 studies) of the corpus, whereas those with 100 or more participants represented 61.4% (51 studies).
Fig. 4. Sample size distribution.

Sample sizes ranged from 5 to 87 182 (median 160). Overall, 32/83 (38.6%) studies enrolled <100 participants and 51/83 (61.4%) enrolled ≥100, with darker shades indicating larger sample-size categories.
Most studies focused on patients with schizophrenia in a clinically stable or chronic phase (45 studies, 54.2%), followed by mixed populations or other diagnostic categories (13 studies, 15.7%), acute inpatients or hospitalized patients (10 studies, 12.0%), patients with first-episode or early psychosis (9 studies, 10.8%), recently discharged patients (5 studies, 6.0%), and treatment-resistant schizophrenia (1 study, 1.2%). Twenty studies (24.1%) included healthy or matched control groups, whereas the remaining 63 (75.9%) exclusively examined patient populations. Several studies incorporated individuals with schizoaffective disorder, bipolar disorder with psychotic features, or broader serious mental illness categories alongside patients with schizophrenia (Fig. 5).
Fig. 5. Clinical population categories.

Categories (mutually exclusive) are: stable/chronic schizophrenia (45, 54.2%), mixed or other diagnoses (13, 15.7%), acute inpatients/hospitalized (10, 12.0%), first‑episode/early psychosis (9, 10.8%), recently discharged (5, 6.0%), and treatment‑resistant schizophrenia (1, 1.2%). Abbreviation: TRS, treatment‑resistant schizophrenia.
Among the 83 studies, 46 (55.4%) were longitudinal studies with specified follow-up or repeated monitoring, while 37 (44.6%) employed cross-sectional or single time-point assessments. For longitudinal studies that explicitly reported follow-up durations (42 studies), follow-up lengths varied considerably as follows: 1 study (2.4%) had a follow-up period of less than 1 week, 4 (9.5%) ranged from 1 week to 1 month, 7 (16.7%) ranged from 1 to 3 months, 7 (16.7%) ranged from 3 to 6 months, 16 (38.1%) ranged from 6 to 12 months, and 7 (16.7%) extended beyond one year (up to 12–17 years).
Data sources and user engagement patterns
The 83 included studies used diverse data-collection methodologies. Active data collection was predominant (56.6%; 47 studies), acquiring data through structured clinical interviews, standardized symptom scales, cognitive task assessments, or patient self-reports. Passive collection comprised 38.6% (32 studies), leveraging sensor-based devices, EHR systems, or social media platforms to capture patient behavioral data. Moreover, 4.8% (4 studies) employed combined approaches integrating both active and passive collection methods to achieve data complementarity.
Regarding user engagement, most studies adopted no-engagement designs (68.7%; 57 studies), wherein data were collected without providing real-time feedback or interventions to patients. Passive sensing constituted 21.7% (18 studies), continuously monitoring patients (e.g., physiological indicators, activity patterns, and behavioral characteristics) via smartphones/wearables. Conversational engagement (e.g., natural language-processing-driven virtual assistants or therapeutic dialogue systems) and nudge-based engagement (e.g., medication reminders or symptom self-assessment prompts through mobile applications) each accounted for 4.8% (4 studies).
Regarding data modality, speech and text data were the most prevalent (22.9%; 19 studies) and included clinical interview transcripts, voice recordings, and natural language-processing techniques. EHRs served as data sources in 14 studies (16.9%), encompassing structured diagnostic codes, prescription information, and unstructured clinical narrative notes. Smartphone-based multimodal sensing ranked next, with 12 studies (14.5%) capturing patients’ mobility trajectories, social interactions, sleep patterns, and screen use behaviors. Wearable device data were relatively scarce, adopted by four studies (4.8%), including wrist-worn accelerometers, smartwatches, or heart-rate monitoring devices.
Outcome measures and temporal horizons
The included studies demonstrated marked heterogeneity in outcome selection and temporal horizons. Proxy endpoints predominated across the literature (e.g., diagnostic classification accuracy, symptom scale scores, medication adherence rates, treatment-response indicators, and social functioning assessments), appearing in 67 studies (80.7%), whereas clinical endpoints (e.g., relapse events, hospital readmissions, symptomatic remission, functional remission, and long-term mortality) were evaluated in 21 studies (25.3%). More specifically, 62 studies (74.7%) exclusively employed proxy endpoints, 16 studies (19.3%) focused solely on clinical endpoints, and 5 studies (6.0%) incorporated both types.
Regarding temporal horizons, concurrent models utilizing data from a single assessment time point represented the most common approach, accounting for 34 studies (41.0%). Short-term investigations of up to three months were employed in 24 studies (28.9%), typically targeting symptom fluctuations, early relapse detection, or medication adherence monitoring. Medium-term investigations spanning 3–12 months represented 16 studies (19.3%), focusing on treatment-response trajectories, functional outcomes, and sustained adherence patterns. Long-term investigations extending beyond 12 months comprised 9 studies (10.8%), addressing outcomes such as multi-year relapse risk, treatment-resistance development, mortality prediction, and chronic disease incidence.
Application domains and task landscape
Across the five rehabilitation management domains that appeared in the included studies, symptom monitoring emerged as the predominant application area (Fig. 6), encompassing 48 studies (57.8%). Symptom monitoring tasks clustered into seven distinct task categories, as follows: diagnostic classification (9 studies) leveraged speech, language, or multimodal features to distinguish patients with schizophrenia from healthy controls [94–102]; symptom scale prediction (14 studies) employed machine learning to estimate Positive and Negative Syndrome Scale, Brief Psychiatric Rating Scale, or ecological momentary assessment scores [102–115]; negative symptom quantification (4 studies) automated the assessment of blunted affect, alogia, anhedonia, avolition, and asociality using wearable sensors or speech analysis [116–119]; cognitive function evaluation (5 studies) detected formal thought disorder or predicted memory performance [120–124]; social functioning assessment (4 studies) utilized smartphone GPS, passive sensing, or facial affect recognition to estimate social isolation, loneliness, and interpersonal competence [125–128]; quality of life prediction (2 studies) estimated subjective well-being or functional outcomes [129, 130]; clinical phenotyping (7 studies) was used to delineate prognostic subgroups or disease stages, including subtype classification [99, 124, 131–135]. Task categories are not mutually exclusive, and thus the counts may sum to more than the number of studies per domain.
Fig. 6. Domains and task categories of AI applications in schizophrenia rehabilitation.
(a) Distribution of the 83 included studies across rehabilitation management domains: symptom monitoring (48 studies), medication management (19), risk management (16), functional training (1), and psychosocial support (3). (b) Symptom‑monitoring task categories among the 48 studies in this domain: diagnostic classification (9 studies), symptom scale prediction (14), negative symptom quantification (4), cognitive function evaluation (5), social functioning assessment (4), quality‑of‑life prediction (2), and clinical phenotyping (7). (c) Medication‑management task categories among 19 studies: adherence monitoring and prediction (7 studies), treatment response and resistance stratification (8), dosage optimization and toxicity prediction (2), pharmacovigilance for non‑psychiatric adverse events (2), and individualized drug selection (1). (d) Risk‑management task categories among 16 studies: relapse prediction (9 studies), hospitalization risk assessment (3), violence‑related classification (3), comorbidity risk prediction (1), and mortality prediction (1). Bars represent the number of studies per domain or task category; domains and task categories are not mutually exclusive, and individual studies can contribute to more than one category.
Medication management constituted the second-largest domain with 19 studies (22.9%), accounting for five core tasks, as shown herein: adherence monitoring and prediction (7 studies) used smartphone-based visual verification, pharmacokinetic modeling, or claims data to forecast treatment continuation [136–142]; treatment response and resistance stratification (8 studies) predicted symptomatic remission, treatment-resistant schizophrenia status, or clozapine responsiveness [143–150]; dosage optimization and toxicity prediction (2 studies) recommended therapeutic dose ranges or forecasted adverse metabolic effects [151, 152]; pharmacovigilance for non-psychiatric adverse events (2 studies), which included monitoring prolactin elevation and medication-sequence–linked hospitalization risks [152, 153]; individualized drug selection (1 study) generated personalized treatment rules based on baseline characteristics [154].
Risk management applications appeared in 16 studies (19.3%), comprising the five task categories exposed in this list: relapse prediction (9 studies) developed early warning systems for psychotic exacerbation with prediction windows ranging from one week to two years using digital phenotyping, Internet search behavior, or smartphone passive sensing [155–163]; hospitalization risk assessment (3 studies) forecasted readmissions or prolonged inpatient stays [156, 164, 165]; violence-related classification (3 studies) covered aggression-risk prediction or victimization event detection [166–168]; comorbidity risk prediction (1 study) estimated type 2 diabetes onset [169]; and mortality prediction (1 study) modeled all-cause death using EHR data [170].
For functional training, only one study (1.2%) identified response trajectories to social cognition training and predicted individualized treatment benefits [171]. Psychosocial support interventions comprised three studies (3.6%): one analyzed therapeutic dialogue patterns in virtual-reality avatar therapy [172], one predicted optimal referral pathways to cognitive behavioral therapy or vocational training [173], and one provided policy recommendation prototypes using offline reinforcement learning [174].
Technological approaches and model architectures
The included studies employed four primary technological paradigms: feature engineering-driven supervised learning (53 studies, 63.9%), representation learning-driven modeling (20 studies, 24.1%), sequence and event-time modeling (7 studies, 8.4%), and prescriptive policy learning (3 studies, 3.6%).
For feature engineering-driven supervised learning, random forest was the most frequently adopted algorithm (24 studies), often used for intrinsic feature-importance profiling and ensemble-based generalization [95, 98–101, 103, 117, 123, 126, 127, 129, 130, 140, 143, 146, 149, 150, 159, 161, 164, 166, 168, 171, 173]. Gradient boosting variants (e.g., XGBoost and gradient boosting machines; 18 studies) were commonly applied to structured tabular data and high-dimensional feature spaces [103, 106, 108, 117, 122, 123, 126, 130, 139, 140, 143, 145, 149, 150, 152, 168, 169, 173]. Support vector machines (18 studies) were usually applied in small-sample settings and frequently used for speech-acoustic classification, but also appeared in higher-dimensional risk-prediction pipelines [94, 96, 104, 105, 109, 110, 112, 123, 127, 134, 136, 146, 147, 159, 163, 164, 166, 168]. Logistic regression (15 studies) was often used for clinical nomogram construction or as a baseline comparator [96, 123, 130, 139–141, 146, 148–150, 162, 166, 168, 169, 171]. Regularization techniques, such as least absolute shrinkage and selection operator/elastic net (12 studies), were implemented to select high-dimensional predictors and mitigate overfitting [109, 112, 116, 117, 128, 141–143, 150, 166, 168, 169].
Among representation learning methods, transformer architectures (4 studies; e.g., BERT/BioBERT and Whisper) were used to process clinical narrative text, therapy-dialogue content, and automatic speech recognition outputs [120, 165, 167, 175]. Convolutional neural networks (4 studies) were applied to model visual inputs for medication adherence verification, painting-based symptom assessment, and accelerometry-based human-activity recognition [102, 107, 118, 137]. Recurrent architectures (3 studies; e.g., long short-term memory/gated recurrent unit/vanilla recurrent neural networks) were used to capture temporal dependencies in smartphone sensor streams, ecological momentary assessment trajectories, and multimodal relapse predictions [107, 113, 157]. Autoencoder frameworks (3 studies) were used for unsupervised anomaly detection in relapse early warning systems and for dimensionality reduction in mortality risk modeling [155, 157, 170]. Two studies reported LLM-augmented pipelines for zero-shot symptom severity scoring or feature extraction from unstructured EHRs [119, 165].
In sequence and event-time modeling, hidden Markov models (1 study) were used to identify latent symptom state transitions from ecological momentary assessment sequences [132]. Cox proportional-hazards regression and random survival forests (1 study) were applied to model time-to-relapse following medication discontinuation [158]. AutoRegressive Integrated Moving Average (ARIMA) models and Gaussian-process anomaly detection (1 study) were implemented to model irregular temporal patterns in relapse prediction systems [156]. Trajectory clustering with fuzzy methods (1 study) was used to stratify first-episode psychosis patients into prognostic phenotypes [133]. Recurrent networks with long short-term memory or gated recurrent unit cells (3 studies) were used to forecast multi-day mental state fluctuations from digital phenotyping data [114, 115, 153].
For prescriptive policy learning, one study applied targeted minimum loss-based individualized treatment rules to recommend optimal antipsychotic selection using baseline clinical features [154]. Two studies deployed offline reinforcement learning (i.e., batch-constrained Q-learning and deep deterministic policy gradient algorithms) for psychotherapy strategy recommendations and simulated inner speech training policies in cognitive remediation contexts [124, 174].
Model performance and predictive efficacy
Model performance metrics varied substantially across task categories. To avoid inappropriate cross-domain comparisons, metrics are reported separately for classification, regression, event-time, and early warning task applications. For classification tasks, 38 studies reported AUC metrics [94–101, 104, 107, 112–114, 117, 120, 123, 127, 130, 139–141, 143–149, 153, 155, 159, 164–166, 168–170, 173], the median of which was 0.79 (interquartile range [IQR]: 0.71–0.86) with a range of 0.59–1.00. The median accuracy was 79.0% (IQR: 66.2–86.9%), ranging from 31.4–99.0%. Four symptom monitoring studies achieved AUC ≥ 0.90, including schizophrenia vs. healthy control discrimination (AUC = 0.99) [94], negative symptom severity classification (AUC = 1.00) [104], diagnostic classification using symptom subtyping (AUC = 0.92) [99], and schizophrenia classification using temporal features (AUC = 0.95) [101]. These models typically drew on feature engineering from speech acoustics or multimodal behavioral markers; in risk-management applications, deep neural architectures with self-attention also achieved AUC ≈ 0.90 (e.g., long-stay hospitalization prediction [165]).
Regarding regression tasks, studies predicted continuous clinical scale scores, symptom trajectories, or functional outcomes using diverse error metrics. Among studies reporting mean absolute error [103, 106, 108, 126, 152, 157], the median was 2.17 (range, 0.05–7.79) when considering different measurement scales, including Brief Psychiatric Rating Scale subscales, social functioning dimensions, and prolactin concentrations. Across five studies that reported absolute root mean squared error values for clinical scales [102, 110, 129, 151, 152], the root mean squared error exhibited a median of 13.30 (range, 0.06–85.23) when considering quality of life indices, Positive and Negative Syndrome Scale total scores, and pharmacokinetic predictions; an additional study reported a relative root mean squared error of 12% on 0–3 ecological momentary assessment symptom scales [109]. The median R² was 0.63 (range, 0.14–0.92) [107, 122, 128, 151], reaching 0.92 in clozapine pharmacokinetic dose concentration modeling [151] and 0.74 in symptom severity prediction from multimodal wearable data streams [107]. Pearson correlation coefficients for symptom scale predictions were generally moderate to high, often in the range of approximately 0.4–0.9 [95, 96, 103, 105, 106, 121, 122, 126, 176], with some Positive and Negative Syndrome Scale reconstruction models achieving very high correlations (up to r ≈ 0.99) [111]. These metrics span heterogeneous scales, and counts reflect studies that reported each metric, and thus should be interpreted with caution.
For event-time modeling, two studies reported a C-index ranging from 0.71–0.78, covering post-discontinuation relapse [158] and all-cause mortality prediction [170]. In both studies, event-time models outperformed baseline-only comparators; for instance, in Brandt et al. [158], the C-index improved from 0.60 for baseline-only covariates to 0.70–0.71 for regularized Cox and random survival forest models.
Among early warning systems, six studies implemented relapse early warning models (mostly evaluated offline/retrospectively) with prediction horizons ranging from 1 week to 30 days [155, 157, 159, 161–163]. The median sensitivity was 31.5% (range, 0.6–66.2%), and the median specificity was 88.0% (range, 71.0–99.7%). One system achieved 66.2% recall at 6.3% precision using balanced random forests on smartphone-sensor clusters [161]. Another attained 99.7% specificity with 0.6% sensitivity via one-class support vector machines [162]. Anomaly-rate increases of approximately 108% [157] and 112% (×2.12) [162] were observed in pre-relapse windows. It was common for studies to have three- to four-week windows (overall range, 1–30 days).
Validation rigor, interpretability, and implementation readiness
Regarding validation protocols, most studies relied on cross-validation (e.g., k-fold, leave-one-subject-out, and Monte Carlo), and a subset used hold-out splits. Four studies reported external or cross-dataset evaluations, including independent cohort or cross-trial datasets and leave-one-site-out or temporal holdout designs [112, 114, 131, 150]. One study achieved 68.0% balanced accuracy on external-validation data spanning three independent trials [150]. For calibration and uncertainty quantification, five studies reported some form of probability calibration or predictive uncertainty handling using Monte Carlo dropout [113], fuzzy-logic confidence stratification over uncertainty-aware decisions [114], and Brier scores and/or calibration plots, sometimes combined with bootstrap internal validation [140, 141, 144]. Most other studies provided no such information.
Regarding interpretability mechanisms, these were relatively common, with those most frequently reported being feature-level approaches (e.g., random forest importance, Shapley additive explanations, permutation importance, and least absolute shrinkage and selection operator coefficients). Local or case-level explanation methods (5 studies) provided instance-specific rationales using Shapley additive explanations, counterfactuals, policy-trajectory visualizations, or LLM-generated justifications [119, 145, 153, 170, 174]. Rule-based interpretability (3 studies) employed decision trees or fuzzy-logic rule sets [97, 114, 160]. A subset of studies (14/83, 16.9%), often applying deep or computer-vision models, reported no explicit interpretability mechanisms [113, 115, 118, 120, 125, 138, 139, 151, 162, 164, 167, 169, 175, 176].
For closed-loop implementation, three studies documented implementation wherein AI predictions triggered direct clinical actions as follows: weekly symptom forecasts that automatically triggered clinical outreach [106]; a randomized controlled trial where AI-based adherence verification with real-time alerts improved adherence rates (94.7 vs. 64.4%; p < 0.001) and symptom outcomes [138]; and a partial closed-loop with computer-vision-flagged medication behaviors prompting counselor-mediated interventions [137]. Most studies operated in recognition-only mode, generating predictions without automated action pathways.
Referring to safety guardrails and quality signals, none of the studies employing reinforcement learning or LLMs reported safety constraints [119, 174]. Supplementary quality signals appeared in one randomized controlled trial [138], one clinician benchmark (n = 24 raters) [113], one user-testing study (n = 7) [163], and one algorithmic-bias probe across demographic subgroups [165].
Discussion
This systematic scoping review adopted a rehabilitation- rather than diagnosis-centered approach, focusing on the actionable value chain (i.e., from monitoring to decision support, intervention, follow-up, and audit) of AI in community and long-term schizophrenia rehabilitation management settings. This value chain framework reflects established measurement-based care principles [14, 15] and implementation science models for digital mental health [177, 178], wherein continuous monitoring informs clinical decisions, triggers timely interventions, enables systematic follow-up, and supports quality auditing cycles. Notably, the publication volume in this area has increased steeply in recent years, underscoring both the timeliness of this evidence base and the immaturity of its implementation layer. We also explicitly delineated the boundary between “rehabilitation” and “pure monitoring/prediction” in our methods, such that the only studies included were those in which AI functions demonstrated a clear pathway to rehabilitation goals (e.g., functional improvement, relapse prevention, medication management, or social participation).
Based on the 83 included studies published between 2012 and October 2025, it appears that AI literature for schizophrenia rehabilitation management is undergoing accelerated development, as more than 90% of the studies were published from 2020 onwards, with immature implementation. Most studies engaged in symptom monitoring (57.8%), medication management (22.9%), and risk management (19.3%), while there was a notable scarcity of studies focused on functional training and psychosocial support (i.e., the areas most proximal to rehabilitation outcomes; 1.2 and 3.6%, respectively). The evidence structure likewise skewed toward “identification and characterization,” as surrogate endpoints dominated (67/83, 80.7%), external validation was rare (4/83, 4.8%), calibration and uncertainty reporting were insufficient (5/83 studies, 6.0%), and closed-loop implementation was uncommon (3/83, 3.6%). For methods, active data collection predominated, yet 68.7% of systems adopted a “no-engagement” design without real-time feedback/intervention. Conversational and nudge-based systems together accounted for <10% of the corpus, and speech/text, EHR, and smartphone sensing were the dominant data modalities, with wearable-only systems remaining uncommon. This indicates that most systems remain only able to discriminate, still requiring a critical transition toward executable, auditable, and sustainable schizophrenia rehabilitation closed loops.
These application gaps reflect the bottleneck effect of the rehabilitation value chain. Functional training and psychosocial support studies require long-term, repeated, and contextualized measurement of behavioral change with actionable labels [84, 179], as studies on these domains rely on high-quality process data and granular task decomposition. Given such implementation complexities, both research categories being markedly underrepresented in the current ecosystem may be unsurprising. In the mental health literature, cross-diagnostic digital interventions and just-in-time adaptive interventions provide methodological inspiration for “moving from identification to action” [180–182]. Based on our findings, we suggest that translating the current evidence into stable benefits within schizophrenia contexts will require reconstructing the data and intervention units around rehabilitation goals. This will help ensure that the algorithmic outputs correspond one-to-one with executable action scripts [183, 184].
At the “meta-analytic” performance level, without conflating tasks, classification tasks yielded an overall median AUC of 0.79 and accuracy of 79%, with a minority of symptom monitoring studies (i.e., predominantly relying on acoustic voice features, multimodal behavioral markers, or self-attention architectures) achieving AUC ≥ 0.90. Relapse prediction models exhibited the typical profile of low sensitivity–high specificity (median sensitivity 31.5%; specificity 88%), suggesting that they are better suited as upstream triage signals rather than standalone decision gates. Two studies showed approximately doubled anomaly rates within the prediction windows [157, 162], although overall capture rates remained limited. For schizophrenia rehabilitation clinical practice, the significance of performance metrics hinges on whether they can deliver quantifiable data to promote early engagement, reduce relapse, and enhance participation [185]. Therefore, subsequent research should link surrogate endpoints with clinical endpoints (e.g., relapse, rehospitalization, functioning, and quality of life) and employ decision curve analysis to bind prediction thresholds to specific actions and resource allocation [186, 187]. These research efforts may help translate model optimization into real-world outcome improvements.
Regarding methodological maturity, most studies employed cross-validation or internal holdout, whereas few studies provided external/cross-dataset validation, reported on calibration and uncertainty, were user studies, and involved closed-loop implementation. For risk communication and action thresholds to be reached, discrimination is merely the starting point, as it is calibration that determines communication credibility and uncertainty presentation that pinpoints when to trigger human review [188, 189]. Importantly, distributional drift and subgroup disparities may rapidly erode effectiveness across disease stages and service contexts [190, 191]. Therefore, external validation, calibration curves/Brier scores, confidence intervals, and subgroup robustness should be routinely reported in future studies. At the design level, systems should embed “abstain/requires review” mechanisms and online drift monitoring strategies [191–193], as this would enable an automatic downgrade to a human–AI collaboration mode when uncertainty escalates.
Furthermore, a pronounced mismatch exists between interpretability and safety requirements. Feature engineering-driven models generally provide global or local explanations, whereas deep learning, LLM, and reinforcement learning applications largely lack transparency and instance-level explanations. Nevertheless, only a few studies using such transparency-lacking methods offered counterfactual or Shapley additive explanation-based case-level evidence. In rehabilitation settings, the accountability requirements for specific actions demand an auditable chain of “prediction–explanation–action” [56, 194, 195], and the model must be able to explain why follow-up was triggered at a given moment, which factors drove medication adjustments, and how thresholds self-adapted for the same patient across different stages. Particularly in reinforcement learning and LLM applications, safety constraints and alignment mechanisms remain unestablished [196–198], with governance lagging behind algorithmic complexity.
Clinical integration and reimbursement/literacy constitute the true thresholds for AI’s scaled deployment in schizophrenia rehabilitation management. However, only three studies achieved closed loops whereby predictions directly triggered actions, while most systems remained in identification mode. In community contexts, it is essential to clarify “who sees what signal when, follows which script to take what action, and who is responsible for tracking and auditing” [177, 178], the absence of corresponding reimbursement mechanisms and workload accounting can render proactive outreach unsustainable [199, 200], and patient and team digital literacy directly impact adherence and interpretation quality [201]. These implementation-layer complexities—role ambiguity, reimbursement gaps, and literacy barriers—reveal that algorithmic performance metrics (e.g., AUC and accuracy) measure what a system can achieve under controlled conditions but remain silent on whether it will be adopted, integrated, and sustained in routine care workflows. Previous studies have predominantly evaluated AI effectiveness through technical benchmarks, leaving questions of reach, feasibility, and service-level impact largely unaddressed. Therefore, instead of continuously hinging on technical metrics for assessing AI deployment, we recommend employing implementation science frameworks such as the RE-AIM to assess reach, adoption, and maintenance, and conducting “AI-in-the-loop” pragmatic trials to evaluate service key performance indicators (e.g., follow-up completion rates, relapse intervals, and functional improvement) [202–204].
Equity and generalizability issues are also concerning. The evidence base is concentrated in high-income countries, with minimal representation from low- to moderate-income countries (only one study from India), and only one included study explicitly probed algorithmic bias across demographic subgroups [165]. This entails not only out-of-domain mismatches in device/data ecosystems and care models but also potential cultural biases in goals and measurements. For instance, functional recovery is operationalized differently across cultures, such as independent employment and living in Western cultures versus family role restoration and caregiver burden reduction in Eastern cultures, whereas mainstream functional metrics exhibit limited sensitivity to the latter [205, 206]. Medication management models are likewise highly context-dependent, as divergences in drug availability, follow-up frequency, and metabolic monitoring resources directly impact the validity of adherence prediction and risk assessment [24]. Therefore, local recalibration, preregistered subgroup reporting, and quantification of performance degradation in cross-domain deployment should become standard components of transfer protocols (e.g., referencing TRIPOD + AI and PROBAST + AI guidelines on external validation and reporting standards) [207, 208] and should be combined with evidence of effectiveness erosion from distributional drift and model under-specification [193].
Regarding ethics and governance, passive sensing and high-frequency monitoring may exacerbate feelings of constant surveillance and paranoid content [48, 209]. Risk stratification outputs, if not contextualized through communication, can readily produce labeling effects and therapeutic pessimism [210, 211]. Involuntary treatment and forensic contexts further require explicit delineation of algorithmic signal boundaries and procedural safeguards [212, 213]. Current research predominantly remains at the minimal threshold of obtaining informed consent, whereas we recommend operationalizing governance requirements into four actionable standards: dynamic consent with minimum necessary data collection, purpose limitation with withdrawal/portability rights, subgroup fairness reporting with bias monitoring, and intervention safety switches in closed-loop scenarios. In high-autonomy systems such as reinforcement learning/LLMs, we suggest the integration of red-teaming, adversarial examples, and privilege escalation interception across the training-to-deployment pipeline, with human–AI decision logs recorded for post-hoc auditing, also referencing the previously cited LLM clinical evaluation/mitigation recommendations [197].
Regarding actionable recommendations for clinical practice and development, clinically, algorithmic outputs should be embedded into a “measurement–feedback–intensification” closed loop (measurement-based care). There should be preset thresholds and action scripts (e.g., “alert → phone follow-up within 48 h → escalate to in-person visit or medication adjustment if necessary”) [214], human review triggers for scenarios of elevated uncertainty or complex comorbidities, and thresholds and scripts dynamically calibrated through case audits and outcome feedback, forming a “learning rehabilitation system” [215]. Pertaining to development and operation, external validation and calibration, uncertainty quantification with abstention, cross-domain transfer with recalibration toolkits, and edge/low-bandwidth with energy constraints should be designated as minimum viable configurations [215–217]. Service key performance indicators should serve as primary evaluation dimensions, ensuring that technology aligns with the rehabilitation goals of “fewer relapses, better engagement, improved quality of life,” and real-world service efficiency and workload accounting should become regular evaluation metrics [218, 219].
This scoping review had several limitations. First, the search and inclusion scope (the databases of choice and English-language literature, respectively) may have resulted in omissions, with the possibilities of disciplinary intersections and non-standard terminology expanding search blind spots. Second, the included studies exhibited substantial heterogeneity in methodology, data sources, participant populations, and outcome specifications, precluding direct comparisons and quantitative synthesis. Third, the existing evidence predominantly features surrogate endpoints and short-to-medium-term follow-ups, with nearly 40% of studies enrolling fewer than 100 participants and only seven studies reporting follow-ups beyond one year, accompanied by limited external validation, calibration/uncertainty reporting, and real-world implementation documentation, all of which affect inferential strength and generalizability. Fourth, research on geography and device/platform ecosystems were concentrated, leaving cross-context transferability and local adaptability yet to be validated. Fifth, our operationalized criteria for determining “readiness for application,” while enhancing relevance for rehabilitation, may have introduced selection bias.
Overall, AI has demonstrated feasibility across several key components of schizophrenia rehabilitation management, although current evidence is insufficient to support conclusions regarding unified effect sizes. The primary contribution of this review lies in providing an application landscape and evaluative criteria centered on rehabilitation goals, distinguishing technologies with mere identification capabilities from tools that can be integrated into service pathways. Future research should adopt patient-centered outcomes and service performance as primary endpoints; conduct prospective, multi-center, and cross-context validation and recalibration; standardize the reporting of calibration, confidence intervals, and subgroup performance; and advance executable and auditable clinical integration within interoperability and governance frameworks. Only through rigorous translation from signal generation to service-level execution can AI substantively reduce relapse risk, enhance engagement, and improve quality of life in schizophrenia contexts.
Acknowledgements
We thank the clinical experts from the Rehabilitation Department of Shanghai Mental Health Center and the nursing staff from several mental rehabilitation management communities in Shanghai for their invaluable guidance and support throughout this project. This work was supported by the 2024 Shanghai Jiao Tong University Key Program for Interdisciplinary Research in Medicine and Engineering (Project No. YG2024ZD24); the 2024 Shanghai “Science and Technology Innovation Action Plan” Medical Innovation Research Special Project (Key Project Sub-Project) (Project No. 24Y22800502); and the Shanghai Jiao Tong University Interdisciplinary Program for Medicine and Engineering Youth Program (Project No. YG2025QNA11). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author contributions
HY: Conceptualization, Data curation, Investigation, Formal analysis, Visualization, Writing–original draft, Writing–review and editing. ZL: Investigation, Data curation, Project administration, and Funding acquisition. FM and FC: Validation, Writing–review and editing. WZ and JC: Funding acquisition, Methodology, Supervision, and Validation.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Hongyi Yang, Zhao Liu.
References
- 1.World Health Organization Clinical Descriptions and Diagnostic Requirements for ICD-11 Mental, Behavioural and Neurodevelopmental Disorders (CDDR). Geneva: World Health Organization; 2024. [Google Scholar]
- 2.McCutcheon RA, Reis Marques TR, Howes OD. Schizophrenia—an overview. JAMA Psychiatry. 2020;77:201–10. [DOI] [PubMed] [Google Scholar]
- 3.World Health Organization Schizophrenia—Fact Sheet. Geneva: World Health Organization; 2025. [Google Scholar]
- 4.Hjorthøj C, Stürup AE, McGrath JJ, Nordentoft M. Years of potential life lost and life expectancy in schizophrenia: a systematic review and meta-analysis. Lancet Psychiatry. 2017;4:295–301. [DOI] [PubMed] [Google Scholar]
- 5.Robinson D, Woerner MG, Alvir JMJ, Bilder R, Goldman R, Geisler S, et al. Predictors of relapse following response from a first episode of schizophrenia or schizoaffective disorder. Arch Gen Psychiatry. 1999;56:241–7. [DOI] [PubMed] [Google Scholar]
- 6.Lu L, Dong M, Zhang L, Zhu XM, Ungvari GS, Ng CH, et al. Prevalence of suicide attempts in individuals with schizophrenia: a meta-analysis of observational studies. Epidemiol Psychiatr Sci. 2019;29:e39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Palmer BA, Pankratz VS, Bostwick JM. The lifetime risk of suicide in schizophrenia: a reexamination. Arch Gen Psychiatry. 2005;62:247–53. [DOI] [PubMed] [Google Scholar]
- 8.Bai W, Liu ZH, Jiang YY, Zhang QE, Rao WW, Cheung T, et al. Worldwide prevalence of suicidal ideation and suicide plan among people with schizophrenia: a meta-analysis and systematic review of epidemiological surveys. Transl Psychiatry. 2021;11:552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.World Health Organization Psychosocial Rehabilitation—a Consensus Statement (WHO/MNH/MND/96.2). Geneva: World Health Organization; 1996. [Google Scholar]
- 10.Thornicroft G, Deb T, Henderson C. Community mental health care worldwide: current status and further developments. World Psychiatry. 2016;15:276–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.World Health Organization Comprehensive Mental Health Action Plan 2013-30 (updated). Geneva: World Health Organization; 2021. [Google Scholar]
- 12.World Health Organization Guidance on Community Mental Health Services: Promoting Person-Centred and Rights-Based Approaches. Geneva: World Health Organization; 2021. [Google Scholar]
- 13.Patel V, Saxena S, Lund C, Thornicroft G, Baingana F, Bolton P, et al. The Lancet Commission on global mental health and sustainable development. Lancet. 2018;392:1553–98. [DOI] [PubMed] [Google Scholar]
- 14.American Psychiatric Association The American Psychiatric Association Practice Guideline for the Treatment of Patients with Schizophrenia. 3rd edn. Washington, DC: APA Publishing; 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lewis CC, Boyd M, Puspitasari A, Navarro E, Howard J, Kassab H, et al. Implementing measurement-based care in behavioral health: a review. JAMA Psychiatry. 2019;76:324–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wykes T, Huddy V, Cellard C, McGurk SR, Czobor P. A meta-analysis of cognitive remediation for schizophrenia: methodology and effect sizes. Am J Psychiatry. 2011;168:472–85. [DOI] [PubMed] [Google Scholar]
- 17.Rodolico A, Bighelli I, Avanzato C, Concerto C, Cutrufelli P, Mineo L, et al. Family interventions for relapse prevention in schizophrenia: a systematic review and network meta-analysis. Lancet Psychiatry. 2022;9:211–21. [DOI] [PubMed] [Google Scholar]
- 18.Turner DT, McGlanaghy E, Cuijpers P, Van Der Gaag M, Karyotaki E, MacBeth A. A meta-analysis of social skills training and related interventions for psychosis. Schizophr Bull. 2018;44:475–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Asher L, Hanlon C, Birhane R, Habtamu A, Eaton J, Weiss HA, et al. Community-based rehabilitation intervention for people with schizophrenia in Ethiopia (RISE): a 12 month mixed methods pilot study. BMC Psychiatry. 2018;18:250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.World Health Organization World Mental Health Report: Transforming Mental Health for All. Geneva: World Health Organization; 2022. [Google Scholar]
- 21.World Health Organization Mental Health Atlas. Geneva: World Health Organization; 2020. [Google Scholar]
- 22.Lieberman JA, Stroup TS, McEvoy JP, Swartz MS, Rosenheck RA, Perkins DO, et al. Effectiveness of antipsychotic drugs in patients with chronic schizophrenia. N Engl J Med. 2005;353:1209–23. [DOI] [PubMed] [Google Scholar]
- 23.Haddad PM, Brain C, Scott J. Nonadherence with antipsychotic medication in schizophrenia: challenges and management strategies. Patient Relat Outcome Meas. 2014;5:43–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.National Institute for Health and Care Excellence Psychosis and Schizophrenia in Adults: Prevention and Management. London: NICE; 2014. vol. CG178. [PubMed] [Google Scholar]
- 25.Substance Abuse and Mental Health Services Administration Results from the 2023 National Survey on Drug Use and Health: Annual National Report. Rockville, MD: Substance Abuse and Mental Health Services Administration; 2024. [Google Scholar]
- 26.Torous J, Linardon J, Goldberg SB, Sun S, Bell I, Nicholas J, et al. The evolving field of digital mental health: current evidence and implementation issues for smartphone apps, generative artificial intelligence, and virtual reality. World Psychiatry. 2025;24:156–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Linardon J, Torous J, Firth J, Cuijpers P, Messer M, Fuller-Tyszkiewicz M. Current evidence on the efficacy of mental health smartphone apps for symptoms of depression and anxiety. A meta-analysis of 176 randomized controlled trials. World Psychiatry. 2024;23:139–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hagi K, Kurokawa S, Takamiya A, Fujikawa M, Kinoshita S, Iizuka M, et al. Telepsychiatry versus face-to-face treatment: systematic review and meta-analysis of randomised controlled trials. Br J Psychiatry. 2023;223:407–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Karyotaki E, Efthimiou O, Miguel C, Bermpohl FMG, Furukawa TA, Cuijpers P, et al. Internet-based cognitive behavioral therapy for depression: a systematic review and individual patient data network meta-analysis. JAMA Psychiatry. 2021;78:361–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Matsumoto K, Hamatani S, Shimizu E. Effectiveness of videoconference-delivered cognitive behavioral therapy for adults with psychiatric disorders: systematic and meta-analytic review. J Med Internet Res. 2021;23:e31293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Freeman D, Lambe S, Kabir T, Petit A, Rosebrock L, Yu LM, et al. Automated virtual reality therapy to treat agoraphobic avoidance and distress in patients with psychosis (gameChange): a multicentre, parallel-group, single-blind, randomised, controlled trial in England with mediation and moderation analyses. Lancet Psychiatry. 2022;9:375–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fonseka LN, Woo BKP. Wearables in schizophrenia: update on current and future clinical applications. JMIR Mhealth Uhealth. 2022;10:e35600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Stone AA, Schneider S, Smyth JM. Evaluation of pressing issues in ecological momentary assessment. Annu Rev Clin Psycho. 2023;19:107–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Onnela JP, Rauch SL. Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology. 2016;41:1691–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Murphy KP Probabilistic Machine Learning: An Introduction. Cambridge, MA: MIT Press; 2022. [Google Scholar]
- 36.Huang SC, Pareek A, Jensen M, Lungren MP, Yeung S, Chaudhari AS. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. NPJ Digit Med. 2023;6:74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst. 2021;32:4–24. [DOI] [PubMed] [Google Scholar]
- 38.Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med. 2023;29:1930–40. [DOI] [PubMed] [Google Scholar]
- 39.Baltrušaitis T, Ahuja C, Morency LP. Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell. 2019;41:423–43. [DOI] [PubMed] [Google Scholar]
- 40.Rieke N, Hancox J, Li W, Milletarì F, Roth HR, Albarqouni S, et al. The future of digital health with federated learning. NPJ Digit Med. 2020;3:119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Feng J, Phillips RV, Malenica I, Bishara A, Hubbard AE, Celi LA, et al. Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare. NPJ Digit Med. 2022;5:66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.World Health Organization Ethics and Governance of Artificial Intelligence for Health: WHO Guidance. World Health Organization; 2021. https://www.who.int/publications/i/item/9789240029200.
- 43.Hansen L, Bernstorff M, Enevoldsen K, Kolding S, Damgaard JG, Perfalk E, et al. Predicting diagnostic progression to schizophrenia or bipolar disorder via machine learning. JAMA Psychiatry. 2025;82:459–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Raket LL, Jaskolowski J, Kinon BJ, Brasen JC, Jönsson L, Wehnert A, et al. Dynamic ElecTronic hEalth reCord deTection (DETECT) of individuals at risk of a first episode of psychosis: a case-control development and validation study. Lancet Digit Health. 2020;2:e229–e239. [DOI] [PubMed] [Google Scholar]
- 45.Zhu Y, Maikusa N, Radua J, Sämann PG, Fusar-Poli P, Agartz I, et al. Using brain structural neuroimaging measures to predict psychosis onset for individuals at clinical high-risk. Mol Psychiatry. 2024;29:1465–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gao J, Qian M, Wang Z, Li Y, Luo N, Xie S, et al. Exploring schizophrenia classification through multimodal MRI and deep graph neural networks: unveiling brain region-specific weight discrepancies and their association with cell-type specific transcriptomic features. Schizophr Bull. 2024;51:217–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Li A, Zalesky A, Yue W, Howes O, Yan H, Liu Y, et al. A neuroimaging biomarker for striatal dysfunction in schizophrenia. Nat Med. 2020;26:558–65. [DOI] [PubMed] [Google Scholar]
- 48.Qi S, Sui J, Pearlson G, Bustillo J, Perrone-Bizzozero NI, Kochunov P, et al. Derivation and utility of schizophrenia polygenic risk associated multimodal MRI frontotemporal network. Nat Commun. 2022;13:4929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Anderson KM, Collins MA, Chin R, Ge T, Rosenberg MD, Holmes AJ. Transcriptional and imaging-genetic association of cortical interneurons, brain function, and schizophrenia risk. Nat Commun. 2020;11:2889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Morgan SE, Seidlitz J, Whitaker KJ, Romero-Garcia R, Clifton NE, Scarpazza C, et al. Cortical patterning of abnormal morphometric similarity in psychosis is associated with brain expression of schizophrenia-related genes. Proc Natl Acad Sci USA. 2019;116:9604–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Torous J, Bucci S, Bell IH, Kessing LV, Faurholt-Jepsen M, Whelan P, et al. The growing field of digital psychiatry: current evidence and the future of apps, social media, chatbots, and virtual reality. World Psychiatry. 2021;20:318–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Koutsouleris N, Hauser TU, Skvortsova V, De Choudhury M. From promise to practice: towards the realisation of AI-informed mental health care. Lancet Digit Health. 2022;4:e829–e840. [DOI] [PubMed] [Google Scholar]
- 53.Yoo DW, Woo H, Nguyen VC, Birnbaum ML, Kruzan KP, Kim JG et al. Patient perspectives on AI-driven predictions of schizophrenia relapses: understanding concerns and opportunities for self-care and treatment. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI Conference. New York: Association for Computing Machinery; 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Eisner E, Ball H, Ainsworth J, Cella M, Chalmers N, Clifford S. et al. Using passive sensing to predict psychosis relapse: an in-depth qualitative study exploring perspectives of people with psychosis. Schizophr Bull. 2025:sbaf126. 10.1093/schbul/sbaf126 [DOI] [PubMed]
- 55.Rogan J, Firth J, Bucci S. Healthcare professionals’ views on the use of passive sensing and machine learning approaches in secondary mental healthcare: a qualitative study. Health Expect. 2024;27:e70116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health. 2021;3:e745–e750. [DOI] [PubMed] [Google Scholar]
- 57.Koppe G, Meyer-Lindenberg A, Durstewitz D. Deep learning for small and big data in psychiatry. Neuropsychopharmacology. 2021;46:176–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Lewandowski KE. Ecological validity in cognitive assessment and treatment. Schizophr Res Cogn. 2025;40:100341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Challis S, Nielssen O, Harris A, Large M. Systematic meta-analysis of the risk factors for deliberate self-harm before and after treatment for first-episode psychosis. Acta Psychiatr Scand. 2013;127:442–54. [DOI] [PubMed] [Google Scholar]
- 60.Shatte ABR, Hutchinson DM, Teague SJ. Machine learning in mental health: a scoping review of methods and applications. Psychol Med. 2019;49:1426–48. [DOI] [PubMed] [Google Scholar]
- 61.Foteinopoulou NM, Patras I. Machine learning approaches for fine-grained symptom estimation in schizophrenia: a comprehensive review. Artif Intell Med. 2025;165:103129. [DOI] [PubMed] [Google Scholar]
- 62.Kambeitz J, Kambeitz-Ilankovic L, Leucht S, Wood S, Davatzikos C, Malchow B, et al. Detecting neuroimaging biomarkers for schizophrenia: a meta-analysis of multivariate pattern recognition studies. Neuropsychopharmacology. 2015;40:1742–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Ni Y, Jia F. A scoping review of AI-Driven digital interventions in mental health care: mapping applications across screening, support, monitoring, prevention, and clinical education. Healthcare (Basel). 2025;13:1205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Cruz-Gonzalez P, He AWJ, Lam EP, Ng IMC, Li MW, Hou R, et al. Artificial intelligence in mental health care: a systematic review of diagnosis, monitoring, and intervention applications. Psychol Med. 2025;55:e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Galderisi S, Mucci A, Buchanan RW, Arango C. Negative symptoms of schizophrenia: new developments and unanswered research questions. Lancet Psychiatry. 2018;5:664–77. [DOI] [PubMed] [Google Scholar]
- 66.Handest R, Molstrom IM, Gram Henriksen M, Hjorthøj C, Nordgaard J. A systematic review and meta-analysis of the association between psychopathology and social functioning in schizophrenia. Schizophr Bull. 2023;49:1470–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-SCR): checklist and explanation. Ann Intern Med. 2018;169:467–73. [DOI] [PubMed] [Google Scholar]
- 68.Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60:84–90. [Google Scholar]
- 69.LeCun Y, Bengio Y, Hinton GE. Deep learning. Nature. 2015;521:436–44. [DOI] [PubMed] [Google Scholar]
- 70.Chivilgina O, Elger BS, Jotterand F. Digital technologies for schizophrenia management: a descriptive review. Sci Eng Ethics. 2021;27:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Rössler W. Psychiatric rehabilitation today: an overview. World Psychiatry. 2006;5:151–7. [PMC free article] [PubMed] [Google Scholar]
- 72.Owen MJ, Sawa A, Mortensen PB. Schizophrenia. Lancet. 2016;388:86–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Reed GM, Keeley JW, Rebello TJ, First MB, Gureje O, Ayuso-Mateos JL, et al. Clinical utility of ICD-11 diagnostic guidelines for high-burden mental disorders: results from mental health settings in 13 countries. World Psychiatry. 2018;17:306–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Bell IH, Eisner E, Allan S, Cartner S, Torous J, Bucci S, et al. Methodological characteristics and feasibility of ecological momentary assessment studies in psychosis: a systematic review and meta-analysis. Schizophr Bull. 2024;50:238–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Benoit J, Onyeaka H, Keshavan M, Torous J. Systematic review of digital phenotyping and machine learning in psychosis spectrum illnesses. Harv Rev Psychiatry. 2020;28:296–304. [DOI] [PubMed] [Google Scholar]
- 76.Gumley AI, Bradstreet S, Ainsworth J, Allan S, Alvarez-Jimenez M, Aucott L, et al. The EMPOWER blended digital intervention for relapse prevention in schizophrenia: a feasibility cluster randomised controlled trial in Scotland and Australia. Lancet Psychiatry. 2022;9:477–86. [DOI] [PubMed] [Google Scholar]
- 77.Stroup TS, Gray N. Management of common adverse effects of antipsychotic medications. World Psychiatry. 2018;17:341–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Kishimoto T, Hagi K, Kurokawa S, Kane JM, Correll CU. Long-acting injectable versus oral antipsychotics for the maintenance treatment of schizophrenia: a systematic review and comparative meta-analysis of randomised, cohort, and pre–post studies. Lancet Psychiatry. 2021;8:387–404. [DOI] [PubMed] [Google Scholar]
- 79.Hawton K, Lascelles K, Pitman A, Gilbert S, Silverman M. Assessment of suicide risk in mental health practice: shifting from prediction to therapeutic assessment, formulation, and risk management. Lancet Psychiatry. 2022;9:922–8. 190–200. [DOI] [PubMed] [Google Scholar]
- 80.Whiting D, Gulati G, Geddes JR, Fazel S. Association of schizophrenia spectrum disorders and violence perpetration in adults and adolescents from 15 countries: a systematic review and meta-analysis. JAMA Psychiatry. 2022;79:120–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Gleeson JF, McGuckian TB, Fernandez DK, Fraser MI, Pepe A, Taskis R, et al. Systematic review of early warning signs of relapse and behavioural antecedents of symptom worsening in people living with schizophrenia spectrum disorders. Clin Psychol Rev. 2024;107:102357. [DOI] [PubMed] [Google Scholar]
- 82.Vita A, Barlati S, Ceraso A, Nibbio G, Ariu C, Deste G, et al. Effectiveness, core elements, and moderators of response of cognitive remediation for schizophrenia: a systematic review and meta-analysis of randomized clinical trials. JAMA Psychiatry. 2021;78:848–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Nijman SA, Veling W, van der Stouwe ECD, Pijnenborg GHM. Social cognition training for people with a psychotic disorder: a network meta-analysis. Schizophr Bull. 2020;46:1086–103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Bond GR, Al-Abdulmunem M, Marbacher J, Christensen TN, Sveinsdottir V, Drake RE. A systematic review and meta-analysis of IPS supported employment for young adults with mental health conditions. Adm Policy Ment Health. 2023;50:160–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Bighelli I, Rodolico A, García-Mieres H, Pitschel-Walz G, Hansen WP, Schneider-Thoma J, et al. Psychosocial and psychological interventions for relapse prevention in schizophrenia: a systematic review and network meta-analysis. Lancet Psychiatry. 2021;8:969–80. [DOI] [PubMed] [Google Scholar]
- 86.Jauhar S, McKenna PJ, Radua J, Fung E, Salvador R, Laws KR. Cognitive–behavioural therapy for the symptoms of schizophrenia: systematic review and meta-analysis with examination of potential bias. Br J Psychiatry. 2014;204:20–29. [DOI] [PubMed] [Google Scholar]
- 87.Smit D, Miguel C, Vrijsen JN, Groeneweg B, Spijker J, Cuijpers P. The effectiveness of peer support for individuals with mental illness: systematic review and meta-analysis. Psychol Med. 2023;53:5332–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Firth J, Siddiqi N, Koyanagi A, Siskind D, Rosenbaum S, Galletly C, et al. The Lancet Psychiatry Commission: a blueprint for protecting physical health in people with mental illness. Lancet Psychiatry. 2019;6:675–712. [DOI] [PubMed] [Google Scholar]
- 89.Daumit GL, Dickerson FB, Wang NY, Dalcin A, Jerome GJ, Anderson CAM, et al. A behavioral weight-loss intervention in persons with serious mental illness. N Engl J Med. 2013;368:1594–602. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Tsoi DTY, Porwal M, Webster AC. Efficacy and safety of bupropion for smoking cessation and reduction in schizophrenia: systematic review and meta-analysis. Br J Psychiatry. 2010;196:346–53. [DOI] [PubMed] [Google Scholar]
- 91.Kane JM, Robinson DG, Schooler NR, Mueser KT, Penn DL, Rosenheck RA, et al. Comprehensive versus usual community care for first-episode psychosis: 2-year outcomes from the NIMH RAISE early treatment program. Am J Psychiatry. 2016;173:362–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Bond GR, Drake RE. The critical ingredients of assertive community treatment. World Psychiatry. 2015;14:240–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Dieterich M, Irving CB, Bergman H, Khokhar MA, Park B, Marshall M. Intensive case management for severe mental illness. Cochrane Database Syst Rev. 2017;1:CD007906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Osipov M, Behzadi Y, Kane JM, Petrides G, Clifford GD. Objective identification and analysis of physiological and behavioral signs of schizophrenia. J Ment Health. 2015;24:276–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Arslan B, Kizilay E, Verim B, Demirlek C, Dokuyan Y, Turan YE, et al. Automated linguistic analysis in speech samples of Turkish-speaking patients with schizophrenia-spectrum disorders. Schizophr Res. 2024;267:65–71. [DOI] [PubMed] [Google Scholar]
- 96.Chan CC, Norel R, Agurto C, Lysaker PH, Myers EJ, Hazlett EA, et al. Emergence of language related to self-experience and agency in autobiographical narratives of individuals with schizophrenia. Schizophr Bull. 2023;49:444–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Parola A, Gabbatore I, Berardinelli L, Salvini R, Bosco FM. Multimodal assessment of communicative-pragmatic features in schizophrenia: a machine learning approach. NPJ Schizophr. 2021;7:28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Ciampelli S, Voppel AE, De Boer JN, Koops S, Sommer IEC. Combining automatic speech recognition with semantic natural language processing in schizophrenia. Psychiatry Res. 2023;325:115252. [DOI] [PubMed] [Google Scholar]
- 99.De Boer JN, Voppel AE, Brederoo SG, Schnack HG, Truong KP, Wijnen FNK, et al. Acoustic speech markers for schizophrenia-spectrum disorders: a diagnostic and symptom-recognition tool. Psychol Med. 2023;53:1302–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Richter V, Neumann M, Kothare H, Roesler O, Liscombe J, Suendermann-Oeft D et al. Towards multimodal dialog-based speech & facial biomarkers of schizophrenia. In: International Conference on Multimodal Interaction. New York, NY: ACM; 2022. [Google Scholar]
- 101.Kalinich M, Ebrahim S, Hays R, Melcher J, Vaidyam A, Torous J. Applying machine learning to smartphone based cognitive and sleep assessments in schizophrenia. Schizophr Res Cogn. 2022;27:100216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Shen H, Wang SH, Zhang Y, Wang H, Li F, Lucas MV, et al. Color painting predicts clinical symptoms in chronic schizophrenia patients via deep learning. BMC Psychiatry. 2021;21:522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Wang R, Aung MSH, Abdullah S, Brian R, Campbell AT, Choudhury T et al. CrossCheck: toward passive sensing and detection of mental health changes in people with schizophrenia. In:Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. (ACM, New York, 2016). pp 886-97.
- 104.Jeong L, Lee M, Eyre B, Balagopalan A, Rudzicz F, Gabilondo C. Exploring the use of natural language processing for objective assessment of disorganized speech in schizophrenia. Psychiatr Res Clin Pract. 2023;5:84–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Arevian AC, Bone D, Malandrakis N, Martinez VR, Wells KB, Miklowitz DJ, et al. Clinical state tracking in serious mental illness through computational analysis of speech. PLoS ONE. 2020;15:e0225695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Wang R, Wang W, Aung MSH, Ben-Zeev D, Brian R, Campbell AT, et al. Predicting symptom trajectories of schizophrenia using mobile sensing. Proc ACM Interact Mob Wearable Ubiquitous Technol. 2017;1:1–24. [Google Scholar]
- 107.Hong M, Kang RR, Yang JH, Rhee SJ, Lee H, Kim YG, et al. Comprehensive symptom prediction in inpatients with acute psychiatric disorders using wearable-based deep learning models: development and validation study. J Med Internet Res. 2024;26:e65994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Adler DA, Wang F, Mohr DC, Choudhury T. Machine learning for passive mental health symptom prediction: generalization across different longitudinal mobile sensing studies. PLoS ONE. 2022;17:e0266516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Tseng VWS, Sano A, Ben-Zeev D, Brian R, Campbell AT, Hauser M, et al. Using behavioral rhythms and multi-task learning to predict fine-grained symptoms of schizophrenia. Sci Rep. 2020;10:15100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Lin E, Lin CH, Lane HY. Applying a bagging ensemble machine learning approach to predict functional outcome of schizophrenia with clinical symptoms and cognitive functions. Sci Rep. 2021;11:6922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Lin GH, Liu JH, Lee SC, Wu BJ, Li SQ, Chiu HJ, et al. Developing a machine learning-based short form of the Positive and Negative Syndrome Scale. Asian J Psychiatry. 2024;94:103965. [DOI] [PubMed] [Google Scholar]
- 112.Soldatos RF, Cearns M, Nielsen MØ, Kollias C, Xenaki LA, Stefanatou P, et al. Prediction of early symptom remission in two independent samples of first-episode psychosis patients using machine learning. Schizophr Bull. 2022;48:122–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.van Dee V, Kia SM, Fregosi C, Swildens WE, Alkema A, Batalla A, et al. Prognostic predictions in psychosis: exploring the complementary role of machine learning models. Ment Health. 2025;28:e301594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.van Opstal DPJ, Kia SM, Jakob L, Somers M, Sommer IEC, Winter-van Rossum I, et al. Psychosis prognosis predictor: a continuous and uncertainty-aware prediction of treatment outcome in first-episode psychosis. Acta Psychiatr Scand. 2025;151:280–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Jean T, Guay Hottin R, Orban P. Forecasting mental states in schizophrenia using digital phenotyping data. PLOS Digit Health. 2025;4:e0000734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Cohen AS, Cox CR, Le TP, Cowan T, Masucci MD, Strauss GP, et al. Using machine learning of computerized vocal expression to measure blunted vocal affect and alogia. NPJ Schizophr. 2020;6:26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Narkhede SM, Luther L, Raugh IM, Knippenberg AR, Esfahlani FZ, Sayama H, et al. Machine learning identifies digital phenotyping measures most relevant to negative symptoms in psychotic disorders: implications for clinical trials. Schizophr Bull. 2022;48:425–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Umbricht D, Cheng WY, Lipsmeier F, Bamdadian A, Lindemann M. Deep learning-based human activity recognition for continuous activity and gesture monitoring for schizophrenia patients with negative symptoms. Front Psychiatry. 2020;11:574375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Liu CM, Chan YH, Ho MY, Liu CC, Lu MH, Liao YA. et al. Analyzing generative AI and machine learning in auto-assessing schizophrenia's negative symptoms. Schizophr Bull. 2025:sbaf102. 10.1093/schbul/sbaf102. [DOI] [PubMed]
- 120.Bradley ER, Portanova J, Woolley JD, Buck B, Painter IS, Hankin M, et al. Quantifying abnormal emotion processing: a novel computational assessment method and application in schizophrenia. Psychiatry Res. 2024;336:115893. [DOI] [PubMed] [Google Scholar]
- 121.Holmlund TB, Chandler C, Foltz PW, Cohen AS, Cheng J, Bernstein JC, et al. Applying speech technologies to assess verbal memory in patients with serious mental illness. NPJ Digit Med. 2020;3:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.McCutcheon RA, Keefe RSE, McGuire PM, Marquand A. Deconstructing cognitive impairment in psychosis with a machine learning approach. JAMA Psychiatry. 2025;82:57–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Zakowicz PT, Brzezicki MA, Levidiotis C, Kim S, Wejkuć O, Wisniewska Z, et al. Detection of formal thought disorders in child and adolescent psychosis using machine learning and neuropsychometric data. Front Psychiatry. 2025;16:1550571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Granato G, Costanzo R, Borghi A, Mattera A, Carruthers S, Rossell S, et al. An experimental and computational investigation of executive functions and inner speech in schizophrenia spectrum disorders. Sci Rep. 2025;15:5185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Difrancesco S, Fraccaro P, Van Der Veer SN, Alshoumr B, Ainsworth J, Bellazzi R et al. Out-of-home activity recognition from GPS data in schizophrenic patients. In: 29th International Symposium on Computer-based Medical Systems (CBMS) (New York, NY, 2016).
- 126.Wang W, Mirjafari S, Harari G, Ben-Zeev D, Brian R, Choudhury T et al. Social sensing: assessing social functioning of patients living with schizophrenia using mobile phone sensing. In: CHI ‘20. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. New York, NY: ACM; 2020. In. [Google Scholar]
- 127.Badal VD, Depp CA, Hitchcock PF, Penn DL, Harvey PD, Pinkham AE. Computational methods for integrative evaluation of confidence, accuracy, and reaction time in facial affect recognition in schizophrenia. Schizophr Res Cogn. 2021;25:100196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Abplanalp SJ, Green MF, Wynn JK, Eisenberger NI, Horan WP, Lee J, et al. Using machine learning to understand social isolation and loneliness in schizophrenia, bipolar disorder, and the community. Schizophrenia (Heidelb). 2024;10:88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Shibata Y, Victorino JN, Natsuyama T, Okamoto N, Yoshimura R, Shibata T. Estimation of subjective quality of life in schizophrenic patients using speech features. Front Rehabil Sci. 2023;4:1121034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Jeong JH, Kim J, Kang N, Ahn YM, Kim YS, Lee D, et al. Modeling the determinants of subjective well-being in schizophrenia. Schizophr Bull. 2025;51:1118–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Miley K, Meyer-Kalos P, Ma S, Bond DJ, Kummerfeld E, Vinogradov S. Causal pathways to social and occupational functioning in the first episode of schizophrenia: uncovering unmet treatment needs. Psychol Med. 2023;53:2041–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Hulme WJ, Stockton-Powdrell C, Lewis S, Martin GP, Bucci S, Parsia B et al. Cluster hidden Markov models: an application to ecological momentary assessment of schizophrenia. In: 32nd International Symposium on Computer-Based Medical Systems (CBMS) (New York, NY, 2019).
- 133.Amoretti S, Verdolini N, Mezquida G, Rabelo-da-Ponte FD, Cuesta MJ, Pina-Camacho L, et al. Identifying clinical clusters with distinct trajectories in first-episode psychosis through an unsupervised machine learning technique. Eur Neuropsychopharmacol. 2021;47:112–29. [DOI] [PubMed] [Google Scholar]
- 134.Martínez-Cao C, Sánchez-Lasheras F, García-Fernández A, González-Blanco L, Zurrón-Madera P, Sáiz PA, et al. PsiOvi staging model for schizophrenia (PsiOvi SMS): a new internet tool for staging patients with schizophrenia. Eur Psychiatry. 2024;67:e36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Martinelli A, Leone S, Baronio CM, Archetti D, Redolfi A, Adorni A, et al. Sex differences in schizophrenia spectrum disorders: insights from the DiAPAson study using a data-driven approach. Soc Psychiatry Psychiatr Epidemiol. 2025;60:1983–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Howes C, Purver M, McCabe R, Healey P, Lavelle M Predicting adherence to treatment for schizophrenia from dialogue transcripts. In: Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. New York: ACM; 2012. [Google Scholar]
- 137.Bain EE, Shafner L, Walling DP, Othman AA, Chuang-Stein C, Hinkle J, et al. Use of a novel artificial intelligence platform on mobile devices to assess dosing compliance in a phase 2 clinical trial in subjects with schizophrenia. JMIR Mhealth Uhealth. 2017;5:e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Chen HH, Hsu HT, Lin PC, Chen CY, Hsieh HF, Ko CH. Efficacy of a smartphone app in enhancing medication adherence and accuracy in individuals with schizophrenia during the COVID-19 pandemic: randomized controlled trial. JMIR Ment Health. 2023;10:e50806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Zhu Z, Roy D, Feng S, Vogler B. AI-based medication adherence prediction in patients with schizophrenia and attenuated psychotic disorders. Schizophr Res. 2025;275:42–51. [DOI] [PubMed] [Google Scholar]
- 140.Jeon SM, Cho J, Lee DY, Kwon JW. Comparison of prediction methods for treatment continuation of antipsychotics in children and adolescents with schizophrenia. Evid Based Ment Health. 2022;25:e26–e33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Pei X, Du X, Liu D, Li X, Wu Y. Nomogram model for predicting medication adherence in patients with various mental disorders based on the Dryad database. BMJ Open. 2024;14:e087312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Dickson MC, Nguyen MM, Patel C, Grabich SC, Benson C, Cothran T, et al. Adherence, persistence, readmissions, and costs in Medicaid members with schizophrenia or schizoaffective disorder initiating paliperidone palmitate versus switching oral antipsychotics: a real-world retrospective investigation. Adv Ther. 2023;40:349–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Kim EY, Kim J, Jeong JH, Jang J, Kang N, Seo J, et al. Machine learning prediction model of the treatment response in schizophrenia reveals the importance of metabolic and subjective characteristics. Schizophr Res. 2025;275:146–55. [DOI] [PubMed] [Google Scholar]
- 144.Wong TY, Luo H, Tang J, Moore TM, Gur RC, Suen YN, et al. Development of an individualized risk calculator of treatment resistance in patients with first-episode psychosis (TRipCal) using automated machine learning: a 12-year follow-up study with clozapine prescription as a proxy indicator. Transl Psychiatry. 2024;14:50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Barruel D, Hilbey J, Charlet J, Chaumette B, Krebs MO, Dauriac-Le Masson V. Predicting treatment resistance in schizophrenia patients: machine learning highlights the role of early pathophysiologic features. Schizophr Res. 2024;270:1–10. [DOI] [PubMed] [Google Scholar]
- 146.Podichetty JT, Silvola RM, Rodriguez-Romero V, Bergstrom RF, Vakilynejad M, Bies RR, et al. Application of machine learning to predict reduction in total PANSS score and enrich enrollment in schizophrenia clinical trials. Clin Transl Sci. 2021;14:1864–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Yee JY, Phua SX, See YM, Andiappan AK, Goh WWB, Lee J. Predicting antipsychotic responsiveness using a machine learning classifier trained on plasma levels of inflammatory markers in schizophrenia. Transl Psychiatry. 2025;15:51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Vellucci L, Barone A, Buonaguro EF, Ciccarelli M, De Simone G, Iannotta F, et al. Severity of autism-related symptoms in treatment-resistant schizophrenia: associations with cognitive performance, psychosocial functioning, and neurological soft signs—clinical evidence and ROC analysis. J Psychiatr Res. 2025;185:119–29. [DOI] [PubMed] [Google Scholar]
- 149.Mishra A, Maiti R, Jena M, Srinivasan A. Evaluating machine learning algorithms for prediction of treatment response for sleep disturbances in patients with schizophrenia: a post-hoc analysis from a randomized controlled trial. Psychiatr Danub. 2025;37:46–54. [DOI] [PubMed] [Google Scholar]
- 150.Hieronymus F, Hieronymus M, Sjöstedt A, Nilsson S, Näslund J, Lisinski A, et al. Predicting remission in schizophrenia using machine learning—assessing the impact of sample size and predictor overinclusion. Acta Psychiatr Scand. 2025;152:441–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 151.Wysokiński A, Dreczka J. Clozapine toxicity predictor: deep neural network model predicting clozapine toxicity and its therapeutic dose range. Psychiatry Res. 2024;342:116256. [DOI] [PubMed] [Google Scholar]
- 152.Zhu X, Hu J, Xiao T, Huang S, Shang D, Wen Y. Integrating machine learning with electronic health record data to facilitate detection of prolactin level and pharmacovigilance signals in olanzapine-treated patients. Front Endocrinol. 2022;13:1011492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Vidal N, Sedki M, Younès N, Bottemanne H, Roux P, Brunet-Gouet E. Neural network analysis of the contribution of psychotropic prescription sequences to the risk of non-psychiatric adverse events in bipolar and schizophrenia spectrum disorders. Front Digit Health. 2025;7:1633220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.Wu CS, Luedtke AR, Sadikova E, Tsai HJ, Liao SC, Liu CC, et al. Development and validation of a machine learning individualized treatment rule in first-episode schizophrenia. JAMA Netw Open. 2020;3:e1921660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 155.Zlatintsi A, Filntisis PP, Garoufis C, Efthymiou N, Maragos P, Menychtas A, et al. E-prevention: advanced support system for monitoring and relapse prevention in patients with psychotic disorders analyzing long-term multimodal data from wearables and video captures. Sensors (Basel). 2022;22:7544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Wang X, Vouk N, Heaukulani C, Buddhika T, Martanto W, Lee J, et al. HOPES: an integrative digital phenotyping platform for data collection, monitoring, and machine learning. J Med Internet Res. 2021;23:e23984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Adler DA, Ben-Zeev D, Tseng VWS, Kane JM, Brian R, Campbell AT, et al. Predicting early warning signs of psychotic relapse from passive sensing data: an approach using encoder-decoder neural networks. JMIR Mhealth Uhealth. 2020;8:e19962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Brandt L, Ritter K, Schneider-Thoma J, Siafis S, Montag C, Ayrilmaz H, et al. Predicting psychotic relapse following randomised discontinuation of paliperidone in individuals with schizophrenia or schizoaffective disorder: an individual participant data analysis. Lancet Psychiatry. 2023;10:184–96. [DOI] [PubMed] [Google Scholar]
- 159.Birnbaum ML, Kulkarni PP, Van Meter A, Chen V, Rizvi AF, Arenare E, et al. Utilizing machine learning on internet search activity to support the diagnostic process and relapse detection in young individuals with early psychosis: feasibility study. JMIR Ment Health. 2020;7:e19348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Fond G, Bulzacka E, Boucekine M, Schürhoff F, Berna F, Godin O, et al. Machine learning for predicting psychotic relapse at 2 years in schizophrenia in the national FACE-SZ cohort. Prog Neuropsychopharmacol Biol Psychiatry. 2019;92:8–18. [DOI] [PubMed] [Google Scholar]
- 161.Zhou J, Lamichhane B, Ben-Zeev D, Campbell A, Sano A. Predicting psychotic relapse in schizophrenia with mobile sensor data: routine cluster analysis. JMIR Mhealth Uhealth. 2022;10:e31006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.Cohen A, Naslund JA, Chang S, Nagendra S, Bhan A, Rozatkar A, et al. Relapse prediction in schizophrenia with smartphone digital phenotyping during COVID-19: a prospective, three-site, two-country, longitudinal study. Schizophrenia (Heidelb). 2023;9:6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Yoo DW, Woo H, Nguyen VC, Birnbaum ML, Kruzan KP, Kim JG, et al. Patient perspectives on AI-driven predictions of schizophrenia relapses: understanding concerns and opportunities for self-care and treatment. In: Proc SIGCHI Conf Hum Factor Comput Syst. 2024;2024:702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 164.Góngora Alonso S, Herrera Montano I, de La Torre Díez I, Franco-Martín M, Amoon M, Román-Gallego JA, et al. Predictive modeling of hospital readmission of schizophrenic patients in a Spanish region combining particle swarm optimization and machine learning algorithms. Biomimetics (Basel). 2024;9:752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 165.Bao Y, Wang W, Liu Z, Wang W, Zhao X, Yu S, et al. Leveraging deep neural network and language models for predicting long-term hospitalization risk in schizophrenia. Schizophrenia (Heidelb). 2025;11:35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Yu T, Zhang X, Liu X, Xu C, Deng C. The prediction and influential factors of violence in male schizophrenia patients with machine learning algorithms. Front Psychiatry. 2022;13:799899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 167.Mason AJC, Bhavsar V, Botelle R, Chandran D, Li L, Mascio A, et al. Applying neural network algorithms to ascertain reported experiences of violence in routine mental healthcare records and distributions of reports by diagnosis. Front Psychiatry. 2024;15:1181739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 168.Wang KZ, Bani-Fatemi A, Adanty C, Harripaul R, Griffiths J, Kolla N, et al. Prediction of physical violence in schizophrenia with machine learning algorithms. Psychiatry Res. 2020;289:112960. [DOI] [PubMed] [Google Scholar]
- 169.Bernstorff M, Hansen L, Enevoldsen K, Damgaard J, Hæstrup F, Perfalk E, et al. Development and validation of a machine learning model for prediction of type 2 diabetes in patients with mental illness. Acta Psychiatr Scand. 2025;151:245–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170.Banerjee S, Lio P, Jones PB, Cardinal RN. A class-contrastive human-interpretable machine learning approach to predict mortality in severe mental illness. NPJ Schizophr. 2021;7:60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Miley K, Bronstein MV, Ma S, Lee H, Green MF, Ventura J, et al. Trajectories and predictors of response to social cognition training in people with schizophrenia: a proof-of-concept machine learning study. Schizophr Res. 2024;266:92–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 172.Hudon A, Beaudoin M, Phraxayavong K, Potvin S, Dumais A. Unsupervised machine learning driven analysis of verbatims of treatment-resistant schizophrenia patients having followed avatar therapy. J Pers Med. 2023;13:801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 173.Barbalat G, Plasse J, Chéreau-Boudet I, Gouache B, Legros-Lafarge E, Massoubre C, et al. Contribution of socio-demographic and clinical characteristics to predict initial referrals to psychosocial interventions in patients with serious mental illness. Epidemiol Psychiatr Sci. 2024;33:e2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 174.Lin B, Cecchi G, Bouneffouf D Psychotherapy AI companion with reinforcement learning recommendations and interpretable policy dynamics. In: WWW ’23 Companion: Companion Proceedings of the ACM Web Conference. New York, NY: ACM; 2023. [Google Scholar]
- 175.Just SA, Elvevåg B, Pandey S, Nenchev I, Bröcker AL, Montag C, et al. Moving beyond word error rate to evaluate automatic speech recognition in clinical samples: lessons from research into schizophrenia-spectrum disorders. Psychiatry Res. 2025;352:116690. [DOI] [PubMed] [Google Scholar]
- 176.Just SA, Bröcker AL, Ryazanskaya G, Nenchev I, Schneider M, Bermpohl F, et al. Validation of natural language processing methods capturing semantic incoherence in the speech of patients with non-affective psychosis. Front Psychiatry. 2023;14:1208856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 177.May CR, Mair F, Finch T, MacFarlane A, Dowrick C, Treweek S, et al. Development of a theory of implementation and integration: normalization Process Theory. Implement Sci. 2009;4:29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 178.Greenhalgh T, Wherton J, Papoutsi C, Lynch J, Hughes G, Hinder S, et al. Beyond adoption: a new framework for theorizing and evaluating nonadoption, abandonment, and challenges to the scale-up, spread, and sustainability of health and care technologies. J Med Internet Res. 2017;19:e8775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 179.Wykes T, Bowie CR, Cella M. Thinking about the future of cognitive remediation therapy revisited: what is left to solve before patients have access?. Schizophr Bull. 2024;50:993–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180.Benjet C, Zainal NH, Albor Y, Alvis-Barranco L, Carrasco-Tapias N, Contreras-Ibáñez CC, et al. A precision treatment model for internet-delivered cognitive behavioral therapy for anxiety and depression among university students: a secondary analysis of a randomized clinical trial. JAMA Psychiatry. 2023;80:768–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 181.Furukawa TA, Noma H, Tajika A, Toyomoto R, Sakata M, Luo Y, et al. Personalised & optimised therapy (POT) algorithm using five cognitive and behavioural skills for subthreshold depression. NPJ Digit Med. 2025;8:531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182.Kollins SH, DeLoss DJ, Cañadas E, Lutz J, Findling RL, Keefe RSE, et al. A novel digital intervention for actively reducing severity of paediatric ADHD (STARS-ADHD): a randomised controlled trial. Lancet Digit Health. 2020;2:e168–e178. [DOI] [PubMed] [Google Scholar]
- 183.Michie S, Van Stralen MM, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011;6:42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 184.Michie S, Richardson M, Johnston M, Abraham C, Francis J, Hardeman W, et al. The behavior change technique taxonomy (v1) of 93 hierarchically clustered techniques: building an international consensus for the reporting of behavior change interventions. Ann Behav Med. 2013;46:81–95. [DOI] [PubMed] [Google Scholar]
- 185.Kappen TH, van Klei WA, van Wolfswinkel L, Kalkman CJ, Vergouwe Y, Moons KGM. Evaluating the impact of prediction models: lessons learned, challenges, and recommendations. Diagn Progn Res. 2018;2:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 186.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 187.Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016;352:i6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188.van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Topic Group. ‘Evaluating diagnostic tests and prediction models’ of the STRATOS initiative. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 189.Kompa B, Snoek J, Beam AL. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit Med. 2021;4:4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 190.D’Amour A, Heller K, Moldovan D, Adlam B, Alipanahi B, Beutel A, et al. Underspecification presents challenges for credibility in modern machine learning. J Mach Learn Res. 2022;23:1–61. [Google Scholar]
- 191.Koch LM, Baumgartner CF, Berens P. Distribution shift detection for the postmarket surveillance of medical AI algorithms: a retrospective simulation study. NPJ Digit Med. 2024;7:120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 192.Angelopoulos AN, Bates S. Conformal prediction: a gentle introduction. FNT in Machine Learning. 2023;16:494–591. [Google Scholar]
- 193.Swaminathan A, Lopez I, Wang W, Srivastava U, Tran E, Bhargava-Shah A, et al. Selective prediction for extracting unstructured clinical data. J Am Med Inform Assoc. 2023;31:188–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 194.Tonekaboni S, Joshi S, McCradden MD, Goldenberg A What clinicians want: contextualizing explainable machine learning for clinical end use. In: Proceedings of the Machine Learning Research. Rochester, MN: Machine Learning for Healthcare; 2019. [Google Scholar]
- 195.Reddy S. Explainability and artificial intelligence in medicine. Lancet Digit Health. 2022;4:e214–e215. [DOI] [PubMed] [Google Scholar]
- 196.Gottesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez F, et al. Guidelines for reinforcement learning in healthcare. Nat Med. 2019;25:16–18. [DOI] [PubMed] [Google Scholar]
- 197.Hager P, Jungmann F, Holland R, Bhagat K, Hubrecht I, Knauer M, et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat Med. 2024;30:2613–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 198.Moor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, Topol EJ, et al. Foundation models for generalist medical artificial intelligence. Nature. 2023;616:259–65. [DOI] [PubMed] [Google Scholar]
- 199.Adler-Milstein J, Mehrotra A. Paying for digital health care—problems with the fee-for-service system. N Engl J Med. 2021;385:871–3. [DOI] [PubMed] [Google Scholar]
- 200.Lozano E, Meza SF, Alexander A, Bonilla P, Jaramillo W Remote patient monitoring (RPM). In Davis M, Kirwan M, Maclay W, Pappas H (eds). Closing the Care Gap with Wearable Devices: Innovating Healthcare with Wearable Patient Monitoring. (Productivity Press, New York, NY, 2022).
- 201.Rodriguez JA, Shachar C, Bates DW. Digital inclusion as health care—supporting health care equity with digital-infrastructure initiatives. N Engl J Med. 2022;386:1101–3. [DOI] [PubMed] [Google Scholar]
- 202.Glasgow RE, Harden SM, Gaglio B, Rabin B, Smith ML, Porter GC, et al. RE-AIM planning and evaluation framework: adapting to new science and practice with a 20-year review. Front Public Health. 2019;7:64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 203.Loudon K, Treweek S, Sullivan F, Donnan P, Thorpe KE, Zwarenstein M. The PRECIS-2 tool: designing trials that are fit for purpose. BMJ. 2015;350:h2147. [DOI] [PubMed] [Google Scholar]
- 204.Ford I, Norrie J. Pragmatic trials. N Engl J Med. 2016;375:454–63. [DOI] [PubMed] [Google Scholar]
- 205.Slade M, Leamy M, Bacon F, Janosik M, Le Boutillier C, Williams J, et al. International differences in understanding recovery: systematic review. Epidemiol Psychiatr Sci. 2012;21:353–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 206.Murwasuminar B, Munro I, Recoche K. Mental health recovery for people with schizophrenia in southeast asia: a systematic review. J Psychiatr Ment Health Nurs. 2023;30:620–36. 10.1111/jpm.12902. [DOI] [PubMed] [Google Scholar]
- 207.Collins GS, Dhiman P, Andaur Navarro CLA, Ma J, Hooft L, Reitsma JB, et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 2021;11:e048008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 208.Moons KGM, Damen JAA, Kaul T, Hooft L, Andaur Navarro C, Dhiman P, et al. PROBAST+AI: an updated quality, risk of bias, and applicability assessment tool for prediction models using regression or artificial intelligence methods. BMJ. 2025;388:e082505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 209.Eisner E, Berry N, Bucci S. Digital tools to support mental health: a survey study in psychosis. BMC Psychiatry. 2023;23:726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 210.Yang LH, Anglin DM, Wonpat-Borja AJ, Opler MG, Greenspoon M, Corcoran CM. Public stigma associated with psychosis risk syndrome in a college population: implications for peer intervention. Psychiatr Serv. 2013;64:284–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 211.Corrigan PW, Druss BG, Perlick DA. The impact of mental illness stigma on seeking and participating in mental health care. Psychol Sci Public Interest. 2014;15:37–70. [DOI] [PubMed] [Google Scholar]
- 212.Burns T, Rugkåsa J, Molodynski A, Dawson J, Yeeles K, Vazquez-Montes M, et al. Community treatment orders for patients with psychosis (OCTET): a randomised controlled trial. Lancet. 2013;381:1627–33. [DOI] [PubMed] [Google Scholar]
- 213.National Institute for Health and Care Excellence. Transition between Inpatient Mental Health Settings and Community or Care Home Settings [NICE Guideline] NG53 (2016).
- 214.Smith M, Saunders R, Stuckhardt L, McGinnis JM, Committee on the Learning. Health care system. In America, Institute of Medicine (eds) Best Care at Lower Cost: the Path to Continuously Learning Health Care in America (National Academies Press, Washington, DC, 2013). [PubMed]
- 215.Pereira CVF, de Oliveira EM, de Souza AD. Machine learning applied to edge computing and wearable devices for healthcare: systematic mapping of the literature. Sensors. 2024;24:6322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 216.Fiske A, Radhuber IM, Willem T, Buyx A, Celi LA, McLennan S. Climate change and health: the next challenge of ethical AI. Lancet Glob Health. 2025;13:e1314–e1320. [DOI] [PubMed] [Google Scholar]
- 217.Lokmic-Tomkins Z, Davies S, Block LJ, Cochrane L, Dorin A, Von Gerich H, et al. Assessing the carbon footprint of digital health interventions: a scoping review. J Am Med Inform Assoc. 2022;29:2128–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 218.National Institute for Health and Care Excellence Evidence Standards Framework for Digital Health Technologies. London, UK: National Institute for Health and Care Excellence; 2019. [Google Scholar]
- 219.Wenderott K, Krups J, Zaruchas F, Weigl M. Effects of artificial intelligence implementation on efficiency in medical imaging—a systematic literature review and meta-analysis. NPJ Digit Med. 2024;7:265. [DOI] [PMC free article] [PubMed] [Google Scholar]




