Abstract
Introduction
The harnessing of advanced AI technologies allows us to analyze faces and minute details of facial expressions associated with neuro and psycho pathological conditions and create algorithms to automate diagnoses. The purpose of this review is to analyze what AI technologies have been developed and what is their use in the diagnosis of mental and neuro disorders through the automated recognition of facial expressions and minute details of their changes.
Methods
A systematic search of the main scientific databases – PubMed, Scopus, Web of Science, ScienceDirect, IEEE Xplore and Google Scholar - was carried out on January 22, 2025. Only publications in English in the interval between 2021 and 2025 were included. The inclusion criteria were the use of AI methods on still or moving images of faces in the context of diagnosing mental and/or neuro disorders. The quality of the studies and the risk of bias were evaluated by appropriate appraisal tools developed for specific designs.
Results
In this study, after reviewing 1710 initial articles, 36 relevant and eligible articles were selected for analysis. These articles mainly focused on diagnosing mental and neurological disorders such as autism, depression, and anxiety using artificial intelligence and facial feature analysis. The diversity of populations and sample sizes was considerable, and the input data included images, video, EEG, and fMRI. The reported diagnostic accuracy of AI models ranged from 80.5% to 99.9% (mean ≈ 93%), with F1-scores between 0.87 and 0.99 and AUC values mostly above 0.90. The most frequently used algorithms were convolutional neural networks (CNNs), transfer learning, and hybrid deep learning approaches.
Conclusion
The results of the study indicate a significant growth in the use of artificial intelligence in the diagnosis of mental and neurological disorders, especially in autism, depression and anxiety. The use of facial and multimodal data combined with advanced algorithms has increased the accuracy of diagnosis.
Graphical Abstract
Supplementary information
The online version contains supplementary material available at 10.1186/s12888-025-07739-7.
Keywords: Artificial intelligence, Deep learning, Facial microexpressions, Mental disorders, Facial analysis, Early diagnosis
Introduction
Mental and neurological disorders are among the most complex and challenging public health issues in the contemporary world [1]. These disorders not only affect the physical and mental health of individuals, but also place a heavy burden on the health, economic and social systems of societies [2]. Timely and accurate diagnosis of these diseases, especially disorders such as depression, anxiety, schizophrenia, bipolar disorder, autism and neurological diseases such as Parkinson’s and Alzheimer’s, plays a fundamental role in improving the treatment process, reducing complications and improving the quality of life of patients [3–6]. However, traditional diagnostic methods, which are mainly based on clinical interviews, psychometric tests and medical imaging, have many limitations [7, 8]. In addition to being costly and time-consuming, these methods often rely on human and empirical judgments, which can lead to errors, delays in diagnosis and inconsistencies between assessments [9–11]. Moreover, access to experienced professionals is limited in many geographical areas, which is an obstacle to providing fast and accurate services [12].
In such circumstances, new technologies based on artificial intelligence and machine learning have emerged as promising solutions to overcome these challenges [13, 14]. With significant advances in deep learning algorithms, it has become possible to extract and analyze complex patterns from image data both accurately and quickly [15]. One of the most promising areas in this field is the analysis of facial expressions, especially micro-expressions. Micro-expressions are short-term, involuntary, and very subtle movements of facial muscles that reveal a person’s true emotions and states even when they intend to hide them [16]. This unique feature has made micro-expressions a powerful and promising tool for diagnosing mental and neurological disorders [17].
Despite technological advances, the reality is that there is still a significant gap in the knowledge and practical applications of these technologies [18]. Numerous studies have been conducted in the field of facial expression recognition using artificial intelligence, but many key questions remain: which algorithms are most accurate in detecting micro-expressions? [19, 20] How representative are the data used of real clinical populations? Are there standard methods for collecting, labeling, and validating the data? How are ethical, legal, and privacy barriers to the use of these data managed? And, most importantly, can the results of laboratory research be generalized to real clinical settings [21]?
This huge gap in the transition of AI technologies from the laboratory to clinical and practical applications highlights the need for systematic reviews [22]. By collecting and analyzing the available evidence in a coherent manner, systematic reviews can provide a comprehensive and clear picture of the current state of research and accurately identify knowledge gaps. This process is crucial for researchers, technology developers, and clinicians to pave the way for evidence-based advancements and the development of intelligent applications.
In addition, along with the many opportunities that artificial intelligence provides in the early and accurate diagnosis of mental and neurological disorders, there are also several challenges that, if ignored, can lead to inefficiency or misuse of these technologies [23]. The lack of unified standards in image data collection, low diversity of training samples, problems related to image quality and recording conditions, cultural and linguistic differences in the expression of emotions, and issues related to privacy and ethics are all among the concerns that must be considered in the development of these technologies [24]. Also, current algorithms may be affected by environmental conditions such as lighting, camera angle, facial covering, and even the individual’s voluntary expression, and these factors can reduce the accuracy of diagnosis [25].
Despite the growing number of studies applying artificial intelligence to emotion recognition and facial analysis, several important knowledge gaps remain. Previous research has often focused on limited disorders (mainly autism or depression), used small or homogeneous datasets, and relied on single-modality inputs such as facial images only. Moreover, there is a lack of standardized protocols for data collection, model evaluation, and risk of bias assessment, which limits the comparability and clinical translation of findings. Comprehensive synthesis of AI-based approaches across different mental and neurological conditions considering diverse algorithms, multimodal inputs, and performance indicators is still scarce.
Therefore, this systematic review aims to fill these gaps by providing an integrated analysis of recent studies on AI-based recognition of facial and micro-expressions for diagnosing mental and neurological disorders. It identifies research trends, algorithms, datasets, and evaluation methods, while highlighting limitations and future research directions to enhance reliability, generalizability, and clinical applicability.
Material and methods
This study aimed to investigate the application of artificial intelligence methods in recognizing facial expressions and microexpressions for diagnosing mental and neurological disorders through a systematic review. Due to the diversity of methods and indicators used in different studies, it was not possible to conduct a meta-analysis. The steps involved in conducting this systematic review are described in detail below.
Search methods
A comprehensive search was conducted in PubMed, Scopus, Web of Science, ScienceDirect, IEEE Xplore, and Google Scholar. The search strategy was designed based on the main concepts of artificial intelligence, mental and neurological disorders, and facial or micro-expression analysis. Boolean operators (AND, OR) were applied to combine related keywords. The search was performed on January 22, 2025, and covered studies published between 2021 and 2025. The full search formulas used in each database are presented in Table 1. Additionally, reference lists of relevant articles were manually screened to identify any additional studies. Table 2 shows the inclusion and exclusion criteria.
Table 1.
Search strategy used in each database
| Database | Search Formula |
|---|---|
| PubMed | (“artificial intelligence” OR AI OR “machine learning” OR “deep learning” OR “computer vision”) AND (depression OR anxiety OR “bipolar disorder” OR schizophrenia OR autism OR Parkinson OR PTSD OR anger) AND (facial OR face OR “micro-expression” OR “micro facial expression” OR “subtle facial expression”) |
| Scopus | TITLE-ABS-KEY((“artificial intelligence” OR AI OR “machine learning” OR “deep learning” OR “computer vision”) AND (depression OR anxiety OR “bipolar disorder” OR schizophrenia OR autism OR Parkinson OR PTSD OR anger) AND (facial OR face OR “micro-expression” OR “micro facial expression” OR “subtle facial expression”)) |
| Web of Science | TS=(“artificial intelligence” OR AI OR “machine learning” OR “deep learning” OR “computer vision”) AND TS=(depression OR anxiety OR “bipolar disorder” OR schizophrenia OR autism OR Parkinson OR PTSD OR anger) AND TS=(facial OR face OR “micro-expression” OR “micro facial expression” OR “subtle facial expression”) |
| ScienceDirect | (“artificial intelligence” OR “machine learning” OR “deep learning”) AND (“facial expression” OR “micro-expression”) AND (depression OR anxiety OR autism OR Parkinson OR PTSD) |
| IEEE Xplore | (“artificial intelligence” OR “machine learning” OR “deep learning”) AND (“facial expression” OR “micro-expression”) AND (“mental disorder” OR “neurological disorder” OR depression OR anxiety OR autism OR Parkinson) |
| Google Scholar | allintitle: (“artificial intelligence” OR “machine learning” OR “deep learning”) “facial expression” OR “micro-expression” “mental disorder” OR “neurological disorder” |
Table 2.
Inclusion and exclusion criteria
| Criteria | Inclusion Criteria | Exclusion Criteria |
|---|---|---|
| AI Methods | Studies using artificial intelligence (AI), machine learning, deep learning, or hybrid approaches for facial expression or micro-expression recognition. | Studies using traditional statistical methods or rule-based approaches without AI algorithms. |
| Participants | Studies involving human participants of any age group. | Studies involving non-human or synthetic data without human facial sample validation. |
| Data Type | Studies using visual or multimodal data (e.g., facial images, videos, EEG, or fMRI with facial cues). | Studies that used emotion recognition without a clear clinical or diagnostic objective. |
| Publication | Published in peer-reviewed journals or conferences, written in English, and published between 2021 and 2025. | Non-peer-reviewed sources, such as preprints, reviews, or commentaries. |
| Target Disorder | Studies aiming to diagnose, screen, or classify mental and neurological disorders like depression, anxiety, bipolar disorder, autism, Parkinson’s disease, or PTSD. | Articles that are irrelevant to the target disorder or non-facial modality. |
| Study Reporting | Studies with clear and complete methodological reporting for AI applications. | Incomplete methodological reporting, lack of AI application, or missing data on modality. |
Selection of articles
After completing the search, the results were transferred to EndNote software. Two researchers independently performed the screening process in two stages. In the first stage, duplicate articles were removed and then the titles and abstracts were reviewed focusing on the study objective. Studies that did not meet the inclusion criteria were excluded and the remaining articles entered the second stage of screening. In this stage, the full text of the articles was independently read by two reviewers and in cases where an article was removed, the reason for its removal was recorded. Finally, eligible articles were selected for the data extraction stage. The process of selecting articles is shown in the PRISMA 2020 diagram.
Data extraction
To extract data, a structured checklist was first designed according to the study objectives and implemented in Excel software. Then, two researchers independently extracted the required data from the full text of the articles. The extracted information included the name of the first author, year of publication, type of study, location of study, type of disorder studied, type of input (image or video), type of artificial intelligence algorithm used, model performance indicators such as accuracy, sensitivity, specificity, and AUC, and the type of data (real or synthetic with human validation).
Quality and risk of bias assessment
The methodological quality and risk of bias of all included studies were independently evaluated by two reviewers using the Joanna Briggs Institute (JBI) Critical Appraisal Checklist for cross-sectional studies and the Newcastle–Ottawa Scale (NOS) for cohort or case-control studies. Each study was classified as high, moderate, or low quality according to the total score. Disagreements were resolved by consensus. The detailed quality ratings are presented in Supplementary Table S1.
Research questions
This systematic review was designed to answer the following key research questions:
Which mental and neurological disorders have been investigated using artificial intelligence methods for facial and micro-expression analysis?
What types of data modalities (e.g., facial images, videos, EEG, fMRI, multimodal inputs) have been employed in these studies?
Which artificial intelligence and machine learning algorithms have been used, and what are their main performance indicators (e.g., accuracy, F1-score, AUC)?
What are the current research gaps, limitations, and challenges reported in the literature regarding AI-based diagnosis through facial and micro-expression recognition?
Results and discussion
Study selection
The results of the systematic search of databases led to the identification of 1710 potentially relevant articles. At this stage, no studies were identified through other sources (Fig. 1). After removing duplicates, 1132 articles remained for initial review. Of these, 915 articles entered the screening stage based on title and abstract, and at this stage, 857 articles were excluded due to their lack of relevance to the study topic. Then, the full text of 58 articles was carefully reviewed to assess the inclusion criteria. At this stage, 22 articles were excluded from the study due to reasons such as the lack of a clear clinical framework, the use of artificial intelligence, or methodological weaknesses. Finally, 36 eligible articles were selected for inclusion in the qualitative synthesis.
Fig. 1.
Diagram of study selection process
Table 3 summarizes the 36 selected studies that used AI algorithms to analyze faces, facial expressions, or emotional states for the purpose of diagnosing mental and neurological disorders. These studies used a variety of machine learning and deep learning techniques on inputs such as images, videos, EEG data, and multimodal states.
Table 3.
Selected studies on the use of artificial intelligence for facial and micro-expression analysis in the diagnosis of mental and neurological disorders
| N | Study (Author, Year) | Target Disorder(s) | Population/Sample Size | Input Modality | AI/ML Technique(s) | Performance Metrics | Key Findings |
|---|---|---|---|---|---|---|---|
| 1 | [26] | Depression, Anxiety, Apathy | 319 older adults with MCI | Speech, Facial Expressions, Text | Random Forest + SHAP | F1: 96.6%, Acc: 87.4%, Prec: 86.6%, Rec: 87.6% | Multimodal model using speech, facial and text features showed high accuracy in classifying mental states. |
| 2 | [27] | Autism | Children (unspecified number) | Facial Expressions (Real-time via IoT) | Enhanced Deep Learning (CNN + GA) | Acc: 99.99% | Real-time system detects 6 emotions with very high accuracy, integrated with IoT for latency reduction. |
| 3 | [28] | Depression | Not reported | Facial Expressions (Images & Videos) | Fusion Fuzzy Logic + CNN | Acc: 94.3% | Fusion of fuzzy logic with CNN yielded near-human-level recognition of depression from facial cues. |
| 4 | [29] | Autism Spectrum Disorder (ASD) | Children (size not specified, unique clinical dataset) | Facial Images | Transfer Learning (VGG16) | Acc: 95%, F1: 0.95 | Deep learning model distinguished ASD from typically developing (TD) children; ethnic/racial factors affect model generalizability. |
| 5 | [30] | Depression | Not specified | Facial Expressions + EEG | AI-assisted Emotion Recognition + EEG Biomarkers | Facial: Acc: 93.58%, F1: 93.68%; EEG: Acc: 99.75%, F1: 99.75% | Developed a GUI tool integrating facial and EEG input; high accuracy and sensitivity in early-stage depression detection. |
| 6 | [31] | Autism Spectrum Disorder (ASD) | 2936 facial images of children | Facial Images | AutoML, SVM, Random Forest, Deep Learning | Acc: ~96% (AutoML best) | AutoML with Hyperpot optimization outperformed manual ML approaches; facilitated hyperparameter tuning and feature selection. |
| 7 | [32] | Depression | Not specified | EEG + Facial Features | BiLSTM (Hybrid DL with Feature Fusion) | Not specified (comparative improvement reported) | Proposed BiLSTM model with fused EEG and facial data outperformed traditional models in depression detection. |
| 8 | [33] | Autism Spectrum Disorder (ASD) | Not specified; used AFFECTNET & KAFD-2020 | Facial Expressions (FACS-based) | FACS-CNN + LSTM (Hybrid DL) | MLP: 48.67%, CNN: 67.75%, LSTM: 71.56%, Proposed: 92% Acc | Proposed hybrid FACS-CNN-LSTM architecture showed significant performance gain in ASD detection from facial expressions. |
| 9 | [34] | Autism Spectrum Disorder (ASD) | 37 ASD adults + 43 controls | Facial Expression Imitation (Computer Vision) | Automated Face Analysis | Not numeric; qualitative assessment | ASD group showed slower, less precise facial imitation; precision correlated with emotion recognition. Highlights potential for therapy targeting imitation to improve recognition. |
| 10 | [35] | Autism Spectrum Disorder (ASD) | 42 participants (video clips, 3 min each) | Facial Behavior (Naturalistic Videos) | AI algorithm vs. Human Experts | AI Acc: 80.5%; Experts: 83.1%; Non-experts: 78.3% | AI matched expert-level accuracy; detected cases missed by humans, highlighting AIs assistive potential in clinical diagnosis. |
| 11 | [36] | Anxiety | 45 preschool children | fMRI (Emotional Faces) | Logistic Lasso Regression | Best model (exact metrics not specified) | Brain connectivity features (e.g. MPFC–amygdala) used to detect anxiety; fMRI-based ML diagnosis enables deeper insight into childhood anxiety. |
| 12 | [37] | Stress, Anxiety | Not specified | Facial Expressions (Eyebrow Movement) | Deep Learning (CNN-based approach) | Not numerically reported | Developed stress/anxiety scale (1–100) based on facial geometry; proposed real-time monitoring using eyebrow position as key marker. |
| 13 | [38] | State Anxiety (Nervousness) | Participants in interview setting (exact N not specified) | Dynamic Facial Behavior (Video) | Computer Vision (Behavioral Nervousness Models) | Not quantified; qualitative comparison to humans | AI outperformed human observers in detecting nervousness from facial dynamics; automated model captured subtle cues missed by experts. |
| 14 | [39] | Autism Spectrum Disorder (ASD) | 2836 facial images (1418 ASD, 1418 non-ASD children) | Facial Images | YOLO + CNNs (MobileNet, Xception, InceptionV3) | Acc: 95% (MobileNet), 94% (Xception), 89% (InceptionV3) | Proposed model accurately distinguishes ASD vs. non-ASD children; supports early ASD screening via enhanced facial feature analysis. |
| 15 | [40] | Autism Spectrum Disorder (ASD) | Dataset size not specified (multiple facial image datasets) | Facial Images | 12 DL Models (Best: DenseNet121) + Explainable AI (LIME, Grad-CAM) | Acc: 90.33%, Prec: 92%, Rec: 92%, F1: 90% | DenseNet121 with XAI showed high accuracy and interpretability; facial region visualization enhanced trust and clinical transparency. |
| 16 | [41] | Autism Spectrum Disorder (ASD) | Children (video data; exact N not specified) | Facial Features (Videos) | Logistic Regression, Decision Tree, Neural Network | NN Acc: 99.8%; DT: 99.65% | Neural networks yielded highest accuracy for ASD detection via video; method optimized for low computational demand and emphasized lack of public video datasets. |
| 17 | [42] | ASD (Face Perception) | ASD vs. TD individuals (number not specified) | Facial Image Ratings | FaceNet + Regression Analysis | R2 higher for ASD group (exact metrics not given) | FaceNet’s attention to facial regions mirrors ASD patterns; ASD individuals’ evaluations aligned more closely with AI than TD peers. |
| 18 | [43] | Autism Spectrum Disorder (ASD) | Children with ASD | Facial Expressions | Facial Emotion Recognition (FER) + Audio Feedback System | Not specified | Developed a child interaction system using FER and sound feedback to improve behavioral engagement in children with ASD. |
| 19 | [44] | Depression | Not specified | Facial Expressions | Graph Coloring Algorithms + Facial Recognition AI | Not reported numerically | Combined facial recognition with graph-theory-based optimization for better emotional signal detection; proposed a unified AI system to assist clinicians in depression diagnosis. |
| 20 | [45] | Autism Spectrum Disorder (ASD) | Not specified (children) | Thermal Face Images | SVM, KNN, Naive Bayes, Random Forest | Metrics not reported | Used GLCM texture features from thermal images to distinguish between ASD and controls; proposed severity-level estimation using facial data. |
| 21 | [46] | Depression | 25 participants | Facial AUs, Head Euler Angles, Landmarks | SVC, Random Forest, XGBClassifier, ANN | AUROC: SVC 0.94, RF 0.92, XGB 0.91, ANN 0.91 | Used detailed statistical facial movement features and PHQ-9 scores; achieved high accuracy in classifying depressive episodes based on facial behavior. |
| 22 | [47] | Autism Spectrum Disorder (ASD) | Not specified | SDFT of Facial Images | CNN + Transfer Learning | Accuracy: Raw 44%, SDFT 52%, Transfer Learning 85% | Used frequency domain features from facial images (SDFT/PSD) to classify ASD; transfer learning improved accuracy significantly. |
| 23 | [48] | Parkinson’s Disease (PD) | 140 videos (70 PD, 70 controls) | 2D Video Facial Features (Geometric & Texture) | RF, SVM, KNN | Accuracy: Geometric 83%, Texture 86%, Combined F1-score 88% | Developed markerless facial feature recognition AI for masked face detection; demonstrated high diagnostic power with RF. |
| 24 | [49] | Autism Spectrum Disorder (ASD) | Not specified | Facial Images | Pre-trained CNNs (MobileNetV3Large, DenseNet121, etc.) + DNN, AdaBoost+SVM/LogReg | Accuracy: DNN 90.63%, AdaBoost LR 89.21%, AdaBoost SVM 88.92% | Achieved high performance in early autism diagnosis using CNN features + ML classifiers with ANOVA for feature selection. |
| 25 | [50] | Depression | 160 university students (80 with depression, 80 control) | Facial expressions (OpenFace), Actions (Kinect) | CNN-LSTM, TCN, Weighted Fusion | CNN-LSTM: 0.781; TCN: 0.769; Fusion: 0.875 (Accuracy) | Fusion of facial and motion features significantly improved accuracy; proposed model demonstrated promising applicability for depression detection in university students. |
| 26 | (Sankari and Kannammal) | Autism Spectrum Disorder (ASD) | Not specified (children, ages 3–11) | Behavioral + Facial Image Features | Logistic Regression, KNN, SVM, Extra Tree, RF, ANN | Image-SVM: 70%; AQ-10 LR/ET: 100% (Accuracy) | Best results with Extra Tree and Logistic Regression on behavioral data; image-based SVM reached 70%; combining behavioral and facial features improved prediction. |
| 27 | [51] | Autism Spectrum Disorder (ASD) | Not specified | Facial Images | VGG16, ResNet-50, SE-ResNet-50, MobileNetv2 (pretrained on VGGFace2) | VGG16: Accuracy 0.86, AUC 0.86 | Domain-specific pretraining (VGGFace2) improved performance in ASD detection; VGG16 outperformed standard architectures; models captured fine-grained facial cues. |
| 28 | [52] | Autism Spectrum Disorder (ASD) | Not specified (image dataset labeled autistic/non-autistic) | Facial features (landmarks, symmetry), Eye gaze estimation | Logistic Regression, k-NN, Gradient Boosting, RF, CNN | Not specified numerically | Combining facial landmark metrics, symmetry, and gaze direction enhanced early ASD detection; models showed promising non-invasive diagnostic potential. |
| 29 | [53] | Anxiety in Drug Addiction | 279 male drug addicts | Facial movement analysis (via image sensors) | ML prediction models (type not specified) | Reliability: r = 0.976; Validity: r = 0.390 | ML-based facial analysis provided reliable tools for dynamically monitoring anxiety; study confirmed potential of nontraditional methods in addiction therapy. |
| 30 | [54] | Depression | 10 patients + public datasets (AffectNet, CREMA-D, MPII) | Facial expressions, body language, voice | Multimodal Attention Fusion Network (MAFN) | MAFN > unimodal models (higher accuracy, precision, recall, F1) | Fusion of multimodal data outperformed single-input models in early depression diagnosis; model is clinically applicable for early intervention. |
| 31 | [55] | Autism Spectrum Disorder (ASD) | Not specified (facial image dataset) | Facial features (static images) | Hybrid DL using VGG16, MobileNetV3Small, ResNet50 | Accuracy: 95.8% | Hybrid deep learning model combining multiple CNN architectures effectively distinguished ASD, supporting early non-invasive detection. |
| 32 | [56] | Autism Spectrum Disorder (ASD) | Bangladeshi children (2 datasets × 1500 images) | Facial expressions | EfficientNet (B0, B7, V2L), MobileNetV2, ResNet, DenseNet, VGG | Accuracy: 99% (toddlers), 94% (teens); avg. 96% & 82.5% | Ethnicity-specific training enhanced ASD detection; EfficientNetB0 achieved highest performance in toddlers; highlights role of cultural tailoring in DL models. |
| 33 | [57] | Autism Spectrum Disorder (ASD) | 25 ASD + 26 matched controls (4,896 recordings) | Facial movements (posed expressions) | Bayesian analysis + K-NN | Accuracy: 91.87%; Precision: 88.51% (ASD), 95.89% (non-ASD); Recall: 96.25% (ASD), 87.50% (non-ASD) | Facial movement differences are robust biomarkers for ASD across sex and race; machine learning classifiers show strong diagnostic performance. |
| 34 | [58] | Autism Spectrum Disorder (ASD), Emotion Recognition | 600+ facial images (children) | Static facial images | MobileViT (ASD), VGG-16 (emotions) | Accuracy: MobileViT: 99% (train), 98% (val); VGG-16: 96% (train), 99% (val) | Developed a dual-model system to detect ASD and facial expressions in children. High accuracy achieved; deployed as a user-accessible web tool. |
| 35 | [59] | Depression | Public video datasets | Video-based facial expressions | Deep learning with label distribution + metric learning on spatiotemporal features | Not numerically stated; noted as superior to prior methods | Combined distribution learning with metric learning enhances feature discrimination and improves facial depression prediction performance. |
| 36 | [60] | Autism Spectrum Disorder (ASD) | Two datasets: Kaggle & YTUIA | Facial images (heterogeneous datasets) | Federated learning with Xception backbone | Accuracy ≈ 90%; 30% gain on cross-domain test sets | Pioneered federated learning for ASD detection across datasets; significant accuracy gains and preservation of data privacy. Highlights domain adaptation’s role in ASD detection. |
Model performance and methodological characteristics
The reported diagnostic performance of AI-based models was generally high across the reviewed studies. Reported accuracy values ranged from 80.5% to 99.9%, with an overall mean accuracy of approximately 93%, indicating that most algorithms achieved strong classification capability in distinguishing between patients and controls or between different mental states. The F1-scores, reflecting the balance between precision and recall, were typically reported between 0.87 and 0.99, suggesting robust detection performance even in cases of class imbalance. Moreover, the area under the receiver operating characteristic curve (AUC) was reported in several studies and was mostly above 0.90, confirming excellent discriminative power of the AI models.
The most frequently used algorithms were convolutional neural networks (CNNs), transfer learning frameworks, and hybrid deep learning approaches such as CNN–LSTM or multimodal fusion models. Common traditional classifiers included Support Vector Machines (SVMs), Random Forests (RF), and K-Nearest Neighbors (KNN).
The most common predictors used across studies were facial landmarks, action units (AUs), geometric features, and texture-based features, with some studies also integrating EEG, fMRI, or voice data as multimodal predictors. Continuous predictors included facial muscle intensity, movement amplitude, and symmetry indices, while categorical predictors were primarily emotion labels or diagnostic categories.
Risk of bias and study quality
A total of 36 studies were assessed for methodological quality. According to the JBI and NOS evaluations, 17 studies (47%) were rated as high quality, 8 studies (22%) as moderate quality, and 11 studies (31%) as low quality. Common limitations included incomplete reporting of sample size, lack of cross-validation, and absence of standardized data collection procedures. Detailed quality assessment results are provided in Supplementary Table S1.
General characteristics of the reviewed studies
In this systematic review, 36 studies published between 2021 and 2025 were identified and reviewed. These studies were conducted by various researchers from around the world and used artificial intelligence methods to analyze facial features or micro-expression states to diagnose mental and neurological disorders. The temporal distribution of the articles is such that most of them were published recently (2023 to 2025).
This increasing trend in the publication of articles reflects the growing interest of the scientific community in the vital role of AI in the field of mental health and neurological disorders. Technological advances, the increasing availability of imaging data and biological signals, and the need for non-invasive and cost-effective methods for early diagnosis have been the main drivers of this growth. In addition, the geographical diversity of researchers is a testament to the global acceptance of this field and the need to develop adaptable and generalizable models that can be applied across cultures and populations. This will pave the way for future advances in the design of AI-based clinical systems.
Targeted mental and neurological disorders in studies
The studies reviewed in this systematic review focused on a diverse range of mental and neurological disorders. The most frequently studied disorder is autism spectrum disorder (ASD), which is identified as a target disorder in more than half of the articles, indicating the special attention of researchers to the capabilities of AI in early diagnosis of this disorder. After that, depression occupies the second place and has been studied in several studies either alone or in combination with anxiety and apathy. Anxiety disorder has also been considered in several studies, either alone or in combination with disorders such as stress or depression. Stress was investigated as the main target in one study and in some others, it had overlaps with anxiety. Among other neurological disorders, Parkinson’s disease has been investigated in only one study, focusing on the recognition of emotionless (mask-like) faces. Also, apathy has been raised in one study alongside depression and anxiety in elderly people with mild cognitive impairment (MCI).
Figure 2 shows the number of articles reviewed for the mental and neurological disorders studied. The largest number of studies was devoted to autism spectrum disorder (ASD). This was followed by depressive and anxiety disorders, and other disorders such as apathy, stress, and Parkinson’s disease received less attention.
Fig. 2.
Distribution of target disorders in systematic review studies
The major focus on autism spectrum disorder is justified, as early diagnosis of ASD plays an important role in improving the quality of life of individuals and reducing the burden on families and society. Artificial intelligence with the ability to analyze complex behavioral and facial data can help in early and accurate identification of this disorder, a topic that has attracted increasing attention from researchers [61]. Depression and anxiety, which are common disorders with a high social and economic burden, present particular challenges for diagnosis and treatment due to their multifaceted and variable nature. The use of artificial intelligence in this area can help discover behavioral and biological patterns that may remain hidden from human observation. Also, the limited study of Parkinson’s disease, given its importance in movement and cognitive disorders, indicates an existing gap and research opportunity that can be filled in the future with the development of more data and more specialized algorithms. Investigating apathy in older adults with MCI also represents a more comprehensive and interdisciplinary approach that could help improve the diagnosis and management of cognitive and psychiatric disorders in vulnerable populations. Overall, this diverse range of disorders studied demonstrates the breadth of potential for AI applications, but requires consideration of the different clinical and cultural characteristics of each disorder to design more accurate and generalizable models [62, 63].
Study populations and sample sizes in the reviewed articles
The study populations in the reviewed articles varied widely in terms of age groups, cognitive status, and clinical conditions. In many studies, the primary focus was on children and adolescents, particularly in studies related to autism spectrum disorder, which often used facial images, video recordings, or databases such as AFFECTNET and KAFD-2020. In a few studies, the study samples included children aged 3 to 11 years, while others did not specify a specific age. Studies related to depression or anxiety sometimes addressed more specific populations, such as older adults with mild cognitive impairment (MCI), university students, or preschool children. One study also included male drug users as the target population. There is a wide range in sample size; Some studies have used databases with thousands of facial images (up to 2800+ images), while others have included only a few dozen people or a few dozen videos. In some cases, the exact sample size is not specified or simply refers to “children” or “individuals with ASD,” which somewhat limits the clarity of subsequent analyses.
The diversity of the study populations reflects an attempt to cover a wide range of clinical conditions and age groups, which is of considerable importance in the generalizability of the results. However, the lack of transparency in some reports and differences in sample sizes can limit the comparison and integration of data. The focus on children and adolescents seems logical given the early onset of disorders such as ASD, but to expand clinical applications, research on adults and the elderly should also be increased. The importance of using large and standardized databases to increase the generalizability of models and reduce potential errors should be emphasized [64, 65].
Types of inputs used for diagnosing mental and neurological disorders
A considerable variety of input data types has been observed in the reviewed studies. The most common type of input is facial features (such as images, videos, facial expressions, and subtle facial muscle movements), which have been used in most studies. Many of these studies have used image databases or recorded video data from real interactions and facial movements of children and adults. In addition to image data, some studies have used multimodal data, such as combining facial features with EEG signals, speech information, or even written text from patient interactions. This multimodal combination has in some cases led to increased accuracy and sensitivity of the diagnostic models. Several studies have also used fMRI data to analyze brain connectivity in the diagnosis of anxiety in children. Some other specific inputs include thermal images of the face, eyebrow movements, head angles and eye area, as well as behavioral and expressive features through the body (such as movements recorded with Kinect).
Figure 3 shows the frequency of input data types used in the reviewed studies for diagnosing mental and neurological disorders. The most commonly used input data in the studies were facial images and videos, which were used in 29 studies. This was followed by EEG data and multimodal models (a combination of multiple data types) each used in 3 studies. Other data such as fMRI, speech, thermal images, and behavioral data were used less frequently.
Fig. 3.
Distribution of input data (input modality) in the reviewed studies
The diversity of input data reflects technological advances and a deeper understanding of the biological and behavioral complexities of mental disorders. The use of multimodal data provides an opportunity to improve the accuracy of diagnosis and the reliability of models, because complementary information from multiple sources can compensate for the shortcomings of each individual data type. This multimodality approach requires the development of more complex algorithms and the optimization of data fusion, which is a major scientific challenge. Also, the use of more complex data such as fMRI opens up new potentials in brain analysis of these disorders that can contribute to more accurate diagnosis and a better understanding of the underlying mechanisms [66, 67].
AI algorithms and methods used in studies
The reviewed studies used a wide range of machine learning (ML) and deep learning (DL) methods to analyze facial data and other inputs. Many studies used convolutional neural networks (CNN) as the basis of the model, and some combined it with other algorithms such as genetic algorithms (GA), LSTM, or fuzzy logic to improve model performance. The use of transfer learning methods was also evident in several studies, especially in studies that used pre-trained models such as VGG16, MobileNet, ResNet, and EfficientNet to extract facial features. These models were sometimes used independently and sometimes in the framework of hybrid models. In some studies, classical machine learning methods such as SVM, Random Forest, KNN, Logistic Regression, AdaBoost, and XGBClassifier have been used to classify the extracted features. In some cases, hybrid models such as CNN-LSTM or CNN with AutoML or recurrent networks such as BiLSTM have been used. Several studies have also used more modern methods such as Federated Learning, Attention Networks, or combining learning models with statistical and graph-based analyses (such as SHAP, LIME, graph coloring, or Lasso regression analysis).
As shown in Fig. 4, most studies (20) used convolutional neural networks and deep learning. More traditional machine learning methods such as SVM and Random Forest were also used in 8 studies. Also, some studies used hybrid models and transfer learning to improve performance, while federated learning and other methods were less common.
Fig. 4.
Artificial intelligence and machine learning methods used in studies
The diversity of algorithms reflects the effort to find the most optimal structures and models for analyzing complex and multifaceted data. The use of convolutional neural networks is logical and effective due to their high ability to extract image features. The use of transfer learning and pre-trained models allows for improved performance when faced with limited data. The use of hybrid and modern methods increases the flexibility and resolution of models and improves the ability to manage heterogeneous data. At the same time, the introduction of Federated Learning methods reflects the attention to data privacy and the ability to train models on sparse data, which is of increasing importance [68].
Performance metrics of AI models in the reviewed studies
The reviewed studies used a diverse set of performance metrics to evaluate the accuracy and efficiency of their detection models. Accuracy was the most commonly used metric, reported in most studies, reaching very high numbers (over 95% or even close to 100%) in some cases. In addition to accuracy, metrics such as F1-score, Precision (positive accuracy), Recall (recall or sensitivity), AUC (area under the ROC curve), and AUROC were also reported in some studies. The F1-score, which is a combination of precision and recall, was particularly important in models that worked with unbalanced data. In cases such as combining EEG with facial features or multi-modal models, F1-scores of over 99% were also reported. Several studies, rather than providing exact numbers, have provided qualitative assessments, analyzing the performance of the model in comparison to human observations or previous models. Some studies also reported correlation coefficients (such as r or R2), especially in studies involving fMRI or regression analyses. However, some papers did not report specific numbers or simply reported relative performance improvements without specifying the metrics.
The reported very high accuracies in many studies indicate significant progress in AI in diagnosing mental and neurological disorders; however, these high numbers should be interpreted with caution. One possible reason for these accuracies is the use of controlled databases and specific samples, which may have less generalizability in real-world situations. Indices such as the F1-score, due to their consideration of the balance between precision and sensitivity, are considered more important indicators for evaluating models in the face of unbalanced data, since many mental disorders are associated with low frequency or heterogeneous distribution of data. The use of multimodal data fusion, such as EEG and facial images, has been able to significantly improve the performance of models, indicating the importance of using diverse data sources for more accurate modeling [69]. Qualitative evaluation of model performance compared to human judgments indicates an attempt to prove the clinical applicability of the models and not simply statistical measures. However, the lack of accurate reporting of indicators in some studies is a significant weakness and prevents a complete and transparent comparison of model performance. Also, the existence of reported correlations in fMRI and regression studies indicates an effort to better understand the biological mechanisms associated with disorders and the use of artificial intelligence in multilevel analyses. To achieve reliable and applicable models, the need for reporting standards and the use of diverse and real-world datasets is increasingly felt [70, 71].
Key findings in the application of artificial intelligence to diagnose mental and neurological disorders
Key Findings Studies show that artificial intelligence, especially in combination with facial analysis and multimodal data, can play an effective role in the early and accurate diagnosis of mental and neurological disorders. Many studies have been able to provide diagnostic performance at the level or even beyond human experts by utilizing facial features and combining them with other data such as EEG, voice, text or motor behaviors. In studies related to autism, models were able to identify specific behavioral and facial differences that could be used as potential biomarkers for early screening. In some studies, the designed models were able to identify cases of the disorder that were overlooked by human observers. Promising results have also been obtained in the areas of depression and anxiety. Multimodal models that combine facial expressions with EEG or body movements have achieved high accuracy and sensitivity in identifying these disorders. Some studies have also attempted to provide models with higher interpretability and reliability by utilizing emerging technologies such as federated learning, attention models, and high-throughput analysis with XAI tools, which could be useful for clinical applications. At the same time, some studies have pointed out challenges such as cultural differences, poor data quality, or the lack of public video databases that could affect the generalizability of the models. These findings emphasize the need for further development of standardized databases, improved evaluation methods, and practical applications of the models in real-world settings.
These findings highlight the transformative potential of AI in the field of mental health and neurological disorders. AI can be a powerful tool to improve diagnostic accuracy and reduce delays in disease detection, which in turn leads to improved treatment outcomes and reduced costs of care. On the other hand, the challenges ahead are mainly related to data and how they are collected and utilized. The lack of global standards and cultural differences can lead to the non-generalizability of models across different societies. Therefore, the development of diverse and standardized databases and international collaborations are key priorities [72]. In addition, for successful clinical adoption, the transparency and interpretability of models must be increased so that health professionals can trust the results and make treatment decisions based on them. Ethically, protecting patient privacy and ensuring responsible use of data is essential, and comprehensive rules and frameworks are needed. Ultimately, the future of AI in this area can greatly help personalize treatments, predict disease progression, and design targeted interventions; but this requires continued effort in research, technology development, and alignment with ethical and legal regulations [73].
Analytical insights and methodological challenges
The reviewed studies demonstrate clear differences between unimodal and multimodal artificial intelligence (AI) approaches in diagnosing mental and neurological disorders. Unimodal systems—those relying solely on facial expressions or micro-expressions—showed strong diagnostic performance, typically achieving accuracies between 85 and 95%. However, multimodal frameworks that integrate additional modalities such as EEG, fMRI, or voice features [26, 30, 54] consistently outperformed unimodal models, often exceeding 95% accuracy. This improvement can be attributed to the complementary nature of multimodal data, which allows for a more comprehensive representation of both neural and behavioral patterns associated with psychiatric and neurological conditions.
A critical challenge identified across studies is the lack of cross-cultural validation and generalizability of AI models. Several studies, such as Lu and Perkowski [29] and Ahmed et al. [56], emphasized that models trained on ethnically homogeneous or geographically limited datasets may exhibit bias when applied to populations with different facial morphologies or cultural expressions. These limitations underscore the importance of establishing globally representative datasets and employing domain adaptation techniques to improve cross-population generalization.
Ethical implications were rarely addressed explicitly in the reviewed studies, despite their growing importance. Major concerns include demographic imbalance in datasets, lack of informed consent in publicly available image collections, and potential misuse of facial data for non-clinical surveillance purposes. Gender and age underrepresentation, particularly the limited inclusion of female and older adult participants, can result in algorithmic bias and reduced fairness.
While many studies report exceptionally high accuracies (often exceeding 95–99%), such results must be interpreted with caution. High accuracy values, especially from small or homogeneous samples, are likely influenced by overfitting, data leakage, or imbalance in diagnostic categories. Few studies performed external validation or used large-scale, independent test datasets. The absence of standardized validation frameworks limits the comparability and clinical reliability of reported outcomes.
To provide a broader perspective on the methodological and practical aspects of AI-based facial analysis, the main challenges identified in the reviewed studies and the corresponding future research directions were summarized. Table 4 presents the main challenges identified in the reviewed studies and the suggested future research directions.
Table 4.
Main challenges identified in the reviewed studies and suggested future research directions
| Challenges | Future Directions |
|---|---|
| Data Scarcity | - Employ Transfer Learning (TL) techniques to enhance model performance with small datasets. |
|
Domain Adaptation [36] |
- Use domain adaptation methods to align source and target domains, ensuring models generalize across different datasets. |
| Real-time Processing | - Develop real-time processing models that can be applied in clinical environments with low latency. |
| Hardware Constraints | - Optimize AI models for low-resource environments to ensure accessibility in real-world clinical settings. |
| Limited Multimodal Data | - Incorporate multimodal data (e.g., facial expressions, EEG, fMRI, voice) to improve diagnostic accuracy. |
| Lack of Standardization | - Standardize data collection, evaluation protocols, and bias assessment methods to ensure comparability across studies. |
| Scalability | - Develop scalable AI models that can be used across diverse populations and clinical settings. |
Conclusion
The results of this study indicate a significant growth in the scientific community’s interest in the applications of artificial intelligence in the diagnosis of mental and neurological disorders. Most studies focused on autism spectrum disorder (ASD), depression, and anxiety, emphasizing the importance of early and accurate diagnosis in these conditions. The widespread use of facial features and multimodal data such as EEG and fMRI, along with the use of advanced deep learning algorithms, especially convolutional neural networks and hybrid models, has improved the accuracy and sensitivity of the models. However, significant differences in data volume and quality, limitations in reporting performance indicators, and cultural and technical challenges have limited the generalizability of the results. The diversity of study populations from children to the elderly, and the wide range of data inputs indicate the high potential of this field to cover a wide range of disorders and age groups, but this requires greater transparency and standardization.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
The author gratefully acknowledges the assistance of Mirbahador Yazdani and Bahram Shirini from Bonab University, who contributed to the article screening and data extraction processes under the author’s supervision. Their support was invaluable in ensuring the accuracy and completeness of the systematic review.
Author contributions
The author solely conducted all stages of the study and manuscript preparation.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Data availability
No datasets were generated or analysed during the current study.
Declarations
Ethics approval and consent to participate
Not applicable. This study is a systematic review and did not involve human participants or animals. The research was conducted in accordance with the principles of the Declaration of Helsinki. No ethics approval or consent was required.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Yoonesi S, Abedi Azar R, Arab Bafrani M, Yaghmayee S, Shahavand H, Mirmazloumi M, Moazeni Limoudehi N, Rahmani M, Hasany S, Idjadi FZ, et al. Facial expression deep learning algorithms in the detection of neurological disorders: a systematic review and meta-analysis. Biomed Eng Online. 2025;24(1):64. 10.1186/s12938-025-01396-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Arias D, Saxena S, Verguet S. Quantifying the global burden of mental disorders and their economic value. EClinicalMedicine. 2022;54:101675. 10.1016/j.eclinm.2022.101675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Abedi L, Naghizad MB, Habibpour Z, Shahsavarinia K, Yazdani MB, Saadati M. A closer look at depression and sleep quality relation: a cross-sectional study of taxi drivers in Tabriz metropolis. Health Sci Rep. 2024;7(9):e70037. 10.1002/hsr2.70037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Davtalab Esmaeili E, Golestani M, Yazdani M, Pirnejad H, Shahsavarinia K, Harzand-Jadidi S, Rezaei M, Sadeghi-Bazargani H. Quality of life and socioeconomic status in northwest of Iran: first wave of the persian traffic cohort study. J Prev. 2024;45(5):751–64. 10.1007/s10935-024-00786-y. [DOI] [PubMed] [Google Scholar]
- 5.Marras C, Meyer Z, Liu H, Luo S, Mantri S, Allen A, Baybayan S, Beck JC, Brown AE, Cheung F, et al. Improving Parkinson’s disease care through systematic screening for depression. Mov Disord Clin Pract. 2024;11(10):1212–22. 10.1002/mdc3.14163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tahami Monfared AA, Phan NTN, Pearson I, Mauskopf J, Cho M, Zhang Q, Hampel H. A systematic review of clinical Practice guidelines for Alzheimer’s disease and strategies for future Advancements. Neurol Ther. 2023;12(4):1257–84. 10.1007/s40120-023-00504-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Aghazadeh H, Ebnetorab SMA, Shahriari N, Ghaffari H, Gheshlaghi EF, Taheri P. Design and production of DNA-based electrochemical and biological biosensors for the detection and measurement of gabapentin medication in clinical specimens. J Electrochem Soc. 2022;169(7):077517. 10.1149/1945-7111/ac8247. [Google Scholar]
- 8.Pourebrahim K, Bafandeh-Zendeh A, Yazdani M. Driver’s age and rear-end crashes associated with distraction. Archiv Trauma Res. 2021;10(3):148–52. 10.4103/atr.atr_45_21. [Google Scholar]
- 9.Anderson AN, King JB, Anderson JS. Neuroimaging in psychiatry and neurodevelopment: Why the emperor has no clothes? The Br J Radiol. 2019;1101;92:20180910. 10.1259/bjr.20180910. [DOI] [PMC free article] [PubMed]
- 10.Jadidi V. The Impact of artificial intelligence on judicial decision-making processes. Adv J Manag Humanity Soc Sci. 2025;e222612. 10.5281/zenodo.15660093.
- 11.Vally ZI, Khammissa RAG, Feller G, Ballyram R, Beetge M, Feller L. Errors in clinical diagnosis: a narrative review. J Int Med Res. 2023;51(8):3000605231162798. 10.1177/03000605231162798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Sun C-F, Correll CU, Trestman RL, Lin Y, Xie H, Hankey MS, Uymatiao RP, Patel RT, Metsutnan VL, McDaid EC, et al. Low availability, long wait times, and high geographic disparity of psychiatric outpatient care in the us. Gener Hosp Psychiatry. 2023;84:12–17. 10.1016/j.genhosppsych.2023.05.012. [DOI] [PubMed] [Google Scholar]
- 13.Babu A, Joseph AP. Artificial intelligence in mental healthcare: transformative potential vs. The necessity of human interaction. Front Phychol. 2024;15:1378904. 10.3389/fpsyg.2024.1378904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Jadidi V, Ardakani HT, Hanif HR, Naseri SZ. Examining how new technologies affect management and decision-making processes in organizations. Int J Adv Stu Hum Soc Sci. 2024;14(1):47–60. 10.48309/ijashss.2025.481440.1221. [Google Scholar]
- 15.Sindhura DN, Pai RM, Bhat SN, Pai MMM. A review of deep learning and generative adversarial networks applications in medical image analysis. Multimedia Syst. 2024;30(3):161. 10.1007/s00530-024-01349-1. [Google Scholar]
- 16.Pereira R, Mendes C, Ribeiro J, Ribeiro R, Miragaia R, Rodrigues N, Costa N, Pereira A. Systematic review of emotion detection with Computer vision and deep learning. Sensors. 2024;24(11):3484. 10.3390/s24113484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sharma D, Singh J, Sehra SS, Sehra SK. Demystifying mental health by decoding facial action unit sequences. Big Data Cognit Comput. 2024;8(7):78. 10.3390/bdcc8070078. [Google Scholar]
- 18.Lu T, Liu X, Sun J, Bao Y, Schuller BW, Han Y, Lu L. Bridging the gap between artificial intelligence and mental health. Sci Bull. 2023;68(15):1606–10. 10.1016/j.scib.2023.07.015. [DOI] [PubMed] [Google Scholar]
- 19.Li Y, Wei J, Liu Y, Kauttonen J, Zhao G. Deep learning for micro-expression recognition: a survey. IEEE Trans Affective Comput. 2022;13(4):2028–46. 10.1109/TAFFC.2022.3205170. [Google Scholar]
- 20.Zhang F, Chai L. A review of research on micro-expression recognition algorithms based on deep learning. Neural Comput And Appl. 2024;36(29):17787–828. 10.1007/s00521-024-10262-7. [Google Scholar]
- 21.Oh Y-H, See J, Le Ngo AC, Phan R-W, Baskaran VM. A survey of automatic facial micro-expression analysis: databases, methods, and challenges. Front Phychol. 2018;9. 10.3389/fpsyg.2018.01128. [DOI] [PMC free article] [PubMed]
- 22.Cruz-Gonzalez P, He AW, Lam EP, Ng IMC, Li MW, Hou R, Chan JN, Sahni Y, Vinas Guasch N, Miller T, et al. Artificial intelligence in mental health care: a systematic review of diagnosis, monitoring, and intervention applications. Psychological Med. 2025;55:e18. 10.1017/S0033291724003295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Voigtlaender S, Pawelczyk J, Geiger M, Vaios EJ, Karschnia P, Cudkowicz M, Dietrich J, Haraldsen I, Feigin V, Owolabi M, et al. Artificial intelligence in neurology: opportunities, challenges, and policy implications. J Neurol. 2024;271(5):2258–73. 10.1007/s00415-024-12220-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Barker D, Tippireddy MKR, Farhan A, Ahmed B. Ethical considerations in emotion recognition research. Phychol Int. 2025;7(2):43. 10.3390/psycholint7020043. [Google Scholar]
- 25.Kollias D, Tzirakis P, Nicolaou MA, Papaioannou A, Zhao G, Schuller B, Kotsia I, Zafeiriou S. Deep affect prediction in-the-Wild: aff-Wild Database and challenge, deep architectures, and beyond. Int J Comput Vision. 2019;127(6):907–29. 10.1007/s11263-019-01158-4. [Google Scholar]
- 26.Zhou Y, Han W, Yao X, Xue J, Li Z, Li Y. Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: a cross-sectional observational study. Int J Nurs Stud. 2023;146:104562. 10.1016/j.ijnurstu.2023.104562. [DOI] [PubMed] [Google Scholar]
- 27.Talaat FM. Real-time facial emotion recognition system among children with autism based on deep learning and IoT. Neural Comput Appl. 2023;35(17):12717–28. 10.1007/s00521-023-08372-9. [Google Scholar]
- 28.Rajawat AS, Bedi P, Goyal S, Bhaladhare P, Aggarwal A, Singhal RS. Fusion fuzzy logic and deep learning for depression detection using facial expressions. Procedia Comput Sci. 2023;218:2795–805. 10.1016/j.procs.2023.01.251. [Google Scholar]
- 29.Lu A, Perkowski M. Deep learning approach for screening autism spectrum disorder in children with facial images and analysis of ethnoracial factors in model development and application. Brain Sci. 2021;11(11):1446. 10.3390/brainsci11111446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Kumar G, Das T, Singh K. Early detection of depression through facial expression recognition and electroencephalogram-based artificial intelligence-assisted graphical user interface. Neural Comput Appl. 2024;36(12):6937–54. 10.1007/s00521-024-09437-z. [Google Scholar]
- 31.Elshoky BRG, Younis EM, Ali AA, Ibrahim OAS. Comparing automated and non-automated machine learning for autism spectrum disorders classification using facial images. Etri J. 2022;44(4):613–23. 10.4218/etrij.2021-0097. [Google Scholar]
- 32.Hamid DSBA, Goyal S, Bedi P. Integration of deep learning for improved diagnosis of depression using EEG and facial features. Materials Today: Proceedings. 2023;80:1965–69. 10.1016/j.matpr.2021.05.659.
- 33.Saranya A, Anandan R. Facial action coding and hybrid deep learning architectures for autism detection. Intell Automation Soft Comput. 2022;33(2). 10.32604/iasc.2022.023445.
- 34.Drimalla H, Baskow I, Behnia B, Roepke S, Dziobek I. Imitation and recognition of facial emotions in autism: a computer vision approach. Mol Autism. 2021;12(1):27. 10.1186/s13229-021-00430-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Sariyanidi E, Zampella CJ, DeJardin E, Herrington JD, Schultz RT, Tunc B. Comparison of human experts and AI in predicting autism from facial behavior. In: CEUR workshop proceedings. 2023. p 48. [PMC free article] [PubMed]
- 36.Jafari M, Tao X, Barua P, Tan RS, Acharya UR. Application of transfer learning for biomedical signals: a comprehensive review of the last decade (2014-2024). Inf Fusion. 2025;118:102982. [Google Scholar]
- 37.Saraswat M, Kumar R, Harbola J, Kalkhundiya D, Kaur M, Goyal MK. Stress and anxiety detection via facial expression through deep learning. In: 2023 3rd International Conference on Technological Advancements in Computational Sciences. ICTACS: IEEE; 2023. p 1565–68. 10.1109/ICTACS59847.2023.10389882.
- 38.Kuipers M, Kappen M, Naber M. How nervous am I? How computer vision succeeds and humans fail in interpreting state anxiety from dynamic facial behaviour. Cognition Emotion. 2023;37(6):1105–15. 10.1080/02699931.2023.2229545. [DOI] [PubMed] [Google Scholar]
- 39.ElMahalawy J, ElSwaify YA, Elliboudy D, Abbas OM, Moustafa N, Wael N. AI-Powered human-Computer interaction assisting early identification of emotional and facial symptoms of Autism Spectrum disorder in children: “A deep learning-based enhanced facial feature recognition System”. In: 2024 International Conference on Machine Intelligence and Smart Innovation (ICMISI). 2024. p 87–93. 10.1016/j.bspc.2024.107433.
- 40.Hossain SS, Al-Islam F, Islam MR, Rahman S, Parvej MS. Autism Spectrum disorder identification from facial images using fine tuned pre-trained deep learning models and Explainable AI techniques. Semarak Int J Appl Phychol. 2025;5(1):29–53. 10.37934/sijap.5.1.2953b. [Google Scholar]
- 41.Bhat C, Goutham G, Bannur C, Kakarla S, Mamatha H. Deep dive into predictive modeling for Autism Spectrum disorder using facial features and machine learning. In: 2024 International Conference on Communication, Computing and Internet of Things. IC3IoT: IEEE; 2024. p 1–5. 10.1109/IC3IoT60841.2024.10550416.
- 42.Imaizumi T, Li L, Nishikawa N, Kumazaki H, Ueda K. Similarities in face recognition between deep learning and Autism Spectrum disorders. In: Proceedings of the 12th International Conference on Human-Agent Interaction. 2024. p 344–46. 10.1145/3687272.3690876.
- 43.Gurusubramani S, Diya B, Sowmiya M. Artificial intelligence and face Emotion prediction based training for Autism kids. In: 2022 1st International Conference on Computational Science and Technology. ICCST: IEEE; 2022. p 344–49.
- 44.Zainab. Redefining mental health diagnostics and algorithmic Innovation: unifying AI in depression detection, facial recognition efficiency, and graph coloring solutions. Algo Vista: J AI Comput Sci. 2024;2(1):18–25. 10.70445/avjcs.2.1.2025.18-25. [Google Scholar]
- 45.Shubhangi GB, Farheen S, Waheed MA. A machine learning approach for early detection and diagnosis of autism and normal controls and estimating severity levels based on face recognition. In: 2022 International Conference on Emerging Trends in Engineering and Medical Sciences (ICETEMS). 2022. p 35–40.
- 46.Hardiansyah B, Hermawati FA, Saputra DA, Caesa DD. Depression detection via facial expressions and movement analysis using machine learning with optimized feature selection. Int J Activity Behav Comput. 2025;2025(2):1–25. 10.60401/ijabc.108. [Google Scholar]
- 47.Abdulshahed, Abdulsadda. Facial expression recognition and classification for autism spectrum disorder based on SDFT images using deep learning. AUIQ Technical Eng Sci. 2024;1(1). 10.70645/3078-3437.1000.
- 48.Hou X, Zhang Y, Wang Y, Wang X, Zhao J, Zhu X, Su J. A markerless 2D video, facial feature recognition-based, artificial intelligence model to assist with screening for Parkinson disease: development and usability study. J Med Internet Res. 2021;23(11):e29554. 10.2196/29554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Anjum J, Waziha A, Hia NA, Kalpoma KA. A dual paradigm of machine learning and deep learning for early identification of Autism Spectrum disorder in children using facial image dataset. In: 2024 27th International Conference on Computer and Information Technology (ICCIT). 2024. p 2817–22. 10.1109/ICCIT64611.2024.11022343.
- 50.Cheng X. Research on depression recognition based on university students’ facial expressions and actions with the assistance of artificial intelligence. J Adv Comput Intel Intell Inf. 2024;28(5):1126–31. 10.20965/jaciii.2024.p1126. [Google Scholar]
- 51.Sai Koppula K, Agrawal A. in: Kole D, Roy Chowdhury S, Basu S, Plewczynski D, Bhattacharjee D, editors. Autism Spectrum disorder detection through facial analysis and deep learning: leveraging domain-specific variations. In: Proceedings of 4th International Conference on Frontiers in Computing and Systems. Singapore: Springer Nature Singapore; 2024. p 619–34.
- 52.Sucharita, Malathi. Unravelling traditional machine learning to advanced deep learning: autism detection using vision based gaze estimation and facial analysis. Inintegrated Technol Electr, Electron Biotechnol Eng. 2025.
- 53.Zhong X, Li A, Zhang X, Zhu T. Facial feature analysis for anxiety detection in people with drug Addiction: establishment and validation of machine learning models. In: 2024 9th International Conference on Intelligent Computing and Signal Processing (ICSP). 2024. p 1053–59. 10.1109/ICSP62122.2024.10743536.
- 54.Zhao Q. Process analysis of facial expressions, movements, and psychological changes in depression based on deep learning algorithms. J Biotech Res. 2025;20:226–35. [Google Scholar]
- 55.Chaudhary G. Analysis of facial features for earlier prediction of autism disorder in children and adults using deep learning Priyanka Sourabh kadam GCEK, it dept. CSE. 2025.
- 56.Ahmed S, Islam F, Aa MR, Noor, Rahman. Enhancing autism spectrum disorder diagnosis in Bangladeshi children through deep learning models and facial expression analysis. 2025.
- 57.Keating and Cook. Facial movements as biomarkers for autism: a Bayesian prevalence and machine-learning proof-of-concept study. 2025. 10.31234/osf.io/h4yd7_v1. [Google Scholar]
- 58.Attar, Paygude. Lightweight deep learning model for autism spectrum disorder detection and expression recognition in children using facial images. Front Health Inf. 2024;13(2).
- 59.Arunachalam S, Ghate SN, Kopperundevi N, Balamurugan MS, Kanagajothi D, S S. Deep Distributed learning and spatial-temporal features for facial depression detection. In: 2023 7th International Conference on Electronics, Communication and Aerospace Technology (ICECA). 2023. p 1288–93. 10.1109/ICECA58529.2023.10394761.
- 60.Alam S, Rashid MM. Enhanced early autism screening: assessing domain adaptation with distributed facial image datasets and deep federated learning. IIUM Eng J. 2025;26(1):113–28. 10.31436/iiumej.v26i1.3186. [Google Scholar]
- 61.Karthik MD, Priya SJ, Mathu T. Autism detection for toddlers using facial features with deep learning. In: 2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC). 2024. p 726–31. 10.1109/ICAAIC60222.2024.10575487.
- 62.Chekroud AM, Zotti RJ, Shehzad Z, Gueorguieva R, Johnson MK, Trivedi MH, Cannon TD, Krystal JH, Corlett PR. Cross-trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry. 2016;3(3):243–50. 10.1016/S2215-0366(15)00471-X. [DOI] [PubMed] [Google Scholar]
- 63.Tabashum T, Snyder RC, O’Brien MK, Albert MV. Machine learning models for Parkinson disease: systematic review. JMIR Med Inf. 2024;12:e50117. 10.2196/50117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Fusar-Poli P, Hijazi Z, Stahl D, Steyerberg EW. The science of prognosis in Psychiatry: a review. JAMA Psychiatry. 2018;75(12):1289–97. 10.1001/jamapsychiatry.2018.2530. [DOI] [PubMed] [Google Scholar]
- 65.Kapur S, Phillips AG, Insel TR. Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? Mol Psychiatry. 2012;17(12):1174–79. 10.1038/mp.2012.105. [DOI] [PubMed] [Google Scholar]
- 66.Calhoun VD, Miller R, Pearlson G, Adalı T. The chronnectome: time-varying connectivity networks as the next frontier in fMRI data discovery. Neuron. 2014;84(2):262–74. 10.1016/j.neuron.2014.10.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Sui J, Adali T, Yu Q, Chen J, Calhoun VD. A review of multivariate methods for multimodal fusion of brain imaging data. J Neurosci Methods. 2012;204(1):68–81. 10.1016/j.jneumeth.2011.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Chen L, Li S, Bai Q, Yang J, Jiang S, Miao Y. Review of image classification algorithms based on convolutional neural networks. Remote Sens. 2021. 10.3390/rs13224712. [Google Scholar]
- 69.Wang X, Wang Y, Yang J, Jia X, Li L, Ding W, Wang F-Y. The survey on multi-source data fusion in cyber-physical-social systems: foundational infrastructure for industrial metaverses and industries 5.0. Inf Fusion. 2024;107:102321. 10.48550/arXiv.2404.07476. [Google Scholar]
- 70.Dwyer DB, Falkai P, Koutsouleris N. Machine learning approaches for clinical psychology and psychiatry. Annu Rev Clin Phychol. 2018;14:91–118. 10.1146/annurev-clinpsy-032816-045037. [DOI] [PubMed] [Google Scholar]
- 71.Rashid B, Damaraju E, Pearlson GD, Calhoun VD. Dynamic connectivity states estimated from resting fMRI identify differences among schizophrenia, bipolar disorder, and healthy control subjects. Front Hum Neurosci. 2014;8:897. 10.3389/fnhum.2014.00897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Shatte ABR, Hutchinson DM, Teague SJ. Machine learning in mental health: a scoping review of methods and applications. Psychological Med. 2019;49(9):1426–48. 10.1017/S0033291719000151. [DOI] [PubMed] [Google Scholar]
- 73.Ibrahim H, Liu X, Rivera SC, Moher D, Chan A-W, Sydes MR, Calvert MJ, Denniston AK. Reporting guidelines for clinical trials of artificial intelligence interventions: the SPIRIT-AI and CONSORT-AI guidelines. Trials. 2021;22(1):11. 10.1186/s13063-020-04951-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
No datasets were generated or analysed during the current study.





