Abstract
Parkinson’s Disease (PD) is a neurodegenerative disorder that is often accompanied by slowness of movement (bradykinesia) or gradual reduction in the frequency and amplitude of repetitive movement (hypokinesia). There is currently no cure for PD, but early detection and treatment can slow down its progression and lead to better treatment outcomes. Vision-based approaches have been proposed for the early detection of PD using gait. Gait can be captured using appearance-based or model-based approaches. Although appearance-based gait contains comprehensive features, it is easily affected by factors such as dressing. On the other hand, model-based gait is robust against changes in dressing and external contours, but it is often too sparse to contain sufficient information. Therefore, we propose a fusion of appearance-based and model-based gait features for PD prediction. First, we extracted keypoint coordinates from gait captured in videos and modeled these keypoints as a point cloud. The silhouette images are also segmented from the videos to obtain an overall appearance representation of the subject. We then perform a binary classification of gait as normal or Parkinsonian using a novel fusion of the gait point cloud and silhouette features, obtaining AUC up to 0.87 and F1-Scores up to 0.82 (precision: 0.85, recall: 0.80).
Introduction
Parkinson’s Disease is a neurodegenerative disorder with several causes and clinical symptoms. Most commonly, it is accompanied by slowness of movement (bradykinesia) [1] or gradual reduction in the frequency and amplitude of repetitive movements (hypokinesia). PD can also progress to stages where any movement is difficult (akinesia). PD is also characterized by tremors, rigidity, and general posture and gait impairment. PD patients have a reduced ability to perform activities of daily living, and hence, have a lower quality of life [2]. Although there is currently no cure for this progressive disease, early detection and interventions can slow down the progression of the disease and lead to better treatment outcomes. However, one of the main challenges with early detection of PD is the long waiting time for specialist consultation. To solve the problem of waiting times, Artificial Intelligence (AI) systems have been proposed [3] such that outpatients can undergo pre-screening before visiting a specialist. Such systems could potentially reduce the burden on healthcare systems and help slow down the progression of degenerative diseases such as PD.
In recent years, AI has been increasingly used in vision-based approaches to diagnose disease from medical images, such as x-rays [4], CT scans [5], and MRIs [6]. The main advantage of using AI for disease diagnosis is its ability to analyze large amounts of medical data quickly and accurately, providing doctors with more precise and efficient diagnostic tools. It can be used to detect and classify abnormal patterns in medical images that may indicate the presence of a disease. AI-based approaches to disease diagnosis are still in their early stages and there are many challenges to be addressed, such as the need for large amounts of high-quality annotated medical data, as well as ensuring the accuracy and interpretability of AI models [7]. However, the potential benefits of using AI in disease diagnosis are significant, including earlier detection, more accurate diagnoses, and more personalized treatment plans. For example, AI techniques were proposed for the pre-screening of COVID-19 from cough recordings [8, 9], and a technique was proposed for early PD detection using pre-motor features [10].
In general, using AI for pre-screening involves data capture, feature extraction based on early stage symptoms, and classification. Important attributes of a pre-screening system include accessibility and ease of use, where patients do not require expert assistance for data capture, for example, using cough recordings for COVID-19 pre-screening. Another important factor is whether the disease presents observable symptoms at the early stages. For example early signs of PD can be observed in gait, including reduced arm swing amplitude and symmetry, gait speed, step length, and increased time spent in the double-support phase [11]. The visible motor signs of Parkinson’s disease are referred to as Parkinsonism [12].
Recently, vision-based approaches have been proposed to detect potential Parkinsonism in gait [13]. This has been enabled by pose estimation models such as AlphaPose [14], which automatically tracks subjects of interest in videos and outputs coordinates of anatomical points on their bodies. Although Parkinsonism exists in some other gait pathology [2], a vision-based system could serve as a gait pre-screening system to reduce clinical consultation costs and waiting times. There are two common approaches to vision-based gait capture, namely, model-free (also referred to as appearance-based) and model-based [15–17]. The model-free approach relies on binary images segmented from images. Although model-free gait contains comprehensive features, it is easily affected by factors such as view angles, clothing changes and carrying changes. On the other hand, the model-based approaches measure gait parameters using skeleton models of the body. Model-based gait is robust against changes in dressing and eliminates the influence of external contours [18]. However, model-based gait features are too sparse to contain sufficient information for recognition—the extracted keypoints are too limited to reflect the complete gait information.
Hence, we propose FuGaPS (Fusion of Gait Point Cloud and Silhouette), a vision-based approach to diagnosing PD using both model-free and model-based gait. FuGaPS retains the human body structure and is robust to dress or carrying variances. It also contains richer information than existing model-based methods using human keypoints. First, we extracted keypoint coordinates from gait captured in videos using an AlphaPose pretrained model. Then we model these keypoints as a point cloud, which provides a richer representation, robust against small perturbations and noise in the pose estimation coordinates [19]. We also use the pretrained model of DeepLabv3+ [20] to obtain silhouette images from the videos. We then perform a binary classification of gait as normal or Parkinsonism using FuGaPS, a fusion of the gait point cloud and silhouette features. To the best of our knowledge, this is the first attempt at using a fusion of gait point clouds and silhouettes for PD detection.
Related works
PD screening usually involves data capture, feature extraction based on PD-related patterns, and classification. Brain electroencephalogram (EEG) signals [21] are commonly used for PD detection. Although EEG features are accurate, EEG can only be captured with specialized equipment and requires expert knowledge for interpretation. Since PD also affects speech, voice signals of patients have also been used for PD detection [22]. However, changes to the voice may be subtle. In addition, voice is easily affected by noise and may require advanced signal processing techniques [23]. Handwriting has also been used for PD detection [24, 25]. However, handwriting cannot be obtained without the cooperation of subjects.
Gait is perhaps the most used external feature for PD detection because it can be captured unobtrusively using force plates, vision-based techniques, or wearable sensors. For example, ground reaction forces have been used for PD detection using machine learning techniques [26]. However, force plates are relatively expensive and require expert knowledge to operate. Other techniques have also been proposed [27], which rely on body markers for gait signal acquisition. However, patients may be affected by the presence of markers on their bodies. Therefore, markerless techniques have been proposed for PD screening [13], which rely on vision-based gait capture enabled by state-of-the-art pose estimation techniques such as AlphaPose and OpenPose [28].
There are two common approaches to vision-based gait capture, namely, model-free (also referred to as appearance-based) and model-based [15–17]. For example, a model-free technique [29] was proposed for the detection of bradykinesia in PD from videos using optical flow. A model-based technique was also proposed [30] which relies on 2D and 3D poses and uses an ensemble of deep learning models to predict Unified Parkinson’s Disease Rating Scale (UPDRS) scores.
Materials and methods
This section describes the techniques for gait capture, feature extraction, and Parkinsonism prediction (Fig 1).
Fig 1. Overall methodology showing the process of FuGaPS generation and PD detection.
Dataset
The data used in this study are videos acquired from various online sources. The videos depict individuals walking with either Parkinsonian or normal gait. To ensure accuracy, the videos were appropriately annotated to indicate which ones showcase Parkinsonism. A PD expert has been invited to validate the accuracy of the annotated samples. This annotation step is crucial, as it directly influences the trained model’s precision. We collected data from 294 subjects with mean age 68±5.83 years, with 97 females. The dataset includes 150 healthy individuals and 144 individuals diagnosed with PD. Online video recordings were selected where the subject is walking and their full body is visible in the video. The dataset collection and method of analysis complied with ethical procedures and was approved by the Ethics Committee of Multimedia University with ethics approval number EA0422022.
Feature extraction
Two different kinds of features were explored, namely, point cloud features (model-based) and silhouette features (model-free). The point cloud features were derived from the keypoint features. The keypoint features were obtained by performing pose estimation to extract the positions of the body joints of subjects from the videos. After that, the point cloud features are generated to obtain robust gait features insensitive to texture variations. On the other hand, the silhouette features were acquired by segmenting the walking subjects from the image background. The details for each approach are provided in the subsequent sections.
Point cloud features
First, the AlphaPose [14] pretrained pose estimation model was used to extract body keypoints including the joint positions from each video frame. These keypoints can represent the position and movement of various body parts, such as the head, torso, and limbs (Fig 2).
Fig 2. Pose estimation keypoints.
After that, the extracted keypoints were preprocessed to make them suitable for model training. This involves sequencing the source video to 100 frames per sequence. For example, if the video contains 432 frames, then four sequences will be formed by taking the first 400 frames, each containing 100 frames. Each frame included the keypoints of the Parkinson’s patient which can be used for gait analysis. This processing part is required to feed video sequences with fixed lengths to the training model. For a walking sequence of T seconds in which we track K body keypoints in a video with frame rate f, AlphaPose outputs a sequence , where is the 2D coordinates, and the confidence of the ith keypoint in frame t. 3D keypoints are then generated by mapping the 2D keypoints to a depth map to obtain the corresponding depth coordinate for each keypoint [31].
As shown in Fig 3, in the NP group, the keypoints are widely spread across a large area, indicating a wide range of motion and active movement. This extensive distribution of keypoints suggests that individuals in the NP group have normal and dynamic gait patterns without significant posture issues. Their movements are fluid, reflecting a high degree of mobility and an absence of restrictions in their bodily motions.
Fig 3. Keypoints generated using AlphaPose for (a) NP (Non-Parkinson’s Disease) (b) PD (Parkinson’s Disease) groups.
Fusion of point cloud and silhouette features
To obtain the silhouette data, we used the pretrained model of DeepLabv3+ [20] which is a state-of-the-art semantic segmentation algorithm that can accurately segment objects and classify pixels in images. It is built upon a ResNet-50 backbone that is pretrained on the ImageNet dataset. The silhouette data is a sequence of images, with one image per video frame, which are averaged to obtain the gait energy image (GEI) for each subject (Fig 4). Each image was flattened to a vector with shape 100 × 100 × 3. Then, the fused feature was obtained by concatenating the flattened silhouette vector with the flattened point cloud features.
Fig 4. Sample gait energy images for the non-PD (NP) and PD groups.
(a) NP. (b) PD.
Experiments and results
Experiment setup
The experiments were performed on a machine running a 64-bit operating system with an Intel Core i5 processor, 16GB RAM, NVIDIA GeForce GTX 1650 Ti 4GB GPU. Experiments were run using Python 3.8.15. Three distinct training scenarios were implemented, applying the same parameters across all three models to evaluate their performance. In the first training scenario, AlphaPose keypoints were resized to a uniform length and organized into point clouds representing XYZ coordinates. The point clouds were flattened into 1D arrays, creating the feature set for this scenario. The second training scenario involved using only Gait Energy Images (GEI), which were resized into 1D arrays. The third and final training scenario combined the GEI images and AlphaPose keypoints into a single feature set. This was achieved by horizontally stacking the flattened GEI image arrays with the flattened AlphaPose keypoints.
Results
For each training scenario, we use a 30% test size, yielding 205 training samples (111 NP, 94 PD) and 89 test samples (39 NP, 50 PD). We evaluated our approach using Support Vector Machines (SVM), Random Forest Classifier, Extra Trees Classifier, Logistic Regression, K-Nearest Neighbors, and Decision Trees. To tune hyperparameters, we employ the Grid Search technique with five-fold cross-validation. For evaluation, we used the AUC, precision, recall, and F1-score. To get a good balance between precision and recall, we used the F1-score as our main metric As shown in Fig 5, the overall best results are obtained using the fusion of GEI and PointCloud features with Logistic Regression. The full results are shown in Table 1, and the confusion matrices for Logistic Regression are shown in Fig 6.
Fig 5. Summary of model f1-scores by feature type.
Table 1. Results on individual and fused features.
| Model | Feature | AUC | Precision | Recall | F1-score |
|---|---|---|---|---|---|
| Decision Tree | Fusion | 0.75 | 0.82 | 0.64 | 0.72 |
| PointCloud | 0.66 | 0.70 | 0.60 | 0.65 | |
| GEI | 0.59 | 0.66 | 0.42 | 0.51 | |
| Extra Trees | Fusion | 0.71 | 0.83 | 0.30 | 0.44 |
| PointCloud | 0.66 | 0.71 | 0.34 | 0.46 | |
| GEI | 0.57 | 0.52 | 0.22 | 0.31 | |
| KNN | Fusion | 0.82 | 0.96 | 0.48 | 0.64 |
| PointCloud | 0.69 | 0.65 | 0.66 | 0.65 | |
| GEI | 0.81 | 0.88 | 0.60 | 0.71 | |
| Logistic Regression | GEI | 0.79 | 0.78 | 0.70 | 0.74 |
| PointCloud | 0.60 | 0.68 | 0.52 | 0.59 | |
| Fusion | 0.87 | 0.85 | 0.80 | 0.82 | |
| Random Forest | PointCloud | 0.69 | 0.77 | 0.40 | 0.53 |
| Fusion | 0.73 | 0.82 | 0.28 | 0.42 | |
| GEI | 0.78 | 1.00 | 0.32 | 0.48 | |
| SVM | PointCloud | 0.57 | 0.64 | 0.46 | 0.53 |
| Fusion | 0.87 | 0.87 | 0.66 | 0.75 | |
| GEI | 0.79 | 0.84 | 0.62 | 0.71 |
Best results are shown in bold font face.
Fig 6. Confusion matrices for PD screening with logistic regression using (a) GEI, (b) PointCloud, and (c) GEI + PointCloud.
As shown in Table 1, the overall best results are obtained with the combined features using Logistic Regression with the highest AUC of 0.87, 80% recall, and f1-score of 0.82. The confusion matrices for Logistic Regression are shown in Fig 6.
Conclusions and future work
In this study, we have proposed vision-based techniques for screening patients for the likelihood of Parkinson’s disease from both model-based gait and model-free gait. In the first experiment, we used gait energy images and obtained F1-scores up to 0.74 using Logistic Regression. In the second experiment, we extracted point cloud features from subjects’ joint coordinates obtained via pose estimation. This technique achieved f1-scores up to 0.65 with Decision Trees and KNN. In the third experiment, we fused the GEI silhouette feature with point cloud features and obtained F1-scores up to 0.82 using Logistic Regression. These results suggest that a fusion of model-free and model-based gait features offers great improvement over model-based gait for Parkinson’s disease screening. Using silhouettes in gait tests is very promising since most clinical gait tests are performed in indoor settings where background subtraction is not a challenge. Moreover, silhouettes will capture most posture-based features, while model-based descriptors will capture more kinematic features.
Although this study has used a modest-sized self-collected dataset, the results show the potential of using fusing model-free and model-based features for the early detection of Parkinson’s disease. This can potentially allow for early intervention and treatment, which will in turn lead to a higher quality of life for people living with Parkinson’s disease. Future research could be focused on collecting more representative data across different collaborative study sites, and also classifying more stages of PD progression, from mild, to moderate, and severe stages the disease.
Data Availability
The DOI to access the data is: https://doi.org/10.34740/KAGGLE/DSV/7327131.
Funding Statement
This project is funded by Multimedia University and Universitas Telkom Joint research grant (MMUE/210063) and Fundamental Research Grant Scheme under Ministry of Higher Education Malaysia (FRGS/1/2020/ICT02/MMU/02/5) received by Tee Connie. Funder URL: https://mastic.mosti.gov.my/sti/incentives/fundamental-research-grant-scheme-frgs. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Bloem BR, Okun MS, Klein C. Parkinson’s disease. The Lancet. 2021;397(10291):2284–2303. doi: 10.1016/S0140-6736(21)00218-X [DOI] [PubMed] [Google Scholar]
- 2. Tolosa E, Garrido A, Scholz SW, Poewe W. Challenges in the diagnosis of Parkinson’s disease. The Lancet Neurology. 2021;20(5):385–397. doi: 10.1016/S1474-4422(21)00030-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Li X, Tian D, Li W, Dong B, Wang H, Yuan J, et al. Artificial intelligence-assisted reduction in patients’ waiting time for outpatient process: a retrospective cohort study. BMC health services research. 2021;21:1–11. doi: 10.1186/s12913-021-06248-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Moses DA. Deep learning applied to automatic disease detection using chest x-rays. Journal of Medical Imaging and Radiation Oncology. 2021;65(5):498–517. doi: 10.1111/1754-9485.13273 [DOI] [PubMed] [Google Scholar]
- 5. Ahuja S, Panigrahi BK, Dey N, Rajinikanth V, Gandhi TK. Deep transfer learning-based automated detection of COVID-19 from lung CT scan slices. Applied Intelligence. 2021;51:571–585. doi: 10.1007/s10489-020-01826-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ebrahimi-Ghahnavieh A, Luo S, Chiong R. Transfer learning for Alzheimer’s disease detection on MRI images. In: 2019 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT). IEEE; 2019. p. 133–138.
- 7. Vellido A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural computing and applications. 2020;32(24):18069–18083. doi: 10.1007/s00521-019-04051-w [DOI] [Google Scholar]
- 8.Pizzo DT, Esteban S. IATos: AI-powered pre-screening tool for COVID-19 from cough audio samples. arXiv preprint arXiv:210413247. 2021;.
- 9. Laguarta J, Hueto F, Subirana B. COVID-19 artificial intelligence diagnosis using only cough recordings. IEEE Open Journal of Engineering in Medicine and Biology. 2020;1:275–281. doi: 10.1109/OJEMB.2020.3026928 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Wang W, Lee J, Harrou F, Sun Y. Early Detection of Parkinson’s Disease Using Deep Learning and Machine Learning. IEEE Access. 2020;8:147635–147646. doi: 10.1109/ACCESS.2020.3016062 [DOI] [Google Scholar]
- 11. di Biase L, Di Santo A, Caminiti ML, De Liso A, Shah SA, Ricci L, et al. Gait Analysis in Parkinson’s Disease: An Overview of the Most Accurate Markers for Diagnosis and Symptoms Monitoring. Sensors. 2020;20(12):3529. doi: 10.3390/s20123529 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wichmann T. Changing views of the pathophysiology of Parkinsonism. Movement Disorders. 2019;34(8):1130–1143. doi: 10.1002/mds.27741 [DOI] [PubMed] [Google Scholar]
- 13. Connie T, Aderinola TB, Ong TS, Goh MKO, Erfianto B, Purnama B. Pose-Based Gait Analysis for Diagnosis of Parkinson’s Disease. Algorithms. 2022;15(12):474. doi: 10.3390/a15120474 [DOI] [Google Scholar]
- 14.Fang HS, Xie S, Tai YW, Lu C. RMPE: Regional Multi-person Pose Estimation. In: ICCV; 2017.
- 15.Li X, Makihara Y, Xu C, Yagi Y. End-to-end model-based gait recognition using synchronized multi-view pose constraint. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. p. 4106–4115.
- 16. Yousef RN, Khalil AT, Samra AS, Ata MM. Model-based and model-free deep features fusion for high performed human gait recognition. The Journal of supercomputing/Journal of supercomputing. 2023;79(12):12815–12852. doi: 10.1007/s11227-023-05156-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Zheng W, Zhu H, Zheng Z, Nevatia R. GaitSTR: Gait Recognition With Sequential Two-Stream Refinement. IEEE Transactions on Biometrics, Behavior, and Identity Science. 2024;. doi: 10.1109/TBIOM.2024.3390626 [DOI] [Google Scholar]
- 18. Aderinola TB, Connie T, Ong TS, Yau WC, Teoh ABJ. Learning age from gait: A survey. IEEE Access. 2021;9:100352–100368. doi: 10.1109/ACCESS.2021.3095477 [DOI] [Google Scholar]
- 19. Han XF, Feng ZA, Sun SJ, Xiao GQ. 3D point cloud descriptors: state-of-the-art. Artificial Intelligence Review. 2023; p. 1–51. [Google Scholar]
- 20.Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV); 2018. p. 801–818.
- 21. Oh SL, Hagiwara Y, Raghavendra U, Yuvaraj R, Arunkumar N, Murugappan M, et al. A deep learning approach for Parkinson’s disease diagnosis from EEG signals. Neural Computing and Applications. 2020;32(15):10927–10933. doi: 10.1007/s00521-018-3689-5 [DOI] [Google Scholar]
- 22. Åström F, Koker R. A parallel neural network approach to prediction of Parkinson’s Disease. Expert Systems with Applications. 2011;38(10):12470–12474. doi: 10.1016/j.eswa.2011.04.028 [DOI] [Google Scholar]
- 23. Benba A, Jilbab A, Sandabad S, Hammouch A. Voice signal processing for detecting possible early signs of Parkinson’s disease in patients with rapid eye movement sleep behavior disorder. International Journal of Speech Technology. 2019;22:121–129. doi: 10.1007/s10772-018-09588-0 [DOI] [Google Scholar]
- 24.Aghzal M, Mourhir A. Early Diagnosis of Parkinson’s Disease based on Handwritten Patterns using Deep Learning. In: 2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS); 2020. p. 1–6.
- 25.Pereira CR, Weber SAT, Hook C, Rosa GH, Papa JP. Deep Learning-Aided Parkinson’s Disease Diagnosis from Handwritten Dynamics. In: 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI); 2016. p. 340–346.
- 26. Balaji E, Brindha D, Balakrishnan R. Supervised machine learning based gait classification system for early detection and stage classification of Parkinson’s disease. Applied Soft Computing. 2020;94:106494. doi: 10.1016/j.asoc.2020.106494 [DOI] [Google Scholar]
- 27. Ricciardi C, Amboni M, De Santis C, Improta G, Volpe G, Iuppariello L, et al. Using gait analysis’ parameters to classify Parkinsonism: A data mining approach. Computer Methods and Programs in Biomedicine. 2019;180:105033. doi: 10.1016/j.cmpb.2019.105033 [DOI] [PubMed] [Google Scholar]
- 28. Cao Z, Martinez GH, Simon T, Wei S, Sheikh YA. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2019;. [DOI] [PubMed] [Google Scholar]
- 29. Williams S, Relton SD, Fang H, Alty J, Qahwaji R, Graham CD, et al. Supervised classification of bradykinesia in Parkinson’s disease from smartphone videos. Artificial Intelligence in Medicine. 2020;110:101966. doi: 10.1016/j.artmed.2020.101966 [DOI] [PubMed] [Google Scholar]
- 30.Mehta D, Asif U, Hao T, Bilal E, von Cavallar S, Harrer S, et al. Towards Automated and Marker-Less Parkinson Disease Assessment: Predicting UPDRS Scores Using Sit-Stand Videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops; 2021. p. 3841–3849.
- 31.Fang Z, Wang A, Bu C, Liu C. 3D Human Pose Estimation Using RGBD Camera. In: 2021 IEEE International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI); 2021. p. 582–587.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The DOI to access the data is: https://doi.org/10.34740/KAGGLE/DSV/7327131.






