Guayacán et al. (2020) [111] |
Diagnosis |
11 PD patients and 11 healthy controls |
Video recordings while walking |
3D spatio-temporal CNN |
ACC = 88–90% |
Reyes et al. (2019) [109] |
Diagnosis |
88 PD patients and 94 healthy controls |
Gait samples from MS Kinect |
Cropping noisy parts + LSTM/1D-CNN/CNN-LSTM |
Best performing: CNN-LSTM with ACC = 83.1%, PREC = 83.5%, REC = 83.4%, F1-SCORE = 81%, Kappa = 64% |
Buongiorno (2019) [110] |
Diagnosis and severity estimation (mild vs. moderate) |
16 PD patients and 14 healthy controls |
Postural and kinematics features from MS Kinect v2 sensor while performing 3 motor exercises (gait, finger and foot tapping) |
Feature selection + SVM/ANN |
Best performing for diagnosis: gait-based ANN with ACC = 89.4%, SENS = 87.0%, SPEC = 91.8%; for severity estimation: ACC = 95.0%, SENS = 90.0%, SPEC = 99.0% |
Grammatikopoulou et al. (2019) [117] |
Severity estimation (UPDRS scores classification) |
12 advanced PD patients and 6 PD patients in initial stage |
Skeletal features from MS Kinect v2 RGB videos while playing an exergame |
Transformation to local coordinate system + two parallel LSTMs (the 1st trained with raw joint coordinates and the 2nd with joint line distances) |
ACC = 77.7% |
Tucker et al. (2015) [122] |
Medication adherence estimation (on/off medication classification) |
7 PD patients |
Skeletal joints 3D position, velocity and acceleration from MS Kinect |
C4.5 DT for generalized model; C4.5 DT, RF, SVM, IBk for personalized models |
for generalized model: ACC = 36.2–77.9%; for personalized models: ACC = 67.7–100% |
Li et al. (2018) [120] |
TUG subtasks segmentation and time estimation for each subtask |
24 PD patients |
Video recordings while performing TUG tests |
Pose estimation with OpenPose/Iterator Error Feedback + SVM/LSTM |
Best performing: OpenPose + LSTM with ACC = 93.1%, PREC = 80.8–97.5%, REC = 86.3–97%, F1-SCORE = 83.5–97.3% for subtasks segmentation and MAE = 0.32–1.07 for time estimation |
Wei et al. (2019) [123] |
Development of a virtual physical therapist: movement recognition (repetitions and sub-actions detection), patient’s errors identification (satisfactory/non-satisfactory performance), task recommendation (regress/repeat/progress) |
35 PD patients |
Motion data recorded by MS Kinect v2 sensor while performing 3 balance/agility tasks |
HMM for repetitions and sub-actions detection + linear-SVM for movement errors identification + {majority undersampling/minority oversampling/decision threshold adjustment/hybrid oversampling with feature standardization and interpolation + RF} for task recommendation |
For repetitions detection: ACC = 97.1–99.4%; for sub-actions segmentation: SENS = 88.4–96.9%, SPEC = 97.2–98.8%; for errors identification: ACC = 86.3–94.2%; best performing for tasks recommendation: hybrid oversampling + RF with ACC = 81.8–95.7%, FPR = 2.8–5.4% |
Hu et al. (2019) [121] |
FoG detection |
45 PD patients |
Videos collected while performing TUG tests |
Graph representation of videos + pretrained features (Res-Net 50 vertex, C3D vertex and context features) + graph sequence-RNN (Bi-directional GS-GRU/Bi-directional GS-LSTM/forward GS-GRU/forward GS-LSTM) + fusion |
Best performing: linear fusion of Bi-directional GS-GRU with context model with AUC = 0.90, SENS = 83.8%, SPEC = 82.3%, ACC = 82.5%, Youden’s J = 0.66, FPR = 17.7%, FNR = 16.2% |
Li et al. (2018) [118] |
Binary classification between pathological (PD/LID) and normal motion; multiclass classification (PD with LID, PD without LID and normal); levodopa-induced dyskinesia severity (UDysRS-III scores) estimation; parkinsonism severity (UPDRS-III scores) estimation |
9 PD patients |
2D-Video recordings while performing communication and drinking tasks (for dyskinesia detection) and while performing leg agility and toe tapping tasks (for parkinsonism detection) |
Convolutional pose estimators + RF |
For binary classification: AUC = 0.634–0.930, F1-SCORE = 50–90.6%; for multiclass classification: ACC = 71.4%, SENS = 83.5–96.2%, SPEC = 68.4–95.7%; for UDysRS-III estimation: RMSE = 2.906, r = 0.741; for UPDRS-III estimation: RMSE = 7.765, r = 0.53 |
Vivar-Estudillo et al. (2018) [112] |
Diagnosis |
18 PD patients and 22 healthy controls |
Position, velocity and rotation data regarding hand movements from leap motion sensor |
Texture features extraction with SDH + kNN/SVM/DT/LDA/LR/ensembles |
Best performing: bagged tree with ACC = 98.62%, SENS = 98.43%, SPEC = 98.80%, PREC = 98.80% |
Moshkova et al. (2020) [113] |
Diagnosis |
16 PD patients and 16 healthy controls |
Signals from leap motion sensor while performing hand motor tasks according to the MDS-UPDRS-III |
Features extraction + kNN/SVM/DT/RF |
Best performing: SVM with ACC = 98.4% when features are extracted from all the tasks |
Ali et al. (2020) [114] |
Diagnosis; classification between PD patients with medication, without medication and healthy controls |
87 PD patients with medication, 119 PD patients without medication and 139 healthy controls |
Videos while performing hand motor tasks |
Segmentation to frames + temporal segmentation with CNN + spatial segmentation with CNN-AE + FFT for feature extraction + SVM |
Best performance when combining 2 tasks for diagnosis: ACC = 91.8%; for 3-class classification: ACC = 73.5% |
Liu et al. (2019) [119] |
Severity estimation (Bradykinesia-related MDS-UPDRS scores classification) |
60 PD patients |
Video recordings while performing hand motor tests |
Pose estimator NN + feature extraction + kNN/RF/linear-SVM/RBF-SVM |
Best performing: RBF-SVM with ACC = 89.7%, PREC = 20–100%, REC = 60–100%, F1-SCORE = 33.3–100% |
Rajnoha et al. (2018) [116] |
Diagnosis |
50 PD patients and 50 age-matched healthy controls |
Face images extracted from video recordings |
HOG for face detection + CNN for embeddings generation + kNN/DT/RF/XGBoost/SVM |
Best performing: DT:ACC = 67.33% with leave-one-out cross validation, RF: ACC = 60.7–85.92% with train-test split |
Jin et al. (2020) [115] |
Diagnosis |
33 PD patients and 31 elderly healthy subjects |
Short videos while imitating images of smiley people |
Splitting videos to frames + coordinate points extraction with Face++ + transformation from absolute to relative coordinates + features extraction + LASSO + LR/SVM/DT/RF/LSTM/RNN |
Best performing: SVM with PREC = 99%, REC = 99%, F1-SCORE = 99% |