Table 4.
Baseline |
Year-1 follow-up |
|||
---|---|---|---|---|
Rating scale | Control vs. PROD AUC (P-value) | Control vs. PD AUC (P-value) | Control vs. PROD AUC (P-value) | Control vs. PD AUC (P-value) |
UPDRS-III | 0.70 (<0.05)∗ | 1.00 (p < 0.05)∗ | 0.70 (0.06) | 0.99 (<0.01)∗∗ |
H&Y† | 0.53 (0.19) | 1.00 (<0.001)∗∗∗ | 0.58 (0.07) | 1.00 (<0.001)∗∗∗ |
MoCA | 0.62 (0.1) | 0.72 (0.01)∗∗ | 0.51 (0.9) | 0.58 (0.1) |
SDM | 0.79 (<0.01)∗∗ | 0.65 (0.2) | 0.80 (<0.01)∗∗ | 0.76 (<0.01)∗∗ |
AUC, Area and the ROC curve (perfect discrimination = 1.0, random discrimination = 0.5). P-value represents the statistical significance of the metric computed with logistic regression models corrected for age and gender. †H&Y could not be corrected for age and gender given the sample distributions, a Mann–Whitney U-test was used instead. As expected, UPDRS-III and H&Y achieved a near perfect discrimination between Control and Parkinson’s disease (PD) groups. SDM had the best performance in distinguishing PROD from Controls, but failed to reliably measure a difference between Controls and Parkinson’s disease. Bolded values indicate the best metrics among the different ML models (∗P-value < 0.05, ∗∗P-value < 0.01, and ∗∗∗P-value < 0.001).