Skip to main content
. 2020 Oct 31;26(4):274–283. doi: 10.4258/hir.2020.26.4.274

Table 7.

Comparison of our methodology with other studies

Study Dataset Methodology Results
Little et al. [10] They used an original dataset consisting of 195 recordings collected from 31 patients where 23 were diagnosed with PD They detected dysphonia by discriminating HCs from PD participants, by extracting time domain and frequency domain features They achieved an accuracy of 91.4% using SVM classifier with 10 highly uncorrelated measures.
Benba et al. [11] They used a dataset consisting of 17 PD patients and 17 HCs They classified PD participants from HCs using a set of recordings recorded using a computer’s microphone, and by extracting 20 MFCC coefficients They achieved an accuracy of 91.17% using linear SVM with 12 MFCC coefficients.
Hemmerling et al. [12] They used an original dataset consisting of 198 recordings collected from 66 patients where 33 were diagnosed with PD They extracted several acoustic features, and applied Principal Component Analysis (PCA) for feature selection They achieved and accuracy of 93.43% using linear SVM
Singh and Xu [19] They selected randomly 1,000 recordings from the mPower database They extracted MFCC coefficients using the python_speech_ features library and compared different feature selection techniques They achieved an accuracy of 99% using SVM with an RBF kernel and by selecting important features using L1 feature selection technique
This study We have used a set of 18,210 smartphone recordings from the mPower database where 9,105 recordings are of PD participants and 9,105 recordings are of healthy controls We have extracted several features, from time frequency and cepstral domains, we have applied different preprocessing techniques and used two feature selection methods ANOVA and LASSO to compare Four different classifiers using 5-fold cross-validation We have achieved on unseen data a high accuracy, sensitivity, and specificity of 95.78%, 95.32%, and 96.23% respectively, and an F1-score of 95.74% using XGBoost with 33 features out of 138 that were chosen using LASSO with C = 0.03

HC: health control group, PD: Parkinson’s disease, SVM: support vector machine, MFCC: mel-frequency cepstral coefficients, RBF: radial basis function, LASSO: least absolute shrinkage and selection operator.