Table 12.
Study | Data splitting | Participants | Features/representation | Classifier | ACC | Prec | Recall | AUC | Threshold | Kappa |
---|---|---|---|---|---|---|---|---|---|---|
3 | Random samples, 2 s segments | 3621 | Spectrogram and log-melspectrogram from coughing sounds | ResNet18 | NA | NA | 0.9 | 0.72 | Manipulated to yield 90% sensitivity | NA |
9 | Used the whole audio and chunked audio | 2000 | Hand-crafted and Vggish extracted features including tempo and MFCC from coughing and breath sounds | Logistic regression, gradient boosting trees, and SVM | NA | 0.72 | 0.69 | 0.80 | NA | NA |
31 | Split the sound files into 6 s audio splits | 5320 | Muscular degradation, vocal cords, sentiment, MFCC | Three pre-trained ResNet50 | 1 | 0.94 | 0.985 | 0.97 | Manipulated | NA |
Our method | Segment the coughing sounds into a single non-overlapping coughing sound | 1502 | Spectrogram, MelSpectrum, tonal, raw, MFCC, power spectrum, chroma | Ensemble of CNN classifiers | 0.77 | 0.80 | 0.71 | 0.77 | 0.5 | 0.53 |
This comparison is not intended to be a head-to-head comparison because several implementation details are not available.