Table 2.
Summary of the datasets used for COVID-19 classification. Cough, breath and speech signals were extracted from the Coswara, ComParE and Sarcos datasets. COVID-19 positive subjects are under-represented in all three.
Type | Dataset | Sampling Rate | Label | Subjects | Total audio | Average per subject | Standard deviation |
---|---|---|---|---|---|---|---|
Cough | Coswara | 44.1 kHz | COVID-19 Positive | 92 | 4.24 min | 2.77 s | 1.62 s |
Healthy | 1079 | 0.98 h | 3.26 s | 1.66 s | |||
Total | 1171 | 1.05 h | 3.22 s | 1.67 s | |||
ComParE | 16 kHz | COVID-19 Positive | 119 | 13.43 min | 6.77 s | 2.11 s | |
Healthy | 398 | 40.89 min | 6.16 s | 2.26 s | |||
Total | 517 | 54.32 min | 6.31 s | 2.24 s | |||
Sarcos | 44.1 kHz | COVID-19 Positive | 18 | 0.87 min | 2.91 s | 2.23 s | |
COVID-19 Negative | 26 | 1.57 min | 3.63 s | 2.75 s | |||
Total | 44 | 2.45 min | 3.34 s | 2.53 s | |||
Breath | Coswara | 44.1 kHz | COVID-19 Positive | 88 | 8.58 min | 5.85 s | 5.05 s |
Healthy | 1062 | 2.77 h | 9.39 s | 5.23 s | |||
Total | 1150 | 2.92 h | 9.126 s | 5.29 s | |||
Speech | Coswara (normal) | 44.1 kHz | COVID-19 Positive | 88 | 12.42 min | 8.47 s | 4.27 s |
Healthy | 1077 | 2.99 h | 9.99 s | 3.09 s | |||
Total | 1165 | 3.19 h | 9.88 s | 3.22 s | |||
Coswara (fast) | 44.1 kHz | COVID-19 Positive | 85 | 7.62 min | 5.38 s | 2.76 s | |
Healthy | 1074 | 1.91 h | 6.39 s | 1.77 s | |||
Total | 1159 | 2.03 h | 6.31 s | 1.88 s | |||
ComParE | 16 kHz | COVID-19 Positive | 214 | 44.02 min | 12.34 s | 5.35 s | |
Healthy | 396 | 1.46 h | 13.25 s | 4.67 s | |||
Total | 610 | 2.19 h | 12.93 s | 4.93 s |