Table 1.
dataset | samples | participants | data types | labels | |
---|---|---|---|---|---|
audio | Coswara [48] | 2030 | — | breathing, cough, speech | COVID-19, smoking status |
COVID-19 sounds [49] | 53 449 | 36 116 | breathing, cough, speech | COVID-19, smoking status, language | |
RECOLA [50] | 34 | 34 | speech | arousal, valence (emotion) | |
E-DAIC [39] | — | 275 | speech | PHQ-8 (depression) | |
ADReSS challenge [40] | 4076 | 156 | voice/speech | Alzheimer’s versus non-AD, MMSE score | |
chewing events [51]a | — | 5 | chewing sound | eating activities | |
mobility | COVID-19 Community Mobility Reports [52] | — | aggregated | aggregated data on the change in mobility trends during COVID-19 | — |
mobile phone mobility data [53] | — | aggregated | aggregated movement between regions during COVID-19 | — | |
motion | GLOBEM dataset: multi-year datasets for longitudinal human behaviour modelling generalization [54] | — | 497 (700 user-years) | activity (steps, sleep), location, call logs, bluetooth, screen | depression, personality, well-being and more |
Multi-Ethnic Study of Atherosclerosis (MESA) [55] | 2 200 000 | 1817 | polysomnography (PSG), actigraphy (activity) | sleep–wake classification | |
FitRec [56] | 253 020 | 1104 | heart rate, speed, GPS | workout route prediction, heart rate prediction |
aData for chewing events are not publicly available.