Table 1.
(a) | ||||
Dataset | Total Size | Annotated Subset | Annotation Source | Classes |
Aladağ et al., 2018 [7] | 508,398 posts | 785 posts | Experts | Suicidal, non-suicidal |
Ji et al., 2018 [13] | 3549 suicidal posts 3652 non-suicidal posts |
NA | Community affiliation | Suicidal, non-suicidal |
Yao et al., 2020 [25] | NA | 500 r/Opiates posts 500 r/SuicideWatch posts |
Crowdsourcing | Opioid addiction, no opioid addiction, suicide risk, no suicide risk |
Reddit SuicideWatch and Mental Health Collection by Ji et al., 2021 [44] | 54,412 posts | NA | Community affiliation | r/Depression, r/SuicideWatch, r/Anxiety, r/Offmychest, r/Bipolar |
Nikhileswar et al., 2021 [39] | 116,037 suicidal posts 116,037 non-suicidal posts |
NA | Community affiliation | Suicidal, non-suicidal |
(b) | ||||
Dataset | Total Size | Annotated Subset | Annotation Source | Classes |
UMD Reddit Suicidality Dataset v2 by Shing et al., 2018 [38] | 11,129 r/SuicideWatch users 11,129 non-r/SuicideWatch users |
866 r/SuicideWatch users 866 non-r/SuicideWatch users |
Crowdsourcing, experts | No risk, low risk, moderate risk, severe risk |
Reddit C-SSRS Suicide Dataset by Gaur et al., 2019 [26] | NA | 500 users (15,755 posts) | Experts | Indicator, ideation, behavior, attempt, supportive |
Reddit C-SSRS Suicide Dataset v2 by Gaur et al., 2021 [11] | NA | 448 users (7327 posts) | Experts | Ideation, behavior, attempt, supportive |