Skip to main content
. 2022 Aug 19;19(16):10347. doi: 10.3390/ijerph191610347

Table 1.

Data source: (a) post level and (b) user level.

(a)
Dataset Total Size Annotated Subset Annotation Source Classes
Aladağ et al., 2018 [7] 508,398 posts 785 posts Experts Suicidal, non-suicidal
Ji et al., 2018 [13] 3549 suicidal posts
3652 non-suicidal posts
NA Community affiliation Suicidal, non-suicidal
Yao et al., 2020 [25] NA 500 r/Opiates posts
500 r/SuicideWatch posts
Crowdsourcing Opioid addiction, no opioid addiction, suicide risk, no suicide risk
Reddit SuicideWatch and Mental Health Collection by Ji et al., 2021 [44] 54,412 posts NA Community affiliation r/Depression, r/SuicideWatch, r/Anxiety, r/Offmychest, r/Bipolar
Nikhileswar et al., 2021 [39] 116,037 suicidal posts
116,037 non-suicidal posts
NA Community affiliation Suicidal, non-suicidal
(b)
Dataset Total Size Annotated Subset Annotation Source Classes
UMD Reddit Suicidality Dataset v2 by Shing et al., 2018 [38] 11,129 r/SuicideWatch users
11,129 non-r/SuicideWatch users
866 r/SuicideWatch users
866 non-r/SuicideWatch users
Crowdsourcing, experts No risk, low risk, moderate risk, severe risk
Reddit C-SSRS Suicide Dataset by Gaur et al., 2019 [26] NA 500 users (15,755 posts) Experts Indicator, ideation, behavior, attempt, supportive
Reddit C-SSRS Suicide Dataset v2 by Gaur et al., 2021 [11] NA 448 users (7327 posts) Experts Ideation, behavior, attempt, supportive