Skip to main content
. 2021 Mar 16;7(3):e25807. doi: 10.2196/25807

Table 5.

Age group comparisons for the top performing variables (by permutation feature importance).

Variable Type Age 13-20 years (n=1014), meana (95% CI) Age 21-54 years (n=643), meana (95% CI) t statistic (df) P value
Sentences per comments Literary characteristics −0.43 (−0.52, −0.35) 0.37 (0.27, 0.46) −12.31 (1489.89) <.001
Year account created Summary statistics 2.96 (2.79, 3.14) 1.35 (1.10, 1.59) 10.65 (1272.76) <.001
Proportion of user’s posts or comments in r/ teenagers Subreddit frequencies −3.49 (−3.67, −3.31) −5.11 (−5.17, −5.05) 16.69 (1208.10) <.001
75th percentile subscriber count for subreddits user posted Summary statistics −0.14 (−0.23, −0.04) −0.33 (−0.46, −0.20) 2.41 (1276.15) .02
Average comment Coleman Liau Index Literary characteristics −0.25 (−0.34, −0.16) 0.08 (−0.01, 0.17) −5.06 (1571.31) <.001
Comment karma Summary statistics −0.19 (−0.25, −0.13) 0.30 (0.22, 0.38) −9.85 (1309.14) <.001
TF-IDFb weight for “school” Term usage −2.46 (−2.65, −2.27) −2.63 (−2.86, −2.40) 1.14 (1406.18) .25
Frequency of WWBPc 23-29 word set used Term usage −1.23 (−1.39, −1.08) −0.05 (−0.20, 0.10) −10.97 (1592.33) <.001
TF-IDF weight for “need” Term usage −1.58 (−1.75, −1.41) −0.31 (−0.48, −0.15) −10.48 (1577.27) <.001
Normalized count of WWBP 23-29 word set used Term usage −1.29 (−1.44, −1.14) 0.04 (−0.11, 0.19) −12.35 (1569.17) <.001
Proportion of comments posted in a thread user started Summary statistics −0.74 (−0.90, −0.57) −1.41 (−1.62, −1.20) 4.95 (1366.37) <.001
TF-IDF weight for “look like” Term usage −3.04 (−3.22, −2.86) −1.84 (−2.08, −1.61) −7.85 (1327.70) <.001
TF-IDF weight for “home” Term usage −3.45 (−3.62, −3.28) −1.89 (−2.14, −1.65) −10.24 (1252.67) <.001
TF-IDF weight for “totally” Term usage −4.23 (−4.37, −4.09) −2.95 (−3.19, −2.71) −8.92 (1092.36) <.001
Proportion of user’s posts or comments in r/news Subreddit frequencies −5.03 (−5.10, −4.97) −4.29 (−4.48, −4.11) −7.38 (810.43) <.001

aQuantile transformed means.

bTF-IDF: term frequency–inverse document frequency.

cWorld Well-Being Project