Skip to main content
. 2023 Mar 24;2:e41205. doi: 10.2196/41205

Table 9.

Depressed users most strongly misclassified in each variation of the preemptive depression identification experimenta.


One depression user per control user (1:1) One depression user per 3 control users (1:3) One depression user per 5 control users (1:5) One depression user per 10 control users (1:10)
Classifier MentalBERT LMb MentalBERT LM MentalRoBERTa LM SVMc using word embeddings
User d13 d38 d13 d57
Control probability 0.93 0.94 0.99 0.98
Sum of post lengths in words 1696 1888 1696 55,897
Topic
  • news

  • hawaii

  • time

  • dead

  • blue

  • sir-geo

  • welcomed

  • invite

  • leave

  • warlock

  • news

  • hawaii

  • time

  • dead

  • blue

  • good

  • time

  • people

  • years

  • problem

Chief TF-IDFd features
  • love

  • minnesota

  • diablo

  • time

  • man

  • bud

  • zoidberg

  • like

  • month

  • hawaii

  • sir

  • geo

  • welcome

  • invite

  • warlock

  • leave

  • titan

  • psn

  • run

  • need

  • love

  • minnesota

  • diablo

  • time

  • man

  • bud

  • zoidberg

  • like

  • month

  • hawaii

  • good

  • know

  • use

  • make

  • time

  • thank

  • link

  • want

  • try

  • like

Depressed vocabulary counts

people 1 1 1 64

know 6 0 6 93

thing 3 0 3 35

feel 2 2 2 10

time 5 8 5 99

woman 1 0 1 7

go 3 0 3 54

want 3 1 3 71

life 2 0 2 28

relationship 0 0 0 2
Control vocabulary counts

game 0 1 0 9

trade 0 0 0 2

key 0 0 0 4

team 2 3 2 4

play 0 1 0 35

player 0 0 0 8

shiny 0 0 0 0

hatch 0 0 0 0

thank 1 1 0 15

add 0 2 0 14

aLexical properties of those users’ posts are provided.

bLM: language model.

cSVM: support vector machine.

dTF-IDF: term frequency–inverse document frequency.