Skip to main content
. 2022 May 4;17(5):e0267901. doi: 10.1371/journal.pone.0267901

Table 4. Performance score.

a. Sentence task
Precision Recall f1 score
LSTM
    Original 0.28 0.20 0.23
    Balanced 0.10 0.96 0.19
    Under-sampling 0.41 0.33 0.37
Bi-LSTM
    Original 0.35 0.33 0.34
    Balanced 0.15 0.86 0.26
    Under-sampling 0.33 0.46 0.38
BERT
    Original 0.43 0.23 0.30
    Balanced 0.03 0.56 0.07
    Under-sampling 0.45 0.66 0.54
b. User task
Precision Recall f1 score
LSTM
    Original 0.66 0.52 0.58
    Balanced 0.23 1.00 0.37
    Under-sampling 0.57 0.57 0.57
Bi-LSTM
    Original 0.65 0.68 0.66
    Balanced 0.30 1.00 0.46
    Under-sampling 0.50 0.68 0.57
BERT
    Original 0.53 0.36 0.43
    Balanced 0.13 0.93 0.23
    Under-sampling 0.63 0.82 0.71

Precision, recall and f1 scores are shown in these tables for the sentence task (a) and the user task (b). NLP deep-learning models used for this study are LSTM, Bi-LSTM and BERT. The percentage of positive data in the training dataset is approx. 2.7% for “Original” (the same ratio as in the original population), 50% for “Balanced” and 5% for “Under-sampling”.