Table 3. The annotator agreement and the overall model performance.
Two measures are used: ordinal Krippendorff’s Alpha and accuracy (Acc). The first line is the self-agreement of individual annotators, and the second line is the inter-annotator agreement between different annotators. The last two lines are the model evaluation results, on the training and the out-of-sample evaluation sets, respectively. Note that the overall model performance is comparable to the inter-annotator agreement.
No. of tweets | Overall | |||
---|---|---|---|---|
Alpha | Acc | |||
Self-agreement | 5,981 | 0.79 | 0.88 | |
Inter-annotator agreement | 53,831 | 0.60 | 0.79 | |
Classification model | Train.set | 50,000 | 0.61 | 0.80 |
Eval.set | 10,000 | 0.57 | 0.80 |