Table 2.
Summary of the evaluation metrics and their respective interpretation.
| Evaluation metric | Range of values | Definition | Interpretation |
| C_va | 0 to 1 | Measures how semantically similar words within a topic are to each other |
|
| Accuracy | 0 to 1 | Indicates the ratio between the number of correct predictions and the total number of predictions |
|
| F1-score | 0 to 1 | The harmonic mean of a model’s ability to correctly predict positive instances (recall) and minimize predicting negative instances as positive (precision) |
|
| MCCb | −1 to 1 | Measures the relationship between the number of positive instances correctly classified, the number of negative instances correctly classified, and the number of positive and negative instances misclassified |
|
| Cohen κ | 0 to 1 | Measures the interrater agreement of 2 raters (in machine learning, this is the classifier and the ground truth) |
|
| Brier Loss | 0 to 1 | A cost function that measures the difference between the predicted probability and the ground truth |
|
aC_v: coherence score.
bMCC: Matthews correlation coefficient.