Table 3.
Experiment | Metric | Precision | Recall | F1-score | Support |
---|---|---|---|---|---|
Main | Retained | 0.931 | 0.966 | 0.948 | 2964 |
LTFU | 0.861 | 0.748 | 0.801 | 838 | |
micro average | 0.918 | 0.918 | 0.918 | 3802 | |
macro average | 0.896 | 0.857 | 0.875 | 3802 | |
weighted average | 0.916 | 0.918 | 0.916 | 3802 | |
Keep all notes | Retained | 0.903 | 0.925 | 0.914 | 6375 |
LTFU | 0.302 | 0.248 | 0.273 | 838 | |
micro average | 0.846 | 0.846 | 0.846 | 7213 | |
macro average | 0.603 | 0.586 | 0.593 | 7213 | |
weighted average | 0.834 | 0.846 | 0.839 | 7213 |
Note: LTFU = lost to follow-up; tp = The number of true positives; fp = The number of false positives; fn = The number of false negatives; precision = tp/(tp + fp); recall = tp/(tp + fn); F1-score = 2 * (precision * recall)/(precision + recall); support = The number of occurrences in a class; micro = Globally calculate metrics by using the total true positives, false negatives, and false positives; macro = Get the unweighted average of the metrics for each label; weighted = Calculate metrics for each label and get their average weighted by support. These metrics definitions were taken from scikit-learn, the result table was obtained through scikit-learn’s classification report method. The linear model with stochastic gradient descent and elastic net regularization was evaluated with the best found hyperparameter alpha of 5e-05 for the main experiment, and 1e-05 for the “keep all notes” setup.