Skip to main content
. 2020 Nov 5;28(1):104–112. doi: 10.1093/jamia/ocaa220

Table 3.

10-fold cross-validation metrics results of the classification task, based on the predicted label for each note when it was in the cross-validation test set

Experiment Metric Precision Recall F1-score Support
Main Retained 0.931 0.966 0.948 2964
LTFU 0.861 0.748 0.801 838
micro average 0.918 0.918 0.918 3802
macro average 0.896 0.857 0.875 3802
weighted average 0.916 0.918 0.916 3802
Keep all notes Retained 0.903 0.925 0.914 6375
LTFU 0.302 0.248 0.273 838
micro average 0.846 0.846 0.846 7213
macro average 0.603 0.586 0.593 7213
weighted average 0.834 0.846 0.839 7213

Note: LTFU = lost to follow-up; tp = The number of true positives; fp = The number of false positives; fn = The number of false negatives; precision = tp/(tp + fp); recall = tp/(tp + fn); F1-score = 2 * (precision * recall)/(precision + recall); support = The number of occurrences in a class; micro = Globally calculate metrics by using the total true positives, false negatives, and false positives; macro = Get the unweighted average of the metrics for each label; weighted = Calculate metrics for each label and get their average weighted by support. These metrics definitions were taken from scikit-learn, the result table was obtained through scikit-learn’s classification report method. The linear model with stochastic gradient descent and elastic net regularization was evaluated with the best found hyperparameter alpha of 5e-05 for the main experiment, and 1e-05 for the “keep all notes” setup.