Skip to main content
. 2020 Oct 14;3(3):431–438. doi: 10.1093/jamiaopen/ooaa029

Table 3.

Mean weighted F1 score ± standard deviation for classification models for classification models and mean accuracy ± standard deviation for token extractor models on increasing numbers of reports (n) after 5 trials

Model n = 16 n = 32 n = 64 n = 128 n = 256
Classification models (mean weighted F1 score across all classification fields ± SD)
 Logistic 0.781 ± 0.175 0.846 ± 0.117 0.875 ± 0.090 0.911 ± 0.059 0.934 ± 0.041
 AdaBoost 0.829 ± 0.140 0.878 ± 0.100 0.907 ± 0.066 0.928 ± 0.049 0.945 ± 0.034
 Random forest 0.795 ± 0.169 0.835 ± 0.128 0.867 ± 0.101 0.882 ± 0.088 0.901 ± 0.070
 SVM 0.738 ± 0.214 0.763 ± 0.209 0.786 ± 0.194 0.842 ± 0.112 0.860 ± 0.140
 CNN 0.720 ± 0.225 0.790 ± 0.163 0.851 ± 0.122 0.893 ± 0.086 0.935 ± 0.055
 LSTM 0.688 ± 0.205 0.729 ± 0.187 0.743 ± 0.203 0.739 ± 0.214 0.739 ± 0.212
Token extractor models (mean accuracy across all token extractor fields ± SD)
 Logistic 0.844 ± 0.085 0.897 ± 0.079 0.892 ± 0.096 0.902 ± 0.087 0.896 ± 0.092
 Adaptive boost 0.877 ± 0.097 0.892 ± 0.080 0.890 ± 0.084 0.896 ± 0.082 0.890 ± 0.092
 Random forest 0.897 ± 0.180 0.898 ± 0.064 0.915 ± 0.054 0.920 ± 0.041 0.924 ± 0.038

CNN, convolutional neural network; LSTM, long short-term memory neural network; SVM, support vector machine.