Table 3.
Model | n = 16 | n = 32 | n = 64 | n = 128 | n = 256 |
---|---|---|---|---|---|
Classification models (mean weighted F1 score across all classification fields ± SD) | |||||
Logistic | 0.781 ± 0.175 | 0.846 ± 0.117 | 0.875 ± 0.090 | 0.911 ± 0.059 | 0.934 ± 0.041 |
AdaBoost | 0.829 ± 0.140 | 0.878 ± 0.100 | 0.907 ± 0.066 | 0.928 ± 0.049 | 0.945 ± 0.034 |
Random forest | 0.795 ± 0.169 | 0.835 ± 0.128 | 0.867 ± 0.101 | 0.882 ± 0.088 | 0.901 ± 0.070 |
SVM | 0.738 ± 0.214 | 0.763 ± 0.209 | 0.786 ± 0.194 | 0.842 ± 0.112 | 0.860 ± 0.140 |
CNN | 0.720 ± 0.225 | 0.790 ± 0.163 | 0.851 ± 0.122 | 0.893 ± 0.086 | 0.935 ± 0.055 |
LSTM | 0.688 ± 0.205 | 0.729 ± 0.187 | 0.743 ± 0.203 | 0.739 ± 0.214 | 0.739 ± 0.212 |
Token extractor models (mean accuracy across all token extractor fields ± SD) | |||||
Logistic | 0.844 ± 0.085 | 0.897 ± 0.079 | 0.892 ± 0.096 | 0.902 ± 0.087 | 0.896 ± 0.092 |
Adaptive boost | 0.877 ± 0.097 | 0.892 ± 0.080 | 0.890 ± 0.084 | 0.896 ± 0.082 | 0.890 ± 0.092 |
Random forest | 0.897 ± 0.180 | 0.898 ± 0.064 | 0.915 ± 0.054 | 0.920 ± 0.041 | 0.924 ± 0.038 |
CNN, convolutional neural network; LSTM, long short-term memory neural network; SVM, support vector machine.