Table 2.
Weighted F1 scores for classification fields and mean accuracy for token extractor fields on full training data sample (n = 2066)
Data elements | Logistic regression | AdaBoost classifier | Random forest | SVM | CNN | LSTM | Majority class accuracy |
---|---|---|---|---|---|---|---|
Gleason grade—primary | 0.978 | 0.971 | 0.941 | 0.932 | 0.981 | 0.628 | 0.709 |
Gleason grade—secondary | 0.958 | 0.943 | 0.913 | 0.912 | 0.968 | 0.576 | 0.467 |
Gleason grade—tertiary | 0.923 | 0.930 | 0.844 | 0.886 | 0.930 | 0.741 | 0.901 |
Tumor histology | 0.989 | 0.995 | 0.995 | 0.993 | 0.995 | 0.994 | 0.991 |
Cribriform pattern | 0.963 | 0.981 | 0.963 | 0.968 | 0.987 | 0.966 | 0.997 |
Treatment effect | 0.981 | 0.979 | 0.981 | 0.981 | 0.981 | 0.973 | 0.985 |
Tumor margin status | 0.941 | 0.953 | 0.888 | 0.918 | 0.950 | 0.630 | 0.799 |
Benign margin status | 0.977 | 0.975 | 0.972 | 0.981 | 0.978 | 0.967 | 0.997 |
Perineural invasion | 0.944 | 0.978 | 0.938 | 0.929 | 0.972 | 0.613 | 0.771 |
Seminal vesicle invasion | 0.943 | 0.974 | 0.940 | 0.965 | 0.976 | 0.784 | 0.904 |
Extraprostatic extension | 0.954 | 0.953 | 0.882 | 0.939 | 0.961 | 0.778 | 0.712 |
Lymph node status | 0.983 | 0.952 | 0.983 | 0.973 | 0.986 | 0.824 | 0.570 |
Mean weighted F1 across classification models | 0.961 | 0.965 | 0.937 | 0.948 | 0.972 | 0.790 | 0.817 |
T stage | 0.951 | 0.954 | 0.948 | – | – | – | – |
N stage | 0.954 | 0.954 | 0.948 | – | – | – | – |
M stage | 0.972 | 0.969 | 0.969 | – | – | – | – |
Estimate tumor volume | 0.605 | 0.765 | 0.873 | – | – | – | – |
Prostate weight | 0.846 | 0.855 | 0.914 | – | – | – | – |
Mean accuracy for token extractor models | 0.866 | 0.899 | 0.930 | – | – | – | – |
CNN, convolutional neural network; LSTM, long short-term memory neural network; SVM, support vector machine.