Table 3.
Pharyngeal phase detection performance by bolus consistency. The best results for each metric are highlighted in bold. P3BPM and P3UESC values higher than or equal to the inter-rater agreement are marked with an asterisk.
| CNN backbone | F1-score | P3BPM (%) | P3UESC (%) | |
|---|---|---|---|---|
| Thin boluses | VGG16 | 0.921 | 89.66 | 89.66* |
| InceptionV3 | 0.927 | 89.66 | 93.10* | |
| ResNet50 | 0.934 | 89.66 | 89.66* | |
| CNN3 | 0.893 | 89.66 | 75.86 | |
| CNN4 | 0.901 | 89.66 | 82.76 | |
| Inter-rater agreement | 93.10 | 86.21 | ||
| Slightly thick boluses | VGG16 | 0.914 | 92.54 | 94.09 |
| InceptionV3 | 0.904 | 89.55 | 92.54 | |
| ResNet50 | 0.927 | 89.55 | 100.00* | |
| CNN3 | 0.901 | 86.57 | 86.57 | |
| CNN4 | 0.894 | 88.06 | 91.04 | |
| Inter-rater agreement | 94.06 | 95.52 | ||
| Mildly thick boluses | VGG16 | 0.881 | 75.36 | 91.30* |
| InceptionV3 | 0.850 | 75.36 | 88.41* | |
| ResNet50 | 0.869 | 76.81 | 89.86* | |
| CNN3 | 0.848 | 71.01 | 85.51 | |
| CNN4 | 0.868 | 75.36 | 89.86* | |
| Inter-rater agreement | 86.96 | 88.41 | ||
| Moderately thick boluses | VGG16 | 0.925 | 77.94 | 100.00 |
| InceptionV3 | 0.913 | 72.06 | 97.06* | |
| ResNet50 | 0.925 | 82.35* | 97.06* | |
| CNN3 | 0.875 | 70.59 | 88.24* | |
| CNN4 | 0.890 | 69.12 | 94.12* | |
| Inter-rater agreement | 82.35 | 88.24 | ||
| Extremely thick boluses | VGG16 | 0.900 | 64.41 | 96.61* |
| InceptionV3 | 0.860 | 61.02 | 91.53* | |
| ResNet50 | 0.908 | 69.49 | 93.22* | |
| CNN3 | 0.830 | 64.41 | 84.75 | |
| CNN4 | 0.876 | 61.02 | 94.92* | |
| Inter-rater agreement | 81.36 | 88.14 |