表 2. T staging model evaluation in the training set.
T分期模型训练集上的评估指标
| Model | Threshold | Sensitivity | Specificity | Accuracy | Hamming loss | F1 score | Kappa score | AUC | AP |
| *Best performance of all models in training set; AP: Average precision; AUC: Area under the curve; Hamming loss: The fraction of labels that are incorrectly predicted; F1 score=2×(precision×recall)/(precision+recall); Kappa score: A score that expresses the level of agreement between two annotators on a classification problem. | |||||||||
| 1* | 0.5 | 0.809 | 0.875 | 0.838 | 0.162 | 0.850 | 0.674 | 0.893 | 0.901 |
| 2 | 0.5 | 0.809 | 0.833 | 0.819 | 0.180 | 0.836 | 0.636 | 0.841 | 0.880 |
| 3 | 0.5 | 0.778 | 0.833 | 0.802 | 0.198 | 0.817 | 0.602 | 0.846 | 0.885 |
| 4 | 0.5 | 0.825 | 0.833 | 0.829 | 0.171 | 0.845 | 0.654 | 0.842 | 0.869 |
| 5 | 0.5 | 0.762 | 0.896 | 0.819 | 0.180 | 0.827 | 0.642 | 0.864 | 0.892 |