Skip to main content
. 2021 Jul 22;121:108197. doi: 10.1016/j.patcog.2021.108197

Table 10.

Comparison of HPC-XGB with respect to the best-performing state of the art model (DT [10], [11]) trained for each subtask. We measured the performance of testing procedure c) in terms of average accuracy and average recall. indicates whether the recall distribution of the proposed HPC-XGB over the 11 GPs (Core Data Team) is significantly higher than DT according to the one-sided Wilcoxon signed-rank test (significance level = 0.05).

Model # of patients Labels Accuracy (mean [std]) Recall (mean rev[std])
HPC-XGB B1 3392 {20,26} 0.661 (0.327) 0.493 (0.092)
DT [10], [11] B1 3392 {20,26} 0.912 (0.113) 0.499 (0.024)
HPC-XGB B2 4972 {19,25} 0.764 (0.230) 0.701 (0.155)
DT [10], [11] B2 4972 {19,25} 0.801 (0.082) 0.674 (0.116)
HPC-XGB B3 2872 {16,18,22,24} 0.652 (0.148) 0.553 (0.099)
DT [10], [11] B3 2872 {16,18,22,24} 0.535 (0.163) 0.458 (0.105)
HPC-XGB B4 1796 {15,17,21} 0.573 (0.213) 0.468 (0.077)
DT [10], [11] B4 1796 {15,17,21} 0.477 (0.152) 0.401 (0.064)
HPC-XGB B5 1104 {9,10,11,12,13,14} 0.301 (0.202) 0.266 (0.155)
DT [10], [11] B5 1104 {9,10,11,12,13,14} 0.277(0.148) 0.268 (0.143)
HPC-XGB B6 1110 {2,4,5,6,7} 0.453 (0.263) 0.369 (0.139)
DT [10], [11] B6 1110 {2,4,5,6,7} 0.377 (0.159) 0.289 (0.097)
HPC-XGB B7 279 {1,3} 0.734 (0.141) 0.750 (0.117)
DT [10], [11] B7 279 {1,3} 0.661 (0.121) 0.643 (0.110)