Skip to main content
. 2021 Sep 17;38(5):483–494. doi: 10.1007/s10585-021-10119-6

Table 3.

Performance of the radiomics models using segmentations from multiple observers (STUD2, PhD, and RAD) for the patients in the training sets and the segmentations from another observer (CNN) in the other patients in the test sets

Regular ICC > 0.75 ICC > 0.90 ComBat Man ComBat Prot
AUC 0.69 [0.57, 0.81] 0.70 [0.59, 0.81] 0.65 [0.53, 0.77] 0.64 [0.40, 0.88] 0.63 [0.38, 0.87]
Accuracy 0.65 [0.54, 0.76] 0.65 [0.55, 0.75] 0.61 [0.50, 0.72] 0.60 [0.41, 0.79] 0.58 [0.39, 0.76]
Sensitivity 0.71 [0.57, 0.86] 0.63 [0.48, 0.78] 0.61 [0.44, 0.77] 0.56 [0.30, 0.82] 0.55 [0.29, 0.81]
Specificity 0.58 [0.41, 0.74] 0.67 [0.51, 0.83] 0.61 [0.45, 0.78] 0.63 [0.33, 0.93] 0.60 [0.29, 0.90]

The performance is reported for: the regular model; using only features with good (ICC > 0.75) or excellent (ICC > 0.90) reliability; and using ComBat harmonization per manufacturer (Man) or per acquisition protocol (Prot) without a moderation variable. For each metric, the mean and 95% confidence interval over the 100 × random-split cross-validation iterations are given

*Abbreviations: AUC area under the receiver operator characteristic curve; ICC intra-class correlation coefficient; Man manufacturer; Prot protocol