Skip to main content
. 2021 Dec 16;1:61. doi: 10.1038/s43856-021-00062-8

Table 2.

Performance summary.

Metrics Dataset DeepMedic DAGMNet_CH3 DAGMNet_CH2 UNet_CH3 UNet_CH2 FCN_CH3 FCN_CH2
Dice score Testing (n = 459) 0.74 (0.17); 0.79 0.76 (0.16); 0.81 0.75 (0.17); 0.80 0.75 (0.18); 0.81 0.74 (0.20); 0.80 0.68 (0.20); 0.72 0.66 (0.20); 0.71
STIR 2 (n = 140) 0.76 (0.18); 0.82 0.75 (0.21); 0.82 0.75 (0.21); 0.81 0.73 (0.24); 0.82 0.73 (0.24); 0.82 0.70 (0.22); 0.75 0.68 (0.24); 0.75
STIR 1 (n = 140) 0.55 (0.27); 0.60 0.51 (0.30); 0.59 0.48 (0.32); 0.58 0.49 (0.31); 0.59 0.48 (0.32); 0.58 0.49 (0.28); 0.55 0.44 (0.30); 0.46
Testing L (n = 163) 0.85 (0.09); 0.87 0.83 (0.10); 0.86 0.84 (0.09); 0.86 0.85 (0.09); 0.88 0.84 (0.10); 0.87 0.81 (0.10); 0.84 0.80 (0.11); 0.83
STIR 2 L (n = 76) 0.84 (0.13); 0.88 0.81 (0.18); 0.88 0.82 (0.16); 0.89 0.81 (0.20); 0.89 0.81 (0.21); 0.88 0.79 (0.18); 0.84 0.77 (0.21); 0.86
STIR 1 L (n = 50) 0.67 (0.25); 0.78 0.64 (0.28); 0.77 0.64 (0.29); 0.79 0.59 (0.30); 0.72 0.62 (0.30); 0.76 0.61 (0.28); 0.74 0.59 (0.30); 0.73
Testing M (n = 144) 0.74 (0.13); 0.76 0.75 (0.14); 0.80 0.74 (0.14); 0.77 0.76 (0.14); 0.79 0.74 (0.16); 0.79 0.67 (0.15); 0.71 0.66 (0.15); 0.68
STIR 2 M (n = 43) 0.73 (0.13); 0.77 0.75 (0.13); 0.77 0.75 (0.15); 0.78 0.72 (0.20); 0.78 0.71 (0.22); 0.77 0.66 (0.16); 0.70 0.66 (0.16); 0.70
STIR 1 M (n = 51) 0.53 (0.24); 0.59 0.49 (0.28); 0.59 0.43 (0.30); 0.50 0.45 (0.31); 0.57 0.42 (0.30); 0.37 0.47 (0.24); 0.53 0.39 (0.26); 0.42
Testing S (n = 152) 0.63 (0.18); 0.67 0.68 (0.19); 0.73* 0.66 (0.22); 0.72 0.65 (0.22); 0.72 0.62 (0.25); 0.69 0.54 (0.22); 0.58 0.51 (0.22); 0.56
STIR 2 S (n = 21) 0.52 (0.21); 0.56 0.51 (0.25); 0.55 0.48 (0.27); 0.51 0.49 (0.28); 0.55 0.53 (0.26); 0.62 0.45 (0.24); 0.48 0.40 (0.22); 0.42
STIR 1 S (n = 39) 0.43 (0.25); 0.52 0.37 (0.29); 0.48 0.34 (0.31); 0.29 0.41 (0.31); 0.48 0.38 (0.32); 0.47 0.37 (0.25); 0.42 0.32 (0.27); 0.38
Precision Testing (n = 459) 0.76 (0.21); 0.82 0.83 (0.17); 0.88* 0.81 (0.18); 0.87 0.80 (0.18); 0.86 0.81 (0.19); 0.87 0.70 (0.22); 0.75 0.68 (0.22); 0.73
STIR 2 (n = 140) 0.75 (0.19); 0.79 0.80 (0.20); 0.87* 0.78 (0.20); 0.85 0.78 (0.21); 0.84 0.80 (0.19); 0.85 0.72 (0.20); 0.78 0.73 (0.20); 0.78
STIR 1 (n = 140) 0.62 (0.26); 0.67 0.62 (0.31); 0.72 0.55 (0.33); 0.64 0.65 (0.31); 0.77 0.66 (0.32); 0.78 0.57 (0.28); 0.65 0.57 (0.33); 0.69
Sensitivity Testing (n = 459) 0.78 (0.17); 0.83* 0.73 (0.19); 0.77 0.74 (0.21); 0.79 0.76 (0.21);0.83 0.71 (0.23); 0.78 0.71 (0.21); 0.77 0.69 (0.23); 0.76
STIR 2 (n = 140) 0.82 (0.21); 0.91* 0.76 (0.24); 0.85 0.78 (0.25); 0.87 0.76 (0.28); 0.90 0.75 (0.28); 0.88 0.74 (0.26); 0.85 0.72 (0.27); 0.82
STIR 1 (n = 140) 0.59 (0.32); 0.65 0.52 (0.33); 0.62 0.53 (0.37); 0.65 0.48 (0.35); 0.53 0.46 (0.35); 0.53 0.52 (0.32); 0.61 0.43 (0.33); 0.41
Subject detection rate Testing (n = 459) 1.00 (0.05); 0.99 (0.08); 0.98 (0.12); 0.99 (0.11); 0.98 (0.15); 0.98 (0.13); 0.97 (0.17);
[0.99, 1.00] [0.99, 1.00] [0.97, 1.00] [0.98, 1.00] [0.96, 0.99] [0.97, 0.99] [0.96, 0.99]
STIR 2 (n = 140) 0.99 (0.08); 0.98 (0.14); 0.99 (0.12); 0.98 (0.14); 0.97 (0.17); 0.98 (0.14); 0.99 (0.12);
[0.98,1.01] [0.95, 1.00] [0.97, 1.01] [0.95, 1.00] [0.94, 1.00] [0.95, 1.00] [0.97, 1.01]
STIR 1 (n = 140) 0.96 (0.20); 0.90 (0.30); 0.84 (0.36); 0.87 (0.33); 0.85 (0.36); 0.91 (0.29); 0.85 (0.36);
[0.92, 0.99] 0.85, 0.95 [0.78, 0.90] [0.82, 0.93] [0.79, 0.91] [0.86, 0.96] [0.79, 0.91]
Spearman correlation of dice and lesion volume size Testing (n = 459) 0.62 [0.57, 0.68] 0.44 [0.37, 0.51] 0.48 [0.41, 0.55] 0.53 [0.46, 0.59] 0.53 [0.46, 0.59] 0.63 [0.57, 0.68] 0.65 [0.59, 0.70]
STIR 2 (n = 140) 0.68 [0.58, 0.76] 0.49 [0.36, 0.61] 0.54 [0.41, 0.65] 0.55 [0.42, 0.65] 0.51 [0.37, 0.62] 0.60 [0.48, 0.69] 0.59 [0.48, 0.69]
STIR 1 (n = 140) 0.42 [0.28, 0.55] 0.42 [0.27, 0.55] 0.42 [0.27, 0.55] 0.30 [0.14, 0.44] 0.36 [0.21, 0.50] 0.44 [0.29, 0.56] 0.42 [0.28, 0.55]
Spearman correlation of dice and lesion DWI contrast Testing (n = 459) 0.60 [0.54, 0.66] 0.65 [0.59, 0.70] 0.61 [0.55, 0.66] 0.62 [0.56, 0.68] 0.64 [0.59, 0.69] 0.64 [0.59, 0.69] 0.65 [0.59, 0.70]
STIR 2 (n = 140) 0.45 [0.31, 0.57] 0.57 [0.44, 0.67] 0.54 [0.41, 0.65] 0.54 [0.41, 0.65] 0.55 [0.42, 0.65] 0.52 [0.38, 0.63] 0.56 [0.43, 0.66]
STIR 1 (n = 140) 0.52 [0.38, 0.63] 0.56 [0.43, 0.66] 0.41 [0.26, 0.54] 0.51 [0.37, 0.62] 0.40 [0.25, 0.53] 0.45 [0.30, 0.57] 0.42 [0.28, 0.55]
Spearman correlation of dice and lesion ADC contrast Testing (n = 459) −0.33 [−0.41, −0.24] −0.48 [−0.55, −0.41] −0.47 [−0.53, −0.39] −0.44 [−0.51, −0.36] −0.46 [−0.53, −0.38] −0.41 [−0.48, −0.33] −0.40 [−0.48, −0.32]
STIR 2 (n = 140) −0.31 [−0.45, −0.15] −0.37 [−0.51, −0.22] −0.36 [−0.50, −0.21] −0.40 [−0.53, −0.25] −0.42 [−0.55, −0.28] −0.38 [−0.51, −0.23] −0.41 [−0.54, −0.26]
STIR 1 (n = 140) −0.24 [−0.39, −0.08]+ −0.30 [−0.44, −0.14] −0.27 [−0.42, −0.11]+ −0.29 [−0.44, −0.13] −0.30 [−0.44, −0.14] −0.13 [−0.29, 0.03]+ −0.20 [−0.35, −0.03]+
Spearman correlation of lesion and predict volume size Testing (n = 459) 0.97 [0.96, 0.97] 0.97 [0.97, 0.98] 0.97 [0.96, 0.97] 0.97 [0.96, 0.98] 0.97 [0.96, 0.97] 0.97 [0.96, 0.97] 0.97 [0.96, 0.97]
STIR 2 (n = 140) 0.97 [0.96, 0.98] 0.96 [0.94, 0.97] 0.96 [0.94, 0.97] 0.93 [0.90, 0.95] 0.89 [0.86, 0.92] 0.95 [0.93, 0.96] 0.94 [0.91, 0.96]
STIR 1 (n = 140) 0.87 [0.83, 0.91] 0.84 [0.79, 0.89] 0.80 [0.73, 0.85] 0.81 [0.74, 0.86] 0.79 [0.72, 0.85] 0.84 [0.78, 0.88] 0.79 [0.72, 0.85]
Spearman correlation of lesion and predict DWI contrast Testing (n = 459) 0.87 [0.85, 0.89] 0.89 [0.86, 0.90] 0.88 [0.86, 0.90] 0.87 [0.84, 0.89] 0.88 [0.86, 0.90] 0.86 [0.83, 0.88] 0.85 [0.82, 0.87]
STIR 2 (n = 140) 0.83 [0.77, 0.88] 0.81 [0.74, 0.86] 0.85 [0.80, 0.89] 0.87 [0.82, 0.90] 0.88 [0.84, 0.91] 0.83 [0.77, 0.87] 0.82 [0.76, 0.87]
STIR 1 (n = 140) 0.61 [0.50, 0.71] 0.70 [0.61, 0.78] 0.59 [0.47, 0.69] 0.69 [0.58, 0.77] 0.74 [0.65, 0.81] 0.59 [0.48, 0.69] 0.50 [0.36, 0.62]
Spearman correlation of lesion and predict ADC contrast Testing (n = 459) 0.77 [0.74, 0.81] 0.84 [0.81, 0.86] 0.83 [0.80, 0.86] 0.82 [0.79, 0.85] 0.83 [0.80, 0.86] 0.80 [0.77, 0.83] 0.80 [0.76, 0.83]
STIR 2 (n = 140) 0.85 [0.80, 0.89] 0.86 [0.81, 0.90] 0.91 [0.87, 0.93] 0.87 [0.82, 0.90] 0.93 [0.90,0.95] 0.90 [0.86, 0.93] 0.84 [0.79, 0.88]
STIR 1 (n = 140) 0.51 [0.38, 0.63] 0.58 [0.46, 0.68] 0.52 [0.38, 0.63] 0.55 [0.42, 0.66] 0.57 [0.44, 0.67] 0.53 [0.39, 0.64] 0.47 [0.32, 0.59]
Median of false positives Not visible (n = 499) 14 0 0 0 0 0 0
Number of subjects whose FP > 10 voxels Not visible (n = 499) 275 132 78 55 36 182 88
False positive subject detection rate Not visible (n = 499) 0.55 (0.50); 0.26 (0.44); 0.16 (0.36); 0.11 (0.31); 0.07 (0.26); 0.36 (0.48); 0.18 (0.38);
[0.51, 0.59]* [0.23, 0.30] [0.12, 0.19] [0.08, 0.14] [0.05, 0.09] [0.32, 0.41] [0.14, 0.21]
False positive subject detection rate (retrospect evaluation) Not visible (n = 499) 0.53 (0.50); 0.24 (0.43); 0.14 (0.35); 0.10 (0.30); 0.06 (0.24); 0.34 (0.47); 0.15 (0.36);
[0.48, 0.57]* [0.21, 0.28] [0.11, 0.17] [0.07, 0.12] [0.04, 0.08] [0.30, 0.38] [0.12, 0.18]
Number of trainable parameters All 24.5 M 10.7 M 10.7 M 10.0 M 10.0 M 10.1 M 10.1M
CPU inference time in seconds Testing (n = 459) 85.68 30.10 (0.52) 29.09 (0.46) 19.30 (0.34) 18.71 (0.37) 7.15 (0.52) 6.80 (0.55)
GPU inference time in seconds Testing (n = 459) 14.97 5.91 (0.44) 4.82 (0.30) 3.78 (0.18) 3.59 (0.18) 2.40 (0.18) 2.26(0.18)

Metrics (dice, precision, sensitivity) are represented as “mean (standard deviation); median”; subject detection rate is represented as “mean (standard deviation); [95% CI]”. The correlations are shown as “correlation coefficient; [95% CI]”. “+” indicates no significant correlations (P value>1E − 3); all the other correlations were significant with P value≤1E − 3. In dataset column, L = large; M = moderate; S = small lesion groups. The statistical significant difference between DAGMNet_CH3 and DeepMedic is labeled by “*”.