Table 3.
Test-set | No finding |
||||
---|---|---|---|---|---|
White | Asian | Black | Female | Male | |
AUC (95% CI) | |||||
Original | 0.85 (0.84–0.85) | 0.86 (0.84–0.88) | 0.85 (0.84–0.86) | 0.86 (0.85–0.86) | 0.84 (0.83–0.84) |
Resampled | 0.84 (0.84–0.85) | 0.85 (0.84–0.85) | 0.86 (0.86–0.86) | 0.86 (0.85–0.86) | 0.84 (0.84–0.84) |
Multitask | 0.84 (0.84–0.85) | 0.85 (0.84–0.87) | 0.85 (0.84–0.85) | 0.86 (0.85–0.86) | 0.84 (0.83–0.84) |
TPR (95% CI) | |||||
Original | 0.75 (0.74–0.75) | 0.74 (0.71–0.77) | 0.80 (0.79–0.82) | 0.78 (0.77–0.79) | 0.74 (0.72–0.74) |
Resampled | 0.74 (0.74–0.75) | 0.73 (0.73–0.74) | 0.77 (0.76–0.78) | 0.77 (0.77–0.78) | 0.73 (0.72–0.73) |
Multitask | 0.73 (0.72–0.74) | 0.76 (0.72–0.79) | 0.82 (0.81–0.83) | 0.77 (0.76–0.78) | 0.73 (0.72–0.74) |
FPR (95% CI) | |||||
Original | 0.19 (0.19–0.19) | 0.17 (0.15–0.19) | 0.25 (0.24–0.26) | 0.21 (0.21–0.21) | 0.19 (0.19–0.20) |
Resampled | 0.20 (0.20–0.20) | 0.19 (0.18–0.19) | 0.21 (0.21–0.22) | 0.21 (0.21–0.21) | 0.19 (0.19–0.19) |
Multitask | 0.18 (0.18–0.19) | 0.19 (0.17–0.21) | 0.27 (0.26–0.28) | 0.20 (0.20–0.21) | 0.20 (0.19–0.20) |
Youden's J statistic (95% CI) | |||||
Original | 0.55 (0.54–0.56) | 0.58 (0.54–0.61) | 0.55 (0.54–0.57) | 0.57 (0.56–0.58) | 0.54 (0.53–0.55) |
Resampled | 0.54 (0.54–0.55) | 0.55 (0.54–0.55) | 0.56 (0.55–0.57) | 0.56 (0.55–0.57) | 0.54 (0.53–0.54) |
Multitask | 0.55 (0.54–0.55) | 0.57 (0.53–0.60) | 0.55 (0.53–0.56) | 0.57 (0.56–0.58) | 0.54 (0.52–0.55) |
Test-set | Pleural effusion |
||||
---|---|---|---|---|---|
White | Asian | Black | Female | Male | |
AUC (95% CI) | |||||
Original | 0.89 (0.89–0.89) | 0.90 (0.88–0.91) | 0.91 (0.90–0.91) | 0.91 (0.90–0.91) | 0.89 (0.88–0.89) |
Resampled | 0.89 (0.89–0.89) | 0.88 (0.88–0.89) | 0.90 (0.90–0.90) | 0.90 (0.89–0.90) | 0.88 (0.88–0.89) |
Multitask | 0.89 (0.89–0.90) | 0.90 (0.88–0.91) | 0.91 (0.90–0.91) | 0.91 (0.90–0.91) | 0.89 (0.89–0.89) |
TPR (95% CI) | |||||
Original | 0.84 (0.84–0.85) | 0.84 (0.81–0.87) | 0.79 (0.77–0.81) | 0.83 (0.82–0.85) | 0.84 (0.83–0.85) |
Resampled | 0.85 (0.85–0.86) | 0.82 (0.82–0.83) | 0.80 (0.79–0.80) | 0.83 (0.82–0.83) | 0.82 (0.82–0.83) |
Multitask | 0.86 (0.85–0.87) | 0.81 (0.77–0.84) | 0.74 (0.71–0.76) | 0.86 (0.85–0.87) | 0.82 (0.81–0.83) |
FPR (95% CI) | |||||
Original | 0.22 (0.21–0.22) | 0.20 (0.18–0.22) | 0.15 (0.14–0.15) | 0.18 (0.18–0.18) | 0.22 (0.21–0.22) |
Resampled | 0.22 (0.22–0.23) | 0.22 (0.21–0.22) | 0.16 (0.16–0.16) | 0.19 (0.19–0.19) | 0.21 (0.21–0.21) |
Multitask | 0.23 (0.22–0.23) | 0.18 (0.16–0.20) | 0.11 (0.11–0.12) | 0.20 (0.20–0.20) | 0.20 (0.20–0.20) |
Youden's J statistic (95% CI) | |||||
Original | 0.63 (0.62–0.64) | 0.64 (0.60–0.67) | 0.64 (0.62–0.66) | 0.65 (0.64–0.67) | 0.62 (0.61–0.63) |
Resampled | 0.63 (0.62–0.64) | 0.61 (0.60–0.62) | 0.64 (0.63–0.64) | 0.63 (0.63–0.64) | 0.62 (0.61–0.62) |
Multitask | 0.63 (0.63–0.64) | 0.63 (0.59–0.66) | 0.62 (0.60–0.64) | 0.66 (0.65–0.67) | 0.62 (0.61–0.63) |
Disease detection results reported separately for each race group and biological sex for ‘no finding’ (top) and ‘pleural effusion’ (bottom). TPR and FPR in subgroups are determined using a fixed decision threshold optimized over the whole patient population for a target FPR of 0.20.