Table 2.
Test-set | No finding |
||||
---|---|---|---|---|---|
White | Asian | Black | Female | Male | |
AUC (95% CI) | |||||
Original | 0.87 (0.86–0.88) | 0.88 (0.86–0.89) | 0.88 (0.87–0.90) | 0.87 (0.86–0.88) | 0.87 (0.86–0.88) |
Resampled | 0.87 (0.86–0.88) | 0.87 (0.87–0.88) | 0.89 (0.88–0.89) | 0.87 (0.86–0.87) | 0.89 (0.88–0.89) |
Multitask | 0.86 (0.86–0.87) | 0.86 (0.85–0.88) | 0.88 (0.86–0.90) | 0.86 (0.85–0.87) | 0.87 (0.86–0.88) |
TPR (95% CI) | |||||
Original | 0.79 (0.77–0.80) | 0.80 (0.76–0.83) | 0.84 (0.80–0.88) | 0.79 (0.77–0.82) | 0.79 (0.77–0.81) |
Resampled | 0.80 (0.78–0.81) | 0.79 (0.78–0.81) | 0.81 (0.80–0.82) | 0.78 (0.76–0.79) | 0.82 (0.81–0.83) |
Multitask | 0.78 (0.76–0.80) | 0.78 (0.74–0.82) | 0.82 (0.78–0.87) | 0.82 (0.80–0.84) | 0.76 (0.74–0.78) |
FPR (95% CI) | |||||
Original | 0.20 (0.20–0.20) | 0.20 (0.19–0.21) | 0.23 (0.21–0.24) | 0.20 (0.20–0.21) | 0.20 (0.20–0.20) |
Resampled | 0.20 (0.20–0.21) | 0.20 (0.20–0.20) | 0.20 (0.19–0.20) | 0.20 (0.20–0.20) | 0.20 (0.20–0.20) |
Multitask | 0.20 (0.20–0.20) | 0.19 (0.18–0.20) | 0.22 (0.21–0.24) | 0.23 (0.23–0.24) | 0.18 (0.17–0.18) |
Youden's J statistic (95% CI) | |||||
Original | 0.59 (0.57–0.60) | 0.60 (0.56–0.64) | 0.61 (0.57–0.65) | 0.59 (0.57–0.61) | 0.59 (0.57–0.61) |
Resampled | 0.59 (0.58–0.61) | 0.59 (0.58–0.61) | 0.61 (0.60–0.63) | 0.58 (0.56–0.59) | 0.62 (0.61–0.63) |
Multitask | 0.58 (0.56–0.59) | 0.59 (0.55–0.63) | 0.60 (0.56–0.65) | 0.58 (0.56–0.60) | 0.58 (0.56–0.60) |
Test-set | Pleural effusion |
||||
---|---|---|---|---|---|
White | Asian | Black | Female | Male | |
AUC (95% CI) | |||||
Original | 0.86 (0.86–0.87) | 0.88 (0.87–0.89) | 0.86 (0.85–0.88) | 0.87 (0.86–0.87) | 0.86 (0.86–0.87) |
Resampled | 0.87 (0.86–0.87) | 0.88 (0.88–0.89) | 0.85 (0.84–0.85) | 0.87 (0.87–0.87) | 0.86 (0.86–0.86) |
Multitask | 0.86 (0.86–0.87) | 0.88 (0.87–0.88) | 0.86 (0.85–0.88) | 0.87 (0.86–0.87) | 0.86 (0.86–0.87) |
TPR (95% CI) | |||||
Original | 0.77 (0.76–0.78) | 0.78 (0.76–0.80) | 0.71 (0.68–0.74) | 0.76 (0.75–0.78) | 0.77 (0.76–0.78) |
Resampled | 0.78 (0.78–0.79) | 0.80 (0.80–0.81) | 0.72 (0.71–0.73) | 0.78 (0.77–0.79) | 0.76 (0.75–0.76) |
Multitask | 0.77 (0.75–0.78) | 0.78 (0.77–0.80) | 0.69 (0.66–0.73) | 0.75 (0.73–0.76) | 0.78 (0.77–0.79) |
FPR (95% CI) | |||||
Original | 0.21 (0.20–0.21) | 0.19 (0.18–0.20) | 0.16 (0.14–0.17) | 0.20 (0.19–0.20) | 0.20 (0.20–0.21) |
Resampled | 0.21 (0.21–0.21) | 0.21 (0.20–0.21) | 0.18 (0.18–0.19) | 0.20 (0.20–0.21) | 0.20 (0.19–0.20) |
Multitask | 0.21 (0.20–0.21) | 0.20 (0.19–0.21) | 0.15 (0.14–0.17) | 0.18 (0.18–0.19) | 0.21 (0.21–0.22) |
Youden's J statistic (95% CI) | |||||
Original | 0.56 (0.55–0.57) | 0.59 (0.57–0.61) | 0.55 (0.52–0.59) | 0.57 (0.55–0.58) | 0.57 (0.55–0.58) |
Resampled | 0.57 (0.56–0.58) | 0.59 (0.59–0.60) | 0.54 (0.52–0.55) | 0.57 (0.57–0.58) | 0.56 (0.55–0.57) |
Multitask | 0.56 (0.55–0.57) | 0.59 (0.57–0.61) | 0.54 (0.51–0.58) | 0.56 (0.55–0.58) | 0.57 (0.55–0.58) |
Disease detection results reported separately for each race group and biological sex for ‘no finding’ (top) and ‘pleural effusion’ (bottom). TPR and FPR in subgroups are determined using a fixed decision threshold optimized over the whole patient population for a target FPR of 0.20.