Table 4.
Performance of AI and radiologists for the identification of changes in lesion burden between two CT scans
AI | Radiologist 1 | Radiologist 2 | Radiologist 3 | |
---|---|---|---|---|
True positive* | 50 | 47 | 52 | 51 |
True negative† | 42 | 47 | 42 | 45 |
False positive‡ | 6 | 1 | 6 | 3 |
False negative§ | 2 | 5 | 0 | 1 |
Accuracy (95% CI) | 0·920 (0·900–0·950) | 0·940 (0·925–0·962) | 0·940 (0·925–0·962) | 0·960 (0·950–0·988) |
Sensitivity (95% CI) | 0·962 (0·947–1·000) | 0·904 (0·872–0·951) | 1·000 (1·000–1·000) | 0·981 (0·974–1·000) |
Specificity (95% CI) | 0·875 (0·833–0·923) | 0·979 (0·971–1·000) | 0·875 (0·833–0·923) | 0·938 (0·917–0·974) |
Data are n, unless stated otherwise. 52 patients had an increase in lesion burden volume and were defined as positive. 48 patients did not have any increase in lesion burden volume and were defined as negative. We presented the complete information to show interrater variability. AI=artificial intelligence.
Correct prediction of lesion burden volume increase.
Correct prediction of no increase in lesion burden volume.
Incorrect prediction of lesion burden volume increase.
Incorrect prediction of no increase in lesion burden volume.