Table 1.
Results comparing deep learning model with expert Surgeons.
Accuracy (SN %, SP %) | RMSE (R2) | M-S agreement:a success/failure | M-S agreement:b blood loss | |
---|---|---|---|---|
Ground truth |
11 success 9 failures |
– | – | Avg blood loss: 568 (range:20–1640) |
Model |
17/20 (85%) (100, 66) |
295 (0.74) | – | – |
Expert cohort |
55/80 (68.75) (79, 56) |
351 (0.70) | 0.43‡ | 0.73c |
Surgeon 1 |
13/20 (65%) (73, 55) |
306 (0.73) | 0.34 | 0.74 |
Surgeon 2 |
14/20 (65%) (81, 55) |
335 (0.66) | 0.43 | 0.66 |
Surgeon 3 |
14/20 (65%) (81, 55) |
423 (0.65) | 0.43 | 0.65 |
Surgeon 4 |
14/20 (65%) (81, 55) |
329 (0.74) | 0.43 | 0.72 |
SN: sensitivity; SP: specificity; M-S: model-surgeon.
aKappa coefficient.
bInter-class coefficient.
cInter-Surgeon Agreement: Success/Failure = 0.95, Blood-Loss: 0.72.