Skip to main content
. 2014 Oct 30;16(10):e233. doi: 10.2196/jmir.3807

Table 3.

Turker consensus in Phase III.

  Number correct (mean)a % correct (mean) Number correct (mode)a % correct (mode) Sensitivityb Specificityb
Phase I: Four-category rating 5 26.3 11 57.9 100.0 100.0
Phase 3: Trial 1 (improved training) 4 21.1c 8d 42.1 100.0 57.1
Phase 3: Trial 2 (raised approval rating) 10 52.6 11e 57.9 100.0 100.0
Phase 3: Trial 3 (Master Graders) 7 36.8 11 57.9 100.0 100.0

aCalculated by level (eg, Turker consensus matches expert designation as normal, mild, moderate, and severe).

bCalculated for normal versus any disease level using the mode consensus score.

cAfter excluding a single Turker with systematically higher scores, 42.1% correct.

dThree images had no mode and were considered incorrect for “Number Correct” and “% correct” but recoded as abnormal for sensitivity and specificity.

eOne image had no mode and was considered incorrect for “Number Correct” and “% correct” but recoded as abnormal for sensitivity and specificity.