Skip to main content
. Author manuscript; available in PMC: 2017 Apr 1.
Published in final edited form as: J Biomed Inform. 2016 Feb 10;60:199–209. doi: 10.1016/j.jbi.2016.02.005

Table 2.

Method performance on verifying GO subset (various configurations)

Configuration
Performance*
Platform Cost Quality Mean AUC Mean Worker Sensitivity Mean Worker Specificity


CrowdFlower Low Low 0.52 0.67 0.71
High Low 0.73 0.66 0.70
Low High 0.58 0.66 0.73
High High 0.62 0.67 0.73
Mechanical Turk Low Low 0.48 0.65 0.72
High Low 0.60 0.65 0.73
Low High 0.44 0.66 0.74
High High 0.44 0.62 0.72

We measured the performance of the crowd via a bootstrapped AUC and estimated worker sensitivity/specificity in various configurations of platform, cost, and quality. We then compared each configuration to every other configuration to understand whether the performance varied significantly.

*

Performance on all configuration pairs differed significantly except:

Sensitivity CrowdFlower Low-Cost, Low-Quality vs. CrowdFlower High-Cost, High-Quality

AUC Mechnical Turk Low-Cost, High-Quality vs Mechanical Turk High-Cost, High-Quality