Skip to main content
. Author manuscript; available in PMC: 2016 Oct 14.
Published in final edited form as: J Mach Learn Res. 2016;17:132.

Table 4.

Retrospective results, with respect to: per-article AUC, NDCG@20, precision@10 and precision@20. For each we report the means and standard deviations over the 133 articles for which candidate sets were annotated for the respective domains. All sentences not in candidate sets are assumed to be irrelevant, these results are therefore noisy and likely pessimistic. We bold cells corresponding to the best performing methods for each metric, PICO element pair.

Method Mean AUC (SD) Mean NDCG@20 (SD) Precision@3 (SD) Precision@10 (SD) Precision@20 (SD)
Population
Direct only 0.904 (0.106) 0.530 (0.270) 0.347 (0.298) 0.183 (0.126) 0.116 (0.070)
DS 0.941 (0.063) 0.484 (0.243) 0.256 (0.242) 0.202 (0.126) 0.129 (0.075)
Nguyen 0.917 (0.091) 0.537 (0.275) 0.328 (0.281) 0.189 (0.128) 0.117 (0.072)
SDS 0.947 (0.059) 0.548 (0.263) 0.336 (0.276) 0.212 (0.133) 0.132 (0.076)
Interventions
Direct only 0.893 (0.099) 0.493 (0.265) 0.397 (0.293) 0.216 (0.148) 0.139 (0.086)
DS 0.933 (0.068) 0.507 (0.239) 0.344 (0.295) 0.250 (0.164) 0.172 (0.099)
Nguyen 0.921 (0.073) 0.536 (0.254) 0.419 (0.300) 0.248 (0.162) 0.158 (0.097)
SDS 0.936 (0.063) 0.530 (0.249) 0.389 (0.323) 0.252 (0.164) 0.172 (0.099)
Outcomes
Direct only 0.837 (0.096) 0.261 (0.241) 0.180 (0.244) 0.114 (0.117) 0.080 (0.072)
DS 0.896 (0.078) 0.308 (0.223) 0.117 (0.203) 0.148 (0.133) 0.120 (0.091)
Nguyen 0.870 (0.085) 0.339 (0.256) 0.228 (0.268) 0.151 (0.137) 0.106 (0.084)
SDS 0.900 (0.079) 0.333 (0.233) 0.138 (0.212) 0.160 (0.134) 0.124 (0.092)