. 2020 Nov 17;28(1):132–137. doi: 10.1093/jamia/ocaa271

Table 2.

Evaluation results after setting a threshold at the number of documents per topic using a minimum number of documents present for each individual topic. The relevance judgments used are a combination of Rounds 1 and 2 of TREC-COVID and our additional relevance assessments. The highest scores for the evaluated and TREC-COVID systems are underlined.

System		P@5	P@10	NDCG@10	MAP	NDCG	bpref
Amazon	question	0.6733	0.6333	0.5390	0.0722	0.1838	0.1049
Amazon	question + narrative	0.7200	0.6400	0.5583	0.0766	0.1862	0.1063
Google	question	0.5733	0.5700	0.4972	0.0693	0.1831	0.1069
Google	question + narrative	0.6067	0.5600	0.5112	0.0687	0.1821	0.1054
TREC-COVID	1. sab20.1.meta.docs	0.7800	0.7133	0.6109	0.0999	0.2266	0.1352
	2. sab20.1.merged	0.6733	0.6433	0.5555	0.0787	0.1971	0.1154
	3. UIowaS_Run3	0.6467	0.6367	0.5466	0.0952	0.2091	0.1279
	4. smith.rm3	0.6467	0.6133	0.5225	0.0914	0.2095	0.1303
	5. udel_fang_run3	0.6333	0.6133	0.5398	0.0857	0.1977	0.1187