Skip to main content
. 2021 Apr 12;4:68. doi: 10.1038/s41746-021-00437-0

Table 2.

TREC-COVID results.

Score Team rank Score Team rank
Round 1 All submissions (144) Automatic submissions (102)
All pairs (1.53M) Judged pairs (8691)
Bpref 0.5176 2 0.5176 1
MAP 0.2401 13 0.4870 1
P@5 0.6333 19 0.8267 1
P@10 0.5567 21 0.7933 1
nDCG@10 0.5445 13 0.7233 1
Round 2 All submissions (136) Automatic submissions (73)
All pairs (2.20M) Judged pairs (12,037)
Bpref 0.5402 2 0.5232 1
MAP 0.3487 1 0.5138 1
P@5 0.8000 3 0.8171 1
P@10 0.7200 3 0.7629 1
nDCG@10 0.6996 1 0.7247 1
Round 3 All submissions (79) Automatic submissions (32)
All pairs (5.14M) Judged pairs (12,713)
Bpref 0.5665 7 0.5665 1
MAP 0.3182 7 0.5385 1
P@5 0.7800 14 0.8200 2
P@10 0.7600 12 0.7850 2
nDCG@10 0.6867 12 0.7065 2
Round 4 All submissions (72) Automatic submissions (28)
All pairs (7.10M) Judged pairs (13,262)
Bpref 0.5887 7 0.5887 3
MAP 0.3436 10 0.5653 3
P@5 0.8222 14 0.8222 5
P@10 0.7978 12 0.8133 4
nDCG@10 0.7391 12 0.7449 6
Round 5 All submissions (126) Automatic submissions (49)
All pairs (9.56M) Judged pairs (23,151)
Bpref 0.5253 13 0.5253 3
MAP 0.3089 14 0.4884 3
P@5 0.8760 13 0.876 3
P@10 0.8260 15 0.842 3
nDCG@10 0.7488 16 0.7567 4

Performance evaluation of the COVID-19 search engine on the five rounds of the TREC-COVID challenge dataset. Two contexts are considered. Context 1 (columns “All submissions, All pairs”) considers our search engine performance against all search engines—manual, feedback, and automatic engines—using both annotated and non-annotated topic-document pairs. Context 2 (“Automatic submissions, Judged pairs”) considers our search engine performance strictly against those in its class—automatic search engines, using topic–document pairs annotated by experts.