Table 2.
TREC-COVID results.
| Score | Team rank | Score | Team rank | |
|---|---|---|---|---|
| Round 1 | All submissions (144) | Automatic submissions (102) | ||
| All pairs (1.53M) | Judged pairs (8691) | |||
| Bpref | 0.5176 | 2 | 0.5176 | 1 |
| MAP | 0.2401 | 13 | 0.4870 | 1 |
| P@5 | 0.6333 | 19 | 0.8267 | 1 |
| P@10 | 0.5567 | 21 | 0.7933 | 1 |
| nDCG@10 | 0.5445 | 13 | 0.7233 | 1 |
| Round 2 | All submissions (136) | Automatic submissions (73) | ||
| All pairs (2.20M) | Judged pairs (12,037) | |||
| Bpref | 0.5402 | 2 | 0.5232 | 1 |
| MAP | 0.3487 | 1 | 0.5138 | 1 |
| P@5 | 0.8000 | 3 | 0.8171 | 1 |
| P@10 | 0.7200 | 3 | 0.7629 | 1 |
| nDCG@10 | 0.6996 | 1 | 0.7247 | 1 |
| Round 3 | All submissions (79) | Automatic submissions (32) | ||
| All pairs (5.14M) | Judged pairs (12,713) | |||
| Bpref | 0.5665 | 7 | 0.5665 | 1 |
| MAP | 0.3182 | 7 | 0.5385 | 1 |
| P@5 | 0.7800 | 14 | 0.8200 | 2 |
| P@10 | 0.7600 | 12 | 0.7850 | 2 |
| nDCG@10 | 0.6867 | 12 | 0.7065 | 2 |
| Round 4 | All submissions (72) | Automatic submissions (28) | ||
| All pairs (7.10M) | Judged pairs (13,262) | |||
| Bpref | 0.5887 | 7 | 0.5887 | 3 |
| MAP | 0.3436 | 10 | 0.5653 | 3 |
| P@5 | 0.8222 | 14 | 0.8222 | 5 |
| P@10 | 0.7978 | 12 | 0.8133 | 4 |
| nDCG@10 | 0.7391 | 12 | 0.7449 | 6 |
| Round 5 | All submissions (126) | Automatic submissions (49) | ||
| All pairs (9.56M) | Judged pairs (23,151) | |||
| Bpref | 0.5253 | 13 | 0.5253 | 3 |
| MAP | 0.3089 | 14 | 0.4884 | 3 |
| P@5 | 0.8760 | 13 | 0.876 | 3 |
| P@10 | 0.8260 | 15 | 0.842 | 3 |
| nDCG@10 | 0.7488 | 16 | 0.7567 | 4 |
Performance evaluation of the COVID-19 search engine on the five rounds of the TREC-COVID challenge dataset. Two contexts are considered. Context 1 (columns “All submissions, All pairs”) considers our search engine performance against all search engines—manual, feedback, and automatic engines—using both annotated and non-annotated topic-document pairs. Context 2 (“Automatic submissions, Judged pairs”) considers our search engine performance strictly against those in its class—automatic search engines, using topic–document pairs annotated by experts.