Skip to main content
. 2022 Jun 2;23:210. doi: 10.1186/s12859-022-04751-6

Table 3.

Evaluation of QA pipeline

Evaluation metric Top@ 1 Top@ 5 Top@ 10 Top@ 20
Retriever
Recall (single document) 0.495 0.711 0.720 0.836
Recall (multiple documents) 0.494 0.716 0.720 0.836
Mean reciprocal rank (MRR) 0.495 0.572 0.582 0.775
Precision 0.495 0.344 0.342 0.304
Mean average precision (MAP) 0.494 0.672 0.690 0.697
Reader
F1-Score 0.504 0.636 0.636 0.771
Exact match (EM) 0.539 0.549 0.698 0.775
Semantic answer similarity (SAS) 0.503 0.623 0.687 0.785
Accuracy 0.895 (same for all top @k)

Bold means best result