Skip to main content
. 2025 Jan 2;25(1):211. doi: 10.3390/s25010211

Table 5.

QA accuracy metrics for BERT QA RL + RS.

Metric Description Formula
Precision Measures the proportion of correct words in the prediction relative to all predicted words. In the QA context, it evaluates the accuracy of the model’s generated answer. Precision=NumberofcorrectwordsinA^TotalwordsinA^×100
Recall Measures the proportion of correct predicted words relative to all words in the correct answer. Evaluates if the model captures the keywords of the expected response. Recall=NumberofcorrectwordsinA^TotalwordsinA×100
Exact Match (EM) This metric measures the percentage of answers that exactly match the correct answer. It is a very strict metric, counting answers as correct only if they are identical to the expected response. EM=NumberofcorrectA^answersTotalAquestions×100
F1-Score F1 is a metric that combines precision and recall. It is used to measure the overlap between predicted words and words in the correct answer. Unlike EM, it does not require exact identity but assesses how many words in the prediction match those in the correct answer. F1-Score=2×Precision×RecallPrecision+Recall