Skip to main content

View full-text article in PMC

. 2025 Jan 2;25(1):211. doi: 10.3390/s25010211

Table 5.

QA accuracy metrics for BERT QA RL + RS.

Metric	Description	Formula
Precision	Measures the proportion of correct words in the prediction relative to all predicted words. In the QA context, it evaluates the accuracy of the model’s generated answer.	$Precision = \frac{Number of correct words in \hat{A}}{Total words in \hat{A}} \times 100$
Recall	Measures the proportion of correct predicted words relative to all words in the correct answer. Evaluates if the model captures the keywords of the expected response.	$Recall = \frac{Number of correct words in \hat{A}}{Total words in A} \times 100$
Exact Match (EM)	This metric measures the percentage of answers that exactly match the correct answer. It is a very strict metric, counting answers as correct only if they are identical to the expected response.	$EM = \frac{Number of correct \hat{A} answers}{Total A questions} \times 100$
F1-Score	F1 is a metric that combines precision and recall. It is used to measure the overlap between predicted words and words in the correct answer. Unlike EM, it does not require exact identity but assesses how many words in the prediction match those in the correct answer.	$F 1 - Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}$