Skip to main content
. 2021 May 26;22:272. doi: 10.1186/s12859-021-04176-7

Table 4.

Robustness detection experiment results using the average evaluation value and the standard deviation among different seeds (12345, 24, 488)

Model 6b Factoid QA
SAcc LAcc MRR
BioBERT (main baseline) 0.4048 ± 0.0107 0.6278 ± 0.0061 0.4927 ± 0.0102
Our Model (BioBERT+POS+NER+FF) 0.4325 ± 0.0167 0.6200 ± 0.0138 0.5063 ± 0.0137
Model 7b Factoid QA
SAcc LAcc MRR
BioBERT (main baseline) 0.4362 ± 0.0087 0.6146 ± 0.0121 0.5059 ± 0.0045
Our Model (BioBERT+POS+NER+FF) 0.4359 ± 0.0078 0.6379 ± 0.0035 0.5122 ± 0.0037
Model 8b Factoid QA
SAcc LAcc MRR
BioBERT (main baseline) 0.3859 ± 0.0087 0.5566 ± 0.0061 0.4509 ± 0.0065
Our Model (BioBERT+POS+NER+FF) 0.3916 ± 0.0033 0.5898 ± 0.0156 0.4652 ± 0.0040

Bold values represent the highest results