Skip to main content
. Author manuscript; available in PMC: 2023 Sep 8.
Published in final edited form as: Proc (IEEE Int Conf Healthc Inform). 2022 Sep 8;2022:84–89. doi: 10.1109/ichi54592.2022.00024

Table II.

F1-scores of six NER models on five medical dataset. The best performance in 5-shot settings and 1-shot settings has been highlighted in bold and underlined.

Models Training Size N2C2 2018 I2B2 2014 MIMIC III BioNLP 2016 SMM4H 2021
SANER (traditional NER model) Whole training data 80.63 90.62 66.57 81.78 44.56
10% training data 79.27 80.67 46.68 70.50 23.4
5-shot 10.27 36.38 21.25 23.14 0.00
1-shot 7.92 31.14 7.07 4.32 0.00
BERT + Classifier (traditional NER model) Whole training data 59.47 76.47 65.71 81.23 47.30
10% training data 42.32 34.69 30.29 58.44 25.83
5-shot 3.27 0.00 0.00 1.71 0.00
1-shot 0.00 0.21 0.57 0.15 0.00
BERT + CRF (traditional NER model) Whole training data 82.79 80.63 59.58 77.62 45.45
10% training data 64.09 27.84 20.5 61.35 2.74
5-shot 0.00 0.00 0.00 0.00 0.00
1-shot 0.00 0.00 0.00 0.00 0.00
StructShot (few-shot model) 5-shot 25.44 20.30 3.18 0.03 0.00
1-shot 17.59 20.26 0.63 0.00 0.00
NNShot (few-shot model) 5-shot 25.29 19.73 19.71 28.88 0.00
1-shot 16.70 16.35 15.37 6.42 0.00
FewShot-Tagging (few-shot model) 5-shot 0.94 0.27 0.60 3.32 0.32
1-shot 4.59 0.14 5.17 6.81 0.35