Skip to main content
. 2022 Nov 8;65(2):463–516. doi: 10.1007/s10115-022-01779-1

Table 5.

Machine learning-based NER methods

Publication Dataset Dataset size Method Features P R F
Tang et al. [132] i2b2 2010 349 train, 477 test SSVM Word + context + sentence + section + cTAKES + MetaMap + ConText + Brown clustering 87.38% 84.31% 85.82%
Wu et al. [148] i2b2 2010 349 train, 477 test LSTM Word embedding 85.33% 86.56% 85.94%
Lee et al. [74] i2b2 2010 170 train, 256 test BERT model Pre-trained and fine-tuned BioBERT - - 86.46%
Zhou et al. [163] i2b2 2010 349 train, 477 test LSTM-CRF Pre-trained contextualized embeddings + Glove embedding - - 87.45%
Zhou et al. [163] NCBI-disease 2014 593 train, 100 valid, 100 test LSTM-CRF Pre-trained contextualized embeddings + Glove embedding - - 87.88%
Lee et al. [74] NCBI-disease 2014 593 train, 100 valid, 100 test BERT model Pre-trained and fine-tuned BioBERT - - 89.36%
Yang et al. [153] n2c2 2019 for family history extraction 99 train, 117 test majority voting of LSTM-CRF models with BERT fine-tuning Fasttext embedding + pre-trained BERT 79.69% 79.20% 79.44%
Uzuner et al. [139] SemEval 2014 Task 7A 199 train, 99 valid, 133 test CRF models Textual features enhanced with a rule-based system 91.1% 85.6% 88.3%
Vunikili et al. [140] CANTEMIST-NER sub-task with tumor morphology mentions 501 train, 500 valid, 5232 test Transfer learning to fine-tune the BERT model BERT contextual embeddings pre-trained on general domain Spanish text 72.7% 74.1% 73.4%
Deng et al. [33] crawled TCM patents’ abstract texts annotated with herb names, disease names, symptoms, and therapeutic effects. 1600 copies: 60% train, 20% valid, 20% test characters BiLSTM-CRF Pre-trained and fine-tuned character embedding 94.63% 94.47% 94.48%
Zhou et al. [163] MACCROBAT 2018 case reports 160 train, 20 valid, 20 test LSTM-CRF Pre-trained contextualized embeddings + Glove embedding - - 65.75%
Lee et al. [74] MACCROBAT 2018 case reports 160 train, 20 valid, 20 test BERT model Pre-trained and fine-tuned BioBERT - - 64.38%