Lei et al. [76] |
+ Word segment and section information. |
+ Medical dictionary to segment words. |
NI |
NI |
|
|
- Most of errors are in long entities. |
|
|
Quimbaya et al. [110] |
- Ignore the context and surrounding words. |
NI |
+ Edit distance, exact and lemmatized matching by a knowledge base. |
NI |
Xu et al. [150] |
+ Category Word2vec, PoS and dependence relations, and semantic correlation knowledge. - Filtering may miss some medical entities. |
+ Medical native noun phrases. + Based on knowledge base. - May obtain some inexact entities. |
NI |
+ All medical native noun phrases. |
Ghiasvand and Kate [41] |
+ Exact matching of unambiguous words from UMLS. |
+ Boundary expansion model trained on UMLS words. + Classify all possible noun phrases. - Noun phrase extraction not always perfect. - There are some nonnoun phrase entities. |
+ Lemma and stem forms as features. |
+ Complete parsing to extract all noun phrases. - Automatic noun phrase extraction is not always perfect. - Some entities not belong to noun phrases. |
Zhou et al. [163] |
+ Word and character embeddings. - Capture the contextual relation on word-level. |
- Can’t treat complex entities in phrase-level. |
+ Character representation can capture out-of-vocabulary words. |
NI |
Deng et al. [33] |
+ Learn contextual semantic information without feature engineering. + BiLSTM can learn the contextual dependences. + CRF can improve the annotation in phrase-level. |
+ Ensures the integrity and the accuracy of the entity by bidirectional storage of textual information. + IOB annotation format. + Avoid segmentation errors by character embeddings. - Nested entities results in unclear boundaries. |
+ Character embedding. |
- Limited by the entity annotation granularity. |
Zhao et al. [161] |
+ Extract lexical, contextual and syntactic clues. + Fine-tune BERT with BiLSTM-CRF. + Rules contextual embedding using ELMO model. |
+ Extract noun phrases in sentence by PoS patterns. |
+ Clues-based rules. - Rules not appropriate for other domains. |
NI |
Li et al. [78] |
+ Word2vec is improved by BiLSTM to capture contextual information. + BERT is better and can capture the context without BiLSTM. |
+ Relation classification between pair of spans is able to recognize discontinuous entities. |
+ ELMo character-level embedding. - Word-level embedding is needed to capture the whole meaning of words. |
+ Enumerates and represents all text spans and apply a relation classification. |
Sui et al. [126] |
+ Interactions between the words, entity triggers and the whole sentence semantics. |
NI |
+ Entity triggers to recognize entity by cue words. - Manual effort is required to prepare entity triggers. |
+ Cast the problem into a graph node classification task. |