Table 4.
Strict matching | train CDR → test NCBIa | train NCBI → test CDRb | |||||
---|---|---|---|---|---|---|---|
Model | p | r | f | p | r | f | |
BiLSTM | 57.32 | 37.92 | 45.64 | 55.19 | 30.79 | 39.52 | |
BiLSTM-CRF | 68.34 | 36.88 | 47.90 | 58.30 | 38.74 | 46.55 | |
GRAM-CNN | 59.74 | 42.81 | 49.88 | 58.48 | 33.21 | 42.36 | |
BERT | 68.92 | 53.13 | 60.00 | 54.17 | 61.44 | 57.57 | |
CLSTM | word level | 62.42 | 48.96 | 54.87 | 60.92 | 38.09 | 46.87 |
character level (3)c | 68.12 | 44.06 | 53.51 | 62.74 | 32.66 | 42.96 | |
character level (7)c | 65.08 | 45.63 | 53.64 | 60.69 | 21.75 | 32.02 | |
word+char levels (3, 3)d | 66.77 | 43.75 | 52.86 | 54.00 | 44.08 | 48.54 | |
word+char levels (5, 5)d | 69.36 | 42.92 | 53.02 | 57.63 | 39.51 | 46.88 |
aTest the disease entities in the NCBI corpus using the model trained on the CDR corpus
bTest the disease entities in the CDR corpus using the model trained on the NCBI corpus
cThe number in parentheses represents the window size at the character level.
dThe numbers in parentheses represent the window sizes at the word and character level, respectively