Table 3.
ITPTb and BERTc model | Precision, mean (SD) | Recall, mean (SD) | F1 score, mean (SD) | |
BERT | 75.0 (1.8) | 85.3 (1.1) | 79.8 (0.6) | |
NCBId-disease | ||||
|
+DAPTe (BioBERT) | 77.7 (2.6) | 85.1 (2.8) | 81.1 (1.1) |
|
+DAPT (ClinicalBERT) | 78.6 (3.2) | 84.4 (1.5) | 81.3 (1.2) |
|
BERT | 71.6 (3.4) | 88.9 (2.4) | 79.2 (1.5) |
i2b2f 2010 | ||||
|
+DAPT (BioBERT) | 75.6 (1.9) | 86.2 (1.4) | 80.5 (1.4) |
|
+DAPT (ClinicalBERT) | 73.2 (2.0) | 89.0 (1.8) | 80.3 (0.7) |
|
BERT | 70.7 (2.7) | 88.7 (1.5) | 78.6 (1.3) |
ShARe-CLEFg 2013 | ||||
|
+DAPT (BioBERT) | 72.9 (2.5) | 88.3 (2.3) | 79.8 (0.8) |
|
+DAPT (ClinicalBERT) | 74.2 (2.6) | 86.5 (3.8) | 79.8 (0.9) |
aDocument-level precision, recall, and F1 score are reported using official evaluation scripts.
bITPT: intermediate-task pretraining.
cBERT: Bidirectional Encoder Representations from Transformers.
dNCBI: National Center for Biotechnology Information.
eDAPT: domain-adaptive pretraining.
fi2b2: Integrating Biology and the Bedside.
gShARe-CLEF: Shared Annotated Resources-Conference and Labs of the Evaluation Forum.