. 2021 Jan 25;2020:860–869.

Table 3.

Entity level performance comparison for all BERT models. Each cell represents the tuple (precision, recall, F1 score). BioBERT and RoBERTa give the best performance.

Entity Type	BERT	BioBERT	Bio+Clinical BERT	RoBERTa
Bleeding event	0.72, 0.76, 0.74	0.74, 0.76, 0.75	0.72, 0.78, 0.75	0.75,0.76,0.75
Bleeding anatomic site	0.72, 0.71, 0.71	0.72, 0.73, 0.72	0.73, 0.72, 0.72	0.76,0.70,0.73
Suspected alternative cause	0.52, 0.40, 0.45	0.50, 0.43, 0.46	0.46, 0.41, 0.43	0.51,0.40,0.45
Severity	0.66, 0.74, 0.70	0.66, 0.76, 0.71	0.63, 0.74, 0.68	0.68,0.73,0.70
Medication	0.92, 0.89, 0.91	0.93, 0.88, 0.91	0.92, 0.89, 0.90	0.94.0.89.0.91
Bleeding lab evaluation	0.81, 0.87, 0.84	0.82, 0.88, 0.85	0.78, 0.91, 0.84	0.83,0.85,0.84
Micro	0.75, 0.75 0.75	0.76, 0.76, 0.76	0.74, 0.77, 0.75	0.77,0.75,0.76
Macro	0.73, 0.73, 0.72	0.73, 0.74, 0.73	0.71, 0.74, 0.72	0.75,0.72,0.73