Skip to main content
. Author manuscript; available in PMC: 2017 Oct 30.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2017;2017:299–309. doi: 10.18653/v1/P17-1028

Table 5. Biomedical IE.

results for Task 2. Rows 1–3 correspond to training on all labels, while Rows 4–7 first aggregate crowd labels then train the sequence labeling model on consensus annotations.

Method Precision Recall F1 std
LSTM (Lample et al., 2016) 77.43 61.13 68.27 1.9
LSTM-Crowd 73.83 63.93 68.47 1.6
LSTM-Crowd-cat 68.08 68.41 68.20 1.8

Majority Vote then CRF 93.71 33.16 48.92 2.8
Dawid-Skene then LSTM 70.21 65.26 67.59 1.7
HMM-Crowd then CRF 79.54 54.76 64.81 2.0
HMM-Crowd then LSTM 73.65 64.64 68.81 1.9