Skip to main content
. 2019 Jun 20;36(1):280–286. doi: 10.1093/bioinformatics/btz504

Table 5.

In-corpus (IC) performance, measured by F1-score, of the baseline (BL) BiLSTM-CRF compared to the multi-task model (MTM)

BL
MTM
Entity Train Partner Average σ Average σ
Chemicals BC4CH. BC5CDR 88.46 0.61 88.81 0.60
CRAFT 88.67 0.50
BC5CDR BC4CH. 92.82 0.80 93.00 0.55
CRAFT 91.52* 0.68
CRAFT BC4CH. 84.98 1.98 85.06 1.49
BC5CDR 84.74 1.33
Diseases BC5CDR NCBI-disease 84.49 0.33 83.85 0.64
Variome 83.29* 0.80
NCBI-disease BC5CDR 87.01 1.17 86.89 1.74
Variome 86.27 1.44
Variome BC5CDR 85.75 2.83 86.13 2.49
NCBI-disease 85.73 2.46
Species CRAFT Linnaeus 96.28 2.21 96.82 1.51
S800 96.90 1.31
Linnaeus CRAFT 89.44 3.91 89.72 4.51
S800 92.18 3.42
S800 CRAFT 72.75 2.42 74.80 2.98
Linnaeus 74.43 1.90
Genes/proteins BC2GM CRAFT 81.48 0.48 79.41** 0.14
JNLPBA 79.60** 0.53
CRAFT BC2GM 84.46 6.08 87.76 2.65
JNLPBA 85.36 4.74
JNLPBA BC2GM 80.92 2.50 81.61 2.53
CRAFT 81.15 2.04

Note: The MTM is trained on pairs of corpora (train, partner), where each corpus is used during training to update the parameters of all hidden layers. IC performance is derived from 5-fold cross-validation, using exact matching criteria. Statistical significance is measured through a two-tailed t-test. Bold, best scores, σ, standard deviation.

*

Significantly different than the BL (P ≤ 0.05).

**

Significantly different than the BL (P ≤ 0.01).