Skip to main content
. 2020 Jul 27;27(10):1510–1519. doi: 10.1093/jamia/ocaa080

Table 3.

Accuracy for each component of the candidate generator in our best complete system Lucene(a + b + c + d + e) + BERT(f-e + ST) on dev set

Overall
|Cm|=1
|Cm|>1
Components |m| Accuracy (%) Recall@30 (%) |m| Accuracy |m| Accuracy (%) Recall@30 (%)
Lucene(a + b) 705 97.16 97.59 681 97.65 24 83.33 95.83
Lucene(c) 165 84.42 84.42 161 84.47 4 75 75
Lucene(d) 164 73.78 81.10 127 82.68 37 43.24 75.68
Lucene(e) 315 38.41 69.84 6 16.67 309 38.83 70.87
Lucene(a + b + c + d + e) 1350 78.96 87.41 976 92.93 374 42.51 72.99

Accuracy indicates how often the first matched candidate concept is correct. Recall@30 indicates how often the correct candidate is within the first 30 matched candidate concepts. |Cm| indicates the size of the candidate concepts. |m| indicates number of mentions predicted by each component.

BERT: Bidirectional Encoder Representations from Transformers; ST: semantic type regularizer.