. 2020 Jul 27;27(10):1510–1519. doi: 10.1093/jamia/ocaa080

Table 3.

Accuracy for each component of the candidate generator in our best complete system Lucene(a + b + c + d + e) + BERT(f-e + ST) on dev set

	Overall			$\| C_{m} \| = 1$		$\| C_{m} \| > 1$
Components	$\| m \|$	Accuracy (%)	Recall@30 (%)	$\| m \|$	Accuracy	$\| m \|$	Accuracy (%)	Recall@30 (%)
Lucene(a + b)	705	97.16	97.59	681	97.65	24	83.33	95.83
Lucene(c)	165	84.42	84.42	161	84.47	4	75	75
Lucene(d)	164	73.78	81.10	127	82.68	37	43.24	75.68
Lucene(e)	315	38.41	69.84	6	16.67	309	38.83	70.87
Lucene(a + b + c + d + e)	1350	78.96	87.41	976	92.93	374	42.51	72.99

Accuracy indicates how often the first matched candidate concept is correct. Recall@30 indicates how often the correct candidate is within the first 30 matched candidate concepts. $| C_{m} | indicates$ the size of the candidate concepts. $| m | indicates$ number of mentions predicted by each component.

BERT: Bidirectional Encoder Representations from Transformers; ST: semantic type regularizer.