. 2020 Jan 16;27(3):407–418. doi: 10.1093/jamia/ocz207

Table 3.

Performance measures for medExtractR and 3 existing natural language processing systems across standardized and combined drug name, strength, dose amount, and frequency

	Tacrolimus						Lamotrigine
	Precision		Recall		F-measure		Precision		Recall		F-measure
Training set performance^a
medExtractR	0.99 (0.98, 1.00)		1.00 (1.00, 1.00)		0.99 (0.99, 1.00)		1.00 (0.99, 1.00)		0.97 (0.94, 1.00)		0.98 (0.97, 1.00)
MedEx	0.79 (0.74-0.84)		0.74 (0.69-0.79)		0.76 (0.72-0.81)		0.91 (0.87-0.95)		0.73 (0.63-0.84)		0.81 (0.74-0.87)
MedXN	0.96 (0.93-0.99)		0.90 (0.84-0.95)		0.93 (0.89-0.96)		0.96 (0.93-0.99)		0.76 (0.64-0.87)		0.85 (0.76-0.92)
CLAMP	0.83 (0.77-0.88)		0.60 (0.54-0.65)		0.70 (0.64-0.74)		0.94 (0.90-0.97)		0.57 (0.46-0.68)		0.71 (0.62-0.79)
Test set performance
medExtractR	0.97 (0.95-0.99)	1.00 (0.99, 1.00)		0.98 (0.97, 1.00)	0.97 (0.94, 1.00)	0.96 (0.91-0.99)		0.96 (0.94-0.98)	0.96 (0.92-0.99)	0.98 (0.96, 1.00)		0.97 (0.95-0.99)
MedEx	0.77 (0.71-0.84)	0.76 (0.71-0.82)		0.77 (0.71-0.82)	0.92 (0.87-0.96)	0.82 (0.74-0.89)		0.87 (0.81-0.91)	0.94 (0.90-0.97)	0.94 (0.86-0.98)		0.94 (0.89-0.97)
MedXN	0.96 (0.92-0.98)	0.96 (0.92-0.99)		0.96 (0.93-0.98)	0.93 (0.89-0.96)	0.83 (0.74-0.90)		0.88 (0.82-0.92)	0.97 (0.94-0.99)	0.97 (0.93, 1.00)		0.97 (0.94-0.99)
CLAMP	0.84 (0.78-0.91)	0.65 (0.58-0.71)		0.73 (0.68-0.79)	0.94 (0.90-0.97)	0.66 (0.57-0.74)		0.78 (0.70-0.84)	0.96 (0.93-0.99)	0.81 (0.75-0.87)		0.88 (0.84-0.92)

Values are presented as estimate (95% bootstrap confidence interval). Results are based on 60 training notes and 50 test notes each for tacrolimus and lamotrigine, and 110 test notes for allopurinol. These are overall results, combining performance across the entities drug name, strength, dose amount, and frequency, which were standardized across systems to ensure comparability.

^aThe training set is for medExtractR, and served as another test set for the 3 existing natural language processing systems.