. 2017 Apr 1;24(6):1062–1071. doi: 10.1093/jamia/ocx019

Table 2.

Performance of all CRF systems for entity and attribute recognition

Feature set^a		Step 1: Boundary detection			Steps 1 + 2: Boundary detection + Classification
Feature set^a		Precision	Recall	F1-score	Precision	Recall	F1-score
BOW	Exact	0.8284	0.6661	0.7384	0.7917	0.6363	0.7054
BOW	Inexact	0.9411	0.8137	0.8728	0.8715	0.7536	0.8083
BOW + POS + Lemma	Exact	0.8687	0.7393	0.7988	0.8342	0.7100	0.7671
BOW + POS + Lemma	Inexact	0.9480	0.8325	0.8865	0.8894	0.7811	0.8317
BOW + POS + Lemma + UMLS	Exact	0.8644	0.7574	0.8073	0.8341	0.7309	0.7791
BOW + POS + Lemma + UMLS	Inexact	0.9445	0.8541	0.8970	0.8836	0.7991	0.8392
BOW + POS + Lemma + UMLS + BC	Exact	0.8682	0.7661	0.8137	0.8382	0.7400	0.7861
BOW + POS + Lemma + UMLS + BC	Inexact	0.9491	0.8558	0.8978	0.8866	0.8037	0.8432

Entity classes	Precision		Recall		F1 score
Entity classes	Inexact	Exact	Inexact	Exact	Inexact	Exact
*Baseline – CliNER (Problem class)	0.3692	0.3421	0.4809	0.4140	0.4177	0.3746
*Baseline – EliXR (Disorder group)	0.6402	0.4289	0.8138	0.7089	0.7176	0.5345
Condition	0.9071	0.8566	0.8788	0.8209	0.8927	0.8384
Observation	0.83.97	0.8169	0.7378	0.6760	0.7855	0.7398
Procedure/Device	0.8817	0.7951	0.6581	0.6110	0.7537	0.6910
Drug/Substance	0.9027	0.8573	0.7287	0.7179	0.8064	0.7814
Qualifier/Modifier	0.8807	0.8505	0.7412	0.7253	0.8049	0.7829
Temporal Constraints	0.8808	0.8045	0.8239	0.7254	0.8514	0.7629
Measurement	0.8984	0.8101	0.8401	0.7168	0.8683	0.7606
Overall	0.8866	0.8382	0.8037	0.7400	0.8432	0.7861

^aFeature notation: BOW: bag of words; POS: part of speech; BC: brown clustering. The upper table describes the general performance with different feature sets. The lower table shows the detailed results of each class using the best feature set (BOW + POS + Lemma + UMLS + BC.

*Here we choose the performance of “problem” entity class in CliNER and concepts that belong to UMLS disorder semantic types identified by EliXR as 2 baselines. We compare 2 baselines with the performance of the “Condition” entity class by EliIE. The full list of semantic types we include is: T020, T190, T049, T019, T047, T050, T033, T037, T048, T191, T046, T184. The bold values in feature set (BOW + POS + Lemma + UMLS + BC) correspond to the overall best performance was achieved using the combination of all the features. The bold values in Entity classes (Procedure/Device) due to the less occurrence in the trials, Procedure/Device has the worst performance with F1 score of 0.69 among all the entity classes. The bold values in Entity classes (Overall) indicate by implementing the system with the best setting (BOW+POS+Lemma+UMLS), the overall performance achieves precision, recall and F1 score with 0.84, 0.74, and 0.79 respectively.