Table 3.
Algorithm | FMA | FMA | MetaMap | MetaMap | |||
---|---|---|---|---|---|---|---|
Vocabulary |
Read/OXMIS |
Read/OXMIS |
Read/OXMIS |
Full Read |
|||
Test set |
Death |
General |
General |
General |
|||
Number of texts |
1000 |
1000 |
1000 |
1000 |
|||
Number of words |
7534 |
25981 |
25981 |
25981 |
|||
Positive diagnoses detected in free text |
|||||||
True positives |
683 |
346 |
286 |
273 |
|||
False positives |
11 |
32 |
126 |
18 |
|||
False negatives |
52 |
101 |
161 |
174 |
|||
Precision, % |
98.4 (97.2, 99.2) |
91.5 (88.3, 94.1) |
69.4 (64.7, 73.8) |
93.8 (90.4, 96.3) |
|||
Recall, % |
92.9 (90.8, 94.7) |
77.4 (73.2, 81.2) |
64.0 (59.3, 68.4) |
61.1 (56.4, 65.6) |
|||
F-score |
0.96 |
0.84 |
0.67 |
0.74 |
|||
Strictly defined precision for positive diagnoses (best term and correct attribute) |
|||||||
Number strictly correct |
625 |
315 |
260 |
247 |
|||
Precision strict, % |
90.1 (87.6, 92.2) |
83.3 (79.2, 86.9) |
63.1 (58.2, 67.8) |
84.9 (80.2, 88.8) |
|||
Precision of non-diagnosis positive concepts |
|||||||
True positives |
84 |
304 |
295 |
453 |
|||
False positives |
2 |
22 |
55 |
41 |
|||
Precision, % |
97.7 (91.9, 99.7) |
93.3 (90.0, 95.7) |
84.3 (80.0, 87.9) |
91.7 (88.9, 94.0) |
|||
Overall precision of positive concepts detected (diagnostic and non-diagnostic) |
|||||||
True positives |
767 |
650 |
581 |
726 |
|||
False positives |
13 |
54 |
181 |
59 |
|||
Precision, % |
98.3 (97.2, 99.1) |
92.3 (90.1, 94.2) |
76.2 (73.1, 79.2) |
92.5 (90.4, 94.2) |
|||
Precision of negative concepts detected |
|||||||
True positives |
5 |
57 |
0 |
92 |
|||
False positives |
5 |
18 |
0 |
33 |
|||
Precision, % |
50.0 (18.7, 81.3) |
76.0 (64.7, 85.1) |
|
73.6 (65.0, 81.1) |
|||
Texts for which algorithm suggested a better Read term than the original term |
|||||||
Percentage of texts |
0 |
1.2 |
0.5 |
0.6 |
|||
Dates and durations |
|||||||
True positives |
116 |
96 |
|
|
|||
False positives |
15 |
10 |
|
|
|||
False negative |
25 |
22 |
|
|
|||
Precision, % |
88.5 (81.8, 93.4) |
90.6 (83.3, 95.4) |
|
|
|||
Recall, % |
82.3 (74.9, 88.2) |
81.4 (73.1, 87.9) |
|
|
|||
F-score |
0.85 |
0.86 |
|
|
|||
Test results and quantitative measurements |
|||||||
True positives |
|
105 |
|
|
|||
False positives |
|
11 |
|
|
|||
False negatives |
|
18 |
|
|
|||
Precision, % |
|
90.5 (83.7, 95.2) |
|
|
|||
Recall, % |
|
85.4 (77.9, 91.1) |
|
|
|||
F-score | 0.89 |
Comparison of precision (positive predictive value) and recall (sensitivity) of the Freetext Matching Algorithm (FMA) and MetaMap against the gold standard of manual review, for two test sets: ‘General’, a random sample of 500 texts from cases and 500 from controls in a study on coronary artery disease; and ‘Death’, a random sample of 1000 texts associated with Read terms for death or suicide in 2001.