Skip to main content
. 2023 Apr 22;30(6):1091–1102. doi: 10.1093/jamia/ocad050

Table 2.

Coverage and prediction results of different components of the QA pipeline on FHIRDATA

Component Using gold concept
Top MetaMap score
Longest concept
Filter using EHR concepts
Longest after filter using EHR concepts
Result: % (#) [total = 966]
MetaMap generated concepts include gold Boundary (recall) 83.85% (810)
MetaMap generated concepts include gold CUI (recall) 41.51% (401)
Predicted concepts match gold Boundary (accuracy) 54.04% (522) 49.59% (479) 49.38% (477) 63.66% (615)
Predicted concepts match gold CUI (accuracy) 23.50% (227) 20.50% (198) 39.13% (378) 40.06% (387)
Generated logical trees include gold (recall) 100.00% (966) 83.64% (808) 78.88% (762) 99.17% (958) 87.06% (841)
Predicted logical tree matches gold (accuracy) 97.41% (941) 81.57% (788) 39.03% (377) 96.27% (930) 84.89% (820)
Predicted time frame matches gold (accuracy) 88.30% (853)
Generated logical forms include gold (recall) 100.00% (966) 45.45% (439) 47.10% (455) 42.34% (409) 54.45% (526)
Predicted logical form matches gold (accuracy) 86.23% (833) 19.25% (186) 9.21% (89) 33.54% (324) 34.16% (330)
Predicted FHIR response matches gold (accuracy) 97.41% (941) 22.77% (220) 10.14% (98) 38.51% (372) 39.13% (378)
Predicted answer matches gold (accuracy) 97.41% (941) 22.98% (222) 10.14% (98) 38.61% (373) 39.34% (380)
Predicted answer matches gold (precision) [# correct responses/all responses] 98.33% [941/957] 94.42% [220/233] 50.52% [98/194] 94.67% [373/394] 94.03% [378/402]