Skip to main content
. 2024 Oct 30;26:e53636. doi: 10.2196/53636

Table 2.

Popularity and accessibility of electronic health record (EHR) question answering (QA) datasets. We have listed the number of citations and the number of studies on EHR QA using the dataset. The information presented here is based on the data available on September 30, 2023.

Datasets Number of citations Number of studies on EHR QA using the datasets Publicly available
emrQA [26] 151 11 Yes
MIMICSQL [5] 51 3 Yes
Yue et al [46] 40 0 No
MIMICSPARQL*a [41] 27 2 Yes
Yue et al [42] 18 0 Yes
Roberts and Demner-Fushman [23] s18 3 No
emrKBQA [8] 15 0 No
Raghavan et al [34] 13 0 No
Roberts and Demner-Fushman [24] 10 1 No
Soni et al [44] 7 3 No
Fan [35] 7 0 Yes
DrugEHRQA [25] 5 0 Yes
DiSCQb [43] 6 0 Yes
Oliveira et al [38] 3 0 No
RadQAc [37] 3 1 Yes
EHRSQL [36] 3 0 Yes
Kim et al [39] 2 0 Yes
ClinicalKBQAd [40] 2 0 No
Hamidi and Roberts [48] 1 0 No
MedAlign [49] 1 0 No
RxWhyQA [27] 0 0 Yes
Mishra et al [45] 0 0 No
CLIFTe [47] 0 0 No
Mahbub et al [50] 0 0 No
Dada et al [51] 0 0 No

aThis dataset follows the original schema of Medical Information Mart for Intensive Care (MIMIC-III).

bDiSCQ: Discharge Summary Clinical Questions.

cRadQA: Radiology Question Answering Dataset.

dClinicalKBQA: Clinical Knowledge Base Question Answering.

eCLIFT: Clinical Shift.