. 2024 Oct 30;26:e53636. doi: 10.2196/53636

Table 2.

Popularity and accessibility of electronic health record (EHR) question answering (QA) datasets. We have listed the number of citations and the number of studies on EHR QA using the dataset. The information presented here is based on the data available on September 30, 2023.

Datasets	Number of citations	Number of studies on EHR QA using the datasets	Publicly available
emrQA [26]	151	11	Yes
MIMICSQL [5]	51	3	Yes
Yue et al [46]	40	0	No
MIMICSPARQL*^a [41]	27	2	Yes
Yue et al [42]	18	0	Yes
Roberts and Demner-Fushman [23]	s18	3	No
emrKBQA [8]	15	0	No
Raghavan et al [34]	13	0	No
Roberts and Demner-Fushman [24]	10	1	No
Soni et al [44]	7	3	No
Fan [35]	7	0	Yes
DrugEHRQA [25]	5	0	Yes
DiSCQ^b [43]	6	0	Yes
Oliveira et al [38]	3	0	No
RadQA^c [37]	3	1	Yes
EHRSQL [36]	3	0	Yes
Kim et al [39]	2	0	Yes
ClinicalKBQA^d [40]	2	0	No
Hamidi and Roberts [48]	1	0	No
MedAlign [49]	1	0	No
RxWhyQA [27]	0	0	Yes
Mishra et al [45]	0	0	No
CLIFT^e [47]	0	0	No
Mahbub et al [50]	0	0	No
Dada et al [51]	0	0	No

^aThis dataset follows the original schema of Medical Information Mart for Intensive Care (MIMIC-III).

^bDiSCQ: Discharge Summary Clinical Questions.

^cRadQA: Radiology Question Answering Dataset.

^dClinicalKBQA: Clinical Knowledge Base Question Answering.

^eCLIFT: Clinical Shift.