Skip to main content
. Author manuscript; available in PMC: 2024 Jul 31.
Published in final edited form as: J Biomed Inform. 2020 Dec 31;115:103671. doi: 10.1016/j.jbi.2020.103671

Table 3.

Benchmark dataset and clinical outcome applications.

Public Resource Description Patient Data Type Patient Count Application Tasks Paper
MIMIC-III A critical care database, including de-identified patient data with ~ 60,000 Intensive Care Unit (ICU) admissions [93] Structured; Unstructured ~46,000 Mortality Prediction [14,43,57,58,61,65,73,77]
Disease Prediction [27,31,43,54,56,57,59,91,98,99]
Admission Prediction [77,88]
Length-of-stay Prediction [14,43,73]
Patient Similarity [58]
24-hour decompensation [14,43,61]
Intervention Prediction (ventilation, prescription, lab test order, etc.) [32,54,55,65,98]
PPMI A longitudinal patient dataset comprising clinical and behavioral variables, imaging, and specimen data of Parkinson’s disease patients. [94] Structured ~1000 Patient subtyping [67,74]
ADNI A longitudinal patient dataset consisting of assessments collected from selected patients in varied stages of Alzheimer’s Disease. [95] Structured ~600 Patient subtyping [74]
i2b2 obesity challenge 1237 discharge summaries from the Partners HealthCare Research Patient Data Repository. Each discharge summary was annotated with patient disease status corresponding to obesity and fifteen comorbidities of obesity. [96] Unstructured ~1000 Phenotype Prediction [62,63,90]
eICU Collaborative Database A combination of multiple critical care units across the United States. The data covers patients who were admitted to critical care units in 2014 and 2015. [97] Structured ~139,000 Mortality medication, and diagnosis predictions [42]