Skip to main content
. 2023 Aug 11;6:141. doi: 10.1038/s41746-023-00888-7

Table 2.

Fidelity results with utility metrics.

Utility with all features
Target Models Metrics MIMIC-III eICU
Train on Real Train on Synth Train on Real Train on Synth
Mortality GBDT AUC 0.762 0.736 0.943 0.938
AP 0.304 0.261 0.600 0.534
RF AUC 0.723 0.710 0.954 0.945
AP 0.276 0.251 0.600 0.580
GRU AUC 0.728 0.667 0.937 0.938
AP 0.278 0.193 0.567 0.528
LR AUC 0.712 0.680 0.872 0.818
AP 0.233 0.207 0.323 0.260
Average AUC 0.731 0.689 0.926 0.909
AP 0.272 0.228 0.522 0.475
Utility with random subsets of features
Target Models Metrics MIMIC-III eICU
Mean-diff p-value (X = 0.04) Mean-diff p-value (X = 0.04)
Mortality RF AUC 0.009 0.000 0.009 0.000
AP 0.035 0.000 0.035 0.098
Gender AUC 0.065 1.000 0.019 0.000
AP 0.046 0.860 0.013 0.000

(Upper) Downstream task performance with four different predictive models and two different settings (train on real vs. train on synthetic) on MIMIC-III and eICU datasets. Performance is evaluated on the original test sets. The best performance in each column is shown in bold. (Lower) The average absolute performance difference (in terms of AUC/AP) between training on real vs. synthetic data and the corresponding p-values (computed by one sample T-test) for predicting mortality and gender with random subsets of features.