Table 2.
Average AUC scores of KUTS and baselines on the SYSMH-S, SYSMH-N, GTCMH and MIMIC-IV-ED datasets
| Datasets | SYSMH-S | SYSMH-N | GTCMH | MIMIC-IV-ED | |
|---|---|---|---|---|---|
| Shallow single-model methods | Decision tree | 0.659 [0.649, 0.667] | 0.709 [0.665, 0.759] | 0.680 [0.644, 0.720] | 0.561 [0.559, 0.564] |
| Support vector machine | 0.836 [0.831, 0.843] | 0.797 [0.750, 0.835] | 0.838 [0.816, 0.858] | 0.797 [0.750, 0.835] | |
| Naive bayes | 0.852 [0.845, 0.857] | 0.829 [0.780, 0.845] | 0.856 [0.831, 0.893] | 0.684 [0.676, 0.696] | |
| Random forest | 0.872 [0.865, 0.878] | 0.855 [0.812, 0.901] | 0.877 [0.856, 0.901] | 0.706 [0.701, 0.712] | |
| XGBoost | 0.890 [0.885, 0.894] | 0.873 [0.827, 0.907] | 0.887 [0.850, 0.907] | 0.756 [0.750, 0.760] | |
| Deep single-model methods | FT-transformer | 0.699 [0.607, 0.758] | 0.659 [0.620, 0.703] | 0.695 [0.616, 0.780] | 0.627 [0.609, 0.640] |
| Tab transformer | 0.728 [0.690, 0.754] | 0.692 [0.639, 0.727] | 0.751 [0.683, 0.803] | 0.722 [0.701, 0.735] | |
| MLP | 0.785 [0.748, 0.827] | 0.731 [0.664, 0.806] | 0.716 [0.497, 0.846] | 0.678 [0.544, 0.727] | |
| BERT-single | 0.867 [0.846, 0.882] | 0.834 [0.796, 0.854] | 0.889 [0.861, 0.917] | 0.843 [0.839, 0.846] | |
| Deep multi-model methods | HAIM | 0.903 [0.888, 0.917] | 0.891 [0.860, 0.911] | 0.864 [0.710, 0.949] | 0.866 [0.857, 0.878] |
| IRENE | 0.917 [0.895, 0.929] | 0.928 [0.887, 0.946] | 0.893 [0.850, 0.927] | 0.867 [0.859, 0.871] | |
| Our model | KUTS | 0.958 [0.954, 0.961] | 0.961 [0.941, 0.981] | 0.950 [0.893, 0.978] | 0.885 [0.879, 0.888] |
| Improve-ment | Average | 18.58% | 21.57% | 18.05% | 22.19% |
| Minimum | 4.69% | 3.56% | 6.38% | 2.08% | |
The baseline models include the shallow single-modal methods (including decision tree, support vector machine, naive Bayes, random forest and XGBoost), the deep single-modal methods (including FT-Transformer, TabTransformer, MLP and BERT) and the multi-modal methods (including HAIM and IRENE). The average improvement (The second to last row) represents the average improvement of KUTS compared with the baselines on this evaluation metric, while the minimum improvement (last row) represents the improvement of KUTS compared with the best performing baseline on this evaluation metric. The evaluation metric in this table is AUC, with 95% confidence intervals in brackets.