Table 2.
Algorithm | Visits Used | Imputation Method | F-score | Precision | Recall | ROC-AUC | PR-AUC |
---|---|---|---|---|---|---|---|
Prediction 90 d in advance | |||||||
Data-driven machine learning models (full models) | |||||||
Multilayer perceptron | Last 2 visitsa | Zero imputation | 0.782 | 0.703 | 0.879 | 0.979 | 0.829 |
Median forward | 0.847 | 0.858 | 0.836 | 0.990 | 0.890 | ||
Gradient boosting | Last 2 visitsa | Zero imputation | 0.874 | 0.852 | 0.897 | 0.994 | 0.933 |
Median forward | 0.890 | 0.875 | 0.905 | 0.996 | 0.956 | ||
Random forest | Last 2 visitsa | Zero imputation | 0.583 | 0.942 | 0.422 | 0.995 | 0.943 |
Median forward | 0.836 | 0.918 | 0.767 | 0.994 | 0.931 | ||
Elastic net | Last 2 visitsa | Zero imputation | 0.774 | 0.649 | 0.957 | 0.984 | 0.861 |
Median forward | 0.846 | 0.800 | 0.897 | 0.992 | 0.904 | ||
Bidirectional recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.818 | 0.786 | 0.853 | 0.984 | 0.874 |
Median forward | 0.856 | 0.819 | 0.897 | 0.989 | 0.916 | ||
Bidirectional attention recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.803 | 0.797 | 0.810 | 0.981 | 0.867 |
Median forward | 0.852 | 0.812 | 0.897 | 0.986 | 0.901 | ||
Manually built logistic regression model (short model) | Last 2 visitsa | None | 0.807 | 0.689 | 0.974 | 0.990 | 0.881 |
Prediction 180 d in advance | |||||||
Data-driven machine learning models (full models) | |||||||
Multilayer perceptron | Last 2 visitsa | Zero imputation | 0.719 | 0.716 | 0.722 | 0.960 | 0.777 |
Median forward | 0.718 | 0.798 | 0.652 | 0.963 | 0.803 | ||
Gradient boosting | Last 2 visitsa | Zero imputation | 0.656 | 0.859 | 0.530 | 0.969 | 0.833 |
Median forward | 0.789 | 0.815 | 0.765 | 0.970 | 0.860 | ||
Random forest | Last 2 visitsa | Zero imputation | 0.115 | > 0.999 | 0.061 | 0.955 | 0.803 |
Median forward | 0.677 | 0.844 | 0.565 | 0.968 | 0.814 | ||
Elastic net | Last 2 visitsa | Zero imputation | 0.698 | 0.629 | 0.783 | 0.952 | 0.768 |
Median forward | 0.767 | 0.777 | 0.757 | 0.959 | 0.787 | ||
Bidirectional recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.722 | 0.732 | 0.713 | 0.965 | 0.759 |
Median forward | 0.718 | 0.706 | 0.730 | 0.956 | 0.730 | ||
Bidirectional attention recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.694 | 0.720 | 0.670 | 0.963 | 0.755 |
Median forward | 0.721 | 0.712 | 0.730 | 0.945 | 0.792 | ||
Manually built logistic regression model (short model) | Last 2 visitsa | None | 0.559 | 0.405 | 0.904 | 0.934 | 0.646 |
Prediction 270 d in advance | |||||||
Data-driven machine learning models (full models) | |||||||
Multilayer perceptron | Last 2 visitsa | Zero imputation | 0.678 | 0.634 | 0.728 | 0.948 | 0.666 |
Median forward | 0.660 | 0.753 | 0.588 | 0.952 | 0.735 | ||
Gradient boosting | Last 2 visitsa | Zero imputation | 0.290 | 0.833 | 0.175 | 0.944 | 0.702 |
Median forward | 0.689 | 0.745 | 0.640 | 0.957 | 0.728 | ||
Random forest | Last 2 visitsa | Zero imputation | 0.068 | > 0.999 | 0.035 | 0.928 | 0.661 |
Median forward | 0.578 | 0.788 | 0.456 | 0.955 | 0.739 | ||
Elastic net | Last 2 visitsa | Zero imputation | 0.647 | 0.566 | 0.754 | 0.942 | 0.702 |
Median forward | 0.650 | 0.756 | 0.570 | 0.943 | 0.716 | ||
Bidirectional recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.605 | 0.581 | 0.632 | 0.938 | 0.649 |
Median forward | 0.661 | 0.632 | 0.693 | 0.940 | 0.737 | ||
Bidirectional attention recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.664 | 0.630 | 0.702 | 0.931 | 0.678 |
Median forward | 0.664 | 0.699 | 0.632 | 0.934 | 0.693 | ||
Manually built logistic regression model (short model) | Last 2 visitsa | None | 0.453 | 0.310 | 0.842 | 0.893 | 0.504 |
Prediction 365 d in advance | |||||||
Data-driven machine learning models (full models) | |||||||
Multilayer perceptron | Last 2 visitsa | Zero imputation | 0.641 | 0.691 | 0.598 | 0.950 | 0.699 |
Median forward | 0.628 | 0.776 | 0.527 | 0.950 | 0.722 | ||
Gradient boosting | Last 2 visitsa | Zero imputation | 0.220 | 0.933 | 0.125 | 0.945 | 0.700 |
Median forward | 0.619 | 0.663 | 0.580 | 0.941 | 0.710 | ||
Random forest | Last 2 visitsa | Zero imputation | 0.018 | > 0.999 | 0.009 | 0.941 | 0.705 |
Median forward | 0.527 | 0.800 | 0.393 | 0.952 | 0.725 | ||
Elastic net | Last 2 visitsa | Zero imputation | 0.588 | 0.626 | 0.554 | 0.938 | 0.673 |
Median forward | 0.512 | 0.808 | 0.375 | 0.935 | 0.681 | ||
Bidirectional recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.606 | 0.656 | 0.562 | 0.945 | 0.631 |
Median forward | 0.678 | 0.661 | 0.696 | 0.935 | 0.694 | ||
Bidirectional attention recurrent neural network | Full sequence; all previous visits | Zero imputation | 0.600 | 0.643 | 0.562 | 0.928 | 0.632 |
Median forward | 0.633 | 0.554 | 0.738 | 0.926 | 0.692 | ||
Manually built logistic regression model (short model) | Last 2 visitsa | None | 0.423 | 0.286 | 0.812 | 0.883 | 0.468 |
Abbreviations: PR-AUC; area under the precision-recall curve; ROC-AUC, area under the receiver operating characteristic curve.
a And summary statistics from earlier visits during the target observation period, as detailed in the Methods.