Skip to main content
. Author manuscript; available in PMC: 2022 Oct 1.
Published in final edited form as: Biom J. 2021 May 24;63(7):1375–1388. doi: 10.1002/bimj.202000199

Table 4.

Discrimination in development and prospective validation sets measured by AUC (95% CIs).

Sampling Framework Logistic regression with LASSO Random forest
Training/ test split Cross-validation split Model estimation Development test set Prospective validation set Development test set Prospective validation set
Visit Visit Observed cluster analysis 0.867 (0.860, 0.873) 0.849 (0.846, 0.851) 0.950 (0.946, 0.954) 0.836 (0.833, 0.838)
Visit Person Observed cluster analysis 0.862 (0.856, 0.868) 0.853 (0.850, 0.855) 0.907 (0.901, 0.912) 0.853 (0.850, 0.855)
Person Person Observed cluster analysis 0.854 (0.847, 0.861) 0.847 (0.845, 0.850) 0.856 (0.849, 0.862) 0.847 (0.844, 0.849)
Person Person Within cluster resampling 0.863 (0.857, 0.869) 0.854 (0.852, 0.856) 0.857 (0.851, 0.864) 0.847 (0.845, 0.849)

Development test set includes 531,639 visits (141,968 people, 1,517 unique events) for the visit-level training/test split and 531,930 visits (72,771 people, 841 unique events) for the person-level training/test split.

Prospective validation set includes 4,286,495 visits (660,659 people, 6,678 unique events).