Table 1.
Logistic, Markov Blanket set (Nsim=100,000) | Logistic, all 24 variables (Nsim=100,000) | Logistic, any variables with a path to the outcome (Nsim=100,000) | Logistic, node’s parent variables (Nsim=100,000) | Lasso, all 24 variables (Nsim=100,000) | Ridge, all 24 variables (Nsim=100,000) | Elastic net, all 24 variables (Nsim=100,000) | Random forest, all 24 variables (Nsim=100,000) | |
---|---|---|---|---|---|---|---|---|
FULL RESULTS: Including all simulated datasets | ||||||||
ICI | ||||||||
N Missing | 8032 | 0 | 8032 | 37,272 | 8597 | 0 | 8612 | 1 |
Mean (SD) | 0.01882 (0.00445) | 0.01964 (0.00495) | 0.01900 (0.00461) | 0.02215 (0.00421) | 0.01912 (0.00451) | 0.03807 (0.02058) | 0.01907 (0.00456) | 0.04133 (0.01779) |
Median | 0.01857 | 0.01925 | 0.01867 | 0.02242 | 0.01888 | 0.02895 | 0.01881 | 0.03636 |
Range | 0.00290–0.03834 | 0.00289–0.04330 | 0.00287–0.04330 | 0.00290–0.03826 | 0.00287–0.03919 | 0.00710–0.18537 | 0.00340–0.04283 | 0.00704–0.16493 |
Number of input variables | ||||||||
N Missing | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Mean (SD) | 4.0 (2.8) | 24.0 (0.0) | 18.9 (7.0) | 1.2 (1.3) | 24.0 (0.0) | 24.0 (0.0) | 24.0 (0.0) | 24.0 (0.0) |
Median | 3.0 | 24.0 | 22.0 | 1.0 | 24.0 | 24.0 | 24.0 | 24.0 |
Range | 0.0–19.0 | 24.0–24.0 | 0.0–24.0 | 0.0–9.0 | 24.0–24.0 | 24.0–24.0 | 24.0–24.0 | 24.0–24.0 |
Direct comparison: ICI of various methods compared to Markov Blanket-based logistic tool | ||||||||
N Missing | 8032 | 8032 | 8032 | 37,272 | 9140 | 8032 | 9147 | 8033 |
< ICI logistic MB, N (%) | 39,354 (42.79%) | 39,540 (42.99%) | 4864 (7.75%) | 26,514 (29.18%) | 8871 (9.65%) | 31,089 (34.22%) | 1650 (1.79%) | |
≥ ICI logistic MB, N (%) | 52,614 (57.21%) | 52,428 (57.01%) | 57,864 (92.25%) | 64,346 (70.82%) | 83,097 (90.35%) | 59,764 (65.78%) | 90,317 (98.21%) | |
COMPLETE CASE RESULTS: only including datasets for which ICI could be estimated for all tools | ||||||||
ICI | ||||||||
N Missing | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 |
Mean (SD) | 0.01956 (0.00463) | 0.01975 (0.00477) | 0.01970 (0.00473) | 0.02211 (0.00421) | 0.01995 (0.00471) | 0.03886 (0.02177) | 0.01990 (0.00476) | 0.04049 (0.02011) |
Median | 0.01953 | 0.01962 | 0.01960 | 0.02238 | 0.01993 | 0.02883 | 0.01987 | 0.03283 |
Range | 0.00290–0.03834 | 0.00289–0.04330 | 0.00287–0.04330 | 0.00290–0.03826 | 0.00287–0.03919 | 0.00710–0.18537 | 0.00340–0.04283 | 0.00704–0.16493 |
Number of input variables | ||||||||
N Missing | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 |
Mean (SD) | 4.1 (2.7) | 24.0 (0.0) | 20.8 (3.9) | 1.9 (1.1) | 24.0 (0.0) | 24.0 (0.0) | 24.0 (0.0) | 24.0 (0.0) |
Median | 4.0 | 24.0 | 22.0 | 2.0 | 24.0 | 24.0 | 24.0 | 24.0 |
Range | 1.0–19.0 | 24.0–24.0 | 1.0–24.0 | 1.0–9.0 | 24.0–24.0 | 24.0–24.0 | 24.0–24.0 | 24.0–24.0 |
Direct comparison: ICI of various methods compared to Markov Blanket-based logistic tool | ||||||||
N Missing | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 | 37,841 |
< ICI logistic MB, N (%) | 26,872 (43.23%) | 27,124 (43.64%) | 4850 (7.80%) | 16,887 (27.17%) | 6508 (10.47%) | 19,959 (32.11%) | 1636 (2.63%) | |
≥ ICI logistic MB, N (%) | 35,287 (56.77%) | 35,035 (56.36%) | 57,309 (92.20%) | 45,272 (72.83%) | 55,651 (89.53%) | 42,200 (67.89%) | 60,523 (97.37%) |
In a series of 100,000 simulated datasets, we obtained these results for ICI and number of input variables for the eight investigated prediction tools. Full results and complete case results, including only datasets for which ICI could be estimated for all tools are presented
Abbreviations: ICI integrated calibration index, MB Markov Blanket, Nsim number of simulations, SD standard deviation