Skip to main content
. 2026 Feb 6;13:e84318. doi: 10.2196/84318

Table 1.

Cohort characteristics and dataset splits used for model development and evaluation. After final model selection, the full dataset from 2020 to 2023 (model development) was used to train the Predictive Risk Identification for Mental Health Events, which was evaluated using 2024 (model evaluation) data to assess real-world performance.


Model development (2020-2023) Model evaluation (2024)—evaluation set (48,313)

Held-out patients Out-of-time patients
Data split (number of patient encounters) Development set (281,022) Test set (73,763) Development set (259,257) Test set (95,528)
Patients, n (%) 3000 (80) 751 (20) 2851 (70.3) 1202 (29.7) 900 (100)
Period January 1, 2020, to December 31, 2023 January 1, 2020, to December 31, 2023 January 1, 2020, to December 31, 2022 January 1, 2023, to December 31, 2023 January 1, 2024, to August 19, 2024
LOSa, mean (SD) 629.22 (632.67) 530.41 (654.51) 708.04 (676.56) 336.90 (390.40) 134.99 (117.83)
Sex, n (%)

Female 839 (27.98) 231 (30.81) 845 (29.63) 310 (25.75) 235 (26.05)

Male 2045 (68.17) 502 (66.82) 1912 (67.08) 842 (70.05) 633 (70.37)

Other 116 (3.85) 18 (2.37) 94 (3.29) 50 (4.20) 32 (3.58)
Sexual orientation, n (%)

Heterosexual 1878 (62.59) 472 (62.90) 1833 (64.31) 699 (58.15) 541 (60.14)

Other 1122 (37.41) 279 (37.10) 1018 (35.69) 503 (41.85) 359 (39.86)
Race, n (%)

Black 273 (9.10) 36 (4.73) 224 (7.85) 109 (9.03) 73 (8.10)

First Nations 61 (2.05) 18 (2.45) 63 (2.21) 23 (1.92) 19 (2.13)

White 1987 (66.23) 572 (76.20) 2000 (70.15) 763 (63.49) 570 (63.28)

Other races 679 (22.62) 125 (16.62) 564 (19.78) 307 (25.57) 238 (26.49)
Incident prevalence

Total number of incidents 11,744 2569 10,688 3625 2106

Patients, n (%) 762 209 766 342 266

aLOS: length of stay in days.