. Author manuscript; available in PMC: 2023 Sep 12.

Published in final edited form as: IEEE J Biomed Health Inform. 2023 Feb 3;27(2):1106–1117. doi: 10.1109/JBHI.2022.3224727

Table IV. Performance Comparison On Transferring Pre-Trained Representations To Risk Prediction Task.

Task	Subset sample size (% of full training set)	No. (%) of positive	Hi-BEHRT
Task	Subset sample size (% of full training set)	No. (%) of positive	Without pretraining		With pretraining
			AUR OC	AUP RC	AUR OC	AU PRC
HF	13,827 (1%)	633 (4.5)	0.84	0.19	0.86	0.23
HF	69,136 (5%)	3,317(4.8)	0.86	0.23	0.88	0.26
Diabetes	13,201 (1%)	643 (4.9)	0.70	0.13	0.73	0.16
Diabetes	66,003 (5%)	3,223 (4.9)	0.79	0.22	0.79	0.22
CKD	13,228 (1%)	1,224 (9.3)	0.73	0.22	0.76	0.26
CKD	66,140 (5%)	6,110(9.2)	0.77	0.26	0.80	0.33
Stroke	12,114(1%)	1,579 (13.0)	0.65	0.21	0.67	0.23
Stroke	60,568 (5%)	7,827 (13.0)	0.68	0.24	0.76	0.41

% of positive represents the percentage of positive cases in the dataset, for example, HF has 633 positive cases in the 1% subset, and it is around 4.5 percent over 13,827 patients.