. 2023 Sep 2;30(12):2072–2082. doi: 10.1093/jamia/ocad168

Table 1.

Description of the parameters related to the dataset information.

Parameter	Description
Country of data source
United States	50 (61.7)
China	13 (16.0)
Others	15 (18.5)
Not reported	3 (3.7)
Follow-up time, in years	[0.8, 24], 8.9 (5.6), 10 [4–12], (70.4)
Sample size	[398, 29 163 297], 585 981 [3 611 801], 32 221 [7845–105 805], (18.5)
Demographic descriptors
Median age	[29, 74.6], 55.6 (13.4), 56.1 [46.2–65.4], (23.5)
Sex (female)	[0, 100], 51 (20.6), 49.8 [43–61.4], (29.6)
Socioeconomic information reported	6 (7.4)
Age limit definition	19 (23.5)
Main condition definition	27 (8.6)
Information included
Demographics	39 (48.2)
Diagnoses	60 (74.1)
Laboratory results	38 (46.9)
Prescribed/billed drugs	43 (53.1)
Medical procedures	31 (38.3)
Clinical measures	31 (38.3)
Information included simultaneously (n)	[1, 6], 3 (1.3), 3 [2–4], (98.8)
Others (eg, free text, genes…)	17 (21)
Types of variables
All quantitative	7 (8.6)
All categorical	39 (48.1)
Both	35 (43.2)

Categorical parameters are described as N (%), while quantitative parameters as [min, max], mean (SD), median [Q1-Q3], (% studies reported). Total N = 81. If more than one dataset was used to train the models, only the parameters of the biggest dataset were collected.