. 2025 May 12;16:3797. doi: 10.1038/s41467-025-59092-8

Table 3.

Demographics, smoking history, and overall survival (OS) statistics across subphenotypes in the development sub-cohort

Variables	Subphenotype 1	Subphenotype 2	Subphenotype 3	P-value	Post-hoc pair-wise analysis with p-value < 0.05
Patients, n (%)	1355 (42.0%)	450 (14.0%)	1420 (44.0%)
Age at index date, years, mean ± SD	68.71 ± 9.39	68.77 ± 9.54	68.64 ± 9.28	0.966	-
Gender, n (%)				1.61E-29	1 vs. 2, 1 vs. 3
Female	752 (55.50%)	152 (33.78%)	500 (35.21%)
Male	603 (44.50%)	298 (66.22%)	920 (64.79%)
Race, n (%)				0.321	-
White	1051 (77.56%)	342 (76.00%)	1125 (79.23%)
Non-White	304 (22.44%)	108 (24.00%)	295 (20.77%)
Practice type, n (%)				3.47E-06	1 vs.2, 2 vs. 3, 1 vs. 3
Academic	209 (15.42%)	100 (22.22%)	175 (12.32%)
Community	1146 (84.58%)	350 (77.78%)	1245 (87.68%)
Smoking status, n (%)				0.181	-
History of smoking	1201 (88.63%)	407 (90.44%)	1282 (90.28%)
No history of smoking	139 (10.26%)	34 (7.56%)	123 (8.66%)
Overall Survival				1.11E-40	1 vs. 2, 2 vs. 3, 1 vs. 3
Survival days, mean ± SD	676 ± 543	454 ± 446	321 ± 387
Survival days, median (Q1-Q3)	516 (248, 982)	305 (137, 611)	180 (60, 442)
Observed events, n (%)	795 (58.67%)	341 (75.78%)	1137 (80.07%)

P-value: The p value was computed by testing differences of each variable across the three subphenotypes. Continuous normally and non-normally distributed variables were tested using one-way analysis of variance or Kruskal–Wallis tests, respectively, and categorical variables were tested using the one-sided Fisher’s exact test. Post-hoc pairwise analysis: If the overall p-value across groups was statistically significant (p < 0.05) for a variable, post-hoc pairwise analysis was performed to identify pairwise significance. Continuous normal and non-normal variables were tested using Tukeyʼs honestly significant difference or pairwise Wilcoxon rank sum test, respectively, and categorical variables were tested by pairwise Fisherʼs test. For instance, 1 vs. 2 indicates statistically significant differences between subphenotypes 1 and 2 on the variable with p < 0.05. SD standard deviation.