Table 1. Baseline characteristics of the enrolled population.
| Variables | All patients (n=21,942), n (%) | Training cohort (n=15,362), n (%) | Validation cohort (n=6,580), n (%) | P |
|---|---|---|---|---|
| Age | 0.406a | |||
| <45 years | 4,302 (19.6) | 3,047 (19.8) | 1,255 (19.1) | |
| 45–59 years | 9,009 (41.1) | 6,373 (41.5) | 2,636 (40.1) | |
| ≥60 years | 8,631 (39.3) | 5,942 (38.7) | 2,689 (40.9) | |
| Race | 0.563 | |||
| White | 17,496 (79.7) | 12,248 (79.7) | 5,248 (79.8) | |
| Black | 2,574 (11.7) | 1,819 (11.8) | 755 (11.5) | |
| Otherb | 1,872 (8.5) | 1,295 (8.4) | 577 (8.8) | |
| Laterality | 0.477 | |||
| Left | 11,154 (50.8) | 7,785 (50.7) | 3,369 (51.2) | |
| Right | 10,788 (49.2) | 7,577 (49.3) | 3,211 (48.8) | |
| Marriage | 0.362 | |||
| Married | 13,104 (59.7) | 9,144 (59.5) | 3,960 (60.2) | |
| Not marriedc | 8,838 (40.3) | 6,218 (40.5) | 2,620 (39.8) | |
| Grade | 0.100 | |||
| I | 2,566 (11.7) | 1,794 (11.7) | 772 (11.7) | |
| II | 9,640 (43.9) | 6,682 (43.5) | 2,958 (45.0) | |
| III & IV | 9,736 (44.4) | 6,886 (44.8) | 2,850 (43.3) | |
| Histologyd | 0.902 | |||
| Ductal carcinoma | 15,476 (70.5) | 10,835 (70.5) | 4,641 (70.5) | |
| Lobular carcinoma | 2,266 (10.3) | 1,599 (10.4) | 667 (10.1) | |
| Mixed carcinoma | 3,182 (14.5) | 2,215 (14.4) | 967 (14.7) | |
| Other | 1,018 (4.6) | 713 (4.6) | 305 (4.6) | |
| Stage T | 0.660 | |||
| T1–2 | 16,547 (75.4) | 11,572 (75.3) | 4,975 (75.6) | |
| T3–4 | 5,395 (24.6) | 3,790 (24.7) | 1,605 (24.4) | |
| LNM | 0.574 | |||
| 1–3 | 13,462 (61.4) | 9,393 (61.1) | 4,069 (61.8) | |
| 4–6 | 3,344 (15.2) | 2,336 (15.2) | 1,008 (15.3) | |
| 7–9 | 1,762 (8.0) | 1,255 (8.2) | 507 (7.7) | |
| ≥10 | 3,374 (15.4) | 2,378 (15.5) | 996 (15.1) | |
| Breast surgery type | 0.705 | |||
| No surgerye/BCS | 4,396 (20.0) | 3,088 (20.1) | 1,308 (19.9) | |
| Mastectomy | 17,546 (80.0) | 12,274 (79.9) | 5,272 (80.1) | |
| Radiotherapy | 0.674 | |||
| Yes | 11,192 (51.0) | 7,850 (51.1) | 3,342 (50.8) | |
| No | 10,750 (49.0) | 7,512 (48.9) | 3,238 (49.2) | |
| Chemotherapy | 0.482 | |||
| Yes | 15,771 (71.9) | 11,063 (72.0) | 4,708 (71.6) | |
| No/unknown | 6,171 (28.1) | 4,299 (28.0) | 1,872 (28.4) | |
| Subtypef | 0.233 | |||
| HR+, HER2+ | 1,104 (5.0) | 762 (5.0) | 342 (5.2) | |
| HR+, HER2− | 6,510 (29.7) | 4,529 (29.5) | 1,981 (30.1) | |
| HR−, HER2+ | 445 (2.0) | 294 (1.9) | 151 (2.3) | |
| HR−, HER2− | 720 (3.3) | 505 (3.3) | 215 (3.3) | |
| Unknown | 13,163 (60.0) | 9,272 (60.4) | 3,891 (59.1) |
a, obtained by the Cochran-Mantel-Haenszel test. b, “other” includes American Indian, AK Native, Asian and Pacific Islander as recorded in the SEER database. c, “Not Married” includes divorced, separated, single, unmarried or domestic partner and widowed. d, “Mixed carcinoma” includes infiltrating duct mixed with other types of carcinoma, infiltrating lobular mixed with other types of carcinoma and infiltrating duct and lobular carcinoma; “Other” means histological types other than above three types. e, “No surgery” means that the primary breast lesion is not operated on, and it is possible that only the axilla was operated on. f, “HR” means the statuses of ER and PR: “HR+” means that the expression of ER or PR is positive; “HR−” means that the expressions of both ER and PR are negative; “Unknown” means unknown HER2 expression in the SEER database. LNM, lymph node metastasis; BCS, breast-conserving surgery; HR, hormone receptor; HER2, human epidermal growth factor receptor 2; ER, estrogen receptor; PR, progesterone receptor.