Skip to main content
Medicine logoLink to Medicine
. 2024 Mar 15;103(11):e37330. doi: 10.1097/MD.0000000000037330

Construction and validation of nomogram for the cancer-specific mortality for HER2-positive breast cancer patients

Nan Wu a,*
PMCID: PMC10939670  PMID: 38489717

Abstract

The cancer-specific mortality (CSM) of patients with human epidermal growth factor receptor 2 positive (HER2+) breast cancer remains dismal and varies widely from person to person. Therefore, we aim to construct a nomogram to predict CSM in HER2+ breast cancer using data from the surveillance, epidemiology, and end results (SEER) database. The clinicopathological data of patients diagnosed with HER2+ breast cancer from 2000 to 2019 were selected from the SEER database. Independent prognostic factors for CSM of patients were identified by competing risk model. Subsequently, we constructed a new predicting nomogram. Calibration curves, receiver operating characteristic curve, and decision curve were used to evaluate the efficiency of the nomogram. A total of 45,362 breast cancer patients in the SEER database were selected for study and randomly separated into training (n = 31,753) and validation (n = 13,609) cohorts. Univariate and multivariate analysis showed that age, race, tumor grade, clinical stage, T stage, surgery status, radiotherapy, chemotherapy, and regional nodes examined were independent risk factors for CSM of HER2+ breast cancer patients. Receiver operating characteristic curves for the prediction nomogram of the CSM for breast cancer patients indicated that the 1-, 3- and 5-year AUCs were 0.874, 0.843, and 0.820 in the training cohort and 0.861, 0.845, and 0.825 in the validation cohort, respectively. The c-index was 0.817 and 0.821 in training cohort and validation cohort, respectively. Moreover, a good agreement was seen between the observed outcome and the predicted probabilities in the calibration curves of the nomogram in training cohort and validation cohort. Further decision curve analysis demonstrated good clinical utilities of the nomogram in training cohort and validation cohort. The nomogram shows good accuracy and reliability in predicting the CSM of breast cancer patients, and it could provide some theoretical support for clinicians to make decisions.

Keywords: Breast cancer, HER2, nomogram, prognosis factor

1. Introduction

Breast cancer is the most prevalent cancer among women worldwide and the second leading cause of cancer-related death, with a lifetime risk of about 13% and 13.6 deaths per 100,000 people.[1] As a heterogeneous disease, breast cancer has many different characteristics in both biological and clinical behavior.[2] Based on the expression of amplification of estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 (HER2), breast cancer was clustered into luminal A type, luminal B type, HER2 type, and triple-negative breast cancer.[3] Clinicians often conduct risk stratification and prognosis prediction according to different molecular subtypes, so as to choose the appropriate treatment and follow-up plan.

HER2 is a tyrosine kinase receptor encoded by the ERBB2 gene. HER2+ breast cancer is referred to HER2 amplification-based breast cancer, accounting for about 15% to 20% of breast cancer cases.[4] Due to its high aggressiveness and recurrence rate, HER2+ breast cancer has a poor clinical outcome with a median overall survival of only 5 years for patients in its early stages.[5,6] Currently, several drugs targeting the HER2 family, including trastuzumab, pertuzumab, and nilotinib, have been used in the treatment of HER2+ breast cancer.[7] However, about 1/5 of HER2+ breast cancer cases will experience metastatic disease after initially treatment.[8] Once metastatic disease occurs, the quality of life and survival time of HER2+ breast cancer cases are severely reduced, making it the leading cause of death.[9]

In this study, we screen the independent prognostic factors of cancer-specific mortality (CSM) in HER2+ breast cancer patients, and constructed a nomogram to predict the CSM risk of patients based on the data from the surveillance, epidemiology, and end results (SEER) database of, which may provide a reference for clinical identification of high-risk patients and individualized treatment decisions.

2. Method

2.1. Data source

From SEER database, we obtained data about HER2+ infiltrating breast cancer from 17 registries, ranking for about 26.5% of the US population in the 2020 census. The demographic information of patients from the health service was retrieved by the staff of cancer registries regularly. Those cases diagnosed with breast cancer between 2000 and 2019 were included in our study and processed with SEER*Stat software (version 8.4.2). The variables of breast cancer cases included age at diagnosis, grade, tumor size, node stage, radiotherapy and chemotherapy status, survival data, etc. Based on the International Classification of Diseases for Oncology, 3rd edition, we determined the histological types of breast cancer cases. We classified tumor stage based on the American Joint Committee on Cancer staging system, 7th edition. The data in this study are from the SEER database, so the ethical review is exempt and the informed consent is not required

2.2. Patient selection

Inclusion criteria are listed as follows: female, the pathological type is invasive ductal carcinoma (ICD-O-38500/3), molecular subtyping was HER2+, 18 to 85 years of age at diagnosis, breast cancer was the first cancer diagnosis, completed follow-up data with a known cause of death and survival time. Died from breast cancer. Exclusion criteria include: incomplete data, survival period <6 month, identified by death certificate or autopsy.

2.3. Statistical analysis

The patients were randomly divided into training and internal validation cohort in a 7:3 ratio by random sampling method. The chi-squared test and Student t test were performed to evaluate differences in categorical variables and continuous variables between groups, respectively. Independent prognostic factors of CSM were screened by univariate and multivariate analysis of clinical variables using competitive risk model. Nomograms were constructed by the “rms” and “survival” packages in R software using training set, which was verified by an internal validation set. The Harrell concordance index (C-index), ROC curve, and the calibration curves were drawn to evaluate the performance of nomograms. Decision curve analysis was used to evaluate the clinical benefits of the nomogram. Statistical analyses were determined by SPSS 25.0 software (IBM Corp., USA), and R 4.1.1 software (R core team, Austria). All tests were 2-sided, and the value of P <.05 was considered statistically significant.

3. Result

3.1. Clinical characters associated with CSM in breast cancer

As shown in Figure 1, after strict process for patient selection, a total of 45,362 IBC patients in the SEER database were selected for study and randomly separated into 2 cohorts at the ratio of 7:3, including the training (n = 31,753) and validation (n = 13,609) cohorts. Table 1 shows the characteristics of HER2+ breast cancer patients in training set and validation cohorts. And there is no significant difference of these 2 cohorts in overall survival rates (Fig. 2).

Figure 1.

Figure 1.

Flowchart of the study. HER2 = human epidermal growth factor receptor 2, SEER = surveillance, epidemiology, and end results.

Table 1.

Baseline characteristics of the training and validation cohorts.

Train cohort Validation cohort P-value
N = 31,753 N = 13,609
Age 56.3 (12.3) 56.3 (12.3) .578
Race .820
 White 24,131 (76.0%) 10,308 (75.7%)
 Black 3261 (10.3%) 1421 (10.4%)
 Other 4361 (13.7%) 1880 (13.8%)
Tumor site .685
 Left 16,237 (51.1%) 6930 (50.9%)
 Right 15,516 (48.9%) 6679 (49.1%)
Grade .933
 I 1547 (4.87%) 679 (4.99%)
 II 11,594 (36.5%) 4958 (36.4%)
 III 18,490 (58.2%) 7923 (58.2%)
 IV 122 (0.38%) 49 (0.36%)
Clinical stage .400
 I 14,252 (44.9%) 6146 (45.2%)
 II 11,866 (37.4%) 5028 (36.9%)
 III 4242 (13.4%) 1795 (13.2%)
 IV 1393 (4.39%) 640 (4.70%)
Stage T .152
 T1 16,458 (51.8%) 7183 (52.8%)
 T2 11,453 (36.1%) 4835 (35.5%)
 T3 2343 (7.38%) 919 (6.76%)
 T4 1499 (4.72%) 672 (4.94%)
Stage N .631
 N0 20,099 (63.3%) 8676 (63.8%)
 N1 8451 (26.6%) 3542 (26.0%)
 N2 1898 (5.98%) 823 (6.05%)
 N3 1305 (4.11%) 568 (4.17%)
Stage M .143
 M0 30,360 (95.6%) 12,969 (95.3%)
 M1 1393 (4.39%) 640 (4.70%)
Tumor size .129
 ≤2 cm 16,182 (51.0%) 7042 (51.7%)
 >2 cm 15,571 (49.0%) 6567 (48.3%)
Surgery .520
 No 1457 (4.59%) 644 (4.73%)
 Yes 30,296 (95.4%) 12,965 (95.3%)
Radiotherapy .231
 No 14,830 (46.7%) 6440 (47.3%)
 Yes 16,923 (53.3%) 7169 (52.7%)
Chemotherapy .461
 No 7700 (24.2%) 3345 (24.6%)
 Yes 24,053 (75.8%) 10,264 (75.4%)
Regional nodes examined .878
 <6 21,563 (67.9%) 9231 (67.8%)
 ≥6 10,190 (32.1%) 4378 (32.2%)
ER status .694
 Negative 10,391 (32.7%) 4427 (32.5%)
 Positive 21,362 (67.3%) 9182 (67.5%)
PR status .746
 Negative 15,610 (49.2%) 6667 (49.0%)
 Positive 16,143 (50.8%) 6942 (51.0%)

ER = estrogen receptor, PR = progesterone receptor.

Figure 2.

Figure 2.

The clinical outcome of HER2+ breast cancer in training and validation cohort. HER2 = human epidermal growth factor receptor 2, HR = hormone receptor, SEER = surveillance, epidemiology, and end results.

The result of univariate and multivariate analysis showed that age, race, tumor grade, clinical stage, T stage, surgery status, radiotherapy, chemotherapy, and regional nodes examined were independent risk factors for the CMS of HER2+ breast cancer (Table 2, P <.05).

Table 2.

Univariate and multivariate logistic regression for the risk factors of cancer-specific death.

Univariate analysis Multivariate analysis
HR (95% CI) P-value HR (95% CI) P-value
Age
 ≤55 Ref. Ref.
 >55 1.41 (1.32–1.50) <.001 1.33 (1.25–1.42) <.001
Race
 White Ref. Ref.
 Black 1.53 (1.41–1.67) <.001 1.37 (1.25–1.49) <.001
 Other 0.76 (0.68–0.84) <.001 0.79 (0.72–0.88) <.001
Tumor site
 Left Ref.
 Right 0.99 (0.93–1.05) .801
Grade
 I Ref. Ref.
 II 1.77 (1.43–2.18) <.001 1.34 (1.08–1.65) .008
 III 2.65 (2.15–3.26) <.001 1.79 (1.45–2.21) <.001
 IV 3.29 (2.17–4.99) <.001 1.99 (1.31–3.03) .001
Clinical stage
 I Ref. Ref.
 II 2.85 (2.57–3.15) <.001 2.39 (2.07–2.77) <.001
 III 8.13 (7.34–9.00) <.001 6.31 (5.38–7.39) <.001
 IV 24.27 (21.82–26.98) <.001 11.20 (7.12–17.62) <.001
Stage T
 T1 Ref. Ref.
 T2 2.94 (2.72–3.19) <.001 1.50 (1.22–1.85) <.001
 T3 5.19 (4.69–5.75) <.001 1.54 (1.24–1.92) <.001
 T4 12.45 (11.33–13.68) <.001 2.12 (1.73–2.61) <.001
Stage N
 N0 Ref. Ref.
 N1 2.03 (1.90–2.18) <.001 1.07 (0.99–1.16) .077
 N2 2.49 (2.24–2.77) <.001 1.08 (0.96–1.22) .177
 N3 3.17 (2.84–3.55) <.001 1.07 (0.94–1.21) .303
Stage M
 M0 Ref. Ref.
 M1 8.69 (8.09–9.33) <.001 1.05 (0.69–1.61) .820
Tumor size
 ≤2 cm Ref. Ref.
 >2 cm 3.62 (3.36–3.89) <.001 0.93 (0.78–1.12) .450
Surgery
 No Ref. Ref.
 Yes 0.13 (0.12–0.14) <.001 0.41 (0.37–0.45) <.001
Radiotherapy
 No Ref. Ref.
 Yes 0.67 (0.63–0.71) <.001 0.81 (0.76–0.87) <.001
Chemotherapy
 No Ref. Ref.
 Yes 0.63 (0.59–0.67) <.001 0.38 (0.35–0.40) <.001
Regional nodes examined
 <6 Ref. Ref.
 ≥6 1.51 (1.42–1.61) <.001 1.13 (1.04–1.23) .005
ER status
 Negative Ref.
 Positive 0.99 (0.93–1.06) .832
PR status
 Negative Ref.
 Positive 1.01 (0.95–1.08) .680

CI= confidence interval, HR= hazard ration.

3.2. Establishment of the prediction nomogram

Based on these clinical characters, we constructed the nomogram prediction models in training cohort and verified them in validation cohorts (Fig. 3A and B). Total score of each patient could be calculated by summing up the points of each variable, which could predict the probability of the CSM of HER2+ breast cancer.

Figure 3.

Figure 3.

The nomogram to predict 1-, 3-, and 5-year OS of HER2+ breast cancer. The nomogram for training (A) and validation (B) cohort. ER = estrogen receptor, HER2 = human epidermal growth factor receptor 2, HR = hormone receptor, OS = overall survival, PR = progesterone receptor.

3.3. Evaluation of the performance of the prediction nomogram

ROC curves for the prediction nomogram for the CSM of HER2+ breast cancer patients indicated that the 1-, 3- and 5-year AUCs were 0.874, 0.843, and 0.820 in the training cohort and 0.861, 0.845, and 0.825 in the validation cohort, respectively (Fig. 4A and B). The c-index was 0.817 (95% confidence interval: 0.806 to 0.831) and 0.821 (95% confidence interval: 0.809 to 0.833) in training cohort and validation cohort, respectively. Moreover, a good agreement was seen between the observed outcome and the predicted probabilities in the calibration curves of the nomogram in training cohort and validation cohort (Fig. 4C and D). Further decision curve analysis demonstrated good clinical utilities of the nomogram models in training cohort and validation cohort (Fig. 4E and F). After calculating the risk score of each patient, we separated HER2+ breast cancers into high and low-risk score group. As expected, HER2+ breast cancer patients with high-risk score would experience a poor CSM in training cohort and validation cohort (Fig. 5A and B).

Figure 4.

Figure 4.

ROC curve, calibration curve, and DCA curve in evaluating the performance of nomogram. ROC curve (A and B), calibration curve (C and D), and DCA curve (E and F) for training and validation cohort. AUC = areas under the curve, DCA = decision curve analysis, ROC = area under receiver operating characteristic.

Figure 5.

Figure 5.

The clinical outcome of HER2+ breast cancer in different group. The clinical outcome of HER2+ breast cancer with different risk score in training (A) and validation (B) cohort. HER2 = human epidermal growth factor receptor 2.

4. Discussion

The prediction model of cancer death constructed by traditional survival analysis method takes cancer-specific death cases as the ending, while deaths from other causes are treated as missing data, which would be not included in statistical analysis. Clinically, breast cancer patients often die from multiple causes, and there is a competitive relationship between various causes. Noncancer-specific death may affect the probability of CSM and there may be a risk of competition between them. Traditional survival analysis methods may exaggerate the risk of CSM because they fail to consider the effect of competing risk factor on patients’ death.[10] To analyze of survival data in the presence of competing risks, Austin et al[11] proposed a statistical method called competition risk model. As the competing risk model considers the influence of other risk factors on CSM, it is more in line with clinical practice to study the prognosis of patients. CSM calculated by the traditional survival analysis method is higher than that calculated by the competitive risk model.[12] In this study, we first identified the factors that may affect the prognosis of HER2+ breast cancer currently available in the SEER database using a competitive risk model. The results showed that age, race, tumor grade, clinical stage, T stage, surgery status, radiotherapy, chemotherapy, and regional nodes examined were independent risk factors for the clinical outcome of HER2+ breast cancer. Based on this, the prediction nomogram is constructed, and the model verification shows that the model has good prediction accuracy and reliability.

As age is closely related to the occurrence, tumor biological characteristics, and prognosis of breast cancer, it is often used as an important factor in clinical decision-making.[13] Studies have found that compared with older patients, breast cancer younger patients have lower tumor grade, accompanied by higher incidence of metastasis and recurrence.[14] However, elderly patients often have more underlying diseases, including heart disease, cerebrovascular disease, and diabetes, than younger patients, which affects the tolerance and compliance to chemotherapy.

Previous study has shown that breast-conserving surgery has the same survival effect as mastectomy in patients with early breast cancer, highlight the vital role of breast-conserving surgery in the treatment of early breast cancer.[15] Wrubel et al[16] found that the survival time of breast-conserving surgery combined with radiotherapy was better than that of mastectomy, and the specific reasons were unknown. In this study, competitive risk analysis showed that breast-conserving surgery had lower tumor-specific mortality than mastectomy alone in early HER2-positive breast cancer patients.

There are some shortcomings in this study. First of all, as this study is retrospective study, it is difficult to avoid bias in cases selection. Second, the SEER database only included part of the clinicopathological information at the time of initial diagnosis and some important data about patients may be missed, including family history, menstrual history, fertility, genomic status, and concomitant diseases. Moreover, the data analyzed in this study came from cases in the United States, and the diagnosis and treatment of breast cancer in other countries is somewhat different from that in the United States, which also affects the extensibility of the model.

5. Conclusion

In summary, our study suggested age, race, tumor grade, clinical stage, T stage, surgery status, radiotherapy, chemotherapy, and regional nodes examined were independent risk factors for CSM in HER2+ breast cancer patients. The nomogram shows good accuracy and reliability in predicting the CSM of breast cancer patients, and it could provide some theoretical support for clinicians to make decisions.

Abbreviations:

CSM
cancer-specific mortality
ER
estrogen receptor
HER2
human epidermal growth factor receptor 2
PR
progesterone receptor
ROC
receiver operating characteristic
SEER
the surveillance, epidemiology, and end results

How to cite this article: Wu N. Construction and validation of nomogram for the cancer-specific mortality for HER2-positive breast cancer patients. Medicine 2024;103:11(e37330).

The authors have no funding and conflicts of interest to disclose.

All data generated or analyzed during this study are included in this published article.

References

  • [1].Siegel RL, Miller KD, Wagle NS, et al. Cancer statistics, 2023. CA Cancer J Clin. 2023;73:17–48. [DOI] [PubMed] [Google Scholar]
  • [2].Weigelt B, Geyer FC, Reis-Filho JS. Histological types of breast cancer: how special are they? Mol Oncol. 2010;4:192–208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Wang Y, Liang Y, Ye F, et al. Histologic heterogeneity predicts patient prognosis of HER2-positive metastatic breast cancer: a retrospective study based on SEER database. Cancer Med. 2023;12:18597–610. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Wolff AC, Hammond MEH, Allison KH, et al. Human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists Clinical Practice Guideline Focused Update. Arch Pathol Lab Med. 2018;142:1364–82. [DOI] [PubMed] [Google Scholar]
  • [5].Loibl S, Gianni L. HER2-positive breast cancer. Lancet. 2017;389:2415–29. [DOI] [PubMed] [Google Scholar]
  • [6].Loibl S, Poortmans P, Morrow M, et al. Breast cancer. Lancet. 2021;397:1750–69. [DOI] [PubMed] [Google Scholar]
  • [7].Cesca MG, Vian L, Cristóvão-Ferreira S, et al. HER2-positive advanced breast cancer treatment in 2020. Cancer Treat Rev. 2020;88:102033. [DOI] [PubMed] [Google Scholar]
  • [8].Bredin P, Walshe JM, Denduluri N. Systemic therapy for metastatic HER2-positive breast cancer. Semin Oncol. 2020;47:259–69. [DOI] [PubMed] [Google Scholar]
  • [9].Kim MY. Breast cancer metastasis. Adv Exp Med Biol. 2021;1187:183–204. [DOI] [PubMed] [Google Scholar]
  • [10].Meng X, Hao F, Ju Z, et al. Conditional survival nomogram predicting real-time prognosis of locally advanced breast cancer: analysis of population-based cohort with external validation. Front Public Health. 2022;10:953992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Austin PC, Lee DS, Fine JP. Introduction to the analysis of survival data in the presence of competing risks. Circulation. 2016;133:601–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Li Z, Shi Y, Wu L, et al. Establishment and verification of a nomogram to predict tumor-specific mortality risk in triple-negative breast cancer: a competing risk model based on the SEER cohort study. Gland Surg. 2022;11:1961–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Anders CK, Hsu DS, Broadwater G, et al. Young age at diagnosis correlates with worse prognosis and defines a subset of breast cancers with shared patterns of gene expression. J Clin Oncol. 2008;26:3324–30. [DOI] [PubMed] [Google Scholar]
  • [14].Avci O, Tacar SY, Seber ES, et al. Breast cancer in young and very young women; is age related to outcome? J Cancer Res Ther. 2021;17:1322–7. [DOI] [PubMed] [Google Scholar]
  • [15].Darby S, McGale P, Correa C, et al.; Early Breast Cancer Trialists' Collaborative Group (EBCTCG). Effect of radiotherapy after breast-conserving surgery on 10-year recurrence and 15-year breast cancer death: meta-analysis of individual patient data for 10,801 women in 17 randomised trials. Lancet. 2011;378:1707–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Wrubel E, Natwick R, Wright GP. Breast-conserving therapy is associated with improved survival compared with mastectomy for early-stage breast cancer: a propensity score matched comparison using the National Cancer Database. Ann Surg Oncol. 2021;28:914–9. [DOI] [PubMed] [Google Scholar]

Articles from Medicine are provided here courtesy of Wolters Kluwer Health

RESOURCES