Abstract
The prognostic value of routine laboratory variables in breast cancer has been largely overlooked. Based on laboratory tests commonly performed in clinical practice, we aimed to develop a new model to predict disease free survival (DFS) after surgical removal of primary breast cancer. In a cohort of 1,596 breast cancer patients, we analyzed the associations of 33 laboratory variables with patient DFS. Based on 3 significant laboratory variables (hemoglobin, alkaline phosphatase, and international normalized ratio), together with important demographic and clinical variables, we developed a prognostic model, achieving the area under the curve of 0.79. We categorized patients into 3 risk groups according to the prognostic index developed from the final model. Compared with the patients in the low-risk group, those in the medium- and high-risk group had a significantly increased risk of recurrence with a hazard ratio (HR) of 1.75 (95% confidence interval [CI] 1.30–2.38) and 4.66 (95% CI 3.54–6.14), respectively. The results from the training set were validated in the testing set. Overall, our prognostic model incorporating readily available routine laboratory tests is powerful in identifying breast cancer patients who are at high risk of recurrence. Further study is warranted to validate its clinical application.
Introduction
Breast cancer is currently the most frequently diagnosed cancer and the leading cause of cancer-related mortality in women. Excluding skin cancers, breast cancer accounts for nearly 1 in 3 cancers1. In 2015, an estimated 231,840 new cases of invasive breast cancer will be diagnosed among women in the U.S., and approximately 40,290 women are expected to die from breast cancer2. Overall breast cancer death rates decreased 36% from 1989 to 2012 due to improvements in early detection and systemic therapies2–4. However, recurrence is still a major concern after surgical removal of primary breast tumor. Most locoregional failures occur within 5 years5. Both ipsilateral breast tumor recurrence and other locoregional recurrences are associated with significantly increased risk of distant disease and death5, 6.
A number of clinical and biological prognostic factors, such as age, performance status, sites of disease, hormone receptor status, and therapies, are associated with long-term clinical outcomes among women with breast cancer7. At present, the prognosis, classification, and treatment of breast cancer are dependent on tumor histological grade, lymph node stage, tumor stage, as well as 3 major protein markers: estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2)4, 8. Several recent studies incorporated various genetic and molecular biomarkers to develop new prognostic models for breast cancer9. Nevertheless, most of the markers are not yet available in routine clinical practice, and their applicability may be limited by high cost and the need for specialized equipment and expertise. Therefore, development of novel prognostic models based on easily available markers from routine clinical practice, will benefit oncologists in identifying patients at risk of locoregional recurrences and distant metastases so as to utilize more efficient patient-tailored treatment strategies.
In this study we hypothesized that a model incorporating biomarkers from conventional laboratory tests may provide valuable information on breast cancer prognosis. To test this hypothesis, we analyzed the associations between 33 routine blood-based laboratory tests and disease free survival (DFS) of patients with breast cancer. Incorporating variables which were significantly associated with DFS in univariate analysis into our prognostic model could better stratify patients into different risk groups. Thus, we offer a new prognostic model which is a noninvasive and inexpensive tool to aid physicians in estimating patient survival.
Results
Characteristics of study population
A total of 1,596 histologically confirmed breast cancer patients were included in this study. The detailed selection criteria are depicted in Fig. 1. Among the 1,596 patients, 1,053 (66.0%) patients were recurrence-free during follow-up, and 543 (34.0%) patients had recurrent disease or died. The patients were divided into a training set (N = 1,064) and a testing set (N = 532). The median follow-up time was 3.6 years (interquartile range [IQR] 1.8–8.2) and 4.2 years (IQR 2.0–8.3) in the training and testing set, respectively (P = 0.30). Demographic and basic clinical variables are summarized in Table 1. The differences between the training and testing sets were not statistically significant for almost all the demographic and clinical variables, except for tumor grade with a borderline significance (P = 0.04). The mean values of 33 laboratory variables are listed in Supplementary Table S1. A total of 12 variables with more than 50% missing observations were excluded from further analysis. The missing values of the remaining 21 variables ranged from 3.0% to 45.8% in the training set.
Table 1.
Patients characteristic | Training set N = 1,064 (%) | Testing set N = 532 (%) | P value |
---|---|---|---|
Age, mean ± SD, (years) | 59.0 ± 14.0 | 59.6 ± 12.9 | 0.35 |
Race/ethnicity | 0.91 | ||
Caucasian | 777 (73.0) | 390 (73.3) | |
African American | 214 (20.1) | 112 (21.0) | |
Asian | 42 (4.0) | 17 (3.2) | |
Others | 12 (1.1) | 5 (0.9) | |
Unknown | 19 (1.8) | 8 (1.5) | |
Smoking status | 0.55 | ||
Never smoking | 571 (53.7) | 265 (49.8) | |
Current smoking | 124 (11.6) | 67 (12.6) | |
Former smoking | 215 (20.2) | 116 (21.8) | |
Unknown | 154 (14.5) | 84 (15.8) | |
Drinking status | 0.99 | ||
Never drinking | 482 (45.3) | 237 (44.5) | |
Current drinking | 380 (35.7) | 192 (36.1) | |
Former drinking | 6 (0.6) | 3 (0.6) | |
Unknown | 196 (18.4) | 100 (18.8) | |
Tumor stage | 0.73 | ||
Stage I | 566 (53.2) | 275 (51.7) | |
Stage II | 355 (33.4) | 190 (35.7) | |
Stage III | 112 (10.5) | 53 (10.0) | |
Stage IV | 31 (2.9) | 14 (2.6) | |
Tumor grade | 0.04 | ||
Well differentiated | 121 (11.4) | 87 (16.4) | |
Moderately differentiated | 410 (38.5) | 201 (37.8) | |
Poorly differentiated | 395 (37.1) | 178 (33.5) | |
Not determined | 138 (13.0) | 66 (12.4) | |
Tumor histology | 0.80 | ||
Invasive ductal carcinoma | 868 (81.6) | 438 (82.3) | |
Invasive lobular carcinoma | 74 (6.9) | 38 (7.1) | |
Mixed carcinoma | 106 (10.0) | 46 (8.7) | |
Othersa | 16 (1.5) | 10 (1.9) | |
Tumor size | 0.47 | ||
2–9 mm | 181 (17.0) | 98 (18.4) | |
10–29 mm | 447 (42.0) | 232 (43.6) | |
30–49 mm | 106 (10.0) | 54 (10.1) | |
50–99 mm | 71 (6.7) | 28 (5.3) | |
≥10 cm | 13 (1.2) | 2 (0.4) | |
Unknown/not found | 246 (23.1) | 118 (22.2) | |
Lymph nodes metastatic rate | 0.50 | ||
0% | 591 (55.5) | 289 (54.3) | |
1–20% | 128 (12.0) | 77 (14.5) | |
20–49% | 55 (5.2) | 31 (5.8) | |
50–79% | 32 (3.0) | 23 (4.3) | |
80–100% | 34 (3.2) | 15 (2.8) | |
Not determined | 176 (16.5) | 76 (14.3) | |
Unknown | 48 (4.5) | 21 (4.0) | |
Estrogen receptor status | 0.77 | ||
Negative | 223 (21.0) | 104 (19.6) | |
Positive | 734 (69.0) | 371 (69.7) | |
Unknown | 107 (10.0) | 57 (10.7) | |
Progesterone receptor status | 0.76 | ||
Negative | 313 (29.5) | 154 (28.9) | |
Positive | 642 (60.3) | 317 (59.6) | |
Unknown | 109 (10.2) | 61 (11.5) | |
Chemotherapy | 0.97 | ||
No | 618 (58.1) | 307 (57.7) | |
Yes | 414 (38.9) | 208 (39.1) | |
Unknown | 32 (3.0) | 17 (3.2) | |
Radiation therapy | 0.12 | ||
No | 603 (56.7) | 290 (54.5) | |
Yes | 427 (40.1) | 214 (40.2) | |
Unknown | 34 (3.2) | 28 (5.3) | |
Hormone therapy | 0.64 | ||
No | 652 (61.3) | 323 (60.7) | |
Yes | 339 (31.9) | 178 (33.5) | |
Unknown | 73 (6.9) | 31 (5.8) |
Abbreviations: SD, standard deviation.
aOthers include colloid, medullary, tubular, papillary carcinoma, and Paget’s disease.
Univariate analysis
Kaplan-Meier and univariate Cox proportional hazards regression analysis were used to select candidate variables to be included in stepwise selection. Ten demographic and basic clinical variables (age, race, tumor stage, tumor size, lymph nodes metastatic rate, ER status, PR status, chemotherapy, radiation therapy, and hormone therapy) were significantly associated with DFS (Supplementary Table S2). Among the remaining 21 laboratory variables, 8 exhibited significant associations with DFS in a univariate basis (Table 2), including HCT, HGB, RBC, and RDW from the CBC panel, albumin and ALP from the CMP panel, and INR and PT from the coagulation panel. All of these 8 variables were significant when they were analyzed as both categorical and continuous variables, as well as in log-rank analysis. They were included as candidate prognostic factors in the next step of stepwise selection and model construction.
Table 2.
Variablesa | % of missing value | No. of patients disease free/recurrence | HR (95% CI) | Cox P | Log-rank P | Brootstrap % (<0.05) |
---|---|---|---|---|---|---|
HCT | 3.57 | <0.0001 | 98.1 | |||
≤37.48% | 295/218 | 1.00 | ||||
>37.48% | 358/155 | 0.65 (0.53–0.80) | <0.0001 | |||
HGB | 3.48 | <0.0001 | 99.7 | |||
≤12.50 T/L | 306/211 | 1.00 | ||||
>12.50 T/L | 348/162 | 0.62 (0.51–0.76) | <0.0001 | |||
RBC | 3.01 | <0.0001 | 97.9 | |||
≤4.19 T/L | 300/216 | 1.00 | ||||
>4.19 T/L | 358/158 | 0.67 (0.54–0.82) | 0.0001 | |||
RDW | 3.57 | <0.0001 | 100.0 | |||
≤13.40% | 371/160 | 1.00 | ||||
>13.40% | 282/213 | 1.85 (1.51–2.27) | <0.0001 | |||
Albumin | 36.09 | 0.03 | 70.1 | |||
≤4.30 g/dL | 266/115 | 1.00 | ||||
>4.30 g/dL | 191/108 | 0.74 (0.57–0.97) | 0.03 | |||
ALP | 30.64 | 0.002 | 87.4 | |||
≤69 IU/L | 268/111 | 1.00 | ||||
>69 IU/L | 224/135 | 1.48 (1.15–1.91) | 0.002 | |||
INR | 39.85 | <0.0001 | 99.9 | |||
≤1.02 | 234/99 | 1.00 | ||||
>1.02 | 175/132 | 2.00 (1.54–2.60) | <0.0001 | |||
PT | 45.77 | <0.0001 | 100.0 | |||
≤13.55 | 214/75 | 1.00 | ||||
>13.55 | 147/141 | 2.14 (1.61–2.85) | <0.0001 |
Abbreviations: RBC, red blood cell; HGB, hemoglobin; HCT, hematocrit; RDW, red cell distribution width; INR, international normalized ratio; ALP, alkaline phosphatase; HR, hazard ratio; CI, confidence interval.
aVariables were categorized by the median value in the study population.
Stepwise selection and final model construction
Multiple imputation method was used to generate 10 imputed datasets from the training set, and stepwise selection was conducted forward to identify the best group of variables to be included in the multivariate Cox proportional hazards model for each imputed dataset. The number of times that each of the 8 variables was selected for inclusion in the model by stepwise selection is summarized in Supplementary Table S3. Three variables (HGB, ALP, and INR) which were selected from ≥6 imputed datasets were included in the final model. The parameter estimates (regression coefficients or weights) and standard errors of the 10 significant demographic and basic clinical variables (age, race, stage, tumor size, lymph nodes metastatic rate, ER status, PR status, chemotherapy, radiation therapy, and hormone therapy), as well as, 3 laboratory variables (HGB, ALP, and INR) in the final model are showed in Table 3. As showed in Supplementary Table S4, the prognostic index was calculated for each patient based on the final model.
Table 3.
Variables | Coefficients | SE |
---|---|---|
Age | 0.03 | 0.004 |
Race (Caucasian as reference) | ||
African American | 0.21 | 0.13 |
others | 0.15 | 0.23 |
Stage (stage I as reference) | ||
II | 0.29 | 0.17 |
III | 1.07 | 0.23 |
IV | 1.87 | 0.28 |
Tumor size (2–9 mm as reference) | ||
10–29 mm | −0.07 | 0.20 |
30–49 mm | −0.11 | 0.15 |
50–99 mm | 0.06 | 0.21 |
≥10 cm | 0.08 | 0.21 |
Unknown/not found | 0.07 | 0.40 |
Lymph nodes metastatic rate (0% as reference) | ||
1–20% | −0.11 | 0.20 |
20–49% | −0.15 | 0.26 |
50–79% | −0.15 | 0.31 |
80–100% | 0.38 | 0.26 |
Not determined | 0.08 | 0.17 |
Unknown | 0.06 | 0.23 |
ER status (Negative as reference) | ||
Positive | −0.18 | 0.20 |
Unknown | 0.14 | 1.05 |
PR status (Negative as reference) | ||
Positive | −0.15 | 0.18 |
Unknown | −0.13 | 1.04 |
Chemotherapy (without chemotherapy as reference) | ||
With chemotherapy | 0.08 | 0.42 |
Unknown | 0.02 | 0.14 |
Radiation therapy (without radiation therapy as reference) | ||
With radiation | 0.26 | 0.35 |
Unknown | −0.43 | 0.12 |
Hormone therapy (without hormone therapy as reference) | ||
With hormone | −0.30 | 0.27 |
Unknown | 0.002 | 0.13 |
Square of HGB* | −0.004 | 0.001 |
Nature Logarithm of ALP* | 0.35 | 0.17 |
Inverse cube of INR* | −0.47 | 0.20 |
Abbreviations: ER, estrogen receptor; PR, progesterone receptor; HGB, hemoglobin; ALP, alkaline phosphatase; INR, international normalized ratio; SE, standard error.
*In order to calculate the prognostic index for the testing set, the missing data of HGB, ALP, and INR in the testing set were imputed as the mean values from the training set (after normality transformation, 154.57 for HGB, 4.24 for ALP, and 0.93 for INR, respectively).
Model validation
The prognostic utility of the final model was measured by the area under the curve (AUC) of receiver operating characteristics (ROC) curve. The AUCs were 0.79 (95% CI: 0.75–0.83) and 0.74 (95% CI: 0.69–0.79) in the training and testing set, respectively (Fig. 2). We repeated the analyses after exclusion of the patients who were followed less than 3, 6, or 12 months. Increasing the length of the exclusion window minimizes potential confounding effects at the time of baseline sample collections. In the subset of patients who were followed ≥3 months, the AUCs in the training and testing sets were the same as that in the overall patients (Supplementary Figure S1A). Very similar results were observed in the subsets of patients who were followed either ≥6 or ≥12 months (Supplementary Figure S1B and C).
The patients were then classified into three risk groups according to the tertile distribution of the prognostic index. Compared with patients in the low-risk group, patients in the medium- and high-risk group had a significantly increased risk of recurrence with a hazard ratio (HR) of 1.75 (95% confidence interval [CI] 1.30–2.38) and 4.66 (95% CI 3.54–6.14), respectively in the training set (Table 4). The survivals were significantly different among these three risk groups (P < 0.0001, Fig. 3A). Similar results were found in the testing set (Table 4 and Fig. 3B), as well as, in the subset analyses (Supplementary Figure S2).
Table 4.
Groups | No. of patients disease free/recurrence | Median DFS time (year) | HR (95% CI) | Cox P | Log-rank P |
---|---|---|---|---|---|
Training set | |||||
Low-risk | 287/68 | NR | 1.00 | <0.0001 | |
Medium-risk | 247/108 | 10.00 (8.14–11.93) | 1.75 (1.30–2.38) | 0.0003 | |
High-risk | 149/205 | 4.26 (3.32–4.92) | 4.66 (3.54–6.14) | <0.0001 | |
Testing set | |||||
Low-risk | 151/26 | NR | 1.00 | <0.0001 | |
Medium-risk | 133/45 | 10.77 (8.77–12.51) | 1.98 (1.22–3.21) | 0.006 | |
High-risk | 86/91 | 5.66 (3.72–7.43) | 5.33 (3.44–8.27) | <0.0001 |
Abbreviations and definitions: DFS, disease free survival; HR, hazard ratio; CI, confidence interval; NR, not reached.
Discussion
In this study, we assessed the associations of a large panel of 33 laboratory variables available in routine clinical practice, with the DFS of a cohort of patients with breast cancer. Three laboratory variables were demonstrated to be associated with DFS and were used to construct a prognostic model that could be used to identify patients at risk of recurrence.
There is not widely accepted prognostic model based on objective criteria other than predicting survival using clinical features. In addition to demographic and basic clinical information, an increasing number of novel prognostic markers have been explored and identified10–12. However, the main problem for most of these studies is that biomarkers rely on sophisticated molecular and/or genetic tests11, 13–16. The practical application of the novel tests is inevitably restricted by its cost and complexity. Comparatively, the prognostic model developed in this study uses laboratory test results which have already been available as a consequence of routine clinical monitoring, at no incremental cost. Combining demographic and basic clinical information, together with these laboratory parameters, we developed a new prognostic model that may help physicians and patients estimate DFS and thereby inform medical decision-making and patient counseling.
Accumulating evidence has shown that black women have a high risk of breast cancer recurrence regardless of age and tumor size17, 18. We previously reported a racial disparity in breast cancer survival using the Jefferson Cancer Registry data19. In the current study, the risk of recurrence increased by 54% in African Americans compared to Caucasians, again demonstrating the prognostic value of race. Therefore, the inclusion of race, as well as other well-known predictors such as age, tumor characteristics, and treatments20 in the model makes the final model reliable in recurrence prediction and applicable in clinics. Our previous study also found that differences in tumor presentation and certain hematologic traits, for example HGB level were associated with racial disparity in breast cancer survival19. Abnormal metabolic index at baseline were reported to affect survival for all stages of breast cancer as well21. In the present study, three laboratory variables (HGB, ALP, and INR) which were significantly associated with patient DFS in univariate analyses were stepwise selected into the final model to predict patient survival. There are plausible physiological reasons why each of these variables might be an important predictor.
It is not uncommon for a cancer patient to have anemia. Besides radiotherapy and chemotherapy, cancer itself could cause anemia of chronic disease. The mechanism of anemia may be because of decreased lifetime of RBC, decreased sensitivity of bone marrow to erythropoietin, and decreased production of erythropoietin22. Not mentioning neoplasm itself has a higher need for nutrition, and some cytokines secreted by neoplasm cells could depress one’s appetite23, which may take parts in the development of cachexia, and devastating prognosis thereafter. A proportion of 62% to 71% breast cancer patients would have anemia during their courses of disease24, 25. The scale of anemia may accord to the phase of breast cancer and the medication of chemotherapy26, 27. Anemia, or HGB level, has been found to have strong relationship with recurrence and prognosis of breast cancer by the studies of ours and others28–34.
Bone is a common site of metastatic breast cancer. Skeletal isoenzyme of ALP increases when there is bone reconstruction. The mRNA of ALP expression elevates in cancer cells, and may participate in mammary mineralization just like ossification formed by osteoblast cells35. ALP is also a sensitive indicator of biliary blockage, and it is more reliable than other liver enzymes when there is a liver metastasis involved36. Therefore, it is reasonable that ALP, as a valuable prognostic marker, was selected in our final prognostic model. However, a recent study by Liu et al. failed to identify the association of pretreatment ALP level with overall survival in female Caucasian patients with non-metastatic invasive breast cancer37. It was reported previously that ALP may not increase much in early stage breast cancer patients, but there would be a significant increase in patients with metastatic disease38. Thus, the different findings between ours and Liu’s study may be due to the differences in patient characteristics (age, gender, and ethnicity) and cancer biology (cancer stage, histological types, and so on).
Tissue factor is a major participant of abnormal coagulation in cancer patients. The expression of tissue factor increases in many different neoplasm models, and has very strong relationship with severity and prognosis39, 40. Several studies have established connection between tissue factor and neoplasm growth and invasion41–43. Although breast cancer cells were reported to produce lower level of tissue factor compared to other cancer cell types44, high level of tissue factor was observed in studies focus more on chemotherapy of breast cancer patients when thrombosis was involved45–47. Tissue factor is not measured routinely, but factor VII function is often measured through PT or INR48, 49. So it may not be surprising that our final model including INR could be used to predict patient recurrence.
Several clinical tools have been developed to predict prognosis and survival benefit from treatments, using clinicopathological features, genetic profiles, and novel biomarkers50. In 466 invasive ductal carcinoma breast cancer patients, Volinia and Croce reported an AUC of 0.74 by integrating mRNA, microRNA, and DNA methylation next-generation sequencing data into the model51. Based on large database of microarray datasets, Griffith et al. developed a robust multi-gene mRNA transcription-based predictor of relapse free survival at 10 years, which achieved an AUC of 0.70 for hormone-positive node-negative breast cancer patients52. Using clinicopathological features and all 14 biomolecular signatures, Campbell et al. reported an AUC of 0.75 in early breast cancer patients, aiming to predict relapse-free survival53. Inevitably, molecular markers included in these studies added additional costs and limited clinical generalization. And apparently, those derived biomarkers which are not clinically certified, may exhibit large variations when measured in different laboratories. In comparison, the laboratory variables we used are inexpensive, readily available, and technically simple. Another prognostic index, the Nottingham Prognostic Index (NPI) is also widely used for predicting survival of operable primary breast cancer54, 55. NPI based on tumor size, histologic grade, and lymph node status56, although is simple and easily available in routine clinics, provided suboptimal performance in predicting patient recurrence57–60. The AUCs for NPI in our study were 0.66 and 0.63 in the training and testing sets, respectively (data not shown). Our current model including demographic and basic clinical variables, as well as 3 routine laboratory variables exhibited a prognostic power superior to previously reported models either using routine clinical variables, or using more expensive and complicated molecular biomarkers.
There are several strengths of our study. We had a relatively large population with 1,596 breast cancer patients and the final results were consistent between training and testing sets. We analyzed DFS of breast cancer patients after surgery to enhance the application of our model, given patients are at high risk of recurrence during the first 5 years of treatments. Generally, the measurements of laboratory variables around time of diagnosis are more relevant to a prediction model, however, are affected by factors such as treatments. Therefore, we restricted the analyses on laboratory variables measured within 3 months after surgery to minimize the influence of certain causes on the variables, such as less reliable test results due to longer time after surgery, or inaccurate test results due to adverse effects after treatments applied. Furthermore, compared to published survival models based on more specialized and expensive biomarkers identified by gene/protein expression assays, our current model relying on easily obtained hematological index from routine clinical practice exhibits a comparative prognostic performance but without increased cost. There are several limitations of this study. First, although our findings are internally validated and the selected variables in the final model are physiologically plausible, our cohort was from a single institution. The results from our current study should be further validated in large independent populations. Second, we collected the hematological indexes detected within 3 months after surgery and related records available in our medical charts. Some indexes which were examined during follow-ups at a long or irregular interval exhibited high percentages of missing values, possibly because they may only be requested to be tested when a clinical sign or side effect was detected or before a treatment-decision was made. Given the fact that request for tests may indirectly carry prognostic value, the missing information may possibly bias our finding. Although the multiple imputation method was used to estimate the missing values, it could neither provide an unbiased estimation nor eliminate potential confounding. Therefore, future studies are required to examine the model performance based on laboratory variables with more complete data. Third, because we do not know whether the patients who were censored due to loss to follow-up were as likely to have a subsequent event as those individuals who remained in the study, informative censoring may occur and bias the results61, 62. Fourth, some important factors such as HER2 status and target therapy were not included in the final model due in part to the missing data. Considering that HER2 is also essential for making treatment decision, and target therapy in HER2 positive patients could affect patient survival, further study could explore the performance of a model incorporating these two variables. Fifth, the patients included in the study were diagnosed between 1988 and 2011. Changes of diagnosis criteria and treatment regimens in this relatively long time period might increase the heterogeneity of our population. Sixth, we excluded the patients due to the lack of laboratory variable measurement, which might confound the results. However, when we compared the basic demographic and clinical characteristics between the included and excluded patients, we did not find significant difference in most of these variables (data not shown), indicating that the confounding, if there is any, may be minor. Moreover, we excluded some patients according to a given clinical characteristics, for example, without surgery. This study design, although made the study population more homogeneous, might restrict the generalization of our final model. Finally, this model performs well as a prognostic model to predict DFS of all patients once identified as breast cancer, but there is a lack of efficiency on predicting the responses to treatments that were used afterwards. This prediction model can be better developed if the follow-up and evaluation of treatments at different time point are included in the analyses.
In summary, we developed an inexpensive model that was mainly based on readily available objective data for a cohort of breast cancer patients identified and treated in a single-institute. If further validated, this model could be used to identify breast cancer patients who are at high risk of recurrence and be helpful to motivate individuals to pursue benefits from treatments.
Methods
Study population
Based on the electronic medical records from the Cancer Registry at Thomas Jefferson University Hospital, we identified histologically confirmed female breast cancer patients who were diagnosed and/or treated from October, 1988 to December, 201119. For the analyses in this study, we excluded the patients (i) without mastectomy or breast conservation surgery and/or without routine blood tests within 3 months after surgery; (ii) with 0/unknown stage and/or cancer histology of carcinoma in situ (including ductal carcinoma in situ and lobular carcinoma in situ); (iii) without recurrence information or never disease free. Finally, a cohort of 1,596 breast cancer patients was selected based on these criteria (Fig. 1). This study was approved by the Institutional Review Board (IRB) at the Thomas Jefferson University. Because this study was based on data obtained from the review of archived medical charts, patient consent was waived by the IRB of the Office of Human Research in Thomas Jefferson University under an approved protocol including the approval for the request for waiver of authorization to collect protected health information.
Data collection
Demographic variables including age, race/ethnicity, smoking status, and drinking status were collected in this study. Basic clinical variables included tumor size, stage, grade, histology, lymph nodes metastatic rate, ER status, PR status, and treatments (hormone therapy, chemotherapy, and radiation therapy). Routine blood-based laboratory test data were also obtained from medical charts, which included a total of 33 variables in four categories: complete blood count (CBC), comprehensive metabolic panel (CMP), coagulation panel, and leukocyte differentiation tests (Supplementary Table S1). Following 10 variables were included in the CBC panel: white blood cell (WBC), red blood cell (RBC), hemoglobin (HGB), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), red cell distribution width (RDW), mean platelet volume (MPV), and platelet count. Routine CMP panel recorded 10 variables including blood urine nitrogen (BUN), creatinine, glucose, protein, albumin, alkaline phosphatase (ALP), alanine aminotransferase (ALT), aspartate aminotransferase (AST), total bilirubin, and anion gap. The coagulation condition of each patient was evaluated by prothrombin time (PT), partial thromboplastin time (PTT), and international normalized ratio (INR). The percentages of neutrophils, lymphocytes, monocytes, basophils, eosinophils as well as their absolute numbers were all obtained from the results of a test for leukocyte differentiation.
Statistical analysis
General analytic strategy
SAS (Version 9.2, SAS institute, Cary, NC) and Stata (Version 12.0, Stata Corp., College station, TX) software packages were used for data analyses in this study. The clinical endpoint analyzed in this study was DFS. The definition of recurrence was that after surgical removal of primary tumor, the regrowth of tumor in the original site or regional lymph nodes, or distant organs. DFS was defined as the time from surgery to the first event of either recurrence or death63. Patients who were alive and recurrence-free on December 31, 2011 were censored. Patients who were lost to follow-up were censored as well. In routine blood-based laboratory tests, variables with greater than 50% missing observations were excluded from analyses. For patients with multiple measurements of the same variable, the mean value of these measurements were calculated and used in the analyses. To develop a risk prediction model, patients were sorted by surgery date and two of every three sorted patients were included in the training set. The remaining patients were included in the testing set to internally assess the predictive performance and control overfitting of the model64, 65. All statistical tests were two-sided, and a P-value of less than 0.05 was considered statistically significant.
Identification of candidate variables
Comparisons of demographic, clinical, and laboratory variables between training and testing sets were performed using the chi-square test for categorical variables and Student’s t test for continuous variables. The association between each variable and patient DFS was assessed using Kaplan-Meier and Cox proportional hazards regression analyses in the training set. Variables that demonstrated a significant association with DFS were included in the next stepwise selection and model construction. Laboratory variables had to be significant in all of the categorical, continuous, and log-rank analyses. Bootstrap resampling method is used to internally validate the analyses of these results. A total of 1,000 bootstrap samples were generated for each analysis. Each time a bootstrap was drawn from the original dataset and the P-value for the analysis was calculated. The number of times with a P-value less than 0.05 was counted.
Stepwise selection and model construction
In order to minimize the confounding effects resulting from potential high correlations between laboratory variables and demographic and clinical variables, we forced significant demographic and clinical variables from the univariate analysis into the model. For the laboratory variables, we conducted stepwise selection using multivariate Cox proportional hazards model with significant laboratory variables identified in the univariate analysis. All continuous variables were kept continuous in the multivariate Cox regression and model construction process to avoid loss of power and residual confounding66. Multiple imputation method was used to handle the missing data in the training set67. The 10 imputation datasets from the training set were generated by Stata’s MI package, basing on the multivariate normal imputation68. And the missing data in the testing set were imputed as the mean values from the training set. Before imputation, box-cox method was used to transform variables with skewed distribution toward normality. In each imputed dataset, a forward stepwise selection was conducted using Akaike’s information criterion (AIC) which balances the data fitting and complexity of the model and reduced risk of overfitting69. The model with the smallest AIC was selected as the best model for each imputed dataset. The significant demographic and clinical variables were forced into the final model which was derived from each of the 10 imputed dataset as a composite. A laboratory variable which was selected in at least 6 imputed datasets was included in the final model. The parameter estimate (weight/coefficients) of each variable was calculated based on the pooled imputed datasets. A prognostic index was derived by calculating the sum of each variable multiplied by its corresponding weight in the final model.
Model validation
Two methods were used for model validation and applied in both training and testing sets. Model’s capability to predict recurrence was assessed by constructing the ROC curves and calculating the AUCs70.
In the second validation method, patients were classified into three risk groups based on the prognostic index calculated by the model. The cutoff value was determined by tertile distribution of the prognostic index. HRs with 95% CI in different risk groups were assessed by Cox proportional hazards model. Survival curves were plotted using Kaplan-Meier method and compared using the log-rank test.
Electronic supplementary material
Acknowledgements
The work was supported by grants from the National Natural Science Foundation of China (No. 21347005 and 81041098).
Author Contributions
C.W., Z.F. and Z.Z. conceived and designed the study. Z.Z., L.L., Z.Y., T.F., Y.D., A.S. and Y.Z. collected clinical data. Z.Z., L.L. analyzed the data. Z.Z., L.L., Z.Y., T.F. and Y.D. interpreted the data. Z.Y., A.S., D.W. and K.L. contributed materials/analysis tools. Z.Z., L.L., C.W. and Z.F. wrote the manuscript. All authors reviewed the manuscript.
Competing Interests
The authors declare that they have no competing interests.
Footnotes
Electronic supplementary material
Supplementary information accompanies this paper at doi:10.1038/s41598-017-08240-2
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Chun Wang, Email: chun.wang@jefferson.edu.
Zhimin Fan, Email: fanzhimn@163.com.
References
- 1.DeSantis C, Ma J, Bryan L, Jemal A. Breast cancer statistics, 2013. CA Cancer J Clin. 2014;64:52–62. doi: 10.3322/caac.21203. [DOI] [PubMed] [Google Scholar]
- 2.American Cancer Society. Breast Cancer Facts & Figures 2015–2016. Atlanta: American Cancer Society, Inc. (2015).
- 3.Amoroso V, et al. International Expert Consensus on Primary Systemic Therapy in the Management of Early Breast Cancer: Highlights of the Fifth Symposium on Primary Systemic Therapy in the Management of Operable Breast Cancer, Cremona, Italy (2013) J Natl Cancer Inst Monogr. 2015;2015:90–96. doi: 10.1093/jncimonographs/lgv023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Goldhirsch A, et al. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2013. Ann Oncol. 2013;24:2206–2223. doi: 10.1093/annonc/mdt303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wapnir IL, et al. Prognosis after ipsilateral breast tumor recurrence and locoregional recurrences in five National Surgical Adjuvant Breast and Bowel Project node-positive adjuvant breast cancer trials. J Clin Oncol. 2006;24:2028–2037. doi: 10.1200/JCO.2005.04.3273. [DOI] [PubMed] [Google Scholar]
- 6.Fisher B, et al. Twenty-year follow-up of a randomized trial comparing total mastectomy, lumpectomy, and lumpectomy plus irradiation for the treatment of invasive breast cancer. N Engl J Med. 2002;347:1233–1241. doi: 10.1056/NEJMoa022152. [DOI] [PubMed] [Google Scholar]
- 7.Marshall EM, et al. Prognostic Factors of Survival among Women with Metastatic Breast Cancer and Impact of Primary or Secondary Nature of Disease on Survival: A French Population-Based Study. Breast J. 2017;23:138–145. doi: 10.1111/tbj.12717. [DOI] [PubMed] [Google Scholar]
- 8.Coates AS, et al. Tailoring therapies–improving the management of early breast cancer: St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer. Ann Oncol. 2015;26:1533–1546. doi: 10.1093/annonc/mdv221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Rivenbark AG, O’Connor SM, Coleman WB. Molecular and cellular heterogeneity in breast cancer: challenges for personalized medicine. Am J Pathol. 2013;183:1113–1124. doi: 10.1016/j.ajpath.2013.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ito M, et al. Evaluation of aldehyde dehydrogenase 1 and transcription factors in both primary breast cancer and axillary lymph node metastases as a prognostic factor. Breast Cancer. 2016;23:437–444. doi: 10.1007/s12282-015-0583-1. [DOI] [PubMed] [Google Scholar]
- 11.Chen B, et al. miR-22 as a prognostic factor targets glucose transporter protein type 1 in breast cancer. Cancer Lett. 2015;356:410–417. doi: 10.1016/j.canlet.2014.09.028. [DOI] [PubMed] [Google Scholar]
- 12.Watanabe G, et al. Combined Immunohistochemistry of PLK1, p21, and p53 for Predicting TP53 Status: An Independent Prognostic Factor of Breast Cancer. Am J Surg Pathol. 2015;39:1026–1034. doi: 10.1097/PAS.0000000000000386. [DOI] [PubMed] [Google Scholar]
- 13.Zwart W, et al. SRC3 Phosphorylation at Serine 543 Is a Positive Independent Prognostic Factor in ER-Positive Breast Cancer. Clin Cancer Res. 2016;22:479–491. doi: 10.1158/1078-0432.CCR-14-3277. [DOI] [PubMed] [Google Scholar]
- 14.Rui X, Li Y, Jin F, Li F. TMPRSS3 is a novel poor prognostic factor for breast cancer. Int J Clin Exp Pathol. 2015;8:5435–5442. [PMC free article] [PubMed] [Google Scholar]
- 15.Dong LL, Chen LM, Wang WM, Zhang LM. Decreased expression of microRNA-124 is an independent unfavorable prognostic factor for patients with breast cancer. Diagn Pathol. 2015;10:45. doi: 10.1186/s13000-015-0257-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Seong MK, et al. Bcl-2 is a highly significant prognostic marker of hormone-receptor-positive, human epidermal growth factor receptor-2-negative breast cancer. Breast Cancer Res Treat. 2015;150:141–148. doi: 10.1007/s10549-015-3305-7. [DOI] [PubMed] [Google Scholar]
- 17.Iqbal J, Ginsburg O, Rochon PA, Sun P, Narod SA. Differences in breast cancer stage at diagnosis and cancer-specific survival by race and ethnicity in the United States. JAMA. 2015;313:165–173. doi: 10.1001/jama.2014.17322. [DOI] [PubMed] [Google Scholar]
- 18.Tammemagi CM, Nerenz D, Neslund-Dudas C, Feldkamp C, Nathanson D. Comorbidity and survival disparities among black and white patients with breast cancer. JAMA. 2005;294:1765–1772. doi: 10.1001/jama.294.14.1765. [DOI] [PubMed] [Google Scholar]
- 19.Wang C, et al. Racial disparity in breast cancer survival: the impact of pre-treatment hematologic variables. Cancer Causes Control. 2015;26:45–56. doi: 10.1007/s10552-014-0481-4. [DOI] [PubMed] [Google Scholar]
- 20.Ehinger A, et al. Histological grade provides significant prognostic information in addition to breast cancer subtypes defined according to St Gallen 2013. Acta Oncol. 2017;56:68–74. doi: 10.1080/0284186X.2016.1237778. [DOI] [PubMed] [Google Scholar]
- 21.Lis CG, Grutsch JF, Vashi PG, Lammersfeld CA. Is serum albumin an independent predictor of survival in patients with breast cancer? JPEN J Parenter Enteral Nutr. 2003;27:10–15. doi: 10.1177/014860710302700110. [DOI] [PubMed] [Google Scholar]
- 22.Means RT., Jr. Pathogenesis of the anemia of chronic disease: a cytokine-mediated anemia. Stem Cells. 1995;13:32–37. doi: 10.1002/stem.5530130105. [DOI] [PubMed] [Google Scholar]
- 23.Argiles JM, Busquets S, Toledo M, Lopez-Soriano FJ. The role of cytokines in cancer cachexia. Curr Opin Support Palliat Care. 2009;3:263–268. doi: 10.1097/SPC.0b013e3283311d09. [DOI] [PubMed] [Google Scholar]
- 24.Denison U, et al. Incidence of anaemia in breast cancer patients receiving adjuvant chemotherapy. Breast Cancer Res Treat. 2003;79:347–353. doi: 10.1023/A:1024016508925. [DOI] [PubMed] [Google Scholar]
- 25.Ludwig H, et al. The European Cancer Anaemia Survey (ECAS): a large, multinational, prospective survey defining the prevalence, incidence, and treatment of anaemia in cancer patients. Eur J Cancer. 2004;40:2293–2306. doi: 10.1016/j.ejca.2004.06.019. [DOI] [PubMed] [Google Scholar]
- 26.Lawless GDF, Jon M. 2000: Cumulative prevalence of anemia in early-stage breast cancer patients. Blood. 1996;11:390b. [Google Scholar]
- 27.Kirshner J, Hatch M, Hennessy DD, Fridman M, Tannous RE. Anemia in stage II and III breast cancer patients treated with adjuvant doxorubicin and cyclophosphamide chemotherapy. Oncologist. 2004;9:25–32. doi: 10.1634/theoncologist.9-1-25. [DOI] [PubMed] [Google Scholar]
- 28.Wan S, et al. Post-diagnosis hemoglobin change associates with overall survival of multiple malignancies - results from a 14-year hospital-based cohort of lung, breast, colorectal, and liver cancers. BMC Cancer. 2013;13:340. doi: 10.1186/1471-2407-13-340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bottini A. BA, Brizzi MP, et al. Impact of pretreatment hemoglobin levels on tumor response, microvessel count and Ki67 expression in breast cancer patients undergoing primary chemotherapy. Breast Cancer Res Treat. 2002;76:S65. doi: 10.1023/A:1020228620173. [DOI] [Google Scholar]
- 30.Dubsky P, et al. Anemia is a significant prognostic factor in local relapse-free survival of premenopausal primary breast cancer patients receiving adjuvant cyclophosphamide/methotrexate/5-fluorouracil chemotherapy. Clin Cancer Res. 2008;14:2082–2087. doi: 10.1158/1078-0432.CCR-07-2068. [DOI] [PubMed] [Google Scholar]
- 31.Kamby C, Sengelov L. Survival and pattern of failure following locoregional recurrence of breast cancer. Clin Oncol (R Coll Radiol). 1999;11:156–163. doi: 10.1053/clon.1999.9033. [DOI] [PubMed] [Google Scholar]
- 32.Jakesz R, et al. Randomized adjuvant trial of tamoxifen and goserelin versus cyclophosphamide, methotrexate, and fluorouracil: evidence for the superiority of treatment with endocrine blockade in premenopausal patients with hormone-responsive breast cancer–Austrian Breast and Colorectal Cancer Study Group Trial 5. J Clin Oncol. 2002;20:4621–4627. doi: 10.1200/JCO.2002.09.112. [DOI] [PubMed] [Google Scholar]
- 33.Henke M, Sindlinger F, Ikenberg H, Gerds T, Schumacher M. Blood hemoglobin level and treatment outcome of early breast cancer. Strahlenther Onkol. 2004;180:45–51. doi: 10.1007/s00066-004-1123-7. [DOI] [PubMed] [Google Scholar]
- 34.Nisman B, et al. Prognostic factors for survival in metastatic breast cancer during first-line paclitaxel chemotherapy. Anticancer Res. 2003;23:1939–1942. [PubMed] [Google Scholar]
- 35.Cox RF, et al. Microcalcifications in breast cancer: novel insights into the molecular mechanism and functional consequence of mammary mineralisation. Br J Cancer. 2012;106:525–537. doi: 10.1038/bjc.2011.583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Crivellari D, et al. Routine tests during follow-up of patients after primary treatment for operable breast cancer. International (Ludwig) Breast Cancer Study Group (IBCSG) Ann Oncol. 1995;6:769–776. doi: 10.1093/oxfordjournals.annonc.a059314. [DOI] [PubMed] [Google Scholar]
- 37.Liu X, et al. Prognostic significance of pretreatment serum levels of albumin, LDH and total bilirubin in patients with non-metastatic breast cancer. Carcinogenesis. 2015;36:243–248. doi: 10.1093/carcin/bgu247. [DOI] [PubMed] [Google Scholar]
- 38.Choudhari A, Desai P, Indumati V, Kadi S. Activities of serum Ada, GGT and alp in carcinoma breast-a case control study for diagnostic and prognostic significance. Indian J Med Sci. 2013;67:123–129. doi: 10.4103/0019-5359.122740. [DOI] [PubMed] [Google Scholar]
- 39.Mann KG, Butenas S, Brummel K. The dynamics of thrombin formation. Arterioscler Thromb Vasc Biol. 2003;23:17–25. doi: 10.1161/01.ATV.0000046238.23903.FC. [DOI] [PubMed] [Google Scholar]
- 40.Falati S, et al. Accumulation of tissue factor into developing thrombi in vivo is dependent upon microparticle P-selectin glycoprotein ligand 1 and platelet P-selectin. J Exp Med. 2003;197:1585–1598. doi: 10.1084/jem.20021868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kasthuri RS, Taubman MB, Mackman N. Role of tissue factor in cancer. J Clin Oncol. 2009;27:4834–4838. doi: 10.1200/JCO.2009.22.6324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Li R, et al. Presence of intratumoral platelets is associated with tumor vessel structure and metastasis. BMC Cancer. 2014;14:167. doi: 10.1186/1471-2407-14-167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Demers M, et al. Cancers predispose neutrophils to release extracellular DNA traps that contribute to cancer-associated thrombosis. Proc Natl Acad Sci USA. 2012;109:13076–13081. doi: 10.1073/pnas.1200419109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yu JL, et al. Oncogenic events regulate tissue factor expression in colorectal cancer cells: implications for tumor progression and angiogenesis. Blood. 2005;105:1734–1741. doi: 10.1182/blood-2004-05-2042. [DOI] [PubMed] [Google Scholar]
- 45.Haddad TC, Greeno EW. Chemotherapy-induced thrombosis. Thromb Res. 2006;118:555–568. doi: 10.1016/j.thromres.2005.10.015. [DOI] [PubMed] [Google Scholar]
- 46.Connolly GC, Khorana AA. Risk stratification for cancer-associated venous thromboembolism. Best Pract Res Clin Haematol. 2009;22:35–47. doi: 10.1016/j.beha.2008.12.006. [DOI] [PubMed] [Google Scholar]
- 47.Deitcher SR, Gomes MP. The risk of venous thromboembolic disease associated with adjuvant hormone therapy for breast carcinoma: a systematic review. Cancer. 2004;101:439–449. doi: 10.1002/cncr.20347. [DOI] [PubMed] [Google Scholar]
- 48.Boccaccio C, et al. The MET oncogene drives a genetic programme linking cancer to haemostasis. Nature. 2005;434:396–400. doi: 10.1038/nature03357. [DOI] [PubMed] [Google Scholar]
- 49.Williams JC, Mackman N. Tissue factor in health and disease. Front Biosci (Elite Ed). 2012;4:358–372. doi: 10.2741/e383. [DOI] [PubMed] [Google Scholar]
- 50.Engelhardt EG, et al. Predicting and communicating the risk of recurrence and death in women with early-stage breast cancer: a systematic review of risk prediction models. J Clin Oncol. 2014;32:238–250. doi: 10.1200/JCO.2013.50.3417. [DOI] [PubMed] [Google Scholar]
- 51.Volinia S, Croce CM. Prognostic microRNA/mRNA signature from the integrated analysis of patients with invasive breast cancer. Proc Natl Acad Sci USA. 2013;110:7413–7417. doi: 10.1073/pnas.1304977110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Griffith OL, et al. A robust prognostic signature for hormone-positive node-negative breast cancer. Genome Med. 2013;5:92. doi: 10.1186/gm496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Campbell HE, Gray AM, Harris AL, Briggs AH, Taylor MA. Estimation and external validation of a new prognostic model for predicting recurrence-free survival for early breast cancer patients in the UK. Br J Cancer. 2010;103:776–786. doi: 10.1038/sj.bjc.6605863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Malmstrom P, et al. S-phase fraction and urokinase plasminogen activator are better markers for distant recurrences than Nottingham Prognostic Index and histologic grade in a prospective study of premenopausal lymph node-negative breast cancer. J Clin Oncol. 2001;19:2010–2019. doi: 10.1200/JCO.2001.19.7.2010. [DOI] [PubMed] [Google Scholar]
- 55.Blamey RW, et al. Survival of invasive breast cancer according to the Nottingham Prognostic Index in cases diagnosed in 1990–1999. Eur J Cancer. 2007;43:1548–1555. doi: 10.1016/j.ejca.2007.01.016. [DOI] [PubMed] [Google Scholar]
- 56.Galea MH, Blamey RW, Elston CE, Ellis IO. The Nottingham Prognostic Index in primary breast cancer. Breast Cancer Res Treat. 1992;22:207–219. doi: 10.1007/BF01840834. [DOI] [PubMed] [Google Scholar]
- 57.Tan W, et al. A combination of Nottingham prognostic index and IHC4 score predicts pathological complete response of neoadjuvant chemotherapy in estrogen receptor positive breast cancer. Oncotarget. 2016;7:87312–87322. doi: 10.18632/oncotarget.13549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Wu, X. et al. Personalized Prognostic Prediction Models for Breast Cancer Recurrence and Survival Incorporating Multidimensional Data. J Natl Cancer Inst. 109 (2017). [DOI] [PMC free article] [PubMed]
- 59.Drukker CA, et al. Optimized outcome prediction in breast cancer by combining the 70-gene signature with clinical risk prediction algorithms. Breast Cancer Res Treat. 2014;145:697–705. doi: 10.1007/s10549-014-2954-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Kim W, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15:230–238. doi: 10.4048/jbc.2012.15.2.230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Ranganathan P, Pramesh CS. Censoring in survival analysis: Potential for bias. Perspect Clin Res. 2012;3:40. doi: 10.4103/2229-3485.92307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Clark TG, Bradburn MJ, Love SB, Altman DG. Survival analysis part I: basic concepts and first analyses. Br J Cancer. 2003;89:232–238. doi: 10.1038/sj.bjc.6601118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Sargent DJ, et al. Disease-free survival versus overall survival as a primary end point for adjuvant colon cancer studies: individual patient data from 20,898 patients on 18 randomized trials. J Clin Oncol. 2005;23:8664–8670. doi: 10.1200/JCO.2005.01.6071. [DOI] [PubMed] [Google Scholar]
- 64.Baker SG, Kramer BS, Srivastava S. Markers for early detection of cancer: statistical guidelines for nested case-control studies. BMC Med Res Methodol. 2002;2:4. doi: 10.1186/1471-2288-2-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Shilpi A, Bi Y, Jung S, Patra SK, Davuluri RV. Identification of Genetic and Epigenetic Variants Associated with Breast Cancer Prognosis by Integrative Bioinformatics Analysis. Cancer Inform. 2017;16:1–13. doi: 10.4137/CIN.S39783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25:127–141. doi: 10.1002/sim.2331. [DOI] [PubMed] [Google Scholar]
- 67.Lee KJ, Carlin JB. Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation. Am J Epidemiol. 2010;171:624–632. doi: 10.1093/aje/kwp425. [DOI] [PubMed] [Google Scholar]
- 68.Morris TP, White IR, Carpenter JR, Stanworth SJ, Royston P. Combining fractional polynomial model building with multiple imputation. Stat Med. 2015;34:3298–3317. doi: 10.1002/sim.6553. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Olofsen E, Dahan A. Using Akaike’s information theoretic criterion in mixed-effects modeling of pharmacokinetic data: a simulation study. F1000Res. 2013;2:71. doi: 10.12688/f1000research.2-71.v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Chambless LE, Cummiskey CP, Cui G. Several methods to assess improvement in risk prediction models: extension to survival analysis. Stat Med. 2011;30:22–38. doi: 10.1002/sim.4026. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.