Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jun 24.
Published in final edited form as: Cancer Epidemiol Biomarkers Prev. 2020 Apr 3;29(6):1271–1277. doi: 10.1158/1055-9965.EPI-19-1478

Validating breast-cancer risk prediction models in the Korean Cancer Prevention Study-II Biobank

Yon Ho Jee 1, Chi Gao 1, Jihye Kim 1, Seho Park 2, Sun Ha Jee 3,*, Peter Kraft 1,4
PMCID: PMC12186873  NIHMSID: NIHMS1582578  PMID: 32245787

Abstract

Background:

Risk prediction models may be useful for precision breast-cancer screening. We aimed to evaluate the performance of breast cancer risk models developed in European-ancestry studies in a Korean population.

Methods:

We compared discrimination and calibration of three multivariable risk models in a cohort of 77,457 women from the Korean Cancer Prevention Study (KCPS)-II. The first incorporated US breast-cancer incidence and mortality rates, US risk-factor distributions, and relative risk (RR) estimates from European-ancestry studies. The second recalibrated the first by using Korean incidence and mortality rates and Korean risk factor distributions, while retaining the European-ancestry RR estimates. Finally, we derived a Korea-specific model incorporating the RR estimates from KCPS.

Results:

The US European-ancestry breast cancer risk model was well calibrated among Korean women <50 years (Expected/Observed=1.124 (0.989, 1.278)) but markedly overestimated the risk for those ≥50 years (E/O=2.472 (2.005, 3.049)). Recalibrating absolute risk estimates using Korean breast cancer rates and risk distributions markedly improved the calibration in women ≥50 (E/O=1.018 (0.825, 1.255)). The model incorporating Korean-based RRs had similar but not clearly improved performance relative to the recalibrated model.

Conclusions:

The poor performance of the US European-ancestry breast cancer risk model among older Korean women highlights the importance of tailoring absolute risk models to specific populations. Recalibrating the model using Korean incidence and mortality rates and risk factor distributions greatly improved performance.

Impact:

The data will provide valuable information to plan and evaluate actions against breast cancer focused on primary prevention and early detection in Korean women.

Keywords: breast cancer, risk prediction, risk stratification, validation

Introduction

Breast cancer is the most common non-skin cancer in women worldwide. Although Korea has had a lower incidence compared to Western countries, the incidence of breast cancer is rapidly increasing. It is now the second leading cancer in Korean women (1) after thyroid cancer, with 21,402 new cases diagnosed in 2014. This increasing trend likely reflects changes in reproductive factors in Korean women such as early menarche, late menopause, and having fewer children at an older age (2), all of which are secondary to the rapid development and westernization of Korea. However, the age distribution of breast cancer incidence in Korea is still markedly different from that in the Western countries, with a peak at 45–49 years and a higher proportion of premenopausal women (35).

Mammography and other screening modalities can reduce morbidity and mortality of breast cancer (6,7). In Korea, mammography is recommended every 2 years for women aged 40 or older. Breast cancer susceptibility, however, largely depends on multiple risk factors, and it is crucial to identify high-risk women who may benefit from aggressive screening strategies. Thus, building and improving the predictive values for risk prediction models is an important step towards targeted screening and prevention.

A number of breast-cancer risk models have been developed in European-ancestry populations (8). These models use information on reproductive factors, family history of disease, mammographic density, and measured genetic factors to estimate a woman’s absolute risk of disease. Recent work has focused on developing and validating a “synthetic” multivariable risk model that can include a comprehensive set of risk factors (9,10). The absolute risks from this model are well calibrated across US- and European cohorts, but they are unlikely to provide accurate risk estimates for Korean women without further modification. It is unknown whether recalibrated risk estimates using Korean population incidence rates but retaining relative risk estimates from European-based studies will perform well among Korean women.

This study aims to evaluate the discrimination and calibration of three breast cancer risk models among Korean women: a model based on risk-factor RRs estimated from European-ancestry studies, US incidence and mortality rates, and the distribution of risk factors among US non-Hispanic white women; a model using European-ancestry RR estimates but Korean incidence and mortality rates and Korean risk-factor distributions; and a model that combines Korean incidence and mortality rates and Korean risk-factor distributions with Korean RR estimates. Although the last model in principle should perform best, in practice, if the risk factor RRs are similar across European-ancestry and Korean populations, the recalibrated model using RRs estimated among Europeans may perform well, especially if the sample sizes used to estimate the Korean RRs are relatively small.

We calculate absolute risk estimates and evaluate their performance using data from the Korean Cancer Prevention Study-II (KCPS-II) Biobank and the Individualized Coherent Absolute Risk Estimation (iCARE) software. iCARE was developed to develop and validate risk prediction models for a population combining information on relative risk (RR) estimates, age-specific incidence/ mortality rates and risk factor distributions from multiple data sources (11).

Materials and Methods

Study population for discrimination and calibration analyses

We used the KCPS‐II Biobank to evaluate the discrimination and calibration of breast-cancer absolute risk models. The KCPS-II includes 78,282 women who undertook routine health assessments at health promotion centers between 2004 and 2013. The study design and recruitment have been described in detail previously (12). All participants gave written informed consent before participation. The Institutional Review Board of Yonsei University approved this study protocol (IRB approval number 4–2014-1008). Exclusion criteria included no information on height and weight, history of breast cancer, and age at entry below 20 or above 80 years. The final analytic samples included 77,457 women (S.Figure 1, S.Table 1).

Data Collection

All participants were asked to complete a structured questionnaire to collect the following details: age at menarche, age at menopause, parity, age at first birth, oral contraceptive (OC) use (never, ever), hormone replacement therapy (HRT) use, alcohol intake, history of benign breast disease (BBD), and family history of breast cancer. Height and weight were measured while participants wore light clothing. Body mass index (BMI) was calculated as the weight (kg) divided by the height squared (m2).

Follow-up for breast cancer

The principal outcome was incidence of breast cancer (ICD-10 codes C50). Since all participants have a unique identification number assigned at birth, allowing linkage with the national cancer registry and hospital admission records, the follow-up was almost 100% complete. Cancer diagnoses are based on histological type, resulting in high accuracy.

Study population for RR estimation

We used the Korean Cancer Prevention Study (KCPS) to independently estimate the relative risks for breast cancer risk factors (S.Table 2). The KCPS is a 1.3-million-member prospective cohort study, designed to assess risk factors for mortality, incidence, and hospital admission from cancer, with a follow-up of 25 years (13). The KCPS cohort includes the 443,627 women aged 20–80 years who received health insurance from the Korean Medical Insurance Corporation and who had biennial medical evaluations between 1992 and 1995. The collection of risk factors was similarly done to the KCPS-II. Since history of BBD was not asked for women in the KCPS, we defined history of BBD based on ICD-10 code D24. In the KCPS cohort, an incident breast cancer was coded based on a hospital admission for a cancer diagnosis.

Statistical analysis

We evaluated the performance of a recently-published breast-cancer absolute-risk model (9) in the KCPS-II Biobank. We compared the performance of three models: (A) the US-based European-ancestry model, using incidence, mortality and risk-factor distributions among US non-Hispanic white women and European-ancestry RRs (USEA); (B) a recalibrated model, using Korean incidence mortality and risk-factor distributions but European-ancestry RRs (KREA); and (C) a fully Korean-based model using Korean incidence mortality and risk-factor distributions and RR estimates from the KCPS (KRKR).

The models include data on reproductive, anthropometric, behavioral and clinical risk factors: age at menarche (≤10, 11, 12, 13, 14, 15, ≥16 years), age at menopause (<40, 40–44, 45–49, 50–54, ≥55 years), parity (0, 1, 2, ≥3 births), age at first birth (<20, 20–24, 25–29, ≥30 years), OC use (never, ever), HRT use (never, ever), BMI (<18.5, 18.5–24.9, 25.0–29.9, ≥30.0 kg/m2), height (cm/10), alcohol intake (0, 1–4, 5–14, 15–24, 25–34, 35–44, ≥45 grams/day), history of BBD (no, yes), and family history of breast cancer (no, yes).

Due to the effect of estrogen, postmenopausal women have a greater risk of developing breast cancer than premenopausal women. Moreover, several factors such as obesity and HRT use have been linked to a higher risk of breast cancer only for postmenopausal women (14,15). Therefore, we assessed the breast cancer risk models separately for women younger than 50 years and women aged 50 or older.

For the USEA and KREA models, we used literature-based RRs of the risk factors (9). For the KRKR model, the RR estimates were obtained from multivariable Cox regression models based on a Korean cohort (KCPS). S.Table 2 provides detailed descriptions of RR estimates included in the models and population distribution.

iCARE uses average age-specific incidence rates to calibrate the predicted risks (16,17). We used information on age-specific breast cancer incidence rates (S.Figure2) and mortality rates from population-based registries in the U.S. and Korea: the 2008–2012 U.S. Surveillance Epidemiology and End Results data and the 2010 Korea National Statistical Office, respectively.

To obtain information on risk factor distributions, iCARE uses an additional individual-level reference dataset of risk factors representing each population (11). The reference datasets were 2010 National Health and Nutrition Examination Survey (NHANES) for the US-based model and 2010–2012 Korean NHANES (KNHANES) for the recalibrated model and the Korean-based model. To account for missing data in KNHANES for continuous factors, we performed conditional mean imputation, using MICE to draw multiple samples of missing factors conditional on observed data (m=10), then averaging factor values over the samples. For two unmeasured factors, history of BBD and family history of breast cancer, in the KNHANES, we used single random draw imputation based on the prevalence of the corresponding factors from the validation cohort. Using the imputation and simulation described above, we created a complete dataset of KNHANES with no missing information on risk factors.

Discrimination and calibration were used to evaluate the performance of model validation. For risk discrimination, we assessed the area under the receiver operating characteristic curve (AUC). For calibration, the KCPS-II Biobank participants were categorized into deciles of five-year absolute risk predicted by iCARE Lit model. The predicted and observed incidence in each decile were compared using expected-to-observed ratio (E/O) and the Hosmer-Lemeshow X2 test. Furthermore, we estimated cumulative and 10-year absolute risk using the current probability method (16) in the Korean-based model.

The absolute risk of developing breast cancer for a woman of age a over the time interval a + s can be calculated as

Ra,a+s=aa+sλ0(t)exp(Zβ)exp(at[λ0(u)exp(Zβ)+m(u)]du)dt (a)

Formula (a) holds under the assumptions that the risk factors Z act in a multiplicative fashion on the baseline hazard function λ0 (t). Formula (a) accounts for competing risks due to mortality from other causes through the age-specific mortality rate function m(t).

Cumulative risk is evaluated as absolute risk between age 20 years and a specific age. The 10-year risk is evaluated as absolute risk over the next 10 years for a woman who has attained a specific age without developing breast cancer. To use this method, we estimated the multivariable RRs of each women in the KCPS-II based on their risk factors Z, the log-relative risks β estimated in the KCPS, the age-specific mortality rates of breast cancer in Korea, and the risk factor distribution in KNHANES. iCARE uses the log relative risks, the risk factor distribution, and the population average age-specific incidence rates to calculate the baseline hazard; it then calculates absolute risk for each subject using formula (a).

All statistical tests were two-sided at a significance level of 0.05 and calculated using SAS version 9.4 software (SAS Institute, Cary, NC, USA) for descriptive statistics and relative risks. Absolute risks were evaluated with R 3.5.0 software using the iCARE package 1.0.0.

Results

Baseline Risk Factors

A total of 680 breast cancer cases were diagnosed during follow-up in the KCPS-II Biobank (322 cases were diagnosed within five years). Baseline risk factor distributions stratified by age of 50 are displayed in S.Table1. Compared to women aged 50 years or older, women younger than 50 years tend to have earlier age at menarche, fewer births, later age at first birth, and were more likely to drink alcohol.

Population distribution and RRs

The population distributions and RRs of breast cancer risk factors exhibit different patterns comparing the US non-Hispanic white and Korean population (S.Table2). US non-Hispanic white women tend to have earlier age at menarche, later age at menopause, and earlier age at first birth than Korean women. The proportions of women who use OC or HRT were markedly higher among US non-Hispanic white women than among Korean women. The US non-Hispanic white women, on average, had a higher body mass index than Korean women.

The relative risks did not differ greatly between European-ancestry and Korean women for most risk factors: the odds ratio comparing the highest versus the lowest risk category among Korean women was on average 0.40–1.90 times that of European-ancestry women. One notable exception was BBD, which had a larger effect on breast cancer among Korean women than European-ancestry women (RR=5.05 vs. 1.68). This may reflect the narrower definition of BBD in the KPCS-II, focusing on women with a history of benign neoplasm of unspecified breast. In addition, the inverse association between body mass index and breast cancer risk seen among premenopausal European-ancestry was not seen among Korean women.

Risk projections

Figure 1 shows cumulative and 10-year risks of breast cancer among Korean women between age 20–80 years by percentiles of absolute risk estimated in the KRKR model. The cumulative risk at 80 years for women in the 95th percentile of risk was 7.56%, while the average cumulative risk of 2.06%. The 10-year risk of breast cancer for women in the 95th percentile of risk peaked at 2.61% at age 45. The 10-year risks increased from age 20 to 45 and decreased thereafter.

Figure 1.

Figure 1.

Cumulative and 10-Year Breast Cancer Risk for Korean women Stratified by Risk Percentiles in the Korean Cancer Prevention Study-II Biobank estimated in the Korean-based model. Cumulative risk is evaluated as absolute risk between age 20 years and a specific age shown on the x-axis. The 10-year risk is evaluated as absolute risk over the next 10 years for a woman who has attained a specific age (shown on the x-axis) without developing breast cancer.

Predictive Capacities

The AUCs for women aged <50 years and ≥50 years for the USEA model in the KCPS-II were 71.8% (95% confidence interval [CI], 68.8–74.8) and 57.1% (95% CI, 51.2–62.9), showing better ability to distinguish cases from non-cases among younger women (Table 1). The USEA model was well calibrated among Korean women aged <50 years (E/O (95% CI)=1.12 (0.99–1.28)) (Figure 2) but it overestimated the risk for those aged ≥50 years (E/O (95% CI)=2.47 (2.01–3.05)) (Figure 3). Recalibrating absolute risk estimates using Korean age-specific incidence rates and risk distributions markedly improves the calibration in women aged ≥50 years (E/O (95% CI)=1.02 (0.83–1.26)). Recalibrating using the Korean age-specific incidence rates while keeping a US risk factor reference distribution underestimated risk in the KCPS-II (E/O (95% CI)=0.74 (0.65,0.84)); recalibrating using a Korean risk factor reference distribution while keeping US incidence rates overestimated risk (E/O (95% CI)=1.36 (1.19,1.54)) (S.Table3). Additionally, incorporating Korean-based RR estimates also improved model calibration (<50 years E/O (95% CI)=0.96 (0.85–1.09); >50 years E/O (95% CI)=0.94 (0.76–1.16)). In discrimination, however, the AUC slightly decreased among women aged <50 (AUC (95% CI)=69.7% (66.7–72.6)) and those aged ≥50 years (AUC (95% CI)=58.4% (52.9–63.8)). For all models, miscalibration was most evident in the extreme risk deciles (S.Table4, S.Figure 3).

Table 1.

Discrimination and calibration for the breast cancer risk prediction models validated using the Korean Cancer Prevention Study-II Biobank

Age group Model AUC (95% CI) E/O ratio (95% CI)
<50 years of age
(233 cases, 57,206 non-cases)
US-based European-ancestry 71.8 (68.8, 74.8) 1.124 (0.989, 1.278)
Recalibrated 70.7 (67.7, 73.7) 0.894 (0.787, 1.017)
Korean-based 69.7 (66.7, 72.6) 0.960 (0.845, 1.091)
≥50 years of age
(87 cases, 18,680 non-cases)
US-based European-ancestry 57.1 (51.2, 62.9) 2.472 (2.005, 3.049)
Recalibrated 61.5 (56.2, 66.9) 1.018 (0.825, 1.255)
Korean-based 58.4 (52.9, 63.8) 0.941 (0.763, 1.161)

AUC = area under the curve; CI = confidence interval; E = expected five-year absolute risk; O = observed five-year incidence. The AUCs reported in Table 1 are defined based on predicted absolute risk and incorporate the variation due to age.

(A) the US-based European-ancestry model, using incidence, mortality and risk-factor distributions among US non-Hispanic white women and European-ancestry relative risk (RR) estimates; (B) a recalibrated model, using Korean incidence mortality and risk-factor distributions but European-ancestry RR estimates; and (C) a fully Korean-based model using Korean incidence mortality and risk-factor distributions and RR estimates from the Korean Cancer Prevention Study.

Figure 2.

Figure 2.

Absolute risk calibration of breast cancer risk prediction models in the KCPS-II among women less than 50 years of age. The risk categories are based on absolute risk. KCPS-II = Korean Cancer Prevention Study-II Biobank; HL = Hosmer-Lemeshow test statistic.

Figure 3.

Figure 3.

Absolute risk calibration of breast cancer risk prediction models in the KCPS-II among women 50 years of age or greater. The risk categories are based on absolute risk. KCPS-II = Korean Cancer Prevention Study-II Biobank; HL = Hosmer-Lemeshow test statistic.

Discussion

In this study, we evaluated the performance of the breast-cancer risk models originally developed for US women to predict the 5-year breast cancer risk in a Korean population, directly and after recalibration to account for Korean age-specific incidence rates, risk factor distributions, and relative risks. To the best of our knowledge, our study is the first to assess these breast-cancer risk models in an East Asian population.

The discrimination of the unrecalibrated USEA model and the two recalibrated models was similar for women <50 years (AUCs between 70.7 and 71.8%), but the recalibrated models performed better for women ≥50 years (AUC of 57.1 versus 58.4 and 61.5%). The differences in AUCs indicate that the recalibration is changing the rank ordering of women according to their predicted risk. This reordering occurs because iCARE uses the distribution of risk factors in the population not only to define the baseline incidence rate, but also to estimate risks for women who are missing data on one or more risk factor. Our results suggest that using a reference distribution that better matches the target population can improve discrimination. In the case of the KCPS-II Biobank, missing data on factors that have very different distributions in the United States and Korea (e.g. age at menarche, parity, age at first birth, and alcohol intake) likely accounts for the improvement in discrimination between the USEA and recalibrated models.

The USEA breast-cancer risk model was well calibrated among Korean women <50 years but overestimated the risk for those ≥50 years. Further recalibrations of the model showed appreciably improved calibration, especially among older women. This underscores the general importance of recalibrating absolute risk models to reflect the age-specific incidence rates, distribution of risk factors, and relative risks in the target population.

Consistent with previous reports, we found lower breast cancer incidence in the KCPS-II Biobank relative to incidence among US non-Hispanic whites (2,18). Relative to US non-Hispanic whites, women in the KCPS-II Biobank had a lower proportion of risk factors such as earlier age at menarche, OC or HRT use, and body mass index.

The RRs for most of the risk factor categories differed modestly between Korean and European-ancestry women; the odds ratio comparing extreme risk factor categories among Korean women was generally between 0.40 and 1.90 times that among European-ancestry women. The largest exception was BBD, which had a RR 5 times larger in Korean than European-ancestry women. Moreover, Korean women ≥50 years had a larger effect of BBD than those <50, whereas the US women had a similar effect between the age categories. This may reflect our definition of BBD in the KCPS, where the RRs were estimated: women with any ICD-10 code of D24 (“benign neoplasm of unspecified breast”) at baseline were considered to have a history of BBD. This is a more restrictive definition than was used in the European-ancestry studies (19) (“atypical hyperplasia of the breast”) and may define a smaller, more homogenous group of women at higher risk of breast cancer. We chose to use ICD-10 D24 to define BBD because we believe that the accuracy of this insurance claims code would be better compared to other codes capturing more heterogeneous forms of BBD.

Risk factor distributions in our study were consistent with distributions of primary risk factors for breast cancer observed in previous Korean studies: early menarche, late menopause (20,21), later and fewer births (2), taller height (22), obesity (23), history of benign breast disease (24), alcohol intake (25), family history (20), and oral contraceptive use (26). A systematic review reported that hormone replacement therapy had no significant effect on breast cancer in Korean women (27).

The literature-based absolute risk model for European-ancestry women that we assessed here has recently been validated in the European-ancestry US and UK populations, showing good calibration (10,28). Risk prediction models used in one country need to be carefully considered before they are adopted and incorporated into guidelines of other countries. These considerations need to account for different disease epidemiology across populations. Indeed, when we applied the original iCARE model, which uses the incidence rates of breast cancer and competing all-cause mortality rates among US non-Hispanic white women, the 5-year absolute risk was over-predicted among Korean women older than 50 (E/O=2.472) (Table 1). This might be due to a variation in age-specific breast cancer incidence between US non-Hispanic whites and Koreans. Consistent with previous findings, we found that the incidence rate of breast cancer in Korean women increased up to age 50 and decreased thereafter; whereas the incidence rate in US non-Hispanic whites rose with age (35).

The recalibrated KREA model, which applied Korean incidence and mortality data and the RR estimates from European-ancestry studies, showed markedly improved calibration among women older than 50 years (E/O=0.92). When the RR estimates from Korean population were further incorporated, the E/O ratio became nearly 1, although the AUC decreased somewhat. The unexpected decrease in model discrimination may be chance fluctuation, or it could reflect relatively imprecise estimates of the Korean RRs from the KCPS. This highlights the importance of considering the bias-variance tradeoff when developing risk models for specific target populations. Estimates of RRs from the target population will be unbiased, but if the available sample sizes are small, those estimates may be highly variable and the resulting risk model may have poor out-of-sample performance. If RR estimates from large samples from a non-target population are available, they may have relatively good performance in the target population, given their improved precision—provided the true RRs in the target and non-target population are not too different. In this specific case, considering the large size of the KCPS and the small, likely chance differences in AUCs between the recalibrated models with European-ancestry and Korean RRs, we believe the fully recalibrated KRKR model using Korean RRs is most appropriate for Korean women.

The striking age incidence curve of breast cancer in Korea—rising into the mid-40s, then declining—has been consistently observed over the last few decades (18). Korea is experiencing an aging society, and there is a strong generational cohort effect in breast cancer occurrence in Korean women. It has been reported that reproductive factors such as early age at menarche, late age at menopause, delayed first pregnancy, and changes in breast feeding patterns are associated with the cohort effect of breast cancer incidence among Korean women (18). Another reason for the highest peak in the middle-aged women may be due to the rapid increase in breast cancer screening experience in that age group, i.e., higher rate of screening rates among women aged in their 40s and 50s, which is compatible with the age-incidence curve findings (29). This may also be responsible for the larger effect of BBD observed among Korean women ≥50 years than those <50 in our study. According to previous projections, the breast cancer incidence in Korea will increase up to 100 per 100,000 women in the future and the incidence curve by age will be similar to the current curve observed in Western women (2).

Several breast cancer risk assessment tools have been proposed in Korea. Previous case-control studies attempted to identify high-risk groups using a breast cancer probability model with relevant risk factors (3032). A model developed from a prospective cohort study in Korea with an 8-year follow-up was internally validated in the same source population (33). However, the model did not differentiate between pre- and postmenopausal women and included only three risk factors (age, age at menarche, and lactation). A more recent study developed a Korean risk prediction model for breast cancer by modifying the Breast Cancer Risk Assessment Tool (BCRAT) and validated it in two Korean cohorts, showing a better validity than that in the original BCRAT (34). Similar to our study, the study calculated the risks separately for two age groups (<50 and ≥50 years old) and included several reproductive factors and modifiable lifestyle habits such as OC use and body mass index. However, the study could not assess model calibration by different levels of risk due to a small number of breast cancer cases.

Matsuno et al. developed the Asian American Breast Cancer Study (AABCS) model using ethnicity-specific data to estimate absolute risks for Asian and Pacific Islander American women; and found that for Chinese and Filipino women, projections of absolute risk were lower in the AABCS model compared with the BCRAT that uses data from white women (35). However, since the AABCS model is designed for American women, it may not be generalizable to women in Asian countries who have historically had lower breast cancer risk than Asian women in the United States or European countries (36).

The limitations of our study include few breast cancer cases in the validation cohort, especially for women who are older than 50 years. After a few more years of follow-up, we expect to obtain a larger number of events when the women in the cohort become older in the future. We also acknowledge that the RR model might differ between populations because the distributions of breast-cancer subtypes differ. For example, a recent study found that higher proportions of estrogen receptor-positive breast cancer at a younger age among Asian women compared to non-Hispanic white women, which was not considered in this study (37). Another limitation was simulated data for unmeasured risk factors in the KNHANES, which may provide different results from using actual data. Estimates of the Korea-specific RRs from the KCPS may be inaccurate for some risk factors due to a high proportion of missing data. Finally, the risk models we have adapted here do not include several important risk factors, which may have led to diminished discriminatory accuracy. Of particular importance here, the model does not include history of breastfeeding, which has been found to be the strongest protective factor in Korean women (38), whereas in European-ancestry women, the protective effect is relatively small (39,40). Model fit might also be improved by tailoring categories for available data to the Korean population, for example, using Asian-specific WHO cutoffs for overweight and obesity (41). The relative risk models also do not include interactions among the risk factors, for example between body mass index and hormone therapy (9,4244). Further research should consider more comprehensive models including history of breastfeeding and other risk factors, such as breast density, bone mineral density, and genetic and biomarkers. One advantage of the iCARE package is that it allows incorporation of polygenic risk score derived from single nucleotide polymorphisms. In our future research, we plan to evaluate whether and how genetic information improve the performance of the breast cancer risk prediction models in Korean population.

We included modifiable risk factors such as parity, age at first birth, OC use, HRT use, body mass index, and alcohol intake, in the breast cancer risk models, allowing policy makers to quantify risk reduction after risk factor modification and encourages the general population to modify behaviors. Moreover, we evaluated model calibration stratified by levels of risk, which can be useful for risk-based prevention and screening by classifying subjects at the extremes of risk. The success of recalibration of existing breast-cancer risk models in this Korean cohort suggests that recalibration could be of great value for assessment of breast cancer risk in other countries.

In conclusion, although the original USEA breast cancer risk model using incidence rates and risk factor distributions from US non-Hispanic women and European-ancestry relative risks showed relatively good discrimination and calibration among Korean women younger than 50, it had lower discrimination and poor calibration among Korean women older than 50. Recalibrated models using Korean breast-cancer incidence rates and RRs had good discrimination and improved calibration. The data from this study will provide valuable information to plan and evaluate actions against breast cancer focused on primary prevention and early detection in Korean women. Future work to improve model discrimination should incorporate additional risk factors, including history of breast feeding, genetic risk markers, and breast density.

Supplementary Material

1
2
3
4
5
6
7

Acknowledgments

P. Kraft had been awarded a grant of the National Cancer Institute, USA (P30CA00651654).

Footnotes

The authors declare no potential conflicts of interest.

References

  • 1.Kweon SS. Updates on Cancer Epidemiology in Korea, 2018. Chonnam Med J 2018;54(2):90–100 doi 10.4068/cmj.2018.54.2.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Park SK, Kim Y, Kang D, Jung EJ, Yoo KY. Risk factors and control strategies for the rapidly rising rate of breast cancer in Korea. J Breast Cancer 2011;14(2):79–87 doi 10.4048/jbc.2011.14.2.79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Leong SP, Shen ZZ, Liu TJ, Agarwal G, Tajima T, Paik NS, et al. Is breast cancer the same disease in Asian and Western countries? World J Surg 2010;34(10):2308–24 doi 10.1007/s00268-010-0683-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ko BS, Noh WC, Kang SS, Park BW, Kang EY, Paik NS, et al. Changing patterns in the clinical characteristics of korean breast cancer from 1996–2010 using an online nationwide breast cancer database. J Breast Cancer 2012;15(4):393–400 doi 10.4048/jbc.2012.15.4.393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Korean Breast Cancer S. Early Screening of Breast Cancer in Korea. J Korean Breast Cancer Soc 2002;5(3):225–34. [Google Scholar]
  • 6.Ray KM, Joe BN, Freimanis RI, Sickles EA, Hendrick RE. Screening Mammography in Women 40–49 Years Old: Current Evidence. AJR Am J Roentgenol 2018;210(2):264–70 doi 10.2214/AJR.17.18707. [DOI] [PubMed] [Google Scholar]
  • 7.Puschel K, Coronado G, Soto G, Gonzalez K, Martinez J, Holte S, et al. Strategies for increasing mammography screening in primary care in Chile: results of a randomized clinical trial. Cancer Epidemiol Biomarkers Prev 2010;19(9):2254–61 doi 10.1158/1055-9965.EPI-10-0313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Cintolo-Gonzalez JA, Braun D, Blackford AL, Mazzola E, Acar A, Plichta JK, et al. Breast cancer risk models: a comprehensive overview of existing models, validation, and clinical applications. Breast Cancer Res Treat 2017;164(2):263–84 doi 10.1007/s10549-017-4247-z. [DOI] [PubMed] [Google Scholar]
  • 9.Garcia-Closas M, Gunsoy NB, Chatterjee N. Combined associations of genetic and environmental risk factors: implications for prevention of breast cancer. J Natl Cancer Inst 2014;106(11) doi 10.1093/jnci/dju305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Choudhury PP, Wilcox AN, Brook MN, Zhang Y, Ahearn T, Orr N, et al. Comparative validation of breast cancer risk prediction models and projections for future risk stratification. JNCI: Journal of the National Cancer Institute 2019. doi 10.1093/jnci/djz113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Choudhury PP, Maas P, Wilcox A, Wheeler W, Brook M, Check D, et al. iCARE: R package to build, validate and apply absolute risk models. bioRxiv 2018:079954 doi 10.1101/079954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jee YH, Emberson J, Jung KJ, Lee SJ, Lee S, Back JH, et al. Cohort Profile: The Korean Cancer Prevention Study-II (KCPS-II) Biobank. Int J Epidemiol 2018;47(2):385–6f doi 10.1093/ije/dyx226. [DOI] [PubMed] [Google Scholar]
  • 13.Jee SH, Sull JW, Park J, Lee S-Y, Ohrr H, Guallar E, et al. Body-Mass Index and Mortality in Korean Men and Women. New England Journal of Medicine 2006;355(8):779–87 doi 10.1056/NEJMoa054017. [DOI] [PubMed] [Google Scholar]
  • 14.Neuhouser ML, Aragaki AK, Prentice RL, Manson JE, Chlebowski R, Carty CL, et al. Overweight, Obesity, and Postmenopausal Invasive Breast Cancer Risk: A Secondary Analysis of the Women’s Health Initiative Randomized Clinical Trials. JAMA Oncology 2015;1(5):611–21 doi 10.1001/jamaoncol.2015.1546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Beral V Breast cancer and hormone-replacement therapy in the Million Women Study. Lancet 2003;362(9382):419–27 doi 10.1016/s0140-6736(03)14065-2. [DOI] [PubMed] [Google Scholar]
  • 16.Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst 1989;81(24):1879–86 doi 10.1093/jnci/81.24.1879. [DOI] [PubMed] [Google Scholar]
  • 17.Tyrer J, Duffy SW, Cuzick J. A breast cancer prediction model incorporating familial and personal risk factors. Stat Med 2004;23(7):1111–30 doi 10.1002/sim.1668. [DOI] [PubMed] [Google Scholar]
  • 18.Choi Y, Kim Y, Park SK, Shin HR, Yoo KY. Age-Period-Cohort Analysis of Female Breast Cancer Mortality in Korea. Cancer Res Treat 2016;48(1):11–9 doi 10.4143/crt.2015.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hartmann LC, Sellers TA, Frost MH, Lingle WL, Degnim AC, Ghosh K, et al. Benign breast disease and the risk of breast cancer. N Engl J Med 2005;353(3):229–37 doi 10.1056/NEJMoa044383. [DOI] [PubMed] [Google Scholar]
  • 20.Yoo KY, Kang D, Park SK, Kim SU, Kim SU, Shin A, et al. Epidemiology of breast cancer in Korea: occurrence, high-risk groups, and prevention. Journal of Korean medical science 2002;17(1):1–6 doi 10.3346/jkms.2002.17.1.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Shin A, Song YM, Yoo KY, Sung J. Menstrual factors and cancer risk among Korean women. Int J Epidemiol 2011;40(5):1261–8 doi 10.1093/ije/dyr121. [DOI] [PubMed] [Google Scholar]
  • 22.Choi YJ, Lee DH, Han KD, Yoon H, Shin CM, Park YS, et al. Adult height in relation to risk of cancer in a cohort of 22,809,722 Korean adults. Br J Cancer 2019;120(6):668–74 doi 10.1038/s41416-018-0371-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jung D, Lee S-M. BMI and Breast Cancer in Korean Women: A Meta-Analysis. Asian Nursing Research 2009;3(1):31–40 doi 10.1016/S1976-1317(09)60014-1. [DOI] [PubMed] [Google Scholar]
  • 24.Park SK, Yoo KY, Kang D, Kim SU, Lee SY, Im HJ, et al. A Case-Control Study on Risk Factors of Benign Breast Disorders in Korea. Epidemiology and Health 2000;22(1):11–9. [Google Scholar]
  • 25.Choi YJ, Myung SK, Lee JH. Light Alcohol Drinking and Risk of Cancer: A Meta-Analysis of Cohort Studies. Cancer Res Treat 2018;50(2):474–87 doi 10.4143/crt.2017.094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Choi B-R, Kwon M-H, Bang M-R. Oral Contraceptive Use and Breast Cancer in Korean Women. The Korean Journal of Health Service Management 2014;8:221–9 doi 10.12811/kshsm.2014.8.4.221. [DOI] [Google Scholar]
  • 27.Bae JM, Kim EH. Hormone Replacement Therapy and Risk of Breast Cancer in Korean Women: A Quantitative Systematic Review. J Prev Med Public Health 2015;48(5):225–30 doi 10.3961/jpmph.15.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gao C, Choudhury PP, Maas P, Tamimi R, Eliassen H, Chatterjee N, et al. Abstract PR02: Validation of breast cancer risk prediction model using Nurses Health and Nurse Health II Studies. Improving Cancer Risk Prediction for Prevention and Early Detection 2017. p PR02–PR. [Google Scholar]
  • 29.Lee JH, Yim SH, Won YJ, Jung KW, Son BH, Lee HD, et al. Population-based breast cancer statistics in Korea during 1993–2002: incidence, mortality, and survival. Journal of Korean medical science 2007;22 Suppl(Suppl):S11–S6 doi 10.3346/jkms.2007.22.S.S11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Park SK, Yoo KY, Kang DH, Ahn SH, Noh DY, Choe KJ. The Estimation of Breast Cancer Disease-Probability by Difference of Individual Susceptibility. Cancer Res Treat 2003;35(1):35–51 doi 10.4143/crt.2003.35.1.35. [DOI] [PubMed] [Google Scholar]
  • 31.Lee CY, Ko IS, Kim HS, Lee WH, Chang SB, Min JS, et al. Development and validation study of the breast cancer risk appraisal for Korean women. Nursing & health sciences 2004;6(3):201–7 doi 10.1111/j.1442-2018.2004.00193.x. [DOI] [PubMed] [Google Scholar]
  • 32.Lee EO, Ahn SH, You C, Lee DS, Han W, Choe KJ, et al. Determining the main risk factors and high-risk groups of breast cancer using a predictive model for breast cancer risk assessment in South Korea. Cancer nursing 2004;27(5):400–6. [DOI] [PubMed] [Google Scholar]
  • 33.Jee SH, Song JW, Nam CM. Development of the Individualized Health Risk Appraisal Model of Breast Cancer Risk in Korean Women. Epidemiol Health 2004;26(1):50–8. [Google Scholar]
  • 34.Park B, Ma SH, Shin A, Chang MC, Choi JY, Kim S, et al. Korean risk assessment model for breast cancer risk prediction. PLoS One 2013;8(10):e76736 doi 10.1371/journal.pone.0076736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Matsuno RK, Costantino JP, Ziegler RG, Anderson GL, Li H, Pee D, et al. Projecting individualized absolute invasive breast cancer risk in Asian and Pacific Islander American women. J Natl Cancer Inst 2011;103(12):951–61 doi 10.1093/jnci/djr154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Stanford JL, Herrinton LJ, Schwartz SM, Weiss NS. Breast cancer incidence in Asian migrants to the United States and their descendants. Epidemiology (Cambridge, Mass) 1995;6(2):181–3 doi 10.1097/00001648-199503000-00017. [DOI] [PubMed] [Google Scholar]
  • 37.Lin CH, Yap YS, Lee KH, Im SA, Naito Y, Yeo W, et al. Contrasting Epidemiology and Clinicopathology of Female Breast Cancer in Asians vs the US Population. J Natl Cancer Inst 2019;111(12):1298–306 doi 10.1093/jnci/djz090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kim Y, Choi JY, Lee KM, Park SK, Ahn SH, Noh DY, et al. Dose-dependent protective effect of breast-feeding against breast cancer among ever-lactated women in Korea. European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation (ECP) 2007;16(2):124–9 doi 10.1097/01.cej.0000228400.07364.52. [DOI] [PubMed] [Google Scholar]
  • 39.Key TJ, Verkasalo PK, Banks E. Epidemiology of breast cancer. The Lancet Oncology 2001;2(3):133–40 doi 10.1016/s1470-2045(00)00254-0. [DOI] [PubMed] [Google Scholar]
  • 40.Kvåle G, Heuch I. Menstrual factors and breast cancer risk. Cancer 1988;62(8):1625–31 doi . [DOI] [PubMed] [Google Scholar]
  • 41.Consultation WE. Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies. Lancet 2004;363(9403):157–63 doi 10.1016/s0140-6736(03)15268-3. [DOI] [PubMed] [Google Scholar]
  • 42.Sandvei MS, Vatten LJ, Bjelland EK, Eskild A, Hofvind S, Ursin G, et al. Menopausal hormone therapy and breast cancer risk: effect modification by body mass through life. Eur J Epidemiol 2019;34(3):267–78 doi 10.1007/s10654-018-0431-7. [DOI] [PubMed] [Google Scholar]
  • 43.Chlebowski RT, Anderson GL, Aragaki AK, Prentice R. Breast Cancer and Menopausal Hormone Therapy by Race/Ethnicity and Body Mass Index. J Natl Cancer Inst 2016;108(2) doi 10.1093/jnci/djv327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Hou N, Hong S, Wang W, Olopade OI, Dignam JJ, Huo D. Hormone replacement therapy and breast cancer: heterogeneous risks by race, weight, and breast density. J Natl Cancer Inst 2013;105(18):1365–72 doi 10.1093/jnci/djt207. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2
3
4
5
6
7

RESOURCES