Skip to main content
eLife logoLink to eLife
. 2023 Mar 27;12:e82608. doi: 10.7554/eLife.82608

Polygenic risk scores for the prediction of common cancers in East Asians: A population-based prospective cohort study

Peh Joo Ho 1,2,3, Iain BeeHuat Tan 1,2,4,5, Dawn Qingqing Chong 5,6, Chiea Chuen Khor 1, Jian-Min Yuan 7,8, Woon-Puay Koh 9,10, Rajkumar Dorajoo 1,, Jingmei Li 1,3,
Editors: Qifeng Yang11, Caigang Liu12
PMCID: PMC10159619  PMID: 36971353

Abstract

Background:

To evaluate the utility of polygenic risk scores (PRSs) in identifying high-risk individuals, different publicly available PRSs for breast (n=85), prostate (n=37), colorectal (n=22), and lung cancers (n=11) were examined in a prospective study of 21,694 Chinese adults.

Methods:

We constructed PRS using weights curated in the online PGS Catalog. PRS performance was evaluated by distribution, discrimination, predictive ability, and calibration. Hazard ratios (HR) and corresponding confidence intervals (CI) of the common cancers after 20 years of follow-up were estimated using Cox proportional hazard models for different levels of PRS.

Results:

A total of 495 breast, 308 prostate, 332 female-colorectal, 409 male-colorectal, 181 female-lung, and 381 male-lung incident cancers were identified. The area under receiver operating characteristic curve for the best-performing site-specific PRS were 0.61 (PGS000873, breast), 0.70 (PGS00662, prostate), 0.65 (PGS000055, female-colorectal), 0.60 (PGS000734, male-colorectal), 0.56 (PGS000721, female-lung), and 0.58 (PGS000070, male-lung), respectively. Compared to the middle quintile, individuals in the highest cancer-specific PRS quintile were 64% more likely to develop cancers of the breast, prostate, and colorectal. For lung cancer, the lowest cancer-specific PRS quintile was associated with 28–34% decreased risk compared to the middle quintile. In contrast, the HR observed for quintiles 4 (female-lung: 0.95 [0.61–1.47]; male-lung: 1.14 [0.82–1.57]) and 5 (female-lung: 0.95 [0.61–1.47]) were not significantly different from that for the middle quintile.

Conclusions:

Site-specific PRSs can stratify the risk of developing breast, prostate, and colorectal cancers in this East Asian population. Appropriate correction factors may be required to improve calibration.

Funding:

This work is supported by the National Research Foundation Singapore (NRF-NRFF2017-02), PRECISION Health Research, Singapore (PRECISE) and the Agency for Science, Technology and Research (A*STAR). WP Koh was supported by National Medical Research Council, Singapore (NMRC/CSA/0055/2013). CC Khor was supported by National Research Foundation Singapore (NRF-NRFI2018-01). Rajkumar Dorajoo received a grant from the Agency for Science, Technology and Research Career Development Award (A*STAR CDA - 202D8090), and from Ministry of Health Healthy Longevity Catalyst Award (HLCA20Jan-0022).

The Singapore Chinese Health Study was supported by grants from the National Medical Research Council, Singapore (NMRC/CIRG/1456/2016) and the U.S. National Institutes of Health (NIH) (R01 CA144034 and UM1 CA182876).

Research organism: None

eLife digest

Although humans contain the same genes, the sequence within these DNA sites can vary from person to person. These small variations, also known as genetic variants, can increase the risk of developing certain diseases. While each variant will only have a weak effect, if multiple variations are present the odds of developing the disease becomes significantly higher.

To determine which variants are linked to a disease, researchers carry out genome-wide association studies which involve analyzing the genomes of individuals with and without the condition and comparing their genetic codes. This data is then used to calculate how different combinations of variants impact a person’s chance of getting the disease, also known as a polygenic risk score.

Currently, most genome-wide association studies only incorporate genetic data from people with European ancestry. Consequently, polygenic risk scores performed using this information may not accurately predict the risk of developing the disease for individuals with other ethnicities, such as people with Asian ancestry.

Here, Ho et al. evaluated how well previously calculated polygenic risk scores for the four most common cancers (breast, colorectal, prostate and lung) worked on individuals of East Asian descent. The scores were tested on a dataset containing the genetic sequence, medical history, diet and activity levels of over 21,000 people living in Singapore in the 1990s. Ho et al. found that the polygenic risk scores for breast, prostate and colorectal cancer were able to predict disease risk. However, the score for lung cancer did not perform as well. The polygenic risk score for breast cancer was the most accurate, and was able to stratify individuals into distinct risk bands at an earlier age than other scores.

These findings shed light on which existing polygenic risk scores will be effective at assessing cancer risk in individuals with East Asian ancestry. Indeed, Ho et al. have already incorporated the polygenic risk score for breast cancer into a pilot study screening individuals in a comparable population in Singapore. However, the polygenic risk scores tested still performed better on individuals with European ancestry, highlighting the need to address the lack of Asian representation in genome-wide association studies.

Introduction

Polygenic risk scores (PRSs) for a range of health traits and conditions have been developed in recent years. These scores, which are based on summary statistics from genome-wide association studies (GWAS), can be used to stratify people depending on their genetic risk of acquiring various diseases, to improve screening and preventative interventions, as well as patient care (Polygenic Risk Score Task Force of the International Common Disease Alliance, 2021; Lambert et al., 2019). Precision risk assessment may help develop tailored screening strategies targeting individuals at higher risk of disease of interest (Clift et al., 2022).

The contributions of heritable genetic factors are different for different cancers. Twin studies have highlighted statistically significant effects of heritable genetic risk factors for cancers of the prostate, colorectal, and breast (Lichtenstein et al., 2000). The amount of phenotypic variance explained by the common genetic variants found by GWAS is also known to vary (Cano-Gamez and Trynka, 2020), suggesting that PRS derived from GWAS findings may perform to varying degrees for different cancers.

The area under receiver operating characteristic curve (AUC) is an important discrimination index for evaluating the performance of PRS. The greater the AUC, the better the discriminatory ability to separate cases from non-cases. A value of 0.5 suggests that the tool is performing no better than chance, while a value of 1 is obtained when cases and non-cases are perfectly separated. The range of reported AUC associated with published PRS ranged from 0.584 to 0.678 for breast cancer (Mavaddat et al., 2019; Ho et al., 2020; Kachuri et al., 2020; Du et al., 2021; Jia et al., 2020; Lacaze et al., 2021; Zhang et al., 2018), 0.591–0.769 for prostate cancer (Kachuri et al., 2020; Jia et al., 2020; Fritsche et al., 2020), 0.609–0.708 for colorectal cancer (Kachuri et al., 2020; Jia et al., 2020; Gafni et al., 2021; Archambault et al., 2022), and 0.52–0.846 for lung cancer (Kachuri et al., 2020; Jia et al., 2020; Fritsche et al., 2020; Hung et al., 2021). In a study by Jia et al. looking at eight common cancers in the UK Biobank population-based cohort study (n=400,812 participants of European descent), the observed AUC ranged from 0.567 to 0.662 (Jia et al., 2020).

While prediction of individual cancer risks through PRS remains moderate, emerging data supports the use of PRS for population-based cancer risk stratification. In previous work, Ho et al. examined the overlap of women identified to be at high risk of developing breast cancer based on family history for the disease, a non-genetic breast cancer risk prediction model, a breast cancer PRS, and carriership of rare pathogenic variants in established breast cancer predisposition genes (Ho et al., 2022). The overlap of individuals found to be at elevated risk of developing breast cancer based on the genetic and non-genetic models was low. PRS was also found to be able to identify high-risk individuals among young women who were not yet eligible to attend mammography screening. The findings suggest that a genetic tool that is feasible to be deployed for population-based screening may complement current screening programs.

Disparities in the genetic risk of cancer among various ancestry populations are poorly understood. Ideally, selected genetic variants that make up PRS should be relevant to the population being screened. The development of training datasets of PRS are dominated by samples of European ancestry, resulting in ancestry bias and issues with transferability to other populations (Lambert et al., 2019; Fritsche et al., 2021). The mismatch between the ancestries of the GWAS samples and the target populations for PRS application is a limiting factor (Fritsche et al., 2021). In this study, we evaluated the utility of common PRS, curated in the Polygenic Score (PGS) Catalog, in predicting the risk of the commonly diagnosed cancers with high genetic predisposition (breast, prostate, colorectal, and lung) in a prospective cohort comprising 21,694 participants of East Asian descent in Singapore. The reporting framework recommended for the interpretation and evaluation of PRS detailed in Wand et al. is used (Wand et al., 2021).

Materials and methods

Singapore Chinese Health Study

The Singapore Chinese Health Study (SCHS) is a population-based prospective cohort study of ethnic Chinese men and women recruited between April 1993 and December 1998 (Hankin et al., 2001). Participants were 45–74 years of age at recruitment and were restricted to the two major dialect groups of Chinese adults in Singapore, who were the Hokkiens and the Cantonese that had originated from Fujian and Guangdong provinces in Southern China, respectively. All our study participants were residents of government housing flats, which were built to accommodate approximately 86% of the resident population in Singapore during the enrolment period. A total of 63,257 individuals (35,298 women and 27,959 men) provided written informed consent (Hankin et al., 2001). The study was approved by the Institutional Review Boards of the National University of Singapore, University of Pittsburgh, and the Agency for Science, Technology and Research (A*STAR, reference number 2022-042). Written, informed consent was obtained from all study participants.

Baseline

An in-person baseline interview was performed at recruitment to collect data on diet using a validated 165-item food frequency questionnaire, smoking, alcohol, physical activity, medical history, and menstrual and reproductive history from women.

Selection of common cancers

In Singapore, between 2015 and 2019, colorectal cancer, the most prevalent cancer in men, accounted for nearly 17% of cancer diagnoses, while breast cancer, the most common cancer in women, accounted for about three out of ten cancer diagnoses (National Registry of Diseases Office, 2021). During this time, cancers of the breast, prostate, colorectal, and lung accounted for approximately half of the total cancer diagnoses. These four most common cancers were selected for inclusion in this study. We further stratified the analysis by sex as differences in colorectal and lung cancer incidence by sex have been reported in Singapore (de Kok et al., 2008).

A unique National Registration Identity Card (NRIC) number for every Singaporean enables the compilation and linkage of data from national register data to the same individual (Emmanuel, 1993). Identification of incident cases of cancer was accomplished by record linkage of all surviving cohort participants with the database of the nationwide Singapore Cancer Registry (Emmanuel, 1993). The Singapore Cancer Register was founded in 1968. Prior to 2009, reporting of neoplasms by all medical practitioners and pathology laboratories to the registry is voluntary (Fung et al., 2016). The registry’s staff compares cancer patient hospital discharges and death certificates to registered cases for verification. Completeness of reporting in the 1970s is 96% and in the 1990s, it was close to 100% (Sim et al., 2006). Cancers that developed among SCHS participants were identified using International Classification of Diseases (ICD) codes ICD-O-3 (breast: C50, prostate: C61, colorectal: C18, C19, and C20, lung: C34).

Follow-up

Death date was obtained by record linkage with the database Birth and Death Registry of Singapore (Emmanuel, 1993). The data on migration in our cohort was collected during our subsequent follow-up interviews, and we were informed about the migration by the family members of cohort members who had migrated. To date, only 47 (<1%) of the entire cohort participants were known to be lost to follow up due to migration out of Singapore, suggesting that the ascertainment of cancer and death incidences among the cohort participants was virtually complete.

Genotyping and imputation

Between 1999 and 2004, a total of 28,346 subjects contributed blood samples. A total of 25,273 SCHS participants were genotyped between the years 2017–2018 with the Illumina Infinium Global Screening Array (GSA) v1.0 and v2.0 (Chang et al., 2021).

Details on the sample quality control (QC) processes are previously described (Chang et al., 2021). Briefly, samples with a call rate of 95% or below (n=176) or heterozygosity extremes (>3 standard deviation [SD], n=236) were removed. Identity-by-state measurements were performed by pairwise comparisons of samples to detect related samples (first and second degree). One sample from each identified pair with the lower call rate was eliminated from further analysis (n=2746). To identify any ethnic outliers, principal component analysis was used in conjunction with 1000 Genomes Project reference populations and within the SCHS samples, which resulted in the further removal of 287 samples. Of the 21,828 samples that passed genotyping QC, 134 participants who were diagnosed with cancer before recruitment or had missing cancer outcomes and were excluded from the study, resulting in a final analytical dataset of 21,694 (Supplementary file 1a).

Alleles for all SNPs were coded to the forward strand and mapped to hg19. SNP QC steps included the exclusion of sex-linked and mitochondrial variants, gross Hardy–Weinberg equilibrium (HWE) outliers (p<1 × 10–6), monomorphic SNPs, or those with a minor allele frequency (MAF) < 1.0%, and SNPs with low call rates (<95.0%). We imputed for additional autosomal SNPs using IMPUTE v2 (Marchini et al., 2007) and with a two reference panel imputation approach by including (1) the cosmopolitan 1000 Genomes reference panels (Phase 3, representing 2504 samples) and (2) an Asian panel comprising 4810 Singaporeans (2780 Chinese, 903 Malays, 1127 Indians) (Chang et al., 2021). SNPs with imputation quality score INFO < 0.8, MAF  < 1.0%, or HWE p < 1  ×  10−6, as well as non-biallelic SNPs were excluded.

Polygenic risk scores

Published PRSs were retrieved from the PGS Catalog, an open database of polygenic scores (retrieved on 26 February 2022) (Supplementary file 1b; Lambert et al., 2021). Of the 2166 PRSs available in the resource, 1706 PRSs comprising less than 100,000 predictors were downloaded. Only PRSs with odds ratios or log odds ratios as weights were included; we excluded PRSs which used odds ratio over expected risk, inverse-variance weighting, and unweighted. A total of 85, 37, 22, and 11 PRSs were available for breast, prostate, colorectal, and lung cancers, respectively. Supplementary file 1c shows the number of individual variants comprising each PRS and proportion of variants missing in the SCHS cohort. Individual PRSs were calculated using the allelic scoring (–score sum) functions with default parameters in PLINK (v1.90b5.2) (Chang et al., 2015). The formula used was,

PRS=β1x1+β2x2++βkxk++β313x313

where xk is the dosage of risk allele (0–2) for SNP k, βk is the corresponding weight.

PRS distribution

Two-sided, two-sample t-tests with a type I error of 0.05 were used to examine whether there was a difference in the distribution of standardised PRS (subtraction of mean value followed by the division by the SD) between site-specific cancer cases and non-cancer controls.

PRS discrimination

Discrimination was quantified by the area under the receiver operating characteristic curve (AUC), using logistic regression models, and their corresponding 95% CI. An AUC of 0.9–1.0 is considered excellent, 0.8–0.9 very good, 0.7–0.8 good, 0.6–0.7 sufficient, and 0.5–0.6 insufficient (Šimundić, 2009). The site-specific PRS with the highest AUC (logistic regression models) was selected. To test the sensitivity of the PRS selection, we obtained a time-to-event metric for AUC at 5 years, we used AUC.cd() from the ‘survAUC’ package in R (Kamarudin et al., 2017).

Associations between PRS and risk of developing cancers

Subjects were classified into PRS percentile groups. Person-years of follow-up were calculated for each subject from the date of enrolment to the date of cancer diagnosis, death, or 31 December 2015 (the date of linkage with the Singapore Cancer Registry), whichever came first. Follow-up time was censored at 20 years after recruitment. The associations between cancer-specific PRS quintiles (where individuals ranked by PRS were categorised into quintiles, using the middle quintile [40–60%] as a reference to reflect the average risk of the population) and the incidence of site-specific cancers were investigated using Cox proportional hazards modelling to estimate hazard ratios (HR) and corresponding 95% confidence intervals (CI), using time since recruitment as the time scale, and adjusted for age at recruitment. Tests for trends were conducted using two-sided Wald tests with a type I error of 0.05. Assumptions for proportional hazards were checked using the cox.zph() function in the ‘survival’ package in R, where a formal score test is done to test if a time-dependent variable is required.

HR and corresponding 95% CI were also estimated for every SD increase in PRS. PRS is known to have ‘portability’ issues related to genetic ancestry and demographics (Martin et al., 2019; Mostafavi et al., 2020). Hence, we adjusted for variables in the models, including age at recruitment, dialect group (Hokkien or Cantonese), highest level of education (no formal education, primary school, or secondary or higher), body mass index (continuous, kg/m2), cigarette smoking (non-smoker, ex-smoker, current smoker), alcohol consumption (never, weekly, daily), moderate physical activity (none, 1–3 hr/week, ≥3 hr/week), vigorous work/strenuous physical activity at least once a week (no or yes), and familial history of cancer (no or yes).

PRS absolute risk association

The 5-year absolute risks of developing breast, prostate, colorectal, and lung cancers were computed for PRS groups of increasing five percentiles over the follow-up period. Incidence (between 2013 and 2017) and mortality (the year 2016) statistics in Singapore (reported in National Registry of Disease Office, 1968 and Singapore Statistics, 2023, respectively) were used for the absolute risk estimations. We estimated the cancer-specific 5-year absolute risk based on PRS an iterative method detailed by Mavaddat et al., 2015.

PRS calibration

Calibration was studied by comparing the expected proportion of cases in the 5 years after recruitment to the observed proportion of cases that occurred in that 5 years, within each decile of PRS. Linear regression of the 10 points (pairs of expected and observed proportion) was used to study the overall calibration. A curve close to the diagonal indicates that predicted cancer risks correspond well to observed proportions. A slope above 1 implies that the model underestimates the absolute risk. Conversely, a slope below 1 implies that the model overestimates the absolute risk. In addition, we used the Hosmer-Lemeshow test to check the goodness-of-fit.

Results

Characteristics of the study population

Table 1 shows the characteristics of the 21,694 participants who were cancer-free at recruitment. The median follow-up time for the cohort was 20 years (interquartile range [IQR]: 18–22). As of December 2015, 495 women developed breast cancer, 308 men developed prostate, 774 (332 women and 409 men) colorectal cancer, and 562 (181 women and 381) lung cancer. The median age at recruitment was 54 years (IQR: 49–61). The median age at diagnosis was 65 years (IQR: 59–70) for female breast cancers, 72 years (IQR: 67–77) for prostate cancers, 71 years (IQR: 65–76) for male colorectal cancers, 71 years (IQR: 64–78) for female colorectal cancers, 74 years (IQR: 68–78) for male lung cancers, and 74 years (IQR: 66–79) for female lung cancers.

Table 1. Demographics of our study population by gender and cancer site.

Demographics variables were collected using structured questionnaire at recruitment. Family history for lung cancer was not available. Information on cancer occurrence (number of cancer and age at cancer occurrence) was obtained through linkage with the Singapore Cancer Registry in December 2015. Follow-up time was calculated from age at recruitment. IQR: Interquartile range.

Entire cohort Individuals who developed cancer
Breast Prostate Colorectal Lung
All Female Male Female Male Female Male Female Male
n 21,694 12,084 9610 495 308 332 409 181 381
Age at recruitment in years, median (IQR) 54 (49–61) 54 (48–60) 55 (49–62) 53 (48–59) 59 (54–64) 58 (52–64) 59 (52–65) 59 (55–64) 60 (55–64)
Number of cancers developed
0 (did not develop cancer) 19633 (90) 11096 (92) 8537 (89)
1 2013 (9) 968 (8) 1045 (11) 476 (96) 293 (95) 317 (95) 387 (95) 175 (97) 362 (95)
2 48 (0) 20 (0) 28 (0) 19 (4) 15 (5) 15 (5) 22 (5) 6 (3) 19 (5)
Age at diagnosis among individuals who develop cancer(s) (earliest age for those with multiple cancers) in years, median (IQR) 70 (64–77) 68 (62–76) 72 (67–77) 65 (59–70) 72 (67–77) 71 (64–78) 71 (65–6) 74 (66–79) 74 (68–78)
Length of follow-up (longest follow-up for those with multiple cancers) in years, median (IQR) 20 (18–22) 20 (18–22) 19 (17–21) 11 (6–16) 13 (9–17) 13 (8–17) 11 (7–16) 14 (9–17) 14 (10–17)
Dialect group (%)
Hokkien 10663 (49) 6132 (51) 4531 (47) 260 (53) 153 (50) 185 (56) 164 (40) 95 (52) 162 (43)
Cantonese 11031 (51) 5952 (49) 5079 (53) 235 (47) 155 (50) 147 (44) 245 (60) 86 (48) 219 (57)
Highest education (%)
No 4629 (21) 3878 (32) 751 (8) 128 (26) 20 (6) 123 (37) 46 (11) 85 (47) 57 (15)
Primary level 9760 (45) 5082 (42) 4678 (49) 206 (42) 146 (47) 138 (42) 232 (57) 62 (34) 228 (60)
Secondary or above 7305 (34) 3124 (26) 4181 (44) 161 (33) 142 (46) 71 (21) 131 (32) 34 (19) 96 (25)
Body mass index in kg/m2, median (IQR) 23 (21–25) 23 (21–25) 23 (21–25) 23 (21–25) 23 (21–25) 23 (21–24) 23 (21–25) 23 (20–24) 23 (20–24)
Smoking status (%)
Never 15553 (72) 11235 (93) 4318 (45) 472 (95) 166 (54) 296 (89) 153 (37) 129 (71) 63 (17)
Ex-smoker 2374 (11) 261 (2) 2113 (22) 8 (2) 66 (21) 14 (4) 108 (26) 9 (5) 74 (19)
Current smoker 3767 (17) 588 (5) 3179 (33) 15 (3) 76 (25) 22 (7) 148 (36) 43 (24) 244 (64)
Number of cigarettes smoked (%)
Does not smoke 15553 (72) 11235 (93) 4318 (45) 472 (95) 166 (54) 296 (89) 153 (37) 129 (71) 63 (17)
<12 2408 (11) 581 (5) 1827 (19) 14 (3) 54 (18) 26 (8) 85 (21) 36 (20) 81 (21)
13–22 2344 (11) 206 (2) 2138 (22) 6 (1) 53 (17) 9 (3) 108 (26) 15 (8) 135 (35)
≥23 1389 (6) 62 (1) 1327 (14) 3 (1) 35 (11) 1 (0) 63 (15) 1 (1) 102 (27)
Alcohol consumption (%)
Never/ occasionally 19079 (88) 11506 (95) 7573 (79) 470 (95) 253 (82) 315 (95) 303 (74) 174 (96) 296 (78)
Weekly 1885 (9) 437 (4) 1448 (15) 20 (4) 44 (14) 10 (3) 66 (16) 5 (3) 49 (13)
Daily 730 (3) 141 (1) 589 (6) 5 (1) 11 (4) 7 (2) 40 (10) 2 (1) 36 (9)
Moderate physical activity (%)
No 16584 (76) 9446 (78) 7138 (74) 380 (77) 208 (68) 269 (81) 295 (72) 143 (79) 294 (77)
1–3 hr/week 3274 (15) 1679 (14) 1595 (17) 69 (14) 62 (20) 43 (13) 68 (17) 23 (13) 53 (14)
≥ 3 hr/week 1836 (8) 959 (8) 877 (9) 46 (9) 38 (12) 20 (6) 46 (11) 15 (8) 34 (9)
Vigorous physical activity/ strenuous sports at least once a week (%)
No 18467 (85) 11221 (93) 7246 (75) 452 (91) 239 (78) 311 (94) 342 (84) 175 (97) 314 (82)
Yes 3227 (15) 863 (7) 2364 (25) 43 (9) 69 (22) 21 (6) 67 (16) 6 (3) 67 (18)
Family history of any cancer in first-degree relatives (%)
No 18193 (84) 10141 (84) 8052 (84) 404 (82) 236 (77) 281 (85) 336 (82) 165 (91) 333 (87)
Yes 3501 (16) 1943 (16) 1558 (16) 91 (18) 72 (23) 51 (15) 73 (18) 16 (9) 48 (13)

Lack of Asian representation in PRS development

Among PRS for breast (n=85), prostate (n=37), colorectal (n=22), and lung cancers (n=11) examined, the reported source of variant associations or GWAS used to build PRS were from predominantly European ancestry populations (Supplementary file 1c). Only six PRS for breast cancer (PGS000028, PGS000029, PGS000050, PGS000345, PGS0001336, and PGS001804), three for prostate cancer (PGS000878, PGS001291, and PGS001805), three PRS for colorectal cancer (PGS000055, PGS000802, and PGS001802), and one for lung cancer (PGS000070) were based on GWAS that included some non-European participants. For PRS development training, all but two PRSs were based on samples of non-European ancestry (PGS000733 for prostate cancer and PGS000802 for colorectal cancer). No significant association (p>0.05) was found between number of variants included in the various PRSs evaluated for each cancer and discriminatory ability (Supplementary file 1d).

PRS distribution

Figure 1 depicts the (A) distribution, (B) discrimination, (C) absolute risk association, and (D) calibration of the best-performing PRS (based on AUC) (Supplementary file 1d) for the four cancers studied: breast (PGS000873; Brentnall et al., 2020), prostate (PGS000662; Conti et al., 2021), colorectal (female: PGS000055; Schmit et al., 2019; male: PGS000734; Archambault et al., 2020), and lung (female: PGS000721; Jia et al., 2020; male: PGS000070; Dai et al., 2019). All PRSs were normally distributed, with a right shift observed in the distribution curves for cancer cases (Figure 1A). The mean value of each site-specific cancer PRS was significantly higher in cancer patients compared to controls (pt-test <0.00273).

Figure 1. Site-specific polygenic risk scores (PRSs) performance assessment.

Figure 1.

(A) Distribution, (B) discrimination, (C) absolute risk association, and (D) calibration for each of the four common cancers studied (from left to right: breast, prostate, lung [female], lung [male], colorectal [female], and colorectal [male]). Two-sided, two-sample t-tests with a type I error of 0.05 were used to examine whether there was a difference in the distribution of standardised PRS (subtraction of mean value followed by the division by the standard deviation) between site-specific cancer cases and non-cancer controls (A). The PRSs showcased are the best-performing scores based on area under the receiver operator characteristic curve (AUC) values in the female and male populations, (i) unadjusted [solid line], and (ii) adjusted for age at recruitment [dashed line] (B). Each colored line in the plots for absolute risk association denotes a five percentile increase in the standardised PRS score in (C). Calibration calculated based on 5-year absolute risk by PRS deciles in (D). A prediction tool is considered more accurate when the AUC is larger. An AUC of 0.9–1.0 is considered excellent, 0.8–0.9 very good, 0.7–0.8 good, 0.6–0.7 sufficient, 0.5–0.6 bad, and less than 0.5 considered not useful (PMID: 27683318).

Figure 1—source data 1. Tables on absolute risk for breast cancer.
Figure 1—source data 2. Tables on absolute risk for colorectal cancer.
Figure 1—source data 3. Tables on absolute risk for lung cancer.
Figure 1—source data 4. Tables on absolute risk for prostate cancer.
Figure 1—source data 5. Tables on polygenic risk scores (PRS) performance assessment.

PRS discriminatory ability

The highest AUC obtained from logistic models was observed for prostate cancer (0.70, 95% CI: [0.66–0.73]), followed by female breast cancer (0.61 [0.58–0.63]), male colorectal cancer (0.60, 95% CI=0.58–0.63), female colorectal cancer (0.56 [0.52–0.60]), male lung cancer (0.58 [0.55–0.61]), and female lung cancer (0.55 [0.50–0.59]) (Figure 1B).

Associations between PRS and the relative hazard of developing cancers

During the follow-up period of 20 years, the risk of acquiring breast, colorectal, or lung cancer increased significantly with higher PRS after adjusting for age at recruitment. Compared to the first PRS quintile, individuals in the highest quintile were more likely to develop the four cancers studied. The highest HR observed was for prostate cancer (8.99 [95%CI: 5.27–15.35]) and lowest for female lung cancer (1.69 [1.03–2.79]), adjusted for age at recruitment (Supplementary file 1e). Significant trends were found for the associations between PRS quintiles and site-specific cancers (p-trend ranges from 1.35×10–25 for prostate cancer to 0.008 for female lung cancer, Supplementary file 1e).

Compared to the middle cancer-specific PRS quintile, individuals in the highest PRS quintile were 64% more likely to develop cancers of the breast, prostate, and colorectal (Table 2). Individuals in the lowest PRS quintile were associated with a 16–72% reduction in risk of developing these cancers. For female lung cancer, the PRS quintiles 1 and 2 were associated with 44% and 45% decreased risk compared to the middle quintile, respectively. However, the HR observed for PRS quintiles 4 (0.89 [0.57–1.39]) and 5 (0.95 [0.61–1.47]) were not significantly different when compared to the middle quintile.

Table 2. Hazard ratios (HR) and corresponding 95% confidence intervals (CI) associated with polygenic risk score quintiles (Q) compared to the population median, using the Cox proportional hazards model and censored at 20 years after recruitment.

Individuals were categorised into cancer-specific quintiles based on their cancer-specific polygenic risk score (PRS). All models were adjusted for age at recruitment.

Cancer site – gender Q1 Q2 Q3 Q4 Q5
Breast – female
 Number of cases 55 73 86 107 145
 HR (95% CI) 0.61 (0.44–0.86) 0.80 (0.59–1.09) 1.00 (Referent) 1.25 (0.94–1.66) 1.64 (1.26–2.14)
Prostate – male
 Number of cases 15 31 55 59 129
 HR (95% CI) 0.28 (0.16–0.50) 0.57 (0.37–0.88) 1.00 (Referent) 1.11 (0.77–1.60) 2.52 (1.84–3.46)
Colorectal – female
 Number of cases 47 43 53 66 101
 HR (95% CI) 0.84 (0.57–1.25) 0.80 (0.53–1.20) 1.00 (Referent) 1.27 (0.88–1.82) 1.91 (1.37–2.67)
Colorectal – male
 Number of cases 36 70 71 87 114
 HR (95% CI) 0.51 (0.34–0.77) 1.00 (0.72–1.39) 1.00 (Referent) 1.29 (0.94–1.76) 1.67 (1.24–2.25)
Lung – female
 Number of cases 25 26 41 36 40
 HR (95% CI) 0.56 (0.34–0.92) 0.55 (0.34–0.91) 1.00 (Referent) 0.89 (0.57–1.39) 0.95 (0.61–1.47)
Lung – male
 Number of cases 51 58 68 80 103
 HR (95% CI) 0.72 (0.50–1.04) 0.79 (0.56–1.13) 1.00 (Referent) 1.14 (0.82–1.57) 1.46 (1.07–1.98)

Every SD increase in PRS is associated with 39–108% elevated risks of breast, prostate, and colorectal cancers (p<1.06 × 10–8), adjusted for age, dialect group, education, BMI, smoking status, physical activity, alcohol consumption, and family history (Table 3). The increased risk for female and male lung cancer was lower than the other three cancers (HRfemale: 1.21 [1.04–1.40], p=0.011; HRmale: 1.35 [1.22–1.49], p=1.01 × 10–8). Age at recruitment is significantly associated with elevated risks of developing all cancers, with the exception of female breast cancer (HR: 1.00 [0.99–1.02], p=0.571). Highest education level and BMI were positively correlated with breast cancer risk. Smoking was significantly associated with an ~30% reduction in risk of prostate cancer but increased the risk of lung cancer by approximately two- and fivefold for past and current smokers, compared to non-smokers, respectively. Alcohol consumption increased the risk of both female and male colorectal cancer by approximately 60% but was only significant for male colorectal cancer. Family history of cancer was only significantly associated with an increased risk for prostate cancer (HR: 1.53 [1.16–2.02], p=7.59 × 10–4).

Table 3. Associations between per standard deviation (SD) increase in site-specific polygenic risk scores and cancer occurrence.

Hazard ratios (HR) and corresponding 95% confidence intervals (CI) were estimated using Cox proportional hazard models, adjusted for age at recruitment, dialect group, highest education attained, body mass index, smoking status, alcohol consumption, and physical activity. Follow-up time was censored at 20 years after recruitment. Significant results are shown in bold.

Cancer site
Breast Prostate Colorectal – female Colorectal – male Lung – female Lung – male
HR (95% CI) p-Value HR (95% CI) p-Value HR (95% CI) p-Value HR (95% CI) p-Value HR (95% CI) p-Value HR (95% CI) p-Value
Site-specific polygenic risk score, per SD increase 1.47 (1.34–1.60) 5.80E-17 2.08 (1.85–2.34) 1.56E-33 1.39 (1.24–1.55) 1.06E-08 1.44 (1.30–1.59) 5.41E-12 1.21 (1.04–1.40) 1.10E-02 1.35 (1.22–1.49) 1.01E-08
Age at recruitment, years 1.00 (0.99–1.02) 5.82E-01 1.09 (1.07–1.10) 6.34E-23 1.07 (1.05–1.09) 7.24E-17 1.06 (1.05–1.08) 9.53E-18 1.07 (1.05–1.10) 1.65E-10 1.09 (1.07–1.10) 1.46E-27
Dialect group (Cantonese vs Hokkien) 0.88 (0.73–1.05) 1.61E-01 0.98 (0.78–1.24) 8.86E-01 0.78 (0.62–0.99) 3.96E-02 1.22 (0.99–1.50) 6.78E-02 0.92 (0.67–1.25) 5.78E-01 1.07 (0.87–1.33) 5.21E-01
Highest education (primary vs no) 1.21 (0.95–1.53) 1.20E-01 1.32 (0.81–2.14) 2.65E-01 1.08 (0.83–1.41) 5.60E-01 0.98 (0.70–1.37) 8.91E-01 0.83 (0.58–1.19) 3.11E-01 0.87 (0.64–1.18) 3.67E-01
Highest education
(secondary or above vs no)
1.54 (1.18–2.01) 1.57E-03 1.60 (0.98–2.63) 6.17E-02 1.06 (0.76–1.48) 7.46E-01 0.80 (0.55–1.16) 2.33E-01 1.10 (0.69–1.74) 6.87E-01 0.63 (0.44–0.90) 1.16E-02
Body mass index, kg/m2 1.04 (1.02–1.07) 1.28E-03 1.01 (0.98–1.05) 5.15E-01 0.99 (0.96–1.02) 5.33E-01 1.02 (0.98–1.05) 3.19E-01 0.97 (0.92–1.01) 1.58E-01 0.97 (0.93–1.00) 5.60E-02
Smoking status
(ex-smoker vs non-smoker)
0.90 (0.45–1.83) 7.81E-01 0.68 (0.50–0.92) 1.32E-02 1.51 (0.86–2.66) 1.55E-01 1.17 (0.90–1.52) 2.36E-01 2.16 (1.04–4.48) 3.86E-02 1.99 (1.41–2.83) 1.09E-04
Smoking status
(current smoker vs non-smoker)
0.83 (0.49–1.39) 4.72E-01 0.70 (0.52–0.93) 1.52E-02 1.10 (0.69–1.75) 6.85E-01 1.22 (0.96–1.56) 1.08E-01 5.78 (3.98–8.38) 2.69E-20 5.15 (3.83–6.91) 1.17E-27
Alcohol consumption
(weekly vs never/ occasionally)
1.04 (0.65–1.67) 8.74E-01 0.98 (0.70–1.39) 9.29E-01 0.76 (0.38–1.54) 4.46E-01 1.31 (1.00–1.73) 5.39E-02 0.72 (0.27–1.96) 5.23E-01 0.89 (0.65–1.22) 4.81E-01
Alcohol consumption
(daily vs never/ occasionally)
0.71 (0.27–1.91) 5.00E-01 0.74 (0.40–1.36) 3.32E-01 1.55 (0.69–3.49) 2.89E-01 1.64 (1.15–2.34) 6.54E-03 0.66 (0.16–2.68) 5.63E-01 1.21 (0.85–1.73) 2.81E-01
Moderate physical activity
(1–3 hr/week vs no)
0.98 (0.75–1.28) 8.80E-01 1.17 (0.87–1.57) 2.97E-01 0.88 (0.63–1.24) 4.74E-01 1.02 (0.78–1.35) 8.79E-01 0.99 (0.62–1.58) 9.73E-01 0.90 (0.67–1.23) 5.17E-01
Moderate physical activity
(≥3 hr/week vs no)
1.17 (0.86–1.60) 3.20E-01 0.99 (0.68–1.45) 9.78E-01 0.59 (0.37–0.96) 3.33E-02 1.10 (0.80–1.52) 5.45E-01 0.96 (0.54–1.70) 8.78E-01 0.86 (0.59–1.26) 4.36E-01
Vigorous physical activity/ strenuous sports at least once a week (yes vs no) 1.24 (0.90–1.70) 1.89E-01 1.05 (0.79–1.41) 7.16E-01 1.09 (0.68–1.74) 7.25E-01 0.75 (0.57–1.00) 5.06E-02 0.58 (0.24–1.42) 2.30E-01 0.95 (0.72–1.26) 7.37E-01
Family history (yes vs no) 1.14 (0.90–1.45) 2.67E-01 1.53 (1.16–2.02) 2.47E-03 1.08 (0.79–1.48) 6.20E-01 1.24 (0.95–1.62) 1.09E-01 0.67 (0.40–1.13) 1.33E-01 0.97 (0.71–1.33) 8.63E-01

All Cox models presented in Tables 2 and 3 did not violate the proportionality assumption for the PRS studied (p-values of cox-zph() for PRS were >0.05).

Association of PRS with absolute risk

In terms of the 5-year absolute risk of developing site-specific cancers, the largest difference between the highest and lowest PRS categories was observed for prostate cancer, followed by breast cancer (Figure 1C). A separation of the absolute risk curves was observed for female breast cancer already at age 30 years. For prostate cancer, the separation of curves was observed only after age 50 years. Slight separation of the curves began after 50 years of age for colorectal and lung cancer.

PRS calibration

In general, predicted risks for the higher PRS categories did not correspond well to the observed proportions for female breast, male prostate, and female lung cancers (Figure 1D); in particular, predicted risks were overestimated for the higher risk categories. Overestimation of risk was observed for all PRS categories for male lung cancer. In contrast, predicted risks were underestimated for both female and male colorectal cancers. Nonetheless, the CI associated with the calibration slopes for all cancers included 1, with the exception of female (0.32 [-0.31 to 0.95]) and male lung cancers (0.26 [-0.25, 0.78]).

Discussion

Precision prevention in oncology is based on the idea that an individual’s risk, which is influenced by genetics, environment, and lifestyle factors, is linked to the amount of benefit achieved through cancer screening (Roberts, 2018). Risk stratification for cancer screening can be used in this framework to identify and recommend screening for persons with a high enough cancer risk that the benefits outweigh the risks. Several PRS prediction models have been established for site-specific cancers, each with its own set of strengths and limitations, and different risk models may produce different results for the same individual.

In an increasingly inclusive world, genetic studies fall short on diversity. According to a 2009 study, an overwhelming 96% of people who took part in genome-wide association studies (GWAS) were of European ancestry (Need and Goldstein, 2009). GWAS results are the backbone on which PRS is developed. A concern raised was that, without representation from a broader spectrum of populations, genomic medicine may be limited to benefitting ‘a privileged few’ (Popejoy and Fullerton, 2016).

Genetic studies in 2016 showed that the proportion of people not of European ancestry included in GWAS has increased to approximately 20% (Popejoy and Fullerton, 2016). Most of this rise can be attributed to more research on Asian ancestry communities in Asia (Popejoy and Fullerton, 2016). With increasing interest worldwide in using a risk-based approach to screening programs over the current age-based paradigm, this progress raises questions on whether selected established PRS shown to perform well in European-based populations has equal utility in Asians. Nonetheless, as our results show, most of the populations from which PRS were developed are still predominantly of European ancestry.

In accordance with published Polygenic Risk Score Reporting Standards, we reported PRS distribution, discrimination, absolute risk association, and calibration for each of the four common cancers studied (Wand et al., 2021). Our results show that cancer cases were associated with higher PRS compared to non-cancer controls. In the age-adjusted models, a constant trend between PRS percentile rank and observed cancer risk in our study population supports the validity of PRS for breast, prostate, and colorectal cancers, but not for lung cancer. The best-performing PRS for female breast cancer was able to stratify women into distinct bands of breast cancer risk at an earlier age, and across all ages, suggesting that it could be a useful prediction tool in risk-based breast cancer screening in combination with other risk factors specific to breast cancer (Ho et al., 2022). This PRS has been incorporated into a pilot risk-based breast cancer screening study in a comparable study population (Liu et al., 2022). The best-performing PRS for prostate and male colorectal cancers in this study appeared to exhibit sufficient discriminatory ability and predictive value, especially for older participants.

PRS may be of limited use in predicting female colorectal and female/male lung cancer. The least predictive value was in lung cancer, which could be related to the higher prevalence of EGFR mutant lung cancer which has an Asian predilection, thus less amenable to PRS developed in Caucasian population (Shigematsu et al., 2005). It is reassuring to see tobacco smoking is a strong risk factor for lung cancer in our dataset. However, smoking appeared to be associated with a protective effect for prostate cancer. While smoking is a well-known risk factor for many cancers (Jacob et al., 2018), in particular lung cancer, observational studies frequently show that smokers are associated with a lower incidence of prostate cancer (Rohrmann et al., 2013; Watters et al., 2009; Adami et al., 1996; Lund Nilsen et al., 2000; Engeland et al., 1996; Islami et al., 2014; Ordóñez-Mena et al., 2016; Giovannucci et al., 2007). However, a Mendelian randomisation study did not support the association (Larsson et al., 2020).

There is room for improvement in the discriminatory ability of PRS (Lewis and Green, 2021). As noted by Lambert et al. in a review, a wider divergence between the average scores of cases and non-cases (quantified by AUC) and associated effect sizes (odds ratio and SD) is expected when PRS explains more of the heredity for each trait (Lambert et al., 2019). Larger GWAS sample sizes of appropriate ancestries and the inclusion of rarer genetic variants, obtained through other methods such as whole-genome sequencing, would likely be required to boost explained heritability (Lambert et al., 2019). In addition, group-wise estimates, which arbitrarily classify the top 10%, 5%, or 1% of samples as the at-risk group, are not optimal for decisions at the individual level (Lewis and Green, 2021). Emerging new methodologies that estimate probability values for hypothetically assigning an individual as at risk or not at risk, thus providing individuals with more clarity, may help to overcome this limitation (Sun et al., 2021). At this point, PRS may not have yet reached the standards as a clinical tool by itself. However, it is still helpful in guiding screening decisions and supplementing established protocols (Polygenic Risk Score Task Force of the International Common Disease Alliance, 2021).

As highlighted by Wei et al., the reliability of score values is necessary for application at the individual level (Wei et al., 2022). Even when a PRS has adequate discrimination, estimated risks can be unreliable (Van Calster et al., 2019). Our results show that cancer risk estimates based on PRS developed using populations of European ancestry are not optimally calibrated for our Asian study population. However, the CI associated with the calibration slopes for all cancers except for female and male lung included the value of 1, suggesting that the overall calibration is not poor. For female and male lung cancers associated with low AUC values, poor calibration is not unexpected. All PRSs except PGS000662 (prostate cancer) passed the formal Hosmer-Lemeshow goodness-of-fit test. Males in the first 5 deciles of PGS000662 did not develop prostate cancer, suggesting that a linear fit may not be appropriate. A hard threshold beginning from the 6th decile may perform better at identifying males at elevated risk of developing prostate cancer.

Poorly calibrated PRS can be misleading and have clinical repercussions (Van Calster et al., 2019; Van Calster and Vickers, 2015). Underestimation of risk may result in a false sense of security. Overestimation of risk may cause unnecessary anxiety, misguided interventions, and overtreatment. In a population-wide screening setting, however, where the return of PRS results can be designed such that only high-risk individuals are highlighted, underestimation of risk may be less of an issue. Arguably, with parallel input from other risk factors and evaluation by healthcare specialists, the overestimation of risk that results in a higher number of at-risk individuals identified may increase the number of cancers potentially detected early. Nonetheless, suitable correction factors will be required to ensure the reliability of PRS prior to clinical implementation.

While the study population used in this analysis comprises less than a thousand cases of the most common cancers examined, the SCHS, established between April 1993 and December 1998, is one of the largest population-based Asian cohorts in the world with high-quality prospective data on exposure and comprehensive capture of morbidity and mortality. All cancer cases are incident cases diagnosed over three decades of follow-up. Blood samples were collected from a subset of SCHS participants who were alive and contactable between 1999 and 2004 (after the recruitment period 1993–1998). While we attempted to adjust associations between PRSs and incident cancers in the study by including multiple related risk factors as covariates in Cox proportional hazards models, we acknowledge the potential of survival biases in the study. This is one of the best resources to evaluate the utility of PRS in a prospective manner. The findings open a window in our current understanding of which PRS is relevant and ready to be deployed in risk-based cancer screening studies. Nonetheless, it should be noted that among the PRS interrogated from the PGS Catalog, the ‘best’ PRS selected in this study may be only superficially superior in terms of AUC (i.e. trailing decimal places) over the ‘next best’ PRS. We further tested the sensitivity of the PRS selection using a time-to-event metric for AUC (at 5 years), and the differences found were non-informative from the logistic regression (Supplementary file 1f). To increase the number of events, we combined the males and females for lung and colorectal cancers. The resulting AUCs (from the logistic regression) were not appreciably different from the sex-specific analysis (Supplementary file 1g).

Ethnic representation in PRS model development, PRS validation, limited discriminative ability in the general population, ill calibration, insufficient healthcare professional and patient education, and healthcare system integration are all hurdles that must be crossed before PRS can be implemented responsibly as a public health instrument (Lewis and Vassos, 2020; Slunecka et al., 2021). While nationwide screening programs have helped to raise cancer awareness, there is still a need to improve the effectiveness and efficiency of cancer screening in Asian countries such as Singapore, given the steadily rising incidence rates. Despite the challenges, a risk-based screening strategy that includes the use of PRS should be actively examined for research and implementation.

Acknowledgements

We thank the Singapore Cancer Registry for the identification of incident cancer cases among participants of the SCHS and Siew-Hong Low of the National University of Singapore for supervising the fieldwork of the SCHS.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Rajkumar Dorajoo, Email: dorajoor@gis.a-star.edu.sg.

Jingmei Li, Email: lijm1@gis.a-star.edu.sg.

Qifeng Yang, Qilu Hospital of Shandong University, China.

Caigang Liu, Shengjing Hospital of China Medical University, China.

Funding Information

This paper was supported by the following grants:

  • National Research Foundation Singapore PRECISE to Jingmei Li.

  • National Medical Research Council NMRC/CSA/0055/2013 to Woon-Puay Koh.

  • National Research Foundation Singapore NRF-NRFI2018-01 to Chiea Chuen Khor.

  • Ministry of Health -Singapore HLCA20Jan-0022 to Rajkumar Dorajoo.

  • National Institutes of Health R01 CA144034 to Jian-Min Yuan.

  • National Institutes of Health UM1 CA182876 to Jian-Min Yuan.

Additional information

Competing interests

No competing interests declared.

No competing interests declared.

received a grant from the Agency for Science, Technology and Research Career Development Award (A*STAR CDA - 202D8090), and from Ministry of Health Healthy Longevity Catalyst Award (HLCA20Jan-0022). The author has no other competing interests to declare.

Author contributions

Conceptualization, Formal analysis, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Writing – review and editing, Interpretation of data.

Writing – review and editing.

Data curation, Funding acquisition, Writing – review and editing.

Resources, Data curation, Funding acquisition, Writing – review and editing.

Resources, Data curation, Funding acquisition, Writing – review and editing.

Conceptualization, Resources, Data curation, Funding acquisition, Methodology, Writing – original draft, Writing – review and editing.

Conceptualization, Resources, Data curation, Supervision, Funding acquisition, Methodology, Writing – original draft, Writing – review and editing.

Ethics

Human subjects: The study was approved by the Institutional Review Boards of the National University of Singapore, University of Pittsburgh, and the Agency for Science, Technology and Research (A*STAR, reference number 2022-042). Written, informed consent was obtained from all study participants.

Additional files

Supplementary file 1. Supplementary files a-g, presenting supplementary figure and tables.
elife-82608-supp1.xlsx (476KB, xlsx)
MDAR checklist
Source code 1. R codes on the statistical analysis.
elife-82608-code1.zip (65.8KB, zip)

Data availability

All polygenic risk scores used in this study are publicly available in the PGS Catalog (https://www.pgscatalog.org; Lambert et al., 2021). The data that support the findings of our study are available from the corresponding authors of the study upon reasonable request (Dr Rajkumar s/o Dorajoo, dorajoor@gis.a-star.edu.sg and Dr Jingmei Li, lijm1@gis.a-star.edu.sg). More information regarding the data access to SCHS can be found at: https://sph.nus.edu.sg/research/cohort-schs/. The data are not publicly available due to Singapore laws. Figure 1—source data 1 contains the numerical data used to generate Figure 1. The code for the study is uploaded as Source code 1.

References

  1. Adami HO, Bergström R, Engholm G, Nyrén O, Wolk A, Ekbom A, Englund A, Baron J. A prospective study of smoking and risk of prostate cancer. International Journal of Cancer. 1996;67:764–768. doi: 10.1002/(SICI)1097-0215(19960917)67:6&#x0003c;764::AID-IJC3&#x0003e;3.0.CO;2-P. [DOI] [PubMed] [Google Scholar]
  2. Archambault AN, Su YR, Jeon J, Thomas M, Lin Y, Conti DV, Win AK, Sakoda LC, Lansdorp-Vogelaar I, Peterse EFP, Zauber AG, Duggan D, Holowatyj AN, Huyghe JR, Brenner H, Cotterchio M, Bézieau S, Schmit SL, Edlund CK, Southey MC, MacInnis RJ, Campbell PT, Chang-Claude J, Slattery ML, Chan AT, Joshi AD, Song M, Cao Y, Woods MO, White E, Weinstein SJ, Ulrich CM, Hoffmeister M, Bien SA, Harrison TA, Hampe J, Li CI, Schafmayer C, Offit K, Pharoah PD, Moreno V, Lindblom A, Wolk A, Wu AH, Li L, Gunter MJ, Gsur A, Keku TO, Pearlman R, Bishop DT, Castellví-Bel S, Moreira L, Vodicka P, Kampman E, Giles GG, Albanes D, Baron JA, Berndt SI, Brezina S, Buch S, Buchanan DD, Trichopoulou A, Severi G, Chirlaque MD, Sánchez MJ, Palli D, Kühn T, Murphy N, Cross AJ, Burnett-Hartman AN, Chanock SJ, de la Chapelle A, Easton DF, Elliott F, English DR, Feskens EJM, FitzGerald LM, Goodman PJ, Hopper JL, Hudson TJ, Hunter DJ, Jacobs EJ, Joshu CE, Küry S, Markowitz SD, Milne RL, Platz EA, Rennert G, Rennert HS, Schumacher FR, Sandler RS, Seminara D, Tangen CM, Thibodeau SN, Toland AE, van Duijnhoven FJB, Visvanathan K, Vodickova L, Potter JD, Männistö S, Weigl K, Figueiredo J, Martín V, Larsson SC, Parfrey PS, Huang WY, Lenz HJ, Castelao JE, Gago-Dominguez M, Muñoz-Garzón V, Mancao C, Haiman CA, Wilkens LR, Siegel E, Barry E, Younghusband B, Van Guelpen B, Harlid S, Zeleniuch-Jacquotte A, Liang PS, Du M, Casey G, Lindor NM, Le Marchand L, Gallinger SJ, Jenkins MA, Newcomb PA, Gruber SB, Schoen RE, Hampel H, Corley DA, Hsu L, Peters U, Hayes RB. Cumulative burden of colorectal cancer-associated genetic variants is more strongly associated with early-onset vs late-onset cancer. Gastroenterology. 2020;158:1274–1286. doi: 10.1053/j.gastro.2019.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Archambault AN, Jeon J, Lin Y, Thomas M, Harrison TA, Bishop DT, Brenner H, Casey G, Chan AT, Chang-Claude J, Figueiredo JC, Gallinger S, Gruber SB, Gunter MJ, Guo F, Hoffmeister M, Jenkins MA, Keku TO, Le Marchand L, Li L, Moreno V, Newcomb PA, Pai R, Parfrey PS, Rennert G, Sakoda LC, Lee JK, Slattery ML, Song M, Win AK, Woods MO, Murphy N, Campbell PT, Su Y-R, Lansdorp-Vogelaar I, Peterse EFP, Cao Y, Zeleniuch-Jacquotte A, Liang PS, Du M, Corley DA, Hsu L, Peters U, Hayes RB. Risk stratification for early-onset colorectal cancer using a combination of genetic and environmental risk scores: an international multi-center study. Journal of the National Cancer Institute. 2022;114:528–539. doi: 10.1093/jnci/djac003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Brentnall AR, van Veen EM, Harkness EF, Rafiq S, Byers H, Astley SM, Sampson S, Howell A, Newman WG, Cuzick J, Evans DGR. A case-control evaluation of 143 single nucleotide polymorphisms for breast cancer risk stratification with classical factors and mammographic density. International Journal of Cancer. 2020;146:2122–2129. doi: 10.1002/ijc.32541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cano-Gamez E, Trynka G. From GWAS to function: using functional genomics to identify the mechanisms underlying complex diseases. Frontiers in Genetics. 2020;11:424. doi: 10.3389/fgene.2020.00424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-Generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chang X, Gurung RL, Wang L, Jin A, Li Z, Wang R, Beckman KB, Adams-Haduch J, Meah WY, Sim KS, Lim WK, Davila S, Tan P, Teo JX, Yeo KK, Yiamunaa M, Liu S, Lim SC, Liu J, van Dam RM, Friedlander Y, Koh W-P, Yuan J-M, Khor CC, Heng C-K, Dorajoo R. Low frequency variants associated with leukocyte telomere length in the singapore chinese population. Communications Biology. 2021;4:519. doi: 10.1038/s42003-021-02056-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Clift AK, Dodwell D, Lord S, Petrou S, Brady SM, Collins GS, Hippisley-Cox J. The current status of risk-stratified breast screening. British Journal of Cancer. 2022;126:533–550. doi: 10.1038/s41416-021-01550-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Conti DV, Darst BF, Moss LC, Saunders EJ, Sheng X, Chou A, Schumacher FR, Olama AAA, Benlloch S, Dadaev T, Brook MN, Sahimi A, Hoffmann TJ, Takahashi A, Matsuda K, Momozawa Y, Fujita M, Muir K, Lophatananon A, Wan P, Le Marchand L, Wilkens LR, Stevens VL, Gapstur SM, Carter BD, Schleutker J, Tammela TLJ, Sipeky C, Auvinen A, Giles GG, Southey MC, MacInnis RJ, Cybulski C, Wokołorczyk D, Lubiński J, Neal DE, Donovan JL, Hamdy FC, Martin RM, Nordestgaard BG, Nielsen SF, Weischer M, Bojesen SE, Røder MA, Iversen P, Batra J, Chambers S, Moya L, Horvath L, Clements JA, Tilley W, Risbridger GP, Gronberg H, Aly M, Szulkin R, Eklund M, Nordström T, Pashayan N, Dunning AM, Ghoussaini M, Travis RC, Key TJ, Riboli E, Park JY, Sellers TA, Lin H-Y, Albanes D, Weinstein SJ, Mucci LA, Giovannucci E, Lindstrom S, Kraft P, Hunter DJ, Penney KL, Turman C, Tangen CM, Goodman PJ, Thompson IM, Jr, Hamilton RJ, Fleshner NE, Finelli A, Parent M-É, Stanford JL, Ostrander EA, Geybels MS, Koutros S, Freeman LEB, Stampfer M, Wolk A, Håkansson N, Andriole GL, Hoover RN, Machiela MJ, Sørensen KD, Borre M, Blot WJ, Zheng W, Yeboah ED, Mensah JE, Lu Y-J, Zhang H-W, Feng N, Mao X, Wu Y, Zhao S-C, Sun Z, Thibodeau SN, McDonnell SK, Schaid DJ, West CML, Burnet N, Barnett G, Maier C, Schnoeller T, Luedeke M, Kibel AS, Drake BF, Cussenot O, Cancel-Tassin G, Menegaux F, Truong T, Koudou YA, John EM, Grindedal EM, Maehle L, Khaw K-T, Ingles SA, Stern MC, Vega A, Gómez-Caamaño A, Fachal L, Rosenstein BS, Kerns SL, Ostrer H, Teixeira MR, Paulo P, Brandão A, Watya S, Lubwama A, Bensen JT, Fontham ETH, Mohler J, Taylor JA, Kogevinas M, Llorca J, Castaño-Vinyals G, Cannon-Albright L, Teerlink CC, Huff CD, Strom SS, Multigner L, Blanchet P, Brureau L, Kaneva R, Slavov C, Mitev V, Leach RJ, Weaver B, Brenner H, Cuk K, Holleczek B, Saum K-U, Klein EA, Hsing AW, Kittles RA, Murphy AB, Logothetis CJ, Kim J, Neuhausen SL, Steele L, Ding YC, Isaacs WB, Nemesure B, Hennis AJM, Carpten J, Pandha H, Michael A, De Ruyck K, De Meerleer G, Ost P, Xu J, Razack A, Lim J, Teo S-H, Newcomb LF, Lin DW, Fowke JH, Neslund-Dudas C, Rybicki BA, Gamulin M, Lessel D, Kulis T, Usmani N, Singhal S, Parliament M, Claessens F, Joniau S, Van den Broeck T, Gago-Dominguez M, Castelao JE, Martinez ME, Larkin S, Townsend PA, Aukim-Hastie C, Bush WS, Aldrich MC, Crawford DC, Srivastava S, Cullen JC, Petrovics G, Casey G, Roobol MJ, Jenster G, van Schaik RHN, Hu JJ, Sanderson M, Varma R, McKean-Cowdin R, Torres M, Mancuso N, Berndt SI, Van Den Eeden SK, Easton DF, Chanock SJ, Cook MB, Wiklund F, Nakagawa H, Witte JS, Eeles RA, Kote-Jarai Z, Haiman CA. Trans-Ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nature Genetics. 2021;53:65–75. doi: 10.1038/s41588-020-00748-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dai J, Lv J, Zhu M, Wang Y, Qin N, Ma H, He Y-Q, Zhang R, Tan W, Fan J, Wang T, Zheng H, Sun Q, Wang L, Huang M, Ge Z, Yu C, Guo Y, Wang T-M, Wang J, Xu L, Wu W, Chen L, Bian Z, Walters R, Millwood IY, Li X-Z, Wang X, Hung RJ, Christiani DC, Chen H, Wang M, Wang C, Jiang Y, Chen K, Chen Z, Jin G, Wu T, Lin D, Hu Z, Amos CI, Wu C, Wei Q, Jia W-H, Li L, Shen H. Identification of risk loci and a polygenic risk score for lung cancer: a large-scale prospective cohort study in Chinese populations. The Lancet. Respiratory Medicine. 2019;7:881–891. doi: 10.1016/S2213-2600(19)30144-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. de Kok IMCM, Wong CS, Chia KS, Sim X, Tan CS, Kiemeney LA, Verkooijen HM. Gender differences in the trend of colorectal cancer incidence in Singapore, 1968-2002. International Journal of Colorectal Disease. 2008;23:461–467. doi: 10.1007/s00384-007-0421-9. [DOI] [PubMed] [Google Scholar]
  12. Du Z, Gao G, Adedokun B, Ahearn T, Lunetta KL, Zirpoli G, Troester MA, Ruiz-Narváez EA, Haddad SA, PalChoudhury P, Figueroa J, John EM, Bernstein L, Zheng W, Hu JJ, Ziegler RG, Nyante S, Bandera EV, Ingles SA, Mancuso N, Press MF, Deming SL, Rodriguez-Gil JL, Yao S, Ogundiran TO, Ojengbe O, Bolla MK, Dennis J, Dunning AM, Easton DF, Michailidou K, Pharoah PDP, Sandler DP, Taylor JA, Wang Q, Weinberg CR, Kitahara CM, Blot W, Nathanson KL, Hennis A, Nemesure B, Ambs S, Sucheston-Campbell LE, Bensen JT, Chanock SJ, Olshan AF, Ambrosone CB, Olopade OI, Yarney J, Awuah B, Wiafe-Addai B, Conti DV, GBHS Study Team. Palmer JR, Garcia-Closas M, Huo D, Haiman CA. Evaluating polygenic risk scores for breast cancer in women of African ancestry. Journal of the National Cancer Institute. 2021;113:1168–1176. doi: 10.1093/jnci/djab050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Emmanuel S. Quality assurance in medicine: research and evaluation activities towards quality control in Singapore. Annals of the Academy of Medicine, Singapore. 1993;22:129–133. [PubMed] [Google Scholar]
  14. Engeland A, Andersen A, Haldorsen T, Tretli S. Smoking habits and risk of cancers other than lung cancer: 28 years’ follow-up of 26,000 Norwegian men and women. Cancer Causes & Control. 1996;7:497–506. doi: 10.1007/BF00051881. [DOI] [PubMed] [Google Scholar]
  15. Fritsche LG, Patil S, Beesley LJ, VandeHaar P, Salvatore M, Ma Y, Peng RB, Taliun D, Zhou X, Mukherjee B. Cancer prsweb: an online Repository with polygenic risk scores for major cancer traits and their evaluation in two independent biobanks. American Journal of Human Genetics. 2020;107:815–836. doi: 10.1016/j.ajhg.2020.08.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Fritsche LG, Ma Y, Zhang D, Salvatore M, Lee S, Zhou X, Mukherjee B. On cross-ancestry cancer polygenic risk scores. PLOS Genetics. 2021;17:e1009670. doi: 10.1371/journal.pgen.1009670. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Fung JWM, Lim SBL, Zheng H, Ho WYT, Lee BG, Chow KY, Lee HP. Data quality at the Singapore cancer registry: an overview of comparability, completeness, validity and timeliness. Cancer Epidemiology. 2016;43:76–86. doi: 10.1016/j.canep.2016.06.006. [DOI] [PubMed] [Google Scholar]
  18. Gafni A, Dite GS, Spaeth Tuff E, Allman R, Hopper JL. Ability of known colorectal cancer susceptibility SNPs to predict colorectal cancer risk: a cohort study within the UK Biobank. PLOS ONE. 2021;16:e0251469. doi: 10.1371/journal.pone.0251469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Giovannucci E, Liu Y, Platz EA, Stampfer MJ, Willett WC. Risk factors for prostate cancer incidence and progression in the health professionals follow-up study. International Journal of Cancer. 2007;121:1571–1578. doi: 10.1002/ijc.22788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hankin JH, Stram DO, Arakawa K, Park S, Low SH, Lee HP, Yu MC. Singapore Chinese Health study: development, validation, and calibration of the quantitative food frequency questionnaire. Nutrition and Cancer. 2001;39:187–195. doi: 10.1207/S15327914nc392_5. [DOI] [PubMed] [Google Scholar]
  21. Ho W-K, Tan M-M, Mavaddat N, Tai M-C, Mariapun S, Li J, Ho P-J, Dennis J, Tyrer JP, Bolla MK, Michailidou K, Wang Q, Kang D, Choi J-Y, Jamaris S, Shu X-O, Yoon S-Y, Park SK, Kim S-W, Shen C-Y, Yu J-C, Tan EY, Chan PMY, Muir K, Lophatananon A, Wu AH, Stram DO, Matsuo K, Ito H, Chan CW, Ngeow J, Yong WS, Lim SH, Lim GH, Kwong A, Chan TL, Tan SM, Seah J, John EM, Kurian AW, Koh W-P, Khor CC, Iwasaki M, Yamaji T, Tan KMV, Tan KTB, Spinelli JJ, Aronson KJ, Hasan SN, Rahmat K, Vijayananthan A, Sim X, Pharoah PDP, Zheng W, Dunning AM, Simard J, van Dam RM, Yip C-H, Taib NAM, Hartman M, Easton DF, Teo S-H, Antoniou AC. European polygenic risk score for prediction of breast cancer shows similar performance in Asian women. Nature Communications. 2020;11:3833. doi: 10.1038/s41467-020-17680-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ho PJ, Ho WK, Khng AJ, Yeoh YS, Tan BK-T, Tan EY, Lim GH, Tan S-M, Tan VKM, Yip C-H, Mohd-Taib N-A, Wong FY, Lim EH, Ngeow J, Chay WY, Leong LCH, Yong WS, Seah CM, Tang SW, Ng CWQ, Yan Z, Lee JA, Rahmat K, Islam T, Hassan T, Tai M-C, Khor CC, Yuan J-M, Koh W-P, Sim X, Dunning AM, Bolla MK, Antoniou AC, Teo S-H, Li J, Hartman M. Overlap of high-risk individuals predicted by family history, and genetic and non-genetic breast cancer risk prediction models: implications for risk stratification. BMC Medicine. 2022;20:150. doi: 10.1186/s12916-022-02334-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hung RJ, Warkentin MT, Brhane Y, Chatterjee N, Christiani DC, Landi MT, Caporaso NE, Liu G, Johansson M, Albanes D, Marchand LL, Tardon A, Rennert G, Bojesen SE, Chen C, Field JK, Kiemeney LA, Lazarus P, Zienolddiny S, Lam S, Andrew AS, Arnold SM, Aldrich MC, Bickeböller H, Risch A, Schabath MB, McKay JD, Brennan P, Amos CI. Assessing lung cancer absolute risk trajectory based on a polygenic risk model. Cancer Research. 2021;81:1607–1615. doi: 10.1158/0008-5472.CAN-20-1237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Islami F, Moreira DM, Boffetta P, Freedland SJ. A systematic review and meta-analysis of tobacco use and prostate cancer mortality and incidence in prospective cohort studies. European Urology. 2014;66:1054–1064. doi: 10.1016/j.eururo.2014.08.059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Jacob L, Freyn M, Kalder M, Dinas K, Kostev K. Impact of tobacco smoking on the risk of developing 25 different cancers in the UK: a retrospective study of 422,010 patients followed for up to 30 years. Oncotarget. 2018;9:17420–17429. doi: 10.18632/oncotarget.24724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Jia G, Lu Y, Wen W, Long J, Liu Y, Tao R, Li B, Denny JC, Shu XO, Zheng W. Evaluating the utility of polygenic risk scores in identifying high-risk individuals for eight common cancers. JNCI Cancer Spectrum. 2020;4:pkaa021. doi: 10.1093/jncics/pkaa021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kachuri L, Graff RE, Smith-Byrne K, Meyers TJ, Rashkin SR, Ziv E, Witte JS, Johansson M. Pan-Cancer analysis demonstrates that integrating polygenic risk scores with modifiable risk factors improves risk prediction. Nature Communications. 2020;11:6084. doi: 10.1038/s41467-020-19600-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kamarudin AN, Cox T, Kolamunnage-Dona R. Time-Dependent ROC curve analysis in medical research: current methods and applications. BMC Medical Research Methodology. 2017;17:53. doi: 10.1186/s12874-017-0332-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Lacaze P, Bakshi A, Riaz M, Orchard SG, Tiller J, Neumann JT, Carr PR, Joshi AD, Cao Y, Warner ET, Manning A, Nguyen-Dumont T, Southey MC, Milne RL, Ford L, Sebra R, Schadt E, Gately L, Gibbs P, Thompson BA, Macrae FA, James P, Winship I, McLean C, Zalcberg JR, Woods RL, Chan AT, Murray AM, McNeil JJ. Genomic risk prediction for breast cancer in older women. Cancers. 2021;13:3533. doi: 10.3390/cancers13143533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lambert SA, Abraham G, Inouye M. Towards clinical utility of polygenic risk scores. Human Molecular Genetics. 2019;28:R133–R142. doi: 10.1093/hmg/ddz187. [DOI] [PubMed] [Google Scholar]
  31. Lambert SA, Gil L, Jupp S, Ritchie SC, Xu Y, Buniello A, McMahon A, Abraham G, Chapman M, Parkinson H, Danesh J, MacArthur JAL, Inouye M. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nature Genetics. 2021;53:420–425. doi: 10.1038/s41588-021-00783-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Larsson SC, Carter P, Kar S, Vithayathil M, Mason AM, Michaëlsson K, Burgess S. Smoking, alcohol consumption, and cancer: a mendelian randomisation study in UK biobank and international genetic consortia participants. PLOS Medicine. 2020;17:e1003178. doi: 10.1371/journal.pmed.1003178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lewis CM, Vassos E. Polygenic risk scores: from research tools to clinical instruments. Genome Medicine. 2020;12:44. doi: 10.1186/s13073-020-00742-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Lewis ACF, Green RC. Polygenic risk scores in the clinic: new perspectives needed on familiar ethical issues. Genome Medicine. 2021;13:14. doi: 10.1186/s13073-021-00829-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K. Environmental and heritable factors in the causation of cancer — analyses of cohorts of twins from Sweden, Denmark, and Finland. New England Journal of Medicine. 2000;343:78–85. doi: 10.1056/NEJM200007133430201. [DOI] [PubMed] [Google Scholar]
  36. Liu J, Ho PJ, Tan THL, Yeoh YS, Chew YJ, Mohamed Riza NK, Khng AJ, Goh S-A, Wang Y, Oh HB, Chin CH, Kwek SC, Zhang ZP, Ong DLS, Quek ST, Tan CC, Wee HL, Li J, Iau PTC, Hartman M. Breast screening tailored for her (breathe) -A study protocol on personalised risk-based breast cancer screening programme. PLOS ONE. 2022;17:e0265965. doi: 10.1371/journal.pone.0265965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lund Nilsen TI, Johnsen R, Vatten LJ. Socio-Economic and lifestyle factors associated with the risk of prostate cancer. British Journal of Cancer. 2000;82:1358–1363. doi: 10.1054/bjoc.1999.1105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nature Genetics. 2007;39:906–913. doi: 10.1038/ng2088. [DOI] [PubMed] [Google Scholar]
  39. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nature Genetics. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Mavaddat N, Pharoah PDP, Michailidou K, Tyrer J, Brook MN, Bolla MK, Wang Q, Dennis J, Dunning AM, Shah M, Luben R, Brown J, Bojesen SE, Nordestgaard BG, Nielsen SF, Flyger H, Czene K, Darabi H, Eriksson M, Peto J, Dos-Santos-Silva I, Dudbridge F, Johnson N, Schmidt MK, Broeks A, Verhoef S, Rutgers EJ, Swerdlow A, Ashworth A, Orr N, Schoemaker MJ, Figueroa J, Chanock SJ, Brinton L, Lissowska J, Couch FJ, Olson JE, Vachon C, Pankratz VS, Lambrechts D, Wildiers H, Van Ongeval C, van Limbergen E, Kristensen V, Grenaker Alnæs G, Nord S, Borresen-Dale A-L, Nevanlinna H, Muranen TA, Aittomäki K, Blomqvist C, Chang-Claude J, Rudolph A, Seibold P, Flesch-Janys D, Fasching PA, Haeberle L, Ekici AB, Beckmann MW, Burwinkel B, Marme F, Schneeweiss A, Sohn C, Trentham-Dietz A, Newcomb P, Titus L, Egan KM, Hunter DJ, Lindstrom S, Tamimi RM, Kraft P, Rahman N, Turnbull C, Renwick A, Seal S, Li J, Liu J, Humphreys K, Benitez J, Pilar Zamora M, Arias Perez JI, Menéndez P, Jakubowska A, Lubinski J, Jaworska-Bieniek K, Durda K, Bogdanova NV, Antonenkova NN, Dörk T, Anton-Culver H, Neuhausen SL, Ziogas A, Bernstein L, Devilee P, Tollenaar RAEM, Seynaeve C, van Asperen CJ, Cox A, Cross SS, Reed MWR, Khusnutdinova E, Bermisheva M, Prokofyeva D, Takhirova Z, Meindl A, Schmutzler RK, Sutter C, Yang R, Schürmann P, Bremer M, Christiansen H, Park-Simon T-W, Hillemanns P, Guénel P, Truong T, Menegaux F, Sanchez M, Radice P, Peterlongo P, Manoukian S, Pensotti V, Hopper JL, Tsimiklis H, Apicella C, Southey MC, Brauch H, Brüning T, Ko Y-D, Sigurdson AJ, Doody MM, Hamann U, Torres D, Ulmer H-U, Försti A, Sawyer EJ, Tomlinson I, Kerin MJ, Miller N, Andrulis IL, Knight JA, Glendon G, Marie Mulligan A, Chenevix-Trench G, Balleine R, Giles GG, Milne RL, McLean C, Lindblom A, Margolin S, Haiman CA, Henderson BE, Schumacher F, Le Marchand L, Eilber U, Wang-Gohrke S, Hooning MJ, Hollestelle A, van den Ouweland AMW, Koppert LB, Carpenter J, Clarke C, Scott R, Mannermaa A, Kataja V, Kosma V-M, Hartikainen JM, Brenner H, Arndt V, Stegmaier C, Karina Dieffenbach A, Winqvist R, Pylkäs K, Jukkola-Vuorinen A, Grip M, Offit K, Vijai J, Robson M, Rau-Murthy R, Dwek M, Swann R, Annie Perkins K, Goldberg MS, Labrèche F, Dumont M, Eccles DM, Tapper WJ, Rafiq S, John EM, Whittemore AS, Slager S, Yannoukakos D, Toland AE, Yao S, Zheng W, Halverson SL, González-Neira A, Pita G, Rosario Alonso M, Álvarez N, Herrero D, Tessier DC, Vincent D, Bacot F, Luccarini C, Baynes C, Ahmed S, Maranian M, Healey CS, Simard J, Hall P, Easton DF, Garcia-Closas M. Prediction of breast cancer risk based on profiling with common genetic variants. Journal of the National Cancer Institute. 2015;107:djv036. doi: 10.1093/jnci/djv036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, Tyrer JP, Chen T-H, Wang Q, Bolla MK, Yang X, Adank MA, Ahearn T, Aittomäki K, Allen J, Andrulis IL, Anton-Culver H, Antonenkova NN, Arndt V, Aronson KJ, Auer PL, Auvinen P, Barrdahl M, Beane Freeman LE, Beckmann MW, Behrens S, Benitez J, Bermisheva M, Bernstein L, Blomqvist C, Bogdanova NV, Bojesen SE, Bonanni B, Børresen-Dale A-L, Brauch H, Bremer M, Brenner H, Brentnall A, Brock IW, Brooks-Wilson A, Brucker SY, Brüning T, Burwinkel B, Campa D, Carter BD, Castelao JE, Chanock SJ, Chlebowski R, Christiansen H, Clarke CL, Collée JM, Cordina-Duverger E, Cornelissen S, Couch FJ, Cox A, Cross SS, Czene K, Daly MB, Devilee P, Dörk T, Dos-Santos-Silva I, Dumont M, Durcan L, Dwek M, Eccles DM, Ekici AB, Eliassen AH, Ellberg C, Engel C, Eriksson M, Evans DG, Fasching PA, Figueroa J, Fletcher O, Flyger H, Försti A, Fritschi L, Gabrielson M, Gago-Dominguez M, Gapstur SM, García-Sáenz JA, Gaudet MM, Georgoulias V, Giles GG, Gilyazova IR, Glendon G, Goldberg MS, Goldgar DE, González-Neira A, Grenaker Alnæs GI, Grip M, Gronwald J, Grundy A, Guénel P, Haeberle L, Hahnen E, Haiman CA, Håkansson N, Hamann U, Hankinson SE, Harkness EF, Hart SN, He W, Hein A, Heyworth J, Hillemanns P, Hollestelle A, Hooning MJ, Hoover RN, Hopper JL, Howell A, Huang G, Humphreys K, Hunter DJ, Jakimovska M, Jakubowska A, Janni W, John EM, Johnson N, Jones ME, Jukkola-Vuorinen A, Jung A, Kaaks R, Kaczmarek K, Kataja V, Keeman R, Kerin MJ, Khusnutdinova E, Kiiski JI, Knight JA, Ko Y-D, Kosma V-M, Koutros S, Kristensen VN, Krüger U, Kühl T, Lambrechts D, Le Marchand L, Lee E, Lejbkowicz F, Lilyquist J, Lindblom A, Lindström S, Lissowska J, Lo W-Y, Loibl S, Long J, Lubiński J, Lux MP, MacInnis RJ, Maishman T, Makalic E, Maleva Kostovska I, Mannermaa A, Manoukian S, Margolin S, Martens JWM, Martinez ME, Mavroudis D, McLean C, Meindl A, Menon U, Middha P, Miller N, Moreno F, Mulligan AM, Mulot C, Muñoz-Garzon VM, Neuhausen SL, Nevanlinna H, Neven P, Newman WG, Nielsen SF, Nordestgaard BG, Norman A, Offit K, Olson JE, Olsson H, Orr N, Pankratz VS, Park-Simon T-W, Perez JIA, Pérez-Barrios C, Peterlongo P, Peto J, Pinchev M, Plaseska-Karanfilska D, Polley EC, Prentice R, Presneau N, Prokofyeva D, Purrington K, Pylkäs K, Rack B, Radice P, Rau-Murthy R, Rennert G, Rennert HS, Rhenius V, Robson M, Romero A, Ruddy KJ, Ruebner M, Saloustros E, Sandler DP, Sawyer EJ, Schmidt DF, Schmutzler RK, Schneeweiss A, Schoemaker MJ, Schumacher F, Schürmann P, Schwentner L, Scott C, Scott RJ, Seynaeve C, Shah M, Sherman ME, Shrubsole MJ, Shu X-O, Slager S, Smeets A, Sohn C, Soucy P, Southey MC, Spinelli JJ, Stegmaier C, Stone J, Swerdlow AJ, Tamimi RM, Tapper WJ, Taylor JA, Terry MB, Thöne K, Tollenaar RAEM, Tomlinson I, Truong T, Tzardi M, Ulmer H-U, Untch M, Vachon CM, van Veen EM, Vijai J, Weinberg CR, Wendt C, Whittemore AS, Wildiers H, Willett W, Winqvist R, Wolk A, Yang XR, Yannoukakos D, Zhang Y, Zheng W, Ziogas A, ABCTB Investigators. kConFab/AOCS Investigators. NBCS Collaborators. Dunning AM, Thompson DJ, Chenevix-Trench G, Chang-Claude J, Schmidt MK, Hall P, Milne RL, Pharoah PDP, Antoniou AC, Chatterjee N, Kraft P, García-Closas M, Simard J, Easton DF. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. American Journal of Human Genetics. 2019;104:21–34. doi: 10.1016/j.ajhg.2018.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Mostafavi H, Harpak A, Agarwal I, Conley D, Pritchard JK, Przeworski M. Variable prediction accuracy of polygenic scores within an ancestry group. eLife. 2020;9:e48376. doi: 10.7554/eLife.48376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. National Registry of Disease Office Singapore Cancer Registry 50th Anniversary Monograph (1968 – 2017) 1968. [January 2, 2023]. https://www.nrdo.gov.sg/publications/cancer
  44. National Registry of Diseases Office Singapore Cancer Registry Annual Report 2018. 2021. [January 2, 2023]. https://www.nrdo.gov.sg/docs/librariesprovider3/default-document-library/scr-annual-report-2018.pdf
  45. Need AC, Goldstein DB. Next generation disparities in human genomics: concerns and remedies. Trends in Genetics. 2009;25:489–494. doi: 10.1016/j.tig.2009.09.012. [DOI] [PubMed] [Google Scholar]
  46. Ordóñez-Mena JM, Schöttker B, Mons U, Jenab M, Freisling H, Bueno-de-Mesquita B, O’Doherty MG, Scott A, Kee F, Stricker BH, Hofman A, de Keyser CE, Ruiter R, Söderberg S, Jousilahti P, Kuulasmaa K, Freedman ND, Wilsgaard T, de Groot LC, Kampman E, Håkansson N, Orsini N, Wolk A, Nilsson LM, Tjønneland A, Pająk A, Malyutina S, Kubínová R, Tamosiunas A, Bobak M, Katsoulis M, Orfanos P, Boffetta P, Trichopoulou A, Brenner H, Consortium on Health and Ageing: Network of Cohorts in Europe and the United States Quantification of the smoking-associated cancer risk with rate advancement periods: meta-analysis of individual participant data from cohorts of the chances consortium. BMC Medicine. 2016;14:62. doi: 10.1186/s12916-016-0607-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Polygenic Risk Score Task Force of the International Common Disease Alliance Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nature Medicine. 2021;27:1876–1884. doi: 10.1038/s41591-021-01549-6. [DOI] [PubMed] [Google Scholar]
  48. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature. 2016;538:161–164. doi: 10.1038/538161a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Roberts MC. Implementation challenges for risk-stratified screening in the era of precision medicine. JAMA Oncology. 2018;4:1484–1485. doi: 10.1001/jamaoncol.2018.1940. [DOI] [PubMed] [Google Scholar]
  50. Rohrmann S, Linseisen J, Allen N, Bueno-de-Mesquita HB, Johnsen NF, Tjønneland A, Overvad K, Kaaks R, Teucher B, Boeing H, Pischon T, Lagiou P, Trichopoulou A, Trichopoulos D, Palli D, Krogh V, Tumino R, Ricceri F, Argüelles Suárez MV, Agudo A, Sánchez M-J, Chirlaque M-D, Barricarte A, Larrañaga N, Boshuizen H, van Kranen HJ, Stattin P, Johansson M, Bjartell A, Ulmert D, Khaw K-T, Wareham NJ, Ferrari P, Romieux I, Gunter MJR, Riboli E, Key TJ. Smoking and the risk of prostate cancer in the European prospective investigation into cancer and nutrition. British Journal of Cancer. 2013;108:708–714. doi: 10.1038/bjc.2012.520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Schmit SL, Edlund CK, Schumacher FR, Gong J, Harrison TA, Huyghe JR, Qu C, Melas M, Van Den Berg DJ, Wang H, Tring S, Plummer SJ, Albanes D, Alonso MH, Amos CI, Anton K, Aragaki AK, Arndt V, Barry EL, Berndt SI, Bezieau S, Bien S, Bloomer A, Boehm J, Boutron-Ruault M-C, Brenner H, Brezina S, Buchanan DD, Butterbach K, Caan BJ, Campbell PT, Carlson CS, Castelao JE, Chan AT, Chang-Claude J, Chanock SJ, Cheng I, Cheng Y-W, Chin LS, Church JM, Church T, Coetzee GA, Cotterchio M, Cruz Correa M, Curtis KR, Duggan D, Easton DF, English D, Feskens EJM, Fischer R, FitzGerald LM, Fortini BK, Fritsche LG, Fuchs CS, Gago-Dominguez M, Gala M, Gallinger SJ, Gauderman WJ, Giles GG, Giovannucci EL, Gogarten SM, Gonzalez-Villalpando C, Gonzalez-Villalpando EM, Grady WM, Greenson JK, Gsur A, Gunter M, Haiman CA, Hampe J, Harlid S, Harju JF, Hayes RB, Hofer P, Hoffmeister M, Hopper JL, Huang S-C, Huerta JM, Hudson TJ, Hunter DJ, Idos GE, Iwasaki M, Jackson RD, Jacobs EJ, Jee SH, Jenkins MA, Jia W-H, Jiao S, Joshi AD, Kolonel LN, Kono S, Kooperberg C, Krogh V, Kuehn T, Küry S, LaCroix A, Laurie CA, Lejbkowicz F, Lemire M, Lenz H-J, Levine D, Li CI, Li L, Lieb W, Lin Y, Lindor NM, Liu Y-R, Loupakis F, Lu Y, Luh F, Ma J, Mancao C, Manion FJ, Markowitz SD, Martin V, Matsuda K, Matsuo K, McDonnell KJ, McNeil CE, Milne R, Molina AJ, Mukherjee B, Murphy N, Newcomb PA, Offit K, Omichessan H, Palli D, Cotoré JPP, Pérez-Mayoral J, Pharoah PD, Potter JD, Qu C, Raskin L, Rennert G, Rennert HS, Riggs BM, Schafmayer C, Schoen RE, Sellers TA, Seminara D, Severi G, Shi W, Shibata D, Shu X-O, Siegel EM, Slattery ML, Southey M, Stadler ZK, Stern MC, Stintzing S, Taverna D, Thibodeau SN, Thomas DC, Trichopoulou A, Tsugane S, Ulrich CM, van Duijnhoven FJB, van Guelpan B, Vijai J, Virtamo J, Weinstein SJ, White E, Win AK, Wolk A, Woods M, Wu AH, Wu K, Xiang Y-B, Yen Y, Zanke BW, Zeng Y-X, Zhang B, Zubair N, Kweon S-S, Figueiredo JC, Zheng W, Marchand LL, Lindblom A, Moreno V, Peters U, Casey G, Hsu L, Conti DV, Gruber SB. Novel common genetic susceptibility loci for colorectal cancer. Journal of the National Cancer Institute. 2019;111:146–157. doi: 10.1093/jnci/djy099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Shigematsu H, Lin L, Takahashi T, Nomura M, Suzuki M, Wistuba II, Fong KM, Lee H, Toyooka S, Shimizu N, Fujisawa T, Feng Z, Roth JA, Herz J, Minna JD, Gazdar AF. Clinical and biological features associated with epidermal growth factor receptor gene mutations in lung cancers. Journal of the National Cancer Institute. 2005;97:339–346. doi: 10.1093/jnci/dji055. [DOI] [PubMed] [Google Scholar]
  53. Sim X, Ali RA, Wedren S, Goh DL-M, Tan C-S, Reilly M, Hall P, Chia K-S. Ethnic differences in the time trend of female breast cancer incidence: Singapore, 1968-2002. BMC Cancer. 2006;6:261. doi: 10.1186/1471-2407-6-261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Šimundić AM. Measures of diagnostic accuracy: basic definitions. EJIFCC. 2009;19:203–211. [PMC free article] [PubMed] [Google Scholar]
  55. Singapore Statistics Age-Specific Death Rates, Annual. 2023. [January 2, 2023]. https://www.tablebuilder.singstat.gov.sg/publicfacing/viewMultiTable.action
  56. Slunecka JL, van der Zee MD, Beck JJ, Johnson BN, Finnicum CT, Pool R, Hottenga J-J, de Geus EJC, Ehli EA. Implementation and implications for polygenic risk scores in healthcare. Human Genomics. 2021;15:46. doi: 10.1186/s40246-021-00339-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Sun J, Wang Y, Folkersen L, Borné Y, Amlien I, Buil A, Orho-Melander M, Børglum AD, Hougaard DM, Regeneron Genetics Center. Melander O, Engström G, Werge T, Lage K. Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction. Nature Communications. 2021;12:5276. doi: 10.1038/s41467-021-25014-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision-analytic performance. Medical Decision Making. 2015;35:162–169. doi: 10.1177/0272989X14547233. [DOI] [PubMed] [Google Scholar]
  59. Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Topic group “evaluating diagnostic tests and prediction models” of the STRATOS initiative Calibration: the achilles heel of predictive analytics. BMC Medicine. 2019;17:230. doi: 10.1186/s12916-019-1466-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Wand H, Lambert SA, Tamburro C, Iacocca MA, O’Sullivan JW, Sillari C, Kullo IJ, Rowley R, Dron JS, Brockman D, Venner E, McCarthy MI, Antoniou AC, Easton DF, Hegele RA, Khera AV, Chatterjee N, Kooperberg C, Edwards K, Vlessis K, Kinnear K, Danesh JN, Parkinson H, Ramos EM, Roberts MC, Ormond KE, Khoury MJ, Janssens ACJW, Goddard KAB, Kraft P, MacArthur JAL, Inouye M, Wojcik GL. Improving reporting standards for polygenic scores in risk prediction studies. Nature. 2021;591:211–219. doi: 10.1038/s41586-021-03243-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Watters JL, Park Y, Hollenbeck A, Schatzkin A, Albanes D. Cigarette smoking and prostate cancer in a prospective US cohort study. Cancer Epidemiology, Biomarkers & Prevention. 2009;18:2427–2435. doi: 10.1158/1055-9965.EPI-09-0252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Wei J, Shi Z, Na R, Resurreccion WK, Wang CH, Duggan D, Zheng SL, Hulick PJ, Helfand BT, Xu J. Calibration of polygenic risk scores is required prior to clinical implementation: results of three common cancers in UKB. Journal of Medical Genetics. 2022;59:243–247. doi: 10.1136/jmedgenet-2020-107286. [DOI] [PubMed] [Google Scholar]
  63. Zhang X, Rice M, Tworoger SS, Rosner BA, Eliassen AH, Tamimi RM, Joshi AD, Lindstrom S, Qian J, Colditz GA, Willett WC, Kraft P, Hankinson SE. Addition of a polygenic risk score, mammographic density, and endogenous hormones to existing breast cancer risk prediction models: a nested case-control study. PLOS Medicine. 2018;15:e1002644. doi: 10.1371/journal.pmed.1002644. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

Qifeng Yang 1

Utilizing one of the largest prospective Asian cohorts with long-term follow-up data, the important study reveals the utility of polygenic risk scores in identifying high-risk individuals from specific cancers. The convincing results reached by the high-quality data highlight the translational significance of this study, which may be important for future exploration in the field of cancer epidemiology.

Decision letter

Editor: Qifeng Yang1

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Polygenic risk scores for the prediction of common cancers in East Asians: A population-based prospective cohort study" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Caigang Liu as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed their reviews with one another, and revisions are necessary for the manuscript. For your guidance, reviewers' comments are appended below. If you decide to revise your manuscript, please revise your work guided by the reviewers' suggestions, and provide a point-by-point response to the following suggestions and concerns.

Reviewer #1 (Recommendations for the authors):

This is an important paper that has the potential to contribute to our understanding of how to take forward polygenic risk scores developed primarily based on individuals of European descent on the large Asian population.

1. However, by focusing on a combined analysis of 4 cancers, the main findings appear to have been lost.

a. The authors note in the discussion that the PRSes developed in individuals of European descent has not well calibrated for the Asian population One would not expect PRS with poor AUC to be well calibrated, hence poor calibration for lung cancer-PRS may be expected in this study. For other cancer-PRSs, the confidence interval of the calibration slopes indicating that they are not significantly different from one, hence no strong suggestion of poor calibration overall, except maybe for prostate PRS. Would it be possible that observed prostate cancer incidence is underestimated due to survival bias?

b. For each of the 4 cancers, it would be helpful to report features associated with translatability of PRS from Europeans to Asians. The authors report that "no significant association (P>0.05) was found between number of variants included in the various PRS evaluated for each cancer and discriminatory ability" – an exploratory analysis of other features associated with performance would be helpful.

c. In Table 2, the authors appear to have grouped individuals into quintiles based on a combined score. It is not clear what is the intent for this analysis – is the aspiration to roll out a risk stratified screening strategy based on "high risk for any of 4 cancers" and then to screen for all 4 cancers? The corresponding text in Line 315-330 needs clarification, it is not clear what is the intent for repeating the analyses with different PRS quintiles as reference group.

d. In Table 3, the authors reported the association of PRS together with other demographic/risk factors. It hasn't been made clear the reason for such exhaustive adjustment in the model. If the intent is to quantify the attenuated amount in PRS association after accounting for potential confounders, then unadjusted PRS would need to be presented and the point needs to be made clear in the text.

e. In Table 4, the authors intend to quantify the proportion of cases captured by the at-risk groups as defined by PRSs. However, what's the rational of the HR cutoffs? Why not use the absolute risks in Figure 1(c) instead? What is the intent to include other risk factors? Higher proportion of cases captured would be expected with additional risk factors, however, it may be an overestimation here given that the RRs of other risk factors were estimated from the same cohort?

2. Have the authors sufficiently addressed survival bias? The cohort of 68k individuals was established in 1993-1998, but blood samples from 28k individuals were only collected in 1999-2004, and of these, 21,694 were analysable. Was the Singapore Cancer Registry data systematically collected from 1993?

Reviewer #2 (Recommendations for the authors):

1. My primary concern is lack of clarity in why not all cancer PRS were carried forward from Supplemental Table 1 to evaluation in Supplemental Table 2. The PRS from the largest prostate cancer GWAS to date (Conti 2021), including 12% Asian individuals, is represented in Supplemental Table 1 (PGS ID: PGS000662) but is not carried forward for evaluation in Supplemental Table 2. Similarly, a lung cancer PRS developed specifically for a Chinese population (Dai 2020) is also represented in Supplemental Table 1 (PGS ID: PGS000070) but not carried forward for evaluation. I have not performed similar inspection for the breast and colorectal cancer PRS, but this raises the concern about incomplete evaluation of available (and even best) PRS.

2. It is not clear that AUC should be used to identify the "best" PRS for each cancer in this population, particularly since from what I can tell the author did not use a time-dependent AUC method and excluded all baseline cases from analyses. A time-to-event metric would seen to be a more appropriate way to identify the "best" PRS in these analyses, either using an AUC method for incident event data or using risk association.

3. Although the term "predictive ability" is used in the landmark Wand 2021 paper (ref 31), I recommend using more specific terms in this manuscript, starting with the abstract (line 66). I think the correct terms would be "relative risk association" and "absolute risk association," as appropriate. Still, Ref 31 (Wand 2021) is helpful and should be introduced in the Introduction to orient to reader to the reporting framework the authors use in presenting this work.

4. Can the authors justify why they performed only sex-stratified analyses for lung and colorectal cancer? The GWAS for these conditions included both sexes. Given the relatively small number of cases, it would be worth performing all analyses in the cohort overall. Sex-stratified analyses could be presented in the supplemental materials, if desired.

5. Lines 182-184: How is it known that only <1% of participants migrated out of Singapore? Are other data available to confirm their ongoing residence in Singapore and absence of cancer diagnosis? E.g., other medical diagnoses or other data?

6. It is reassuring to see cigarette smoking as a strong lung cancer risk factor, but the apparent protective effect of smoking on prostate cancer contradicts current evidence. The authors should discuss this in the Discussion.

7. Lines 274-292 could be shortened, simply referring the reader to most of these results in Table 1.

8. Lines 368-387: This section of the results seems out of order compared with the Methods. I recommend reorganizing the Results as follows: "Characteristics of the study population," "Lack of Asian representation in PRS development," "PRS discriminatory ability," "PRS distribution," "Associations between PRS and relative hazard of developing cancers," "Number of cancers that developed within PRS at-risk groups," "Association of PRS with absolute risk," "PRS calibration."

9. Calibration should be formally tested (e.g., with a Hosmer-Lemeshow or more sophisticated test) in addition to the visual inspection presented.

10. Lines 470-476 of the Discussion stray a little from the focused thesis of this work.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Polygenic risk scores for the prediction of common cancers in East Asians: A population-based prospective cohort study" for further consideration by eLife. Your revised article has been evaluated by Caigang Liu (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

1. Your response to the initial critique #1 is valid. The response itself contains errors (e.g. incorrect PGS numbers given for specific disease PRS), but the supplemental tables appear correct. The caption for Figure 1 is missing, which should explain which column corresponds to which disease and in which sex.

2. Please add the methods of your new time-to-event AUC sensitivity analysis to the main Methods of the manuscript, including a reference to the R package used. Also please state explicitly in the Methods your approach to choosing the best PRS for each disease (or disease-sex combination): logistic regression AUC.

3. Please explicitly state this rationale for sex-stratified analyses for these cancers in the Methods.

eLife. 2023 Mar 27;12:e82608. doi: 10.7554/eLife.82608.sa2

Author response


Reviewer #1 (Recommendations for the authors):

This is an important paper that has the potential to contribute to our understanding of how to take forward polygenic risk scores developed primarily based on individuals of European descent on the large Asian population.

1. However, by focusing on a combined analysis of 4 cancers, the main findings appear to have been lost.

a. The authors note in the discussion that the PRSes developed in individuals of European descent has not well calibrated for the Asian population One would not expect PRS with poor AUC to be well calibrated, hence poor calibration for lung cancer-PRS may be expected in this study. For other cancer-PRSs, the confidence interval of the calibration slopes indicating that they are not significantly different from one, hence no strong suggestion of poor calibration overall, except maybe for prostate PRS. Would it be possible that observed prostate cancer incidence is underestimated due to survival bias?

We have added the point that the reviewer highlighted about confidence intervals associated with the calibration slopes in the Results:

In Results (section “PRS calibration”): “In general, predicted risks for the higher PRS categories did not correspond well to the observed proportions for female breast, male prostate, and female lung cancers (Figure 1D); in particular, predicted risks were overestimated for the higher risk categories. Overestimation of risk was observed for all PRS categories for male lung cancer. In contrast, predicted risks were underestimated for both female and male colorectal cancers. Nonetheless, the confidence intervals associated with the calibration slopes for all cancers included 1, with the exception of female (0.32 [-0.31 to 0.95]) and male lung cancers (0.26 [-0.25, 0.78]).”

The prostate cancer PRS has poor evidence of a good fit.

In the Discussion paragraph 5, we have added (in bold):

“Our results show that cancer risk estimates based on PRS developed using populations of European ancestry are not optimally calibrated for our Asian study population. However, the confidence intervals associated with the calibration slopes for all cancers except for female and male lung included the value of 1, suggesting that the overall calibration is not poor. For female and male lung cancers associated with low AUC values, poor calibration is not unexpected. All PRSs except PGS000662 (prostate cancer) passed the formal Hosmer-Lemeshow goodness-of-fit test. Males in the first 5 deciles of PGS000662 did not develop prostate cancer, suggesting that a linear fit may not be appropriate. A hard threshold beginning from the 6th decile may perform better at identifying males at elevated risk of developing prostate cancer.”

b. For each of the 4 cancers, it would be helpful to report features associated with translatability of PRS from Europeans to Asians. The authors report that "no significant association (P>0.05) was found between number of variants included in the various PRS evaluated for each cancer and discriminatory ability" – an exploratory analysis of other features associated with performance would be helpful.

We did not find evidence of association between the number of variants and the calibration (expected/observed or Hosmer-Lemeshow p-value) of the PRS in our study population. We summarise the results from the additional analysis here in Author response table 1 and ( (Author response image 1) ) .

Author response table 1. Linear associations between features performance (AUC, calibration [expected/ observed], and Hosmer-Lemeshow p-values) and the number of variants in the polygenic risk score, by cancer type.

Cancer site – gender Feature Linear association p-value Max value of feature
Breast – Female AUC P=0.864 0.61075098
Breast – Female Calibration (E/O) P=0.748 1.50854654
Breast – Female Hosmer-Lemeshow p-value P=0.847 0.960221
Prostate – Male AUC P=0.403 0.72849342
Prostate – Male Calibration (E/O) P=0.567 4.74364593
Prostate – Male Hosmer-Lemeshow p-value P=0.844 0.4708587
Colorectal – Female AUC P=0.734 0.64886163
Colorectal – Female Calibration (E/O) P=0.779 1.00833383
Colorectal – Female Hosmer-Lemeshow p-value P=0.296 0.9056375
Colorectal – Male AUC P=0.666 0.66361296
Colorectal – Male Calibration (E/O) P=0.752 1.03242402
Colorectal – Male Hosmer-Lemeshow p-value P=0.047 0.8180789
Lung – Female AUC P=0.728 0.68602239
Lung – Female Calibration (E/O) P=0.869 2.97417751
Lung – Female Hosmer-Lemeshow p-value P=0.111 0.8170288
Lung – Male AUC P=0.451 0.68032583
Lung – Male Calibration (E/O) P=0.385 2.78560993
Lung – Male Hosmer-Lemeshow p-value P=0.404 0.9451139

Author response image 1. Distributions of AUC, calibration (expected/ observed), and Hosmer-Lemeshow p-values for features of performance, by cancer type.

Author response image 1.

c. In Table 2, the authors appear to have grouped individuals into quintiles based on a combined score. It is not clear what is the intent for this analysis – is the aspiration to roll out a risk stratified screening strategy based on "high risk for any of 4 cancers" and then to screen for all 4 cancers? The corresponding text in Line 315-330 needs clarification, it is not clear what is the intent for repeating the analyses with different PRS quintiles as reference group.

Cancer-specific quintiles were used for each cancer-specific analysis. The premise of this analysis is to identify the best-performing PRS for each cancer for the purpose of informing screening strategies for each respective cancer. To clarify, we have made changes to the text as follows:

In Methods (section “Associations between PRS and risk of developing cancers”): “The associations between cancer-specific PRS quintiles (where individuals ranked by PRS were categorized into quintiles, using the middle quintile [40 to 60%] as a reference to reflect the average risk of the population) and the incidence of site-specific cancers were investigated using Cox proportional hazards modeling to estimate hazard ratios (HR) and corresponding 95% confidence intervals (CI), using time since recruitment as the time scale, and adjusted for age at recruitment.”

In Results (section “Associations between PRS and the relative hazard of developing cancers”): “Compared to the middle cancer-specific PRS quintile, individuals in the highest PRS quintile were 64% more likely to develop cancers of the breast, prostate, and colorectal (Table 2).”

Table 2 caption: “Table 2. Hazard ratios (HR) and corresponding 95% confidence intervals (CI) associated with polygenic risk score quintiles (Q) compared to the population median, using the Cox proportional hazards model and censored at 20 years after recruitment. Individuals were categorized into cancer-specific quintiles based on their cancer-specific PRS. All models were adjusted for age at recruitment.”

d. In Table 3, the authors reported the association of PRS together with other demographic/risk factors. It hasn't been made clear the reason for such exhaustive adjustment in the model. If the intent is to quantify the attenuated amount in PRS association after accounting for potential confounders, then unadjusted PRS would need to be presented and the point needs to be made clear in the text.

We have clarified in Methods (section “Associations between PRS and risk of developing cancers”): “PRS is known to have “portability” issues related to genetic ancestry and demographics (10.1038/s41588-019-0379-x, https://elifesciences.org/articles/48376). Hence, we adjusted for variables in the models, including age at recruitment, dialect group (Hokkien or Cantonese), highest level of education (no formal education, primary school, or secondary or higher), body mass index (continuous, kg/m2), cigarette smoking (non-smoker, ex-smoker, current smoker), alcohol consumption (never, weekly, daily), moderate physical activity (none, 1-3h/week, ≥3h/week), vigorous work/strenuous physical activity at least once a week (no or yes), and familial history of cancer (no or yes).”

Results for the unadjusted PRS analyses have been added to Supplementary file 1d, columns N to Q.

e. In Table 4, the authors intend to quantify the proportion of cases captured by the at-risk groups as defined by PRSs. However, what's the rational of the HR cutoffs? Why not use the absolute risks in Figure 1(c) instead? What is the intent to include other risk factors? Higher proportion of cases captured would be expected with additional risk factors, however, it may be an overestimation here given that the RRs of other risk factors were estimated from the same cohort?

To minimize confusion, we have removed the results from Table 4. Using the absolute risk as the cut-off will not increase information known about the group identified as high-risk. The proportion of individuals identified as at high risk will be greater than the percentile identified by the absolute risk cut-off (x%). In addition, if the PRS is well calibrated, we expect >x% of these high-risk individuals to develop the disease.

2. Have the authors sufficiently addressed survival bias? The cohort of 68k individuals was established in 1993-1998, but blood samples from 28k individuals were only collected in 1999-2004, and of these, 21,694 were analysable. Was the Singapore Cancer Registry data systematically collected from 1993?

We thank the reviewer for highlighting this point. First, we agree that survival bias could exist since, we could only include participants who survived to participate in the follow-up interview and who also agreed to give blood samples for research. For example, in the breast cancer evaluation, those who gave blood samples were younger (mean age of 55.1 years versus 57.1 years at recruitment) and were also more likely to have received education (32.9 percent with no formal education versus 45.4 percent) compared to women who did not give blood. As such, Cox proportional hazards models evaluating associations between PRS and incident cancers in the study were additionally adjusted for these related risk factors collected at recruitment, including age at recruitment, highest level of education, body mass index, cigarette smoking status, alcohol consumption, physical activity and familial history of cancer. In the revised version of the manuscript, we have further included survival bias as a potential study limitation:

In Discussion (second last paragraph): “Blood samples were collected from a subset of SCHS participants who were alive and contactable between 1999 and 2004 (after the recruitment period 1993 – 1998). While we attempted to adjust associations between PRSs and incident cancers in the study by including multiple related risk factors as covariates in Cox proportional hazards models, we acknowledge the potential of survival biases in the study.”

We have added more information on the Singapore Cancer Registry:

In Methods (section “Selection of common cancers”): “Identification of incident cases of cancer was accomplished by record linkage of all surviving cohort participants with the database of the nationwide Singapore Cancer Registry [20]. The Singapore Cancer Register was founded in 1968. Prior to 2009, reporting of neoplasms by all medical practitioners and pathology laboratories to the registry is voluntary (10.1016/j.canep.2016.06.006). The registry's staff compares cancer patient hospital discharges and death certificates to registered cases for verification. Completeness of reporting in the 1970s is 96% and in the 1990s, it was close to 100% (10.1186/1471-2407-6-261).”

Reviewer #2 (Recommendations for the authors):

1. As summarized in my public comments, my primary concern is lack of clarity in why not all cancer PRS were carried forward from Supplemental Table 1 to evaluation in Supplemental Table 2. The PRS from the largest prostate cancer GWAS to date (Conti 2021), including 12% Asian individuals, is represented in Supplemental Table 1 (PGS ID: PGS000662) but is not carried forward for evaluation in Supplemental Table 2. Similarly, a lung cancer PRS developed specifically for a Chinese population (Dai 2020) is also represented in Supplemental Table 1 (PGS ID: PGS000070) but not carried forward for evaluation. I have not performed similar inspection for the breast and colorectal cancer PRS, but this raises the concern about incomplete evaluation of available (and even best) PRS.

We have included the analysis of the PRSs that did not have training/ testing datasets mentioned and added the results to Supplementary file 1c-f. Thus, we studied a total of 165 PRSs (87 for breast cancer, 26 for colorectal cancer, 13 for lung cancers and 39 for prostate cancer). Figure 1 is updated to the PRSs from this list – PGS000873 (Breast), PGS000662 (Prostate), PGS000055 (Lung-Female), PGS000734 (Lung-Male), PGS000721 (Colorectal-Female), and PGS000070 (Colorectal-Male).

2. It is not clear that AUC should be used to identify the "best" PRS for each cancer in this population, particularly since from what I can tell the author did not use a time-dependent AUC method and excluded all baseline cases from analyses. A time-to-event metric would seen to be a more appropriate way to identify the "best" PRS in these analyses, either using an AUC method for incident event data or using risk association.

To obtain a time-to-event metric for AUC at 5-year, we used AUC.cd() from the survAUC package in R (10.1186/s12874-017-0332-6). The best PRSs chosen were the same except for female colorectal cancer (PGS000149). However, the AUC at 5-year was 0.66677 for PGS000055 (the chosen PRS by AUC from the logistic model), 0.00005 lower than for PGS000149 (surv.AUC = 0.66682). We find the difference non-informative and have presented the AUCs from the Cox proportional hazards model in Supplementary file 1f.

In Discussion (second last paragraph): “We further tested the sensitivity of the PRS selection using a time-to-event metric for AUC (at 5-year), and the differences found were non-informative from the logistic regression (Supplementary file 1g).”

3. Although the term "predictive ability" is used in the landmark Wand 2021 paper (ref 31), I recommend using more specific terms in this manuscript, starting with the abstract (line 66). I think the correct terms would be "relative risk association" and "absolute risk association," as appropriate. Still, Ref 31 (Wand 2021) is helpful and should be introduced in the Introduction to orient to reader to the reporting framework the authors use in presenting this work.

We have indicated the reporting framework used in the last sentence of the Introduction:

“In this study, we evaluated the utility of common PRS, curated in the Polygenic Score (PGS) Catalog, in predicting the risk of the commonly diagnosed cancers with high genetic predisposition (breast, prostate, colorectal, and lung) in a prospective cohort comprising 21,694 participants of East Asian descent in Singapore. The reporting framework recommended for the interpretation and evaluation of PRS detailed in Wand et al. is used (10.1038/s41586-021-03243-6).”

In Results: “PRS predictive ability” replaced with “PRS absolute risk association”

In Results (section “PRS distribution”): “Figure 1 depicts the (A) distribution, (B) discrimination, (C) absolute risk association, and (D) calibration of the best-performing PRS (based on AUC) (Additional file 1 – Supplementary Table 3) for the four cancers studied: breast (PGS000873), prostate (PGS000662), colorectal (female: PGS000055; male: PGS000734), and lung (female: PGS000721; male: PGS000070).”

In Discussion (fourth paragraph): “In accordance with published Polygenic Risk Score Reporting Standards, we reported PRS distribution, discrimination, absolute risk association, and calibration for each of the four common cancers studied [31].”

Caption: “Figure 1. Site-specific polygenic risk scores (PRS) performance assessment.

(A) Distribution, (B) discrimination, (C) absolute risk association and (D) calibration for each of the four common cancers studied. Two-sided, two-sample t-tests with a type I error of 0.05 were used to examine whether there was a difference in the distribution of standardised PRS (subtraction of mean value followed by the division by the standard deviation) between site-specific cancer cases and non-cancer controls (A). The PRS showcased are the best-performing scores based on Area Under the Receiver Operator Characteristic Curve (AUC) values in the female and male populations, (i) unadjusted [solid line], and (ii) adjusted for age at recruitment [dashed line] (B). Each colored line in the plots for absolute risk association denotes a five percentile increase in the standardised PRS score in (C). Calibration calculated based on five-year absolute risk by PRS deciles in (D). A prediction tool is considered more accurate when the AUC is larger. An AUC of 0.9–1.0 is considered excellent, 0.8–0.9 very good, 0.7–0.8 good, 0.6–0.7 sufficient, 0.5–0.6 bad, and less than 0.5 considered not useful (PMID: 27683318).”

4. Can the authors justify why they performed only sex-stratified analyses for lung and colorectal cancer? The GWAS for these conditions included both sexes. Given the relatively small number of cases, it would be worth performing all analyses in the cohort overall. Sex-stratified analyses could be presented in the supplemental materials, if desired.

Gender differences in colorectal and lung cancer incidence have been reported in Singapore.

From Kok et al. (DOI: 10.1007/s00384-007-0421-9): “Male colorectal cancer rates between 1968 and 2002 from 20 to 40 per 100,000 person years. The increase was sharpest among older men, for whom there was a significant AC effect. Female colorectal cancer rates increased until 1992 (from 16 to 29 per 100,000 person years) and stabilized afterward.”

From Lim et al. (DOI: 10.1016/j.lungcan.2014.01.007): “Lung cancer incidence rates were more than two times higher in males compared to females.”

The AUCs associated with selected PRS for combined colorectal (PRS000055, 0.65 [0.63 to 0.67]) and combined lung (PRS000070, 0.69 [0.67 to 0.71]) were not appreciably different from the PRS selected for the sex-stratified analyses (female colorectal: PRS000055, 0.65 [0.62 to 0.69]; male colorectal: PRS000734, 0.66 [0.63 to 0.69]; female lung: PRS000721, 0.69 [0.65 to 0.73]; male lung: PRS000070, 0.68 [0.65 to 0.71]). We have presented the combined-sex analysis for colorectal and lung cancer in Supplementary file 1g.

In Discussion (second last paragraph): “To increase the number of events, we combined the males and females for lung and colorectal cancers. The resulting AUCs (from the logistic regression) were not appreciably different from the sex-specific analysis (Supplementary file 1g).”

5. Lines 182-184: How is it known that only <1% of participants migrated out of Singapore? Are other data available to confirm their ongoing residence in Singapore and absence of cancer diagnosis? E.g., other medical diagnoses or other data?

The data on migration in our cohort was collected during our subsequent follow-up interviews, and we were informed about the migration by the family members of cohort members who had migrated. To our best knowledge, we only knew 47 participants who had migrated to other countries in our cohort for the duration of this study.

In Methods (section “Follow-up”): “The data on migration in our cohort was collected during our subsequent follow-up interviews, and we were informed about the migration by the family members of cohort members who had migrated.”

We have systematically done linkage analysis between our cohort data and the nationwide Singapore Cancer Registry to identify incident cancer cases in our cohort since recruitment in the 1990s. The nationwide cancer registry has been in place since 1968 and has been shown to be comprehensive in its recording of incident cancer cases [reference: Bray F, Colombet M, Mery L, Piñeros M, Znaor A, Zanetti R and Ferlay J, editors (2017). Cancer Incidence in Five Continents, Vol. XI (electronic version) Lyon, IARC. http://ci5.iarc.fr last accessed on 2 December 2022]. Furthermore, compulsory reporting of cases to Singapore Cancer Registry has been mandated by law since 2010. In addition, we were able to ascertain survival status of all cohort participants via record linkage with the population-based Singapore Registry of Births and Deaths. Hence, we are confident that the capture of cancer cases via linkage with national registry can be considered to be virtually complete.

In Methods (section “Selection of common cancers”): “The Singapore Cancer Register was founded in 1968. Prior to 2009, reporting of neoplasms by all medical practitioners and pathology laboratories to the registry is voluntary (10.1016/j.canep.2016.06.006). The registry's staff compares cancer patient hospital discharges and death certificates to registered cases for verification. Completeness of reporting in the 1970s is 96% and in the 1990s, it was close to 100% (10.1186/1471-2407-6-261).”

6. It is reassuring to see cigarette smoking as a strong lung cancer risk factor, but the apparent protective effect of smoking on prostate cancer contradicts current evidence. The authors should discuss this in the Discussion.

We have added in the Discussion:

In Discussion (fifth paragraph): “It is reassuring to see tobacco smoking is a strong risk factor for lung cancer in our dataset. However, smoking appeared to be associated with a protective effect for prostate cancer. While smoking is a well-known risk factor for many cancers (10.18632/oncotarget.24724), in particular lung cancer, observational studies frequently show that smokers are associated with a lower incidence of prostate cancer (10.1038/bjc.2012.520, 10.1158/1055-9965.EPI-09-0252, 10.1002/(SICI)1097-0215(19960917)67:6<764::AID-IJC3>3.0.CO;2-P, 10.1054/bjoc.1999.1105, 10.1007/BF00051881, 10.2105/AJPH.2008.15050, 10.1016/j.eururo.2014.08.059, 10.1186/s12916-016-0607-5, 10.1002/ijc.22788). However, a Mendelian randomisation study did not support the association (10.1371/journal.pmed.1003178).”

7. Lines 274-292 could be shortened, simply referring the reader to most of these results in Table 1.

We have removed the second paragraph under Characteristics of the study population, keeping only details relevant to numbers of incident cancers and age of diagnosis:

In Results (section “Characteristics of the study population”): “Table 1 shows the characteristics of the 21,694 participants who were cancer-free at recruitment. The median follow-up time for the cohort was 20 years (IQR: 18 to 22). As of December 2015, 495 women developed breast cancer, 308 men developed prostate, 774 (332 women and 409 men) colorectal cancer, and 562 (181 women and 381) lung cancer. The median age at recruitment was 54 years (interquartile range [IQR]: 49 to 61). The median age at diagnosis was 65 years (IQR: 59-70) for female breast cancers, 72 years (IQR: 67 to 77) for prostate cancers, 71 years (IQR: 65 to 76) for male colorectal cancers, 71 years (IQR: 64 to 78) for female colorectal cancers, 74 years (IQR: 68 to 78) for male lung cancers and 74 years (IQR: 66 to 79) for female lung cancers.“

8. Lines 368-387: This section of the results seems out of order compared with the Methods. I recommend reorganizing the Results as follows: "Characteristics of the study population," "Lack of Asian representation in PRS development," "PRS discriminatory ability," "PRS distribution," "Associations between PRS and relative hazard of developing cancers," "Number of cancers that developed within PRS at-risk groups," "Association of PRS with absolute risk," "PRS calibration."

We have reordered the sections as suggested without the section on “Number of cancers that developed within PRS at-risk groups” (based on the comments by Reviewer 1).

9. Calibration should be formally tested (e.g., with a Hosmer-Lemeshow or more sophisticated test) in addition to the visual inspection presented.

We did not observe any lack of calibration based on the Hosmer-Lemeshow test (using 10 groups) for all PRSs except PGS000662 (prostate cancer). However, males in the first 5 deciles of PGS000662 did not develop prostate cancer. This may indicate that a linear fit is not appropriate, but a hard threshold (here the start of the 6th decile) is more appropriate to indicate males at elevated risk of developing prostate cancer.

In Methods (section “PRS calibration”): “In addition, we used the Hosmer-Lemeshow test to check the goodness-of-fit.”

In Discussion (paragraph 7): “All PRSs except PGS000662 (prostate cancer) passed the formal Hosmer-Lemeshow goodness-of-fit test. Males in the first 5 deciles of PGS000662 did not develop prostate cancer, suggesting that a linear fit may not be appropriate. A hard threshold beginning from the 6th decile may perform better at identifying males at elevated risk of developing prostate cancer.”

10. Lines 470-476 of the Discussion stray a little from the focused thesis of this work.

We have removed these lines from the Discussion.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

The manuscript has been improved but there are some remaining issues that need to be addressed, as outlined below:

1. Your response to the initial critique #1 is valid. The response itself contains errors (e.g. incorrect PGS numbers given for specific disease PRS), but the supplemental tables appear correct. The caption for Figure 1 is missing, which should explain which column corresponds to which disease and in which sex.

We corrected the errors in the response for the PGS numbers, table numbers, and supplementary files numbers.

“Thus, we studied a total of 165 PRSs (87 for breast cancer, 26 for colorectal cancer, 13 for lung cancers and 39 for prostate cancer). Figure 1 is updated to the PRSs from this list – PGS000873 (Breast), PGS000662 (Prostate), PGS000055 (Lung-Female), PGS000734 (Lung-Male), PGS000721 (Colorectal-Female), and PGS000070 (Colorectal-Male).”

Caption for figure 1:

“Figure 1. Site-specific polygenic risk scores (PRS) performance assessment.

(A) Distribution, (B) discrimination, (C) absolute risk association and D) calibration for each of the four common cancers studied (columns from left to right: breast, prostate, lung [female], lung [male], colorectal [female], and colorectal [male]. Two-sided, two-sample t-tests with a type I error of 0.05 were used to examine whether there was a difference in the distribution of standardised PRS (subtraction of mean value followed by the division by the standard deviation) between site-specific cancer cases and non-cancer controls (A). The PRS showcased are the best-performing scores based on Area Under the Receiver Operator Characteristic Curve (AUC) values in the female and male populations, (i) unadjusted [solid line], and (ii) adjusted for age at recruitment [dashed line] (B). Each colored line in the plots for absolute risk association denotes a five percentile increase in the standardised PRS score in (C). Calibration calculated based on five-year absolute risk by PRS deciles in (D). A prediction tool is considered more accurate when the AUC is larger. An AUC of 0.9–1.0 is considered excellent, 0.8–0.9 very good, 0.7–0.8 good, 0.6–0.7 sufficient, 0.5–0.6 bad, and less than 0.5 considered not useful (PMID: 27683318).”

2. Please add the methods of your new time-to-event AUC sensitivity analysis to the main Methods of the manuscript, including a reference to the R package used. Also please state explicitly in the Methods your approach to choosing the best PRS for each disease (or disease-sex combination): logistic regression AUC.

We added the information in Methods as recommended.

In Methods (PRS discrimination):

“The site-specific PRS with the highest AUC (logistic regression models) was selected. To test the sensitivity of the PRS selection, we obtained a time-to-event metric for AUC at 5-year, we used AUC.cd() from the survAUC package in R (10.1186/s12874-017-0332-6).”

3. Please explicitly state this rationale for sex-stratified analyses for these cancers in the Methods.

We have added the information as recommended.

In Methods (Selection of common cancers)

“We further stratified the analysis by sex as differences in colorectal and lung cancer incidence by sex have been reported in Singapore [10.1007/s00384-007-0421-9].”

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Figure 1—source data 1. Tables on absolute risk for breast cancer.
    Figure 1—source data 2. Tables on absolute risk for colorectal cancer.
    Figure 1—source data 3. Tables on absolute risk for lung cancer.
    Figure 1—source data 4. Tables on absolute risk for prostate cancer.
    Figure 1—source data 5. Tables on polygenic risk scores (PRS) performance assessment.
    Supplementary file 1. Supplementary files a-g, presenting supplementary figure and tables.
    elife-82608-supp1.xlsx (476KB, xlsx)
    MDAR checklist
    Source code 1. R codes on the statistical analysis.
    elife-82608-code1.zip (65.8KB, zip)

    Data Availability Statement

    All polygenic risk scores used in this study are publicly available in the PGS Catalog (https://www.pgscatalog.org; Lambert et al., 2021). The data that support the findings of our study are available from the corresponding authors of the study upon reasonable request (Dr Rajkumar s/o Dorajoo, dorajoor@gis.a-star.edu.sg and Dr Jingmei Li, lijm1@gis.a-star.edu.sg). More information regarding the data access to SCHS can be found at: https://sph.nus.edu.sg/research/cohort-schs/. The data are not publicly available due to Singapore laws. Figure 1—source data 1 contains the numerical data used to generate Figure 1. The code for the study is uploaded as Source code 1.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES