Abstract
Background:
We previously developed an African-ancestry-specific polygenic hazard score (PHS46+African) that substantially improved prostate cancer risk stratification in men with African ancestry. The model consists of 46 SNPs identified in Europeans and 3 SNPs from 8q24 shown to improve model performance in Africans. Herein, we used principal component (PC) analysis to uncover subpopulations of men with African ancestry for whom the utility of PHS46+African may differ.
Materials and Methods:
Genotypic data were obtained from PRACTICAL consortium for 6,253 men with African genetic ancestry. Genetic variation in a window spanning 3 African-specific 8q24 SNPs was estimated using 93 PCs. A Cox proportional hazards framework was used to identify the pair of PCs most strongly associated with performance of PHS46+African. A calibration factor (CF) was formulated using Cox coefficients to quantify the extent to which the performance of PHS46+African varies with PC.
Results:
CF of PHS46+African was strongly associated with the first and twentieth PCs. Predicted CF ranged from 0.41 to 2.94, suggesting that PHS46+African may be up to 7 times more beneficial to some African men than others. The explained relative risk for PHS46+African varied from 3.6% to 9.9% for individuals with low and high CF values, respectively. By cross-referencing our dataset with 1000 Genomes, we identified significant associations between continental and calibration groupings.
Conclusion:
We identified PCs within 8q24 that were strongly associated with performance of PHS46+African. Further research to improve clinical utility of polygenic risk scores (or models) is needed to improve health outcomes for men of African ancestry
Keywords: Prostate cancer, health disparities, genome wide association study, polygenic risk, genomics, local genotypic ancestry, African
Introduction
Polygenic hazard score (PHS) models can test for associations between genetic variants and the age at diagnosis of prostate cancer1,2. These models generate personalized estimates of risk that can be used to guide decisions on whether and when to offer screening to men3–5. However, development of polygenic models have often included only individuals of European genetic ancestry6,7, which could potentially lead to greater prostate cancer disparities. This is a particular concern for prostate cancer. For example, African American men are more likely than other men in the United States to develop prostate cancer, have a younger age of diagnosis, and are more than twice as likely to die from their prostate cancer as white men8.
PHS46, a model for prostate cancer trained exclusively in men of European ancestry, was found to be roughly half as effective in African men as in Europeans and Asians9. Similar trends were observed for other polygenic scores6, highlighting the need for increased diversification of large-scale genome studies in order to address this disparity7 in clinical utility. Furthermore, studies have suggested that inequities in the performance of polygenic risk scores may exacerbate disparities for individuals and communities that are already under-represented in research10. In an effort to develop more equitable PHS models, our group recently developed an African-ancestry-specific PHS model (PHS46+African)11, that substantially improved upon the performance of PHS46.
PHS46+African consists of 46 single nucleotide polymorphisms (SNPs) that were identified in Europeans, together with 3 additional SNPs located on the 8q24 chromosome. These three SNPs, herein referred to as African-specific, were found to uniquely improve performance of the PHS model in men of African genetic ancestry (African men). The term “African-specific” is not meant to confer any information on the relative allele frequency of these variants. However, given the inherent genetic diversity on the African continent and gene flow associated with the African diaspora12, we believe that PHS46+African may benefit certain subpopulations of African men over others. This would have implications for the general utility of PHS46+African and provide an opportunity to further improve the model.
Therefore, we used principal component (PC) analysis to estimate the genetic relatedness13,14 of a dataset of men of African genetic ancestry (African dataset) and to determine whether the performance of PHS46+African varied along PCs representing local genetic ancestry near African-specific SNPs. The PC analysis was conducted on a SNP window encompassing the 3 African-specific SNPs and limited to the 8q24 chromosome. In this way, the African men are projected onto axes, where proximity is based on patterns of genetic variation within this specific section of the genome and agnostic to geographical or social groupings in the dataset. We believe that this “local PC” approach13, focusing on 8q24, may uncover subpopulations of African men for whom the utility of PHS46+African differs.
Material and Methods
African-Ancestry Dataset
Genotypic and phenotypic data for this study were obtained from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL)15 consortium. Genotyping was performed using the OncoArray15,16 chip and covered 444,323 SNPs across the genome. Genotypic data was coded as effect allele counts (0, 1, or 2) for each SNP. Phenotypic data consisted of the prostate cancer case/control status, age at diagnosis, age at last follow-up, genome-wide principal components, and genotypic ancestry.
The genotypic ancestry of each individual was previously estimated using 2,318 ancestry informative markers spanning the entire genome17.Ancestral groupings using SNP markers showed good agreement with self-reported race/ethnicity9. In total, 6,253 men were classified as having African genotypic ancestry and were used for this analysis. All contributing studies were approved by the relevant ethics committees; written informed consent was obtained from the study participants.
PHS46+African was estimated for each African individual using the 49 SNPs and their respective PHS46+African SNP coefficients from the literature9.
Principal components of 8q24 SNP window
Genetic variation within a narrow window surrounding the three 8q24 SNPs was quantified using principal component decomposition. An initial selection window of SNPs was identified as all SNPs on chromosome 8 lying between the three African-specific 8q24 SNPs in PHS46+African or within 15 kbp of those SNPs in either direction (Figure 1). SNPs were subsequently removed from this window if the call rate was less than 0.95 or if the SNPs could not be cross-referenced against those available on the 1000 Genomes Project Phase 3 dataset18. In total, 93 SNPs (Supplementary Table 1) met the selection criteria and constituted the local 8q24 window. Missing SNP calls were replaced with the mean of the genotyped data for that SNP in the African dataset11,19. Genetic count data were first standardized across the dataset (mean of 0, standard deviation of 1) before the first 93 principal components (PC) of the 8q24 SNP window were estimated using the “pca” function in MATLAB.
Interaction between 8q24 SNP window and PHS46+African
For each principal component (PC) of the 8q24 SNP window, a Cox proportional hazards model was estimated using the age of onset of any prostate cancer as the time-to-event (Eq. 1):
where HR is the hazard rate, β1 is the coefficient of PHS46+African, β2 is the coefficient of the interaction terms between the 8q24 PC and PHS46+African, and β3 through β6 are the coefficients for the interaction terms between the first 4 global ancestry principal components (globPC) and PHS46+African. The model was estimated using the entire African dataset, where each observation was weighted by a sample-weighting correction factor11,19 to account for differences between the case-control rate of our dataset and that observed in the general population. Controls were censored at the age of last follow-up. The p-value associated with β2 was recorded for each of the 90 PCs from 8q24, after which the two principal components with the smallest p-values (, ) were selected for further analysis. Only two components were selected in order to simplify the methodology and visualization of the dataset in this first analysis of the complex interactions of local ancestry on polygenic score performance. This two-component selection also reflects what has often been used in the literature to estimate global ancestry classifications17.
A Cox proportional hazards model was then estimated using both of the selected principal components (Eq. 2):
The calibration factor (CF) for PHS46+African, as a function of and , was defined as (Eq. 3):
CF was formulated as the performance metric of interest in this study and can be used to quantify the variation in the coefficient of PHS46+African as a result of differences in the expression of variants within the 8q24 SNP window. Larger values of CF suggest stronger associations between the PHS46+African score and the age at diagnosis of prostate cancer. Individuals in the top 33rd, middle 33rd, and bottom 33rd percentiles of CF values were grouped into high-, middle- and low-calibration groups respectively. These groups were used to facilitate comparisons in CF between PHS models and dataset variables.
Comparison between PHS46 and PHS46+African in calibration groups
Within each calibration group, a Cox proportional hazards model was fit using the age of diagnosis of any prostate cancer as the time-to-event and either PHS46+African or PHS46 as the sole predictor variable. PHS46 was estimated using SNP coefficients published previously9. Controls were censored at age at last follow-up. Models were fit using data only from that calibration group. For each of these group-based models, the explained relative risk (ERR)20 was estimated as a measure of model goodness-of-fit. ERR was compared between the two PHS models to determine whether the improvement in performance between the models was evenly distributed across the calibration groups. Empirical confidence intervals for ERR were estimated using 1000 bootstrapped samples.
Variation in dataset variables across calibration groups
In order to identify differences in characteristics between calibration groups, a set of generalized linear models were fit to study the association between dataset variables (genetic count of three African-specific 8q24 SNPs, case-control status, and age at diagnosis of cases) and calibration groups. In each case, the dataset variable was set as the independent variable, while the calibration group classification (low/middle/high) was set as the predictor. Identity link functions were used for the continuous dataset variables (genetic count of three 8q24 SNPs, age at diagnosis of cases), whereas the logit link function was used for the binary variable (case-control status). Fitted models were then used to predict mean values of the dataset variables for each 2-PC region. In addition, a chi-squared test was used to determine whether any association existed between the contributing study (18 in total) and calibration group.
Continental differences between calibration groups
The 1000 Genomes (1000G) dataset was used to identify potential differences in continental origins across calibration groups. The 1000G dataset is subdivided into 5 continental groups: European, East Asian, admixed American, South Asian, and African individuals. Genetic counts for the 93-SNP 8q24 window for 2,504 individuals from the 1000G dataset were obtained from publicly available sources18. and scores were estimated using the same scaling factors and PC coefficients derived from the African dataset. CF and calibration groups for each individual in the 1000G dataset were estimated using Equation 3, and the percentile cutoffs derived from the African dataset. Chi-squared tests were used to test for associations between continental and calibration groups.
Results
Model interaction between 8q24 PCs and PHS46+African
The first (PC-1) and twentieth (PC-20) principal components had the two smallest p-values (Supplementary Table 2) when estimating the Cox proportional hazards models, as formulated in Eq.1, for all 93 principal components of the 8q24 SNP window. These two principal components were thus selected as and and used to estimate the Cox proportional hazards model formulated in Eq. 2. (Table 1). No significant interactions (α = 0.05) were detected between global ancestry principal components and PHS46+African.
Table 1. Coefficients of Cox model estimating interactions between PC and PHS46+African.
Coefficient | Associated predictor | Estimate | p-value |
---|---|---|---|
γ 1 | PHS46+African | 1.55 | < 1E-16 |
γ 2 | PC-1 × PHS46+African | −0.049 | 7.36 E-6 |
γ 3 | PC-20 × PHS46+African | 0.21 | 5.84 E-5 |
γ 4 | Global Ancestry PC(1–4) × PHS46+African | 7.34 | 0.29 |
γ 5 | −14.71 | 0.33 | |
γ 6 | −3.97 | 0.42 | |
γ 7 | −2.86 | 0.54 |
Principal component coefficients and scaling factors needed to estimate PC-1 and PC-20 from the genetic counts of the 93 SNP-8q24 window are reported in the Supplementary Material (Supplementary Table 3).
Calibration factor for PHS46+African
CF was plotted as a function PC-1 and PC-20 for every African individual in our dataset (Figure 2A). Qualitatively, the calibration factor tended to increase from high-PC-1/low-PC-20 to low-PC-1/high-PC-20 values from a minimum value of 0.41 to a maximum of 2.94. Grouping the African individuals based on percentiles of CF (Figure 2B) demonstrates this pattern more conspicuously, as the middle calibration group clusters on the PC-1/PC-20 space as a diagonal band that neatly divides the high and low calibration groups.
To investigate the influence of limiting the analysis to PC-1 and PC-20, we also estimated a calibration factor using the three principal components with the smallest p-values for β2 in Equation 1. The resulting 3-PC calibration factor was highly correlated with CF (R2 = 0.83), suggesting that similar trends would be observed if additional local ancestry PC’s were incorporated into the formulation of CF.
Comparing explained relative risk (ERR) between calibration groups and across PHS models
Mean ERR values for both PHS46 and PHS46+African were greater for the high calibration group than for the low calibration group (Table 2). Improvement of ERR with addition of the three African-specific SNPs was estimated as the as the difference between the ERR values of PHS46+African and PHS46. The absolute improvement in ERR was comparable for all three calibration groups.
Table 2. ERR for PHS46+African and PHS46.
Calibration group | PHS46+African | PHS46 | Improvement in ERR |
---|---|---|---|
low | 0.036 [0.019, 0.059] | 0.013 [0.004, 0.028] | 0.023 [0.011, 0.039] |
middle | 0.056 [0.036, 0.083] | 0.018 [0.007, 0.036] | 0.037 [0.021, 0.055] |
high | 0.099 [0.068, 0.135] | 0.041 [0.022, 0.069] | 0.058 [0.035, 0.083] |
Variation in dataset variables across 2-PC space
Fitted generalized linear models were used to predict mean values of case-fraction, age of cases, and genetic counts of three 8q24 SNPs for the calibration groups (Supplementary Table 4). No significant (p < 0.05) differences were found in the predicted age of cases across the groups. The predicted fraction of cases in the high-calibration group was lower (0.49) than that of the low-calibration group (0.53). The predicted mean genetic count of rs5013678 in the high-calibration group (0.27) was greater than that in the low-calibration (0.092) and middle-calibration (0.17) groups. No statistically significant association was detected between contributing study and calibration group (Supplementary Table 5).
As a post-hoc analysis, R2 values were estimated between case-control status and CF (0.0025) as well as between mean genetic count of rs5013678 and CF (0.039), suggesting that less than 4% of the variation in CF could be explained by each of these variables.
Continental characterization of calibration groups
The 1000G dataset was mapped into the 2-dimensional space defined by PC-1 and PC-20 and stratified by continental group (Figure 3). In general, individuals from the 1000G dataset mapped within the boundaries defined by the OncoArray African dataset. A statistically significant association (χ2 = 288, p < 0.001) was detected between continental and calibration groups in the 1000G dataset. Analysis of the standardized residuals of the test (Supplementary Table 6) revealed that the largest deviation from the expected cross-tabulation counts of the two variables was the greater-than-expected number of African individuals in the low-calibration group. Further analysis of the cross-tabulation between the 2 variables (Supplementary Table 7) revealed that the individuals of European origin made up the largest continental group within the high-calibration individuals (20%).
Discussion
Genetic models developed predominantly with European data may widen health disparities. PHS46+African incorporates African-specific SNPs from 8q24 and improves performance in men of African ancestry, but we have demonstrated here that heterogeneity in local ancestry in 8q24 can affect performance gains.
We identified two principal components within the 8q24 region (PC-1 and PC-20) that were most strongly associated with the calibration factor of PHS46+African among men of African ancestry. These associations were estimated while co-varying for global ancestry principal components (Table 1), suggesting that the variation in performance of PHS46+African across men of African ancestry using local ancestry in 8q24 could not be explained by ancestral differences estimated using the entire genome. The model-predicted CF values ranged from 0.41 to 2.94 for the combinations of PC-1 and PC-20 found in our dataset, suggesting that PHS46+African was roughly 7 times more useful in stratifying risk in some individuals compared to others. Assigning individuals into three equally sized subsets based on thresholds of CF, we identified low-, middle-, and high-calibration groups within our dataset. The goodness-of-fit for PHS46+African in each of these groups, assessed using the explained relative risk, was also found to increase from low- to high-calibration group. Improvements in ERR were observed across all groups when PHS46+African was used instead of PHS46.
The discovery of PC-1 and PC-20 allows us to identify individuals for whom risk stratification using PHS46+African is expected to be less precise. Several strategies could be implemented to improve equity in performance of PHS46+African in men of African genetic ancestry. These strategies can be pursued simultaneously with essential work to increase the overall diversity of genetic studies. For example, efforts to discover additional SNPs in future datasets could include weighting of individuals according to their PC-1 and PC-20 values, in order to enrich discovery of SNPs that preferentially benefit those in the low-calibration group. In addition, SNP weights in PHS46+African might be re-estimated to account for variations in effect sizes with calibration group. The principal components can also be used prospectively to guide enrollment in genome wide association studies so as to ensure that individuals from low-calibration groups are adequately represented in the datasets.
By cross-referencing our dataset with individuals from 1000 Genomes, we identified continent-level differences in calibration groups that suggest ancestral origins for the genetic relatedness defined by PC-1 and PC-20. The mapping of 1000G data into the 2-dimensional space defined by PC-1 and PC-20 may provide clues as to the underlying variation in CF. 46 of the 49 SNPs used in PHS46+African were discovered in a dataset consisting entirely of men with European genetic ancestry. Therefore, it is unsurprising that the high-calibration group overlapped substantially with men of European ancestry, as defined by 1000G. However, certain continental groups that were not explicitly used in training of model weights, such as South Asian and admixed American, also exhibited overlap with the high-calibration group (Supplementary Table 7). On the opposite end of the spectrum, the low-calibration group overlapped with primarily African and East Asian men. Further investigation into pockets of reference groups, from 1000G and other datasets, that share common values of PC-1 and PC-20 may reveal ancestral linkages that may help to predict which ancestral groups or datasets a model is most likely to perform well in.
There are several limitations to the current analysis to consider. We studied the same dataset that was used to identify the 3 African-specific SNPs. Therefore, the dependence of PHS46+African on PC-1 and PC-20 will need to be validated in independent test sets to ensure generalizability of findings. Our study is further limited by a relatively small number of observations compared to those found in larger, often predominantly European, genome-wide association studies. While a larger dataset would allow us to more robustly estimate model coefficients, the mapping of 1000G data into the same 2-dimensional space suggests that our current sample accurately portrays the extent of variation in the 8q24 SNP window (Figure 3). In addition, the calibration groups were selected based on arbitrary thresholds of CF and were used solely to simplify the analysis. Further investigations will be required to determine whether distinct subpopulations, based on ancestral relatedness, may exist in the 2-dimensional PC space. Furthermore, 8q24 was chosen to investigate effects in dependence of PHS performance on local ancestry because of substantial evidence supporting the importance of this region of the genome to the prediction of prostate cancer in men of African ancestry. The techniques described in this study can be used, in the future, to investigate local ancestry effects across the genome. Lastly, the dataset used in this analysis may only reflect a fraction of the diversity that is present in men of African genetic ancestry. As such, we are most likely under-estimating the variation in real-world performance of PHS46+African in this population.
In conclusion, we used local PC analysis to identify axes of variation within the 8q24 SNP window that were strongly associated with the performance of PHS46+African. Mapping our dataset onto these axes revealed that PHS46+African may meaningfully underperform in certain individuals of African genetic ancestry. Investigation into the origins of both high- and low-performing groups can be used to generate a model that is more equitable in performance across subpopulations of men with African ancestry.
Supplementary Material
Funding
This study was funded in part by grants from the University of California (#C21CR2060), the United States National Institute of Health/National Institute of Biomedical Imaging and Bioengineering (#K08EB026503), the Research Council of Norway (#223273), KG Jebsen Stiftelsen, and South East Norway Health Authority.
Funding for the PRACTICAL consortium member studies is detailed in the Appendix A2.
The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies, who had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.
Conflict of Interest:
All authors declare no support from any organization for the submitted work except as follows: AMD and TMS report a research grant from the US Department of Defense. OAA reports research grants from KG Jebsen Stiftelsen, Research Council of Norway, and South East Norway Health Authority.
Authors declare no financial relationships with any organizations that might have an interest in the submitted work in the previous three years except as follows, with all of these relationships outside the present study: TMS reports honoraria from Multimodal Imaging Services Corporation for imaging segmentation, honoraria from WebMD, Inc. for educational content, as well as a past research grant from Varian Medical Systems. OAA reports speaker honoraria from Lundbeck.
Authors declare no other relationships or activities that could appear to have influenced the submitted work except as follows: OAA has a patent application # U.S. 20150356243 pending; AMD also applied for this patent application and assigned it to UC San Diego. AMD has additional disclosures outside the present work: founder, equity holder, and advisory board member for CorTechs Labs, Inc.; founder and equity holder in HealthLytix, Inc., advisory board member of Human Longevity, Inc.; recipient of nonfinancial research support from General Electric Healthcare. OAA is a consultant for HealthLytix, Inc.
Additional acknowledgments for the PRACTICAL consortium and contributing studies are described in the Appendix A3
Footnotes
Ethics Statement
The present analyses used de-identified data from the PRACTICAL consortium and have been approved by the Institutional Review Board at the University of California San Diego. All contributing studies were approved by the relevant ethics committees and performed in accordance with the Declaration of Helsinki.
Additional consortia memebers, associated funding, and contact information are provided in the Supplementary Data Appendix 1,2, and 3.
Data Availability Statement
The data used in this work were obtained from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium.
Readers who are interested in accessing the data must first submit a proposal to the Data Access Committee. If the reader is not a member of the consortium, their concept form must be sponsored by a principal investigator (PI) of one of the PRACTICAL consortium member studies. If approved by the Data Access Committee, PIs within the consortium, each of whom retains ownership of their data submitted to the consortium, can then choose to participate in the specific proposal. In addition, portions of the data are available for request from dbGaP (database of Genotypes and Phenotypes) which is maintained by the National Center for Biotechnology Information (NCBI): https://www.ncbi.nlm.nih.gov/gap/?term=Icogs+prostatehttps://www.ncbi.nlm.nih.gov/gap/?term=Icogs+prostate.
Anyone can apply to join the consortium. The eligibility requirements are listed here: http://practical.icr.ac.uk/blog/?page_id=9. Joining the consortium would not guarantee access, as a proposal for access would still be submitted to the Data Access Committee, but there would be no need for a separate member sponsor. Readers may find information about application by using the contact information below:
Rosalind Eeles
Principal Investigator for PRACTICAL
Professor of Oncogenetics
Institute of Cancer Research (ICR)
Sutton, UK
PRACTICAL@icr.ac.uk
URL: http://practical.icr.ac.uk
Tel: ++44 (0)20 8722 4094
References
- 1.Seibert TM, Fan CC, Wang Y, Zuber V, Karunamuni R, Parsons JK et al. Polygenic hazard score to guide screening for aggressive prostate cancer: Development and validation in large scale cohorts. BMJ 2018; 360: 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Huynh-Le M-P, Fan CC, Karunamuni R, Walsh EI, Turner EL, Lane JA et al. A genetic risk score to personalize prostate cancer screening, applied to population data. Cancer Epidemiol Biomarkers Prev 2020; : cebp.1527.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Pashayan N, Duffy SW, Chowdhury S, Dent T, Burton H, Neal DE et al. Polygenic susceptibility to prostate and breast cancer: Implications for personalised screening. Br J Cancer 2011; 104: 1656–1663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Witte JS. Personalized prostate cancer screening: Improving PSA tests with genomic information. Sci Transl Med 2010; 2: 1–5. [DOI] [PubMed] [Google Scholar]
- 5.Chen H, Liu X, Brendler CB, Ankerst DP, Leach RJ, Goodman PJ et al. Adding genetic risk score to family history identifies twice as many high-risk men for prostate cancer: Results from the prostate cancer prevention trial. Prostate 2016; 76: 1120–1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, Feldman M et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat Commun 2019; 10. doi: 10.1038/s41467-019-11112-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Sirugo G, Williams SM, Tishkoff SA. The Missing Diversity in Human Genetic Studies. Cell 2019; 177: 26–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.DeSantis CE, Miller KD, Goding Sauer A, Jemal A, Siegel RL. Cancer statistics for African Americans, 2019. CA Cancer J Clin 2019; 69: 211–233. [DOI] [PubMed] [Google Scholar]
- 9.Huynh-Le MP, Fan CC, Karunamuni R, Thompson WK, Martinez ME, Eeles RA et al. Polygenic hazard score is associated with prostate cancer in multi-ethnic populations. Nat Commun 2021; 12: 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet 2019; 51: 584–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Karunamuni RA, Huynh-Le M-P, Fan CC, Thompson W, Eeles RA, Kote-Jarai Z et al. African-specific improvement of a polygenic hazard score for age at diagnosis of prostate cancer. Int J cancer 2020. doi: 10.1002/ijc.33282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mathias RA, Taub MA, Gignoux CR, Fu W, Musharoff S, O’Connor TD et al. A continuum of admixture in the Western Hemisphere revealed by the African Diaspora genome. Nat Commun 2016; 7. doi: 10.1038/ncomms12522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Li H, Ralph P. Local PCA Shows How the Effect of Population. Genetics 2019; 211: 289–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Todesco M, Owens GL, Bercovich N, Légaré JS, Soudi S, Burge DO et al. Massive haplotypes underlie ecotypic differentiation in sunflowers. Nature 2020; 584: 602–607. [DOI] [PubMed] [Google Scholar]
- 15.Amos CI, Dennis J, Wang Z, Byun J, Schumacher FR, Gayther SA et al. The oncoarray consortium: A network for understanding the genetic architecture of common cancers. Cancer Epidemiol Biomarkers Prev 2017; 26: 126–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Infinium OncoArray-500K BeadChip | Research cancer predisposition and risk. https://www.illumina.com/products/by-type/microarray-kits/infinium-oncoarray-500k.html (accessed 21 Jul2020).
- 17.Li Y, Byun J, Cai G, Xiao X, Han Y, Cornelis O et al. FastPop: A rapid principal component derived method to infer intercontinental ancestry using genetic data. BMC Bioinformatics 2016; 17: 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Auton A, Abecasis GR, Altshuler DM, Durbin RM, Bentley DR, Chakravarti A et al. A global reference for human genetic variation. Nature 2015; 526: 68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Karunamuni RA, Huynh-Le MP, Fan CC, Eeles RA, Easton DF, Kote-Jarai ZsS et al. The effect of sample size on polygenic hazard models for prostate cancer. Eur J Hum Genet 2020. doi: 10.1038/s41431-020-0664-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Heller G A measure of explained risk in the proportional hazards model. Biostatistics 2012; 13: 315–325. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.