Summary
It has been 15 years since the advent of the genome-wide association study (GWAS) era. Here, we review how this experimental design has realized its promise by facilitating an impressive range of discoveries with remarkable impact on multiple fields, including population genetics, complex trait genetics, epidemiology, social science, and medicine. We predict that the emergence of large-scale biobanks will continue to expand to more diverse populations and capture more of the allele frequency spectrum through whole-genome sequencing, which will further improve our ability to investigate the causes and consequences of human genetic variation for complex traits and diseases.
We review how GWAS facilitated an impressive range of discoveries impacting multiple fields, including epidemiology, social science, and medicine. We predict that GWAS will continue to expand to more diverse populations and rarer variants, further improving our investigation into causes and consequences of human genetic variation for complex traits.
Introduction
The human genome project can have more than one reward. In addition to sequencing the entire human genome, it can lead to identification of polymorphisms for all the genes in the human genome and the diseases to which they contribute.—Risch and Merikangas (1996)1
The fundamental promise of the genome-wide association study (GWAS) experimental design was that polymorphisms could be detected that are associated with disease risk at the population level, thereby explaining a proportion of familial risk. 15 years from the first well-designed GWAS by the Wellcome Trust Case Control Consortium (WTCCC),2 we reflect on how much this promise has been fulfilled, not just for disease but for a plethora of complex trats. In previous reviews,3,4 we discussed the criticism and perceived failure of GWAS and made several predictions for future discoveries and use of GWAS data. The initial skepticism of GWAS has largely diminished because of the overwhelming empirical evidence of its success. Given the explosion of GWAS discovery and applications across multiple disciplines, we cannot perform an exhaustive review of all relevant literature in the permitted space. Instead, we have focused on major developments in the last 5 years, revisiting past predictions and looking to the future of GWAS.
Bigger is better
Large sample sizes have been the primary foundation for continued and increased discoveries from GWAS. Over the past 5 years, the average sample size per publication has more than tripled, substantially increasing the number of significant associations (Figure 1). Several GWASs, including those for height,5 smoking initiation,6 educational attainment,7,8 and blood pressure,9 have surpassed the symbolic threshold of one million, partly through contributions of many smaller cohorts that have increased in number and size, but mostly because of large numbers of samples from large-scale biobanks and the company 23andMe, Inc. Among the biobanks, the UK Biobank (with genotype data of ∼500,000 deeply phenotyped participants) has played a leading role,10 beside other significant initiatives worldwide such as the pioneering deCODE Genetics,11 the Estonian Biobank,12 Biobank Japan,13 China Kadoori Biobank,14 FinnGen in Finland,15 Lifelines in the Netherlands,16 the Million Veteran Program in the USA,17 and more recently, the All of Us Research Program in the US.18
Figure 1.
Average sample size and average number of genome-wide significant (GWS) loci per publication for each year during the 15 years history of GWAS discoveries
The data were extracted from 5,771 GWAS publications that used a genome-wide genotyping array and shared their summary statistics on GWAS Catalog before November 8, 2022.
Most national biobanks attempt to sample a reasonable representation of people living in their respective countries and are therefore not specifically enriched for any one trait or disease. Consequently, even though the sample sizes in biobanks are large (e.g., 500,000), diseases with a lifetime risk of, say, 1% will result in approximately “only” 5,000 cases. Hence, there is a continued need for contributions of the often smaller but more specialized studies to disorder-specific research consortia such as the Psychiatric Genomics Consortium (PGC)19 for psychiatric diseases and the CARDIoGRAMplusC4D consortium20 for heart disease. However, given the growth of well-phenotyped large biobanks in many countries and the large contribution to sample size from 23andMe, Inc., the rationale for consortia that combine many small samples for commonly measured quantitative traits is much diminished. Compared to meta-analyses of multiple cohorts, apart from their sheer size, large-scale biobanks improve statistical power of GWAS in two ways. First, by harmonizing phenotype definition and minimizing batch effects, and second, by revealing and exploiting information contained in phenotypes of (close) relatives also present in the biobanks, even if close relatives are not genotyped.21 The latter aspect of biobanks had led to the development of novel statistical methods that could exploit this untapped information to improve power.21,22,23 To accommodate the increasing sample sizes, statistical approaches have been devised that are computationally orders of magnitude faster, scalable to analyze cohorts of millions of individuals.24,25
Polygenic predictors have come of age
Marker loci associated with highly significant additive effects on the character can be included in a net molecular score, m, which for any individual is the sum of the additive effects on the character associated with these markers.—Lande and Thompson (1990)26
From its foundation in agricultural genetics,27 one of the earlier promises from the human genome project and the GWAS design has been the ability to predict the genetic predisposition of heritable traits in humans.28,29,30 It is now well established that results from GWAS can be used to make predictions about diseases and other traits in individuals where those traits have not (yet) been observed. For the majority of common diseases, such predictions will never become diagnostic because their accuracy is limited by the heritability of the trait, by how much of that heritability is captured by the genome technology (e.g., common SNP GWAS, GWAS-by-WES [whole-exome sequencing], GWAS-by-WGS [whole genome sequencing]), and by how well the effects of individual SNPs are estimated. The precision of the estimation of SNP effects is limited by the sampling scheme and sample size of the discovery data and the statistical methods that are used (e.g., using only genome-wide significant loci versus methods that use all data and univariate versus multivariate methods). Nevertheless, the large increases in sample sizes combined with advanced multivariate analysis methods have led to such increases in accuracy that polygenic predictors have become an important research tool across disciplines.31,32 Furthermore, they are becoming ripe for clinical trials as a result of their increased ability to improve screening algorithms that aid in, e.g., the identification of individuals at risk for disease or patients that benefit more from certain medical therapies.33,34,35
One way to quantify the accuracy of a polygenic score is as an “effect size” (), which expresses the change in phenotypic standard deviations (SDs) per SD of the predictor (, with the proportion of phenotypic variance explained by the polygenic score and the SD of the phenotype). For example, a polygenic score with an = 0.09 has an effect size of 0.3 phenotypic SD, about 2 cm for height, 5 mmHg for systolic blood pressure, or 1 year of schooling. In Figure 2, we show how the prediction accuracy of height has increased since 2010. It demonstrates how ever-larger sample sizes lead to increasing effect sizes from 2.2 cm in 2010 to more than 4.1 cm in 2022, assuming that = 6.5 cm for height. By expressing polygenic score prediction accuracy in terms of trait SD units, it can be compared to the effect sizes of exposures, treatments, and interventions. This has been applied to show that effect sizes (expressed as risk) of common disease polygenic scores are of the same order as those of known monogenic mutations.36 The larger the effect sizes of polygenic scores, the better they are at identifying people at very high (and very low) risk of disease. For example, using the latest height GWAS, the mean height difference between individuals at the extremes of polygenic score distribution is ∼23 cm (2.5 SD below the mean polygenic score versus 2.5 SD above the mean, Figure 2D). In general, more or earlier screening of people at high risk would pay off if there are preventive treatments.37 For example, Kiflen et al.38 determined optimal health-economic strategies for prescribing statins on the basis of individuals’ polygenic risk of cardiovascular disease.
Figure 2.
Effect sizes of polygenic scores increase with sample size
(A–D) Each panel corresponds to one of four height polygenic scores derived from independent genome-wide significant SNPs identified in Lango-Allen et al. (2010)39 (A), Wood et al. (2014)40 (B), Yengo et al. (2018)41 (C), and Yengo et al. (2022)5 (D). Note the difference between the panels in the scale of the y axes on the right, indicating the increasing precision of the height polygenic scores as the discovery sample sizes increase. Each polygenic score is scaled to have a mean of 0 and a variance of 1. Error bars indicate standard errors of the mean. (A), (B), and (D) use data from 14,587 unrelated participants of the UK Biobank (not included in the discovery GWAS), while (C) uses data from 8,235 unrelated participants from the Health and Retirement Study not included in Yengo et al. (2018). The number of SNPs used in each polygenic score is reported in the legend of each panel (top-left) and were based for Lango-Allen et al. (2010) and Wood et al. (2014) on a reanalysis by Yengo et al. (2022) based on the HapMap 3 SNP panel. Each polygenic score was binned into 12 groups defined as: below −2.5, (−2.5,−2.0), (−2.0,−1.5), (−1.5,−1.0), (−1.0,−0.5), (−0.5,0.0), (0.0,0.5), (0.5,1.0), (1.0,1.5), (1.5,2.0), (2.0,2.5) and above >2.5. Height differences are expressed on the z axis against the lowest group (defined). Each panel represents a histogram of the height polygenic score (x axis) with the percentage of the individuals in each group represented on the y axis.
Polygenic predictors can only capture phenotypic variation that is associated with genetic factors. Therefore, it is never fully accurate to predict phenotypes nor sufficiently diagnostic. For a quantitative trait, the SD of the outcome around its prediction from a polygenic score can be expressed as , with an upper bound of defined as a function of the total heritability (). For example, for the hypothetical case that all genetic factors for height (assuming = 0.8) are identified and their effect sizes estimated without error, the variation in actual height around its predicted value from the polygenic score would be of , equivalent to a 95% confidence interval of about 12 cm.
Transferability of GWAS results across populations
Humans across the globe share a common ancestry, which implies no clearly demarcated ancestry groups.42,43 Yet our demographic and cultural history have led to human groups that differ, on average, genetically and in their environment. Genetic differences affect the distribution of allelic variants between such groups and environmental differences can alter their effects within groups on traits and disease liability. Consequently, findings from GWAS conducted in one group are not always transferable to another group. We use transferability here to mean two things: (1) that genetic associations may not replicate in other groups because of reasons other than statistical power and (2), its corollary, that polygenic predictors derived from GWAS performed in one group may underperform when applied in other groups because of reasons other than statistical power.
Issues related to GWAS transferability have been at the core of recent developments in GWAS research. First, many studies have quantified the lack of transferability of GWAS findings from European ancestry to other populations,44,45,46 which is explained by the over-representation of European ancestries participants in GWAS.47 Second, a growing number of GWASs are conducted in populations with non-European ancestries, mostly driven by efforts from large-scale biobanks (N > 100,000) in East Asia such as Biobank Japan,13 China Kadoori Biobank,14 and the Taiwan Biobank.48,49 Third, novel statistical methods are being developed (and extended) with the main purpose to improve transferability.50
The lack of transferability of GWAS findings between human groups is explained by a combination of factors including differences in haplotype frequencies (e.g., caused by differences in linkage disequilibrium between marker and causal variants) and effect sizes (e.g., caused by gene-by-gene or gene-by-environment interactions or gene-environment correlations). The relative importance of these factors varies across traits. For example, Wang et al.51 have shown theoretically and through simulations that between 25% and 80% of the loss of accuracy of polygenic predictors could be explained by differences in haplotype frequencies between European and Asian or African populations. Therefore, it remains to be investigated to what extent, and for which traits, environmental or cultural differences between populations play a role. There are indications that these contribute to population differences in GWAS signals for mental health outcomes. For example, the genetic correlation for major depression susceptibility between East-Asian and European ancestry populations was estimated at only ∼0.4, with BMI, coronary artery disease, and type 2 diabetes showing a positive genetic correlation with major depression in European ancestry individuals and a negative genetic correlation with major depression in East Asian ancestry individuals.52 These observations are consistent with the hypothesis that cultural differences affect which traits lead to depression in different populations. To interpret the magnitude of those estimated genetic correlations between ancestries, they should be benchmarked against genetic correlations observed, within-ancestry, across multiple studies and cohorts. Indeed, for major depression, the genetic correlations between cohorts with participants from the same ancestry group were on average ∼0.76,53 implying that the accuracy of a polygenic score derived in one cohort will be attenuated by 42% (1–0.762) when applied in another cohort to individuals with the same ancestry.
Old questions addressed by new data and new analytical methods
In general, the hypothesis of cumulative Mendelian factors seems to fit the facts very accurately.—Fisher (1918)54
For over a century, researchers have asked and theoretically tried to address questions about nature versus nurture, the genetic architecture of complex traits, the effect of natural selection on genetic variation between and within populations, mate choice, and indirect (associative) genetic effects.55,56 GWAS datasets have now provided the means to empirically test previously proposed hypotheses and estimate parameters that are fundamental in evolutionary, population, and quantitative genetics.
Fine-tuning the genetic architecture of complex traits
The joint distribution of frequencies and effect sizes of variants causing a trait or disease is commonly referred to as its genetic architecture. GWASs are typically well powered to capture the effects of relatively common genetic variation and, when large enough, can fully map where this genetic variation is located on the genome. This has recently been achieved for height through a GWAS involving over 5 million individuals from multiple ancestries.5 This unprecedentedly large study showed a saturation of the common-variant architecture among European-ancestry genomes, whereby approximately 12,000 SNPs jointly explain 40% of variation in out-of-sample prediction, which approaches the common SNP-based heritability. Despite the high polygenicity, only ∼21% of the genome appears to harbor common (MAF > 1%) genetic variation for height.5 Height continues to be the workhorse of human genetics because it is easily measured (a self-report is accurate), often recorded in medical or health questionnaires, and has a high heritability. A century ago, height served as the model complex trait when Mendel’s laws of inheritance were reconciled with the inheritance of quantitative traits.57
Rare variants can explain a substantial fraction of the heritability and have different properties than common variants, at least in part as a result of natural selection. They generally have larger effects and behave differently in their relationship with ancestry,58 geography,59 and therefore potentially also with respect to their association with environmental effects. The vast majority of human variants are rare; among 400 million detected variants in 53,831 sequenced individuals from the TOPMed project, ∼97% had a minor allele frequency (MAF) of <1%, and 46% were singletons,60 and among 643 million variants detected in 149,960 sequenced individuals from UK Biobank, 92%–97% had MAF < 0.1% and 40%–46% were singletons.61 Most of these variants will have little or no impact on disease risk, and it is harder to identify those that do in a GWAS design, as that would require exceptionally large sample sizes (Figure 3). Rare variants have therefore often been tested as a group (e.g., with a burden test) rather than individually.62
Figure 3.
Proportional increase in sample size (RN(p)) relative to common variant GWAS (pREF = ½) required for detecting rare variant associations with GWAS-by-WGS
(A and B) (A) and (B) show RN(p) as a function of the frequency p (varied between 0.01% and 1%) and the parameter S (varied between −1 and 0), respectively.
In a subset of 25,465 TOPMed participants, rare variants accounted for much of the missing heritability for height (but less so for BMI), especially protein-altering variants in low linkage disequilibrium (LD) with other variants.63 A GWAS-by-WES with a sample size of 640,000 reported evidence of 16 genes that were significantly associated with BMI via a burden test,64 whereas a GWAS for the same trait on a similar sample size identified more than 500 distinct loci.41 The number of detected genes or loci, however, is not necessarily a good indication of the utility of GWAS-by-WES discoveries because in a GWAS-by-WES the target gene is identified, which could, for example, lead to faster translation to new therapeutics. In another large GWAS-by-WES—24,248 individuals with schizophrenia versus 97,322 controls—ten genes were identified that increase schizophrenia risk when affected by rare variants,65 of which four were also identified in a common variant GWAS.66 In another larger study of exome sequence data, where associations between ∼12 million coding variants and 3,994 health-related traits were investigated, there was a significant enrichment of rare variant associations in loci from GWASs, although most of these associations (∼91%) were independent of common variant signals, underscoring the value of rare variants for providing additional evidence for implicated genes.67
Inference on natural selection
Across nearly all complex traits that have been studied, there is evidence for negative selection, in that alleles with larger effect sizes are maintained at lower frequency.68,69,70 Another notable manifestation of negative selection is that it creates LD-dependent genetic architectures. For example, SNPs within genomic regions with lower (than average) levels of LD tend to explain more heritability.71 However, the relationship between LD and per-SNP contribution to heritability is nonlinear and can vary according to alleles’ ages and selection coefficients (s). Depending on how the relationship between the focal trait and fitness is modeled, estimates of selection coefficients (s) at trait-associated loci range from 10−3 to 3 × 10−5.72,73 These negative selection pressures have shaped the genome-wide heritability distribution for most complex traits by flattening the distribution of genetic effects, meaning that the effective number of trait-associated loci is larger than it would be under neutrality.74
Besides the major discoveries about Homo species made through the new discipline of paleogenomics, ancient DNA genomics has also shown to be useful for inferring past selection pressures, for instance those on increased stature in ancestors of Bronze age European ancestry populations.61 The ability to infer height from skeletal remains has also made height the first trait for which a polygenic predictor computed from ancient DNA significantly predicts phenotypic variation in prehistoric samples.75,76 Polygenic adaptation on height over the past 2,000 to 3,000 years has also been inferred with GWAS data,77 although recent studies have shown weaker evidence of positive selection on height than previously reported.77,78,79,80
Assortative mating in the genome era
Assortative mating, i.e., mate choice driven by trait similarity, is a prevalent behavior across species.81 In humans, evidence of assortative mating mostly come from observed phenotypic resemblance between spouses across many traits and diseases.82 Disentangling the causes of this resemblance (mate choice versus shared environment) has long remained a challenge, often requiring complex experimental designs to be resolved. Recently, GWAS data have fostered new solutions to this problem by leveraging the fact that assortative mating induces widespread correlations between trait-increasing alleles among genomes. This property of assortative mating, already predicted in Fisher’s seminal 1918 paper,54 has led to the development of various methods to quantify assortative mating in different contexts. For example, when spouse pairs are available, assortative mating can be quantified by estimating the correlation of polygenic predictors between spouses.83 Other methods proceed by quantifying (in unrelated individuals) the correlation between polygenic predictors calculated from either odd- or even-numbered chromosomes,84 or by modeling the excess phenotypic (or genetic) similarities between relatives that is expected under assortative mating as opposed to random mating.85 Overall, GWAS-derived studies of assortative mating have provided genetic evidence that height and intelligence similarity between spouses is caused by mate choice, while the similarity in the numbers of years of education is likely to be (partly) driven by indirect assortment involving other traits genetically correlated with educational attainment.83 Despite numerous previous attempts and suggestive evidence,84,86,87 how much of the phenotypic similarity between spouses in their susceptibility to neurological and psychiatric disorders88 is due to mate choice remains an open question.89
Detection and quantification of gene-environment correlations
Gene-environment correlations have long been hypothesized to affect estimates of genetic variance components in twin and family studies.90 The GWAS era provides new data and approaches to detect and quantify their effects on complex traits. Effects of the parental rearing environment on offspring health was detected through associations between educational attainment polygenic scores constructed from non-transmitted parental alleles and a variety of offspring health outcomes.91 The inflation of GWAS effect estimates resulting from these gene-environment correlations and assortative mating can be mitigated with family-based GWAS designs, where members of the same family (e.g., siblings) are compared. Out of 25 complex traits investigated, within-family GWASs show the strongest reductions in SNP-based heritability estimates for educational attainment (∼76% reduction), cognitive ability (44% reduction), ever smoking (25% reduction), and height (17% reduction).77 The predictive value of polygenic scores for educational attainment also decreases by about half when predicting education or other complex traits within families or in adopted children.8,92,93
The polygenic score for educational attainment seems to stand out from other traits in the way that it is affected by assortative mating and gene-environment correlations. Out of 33 polygenic scores analyzed, that of educational attainment showed the strongest differences between geographic regions within Great Britain, most likely driven by socio-economic status (SES)-related migration.94 These regional differences could further induce assortative mating as well as gene-environment correlations that extend beyond families, increasing the genetic correlation with educational attainment and income for a wide range of physical and mental health outcomes.95 Population-based GWAS on indicators of SES, such as educational attainment (now at N = ∼3 million),8 thus seem to capture a bundle of underlying traits that are associated with socio-economic success in the modern world, and thus with getting exposed to more advantageous or detrimental (social) environments.94,95,96,97 Genetic data will be useful in further teasing out the causal relationships between all of these factors.
Gene-environment correlations and assortative mating both lead to a difference in genetic variation and GWAS effect sizes relative to a hypothetical population in which there is random mating and no association between genes and environments. Whether this means that estimates of genetic effects and genetic variances are “biased” if such between-family or between-region effects are present and ignored depends on one’s perspective and interests. For example, the within-family genetic segregation variance (which is 50% of the genetic variance under random mating) best explains genetic and phenotypic differences between siblings, yet differences between families are better explained by considering the extra variation induced by assortative mating, gene-environment correlations, and population stratification. If the goal is to maximize the predictive power of polygenic scores under the same environmental conditions as the discovery dataset, we would not have to account for gene-environment correlations or assortative mating. However, if we want to understand the mechanisms behind the genetic effects or use them to make causal inferences, these factors would have to be taken into account.
Novel mendelian randomization methods and applications
Mendelian randomization (MR) is a statistical method that attempts to infer exposure-outcome causality by mimicking a randomized control trial through the use of genetic variants as instrumental variables (a popular technique in the econometrics literature).98 The power and interpretability of MR are dependent on the robustness and relevance of the instruments in the causal models, which have substantially improved with larger GWASs. Over the last years, the number of genome-wide significant variants that can serve as instrumental variables has increased and the arrival of the large biobank samples has made it more feasible to conduct (one-sample) MR. At the same time, as a result of partly unverifiable assumptions, a broad variety of new analytical methods have been developed, each with their own strengths and weaknesses, including weighted median regression,99 weighted mode regression,100 MR-Egger,101 generalized summary-based MR,102 MR-PRESSO,103 GSMR,104 Steiger filtering,105 latent causal variable modeling,106 MR-CAUSE,107 and multivariable MR.108 Well-executed MR studies use various of the above mentioned methods, with stronger evidence for causal relationships when findings are consistent over these different methods. An interesting example of MR answering long-standing questions is the relationship between myopia and schooling (e.g., reading time). The connection between myopia and schooling was impossible to test with a randomized controlled trial because it would be unethical to keep children out of school. Once powerful GWASs for both educational attainment and myopia were available, MR revealed evidence for a causal influence of higher educational attainment on myopia and not the other way around.109 The unique potential of GWAS data for causal inference has clinical value and can contribute to disease prevention by identifying modifiable non-genetic risk factors.
GWAS in and around the clinic
Many clinically actionable biological insights have followed from the identification of genetic variants that influence disease risk. Rare genetic variants with large effects have so far been most clinically viable through widespread translational applications for monogenic disorders.110 For complex traits, there is a longer trajectory between detecting genetic associations and mechanistic insights. There are several reasons for this, including (1) the lower penetrance of individual variants; (2) the limits that LD structure imposes on the resolution of the genome; (3) noncoding regions harboring most GWAS associations, which makes it more challenging to link significantly associated variants to specific genic actions; and (4) gene-environment correlations and widespread pleiotropy, which complicate the identification of the specific trait that makes the genetic variant increase disease risk. Nevertheless, notable progress has been made for a variety of complex traits in a relatively short time frame. Since our last review, more insights into functional mechanisms have followed from GWAS associations, which include the effects of increased production of the vasoconstrictor peptide ET-1 in endothelial cells by END1 in a range of vascular diseases,111 the (sex-specific) effects of reduced KLF14 expression in adipose tissue in type 2 diabetes,112 and the cellular alterations resulting from APOE4 variant expression in neurons, astrocytes, and microglia in Alzheimer’s disease.113 COVID-19, however, has been a particularly telling example for the potential of GWAS to accelerate the unraveling of biological mechanisms with substantial clinical impact.
COVID-19
The urgency to combat a global pandemic has pushed COVID-19 research to make more progress in ∼2 years than any other diseases have since the advent of the GWAS design. Infectious diseases are typically studied by focusing on the pathogen rather than the host, but GWAS initiatives have helped identify human genetic variants that led to compelling insights into the susceptibility to SARS-CoV-2 infection and COVID-19 disease severity. The virus enters the body through ACE2 receptors. A recent GWAS has identified a relatively rare variant 60 bp upstream of ACE2 (rs190509934, MAF < 2%) that reduced the risk of infection by ∼40%.114 Follow-up analysis of RNA-sequencing data from liver tissue showed this variant to downregulate ACE2 expression by ∼37%.114 The ACE2 receptor interacts with the proline transporter SIT1, which is encoded by SLC6A20 on chromosome 3, one of the earliest and best replicated associations with SARS-CoV-2 infection.114,115,116 These associations confirm the key role of ACE2 in SARS-COV2 infection and increase its potential as a therapeutic target for COVID-19 prevention. Once infected, there is a risk of developing severe disease with respiratory failure, which has resulted in more than 6 million deaths worldwide. Several of the loci associated with disease severity are associated with lung function,115 some of which implicated in pulmonary surfactant biology.117 The associated SFTPD gene,117 for example, encodes surfactant protein D, which is part of the immune response that protects lungs against pathogens such as SARS-CoV-2, where it binds to its S1 spike protein.118 The immune response plays a key role in critical illness as indicated by several other GWAS associations with implicated genes, including the innate antiviral defense (IFNAR2 and OAS genes), which is important early in disease, and inflammatory lung injury (DPP9, CCR2, and TYK2 genes), which is more implicated in later life-threatening symptoms.119 Evidence for a causal link between TYK2 expression and critical illness has nominated baricitinib as a candidate drug, which inhibits TYK2 expression and was originally licensed for the treatment of rheumatoid arthritis and atopic dermatitis.119 A randomized controlled clinical trial showed treatment with baricitinib to reduce the COVID-19 mortality rate with ∼20%.120
Drug repurposing
GWASs have proven to be an effective guide on the long road to drug approval. Two-thirds of FDA-approved drugs were supported with evidence from genetics.121 The long duration and cost of novel drug development and approval has increased the focus on the quicker and cheaper alternative of repurposing existing drugs. As illustrated in the COVID-19 example above, biological insights from GWASs can be used to identify suitable compounds for drug repurposing. Two drugs originally used for treatment of psoriasis, namely ustekinumab and risankizumab,122,123 target interleukin-23 (IL-23), which activates the IL-23 receptor encoded by IL23R, one of the first and well-replicated associated genes in Crohn disease GWASs.2,124,125 Several clinical trials have confirmed a significant benefit from ustekinumab126,127,128 and risankizumab129,130 for Crohn disease treatment. FDA approval for treatment of Crohn disease was given in 2016 for ustekinumab and in 2022 for risankizumab. A variety of computational approaches are being applied to reveal more leads for drug repurposing, including GWAS-imputed transcriptomic profile matching,131 Mendelian randomization,132 and drug-gene set analysis.133,134
Polygenic scores
The value of these predictive SNPs could be reaped long before the causal mechanism of each contributing variant can be determined.—Wray et al. (2007)135
Interpreting the mechanisms behind regulatory causal variants will require the Herculean task of systematically annotating their functional impact across cell types and tissues136,137 throughout different external influences and developmental stages.138 In revealing the highly polygenic nature of the mechanisms underlying common diseases, GWASs showed that the aggregate of estimated allelic effects can result in predictive polygenic scores with clinical potential, regardless of how well we understand their biological underpinnings. Current polygenic scores for COVID-19 outcomes are powerful enough to improve the identification of individuals that should be prioritized for COVID-19 vaccinations because of increased risk for severe disease (top 10% polygenic score was associated with up to 1.75-fold increased risk of severe disease).114 A growing number of clinical trials are being conducted assessing whether integrating polygenic scores in screening procedures can improve early detection and facilitate personalized risk-based screening,139 for instance for breast cancer,140,141 colorectal cancer,142 and heart disease.33 Communication to individuals about their own cardiovascular disease risk based on polygenic scores has shown to motivate positive changes in health behavior and the propensity to seek care.143 Our Future Health, which aims to genotype up to 5 million people in the UK, has recently announced a collaboration with the company Genomics PLC to generate polygenic scores that can be used for research purposes as well as for personal feedback to their participants that could help them toward actions that reduce their risk on common diseases such as diabetes, heart disease, stroke, dementia, and cancer.144 The relative ease and low cost of constructing polygenic scores has already accelerated their implementation in pre-natal polygenic risk screening in IVF treatments for common diseases such as diabetes, heart disease, cancers, Alzheimer disease, and schizophrenia,145,146 despite ongoing discussions about the ethical and practical value of screening embryos.147,148
Gene editing
Recently, genetic associations have resulted in another revolutionary development in medicine with a treatment that can directly and permanently affect the genetic sequence of a specific tissue in an individual. After confirming its effect in primates in 2021,149 it was reported in July 2022 that the first human had received a dose of the gene-editing medicine, named VERVE-101, which permanently turns off PCSK9 in the liver, reducing the disease-driving low-density lipoprotein (LDL) cholesterol through a single base change.150 This clinical trial (called heart-1) is set to evaluate VERVE-101 in ∼40 patients with heterozygous familial hypercholesterolemia, a subtype of atherosclerotic cardiovascular disease. More applications of this base editing technology are underway for the treatment of Mendelian forms of disease such as sickle cell disease and β-thalassemia,150 opening up another avenue for genetic associations to prevent and cure diseases. As we continue to refine this groundbreaking technology for safe use in humans on genetic variants with known and large (deleterious) effects, our mechanistic understanding of GWAS associations will continue to advance, potentially paving the way for the direct re-coding of complex traits.
Discussion
In the past 15 years, GWAS discoveries have changed and impacted research across multiple disciplines. Polygenic scores have been suggested for clinical,151 social,152 and even reproductive purposes (e.g., embryo selection).147 The full spectrum of consequences of such applications is hard to predict given that the biological functions of causal variants and the environmental effects captured by the GWAS signals are not fully understood, especially across populations with different genetic backgrounds and environments. As GWAS sample size, ancestry diversity, and coverage of genetic variations increase, we expect to continue to improve our understanding of the pathways between identified variants and complex traits.
Expanding population coverage to enhance discovery and promote equity
One of the most important advancements for the coming years will be the expansion of GWAS data collections to populations across the world. In 2021, ∼86% of GWAS participants were of European ancestry.153 This Euro-centric bias in human genetics and genomics research reflects the fact that past investments and infrastructure development have largely been concentrated in countries with high proportions of European-ancestry. This Euro-focus is limiting the ability to study the genetic architecture of traits under different environments and across ancestries. It also limits the accuracy of polygenic prediction across populations and may further hinder the development and utility of new therapeutics across the world. The largest contributions to a more diverse GWAS catalog are coming from East Asia, and further expansions to the rest of the world are of vital importance, both scientifically and ethically. More emphasis on population diversity in data collection will lead to more discovery and better prediction accuracy across all ancestries. While approaches are being devised to improve the predictive power of polygenic scores across ancestries by accounting for LD differences,50 these will not fully solve the lack of transferability due to environmental or cultural differences between populations. The awareness of this problem is growing rapidly in the genetics community,45,153,154,155 and it is now time for the data to catch up. If we had to advise where to construct the next large biobanks, we would recommend starting a pan-African biobank to maximize the genetic variation captured and broaden the range of environmental exposures. Such initiatives could be deployed on the African continent (e.g., the Nigerian 100K Genomes Project) but also in European countries with large diasporas from all corners of Africa (e.g., France or Belgium). It is important that partnerships behind such projects are equitable in such a way that the local researchers and local communities are engaged as equals while prioritizing their benefit, following the example of, e.g., NeuroGAP.156
By collecting more genotype data across populations, cross-population comparisons of genetic architectures of complex traits can be investigated. Natural selection has the potential to drive genetic mean differences among populations by making them adapt to different environments, and GWAS data can, in principle, be used to quantify such differences. For human height, the correlation of SNP effect sizes at genome-wide significant loci across global ancestry groups is high (ranging between 0.64 and 0.99),5 and polygenic predictors using estimated SNP effects in one population are positively correlated with height in another, although the magnitude of this correlation is (much) reduced compared to predictor-trait correlation within the same population.5 However, even if the correlation of effect sizes would be perfect, predicted mean genetic differences may not translate into phenotypic mean group differences. Effects of causal variants can depend on the environment, both within and between populations. Effect sizes could be smaller in environments where the mean trait value is smaller (a simple scale effect), effect sizes could interact with environmental factors (gene-by-environment interaction), and effect sizes can be correlated with environmental factors (gene-environment correlations). Questions about between-group genetic differences are particularly controversial for traits associated with social outcomes. Differences in traits like educational attainment and intelligence have a history of being (mis)interpreted in the context of deprecated classifications of human populations into a handful of categories (“races”)157 to bolster racial supremacy ideologies. They are also more sensitive to gene-environment correlations than other complex traits as a result of systematic socio-economic differences,77,95 which are expected to exist especially between groups who have historically experienced differences in socio-economic opportunities, oppression, or exploitation. Research into the genetic background of population differences in social traits is therefore particularly sensitive to being misunderstood or misused, e.g., to incite hate or influence social policy. Trying to understand whether between-population mean phenotypic differences in complex traits, specifically diseases and their risk factors, are partly driven by genetic differences is likely to become an active area of research in the future. Researchers, including geneticists, need to remain vigilant about the pitfalls of such analyses, on both an analytical and a societal level, and take responsibility to mitigate the misuse and misinterpretation of genomic data in a discriminatory or racist framework.
Polygenic differentiation between populations should be studied with care, firstly because of societal sensitivities mentioned above, but also because of its subtle nature and high sensitivity for misinterpretation and confounding due to systematic allele frequency, LD, and environmental differences.78,79,158 We should also note here that between-population genetic differences in disease prevalence and trait means are not necessarily caused by natural selection. Under a pure genetic drift model, one could expect mean genetic differences among groups that differ in allele frequencies for polygenic traits.56 Furthermore, there are scenarios in which selection pressures may have even made it more challenging to detect polygenic differentiation between populations. Negative selection pressures can give rise to population-specific genetic architectures and causal variants for the same traits, making it difficult to detect trait mean differences by comparing polygenic scores based on GWASs from a single population.159 Theoretical work indicates that a polygenic trait constrained by stabilizing selection to a certain optimum phenotypic value in two populations can, counter-intuitively, increase the genetic differentiation of trait-influencing loci: genetic variants that accidentally increase in frequency as a result of drift in one population would lead to a compensatory decrease in frequency of other loci in this population.160 This could create the illusion of an implied phenotypic mean differentiation when basing the comparison between two populations on polygenic scores computed from a GWAS done in only one of the two populations.
Expanding genomic coverage through GWAS-by-WGS
Rare and common genetic variation have thus far been interrogated with largely different measurement and analytic approaches, even though complex traits are influenced by alleles distributed across a continuous spectrum of frequencies that vary between populations, often implicating the same genes. Sequencing larger proportions of human populations could help future GWASs bring the realms of rare and common variants closer together. In our 10-year review,4 we showed a comparison of the power to detect association for low frequency variants, using either a GWAS-by-WGS or GWAS-by-chip approach. We revisit this comparison because there already have been large GWAS-by-WES studies64,65,67 and because we have a better quantification of the relationship between effect size and frequency of trait-associated alleles. GWAS-by-WGS studies are emerging, and are expected to measure ∼40 times as many variants as WES datasets of the same individuals.61 Sample sizes of whole-genome-sequenced datasets are increasing; the sample size of the TOPMed initiative is >53,831 and the entire UK Biobank is being sequenced, of which 200,000 have already been made available including 150,119 recently described and analyzed.61 Moreover, research in the last 5 years has quantified the association between allele frequency and effect size for many traits, showing larger effects for lower frequencies, and this association affects the power of detection. In Figure 3, we calculate the ratio of the sample sizes needed to map low frequency variants in GWAS-by-WGS compared to that of mapping a common variant association, as a function of MAF and the association between effect size and frequency (parameterized as ).68 It shows that for a realistic range of , the required sample size to detect rare variants in GWAS-by-WGS is much larger than the detection of common variants, and the rarer the variants the larger the sample size ratio. The reason is that although rare variants tend to have larger effect sizes,68,69,70 this does not necessarily compensate for its lower heterozygosity; power depends on , with being the minor allele frequency, the heterozygosity under Hardy-Weinberg equilibrium and the per-allele effect size. If we assume as in Zeng et al.68 that is on average proportional to then the proportional increase in sample size () to detect a rare-variant (e.g., < 1%) association with the same statistical power as that needed to identify a common SNP with an MAF , can be expressed as . This implies, for a trait such as height with an estimated around −0.65 (Zeng et al. 68), that detecting SNPs with an MAF of 0.1% would require 7-fold larger samples than currently needed to detect common variants associations ( = 1/2). Altogether, we expect that sample sizes will continue to increase and the focus on rare variant detection will continue to grow nonetheless, as rare variants explain a substantial portion of complex trait variation and have a better likelihood of being the causal allele when significantly associated.
Concluding remarks
The initial promise of GWAS was discovery of variant-trait associations through linkage disequilibrium, as an entry into studying disease biology and ultimately leading to better prevention and treatment of diseases and disorders. 15 years from the first well-designed GWAS, this promise has not only been realized, but much more has been achieved that was unforeseen and not predicted at the time: the effects of natural selection and polygenic adaptation at many trait-associated loci have been detected and quantified, novel causal relationships between exposures and disease have been detected, new drugs are being trialed on the basis of GWAS results, polygenic scores are undergoing clinical trials, and polygenic approaches are becoming embedded in the social sciences. Overall, GWAS has contributed to a much better understanding of the causes and consequences of human genetic variation for complex traits and disease and will likely continue to do so in the future.
Acknowledgments
This research was supported by the Australian National Health and Medical Research Council (1113400) and the Australian Research Council (FL180100072 and DE200100425). A.A. and K.J.H.V. are supported by the Foundation Volksbond Rotterdam. For Figure 2, data from the UK Biobank was used under Project 12505 and data from the Health and Retirement Study (HRS). HRS (Health and Retirement Study) is supported by the National Institute on Aging (NIA, U01AG009740). HRS genotyping received additional support from the National Institute on Aging (RC2 AG036495 and RC4 AG039029). Genotype data on HRS participants was obtained with the Illumina HumanOmni2.5 BeadChips (HumanOmni2.5-4v1, HumanOmni2.5-8v1, HumanOmni2.5-8v1.1). Genotyping was conducted by the NIH Center for Inherited Disease Research (CIDR) at Johns Hopkins University. Genotyping quality control and final preparation of the data were performed by the Genetics Coordinating Center at the University of Washington and the University of Michigan. The validation dataset includes respondents who provided DNA samples and signed consent forms in 2006, 2008, and 2010 (dbGaP: phs000428.v2.p2). Because of space limitations, we were unable to do full justice to all relevant and important GWAS articles that have been published in the last 15 years. We apologize to many of our colleagues whose work is not cited.
Declaration of interests
The authors declare no competing interests.
References
- 1.Risch N., Merikangas K. The future of genetic studies of complex human diseases. Science (New York, N.Y.) 1996;273:1516–1517. doi: 10.1126/science.273.5281.1516. [DOI] [PubMed] [Google Scholar]
- 2.Wellcome Trust Case Control Consortium Genome-wide association study of 14, 000 cases of seven common diseases and 3, 000 shared controls. Nature. 2007;447:661–678. doi: 10.1038/nature05911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Visscher P.M., Brown M.A., McCarthy M.I., Yang J. Five years of GWAS discovery. Am. J. Hum. Genet. 2012;90:7–24. doi: 10.1016/j.ajhg.2011.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Visscher P.M., Wray N.R., Zhang Q., Sklar P., McCarthy M.I., Brown M.A., Yang J. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 2017;101:5–22. doi: 10.1016/j.ajhg.2017.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Yengo L., Vedantam S., Marouli E., Sidorenko J., Bartell E., Sakaue S., Graff M., Eliasen A.U., Jiang Y., Raghavan S., et al. A Saturated Map of Common Genetic Variants Associated with Human Height from 5.4 Million Individuals of Diverse Ancestries. Nature. 2022;610:704–712. doi: 10.1038/s41586-022-05275-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Liu M., Jiang Y., Wedow R., Li Y., Brazel D.M., Chen F., Datta G., Davila-Velderrain J., McGuire D., Tian C., et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat. Genet. 2019;51:237–244. doi: 10.1038/s41588-018-0307-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lee J.J., Wedow R., Okbay A., Kong E., Maghzian O., Zacher M., Nguyen-Viet T.A., Bowers P., Sidorenko J., Karlsson Linnér R., et al. Gene discovery and polygenic prediction from a 1.1-million-person GWAS of educational attainment. Nat. Genet. 2018;50:1112–1121. doi: 10.1038/s41588-018-0147-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Okbay A., Wu Y., Wang N., Jayashankar H., Bennett M., Nehzati S.M., Sidorenko J., Kweon H., Goldman G., Gjorgjieva T., et al. Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nat. Genet. 2022;54:437–449. doi: 10.1038/s41588-022-01016-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Evangelou E., Warren H.R., Mosen-Ansorena D., Mifsud B., Pazoki R., Gao H., Ntritsos G., Dimou N., Cabrera C.P., Karaman I., et al. Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nat. Genet. 2018;50:1412–1425. doi: 10.1038/s41588-018-0205-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bycroft C., Freeman C., Petkova D., Band G., Elliott L.T., Sharp K., Motyer A., Vukcevic D., Delaneau O., O'Connell J., et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. doi: 10.1038/s41586-018-0579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Stefansson K. William Allan Award1. Am. J. Hum. Genet. 2018;102:351–353. doi: 10.1016/j.ajhg.2018.01.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Leitsalu L., Metspalu A. Elsevier; 2017. Genomic and Precision Medicine; pp. 119–129. [Google Scholar]
- 13.Nagai A., Hirata M., Kamatani Y., Muto K., Matsuda K., Kiyohara Y., Ninomiya T., Tamakoshi A., Yamagata Z., Mushiroda T., et al. Overview of the BioBank Japan Project: study design and profile. J. Epidemiol. 2017;27:S2–S8. doi: 10.1016/j.je.2016.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Walters R.G., Millwood I.Y., Lin K., Valle D.S., McDonnell P., Hacker A., Avery D., Cai N., Kretzschmar W.W., Ansari M.A., et al. Genotyping and population structure of the China Kadoorie Biobank. Preprint at medRxiv. 2022 doi: 10.1101/2022.05.02.22274487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kurki M.I., Karjalainen J., Palta P., Sipilä T.P., Kristiansson K., Donner K., Reeve M.P., Laivuori H., Aavikko M., Kaunisto M.A., et al. FinnGen: Unique genetic insights from combining isolated population and national health register data. Preprint at medRxiv. 2022 doi: 10.1101/2022.03.03.22271360. [DOI] [Google Scholar]
- 16.Sijtsma A., Rienks J., van der Harst P., Navis G., Rosmalen J.G.M., Dotinga A. Cohort profile update: lifelines, a three-generation cohort study and biobank. Int. J. Epidemiol. 2021;51:e295–e302. doi: 10.1093/ije/dyab257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Gaziano J.M., Concato J., Brophy M., Fiore L., Pyarajan S., Breeling J., Whitbourne S., Deen J., Shannon C., Humphries D., et al. Million Veteran Program: A mega-biobank to study genetic influences on health and disease. J. Clin. Epidemiol. 2016;70:214–223. doi: 10.1016/j.jclinepi.2015.09.016. [DOI] [PubMed] [Google Scholar]
- 18.The All of Us Research Program Investigators The “All of Us” Research Program. N. Engl. J. Med. 2019;381:668–676. doi: 10.1056/NEJMsr1809937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.PGC: Psychiatric Genomics Consortium, https://www.med.unc.edu/pgc/(2022).
- 20.CARDIoGRAMplusC4D (Coronary ARtery DIsease Genome wide Replication and Meta-analysis (CARDIoGRAM) plus The Coronary Artery Disease (C4D) Genetics), http://www.cardiogramplusc4d.org/(2022).
- 21.Liu J.Z., Erlich Y., Pickrell J.K. Case-control association mapping by proxy using family history of disease. Nat. Genet. 2017;49:325–331. doi: 10.1038/ng.3766. [DOI] [PubMed] [Google Scholar]
- 22.Loh P.R., Kichaev G., Gazal S., Schoech A.P., Price A.L. Mixed-model association for biobank-scale datasets. Nat. Genet. 2018;50:906–908. doi: 10.1038/s41588-018-0144-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hujoel M.L.A., Gazal S., Loh P.R., Patterson N., Price A.L. Liability threshold modeling of case-control status and family history of disease increases association power. Nat. Genet. 2020;52:541–547. doi: 10.1038/s41588-020-0613-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Jiang L., Zheng Z., Qi T., Kemper K.E., Wray N.R., Visscher P.M., Yang J. A resource-efficient tool for mixed model association analysis of large-scale data. Nat. Genet. 2019;51:1749–1755. doi: 10.1038/s41588-019-0530-8. [DOI] [PubMed] [Google Scholar]
- 25.Jiang L., Zheng Z., Fang H., Yang J. A generalized linear mixed model association tool for biobank-scale data. Nat. Genet. 2021;53:1616–1621. doi: 10.1038/s41588-021-00954-4. [DOI] [PubMed] [Google Scholar]
- 26.Lande R., Thompson R. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics. 1990;124:743–756. doi: 10.1093/genetics/124.3.743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wray N.R., Kemper K.E., Hayes B.J., Goddard M.E., Visscher P.M. Complex Trait Prediction from Genome Data: Contrasting EBV in Livestock to PRS in Humans: Genomic Prediction. Genetics. 2019;211:1131–1141. doi: 10.1534/genetics.119.301859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.de los Campos G., Gianola D., Allison D.B. Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat. Rev. Genet. 2010;11:880–886. doi: 10.1038/nrg2898. [DOI] [PubMed] [Google Scholar]
- 29.Gottesman M.M., Collins F.S. The role of the human genome project in disease prevention. Prev. Med. 1994;23:591–594. doi: 10.1006/pmed.1994.1094. [DOI] [PubMed] [Google Scholar]
- 30.Jostins L., Barrett J.C. Genetic risk prediction in complex disease. Hum. Mol. Genet. 2011;20:R182–R188. doi: 10.1093/hmg/ddr378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Harden K.P., Koellinger P.D. Using genetics for social science. Nat. Hum. Behav. 2020;4:567–576. doi: 10.1038/s41562-020-0862-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kullo I.J., Lewis C.M., Inouye M., Martin A.R., Ripatti S., Chatterjee N. Polygenic scores in biomedical research. Nat. Rev. Genet. 2022;23:524–532. doi: 10.1038/s41576-022-00470-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Klarin D., Natarajan P. Clinical utility of polygenic risk scores for coronary artery disease. Nat. Rev. Cardiol. 2022;19:291–301. doi: 10.1038/s41569-021-00638-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Murray G.K., Lin T., Austin J., McGrath J.J., Hickie I.B., Wray N.R. Could polygenic risk scores be useful in psychiatry? A review. JAMA Psychiatr. 2021;78:210–219. doi: 10.1001/jamapsychiatry.2020.3042. [DOI] [PubMed] [Google Scholar]
- 35.Fahed A.C., Philippakis A.A., Khera A.V. The potential of polygenic scores to improve cost and efficiency of clinical trials. Nat. Commun. 2022;13:2922. doi: 10.1038/s41467-022-30675-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Khera A.V., Chaffin M., Aragam K.G., Haas M.E., Roselli C., Choi S.H., Natarajan P., Lander E.S., Lubitz S.A., Ellinor P.T., Kathiresan S. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 2018;50:1219–1224. doi: 10.1038/s41588-018-0183-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Torkamani A., Wineinger N.E., Topol E.J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 2018;19:581–590. doi: 10.1038/s41576-018-0018-x. [DOI] [PubMed] [Google Scholar]
- 38.Kiflen M., Le A., Mao S., Lali R., Narula S., Xie F., Paré G. Cost-effectiveness of polygenic risk scores to guide statin therapy for cardiovascular disease prevention. Circ: Genom. Precis. Med. 2022;15 doi: 10.1161/CIRCGEN.121.003423. [DOI] [PubMed] [Google Scholar]
- 39.Lango Allen H., Estrada K., Lettre G., Berndt S.I., Weedon M.N., Rivadeneira F., Willer C.J., Jackson A.U., Vedantam S., Raychaudhuri S., et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature. 2010;467:832–838. doi: 10.1038/nature09410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wood A.R., Esko T., Yang J., Vedantam S., Pers T.H., Gustafsson S., Chu A.Y., Estrada K., Luan J., Kutalik Z., et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 2014;46:1173–1186. doi: 10.1038/ng.3097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Yengo L., Sidorenko J., Kemper K.E., Zheng Z., Wood A.R., Weedon M.N., Frayling T.M., Hirschhorn J., Yang J., Visscher P.M., GIANT Consortium Meta-analysis of genome-wide association studies for height and body mass index in∼ 700000 individuals of European ancestry. Hum. Mol. Genet. 2018;27:3641–3649. doi: 10.1093/hmg/ddy271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Mathieson I., Scally A. What is ancestry? PLoS Genet. 2020;16:e1008624. doi: 10.1371/journal.pgen.1008624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lewis A.C.F., Molina S.J., Appelbaum P.S., Dauda B., Di Rienzo A., Fuentes A., Fullerton S.M., Garrison N.A., Ghosh N., Hammonds E.M., et al. Getting genetic ancestry right for science and society. Science (New York, N.Y.) 2022;376:250–252. doi: 10.1126/science.abm7530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Popejoy A.B., Fullerton S.M. Genomics is failing on diversity. Nature. 2016;538:161–164. doi: 10.1038/538161a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Martin A.R., Kanai M., Kamatani Y., Okada Y., Neale B.M., Daly M.J. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Martin A.R., Gignoux C.R., Walters R.K., Wojcik G.L., Neale B.M., Gravel S., Daly M.J., Bustamante C.D., Kenny E.E. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 2017;100:635–649. doi: 10.1016/j.ajhg.2017.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Mills M.C., Rahal C. The GWAS diversity monitor tracks diversity by disease in real time. Nat. Genet. 2020;52:242–243. doi: 10.1038/s41588-020-0580-y. [DOI] [PubMed] [Google Scholar]
- 48.Wei C.-Y., Yang J.H., Yeh E.C., Tsai M.F., Kao H.J., Lo C.Z., Chang L.P., Lin W.J., Hsieh F.J., Belsare S., et al. Genetic profiles of 103, 106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese. NPJ Genom. Med. 2021;6:10. doi: 10.1038/s41525-021-00178-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Feng Y.-C.A., Chen C.Y., Chen T.T., Kuo P.H., Hsu Y.H., Yang H.I., Chen W.J., Shen C.Y., Ge T., Huang H., Lin Y.F. Taiwan Biobank: a rich biomedical research database of the Taiwanese population. medRxiv. 2021 doi: 10.1101/2021.12.21.21268159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Ruan Y., Lin Y.F., Feng Y.C.A., Chen C.Y., Lam M., Guo Z., Stanley Global Asia Initiatives. He L., Sawa A., Martin A.R., et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 2022;54:1259–1268. doi: 10.1038/s41588-022-01144-6. [DOI] [PubMed] [Google Scholar]
- 51.Wang Y., Guo J., Ni G., Yang J., Visscher P.M., Yengo L. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 2020;11:3865–3869. doi: 10.1038/s41467-020-17719-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Giannakopoulou O., Lin K., Meng X., Su M.H., Kuo P.H., Peterson R.E., Awasthi S., Moscati A., Coleman J.R.I., Bass N., et al. The genetic architecture of depression in individuals of East Asian ancestry: a genome-wide association study. JAMA Psychiatr. 2021;78:1258–1269. doi: 10.1001/jamapsychiatry.2021.2099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wray N.R., Ripke S., Mattheisen M., Trzaskowski M., Byrne E.M., Abdellaoui A., Adams M.J., Agerbo E., Air T.M., Andlauer T.M.F., et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat. Genet. 2018;50:668–681. doi: 10.1038/s41588-018-0090-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Fisher R.A. The correlation between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinb. 1919;52:399–433. [Google Scholar]
- 55.Lynch M., Walsh B. Vol. 1. Sinauer Sunderland; 1998. (Genetics and Analysis of Quantitative Traits). [Google Scholar]
- 56.Walsh B., Lynch M. Oxford University Press; 2018. Evolution and Selection of Quantitative Traits. [Google Scholar]
- 57.Brownlee J. The inheritance of complex growth forms, such as stature, on Mendel’s theory. Proc. R. Soc. Edinb. 1911;XI:251–256. doi: 10.1093/ije/dyt068. [DOI] [PubMed] [Google Scholar]
- 58.Mathieson I., McVean G. Differential confounding of rare and common variants in spatially structured populations. Nat. Genet. 2012;44:243–246. doi: 10.1038/ng.1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Biddanda A., Rice D.P., Novembre J. A variant-centric perspective on geographic patterns of human allele frequency variation. Elife. 2020;9:e60107. doi: 10.7554/eLife.60107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Taliun D., Harris D.N., Kessler M.D., Carlson J., Szpiech Z.A., Torres R., Taliun S.A.G., Corvelo A., Gogarten S.M., Kang H.M., et al. Sequencing of 53, 831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021;590:290–299. doi: 10.1038/s41586-021-03205-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Halldorsson B.V., Eggertsson H.P., Moore K.H.S., Hauswedell H., Eiriksson O., Ulfarsson M.O., Palsson G., Hardarson M.T., Oddsson A., Jensson B.O., et al. The sequences of 150, 119 genomes in the UK Biobank. Nature. 2022;607:732–740. doi: 10.1038/s41586-022-04965-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Nicolae D.L. Association tests for rare variants. Annu. Rev. Genomics Hum. Genet. 2016;17:117–130. doi: 10.1146/annurev-genom-083115-022609. [DOI] [PubMed] [Google Scholar]
- 63.Wainschtein P., Jain D., Zheng Z., TOPMed Anthropometry Working Group. NHLBI Trans-Omics for Precision Medicine TOPMed Consortium. Cupples L.A., Shadyab A.H., McKnight B., Shoemaker B.M., Mitchell B.D., et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 2022;54:263–273. doi: 10.1038/s41588-021-00997-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Akbari P., Gilani A., Sosina O., Kosmicki J.A., Khrimian L., Fang Y.Y., Persaud T., Garcia V., Sun D., Li A., et al. Sequencing of 640, 000 exomes identifies GPR75 variants associated with protection from obesity. Science (New York, N.Y.) 2021;373:eabf8683. doi: 10.1126/science.abf8683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Singh T., Poterba T., Curtis D., Akil H., Al Eissa M., Barchas J.D., Bass N., Bigdeli T.B., Breen G., Bromet E.J., et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature. 2022;604:509–516. doi: 10.1038/s41586-022-04556-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Trubetskoy V., Pardiñas A.F., Qi T., Panagiotaropoulou G., Awasthi S., Bigdeli T.B., Bryois J., Chen C.Y., Dennison C.A., Hall L.S., et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature. 2022;604:502–508. doi: 10.1038/s41586-022-04434-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Backman J.D., Li A.H., Marcketta A., Sun D., Mbatchou J., Kessler M.D., Benner C., Liu D., Locke A.E., Balasubramanian S., et al. Exome sequencing and analysis of 454, 787 UK Biobank participants. Nature. 2021;599:628–634. doi: 10.1038/s41586-021-04103-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Zeng J., de Vlaming R., Wu Y., Robinson M.R., Lloyd-Jones L.R., Yengo L., Yap C.X., Xue A., Sidorenko J., McRae A.F., et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. 2018;50:746–753. doi: 10.1038/s41588-018-0101-4. [DOI] [PubMed] [Google Scholar]
- 69.Zeng J., Xue A., Jiang L., Lloyd-Jones L.R., Wu Y., Wang H., Zheng Z., Yengo L., Kemper K.E., Goddard M.E., et al. Widespread signatures of natural selection across human complex traits and functional genomic categories. Nat. Commun. 2021;12:1164–1212. doi: 10.1038/s41467-021-21446-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Gazal S., Loh P.R., Finucane H.K., Ganna A., Schoech A., Sunyaev S., Price A.L. Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations. Nat. Genet. 2018;50:1600–1607. doi: 10.1038/s41588-018-0231-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Gazal S., Finucane H.K., Furlotte N.A., Loh P.R., Palamara P.F., Liu X., Schoech A., Bulik-Sullivan B., Neale B.M., Gusev A., Price A.L. Linkage disequilibrium–dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 2017;49:1421–1427. doi: 10.1038/ng.3954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Simons Y.B., Bullaughey K., Hudson R.R., Sella G. A population genetic interpretation of GWAS findings for human quantitative traits. PLoS Biol. 2018;16:e2002985. doi: 10.1371/journal.pbio.2002985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Schoech A.P., Jordan D.M., Loh P.R., Gazal S., O'Connor L.J., Balick D.J., Palamara P.F., Finucane H.K., Sunyaev S.R., Price A.L. Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nat. Commun. 2019;10:790. doi: 10.1038/s41467-019-08424-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.O'Connor L.J., Schoech A.P., Hormozdiari F., Gazal S., Patterson N., Price A.L. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 2019;105:456–476. doi: 10.1016/j.ajhg.2019.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Cox S.L., Moots H.M., Stock J.T., Shbat A., Bitarello B.D., Nicklisch N., Alt K.W., Haak W., Rosenstock E., Ruff C.B., Mathieson I. Predicting skeletal stature using ancient DNA. Am. J. Phys. Anthropol. 2022;177:162–174. [Google Scholar]
- 76.Cox S.L., Ruff C.B., Maier R.M., Mathieson I. Genetic contributions to variation in human stature in prehistoric Europe. Proc. Natl. Acad. Sci. USA. 2019;116:21484–21492. doi: 10.1073/pnas.1910606116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Howe L.J., Nivard M.G., Morris T.T., Hansen A.F., Rasheed H., Cho Y., Chittoor G., Ahlskog R., Lind P.A., Palviainen T., et al. Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nat. Genet. 2022;54:581–592. doi: 10.1038/s41588-022-01062-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Berg J.J., Harpak A., Sinnott-Armstrong N., Joergensen A.M., Mostafavi H., Field Y., Boyle E.A., Zhang X., Racimo F., Pritchard J.K., Coop G. Reduced signal for polygenic adaptation of height in UK Biobank. Elife. 2019;8:e39725. doi: 10.7554/eLife.39725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Sohail M., Maier R.M., Ganna A., Bloemendal A., Martin A.R., Turchin M.C., Chiang C.W., Hirschhorn J., Daly M.J., Patterson N., et al. Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies. Elife. 2019;8:e39702. doi: 10.7554/eLife.39702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Chen M., Sidore C., Akiyama M., Ishigaki K., Kamatani Y., Schlessinger D., Cucca F., Okada Y., Chiang C.W.K. Evidence of Polygenic Adaptation in Sardinia at Height-Associated Loci Ascertained from the Biobank Japan. Am. J. Hum. Genet. 2020;107:60–71. doi: 10.1016/j.ajhg.2020.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Jiang Y., Bolnick D.I., Kirkpatrick M. Assortative mating in animals. Am. Nat. 2013;181:E125–E138. doi: 10.1086/670160. [DOI] [PubMed] [Google Scholar]
- 82.Horwitz T.B., Keller M.C. A comprehensive meta-analysis of human assortative mating in 22 complex traits. Preprint at bioRxiv. 2022 doi: 10.1101/2022.03.19.484997. [DOI] [Google Scholar]
- 83.Robinson M.R., The LifeLines Cohort Study. Kleinman A., Graff M., Vinkhuyzen A.A.E., Couper D., Miller M.B., Peyrot W.J., Abdellaoui A., Zietsch B.P., et al. Genetic evidence of assortative mating in humans. Nat. Hum. Behav. 2017;1:0016. [Google Scholar]
- 84.Yengo L., Robinson M.R., Keller M.C., Kemper K.E., Yang Y., Trzaskowski M., Gratten J., Turley P., Cesarini D., Benjamin D.J., et al. Imprint of assortative mating on the human genome. Nat. Hum. Behav. 2018;2:948–954. doi: 10.1038/s41562-018-0476-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Kemper K.E., Yengo L., Zheng Z., Abdellaoui A., Keller M.C., Goddard M.E., Wray N.R., Yang J., Visscher P.M. Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals. Nat. Commun. 2021;12:1050–1111. doi: 10.1038/s41467-021-21283-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Nordsletten A.E., Brander G., Larsson H., Lichtenstein P., Crowley J.J., Sullivan P.F., Wray N.R., Mataix-Cols D. Evaluating the impact of nonrandom mating: psychiatric outcomes among the offspring of pairs diagnosed with schizophrenia and bipolar disorder. Biol. Psychiatry. 2020;87:253–262. doi: 10.1016/j.biopsych.2019.06.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Torvik F.A., Eilertsen E.M., Hannigan L.J., Cheesman R., Howe L.J., Magnus P., Reichborn-Kjennerud T., Andreassen O.A., Njølstad P.R., Havdahl A., Ystrom E. Modeling assortative mating and genetic similarities between partners, siblings, and in-laws. Nat. Commun. 2022;13:1108. doi: 10.1038/s41467-022-28774-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Nordsletten A.E., Larsson H., Crowley J.J., Almqvist C., Lichtenstein P., Mataix-Cols D. Patterns of nonrandom mating within and across 11 major psychiatric disorders. JAMA Psychiatr. 2016;73:354–361. doi: 10.1001/jamapsychiatry.2015.3192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Wray N.R., Yengo L. Assortative mating in autism spectrum disorder: toward an evidence base from DNA data, but not there yet. Biol. Psychiatry. 2019;86:250–252. doi: 10.1016/j.biopsych.2019.06.007. [DOI] [PubMed] [Google Scholar]
- 90.Plomin R., DeFries J.C., Loehlin J.C. Genotype-environment interaction and correlation in the analysis of human behavior. Psychol. Bull. 1977;84:309–322. [PubMed] [Google Scholar]
- 91.Kong A., Thorleifsson G., Frigge M.L., Vilhjalmsson B.J., Young A.I., Thorgeirsson T.E., Benonisdottir S., Oddsson A., Halldorsson B.V., Masson G., et al. The nature of nurture: Effects of parental genotypes. Science (New York, N.Y.) 2018;359:424–428. doi: 10.1126/science.aan6877. [DOI] [PubMed] [Google Scholar]
- 92.Cheesman R., Hunjan A., Coleman J.R.I., Ahmadzadeh Y., Plomin R., McAdams T.A., Eley T.C., Breen G. Comparison of adopted and nonadopted individuals reveals gene–environment interplay for education in the UK Biobank. Psychol. Sci. 2020;31:582–591. doi: 10.1177/0956797620904450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Selzam S., Ritchie S.J., Pingault J.B., Reynolds C.A., O'Reilly P.F., Plomin R. Comparing within- and between-family polygenic score prediction. Am. J. Hum. Genet. 2019;105:351–363. doi: 10.1016/j.ajhg.2019.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Abdellaoui A., Hugh-Jones D., Yengo L., Kemper K.E., Nivard M.G., Veul L., Holtz Y., Zietsch B.P., Frayling T.M., Wray N.R., et al. Genetic correlates of social stratification in Great Britain. Nat. Hum. Behav. 2019;3:1332–1342. doi: 10.1038/s41562-019-0757-5. [DOI] [PubMed] [Google Scholar]
- 95.Abdellaoui A., Dolan C.V., Verweij K.J.H., Nivard M.G. Gene-environment correlations across geographic regions affect genome-wide association studies. Nat. Genet. 2022;54:1345–1354. doi: 10.1038/s41588-022-01158-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Abdellaoui A., Verweij K.J.H. Dissecting polygenic signals from genome-wide association studies on human behaviour. Nat. Hum. Behav. 2021;5:686–694. doi: 10.1038/s41562-021-01110-y. [DOI] [PubMed] [Google Scholar]
- 97.Demange P.A., Malanchini M., Mallard T.T., Biroli P., Cox S.R., Grotzinger A.D., Tucker-Drob E.M., Abdellaoui A., Arseneault L., van Bergen E., et al. Investigating the genetic architecture of noncognitive skills using GWAS-by-subtraction. Nat. Genet. 2021;53:35–44. doi: 10.1038/s41588-020-00754-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Reiersøl O. Almqvist & Wiksell; 1945. Confluence Analysis by Means of Instrumental Sets of Variables. [Google Scholar]
- 99.Bowden J., Davey Smith G., Haycock P.C., Burgess S. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet. Epidemiol. 2016;40:304–314. doi: 10.1002/gepi.21965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Hartwig F.P., Davey Smith G., Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int. J. Epidemiol. 2017;46:1985–1998. doi: 10.1093/ije/dyx102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Bowden J., Davey Smith G., Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 2015;44:512–525. doi: 10.1093/ije/dyv080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Zhu Z., Zheng Z., Zhang F., Wu Y., Trzaskowski M., Maier R., Robinson M.R., McGrath J.J., Visscher P.M., Wray N.R., Yang J. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 2018;9:224. doi: 10.1038/s41467-017-02317-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Verbanck M., Chen C.Y., Neale B., Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 2018;50:693–698. doi: 10.1038/s41588-018-0099-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Zhu Z., Zheng Z., Zhang F., Wu Y., Trzaskowski M., Maier R., Robinson M.R., McGrath J.J., Visscher P.M., Wray N.R., Yang J. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 2018;9:224–312. doi: 10.1038/s41467-017-02317-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Hemani G., Tilling K., Davey Smith G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 2017;13:e1007081. doi: 10.1371/journal.pgen.1007081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.O'Connor L.J., Price A.L. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat. Genet. 2018;50:1728–1734. doi: 10.1038/s41588-018-0255-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Morrison J., Knoblauch N., Marcus J.H., Stephens M., He X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat. Genet. 2020;52:740–747. doi: 10.1038/s41588-020-0631-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Sanderson E. Multivariable mendelian randomization and mediation. Cold Spring Harb. Perspect. Med. 2021;11:a038984. doi: 10.1101/cshperspect.a038984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Mountjoy E., Davies N.M., Plotnikov D., Smith G.D., Rodriguez S., Williams C.E., Guggenheim J.A., Atan D. Education and myopia: assessing the direction of causality by mendelian randomisation. BMJ (Clinical Research Ed. 2018;361:k2022. doi: 10.1136/bmj.k2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Adam M.P., Ardinger H.H., Pagon R.A., Wallace S.E., Bean L.J., Stephens K., Amemiya A. University of Washington; 1993. GeneReviews. [Google Scholar]
- 111.Gupta R.M., Hadaya J., Trehan A., Zekavat S.M., Roselli C., Klarin D., Emdin C.A., Hilvering C.R.E., Bianchi V., Mueller C., et al. A genetic variant associated with five vascular diseases is a distal regulator of endothelin-1 gene expression. Cell. 2017;170:522–533.e15. doi: 10.1016/j.cell.2017.06.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Small K.S., Todorčević M., Civelek M., El-Sayed Moustafa J.S., Wang X., Simon M.M., Fernandez-Tajes J., Mahajan A., Horikoshi M., Hugill A., et al. Regulatory variants at KLF14 influence type 2 diabetes risk via a female-specific effect on adipocyte size and body composition. Nat. Genet. 2018;50:572–580. doi: 10.1038/s41588-018-0088-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Lin Y.T., Seo J., Gao F., Feldman H.M., Wen H.L., Penney J., Cam H.P., Gjoneska E., Raja W.K., Cheng J., et al. APOE4 causes widespread molecular and cellular alterations associated with Alzheimer's Disease phenotypes in human iPSC-derived brain cell types. Neuron. 2018;98:1141–1154.e7. doi: 10.1016/j.neuron.2018.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Horowitz J.E., Kosmicki J.A., Damask A., Sharma D., Roberts G.H.L., Justice A.E., Banerjee N., Coignet M.V., Yadav A., Leader J.B., et al. Genome-wide analysis provides genetic evidence that ACE2 influences COVID-19 risk and yields risk scores associated with severe disease. Nat. Genet. 2022;54:382–392. doi: 10.1038/s41588-021-01006-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.COVID-19 Host Genetics Initiative Mapping the human genetic architecture of COVID-19. Nature. 2021;600:472–477. doi: 10.1038/s41586-021-03767-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Roberts G.H.L., Partha R., Rhead B., Knight S.C., Park D.S., Coignet M.V., Zhang M., Berkowitz N., Turrisini D.A., Gaddis M., et al. Expanded COVID-19 phenotype definitions reveal distinct patterns of genetic association and protective effects. Nat. Genet. 2022;54:374–381. doi: 10.1038/s41588-022-01042-x. [DOI] [PubMed] [Google Scholar]
- 117.COVID-19 Host Genetics Initiative A first update on mapping the human genetic architecture of COVID-19. Nature. 2022;608:E1–E10. doi: 10.1038/s41586-022-04826-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Hsieh M.-H., Beirag N., Murugaiah V., Chou Y.C., Kuo W.S., Kao H.F., Madan T., Kishore U., Wang J.Y. Human surfactant protein D binds spike protein and acts as an entry inhibitor of SARS-CoV-2 pseudotyped viral particles. Front. Immunol. 2021;12:641360. doi: 10.3389/fimmu.2021.641360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Pairo-Castineira E., Clohisey S., Klaric L., Bretherick A.D., Rawlik K., Pasko D., Walker S., Parkinson N., Fourman M.H., Russell C.D., et al. Genetic mechanisms of critical illness in COVID-19. Nature. 2021;591:92–98. doi: 10.1038/s41586-020-03065-y. [DOI] [PubMed] [Google Scholar]
- 120.Abani O., Abbas A., Abbas F., Abbas J., Abbas K., Abbas M., Abbasi S., Abbass H., Abbott A., Abbott A., Abdallah N. Baricitinib in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial and updated meta-analysis. Lancet. 2022;400:359–368. doi: 10.1016/S0140-6736(22)01109-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Ochoa D., Karim M., Ghoussaini M., Hulcoop D.G., McDonagh E.M., Dunham I. Human genetics evidence supports two-thirds of the 2021 FDA-approved drugs. Nat. Rev. Drug Discov. 2022;21:551. doi: 10.1038/d41573-022-00120-3. [DOI] [PubMed] [Google Scholar]
- 122.Savage L.J., Wittmann M., McGonagle D., Helliwell P.S. Ustekinumab in the treatment of psoriasis and psoriatic arthritis. Rheumatol. Ther. 2015;2:1–16. doi: 10.1007/s40744-015-0010-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Banaszczyk K. Risankizumab in the treatment of psoriasis–literature review. Reumatologia. 2019;57:158–162. doi: 10.5114/reum.2019.86426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Duerr R.H., Taylor K.D., Brant S.R., Rioux J.D., Silverberg M.S., Daly M.J., Steinhart A.H., Abraham C., Regueiro M., Griffiths A., et al. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science (New York, N.Y.) 2006;314:1461–1463. doi: 10.1126/science.1135245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.de Lange K.M., Moutsianas L., Lee J.C., Lamb C.A., Luo Y., Kennedy N.A., Jostins L., Rice D.L., Gutierrez-Achury J., Ji S.G., et al. Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease. Nat. Genet. 2017;49:256–261. doi: 10.1038/ng.3760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Sandborn W.J., Feagan B.G., Fedorak R.N., Scherl E., Fleisher M.R., Katz S., Johanns J., Blank M., Rutgeerts P., Ustekinumab Crohn's Disease Study Group A randomized trial of Ustekinumab, a human interleukin-12/23 monoclonal antibody, in patients with moderate-to-severe Crohn's disease. Gastroenterology. 2008;135:1130–1141. doi: 10.1053/j.gastro.2008.07.014. [DOI] [PubMed] [Google Scholar]
- 127.Sandborn W.J., Gasink C., Gao L.L., Blank M.A., Johanns J., Guzzo C., Sands B.E., Hanauer S.B., Targan S., Rutgeerts P., et al. Ustekinumab induction and maintenance therapy in refractory Crohn's disease. N. Engl. J. Med. 2012;367:1519–1528. doi: 10.1056/NEJMoa1203572. [DOI] [PubMed] [Google Scholar]
- 128.Feagan B.G., Sandborn W.J., Gasink C., Jacobstein D., Lang Y., Friedman J.R., Blank M.A., Johanns J., Gao L.L., Miao Y., et al. Ustekinumab as induction and maintenance therapy for Crohn’s disease. N. Engl. J. Med. 2016;375:1946–1960. doi: 10.1056/NEJMoa1602773. [DOI] [PubMed] [Google Scholar]
- 129.Feagan B.G., Sandborn W.J., D'Haens G., Panés J., Kaser A., Ferrante M., Louis E., Franchimont D., Dewit O., Seidler U., et al. Induction therapy with the selective interleukin-23 inhibitor risankizumab in patients with moderate-to-severe Crohn's disease: a randomised, double-blind, placebo-controlled phase 2 study. Lancet. 2017;389:1699–1709. doi: 10.1016/S0140-6736(17)30570-6. [DOI] [PubMed] [Google Scholar]
- 130.Feagan B.G., Panés J., Ferrante M., Kaser A., D'Haens G.R., Sandborn W.J., Louis E., Neurath M.F., Franchimont D., Dewit O., et al. Risankizumab in patients with moderate to severe Crohn's disease: an open-label extension study. Lancet. Gastroenterol. Hepatol. 2018;3:671–680. doi: 10.1016/S2468-1253(18)30233-4. [DOI] [PubMed] [Google Scholar]
- 131.So H.-C., Chau C.K.L., Chiu W.T., Ho K.S., Lo C.P., Yim S.H.Y., Sham P.C. Analysis of genome-wide association data highlights candidates for drug repositioning in psychiatry. Nat. Neurosci. 2017;20:1342–1349. doi: 10.1038/nn.4618. [DOI] [PubMed] [Google Scholar]
- 132.Schmidt A.F., Finan C., Gordillo-Marañón M., Asselbergs F.W., Freitag D.F., Patel R.S., Tyl B., Chopade S., Faraway R., Zwierzyna M., Hingorani A.D. Genetic drug target validation using Mendelian randomisation. Nat. Commun. 2020;11:3255–3312. doi: 10.1038/s41467-020-16969-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.De Jong S., Vidler L.R., Mokrab Y., Collier D.A., Breen G. Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia. J. Psychopharmacol. 2016;30:826–830. doi: 10.1177/0269881116653109. [DOI] [PubMed] [Google Scholar]
- 134.Bell N., Uffelmann E., van Walree E., de Leeuw C., Posthuma D. Using genome-wide association results to identify drug repurposing candidates. Preprint at medRxiv. 2022 doi: 10.1101/2022.09.06.22279660. [DOI] [Google Scholar]
- 135.Wray N.R., Goddard M.E., Visscher P.M. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 2007;17:1520–1528. doi: 10.1101/gr.6665407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Ongen H., Brown A.A., Delaneau O., Panousis N.I., Nica A.C., GTEx Consortium. Dermitzakis E.T. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 2017;49:1676–1683. doi: 10.1038/ng.3981. [DOI] [PubMed] [Google Scholar]
- 137.Gamazon E.R., Segrè A.V., van de Bunt M., Wen X., Xi H.S., Hormozdiari F., Ongen H., Konkashbaev A., Derks E.M., Aguet F., et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 2018;50:956–967. doi: 10.1038/s41588-018-0154-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Regev A., Teichmann S.A., Lander E.S., Amit I., Benoist C., Birney E., Bodenmiller B., Campbell P., Carninci P., Clatworthy M., et al. The human cell atlas. Elife. 2017;6:e27041. doi: 10.7554/eLife.27041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Hao L., Kraft P., Berriz G.F., Hynes E.D., Koch C., Korategere V Kumar P., Parpattedar S.S., Steeves M., Yu W., Antwi A.A., et al. Development of a clinical polygenic risk score assay and reporting workflow. Nat. Med. 2022;28:1006–1013. doi: 10.1038/s41591-022-01767-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Esserman L., Eklund M., Veer L.V., Shieh Y., Tice J., Ziv E., Blanco A., Kaplan C., Hiatt R., Fiscalini A.S., et al. The WISDOM study: a new approach to screening can and should be tested. Breast Cancer Res. Treat. 2021;189:593–598. doi: 10.1007/s10549-021-06346-w. [DOI] [PubMed] [Google Scholar]
- 141.Roux A., Cholerton R., Sicsic J., Moumjid N., French D.P., Giorgi Rossi P., Balleyguier C., Guindy M., Gilbert F.J., Burrion J.B., et al. Study protocol comparing the ethical, psychological and socio-economic impact of personalised breast cancer screening to that of standard screening in the “My Personal Breast Screening” (MyPeBS) randomised clinical trial. BMC Cancer. 2022;22:507. doi: 10.1186/s12885-022-09484-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Saya S., Boyd L., Chondros P., McNamara M., King M., Milton S., Lourenco R.D.A., Clark M., Fishman G., Marker J., et al. The SCRIPT Trial: study protocol for a randomised controlled trial of a polygenic risk score to tailor colorectal cancer screening in primary care. Trials. 2022;23:810. doi: 10.1186/s13063-022-06734-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Widén E., Junna N., Ruotsalainen S., Surakka I., Mars N., Ripatti P., Partanen J.J., Aro J., Mustonen P., Tuomi T., et al. How communicating polygenic and clinical risk for atherosclerotic cardiovascular disease impacts health behavior: an observational follow-up study. Circ. Genom. Precis. Med. 2022;15:e003459. doi: 10.1161/CIRCGEN.121.003459. [DOI] [PubMed] [Google Scholar]
- 144.Health O.F. 2022. Genomics plc to generate polygenic risk scores for Our Future Health.https://ourfuturehealth.org.uk/genomics-plc-to-generate-polygenic-risk-scores-for-our-future-health/ [Google Scholar]
- 145.Kozlov M. The controversial embryo tests that promise a better baby. Nature. 2022;609:668–671. doi: 10.1038/d41586-022-02961-9. [DOI] [PubMed] [Google Scholar]
- 146.Kumar A., Im K., Banjevic M., Ng P.C., Tunstall T., Garcia G., Galhardo L., Sun J., Schaedel O.N., Levy B., et al. Whole-genome risk prediction of common diseases in human preimplantation embryos. Nat. Med. 2022;28:513–516. doi: 10.1038/s41591-022-01735-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Turley P., Meyer M.N., Wang N., Cesarini D., Hammonds E., Martin A.R., Neale B.M., Rehm H.L., Wilkins-Haug L., Benjamin D.J., et al. Problems with using polygenic scores to select embryos. N. Engl. J. Med. 2021;385:78–86. doi: 10.1056/NEJMsr2105065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Lencz T., Backenroth D., Granot-Hershkovitz E., Green A., Gettler K., Cho J.H., Weissbrod O., Zuk O., Carmi S. Utility of polygenic embryo screening for disease depends on the selection strategy. Elife. 2021;10:e64716. doi: 10.7554/eLife.64716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Musunuru K., Chadwick A.C., Mizoguchi T., Garcia S.P., DeNizio J.E., Reiss C.W., Wang K., Iyer S., Dutta C., Clendaniel V., et al. In vivo CRISPR base editing of PCSK9 durably lowers cholesterol in primates. Nature. 2021;593:429–434. doi: 10.1038/s41586-021-03534-y. [DOI] [PubMed] [Google Scholar]
- 150.Kingwell K. Base editors hit the clinic. Nat. Rev. Drug Discov. 2022;21:545–547. doi: 10.1038/d41573-022-00124-z. [DOI] [PubMed] [Google Scholar]
- 151.Lambert S.A., Abraham G., Inouye M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 2019;28:R133–R142. doi: 10.1093/hmg/ddz187. [DOI] [PubMed] [Google Scholar]
- 152.Visscher P.M. Genetics of cognitive performance, education and learning: from research to policy? NPJ Sci. Learn. 2022;7:8. doi: 10.1038/s41539-022-00124-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 153.Fatumo S., Chikowore T., Choudhury A., Ayub M., Martin A.R., Kuchenbaecker K. A roadmap to increase diversity in genomic studies. Nat. Med. 2022;28:243–250. doi: 10.1038/s41591-021-01672-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 154.Whose genomics? Nat. Human Behav. 2019;3:409–410. doi: 10.1038/s41562-019-0619-1. [DOI] [PubMed] [Google Scholar]
- 155.Peterson R.E., Kuchenbaecker K., Walters R.K., Chen C.Y., Popejoy A.B., Periyasamy S., Lam M., Iyegbe C., Strawbridge R.J., Brick L., et al. Genome-wide association studies in ancestrally diverse populations: opportunities, methods, pitfalls, and recommendations. Cell. 2019;179:589–603. doi: 10.1016/j.cell.2019.08.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Martin A.R., Stroud R.E., 2nd, Abebe T., Akena D., Alemayehu M., Atwoli L., Chapman S.B., Flowers K., Gelaye B., Gichuru S., et al. Increasing diversity in genomics requires investment in equitable partnerships and capacity building. Nat. Genet. 2022;54:740–745. doi: 10.1038/s41588-022-01095-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Genetics A.S.o.H. ASHG denounces attempts to link genetics and racial supremacy. Am. J. Hum. Genet. 2018;103:636. doi: 10.1016/j.ajhg.2018.10.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Novembre J., Barton N.H. Tread Lightly Interpreting Polygenic Tests of Selection. Genetics. 2018;208:1351–1355. doi: 10.1534/genetics.118.300786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 159.Durvasula A., Lohmueller K.E. Negative selection on complex traits limits phenotype prediction accuracy between populations. Am. J. Hum. Genet. 2021;108:620–631. doi: 10.1016/j.ajhg.2021.02.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.Yair S., Coop G. Population differentiation of polygenic score predictions under stabilizing selection. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2022;377:20200416. doi: 10.1098/rstb.2020.0416. [DOI] [PMC free article] [PubMed] [Google Scholar]



