Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2019 Sep 4.
Published in final edited form as: Nature. 2018 Sep 26;562(7726):268–271. doi: 10.1038/s41586-018-0566-4

Common genetic variants contribute to risk of rare severe neurodevelopmental disorders

Mari E K Niemi 1, Hilary C Martin 1, Daniel L Rice 1, Giuseppe Gallone 1, Scott Gordon 2, Martin Kelemen 1, Kerrie McAloney 2, Jeremy McRae 1, Elizabeth J Radford 1,3, Sui Yu 4, Jozef Gecz 5,6, Nicholas G Martin 2, Caroline F Wright 7, David R Fitzpatrick 8, Helen V Firth 1,9, Matthew E Hurles 1, Jeffrey C Barrett 1
PMCID: PMC6726472  EMSID: EMS84141  PMID: 30258228

Abstract

There are thousands of rare human disorders caused by a single deleterious, protein-coding genetic variant1. However, patients with the same genetic defect can have different clinical presentations24, and some individuals carrying known disease-causing variants can appear unaffected5. What explains these differences? Here, we study a cohort of 6,987 children assessed by clinical geneticists to have severe neurodevelopmental disorders, such as global developmental delay and autism, often with abnormalities of other organ systems. While the genetic causes of these neurodevelopmental disorders are expected to be almost entirely monogenic, we show that 7.7% of variance in risk is attributable to inherited common genetic variation. We replicated this genome wide common variant burden by showing that it is over-transmitted from parents to children with neurodevelopmental disorders in an independent sample of 728 trios from the same cohort. Our common variant signal is significantly positively correlated with genetic predisposition to fewer years of schooling, decreased intelligence, and risk of schizophrenia. We found that common variant risk was not significantly different between individuals with and without a known protein-coding diagnostic variant, suggesting that common variant risk is not confined to patients without a monogenic diagnosis. In addition, previously published common variant scores for autism, height, birth weight, and intracranial volume were all correlated with those traits within our cohort, suggesting that phenotypic expression in individuals with monogenic disorders is affected by the same variants as the general population. Our results demonstrate that common genetic variation affects both overall risk and clinical presentation in neurodevelopmental disorders typically considered to be monogenic.


We carried out a genome-wide association study (GWAS) in 6,987 patients with severe neurodevelopmental disorders and 9,270 ancestry-matched controls, using common variants with a minor allele frequency ≥5% (Figure 1, Extended Data Figure 1, Supplementary Tables 1-2 and Methods). The patients were recruited by senior clinical geneticists in the UK and Ireland as part of the Deciphering Developmental Disorders (DDD) study6,7. They all had at least one abnormality affecting the central nervous system morphology or physiology, and to be recruited to the study their clinical features were sufficiently severe that their disorder was thought likely to be monogenic. In addition to neurodevelopmental defects (e.g. global developmental delay, intellectual disability, cognitive impairment or learning disabilities in 86%, autism spectrum disorders in 16%, Figure 2a), 88% also had abnormalities in at least one other organ system (Figure 2b and Extended Data Table 1).

Figure 1. Outline of analysis exploring the contribution of common variants to risk of severe neurodevelopmental disorders.

Figure 1

We first conducted a discovery GWAS in a large dataset of neurodevelopmental disorder patients, and replicated the common variant contribution by analysing polygenic transmission in independent trios from the same cohort. Next, we looked for overlap of common variant effects between neurodevelopmental disorder risk and other published GWAS, and replicated these findings in an independent Australian cohort. Finally, we explored how polygenic effects were distributed within our discovery patient cohort, and whether common variants contributed to expressivity of specific phenotypes.

Figure 2. Patients recruited to the DDD study have diverse phenotypes.

Figure 2

A. Examples of specific phenotypes affecting different organ systems, observed in the full DDD cohort and the neurodevelopmental subset of patients. B. Distribution of the number of distinct organ systems affected in the set of 6,987 patients with neurodevelopmental abnormalities (Methods).

We did not find any single variant associations at genome-wide significance (Extended Data Figure 2a), which was unsurprising given the heterogeneity of our clinical phenotype and the presumption that these disorders are monogenic. We did, however, observe a modest inflation in the test statistics (λ=1.097, Extended Data Figure 2b), which could indicate either residual bias between cases and controls, or evidence of a polygenic contribution of common variants to disease risk. We therefore estimated common variant heritability using LD score regression8, which can differentiate between these two possibilities, and found that 7.7% (SE=2.1%) of variance in risk (on the liability scale) for neurodevelopmental disorders in our sample was attributable to common genetic variants, when assuming a population prevalence of 1% (Methods). This common variant heritability estimate (h2) is similar to what has been reported for common disorders such as autism (h2=11.8%, SE=1.0%)9 and major depressive disorder (h2=8.9%, SE=0.4%)10. To replicate this signal we analysed an independent set of 728 parent-child trios recruited as part of the same study, but who were not in the initial GWAS. We calculated polygenic scores for each individual by summing the genetic effects across all independent variants from our discovery GWAS (Figure 1 and Methods). We then performed a polygenic transmission disequilibrium test11, which compares the mean parental polygenic scores to those of the affected children. We found that our neurodevelopmental disorder risk score was over-transmitted in these trios (P=0.0035, t=2.48, df=727, one-sided t-test), confirming that common variants contribute to risk of disorders widely presumed to be monogenic.

Previous studies have shown that risk of more common neuropsychiatric disorders (e.g. schizophrenia and bipolar disorder12,13) and variation in other brain-related traits, including educational attainment13, is driven in part by shared common genetic effects. We therefore used the LD score method14 to test for genetic correlation between our neurodevelopmental disorder GWAS and available GWAS data for common neuropsychiatric disorders, cognitive and educational traits, anthropometric traits, and negative control diseases that have well powered GWAS but are not related to neurodevelopment. We found that genetic risk for neurodevelopmental disorders was significantly negatively correlated with genetic predisposition to higher educational attainment15 (rg = -0.49, SE = 0.08, P = 5.3x10-10) and intelligence16 (as measured by Spearman’s g) (rg = -0.44, SE = 0.10, P = 2.2x10-5), and positively correlated with genetic risk of schizophrenia (rg = 0.28, SE = 0.07, P = 2.7x10-5) (Figure 3 and Extended Data Table 2). None of the anthropometric traits, nor the negative control traits, were significantly genetically correlated with our data, after accounting for multiple testing. We also used partitioned LD score regression17 to show that heritability of neurodevelopmental disorders was nominally significantly enriched in cells of the central nervous system (P = 0.02), and in mammalian constrained regions18 (P = 0.009) (Supplementary Table 2), consistent with similar analyses for other neuropsychiatric and cognitive traits. Together, these results suggest that thousands of common variants have individually small effects on brain development or function, which in turn influences neuropsychiatric disease risk, cognitive traits, and risk for severe neurodevelopmental disorders.

Figure 3. Genetic correlations between neurodevelopmental disorder risk (6,987 cases and 9,270 controls) against nineteen other traits.

Figure 3

Cognitive/psychiatric (purple), anthropometric (orange) and negative control traits (green), with SNP heritability (h2) displayed for the trait. SNP heritability for dichotomous traits is displayed on the liability scale. Genetic correlation was calculated using bivariate LD score correlation14, with the bars representing 95% confidence intervals (using standard error) before correction for multiple testing. Uncorrected P-values are from a two-sided z-score, and are only shown if they pass Bonferroni correction for 19 traits. Sample sizes for 19 other GWAS are shown in Extended Data Table 2.

We next investigated how general our genetic correlation findings were, by attempting to replicate them in another neurodevelopmental disorder cohort (Figure 1). We obtained GWAS data for 1,270 neurodevelopmental disorder cases from Australia and 1,688 ancestry-matched Australian controls. This sample size is too small to do direct genetic discovery or to reliably apply LD score regression, so we tested common variant polygenic scores using summary statistics from our discovery GWAS and published GWAS, including educational attainment15 and intelligence16. This approach requires specification of P-value thresholds, and is less robust to population structure and cryptic relatedness, but it produced similar results to LD score in our discovery GWAS, so we believe it is well suited to a replication analysis. We replicated our observation of lower polygenic scores for educational attainment and intelligence in the Australian neurodevelopmental disorder cases compared to controls (P = 1.0x10-8 and P = 7.6x10-4 respectively), and found that cases had a nominally significantly increased score for schizophrenia (P = 0.014) (Methods and Extended Data Table 3). We did not see a significant difference between Australian cases and controls for the score constructed from our own discovery GWAS. We should have had 95% power (Methods) to detect a difference if the two cohorts had identical phenotypes, suggesting that differential phenotypic ascertainment between the British and Australian cohorts diluted our ability to quantify their shared genetics.

These findings could mean that common variants entirely explain a subset of patients with neurodevelopmental disorders, and are not relevant in the remainder, or that all patients’ disorders have both rare and common variant contributions (Figure 1). We have exome sequenced our cohort of patients, as well as their parents, and have previously reported a variety of both de novo and inherited diagnostic variants19,20. We therefore compared polygenic scores for cognitive traits and neuropsychiatric disorders between patients for whom we had identified diagnostic or probably diagnostic variants in a known developmental disorder gene21 (N=1,127) and those who had no candidate diagnostic variant (N=2,479), but we found no significant differences for any polygenic score we tested after controlling for multiple testing (Extended Data Table 4 and Methods). We showed by simulations that if the “diagnosed” cases had the same distribution of the educational attainment polygenic score as controls, we would have had sufficient power to detect a difference between them and the undiagnosed cases (Methods). This is consistent with a previous study in autism11 that similarly found no evidence for a difference in polygenic risk scores between autism cases with a de novo diagnostic mutation compared to those without. This suggests that both common and rare variants are contributing in many neurodevelopmental disorder patients. However, as the DDD project continues to identify new diagnoses, we anticipate that the increase in power may show that monogenic and polygenic contributions are not purely additive.

In addition to showing that common variation affects overall risk of severe neurodevelopmental disorders, we sought to determine if it can also affect individual presentation of symptoms. We identified four phenotypes measured in our neurodevelopmental disorder cohort for which independent GWAS data are available: autism (16% of cohort), birth weight, height, and intracranial volume. On average, our neurodevelopmental patients had a head circumference 1.20 standard deviations (SD) smaller, they were 0.72 SD shorter than, and weighed 0.15 SD less than the age and sex-adjusted population average. We constructed common variant polygenic scores for the four phenotypes as described above, and tested for association between the relevant score and phenotype in our cohort. In all four cases, there was significant association (Table 1 and Extended Data Table 5), demonstrating that common variation contributes to the expression of these traits in our study. Consistent with previous reports9 we also found that individuals with autism in our cohort had higher polygenic scores for educational attainment compared to those without autism. We next tested for association between the educational attainment polygenic score and severity of overall neurodevelopmental phenotype. We found that patients with severe intellectual disability or developmental delay (N=911, Methods) had higher scores (i.e. greater educational attainment, proxy for higher cognitive function, P=0.004, Table 1) than those with mild or moderate disability or delay (N=1,902). This finding, which might seem initially counter-intuitive, is consistent with epidemiological studies22 which found that the siblings of patients with severe intellectual disability showed a normal distribution of IQ, whereas siblings of patients with milder intellectual disability had lower IQ than average, implying that mild intellectual disability represents the tail-end of the distribution of polygenic effects on intelligence and severe intellectual disability has a different etiology.

Table 1. Polygenic score analyses in the DDD Study.

Resultsa
Measured trait     Polygenic score Beta Standard error P-value R2
Birth weight (N=6,496) Birth weight 0.187 0.017 2.55x10-28 0.02
Height (N=5,465) Height 0.408 0.033 1.18x10-35 0.033
Head circumference (N=6,074) Intracranial volume 0.132 0.031 1.79x10-5 0.004
Autistic behavior: affected (N=1,121), unaffected (N=5,866) Autism spectrum disorder 0.12 0.033 2.53x10-4 0.006 c
Developmental delay or intellectual disability: severe (N=911), mild/moderate (N=1,902) b Educational attainment 0.116 0.04 0.004 0.008 c
a

Linear or logistic regression of measured traits in the DDD Study against the respective polygenic score, including ten ancestry principal components as covariates. P-values are two-sided, from t-distribution (linear) and z-score distribution (logistic).

b

Severe cases were labelled as 1 in the logistic regression.

c

Nagelkerke R2

The study of human disease genetics has often been segregated into rare, single gene disorders, and common complex disorders. There is abundant evidence that rare variants in individual genes can cause phenotypes seen much more commonly in individuals without a monogenic cause, including genes for maturity onset diabetes of the young23 and familial Parkinson’s disease24.There is also emerging evidence that the cumulative effect of common variants can modify the penetrance of rare variants in complex phenotypes like educational attainment25, schizophrenia26 and breast cancer27. Here we have shown that the same interplay between rare and common variation exists even in severe neurodevelopmental disorders typically presumed to be monogenic. Previous studies have shown that the penetrance and expression of these disorders are affected by which specific missense variant is carried28 and the presence of mutations in secondary modifier genes29. Here, we have demonstrated that they are also modified by common variants that influence neurodevelopmental traits in the general population. We analysed individuals of European ancestry (as, alas, do the vast majority of published GWAS) and since the genetic architecture of neurodevelopmental disorders may differ between populations30 further studies will be required to generalise our findings. Altogether, our findings suggest that fully understanding the genetic architecture of neurodevelopmental disorders will require considering the full spectrum of alleles from those unique to an individual to those shared across continents.

Methods

DDD cohort phenotypes

Recruitment and phenotyping of DDD patients is described in detail elsewhere6,7. The DDD study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South Research Ethics Committee and GEN/284/12, granted by the Republic of Ireland Research Ethics Committee). Families gave informed consent for participation. Briefly, the DDD study recruited patients with a previously undiagnosed developmental disorder, in the UK and Ireland. Patient phenotypes were systematically recorded by clinical geneticists using Human Phenotype Ontology (HPO) terms in a central database, DECIPHER21.

The DDD cohort is very heterogeneous in terms of patient phenotypes, and so we narrowed our analyses to singleton patients and trios where the proband had at least one of the following HPO terms or daughter terms of: abnormal metabolic brain imaging by MRS (HP:0012705), abnormal brain positron emission tomography (HP:0012657), abnormal synaptic transmission (HP:0012535), abnormal nervous system electrophysiology (HP:0001311), behavioural abnormality (HP:0000708), seizures (HP:0001250), encephalopathy (HP:001298), abnormality of higher mental function (HP:0011446), neurodevelopmental abnormality (HP:0012759), abnormality of the nervous system morphology (HP:0012639). This “neurodevelopmental” subset included both individuals who have since recruitment to the DDD study been found to carry diagnostic exome mutations in protein-coding genes6,19,20,31, and individuals who are awaiting diagnosis. We therefore define our main phenotype, “neurodevelopmental disorder risk”, as the risk of having a previously undiagnosed developmental disorder and being included in the DDD study, and having at least one neurodevelopmental HPO. In addition to HPOs, some DDD patients also had a clinical record of growth measurements such as height, birth weight and head circumference.

We counted the proportion of DDD patients with particular medically relevant HPOs, displayed in Figure 2a. Individuals with the HPO were counted using a word search of the particular HPO and its daughter nodes. When counting the number of distinct organ systems affected in each DDD patient (Figure 2b), we faced the issue that some HPOs fell under multiple organ systems, as for example, microcephaly which is a common term in the cohort falls under three categories: "nervous system", "head or neck" and "skeletal system". In order to assign each HPO into only one organ system, we first ranked organ systems based on the number of raw counts of individuals with at least one term under that system (Extended Data Table 1) in the full DDD cohort. We then looked for individuals with at least one HPO under the organ system ranked most commonly affected, and assigned these individuals an organ system count of one. We then removed these HPOs from the patients’ lists, before continuing to identify individuals with at least one HPO in the organ system ranked second most prevalently affected. We continued to count organs and remove HPOs until we had assigned all individuals a count of organs systems affected out of 19 non-overlapping systems.

Australian developmental disorder cohort phenotypes

We obtained a replication cohort of 1,270 developmental disorder cases from South Australia, originally genotyped (using the Illumina Infinium CytoSNP-850k BeadChip) as part of routine clinical care to ascertain pathogenic copy number variants. The majority (>95%) were under 18 years old. 50-60% were recruited through clinical genetics units, and the rest through neurologists, neonatologists, paediatricians and cardiologists. Based on reviewing information on the request forms, the majority of patients had developmental delay/intellectual disability and malformations involving at least one organ (e.g. brain, heart, and kidney). 15-20% were recruited as neonates with multiple malformations involving brain, heart and/or other organs, and were too young to be diagnosed with developmental delay/intellectual disability.

Datasets and Quality Control

We genotyped 11,304 patients and 930 full trios recruited to the DDD study, on Illumina HumanCoreExome and HumanOmniExpress chips, respectively. Genotyping was carried out by the Wellcome Trust Sanger Institute genotyping facility. As controls for the discovery GWAS, we used genotype data for 10,484 individuals from the UK-based Understanding Society (UKHLS)32,33. Recruitment to this study was carried out through UK-wide household longitudinal survey. For replication, we obtained GWAS data from a cohort of neurodevelopmental disorder cases from South Australia and population-matched controls from the Brisbane Longitudinal Twin Study (Queensland Institute of Medical Research34,35). All data were on GRCh37, and detailed information of genotyping chips is shown in Supplementary Table 1.

We performed variant and sample quality control for each dataset separately. We removed samples whose reported sex was inconsistent with the genotype data, who had high sample missingness (≥3% of MAF≥10% variants), samples with high or low heterozygosity (±3 standard deviations from the mean, using MAF≥10% variants) to control for admixture and inbreeding, and sample duplicates (alleles identical by descent ≥98%, using MAF>10% variants). We removed one individual from pairs of related individuals (alleles identical by descent >12%, using PLINK) from the case-control cohorts. Individuals in the discovery cohort were not related to the independent DDD trios. We also removed trios with a high number of Mendel errors (>2,000 errors). For variant quality control, we removed variants if they had high genotype missingness (≥3%), Hardy Weinberg Equilibrium test p<1×10-5, no strand information, if they were duplicates, if the alleles were discordant between case and control datasets, or if alleles and their frequency in Europeans were discordant with HRC v1.1 imputation reference panel. We only included variants on chr1-22. For the HumanCoreExome data and the Australian data, we removed rare variants MAF≤0.5% before imputation. Post imputation, we removed imputed variants with INFO≤0.9 or high missingness (≥5%).

We defined sample ancestry based on a projection principal component (PCA) analysis using PLINK with 1000 Genomes Phase 3 populations, using SNPs that overlapped between the datasets (DDD+UKHLS and Australian cases+controls separately) and the reference populations. For this, we used SNPs with a minor allele frequency (MAF) of ≥10%, excluded A/T G/C SNPs, removed regions of extended linkage disequilibrium (including the HLA region), and thinned the SNPs by pruning those with pairwise r2>0.2 in batches of 50 SNPs with sliding windows of 5 (“--indep-pairwise 50 5 0.2” in PLINK). This left 52,836 SNPs for the projection PCA with the DDD/UKHLS data and 40,626 SNPs with the Australian data. For analyses described in this paper, we carried forward individuals of European ancestry, defined by selecting samples clustering around the 1000 Genomes Great British (GBR) samples in the PCA (Extended Data Figures 1 and 3). The distribution of ancestries was different between cases and controls, likely due to marked differences in ascertainment (e.g. individuals from ancestries with high levels of consanguinity are more likely to be recruited to studies of rare genetic disorders). Because we tightly filtered based on PCA these differences do not affect our results.

Phasing and imputation

After sample and variant quality control, we imputed European samples from all datasets in order to boost the coverage of the genome for association testing and to increase overlap of datasets genotyped on different chips. We used reference-based haplotype phasing and imputation. The discovery GWAS cohorts genotyped on the HumanCoreExome backbone were phased and imputed together using variants that intersected between the different versions of the chip. Trios were phased and imputed in a second batch because they were genotyped on a different chip. We phased and imputed the Australian GWAS data in a third batch, using variants that intersected between the CytoSNP-850K chip and the Illumina 610K chip. None of the analyses in our paper were directly across batches, so there is no bias introduced by this approach. We used the Sanger Institute Imputation Service36 to carry out phasing and imputation on the DDD discovery dataset, DDD trios dataset and Australian dataset, using Eagle2 (v2.0.5)37 and PBWT38 respectively, selecting the Haplotype Reference Consortium as the reference panel (release 1.1, chr1-22, X) 36.

Discovery GWAS of neurodevelopmental disorder risk

We carried out genome-wide association study for neurodevelopmental disorder risk in the discovery neurodevelopmental set of 6,987 cases and 9,270 controls of European ancestry-only, using BOLT linear mixed models39 with sex as a covariate. We included in our analysis genotyped variants or high-confidence imputed variants (INFO≥0.9) with a MAF of ≥5%.

SNP heritability

From the discovery GWAS summary statistics, we removed the MHC region (chromosome 6 region 26-34MB), and estimated trait heritability using LDSC8 in LD Hub40. Given the ascertainment of the DDD neurodevelopmental cases in this study, estimating the true population prevalence was not feasible. We therefore estimated single nucleotide polymorphism (SNP) heritability for our discovery GWAS on the liability scale for a range of prevalences between 0.2% and 2%, and found that SNP heritability varies from 5.5% (SE=1.5%) to 9.1% (SE=2.5%). We report heritability assuming a prevalence of 1% in the population. Heritability on the observed scale in our discovery GWAS was 13.8% (SE=3.7%).

Polygenic transmission disequilibrium test

We used the pTDT method, described in11, to investigate transmission disequilibrium of effect alleles for traits within DDD trios, using imputed genotype data. Briefly, the test compares the means of two polygenic score distributions: one comprising of scores of the probands, and the other of the average parent-pair scores. The test is equivalent to a one-sample t-test, assessing whether the mean of score distribution in probands deviates from the mean of parent-pair score average. We report a one-sided p-value for over-transmission.

Genetic correlation

We carried out genetic correlation of the neurodevelopmental disorder risk discovery GWAS against multiple published traits using bivariate LDSC14. For traits included in LD Hub we used the online server, and for traits not included in LD Hub we used the LDSC software. For genetic correlation with neurodevelopmental disorder risk, we pre-selected a range of different types of traits and diseases: traits relating to cognitive performance, education, psychiatric traits and diseases, anthropometric traits and non-brain related traits and diseases. Ninety-five percent confidence intervals in Figure 3 are shown before correction for multiple testing. We set the significance threshold to p<0.0026 (0.05/19 tests).

Partitioned heritability

We used partitioned LDSC17 to look for enrichment of heritability in cell type groups and functional genomic categories. To do this we used the baseline model LD scores and regression weights available online. For cell type groups and functional categories we set the significance threshold to P<0.005 (0.05/10 tests) and P<9.6x10-4 (0.05/52 tests), respectively.

Polygenic scores

We constructed polygenic scores using summary statistics from our neurodevelopmental disorder risk GWAS and seven published GWAS (educational attainment15, intelligence16, schizophrenia41, autism9, intracranial volume42, height43 and birth weight44). For all traits, we included only variants that had a MAF≥5% and were directly genotyped or imputed with high confidence (INFO≥0.9) in the respective study cohort (discovery case-control, trios or Australians). To construct the polygenic scores for individuals, we then multiplied the variant effects (betas) with the individual’s allele counts. For imputed variants, we used genotype probabilities rather than hard-called allele counts. To find independent variants for our scores, we pruned variants intersecting the original study summary statistics and our GWAS data using PLINK, by taking the top variant and removing variants within 500kb and that have r2≥0.1 with the top variant. We then repeated the process until no variant had a P-value below a pre-defined threshold, which we based on prior knowledge of variance in the phenotype explained. For the neurodevelopmental disorder risk score, we tested seven P-value thresholds (P<1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001) and chose the one which resulted in a score that explained the most variance (Nagelkerke’s R2) in case/control status in an independent subset of DDD patients. Specifically, we repeated our neurodevelopmental disorder risk GWAS having removed a random subset of 20% of cases and controls, then calculated a score in this leave-out subset, and performed a logistic regression to assess association of case-control status with the score. The threshold P<1 performed best in ten independent permutations, and we used this threshold to construct scores in pTDT and Australian case-control analyses. We additionally tested all seven thresholds when constructing scores in the Australians, however varying the threshold did not change our results. When deciding the P-value thresholds for published GWAS, we used the threshold that had been found to explain the most variation in other published studies for the trait (years in education P<145, intelligence16, schizophrenia P<0.0541, autism P<0.111). For traits which we had phenotype data for in the DDD, we used thresholds that explained the most variation in DDD cases (intracranial volume P<1, birth weight P<0.01, height P<0.005). Thresholds and the number of variants used for each score are shown in Extended Data Tables 3-5. All scores were normalised to a mean of 0 and variance of 1. To test for association between trait and score, we used R (version 1.90b3) to perform logistic regression for binary traits and linear regression for quantitative traits, including the first ten principal components from the ancestry PCA to control for possible population stratification.

In order to assess power for detecting differences in scores between diagnosed and undiagnosed patients, we tested the hypothesis that diagnosed patients were effectively a random sample of controls with respect to their polygenic scores. Specifically, we randomly sampled 1,127 controls (i.e. the same number as we had diagnosed patients) and compared the polygenic scores between them and the undiagnosed patients using logistic regression. We repeated this 10,000 times and determined the proportion of times we detected a significant difference P<0.007 (P<0.05/7 correcting for seven polygenic scores) as proxy for power. For educational attainment, this was 99.1% of simulations, 93.6% for schizophrenia, and 61.2% for intelligence.

We used AVENGEME46 to calculate power to find significant association (at P<0.05) between our polygenic score for neurodevelopmental disorders and case/control status in the Australian dataset. We assumed that the SNP heritability is the same (7.7%) in both the Australian and British cohorts, and that the genetic correlation between them was 1.

The schizophrenia PGC-CLOZUK study included some controls from the Australian cohort used in our study, and therefore we ran polygenic score analyses in the Australians using summary statistics from PGC-CLOZUK (obtained through personal communication from A. Pardinas) after these samples had been removed.

Subsetting the DDD cohort

We defined a set of patients with an exonic diagnosis and a set with no likely diagnostic variants. This was based on the clinical filtering procedure described in6, which focuses on identifying rare, damaging variants in a set of genes known to cause developmental disorders (https://www.ebi.ac.uk/gene2phenotype/), that fit an appropriate inheritance mode. Variants that pass clinical filtering are uploaded to DECIPHER, where the patients’ clinicians classify them as “definitely pathogenic”, “likely pathogenic”, “uncertain”, “likely benign” or “benign”. This process of clinical classification is necessarily dynamic as new disorders are identified and patients manifest new phenotypes. Our “diagnosed” set consists of 1,127 patients who fulfilled one of these criteria: a) amongst the diagnosed set in a recent reanalysis of the first 1,133 trios47, or b) had at least one variant (or pair of compound heterozygous variants) rated as “definitely pathogenic” or “likely pathogenic” by a clinician, or c) had at least one variant (or pair of compound heterozygous variants) in a class with a high positive predictive value that passed clinical filtering but had not yet been rated by clinicians. We considered de novo or compound heterozygous loss-of-function (LoF) variants to have high positive predictive value, since of the ones that had been rated clinicians, 100% of compound heterozygous LoFs and 94.% of de novo LoFs had been classed as “definitely” or “likely pathogenic”. Our “undiagnosed” set consists of 2,479 patients who had no variants that passed our clinical filtering, or in whom the variants that had passed clinical filtering had all been rated as “likely benign” or “benign” by clinicians, or who were amongst the “undiagnosed” set in the first 1,133 trios that have previously been extensively clinically reviewed6. Note that our diagnosed versus undiagnosed analysis excludes 3,375 patients who had one or more variants that passed clinical filtering in a class with a relatively low positive predictive value, but that have not yet been rated by clinicians.

We defined patients to present with autistic behaviours if their phenotype included autistic behaviour (HP:0000729) or any of its daughter nodes. We defined patients as having “mild/moderate intellectual disability or delay” if their HPO phenotypes included borderline, mild or moderate intellectual disability (HP:0006889, HP:0001256, HP:0002342) and/or mild or moderate global developmental delay (HP:0011342, HP:0011343). Patients were included in the “severe ID or delay” set if they had severe or profound intellectual disability (HP:0010864, HP:0002187) and/or severe or profound global developmental delay (HP:0011344, HP:0012736). We excluded patients with ID or global developmental delay of undefined severity.

Extended Data

Extended Data Figure 1. Ancestry principal components analysis of UK and Australian samples.

Extended Data Figure 1

Reference samples (N=2,504) from 1000 Genomes Phase 3, coloured by the five super populations, used for a projection PCA of (a) UK cohorts (DDD and UKHLS), or (b) Australian cohorts c, All DDD cases (discovery N=11,304 and from trios N=930), and d, all Australian cases (N=2,283) from their respective projection PCA with 1000 Genomes. Case samples with European ancestry are plotted in red and non-Europeans in grey. e, All UKHLS controls (N=10,396) and f, all Australian controls (N=4,274) from their respective projection PCA with 1000 Genomes. Control samples with European ancestry are plotted in blue and non-Europeans in grey. All cases and controls coloured in grey (panels c, d, e and f) were excluded from analysis due to non-European ancestry. UK cohorts are plotted after removal of samples that failed quality control, and Australian cohorts before removal of samples failing quality control.

Extended Data Figure 2. Discovery GWAS of neurodevelopmental disorder risk.

Extended Data Figure 2

a. Manhattan plot of neurodevelopmental disorder discovery GWAS, with 6,987 DDD cases and 9,270 ancestry-matched UKHLS controls (both European ancestry), using 4,134,438 variants MAF≥5% chr1-22. P-values were from a two-tailed chi squared distribution. Red line = threshold for genome-wide significance (P=5x10-8). b. Quantile-quantile plot of neurodevelopmental disorder discovery GWAS. Red line = expected values under the null.

Extended Data Figure 3. Ancestry principal components analysis of UK and Australian samples (PCs 2-5).

Extended Data Figure 3

Reference samples (N=2,504) from 1000 Genomes Phase 3, colored by the five super populations, are plotted on the left hand side, from projection PCAs with UK cohorts. Middle panels show the PCs plotted for DDD cases (discovery N=10,556 and from trios N=911) (UK samples) and Australian cases (N=2,283). Red=European ancestry case samples, grey=non-European samples, which were excluded from analyses. Right hand panels show PCs for UKHLS controls (N=10,396) (UK samples) and Australian controls (N=4,274). Blue=European ancestry control samples, grey=non-European samples, which were excluded from analyses. UK cohorts are plotted after removal of samples that failed quality control, and Australian cohorts before removal of samples that failed quality control.

Extended Data Table 1. Proportions of neurodevelopmental disorder patients who have at least one HPO term belonging to a particular organ system category.

Organ system % All DDD patients (N=13,558) % unrelated DDD patients, GBR ancestry (N=6,987)
Nervous system 87 100
Head or neck 68.9 71.2
Skeletal system 61.7 61.8
Limbs 35.1 35.3
Eye 34.9 35.3
Integument 31.2 31.9
Ear 20.1 19.7
Digestive system 20 19.1
Musculature 19.9 18.7
Cardiovascular system 15.1 13.5
Genitourinary system 12.4 11.4
Respiratory system 8.1 7.3
Connective tissue 7.4 6.3
Immune system 6.8 6.5
Endocrine system 4.1 4.1
Metabolism homeostasis 4.1 4
Breast 3.7 3.7
Blood and blood forming tissues 2.1 2.1
Voice 1.1 1.1

The HPO tree descends from “phenotypic abnormality”, through different organ systems, down to specific terms describing particular phenotypes. Each HPO term used by clinicians to describe patients was traced up the tree to the organ system level. However, some HPOs may belong to more than one organ system category: for example, microcephaly will be counted under "nervous system", "head or neck" and "skeletal system" in the HPO tree, whilst global developmental delay will only appear under "nervous system".

Extended Data Table 2. Genetic correlations between neurodevelopmental disorder risk and a range of traits, calculated using the LD score method.

Trait 2 Genetic correlation between developmental disorder risk and trait 2 Standard error 95% confidence interval (standard error) lower bound 95% confidence interval (standard error) upper bound P-value SNP heritability for trait 2a SE for trait 2 SNP heritability Population prevalence used for liability scale conversion N for trait 2 GWAS (Ncases:Ncontrols if dichotomous)
Years of schooling -0.491 0.079 -0.336 -0.645 5.31x10-10 0.112 0.004 766,345
Intelligence (Spearman's g) -0.441 0.104 -0.237 -0.645 2.15x10-5 0.203 0.013 78,308
Schizophrenia 0.279 0.066 0.148 0.409 2.71x10-5 0.242 0.008 0.01 11,260:24,542
Attention deficit hyperactivity disorder 0.727 0.292 0.155 1.299 0.013 0.071 0.031 17,666
Major depressive disorder 0.389 0.177 0.042 0.736 0.028 0.087 0.017 0.15 9,240:9,519
Childhood IQ -0.252 0.153 0.048 -0.553 0.1 0.279 0.051 12,441
Autism spectrum disorder -0.078 0.103 0.123 -0.28 0.445 0.118 0.01 0.012 18,381:27,969
Bipolar disorder 0.033 0.105 -0.172 0.238 0.751 0.25 0.023 0.01 7,481:9,250
Height -0.176 0.07 -0.038 -0.314 0.012 0.336 0.021 253,288
Body mass index 0.174 0.071 0.035 0.312 0.015 0.189 0.01 336,107
Child birth length -0.291 0.155 0.013 -0.595 0.061 0.165 0.027 28,459
Intracranial volume -0.319 0.218 0.107 -0.746 0.142 0.167 0.053 11,373
Birth weight -0.133 0.098 0.059 -0.326 0.174 0.095 0.008 143,677
Alzheimer's disease 0.424 0.259 -0.083 0.932 0.101 0.068 0.013 0.05 17,008:37,154
Coronary artery disease 0.077 0.091 -0.101 0.254 0.396 0.07 0.005 0.05 60,801:123,504
Lumbar Spine bone mineral density 0.101 0.132 -0.158 0.36 0.447 0.116 0.018 44,731
Parkinson's disease 0.093 0.136 -0.173 0.359 0.494 0.167 0.05 0.002 1,713:3,978
Type 2 Diabetes 0.071 0.122 -0.168 0.309 0.562 0.12 0.147 0.015 12,171:56,862
Crohn's disease -0.024 0.096 0.164 -0.211 0.804 0.252 0.027 0.003 5,956:14,927

Trait 2 is the trait that neurodevelopmental disorder risk is compared to. Uncorrected P-values are from a two-sided z-score.

a

SNP heritability for dichotomous traits is on the liability scale.

Extended Data Table 3. Polygenic score analyses comparing 1,266 Australian neurodevelopmental cases and 1,688 controls.

Polygenic score parameters Resultsa

  Polygenic score r2 for SNP pruning P-value threshold for SNP pruning Number of SNPs in score Beta Standard error P-value
Educational attainment (SSGAC, 2018) 0.1 1 92,091 -0.218 0.038 9.97x10-9
Height (Wood et al., 2014) 0.1 0.005 9,809 -0.155 0.04 8.84x10-5
Intelligence (Sniekers et al., 2017) 0.1 0.05 21,551 -0.126 0.038 7.61x10-4
Schizophrenia (QIMR removed) (Pardinas et al., 2018) 0.1 0.05 23,878 0.092 0.038 0.014
Intracranial volume (Adams et al., 2016) 0.1 1 90,928 -0.078 0.038 0.041
Autism (Grove et al., 2017) 0.1 0.1 26,846 0.07 0.038 0.063
Birth weight (Horikoshi et al., 2016) 0.1 0.01 6,828 -0.062 0.038 0.098
Developmental disorder risk (discovery GWAS) 0.1 1 67,001 -0.047 0.038 0.212
a

Logistic regression of case/control status on polygenic score using 10 ancestry principal components as covariates. P-values are uncorrected, two-sided, and from z-score distribution.

Extended Data Table 4. Polygenic score analyses comparing DDD patients with an exome diagnosis (N=1,127) against undiagnosed patients (N=2,479).

Parameters Resultsa

Polygenic score trait r2 for SNP pruning P-value threshold for SNP pruning Number of SNPs in score Beta Standard error P-value
Educational attainment 0.1 1 79,292 0.08 0.037 0.028
Intelligence 0.1 0.05 19,387 0.063 0.036 0.08
Schizophrenia 0.1 0.05 21,321 0.017 0.036 0.644
Autism 0.1 0.1 23,648 -0.077 0.036 0.032
Intracranial volume 0.1 1 76,788 4.98x10-3 0.036 0.891
Birth weight 0.1 0.01 6,212 1.54x10-3 0.036 0.966
Height 0.1 0.005 9,019 1.34x10-3 0.036 0.971
a

Logistic regression of diagnosed/undiagnosed status on polygenic score using 10 ancestry principal components as covariates. P-values are uncorrected, two-sided, and from z-score distribution.

Extended Data Table 5. Polygenic score analyses in DDD patients for measured traits.

Score parameters Resultsa

Measured trait Polygenic score r2 for SNP pruning P-value threshold for SNP pruning Number of SNPs in score Beta Standard error P-value R2
Birth weight (N=6,496) Birth weight 0.1 0.01 6,212 0.187 0.017 2.55x10-28 0.02
Height (N=5,465) Height 0.1 0.005 9,019 0.408 0.033 1.18x10-35 0.033
Head circumference (N=6,074) Intracranial volume 0.1 1 76,788 0.132 0.031 1.79x10-5 0.004
Autistic behavior: affected (N=1,121), unaffected (N=5,866) Autism 0.1 0.1 23,648 0.12 0.033 2.53x10-4 0.006c
Developmental delay or intellectual disability: severe (N=911), mild/moderate (N=1,902)b Educational attainment 0.1 1 79,292 0.116 0.04 0.004 0.008c
a

Linear or logistic regression on polygenic score using 10 ancestry principal components as covariates. P-values are uncorrected, two-sided, and from t-distribution (linear) and z-score distribution (logistic).

b

Severe cases were labelled as 1 in the logistic regression.

c

Nagelkerke R2.

Acknowledgements

We thank the families for their participation and patience, the DDD study clinicians, research nurses and clinical scientists in the recruiting centres for their hard work and perseverance on behalf of families, Minna Niemi for help making Figure 1, and Varun Warrier for useful discussions. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund [grant number HICF-1009-003], a parallel funding partnership between Wellcome and the Department of Health, and the Wellcome Sanger Institute [grant number WT098051]. The views expressed in this publication are those of the author(s) and not necessarily those of Wellcome or the Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). Families gave informed consent for participation. The research team acknowledges the support of the National Institute for Health Research, through the Comprehensive Clinical Research Network. This study makes use of data generated by the DECIPHER community. A full list of centres who contributed to the generation of the data is available from http://decipher.sanger.ac.uk and via email from decipher@sanger.ac.uk. Funding for the project was provided by the Wellcome Trust.

We used data from Understanding Society: The UK Household Longitudinal Study, which is led by the Institute for Social and Economic Research at the University of Essex and funded by the Economic and Social Research Council (Grant Number: ES/M008592/1). The data were collected by NatCen and the genome wide scan data were analysed by the Wellcome Trust Sanger Institute. Information on how to access the data can be found on the Understanding Society website https://www.understandingsociety.ac.uk/. Data governance was provided by the METADAC data access committee, funded by ESRC, Wellcome, and MRC. (Grant Number: MR/N01104X/1).

Australian controls from the Brisbane Longitudinal Twin Study were collected and genotyped with grants from the National Health and Medical Research Council.

We thank Antonio Pardiñas for producing the PGC-CLOZUK summary statistics without the Australian controls.

Footnotes

Author contributions

Study design: J.C.B., C.F.W., D.R.F., H.V.F. and M.E.H.

Data analysis and methods: M.E.K.N., H.C.M., D.L.R., G.G., M.K., J.M., and E.J.R.

Australian data collection: S.Y., J.G. and N.G.M.; Australian data preparation: K.M. and S.G.

Wrote paper: M.E.K.N., H.C.M. and J.C.B.

Analytical supervision: H.C.M. and J.C.B.

Project supervision: J.C.B.

Competing interests

M.E.H. is a co-founder of, consultant to, and holds shares in, Congenica Ltd, a genetics diagnostic company. J.C.B is an employee of Genomics plc.

Data availability

The raw genotype data, post-quality control genotype data and discovery GWAS summary statistics generated/analysed during the current study are available through European Genome-phenome Archive under EGAS00001000775.

References

  • 1.Boycott KM, et al. International Cooperation to Enable the Diagnosis of All Rare Genetic Diseases. Am J Hum Genet. 2017;100:695–705. doi: 10.1016/j.ajhg.2017.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Owen CI, et al. Extending the phenotype associated with the CSNK2A1-related Okur-Chung syndrome-A clinical study of 11 individuals. Am J Med Genet A. 2018 doi: 10.1002/ajmg.a.38610. [DOI] [PubMed] [Google Scholar]
  • 3.Singh T, et al. Rare loss-of-function variants in SETD1A are associated with schizophrenia and developmental disorders. Nat Neurosci. 2016;19:571–577. doi: 10.1038/nn.4267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Balasubramanian M, et al. Delineating the phenotypic spectrum of Bainbridge-Ropers syndrome: 12 new patients with de novo, heterozygous, loss-of-function mutations inASXL3and review of published literature. J Med Genet. 2017;54:537–543. doi: 10.1136/jmedgenet-2016-104360. [DOI] [PubMed] [Google Scholar]
  • 5.Minikel EV, et al. Quantifying prion disease penetrance using large population control cohorts. Sci Transl Med. 2016;8:322ra9. doi: 10.1126/scitranslmed.aad5169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wright CF, et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet. 2015;385:1305–1314. doi: 10.1016/S0140-6736(14)61705-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Deciphering Developmental Disorders Study. Large-scale discovery of novel genetic causes of developmental disorders. Nature. 2015;519:223–228. doi: 10.1038/nature14135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bulik-Sullivan BK, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Grove J, et al. Common risk variants identified in autism spectrum disorder. bioRxiv. 2017:224774. doi: 10.1101/224774. [DOI] [Google Scholar]
  • 10.Wray NR, et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet. 2018 doi: 10.1038/s41588-018-0090-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Weiner DJ, et al. Polygenic transmission disequilibrium confirms that common and rare variation act additively to create risk for autism spectrum disorders. Nat Genet. 2017 doi: 10.1038/ng.3863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.International Schizophrenia Consortium et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–752. doi: 10.1038/nature08185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Anttila V, et al. Analysis of shared heritability in common disorders of the brain. bioRxiv. 2017 doi: 10.1101/048991. 048991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bulik-Sullivan B, et al. An atlas of genetic correlations across human diseases and traits. Nat Genet. 2015;47:1236–1241. doi: 10.1038/ng.3406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Social Science Genetic Association Consortium. [Accessed: 18th March 2018];Social Science Genetic Association Consortium. Available at: https://www.thessgac.org/data.
  • 16.Sniekers S, et al. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence. Nat Genet. 2017;49:1107–1112. doi: 10.1038/ng.3869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Finucane HK, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet. 2015;47:1228–1235. doi: 10.1038/ng.3404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lindblad-Toh K, et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 2011;478:476–482. doi: 10.1038/nature10530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Deciphering Developmental Disorders Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature. 2017;542:433–438. doi: 10.1038/nature21062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Martin HC, et al. Quantifying the contribution of recessive coding variation to developmental disorders. bioRxiv. 2017 doi: 10.1101/201533. 201533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Firth HV, et al. DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. Am J Hum Genet. 2009;84:524–533. doi: 10.1016/j.ajhg.2009.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Reichenberg A, et al. Discontinuity in the genetic and environmental causes of the intellectual disability spectrum. Proc Natl Acad Sci U S A. 2016;113:1098–1103. doi: 10.1073/pnas.1508093112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Flannick J, Johansson S, Njølstad PR. Common and rare forms of diabetes mellitus: towards a continuum of diabetes subtypes. Nat Rev Endocrinol. 2016;12:394–406. doi: 10.1038/nrendo.2016.50. [DOI] [PubMed] [Google Scholar]
  • 24.Hernandez DG, Reed X, Singleton AB. Genetics in Parkinson disease: Mendelian versus non-Mendelian inheritance. J Neurochem. 2016;139(Suppl 1):59–74. doi: 10.1111/jnc.13593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ganna A, et al. Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat Neurosci. 2016;19:1563–1565. doi: 10.1038/nn.4404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tansey KE, et al. Common alleles contribute to schizophrenia in CNV carriers. Mol Psychiatry. 2016;21:1085–1089. doi: 10.1038/mp.2015.143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Kuchenbaecker KB, et al. Evaluation of Polygenic Risk Scores for Breast and Ovarian Cancer Risk Prediction in BRCA1 and BRCA2 Mutation Carriers. J Natl Cancer Inst. 2017;109 doi: 10.1093/jnci/djw302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Martinelli S, et al. Functional Dysregulation of CDC42 Causes Diverse Developmental Phenotypes. Am J Hum Genet. 2018 doi: 10.1016/j.ajhg.2017.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Khanna H, et al. A common allele in RPGRIP1L is a modifier of retinal degeneration in ciliopathies. Nat Genet. 2009;41:739–745. doi: 10.1038/ng.366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Martin HC, et al. Quantifying the contribution of recessive coding variation to developmental disorders. bioRxiv. 2017 doi: 10.1126/science.aar6731. 201533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Short PJ, et al. De novo mutations in regulatory elements in neurodevelopmental disorders. Nature. 2018;555:611–616. doi: 10.1038/nature25983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Understanding Society: Waves 1-7, 2009-2016 and Harmonised BHPS: Waves 1-18, 1991-2009. Available at: https://discover.ukdataservice.ac.uk/.
  • 33.Understanding Society: Waves 2 and 3 Nurse Health Assessment, 2010-2012. University of Essex: Institute for Social and Economic Research and National Centre for Social Research; [Google Scholar]
  • 34.Wright MJ, Martin NG. Brisbane Adolescent Twin Study: Outline of study methods and research projects. Aust J Psychol. 2004;56:65–78. [Google Scholar]
  • 35.Mina-Vargas A, et al. Heritability and GWAS Analyses of Acne in Australian Adolescent Twins. Twin Res Hum Genet. 2017;20:541–549. doi: 10.1017/thg.2017.58. [DOI] [PubMed] [Google Scholar]
  • 36.McCarthy S, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016 doi: 10.1038/ng.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Loh P-R, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat Genet. 2016;48:1443–1448. doi: 10.1038/ng.3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Durbin R. Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT) Bioinformatics. 2014;30:1266–1272. doi: 10.1093/bioinformatics/btu014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Loh P-R, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015;47:284–290. doi: 10.1038/ng.3190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Zheng J, et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics. 2017;33:272–279. doi: 10.1093/bioinformatics/btw613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Pardiñas AF, et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat Genet. 2018 doi: 10.1038/s41588-018-0059-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Adams HHH, et al. Novel genetic loci underlying human intracranial volume identified through genome-wide association. Nat Neurosci. 2016;19:1569–1582. doi: 10.1038/nn.4398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Wood AR, et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet. 2014;46:1173–1186. doi: 10.1038/ng.3097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Horikoshi M, et al. Genome-wide associations for birth weight and correlations with adult disease. Nature. 2016;538:248–252. doi: 10.1038/nature19806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Okbay A, et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature. 2016;533:539–542. doi: 10.1038/nature17671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Palla L, Dudbridge F. A Fast Method that Uses Polygenic Scores to Estimate the Variance Explained by Genome-wide Marker Panels and the Proportion of Variants Affecting a Trait. Am J Hum Genet. 2015;97:250–259. doi: 10.1016/j.ajhg.2015.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wright CF, et al. Making new genetic diagnoses with old data: iterative reanalysis and reporting from genome-wide data in 1,133 families with developmental disorders. Genet Med. 2018 doi: 10.1038/gim.2017.246. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES