Increased burden of deleterious variants in essential genes in autism spectrum disorder

Xiao Ji; Rachel L Kember; Christopher D Brown; Maja Bućan

doi:10.1073/pnas.1613195113

. 2016 Dec 12;113(52):15054–15059. doi: 10.1073/pnas.1613195113

Increased burden of deleterious variants in essential genes in autism spectrum disorder

Xiao Ji ^a,^b, Rachel L Kember ^b, Christopher D Brown ^b,¹, Maja Bućan ^b,^c,¹

PMCID: PMC5206557 PMID: 27956632

Significance

Essential genes (EGs) are necessary for survival and the development of an organism. Our study is focused on investigating the role of EGs in autism spectrum disorder (ASD). With a comprehensive catalog of 3,915 mammalian EGs, we show that there is both an elevated burden of damaging mutations in EGs in ASD probands and also, an enrichment of EGs in known ASD risk genes. Moreover, the analysis of EGs in the developing brain identified clusters of coexpressed EGs implicated in ASD. Overall, we provide evidence that genes that are essential for survival and fitness also contribute to ASD risk and lead to the disruption of normal social behavior.

Keywords: essential genes, mouse knockouts, mutational burden, autism spectrum disorder, coexpression modules

Abstract

Autism spectrum disorder (ASD) is a heterogeneous, highly heritable neurodevelopmental syndrome characterized by impaired social interaction, communication, and repetitive behavior. It is estimated that hundreds of genes contribute to ASD. We asked if genes with a strong effect on survival and fitness contribute to ASD risk. Human orthologs of genes with an essential role in pre- and postnatal development in the mouse [essential genes (EGs)] are enriched for disease genes and under strong purifying selection relative to human orthologs of mouse genes with a known nonlethal phenotype [nonessential genes (NEGs)]. This intolerance to deleterious mutations, commonly observed haploinsufficiency, and the importance of EGs in development suggest a possible cumulative effect of deleterious variants in EGs on complex neurodevelopmental disorders. With a comprehensive catalog of 3,915 mammalian EGs, we provide compelling evidence for a stronger contribution of EGs to ASD risk compared with NEGs. By examining the exonic de novo and inherited variants from 1,781 ASD quartet families, we show a significantly higher burden of damaging mutations in EGs in ASD probands compared with their non-ASD siblings. The analysis of EGs in the developing brain identified clusters of coexpressed EGs implicated in ASD. Finally, we suggest a high-priority list of 29 EGs with potential ASD risk as targets for future functional and behavioral studies. Overall, we show that large-scale studies of gene function in model organisms provide a powerful approach for prioritization of genes and pathogenic variants identified by sequencing studies of human disease.

Autism spectrum disorder (ASD) is a heterogeneous, heritable neurodevelopmental syndrome characterized by impaired social interaction, communication, and repetitive behavior (1, 2). The highly polygenic nature of ASD (3–5) suggests that the analysis of the full spectrum of sequence variants in hundreds of genes will be necessary for deeper understanding of disrupted neuronal function. Prioritization of ASD risk genes initially focused on known pathways with recognized relevance to pathogenesis of ASD, such as synaptic function and neuronal development (6). However, combined analyses of de novo, inherited, and case–control variation in over 2,500 ASD parent–child nuclear families identified around 100 genes contributing to ASD risk (7–9), converging on pathways implicated in transcriptional regulation and chromatin modeling in addition to synaptic function.

The main challenge in the current understanding of genetic architecture of ASD comes from a need to study the interplay between variants with a high effect (for example, recurrent de novo variants) and a background of variants with an intermediate effect but that nevertheless still disrupt proper neuronal development. Essential genes (EGs) or genes that are necessary for successful completion of pre- and postnatal development are prime candidates for the source of this background or load of variants with a cumulative intermediate effect. EGs are highly enriched for human disease genes and under strong purifying selection (10–14). In addition to intolerance to loss-of-function and deleterious mutations, the functional impact of EGs is reflected by haploinsufficiency that is commonly observed in heterozygous mutations (11, 15). In addition to their role in defining a “minimal gene set” (16, 17), EGs tend to play important roles in protein interaction networks (18). Therefore, one may consider that EGs are involved in rate-limiting steps that affect a range of disease pathways (19).

Recently, three large-scale screens (gene trap and CRISPR-Cas9) have been performed to assess the effect of single-gene mutations on cell viability or survival of haploid human cancer cell lines (“cell-based essentiality”) (20–22). These studies identified an overlapping core set of genes that were essential in the majority of cell lines tested (n = 956), although a subset of genes were essential in specific cell lines. In an alternative and complementary approach, we assembled a catalog of human orthologs of EGs in the mouse (n = 3,326) (14) based on the organismal-level phenotypes of loss-of-function mouse mutants from the Mouse Genome Informatics (MGI) database (23) and the International Mouse Phenotyping Consortium (IMPC) web portal (24). Based on these data, homozygous loss-of-function mutations in 3,326 genes lead to prenatal or preweaning lethality, with a significant overlap between the core set of human cell EGs and human orthologs of EGs in the mouse (14). These studies are consistent with 30% (or ∼6,000) of protein-coding genes to be essential for pre- and postnatal survival (14, 25).

A deeper understanding of the mutational spectrum of EGs in a neurodevelopmental disorder, such as ASD, is important, because EGs are less likely to be redundant, are more likely to have functional consequences when mutated, and may produce a gradation of phenotypes (25). Our previous work reported an enrichment of EGs among genes with de novo mutations in ASD patients (11). Several groups reported an enrichment of de novo and rare inherited single-nucleotide loss-of-function variants in ASD probands (8, 26), although there is a depletion of damaging mutations in ASD risk genes in population controls (12, 27, 28). In this report, we compiled, to our knowledge, the most comprehensive list of human EGs and extended the analysis to both de novo and inherited damaging variants in 1,781 ASD families. In addition to disease status, we further showed the effect of damaging variants in EGs on ASD-related traits, such as the social skill measurement in 2,348 ASD probands. Finally, we performed coexpression analysis of EGs in the developing human brain to identify clusters of interacting EGs that contribute to ASD risk and suggest ASD candidate genes.

Results

To identify the most comprehensive set of EGs in mammals, we combined the set of human orthologs of EGs in the mouse (n = 3,326) (14) with a set of human “core EGs” (n = 956) that were found to be essential in cell-based assays (20–22). Based on a significant overlap between tested mouse and human EGs (14), we expanded our original set of 3,326 EGs with the addition of nonoverlapping 589 EGs identified only in human cell lines for a total of 3,915 EGs (SI Materials and Methods and Dataset S1). In our subsequent analyses, we compared features of and genetic variation in these EGs with 4,919 human orthologs of genes with reported nonlethal phenotypes in the mouse [nonessential genes (NEGs)].

Homozygous loss-of-function mutations in EGs lead to lethality (or miscarriages in humans) and as such, cannot contribute to disease. Although we and others reported a depletion of loss-of-function mutations in EGs in humans (11, 12, 14), heterozygosity for a loss-of-function mutation or other “milder” alleles in EGs may contribute to both dominant and recessive diseases. We illustrated this point using a catalog of disease-linked genes in Online Mendelian Inheritance in Man (29) (SI Materials and Methods); EGs were enriched relative to NEGs in 1,000 genes underlying dominant diseases (odds ratio = 1.95, P value = 3.17 × 10⁻¹⁹; two-sided Fisher’s exact test) and 1,645 genes underlying recessive disease (odds ratio = 1.52, P value = 4.94 × 10⁻¹¹; two-sided Fisher’s exact test) (Fig. 1A). A stronger enrichment of EGs among genes underlying dominant disease implies that dominant negative alleles and haploinsufficiency play an important role. We provide multiple lines of evidence for higher probability of haploinsufficiency of EGs (Fig. 1A and SI Materials and Methods). First, using the systematically rated dosage-sensitive genes from ClinGen (30), we found that EGs were significantly enriched compared with NEGs and that the levels of EG enrichment positively correlated with levels of evidence supporting dosage sensitivity of rated genes (odds ratio = 3.94, P value = 5.07 × 10⁻²⁰ for “sufficient evidence”; odds ratio = 5.26, P value = 7.08 × 10⁻⁵ for “some evidence”; odds ratio = 2.52, P value = 0.0106 for “little evidence”; odds ratio = 1.14, P value = 0.608 for “not dosage sensitive”; two-sided Fisher’s exact test). Second, as an extension of the earlier findings from the work by Georgi et al. (11), we confirmed the enrichment of EG relative to NEG for 262 human haploinsufficient genes (31) with the updated EG and NEG list (183 EGs vs. 62 NEGs; P value = 1.64 × 10⁻²², odds ratio = 3.84; two-sided Fisher’s exact test). Third, EGs are significantly overrepresented among 313 human orthologs of mouse genes with heterozygous alleles associated with mutant phenotypes from the MGI (23) (odds ratio = 3.43, P value = 2.74 × 10⁻²³; two-sided Fisher’s exact test). Fourth, with two genome-wide prediction models of haploinsufficient genes in the human genome (32, 33), we observed that EGs have significantly higher probability of exhibiting haploinsufficiency compared with NEGs (P value < 2.2 × 10⁻¹⁶ for both models; two-sided Wilcoxon rank sum test) (Fig. 1 B and C and SI Materials and Methods). Based on our findings that EGs linked to Mendelian disease are overwhelmingly dosage-sensitive, we explored the possibility that a cumulative effect of pathogenic variants in multiple EGs may underlie the genetic basis of a complex disease with early postnatal onset, such as ASD.

Fig. 1. — Haploinsufficiency of EGs. (A) For each class of genes with different essentiality status (EG in red, NEG in turquoise, and unknown in gray), the proportion of genes among each gene set of interest is plotted in *Left*. Dosage-sensitive genes from ClinGen (30) were classified into five categories (1, sufficient evidence; 2, some evidence; 3, little evidence; 4, no evidence; and 5, not sensitive/recessive). Two-sided Fisher’s exact test was performed to assess the enrichment of EGs vs. NEGs, and the P values were indicated. The odds ratios for enrichment of EGs compared with NEGs and the 95% confidence intervals of odds ratios are plotted in *Right*. OMIM, Online Mendelian Inheritance in Man (29). (B and C) Histograms and estimated density curves indicating the distribution of (B) the Haploinsufficiency Score (HIS) (32) and (C) the Genome-Wide Haploinsufficiency Score (GHIS) (33) across three gene sets, including EGs (red), NEGs (turquoise), and all protein-coding genes (56) (gray). EGs have significantly higher probability of exhibiting haploinsufficiency compared with NEGs (P value < 2.2 × 10⁻¹⁶ for both models; two-sided Wilcoxon rank sum test).

To address a possible cumulative effect of variants in EGs in ASD in a larger cohort of 1,781 ASD quartet families (with 1,781 probands and 1,781 siblings) from the Simons Simplex Collection (34), we acquired de novo and rare inherited mutations from the exome sequencing data of these families (8, 26). We examined the individual mutational burden defined by the number of de novo loss-of-function (dnLoF), de novo nonsynonymous damaging (dnNSD), and inherited rare damaging (inhRD) mutations per individual (Fig. S1, SI Materials and Methods, and Datasets S2–S4). On average, an ASD proband carried 0.06 dnLoF, 0.21 dnNSD, and 10.74 inhRD mutations in EGs. The mutational burden in EGs was significantly elevated in ASD probands compared with unaffected siblings for the three classes of variants considered (P value = 4.75 × 10⁻⁷ for dnLoF, P value = 3.41 × 10⁻⁴ for dnNSD, and P value = 0.017 for inhRD; one-sided Wilcoxon signed ranked test) (Fig. 2A and Table S1). In contrast, no significant difference in mutational burden in NEGs was observed (P value = 0.10 for dnLoF, P value = 0.069 for dnNSD, and P value = 0.75 for inhRD) (Table S1). Interestingly, 10,823 genes that are currently not assigned as EG or NEG (i.e., phenotypically uncharacterized in mouse knockouts and human cell-based assays) have a moderately elevated burden of dnLoF but not dnNSD and inhRD variants in ASD probands (P value = 0.0042) (Table S1). Notably, the effect sizes of EG burden in each variant type correspond to our understanding of the severity of the variant type; de novo mutations, which are expected to have a larger functional impact, also display the strongest difference between ASD probands and unaffected siblings (effect size = 0.117 for dnLoF; effect size = 0.079 for dnNSD; Cohen’s d). In contrast, inherited mutations are expected to have a moderate functional impact, and a smaller difference is observed between probands and siblings (effect size = 0.042 for inhRD). Although we observed marginally increased burden of dnLoF and dnNSD mutations in EGs in female (n = 325) compared with male (n = 2,043) probands (Table S2), the analysis of families divided by gender of proband–sibling pairs (female–female, male–female, female–male, and male–male) showed that gender bias does not underlie the observed differences in mutational burden between probands and siblings (Table S3).

Fig. S1. — Variant filtering steps for the mutational burden analysis. EVS, Exome Variant Server (release ESP 6500); MAF, minor allele frequency.

Fig. 2. — Assessment of the contribution of EGs to ASD risk. (A) Individual mutational burden analysis in 1,781 pairs of ASD probands and unaffected siblings (Table S1). The analyses were performed separately for 3,915 EGs (red) and 4,919 NEGs (turquoise). The individual mutational burden is defined by the number of dnLoF, dnNSD, and inhRD mutations per individual. Effect sizes were measured by Cohen’s d, which is defined as the difference between both means divided by the SD of the paired differences. The estimated 95% confidence intervals of effect sizes were plotted (*SI Materials and Methods*). P values were obtained from one-sided Wilcoxon signed ranked test. *P value < 0.05. (B) ASD candidate genes categorized by SFARI genes scores (S, syndromic; 1, high confidence; 2, strong candidate; 3, suggestive evidence; 4, minimal evidence; 5, hypothesized; and 6, not supported) (37) and their essentiality status (EG in red, NEG in turquoise, and unknown in gray). ***The P value from two-sided Fisher’s exact test (EG vs. NEG) is less than 0.001. (C) The distribution of TADA FDR q values of EGs and NEGs. The FDR q value of the TADA test evaluates ASD association based on combined evidence from de novo SNVs and small deletions, rare inherited variants, and variants (9). The observed negative log₁₀ (q) values of 3,915 EGs (red) and 4,919 NEGs (turquoise) are compared with the expected counterparts under the null hypothesis. The dashed lines indicate the FDR thresholds (FDR = 0.1 in red and FDR = 0.5 in blue) for identification of ASD risk genes. The 95% confidence intervals of the expected negative log₁₀ (q) values are shaded in gray.

Table S1.

Mutational burden analysis in 1,781 ASD quartet families

Variant type and gene set	No. of genes	Proband average	Sibling average	Effect size	Effect size 95% CI low	Effect size 95% CI high	P value
dnLOF
EG: this work	3,915	0.0640	0.0286	0.1170	0.0715	0.1596	4.75 × 10⁻⁷
EG: Dickinson et al. (14)	3,326	0.0595	0.0253	0.1176	0.0730	0.1603	4.16 × 10⁻⁷
EG: Georgi et al. (11)	2,472	0.0494	0.0168	0.1254	0.0820	0.1671	7.82 × 10⁻⁸
Human cell EGs (20, 21, 22)	956	0.0079	0.0056	0.0193	−0.0269	0.0637	0.2118
NEG	4,919	0.0387	0.0309	0.0300	−0.0157	0.0774	0.1028
Phenotypically uncharacterized genes	10,823	0.0752	0.0533	0.0606	0.0143	0.1084	0.004257
dnNSD
EG: this work	3,915	0.2061	0.1589	0.0794	0.0324	0.1274	3.41 × 10⁻⁴
EG: Dickinson et al. (14)	3,326	0.1875	0.1376	0.0892	0.0429	0.1353	8.13 × 10⁻⁵
EG: Georgi et al. (11)	2,472	0.1505	0.1050	0.0895	0.0435	0.1366	7.36 × 10⁻⁵
Human cell EGs (20, 21, 22)	956	0.0371	0.0365	0.0021	−0.0435	0.0499	0.4696
NEG	4,919	0.1611	0.1404	0.0374	−0.0100	0.0827	0.0691
Phenotypically uncharacterized genes	10,823	0.2471	0.2791	−0.0419	−0.0884	0.0044	0.9636
inhRD
EG: this work	3,915	10.7428	10.6042	0.0420	−0.0041	0.0887	0.01688
EG: Dickinson et al. (14)	3,326	9.3257	9.2358	0.0287	−0.0185	0.0757	0.04139
EG: Georgi et al. (11)	2,472	7.0236	6.9163	0.0402	−0.0053	0.0867	0.02622
Human cell EGs (20, 21, 22)	956	2.3745	2.3779	−0.0022	−0.0485	0.0435	0.5935
NEG	4,919	12.7816	12.8355	−0.0150	−0.0618	0.0308	0.7456
Phenotypically uncharacterized genes	10,823	20.3947	20.4559	−0.0133	−0.0592	0.0342	0.5404

Open in a new tab

Effect sizes were measured by Cohen's d, which is defined as the difference between both means divided by the SD of the paired differences. P values were obtained from one-sided Wilcoxon signed ranked test. 95% CI, 95% confidence interval.

Table S2.

Difference in individual mutational burden between male and female probands

Variant type and gene set	Female proband average	Male proband average	Effect size	P value
dnLoF
EG	0.0862	0.0597	0.1042	0.0355^*
NEG	0.0462	0.0357	0.0551	0.1782
dnNSD
EG	0.2400	0.1948	0.1014	0.0388^*
NEG	0.2000	0.1596	0.0993	0.0742
inhRD
EG	11.0523	10.9633	0.0151	0.4711
NEG	13.2677	13.0113	0.0360	0.5271

Open in a new tab

Effect sizes were measured by Cohen's d, which is defined as the difference between both means divided by pooled SD.

P values with statistical significance.

Table S3.

Mutational burden analysis in 1,781 ASD quartet families (dissected by genders of proband–sibling pairs)

Variant type, gene set, and proband gender	Sibling gender	No. of families	Proband average	Sibling average	Effect size	P value
dnLoF
EG
All	All	1,781	0.0640	0.0286	0.1170	4.75 × 10⁻⁷
Female	Male	101	0.0891	0.0099	0.2588	0.0067
Male	Female	826	0.0593	0.0327	0.0893	0.0053
Male	Male	732	0.0615	0.0246	0.1228	0.0005
Female	Female	122	0.0902	0.0410	0.1461	0.0600
NEG
All	All	1,781	0.0387	0.0309	0.0300	0.1028
Female	Male	101	0.0396	0.0297	0.0374	0.3884
Male	Female	826	0.0412	0.0266	0.0558	0.0549
Male	Male	732	0.0369	0.0314	0.0213	0.2838
Female	Female	122	0.0328	0.0574	0.0818	0.8302
dnNSD
EG
All	All	1,781	0.2061	0.1589	0.0794	0.0003
Female	Male	101	0.2178	0.1683	0.0724	0.2392
Male	Female	826	0.2094	0.1755	0.0552	0.0454
Male	Male	732	0.1885	0.1270	0.1136	0.0013
Female	Female	122	0.2787	0.2295	0.0725	0.2157
NEG
All	All	1,781	0.1611	0.1404	0.0374	0.0691
Female	Male	101	0.1881	0.1980	0.0155	0.5696
Male	Female	826	0.1465	0.1477	0.0022	0.5515
Male	Male	732	0.1667	0.1175	0.0904	0.0080
Female	Female	122	0.2049	0.1803	0.0379	0.3817
inhRD
EG
All	All	1,781	10.7428	10.6042	0.0420	0.0169
Female	Male	101	10.3762	10.6436	0.0778	0.8260
Male	Female	826	10.8341	10.7034	0.0401	0.1120
Male	Male	732	10.5765	10.4372	0.0417	0.0449
Female	Female	122	11.4262	10.9016	0.1619	0.0430
NEG
All	All	1,781	12.7816	12.8355	0.0150	0.7456
Female	Male	101	12.5050	13.0792	0.1398	0.9143
Male	Female	826	12.8693	13.0182	0.0424	0.7802
Male	Male	732	12.6134	12.5546	0.0165	0.5576
Female	Female	122	13.4262	13.0820	0.0907	0.1327

Open in a new tab

To evaluate the effect of rare damaging mutations in EGs on ASD-associated traits, we used the available quantitative phenotype data on social and cognitive impairments in ∼2,500 ASD families from Simons Simplex Collection (8, 26) (Dataset S2). As a measure of sociability, we used the total raw score from the Social Responsiveness Scale (SRS) (35), and as cognitive measures, we used three different intelligence quotient (IQ) scores (full-scale IQ, verbal IQ, and nonverbal IQ). As previously reported (36), SRS scores were unrelated to IQ, especially in subjects with IQ higher than 50 (Fig. S2). In male probands, we observed that the mutational burden in EGs was positively correlated with the SRS total raw score (P value = 1.08 × 10⁻⁶; Poisson regression) (Table 1). The effect was not significant in NEGs (P = 0.21). In female probands, mutational burden in NEGs but not EGs was negatively correlated with SRS total raw score (P = 0.085 for EG and P = 6.06e-06 for NEG). In addition, we found that mutational burden in both EGs and NEGs had a significant effect (P value < 2.2 × 10⁻¹⁶) on verbal and nonverbal IQ scores and that the effect sizes of mutational burden in EGs and NEGs were comparable (Table S4). These results suggest that, in ASD probands, deleterious variants in EGs contribute to decreased social skills in males, whereas deleterious variants in both EGs and NEGs lead to decreased IQ.

Fig. S2. — Correlation between SRS and IQ. For each of 2,368 ASD probands from Simons Simplex Collection, the Pearson correlation between SRS total raw scores and three IQ scores (full-scale IQ, verbal IQ, and nonverbal IQ) was plotted. The probands were divided by IQ scores: (A, C, and E) IQ < 50 and (B, D, and F) IQ ≥ 50.

Table 1.

Relationship between individual mutational burden and SRS in ASD probands

Group and gene set	Estimate	Standard error	P value
2,031 Male probands
EG (3,915 genes)	0.001860	0.000381	1.08 × 10⁻⁶^*
NEG (4,919 genes)	0.000407	0.000324	0.209
317 Female probands
EG (3,915 genes)	−0.001511	0.000877	0.085
NEG (4,919 genes)	−0.003084	0.000682	6.04 × 10⁻⁶

Open in a new tab

Coefficients for Poisson regression are shown, which model the relationship between SRS total raw score and individual burden of all rare damaging mutations (including dnLOF, dnNSD, and inhRD mutations).

The P value with statistical significance with positive estimated effects (P value < 0.05; estimate > 0).

Table S4.

Relationship between individual mutational burden and IQ in ASD probands

Trait and gene set	Estimate	SE	P value
Verbal IQ
EG (3,915 genes)	−0.007279	0.000400	<2.2 × 10⁻¹⁶
NEG (4,919 genes)	−0.005307	0.000383	<2.2 × 10⁻¹⁶
Nonverbal IQ
EG (3,915 genes)	−0.007172	0.000336	<2.2 × 10⁻¹⁶
NEG (4,919 genes)	−0.004906	0.000320	<2.2 × 10⁻¹⁶

Open in a new tab

Coefficients for Poisson regression are shown, which modeled the relationship between verbal/nonverbal IQ and individual burden of all rare damaging mutations (including dnLOF, dnNSD, and inhRD mutations).

To initially explore the overlap between EGs and known ASD genes, we examined the essentiality status of ∼500 ASD candidate genes from the Simons Foundation Autism Research Initiative (SFARI) AutDB database (updated December of 2015) (37) (Fig. 2B). Compared with NEGs, EGs were enriched among ASD candidates categorized as “syndromic” (category S: odds ratio = 3.95, P value = 0.0003; two-sided Fisher’s exact test), candidates with “high confidence” (category 1: odds ratio = 15.12, P value = 0.0004), and candidates with “suggestive evidence” (category 3: odds ratio = 2.14, P value = 0.0006). Trends of enrichment of EGs were also observed for “strong candidates” (category 2: odds ratio = 1.62, P value = 0.21). We did not observe enrichment of EGs among candidate genes with less supportive evidence (categories 4–6).

To further address whether EGs contribute to ASD risk, we compared the strength of ASD association signals between EGs and NEGs in data from a recent comprehensive analysis of ASD genomic architecture (9), where the transmission and de novo association (TADA) test (38) was used to evaluate ASD association based on combined evidence from de novo single-nucleotide variants (SNVs), de novo small deletions, and rare inherited variants from Simons Simplex Collection cohorts as well as case–control data from Autism Sequencing Consortium (ASC) cohorts (39). There was a significant enrichment of EGs compared with NEGs in 65 high-confidence TADA ASD genes [TADA false discovery rate (FDR) q values < 0.1] identified by Sanders et al. (9) (36 EGs vs. 15 NEGs; odds ratio = 3.03, P value = 1.82 × 10⁻⁴; one-sided Fisher’s exact test). In a broader set of 441 “potential” TADA ASD genes (TADA FDR < 0.5), EGs were also enriched compared with NEGs (132 EGs vs. 117 NEGs; odds ratio = 1.43, P value = 0.00537). Furthermore, by comparing the observed TADA FDR with the expected TADA FDR, we detected a strong deviation from the null distribution in EGs, especially in 132 EGs with potential ASD association (TADA FDR < 0.5) (Fig. 2C). In contrast, NEGs were not enriched for association relative to the background expectation, suggesting that the association signals between EGs and ASD were stronger and less likely to be false positive compared with NEGs.

It is our hypothesis that a cumulative effect of deleterious variants in several EGs, within the same pathway or across pathways may underlie impaired brain development and individual’s ASD risk. To identify clusters of potentially interacting genes, we evaluated the spatiotemporal expression of EGs and NEGs using RNA sequencing (RNA-seq) data from BrainSpan (40). We identified 41 coexpression modules with distinct expression patterns across 16 brain regions and 31 pre- and postnatal time points (Fig. S3 and SI Materials and Methods). We observed that the majority of EG-enriched modules (11 of 14; FDR < 0.1; two-sided Fisher’s exact test) (Fig. 3A, Fig. S3, and Table S5) exhibited an “early-expression” pattern, where the expression levels were higher at early fetal stages (starting from 8 postconceptual weeks) and gradually declined before birth. In contrast, the majority of the NEG-enriched modules (15 of 18) exhibited a “later-expression” pattern, with expression levels that were lower at early fetal stages and gradually increased until birth.

Fig. S3. — Expression profiles of 41 coexpression modules in the brain. Expression profiles of genes from 41 coexpression modules based on the RNA-seq data from BrainSpan (25) are shown. The y axis represents the first principle component of the module-level expression profiles in each brain tissue type. The x axis represents developmental stages in chronological order (Fig. 2B shows the labels of the time points). The vertical dashed lines indicate the time of birth. The total number of protein-coding genes in each module (n) is indicated along with the module name.

Fig. 3. — Coexpression analysis of EGs in developing human brain. (A) Coexpressed modules enriched in EGs and NEGs. The upper barplot displays the level of enrichment of EGs vs. NEGs for each of 41 coexpression modules based on BrainSpan RNA-seq data. The lower barplot displays the level of enrichment (green) of 441 potential ASD genes in EGs from 41 coexpression modules. The heights of the bars represent negative log₁₀ (FDR q value). The upper and lower red dashed lines indicate FDR q value threshold of 0.1. (B) The brain expression trajectories of genes from three coexpression modules implicated in ASD. The expression trajectories in brain for 1,601 genes in M01 (orange), 1,150 genes in M02 (purple), and 347 genes in M16 (green) were fitted based on the first principle components of the module-level expression profiles (y axis). The x axis represents developmental stages in chronological order. The vertical dashed line indicates the time of birth. pcw, Postconceptual week. (C) Coexpression network of 973 EGs from M01 (orange), M02 (purple), and M16 (green). Edges indicate coexpression between gene pairs.

Table S5.

Coexpression modules in the developing brain

Module	No. of genes	Expression pattern	Enrichment	No. of EGs	No of NEGs	Odds ratio (EG/NEG)	FDR q value (EG/NEG)	No. of potential ASD genes	Odds ratio (ASD genes)	FDR q value (ASD genes)
M01	1,601	Early expressed	EG enriched	501	251	2.73	7.38 × 10⁻³⁸^*	55	1.52	0.004^*
M02	1,150	Early expressed	EG enriched	367	208	2.34	2.80 × 10⁻²²^*	53	2.13	2.58 × 10⁻⁶^*
M03	1,054	Mixed	NEG enriched	204	340	0.74	9.67 × 10⁻⁴^*	18	0.72	0.934
M04	810	Late expressed	NEG enriched	122	326	0.45	3.19 × 10⁻¹⁴^*	19	1.00	0.529
M05	781	Late expressed	NEG enriched	156	239	0.81	0.0491^*	24	1.32	0.122
M06	702	Late expressed	NEG enriched	129	254	0.63	1.55 × 10⁻⁵^*	11	0.65	0.948
M07	663	Early expressed	EG enriched	251	141	2.32	1.23 × 10⁻¹⁵^*	8	0.50	0.989
M08	580	Early expressed	EG enriched	193	114	2.19	3.62 × 10⁻¹¹^*	13	0.95	0.613
M09	559	Late expressed	NEG enriched	104	206	0.62	9.26 × 10⁻⁵^*	16	1.23	0.246
M10	503	Early expressed	EG enriched	126	114	1.40	0.0102^*	9	0.74	0.847
M11	457	Late expressed	NEG enriched	79	178	0.55	7.33 × 10⁻⁶^*	9	0.83	0.753
M12	420	Late expressed	NEG enriched	62	163	0.47	1.90 × 10⁻⁷^*	7	0.69	0.874
M13	418	Late expressed	NEG enriched	97	193	0.62	1.46 × 10⁻⁴^*	7	0.69	0.877
M14	370	Late expressed	EG enriched	81	58	1.77	0.00102^*	4	0.45	0.977
M15	368	Mixed	EG enriched	104	95	1.39	0.0251^*	5	0.57	0.934
M16	347	Early expressed	EG enriched	106	90	1.49	0.00570^*	20	2.57	2.80 × 10⁻⁴^*
M17	339	Early expressed	EG enriched	102	59	2.20	1.20 × 10⁻⁰⁶^*	16	2.05	0.008
M18	306	Late expressed		66	61	1.37	0.0874	5	0.67	0.861
M19	299	Late expressed	NEG enriched	31	118	0.32	1.81 × 10⁻⁹^*	2	0.28	0.994
M20	296	Late expressed	NEG enriched	51	91	0.70	0.0498^*	5	0.72	0.823
M21	291	Early expressed		54	73	0.93	0.719	5	0.72	0.818
M22	278	Early expressed	EG enriched	83	25	4.24	6.17 × 10⁻¹²^*	2	0.29	0.991
M23	272	Late expressed	NEG enriched	41	84	0.61	0.0108^*	2	0.31	0.988
M24	258	Early expressed		51	49	1.31	0.189	11	1.84	0.047
M25	244	Early expressed	EG enriched	86	49	2.23	6.66 × 10⁻⁶^*	11	1.98	0.031
M26	239	Early expressed	EG enriched	79	18	5.61	8.28 × 10⁻¹⁴^*	4	0.70	0.821
M27	213	Late expressed	NEG enriched	45	85	0.66	0.0261^*	6	1.19	0.399
M28	197	Late expressed		32	41	0.98	1	1	0.21	0.991
M29	193	Late expressed	NEG enriched	33	69	0.60	0.0158^*	2	0.43	0.943
M30	188	Late expressed	NEG enriched	11	43	0.32	2.92 × 10⁻⁴^*	3	0.69	0.808
M31	187	Late expressed		41	64	0.80	0.323	6	1.38	0.279
M32	172	Late expressed	NEG enriched	24	60	0.50	0.00388^*	3	0.75	0.766
M33	170	Late expressed		41	40	1.29	0.263	4	1.00	0.568
M34	163	Mixed	EG enriched	48	22	2.76	5.06 × 10⁻⁵^*	2	0.51	0.904
M35	151	Mixed	NEG enriched	21	48	0.55	0.0207^*	6	1.73	0.147
M36	151	Late expressed		22	44	0.63	0.0815	3	0.82	0.707
M37	146	Early expressed	EG enriched	38	9	5.35	3.81 × 10⁻⁷^*	2	0.57	0.862
M38	128	Late expressed	NEG enriched	17	63	0.34	2.11 × 10⁻⁵^*	4	1.36	0.347
M39	115	Early expressed		29	42	0.87	0.632	4	1.47	0.298
M40	99	Unknown		4	13	0.39	0.0926	1	0.45	0.890
M41	74	Unknown	NEG enriched	4	16	0.31	0.0400^*	1	0.59	0.816

Open in a new tab

P values with statistical significance.

We found that EGs in three EG-enriched modules (M01, M02, and M16) were significantly enriched (FDR < 0.1; one-sided Fisher’s exact test) for 441 potential TADA ASD genes (Fig. 3A). Notably, all of the three modules were also EG-enriched and early-expressed across fetal brain regions (Fig. 3 A and B). From the pathway enrichment analysis of these EG-enriched modules in the Reactome database (41, 42), we found that the top pathways enriched included “transcription” (M01), “chromatin modifying enzymes and chromatin organization” (M02), and “axon guidance” (M16) (Table S6), in agreement with the insights from recent large-scale autism studies showing that genes for synaptic formation, transcriptional regulation, and chromatin remodeling are disrupted in autism (7–9). This combined analysis identified 974 EGs from three modules that are coexpressed with known ASD candidate genes at distinct stages of brain development.

Table S6.

Reactome pathways enriched in three EG-enriched modules implicated in ASD

Module and term	Overlap	P value	Adjusted P value	Genes
M01
Transcription	25/202	2.40 × 10⁻⁶	6.79 × 10⁻⁴^*	GTF3C3; HDAC2; CCNT2; GTF3C4; RRN3; CSTF3; GTF2E1; CLP1; PCF11; POLR2B; SNAPC3; CSTF1; RNGTT; TBP; NCBP1; NCBP2; GTF2H3; NFIA; POLR3B; NFIB; POLR3C; POLR1B; POLR1E; TFAM; TAF5
Processing of capped intron-containing pre-mRNA	22/144	4.34 × 10⁻⁷	3.67 × 10⁻⁴^*	NCBP1; NUP133; DHX9; NCBP2; CSTF3; CDC5L; HNRNPU; PLRG1; YBX1; NUP160; EFTUD2; PRPF4; CLP1; HNRNPH1; PCF11; POLR2B; NUP50; CSTF1; NUPL1; RAE1; SF3B1; CTNNBL1
Folding of actin by CCT/TriC	7/9	1.11 × 10⁻⁶	4.70 × 10⁻⁴^*	CCT3; CCT6A; CCT2; TCP1; CCT7; CCT5; CCT4
mRNA splicing	17/113	1.11 × 10⁻⁵	0.00188^*	NCBP1; DHX9; NCBP2; CSTF3; CDC5L; HNRNPU; PLRG1; YBX1; EFTUD2; PRPF4; CLP1; HNRNPH1; PCF11; POLR2B; CSTF1; SF3B1; CTNNBL1
HIV infection	23/218	6.34 × 10⁻⁵	0.00589^*	CCNT2; PSMD11; RNGTT; TBP; TSG101; NCBP1; NUP133; NCBP2; XRCC5; HMGA1; NEDD4L; GTF2H3; GTF2E1; NUP160; AP1G1; POLR2B; NUP50; PSMD2; TAF5; NUPL1; PAK2; RAE1; KPNB1
HIV lifecycle	18/137	3.19 × 10⁻⁵	0.00451^*	CCNT2; RNGTT; TBP; TSG101; NCBP1; NUP133; NCBP2; XRCC5; HMGA1; NEDD4L; GTF2H3; GTF2E1; NUP160; POLR2B; NUP50; TAF5; NUPL1; RAE1
snRNP assembly	10/49	8.30 × 10⁻⁵	0.00589^*	NCBP1; NUP133; NCBP2; NUP50; TGS1; DDX20; NUPL1; RAE1; NUP160; WDR77
Formation of tubulin-folding intermediates by TriC/CCT	7/20	5.94 × 10⁻⁵	0.00589^*	CCT3; CCT6A; CCT2; TCP1; CCT7; CCT5; CCT4
Association of TriC/CCT with target proteins during biosynthesis	8/29	7.19 × 10⁻⁵	0.00589^*	CCT3; CCT6A; CCT2; TCP1; XRN2; CCT7; CCT5; CCT4
Regulation of cholesterol biosynthesis by SREBP	10/53	1.47 × 10⁻⁴	0.00890^*	SQLE; SEC24B; GGPS1; NFYA; TGS1; CYP51A1; HMGCR; SEC24D; KPNB1; FDFT1
M02
Chromatin organization	35/208	3.76 × 10⁻¹⁵	1.41 × 10⁻¹²^*	PHF2; KDM5C; SMARCB1; TRRAP; EHMT2; EHMT1; CHD4; ACTB; PHF21A; NSD1; SAP130; EP400; WDR5; EP300; BRD8; WHSC1; MTA2; KDM6B; BRD1; CREBBP; KDM4B; SMARCC2; KDM2B; SETDB1; SETD1B; USP22; DNMT3A; ARID1A; GATAD2A; HCFC1; SMARCA4; NCOR1; KAT6B; KAT6A; RCOR1
Processing of capped intron-containing pre-mRNA	19/144	3.55 × 10⁻⁷	8.87 × 10⁻⁵^*	NUP214; SF3A1; SF3B2; SF3B3; NUP155; FUS; DDX23; SMC1A; PRPF8; SRRM1; NUP93; PRPF6; U2AF2; NUP62; POLR2D; TPR; DHX38; NUP98; SNRNP200
Transcription	18/202	1.02 × 10⁻⁴	0.00660^*	GTF3C1; NFIX; POU2F1; EHMT2; CHD4; SSRP1; GATAD2A; SRRM1; POLR3A; POLR1A; U2AF2; POLR2D; TCEB3; UBTF; DHX38; MTA2; TAF4; TAF1
PKMTs methylate histone lysines	7/29	8.03 × 10⁻⁵	0.00660^*	SETDB1; EHMT2; NSD1; SETD1B; WDR5; EHMT1; WHSC1
Transport of mature mRNA derived from an intron-containing transcript	9/50	5.80 × 10⁻⁵	0.00660^*	NUP214; NUP93; NUP155; U2AF2; NUP62; TPR; DHX38; NUP98; SRRM1
HATs acetylate histones	13/105	4.91 × 10⁻⁵	0.00660^*	BRD1; CREBBP; TRRAP; USP22; ACTB; HCFC1; KAT6B; KAT6A; SAP130; EP400; WDR5; EP300; BRD8
Transport of mature transcript to cytoplasm	9/54	9.84 × 10⁻⁵	0.00660^*	NUP214; NUP93; NUP155; U2AF2; NUP62; TPR; DHX38; NUP98; SRRM1
mRNA splicing	13/113	9.74 × 10⁻⁵	0.00660^*	SF3A1; SF3B2; SF3B3; FUS; DDX23; SMC1A; PRPF8; SRRM1; PRPF6; U2AF2; POLR2D; DHX38; SNRNP200
Regulation of lipid metabolism by peroxisome proliferator-activated receptor alpha	13/114	1.06 × 10⁻⁴	0.00660^*	ABCA1; MED1; CREBBP; NCOA6; NRF1; MED26; SREBF2; MED12; MED14; MED24; NCOR1; SIN3A; EP300
Transcriptional regulation of white adipocyte differentiation	11/78	6.57 × 10⁻⁵	0.00660^*	MED12; MED1; CREBBP; MED14; MED24; NCOR1; NCOA6; EP300; LPL; MED26; SREBF2
M16
Axon guidance	11/327	2.24 × 10⁻⁴	0.0740	GSK3B; ARHGEF12; ROCK2; RASA1; KCNQ3; ANK2; ANK3; ARHGEF7; GRIN2B; MYH10; ITGA9
Synthesis of PIPs at the early endosome membrane	3/13	3.84 × 10⁻⁴	0.0740	INPP4A; PIKFYVE; PIK3C3
CREB phosphorylation through the activation of Ras	3/27	2.54 × 10⁻³	0.122	PDPK1; BRAF; GRIN2B
Insulin receptor signaling cascade	5/92	0.00191	0.122	PDPK1; GRB10; PIK3C3; TSC1; MTOR
Eph-ephrin signaling	5/94	0.00209	0.122	ROCK2; RASA1; ARHGEF7; GRIN2B; MYH10
Sema4D-induced cell migration and growth cone collapse	3/24	0.00187	0.122	ARHGEF12; ROCK2; MYH10
Interaction between L1 and ankyrins	3/29	0.00306	0.131	KCNQ3; ANK2; ANK3
Post NMDA receptor activation events	3/35	0.00501	0.143	PDPK1; BRAF; GRIN2B
Signaling by insulin receptor	5/116	0.00497	0.143	PDPK1; GRB10; PIK3C3; TSC1; MTOR
PI3K cascade	4/68	0.00423	0.143	PDPK1; PIK3C3; TSC1; MTOR

Open in a new tab

CREB, cAMP response element binding protein; HATs, histone acetyltransferases; NMDA, N-methyl-d-aspartate; PKMTs, protein lysine methyltransferases; PIPs, phosphatidylinositol phosphates; PI3K, phosphoinositide 3-kinase; snRNP, small nuclear ribonucleo proteins; SREBP, sterol regulatory element-binding proteins; TriC/CCT, TCP1-ring complex or chaperonin containing TCP1.

Adjusted P values with statistical significance.

To further prioritize known EGs as candidates for ASD, we constructed a coexpression network for 974 EGs from three modules enriched for potential ASD genes (Fig. 3C and SI Materials and Methods); 844 genes among 974 have a close interaction with high-confidence ASD genes (connected to at least two genes with TADA FDR < 0.1), and 370 genes harbor de novo or inherited loss-of-function mutations in ASD individuals from Simons Simplex Collection or ASC cohorts. Of these, 52 have a TADA FDR less than 0.5. Among 52 genes, 23 have been previously shown to contribute to ASD risk [categories syndromic (S), 1, 2, 3, and 4 in SFARI]. For the remaining 29 EGs that have not yet been linked to ASD risk, we argue that, based on (i) the importance of EGs in ASD etiology as shown by their role in critical developmental stages and the increased burden of rare, damaging mutations in ASD individuals; (ii) their coexpression with high-confidence ASD genes in brain; and (iii) the suggestive genetic evidence from the TADA analysis, these 29 EGs represent the strongest candidates for additional investigation in their role in ASD (Fig. S4 and Table S7). According to available mouse phenotypes from the MGI (23) and the IMPC (24), 11 of these 29 EGs have reported heterozygous phenotypes in mice (Table S7). Among them, four EGs (CHD1, FBXO11, KDM4B, and VCP) have been associated with abnormal neural development and/or behavioral phenotypes in heterozygotes.

Fig. S4. — Chromosomal distribution of 29 EGs suggested as strong ASD candidate genes. The locations of each gene along the chromosomes are shown in red.

Table S7.

Priority list of 29 EGs as strong ASD candidates

Gene	Chromosome	Start	End	Module	TADA FDR q value	No. of high-confidence ASD genes that are coexpressed	Disease associations
BIRC6	2	32357028	32618899	M02	0.47	15	—
CHD1	5	98853985	98928957	M01	0.17	15	CHD8 has been previously associated with autism
CUL1	7	148697914	148801036	M01	0.49	12	—
DHX29	5	55256245	55307722	M01	0.40	17	—
DVL3	3	184155388	184173610	M02	0.33	10	Robinow syndrome-3 characterized by skeletal abnormalities
EP300	22	41091786	41180077	M02	0.45	13	Rubinstein–Taybi syndrome characterized by short stature, moderate to severe learning difficulties, distinctive facial features, and broad thumbs and first toes
EP400	12	131949920	132081102	M02	0.43	9	—
FBXO11	2	47789316	47905793	M01	0.15	17	Associated with chronic otitis media with effusion and recurrent otitis media, a hearing loss disorder, and the N-ethyl-N-nitrosourea (ENU) knockout of the homologous mouse gene results in the deaf mouse mutant Jeff
KDM4B	19	4969113	5153595	M02	0.30	14	—
LDB1	10	102107560	102120453	M02	0.42	14	—
LTN1	21	28928144	28992956	M16	0.37	3	—
MORC3	21	36320189	36386148	M01	0.50	10	—
MYH10	17	8474205	8630761	M16	0.13	3	Essential for normal spine morphology and dynamics; pharmacologic or genetic inhibition of Myh10 altered protrusive motility of spines, destabilized their mushroom head morphology, and impaired excitatory synaptic transmission
NFIB	9	14081843	14398983	M01	0.45	15	—
PBX1	1	164555584	164899296	M01	0.46	16	—
PHF21A	11	45929323	46121178	M02	0.48	11	—
RFX7	15	56087280	56243266	M01	0.25	17	—
RNF38	9	36336396	36487548	M01	0.41	18	—
SMARCE1	17	40624962	40648508	M01	0.41	12	Meningiomas (brain and spinal cord tumors)
SNW1	14	77717599	77761207	M01	0.44	12	—
STXBP5	6	147204425	147390476	M16	0.37	2	—
SUFU	10	102503987	102633535	M02	0.47	14	Familial meningioma, medulloblastoma
TAF4	20	61953469	62065810	M02	0.30	10	Interference of transcription by the binding of TAF4 with expanded polyglutamine stretches is involved in the pathogenetic mechanisms underlying neurodegeneration
TANC2	17	63009556	63427699	M02	0.32	14	—
TNPO3	7	128954180	129055173	M01	0.19	17	Mutations found in patients with muscular dystrophy
UTP6	17	31860899	31901765	M01	0.19	12	—
VCP	9	35056064	35073249	M02	0.49	9	Inclusion body myopathy with Paget disease of bone and frontotemporal dementia, amyotrophic lateral sclerosis, Charcot–Marie–tooth disease type 2Y
WHSC1	4	1871424	1982207	M02	0.27	13	Located in the Wolf–Hirschhorn syndrome critical region
YTHDC1	4	68310387	68350089	M01	0.48	11	—

Open in a new tab

SI Materials and Methods

Identification of EGs.

We identified 3,023 protein-coding EGs annotated with 50 Mouse Phenotype (MP) terms, including prenatal, perinatal, and postnatal lethal phenotypes from the MGI (23) (Table S8). The MGI database was also used to extract 4,995 protein-coding NEGs with nonlethal phenotypes in the mouse. Phenotype data from the IMPC database portal (24) expanded the lethal gene list with the addition of 252 lethal genes and 101 genes with subviable phenotypes. We further supplemented the nonlethal gene list with 701 genes with viable phenotypes from the IMPC. In the case of discrepancy in the reported lethality status between the MGI and the IMPC, we deferred to the phenotypes reported by the IMPC, because these mouse lines were generated on a defined C57BL/6N background and phenotypically characterized using a standardized pipeline. One to one mouse–human orthology of lethal and nonlethal genes was established based on MGI annotation and manual curation, resulting in 3,326 essential and 4,919 nonessential human orthologs.

Table S8.

MP terms for lethal phenotypes

MP identification	Lethality type	Lethality description
MP:0002058	Neonatal lethality	Death within the neonatal period after birth (Mus: P0)
MP:0002080	Prenatal lethality	Death anytime between fertilization and birth (Mus: approximately E18.5)
MP:0002081	Perinatal lethality	Death anytime within the perinatal period (Mus: E18.5 through postnatal day 1)
MP:0002082	Postnatal lethality	Premature death anytime between the neonatal period and weaning age (Mus: P1 to ∼3 wk of age)
MP:0006204	Embryonic lethality before implantation	Death anytime between fertilization and implantation (Mus: E0 to less than E4.5)
MP:0006205	Embryonic lethality between implantation and somite formation	Death anytime between the point of implantation and somite formation (Mus: E4.5 to less than E8)
MP:0006206	Embryonic lethality between somite formation and embryo turning	Death anytime between somite formation and the initiation of embryo turning (Mus: E8 to less than E9)
MP:0006207	Embryonic lethality during organogenesis	Death anytime between embryo turning and the completion of organogenesis (Mus: E9–E9.5 to less than E14)
MP:0006208	Lethality throughout fetal growth and development	Death anytime between the completion of organogenesis and birth (Mus: E14 to approximately E18.5)
MP:0008527	Embryonic lethality at implantation	Death because of failure of implantation (Mus: E4.5)
MP:0008569	Lethality at weaning	Premature death at weaning age, often caused by the inability to make the transition to solid food
MP:0008762	Embryonic lethality	Death of an animal within the embryonic period before organogenesis (Mus: before E14)
MP:0009850	Embryonic lethality between implantation and placentation	Death anytime between the point of implantation and the initiation of placentation (Mus: E4.5 to less than E9)
MP:0010770	Preweaning lethality	Death anytime between fertilization and weaning age (Mus: ∼3–4 wk of age)
MP:0010831	Partial lethality	The appearance of lower than Mendelian ratios of offspring of a given genotype because of death of some but not all of the organisms
MP:0010832	Lethality during fetal growth through weaning	Death anytime between the completion of organogenesis and weaning age (Mus: E14 to ∼3 wk of age)
MP:0011083	Complete lethality at weaning	Premature death at weaning age of all organisms of a given genotype in a population, often because of the inability to make the transition to solid food
MP:0011084	Partial lethality at weaning	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms at weaning age
MP:0011085	Complete postnatal lethality	Premature death anytime between the neonatal period and weaning age of all organisms of a given genotype in a population (Mus: P1 to ∼3 wk of age)
MP:0011086	Partial postnatal lethality	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms anytime between the neonatal period and weaning age (Mus: P1 to ∼3 wk of age)
MP:0011087	Complete neonatal lethality	Death of all organisms of a given genotype in a population within the neonatal period after birth (Mus: P0)
MP:0011088	Partial neonatal lethality	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms within the neonatal period after birth (Mus: P0)
MP:0011089	Complete perinatal lethality	Death of all organisms of a given genotype in a population within the perinatal period (Mus: E18.5 through postnatal day 1)
MP:0011090	Partial perinatal lethality	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms within the perinatal period (Mus: E18.5 through postnatal day 1)
MP:0011091	Complete prenatal lethality	Death of all organisms of a given genotype in a population between fertilization and birth (Mus: approximately E18.5)
MP:0011092	Complete embryonic lethality	Death of all organisms of a given genotype in a population within the embryonic period before organogenesis (Mus: before E14)
MP:0011093	Complete embryonic lethality at implantation	Death of all organisms of a given genotype in a population at the point of implantation (Mus: E4.5)
MP:0011094	Complete embryonic lethality before implantation	Death of all organisms of a given genotype in a population between fertilization and implantation (Mus: E0 to less than E4.5)
MP:0011095	Complete embryonic lethality between implantation and placentation	Death of all organisms of a given genotype in a population between the point of implantation and the initiation of placentation (Mus: E4.5 to less than E9)
MP:0011096	Complete embryonic lethality between implantation and somite formation	Death of all organisms of a given genotype in a population between the point of implantation and somite formation (Mus: E4.5 to less than E8)
MP:0011097	Complete embryonic lethality between somite formation and embryo turning	Death of all organisms of a given genotype in a population between somite formation and the initiation of embryo turning (Mus: E8 to less than E9)
MP:0011098	Complete embryonic lethality during organogenesis	Death of all organisms of a given genotype in a population between embryo turning and the completion of organogenesis (Mus: E9–E9.5 to less than E14)
MP:0011099	Complete lethality throughout fetal growth and development	Death of all organisms of a given genotype in a population between the completion of organogenesis and birth (Mus: E14 to approximately E18.5)
MP:0011100	Complete preweaning lethality	Death of all organisms of a given genotype in a population between fertilization and weaning age (Mus: ∼3–4 wk of age)
MP:0011101	Partial prenatal lethality	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between fertilization and birth (Mus: approximately E18.5)
MP:0011102	Partial embryonic lethality	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms within the embryonic period before organogenesis (Mus: before E14)
MP:0011103	Partial embryonic lethality at implantation	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms at the point of implantation (Mus: E4.5)
MP:0011104	Partial embryonic lethality before implantation	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between fertilization and implantation (Mus: E0 to less than E4.5)
MP:0011105	Partial embryonic lethality between implantation and placentation	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between the point of implantation and the initiation of placentation (Mus: E4.5 to less than E9)
MP:0011106	Partial embryonic lethality between implantation and somite formation	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between the point of implantation and somite formation (Mus: E4.5 to less than E8)
MP:0011107	Partial embryonic lethality between somite formation and embryo turning	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between somite formation and the initiation of embryo turning (Mus: E8 to less than E9)
MP:0011108	Partial embryonic lethality during organogenesis	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between embryo turning and the completion of organogenesis (Mus: E9–E9.5 to less than E14)
MP:0011109	Partial lethality throughout fetal growth and development	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between the completion of organogenesis and birth (Mus: E14 to approximately E18.5)
MP:0011110	Partial preweaning lethality	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between fertilization and weaning age (Mus: ∼3–4 wk of age)
MP:0011111	Complete lethality during fetal growth through weaning	Death of all organisms of a given genotype in a population between the completion of organogenesis and weaning age (Mus: E14 to ∼3 wk of age)
MP:0011112	Partial lethality during fetal growth through weaning	The appearance of lower than Mendelian ratios of organisms of a given genotype because of death of some but not all of the organisms between the completion of organogenesis and weaning age (Mus: E14 to ∼3 wk of age)
MP:0011400	Complete lethality	All individuals of a given genotype in a population die before the end of the normal lifespan, but time(s) of death are unspecified
MP:0013292	Embryonic lethality before organogenesis	Death before the completion of embryo turning (Mus: E9–E9.5)
MP:0013293	Embryonic lethality before tooth bud stage	Death before the appearance of tooth buds (Mus: E12–E12.5)
MP:0013294	Prenatal lethality before heart atrial septation	Death before the completion of heart atrial septation (Mus: E14.5–E15.5)

Open in a new tab

E, embryonic day; Mus, Mus musculus.

The catalog of EGs was further augmented with the addition of cell EGs from three recent studies (20–22) aimed at the characterization of EGs in human cell lines. We obtained 1,580 core EGs (genes above essentiality threshold in at least three of five cell lines in the study) from the work by Hart et al. (22), 1,739 core EGs (genes above essentiality threshold in at least two of four cell lines in the study) from the work by Wang et al. (21), and 1,734 core EGs (genes above essentiality threshold in at least one of two cell lines in the study) from the work by Blomen et al. (20). By taking the overlap of three sets of core EGs, we obtained 956 high-confidence human EGs. Among 956 EGs in human cell lines, 348 genes (36.4%) are also human orthologs of EGs in the mouse, 19 genes (2.0%) are human orthologs of NEGs in the mouse, and 589 genes (61.6%) have not been tested in the mouse (14).

Analysis of Haploinsufficiency of EGs.

We collected genes sets from multiple studies and resources for the analysis of patterns of inheritance and haploinsufficiency of EGs. First, a catalog of human disease genes was obtained from Online Mendelian Inheritance in Man (OMIM; downloaded on July 12, 2016) (29). From the OMIM catalog, we identified 1,411 genes annotated with genetic disorders that are “autosomal dominant” or “X-linked dominant” and 2,056 genes annotated with genetic disorders that are “autosomal recessive” or “X-linked recessive.” By dissecting the above two gene lists, we obtained 1,000 genes underlying only dominant diseases, 1,645 genes underlying only recessive diseases, and 441 genes that were linked to both dominant and recessive disorders. Second, a list of 616 protein-coding genes that were systematically assessed for evidence for dosage sensitivity was obtained from ClinGen Dosage Sensitivity Map (30). Among 616 genes, 239 genes were dosage-sensitive with sufficient evidence, 41 genes were dosage-sensitive with some evidence, 47 genes were dosage-sensitive with little evidence, 200 genes had no evidence for dosage pathogenicity so far, and 89 genes were not dosage-sensitive or with autosomal recessive phenotype. Third, a list of 262 haploinsufficient genes based on text-mining from PubMed and OMIM was obtained from the work by Dang et al. (31). Fourth, from the MGI, we identified 313 human orthologs of mouse genes associated with heterozygous phenotypes. For each of the gene sets, we evaluated the enrichment of EGs compared with NEGs using Fisher’s exact test.

We acquired the Haploinsufficiency Scores (32) and the Genome-Wide Haploinsufficiency Scores (33) for genome-wide prediction of the probability of haploinsufficiency. For each prediction model, the raw scores were ranked and converted to percentiles. The histograms and estimated density curves were plotted using ggplot2 (geom_histogram and geom_line) in R.

Burden Analysis of Mutations in EGs in ASD Families.

The Simons Simplex Collection contains genetic and phenotypic information from 2,600 ASD families, each of which has one child affected with ASD and unaffected parents and siblings (34). ASD probands were defined by clinical consensus from the Autism Diagnostic Interview–Revised (57) and the Autism Diagnostic Observation Schedule (58). Multiple individual phenotypic measures, including the SRS (35) and IQ, were available (8, 26).

We aimed to investigate the impact of both de novo and rare inherited variants in EGs on ASD risk. We acquired a list of 5,648 de novo variants from an exome sequencing study on 2,517 ASD families from the Simons Simplex Collection (8) and an additional list of 1,544 de novo variants from a reanalysis of the same cohort (2,377 ASD families) with a different pipeline (26). Among 7,192 de novo variants, 674 were loss-of-function mutations (i.e., SNVs that are frameshift, stop-loss, stop-gain, start-loss, splicing donor or acceptor, and frameshift indels), and 3,462 were nonsynonymous mutations (i.e., missense SNVs and nonframeshift indels). The deleterious de novo nonsynonymous mutations were selected using a threshold of the Combined Annotation-Dependent Depletion (CADD) (59) phred-scale score above 10. In addition, we obtained 249,729 rare inherited mutations from 2,377 ASD families (26). From the variants successfully called by both GATK (60) and FreeBayes (61), we extracted loss-of-function mutations and nonsynonymous mutations with minor allele frequency in Exome Variant Server (European ancestry) (62) less than 0.01 and CADD phred-scale score above 10. At the end of the variant filtering steps, we obtained 372 dnLoF variants, 1,497 dnNSD variants, and 77,891 inhRD variants in EGs or NEGs for mutational burden analysis (Fig. S1 and Datasets S3 and S4).

The individual mutational burden was defined as the number of mutations carried by each subject in the gene sets of interest (i.e., 3,915 EGs and 4,919 NEGs) for each class of variants (dnLoF, dnNSD, and inhRD). Among all Simons Simplex Collection ASD families, there were 1,781 ASD quartets where exome sequence data from an affected proband and an unaffected sibling were available. The individual mutational burden in 1,781 ASD probands was compared with the burden in their unaffected sibling using one-sided Wilcoxon signed ranked test.

We acquired SRS total raw scores for 2,348 probands and 1,678 siblings as well as verbal/nonverbal IQs for 2,359 probands for 1,781 ASD quartets and 587 ASD trios from Simons Simplex Collection families (Dataset S2). Poisson regression analysis was carried out separately between each trait (i.e., SRS total raw score and verbal IQ and nonverbal IQ) as the dependent variables and the individual burdens of all rare damaging mutations (including dnLoF, dnNSD, and inhRD) in EGs or NEGs as the independent variables.

Construction of Coexpression Modules and Coexpression Network in Brain.

Coexpression analysis in human brain was conducted based on RNA-seq data from BrainSpan: Atlas of the Developing Human Brain (40). We used the Weighted Correlation Network Analysis (WGCNA) package (63) for data quality control and identification of modules of coexpressed genes. The expression data for 52,376 Ensembl genes (56) (including protein-coding genes, noncoding genes, or pseudogenes) across 525 samples were obtained; 1,716 genes with too many missing entries or zero variance in expression levels were removed by the “goodSamplesGenes” function in the WGCNA, and 12,613 genes with very low expression levels [maximum reads per kilobase of transcript per million mapped reads (RPKM) less than 0.5 across samples] were removed. As a final step for gene-level data cleaning, only protein-coding genes were selected for additional analysis. For sample-level data cleaning, three outlier subjects (300, 303, and 306) were removed according to subject-level clustering result. Ten brain tissue types (caudal ganglionic eminence, cerebellum, dorsal thalamus, lateral ganglionic eminence, medial ganglionic eminence, occipital neocortex, parietal neocortex, primary motor sensory cortex, temporal neocortex, and upper rhombic lip) with data from fewer than 10 developmental stages were removed. The final quality-controlled dataset consisted of expression levels of 15,952 protein-coding genes in 16 brain tissue types across 31 pre- and postnatal developmental stages (495 samples in total). For module detection, we used the “blockwiseModules” function in the WGCNA with default parameters, except for the network type (power = 6, deepSplit = 2, and networkType = “signed”). We used the signed version of coexpression analysis that links two genes with positive correlation of expression levels.

Coexpression between gene pairs was calculated based on the quality-controlled BrainSpan RNA-seq data with 495 brain samples. Two genes were defined as “coexpressed” in the brain if the Pearson correlation of the expression levels of both genes across 495 brain samples was greater than or equal to 0.8. In total, there were 8,600,150 coexpression links among protein-coding genes. The coexpression network was created using the GeneMania plugin (64) within Cytoscape 3.2.1 (65). Of 974 EGs from three modules (M01, M02, and M16) implicated in ASD, coexpression data were available for 973 genes, which were used as the input gene set for network construction. The coexpression network consists of a main connected component with 963 nodes and 187,443 edges as well as 10 isolated nodes.

Discussion

We provide multiple lines of evidence suggesting that deleterious variants in EGs have a cumulative effect on ASD risk. Using the most comprehensive list of 3,915 EGs established to date, we show that there is both an elevated burden of damaging mutations in EGs in ASD probands and also, an enrichment of EGs in the recently identified high-confidence ASD-associated genes. Moreover, the analysis of EGs in the developing brain identified clusters of coexpressed EGs implicated in ASD, including 29 EGs functionally related to previously identified ASD risk genes.

We find that ASD individuals have a higher burden of mutations in EGs compared with their unaffected siblings. It is notable but not surprising that this effect is particularly pronounced when considering de novo mutations, because this class of mutations is only subject to selection pressure after originating in the individual and has exhibited some of the most prominent associations with the risk of ASD (8, 43–45). Similarly, a moderately increased burden of dnLoF variants in ASD probands was detected with a group of 10,823 phenotypically uncharacterized genes. Based on current estimates, one-fifth of these uncharacterized genes (∼2,000) are expected to be EGs, which may explain the higher mutational burden of dnLoF variants in ASD probands. Recent studies have begun to show that additional genetic factors, such as rare and common inherited variations, also contribute to ASD (26, 46). Our result supports this finding, showing that inherited, rare, damaging mutations in EGs also have a significant effect on ASD risk. Furthermore, we show an EG-specific effect on social responsiveness, a measure of the social aspects of ASD. In contrast, mutational burden in both EGs and NEGs has an effect on IQ measures. Complex social behaviors result from a range of different cognitive processes; however, in ASD subjects, there is a striking dissociation in the level of impairment in social interaction or communication and general cognitive abilities (as measured by IQ) (36) (Fig. S2). Moreover, studies in model organisms clearly show a fetal origin for social behavior deficits (47). Our results are in line with these findings and suggest that, although a higher mutational burden over all genes may have consequences on IQ, mutational burden in a set of genes with a role at critical early developmental stages influences the development of social behavior. Moreover, our findings are also further supported by the recent report that genomic regions that are under accelerated evolution have essential functions in the human brain development and when mutated, may cause increased risk for autism (48). Therefore, understanding the regulatory landscape of dosage-sensitive EGs expressed at critical stages of brain development may reveal risk alleles for many neurodevelopmental and psychiatric disorders.

The analysis of the overlapping set of Simons Simplex Collection ASD families by several groups using complementary approaches led to the identification of around 100 ASD risk genes and the finding of a depletion of damaging mutations in ASD risk genes (12, 27, 28). We show that a significant number of reported ASD risk genes are essential for survival and fitness and therefore, have a distinctive mutational spectrum, providing a biological foundation for this intolerance to damaging mutations. Of the spectrum of existing alleles, homozygosity or compound heterozygosity for loss-of function alleles will never be observed. Also, because of synthetic lethality, some combinations of mutations in EGs are eliminated. Therefore, individuals will have only a subset of “milder” coding or regulatory alleles. The current list of candidate genes consists of 100 (high-confidence ASD genes) to 400 genes (potential ASD genes) (9). It is striking that our study provides strong statistical evidence for the aggregate effect across 3,915 EGs impacting risk for this neurodevelopmental disorder. A recent SNP-based heritability study reported the extreme polygenicity of schizophrenia, with 70% of 1-Mb genomic regions harboring schizophrenia risk alleles (49). Assuming a similar genetic architecture in ASD and schizophrenia, genomic maps of EGs with “surviving” deleterious and regulatory variants in ASD probands represent a complementary approach for the analysis of combinations of culprit genes or alleles.

Because of the fundamental functional role of EGs in an organism, genetic variants in these genes are likely to contribute to many traits and diseases as reflected by the previous finding that EGs are enriched for human disease genes (11, 13, 14). Our study is focused on a specific neurodevelopmental disorder—ASD—because it has been suggested that ASD has its roots in abnormalities in prenatal brain development (50–52). Specifically, our analysis of the temporal expression patterns of coexpressed gene modules in the developing brain shows that genes in three EG-enriched coexpression modules implicated in ASD are expressed at a high level at the earliest stages of brain development, as early as 8 weeks after conception. In contrast, at later stages of brain development, the expression levels of genes in these EG-enriched modules decrease, whereas the expression levels of genes in NEG-enriched modules increase. This finding suggests that EGs have a distinctive influence at some of the earliest brain developmental stages as previously reported for constrained genes (53) and genes in functional networks perturbed in ASD (54). However, it is not clear whether the contribution of EGs is specific to ASD or widespread across disorders with various underlying mechanisms. A comparison of the burden of deleterious variants in EGs across other complex disorders, including those with a later onset, is warranted.

Each individual can carry a number of deleterious mutations, each of which can have a small effect. Because brain function may be particularly sensitive to mutation accumulation, identifying a specific set of genes in which mutations have a behavioral effect will assist us in understanding how mutation accumulation within an individual can result in a phenotype, such as ASD. Hallmarks of ASD are phenotypic heterogeneity, frequent comorbidities, and that no specific brain region or cell type is uniquely implicated (5), further supporting the role of genes with a global effect on embryonic and fetal development. Here, we provide evidence that genes that are essential for survival and fitness also contribute to ASD risk and lead to the disruption of normal social behavior.

Materials and Methods

Identification of EGs.

Mouse Phenotype (MP) terms for the annotation of EGs are listed in Table S8. More details on identification of the catalog of EGs are in SI Materials and Methods.

Analysis of Haploinsufficiency of EGs.

Details on collection of genes sets for the analysis of haploinsufficiency of EGs are in SI Materials and Methods.

Burden Analysis of Mutations in EGs in ASD Families.

Details on collection of genetic and phenotypic data of ASD families and variant filtering process are in SI Materials and Methods.

Comparison Between Observed and Expected TADA FDR q Values.

To compare the strength of association signals to ASD between EGs and NEGs, FDR q values for the TADA test of 18,665 genes were obtained from the work by Sanders et al. (9). For each gene set of interest (i.e., 3,915 EGs or 4,919 NEGs), the null distribution of TADA FDR q values was generated by randomly resampling with replacement. Within one iteration of the resampling procedure, the TADA FDR q value of a random gene from the tested 18,665 genes was obtained for each gene in the gene set of interest. The resampled TADA FDR q values were then ranked from low to high. The resampling procedure was repeated for 100,000 iterations. For each observed TADA FDR q value ranked from low to high, the median of 100,000 resampled q values with the same rank was considered the expected TADA FDR q value. The 2.5th and 97.5th percentiles of 100,000 resampled q values were considered the estimated 95% confidence intervals of each expected TADA FDR q value. The observed FDR q values were then compared with the expected FDR q values.

Construction of Coexpression Modules and Coexpression Network in Brain.

Details on construction of coexpression modules and coexpression network in the developing human brain are in SI Materials and Methods.

Pathway Enrichment Analysis.

We performed pathway enrichment analysis in the Reactome database (42) using Enrichr (55) for three EG-enriched modules (M01, M02, and M16) that were also enriched for potential ASD genes (Table S6). The enriched pathways were ranked by P values with Benjamini–Hochberg adjustment (FDR q values) from the Fisher’s exact test.

Code Availability.

Details on availability of code used to generate reported results are in Table S9.

Table S9.

Analysis code for figures and tables generated

File name	Figure/table	Description
Fig1A_Fig2B_plotGeneSetEnrichment.r	Figs. 1A and 2B	Plotting enrichment of EGs among haploinssuficient genes and ASD risk genes
Fig1BC_plotHIScoreDistribution.r	Fig. 1 B and C	Plotting the distribution of haploinsufficiency scores
Fig2A_getForestPlot_burdenAnalysis.r	Fig. 2A	Plotting the results for mutational burden analysis
Fig2A_Table1_TableS1_S2_S3_S4_burdenAnalysis.r	Fig. 2A, Table 1, and Tables S1, S2, S3, and S4	Performing mutational burden analysis
Fig2C_getExpectedTADAFDR.py	Fig. 2C	Generating the null distribution of TADA FDR q values for gene set
Fig2C_plotTADAfdrQQ.r	Fig. 2C	Plotting the observed vs. null distribution of TADA FDR q values for gene set
Fig3A_plotModuleEnrichment.r	Fig. 3A	Plotting the enrichment of EGs/NEGs among coexpression modules
Fig3C_plotNetworkAttibutes.r	Fig. 3C	Plotting the coexpression network of gene modules implicated in ASD
FigS1_Fig3B_plotEigengenes.r	Fig. 3B and Fig. S1	Plotting the expression trajectories of coexpression modules
FigS2_plotSRS_IQ.r	Fig. S2	Plotting the correlation between SRS and IQ in ASD probands
DatasetS3_S4_getVariantList.py	Datasets S3 and S4	Generating lists of variants for mutational burden analysis

Open in a new tab

Analysis codes for figures and tables generated were deposited into Github (https://github.com/Bucanlab/Ji_PNAS_2016).

Supplementary Material

Supplementary File

pnas.1613195113.sd01.xlsx^{(2.9MB, xlsx)}

Supplementary File

pnas.1613195113.sd02.xlsx^{(349.8KB, xlsx)}

Supplementary File

pnas.1613195113.sd03.xlsx^{(472.5KB, xlsx)}

Supplementary File

pnas.1613195113.sd04.xlsx^{(12MB, xlsx)}

Supplementary File

pnas.1613195113.sd02.xlsx^{(349.8KB, xlsx)}

Supplementary File

pnas.1613195113.sd03.xlsx^{(472.5KB, xlsx)}

Supplementary File

pnas.1613195113.sd04.xlsx^{(12MB, xlsx)}

Acknowledgments

We thank Steve Murray and the International Mouse Phenotyping Consortium (IMPC) for help with generation of gene lists, and Benjamin Georgi, Benjamin Voight, Hakon Hakonarson, Steve Brown, Judith Miller, Edward Brodkin, and Lu Chen for discussions. X.J. was supported by a fellowship from Biomedical Graduate Studies at the University of Pennsylvania. This work was supported by the Pennsylvania Commonwealth Grant and NIH Grants R01MH101822 (to C.D.B.) and R01MH093415 (to M.B. and Steven M. Paul; multiple principal investigators).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1613195113/-/DCSupplemental.

References

1.State MW, Levitt P. The conundrums of understanding genetic risks for autism spectrum disorders. Nat Neurosci. 2011;14(12):1499–1506. doi: 10.1038/nn.2924. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Huguet G, Ey E, Bourgeron T. The genetic landscapes of autism spectrum disorders. Annu Rev Genomics Hum Genet. 2013;14:191–213. doi: 10.1146/annurev-genom-091212-153431. [DOI] [PubMed] [Google Scholar]
3.Willsey AJ, State MW. Autism spectrum disorders: From genes to neurobiology. Curr Opin Neurobiol. 2015;30:92–99. doi: 10.1016/j.conb.2014.10.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.De Rubeis S, Buxbaum JD. Recent advances in the genetics of autism spectrum disorder. Curr Neurol Neurosci Rep. 2015;15(6):36. doi: 10.1007/s11910-015-0553-1. [DOI] [PubMed] [Google Scholar]
5.de la Torre-Ubieta L, Won H, Stein JL, Geschwind DH. Advancing the understanding of autism disease mechanisms through genetics. Nat Med. 2016;22(4):345–361. doi: 10.1038/nm.4071. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Geschwind DH, Levitt P. Autism spectrum disorders: Developmental disconnection syndromes. Curr Opin Neurobiol. 2007;17(1):103–111. doi: 10.1016/j.conb.2007.01.009. [DOI] [PubMed] [Google Scholar]
7.De Rubeis S, et al. DDD Study; Homozygosity Mapping Collaborative for Autism; UK10K Consortium Synaptic, transcriptional and chromatin genes disrupted in autism. Nature. 2014;515(7526):209–215. doi: 10.1038/nature13772. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Iossifov I, et al. The contribution of de novo coding mutations to autism spectrum disorder. Nature. 2014;515(7526):216–221. doi: 10.1038/nature13908. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Sanders SJ, et al. Autism Sequencing Consortium Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron. 2015;87(6):1215–1233. doi: 10.1016/j.neuron.2015.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Zhang M, Zhu C, Jacomy A, Lu LJ, Jegga AG. The orphan disease networks. Am J Hum Genet. 2011;88(6):755–766. doi: 10.1016/j.ajhg.2011.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Georgi B, Voight BF, Bućan M. From mouse to human: Evolutionary genomics analysis of human orthologs of essential genes. PLoS Genet. 2013;9(5):e1003484. doi: 10.1371/journal.pgen.1003484. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Petrovski S, Wang Q, Heinzen EL, Allen AS, Goldstein DB. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 2013;9(8):e1003709. doi: 10.1371/journal.pgen.1003709. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Dickerson JE, Zhu A, Robertson DL, Hentges KE. Defining the role of essential genes in human disease. PLoS One. 2011;6(11):e27368. doi: 10.1371/journal.pone.0027368. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Dickinson ME, et al. International Mouse Phenotyping Consortium; Jackson Laboratory; Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS); Charles River Laboratories; MRC Harwell; Toronto Centre for Phenogenomics; Wellcome Trust Sanger Institute; RIKEN BioResource Center High-throughput discovery of novel developmental phenotypes. Nature. 2016;537(7621):508–514. doi: 10.1038/nature19356. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Deutschbauer AM, et al. Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics. 2005;169(4):1915–1925. doi: 10.1534/genetics.104.036871. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Mushegian AR, Koonin EV. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc Natl Acad Sci USA. 1996;93(19):10268–10273. doi: 10.1073/pnas.93.19.10268. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Koonin EV. Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat Rev Microbiol. 2003;1(2):127–136. doi: 10.1038/nrmicro751. [DOI] [PubMed] [Google Scholar]
18.Hwang YC, et al. Predicting essential genes based on network and sequence analysis. Mol Biosyst. 2009;5(12):1672–1678. doi: 10.1039/B900611G. [DOI] [PubMed] [Google Scholar]
19.Chakravarti A, Turner TN. Revealing rate-limiting steps in complex disease biology: The crucial importance of studying rare, extreme-phenotype families. BioEssays. 2016;38(6):578–586. doi: 10.1002/bies.201500203. [DOI] [PubMed] [Google Scholar]
20.Blomen VA, et al. Gene essentiality and synthetic lethality in haploid human cells. Science. 2015;350(6264):1092–1096. doi: 10.1126/science.aac7557. [DOI] [PubMed] [Google Scholar]
21.Wang T, et al. Identification and characterization of essential genes in the human genome. Science. 2015;350(6264):1096–1101. doi: 10.1126/science.aac7041. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Hart T, et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell. 2015;163(6):1515–1526. doi: 10.1016/j.cell.2015.11.015. [DOI] [PubMed] [Google Scholar]
23.Eppig JT, et al. Mouse Genome Database Group The Mouse Genome Database (MGD): From genes to mice--a community resource for mouse biology. Nucleic Acids Res. 2005;33(Database issue):D471–D475. doi: 10.1093/nar/gki113. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Koscielny G, et al. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data. Nucleic Acids Res. 2014;42(Database issue):D802–D809. doi: 10.1093/nar/gkt977. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.White JK, et al. Sanger Institute Mouse Genetics Project Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes. Cell. 2013;154(2):452–464. doi: 10.1016/j.cell.2013.06.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Krumm N, et al. Excess of rare, inherited truncating mutations in autism. Nat Genet. 2015;47(6):582–588. doi: 10.1038/ng.3303. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Samocha KE, et al. A framework for the interpretation of de novo mutation in human disease. Nat Genet. 2014;46(9):944–950. doi: 10.1038/ng.3050. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Iossifov I, et al. Low load for disruptive mutations in autism genes and their biased transmission. Proc Natl Acad Sci USA. 2015;112(41):E5600–E5607. doi: 10.1073/pnas.1516376112. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33(Database issue):D514–D517. doi: 10.1093/nar/gki033. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Rehm HL, et al. ClinGen ClinGen--the Clinical Genome Resource. N Engl J Med. 2015;372(23):2235–2242. doi: 10.1056/NEJMsr1406261. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Dang VT, Kassahn KS, Marcos AE, Ragan MA. Identification of human haploinsufficient genes and their genomic proximity to segmental duplications. Eur J Hum Genet. 2008;16(11):1350–1357. doi: 10.1038/ejhg.2008.111. [DOI] [PubMed] [Google Scholar]
32.Huang N, Lee I, Marcotte EM, Hurles ME. Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 2010;6(10):e1001154. doi: 10.1371/journal.pgen.1001154. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Steinberg J, Honti F, Meader S, Webber C. Haploinsufficiency predictions without study bias. Nucleic Acids Res. 2015;43(15):e101. doi: 10.1093/nar/gkv474. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Fischbach GD, Lord C. The Simons Simplex Collection: A resource for identification of autism genetic risk factors. Neuron. 2010;68(2):192–195. doi: 10.1016/j.neuron.2010.10.006. [DOI] [PubMed] [Google Scholar]
35.Constantino J, Gruber C. The Social Responsiveness Scale Manual. Western Psychological Services; Los Angeles: 2005. [Google Scholar]
36.Constantino JN, et al. Validation of a brief quantitative measure of autistic traits: Comparison of the social responsiveness scale with the autism diagnostic interview-revised. J Autism Dev Disord. 2003;33(4):427–433. doi: 10.1023/a:1025014929212. [DOI] [PubMed] [Google Scholar]
37.Abrahams BS, et al. SFARI Gene 2.0: A community-driven knowledgebase for the autism spectrum disorders (ASDs) Mol Autism. 2013;4(1):36. doi: 10.1186/2040-2392-4-36. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.He X, et al. Integrated model of de novo and inherited genetic variants yields greater power to identify risk genes. PLoS Genet. 2013;9(8):e1003671. doi: 10.1371/journal.pgen.1003671. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Buxbaum JD, et al. Autism Sequencing Consortium The autism sequencing consortium: Large-scale, high-throughput sequencing in autism spectrum disorders. Neuron. 2012;76(6):1052–1056. doi: 10.1016/j.neuron.2012.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.BrainSpan 2011 BrainSpan: Atlas of the Developing Human Brain. Available at brainspan.org. Accessed October 4, 2013.
41.Croft D, et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42(Database issue):D472–D477. doi: 10.1093/nar/gkt1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Fabregat A, et al. The Reactome pathway Knowledgebase. Nucleic Acids Res. 2016;44(D1):D481–D487. doi: 10.1093/nar/gkv1351. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Sanders SJ, et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 2012;485(7397):237–241. doi: 10.1038/nature10945. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.O’Roak BJ, et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature. 2012;485(7397):246–250. doi: 10.1038/nature10989. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Iossifov I, et al. De novo gene disruptions in children on the autistic spectrum. Neuron. 2012;74(2):285–299. doi: 10.1016/j.neuron.2012.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Gaugler T, et al. Most genetic risk for autism resides with common variation. Nat Genet. 2014;46(8):881–885. doi: 10.1038/ng.3039. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Belinson H, et al. Prenatal β-catenin/Brn2/Tbr2 transcriptional cascade regulates adult social and stereotypic behaviors. Mol Psychiatry. 2016;21(10):1417–1433. doi: 10.1038/mp.2015.207. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Doan RN, et al. Mutations in human accelerated regions disrupt cognition and social behavior. Cell. 2016;167(2):341–354.e12. doi: 10.1016/j.cell.2016.08.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Loh PR, et al. Schizophrenia Working Group of Psychiatric Genomics Consortium Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat Genet. 2015;47(12):1385–1392. doi: 10.1038/ng.3431. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Willsey AJ, et al. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell. 2013;155(5):997–1007. doi: 10.1016/j.cell.2013.10.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Parikshak NN, et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell. 2013;155(5):1008–1021. doi: 10.1016/j.cell.2013.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Stoner R, et al. Patches of disorganization in the neocortex of children with autism. N Engl J Med. 2014;370(13):1209–1219. doi: 10.1056/NEJMoa1307491. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Choi J, Shooshtari P, Samocha KE, Daly MJ, Cotsapas C. Network analysis of genome-wide selective constraint reveals a gene network active in early fetal brain intolerant of mutation. PLoS Genet. 2016;12(6):e1006121. doi: 10.1371/journal.pgen.1006121. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Chang J, Gilman SR, Chiang AH, Sanders SJ, Vitkup D. Genotype to phenotype relationships in autism spectrum disorders. Nat Neurosci. 2015;18(2):191–198. doi: 10.1038/nn.3907. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Chen EY, et al. Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128. doi: 10.1186/1471-2105-14-128. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Flicek P, et al. Ensembl 2014. Nucleic Acids Res. 2014;42(Database issue):D749–D755. doi: 10.1093/nar/gkt1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Lord C, Rutter M, Le Couteur A. Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. J Autism Dev Disord. 1994;24(5):659–685. doi: 10.1007/BF02172145. [DOI] [PubMed] [Google Scholar]
58.Lord C, et al. The autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism. J Autism Dev Disord. 2000;30(3):205–223. [PubMed] [Google Scholar]
59.Kircher M, et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46(3):310–315. doi: 10.1038/ng.2892. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.McKenna A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Garrison E, Marth G. 2012. Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907.
62. NHLBI Exome Sequencing Project (ESP) Exome Variant Server. Available at evs.gs.washington.edu/EVS/. Accessed November 11, 2015.
63.Langfelder P, Horvath S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Mostafavi S, Ray D, Warde-Farley D, Grouios C, Morris Q. GeneMANIA: A real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 2008;9(Suppl 1):S4. doi: 10.1186/gb-2008-9-s1-s4. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Shannon P, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

pnas.1613195113.sd01.xlsx^{(2.9MB, xlsx)}

Supplementary File

pnas.1613195113.sd02.xlsx^{(349.8KB, xlsx)}

Supplementary File

pnas.1613195113.sd03.xlsx^{(472.5KB, xlsx)}

Supplementary File

pnas.1613195113.sd04.xlsx^{(12MB, xlsx)}

Supplementary File

pnas.1613195113.sd02.xlsx^{(349.8KB, xlsx)}

Supplementary File

pnas.1613195113.sd03.xlsx^{(472.5KB, xlsx)}

Supplementary File

pnas.1613195113.sd04.xlsx^{(12MB, xlsx)}

[r1] 1.State MW, Levitt P. The conundrums of understanding genetic risks for autism spectrum disorders. Nat Neurosci. 2011;14(12):1499–1506. doi: 10.1038/nn.2924. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2] 2.Huguet G, Ey E, Bourgeron T. The genetic landscapes of autism spectrum disorders. Annu Rev Genomics Hum Genet. 2013;14:191–213. doi: 10.1146/annurev-genom-091212-153431. [DOI] [PubMed] [Google Scholar]

[r3] 3.Willsey AJ, State MW. Autism spectrum disorders: From genes to neurobiology. Curr Opin Neurobiol. 2015;30:92–99. doi: 10.1016/j.conb.2014.10.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4] 4.De Rubeis S, Buxbaum JD. Recent advances in the genetics of autism spectrum disorder. Curr Neurol Neurosci Rep. 2015;15(6):36. doi: 10.1007/s11910-015-0553-1. [DOI] [PubMed] [Google Scholar]

[r5] 5.de la Torre-Ubieta L, Won H, Stein JL, Geschwind DH. Advancing the understanding of autism disease mechanisms through genetics. Nat Med. 2016;22(4):345–361. doi: 10.1038/nm.4071. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6] 6.Geschwind DH, Levitt P. Autism spectrum disorders: Developmental disconnection syndromes. Curr Opin Neurobiol. 2007;17(1):103–111. doi: 10.1016/j.conb.2007.01.009. [DOI] [PubMed] [Google Scholar]

[r7] 7.De Rubeis S, et al. DDD Study; Homozygosity Mapping Collaborative for Autism; UK10K Consortium Synaptic, transcriptional and chromatin genes disrupted in autism. Nature. 2014;515(7526):209–215. doi: 10.1038/nature13772. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r8] 8.Iossifov I, et al. The contribution of de novo coding mutations to autism spectrum disorder. Nature. 2014;515(7526):216–221. doi: 10.1038/nature13908. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9] 9.Sanders SJ, et al. Autism Sequencing Consortium Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron. 2015;87(6):1215–1233. doi: 10.1016/j.neuron.2015.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r10] 10.Zhang M, Zhu C, Jacomy A, Lu LJ, Jegga AG. The orphan disease networks. Am J Hum Genet. 2011;88(6):755–766. doi: 10.1016/j.ajhg.2011.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Georgi B, Voight BF, Bućan M. From mouse to human: Evolutionary genomics analysis of human orthologs of essential genes. PLoS Genet. 2013;9(5):e1003484. doi: 10.1371/journal.pgen.1003484. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r12] 12.Petrovski S, Wang Q, Heinzen EL, Allen AS, Goldstein DB. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 2013;9(8):e1003709. doi: 10.1371/journal.pgen.1003709. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r13] 13.Dickerson JE, Zhu A, Robertson DL, Hentges KE. Defining the role of essential genes in human disease. PLoS One. 2011;6(11):e27368. doi: 10.1371/journal.pone.0027368. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14] 14.Dickinson ME, et al. International Mouse Phenotyping Consortium; Jackson Laboratory; Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS); Charles River Laboratories; MRC Harwell; Toronto Centre for Phenogenomics; Wellcome Trust Sanger Institute; RIKEN BioResource Center High-throughput discovery of novel developmental phenotypes. Nature. 2016;537(7621):508–514. doi: 10.1038/nature19356. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r15] 15.Deutschbauer AM, et al. Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics. 2005;169(4):1915–1925. doi: 10.1534/genetics.104.036871. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16] 16.Mushegian AR, Koonin EV. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc Natl Acad Sci USA. 1996;93(19):10268–10273. doi: 10.1073/pnas.93.19.10268. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17] 17.Koonin EV. Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat Rev Microbiol. 2003;1(2):127–136. doi: 10.1038/nrmicro751. [DOI] [PubMed] [Google Scholar]

[r18] 18.Hwang YC, et al. Predicting essential genes based on network and sequence analysis. Mol Biosyst. 2009;5(12):1672–1678. doi: 10.1039/B900611G. [DOI] [PubMed] [Google Scholar]

[r19] 19.Chakravarti A, Turner TN. Revealing rate-limiting steps in complex disease biology: The crucial importance of studying rare, extreme-phenotype families. BioEssays. 2016;38(6):578–586. doi: 10.1002/bies.201500203. [DOI] [PubMed] [Google Scholar]

[r20] 20.Blomen VA, et al. Gene essentiality and synthetic lethality in haploid human cells. Science. 2015;350(6264):1092–1096. doi: 10.1126/science.aac7557. [DOI] [PubMed] [Google Scholar]

[r21] 21.Wang T, et al. Identification and characterization of essential genes in the human genome. Science. 2015;350(6264):1096–1101. doi: 10.1126/science.aac7041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r22] 22.Hart T, et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell. 2015;163(6):1515–1526. doi: 10.1016/j.cell.2015.11.015. [DOI] [PubMed] [Google Scholar]

[r23] 23.Eppig JT, et al. Mouse Genome Database Group The Mouse Genome Database (MGD): From genes to mice--a community resource for mouse biology. Nucleic Acids Res. 2005;33(Database issue):D471–D475. doi: 10.1093/nar/gki113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r24] 24.Koscielny G, et al. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data. Nucleic Acids Res. 2014;42(Database issue):D802–D809. doi: 10.1093/nar/gkt977. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r25] 25.White JK, et al. Sanger Institute Mouse Genetics Project Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes. Cell. 2013;154(2):452–464. doi: 10.1016/j.cell.2013.06.022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r26] 26.Krumm N, et al. Excess of rare, inherited truncating mutations in autism. Nat Genet. 2015;47(6):582–588. doi: 10.1038/ng.3303. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r27] 27.Samocha KE, et al. A framework for the interpretation of de novo mutation in human disease. Nat Genet. 2014;46(9):944–950. doi: 10.1038/ng.3050. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r28] 28.Iossifov I, et al. Low load for disruptive mutations in autism genes and their biased transmission. Proc Natl Acad Sci USA. 2015;112(41):E5600–E5607. doi: 10.1073/pnas.1516376112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r29] 29.Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33(Database issue):D514–D517. doi: 10.1093/nar/gki033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r30] 30.Rehm HL, et al. ClinGen ClinGen--the Clinical Genome Resource. N Engl J Med. 2015;372(23):2235–2242. doi: 10.1056/NEJMsr1406261. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r31] 31.Dang VT, Kassahn KS, Marcos AE, Ragan MA. Identification of human haploinsufficient genes and their genomic proximity to segmental duplications. Eur J Hum Genet. 2008;16(11):1350–1357. doi: 10.1038/ejhg.2008.111. [DOI] [PubMed] [Google Scholar]

[r32] 32.Huang N, Lee I, Marcotte EM, Hurles ME. Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 2010;6(10):e1001154. doi: 10.1371/journal.pgen.1001154. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r33] 33.Steinberg J, Honti F, Meader S, Webber C. Haploinsufficiency predictions without study bias. Nucleic Acids Res. 2015;43(15):e101. doi: 10.1093/nar/gkv474. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r34] 34.Fischbach GD, Lord C. The Simons Simplex Collection: A resource for identification of autism genetic risk factors. Neuron. 2010;68(2):192–195. doi: 10.1016/j.neuron.2010.10.006. [DOI] [PubMed] [Google Scholar]

[r35] 35.Constantino J, Gruber C. The Social Responsiveness Scale Manual. Western Psychological Services; Los Angeles: 2005. [Google Scholar]

[r36] 36.Constantino JN, et al. Validation of a brief quantitative measure of autistic traits: Comparison of the social responsiveness scale with the autism diagnostic interview-revised. J Autism Dev Disord. 2003;33(4):427–433. doi: 10.1023/a:1025014929212. [DOI] [PubMed] [Google Scholar]

[r37] 37.Abrahams BS, et al. SFARI Gene 2.0: A community-driven knowledgebase for the autism spectrum disorders (ASDs) Mol Autism. 2013;4(1):36. doi: 10.1186/2040-2392-4-36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r38] 38.He X, et al. Integrated model of de novo and inherited genetic variants yields greater power to identify risk genes. PLoS Genet. 2013;9(8):e1003671. doi: 10.1371/journal.pgen.1003671. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r39] 39.Buxbaum JD, et al. Autism Sequencing Consortium The autism sequencing consortium: Large-scale, high-throughput sequencing in autism spectrum disorders. Neuron. 2012;76(6):1052–1056. doi: 10.1016/j.neuron.2012.12.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r40] 40.BrainSpan 2011 BrainSpan: Atlas of the Developing Human Brain. Available at brainspan.org. Accessed October 4, 2013.

[r41] 41.Croft D, et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42(Database issue):D472–D477. doi: 10.1093/nar/gkt1102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r42] 42.Fabregat A, et al. The Reactome pathway Knowledgebase. Nucleic Acids Res. 2016;44(D1):D481–D487. doi: 10.1093/nar/gkv1351. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r43] 43.Sanders SJ, et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 2012;485(7397):237–241. doi: 10.1038/nature10945. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r44] 44.O’Roak BJ, et al. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature. 2012;485(7397):246–250. doi: 10.1038/nature10989. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r45] 45.Iossifov I, et al. De novo gene disruptions in children on the autistic spectrum. Neuron. 2012;74(2):285–299. doi: 10.1016/j.neuron.2012.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r46] 46.Gaugler T, et al. Most genetic risk for autism resides with common variation. Nat Genet. 2014;46(8):881–885. doi: 10.1038/ng.3039. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r47] 47.Belinson H, et al. Prenatal β-catenin/Brn2/Tbr2 transcriptional cascade regulates adult social and stereotypic behaviors. Mol Psychiatry. 2016;21(10):1417–1433. doi: 10.1038/mp.2015.207. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r48] 48.Doan RN, et al. Mutations in human accelerated regions disrupt cognition and social behavior. Cell. 2016;167(2):341–354.e12. doi: 10.1016/j.cell.2016.08.071. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r49] 49.Loh PR, et al. Schizophrenia Working Group of Psychiatric Genomics Consortium Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat Genet. 2015;47(12):1385–1392. doi: 10.1038/ng.3431. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r50] 50.Willsey AJ, et al. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell. 2013;155(5):997–1007. doi: 10.1016/j.cell.2013.10.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r51] 51.Parikshak NN, et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell. 2013;155(5):1008–1021. doi: 10.1016/j.cell.2013.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r52] 52.Stoner R, et al. Patches of disorganization in the neocortex of children with autism. N Engl J Med. 2014;370(13):1209–1219. doi: 10.1056/NEJMoa1307491. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r53] 53.Choi J, Shooshtari P, Samocha KE, Daly MJ, Cotsapas C. Network analysis of genome-wide selective constraint reveals a gene network active in early fetal brain intolerant of mutation. PLoS Genet. 2016;12(6):e1006121. doi: 10.1371/journal.pgen.1006121. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r54] 54.Chang J, Gilman SR, Chiang AH, Sanders SJ, Vitkup D. Genotype to phenotype relationships in autism spectrum disorders. Nat Neurosci. 2015;18(2):191–198. doi: 10.1038/nn.3907. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r55] 55.Chen EY, et al. Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128. doi: 10.1186/1471-2105-14-128. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r56] 56.Flicek P, et al. Ensembl 2014. Nucleic Acids Res. 2014;42(Database issue):D749–D755. doi: 10.1093/nar/gkt1196. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r57] 57.Lord C, Rutter M, Le Couteur A. Autism Diagnostic Interview-Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. J Autism Dev Disord. 1994;24(5):659–685. doi: 10.1007/BF02172145. [DOI] [PubMed] [Google Scholar]

[r58] 58.Lord C, et al. The autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism. J Autism Dev Disord. 2000;30(3):205–223. [PubMed] [Google Scholar]

[r59] 59.Kircher M, et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46(3):310–315. doi: 10.1038/ng.2892. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r60] 60.McKenna A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r61] 61.Garrison E, Marth G. 2012. Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907.

[r62] 62. NHLBI Exome Sequencing Project (ESP) Exome Variant Server. Available at evs.gs.washington.edu/EVS/. Accessed November 11, 2015.

[r63] 63.Langfelder P, Horvath S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r64] 64.Mostafavi S, Ray D, Warde-Farley D, Grouios C, Morris Q. GeneMANIA: A real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 2008;9(Suppl 1):S4. doi: 10.1186/gb-2008-9-s1-s4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r65] 65.Shannon P, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Increased burden of deleterious variants in essential genes in autism spectrum disorder

Xiao Ji

Rachel L Kember

Christopher D Brown

Maja Bućan

Significance

Abstract

Results

Fig. 1.

Fig. S1.

Fig. 2.

Table S1.

Table S2.

Table S3.

Fig. S2.

Table 1.

Table S4.

Fig. S3.

Fig. 3.

Table S5.

Table S6.

Fig. S4.

Table S7.

SI Materials and Methods

Identification of EGs.

Table S8.

Analysis of Haploinsufficiency of EGs.

Burden Analysis of Mutations in EGs in ASD Families.

Construction of Coexpression Modules and Coexpression Network in Brain.

Discussion

Materials and Methods

Identification of EGs.

Analysis of Haploinsufficiency of EGs.

Burden Analysis of Mutations in EGs in ASD Families.

Comparison Between Observed and Expected TADA FDR q Values.

Construction of Coexpression Modules and Coexpression Network in Brain.

Pathway Enrichment Analysis.

Code Availability.

Table S9.

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases