Summary
Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation for the resulting phenotypic variation is unknown. As part of the ongoing Epi25 Collaboration, we performed a whole-exome sequencing analysis of 13,487 epilepsy-affected individuals and 15,678 control individuals. While prior Epi25 studies focused on gene-based collapsing analyses, we asked how the pattern of variation within genes differs by epilepsy type. Specifically, we compared the genetic architectures of severe developmental and epileptic encephalopathies (DEEs) and two generally less severe epilepsies, genetic generalized epilepsy and non-acquired focal epilepsy (NAFE). Our gene-based rare variant collapsing analysis used geographic ancestry-based clustering that included broader ancestries than previously possible and revealed novel associations. Using the missense intolerance ratio (MTR), we found that variants in DEE-affected individuals are in significantly more intolerant genic sub-regions than those in NAFE-affected individuals. Only previously reported pathogenic variants absent in available genomic datasets showed a significant burden in epilepsy-affected individuals compared with control individuals, and the ultra-rare pathogenic variants associated with DEE were located in more intolerant genic sub-regions than variants associated with non-DEE epilepsies. MTR filtering improved the yield of ultra-rare pathogenic variants in affected individuals compared with control individuals. Finally, analysis of variants in genes without a disease association revealed a significant burden of loss-of-function variants in the genes most intolerant to such variation, indicating additional epilepsy-risk genes yet to be discovered. Taken together, our study suggests that genic and sub-genic intolerance are critical characteristics for interpreting the effects of variation in genes that influence epilepsy.
Keywords: epilepsy, epileptic encephalopathy, seizures, whole-exome sequencing, focal epilepsy, generalized epilepsy, intolerance, ClinVar, Epi25, Louvain
Introduction
Epilepsy is a clinical diagnosis in which the individual has an enduring predisposition to seizures. Although the most severe types most commonly begin in childhood with profound impact, epilepsies can begin at any age and have a cumulative incidence approaching 4%.1, 2, 3 While the genetics of the epilepsies are complex, uncovering pathogenic variants can, in some cases, provide opportunities for targeted or precision medicines.4,5 Whole-exome sequencing (WES) case-control studies have led to multiple insights into the epilepsies, such as the contribution of de novo variants in developmental and epileptic encephalopathy (DEE [MIM: 308350]), the role of the GABA pathway in genetic generalized epilepsy (GGE [MIM: 600669]), and the link between non-acquired focal epilepsy (NAFE [MIM: 604364, 245570]) and GATOR1 complex genes.6, 7, 8, 9, 10 DEEs are a severe form of early onset, intractable epilepsy associated with developmental delay.8,11, 12, 13, 14 In contrast, GGE and NAFE, characterized by generalized seizures and focal seizures, respectively, are more common and generally less severe.1,2,15, 16, 17 Yet, exome sequencing has revealed that a set of 43 genes typically associated with DEE also harbor ultra-rare variants in milder epilepsies.7,9
It is unknown how these variants cause such different epilepsy phenotypes despite being drawn from a set of shared genes and even from within the same gene. The likelihood of a gene’s being associated with disease can be predicted in silico, in part, by a given gene’s intolerance to functional variation in the general population.18, 19, 20 Epilepsy-causing variants tend to be rare in the general population and located in the least tolerant genes.7,9,18,21 While genic intolerance may help determine the likelihood of a gene-disease association, it does not clarify the differential impact of variants within the same gene.22 Variants within the same gene may lead to widely different epilepsy phenotypes.23, 24, 25, 26, 27, 28, 29 Predicting the differential effects of two variants within the same gene requires an understanding of sub-genic intolerance because different regions or domains will have varied importance for the protein’s function and may therefore contribute differentially to disease phenotype or severity.22 Consistent with this idea, distributions of disease mutations often cluster in specific genic sub-regions.30 In general, epilepsy variants cluster in the most intolerant genic sub-regions.22,31, 32, 33 The relationship between the severity of epilepsy caused by SCN2A variants and sub-genic intolerance has been explored,32 but a more systematic study of the association of sub-genic intolerance and epilepsy severity has not been undertaken. Given that a single variant may lead to variable phenotypes,34, 35, 36, 37 we do not expect sub-genic intolerance to explain all severity variability, but a deeper investigation will add to our understanding of the complex sequelae of a single variant.
The Epi25 Collaborative (Epi25) is the largest epilepsy exome analysis to date with more than 200 partners from 40 research cohorts contributing exome and phenotype data from more than 19,000 individuals with epilepsy (see web resources). The aspiration of the collaborative is that extensive exome data combined with accurate phenotypic data will allow for well-matched cohorts and clarify genotype-phenotype relationships in epilepsy, and Epi25 analyses have already yielded rich results for rare variants in the epilepsies.9 A dataset of this magnitude and detail allows us to examine the presence of curated variants from a clinical database such as ClinVar.38,39 Similarly, we are able to test for the burden of damaging variants in the ~15,000 genes not yet associated with Mendelian disease to detect the potential for epilepsy-gene discovery. Combining expansive genetic data from Epi25 and recently developed sub-genic intolerance metrics, we show that in a set of genes harboring missense variants in both milder and more severe epilepsies, variants in more severe epilepsies are preferentially located in less tolerant genic sub-regions. Furthermore, only ultra-rare (i.e., not found in a public database) pathogenic/likely pathogenic40 ClinVar variants are increased in our cohort, and our sub-genic intolerance finding is replicated in these ultra-rare variants. Finally, there most likely remain undiscovered epilepsy-associated or epilepsy-risk genes among the genes most intolerant to loss-of-function variation.
Subjects and methods
Study design and participants
As described previously, we collected DNA and detailed phenotyping data on individuals with epilepsy from 40 sites in Europe, North America, Australasia, and Asia (Table S1).9 Here, we analyzed individuals with DEEs (n = 2,007), GGE (also known as idiopathic generalized epilepsy; n = 5,771), and NAFE (n = 7,489), accounting for the first 3 years of enrollment in Epi25. A subset of the data is available on dbGaP: phs001489. Following sample quality control (QC), relatedness testing (see sample and variant QC), and clustering (see clustering), the combined epilepsy analysis included 13,171 affected individuals (1,782, 5,048, and 6,341 individuals with DEE, GGE, and NAFE, respectively) along with 14,100 control individuals (2,048 genomes and 12,052 exomes). In the included clusters in the individual epilepsy analyses, 1,835 individuals with DEE were compared to 13,978 control individuals, 5,303 individuals with GGE were compared to 15,677 control individuals, and 6,439 individuals with NAFE were compared to 15,678 control individuals. Control individuals were aggregated from local collections at the Institute for Genomic Medicine at Columbia University (IGM - Columbia University, New York, NY, USA). Control individuals who passed the same QC and who were not known to have phenotypes overlapping DEE, GGE, or NAFE or be related to a proband with epilepsy were analyzed following geographic ancestry clustering (Figure S1, Table S2).
Phenotyping procedures
As described previously, epilepsies were clinically diagnosed by epileptologists (see below for criteria DEEs, GGE, and NAFE) in accordance with the International League Against Epilepsy (ILAE) classification at the time of diagnosis and recruitment.2,9 De-identified (non-PHI [protected health information]) phenotyping data were entered into the Epi25 data repository (hosted at the Luxembourg Centre for Systems Biomedicine) via online case record forms based on the RedCAP platform. De-identified data for subjects of previous coordinated efforts with phenotyping (e.g., the Epilepsy Phenome/Genome Project41 and the EpiPGX Project, see web resources) that were already entered into a database were accessed and transferred to the new platform. Phenotyping data underwent review for uniformity among sites and QC, and inconsistencies were reviewed by the phenotyping committee.
Epilepsy definitions
Epilepsy diagnoses and classification for Epi25 have been described previously.9 Briefly, DEE diagnosis required severe refractory epilepsy of unknown etiology with developmental plateau or regression and epileptiform features on electroencephalogram (EEG). Exclusion criteria included epileptogenic lesions on MRI. GGE diagnosis required a history of generalized seizure types with generalized epileptiform discharges on EEG. Exclusion criteria include focal seizures, moderate-to-severe intellectual disability, and epileptogenic lesions found on neuroimaging (when available). Diagnosis of NAFE required a history of focal seizures with either focal epileptiform discharges or normal findings on EEG. Exclusion criteria included neuroimaging lesions (except hippocampal sclerosis), a history of generalized seizures, and moderate-to-severe intellectual disability.
Informed consent
Adult subjects or the legal guardian for enrolled children signed informed consent at participating centers per the ethical requirements of the local rules at the time of enrollment.9 The consent must not exclude data sharing to be included in the study. Consent forms for samples collected after January 25, 2015 required specific language according to the National Institutes of Health’s Genomic Data Sharing Policy (see web resources). For control individuals, protocols were approved by Columbia University’s institutional review board and participants provided informed consent for the use of DNA in genetic research.
Next-generation sequencing data generation
All Epi25 samples were sequenced at the Broad Institute of Harvard and the Massachusetts Institute of Technology (MIT) on the Illumina HiSeq X platform with the use of 151 bp paired-end reads. Exome capture was performed with Illumina Nextera Rapid Capture or TruSeq Rapid Exome enrichment kit (target size 38 Mb). FastQ files were transferred to the IGM.
Next-generation sequencing of control individuals was performed at the IGM or transfered to the IGM and was a mixture of whole-genome sequencing and whole-exome sequencing. Exomes were captured with multiple capture kits and sequenced according to standard protocols on Illumina’s HiSeq 2000, HiSeq 2500, and NovaSeq 6000 (Illumina, San Diego, CA, USA) platform with 150 bp paired-end reads. Genomes were sequenced according to standard protocols on Illumina’s HiSeq 2000, HiSeq 2500, and NovaSeq 6000 (Illumina, San Diego, CA, USA) platform.
Variant calling
Both affected individuals and control individuals were processed with the same IGM bioinformatic pipeline for variant calling. Reads were aligned to human reference GRCh37 via DRAGEN (Edico Genome, San Diego, CA, USA)42 and duplicates were marked with Picard (Broad Institute, Boston, MA, USA). Variants were called according to the Genome Analysis Toolkit (GATK - Broad Institute, Boston, MA, USA) Best Practices recommendations v.3.6.43,44 Finally, variants were annotated with ClinEff45 and custom annotations, including Genome Aggregation Database (gnomAD) v.2.1 frequencies,20 regional-intolerance metrics,31,32 in silico filters,46 and ClinVar (as of 10/20/2020)38,39 clinical annotation, were added via the IGM’s in-house analysis tool for annotated variants (ATAV) platform.47
Sample and variant QC
Only samples with at least 90% of the consensus coding sequence (CCDS release 20)48 covered at a minimum of 10×, ≤2% contamination levels according to VerifyBamID,49 and single nucleotide variants (SNVs) and indels overlapping the Single Nucleotide Polymorphism database (dbSNP)50 at least 85% and 80%, respectively, were included. We removed samples with a discordance between self-declared and sequence-derived gender to prevent phenotype-genotype mismatch. We used kinship-based inference for GWAS (KING) to detect related individuals and removed one of each pair that had an inferred relationship of second-degree or closer while favoring the inclusion of affected individuals over control individuals and well covered over poorly covered.51
We restricted analyses to variants within the CCDS inclusive of two base intronic extensions to accommodate canonical splice variants. All included variants had to fulfill the following criteria to be included: (1) at least 10× coverage of the site, (2) quality score (QUAL) ≥ 50, (3) genotype quality score (GQ) ≥ 20, (4) quality by depth score (QD) ≥ 5, (5) mapping quality score (MQ) ≥ 40, (6) read position rank sum score (RPRS) ≥ −3, (7) mapping quality rank sum score (MQRS) ≥ −10, (8) Fisher’s strand bias score (FS) ≤ 60 (SNVs) or ≤ 200 (indels), (9) strand odds ratio (SOR) ≤ 3 (SNVs) or ≤ 10 (indels), (10) GATK Variant Quality Score Recalibration filter “PASS,” and (11) alternate allele fraction for heterozygous calls ≥ 0.3. Known sequencing artifacts as described previously52 as well as low-quality variants per Exome Aggregation Consortium,53 gnomAD,20 or the Exome Variant Server were excluded (see web resources).
Clustering
As previously described by Cameron-Christie and colleagues,54 we performed principal-component analysis (PCA) for dimensionality reduction on a set of pre-defined variants to capture population structure. We applied the Louvain method of community detection with the first six principal components (PCs) as input to identify clusters within the data that reflect the geographic ancestry of the samples as previously described.55,56 To check the quality of the clusters, we performed further dimensionality reduction by using the Uniform Manifold Approximation and Projection (UMAP)57 on the first six PCs (Figures S1A–S1C) to disentangle geographic ancestry, which is then reflected in the cluster membership.58,59 A neural-network pre-trained on samples with known geographic ancestry generated probability estimates for each of six groups (European, African, Latino, East Asian, South Asian, and Middle Eastern). We used a 95% probability cutoff to assign a geographic ancestry label to each sample. Samples that did not reach 95% for any of the ancestry groups were labeled “admixed” (Figure S1).
Clustering was performed on the combined epilepsies as previously described.56 Clusters containing at least 20 affected individuals in each epilepsy type (DEE, GGE, and NAFE) and 20 controls were kept (Figure S1C, Table S3). Each epilepsy type/control group separately underwent clustering again to optimize ancestry matching for each epilepsy type (Figures S1D–S1L). The individual epilepsy clustering was used for individual epilepsy quantile-quantile plots (see quantile-quantile plots and genomic inflation factor λ, Figure 1), the analysis of common enrichment among DEE genes (Figure 2), and associated supplementary figures and tables. The combined epilepsy clustering was used for the combined epilepsy collapsing analysis, sub-genic comparisons, and ClinVar pathogenic/likely pathogenic analyses (Figures 3 and 4, control data in Figure 5) and associated supplementary figures and tables. The individual epilepsy clustering was also used to demonstrate potential for gene discovery (Figure 6) with associated supplementary figures and tables. All clusters underwent coverage harmonization (see coverage harmonization).
Coverage harmonization
As described previously,52 coverage differences between affected individuals and control individuals introduce a bias because no variants can be called without sufficient coverage. To reduce the influence of coverage differences caused by different capture kits or sequencing depth in general, we used a site-based pruning approach and removed sites where the absolute difference in percentages of affected individuals compared to control individuals with at least 10× coverage was greater than 7.0%. Each cluster (see clustering) underwent independent coverage harmonization. This resulted in four sets of coverage maps (Figure S1).
Qualifying variant
In the context of collapsing analyses, qualifying variants have been defined in order to identify a set of variants that are enriched for real variant calls and variants with strong functional effects.60 Here, we defined a qualifying variant (QV) as a variant passing both QC filters (see sample and variant QC) and model-specific filters (Table S4), such as variant effect filters, pathogenicity predictors, and internal and external minor allele frequency (MAF) filters. Variants could be drawn from three pools: (1) variants from Epi25 data and matched controls blinded to ClinVar status, (2) variants from Epi25 data and matched controls designated pathogenic/likely pathogenic (P/LP) in ClinVar as of 10/20/2020, or (3) all published P/LP ClinVar variants as of 10/20/2020. For analyses of variants in Epi25 data and matched controls blinded to ClinVar status (1) (Figures 1, 2, and 3, control data in Figures 5 and 6, Table 1), we applied the following filtering in addition to the variant QC filtering (see sample and variant QC): (1) all variants are “ultra-rare,” meaning they are not found in any non-neuro gnomAD population; (2) we filtered all protein-truncating variants (PTVs) with loss-of-function transcript effect estimator (LOFTEE) to remove likely false-positive PTVs;20 (3) we removed all variants located in regions with highly repetitive elements to reduce false-positive variants;61 (4) we removed all variants in regions with a proportion expression across transcripts (pext) value less than 1/10 the maximum pext value for that gene because they are unlikely to affect translated mRNA;62 and (5) we excluded variants with an internal allele frequency greater than 0.05% applied to the combined case-control call set by cluster excluding one allele to allow for clusters in which one allele might exceed that allele frequency threshold.62 PTV effects included stop gain, frameshift, splice acceptor, and splice donor variants.
Table 1.
GGE-associated gene | Number of GGE-affected individuals in Epi25 | GGE p value | NAFE-associated gene | Number of NAFE-affected individuals in Epi25 | NAFE p value |
---|---|---|---|---|---|
NLGN2 | 3 | 8.6 × 10−3 | WDR18 | 4 | 0.01 |
HDLBP | 4 | 8.9 × 10−3 | SOCS7 | 5 | 0.01 |
RC3H2 | 4 | 0.01 | TRIM9 | 3 | 0.05 |
XPO5 | 3 | 0.02 | ENAH | 2 | 0.05 |
Genes listed are among the most intolerant decile to loss-of-function variation and harbor protein-truncating variants (PTVs) in more than one epilepsy-affected individual but harbor no PTVs in control individuals. Only the top four gene associations are shown per epilepsy. Full tables can be found in the supplemental information (Tables S37 and S38). p values drawn from ultra-rare protein-truncating variants collapsing analysis (Figure S9, Tables S19 and S20).
For P/LP variants found in Epi25 and matched controls (Figure 4) (2) and all published P/LP variants (Figure 5, non-control data) (3), no universal filtering was applied beyond variant QC. ClinVar variants could additionally be filtered by ClinVar “review status,” which attempts to capture the level of review supporting the assertion of clinical significance for the variant with increasing number of “gold stars” from 0 to 4.63, 64, 65
In addition to the filtering applied above, we defined the following categories of missense variants to be utilized in the study. For “damaging” missense variants, REVEL46 filter ≥ 0.5 (when defined) was applied. For “intolerant” missense variants, a missense tolerance ratio (MTR) filter ≤ 0.78 (when defined), which represents a variant in the most intolerant quartile of all regions in the exome to missense variation, was applied (see web resources).32 To further enhance missense variants for those located in intolerant genic sub-regions, we utilized a separate model in which we added an exon-based localized intolerance model using Bayesian regression (LIMBR) percentile < 25. LIMBR is a sub-genic intolerance score previously shown to enhance selection for missense variants associated with DEEs.31
Gene-based collapsing
As described previously,7,52,56 we performed gene-based collapsing to test whether there is a significant enrichment of affected individuals harboring a QV in a given gene compared to controls. For each gene within each cluster, we assigned an indicator variable (1/0 states) to each individual on the basis of the presence of at least one qualifying variant in the gene (state 1) or no qualifying variants in that gene (state 0) to create a gene-by-subject matrix for each cluster. From the collapsing matrices of the individual clusters, we extracted the number of affected individuals/control individuals with and without a QV per gene and used the exact two-sided Cochran-Mantel-Haenszel (CMH) test66,67 to test for an association between disease status and QV status (Table S4) while controlling for cluster membership. Finally, we created quantile-quantile (QQ) plots (described below). We defined a study-wide Bonferroni multiplicity-adjusted significance threshold of p < 1.6 × 10−7 (0.05 / [18,650 CCDS genes × 17 non-synonymous models]).
The synonymous model was used as a putatively negative control (Figures S2 and S3A, Tables S4, S6–S8, and S16). Additional details for the 17 non-synonymous models can be found in Table S4. The top 200 ranked genes for each analysis can be found in the supplemental tables (Tables S6–S26). The membership of each gene in the following gene sets is also indicated: (D) 43 dominant genes associated with DEE in the Online Mendelian Inheritance in Man (OMIM, see web resources) (see gene set enrichment testing), (P) 101 dominant genes with epilepsy or related terms in its OMIM phenotype, (L) the 1,920 genes most intolerant to loss-of-function variation in the general population (see gene set enrichment testing), top 200 ranked genes in prior Epi25 DEE (D25), GGE (G25), or NAFE (N25) association analyses,9 or top 300 ranked genes in prior GGE (G4K) or NAFE (N4K) Epi4K association analyses.7 Epi4K was a large WES epilepsy project completed prior to Epi25.
Quantile-quantile plots and genomic inflation factor λ
We generated quantile-quantile (QQ) plots with empirical (permutation-based) expected probability distributions by using a previously described method.7,52 For each collapsing model and cluster, the original case and control labels were randomly permuted, while the rest of the gene-by-sample matrix was kept fixed. For each cluster, we extracted the number of newly sampled cases/controls with and without a QV per gene and used the CMH test to test for an association between case/control status (see gene-based collapsing) and QV status (see qualifying variant) while controlling for cluster membership. This process was repeated 1,000 times, and for each permutation, the p values were ordered. The mean of each rank-ordered estimate across the 1,000 permutations (i.e., the average 1st order statistic, the average 2nd order statistic, etc.) represents the empirical estimates of the expected ordered p values. We plotted the negative logarithm of the permutation-based expected distribution relative to the observed ordered statistic to get permutation-based QQ plots. We also used the permutation-based expected p values to estimate the genomic inflation factor λ on the basis of the regression method as described previously.7,52 Genes labeled in black are known epilepsy-associated genes on the basis of manual review of the literature, while genes labeled in color are candidate epilepsy-associated genes.
Gene set enrichment testing
As described previously,7 biologically informed gene sets can reveal important pathways or gene characteristics by aggregated signal across related genes (Table S5). We utilized the following gene sets (GS-1 to GS-6) informed by their OMIM disease associations, inheritance patterns, and genic intolerance.
(GS-1) 43 established dominant (e.g., autosomal dominant or x-linked dominant) DEE-associated genes drawn from OMIM Phenotypic Series PS308350 and PS617711 on 10/9/2020.
(GS-2) 24 genes drawn from the 43 genes in GS-1 for which in all three epilepsies have a damaging missense variant.
(GS-3) 101 established dominant genes associated with OMIM phenotypes containing epilepsy and epilepsy related terms on 02/16/2021.
(GS-4) 14 genes harboring ultra-rare missense variants associated with both DEE and with epilepsy but not DEE in ClinVar (SZT2, SCN2A, SCN1A, HCN1, GABRA1, GABRG2, KCNQ3, SPTAN1, KCNT1, GRIN2B, GABRB3, CHD2, TBC1D24, and KCNQ2) as of 10/20/2020.
(GS-5) 10 gene sets representing the genes without a confirmed disease phenotype in OMIM on 02/16/2021 (18,852 CCDS genes – 3,964 genes = 14,888 genes) distributed into 10 groups by their loss-of-function observed/expected upper bound fraction (LOEUF) decile were created.20 LOEUF is the 90% upper bound of the confidence interval of the observed/expected ratio of predicted loss-of-function variants in gnomAD and can be used to bin genes into deciles of approximately 1,920 genes each.
(GS-6) 10 gene sets representing the genes without a confirmed phenotype in OMIM on 02/16/2021 (18,852 CCDS genes – 3,964 genes = 14,888 genes) distributed into 10 groups by their missense Z score were created.19,20,68 Missense Z score captures the number of observed missense variants in a gene compared to the expected number of missense variants in the general population. The score was used to bin genes into deciles of approximately 1,920 genes each.
For a gene set analysis, we extracted the number of affected individuals/control individuals with and without at least one QV among any of the genes in the gene set and used the exact two-sided CMH test66,67 to test for an association between disease status and QV status while controlling for cluster membership. To examine association with LOEUF deciles (Figure 6), we only used control individuals without a disease association in our database (“controls” and “healthy family members”) (Table S2). We used a false discovery rate (FDR) correction for multiple comparisons. We performed 123 CMH tests to determine odds ratios for gene set enrichment testing and defined a significant enrichment at FDR < 0.05. For forest plots, odds ratios and p values were displayed for associations with an unadjusted p value < 0.05.
Sub-genic intolerance comparison
We examined sub-genic intolerance scores (MTR) in multiple ways. We compared the raw MTR and MTR domain percentiles scores across epilepsy-affected and control individuals directly by using the Kruskal-Wallis test by rank. For groups with p value < 0.05, we performed pairwise comparisons by using the Wilcoxon signed-rank test. This method may not be an adequate comparison because, despite enriching for damaging missense variants with REVEL, control individuals with qualifying variants (which are unlikely to be true positives) remain, indicating that some of the qualifying variants found in affected individuals may also be benign. Direct comparison of sub-genic intolerance scores among epilepsies is therefore difficult to interpret because the QV burden is different among epilepsies (see results) and the true positive rate among these QVs is unknown.
To compare MTR among epilepsies, it was necessary to estimate and compare the “true positive” distribution of scores for each epilepsy. To achieve this, we created a weighted average of the cumulative distribution function (CDF) of MTR scores for ultra-rare damaging missense variants in each epilepsy (CDFDEE, CDFGGE, and CDFNAFE) and the CDF of ultra-rare damaging missense variants in our controls (CDFCTRL) to obtain the “true positive” CDF for each epilepsy (CDFDEE_TP, CDFGGE_TP, and CDFNAFE_TP). Only damaging missense variants with defined MTR scores were considered.
At a given MTR value, the “true positive” CDF is a weighted average of the epilepsy and control CDF with the weights determined by the QV rate of the control population at that MTR value. For example, if at an MTR score of 0.5, 4% of DEE-affected individuals have an ultra-rare damaging missense variant and 1% of control individuals of have an ultra-rare damaging missense variant, then CDFDEE_TP(0.5) = 0.75 × CDFDEE(0.5) + 0.25 × CDFCTRL(0.5).69 We then used a Kolmogorov–Smirnov test (statistic D) to compare the distribution of “true positive” MTR CDFs of each epilepsy pair. Given that we did not know the distribution of D, we performed a permutation test with 10,000 permutations for each comparison. We assessed significance at p < 0.05.
To compare sub-genic intolerance scores by gene, we compared the “true positive” mean MTR by gene for DEE compared to NAFE and compared to GGE. In a given gene, the “true positive” mean MTR is a weighted average of the epilepsy mean MTR and control mean MTR scores with the weights determined by the QV rate of the control population in that gene. For example, if in gene X, 4% of DEE-affected individuals have an ultra-rare damaging missense variant and 1% of control individuals have an ultra-rare damaging missense variant, then MeanDEE_TP(X) = 0.75 × MeanDEE(X) + 0.25 × MeanCTRL(X). For those genes with no control variants, the means were calculated without weighting. We measured the number of genes where DEE had a lower weighted mean MTR and measured significance with a binomial test with the null hypothesis that DEE variants had a lower MeanTP in half of the genes in the tested gene set.
To compare the MTR values of published ClinVar variants (i.e., not drawn from our affected individuals or control individuals), we divided the variants into those associated with DEE and non-DEE epilepsy. ClinVar variants with phenotypes containing “epilepsy” or “epileptic” were considered associated with epilepsy. Those with phenotypes containing “West,” “Dravet,” “Lennox-Gastaut,” “infantile spasm,” “Ohtahara,” “myoclonic,” or “glut 1” were considered associated with DEE, while the remainder were classified as non-DEE epilepsy. There was an inadequate number of variants specifically associated with GEE and NAFE to further sub-divide them. For variants with multiple clinical associations, the most severe association was assigned. We looked at only ultra-rare variants with a defined MTR value. We limited our analysis to only those genes harboring variants in both epilepsy groups (see gene set enrichment testing). The control variant set was drawn from the combined epilepsy analysis (Figures S1A–S1C). We used a two-sample Wilcoxon test to assess significance. We measured the number of genes where DEE had a lower mean MTR and measured significance with a binomial test with the null hypothesis that DEE variants had a lower mean MTR in half of the genes in the tested gene set.
Lollipop and MTR plots
Lollipop mutation diagrams were generated for the 24 genes analyzed for the sub-genic intolerance comparison (GS-2) via lollipops-v.1.5.3.70 All 614 missense variants (DEE = 100, GGE = 133, NAFE = 153, and control = 228) were displayed across the linear gene structure of the associated gene. For each gene, the MTR distribution with missense variant locations plotted was juxtaposed against the lollipop mutation diagram. MTR data were downloaded from the MTR-Viewer website (see web resources).71
Comparison of evolutionary constrained regions
Evolutionary constraint for missense variants was assessed at three levels. For base-level scores, we used the GERP++ “rejected substitution” (RS) score in which higher scores correspond to greater constraint.72,73 For exonic and domain constraint, we used exonic and domain subGERP scores, respectively.22 We compared scores across epilepsies and controls directly by using the Kruskal-Wallis test by rank. No group reached statistical significance (p value < 0.05), so no pairwise comparisons were performed.
Candidate non-OMIM epilepsy genes
To ascertain additional potential epilepsy-gene associations not found in OMIM, we highlighted genes that are (1) in the most intolerant decile to loss-of-function (LOF) variation in the general population by LOEUF rank, (2) not associated with a disease in OMIM, (3) harbor PTVs with LOFTEE filtering in more than one affected individual, and (4) harbor no control PTVs with LOFTEE filtering.
Data analysis and display
Unless otherwise noted in the methods, data analysis and visualization were performed with R (v.3.6.0).74 Notches in boxplots indicate 1.58 ∗ interquartile range / sqrt(n), which approximates the 95% confidence interval.75
Results
Gene-based collapsing in three types of epilepsies
The results of the gene-based collapsing should be viewed through the lens of prior rare-variant association analyses of epilepsy data and, specifically, Epi25 data. The data in this analysis are a superset of the data used in prior Epi25 analyses.9 The cluster-based collapsing analysis allows for the inclusion of multiple ancestries because each geographic ancestry-matched cluster is analyzed separately (Figure S1). The results are then combined with the CMH test (see gene-based collapsing) accounting for population sub-structure.56 The sample size increased in all three epilepsies (1,835 from 1,021 DEE-affected individuals, 5,303 from 3,108 GEE-affected individuals, and 6,349 from 3,597 NAFE-affected individuals) because of increased enrollment in Epi25 and the inclusion of affected individuals with non-European geographic ancestry. Other differences include a different control set and different in silico methods of indicating QV status. We ran gene-based collapsing (Tables S6–S26) for gene-discovery counting PTVs and damaging missense variants for all three epilepsies (Figure 1, Tables S9, S12, and S14) and all epilepsies combined (Figure S3B, Table S17). There was expected overlap among the top ranked genes from prior Epi25 analyses as well as the suggestion of candidate genes not previously associated with epilepsy (Tables S11–S26).
In the DEE collapsing analysis (Figure 1A, Table S9), the top two ranked genes were the same as in the prior Epi25 analysis, but now SCN1A ([MIM: 182389] OR = 7.1, p = 4.4 × 10−8) and NEXMIF (previously known as KIAA2022 [MIM: 300524] OR 26.5, p = 8.6 × 10−8) both achieve study-wide significance. In contrast to prior Epi25 analyses, nine of the top ten ranked genes are known epilepsy genes,76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87 demonstrating the strength of the increased sample size and clustering methodology. The remaining gene, AP3S2 ([MIM: 602416] OR = 70.5, p = 2.7 × 10−4), is a component of the AP3 complex, an adaptor-related complex with no prior association to epilepsy, although it was a top 200 hit in the prior Epi25 DEE analysis.9,88 Hermansky-Pudlak syndrome 10 (MIM: 617050), which is notable for infantile onset of immunodeficiency and intractable seizures, is caused by bi-allelic mutations in AP3D1 (MIM: 607246), a different component of the same AP3 complex.89 To highlight candidate genes, we removed DEE-affected individuals in Figure 1A that harbored a qualifying variant in any of the 101 dominant genes with epilepsy or related terms in the OMIM phenotype and re-ran the collapsing analysis (Figure S4, Table S11). The 5th ranked gene, SRCAP ([MIM: 611421] OR = 6.8, p = 1.6 × 10−3), is highly intolerant to loss-of-function variants (LOUEF = 0.1) and is associated with Floating-Harbor syndrome (MIM: 136140), which can include seizures.90,91 In summary, this enlarged DEE analyses with affected individuals of non-European geographic ancestry produced results that more consistently elevated known epilepsy-associated genes and, importantly, proposed genes without prior epilepsy associations (AP3S2 and SRCAP).
Four of the top ten ranked genes in the gene-based collapsing analysis for GGE (Figure 1B, Table S12) were previously associated with epilepsy (SLC6A1 [MIM: 137165], SCN1A, GRIN2A [MIM: 138253], and GABRA1 [MIM: 137160]).92, 93, 94, 95 The top hit is SLC6A1 (OR = 16.6, p = 2.1 × 10−6), which was a top 200 gene in the prior Epi25 GGE analysis but now approaches study-wide significance.9 SCL6A1 was initially implicated in DEE, but its role in generalized epilepsies has only been more recently revealed.95,96 Among the remaining genes, there are two promising candidates: (1) FBXO42 ([MIM: 609109] OR = 13.6, p = 4.5 × 10−4), which is a highly intolerant gene (LOEUF = 0.27) important in the regulation of p53 and not yet implicated in disease but was a top 200 GGE-associated gene in the prior Epi25 analysis,9 and (2) KCNK18 ([MIM: 613655] OR = Inf, p = 1.6 × 10−3), which is a potassium channel implicated in migraine pathology.97,98 Promising candidate genes for GGE from the prior Epi25 analysis (CACNA1G [MIM: 604065] and UNC79 [MIM: 616884]) were not among the top 200 associated genes, which may be related to the different method of filtering missense variants.9 Further limiting missense variants to intolerant as well as damaging (Figure S5B, Table S13) elevated CACNA1B ([MIM: 601012], OR = 5.5, p = 3.5 × 10−4). Bi-allelic LOF variants in CACNA1B cause severe epilepsy.99 CACNA1B was the top gene associated with GGE in Epi4K,7 a large WES epilepsy project prior to Epi25. No association was found in the prior Epi25 analysis and there is limited other literature linking CACNA1B to GGE. This new GGE Epi25 collapsing analysis did not confirm promising candidate genes from the prior Epi25 analysis but did provide additional support for the association between CACNA1B and GGE and proposed candidate genes (FBXO42 and KCNK18).
Gene-based collapsing analysis for NAFE (Figure 1C, Table S14) showed a familiar top hit, DEPDC5 ([MIM: 614191] OR = 5.4, p = 1.3 × 10−6), and four additional genes (GRIN2A, SCN1A, SCN8A [MIM: 600702], and NPRL2 [MIM: 607072]), which have previously been implicated in NAFE.7,9,80,84,92,100,101 Renin, the protein encoded by REN ([MIM: 179820] OR = 12.7, p = 4.2 × 10−4), is produced by juxtaglomerular cells of the kidney but has been implicated as a target of adjuvant therapy for epilepsy.102,103 ADORA2B ([MIM: 600446] OR = Inf, p = 4.5 × 10−4), is a small gene encoding an adenosine receptor not associated with disease but being explored for its role in epileptogenesis.104,105 DAW1 (OR = 30.0, p = 1.8 × 10−4), a little understood gene, supports cilia function.106 The increased sample size did not further support promising genes from the prior Epi25 analysis, such as TRIM3 (MIM: 605493), PPFIA3 (MIM: 603144), and KCNJ3 (MIM: 601534).9 Further limiting missense variants to intolerant as well as damaging (Figure S5C, Table S15) removed all control-enriched genes from the top ten ranked genes and elevated known epilepsy genes. Interestingly, the 7th ranked gene, TSC1 ([MIM: 605284], OR = 14, p = 1.7 × 10−3), is typically associated with focal epilepsy in the context of tuberous sclerosis-1 (MIM: 191100) or focal cortical dysplasia, type II, somatic (MIM: 607341), although the individuals with focal epilepsy in this study do not have a lesion on MRI.107,108 Like the GGE collapsing analysis, the NAFE collapsing analysis proposed different candidate genes rather than confirming those from prior Epi25 analyses.
Milder epilepsies remain enriched for ultra-rare variants in a limited gene set
Our group has previously observed that more mild epilepsies are enriched in genes also associated with severe phenotypes.7,9 To limit the degree to which individual genes in the gene set drove that finding and facilitate comparisons of variants across epilepsies, we recapitulated that analysis but narrowed the gene set of dominant DEE-associated genes to include only those 24 genes containing at least one damaging missense variant in all three epilepsies (Figure 2, Tables S5 and S27). DEE (CMH pooled odds ratio [OR] = 2.1, FDR-adjusted p value [adj.p] = 1.9 × 10−9) and NAFE (CMH pooled odds ratio [OR] = 1.3, FDR-adjusted p value [adj.p] = 1.2 × 10−3) are enriched for all missense variants. All three epilepsies are enriched for damaging missense variants (DEE OR = 3.7, adj.p = 6.8 × 10−17; GGE OR = 1.7, adj.p = 1.2 × 10−4; NAFE OR = 1.7, adj.p = 6.4 × 10−5), and removing the damaging filter, all three epilepsies are also enriched for variants in intolerant genic sub-regions (DEE OR = 3.5, adj.p = 1.6 × 10−14; GGE OR = 1.7, adj.p = 1.2 × 10−4; NAFE OR = 1.6, adj.p = 3.5 × 10−4). Combining both improves enrichment in all three epilepsies (DEE OR = 5.5, adj.p = 8.1 × 10−19; GGE OR = 2.2, adj.p = 1.0 × 10−6; NAFE OR = 2.0, adj.p = 1.8 × 10−5). Only DEE and GGE were enriched for loss-of-function variants (DEE OR = 12.7, adj.p = 1.9 × 10−9; GGE OR = 3.8, adj.p = 4.6 × 10−4), which is consistent with prior analyses.9 In summary, despite restricting our DEE-associated gene set further to ensure that at least one affected individual per epilepsy harbored a damaging missense variant in each gene and enlarging our samples to include individuals of non-European ancestry, a familiar pattern of enrichment exists in the milder epilepsies.
Ultra-rare DEE variants in Epi25 are located in intolerant genic sub-regions
After demonstrating that more mild epilepsies (GGE, NAFE) were enriched for ultra-rare damaging missense variants in the same gene set as severe epilepsies (DEE) (Figure 2), we tested the hypothesis that variants associated with DEE were located in more intolerant sub-regions than those associated with GGE or NAFE. Despite filtering for pathogenicity with REVEL, there remains a background rate of enrichment of ultra-rare damaging missense variants in the control population (Figure 2, Table S29). This suggests that a portion of the ultra-rare damaging missense variants in our epilepsy-affected individuals are also benign, which makes direct comparison of the sub-genic intolerance score among epilepsy subtypes (Figure S6A) difficult to interpret because the burden of damaging missense variants in DEE-affected individuals is higher than those of GGE or NAFE (CMH; DEE-GGE OR = 2.2, adj.p = 7.8 × 10−7; DEE-NAFE OR = 2.3, adj.p = 9.4 × 10−8; Table S28). Instead, we estimated the distribution of MTR scores of “true positive” ultra-rare damaging missense variants in each epilepsy and made pairwise comparisons by using a Kolmogorov-Smirnov (K-S) test (see sub-genic intolerance comparison, Figure 3). Consistent with our hypothesis, the distribution of MTR scores for DEE variants was significantly different from NAFE (“true positive” median MTR DEE = 0.670 versus NAFE = 0.721, K-S, p < 0.0156), while the difference from GGE did not achieve statistical significance (“true positive” median MTR DEE = 0.670 versus GGE = 0.710, K-S, p = 0.38). On a per gene basis, the MTR scores of DEE variants are not uniformly more intolerant than GGE and NAFE (Figure S7). Although the above analysis demonstrates that DEE variants lay in more intolerant genic sub-regions than NAFE variants, it does not account for the possible differential contribution of specific genes to specific epilepsies among the 24 genes. To address this concern, we performed a second analysis that compared the weighted mean MTR of DEE compared to NAFE and to GGE (Table S29). The weighted mean MTR scores of the DEE variants was lower in 15 of the 24 genes compared to NAFE (binomial test, p = 0.31) and 15 of the 24 genes compared to GGE (binomial test, p = 0.31).
No clear relationship exists between gene, protein domain, and epilepsy type (Figure S8). Despite the large Epi25 dataset, we most likely remain underpowered to untangle the epilepsy by protein space relationship on an individual gene level.33 MTR is calculated on a sliding window, making it independent of known gene structures. Domain-based MTR showed a smaller difference among the epilepsies (Figures S6A and S6B), suggesting that the sub-genic intolerance differences among the epilepsies is at least partially independent from gene structures.32 We also examined whether missense variants associated with DEE were located in more evolutionary constrained bases, exons, or domains than milder epilepsies (Figures S6C–S6E). No comparison met statistical significance. This was true despite both evolutionary constrained and intolerant domains harboring pathogenic variants, although differences in domains may be difficult to assess given the limited number per gene.22
Only ultra-rare pathogenic/likely pathogenic ClinVar variants are enriched in Epi25
The sample size of Epi25 allows us to assess the representation of variants found in ClinVar, a heavily used clinical database of curated variants, in our three epilepsy sub-groups and investigate whether sub-genic intolerance might add clinically useful information.38,39 Using a set of 101 genes with epilepsy or related terms in their OMIM phenotypes (Table S5), we examined the burden of P/LP variants in our affected individuals compared to control individuals (Figure 4A, Table S30). Given the prior findings that epilepsy-affected individuals are enriched with ultra-rare variants but not more common variants,7 we divided our ClinVar analysis into variants not found in the non-neuro gnomAD populations (ultra-rare) and variants seen in the general population (public). Consistent with prior reports, there was an increased burden of ultra-rare P/LP variants in our epilepsy-affected individuals compared to control individuals irrespective of epilepsy type (CMH; DEE OR = 84.5, adj.p = 8.9 × 10−38; GGE OR = 14.5, adj.p = 1.8 × 10−11; NAFE OR = 14.4, adj.p = 6.9 × 10−13). There was no enrichment in public variants (Figure 4A). Epilepsy variants in ClinVar also found in gnomAD or future public datasets may require additional investigation to confirm pathogenicity.
Severe pathogenic/likely pathogenic ClinVar variants are located in intolerant genic sub-regions
Among ultra-rare ClinVar variants, we sought to determine whether we could further differentiate epilepsy variants from control variants (Figure 4B, Table S31). ClinVar “review status” attempts to capture the level of review supporting the assertion of clinical significance for the variant with increasing number of “gold stars” from zero to four.63, 64, 65 Filtering ultra-rare P/LP ClinVar on the basis of review status did not improve discrimination in a dose-dependent fashion. In all three epilepsies, there were no zero star controls but the enrichment of variants with more than one star exceeded the enrichment of variants with one star (CMH; DEE OR = 47.5, adj.p = 7.8 × 10−12 → OR = 91.6, adj.p = 1.4 × 10−21; GGE OR = 9.1, adj.p = 5.6 × 10−4 → OR = 17.2, adj.p = 6.6 × 10−7; NAFE OR = 8.2, adj.p = 2.0 × 10−3 → OR = 10.7, adj.p = 1.3 × 10−4). We next examined whether sub-genic intolerance filtering could further improve discrimination of affected individuals compared to control individuals. After filtering with MTR, the OR of ultra-rare missense variants increased in all three epilepsies (CMH; DEE OR = 92.5, adj.p = 3.4 × 10−32 → OR = 335.4, adj.p = 1.4 × 10−25; GGE OR = 14.9, adj.p = 1.2 × 10−9 → OR = 59.6, adj.p = 3.9 × 10−10; NAFE OR = 12.3, adj.p = 3.8 × 10−9 → OR = 34.7, adj.p = 9.2 × 10−8). All three epilepsies were enriched with ultra-rare PTVs in ClinVar (DEE OR = 49.8, adj.p = 3.4 × 10−6; GGE OR = 11.0, adj.p = 0.045; NAFE OR = 24.7, adj.p = 1.6 × 10−4). Among the few public variants, only missense variants filtered with MTR were statistically enriched in NAFE-affected individuals, and overall, MTR filtering removed all 12 control missense variants but only four of ten epilepsy variants (Table S32). In summary, sub-genic intolerance filtering improved discrimination of both ultra-rare and public variants in ClinVar, suggesting sub-genic intolerance provides additive information to identify potential false-positive or variable penetrance variants in ClinVar.
Using ultra-rare P/LP ClinVar variants, we sought to confirm our Epi25 finding (Figure 3) that missense variants in severe epilepsies are located in more intolerant genic sub-regions than milder epilepsies. We compared median sub-genic intolerance scores between DEE and non-DEE epilepsies (see sub-genic intolerance comparison) in genes with missense variants in both epilepsy groups (Figure 5, Tables S5 and S33). The median MTR score was lower (more intolerant) for published ClinVar DEE variants compared to non-DEE epilepsy ClinVar variants (median DEE MTR = 0.57 versus median non-DEE MTR = 0.70, Wilcoxon signed-rank test, p < 6.7 × 10−3). When examined by gene, the mean MTR score for the DEE variants was lower than the non-DEE variants in 11 of 14 genes tested (binomial test, p = 0.057). Reassuringly, both DEE and non-DEE variants existed in more intolerant regions than ultra-rare control variants (median control MTR = 0.83, control set drawn from combined epilepsy clusters, see clustering).
Epilepsy genes remain to be discovered and are most likely loss-of-function intolerant
There are ~3,900 genes identified in OMIM as harboring variants that are causative or a risk factor for disease.109 Analyzing likely damaging variants in non-OMIM genes may give a sense of as-yet to be discovered epilepsy genes (Figure 6, Tables S34 and S35). GGE and NAFE revealed a significant burden of PTVs in the intersection of non-OMIM genes with the decile of genes most intolerant to loss-of-function variation in the general population (GGE OR = 1.3, adj.p = 2.7 × 10−4; NAFE OR = 1.2, adj.p = 0.013) (Figures 6B and 6C). We highlighted the top four genes in the most intolerant decile associated with GGE and NAFE that had more than one case PTV and no control PTVs (Table 1). The most significant GGE candidate gene, NLGN2 (MIM: 606479, 3 cases), encodes neuroligin 2, which is a trans-synaptic adhesion molecule important in the synapse.110 The most significant NAFE candidate gene was WDR18 (4 cases), whose protein product forms the PELP1-TEX10-WDR18 complex important in ribosomal maturation.111 Tables of potential DEE, GGE, and NAFE genes are included in the supplement (Tables S36–S38). Finally, to investigate additional candidate genes, we performed ultra-rare variant collapsing analysis with only PTVs (Figure S9, Tables S18–S20), only damaging missense variants (Figure S10, Tables S21–S23), and PTVs combined with damaging and intolerant missense variants further limited to intolerant LIMBR exons (see qualifying variant, Figure S11, Tables S24–S26).31
DEE-affected individuals also revealed a trend toward increased burden in the intersection of non-OMIM genes with the 7th most intolerant decile (DEE OR = 1.1, adj.p = 0.14, Figure 6A), which may reflect genes associated with recessive epilepsies.20 None of the epilepsies revealed a significant burden of damaging and intolerant missense variants in missense intolerant genes (Table S35).
Discussion
In this, the largest Epi25 exome study of epilepsies to date including individuals of non-European geographic descent, we reaffirm that ultra-rare variants contribute to the three major epilepsy groups (Figure 1). Our collapsing analyses proposed epilepsy-associated genes (AP3S2, SRCAP, FBXO42, KCNK18, REN, and ADORA2B) requiring future confirmation. These associations reveal the power of increasing sample size with Epi25 and our clustering technique’s inclusion of non-European populations. The p values in DEE analyses must be regarded in light of the smaller sample size of individuals with DEE (1,835 with DEE compared to 5,303 with GGE and 6,379 with NAFE). We were unable to confirm several promising candidate genes from the prior Epi25 analysis that may be secondary to different control groups, different in silico filters, or a larger sample size.9 We confirmed enrichment of ultra-rare variants in GGE and NAFE in genes associated with DEE even when limited to genes in which all epilepsies have a damaging missense variant to limit single and distinct genes’ driving associations with different epilepsies (Figure 2).
Sub-genic intolerance has broad implications. It has been shown to help improve discrimination between pathogenic and benign variants and confirm the pathogenicity of new variants.22,32,112, 113, 114, 115 Pathogenic variants may cluster in areas of regional intolerance,31,32,116 and sub-genic intolerance scores may inform biochemical exploration, yielding novel insights into protein function.117 To our knowledge, this is the broadest demonstration that sub-genic intolerance scores might not only be different between case and control but also affect disease severity (Figures 3 and 5).32 This discrepancy may broadly inform the functional similarities of mutations leading to more severe disease across genes or, interestingly, across gene families.118
Using the large Epi25 dataset allowed us to assess variants documented in ClinVar (Figure 4). Allele frequency is known to be inversely associated with pathogenicity, and among Epi25 participants, only ultra-rare variants were enriched in affected individuals compared to control individuals (Figure 4A). Previous analyses have used population-based MAFs to reclassify variants as benign.32,68,119,120 The evolving nature of ClinVar classifications has been noted previously as more population-wide control data become available.63,64,121 Within the ultra-rare MAF bin, review status did not provide additional enrichment in a dose-dependent manner in our data (Figure 4B), although it has indicated higher true positive value in other studies focused on more common variants.63, 64, 65,122 One and two star ultra-rare pathogenic variants in ClinVar have been reported as possible false-positives,122 although no study to our knowledge has systematically evaluated ultra-rare P/LP ClinVar variants for false-positivity or incomplete penetrance. Finally, four of the five ultra-rare and all 12 public missense P/LP variants harbored by control individuals were located in more tolerant regions of the exome (Figure 4B, Tables S31 and S32). The enrichment of ClinVar variants with MTR filtering suggests that regional intolerance may provide additional information to clinicians assessing ClinVar variants.
There most likely remain genes that will ultimately be associated with a disease, although the pace of discovery may be slowing.109 In this Epi25 cohort, GGE and NAFE contained an increased burden of PTVs in the non-OMIM genes most intolerant to loss-of-function variation in the general population (Figure 6). No increase was seen for individuals with DEE, suggesting that gene discovery for DEE is advanced compared to the milder epilepsies. There are several genes with PTVs in multiple affected individuals but in no control individuals that are potential epilepsy or epilepsy-risk genes (Tables 1 and S37–S39). With increased sample size, these genes may become more prominent in future collapsing analyses.
Limitations of this study are that individuals with epilepsy were enrolled at variable ages, leaving open the possibility that a case may evolve from one epilepsy to another. While we posit that variant location determines the severity of the variant and therefore determines the phenotype, this does not address variants that have one autosomal dominant phenotype and a different autosomal recessive phenotype. The sub-genic intolerance score-by-gene interaction (Figures S6 and S7) may be secondary to different numbers of variants per gene, incomplete capture of all sub-genic intolerance information by MTR, or other factors that contribute to epilepsy severity. Examining the collective sub-genic intolerance scores of variants from multiple genes does not take into account within-gene comparisons (i.e., sub-genic intolerance distributions differ per gene, as do the epilepsy type-by-gene burdens). We attempted to address these confounds (Tables S29 and S33) but were under-powered. Future studies will be needed to understand the gene-by-intolerance score interaction. Finally, segregation analysis of variants in candidate epilepsy-associated genes (Table 1) could weaken or bolster the proposed relationships. Unfortunately, we do not have access to Epi25 family member data. As the Epi25 enrollment increases, we look forward to the increased power’s allowing for the further elucidation of the genetic architectures of the epilepsies.
Consortia
The members of the Epi25 Collaborative are Joshua E. Motelow, Gundula Povysil, Ryan S. Dhindsa, Kate E. Stanley, Andrew S. Allen, Yen-Chen Anne Feng, Daniel P. Howrigan, Liam E. Abbott, Katherine Tashman, Felecia Cerrato, Caroline Cusick, Tarjinder Singh, Henrike Heyne, Andrea E. Byrnes, Claire Churchhouse, Nick Watts, Matthew Solomonson, Dennis Lal, Namrata Gupta, Benjamin M. Neale, Gianpiero L. Cavalleri, Patrick Cossette, Chris Cotsapas, Peter De Jonghe, Tracy Dixon-Salazar, Renzo Guerrini, Hakon Hakonarson, Erin L. Heinzen, Ingo Helbig, Patrick Kwan, Anthony G. Marson, Slavé Petrovski, Sitharthan Kamalakaran, Sanjay M. Sisodiya, Randy Stewart, Sarah Weckhuysen, Chantal Depondt, Dennis J. Dlugos, Ingrid E. Scheffer, Pasquale Striano, Catharine Freyer, Roland Krause, Patrick May, Kevin McKenna, Brigid M. Regan, Caitlin A. Bennett, Costin Leu, Stephanie L. Leech, Terence J. O’Brien, Marian Todaro, Hannah Stamberger, Danielle M. Andrade, Quratulain Zulfiqar Ali, Tara R. Sadoway, Heinz Krestel, André Schaller, Savvas S. Papacostas, Ioanna Kousiappa, George A. Tanteles, Yiolanda Christou, Katalin Štěrbová, Markéta Vlčková, Lucie Sedláčková, Petra Laššuthová, Karl Martin Klein, Felix Rosenow, Philipp S. Reif, Susanne Knake, Bernd A. Neubauer, Friedrich Zimprich, Martha Feucht, Eva M. Reinthaler, Wolfram S. Kunz, Gábor Zsurka, Rainer Surges, Tobias Baumgartner, Randi von Wrede, Manuela Pendziwiat, Hiltrud Muhle, Annika Rademacher, Andreas van Baalen, Sarah von Spiczak, Ulrich Stephani, Zaid Afawi, Amos D. Korczyn, Moien Kanaan, Christina Canavati, Gerhard Kurlemann, Karen Müller-Schlüter, Gerhard Kluger, Martin Häusler, Ilan Blatt, Johannes R. Lemke, Ilona Krey, Yvonne G. Weber, Stefan Wolking, Felicitas Becker, Stephan Lauxmann, Christian Boßelmann, Josua Kegele, Christian Hengsbach, Sarah Rau, Bernhard J. Steinhoff, Andreas Schulze-Bonhage, Ingo Borggräfe, Christoph J. Schankin, Susanne Schubert-Bast, Herbert Schreiber, Thomas Mayer, Rudolf Korinthenberg, Knut Brockmann, Markus Wolff, Dieter Dennig, Rene Madeleyn, Reetta Kälviäinen, Anni Saarela, Oskari Timonen, Tarja Linnankivi, Anna-Elina Lehesjoki, Sylvain Rheims, Gaetan Lesca, Philippe Ryvlin, Louis Maillard, Luc Valton, Philippe Derambure, Fabrice Bartolomei, Edouard Hirsch, Véronique Michel, Francine Chassoux, Mark I. Rees, Seo-Kyung Chung, William O. Pickrell, Robert Powell, Mark D. Baker, Beata Fonferko-Shadrach, Charlotte Lawthom, Joseph Anderson, Natascha Schneider, Simona Balestrini, Sara Zagaglia, Vera Braatz, Michael R. Johnson, Pauls Auce, Graeme J. Sills, Larry W. Baum, Pak C. Sham, Stacey S. Cherny, Colin H.T. Lui, Norman Delanty, Colin P. Doherty, Arif Shukralla, Hany El-Naggar, Peter Widdess-Walsh, Nina Barišić, Laura Canafoglia, Silvana Franceschetti, Barbara Castellotti, Tiziana Granata, Francesca Ragona, Federico Zara, Michele Iacomino, Antonella Riva, Francesca Madia, Maria Stella Vari, Vincenzo Salpietro, Marcello Scala, Maria Margherita Mancardi, Lino Nobili, Elisabetta Amadori, Thea Giacomini, Francesca Bisulli, Tommaso Pippucci, Laura Licchetta, Raffaella Minardi, Paolo Tinuper, Lorenzo Muccioli, Barbara Mostacci, Antonio Gambardella, Angelo Labate, Grazia Annesi, Lorella Manna, Monica Gagliardi, Elena Parrini, Davide Mei, Annalisa Vetro, Claudia Bianchini, Martino Montomoli, Viola Doccini, Carmen Barba, Shinichi Hirose, Atsushi Ishii, Toshimitsu Suzuki, Yushi Inoue, Kazuhiro Yamakawa, Ahmad Beydoun, Wassim Nasreddine, Nathalie Khoueiry Zgheib, Birute Tumiene, Algirdas Utkus, Lynette G. Sadleir, Chontelle King, S. Hande Caglayan, Mutluay Arslan, Zuhal Yapıcı, Pınar Topaloglu, Bulent Kara, Uluc Yis, Dilsad Turkdogan, Aslı Gundogdu-Eken, Nerses Bebek, Meng-Han Tsai, Chen-Jui Ho, Chih-Hsiang Lin, Kuang-Lin Lin, I-Jun Chou, Annapurna Poduri, Beth R. Shiedley, Catherine Shain, Jeffrey L. Noebels, Alicia Goldman, Robyn M. Busch, Lara Jehi, Imad M. Najm, Lisa Ferguson, Jean Khoury, Tracy A. Glauser, Peggy O. Clark, Russell J. Buono, Thomas N. Ferraro, Michael R. Sperling, Warren Lo, Michael Privitera, Jacqueline A. French, Steven Schachter, Ruben I. Kuzniecky, Orrin Devinsky, Manu Hegde, David A. Greenberg, Colin A. Ellis, Ethan Goldberg, Katherine L. Helbig, Mahgenn Cosico, Priya Vaidiswaran, Eryn Fitch, Samuel F. Berkovic, Holger Lerche, Daniel H. Lowenstein, and David B. Goldstein. See supplemental information for consortium member affiliations.
Declaration of interests
B.M.N. is a member of the scientific advisory board at Deep Genomics and RBNC Therapeutics, a member of the scientific advisory committee at Milken, and a consultant for Camp4 Therapeutics, Takeda Pharmaceutical, and Biogen. R.S.D. is a consultant for AstraZeneca. D.B.G. is a founder and shareholder in Praxis Precision Medicines, a shareholder in and member of the scientific advisor board for Apostle Inc., a shareholder in Q State – Biosciences, and a consultant for Gilead Sciences, AstraZeneca, and GoldFinch Bio.
Acknowledgments
We thank the Epi25 principal investigators, local staff overseeing individual cohorts, and all of the individuals with epilepsy and their families who participated in Epi25 for their commitment to this international collaboration. J.E.M. is supported by the National Institutes of Health (TL1TR001875). This work is part of the Centers for Common Disease Genomics (CCDG) program, funded by the National Human Genome Research Institute (NHGRI) and the National Heart, Lung, and Blood Institute (NHLBI). CCDG-funded Epi25 research activities at the Broad Institute, including genomic data generation in the Broad Genomics Platform, are supported by NHGRI grant UM1 HG008895 (PIs: Eric Lander, Stacey Gabriel, Mark Daly, and Sekar Kathiresan). The Genome Sequencing Program efforts were also supported by NHGRI grant 5U01HG009088. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. A supplemental grant for Epi25 phenotyping was supported by “Epi25 Clinical Phenotyping R03,” National Institutes of Health (1R03NS108145-01); D.H.L. and S.F.B. were the principal investigators. We also thank the Stanley Center for Psychiatric Research at the Broad Institute for supporting the genomic data generation. Additional funding sources and acknowledgment of individual cohorts are listed in the supplemental information.
Published: April 30, 2021
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.ajhg.2021.04.009.
Contributor Information
Epi25 Collaborativejm4279@cumc.columbia.edudg2875@cumc.columbia.edu:
Joshua E. Motelow, Gundula Povysil, Ryan S. Dhindsa, Kate E. Stanley, Andrew S. Allen, Yen-Chen Anne Feng, Daniel P. Howrigan, Liam E. Abbott, Katherine Tashman, Felecia Cerrato, Caroline Cusick, Tarjinder Singh, Henrike Heyne, Andrea E. Byrnes, Claire Churchhouse, Nick Watts, Matthew Solomonson, Dennis Lal, Namrata Gupta, Benjamin M. Neale, Gianpiero L. Cavalleri, Patrick Cossette, Chris Cotsapas, Peter De Jonghe, Tracy Dixon-Salazar, Renzo Guerrini, Hakon Hakonarson, Erin L. Heinzen, Ingo Helbig, Patrick Kwan, Anthony G. Marson, Slavé Petrovski, Sitharthan Kamalakaran, Sanjay M. Sisodiya, Randy Stewart, Sarah Weckhuysen, Chantal Depondt, Dennis J. Dlugos, Ingrid E. Scheffer, Pasquale Striano, Catharine Freyer, Roland Krause, Patrick May, Kevin McKenna, Brigid M. Regan, Caitlin A. Bennett, Costin Leu, Stephanie L. Leech, Terence J. O’Brien, Marian Todaro, Hannah Stamberger, Danielle M. Andrade, Quratulain Zulfiqar Ali, Tara R. Sadoway, Heinz Krestel, André Schaller, Savvas S. Papacostas, Ioanna Kousiappa, George A. Tanteles, Yiolanda Christou, Katalin Štěrbová, Markéta Vlčková, Lucie Sedláčková, Petra Laššuthová, Karl Martin Klein, Felix Rosenow, Philipp S. Reif, Susanne Knake, Bernd A. Neubauer, Friedrich Zimprich, Martha Feucht, Eva M. Reinthaler, Wolfram S. Kunz, Gábor Zsurka, Rainer Surges, Tobias Baumgartner, Randi von Wrede, Manuela Pendziwiat, Hiltrud Muhle, Annika Rademacher, Andreas van Baalen, Sarah von Spiczak, Ulrich Stephani, Zaid Afawi, Amos D. Korczyn, Moien Kanaan, Christina Canavati, Gerhard Kurlemann, Karen Müller-Schlüter, Gerhard Kluger, Martin Häusler, Ilan Blatt, Johannes R. Lemke, Ilona Krey, Yvonne G. Weber, Stefan Wolking, Felicitas Becker, Stephan Lauxmann, Christian Boßelmann, Josua Kegele, Christian Hengsbach, Sarah Rau, Bernhard J. Steinhoff, Andreas Schulze-Bonhage, Ingo Borggräfe, Christoph J. Schankin, Susanne Schubert-Bast, Herbert Schreiber, Thomas Mayer, Rudolf Korinthenberg, Knut Brockmann, Markus Wolff, Dieter Dennig, Rene Madeleyn, Reetta Kälviäinen, Anni Saarela, Oskari Timonen, Tarja Linnankivi, Anna-Elina Lehesjoki, Sylvain Rheims, Gaetan Lesca, Philippe Ryvlin, Louis Maillard, Luc Valton, Philippe Derambure, Fabrice Bartolomei, Edouard Hirsch, Véronique Michel, Francine Chassoux, Mark I. Rees, Seo-Kyung Chung, William O. Pickrell, Robert Powell, Mark D. Baker, Beata Fonferko-Shadrach, Charlotte Lawthom, Joseph Anderson, Natascha Schneider, Simona Balestrini, Sara Zagaglia, Vera Braatz, Michael R. Johnson, Pauls Auce, Graeme J. Sills, Larry W. Baum, Pak C. Sham, Stacey S. Cherny, Colin H.T. Lui, Norman Delanty, Colin P. Doherty, Arif Shukralla, Hany El-Naggar, Peter Widdess-Walsh, Nina Barišić, Laura Canafoglia, Silvana Franceschetti, Barbara Castellotti, Tiziana Granata, Francesca Ragona, Federico Zara, Michele Iacomino, Antonella Riva, Francesca Madia, Maria Stella Vari, Vincenzo Salpietro, Marcello Scala, Maria Margherita Mancardi, Lino Nobili, Elisabetta Amadori, Thea Giacomini, Francesca Bisulli, Tommaso Pippucci, Laura Licchetta, Raffaella Minardi, Paolo Tinuper, Lorenzo Muccioli, Barbara Mostacci, Antonio Gambardella, Angelo Labate, Grazia Annesi, Lorella Manna, Monica Gagliardi, Elena Parrini, Davide Mei, Annalisa Vetro, Claudia Bianchini, Martino Montomoli, Viola Doccini, Carmen Barba, Shinichi Hirose, Atsushi Ishii, Toshimitsu Suzuki, Yushi Inoue, Kazuhiro Yamakawa, Ahmad Beydoun, Wassim Nasreddine, Nathalie Khoueiry Zgheib, Birute Tumiene, Algirdas Utkus, Lynette G. Sadleir, Chontelle King, S. Hande Caglayan, Mutluay Arslan, Zuhal Yapıcı, Pınar Topaloglu, Bulent Kara, Uluc Yis, Dilsad Turkdogan, Aslı Gundogdu-Eken, Nerses Bebek, Meng-Han Tsai, Chen-Jui Ho, Chih-Hsiang Lin, Kuang-Lin Lin, I-Jun Chou, Annapurna Poduri, Beth R. Shiedley, Catherine Shain, Jeffrey L. Noebels, Alicia Goldman, Robyn M. Busch, Lara Jehi, Imad M. Najm, Lisa Ferguson, Jean Khoury, Tracy A. Glauser, Peggy O. Clark, Russell J. Buono, Thomas N. Ferraro, Michael R. Sperling, Warren Lo, Michael Privitera, Jacqueline A. French, Steven Schachter, Ruben I. Kuzniecky, Orrin Devinsky, Manu Hegde, David A. Greenberg, Colin A. Ellis, Ethan Goldberg, Katherine L. Helbig, Mahgenn Cosico, Priya Vaidiswaran, Eryn Fitch, Samuel F. Berkovic, Holger Lerche, Daniel H. Lowenstein, and David B. Goldstein
Data and code availability
The accession number for the Epi25 Year 1 whole-exome sequencing data reported in this paper is dbGaP: phs001489. Epi25 Year 2 will be available in the near future under the same accession number. Epi25 Year 3 is not yet publicly available.
Web resources
Consensus Coding Sequence, https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi
Epi25 Collaborative, http://epi-25.org/
Epi25 WES results browser, https://epi25.broadinstitute.org/
EpiPGX project, http://www.epipgx.eu
Exome Aggregation Consortium (ExAC), http://exac.broadinstitute.org
Exome Variant Server, https://evs.gs.washington.edu/EVS/
Genome Aggregation Database (gnomAD), https://gnomad.broadinstitute.org
Genome Analysis Toolkit (GATK), https://gatk.broadinstitute.org/hc/en-us
lollipops-v.1.5.3, https://github.com/joiningdata/lollipops
MTR-Viewer, http://biosig.unimelb.edu.au/mtr-viewer/
NIH Genomic Data Sharing Policy, https://osp.od.nih.gov/scientific-sharing/policies/
OMIM, https://www.omim.org
Rare Exome Variant Ensemble Learner (REVEL), https://sites.google.com/site/revelgenomics/
Supplemental information
References
- 1.Aaberg K.M., Gunnes N., Bakken I.J., Lund Søraas C., Berntsen A., Magnus P., Lossius M.I., Stoltenberg C., Chin R., Surén P. Incidence and Prevalence of Childhood Epilepsy: A Nationwide Cohort Study. Pediatrics. 2017;139:e20163908. doi: 10.1542/peds.2016-3908. [DOI] [PubMed] [Google Scholar]
- 2.Fisher R.S., Acevedo C., Arzimanoglou A., Bogacz A., Cross J.H., Elger C.E., Engel J., Jr., Forsgren L., French J.A., Glynn M. ILAE official report: a practical clinical definition of epilepsy. Epilepsia. 2014;55:475–482. doi: 10.1111/epi.12550. [DOI] [PubMed] [Google Scholar]
- 3.Hesdorffer D.C., Logroscino G., Benn E.K., Katri N., Cascino G., Hauser W.A. Estimating risk for developing epilepsy: a population-based study in Rochester, Minnesota. Neurology. 2011;76:23–27. doi: 10.1212/WNL.0b013e318204a36a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.EpiPM Consortium A roadmap for precision medicine in the epilepsies. Lancet Neurol. 2015;14:1219–1228. doi: 10.1016/S1474-4422(15)00199-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ellis C.A., Petrovski S., Berkovic S.F. Epilepsy genetics: clinical impacts and biological insights. Lancet Neurol. 2020;19:93–100. doi: 10.1016/S1474-4422(19)30269-8. [DOI] [PubMed] [Google Scholar]
- 6.May P., Girard S., Harrer M., Bobbili D.R., Schubert J., Wolking S., Becker F., Lachance-Touchette P., Meloche C., Gravel M., Epicure Consortium. EuroEPINOMICS CoGIE Consortium. EpiPGX Consortium Rare coding variants in genes encoding GABAA receptors in genetic generalised epilepsies: an exome-based case-control study. Lancet Neurol. 2018;17:699–708. doi: 10.1016/S1474-4422(18)30215-1. [DOI] [PubMed] [Google Scholar]
- 7.Epi4K consortium. Epilepsy Phenome/Genome Project Ultra-rare genetic variation in common epilepsies: a case-control sequencing study. Lancet Neurol. 2017;16:135–143. doi: 10.1016/S1474-4422(16)30359-3. [DOI] [PubMed] [Google Scholar]
- 8.Allen A.S., Berkovic S.F., Cossette P., Delanty N., Dlugos D., Eichler E.E., Epstein M.P., Glauser T., Goldstein D.B., Han Y., Epi4K Consortium. Epilepsy Phenome/Genome Project De novo mutations in epileptic encephalopathies. Nature. 2013;501:217–221. doi: 10.1038/nature12439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Epi25 Collaborative Ultra-Rare Genetic Variation in the Epilepsies: A Whole-Exome Sequencing Study of 17,606 Individuals. Am. J. Hum. Genet. 2019;105:267–282. doi: 10.1016/j.ajhg.2019.05.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Krenn M., Wagner M., Hotzy C., Graf E., Weber S., Brunet T., Lorenz-Depiereux B., Kasprian G., Aull-Watschinger S., Pataraia E. Diagnostic exome sequencing in non-acquired focal epilepsies highlights a major role of GATOR1 complex genes. J. Med. Genet. 2020;57:624–633. doi: 10.1136/jmedgenet-2019-106658. [DOI] [PubMed] [Google Scholar]
- 11.Epi4K Consortium De Novo Mutations in SLC1A2 and CACNA1A Are Important Causes of Epileptic Encephalopathies. Am. J. Hum. Genet. 2016;99:287–298. doi: 10.1016/j.ajhg.2016.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.EuroEPINOMICS-RES Consortium. Epilepsy Phenome/Genome Project. Epi4K Consortium De novo mutations in synaptic transmission genes including DNM1 cause epileptic encephalopathies. Am. J. Hum. Genet. 2014;95:360–370. doi: 10.1016/j.ajhg.2014.08.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Heyne H.O., Singh T., Stamberger H., Abou Jamra R., Caglayan H., Craiu D., De Jonghe P., Guerrini R., Helbig K.L., Koeleman B.P.C., EuroEPINOMICS RES Consortium De novo variants in neurodevelopmental disorders with epilepsy. Nat. Genet. 2018;50:1048–1053. doi: 10.1038/s41588-018-0143-7. [DOI] [PubMed] [Google Scholar]
- 14.McTague A., Howell K.B., Cross J.H., Kurian M.A., Scheffer I.E. The genetic landscape of the epileptic encephalopathies of infancy and childhood. Lancet Neurol. 2016;15:304–316. doi: 10.1016/S1474-4422(15)00250-1. [DOI] [PubMed] [Google Scholar]
- 15.Banerjee P.N., Filippi D., Allen Hauser W. The descriptive epidemiology of epilepsy-a review. Epilepsy Res. 2009;85:31–45. doi: 10.1016/j.eplepsyres.2009.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jallon P., Loiseau P., Loiseau J. Newly diagnosed unprovoked epileptic seizures: presentation at diagnosis in CAROLE study. Coordination Active du Réseau Observatoire Longitudinal de l’ Epilepsie. Epilepsia. 2001;42:464–475. doi: 10.1046/j.1528-1157.2001.31400.x. [DOI] [PubMed] [Google Scholar]
- 17.Jallon P., Latour P. Epidemiology of idiopathic generalized epilepsies. Epilepsia. 2005;46(Suppl 9):10–14. doi: 10.1111/j.1528-1167.2005.00309.x. [DOI] [PubMed] [Google Scholar]
- 18.Petrovski S., Wang Q., Heinzen E.L., Allen A.S., Goldstein D.B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 2013;9:e1003709. doi: 10.1371/journal.pgen.1003709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Samocha K.E., Robinson E.B., Sanders S.J., Stevens C., Sabo A., McGrath L.M., Kosmicki J.A., Rehnström K., Mallick S., Kirby A. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 2014;46:944–950. doi: 10.1038/ng.3050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Karczewski K.J., Francioli L.C., Tiao G., Cummings B.B., Alföldi J., Wang Q., Collins R.L., Laricchia K.M., Ganna A., Birnbaum D.P., Genome Aggregation Database Consortium The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Bennett C.A., Petrovski S., Oliver K.L., Berkovic S.F. ExACtly zero or once: A clinically helpful guide to assessing genetic variants in mild epilepsies. Neurol. Genet. 2017;3:e163. doi: 10.1212/NXG.0000000000000163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gussow A.B., Petrovski S., Wang Q., Allen A.S., Goldstein D.B. The intolerance to functional genetic variation of protein domains predicts the localization of pathogenic mutations within genes. Genome Biol. 2016;17:9. doi: 10.1186/s13059-016-0869-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Larsen J., Carvill G.L., Gardella E., Kluger G., Schmiedel G., Barisic N., Depienne C., Brilstra E., Mang Y., Nielsen J.E., EuroEPINOMICS RES Consortium CRP The phenotypic spectrum of SCN8A encephalopathy. Neurology. 2015;84:480–489. doi: 10.1212/WNL.0000000000001211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Stamberger H., Nikanorova M., Willemsen M.H., Accorsi P., Angriman M., Baier H., Benkel-Herrenbrueck I., Benoit V., Budetta M., Caliebe A. STXBP1 encephalopathy: A neurodevelopmental disorder including epilepsy. Neurology. 2016;86:954–962. doi: 10.1212/WNL.0000000000002457. [DOI] [PubMed] [Google Scholar]
- 25.Heron S.E., Dibbens L.M. Role of PRRT2 in common paroxysmal neurological disorders: a gene with remarkable pleiotropy. J. Med. Genet. 2013;50:133–139. doi: 10.1136/jmedgenet-2012-101406. [DOI] [PubMed] [Google Scholar]
- 26.Leen W.G., Klepper J., Verbeek M.M., Leferink M., Hofste T., van Engelen B.G., Wevers R.A., Arthur T., Bahi-Buisson N., Ballhausen D. Glucose transporter-1 deficiency syndrome: the expanding clinical and genetic spectrum of a treatable disorder. Brain. 2010;133:655–670. doi: 10.1093/brain/awp336. [DOI] [PubMed] [Google Scholar]
- 27.Wolff M., Johannesen K.M., Hedrich U.B.S., Masnada S., Rubboli G., Gardella E., Lesca G., Ville D., Milh M., Villard L. Genetic and phenotypic heterogeneity suggest therapeutic implications in SCN2A-related disorders. Brain. 2017;140:1316–1336. doi: 10.1093/brain/awx054. [DOI] [PubMed] [Google Scholar]
- 28.Blanchard M.G., Willemsen M.H., Walker J.B., Dib-Hajj S.D., Waxman S.G., Jongmans M.C., Kleefstra T., van de Warrenburg B.P., Praamstra P., Nicolai J. De novo gain-of-function and loss-of-function mutations of SCN8A in patients with intellectual disabilities and epilepsy. J. Med. Genet. 2015;52:330–337. doi: 10.1136/jmedgenet-2014-102813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.He N., Lin Z.J., Wang J., Wei F., Meng H., Liu X.R., Chen Q., Su T., Shi Y.W., Yi Y.H., Liao W.P. Evaluating the pathogenic potential of genes with de novo variants in epileptic encephalopathies. Genet. Med. 2019;21:17–27. doi: 10.1038/s41436-018-0011-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gelfman S., Dugger S., de Araujo Martins Moreno C., Ren Z., Wolock C.J., Shneider N.A., Phatnani H., Cirulli E.T., Lasseigne B.N., Harris T. A new approach for rare variation collapsing on functional protein domains implicates specific genic regions in ALS. Genome Res. 2019;29:809–818. doi: 10.1101/gr.243592.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hayeck T.J., Stong N., Wolock C.J., Copeland B., Kamalakaran S., Goldstein D.B., Allen A.S. Improved Pathogenic Variant Localization via a Hierarchical Model of Sub-regional Intolerance. Am. J. Hum. Genet. 2019;104:299–309. doi: 10.1016/j.ajhg.2018.12.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Traynelis J., Silk M., Wang Q., Berkovic S.F., Liu L., Ascher D.B., Balding D.J., Petrovski S. Optimizing genomic medicine in epilepsy through a gene-customized approach to missense variant interpretation. Genome Res. 2017;27:1715–1729. doi: 10.1101/gr.226589.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zhang J., Kim E.C., Chen C., Procko E., Pant S., Lam K., Patel J., Choi R., Hong M., Joshi D. Identifying mutation hotspots reveals pathogenetic mechanisms of KCNQ2 epileptic encephalopathy. Sci. Rep. 2020;10:4756. doi: 10.1038/s41598-020-61697-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Myers C.T., Hollingsworth G., Muir A.M., Schneider A.L., Thuesmunn Z., Knupp A., King C., Lacroix A., Mehaffey M.G., Berkovic S.F. Parental Mosaicism in “De Novo” Epileptic Encephalopathies. N. Engl. J. Med. 2018;378:1646–1648. doi: 10.1056/NEJMc1714579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.de Lange I.M., Koudijs M.J., van ’t Slot R., Gunning B., Sonsma A.C.M., van Gemert L.J.J.M., Mulder F., Carbo E.C., van Kempen M.J.A., Verbeek N.E. Mosaicism of de novo pathogenic SCN1A variants in epilepsy is a frequent phenomenon that correlates with variable phenotypes. Epilepsia. 2018;59:690–703. doi: 10.1111/epi.14021. [DOI] [PubMed] [Google Scholar]
- 36.Winawer M.R., Griffin N.G., Samanamud J., Baugh E.H., Rathakrishnan D., Ramalingam S., Zagzag D., Schevon C.A., Dugan P., Hegde M. Somatic SLC35A2 variants in the brain are associated with intractable neocortical epilepsy. Ann. Neurol. 2018;83:1133–1146. doi: 10.1002/ana.25243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kim J.K., Cho J., Kim S.H., Kang H.C., Kim D.S., Kim V.N., Lee J.H. Brain somatic mutations in MTOR reveal translational dysregulations underlying intractable focal epilepsy. J. Clin. Invest. 2019;129:4207–4223. doi: 10.1172/JCI127032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Landrum M.J., Lee J.M., Benson M., Brown G.R., Chao C., Chitipiralla S., Gu B., Hart J., Hoffman D., Jang W. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46(D1):D1062–D1067. doi: 10.1093/nar/gkx1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Landrum M.J., Lee J.M., Riley G.R., Jang W., Rubinstein W.S., Church D.M., Maglott D.R. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–D985. doi: 10.1093/nar/gkt1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Richards S., Aziz N., Bale S., Bick D., Das S., Gastier-Foster J., Grody W.W., Hegde M., Lyon E., Spector E., ACMG Laboratory Quality Assurance Committee Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015;17:405–424. doi: 10.1038/gim.2015.30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Abou-Khalil B., Alldredge B., Bautista J., Berkovic S., Bluvstein J., Boro A., Cascino G., Consalvo D., Cristofaro S., Crumrine P., EPGP Collaborative The epilepsy phenome/genome project. Clin. Trials. 2013;10:568–586. doi: 10.1177/1740774513484392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Miller N.A., Farrow E.G., Gibson M., Willig L.K., Twist G., Yoo B., Marrs T., Corder S., Krivohlavek L., Walter A. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Med. 2015;7:100. doi: 10.1186/s13073-015-0221-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M.A. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Van der Auwera G.A., Carneiro M.O., Hartl C., Poplin R., Del Angel G., Levy-Moonshine A., Jordan T., Shakir K., Roazen D., Thibault J. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics. 2013;43 doi: 10.1002/0471250953.bi1110s43. 11.10.11–11.10.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Cingolani P., Platts A., Wang L., Coon M., Nguyen T., Wang L., Land S.J., Lu X., Ruden D.M. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;6:80–92. doi: 10.4161/fly.19695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ioannidis N.M., Rothstein J.H., Pejaver V., Middha S., McDonnell S.K., Baheti S., Musolf A., Li Q., Holzinger E., Karyadi D. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am. J. Hum. Genet. 2016;99:877–885. doi: 10.1016/j.ajhg.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Ren Z., Povysil G., Hostyk J.A., Cui H., Bhardwaj N., Goldstein D.B. ATAV: a comprehensive platform for population-scale genomic analyses. BMC Bioinformatics. 2021;22:149. doi: 10.1186/s12859-021-04071-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Pruitt K.D., Harrow J., Harte R.A., Wallin C., Diekhans M., Maglott D.R., Searle S., Farrell C.M., Loveland J.E., Ruef B.J. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res. 2009;19:1316–1323. doi: 10.1101/gr.080531.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Jun G., Flickinger M., Hetrick K.N., Romm J.M., Doheny K.F., Abecasis G.R., Boehnke M., Kang H.M. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 2012;91:839–848. doi: 10.1016/j.ajhg.2012.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Sayers E.W., Barrett T., Benson D.A., Bolton E., Bryant S.H., Canese K., Chetvernin V., Church D.M., DiCuccio M., Federhen S. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2011;39:D38–D51. doi: 10.1093/nar/gkq1172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Manichaikul A., Mychaleckyj J.C., Rich S.S., Daly K., Sale M., Chen W.M. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26:2867–2873. doi: 10.1093/bioinformatics/btq559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Petrovski S., Todd J.L., Durheim M.T., Wang Q., Chien J.W., Kelly F.L., Frankel C., Mebane C.M., Ren Z., Bridgers J. An Exome Sequencing Study to Assess the Role of Rare Genetic Variation in Pulmonary Fibrosis. Am. J. Respir. Crit. Care Med. 2017;196:82–93. doi: 10.1164/rccm.201610-2088OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Gravel S., Henn B.M., Gutenkunst R.N., Indap A.R., Marth G.T., Clark A.G., Yu F., Gibbs R.A., Bustamante C.D., 1000 Genomes Project Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. USA. 2011;108:11983–11988. doi: 10.1073/pnas.1019276108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Cameron-Christie S., Wolock C.J., Groopman E., Petrovski S., Kamalakaran S., Povysil G., Vitsios D., Zhang M., Fleckner J., March R.E. Exome-Based Rare-Variant Analyses in CKD. J. Am. Soc. Nephrol. 2019;30:1109–1122. doi: 10.1681/ASN.2018090909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Blondel V.D., Guillaume J.L., Lambiotte R., Lefebvre E. Fast unfolding of communities in large networks. J. Stat. Mech. 2008;2008:P10008. [Google Scholar]
- 56.Povysil G., Chazara O., Carss K.J., Deevi S.V.V., Wang Q., Armisen J., Paul D.S., Granger C.B., Kjekshus J., Aggarwal V. Assessing the Role of Rare Genetic Variation in Patients With Heart Failure. JAMA Cardiol. 2021;6:379–386. doi: 10.1001/jamacardio.2020.6500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.McInnes L., Healy J., Melville J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv. 2018 https://arxiv.org/abs/1802.03426 1802.03426. [Google Scholar]
- 58.Diaz-Papkovich A., Anderson-Trocmé L., Ben-Eghan C., Gravel S. UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet. 2019;15:e1008432. doi: 10.1371/journal.pgen.1008432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Dai C.L., Vazifeh M.M., Yeang C.H., Tachet R., Wells R.S., Vilar M.G., Daly M.J., Ratti C., Martin A.R. Population Histories of the United States Revealed through Fine-Scale Migration and Haplotype Analysis. Am. J. Hum. Genet. 2020;106:371–388. doi: 10.1016/j.ajhg.2020.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. doi: 10.1038/nrg2779. [DOI] [PubMed] [Google Scholar]
- 61.Krusche P., Trigg L., Boutros P.C., Mason C.E., De La Vega F.M., Moore B.L., Gonzalez-Porta M., Eberle M.A., Tezak Z., Lababidi S., Global Alliance for Genomics and Health Benchmarking Team Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 2019;37:555–560. doi: 10.1038/s41587-019-0054-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Cummings B.B., Karczewski K.J., Kosmicki J.A., Seaby E.G., Watts N.A., Singer-Berk M., Mudge J.M., Karjalainen J., Satterstrom F.K., O’Donnell-Luria A.H., Genome Aggregation Database Production Team. Genome Aggregation Database Consortium Transcript expression-aware annotation improves rare variant interpretation. Nature. 2020;581:452–458. doi: 10.1038/s41586-020-2329-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Xiang J., Yang J., Chen L., Chen Q., Yang H., Sun C., Zhou Q., Peng Z. Reinterpretation of common pathogenic variants in ClinVar revealed a high proportion of downgrades. Sci. Rep. 2020;10:331. doi: 10.1038/s41598-019-57335-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Shah N., Hou Y.C., Yu H.C., Sainger R., Caskey C.T., Venter J.C., Telenti A. Identification of Misclassified ClinVar Variants via Disease Population Prevalence. Am. J. Hum. Genet. 2018;102:609–619. doi: 10.1016/j.ajhg.2018.02.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Rehm H.L., Berg J.S., Brooks L.D., Bustamante C.D., Evans J.P., Landrum M.J., Ledbetter D.H., Maglott D.R., Martin C.L., Nussbaum R.L., ClinGen ClinGen--the Clinical Genome Resource. N. Engl. J. Med. 2015;372:2235–2242. doi: 10.1056/NEJMsr1406261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Mantel N., Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J. Natl. Cancer Inst. 1959;22:719–748. [PubMed] [Google Scholar]
- 67.Cochran W.G. Some Methods for Strengthening the Common X2 Tests. Biometrics. 1954;10:417–451. [Google Scholar]
- 68.Lek M., Karczewski K.J., Minikel E.V., Samocha K.E., Banks E., Fennell T., O’Donnell-Luria A.H., Ware J.S., Hill A.J., Cummings B.B., Exome Aggregation Consortium Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. doi: 10.1038/nature19057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Hu Y.J., Liao P., Johnston H.R., Allen A.S., Satten G.A. Testing Rare-Variant Association without Calling Genotypes Allows for Systematic Differences in Sequencing between Cases and Controls. PLoS Genet. 2016;12:e1006040. doi: 10.1371/journal.pgen.1006040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Jay J.J., Brouwer C. Lollipops in the Clinic: Information Dense Mutation Plots for Precision Medicine. PLoS ONE. 2016;11:e0160519. doi: 10.1371/journal.pone.0160519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Silk M., Petrovski S., Ascher D.B. MTR-Viewer: identifying regions within genes under purifying selection. Nucleic Acids Res. 2019;47(W1):W121–W126. doi: 10.1093/nar/gkz457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Goode D.L., Cooper G.M., Schmutz J., Dickson M., Gonzales E., Tsai M., Karra K., Davydov E., Batzoglou S., Myers R.M., Sidow A. Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes. Genome Res. 2010;20:301–310. doi: 10.1101/gr.102210.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Davydov E.V., Goode D.L., Sirota M., Cooper G.M., Sidow A., Batzoglou S. Identifying a high fraction of the human genome to be under selective constraint using GERP++ PLoS Comput. Biol. 2010;6:e1001025. doi: 10.1371/journal.pcbi.1001025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.R Core Team . R Foundation for Statistical Computing; Vienna, Austria: 2019. R: A Language and Environment for Statistical Computing. [Google Scholar]
- 75.McGill R., Tukey J.W., Larsen W.A. Variations of box plots. Am. Stat. 1978;32:12–16. [Google Scholar]
- 76.de Lange I.M., Helbig K.L., Weckhuysen S., Møller R.S., Velinov M., Dolzhanskaya N., Marsh E., Helbig I., Devinsky O., Tang S., EuroEPINOMICS-RES MAE working group De novo mutations of KIAA2022 in females cause intellectual disability and intractable epilepsy. J. Med. Genet. 2016;53:850–858. doi: 10.1136/jmedgenet-2016-103909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Fujiwara T., Sugawara T., Mazaki-Miyazaki E., Takahashi Y., Fukushima K., Watanabe M., Hara K., Morikawa T., Yagi K., Yamakawa K., Inoue Y. Mutations of sodium channel alpha subunit type 1 (SCN1A) in intractable childhood epilepsies with frequent generalized tonic-clonic seizures. Brain. 2003;126:531–546. doi: 10.1093/brain/awg053. [DOI] [PubMed] [Google Scholar]
- 78.Claes L., Ceulemans B., Audenaert D., Smets K., Löfgren A., Del-Favero J., Ala-Mello S., Basel-Vanagaite L., Plecko B., Raskin S. De novo SCN1A mutations are a major cause of severe myoclonic epilepsy of infancy. Hum. Mutat. 2003;21:615–621. doi: 10.1002/humu.10217. [DOI] [PubMed] [Google Scholar]
- 79.Carvill G.L., Weckhuysen S., McMahon J.M., Hartmann C., Møller R.S., Hjalgrim H., Cook J., Geraghty E., O’Roak B.J., Petrou S. GABRA1 and STXBP1: novel genetic causes of Dravet syndrome. Neurology. 2014;82:1245–1253. doi: 10.1212/WNL.0000000000000291. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Endele S., Rosenberger G., Geider K., Popp B., Tamer C., Stefanova I., Milh M., Kortüm F., Fritsch A., Pientka F.K. Mutations in GRIN2A and GRIN2B encoding regulatory subunits of NMDA receptors cause variable neurodevelopmental phenotypes. Nat. Genet. 2010;42:1021–1026. doi: 10.1038/ng.677. [DOI] [PubMed] [Google Scholar]
- 81.Hoffbuhr K., Devaney J.M., LaFleur B., Sirianni N., Scacheri C., Giron J., Schuette J., Innis J., Marino M., Philippart M. MeCP2 mutations in children with and without the phenotype of Rett syndrome. Neurology. 2001;56:1486–1495. doi: 10.1212/wnl.56.11.1486. [DOI] [PubMed] [Google Scholar]
- 82.Schubert J., Siekierska A., Langlois M., May P., Huneau C., Becker F., Muhle H., Suls A., Lemke J.R., de Kovel C.G., EuroEPINOMICS RES Consortium Mutations in STX1B, encoding a presynaptic protein, cause fever-associated epilepsy syndromes. Nat. Genet. 2014;46:1327–1332. doi: 10.1038/ng.3130. [DOI] [PubMed] [Google Scholar]
- 83.Krey I., Krois-Neudenberger J., Hentschel J., Syrbe S., Polster T., Hanker B., Fiedler B., Kurlemann G., Lemke J.R. Genotype-phenotype correlation on 45 individuals with West syndrome. Eur. J. Paediatr. Neurol. 2020;25:134–138. doi: 10.1016/j.ejpn.2019.11.010. [DOI] [PubMed] [Google Scholar]
- 84.Dibbens L.M., de Vries B., Donatello S., Heron S.E., Hodgson B.L., Chintawar S., Crompton D.E., Hughes J.N., Bellows S.T., Klein K.M. Mutations in DEPDC5 cause familial focal epilepsy with variable foci. Nat. Genet. 2013;45:546–551. doi: 10.1038/ng.2599. [DOI] [PubMed] [Google Scholar]
- 85.Weaving L.S., Christodoulou J., Williamson S.L., Friend K.L., McKenzie O.L., Archer H., Evans J., Clarke A., Pelka G.J., Tam P.P. Mutations of CDKL5 cause a severe neurodevelopmental disorder with infantile spasms and mental retardation. Am. J. Hum. Genet. 2004;75:1079–1093. doi: 10.1086/426462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Heron S.E., Smith K.R., Bahlo M., Nobili L., Kahana E., Licchetta L., Oliver K.L., Mazarib A., Afawi Z., Korczyn A. Missense mutations in the sodium-gated potassium channel gene KCNT1 cause severe autosomal dominant nocturnal frontal lobe epilepsy. Nat. Genet. 2012;44:1188–1190. doi: 10.1038/ng.2440. [DOI] [PubMed] [Google Scholar]
- 87.Barcia G., Fleming M.R., Deligniere A., Gazula V.R., Brown M.R., Langouet M., Chen H., Kronengold J., Abhyankar A., Cilio R. De novo gain-of-function KCNT1 channel mutations cause malignant migrating partial seizures of infancy. Nat. Genet. 2012;44:1255–1259. doi: 10.1038/ng.2441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Dell’Angelica E.C., Ohno H., Ooi C.E., Rabinovich E., Roche K.W., Bonifacino J.S. AP-3: an adaptor-like protein complex with ubiquitous expression. EMBO J. 1997;16:917–928. doi: 10.1093/emboj/16.5.917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Ammann S., Schulz A., Krägeloh-Mann I., Dieckmann N.M., Niethammer K., Fuchs S., Eckl K.M., Plank R., Werner R., Altmüller J. Mutations in AP3D1 associated with immunodeficiency and seizures define a new type of Hermansky-Pudlak syndrome. Blood. 2016;127:997–1006. doi: 10.1182/blood-2015-09-671636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Seifert W., Meinecke P., Krüger G., Rossier E., Heinritz W., Wüsthof A., Horn D. Expanded spectrum of exon 33 and 34 mutations in SRCAP and follow-up in patients with Floating-Harbor syndrome. BMC Med. Genet. 2014;15:127. doi: 10.1186/s12881-014-0127-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Nikkel S.M., Dauber A., de Munnik S., Connolly M., Hood R.L., Caluseriu O., Hurst J., Kini U., Nowaczyk M.J., Afenjar A., FORGE Canada Consortium The phenotype of Floating-Harbor syndrome: clinical characterization of 52 individuals with mutations in exon 34 of SRCAP. Orphanet J. Rare Dis. 2013;8:63. doi: 10.1186/1750-1172-8-63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Abou-Khalil B., Ge Q., Desai R., Ryther R., Bazyk A., Bailey R., Haines J.L., Sutcliffe J.S., George A.L., Jr. Partial and generalized epilepsy with febrile seizures plus and a novel SCN1A mutation. Neurology. 2001;57:2265–2272. doi: 10.1212/wnl.57.12.2265. [DOI] [PubMed] [Google Scholar]
- 93.Cossette P., Liu L., Brisebois K., Dong H., Lortie A., Vanasse M., Saint-Hilaire J.M., Carmant L., Verner A., Lu W.Y. Mutation of GABRA1 in an autosomal dominant form of juvenile myoclonic epilepsy. Nat. Genet. 2002;31:184–189. doi: 10.1038/ng885. [DOI] [PubMed] [Google Scholar]
- 94.Strehlow V., Heyne H.O., Vlaskamp D.R.M., Marwick K.F.M., Rudolf G., de Bellescize J., Biskup S., Brilstra E.H., Brouwer O.F., Callenbach P.M.C., GRIN2A study group GRIN2A-related disorders: genotype and functional consequence predict phenotype. Brain. 2019;142:80–92. doi: 10.1093/brain/awy304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Carvill G.L., McMahon J.M., Schneider A., Zemel M., Myers C.T., Saykally J., Nguyen J., Robbiano A., Zara F., Specchio N., EuroEPINOMICS Rare Epilepsy Syndrome Myoclonic-Astatic Epilepsy & Dravet working group Mutations in the GABA Transporter SLC6A1 Cause Epilepsy with Myoclonic-Atonic Seizures. Am. J. Hum. Genet. 2015;96:808–815. doi: 10.1016/j.ajhg.2015.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Johannesen K.M., Gardella E., Linnankivi T., Courage C., de Saint Martin A., Lehesjoki A.E., Mignot C., Afenjar A., Lesca G., Abi-Warde M.T. Defining the phenotypic spectrum of SLC6A1 mutations. Epilepsia. 2018;59:389–402. doi: 10.1111/epi.13986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Lafrenière R.G., Cader M.Z., Poulin J.F., Andres-Enguix I., Simoneau M., Gupta N., Boisvert K., Lafrenière F., McLaughlan S., Dubé M.P. A dominant-negative mutation in the TRESK potassium channel is linked to familial migraine with aura. Nat. Med. 2010;16:1157–1160. doi: 10.1038/nm.2216. [DOI] [PubMed] [Google Scholar]
- 98.Sun L., Shi L., Li W., Yu W., Liang J., Zhang H., Yang X., Wang Y., Li R., Yao X. JFK, a Kelch domain-containing F-box protein, links the SCF complex to p53 regulation. Proc. Natl. Acad. Sci. USA. 2009;106:10195–10200. doi: 10.1073/pnas.0901864106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Gorman K.M., Meyer E., Grozeva D., Spinelli E., McTague A., Sanchis-Juan A., Carss K.J., Bryant E., Reich A., Schneider A.L., Deciphering Developmental Disorders Study. UK10K Consortium. NIHR BioResource Bi-allelic Loss-of-Function CACNA1B Mutations in Progressive Epilepsy-Dyskinesia. Am. J. Hum. Genet. 2019;104:948–956. doi: 10.1016/j.ajhg.2019.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Gardella E., Møller R.S. Phenotypic and genetic spectrum of SCN8A-related disorders, treatment options, and outcomes. Epilepsia. 2019;60(Suppl 3):S77–S85. doi: 10.1111/epi.16319. [DOI] [PubMed] [Google Scholar]
- 101.Ricos M.G., Hodgson B.L., Pippucci T., Saidin A., Ong Y.S., Heron S.E., Licchetta L., Bisulli F., Bayly M.A., Hughes J., Epilepsy Electroclinical Study Group Mutations in the mammalian target of rapamycin pathway regulators NPRL2 and NPRL3 cause focal epilepsy. Ann. Neurol. 2016;79:120–131. doi: 10.1002/ana.24547. [DOI] [PubMed] [Google Scholar]
- 102.Krasniqi S., Daci A. Role of the Angiotensin Pathway and its Target Therapy in Epilepsy Management. Int. J. Mol. Sci. 2019;20:726. doi: 10.3390/ijms20030726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Gasparini S., Ferlazzo E., Sueri C., Cianci V., Ascoli M., Cavalli S.M., Beghi E., Belcastro V., Bianchi A., Benna P., Epilepsy Study Group of the Italian Neurological Society Hypertension, seizures, and epilepsy: a review on pathophysiology and management. Neurol. Sci. 2019;40:1775–1783. doi: 10.1007/s10072-019-03913-4. [DOI] [PubMed] [Google Scholar]
- 104.Liu Y.J., Chen J., Li X., Zhou X., Hu Y.M., Chu S.F., Peng Y., Chen N.H. Research progress on adenosine in central nervous system diseases. CNS Neurosci. Ther. 2019;25:899–910. doi: 10.1111/cns.13190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Chen J.F., Eltzschig H.K., Fredholm B.B. Adenosine receptors as drug targets--what are the challenges? Nat. Rev. Drug Discov. 2013;12:265–286. doi: 10.1038/nrd3955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Lesko S.L., Rouhana L. Dynein assembly factor with WD repeat domains 1 (DAW1) is required for the function of motile cilia in the planarian Schmidtea mediterranea. Dev. Growth Differ. 2020;62:423–437. doi: 10.1111/dgd.12669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Gupta A., de Bruyn G., Tousseyn S., Krishnan B., Lagae L., Agarwal N., TSC Natural History Database Consortium Epilepsy and Neurodevelopmental Comorbidities in Tuberous Sclerosis Complex: A Natural History Study. Pediatr. Neurol. 2020;106:10–16. doi: 10.1016/j.pediatrneurol.2019.12.016. [DOI] [PubMed] [Google Scholar]
- 108.Lim J.S., Gopalappa R., Kim S.H., Ramakrishna S., Lee M., Kim W.I., Kim J., Park S.M., Lee J., Oh J.H. Somatic Mutations in TSC1 and TSC2 Cause Focal Cortical Dysplasia. Am. J. Hum. Genet. 2017;100:454–472. doi: 10.1016/j.ajhg.2017.01.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Bamshad M.J., Nickerson D.A., Chong J.X. Mendelian Gene Discovery: Fast and Furious with No End in Sight. Am. J. Hum. Genet. 2019;105:448–455. doi: 10.1016/j.ajhg.2019.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Chubykin A.A., Atasoy D., Etherton M.R., Brose N., Kavalali E.T., Gibson J.R., Südhof T.C. Activity-dependent validation of excitatory versus inhibitory synapses by neuroligin-1 versus neuroligin-2. Neuron. 2007;54:919–931. doi: 10.1016/j.neuron.2007.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Finkbeiner E., Haindl M., Muller S. The SUMO system controls nucleolar partitioning of a novel mammalian ribosome biogenesis complex. EMBO J. 2011;30:1067–1078. doi: 10.1038/emboj.2011.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Kelly M., Park M., Mihalek I., Rochtus A., Gramm M., Pérez-Palma E., Axeen E.T., Hung C.Y., Olson H., Swanson L., Undiagnosed Diseases Network Spectrum of neurodevelopmental disease associated with the GNAO1 guanosine triphosphate-binding region. Epilepsia. 2019;60:406–418. doi: 10.1111/epi.14653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Szczałuba K., Chmielewska J.J., Sokolowska O., Rydzanicz M., Szymańska K., Feleszko W., Włodarski P., Biernacka A., Murcia Pienkowski V., Walczak A. Neurodevelopmental phenotype caused by a de novo PTPN4 single nucleotide variant disrupting protein localization in neuronal dendritic spines. Clin. Genet. 2018;94:581–585. doi: 10.1111/cge.13450. [DOI] [PubMed] [Google Scholar]
- 114.Havrilla J.M., Pedersen B.S., Layer R.M., Quinlan A.R. A map of constrained coding regions in the human genome. Nat. Genet. 2019;51:88–95. doi: 10.1038/s41588-018-0294-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Samocha K.E., Kosmicki J.A., Karczewski K.J., O’Donnell-Luria A.H., Pierce-Hoffman E., MacArthur D.G., Neale B.M., Daly M.J. Regional missense constraint improves variant deleteriousness prediction. bioRxiv. 2017 doi: 10.1101/148353. [DOI] [Google Scholar]
- 116.Hemati P., Revah-Politi A., Bassan H., Petrovski S., Bilancia C.G., Ramsey K., Griffin N.G., Bier L., Cho M.T., Rosello M., C4RCD Research Group. DDD study Refining the phenotype associated with GNB1 mutations: Clinical data on 18 newly identified patients and review of the literature. Am. J. Med. Genet. A. 2018;176:2259–2275. doi: 10.1002/ajmg.a.40472. [DOI] [PubMed] [Google Scholar]
- 117.Ogden K.K., Chen W., Swanger S.A., McDaniel M.J., Fan L.Z., Hu C., Tankovic A., Kusumoto H., Kosobucki G.J., Schulien A.J. Molecular Mechanism of Disease-Associated Mutations in the Pre-M1 Helix of NMDA Receptors and Potential Rescue Pharmacology. PLoS Genet. 2017;13:e1006536. doi: 10.1371/journal.pgen.1006536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Pérez-Palma E., May P., Iqbal S., Niestroj L.M., Du J., Heyne H.O., Castrillon J.A., O’Donnell-Luria A., Nürnberg P., Palotie A. Identification of pathogenic variant enriched regions across genes and gene families. Genome Res. 2020;30:62–71. doi: 10.1101/gr.252601.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Shearer A.E., Eppsteiner R.W., Booth K.T., Ephraim S.S., Gurrola J., 2nd, Simpson A., Black-Ziegelbein E.A., Joshi S., Ravi H., Giuffre A.C. Utilizing ethnic-specific differences in minor allele frequency to recategorize reported pathogenic deafness variants. Am. J. Hum. Genet. 2014;95:445–453. doi: 10.1016/j.ajhg.2014.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Whiffin N., Minikel E., Walsh R., O’Donnell-Luria A.H., Karczewski K., Ing A.Y., Barton P.J.R., Funke B., Cook S.A., MacArthur D., Ware J.S. Using high-resolution variant frequencies to empower clinical genome interpretation. Genet. Med. 2017;19:1151–1158. doi: 10.1038/gim.2017.26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Yang S., Lincoln S.E., Kobayashi Y., Nykamp K., Nussbaum R.L., Topper S. Sources of discordance among germ-line variant classifications in ClinVar. Genet. Med. 2017;19:1118–1126. doi: 10.1038/gim.2017.60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Wright C.F., Eberhardt R.Y., Constantinou P., Hurles M.E., FitzPatrick D.R., Firth H.V., DDD Study Evaluating variants classified as pathogenic in ClinVar in the DDD Study. Genet. Med. 2021;23:571–575. doi: 10.1038/s41436-020-01021-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The accession number for the Epi25 Year 1 whole-exome sequencing data reported in this paper is dbGaP: phs001489. Epi25 Year 2 will be available in the near future under the same accession number. Epi25 Year 3 is not yet publicly available.