Skip to main content
. Author manuscript; available in PMC: 2014 Aug 18.
Published in final edited form as: Nature. 2014 Jan 22;506(7487):185–190. doi: 10.1038/nature12975

Extended Data Table 6. Further stratification of enrichment analyses by class of variant.

a. Geneset analyses for damaging missense mutations only. For the primary geneset and the 12 constituent subsets, a comparison of disruptive versus (strictlydefined) damaging missenses, i.e. an independent set of variants. The omnibus result for the primary test is modest (P=0.04) and did not withstand correction for multiple testing: as illustrated in Figure 1 and the main text, the bulk of the enrichment signal we observe comes from (singleton) disruptive mutations. Nonetheless, specific genesets such as ARC and the NMDAR network are highly and independently enriched for missense variants. N represents the number of genes with at least one mutation of this class observed in the sample. A/U represent case/control counts of non-reference genotypes. OR represents the odds ratio (not corrected for exome-wide rates) estimated by Firth’s method for sets with small cell counts. All tests are empirical and 1-sided (higher values expected in cases) as described in the main text and methods. b. Enrichment analyses of novel and case-unique disruptive mutations. For primary and secondary genesets (and constituent subsets) as well as the composite set: results of three alternative burden analyses. First, focusing only on genes without any control disruptive variants; no further frequency filter is imposed. Here “N genes(A/U)” indicates the number of genes with at least one disruptive variant, followed by the number of genes with case-only disruptive mutations and (for comparison) the number with control-only disruptive mutations. The “A(U)” column gives the number of case variants in the case-only genes: the test statistic is based on the empirical distribution of this count. The U in this field represents the similar quantity for controls (not explicitly used in the statistic). The second set of analyses represent standard burden/enrichment tests (i.e. as Tables 2 and 4) but stratified for novel versus known disruptive variants, according to dbSNP and the Exome Sequencing Project/Exome Variant Server (ESP/EVS) database. Novel variants show greater enrichment, although most rare variants observed in our study (both in cases and in controls) are novel, so tests of novel variants will have greater power.

a
Primary geneset Disruptive singletons Damaging missense (strict) singleton
P N A/U OR P N A/U OR
All primary genes 0.0008 905 852/716 1.20 0.0393 1357 2080/2001 1.04
SCZ de novo genes
 Exome sequencing (disruptive) 0.0349 40 56/38 1.48 0.5613 53 121/120 1.01
 Exome sequencing (nonsyn) 0.0059 332 384/309 1.25 0.2776 393 750/736 1.02
Copy number variants
de novo CNV genes (Kirov et al, 2012) 0.0224 64 61/40 1.53 0.0593 90 125/112 1.12
 SCZ-associated CNV genes 0.3378 72 65/55 1.19 0.0310 111 148/119 1.25
GWAS
 Voltage-gated calcium channel genes 0.0021 9 12/1 8.40 0.4629 18 37/35 1.06
 Common SNPs (P < 1e-4 intervals) 0.1832 185 165/146 1.14 0.9246 268 359/395 0.91
 miRNA-137 targets 0.6643 140 98/100 0.99 0.1415 263 376/361 1.05
Synaptic genes
 PSD (human core) 0.0824 219 172/145 1.19 0.0070 394 646/581 1.12
 ARC 0.0012 9 9/0 19.20 0.0069 19 32/15 2.14
 NMDAR network 0.0162 17 17/5 3.42 0.0003 34 76/45 1.70
 PSD-95 0.0018 16 17/3 5.10 0.0218 34 44/30 1.47
 mGluRS 0.1335 10 9/3 3.02 0.1715 22 52/36 1.45
b
Geneset Case-unique burden analysis Known variants Novel variants
Singletons MAF < 0.1% Singletons MAF < 0.1%
P N genes(A/U) A(U) P N P N P N P N
Composite 0.0006 829(275/214) 378(297) 0.3733 145 0.2202 226 0.0002 683 0.0005 744
Primary 0.0022 1026(325/265) 440(367) 0.1003 191 0.0417 299 0.0074 831 0.0058 910
SCZ de novo genes
 Exome sequencing (disruptive) 0.0018 47(16/6) 29(12) 0.5514 13 0.2196 24 0.0362 35 0.0010 40
 Exome sequencing (nonsyn) 0.0037 371(108/80) 159(116) 0.3647 94 0.2064 144 0.0142 302 0.0071 326
Copy number variants
de novo CNV genes 0.1267 79(25/17) 32(24) 0.0156 13 0.0116 24 0.0819 59 0.0679 67
 SCZ-associated CNV genes 0.7971 90(20/23) 24(32) 0.0355 23 0.0069 34 0.6081 63 0.9187 76
GWAS
 Voltage-gated calcium channel 0.0129 9(5/1) 10(1) 0.4922 2 0.7077 3 0.0006 7 0.0022 8
 P < 1e-4 intervals 0.1079 211(65/55) 91 (78) 0.0394 44 0.0409 69 0.4425 164 0.1882 180
 miRNA-137 targets 0.4498 156(52/50) 67(60) 0.9939 14 0.9972 22 0.3846 133 0.2757 147
Synaptic genes
 PSD (human core) 0.2234 244(92/79) 113(109) 0.5348 25 0.3072 40 0.1091 205 0.1629 226
 ARC 0.0008 9(9/0) 9(0) 0 0 0.0016 9 0.0013 9
 NMDAR network 0.0105 21(13/4) 18(4) 1.0000 1 0.6905 2 0.0075 16 0.0085 19
 PSD-95 0.0137 16(13/2) 14(2) 0.1218 1 0.1559 1 0.0034 15 0.0022 15
 mGluRS 0.1363 11(7/1) 8(1) 0 0.1458 1 0.1427 10 0.1826 11
Secondary (autism/ID) 0.0916 1249(348/314) 479(471) 0.1679 226 0.3543 352 0.1834 1041 0.0807 1143
De novo genes (exome sequencing)
 Autism (disruptive) 0.662 65(17/17) 20(26) 0.4463 18 0.0781 23 0.6161 50 0.7009 56
 Autism (nonsyn) 0.220 407(101/96) 143(154) 0.3198 89 0.4487 133 0.6656 336 0.4960 369
 ID (disruptive) 0.262 8(4/1) 4(2) 1.0000 1 0.2747 3 0.3558 8 0.0578 8
 ID (nonsyn) 0.052 69(22/18) 35(28) 0.1303 14 0.0934 26 0.5331 62 0.3368 66
Neurodevelopmental candidates
 Betancur (2011), ASD candidates 0.110 37(12/6) 16(7) 0.5824 9 0.7543 14 0.0484 24 0.0429 29
 Betancur (2011), ID candidates 0.994 88(14/28) 16(38) 0.6553 16 0.7056 24 0.9488 74 0.9556 82
Autism PPI networks
 CHD8 network 1.000 1(0/1) 0(1) 0 0 1.0000 1 1.0000 1
 O’Roak et al. 49-gene network 0.796 19(3/7) 4(16) 0.4755 5 0.7326 7 0.6081 30 0.7231 33
 O’Roak et al. 74-gene network 0.654 33(6/13) 10(28) 0.6438 4 0.6667 4 0.7285 17 0.8411 19
Fragile × mental retardation protein targets
 Darnell et al. targets 0.022 341(131/95) 169(133) 0.3048 39 0.3889 61 0.0007 288 0.0022 309
 Ascano et al. targets 0.449 517(134/131) 187(200) 0.5571 83 0.7281 128 0.5261 439 0.4089 482
 Ascano et al. FMRP/autism 0.423 33(10/6) 12(12) 0.0384 5 0.4624 11 0.6088 23 0.3954 28