Extended Data Table 6. Further stratification of enrichment analyses by class of variant.
a. Geneset analyses for damaging missense mutations only. For the primary geneset and the 12 constituent subsets, a comparison of disruptive versus (strictlydefined) damaging missenses, i.e. an independent set of variants. The omnibus result for the primary test is modest (P=0.04) and did not withstand correction for multiple testing: as illustrated in Figure 1 and the main text, the bulk of the enrichment signal we observe comes from (singleton) disruptive mutations. Nonetheless, specific genesets such as ARC and the NMDAR network are highly and independently enriched for missense variants. N represents the number of genes with at least one mutation of this class observed in the sample. A/U represent case/control counts of non-reference genotypes. OR represents the odds ratio (not corrected for exome-wide rates) estimated by Firth’s method for sets with small cell counts. All tests are empirical and 1-sided (higher values expected in cases) as described in the main text and methods. b. Enrichment analyses of novel and case-unique disruptive mutations. For primary and secondary genesets (and constituent subsets) as well as the composite set: results of three alternative burden analyses. First, focusing only on genes without any control disruptive variants; no further frequency filter is imposed. Here “N genes(A/U)” indicates the number of genes with at least one disruptive variant, followed by the number of genes with case-only disruptive mutations and (for comparison) the number with control-only disruptive mutations. The “A(U)” column gives the number of case variants in the case-only genes: the test statistic is based on the empirical distribution of this count. The U in this field represents the similar quantity for controls (not explicitly used in the statistic). The second set of analyses represent standard burden/enrichment tests (i.e. as Tables 2 and 4) but stratified for novel versus known disruptive variants, according to dbSNP and the Exome Sequencing Project/Exome Variant Server (ESP/EVS) database. Novel variants show greater enrichment, although most rare variants observed in our study (both in cases and in controls) are novel, so tests of novel variants will have greater power.
| a
| ||||||||
|---|---|---|---|---|---|---|---|---|
| Primary geneset | Disruptive singletons | Damaging missense (strict) singleton | ||||||
| P | N | A/U | OR | P | N | A/U | OR | |
| All primary genes | 0.0008 | 905 | 852/716 | 1.20 | 0.0393 | 1357 | 2080/2001 | 1.04 |
| SCZ de novo genes | ||||||||
| Exome sequencing (disruptive) | 0.0349 | 40 | 56/38 | 1.48 | 0.5613 | 53 | 121/120 | 1.01 |
| Exome sequencing (nonsyn) | 0.0059 | 332 | 384/309 | 1.25 | 0.2776 | 393 | 750/736 | 1.02 |
| Copy number variants | ||||||||
| de novo CNV genes (Kirov et al, 2012) | 0.0224 | 64 | 61/40 | 1.53 | 0.0593 | 90 | 125/112 | 1.12 |
| SCZ-associated CNV genes | 0.3378 | 72 | 65/55 | 1.19 | 0.0310 | 111 | 148/119 | 1.25 |
| GWAS | ||||||||
| Voltage-gated calcium channel genes | 0.0021 | 9 | 12/1 | 8.40 | 0.4629 | 18 | 37/35 | 1.06 |
| Common SNPs (P < 1e-4 intervals) | 0.1832 | 185 | 165/146 | 1.14 | 0.9246 | 268 | 359/395 | 0.91 |
| miRNA-137 targets | 0.6643 | 140 | 98/100 | 0.99 | 0.1415 | 263 | 376/361 | 1.05 |
| Synaptic genes | ||||||||
| PSD (human core) | 0.0824 | 219 | 172/145 | 1.19 | 0.0070 | 394 | 646/581 | 1.12 |
| ARC | 0.0012 | 9 | 9/0 | 19.20 | 0.0069 | 19 | 32/15 | 2.14 |
| NMDAR network | 0.0162 | 17 | 17/5 | 3.42 | 0.0003 | 34 | 76/45 | 1.70 |
| PSD-95 | 0.0018 | 16 | 17/3 | 5.10 | 0.0218 | 34 | 44/30 | 1.47 |
| mGluRS | 0.1335 | 10 | 9/3 | 3.02 | 0.1715 | 22 | 52/36 | 1.45 |
| b
| |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Geneset | Case-unique burden analysis | Known variants | Novel variants | ||||||||
| Singletons | MAF < 0.1% | Singletons | MAF < 0.1% | ||||||||
| P | N genes(A/U) | A(U) | P | N | P | N | P | N | P | N | |
| Composite | 0.0006 | 829(275/214) | 378(297) | 0.3733 | 145 | 0.2202 | 226 | 0.0002 | 683 | 0.0005 | 744 |
| Primary | 0.0022 | 1026(325/265) | 440(367) | 0.1003 | 191 | 0.0417 | 299 | 0.0074 | 831 | 0.0058 | 910 |
| SCZ de novo genes | |||||||||||
| Exome sequencing (disruptive) | 0.0018 | 47(16/6) | 29(12) | 0.5514 | 13 | 0.2196 | 24 | 0.0362 | 35 | 0.0010 | 40 |
| Exome sequencing (nonsyn) | 0.0037 | 371(108/80) | 159(116) | 0.3647 | 94 | 0.2064 | 144 | 0.0142 | 302 | 0.0071 | 326 |
| Copy number variants | |||||||||||
| de novo CNV genes | 0.1267 | 79(25/17) | 32(24) | 0.0156 | 13 | 0.0116 | 24 | 0.0819 | 59 | 0.0679 | 67 |
| SCZ-associated CNV genes | 0.7971 | 90(20/23) | 24(32) | 0.0355 | 23 | 0.0069 | 34 | 0.6081 | 63 | 0.9187 | 76 |
| GWAS | |||||||||||
| Voltage-gated calcium channel | 0.0129 | 9(5/1) | 10(1) | 0.4922 | 2 | 0.7077 | 3 | 0.0006 | 7 | 0.0022 | 8 |
| P < 1e-4 intervals | 0.1079 | 211(65/55) | 91 (78) | 0.0394 | 44 | 0.0409 | 69 | 0.4425 | 164 | 0.1882 | 180 |
| miRNA-137 targets | 0.4498 | 156(52/50) | 67(60) | 0.9939 | 14 | 0.9972 | 22 | 0.3846 | 133 | 0.2757 | 147 |
| Synaptic genes | |||||||||||
| PSD (human core) | 0.2234 | 244(92/79) | 113(109) | 0.5348 | 25 | 0.3072 | 40 | 0.1091 | 205 | 0.1629 | 226 |
| ARC | 0.0008 | 9(9/0) | 9(0) | 0 | 0 | 0.0016 | 9 | 0.0013 | 9 | ||
| NMDAR network | 0.0105 | 21(13/4) | 18(4) | 1.0000 | 1 | 0.6905 | 2 | 0.0075 | 16 | 0.0085 | 19 |
| PSD-95 | 0.0137 | 16(13/2) | 14(2) | 0.1218 | 1 | 0.1559 | 1 | 0.0034 | 15 | 0.0022 | 15 |
| mGluRS | 0.1363 | 11(7/1) | 8(1) | 0 | 0.1458 | 1 | 0.1427 | 10 | 0.1826 | 11 | |
| Secondary (autism/ID) | 0.0916 | 1249(348/314) | 479(471) | 0.1679 | 226 | 0.3543 | 352 | 0.1834 | 1041 | 0.0807 | 1143 |
| De novo genes (exome sequencing) | |||||||||||
| Autism (disruptive) | 0.662 | 65(17/17) | 20(26) | 0.4463 | 18 | 0.0781 | 23 | 0.6161 | 50 | 0.7009 | 56 |
| Autism (nonsyn) | 0.220 | 407(101/96) | 143(154) | 0.3198 | 89 | 0.4487 | 133 | 0.6656 | 336 | 0.4960 | 369 |
| ID (disruptive) | 0.262 | 8(4/1) | 4(2) | 1.0000 | 1 | 0.2747 | 3 | 0.3558 | 8 | 0.0578 | 8 |
| ID (nonsyn) | 0.052 | 69(22/18) | 35(28) | 0.1303 | 14 | 0.0934 | 26 | 0.5331 | 62 | 0.3368 | 66 |
| Neurodevelopmental candidates | |||||||||||
| Betancur (2011), ASD candidates | 0.110 | 37(12/6) | 16(7) | 0.5824 | 9 | 0.7543 | 14 | 0.0484 | 24 | 0.0429 | 29 |
| Betancur (2011), ID candidates | 0.994 | 88(14/28) | 16(38) | 0.6553 | 16 | 0.7056 | 24 | 0.9488 | 74 | 0.9556 | 82 |
| Autism PPI networks | |||||||||||
| CHD8 network | 1.000 | 1(0/1) | 0(1) | 0 | 0 | 1.0000 | 1 | 1.0000 | 1 | ||
| O’Roak et al. 49-gene network | 0.796 | 19(3/7) | 4(16) | 0.4755 | 5 | 0.7326 | 7 | 0.6081 | 30 | 0.7231 | 33 |
| O’Roak et al. 74-gene network | 0.654 | 33(6/13) | 10(28) | 0.6438 | 4 | 0.6667 | 4 | 0.7285 | 17 | 0.8411 | 19 |
| Fragile × mental retardation protein targets | |||||||||||
| Darnell et al. targets | 0.022 | 341(131/95) | 169(133) | 0.3048 | 39 | 0.3889 | 61 | 0.0007 | 288 | 0.0022 | 309 |
| Ascano et al. targets | 0.449 | 517(134/131) | 187(200) | 0.5571 | 83 | 0.7281 | 128 | 0.5261 | 439 | 0.4089 | 482 |
| Ascano et al. FMRP/autism | 0.423 | 33(10/6) | 12(12) | 0.0384 | 5 | 0.4624 | 11 | 0.6088 | 23 | 0.3954 | 28 |