. Author manuscript; available in PMC: 2014 Aug 18.

Published in final edited form as: Nature. 2014 Jan 22;506(7487):185–190. doi: 10.1038/nature12975

Extended Data Table 6. Further stratification of enrichment analyses by class of variant.

a. Geneset analyses for damaging missense mutations only. For the primary geneset and the 12 constituent subsets, a comparison of disruptive versus (strictlydefined) damaging missenses, i.e. an independent set of variants. The omnibus result for the primary test is modest (P=0.04) and did not withstand correction for multiple testing: as illustrated in Figure 1 and the main text, the bulk of the enrichment signal we observe comes from (singleton) disruptive mutations. Nonetheless, specific genesets such as ARC and the NMDAR network are highly and independently enriched for missense variants. N represents the number of genes with at least one mutation of this class observed in the sample. A/U represent case/control counts of non-reference genotypes. OR represents the odds ratio (not corrected for exome-wide rates) estimated by Firth’s method for sets with small cell counts. All tests are empirical and 1-sided (higher values expected in cases) as described in the main text and methods. b. Enrichment analyses of novel and case-unique disruptive mutations. For primary and secondary genesets (and constituent subsets) as well as the composite set: results of three alternative burden analyses. First, focusing only on genes without any control disruptive variants; no further frequency filter is imposed. Here “N genes(A/U)” indicates the number of genes with at least one disruptive variant, followed by the number of genes with case-only disruptive mutations and (for comparison) the number with control-only disruptive mutations. The “A(U)” column gives the number of case variants in the case-only genes: the test statistic is based on the empirical distribution of this count. The U in this field represents the similar quantity for controls (not explicitly used in the statistic). The second set of analyses represent standard burden/enrichment tests (i.e. as Tables 2 and 4) but stratified for novel versus known disruptive variants, according to dbSNP and the Exome Sequencing Project/Exome Variant Server (ESP/EVS) database. Novel variants show greater enrichment, although most rare variants observed in our study (both in cases and in controls) are novel, so tests of novel variants will have greater power.

a
Primary geneset	Disruptive singletons				Damaging missense (strict) singleton
Primary geneset	P	N	A/U	OR	P	N	A/U	OR
All primary genes	0.0008	905	852/716	1.20	0.0393	1357	2080/2001	1.04
SCZ de novo genes
Exome sequencing (disruptive)	0.0349	40	56/38	1.48	0.5613	53	121/120	1.01
Exome sequencing (nonsyn)	0.0059	332	384/309	1.25	0.2776	393	750/736	1.02
Copy number variants
de novo CNV genes (Kirov et al, 2012)	0.0224	64	61/40	1.53	0.0593	90	125/112	1.12
SCZ-associated CNV genes	0.3378	72	65/55	1.19	0.0310	111	148/119	1.25
GWAS
Voltage-gated calcium channel genes	0.0021	9	12/1	8.40	0.4629	18	37/35	1.06
Common SNPs (P < 1e-4 intervals)	0.1832	185	165/146	1.14	0.9246	268	359/395	0.91
miRNA-137 targets	0.6643	140	98/100	0.99	0.1415	263	376/361	1.05
Synaptic genes
PSD (human core)	0.0824	219	172/145	1.19	0.0070	394	646/581	1.12
ARC	0.0012	9	9/0	19.20	0.0069	19	32/15	2.14
NMDAR network	0.0162	17	17/5	3.42	0.0003	34	76/45	1.70
PSD-95	0.0018	16	17/3	5.10	0.0218	34	44/30	1.47
mGluRS	0.1335	10	9/3	3.02	0.1715	22	52/36	1.45

b
Geneset	Case-unique burden analysis			Known variants				Novel variants
				Singletons		MAF < 0.1%		Singletons		MAF < 0.1%
	P	N genes(A/U)	A(U)	P	N	P	N	P	N	P	N
Composite	0.0006	829(275/214)	378(297)	0.3733	145	0.2202	226	0.0002	683	0.0005	744
Primary	0.0022	1026(325/265)	440(367)	0.1003	191	0.0417	299	0.0074	831	0.0058	910
SCZ de novo genes
Exome sequencing (disruptive)	0.0018	47(16/6)	29(12)	0.5514	13	0.2196	24	0.0362	35	0.0010	40
Exome sequencing (nonsyn)	0.0037	371(108/80)	159(116)	0.3647	94	0.2064	144	0.0142	302	0.0071	326
Copy number variants
de novo CNV genes	0.1267	79(25/17)	32(24)	0.0156	13	0.0116	24	0.0819	59	0.0679	67
SCZ-associated CNV genes	0.7971	90(20/23)	24(32)	0.0355	23	0.0069	34	0.6081	63	0.9187	76
GWAS
Voltage-gated calcium channel	0.0129	9(5/1)	10(1)	0.4922	2	0.7077	3	0.0006	7	0.0022	8
P < 1e-4 intervals	0.1079	211(65/55)	91 (78)	0.0394	44	0.0409	69	0.4425	164	0.1882	180
miRNA-137 targets	0.4498	156(52/50)	67(60)	0.9939	14	0.9972	22	0.3846	133	0.2757	147
Synaptic genes
PSD (human core)	0.2234	244(92/79)	113(109)	0.5348	25	0.3072	40	0.1091	205	0.1629	226
ARC	0.0008	9(9/0)	9(0)		0		0	0.0016	9	0.0013	9
NMDAR network	0.0105	21(13/4)	18(4)	1.0000	1	0.6905	2	0.0075	16	0.0085	19
PSD-95	0.0137	16(13/2)	14(2)	0.1218	1	0.1559	1	0.0034	15	0.0022	15
mGluRS	0.1363	11(7/1)	8(1)		0	0.1458	1	0.1427	10	0.1826	11
Secondary (autism/ID)	0.0916	1249(348/314)	479(471)	0.1679	226	0.3543	352	0.1834	1041	0.0807	1143
De novo genes (exome sequencing)
Autism (disruptive)	0.662	65(17/17)	20(26)	0.4463	18	0.0781	23	0.6161	50	0.7009	56
Autism (nonsyn)	0.220	407(101/96)	143(154)	0.3198	89	0.4487	133	0.6656	336	0.4960	369
ID (disruptive)	0.262	8(4/1)	4(2)	1.0000	1	0.2747	3	0.3558	8	0.0578	8
ID (nonsyn)	0.052	69(22/18)	35(28)	0.1303	14	0.0934	26	0.5331	62	0.3368	66
Neurodevelopmental candidates
Betancur (2011), ASD candidates	0.110	37(12/6)	16(7)	0.5824	9	0.7543	14	0.0484	24	0.0429	29
Betancur (2011), ID candidates	0.994	88(14/28)	16(38)	0.6553	16	0.7056	24	0.9488	74	0.9556	82
Autism PPI networks
CHD8 network	1.000	1(0/1)	0(1)		0		0	1.0000	1	1.0000	1
O’Roak et al. 49-gene network	0.796	19(3/7)	4(16)	0.4755	5	0.7326	7	0.6081	30	0.7231	33
O’Roak et al. 74-gene network	0.654	33(6/13)	10(28)	0.6438	4	0.6667	4	0.7285	17	0.8411	19
Fragile × mental retardation protein targets
Darnell et al. targets	0.022	341(131/95)	169(133)	0.3048	39	0.3889	61	0.0007	288	0.0022	309
Ascano et al. targets	0.449	517(134/131)	187(200)	0.5571	83	0.7281	128	0.5261	439	0.4089	482
Ascano et al. FMRP/autism	0.423	33(10/6)	12(12)	0.0384	5	0.4624	11	0.6088	23	0.3954	28