Meta-analysis of genome-wide association study data identifies additional type 1 diabetes loci

Jason D Cooper; Deborah J Smyth; Adam M Smiles; Vincent Plagnol; Neil M Walker; James Allen; Kate Downes; Jeffrey C Barrett; Barry Healy; Josyf C Mychaleckyj; James H Warram; John A Todd

doi:10.1038/ng.249

. Author manuscript; available in PMC: 2009 Jun 1.

Published in final edited form as: Nat Genet. 2008 Nov 2;40(12):1399–1401. doi: 10.1038/ng.249

Meta-analysis of genome-wide association study data identifies additional type 1 diabetes loci

Jason D Cooper ¹, Deborah J Smyth ¹, Adam M Smiles ², Vincent Plagnol ¹, Neil M Walker ¹, James Allen ¹, Kate Downes ¹, Jeffrey C Barrett ¹, Barry Healy ¹, Josyf C Mychaleckyj ³, James H Warram ², John A Todd ¹

PMCID: PMC2635556 EMSID: UKMS2312 PMID: 18978792

Abstract

To provide more power to detect type 1 diabetes (T1D) loci, we performed a meta-analysis of data from three genome-wide association (GWA) studies. We tested 305,090 SNPs in 3,561 T1D cases and 4,646 controls of European ancestry. We obtained further support for 4q27/IL2-IL21 (P = 1.9×10^-8) and, after genotyping 6,225 cases, 6,946 controls and 2,828 families, convincing evidence for four previously unknown and distinct loci in chromosome regions 6q15/BACH2 (4.7×10^-12), 10p15/PRKCQ (3.7×10^-9), 15q24/CTSH (3.2×10^-15) and 22q13/C1QTNF6 (2.0×10^-8).

In the present study, we undertook a meta-analysis of three GWA studies, combining the British Wellcome Trust Case Control Consortium (WTCCC)¹ T1D case-control data with 1,785 American T1D cases from the Genetics of Kidneys in Diabetes (GoKinD) study²^,³ and 1,727 American controls from the National Institute of Mental Health (NIMH). All samples had been genotyped using the Affymetrix 500K SNP chipset, but each study had used a different scoring algorithm - potentially introducing a differential bias in genotype calling between American cases and controls, resulting in false-positive associations⁴. Consequently, we started the analysis by re-scoring the American case-control data (Supplementary Methods).

After re-scoring the American case-control data, for consistency between the studies, we also updated the WTCCC and NIMH SNP information to NCBI Human Genome Build 36 and aligned the SNP alleles between studies. We then applied SNP and sample quality control filters to each study (Supplementary Methods). After applying clustering quality and minimum minor allele frequency filters, 330,183 WTCCC and 335,565 GoKinD/NIMH SNPs remained. We note that inspection of allele signal intensity plots was still required for the GoKinD/NIMH SNPs (Supplementary Methods). The sample quality control filters consisted of excluding duplicate samples, first or second degree relatives, samples with low heterozygosity and samples with substantial non-European ancestry (Supplementary Methods), all deduced from their genotype. This resulted in 3,561 cases (1,960 British and 1,601 American) and 4,646 controls (2,942 British and 1,704 American).

We analysed the American case-control data separately in order to gauge their quality and suitability for inclusion in the meta-analysis. To control for population structure within the American data, we used a propensity score derived from principal components (Supplementary Methods), which reduced the inflation of the test statistic from 18% to 14%. We convincingly detected (least significant P = 9.2×10^-4) eight of the ten confirmed T1D associated regions⁵ (1p13/PTPN22, 2q33/CTLA4, 6p21/HLA, 10p15/IL2RA, 12q13/ERRB3, 12q24/C12orf30, 16p13/CLEC16A and 18p11/PTPN2; Supplementary Table 1). The remaining regions were 2q24/IFIH1 (P = 0.020) and 11p15/INS, as the closest SNP to the INS gene on the Affymetrix 500K SNP chipset¹ was not available for American cases (Supplementary Note). Although no new T1D loci at genome-wide levels of significance (P < 5.0×10^-7(ref.¹)) were evident (Supplementary Figure 1), we did find additional support for the previously detected but unconfirmed⁵ 4q27/IL 2-IL21 region at P = 4.8×10^-3 (Supplementary Table 2).

We then performed a meta-analysis of the evidence for 305,090 SNPs available in both the British and American studies. To produce an overall score test for these SNPs, we summed the score statistics and score variances from the British and American case-control analyses. As expected, the meta-analysis was dominated by known T1D associated regions (Supplementary Table 1 and Supplementary Figure 1); the test statistic inflation was 12% (Supplementary Figure 2). Despite very different population backgrounds (Great Britain, GB, versus USA) and ascertainment (paediatric T1D clinics versus longstanding T1D cases with or without diabetic nephropathy) there was no evidence of heterogeneity in the ten previously established T1D loci (Supplementary Table 1). The combined evidence for 4q27/IL2-IL21 was P = 1.9×10^-8, suggesting that this is a true T1D locus (Supplementary Table 2).

Although no new T1D loci were associated at genome-wide levels of significance, we followed up the most associated SNPs by genotyping an additional 6,225 case and 6,946 control samples from GB. We removed the known T1D loci and SNPs with an r² ≥ 0.1 with them, and shortlisted the top 30 ranked SNPs (least significant P = 1.2×10^-5 and corrected for 12% inflation, P = 3.4×10^-5) for follow-up (Supplementary Table 3). A further 11 SNPs from 12q24 and five SNPs from 1p13 were removed as their associations were explained by known T1D loci, rs3184504/SH2B3(ref.⁵) (Supplementary Methods) and rs2476601/PTPN22 (ref.⁶) respectively. A SNP, rs947474/PRKCQ, on 10p15 which was 260 kb centromeric of the T1D associated IL2RA region⁷ proved to be independently associated by regression analysis (Supplementary Table 4) and separated by a number of recombination hotspots (Supplementary Figure 3). In addition, we note that we were unable to find any evidence of an extended haplotype connecting the IL2RA region with rs947474/PRKCQ (data not shown). A further four SNPs, despite passing SNP quality control filters (Supplementary Methods), were excluded after inspection of the genotype signal intensity plots (Supplementary Table 3). As four of the remaining ten SNPs were from 6q15/BACH2, we genotyped seven SNPs (Table 1).

Table 1.

A summary of the results for the seven most associated SNPs found in the meta-analysis and followed-up in additional samples. We report the maximum number of case, control and family samples with genotype data. The MAF was estimated in a minimum of 9,647 controls from the British 1958 Birth Cohort and UK Blood Services. The signal intensity plots are shown in Supplementary Figure 7. A 2-df test is reported when there was a significant difference between genotypic effects model and the multiplicative allelic effects model. The SNP genotypes are reported in Supplementary Table 5. We used one-tailed tests for the family data (that is, the null hypothesis was not rejected unless the effect was in the same direction as the original study).

Chromosome	Gene region	SNP	MAF in British controls	GWA studies					Follow-up				Combined P-values (1-df)

				British 1,960 cases and 2,942 controls		American 1,601 cases and 1,704 controls		British and American	British 6,225 cases and 6,946 controls		2,828 families (3,064 parent-child trios)

				P (1-df)	OR (95% c.i.)	P (1-df)	OR (95% c.i.)	P (1-df)	P (1-df)	OR (95% c.i.)	P (1-df)	RR (95% c.i.)

2p23	intergenic region	rs2165738 G>C	0.270	0.0157	1.12 (1.02-1.23)	6.09×10^-5	1.26 (1.12-1.40)	1.03×10^-5	0.0147	1.07 (1.01-1.13)	N/A		3.65×10^-6
5q34	gene desert	rs6887079 T>C	0.498	1.68×10^-3	1.14 (1.05-1.24)	5.60×10^-4	1.19 (1.08-1.31)	3.95×10^-6	0.454	1.02 (0.97-1.07)	N/A		0.465
	(2-df test)			8.56×10^-4
6q15	BACH2 ^†	rs11755527 C>G	0.465	3.00×10^-3	1.13 (1.04-1.23)	4.16×10^-4	1.20 (1.08-1.33)	6.07×10^-6	6.93×10^-7	1.13 (1.08-1.19)	0.0185	1.08 (1.01-1.16)	4.66×10^-12
	(2-df test)								2.35×10^-8
10p15	PRKCQ	rs947474 A>G	0.187	2.03×10^-4	0.81 (0.73-0.91)	4.56×10^-3	0.83 (0.73-0.94)	3.34×10^-6	5.49×10^-3	0.91 (0.85-0.97)	1.32×10^-3	0.86 (0.78-0.95)	3.65×10^-9
15q24	CTSH	rs3825932 T>C	0.318	3.56×10^-3	0.87 (0.80-0.96)	5.47×10^-4	0.83 (0.74-0.92)	8.93×10^-6	8.67×10^-8	0.86 (0.82-0.91)	2.29×10^-4	0.86 (0.80-0.93)	3.17×10^-15
16p13	C16orf75, PRM3, TNP2	rs416603 A>T	0.440	8.01×10^-4	0.87 (0.80-0.94)	2.02×10^-3	0.85 (0.77-0.94)	5.61×10^-6	0.0158	0.94 (0.89-0.99)	N/A		2.63×10^-6
22q13	C1QTNF6	rs229541 C>T	0.427	1.06×10^-3	1.15 (1.06-1.25)	3.36×10^-3	1.16 (1.05-1.29)	1.16×10^-5	8.10×10^-5	1.11 (1.05-1.16)	0.146	1.04 (0.97-1.12)	1.98×10^-8

Open in a new tab

^†

The most associated BACH2 SNP was rs619192 (meta-analysis P-value = 3.98×10^-6), which was in perfect linkage disequilibrium (r² = 1) with rs11755527, genotyped based on an initial meta-analysis. MAF = minor allele frequency, df = degree-of-freedom, OR = odds ratio for minor allele, c.i. = confidence interval, N/A = not attempted.

We obtained additional support for four of the seven associations: rs11755527/BACH2 on 6q15 (P = 6.9×10^-7; OR for minor allele G = 1.13); rs947474/PRKCQ on 10p15 (P = 5.5×10^-3; OR for minor allele G = 0.91); rs3825932/CTSH on 15q24 (P = 8.7×10^-8; OR for minor allele C = 0.86); and, rs229541/C1QTNF6 on 22q13 (P = 8.1×10^-5; OR for minor allele T = 1.11) (Table 1). In addition, we regenotyped the seven SNPs in a minimum of 1,771 case and 2,756 control samples used in the WTCCC to validate the genotyping, which showed concordance was 99.3% or better.

We also genotyped the four most associated SNPs, rs11755527/BACH2, rs947474/PRKCQ, rs3825932/CTSH and rs229541/C1QTNF6, in 871 multiplex and 1,957 simplex families, resulting in P-values of 0.019, 1.3×10^-3, 2.3×10^-4 and 0.15 respectively (Table 1). The overall, combined P-values (2.0×10^-8 to 3.2×10^-15; Table 1) provided convincing support for four, previously undetected and distinct, T1D loci.

There are several interesting functional candidate genes located within the four new associated regions (Supplementary Methods). The 365 kb associated region on 6q15 contains only one gene, BACH2, intron 3 of which contains rs11755527. BACH2 encodes BTB and CNC homology 1, basic leucine zipper transcription factor 2, which has a role as key regulator of nucleic acid-triggered antiviral responses in human cells⁸ and is highly expressed in B cells⁹ (GNF SymAtlas¹⁰) (Supplementary Figure 4). In the 234 kb associated region of 10p15, the gene protein kinase C, theta (PRKCQ) is 79 kb telomeric of rs947474 (Supplementary Figure 3). PRKCQ controls several fundamental processes in T cell biology, including integration of T cell receptor (TCR) and CD28 signalling, leading to activation of transcription factors (NF-κB and AP-1)¹¹. PRKCQ deficient mice display defects in the differentiation of T helper subsets, particularly in Th2 and Th17 mediated inflammatory responses¹¹. Furthermore, its selective role in T cell effector function, makes PRKCQ an attractive therapeutic target in T cell mediated disease processes¹².

In the 660 kb associated region of 15q24, rs3825932 is located in intron 1 of cathepsin H (CTSH), along with eight other genes (Supplementary Figure 5). On 22q13, rs229541 is located between the genes: C1q and tumor necrosis factor related protein 6 (C1QTNF6) and somatostatin receptor 3 (SSTR3), in a 125 kb associated region that contains two other genes (Supplementary Figure 6). The meta-analysis results suggest that coding and intronic sequences of the strong candidate gene IL2RB, 56 kb centromeric of rs229541, are not associated with T1D, as we previously reported⁵; although causal variants could affect the function of regulatory sequences that control the expression of genes hundreds of kb away.

To conclude, we present convincing evidence for four, previously undetected and distinct, T1D loci (6q15/BACH2, 10p15/PRKCQ, 15q24/CTSH and 22q13/C1QTNF6). In addition, we provide further support for 4q27/IL2-IL21, increasing the total of T1D loci with convincing evidence from ten(ref.^5,13,14) to 15 (including the HLA region). The evidence for these new T1D loci was obtained by forming an American case-control GWA study from existing data and incorporating this data into a meta-analysis with the British (WTCCC) data, followed by independent replication in cases and controls, and with some success, in families.

Recently, an additional locus on chromosome 21q22.3, including the UBASH3A gene, was reported by a Type 1 Diabetes Genetics Consortium (T1DGC) study¹⁵. In this study, we have demonstrated the effectiveness of combining the evidence from GWA studies to find disease loci with typical effect sizes (OR < 1.2), and that GWA studies can be successfully formed using case and control data from different studies, provided that allele signal intensity data are available for recalling and checking the SNP genotypes.

Supplementary Material

NIHMS2312-supplement-Supplementary_.pdf^{(510.3KB, pdf)}

Acknowledgements

This work was funded by the Juvenile Diabetes Research Foundation International, the Wellcome Trust and the National Institute for Health Research Cambridge Biomedical Centre. The Cambridge Institute for Medical Research (CIMR) is in receipt of a Wellcome Trust Strategic Award (079895). We gratefully acknowledge the participation of all the patients, control subjects and family members.

We gratefully acknowledge David Clayton for methodology advice and comments on the manuscript.

We acknowledge use of DNA from the British 1958 Birth Cohort collection, funded by the Medical Research Council and the Wellcome Trust. We thank The Avon Longitudinal Study of Parents and Children laboratory in Bristol and the British 1958 Birth Cohort team, including S. Ring, R. Jones, M. Pembrey, W. McArdle, D. Strachan and P. Burton for preparing and providing the control DNA samples. We also thank H. Stevens, P. Clarke, G. Coleman, S. Duley, D. Harrison, S. Hawkins, M. Maisuria, T. Mistry and N. Taylor for preparation of DNA samples.

We acknowledge use of DNA from the Human Biological Data Interchange and Diabetes UK for the USA and UK multiplex families, respectively, the Norwegian Study Group for Childhood Diabetes (D. Undlien and K. Ronningen) for the Norwegian families; D. Savage, C. Patterson, D. Carson and P. Maxwell for the Northern Irish families. Genetics of Type 1 Diabetes in Finland (GET1FIN) J. Tuomilehto,L. Kinnunen, E. Tuomilehto-Wolf, V. Harjutsalo and T. Valle for the Finnish families and C. Guja and C. Ionescu-Tirgoviste for the Romanian families.

WTCCC:

This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113 (see Nature 2007; 447; 661-78).

NIMH:

We gratefully acknowledge the National Institute of Mental Health for generously allowing the use of their control CEL and genotype data. Biomaterials and phenotypic data were obtained from the following projects that participated in the NIMH Control Samples:

Control subjects from the National Institute of Mental Health Schizophrenia Genetics Initiative (NIMH-GI), data and biomaterials are being collected by the “Molecular Genetics of Schizophrenia II” (MGS-2) collaboration. The Investigators and co-investigators are: ENH/Northwestern University, Evanston, IL, MH059571, Pablo V. Gejman, M.D. (Collaboration Coordinator: PI), Alan R. Sanders, M.D.; Emory University School of Medicine, Atlanta, GA, MH59587, Farooq Amin, M.D. (PI); Louisiana State University Health Sciences Center; New Orleans, Louisiana, MH067257, Nancy Buccola APRN, BC, MSN (PI); University of California-Irvine, Irvine, CA, MH60870, William Byerley, M.D. (PI); Washington University, St. Louis, MO, U01, MH060879, C. Robert Cloninger, M.D. (PI); University of Iowa, Iowa, IA, MH59566, Raymond Crowe, M.D. (PI), Donald Black, M.D.; University of Colorado, Denver, CO, MH059565, Robert Freedman, M.D. (PI); University of Pennsylvania, Philadelphia, PA, MH061675, Douglas Levinson M.D. (PI); University of Queensland, Queensland, Australia, MH059588, Bryan Mowry, M.D. (PI); Mt. Sinai School of Medicine, New York, NY MH59586, Jeremy Silverman, Ph.D. (PI).

The samples were collected by V L Nimgaonkar’s group at the University of Pittsburgh, as part of a multi-institutional collaborative research project with J Smoller, M.D. D.Sc. and P Sklar, M.D. Ph.D. (Massachusetts General Hospital) (grant MH 63420).

GoKinD:

We gratefully acknowledge the National Institute of Health for generously allowing the use of their control allele signal intensity and genotype data. The dataset(s) used for the analyses described in this manuscript were obtained from the GAIN Database found at http://view.ncbi.nlm.nih.gov/dbgap - controlled through dbGaP accession number phs000018.v1.p1.

Footnotes

URLs.

National Institute of Mental Health (NIMH): http://www.nimhgenetics.org/ Further information about T1D loci in T1DBase: http://www.t1dbase.org/

References

1.Wellcome Trust Case Control Consortium Nature. 2007;447:661–678. [Google Scholar]
2.Mueller PW, et al. J Am Soc Nephrol. 2006;17:1782–1790. doi: 10.1681/ASN.2005080822. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Manolio TA, et al. Nat Genet. 2007;39:1045–1051. doi: 10.1038/ng2127. [DOI] [PubMed] [Google Scholar]
4.Clayton DG, et al. Nat Genet. 2005;37:1243–1246. doi: 10.1038/ng1653. [DOI] [PubMed] [Google Scholar]
5.Todd JA, et al. Nat Genet. 2007;39:857–864. doi: 10.1038/ng2068. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Smyth DJ, et al. Diabetes. 2008;57:1730–1737. doi: 10.2337/db07-1131. [DOI] [PubMed] [Google Scholar]
7.Lowe CE, et al. Nat Genet. 2007;39:1074–1082. doi: 10.1038/ng2102. [DOI] [PubMed] [Google Scholar]
8.Hong SW, et al. Biochem Biophys Res Commun. 2008;365:426–432. doi: 10.1016/j.bbrc.2007.10.183. [DOI] [PubMed] [Google Scholar]
9.Muto A, et al. EMBO J. 1998;17:5734–5743. doi: 10.1093/emboj/17.19.5734. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Su AI, et al. Proc Natl Acad Sci U S A. 2002;99:4465–4470. doi: 10.1073/pnas.012025199. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Hayashi K, Altman A. Pharmacol Res. 2007;55:537–544. doi: 10.1016/j.phrs.2007.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Chaudhary D, Kasaian M. Curr Opin Investig Drugs. 2006;7:432–437. [PubMed] [Google Scholar]
13.Hakonarson H, et al. Diabetes. 2008;57:1143–1146. doi: 10.2337/db07-1305. [DOI] [PubMed] [Google Scholar]
14.Smyth DJ, et al. Nat Genet. 2006;38:617–619. doi: 10.1038/ng1800. [DOI] [PubMed] [Google Scholar]
15.Concannon P, et al. Diabetes advance online publication. 2008 Jul 22; doi:10.2337/db08-0753. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS2312-supplement-Supplementary_.pdf^{(510.3KB, pdf)}

[R1] 1.Wellcome Trust Case Control Consortium Nature. 2007;447:661–678. [Google Scholar]

[R2] 2.Mueller PW, et al. J Am Soc Nephrol. 2006;17:1782–1790. doi: 10.1681/ASN.2005080822. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Manolio TA, et al. Nat Genet. 2007;39:1045–1051. doi: 10.1038/ng2127. [DOI] [PubMed] [Google Scholar]

[R4] 4.Clayton DG, et al. Nat Genet. 2005;37:1243–1246. doi: 10.1038/ng1653. [DOI] [PubMed] [Google Scholar]

[R5] 5.Todd JA, et al. Nat Genet. 2007;39:857–864. doi: 10.1038/ng2068. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Smyth DJ, et al. Diabetes. 2008;57:1730–1737. doi: 10.2337/db07-1131. [DOI] [PubMed] [Google Scholar]

[R7] 7.Lowe CE, et al. Nat Genet. 2007;39:1074–1082. doi: 10.1038/ng2102. [DOI] [PubMed] [Google Scholar]

[R8] 8.Hong SW, et al. Biochem Biophys Res Commun. 2008;365:426–432. doi: 10.1016/j.bbrc.2007.10.183. [DOI] [PubMed] [Google Scholar]

[R9] 9.Muto A, et al. EMBO J. 1998;17:5734–5743. doi: 10.1093/emboj/17.19.5734. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Su AI, et al. Proc Natl Acad Sci U S A. 2002;99:4465–4470. doi: 10.1073/pnas.012025199. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Hayashi K, Altman A. Pharmacol Res. 2007;55:537–544. doi: 10.1016/j.phrs.2007.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Chaudhary D, Kasaian M. Curr Opin Investig Drugs. 2006;7:432–437. [PubMed] [Google Scholar]

[R13] 13.Hakonarson H, et al. Diabetes. 2008;57:1143–1146. doi: 10.2337/db07-1305. [DOI] [PubMed] [Google Scholar]

[R14] 14.Smyth DJ, et al. Nat Genet. 2006;38:617–619. doi: 10.1038/ng1800. [DOI] [PubMed] [Google Scholar]

[R15] 15.Concannon P, et al. Diabetes advance online publication. 2008 Jul 22; doi:10.2337/db08-0753. [Google Scholar]

PERMALINK

Meta-analysis of genome-wide association study data identifies additional type 1 diabetes loci

Jason D Cooper

Deborah J Smyth

Adam M Smiles

Vincent Plagnol

Neil M Walker

James Allen

Kate Downes

Jeffrey C Barrett

Barry Healy

Josyf C Mychaleckyj

James H Warram

John A Todd

Abstract

Table 1.

Supplementary Material

Acknowledgements

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Meta-analysis of genome-wide association study data identifies additional type 1 diabetes loci

Jason D Cooper

Deborah J Smyth

Adam M Smiles

Vincent Plagnol

Neil M Walker

James Allen

Kate Downes

Jeffrey C Barrett

Barry Healy

Josyf C Mychaleckyj

James H Warram

John A Todd

Abstract

Table 1.

Supplementary Material

Acknowledgements

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases