Skip to main content
American Journal of Human Genetics logoLink to American Journal of Human Genetics
. 2005 Mar 17;76(5):773–779. doi: 10.1086/429843

Localization of a Type 1 Diabetes Locus in the IL2RA/CD25 Region by Use of Tag Single-Nucleotide Polymorphisms

Adrian Vella 1, Jason D Cooper 1, Christopher E Lowe 1, Neil Walker 1, Sarah Nutland 1, Barry Widmer 2, Richard Jones 3, Susan M Ring 3, Wendy McArdle 3, Marcus E Pembrey 3,4, David P Strachan 5, David B Dunger 2, Rebecca C J Twells 1, David G Clayton 1, John A Todd 1
PMCID: PMC1199367  PMID: 15776395

Abstract

As part of an ongoing search for genes associated with type 1 diabetes (T1D), a common autoimmune disease, we tested the biological candidate gene IL2RA (CD25), which encodes a subunit (IL-2Rα) of the high-affinity interleukin-2 (IL-2) receptor complex. We employed a tag single-nucleotide polymorphism (tag SNP) approach in large T1D sample collections consisting of 7,457 cases and controls and 725 multiplex families. Tag SNPs were analyzed using a multilocus test to provide a regional test for association. We found strong statistical evidence in the case-control collection (P=6.5×10-8) for a T1D locus in the CD25 region of chromosome 10p15 and replicated the association in the family collection (P=7.3×10-3; combined P=1.3×10-10). These results illustrate the utility of tag SNPs in a chromosome-regional test of disease association and justify future fine mapping of the causal variant in the region.

Introduction

Despite hundreds of association studies, few have been consistently replicated (Dahlman et al. 2002; Hirschhorn et al. 2003; Ioannidis et al. 2003; Lohmueller et al. 2003). In type 1 diabetes (T1D [MIM 222100]), only four loci have been identified and successfully replicated: the HLA class II genes on chromosome 6p21 (Cucca et al. 2001), the insulin gene (INS) on chromosome 11p15 (Bell et al. 1984; Barratt et al. 2004), the CTLA-4 gene on chromosome 2q33 (Nisticó et al. 1996; Ueda et al. 2003), and the recently associated PTPN22 gene on chromosome 1p13 (Bottini et al. 2004; Smyth et al. 2004). It is now generally accepted that large numbers of individuals and more stringent criteria for interpreting association studies are required to ensure reliable detection of association (Dahlman et al. 2002; Ioannidis et al. 2003; Lohmueller et al. 2003; Thomas and Clayton 2004; Wacholder et al. 2004; Freimer and Sabatti 2004; Smyth et al. 2005; Wang et al. 2005).

Most cases of T1D result from immune-mediated destruction of the insulin-producing β cells of the pancreas in an inflammatory process that involves many cell types of the immune system, including T lymphocytes. The four identified T1D loci underpin these known features of the disease, since they are all involved in T cell development (for INS, in terms of thymic tolerance of the insulin molecule), activation, expansion, and regulation (Todd and Wicker 2001; Ueda et al. 2003; Anjos and Polychronakos 2004). As part of an ongoing search for genes associated with T1D, we tested the biological candidate gene CD25 (MIM 147730). In common with the T1D loci identified elsewhere, CD25 is central to immune regulation. CD25 expression on regulatory T cells is essential for their function in suppressing T cell immune responses and autoimmune disease (Salomon et al. 2000; Malek and Bayer 2004; Viglietta et al. 2004). Additionally, in humans, a rare mutation of CD25 caused severe autoimmune disease (Sharfe et al. 1997).

We adopted a linkage disequilibrium (LD)-mapping approach to test for an association between T1D and the CD25 region, using tag SNPs (Johnson et al. 2001; Chapman et al. 2003; Clayton et al. 2004) in large case-control and family collections. Elsewhere, we have shown that the use of tag SNPs can reduce genotyping costs by approximately two-thirds (Chapman et al. 2003; Clayton et al. 2004; Lowe et al. 2004).

Subjects and Methods

Subjects

The resequencing panel consisted of 32 CEPH individuals (Utah residents with ancestry from northern and western Europe) (Fondation Jean Dausset–CEPH).

The 3,527 cases were recruited as part of the United Kingdom Genetic Resource Investigating Diabetes (U.K. GRID) study, which is a joint project between the University of Cambridge Department of Paediatrics and the Department Medical Genetics at the Cambridge Institute for Medical Research and is funded by the Juvenile Diabetes Research Foundation and the Wellcome Trust. The eventual aim of the project is to collect 8,000 cases with T1D for comparison with 8,000 controls from the 1958 British Birth Cohort (1958 BBC), to allow well-powered genetic association studies. The 1958 BBC is an ongoing follow-up of all persons born in Great Britain during one week in 1958 (National Child Development Study), including a recent biomedical assessment during 2002–2004 at which blood samples and informed consent were obtained for creation of a genetic resource. All cases were white, and at least 97% of the controls were of white ethnicity.

All families were of white European descent and were composed of two parents and at least one affected child. The population studied consisted of 472 multiplex families from the Diabetes United Kingdom Warren collection and 268 multiplex families from the (U.S.) Human Biological Data Interchange. The characteristics and inclusion criteria for each family collection have been described elsewhere (Vella et al. 2004), and these and reference to the case-control samples can be obtained from the Juvenile Diabetes Research Foundation/Wellcome Trust Web site. All DNA samples were collected with approval of the relevant research ethics committees, and written informed consent was obtained from the participants.

Identifying Polymorphisms

To identify polymorphisms in the CD25 gene, the exons, exon/intron boundaries, and up to 3 kb of 3′ and 5′ flanking sequence were resequenced in DNA samples from 32 CEPH DNA samples by use of nested PCR products; we also sequenced the regions −9,000 to −8,000, −4,000 to −3,000, and +3,000 to +4,000 bases, numbered relative to the transcription start site, to encompass the CD28 response element (CD28rE), the Positive Regulatory Region (PRR) III, and PRR IV, respectively (Toledano et al. 1990; Kim and Leonard 2002). In total, 15 kb was sequenced from the CD25 region, spanning ∼60 kb. The sequencing reactions were performed using the Applied Biosystems (ABI) BigDye (version 3.1) chemistry, and the sequences were analyzed using an ABI 3700 capillary sequencer. Analysis of the sequence traces was performed using the Staden package (Bonfield et al. 1998) and was double scored by a second operator. All sequence information and primer locations are provided at the T1DBase Web site.

Tag SNPs

As described elsewhere (Chapman et al. 2003; Clayton et al. 2004), we used the resequencing genotype data to investigate the ability of smaller subsets of SNPs to predict the genotypes of the remainder. Predictive performance was assessed using a locus R2 measure (coefficient of determination), which measures the ability to predict each known SNP genotype by linear regression on the tag SNP genotypes (Chapman et al. 2003). We considered only SNPs with a minor-allele frequency (MAF) ⩾5% and required that the subset of tag SNPs predict the remaining SNPs with a minimum R2 of 0.8.

We selected an optimal set of tag SNPs, using a mixture of step-up, step-down, and exhaustive subset search algorithms. Since the exhaustive subset search procedure can be slow, we initially identified a set of tag SNPs selected by both step-up and step-down searches, and we determined the best additional set of tag SNPs by exhaustive subset search of the remaining SNPs (Lowe et al. 2004). The programs for the selection and analysis of tag SNPs are implemented in STATA and can be downloaded from D.G.C.'s Web site.

Genotyping

Tag SNPs were genotyped using either Taqman (Applied Biosystems) or Invader (Third Wave Technologies) technologies on a British case-control collection (3,527 cases and 3,930 controls), in accordance with the manufacturers’ protocols. All genotyping data were double scored to minimize error.

Multilocus Test for Association

Chapman et al. (2003) suggested the use of a multivariate test statistic in the analysis of a tagged region. Essentially, the test contrasts the profile of tag SNP allele frequencies between cases and controls by use of Hotelling’s T2 test (Xiong et al. 2002; Chapman et al. 2003; Fan and Knapp 2003). The test does not assume Hardy-Weinberg equilibrium in cases and controls; since no imputation of haplotype phase is required, variance and covariances of genotypes are estimated empirically. In the analysis of the case-control collection, the multilocus test was stratified by broad geographical region within Great Britain to exclude the possibility of confounding by geography. For each of 12 regions, we computed the vector of contrasts between case and control allele frequencies. The final test was based on a weighted sum of these contrast vectors, with weights inversely proportional to variance. This procedure is a multivariate generalization of standard methods for control of confounding in epidemiological studies (Breslow and Day 1980; Clayton and Hills 1993).

In the analysis of the family collection, parent-child trios were analyzed by a very closely related procedure. Each trio contributed a vector of transmitted allele pairs and a vector of untransmitted allele pairs. These can be thought of as “cases” and matched “pseudocontrols,” respectively. Each pair was then scored 0, 1, or 2, and the case and control profiles were compared using a paired Hotelling’s T2 test (Chapman et al. 2003).

These tests are most powerful when the effect of the causal variant is codominant. A recessive mode of inheritance results in reduced heterozygosity at the causal locus, but this may be reflected only weakly at any one tag SNP. Chapman et al. (2003) suggested a method for incorporation of this information into the multilocus test, but we did not judge it to be necessary in our study.

The evidence from case-control and family collections was combined in the same manner as was used to amalgamate the evidence from different geographical regions in the analysis of the case-control collection; a vector of contrasts comparing case and control allele frequencies for the set of tag SNPs was contributed from both studies, and a weighted mean of this vector was computed and tested against 0 (appendix A).

Imputation of Missing Tag SNP Genotypes

The multilocus test takes account of correlations between genotype for different tags and, therefore, requires that a complete set of scores be entered for each subject. Even a modest genotyping failure rate can result in a substantial attrition of subjects for such “complete case” analyses and substantial loss of power. We have avoided this by imputation of missing values.

Imputation of missing genotypes was performed using linear regression; that is, the missing tag SNP, ti(i=1,2,…,n), genotypes (scored 0, 1, or 2) were predicted from the regression of ti on the set of complete tag SNP genotypes, excluding ti. Clayton et al. (2004) justify this procedure for high-LD regions. Imputation was performed under the null hypothesis so that, for the same genotypes at observed loci, a missing locus would be assigned the same score whether the subject were a case or a control. The effect of this is to shrink case-control differences toward zero, but, since their variances and covariances are estimated empirically, the size of the test is preserved.

We evaluated the effect of imputation on type 1 error rates and on power in a simulation study. A number of scenarios, defined by the number of tag SNPs (maximum 20) and the percentage of missing at-random tag SNP genotypes (5% or 10%), in both case-control samples (1,000 cases and 1,000 controls) and families (1,000 parent-child trios), were considered. The effect of imputation on type 1 error rates in data generated under the null hypothesis of no association was evaluated on the basis of how often the null hypothesis was rejected when a significance test with a critical P value of P1 was applied. The null hypothesis should be rejected with probability P1. The power, with and without the imputation of missing tag SNP genotypes, was evaluated on the basis of how often the null hypothesis was rejected when applied to data generated under the alternative hypothesis.

Results

The resequencing of CD25 for 32 CEPH individuals identified 55 polymorphisms (table 1), 54 of which were SNPs; 13 of these SNPs were novel when compared with dbSNP build 123, and 1 polymorphism, ss35031434, was a novel G insertion/deletion. Sixteen SNPs had an MAF <5% and were consequently not included in the tag SNP selection. From the 39 common SNPs (MAF ⩾5%), 20 tag SNPs were selected and genotyped in the case-control collection (table 2). All tag SNP genotypes in cases and controls were in Hardy-Weinberg equilibrium.

Table 1.

Polymorphisms Identified in CD25

Position
SNPa DILb No. Mapc Gene MAF(%) Locus R2
35031438 (G→A) 4573 6116611 5′ 33 92.45
35031439 (G→A) 8087 6112706 5′ 11 Tag SNP
35031440 (C→T) 4574 6112445 5′ 39 Tag SNP
35031441 (C→T) 4575 6112346 5′ 48 98.74
35031426 (G→A) 8088 6110031 5′ 3
35031442 (A→G) 8089 6109962 5′ 9 100
35031443 (G→T) 8090 6109842 5′ 39 100
35031427 (T→C) 4579 6107878 Intron 1 3
35031444 (G→A) 8091 6107768 Intron 1 10 100
35031445 (C→T) 4580 6104731 Intron 1 33 92.45
35031432 (G→A) 8098 6104477 Intron 1 4
35031433 (T→G) 8112 6098360 Intron 1 4
35031421 (C→T) 8144 6084199 Intron 1 2
35031446 (C→T) 4581 6083852 Intron 1 48 Tag SNP
35031422 (G→A) 8161 6071975 Exon 2d 2
35031447 (T→C) 4583 6071694 Intron 2 45 95
35031448 (G→T) 8162 6071309 Intron 2 23 87.18
35031428 (G→C) 8163 6071193 Intron 2 3
35031449 (C→T) 8164 6070468 Intron 2 16 Tag SNP
35031429 (G→A) 8165 6070467 Intron 2 3
35031430 (G→T) 8166 6070423 Intron 2 3
35031450 (G→C) 4584 6070206 Intron 3 23 87.18
35031451 (A→T) 4585 6070201 Intron 3 45 95
35031437 (C→T) 8175 6067514 Exon 4d 5 83.35
35031452 (G→A) 8176 6067325 Intron 4 14 Tag SNP
35031453 (G→A) 8186 6065787 Intron 5 8 Tag SNP
35031434 (G→-) 8187 6065736 Intron 5 4
35031454 (G→A) 4588 6065505 Intron 5 8 84.45
35031455 (T→G) 4589 6065485 Intron 5 8 Tag SNP
35031456 (G→C) 4593 6062741 Intron 7 13 84.8
35031423 (G→C) 8193 6062426 Intron 7 2
35031457 (A→G) 4599 6062422 Intron 7 9 Tag SNP
35031435 (C→G) 4600 6062398 Intron 7 4
35031458 (T→C) 4601 6062329 Intron 7 28 Tag SNP
35031431 (C→T) 4602 6062151 Intron 7 3
35031459 (T→G) 4603 6061780 Intron 7 13 Tag SNP
35031424 (C→T) 8194 6061745 Intron 7 2
35031460 (C→A) 8195 6061738 Intron 7 14 Tag SNP
35031461 (G→A) 4604 6059610 Intron 7 18 Tag SNP
35031462 (C→T) 8198 6059295 Intron 7 6 Tag SNP
35031463 (A→T) 4605 6059228 Intron 7 27 87.49
35031464 (A→G) 8199 6059156 Intron 7 5 Tag SNP
35031465 (G→C) 8200 6059130 Intron 7 5 Tag SNP
35031466 (G→A) 8201 6059048 Intron 7 8 87.71
35031467 (G→A) 4606 6058957 Intron 7 13 Tag SNP
35031436 (G→A) 8202 6058771 3′ UTR 4
35031468 (G→A) 4607 6058164 3′ UTR 6 100
35031469 (G→A) 4608 6057872 3′ UTR 9 Tag SNP
35031425 (G→A) 8203 6057815 3′ UTR 2
35031470 (A→G) 4609 6057574 3′ UTR 48 86.1
35031471 (T→C) 4610 6057380 3′ 42 Tag SNP
35031472 (C→T) 4611 6057292 3′ 6 Tag SNP
35031473 (A→T) 4612 6057169 3′ 12 Tag SNP
35031474 (C→T) 4613 6057139 3′ 27 87.49
35031475 (C→G) 8204 6056961 3′ 7 95.95
a

Alleles are coded by National Center for Biotechnology Information (NCBI) ss number (major allele→minor allele).

b

DIL = Diabetes and Inflammation Laboratory.

c

Positions are from NCBI build 34.

d

Synonymous.

Table 2.

Summary of Tag SNPs for the Case-Control Collection[Note]

Tag SNP Genotype Counts
SNPa and Population MAF (%) 1/1 1/2 2/2
35031439 (G→A):
 Case 5 11 372 3094
 Control 6 12 349 3490
35031440 (C→T):
 Case 35 1332 1663 477
 Control 38 1656 1743 463
35031446 (C→T):
 Case 46 967 1726 740
 Control 47 1102 1904 820
35031449 (C→T):
 Case 21 2088 1172 198
 Control 23 2395 1300 183
35031452 (G→A):
 Case 18 126 1075 2260
 Control 19 105 1181 2537
35031453 (G→A):
 Case 7 19 507 2925
 Control 8 33 530 3293
35031455 (T→G):
 Case 13 36 601 2850
 Control 10 68 875 2925
35031457 (A→G):
 Case 12 2672 766 63
 Control 13 2966 852 46
35031458 (T→C):
 Case 24 182 1226 2004
 Control 23 216 1400 2183
35031459 (T→G):
 Case 17 59 838 2553
 Control 14 106 1097 2657
35031460 (C→A):
 Case 15 86 997 2382
 Control 17 91 992 2754
35031461 (G→A):
 Case 27 223 1367 1862
 Control 26 285 1493 2034
35031462 (C→T):
 Case 4 3078 361 11
 Control 6 3484 311 12
35031464 (A→G):
 Case 2 3361 115 1
 Control 2 3701 151 1
35031465 (G→C):
 Case 8 17 512 2931
 Control 8 29 535 3301
35031467 (G→A):
 Case 16 71 821 2576
 Control 14 87 1060 2680
35031469 (G→A):
 Case 11 42 680 2747
 Control 11 37 746 3077
35031471 (T→C):
 Case 44 638 1702 1122
 Control 43 719 1932 1167
35031472 (C→T):
 Case 3 3263 231 5
 Control 3 3636 234 3
35031473 (A→T):
 Case 16 2500 894 79
 Control 15 2745 1017 96

Note.— Tag SNPs are listed in chromosomal order, as genotyped in the case-control collection, which consisted of 3,527 cases and 3,930 controls. Alleles are coded “1” or “2,” alphabetically.

a

Alleles are coded by NCBI ss number (major allele→minor allele).

The multilocus test P value for the case-control collection was 6.5×10-8 (3,521 case and 3,930 control genotypes; F20,7419=3.7). The multilocus test was stratified by 12 broad geographic regions, to minimize any confounding due to variation in allele frequencies across Great Britain (see the “Subjects and Methods” section) (unstratified P value was 1.4×10-8).

We proceeded to genotype the tag SNPs in a large family collection (472 British and 268 U.S. multiplex families with T1D). tag SNP genotypes in parents and affected offspring were all in Hardy-Weinberg equilibrium. We replicated the association in the family collection, with a multilocus test P value of 7.3×10-3 (parent-child trio genotypes=1,378; χ220=38.7) (table 3), thus providing independent evidence of an association between T1D and CD25. Figure 1 shows the striking agreement between the odds ratios and the relative risks (transmission ratios) for the minor alleles of the tag SNPs genotyped in the case-control and family collections. Consequently, when results from both studies were combined, the multilocus test (Smyth et al. 2004) (appendix A) provided strong statistical support for a T1D locus in the CD25 region of chromosome 10p15.1 (χ220=88.6; P=1.3×10-10). There was no suggestion of reduced heterozygosity in cases or affected offspring, an indication of recessive inheritance.

Table 3.

Summary of Tag SNPs for the Family Collection[Note]

No. of Transmissions
SNPa and Population MAF (%) No. of Trios Transmitted Untransmitted
35031439 (G→A):
 U.K. 4 804 1,544 1,539
 U.S. 5 506 964 960
35031440 (C→T):
 U.K. 36 732 537 501
 U.S. 36 492 367 335
35031446 (C→T):
 U.K. 44 767 689 674
 U.S. 46 520 472 477
35031449 (C→T):
 U.K. 24 775 370 356
 U.S. 22 482 219 211
35031452 (G→A):
 U.K. 20 785 1,257 1,262
 U.S. 19 497 805 806
35031453 (G→A):
 U.K. 8 794 1,460 1,475
 U.S. 8 494 896 921
35031455 (T→G):
 U.K. 11 790 1,407 1,389
 U.S. 11 504 921 887
35031457 (A→G):
 U.K. 12 790 169 201
 U.S. 13 517 135 139
35031458 (T→C):
 U.K. 23 762 1,192 1,152
 U.S. 24 489 765 735
35031459 (T→G):
 U.K. 15 767 1,316 1,306
 U.S. 13 507 904 861
35031460 (C→A):
 U.K. 14 750 1,293 1,301
 U.S. 17 486 805 820
35031461 (G→A):
 U.K. 26 750 1,128 1,095
 U.S. 26 496 739 719
35031462 (C→T):
 U.K. 4 802 62 70
 U.S. 5 523 54 44
35031464 (A→G):
 U.K. 1 773 21 15
 U.S. 1 517 17 13
35031465 (G→C):
 U.K. 8 796 1,477 1,465
 U.S. 8 474 860 891
35031467 (G→A):
 U.K. 14 728 1,257 1,238
 U.S. 14 487 855 832
35031469 (G→A):
 U.K. 10 781 1,413 1,394
 U.S. 11 504 888 900
35031471 (T→C):
 U.K. 43 780 882 883
 U.S. 43 498 568 546
35031472 (C→T):
 U.K. 3 810 52 41
 U.S. 3 520 31 34
35031473 (A→T):
 U.K. 16 796 245 262
 U.S. 16 524 137 184

Note.— Alleles are listed in chromosomal order, as genotyped in the family collection, which consisted of 472 U.K. (1,745 individuals and 841 parent-child trios) and 268 U.S. (1,061 individuals and 537 parent-child trios) multiplex families. Alleles are coded as major allele→minor allele; transmissions refer to the allele coded as 2. Parental MAFs are reported.

a

Alleles are coded by NCBI ss number (major allele→minor allele).

Figure 1.

Figure  1

Upper panel, Odds ratios and transmission ratios for the minor allele of CD25 tag SNPs genotyped in the case-control (filled circles) and family (open circles) collections, respectively. Vertical lines indicate 95% CIs. Lower panel, Chromosome position of CD25 tag SNPs. Open long rectangle indicates UTR, filled long rectangles indicate exons, filled short rectangles indicate regulatory regions, and the arrow labeled “+1” indicates the transcript start site. A version of this figure can be viewed at the T1DBase Web site.

Simulations of data generated under the null hypothesis indicated that imputation did not affect type 1 error rates; that is, the null hypothesis was rejected with probability P1 when a significance test with a critical P value of P1 was applied. For example, the null hypothesis was rejected in 483, 102, and 9 of 10,000 simulations of 1,000 cases and 1,000 controls genotyped in 20 tag SNPs, with 5% of tag SNP genotypes missing at critical P values of 0.05, 0.01, and 0.001, respectively. Simulations also indicated that imputation partially recaptures the loss of power incurred by restricting the analysis to subjects with a complete set of tag SNP genotypes (table 4). For example, if a genomic region has 20 tag SNPs that are genotyped in 1,000 cases and 1,000 controls with 5% of tag SNP genotypes missing at random, by imputing missing genotypes, power to detect a causal variant with an odds ratio of 1.5 (causal allele frequency of 0.35 in controls) at a significance level of P=.05 increases from 46.5% to 96.0% in 1,000 simulations. In the equivalent exercise in families (1,000 parent-child trios), power increases from 12.9% to 99.1%. Imputation is particularly important for parent-child trios and for candidate genes with a relatively large number of tag SNPs, since, in those cases, restriction of the analysis to complete cases is particularly damaging. The multilocus test P values for the case-control and family collections without imputation were 2.1×10-6 (2,812 case and 2,981 control genotypes; F20,5761=3.2) and .052 (parent-child trio genotypes =558; χ220=31.2), respectively.

Table 4.

Effect of Imputation on Power of a Simulated Case-Control Study and of a Family Study[Note]

Percentage of Tag SNP Genotypes Missing at Random
In Case-Control Study with
In Family Study with
No Imputation
Imputation
No Imputation
Imputation
No. of Tag SNPs and P 5% 10% 5% 10% 5% 10% 5% 10%
4:
 .05 99.9 98.8 100.0 99.8 97.3 78.6 99.8 99.9
 .01 98.6 95.4 99.6 99.3 90.8 62.5 98.7 97.8
 .001 94.3 84.8 98.1 96.6 75.2 35.3 95.5 92.0
12:
 .05 89.7 57.5 99.7 99.5 38.9 3.3 99.3 98.6
 .01 74.2 33.2 97.8 97.3 19.0 .1 95.8 93.6
 .001 48.7 13.1 92.4 91.0 3.3 .0 85.1 79.7
20:
 .05 46.5 16.9 96.0 94.7 12.9 .0 99.1 97.7
 .01 24.1 5.2 88.1 85.3 2.1 .0 95.3 92.4
 .001 7.9 .6 67.7 65.2 .1 .0 80.9 78.0

Note.— Power was estimated from 1,000 simulations of each type of study (1,000 each, cases, controls, and parent-child trios). The causal variant had an OR of 1.5 and a causal-allele frequency of ∼35% in controls and parents.

Discussion

In the present study, we have assessed only the polymorphisms located in or close to exons and known regulatory regions, as well as up to 3 kb of 3′ and 5′ flanking sequence. Since we have found strong statistical evidence for a T1D locus in the CD25 region, we have now started to resequence an extended and, where possible, contiguous CD25 chromosome region of ∼190 kb including CD25 to identify potential causal variants. Fine mapping of an extended region is required, since the CD25 tag SNPs could be in LD with a causal variant beyond CD25. For example, IL15RA, a strong functional candidate (Fehniger and Caligiuri 2001), is ∼30 kb from the 3′ flanking sequence of CD25.

Since the causal variant(s) in the CD25 region remains unknown, to replicate the association with T1D reported in the present study, a tag SNP approach would be required, genotyping either the same or a new set of tag SNPs. A new selection could be required—when the population in a subsequent study has a different pattern of LD in the CD25 region, for instance. The temptation to ignore the foundations of the reported association, the set of tag SNPs, and the pattern of LD behind their selection and to genotype only the most-associated tag SNP may well lead to false-negative results, since the power of the present study is based on the complete set of tag SNPs.

Acknowledgments

We gratefully acknowledge the participation of all patients with T1D, their family members, and control individuals. We thank Helen Rance, Tasneem Hassanali, Jayne Hutchings, Gillian Coleman, Sarah Field, Trupti Mistry, Kirsi Bourget, Sally Clayton, Matthew Hardy, Jennifer Keylock, Pamela Lauder, Meeta Maisuria, William Meadows, Meera Sebastian, and Sarah Wood, for preparing DNA samples; the Avon Longitudinal Study of Parents and Children Laboratory in Bristol, for preparing DNA samples; Luc Smink, Oliver Burren, and Alex Lam, for bioinformatic support. We acknowledge use of DNA from the 1958 BBC collection, funded by Medical Research Council grant G0000934 and Wellcome Trust grant 068545/Z/02. We acknowledge Diabetes UK and the Human Biological Data Interchange, for multiplex family collections, and the Wellcome Trust, the Juvenile Diabetes Research Foundation, Diabetes UK, and the Medical Research Council, for financial support. A.V. is a Mayo Foundation Scholar, and D.G.C. is a Juvenile Diabetes Research Foundation/Wellcome Trust Principal Research Fellow.

Appendix A

In an earlier study (Smyth et al. 2004), we used and defined a score test to combine single-locus tests from family- and population-based studies; in the present article, we used a multivariate version of that test. Chapman et al. (2003) defined the multilocus test as a multivariate score test, T2=UTVθU, where U is a score vector—with one element for each tag SNP, contrasting allele frequencies in cases and controls or, in family studies, frequencies of transmitted and untransmitted alleles. V is the estimated variance of the score statistic, and θ denotes a generalized inverse. The test statistic is asymptotically distributed as χ2, with degrees of freedom equal to the rank of V, which is equal to the number of tag SNPs. When combining results from family- and population-based studies, we first calculate the U vector and V matrix for each study. We then calculate an overall U and V by summation of the contributions from each study—U=U1+U2 and V=V1+V2—and calculate the T2 test statistic as before.

Electronic-Database Information

The URLs for data presented herein are as follows:

  1. dbSNP Home Page, http://www.ncbi.nlm.nih.gov/SNP/index.html
  2. D.G.C.'s Web site, http://www-gene.cimr.cam.ac.uk/clayton/software/ (for the programs used for the selection and analysis of tag SNPs)
  3. Fondation Jean Dausset–CEPH, http://www.cephb.fr/ (for information about the individuals used in the sequencing panel)
  4. Juvenile Diabetes Research Foundation/Wellcome Trust, http://www-gene.cimr.cam.ac.uk/todd/dna-refs.shtml (for references to DNA collections used at the Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory)
  5. National Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/
  6. National Child Development Study, http://www.cls.ioe.ac.uk/Cohort/Ncds/mainncds.htm (for information about the control collection)
  7. Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/ (for T1D and CD25)
  8. STATA, http://www.stata.com (for statistical software used)
  9. T1DBase, http://www.t1dbase.org/cgi-bin/welcome.cgi (for further information about the CD25 region)
  10. U.K. GRID, http://www-gene.cimr.cam.ac.uk/ucdr/grid.shtml (for information about the case collection)

References

  1. Anjos S, Polychronakos C (2004) Mechanisms of genetic susceptibility to type I diabetes: beyond HLA. Mol Genet Metab 81:187–195 10.1016/j.ymgme.2003.11.010 [DOI] [PubMed] [Google Scholar]
  2. Barratt BJ, Payne F, Lowe CE, Hermann R, Healy BC, Harold D, Concannon P, Gharani N, McCarthy MI, Olavesen MG, McCormack R, Guja C, Ionescu-Tirgoviste C, Undlien DE, Ronningen KS, Gillespie KM, Tuomilehto-Wolf E, Tuomilehto J, Bennett ST, Clayton DG, Cordell HJ, Todd JA (2004) Remapping the insulin gene/IDDM2 locus in type 1 diabetes. Diabetes 53:1884–1889 [DOI] [PubMed] [Google Scholar]
  3. Bell GI, Horita S, Karam JH (1984) A polymorphic locus near the human insulin gene is associated with insulin-dependent diabetes mellitus. Diabetes 33:176–183 [DOI] [PubMed] [Google Scholar]
  4. Bonfield JK, Rada C, Staden R (1998) Automated detection of point mutations using fluorescent sequence trace subtraction. Nucleic Acids Res 26:3404–3409 10.1093/nar/26.14.3404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bottini N, Musumeci L, Alonso A, Rahmouni S, Nika K, Rostamkhani M, MacMurray J, Meloni GF, Lucarelli P, Pellecchia M, Eisenbarth GS, Comings D, Mustelin T (2004) A functional variant of lymphoid tyrosine phosphatase is associated with type I diabetes. Nat Genet 36:337–338 10.1038/ng1323 [DOI] [PubMed] [Google Scholar]
  6. Breslow NE, Day NE (1980) Statistical methods in cancer research: volume 1—the analysis of case-control studies. IARC Scientific Publications, Lyon [PubMed] [Google Scholar]
  7. Chapman JM, Cooper JD, Todd JA, Clayton D (2003) Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum Hered 56:18–31 10.1159/000073729 [DOI] [PubMed] [Google Scholar]
  8. Clayton D, Chapman J, Cooper J (2004) Use of unphased multilocus genotype data in indirect association studies. Genet Epidemiol 27:415–428 10.1002/gepi.20032 [DOI] [PubMed] [Google Scholar]
  9. Clayton D, Hills M (1993) Statistical models in epidemiology. Oxford Science Publications, Oxford, United Kingdom [Google Scholar]
  10. Cucca F, Lampis R, Congia M, Angius E, Nutland S, Bain SC, Barnett AH, Todd JA (2001) A correlation between the relative predisposition of MHC class II alleles to type 1 diabetes and the structure of their proteins. Hum Mol Genet 10:2025–2037 10.1093/hmg/10.19.2025 [DOI] [PubMed] [Google Scholar]
  11. Dahlman I, Eaves IA, Kosoy R, Morrison VA, Heward J, Gough SC, Allahabadia A, Franklyn JA, Tuomilehto J, Tuomilehto-Wolf E, Cucca F, Guja C, Ionescu-Tirgoviste C, Stevens H, Carr P, Nutland S, McKinney P, Shield JP, Wang W, Cordell HJ, Walker N, Todd JA, Concannon P (2002) Parameters for reliable results in genetic association studies in common disease. Nat Genet 30:149–150 10.1038/ng825 [DOI] [PubMed] [Google Scholar]
  12. Fan R, Knapp M (2003) Genome association studies of complex diseases by case-control designs. Am J Hum Genet 72:850–868 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Fehniger TA, Caligiuri MA (2001) Interleukin 15: biology and relevance to human disease. Blood 97:14–32 10.1182/blood.V97.1.14 [DOI] [PubMed] [Google Scholar]
  14. Freimer N, Sabatti C (2004) The use of pedigree, sib-pair and association studies of common diseases for genetic mapping and epidemiology. Nat Genet 36:1045–1051 10.1038/ng1433 [DOI] [PubMed] [Google Scholar]
  15. Hirschhorn JN (2003) Genetic epidemiology of type 1 diabetes. Pediatr Diabetes 4:87–100 10.1034/j.1399-5448.2001.00013.x [DOI] [PubMed] [Google Scholar]
  16. Ioannidis JPA, Trikalinos TA, Ntzani EE, Contopoulos-Ioannidis DG (2003) Genetic associations in large versus small studies: an empirical assessment. Lancet 361:567–571 10.1016/S0140-6736(03)12516-0 [DOI] [PubMed] [Google Scholar]
  17. Johnson GCL, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA, Dudbridge F, Twells RCJ, Payne F, Hughes W, Nutland S, Stevens H, Carr P, Tuomilehto-Wolf E, Tuomilehto J, Gough SCL, Clayton DG, Todd JA (2001) Haplotype tagging for the identification of common disease genes. Nat Genet 29:233–237 10.1038/ng1001-233 [DOI] [PubMed] [Google Scholar]
  18. Kim HP, Leonard WJ (2002) The basis for TCR-mediated regulation of the IL-2 receptor α chain gene: role of widely separated regulatory elements. EMBO J 21:3051–3059 10.1093/emboj/cdf321 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn (2003) Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet 33:177–182 10.1038/ng1071 [DOI] [PubMed] [Google Scholar]
  20. Lowe CE, Cooper JD, Chapman JM, Barratt BJ, Twells RCJ, Green EA, Savage DA, Guja C, Ionescu-Tîrovişte C, Tuomilehto-Wolf E, Tuomilehto J, Todd JA, Clayton DG (2004) Cost-effective analysis of candidate genes using htSNPs: a staged approach. Genes Immun 5:301–305 10.1038/sj.gene.6364064 [DOI] [PubMed] [Google Scholar]
  21. Malek TR, Bayer AL (2004) Tolerance, not immunity, crucially depends on IL-2. Nat Rev Immunol 4:665–674 10.1038/nri1435 [DOI] [PubMed] [Google Scholar]
  22. Nisticó L, Buzzetti R, Pritchard LE, Van der Auwera B, Giovannini C, Bosi E, Larrad MT, Rios MS, Chow CC, Cockram CS, Jacobs K, Mijovic C, Bain SC, Barnett AH, Vandewalle CL, Schuit F, Gorus FK, Tosi R, Pozzilli P, Todd JA (1996) The CTLA-4 gene region of chromosome 2q33 is linked to, and associated with, type 1 diabetes. Hum Mol Genet 5:1075–1080 10.1093/hmg/5.7.1075 [DOI] [PubMed] [Google Scholar]
  23. Salomon B, Lenschow DJ, Rhee L, Ashourian N, Singh B, Sharpe A, Bluestone JA (2000) B7/CD28 costimulation is essential for the homeostasis of the CD4+CD25+ immunoregulatory T cells that control autoimmune diabetes. Immunity 12:431–440 10.1016/S1074-7613(00)80195-8 [DOI] [PubMed] [Google Scholar]
  24. Sharfe N, Dadi HK, Shahar M, Roifman CM (1997) Human immune disorder arising from mutation of the α chain of the interleukin-2 receptor. Proc Natl Acad Sci USA 94:3168–3171 10.1073/pnas.94.7.3168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Smyth D, Cooper JD, Collins JE, Heward JM, Franklyn JA, Howson JM, Vella A, Nutland S, Rance HE, Maier L, Barratt BJ, Guja C, Ionescu-Tirgoviste C, Savage DA, Dunger DB, Widmer B, Strachan DP, Ring SM, Walker N, Clayton DG, Twells RC, Gough SC, Todd JA (2004) Replication of an association between the lymphoid tyrosine phosphatase locus (LYP/PTPN22) with type 1 diabetes, and evidence for its role as a general autoimmunity locus. Diabetes 53:3020–3023 [DOI] [PubMed] [Google Scholar]
  26. Smyth D, Howson JM, Lowe CE, Walker N, Lam AC, Nutland S, Hutchings J, Tuomilehto-Wolf E, Tuomilehto J, Guja C, Ionescu-Tirgoviste C, Undlien DE, Ronningen KS, Savage DA, Dunger DB, Twells RC, McArdle W, Strachan DP, Todd JA (2005) Assessing the validity of the association between the SUMO4 M55V variant and risk of type 1 diabetes. Nat Genet 37:110–111 10.1038/ng0205-110 [DOI] [PubMed] [Google Scholar]
  27. Thomas DC, Clayton DG (2004) Betting odds and genetic associations. J Natl Cancer Inst 96:421–423 [DOI] [PubMed] [Google Scholar]
  28. Todd JA, Wicker LS (2001) Genetic protection from the inflammatory disease type 1 diabetes in humans and animal models. Immunity 15:387–395 10.1016/S1074-7613(01)00202-3 [DOI] [PubMed] [Google Scholar]
  29. Toledano MB, Roman DG, Halden NF, Lin BB, Leonard WJ (1990) The same target sequences are differentially important for activation of the interleukin 2 receptor α-chain gene in two distinct T-cell lines. Proc Natl Acad Sci USA 87:1830–1834 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Ueda H, Howson JM, Esposito L, Heward J, Snook H, Chamberlain G, Rainbow DB, et al (2003) Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423:506–511 10.1038/nature01621 [DOI] [PubMed] [Google Scholar]
  31. Vella A, Howson JM, Barratt BJ, Twells RC, Rance HE, Nutland S, Tuomilehto-Wolf E, Tuomilehto J, Undlien DE, Ronningen KS, Guja C, Ionescu-Tirgoviste C, Savage DA, Todd JA (2004) Lack of association of the Ala(45)Thr polymorphism and other common variants of the NeuroD gene with type 1 diabetes. Diabetes 53:1158–1161 [DOI] [PubMed] [Google Scholar]
  32. Viglietta V, Baecher-Allan C, Weiner HL, Hafler DA (2004) Loss of functional suppression by CD4+CD25+ regulatory T cells in patients with multiple sclerosis. J Exp Med 199:971–979 10.1084/jem.20031579 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N (2004) Assessing the probability of false-positive reports in molecular epidemiology studies. J Natl Cancer Inst 96:434–442 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Wang WYS, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 6:109–118 10.1038/nrg1522 [DOI] [PubMed] [Google Scholar]
  35. Xiong M, Zhao J, Boerwinkle E (2002) Generalized T2 test for genome association studies. Am J Hum Genet 70:1257–1268 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

RESOURCES