Skip to main content
JAMA Network logoLink to JAMA Network
. 2020 Oct 22;3(10):e2017666. doi: 10.1001/jamanetworkopen.2020.17666

Association of Uncommon, Noncoding Variants in the APOE Region With Risk of Alzheimer Disease in Adults of European Ancestry

Elizabeth E Blue 1,, Anqi Cheng 2, Sunny Chen 3, Chang-En Yu 3,4, for the Alzheimer’s Disease Genetics Consortium
PMCID: PMC7582128  PMID: 33090224

This genetic association study assesses whether variants near the apolipoprotein E gene (APOE) are associated with Alzheimer disease independently of ε2/ε3/ε4 genotype.

Key Points

Question

Is genetic variation near the apolipoprotein E gene (APOE) associated with risk of Alzheimer disease (AD) independently of the ε2/ε3/ε4 genotype?

Findings

In this genetic association study of 18 795 participants of European ancestry from the Alzheimer’s Disease Genetics Consortium, an association was found between rs2075650 and AD risk among ε4 homozygotes and a significant association was found between rs192879175 and AD risk among ε3 homozygotes.

Meaning

These findings suggest that even among individuals with the same ε2/ε3/ε4 genotype, genetic variation within the APOE neighboring region may be associated with risk of AD.

Abstract

Importance

The ε2 and ε4 alleles of the apolipoprotein E (APOE) gene are associated with Alzheimer disease (AD) risk. Although nearby genetic variants have also been shown to be associated with AD, including rs2075650 in the TOMM40 gene and rs4420638 near the APOC1 gene, it is unknown whether these associations are independent of the ε2 and ε4 alleles.

Objective

To assess whether variants near APOE are associated with AD independently of the ε2/ε3/ε4 genotype.

Design, Setting, and Participants

In this genetic association study of the Alzheimer’s Disease Genetics Consortium imputed genotype at data, 14 415 variants near APOE (±500 kilobase) for 18 795 individuals with European ancestry were tested for association with AD using 4 logistic mixed models adjusting for sex, cohort, population structure, and relatedness. Model 1 had no APOE adjustment, and model 2 adjusted for the count of ε2 and ε4 alleles. Model 3 was restricted to ε3 homozygotes, and model 4 was restricted to ε4 homozygotes. Data were downloaded from May 31, 2018, to June 3, 2018, and analyzed from November 1, 2018, to June 24, 2020.

Main Outcomes and Measures

Alzheimer disease affectation status was defined by clinicians using standard National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer Disease and Related Disorders Association criteria. Association was evaluated using Score tests; results with P < .05 divided by the number of independent tests per model were considered statistically significant.

Results

Among the 18 795 individuals in the study, 9704 were affected by AD and 9066 were control individuals; the median age at onset/evaluation was 76 (interquartile range, 70-82) years; and 11 167 were female (59.4%). Associations with AD were found for rs2075650 (odds ratio [OR], 2.59; 95% CI, 2.45-2.75; P = 3.19 × 10−228) and rs4420638 (OR, 2.77; 95% CI, 2.62-2.94; P = 2.99 × 10−254) without APOE adjustment. Although rs2075650 was nominally associated with AD among the ε4 homozygotes (OR, 1.33; 95% CI, 1.00-1.77; P = .047), the association between rs4420638 and AD was eliminated by APOE adjustment (model 2 OR, 1.06 [95% CI, 0.96-1.18; P = .24]; model 3 OR, 1.13 [95% CI, 0.95-1.34; P = .18]; model 4 OR, 0.90 [95% CI, 0.56-1.45; P = .66]). There was a significant association between rs192879175 and AD among ε3 homozygotes (OR, 0.50; 95% CI, 0.37-0.68; P = 8.30 × 10−6).

Conclusions and Relevance

The results of this genetic association study suggest that ε2/ε3/ε4 alleles are not the only variants in the APOE region that are associated with AD risk. Additional work with independent data is needed to replicate these results.

Introduction

The association between the apolipoprotein E (APOE [OMIM 107741]) gene and Alzheimer disease (AD) has been known for longer than 25 years1,2 and has remained the strongest and most consistent association between AD risk and a common DNA variant.3,4 Dozens of genetic loci are associated with risk of AD, and hundreds of variants across 3 genes (APP [OMIM 605714], PSEN1 [OMIM 104311], and PSEN2 [OMIM 600759]) are known to cause early-onset, autosomal dominant forms of AD.4,5,6 This genetic heterogeneity has also been observed at the APOE locus. Two independent missense variants in APOE, rs429358 and rs7412, are consistently associated with large effects on AD risk, and together define the ε2/ε3/ε4 alleles. Associations between many other single-nucleotide variants (SNVs) at the APOE locus with AD risk, age at onset, and/or biomarkers have been reported.7,8

Whether the association between SNVs in the APOE region and AD is independent of the effects of rs429358 and rs7412 is not settled. Many of these SNVs are in linkage disequilibrium (LD) with rs429358 in European ancestry samples, and most are noncoding changes that could affect gene expression.7,8 The APOE locus includes a long cluster of genes transcribed in the same direction, suggesting that they may be coregulated by cis regulatory elements. These genes have also been implicated in shared biological pathways, including lipid metabolism, the immune system, and mitochondrial function,5,7 which suggests that changes in either quality or quantity of the products of these genes may also be associated with AD.

Two noncoding SNVs at the APOE locus have consistently shown an association with AD risk and related traits: rs2075650 (the TOMM40 SNV [OMIM 608061]) and rs4420638 (the APOC1 SNV [OMIM 107710]). The association between these SNVs and AD is not always robust to APOE adjustment.9,10 Both SNVs are also associated with memory and cognitive function, cerebral spinal fluid biomarkers for immune response,11 oxidative stress markers,9 and longevity.12,13,14 However, because both SNVs are in moderate LD (0.2 < r2 < 0.8) with rs429358, these associations may not be independent of the ε4 allele.

We investigated whether rs2075650, rs44209638, or other SNVs in the extended APOE locus are associated with risk of AD independently of ε2/ε3/ε4 genotype in a large cohort with European ancestry. We hypothesized that the analytical strategy to adjust for APOE effects may influence these association signals.

Methods

Samples and Genotype Data

This genetic association study used Alzheimer’s Disease Genetics Consortium (ADGC) data, which were accessed through an application on the ADGC website.15 All participants reported European ancestry. This study was approved by the University of Washington institutional review board and followed the Strengthening the Reporting of Genetic Association Studies (STREGA) reporting guideline. This study evaluated publicly available deidentified data provided by the ADGC. Informed consent was obtained for all research participants as previously described.16

The ADGC imputed genotype data were previously generated using the segmented haplotype estimation and imputation tool (SHAPEIT)17 and IMPUTE, version 2,18 or MaCH19 and Minimac20 software and the 1000 Genomes Project (1KGP) sequence data as reference (phase 3; hg19/GRCh37),5,21 in which imputed variants with minor allele frequencies (MAFs) of at least 0.01 and either an r2 or an information measure of less than 0.40 were removed. After excluding 2 data sets owing to incomplete data files, we extracted the SNVs on a bead chip array (Infinium OmniExpress; Illumina) to create a genome-wide association study (GWAS) panel used to estimate principal components, relatedness, and genomic inflation (λ statistic).22 Single-nucleotide variants with an MAF of less than 0.05, variant-level missing rate of greater than 0.05, or ambiguous alleles were excluded from analysis, as were samples with individual-level missing rate of greater than 0.05; 510 665 variants in 18 795 participants remained. We extracted the 14 415 imputed SNVs within the APOE gene (±500 kilobase [kb]) (chromosome 19: 44 909 039-45 912 650) for association testing. Case individuals were defined as those affected by AD as determined by clinicians using the National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer Disease and Related Disorders Association criteria,23,24 and control individuals were those not affected by AD. APOE genotypes (ε2/ε3/ε4) were extracted from the cohort-specific covariate files. APOE was genotyped differently across ADGC cohorts.16

Statistical Analysis

Data were downloaded from May 31, 2018, to June 3, 2018, and analyzed from November 1, 2018, to June 24, 2020. The GENESIS package was used to test for the association between SNVs and AD risk,25,26 an approach that accounts for both population and pedigree structure (eMethods in the Supplement). PC-AiR27 performed a principal components analysis on the GWAS panel to detect population structure, accounting for kinship estimates provided using the KING approach for robust inference.28 PC-Relate29 then used these principal components to estimate a genetic relatedness matrix that is adjusted for population structure. Plots of the first 2 principal components were used to identify outliers among those with self-reported European ancestry. We fit 4 logistic mixed models adjusted for sex, cohort, the first 10 principal components, and a polygenic random effect with covariance structure given by the genetic relatedness matrix. Model 1 included all samples with no APOE adjustment, model 2 included all samples and adjusted for ε2 and ε4 allele counts, model 3 was restricted to ε3 homozygotes, and model 4 was restricted to ε4 homozygotes. Score tests were performed for each logistic model for all SNVs with an MAF of greater than 0.01, with missing genotype data imputed using observed allele frequencies within the data. We estimated the odds ratio (OR) and its 95% CI as follows: OR = Exp × (score statistic/standard error2) and 95% CI = ±1.96 × (1/standard error). For each model m, the number of independent tests tm was estimated using the genetic type 1 error calculator.30 Statistical significance was defined as P < .05/tm. Linkage disequilibrium between pairs of SNVs was measured using PLINK, version 1.07.31 Correlations between ε2 and ε4 genotypes and imputed genotypes at rs7412 and rs429358 were estimated using R, version 3.5.2.32 The mismatch between observed and expected ε2 and ε4 genotypes was calculated as the number of alleles differing between the observed and imputed genotypes divided by the number of alleles observed. Basic variant annotations, including LD in the 1KGP subset with European ancestry, were performed using HaploReg, version 4.1.33 Ancestry-matched reference allele frequencies (European MAF) were extracted for non-Finnish Europeans in the gnomAD database, version 2.1.34

Results

Summary Statistics

The data within the APOE region includes 14 415 SNVs and 18 795 individuals, of whom 11 167 were women (59.4%) and 7628 were men (40.6%) (median age at onset/evaluation, 76 [interquartile range, 70-82] years); 9704 were affected by AD (51.6%), and 9066 were controls (51.6%) (eTable 1 in the Supplement). Among cases, the ε2 allele frequency was 680 of 19 408 (3.5%), and the ε4 allele frequency was 7360 of 19 408 (37.9%); among controls, the ε2 frequency was 1444 of 18 132 (8.0%) and the ε4 frequency was 2490 of 18 132 (13.7%). We observed 71 ε2 homozygotes, 8848 ε3 homozygotes, and 1503 ε4 homozygotes. No outliers were identified by principal components analysis (eFigure 1 in the Supplement), and relatedness estimates were robust to the inclusion of genotypes from chromosome 19 (eFigure 2 in the Supplement). The number of independent tests within the APOE region was similar across analysis models (t1 and t2, 1128; t3, 1055; and t4, 1013), with similar significance thresholds (t1 and t2, P = 4.43 × 10−5; t3, P = 4.74 × 10−5; and t4, P = 4.94 × 10−5).

Associations of rs2075650 and rs4420638 With ε2, ε4, and AD Risk

There was a stronger LD among rs2075650 (TOMM40), rs4420638 (APOC1), and rs429358 (ε4) in the ADGC data than in 1KGP Europeans, and none of these SNVs were in LD with rs7412 (ε2) (eTable 2 in the Supplement). Among the 1KGP Europeans, both rs2075650 (r2 = 0.48) and rs4420638 (r2 = 0.65) had moderate LD with rs429358 and modest LD with each other (r2 = 0.30). These correlations were strengthened in the ADGC data, in which r2 ranged from 0.50 to 0.83 among these 3 SNVs.

The association between AD status and the TOMM40 and APOC1 SNVs varied across models (Table 1), each showing no evidence for genomic inflation (λ1 = 1.03; λ2 = 1.03; λ3 = 1.01; and λ4 = 0.99) (eFigure 3 in the Supplement).

Table 1. Association Between the TOMM40, APOC1, and APOE SNVs and AD With and Without APOE Adjustment or Stratification.

Modela SNV Nearest gene No. of Participants AAC AAF OR (95% CI) P value
1 rs2075650 TOMM40 18 211 8108 0.2226 2.59 (2.45-2.75) 3.19 × 10−228b
2 rs2075650 TOMM40 18 211 8108 0.2226 1.09 (0.99-1.19) .07
3 rs2075650 TOMM40 8642 746 0.0432 1.16 (0.98-1.38) .09
4 rs2075650 TOMM40 1426 2106 0.7400 1.33 (1.00-1.77) .047c
1 rs4420638 APOC1 15 894 7967 0.2506 2.77 (2.62-2.94) 2.99 × 10−254b
2 rs4420638 APOC1 15 894 7967 0.2506 1.06 (0.96-1.18) .24
3 rs4420638 APOC1 7821 674 0.0431 1.13 (0.95-1.34) .18
4 rs4420638 APOC1 1058 1893 0.8900 0.90 (0.56-1.45) .66

Abbreviations: AAC, alternate allele count; AAF, alternate allele frequency; AD, Alzheimer disease; APOE, apolipoprotein E; OR, odds ratio; SNV, single-nucleotide variant.

a

Model 1 included all samples, no APOE adjustment; model 2, all samples, adjusted for APOE ε2 and ε4 allele counts; model 3, restricted to ε3 homozygotes; and model 4, restricted to ε4 homozygotes.

b

Indicates passing the model-specific significance threshold.

c

Indicates nominally significant.

Each SNV was significantly associated with AD without APOE adjustment (model 1) (OR for rs2075650, 2.59 [95% CI, 2.45-2.75; P = 3.19 × 10−228]; OR for rs4420638, 2.77 [95% CI, 2.62-2.94; P = 2.99 × 10−254]), although these associations weakened with APOE adjustment or stratification. rs4420638 was not associated with AD with APOE adjustment (model 2: OR, 1.06; 95% CI, 0.96-1.18; P = .24), among ε3 homozygotes (model 3: OR, 1.13; 95% CI, 0.95-1.34; P = .18), or among ε4 homozygotes (model 4: OR, 0.90; 95% CI, 0.56-1.45; P = .66). The association between rs2075650 and AD was nominally significant among ε4 homozygotes (model 4) (OR, 1.33; 95% CI, 1.00-1.77; P = .047) but failed to reach significance after APOE adjustment (model 2; OR, 1.09; 95% CI, 0.99-1.19; P = .07) or among ε3 homozygotes (model 3; OR, 1.16; 95% CI, 0.98-1.38; P = .09).

Another TOMM40 variant (rs10524523, also known as poly-T 523) has been reported to be associated with AD risk35 but was not available in ADGC data. Using a proxy SNV, rs8106922, which best defines the phylogenetic clade separating long vs short poly-T alleles,36 we found that rs2075650, rs4420638, rs429358, and rs7412 were not in LD with rs8106922 in ADGC data or 1KGP Europeans (r2 < 0.20). Although the minor allele at rs8106922 was significantly associated with reduced risk of AD under model 1 (OR, 0.69; 95% CI, 0.65-0.72; P < .001), the association was not significant under any model adjusting for or stratifying by APOE genotype (eTable 3 in the Supplement).

Imputed vs Measured APOE Genotyping

We observed discordance between the observed ε2 and ε4 genotypes and the imputed genotypes at the SNVs used to define them. Both rs429358 (ε4) and rs7412 (ε2) were polymorphic in the imputed data in which they should not have been observed, that is, among ε3 homozygotes (173 of 17 276 and 79 of 17 052 alleles, respectively) and ε4 homozygotes (2314 of 2492 and 41 of 2664 alleles, respectively). Within the ADGC data, the ε2 and rs7412 genotypes were correlated (r2 = 0.77; P < .001), with a 1.3% mismatch between observed and imputed genotypes, and both the correlation (r2 = 0.88; P < .001) and mismatch (2.3%) between the ε4 and rs429358 genotypes were higher. Both the correlation between observed and imputed genotypes and the mismatch between them varied by APOE genotyping strategies (eTable 4 in the Supplement). The SNV-based genotyping had the highest correlation with imputed ε2 (r2 = 0.81) and ε4 (r2 = 0.90), and high-throughput sequencing had the lowest (r2 = 0.47 and r2 = 0.78, respectively). This discordance between observed and imputed ε2 and ε4 genotypes may have led to spurious associations with AD; there was a nominal association between imputed genotypes at rs429358 and AD after APOE adjustment (model 2 OR, 1.16; 95% CI, 1.00-1.34; P = .04) and among ε3 homozygotes (model 3 OR, 1.73; 95% CI, 1.26-2.38; P = 6.32 × 10−4). Imputation accuracy varies based on both the observed marker panel and the reference data set; older arrays performed worse with the 1KGP reference panel used by the ADGC (rs7412, r2 = 0.75; rs429358, r2 = 0.82) than newer arrays (rs7412, r2 = 0.95; rs429358, r2 = 0.95), and both performed better when using the Haplotype Reference Consortium reference panel (r2 > 0.98).37

Additional Associations of APOE Region SNVs and AD Risk

Among the 14 415 SNVs in the APOE region, we identified 1 significant association across models after correcting for the effective number of independent tests. The Figure provides Manhattan plots of the associations with AD across models 2 to 4, and Table 2 summarizes the 12 strongest associations with AD across these 3 models. One SNV (rs192879175) was significantly associated with AD among ε3 homozygotes (model 3 OR, 0.50; 95% CI, 0.37-0.68; P = 8.30 × 10−6). No other SNVs were significantly associated with AD after multiple testing correction. None of these 12 SNVs were common in the ADGC data set (MAF > 0.10) or in LD with either rs429358 or rs7412 (maximum r2 = 0.006), and the BCAM missense variant rs117737673 represented the only coding change.

Figure. Manhattan Plot of Association Results Between Single-Nucleotide Variations in the Apolipoprotein E (APOE) Gene Region and Risk for Alzheimer Disease Across Analysis Models.

Figure.

Variant positions on chromosome 19 are relative to the hg19/GRCh37 reference genome. For model 2, the analysis included all samples adjusted for APOE ε2 and ε4 allele counts, with 1128 effective independent tests. For model 3, the analysis was restricted to APOE ε3 homozygotes, with 1055 effective independent tests. For model 4, the analysis was restricted to APOE ε4 homozygotes, with 1013 effective independent tests. The horizontal orange line denotes the statistical significance threshold per model (P < .05/number of independent tests), whereas the blue dotted line denotes P = 1/number of independent tests. The blue dotted square highlights the genes falling within the region harboring variants with P < 1/effective number of tests. Mb indicates megabase.

Table 2. Additional SNVs Within the APOE Region With an Association With AD Status Across Models 2, 3, and 4.

Modela SNV BP37 ALT No. of participants AAC AAF OR (95% CI) P value
2 rs143764218 45222739 AC 16 714 915 0.03 0.76 (0.64-0.89) 6.26 × 10−4
3 rs143764218 45222739 AC 7794 507 0.03 0.69 (0.56-0.85) 5.20 × 10−4
3 rs1979377 45259002 C 7396 801 0.05 0.71 (0.59-0.84) 6.84 × 10−5
3 chr19:45264102:I 45264102 TG 7518 555 0.04 0.68 (0.56-0.83) 1.67 × 10−4
3 rs10416720 45264110 T 7491 846 0.06 0.75 (0.63-0.88) 5.72 × 10−4
3 rs145414981 45265003 C 7355 718 0.05 0.74 (0.62-0.88) 9.18 × 10−4
3 rs73572003 45302665 G 7982 1250 0.08 0.79 (0.69-0.91) 8.32 × 10−4
3 rs143695016 45302840 T 8003 1251 0.08 0.79 (0.68-0.90) 6.59 × 10−4
3 rs192879175 45305363 T 8635 256 0.01 0.50 (0.37-0.68) 8.30 × 10−6b
3 rs28399650 45314364 A 8633 433 0.03 0.68 (0.54-0.85) 7.80 × 10−4
3 rs28399652 45314975 G 8640 434 0.03 0.67 (0.54-0.85) 7.36 × 10−4
3 rs2968180 45318153 T 8218 1542 0.09 0.79 (0.70-0.90) 3.31 × 10−4
3 rs117737673 45322316 T 8489 546 0.03 0.70 (0.57-0.86) 8.42 × 10−4

Abbreviations: AAC, alternate allele count; AAF, alternate allele frequency; AD, Alzheimer disease; ALT, alternate allele; APOE, apolipoprotein E; BP37, position on the hg19 map; OR, odds ratio; SNV, single-nucleotide variant.

a

Model 2 included all samples, adjusted for APOE ε2 and ε4 allele counts; model 3, restricted to ε3 homozygotes; and model 4, restricted to ε4 homozygotes. The effective number of tests under model 2 was 1128 of 3408 SNVs; under model 3, 1055 of 3346 SNVs; and under model 4, 1013 of 3238 SNVs. All variants are on chromosome 19.

b

Indicates passing the model-specific significance threshold.

Evidence for Replication

Limited evidence for replication of the significant associations presented in Tables 1 and 2 was available and was derived from 2 GWAS of AD in European ancestry samples. The family-based GWAS of the National Institute of Aging-Late Onset Alzheimer Disease Family Study (NIA-LOAD38) included association tests within APOE strata. That analysis of 1421 ε3 homozygotes did not provide evidence for an association between rs2968180 and AD, whereas the analysis of 408 ε4 homozygotes supported the association between rs2075650 and AD. This evidence was not independent of the ADGC, because the NIA-LOAD sample was represented in the ADGC LOAD cohort (eTable 1 in the Supplement). The stage 1 meta-analysis of the International Genomics Alzheimer Project data39 represented 53 711 participants, including 10 273 from the ADGC.40 We compared results from the International Genomics Alzheimer Project analysis of 34 152 ε4-negative participants with our analysis of ε3 homozygotes. Results were available for 4 SNVs from Table 2; the associations between AD and rs145414981 and rs1979377 were nominally significant, whereas the associations between AD and rs143695016 and rs73572003 were not.39

Discussion

This study found an association between several SNVs for APOE and AD risk. Among these, rs192879175 was significantly associated with risk of AD among ε3 homozygotes, rs143764218 was nominally associated with AD after APOE adjustment and among ε3 homozygotes, and rs2075650 was nominally associated with AD among ε4 homozygotes. There was a stronger association between SNVs near APOE and AD status in the APOE-stratified vs the APOE-adjusted models. This finding was likely because these strata were restricted to individuals who shared 2 copies of an APOE allele identical by state and were therefore more likely to share recent common ancestry.

The TOMM40 SNV rs2075650 has a long history of an association with AD and related traits, including 8 GWAS for AD risk41,42,43,44,45,46,47,48 and several studies of healthy aging and longevity.12,13,49,50 It is a common variant located within intron 2 of TOMM40 (European MAF = 0.14). rs2075650 overlaps with promoter/enhancer histone marks in immune cells and brain tissues, is predicted to alter 8 transcription factor binding site motifs,51 and is significantly associated with TOMM40, PVRL2, and HIF3A expression levels.11,52,53 Both rs192879175 and rs143764218 are uncommon (rs192879175: European MAF, 0.01; rs143764218: European MAF, 0.05), are located between genes, and bear features consistent with regulatory variants. rs192879175 is 1.5 kb 3′ of CBLC, sits within enhancer histone marks and a DNase I hypersensitivity site in liver, and is predicted to alter a transcription factor binding site motif.51 Similarly, rs143764218 is 8.7 kb 3′ of CEACAM16, sits within promoter/enhancer histone marks and DNase I hypersensitivity site across multiple tissues, and is predicted to alter 7 transcription factor binding site motifs.51 Neither rs192879175 nor rs143764218 has previously been shown to be associated with AD or other traits by GWAS,54 perhaps owing to their uncommon allele frequencies.

We identified associations between noncoding variants in the APOE region and risk of AD. Haplotypic differences among participants sharing the same APOE genotype are associated with risk of AD.55,56,57 Haplotypes derived from rs429358, rs7412, and neighboring noncoding SNVs that vary in frequency across populations are associated with increased risk of AD.55 Admixture analyses in Puerto Rican, African American, and Caribbean Hispanic data sets have shown that ε4 alleles inherited on an African background are associated with reduced risk of AD compared with those inherited on a European background, again suggesting that haplotype structures correlated with ε4 vary between populations and are associated with AD risk.56,57 All SNVs with significant associations with AD were located within a 186-kb region immediately 5′ of APOE. All 5 genes in this region share the same transcriptional orientation as APOE, suggesting synchronized cis regulation might exist. Regulatory variants could modify this transcriptional pathway and subsequently change the gene expression profiles within this entire region.

Few AD genetics studies have accounted for APOE genotype, hampering replication efforts. As summarized above, 2 studies38,39 offered limited support for the 5 SNVs with evidence for association with AD in our study. However, both studies included a subset of the ADGC data analyzed herein and were not truly independent. Larger data sets with high-quality APOE genotype data are needed to replicate the results of the present study, particularly for the associations identified among ε4 homozygotes, including 1326 cases and 177 controls. Laboratory-based procedures such as molecular haplotyping, haplotype-based fine mapping,58 and reporter assays are needed to investigate the potential functional consequences of SNVs and how those consequences may influence AD pathogenesis.

Limitations

This study has limitations. Imputed genotype data are not without error. Most of the discordant genotypes we observed involved ε2 or ε4 alleles being imputed as ε3 alleles, consistent with prior work59; this likely contributed to the spurious association between rs429358 and AD among the ε3 homozygotes. The ADGC data were collected on a mixture of older and newer arrays, which may explain some of the discordance we observed between the observed and imputed APOE genotypes. We observed lower mismatch rates at ε2 and ε4 among those genotyped by an SNV-based approach with high accuracy (error rate, 0.00237) compared with those genotyped by next-generation sequencing, suggesting that genotyping error may explain these differences. The stronger correlation between the APOE region genotypes in the ADGC compared with the 1KGP Europeans (consistent with previous reports of differing LD patterns between AD cases and controls60) suggests that using sequence data generated on a large and diverse sample set ascertained for AD status as a reference may improve the quality of imputed genotypes in AD GWAS. Our data represent only those with European ancestry; thus, our results may not apply to other populations.

Conclusions

This genetic association study found that ε2/ε3/ε4 alleles as well as other variants in the APOE region were associated with AD risk. Although future work in independent data are needed to replicate these results, our findings appear to provide valuable new candidate sites for targeted genetic analyses on larger sample sets representing diverse ethnic groups. The findings suggest that increased LD between SNVs within the APOE region in samples ascertained for AD vs population samples may influence the accuracy of imputation within AD-related data sets. The correlation between imputed vs measured ε2 and ε4 genotypes within the ADGC varied by genotyping platform, suggesting next-generation sequencing at rs7412 and rs429358 may not be as accurate as alternative approaches. Association testing results in the APOE region varies between models adjusting for or stratifying by ε2/ε3/ε4 genotype; future GWAS using these alternative approaches may yield novel results in existing data sets.

Supplement.

eMethods. Software Tools

eTable 1. Sample Summary Table by ADGC Cohort

eTable 2. Evidence for Linkage Disequilibrium Among the TOMM40, APOC1, and APOE SNVs

eTable 3. Evidence for Association Between rs8106922 With and Without APOE Adjustment or Stratification

eTable 4. Comparison of APOE ε2 and ε4 Genotypes With Imputed Genotypes at rs7412 and rs429358

eFigure 1. Principal Components Analysis Results

eFigure 2. Difference Between Kinship Estimates Based on Genotypes for All Autosomes vs All Autosomes Except Chromosome 19

eFigure 3. Quantile-Quantile Plots of Genome-Wide Association Tests Under Each Analysis Model

eReferences.

References

  • 1.Corder EH, Saunders AM, Strittmatter WJ, et al. Gene dose of apolipoprotein E type 4 allele and the risk of Alzheimer’s disease in late onset families. Science. 1993;261(5123):921-923. doi: 10.1126/science.8346443 [DOI] [PubMed] [Google Scholar]
  • 2.Strittmatter WJ, Saunders AM, Schmechel D, et al. Apolipoprotein E: high-avidity binding to beta-amyloid and increased frequency of type 4 allele in late-onset familial Alzheimer disease. Proc Natl Acad Sci U S A. 1993;90(5):1977-1981. doi: 10.1073/pnas.90.5.1977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Alzheimer’s Association 2017 Alzheimer’s disease facts and figures. Alzheimer's Dementia. 2017;13(4):325-373. doi: 10.1016/j.jalz.2017.02.001 [DOI] [Google Scholar]
  • 4.Pimenova AA, Raj T, Goate AM. Untangling genetic risk for Alzheimer’s disease. Biol Psychiatry. 2018;83(4):300-310. doi: 10.1016/j.biopsych.2017.05.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kunkle BW, Grenier-Boley B, Sims R, et al. ; Alzheimer Disease Genetics Consortium (ADGC); European Alzheimer’s Disease Initiative (EADI); Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium (CHARGE); Genetic and Environmental Risk in AD/Defining Genetic, Polygenic and Environmental Risk for Alzheimer’s Disease Consortium (GERAD/PERADES) . Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat Genet. 2019;51(3):414-430. doi: 10.1038/s41588-019-0358-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kinoshita J, Clark T. Alzforum. Methods Mol Biol. 2007;401:365-381. doi: 10.1007/978-1-59745-520-6_19 [DOI] [PubMed] [Google Scholar]
  • 7.Yashin AI, Fang F, Kovtun M, et al. Hidden heterogeneity in Alzheimer’s disease: insights from genetic association studies and other analyses. Exp Gerontol. 2018;107:148-160. doi: 10.1016/j.exger.2017.10.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Nho K, Kim S, Horgusluoglu E, et al. ; Alzheimer’s Disease Neuroimaging Initiative (ADNI) . Association analysis of rare variants near the APOE region with CSF and neuroimaging biomarkers of Alzheimer’s disease. BMC Med Genomics. 2017;10(suppl 1):29. doi: 10.1186/s12920-017-0267-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Prendecki M, Florczak-Wyspianska J, Kowalska M, et al. Biothiols and oxidative stress markers and polymorphisms of TOMM40 and APOC1 genes in Alzheimer’s disease patients. Oncotarget. 2018;9(81):35207-35225. doi: 10.18632/oncotarget.26184 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Huentelman M, Corneveaux J, Myers A, et al. Genome-wide association study for Alzheimer’s disease risk in a large cohort of clinically characterized and neuropathologically verified subjects. Alzheimer Dementia. 2010;6(4):e13. doi: 10.1016/j.jalz.2010.08.041 [DOI] [Google Scholar]
  • 11.Liu C, Chyr J, Zhao W, et al. ; Alzheimer’s Disease Neuroimaging Initiative . Genome-wide association and mechanistic studies indicate that immune response contributes to Alzheimer’s disease development. Front Genet. 2018;9:410. doi: 10.3389/fgene.2018.00410 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Pilling LC, Atkins JL, Bowman K, et al. Human longevity is influenced by many genetic variants: evidence from 75,000 UK Biobank participants. Aging (Albany NY). 2016;8(3):547-560. doi: 10.18632/aging.100930 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Yashin AI, Arbeev KG, Wu D, et al. Genetics of human longevity from incomplete data: new findings from the Long Life Family Study. J Gerontol A Biol Sci Med Sci. 2018;73(11):1472-1481. doi: 10.1093/gerona/gly057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Shadyab AH, Kooperberg C, Reiner AP, et al. Replication of genome-wide association study findings of longevity in White, African American, and Hispanic women: the Women’s Health Initiative. J Gerontol A Biol Sci Med Sci. 2017;72(10):1401-1406. doi: 10.1093/gerona/glw198 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Alzheimer’s Disease Genetics Consortium. Accessed May 31, 2018. http://www.adgenetics.org/content/feedback-and-queries
  • 16.Naj AC, Jun G, Beecham GW, et al. Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer’s disease. Nat Genet. 2011;43(5):436-441. doi: 10.1038/ng.801 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nat Methods. 2011;9(2):179-181. doi: 10.1038/nmeth.1785 [DOI] [PubMed] [Google Scholar]
  • 18.Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 2009;5(6):e1000529. doi: 10.1371/journal.pgen.1000529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol. 2010;34(8):816-834. doi: 10.1002/gepi.20533 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat Genet. 2012;44(8):955-959. doi: 10.1038/ng.2354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Auton A, Brooks LD, Durbin RM, et al. ; 1000 Genomes Project Consortium . A global reference for human genetic variation. Nature. 2015;526(7571):68-74. doi: 10.1038/nature15393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55(4):997-1004. doi: 10.1111/j.0006-341X.1999.00997.x [DOI] [PubMed] [Google Scholar]
  • 23.McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology. 1984;34(7):939-944. doi: 10.1212/WNL.34.7.939 [DOI] [PubMed] [Google Scholar]
  • 24.Dubois B, Feldman HH, Jacova C, et al. Research criteria for the diagnosis of Alzheimer’s disease: revising the NINCDS-ADRDA criteria. Lancet Neurol. 2007;6(8):734-746. doi: 10.1016/S1474-4422(07)70178-3 [DOI] [PubMed] [Google Scholar]
  • 25.GENESIS: Genetic Estimation and Inference in Structured Samples (GENESIS): statistical methods for analyzing genetic data from samples with population structure and/or relatedness. R package version 2.4.0. [computer program]. R Foundation for Statistical Computing; 2016.
  • 26.Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics. 2012;28(24):3326-3328. doi: 10.1093/bioinformatics/bts606 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Conomos MP, Miller MB, Thornton TA. Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness. Genet Epidemiol. 2015;39(4):276-293. doi: 10.1002/gepi.21896 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867-2873. doi: 10.1093/bioinformatics/btq559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Conomos MP, Reiner AP, Weir BS, Thornton TA. Model-free estimation of recent genetic relatedness. Am J Hum Genet. 2016;98(1):127-148. doi: 10.1016/j.ajhg.2015.11.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li MX, Yeung JM, Cherny SS, Sham PC. Evaluating the effective numbers of independent tests and significant P-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum Genet. 2012;131(5):747-756. doi: 10.1007/s00439-011-1118-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559-575. doi: 10.1086/519795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.R: a language and environment for statistical computing [computer program]. R Foundation for Statistical Computing; 2011.
  • 33.Ward LD, Kellis M. HaploReg v4.1: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res. 2016;44(D1):D877-D881. doi: 10.1093/nar/gkv1340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lek M, Karczewski KJ, Minikel EV, et al. ; Exome Aggregation Consortium . Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285-291. doi: 10.1038/nature19057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Roses AD, Lutz MW, Amrine-Madsen H, et al. A TOMM40 variable-length polymorphism predicts the age of late-onset Alzheimer’s disease. Pharmacogenomics J. 2010;10(5):375-384. doi: 10.1038/tpj.2009.69 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lutz MW, Crenshaw DG, Saunders AM, Roses AD. Genetic variation at a single locus and age of onset for Alzheimer’s disease. Alzheimers Dement. 2010;6(2):125-131. doi: 10.1016/j.jalz.2010.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Lupton MK, Medland SE, Gordon SD, et al. Accuracy of inferred APOE genotypes for a range of genotyping arrays and imputation reference panels. J Alzheimers Dis. 2018;64(1):49-54. doi: 10.3233/JAD-171104 [DOI] [PubMed] [Google Scholar]
  • 38.Wijsman EM, Pankratz ND, Choi Y, et al. ; NIA-LOAD/NCRAD Family Study Group . Genome-wide association of familial late-onset Alzheimer’s disease replicates BIN1 and CLU and nominates CUGBP2 in interaction with APOE. PLoS Genet. 2011;7(2):e1001308. doi: 10.1371/journal.pgen.1001308 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Jun G, Ibrahim-Verbaas CA, Vronskaya M, et al. ; IGAP Consortium . A novel Alzheimer disease locus located near the gene encoding tau protein. Mol Psychiatry. 2016;21(1):108-117. doi: 10.1038/mp.2015.23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lambert JC, Ibrahim-Verbaas CA, Harold D, et al. ; European Alzheimer’s Disease Initiative (EADI); Genetic and Environmental Risk in Alzheimer’s Disease; Alzheimer’s Disease Genetic Consortium; Cohorts for Heart and Aging Research in Genomic Epidemiology . Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat Genet. 2013;45(12):1452-1458. doi: 10.1038/ng.2802 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Harold D, Abraham R, Hollingworth P, et al. Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer’s disease. Nat Genet. 2009;41(10):1088-1093. doi: 10.1038/ng.440 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Heinzen EL, Need AC, Hayden KM, et al. Genome-wide scan of copy number variation in late-onset Alzheimer’s disease. J Alzheimers Dis. 2010;19(1):69-77. doi: 10.3233/JAD-2010-1212 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lambert JC, Heath S, Even G, et al. ; European Alzheimer’s Disease Initiative Investigators . Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer’s disease. Nat Genet. 2009;41(10):1094-1099. doi: 10.1038/ng.439 [DOI] [PubMed] [Google Scholar]
  • 44.Seshadri S, Fitzpatrick AL, Ikram MA, et al. ; CHARGE Consortium; GERAD1 Consortium; EADI1 Consortium . Genome-wide analysis of genetic loci associated with Alzheimer disease. JAMA. 2010;303(18):1832-1840. doi: 10.1001/jama.2010.574 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pérez-Palma E, Bustos BI, Villamán CF, et al. ; Alzheimer’s Disease Neuroimaging Initiative; NIA-LOAD/NCRAD Family Study Group . Overrepresentation of glutamate signaling in Alzheimer’s disease: network-based pathway enrichment using meta-analysis of genome-wide association studies. PLoS One. 2014;9(4):e95413. doi: 10.1371/journal.pone.0095413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Nelson PT, Estus S, Abner EL, et al. ; Alzheimer’ Disease Genetic Consortium . ABCC9 gene polymorphism is associated with hippocampal sclerosis of aging pathology. Acta Neuropathol. 2014;127(6):825-843. doi: 10.1007/s00401-014-1282-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Naj AC, Beecham GW, Martin ER, et al. Dementia revealed: novel chromosome 6 locus for late-onset Alzheimer disease provides genetic evidence for folate-pathway abnormalities. PLoS Genet. 2010;6(9):e1001130. doi: 10.1371/journal.pgen.1001130 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kim S, Swaminathan S, Shen L, et al. ; Alzheimer’s Disease Neuroimaging Initiative . Genome-wide association study of CSF biomarkers Abeta1-42, t-tau, and p-tau181p in the ADNI cohort. Neurology. 2011;76(1):69-79. doi: 10.1212/WNL.0b013e318204a397 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Deelen J, Beekman M, Uh HW, et al. Genome-wide association study identifies a single major locus contributing to survival into old age; the APOE locus revisited. Aging Cell. 2011;10(4):686-698. doi: 10.1111/j.1474-9726.2011.00705.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Sebastiani P, Solovieff N, Dewan AT, et al. Genetic signatures of exceptional longevity in humans. PLoS One. 2012;7(1):e29848. doi: 10.1371/journal.pone.0029848 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ward LD, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40(database issue):D930-D934. doi: 10.1093/nar/gkr917 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lappalainen T, Sammeth M, Friedländer MR, et al. ; Geuvadis Consortium . Transcriptome and genome sequencing uncovers functional variation in humans. Nature. 2013;501(7468):506-511. doi: 10.1038/nature12531 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Greenawalt DM, Dobrin R, Chudin E, et al. A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res. 2011;21(7):1008-1016. doi: 10.1101/gr.112821.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Buniello A, MacArthur JAL, Cerezo M, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47(D1):D1005-D1012. doi: 10.1093/nar/gky1120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Babenko VN, Afonnikov DA, Ignatieva EV, Klimov AV, Gusev FE, Rogaev EI. Haplotype analysis of APOE intragenic SNPs. BMC Neurosci. 2018;19(suppl 1):16. doi: 10.1186/s12868-018-0413-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Blue EE, Horimoto ARVR, Mukherjee S, Wijsman EM, Thornton TA. Local ancestry at APOE modifies Alzheimer’s disease risk in Caribbean Hispanics. Alzheimers Dement. 2019;15(12):1524-1532. doi: 10.1016/j.jalz.2019.07.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Rajabli F, Feliciano BE, Celis K, et al. Ancestral origin of ApoE ε4 Alzheimer disease risk in Puerto Rican and African American populations. PLoS Genet. 2018;14(12):e1007791. doi: 10.1371/journal.pgen.1007791 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Zhou X, Chen Y, Mok KY, et al. ; Alzheimer’s Disease Neuroimaging Initiative . Non-coding variability at the APOE locus contributes to the Alzheimer’s risk. Nat Commun. 2019;10(1):3310. doi: 10.1038/s41467-019-10945-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Oldmeadow C, Holliday EG, McEvoy M, et al. Concordance between direct and imputed APOE genotypes using 1000 Genomes data. J Alzheimers Dis. 2014;42(2):391-393. doi: 10.3233/JAD-140846 [DOI] [PubMed] [Google Scholar]
  • 60.Kulminski AM, Huang J, Wang J, He L, Loika Y, Culminskaya I. Apolipoprotein E region molecular signatures of Alzheimer’s disease. Aging Cell. 2018;17(4):e12779. doi: 10.1111/acel.12779 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement.

eMethods. Software Tools

eTable 1. Sample Summary Table by ADGC Cohort

eTable 2. Evidence for Linkage Disequilibrium Among the TOMM40, APOC1, and APOE SNVs

eTable 3. Evidence for Association Between rs8106922 With and Without APOE Adjustment or Stratification

eTable 4. Comparison of APOE ε2 and ε4 Genotypes With Imputed Genotypes at rs7412 and rs429358

eFigure 1. Principal Components Analysis Results

eFigure 2. Difference Between Kinship Estimates Based on Genotypes for All Autosomes vs All Autosomes Except Chromosome 19

eFigure 3. Quantile-Quantile Plots of Genome-Wide Association Tests Under Each Analysis Model

eReferences.


Articles from JAMA Network Open are provided here courtesy of American Medical Association

RESOURCES