Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2022 Apr 12;12:6117. doi: 10.1038/s41598-022-09825-2

Manifestations of Alzheimer’s disease genetic risk in the blood are evident in a multiomic analysis in healthy adults aged 18 to 90

Laura Heath 1,2,, John C Earls 1,3, Andrew T Magis 1, Sergey A Kornilov 1, Jennifer C Lovejoy 1, Cory C Funk 1, Noa Rappaport 1, Benjamin A Logsdon 2, Lara M Mangravite 2, Brian W Kunkle 4,5, Eden R Martin 4,5, Adam C Naj 6,7, Nilüfer Ertekin-Taner 8,9, Todd E Golde 10, Leroy Hood 1,11, Nathan D Price 1,3,; Alzheimer’s Disease Genetics Consortium
PMCID: PMC9005657  PMID: 35413975

Abstract

Genetics play an important role in late-onset Alzheimer’s Disease (AD) etiology and dozens of genetic variants have been implicated in AD risk through large-scale GWAS meta-analyses. However, the precise mechanistic effects of most of these variants have yet to be determined. Deeply phenotyped cohort data can reveal physiological changes associated with genetic risk for AD across an age spectrum that may provide clues to the biology of the disease. We utilized over 2000 high-quality quantitative measurements obtained from blood of 2831 cognitively normal adult clients of a consumer-based scientific wellness company, each with CLIA-certified whole-genome sequencing data. Measurements included: clinical laboratory blood tests, targeted chip-based proteomics, and metabolomics. We performed a phenome-wide association study utilizing this diverse blood marker data and 25 known AD genetic variants and an AD-specific polygenic risk score (PGRS), adjusting for sex, age, vendor (for clinical labs), and the first four genetic principal components; sex-SNP interactions were also assessed. We observed statistically significant SNP-analyte associations for five genetic variants after correction for multiple testing (for SNPs in or near NYAP1, ABCA7, INPP5D, and APOE), with effects detectable from early adulthood. The ABCA7 SNP and the APOE2 and APOE4 encoding alleles were associated with lipid variability, as seen in previous studies; in addition, six novel proteins were associated with the e2 allele. The most statistically significant finding was between the NYAP1 variant and PILRA and PILRB protein levels, supporting previous functional genomic studies in the identification of a putative causal variant within the PILRA gene. We did not observe associations between the PGRS and any analyte. Sex modified the effects of four genetic variants, with multiple interrelated immune-modulating effects associated with the PICALM variant. In post-hoc analysis, sex-stratified GWAS results from an independent AD case–control meta-analysis supported sex-specific disease effects of the PICALM variant, highlighting the importance of sex as a biological variable. Known AD genetic variation influenced lipid metabolism and immune response systems in a population of non-AD individuals, with associations observed from early adulthood onward. Further research is needed to determine whether and how these effects are implicated in early-stage biological pathways to AD. These analyses aim to complement ongoing work on the functional interpretation of AD-associated genetic variants.

Subject terms: Genetic association study, Alzheimer's disease

Introduction

The rapidly decreasing cost of genomics paired with technological advances in the generation of multi-omic data has resulted in multiple datasets of deeply phenotyped individuals with a variety of health outcomes13. The data collected in these studies have the potential to yield important insights into potential molecular drivers of health observable in the blood periphery. The present study seeks to leverage a unique and relatively large set of multi-omic, deep-phenotyping data to shed light on genetic pathways to late-onset Alzheimer’s disease (AD) by assessing differences in ~ 2000 analytes in the blood that show association with known genetic risk variants for AD. Coupled with high-dimensional data sets, this approach has the potential to yield clues into gene pleiotropy, disease processes, and possible early-intervention strategies, which are critically important given the essentially untreatable nature of late-stage Alzheimer’s disease once significant brain deterioration has occurred.

Genetic variation plays a substantial role in AD risk, with twin studies estimating AD heritability at 58–79%4. While the emergence of recent large-scale consortia efforts has facilitated well-powered meta-analyses of genome-wide association studies (GWAS) to identify multiple common variants with small effect sizes5,6, the research community is still untangling exactly how this genetic variation influences disease risk. Functional genomics studies are beginning to identify likely genetic pathways to disease with the aid of transcriptomic, epigenomic, and endophenotypic data710. So far, genetic and multi-omic studies of AD studies have largely focused on older individuals with either clinically diagnosed AD or milder symptoms of cognitive decline, despite research pointing to highly variable AD pathobiology that occurs on a spectrum, and begins decades before clinical symptoms onset11.

In this study, we leveraged the results from a large-scale GWAS meta-analysis5 alongside data from a deeply phenotyped wellness cohort to investigate the physiological periphery effects of genetic risk for AD in individuals without known cognitive impairment, at all ages. We undertook an agnostic approach by adopting a phenome-wide association study (PheWAS) design12. By examining how genetic variation impacts 2008 analytes in the blood of 2831 individuals, we sought to complement previous functional genomics studies as well as potentially reveal new testable hypotheses for future studies. In addition, we tested for associations between a polygenic risk score (PGRS) for AD and blood analytes to determine if a relative burden of genetic risk might impact observable changes in the blood, and we assessed for effect modification of genetic risk by sex.

Results

Summary of population and study design

Sixty-one percent of Arivale participants were female, 22% were of non-white self-reported ethnicity, and 28% were obese (Table 1). The mean age at blood draw was 47 years, with a range of 18 to 89+. In general, individuals who joined Arivale had somewhat higher levels of cardiovascular risk markers compared to the US population, and slightly lower rates of obesity and pre-diabetes3 (these rates were consistent with rates in the geographies and ethnicities of the population, mostly from the west coast region of the United States).

Table 1.

Baseline self-reported characteristics of Arivale participants with available whole-genome sequences.

Characteristica N = 2831
Age, mean (sd) 47.0 (12.0)
Female, n (%) 1719 (60.7)
Nonwhiteb, n (%) (n = 2725) 597 (21.9)
Afro-Caribbean 1 (< 0.1)
American Indian or Alaska Native 5 (0.2)
Ashkenazi Jewish 49 (1.8)
Asian 84 (3.1)
Black or African American 64 (2.3)
East Asian 91 (3.3)
Hispanic Latino or Spanish origin 120 (4.4)
Middle Eastern or North African 18 (0.7)
Native Hawaiian or other Pacific Islander 17 (0.6)
Sephardic Jewish 4 (0.1)
South Asian 79 (2.9)
White 2128 (78.1)
Other 65 (2.4)
BMI, mean (sd) (n = 2750) 27.9 (6.4)
Obesec, n (%) (n = 2750) 802 (29.2)
Moderate activity ≥ 3×/week, n (%) (n = 2275) 1460 (64.2)
Vigorous activity ≥ 3×/week, n (%) (n = 2271) 697 (30.7)
Ever smoke, n (%) (n = 2207) 565 (25.6)
Current meds for cholesterol, n (%) (n = 2378) 287 (12.1)
Past and/or current self-report of
Migraine, n (%) (n = 2229) 558 (25.0)
High cholesterol, n (%) (n = 2301) 558 (24.2)
Depression, n (%) (n = 2278) 521 (22.9)
GERD, n (%) (n = 2220) 464 (20.9)
Hypertension, n (%) (n = 2316) 434 (18.7)
Asthma, n (%) (n = 2361) 376 (15.9)

aFor categories with missing data, total non-missing N is reported in parentheses.

bRace/ethnicity categories presented to participants in Arivale questionnaire.

cObese defined as BMI ≥ 30.

Phenome-wide association study results

We observed 33 SNP-analyte associations that were statistically significant at FDR-adjusted p-value < 0.05, with most of the associations observed for the APOE SNPs (rs7412, or the e2-defining allele, and rs429358, or the e4-defining allele). The other SNPs showing significant associations with at least one clinical chemistry, protein, or metabolite were rs10933431, rs12539172, and rs3752246 (Fig. 1, Table S2). Complete PheWAS results, including beta coefficients, sample sizes, minor allele frequencies, Hardy–Weinberg Equilibrium p-values, and raw and adjusted p-values for each SNP are in Supplementary Excel File 1. Sample sizes varied among analytes collected (particularly among protein analytes, as a small subset of the population (N = 354) had samples submitted for the full range of protein panels, as described in “Methods” section).

Figure 1.

Figure 1

Statistically significant SNP-analyte associations after correcting for multiple testing (threshold FDR-adjusted p-value = 0.05), by SNP. Top panel: log-transformed beta-coefficient from the linear regression model adjusted for sex, age, and genetic principal components 1–4; markers above the zero line (orange) indicate analytes that increased in value with the minor allele, while markers below the line indicate markers that decreased in value. Second panel: FDR-adjusted − log10 p-value; orange line at FDR-p = 0.05. Proteins = red, metabolites = blue, clinical chemistries = purple. Metabolite codes: DG diacylglycerol, LC lactosylceramide, o oleoyl; a arachidonoyl, g glycerol, l linoleoyl, p palmitoyl. Third panel: minor allele frequency (MAF). Bottom panel: Total sample size for each analyte-SNP regression.

NYAP1

The most robust SNP-analyte associations we observed were between rs12539712 in the 3’ region of NYAP1 (Neuronal Tyrosine Phosphorylated Phosphoinositide-3-Kinase Adaptor 1), and two co-regulated proteins, paired immunoglobulin-like type 2 receptors beta and alpha (PILRB and PILRA) (Fig. 2). Carriage of the minor allele (AD risk odds ratio (OR) = 0.92) was associated with significant reduction in normalized protein expression (NPX) of PILRB and PILRA compared to individuals homozygous for the major allele (FDR-adjusted p-values = 2.2 × 10–33 and 2.3 × 10–17, respectively), while the overall level of NPX increased with age among all genotypes. The reduction in protein levels appears roughly dose-dependent with the number of minor alleles and was observed in all but the oldest and youngest age groups (likely due to small numbers of the minor allele in these groups (Table S3A). These observations led us to previous studies pointing to variation in PILRA as the causal gene at this locus, with a missense SNP as a leading candidate (G78R, rs1859788)1316. In post-hoc analysis, we repeated the PheWAS with this putative causal SNP (which was in LD with our index SNP rs12539172, R2 = 0.77), and the associations became stronger (FDR-adjusted p-value for PILRB = 3.6 × 10–52; for PILRA = 1.4 × 10–22) (Fig. 2), with genotype differences observed in all age groups (Table S3A).

Figure 2.

Figure 2

Unadjusted box plots of normalized protein expression (NPX) levels of PILRA and PILRB by genotype and age group. White boxplots = individuals who are homozygous for the major allele, gray boxplots = heterozygotes, black boxplots = minor allele homozygotes. Box plot midline = median value, lower/upper hinges = 25th and 75th percentiles, respectively; lower whisker ends/upper whisker ends no further than 1.5× interquartile range from the hinge. Data beyond whiskers are outlying points. Top panel: NPX of PILRA and PILRB by rs12539172 (NYAP1) genotype; Bottom panel: NPX of PILRA and PILRB by rs1859788 genotype.

APOE4

We observed significant associations between rs429358 (which encodes the e4 allele) and multiple related clinical measures of cholesterol (Fig. S1). Differences by genotype were less pronounced in older age groups likely due to statin use (Table S3B); exploratory analyses visualizing only individuals who did not report use of statin-lowering medications showed more consistent genotype-dependent differences between rs429358 and the top cholesterol marker, low-density lipoprotein (LDL) particle number (Fig. S2, Table S3B). The concentration of two proteins in the blood were associated with the e4 allele: PLA2G7 (Platelet Activating Factor Acetylhydrolase) and CD28 (T-Cell-Specific Surface Glycoprotein CD28). Selected lipid metabolites in the blood were positively associated with e4: two diacylglycerol (DG) metabolites (one of which was measured twice in the Metabolon panel) were higher in e4 carriers compared to non-carriers.

APOE2

We observed significantly lower levels of multiple clinical measures of LDL cholesterol associated with carriage of the e2 allele (Fig. S3). As the unadjusted plots show, e2 homozygotes are dramatically different than other genotypes, though it should be noted that few e2 homozygotes were present in the population (n = 16) and were within a limited age range (30–59 years). Selected lipid metabolites in the blood were positively associated with e2: a monoglyceride (MG) and four diacylglycerol (DG) metabolites (one of which was a replicate) were higher in e2 carriers compared to non-carriers. We observed six e2-protein associations (Fig. 3), such that each of the following proteins were observed at higher levels in e2 carriers: low density lipoprotein receptor (LDLR), heme oxygenase-1 (HMOX-1), SLAM family member 8 (SLAMF8), ring finger protein 31 (RNF31), contactin associated protein 2 (CNTNAP2), and signal recognition particle 14 (SRP14).

Figure 3.

Figure 3

Unadjusted box plots of normalized protein expression levels (NPX) of six proteins significantly associated with APOE2 genotype, by age group. White boxplots = individuals who are homozygous for the major allele, gray boxplots = heterozygotes, black boxplots = minor allele homozygotes. Box plot midline = median value, lower/upper hinges = 25th and 75th percentiles, respectively; lower whisker ends/upper whisker ends no further than 1.5× interquartile range from the hinge. Data beyond whiskers are outlying points. LDLR low-density lipoprotein receptor, HMOX1 heme oxygenase-1, SLAMF8 SLAM family member 8, RNF31 E3 ubiquitin-protein ligase RNF31, CNTNAP2 contactin-associated protein-like 2, SRP14 signal recognition particle 14 kDa protein.

ABCA7

The ABCA7 (ATP Binding Cassette Subfamily A Member 7) variant (rs3752246), which has been associated with increased risk of AD (OR 1.15, Table S1), was associated with lower levels of two lactosylceramide (LC) metabolites in the sphingolipid family. These differences were evident starting in the youngest age groups (Fig. S4, Table S3A). The minor allele of rs3752246 was also associated with higher levels of DEFA1 (Defensin Alpha 1), an antimicrobial peptide.

INPP5D

An intronic SNP in INPP5D (Inositol Polyphosphate-5-Phosphatase D) (rs10933431), which was associated with a lowered risk of AD in meta-analyses, was associated with lower levels of the protein IDUA (alpha-l-iduronidase) (Fig. S4).

Polygenic risk score

No associations were observed between the APOE-free PGRS and any analyte after FDR correction for multiple testing, either in primary analyses or in analyses adjusted for e4 status, or among non-e4 individuals only. No effect modification by sex or APOE4 status was observed.

Sex-specific findings

We observed a SNP × sex interaction involving the AD-protective PICALM variant, such that the minor allele was associated with higher levels of 30 proteins in men and lower levels of the proteins in women (Fig. 4, Fig. S5, Table S4). These proteins were highly correlated with one another (mean pairwise spearman’s rho = 0.49); thus, it is unclear whether the associations are independently biologically meaningful, or whether there is a passenger effect, in which one or a few proteins are driving the sex-differential association with genotype observed in the data. In addition, the PICALM variant is associated with a sex-specific effect on five highly correlated long-chain fatty acid (LCFA) metabolites and one polyunsaturated fatty acid (PFA) metabolite (Docosahexaenoic acid) (Fig. 4). To investigate further, we conducted a post-hoc analysis examining the impact of this variant on AD risk stratified by sex, in a meta-analysis of clinically diagnosed late-onset AD (18,812 individuals, Table S5). While AD risk was reduced in both men and women among carriers of the minor allele, the effect was stronger among men (Table 2, Table S6), which was consistent with the sex-stratified SNP-analyte analyses (data not shown).

Figure 4.

Figure 4

Heatmap of statistically significant genotype × sex interaction terms at FDR-adjusted p-value < 0.1. Beta coefficients obtained from sex-stratified analyses, middle-column p-values from interaction term in the full model. SL sphingolipid, LCFA long-chain fatty acid, PFA polyunsaturated fatty acid.

Table 2.

Results of sex-specific analysis and sex-SNP interaction analysis of PICALM variant rs3851179 in the ADGC.

Sex-stratified resultsa Beta Std error p-value MAF
Male model 1  − 0.206 0.035 5.62E−09 0.358
Male model 2  − 0.176 0.038 4.08E−06 0.359
Female model 1  − 0.083 0.029 4.37E−03 0.354
Female model 2  − 0.087 0.031 5.60E−03 0.352
Interaction resultsb Interaction beta Std error p-value MAF
Model 1 0.116 0.044 8.05E−03 0.354
Model 2 0.372 0.048 7.84E−02 0.354

N = 9135 cases (60% female), 9,677 controls (60% female).

aModel 1: adjusted for age, sex, and PCs; Model 2: adjusted for age, sex, PCs, and APOE genotype.

bModel 1: adjusted for age and PCs; Model 2: adjusted for age, PCs, and APOE.

Other observed sex-specific effects were more modest. The SNP near CD2AP (CD2 Associated Protein) interacted with sex to affect three highly correlated sphingomyelins and three plasmologens, while the SNP in SPI1 (Transcription Factor PU.1) interacted with sex to affect SPARC related modular calcium binding 2 (SMOC2). Lastly, the missense ABCA7 SNP interacted with sex to affect levels of Ubiquitin conjugating enzyme E2f (UBE2F).

Stratification by self-identified race/ethnicity

Unfortunately, due to vanishingly small numbers in individual self-identified groups (Table 1), we were not able to assess genetic risk effects in individual groups with statistical rigor. As expected, analyses restricted to white individuals recapitulated results of the overall analysis (Fig. S6). In the nonwhite group overall, we observed effect sizes that were consistent with the overall results and white-only results (Fig. S7).

Discussion

Our study examines associations between known genetic risk factors for AD and blood markers (clinical labs, proteins, and metabolites). It provides insight into the manifestation of AD-related genetic risk in blood-borne analytes from cognitively normal individuals and demonstrated how AD-related genetic variation manifests in the blood across adulthood. Our results contribute to the growing literature highlighting a potential causal variant (missense SNP in PILRA), point to potential new mechanisms of protection among APOE2 carriers, and suggest a role for infectious diseases as AD risk factors, alongside lipid metabolism, immune response, and endocytosis. We also uncovered intriguing differences between men and women in how genetic risk manifests in the blood. These analyses not only add to the existing literature on functional genomics in AD, but also offer up multiple potential new hypotheses to catalyze future studies.

The strongest associations in the study were between the NYAP1 SNP (rs12539172) and the PILRB/PILRA proteins. This locus was originally identified by rs1476679 near ZCWCP16. NYAP1 and ZCWPW1 are located near PILRA and PILRB on chromosome 7, within a linkage disequilibrium (LD) block. In previous gene expression studies, the initial index SNP for ZWCWP1 has been associated with expression of multiple PILRB and PILRA transcripts in brain9,17. PILRA and PILRB are paired, co-regulated inhibiting/activating receptors, respectively, that are expressed on innate immune cells, recognize certain O-glycosylated proteins, and have an important role in regulating acute inflammatory reactions18. The R78 substitution in PILRA (rs1859788) has been shown to reduce the binding capacity of endogenous ligands and thereby potentially increase microglial activity16. In addition, while controversial, work from our group and others1921 has suggested a potential viral role in AD risk. Notably, the R78 variant has been implicated in HSV-1 (Herpes Simplex Virus type 1) infectivity16 and differences in HSV-1 antibody titer levels13. While previous studies have hypothesized that reduced activity of PILRA was due to steric conformational changes in the protein leading to reduced binding of key ligands (including HSV-1 glycoprotein B), our results suggest that reduced levels of circulating PILRA protein in R78 carriers could also be a factor in the overall protective effect of this genetic variant.

Statistically significant associations were observed between multiple lipid analytes and the SNPs encoding both APOE4 (rs429358) and APOE2 (rs7412). APOE normally plays a key role in lipid transport, including shuttling cholesterol to neurons in healthy brains. Notably, APOE has a role in beta-amyloid (Aβ) metabolism, and while the exact mechanism is unknown, the e4 variant appears to accelerate neurotoxic Aβ accumulation, aggregation, and deposition in the brain22. The associations we observed between the e4 variant and increased levels of total cholesterol and LDL cholesterol, along with lower levels of high-density lipoprotein (HDL), were consistent with previous cardiovascular disease cohort studies that included young, middle-aged, and older adults2326. The e4 allele was associated with increased NPX of two inflammatory proteins. PLA2G7 is a known cardiovascular risk marker with pro-inflammatory and oxidative activities27 which has previously been associated with APOE genotypes28 and implicated in AD and cognitive decline27,29. To our knowledge, CD28 protein levels have not previously been associated with e4 status, though this relatively weak association may be a downstream result of APOE isoform-specific effects on inflammation30.

Blood cholesterol levels among APOE2 carriers were also largely consistent with a body of existing data24; the e2 variant was associated with lower levels of multiple measures of LDL cholesterol. It should be noted that while 5–10% of e2 homozygotes develop type III hyperlipoproteinemia (typically in the presence of an existing metabolic disorder31) resulting in elevated cholesterol levels, all e2 homozygotes in the study had significantly decreased levels of LDL cholesterol compared to other genotypes. In contrast, the e2 variant was associated with higher levels of six lipid metabolites in the diacylglycerol and monoacylglycerol family; interestingly, both the e4 variant and e2 variants were associated with increased levels of the same two lipid metabolites in the diacylglycerol family, despite the opposite effects of these two variants on circulating blood cholesterol. Diacylglycerol is a precursor to triacylglyceride (TG), which is typically higher in APOE2 carriers26. The effects of high DGs and TGs remains unclear. DG-rich diets fed to diabetic APOE-knockout mice had reduced atherosclerosis and lower plasma cholesterol than mice fed TG-rich or western diets32,33; however, non-targeted metabolomics studies have shown elevated levels of DGs and MGs in AD and mild cognitive impairment (MCI) patient brains and blood compared to cognitively intact individuals34,35.

We observed six proteins that were significantly upregulated in APOE2 carriers (Fig. 3). The LDLR protein had higher levels of NPX in e2 carriers, particularly in e2 homozygotes. Though APOE2 is known to bind poorly to LDLR (~ 2% of e3 or e4 binding activity)36, APOE2 was associated with lower levels of LDL cholesterol across age groups as noted previously, perhaps due to compensatory up-regulation of LDLR26. Greater understanding of the compensatory mechanism leading to upregulated LDLR and lower circulating LDL cholesterol is needed. The e2 variant was associated with increased levels of the highly inducible HMOX-1, which has antioxidant properties and has been associated with both neuroprotection and neurodegeneration37. SLAMF8 may be another link to an antioxidant effect of APOE2, as it has been implicated in modulation of reactive oxygen species and inflammation via negative regulation of NOX activity38. APOE2 carriers displayed higher levels of RNF31 protein (aka HOIP). HOIP is the catalytic component of the linear ubiquitin chain assembly complex (LUBAC), which was shown to have a role in the recognition and degradation of misfolded proteins39. Variation in CNTNAP2, a member of the neurexin superfamily of proteins involved in cell–cell interactions in the nervous system, has been associated with neurodevelopmental disorders40, and has been implicated in AD-related dementia41. Lastly, SRP14, which has a role in targeting secretory proteins to the rough endoplasmic reticulum (ER) membrane, has been identified as one of many tau-associated ER proteins in AD brains42. To our knowledge, the APOE2-protein associations described here are novel and may help point to the mechanisms of protection associated with the e2 variant.

ABCA7 is involved in lipid efflux from cells into lipoprotein particles, plays a role in lipid homeostasis43, and has also been implicated in Aβ processing and deposition in the brain44. Our results support ABCA7’s lipid-related function by showing lower levels of two LC metabolites among individuals carrying the AD-risk allele of rs3752246. In contrast, we observed higher NPX of DEFA1 protein in carriers of the ABCA7 variant, which is consistent with previous studies showing higher levels of this protein in cerebral spinal fluid (CSF) and sera of AD patients compared to controls45,46, potentially linking ABCA7 with an inflammatory response pathway to AD. Lastly, lower NPX of IDUA was associated with the INPP5D SNP. INPP5D, which encodes the lipid phosphatase SHIP1, is a negative regulator of immune signaling and is expressed in microglia47. To our knowledge, this association has not been previously observed.

Genetic variation likely affects men and women differentially, pointing to mechanisms that contribute to known differences in AD pathology between the sexes48. The set of proteins that were differentially affected by sex and PICALM genotype are primarily implicated in immune processes, cell adhesion, and regulatory processes, with widely overlapping functions (Fig. S8). Our results highlight an interaction between the AD-risk variant in PICALM and multiple proteins implicated in immune response in a sex-specific manner, and support emerging research showing sex differences in the neuroimmune response that impact microglia function49. We also observed a sex-differential effect of the variant on multiple LCFA metabolites and one PFA metabolite (DHA). A potential link between PICALM function, lipids, and AD is feasible: fatty acids, and DHA in particular, have long been known to have a role in maintaining brain health and cognition50, while PICALM expression has been shown to influence cholesterol homeostasis through multiple mechanisms51. This multi-analyte interaction was supported by results from sex stratified GWAS meta-analyses, which showed differing effect sizes of the variant on men vs. women.

In addition to these sex-specific PICALM effects, the SNP near CD2AP, a scaffolding protein, interacted with sex to affect three highly correlated sphingomyelins and three plasmologens, while the SNP in SPI1, a transcription factor associated with microglial activation52, interacted with sex to affect SMOC2, a protein involved in microgliosis that has been previously associated with Aβ positivity in CSF53.

We also examined an AD-specific polygenic risk score. While the PGRS is predictive of disease in case/control studies54, it was not associated with any blood analytes in the all-ages AD-free Arivale cohort. Combining genetic effects into a single score for AD likely served to dilute any individual genetic effect on the manifestation of genetic risk in the blood. In addition, the relative youth and cognitive health of this cohort should be considered. The PGRS may be more likely to detect perturbation in analytes that are markers of systemic inflammation or immune dysfunction in later life and among cohorts experiencing cognitive impairment.

The results presented here are novel and we believe will be of interest to the AD-related functional genomics community, though several limitations should be noted. The study population was not a random sample but was self-selected. The population is largely self-identified non-hispanic white, was mostly located on the west coast, and likely has higher than average socio-economic status (though these data were not captured). Thus, results may not be generalizable to a broader population. At this time, we were not aware of a suitable replication cohort that would contain parallel-omics panels in an all-ages health-heterogeneous cohort. Future studies will be needed to assess generality of the findings to other populations, not only for the sake of replicability of the findings, but due to the relative ancestral homogeneity of this data set. Previous studies have shown genetic heterogeneity between white and non-white individuals, particularly with regard to African Americans and risk of cognitive outcomes among carriers of APOE and ABCA7 variants55,56. Given known wide-ranging racial/ethnic disparities in dementia incidence57, it is imperative that future deep-phenotyping studies are far more inclusive than the study presented here.

Another limitation to the interpretation of results concerns the issue of pleiotropy; we cannot discern pleiotropic, non-AD-related effects from true causal effects that are implicated in AD pathogenesis. However, even if the associations described here are purely the result of pleiotropy and are unrelated to causal mechanisms of AD, the novel associations we described may provide clues to the function of several genes that are highly interesting to the AD community. Related, we only obtained peripheral plasma, and are unable to examine effects in AD-relevant compartments such as brain or CSF. We had high-coverage WGS available and did not interrogate other types of genetic variation such as copy number variants, indels, and short tandem repeats. Lastly, data harmonization with other studies will be a challenge. For instance, most previous metabolomics studies used metabolomics data that lacked complete speciation, and more work is needed with newer technologies that yield high fidelity data to determine the biological effects of specific serum metabolites.

This study also has multiple strengths. While most studies focused on AD-related genetic variation consist of case/control cohorts in older adults, the Arivale data offered an unprecedented look into how genetic variation perturbs physiological pathways in the blood long before disease onset, in health-heterogeneous individuals of all ages. This feature allowed us to observe subtle changes in blood associated with genetic variation, due to the relatively large sample size (2831 individuals with WGS) and the high quality of the blood analytes collected. Our results are from a “real-world” cohort, which promises to be an increasing source of large-scale data in the community going forward, with its accompanying advantages and disadvantages. Some results were previously unobserved and need to be replicated (such as the associations between APOE2 and multiple proteins), while other results agree with previous findings and serve to reinforce confidence that the results are reasonably representative and not simply spurious.

Conclusions

Due to a unified world-wide effort, dozens of genetic variants have been robustly implicated in the development of AD, though we are still in the early stages of understanding exactly how genetic variation contributes to disease. Our study showed that AD-related genetic variation manifests in the blood, from early adulthood onward, and highlights known targets for prevention in early and mid-life, such as cholesterol monitoring, mitigation of inflammation, and possibly, HSV-1 prevention and/or viral load management. Importantly, as well as yielding new insight into the pathobiology of AD through adulthood, these results may provide a significant number of new drug targets that are highly novel and biologically plausible or may serve as biomarkers if confirmed to have a consistent influence on AD pathophysiology. Lastly, these results highlight the need to assess for sex differences in future studies. Taken together, these results not only illustrate previously unobserved biological phenomenon as a result of AD-associated genetic variation, but also serve as an important resource for the generation of hypotheses for future functional genomics studies and emphasize the potential insight that can be gleaned from deeply phenotyped individuals.

Methods

Population

The Institute for Systems Biology, through partnership with their spin-out company Arivale, has access to a wealth of data collected from subscribers in the commercially available (now closed) Arivale Scientific Wellness program3,58, from July 2015 to May 2019. In brief, participants in the Arivale program were assigned a health coach upon joining the program, who then utilized data from clinical blood assays and detailed health-history and behavioral questionnaires to personalize health advice and management of health goals.

All research was conducted in accordance with regulations and guidelines for observational research in human subjects. Informed consent was obtained from all participants for the use of their anonymized data in research. The study was reviewed and approved by the Western International Review Board (Study Number 1178906 at Arivale and Study Number 20170658 at the Institute for Systems Biology, in Seattle, WA).

Blood-derived clinical laboratory tests and whole genome sequencing

We identified 2831 individuals with whole genome sequencing (WGS) and at least one class of blood-derived analyte, described as follows. For each participant, fasting clinical blood laboratory tests were measured upon joining the program. Blood samples were collected at either local facilities hosted by LabCorp (North Carolina, USA) or Quest Diagnostics (New Jersey, USA). Whole genome sequencing was performed on DNA extracted from whole blood with library preparation using the Illumina TruSeq Nano Library prep kit and sequenced using Illumina HiSeq X, PE-150, target 30× coverage at a single Clinical Laboratory Improvement Amendmnets (CLIA)-approved sequencing laboratory. Only values with < 20% missing were included, and no imputation was performed. At the baseline blood draw, 2827 of the 2831 individuals with sequenced whole genomes had up to 63 fasting clinical blood lab tests. Clinical blood tests included standard markers for cardiometabolic health (lipid levels), diabetes, inflammation, kidney and liver function, nutrition (vitamins and minerals), and blood cell counts. All clinical lab tests included, with descriptions and units where available, are in Supplementary Excel File 2.

Proteomics: Frozen plasma samples (aliquots of the initial blood draw) were also sent to Olink (Olink Bioscience, Sweden) for targeted proteomics assays based on Olink’s proximity extension assay (PEA) technique59, which is a dual-recognition, DNA-coupled methodology that is quantified by quantitative real-time PCR and enables high multiplex, high throughput proteomics that are both sensitive and specific (for further details, see https://www.olink.com/our-platform/our-pea-technology/). Full details of normalization and batch effect adjustment have been described previously60. For analysis, only proteins with < 20% missing were included and no imputation was performed. Up to 2694 of these participants had quantitative proteomic data on 274 proteins from three Olink panels (Cardiovascular II, Cardiovascular III, and Inflammation panels). An additional 919 proteins (from 10 additional panels available at Olink at the time) were obtained from a subsample of 354 individuals, in which Apolipoprotein E (APOE) e2/e2 and APOE e4/e4 genotypes were overrepresented. Since multiple batches were performed, previously generated pooled control samples were run with each batch and used for batch correction and multiple control samples were included on each plate.

Metabolomics

Aliquots of frozen plasma samples were sent to Metabolon, Inc. (North Carolina) to conduct metabolomics assays using the Metabolon HD4 discovery platform. In brief, Metabolon conducted their Global Metabolomics high-performance liquid chromatography (HPLC)-mass spectrometry assays on the plasma samples. Full details of sample handling, quality control, biochemical identification, data curation, and quantification and normalization has been described previously60,61. For analysis, only metabolites with < 20% missing (or detectable) were included and no imputation was performed. Up to 1909 of the participants had data from 754 metabolites, though due to technical variability and variation in detection rates of rare metabolites, sample sizes ranged from 1539 to 1909 after pruning metabolites with < 20% missing. Relative concentration values were reported for each metabolite. Full biochemical annotation for each metabolite (when available), as provided by Metabolon at the time of quantification, can be found in Supplementary Excel File 2.

SNP selection

We selected 25 common and somewhat-rare (> 1% allele frequency) single nucleotide polymorphisms (SNPs) that were significantly associated with AD in a large-scale meta-analysis based on updated data from the International Genomics of Alzheimer’s Project (IGAP)5. In addition to these variants, we also included the SNP coding for APOE e2 (rs7412). The 25 SNPs were linked to 24 genes (two SNPs in APOE), as detailed in Table S1.

Polygenic risk score calculation for AD

PGRS for age-associated AD risk was computed using summary statistics from the initial IGAP-driven GWAS meta-analysis6. Briefly, the set of SNPs included in the PGS was determined as follows. The Benjamini–Hochberg62 procedure was applied to the p-values for all SNPs tested in the GWAS to account for multiple testing by controlling the false discovery rate (FDR) at a 5% level. This FDR-filtered set of SNPs was then further pruned using linkage disequilibrium (LD): pairs of SNPs in close proximity capturing highly correlated information (r2 > 0.2) were identified, and the SNP with the smaller p-value in the pair was kept; this was repeated until all remaining SNPs were mutually uncorrelated (r2 < 0.2 for all pairs). The PGRS for each individual was then calculated by summing up the published effect size for each selected SNP multiplied by the number of effect alleles the individual carried for that SNP, across all of the selected SNPs. Missing genotypes were mean imputed using the effect allele frequency.

Statistical analysis

Following a phenome-wide association study approach (PheWAS)12,63, the primary model for each SNP used linear regression, with genotype (0, 1, or 2, with 0 indicating homozygosity for the major allele and 2 indicating homozygosity for the minor allele) as the predictor, and each continuous quantitative analyte as the dependent variable. Clinical lab and metabolite values were natural log transformed to account for right skewness and outliers, with + 1 added to each natural log transformation to prevent zero values. Proteomic quantities were presented as normalized protein expression (NPX), Olink’s arbitrary unit, which is in log2 scale. Genetic ancestry was represented by principal components (PCs) 1–4, calculated using previously described methods64. All SNP models were adjusted for age, sex, genetic ancestry PCs 1–4, and vendor identification for the clinical labs. Secondary models tested effect modification by sex by including a gene x sex interaction term in the models. We accounted for multiple comparisons by applying the Benjamini–Hochberg method62 at alpha = 0.05 on a per-SNP basis and applied to the main effect of genotype in the primary models, while we set B-H alpha = 0.1 of the sex-SNP interaction term as the threshold for the gene x sex interaction models. The FDR rate took into account testing for all 2008 possible analytes, with the understanding that this adjustment was highly conservative given a high degree of correlation among multiple groups of analytes, and the fact that some analytes were sampled in only a subset of individuals. Both raw and adjusted p-values are reported.

We also repeated the primary PheWAS approach with participants stratified by self-identified race, due to evidence for variable genetic risk for cognitive outcomes between non-Hispanic white (hereafter referred to as “white”) and non-white populations55,56. Unfortunately, due to small numbers of individuals in specific non-white racial and ethnic groups, which become vanishingly small when accounting for allele frequency and numbers of available samples (Table 1), we were not able to assess genetic risk effects in individual groups with statistical rigor and had to group all non-white participants into one stratum for analysis. The stratified white and non-white group analyses serve as an investigation into whether our primary results reflected the majority-white makeup of the Arivale population. PheWAS was applied as described above, with FDR to account for multiple comparisons.

To visualize genotype-analyte associations across adulthood, we created boxplots of the log-transformed analyte values by genotype, stratified by age group (by decade, from 18–29 to 70 and over). One-way analysis of variance (ANOVA) was used to test whether there was an overall difference between genotypes within each age group. All statistical analyses were performed in R v3.5.1 (https://www.R-project.org/).

In post-hoc exploratory analysis focused on the SNP in the PICALM (Phosphatidylinositol Binding Clathrin Assembly Protein) locus (rs3851179), sex-stratified and sex-interaction analyses was performed on 12,324 cases (57.7% female) and 11,453 controls (59.9% female) of European ancestry from the Alzheimer’s Disease Genetics Consortium (ADGC) (see Supplementary Table 4 for dataset details). Datasets were imputed to the Haplotype Reference Consortium (HRC)65 panel using the Michigan Imputation Server (https://imputationserver.sph.umich.edu/index.html#!). Standard pre-imputation quality control was performed on all datasets individually, including exclusion of individuals with low call rate, individuals with a high degree of relatedness, and variants with low call rate66. Individuals with non-European ancestry according to principal components analysis of ancestry-informative markers were excluded from the further analysis. Detailed descriptions of individual ADGC datasets can be found in Kunkle et al.5 and Table S5. Study-specific logistic regression analyses employed Plink67 for sex-interaction analysis and SNPTest68 for sex-stratified analysis. Sex-interaction, which analyzed the sex × variant interaction, and sex-stratified analysis of males and females separately, were performed for two separate models per analysis, one adjusting for age, sex and PCs (model 1) and a second adjusting for age, sex, PCs and APOE (model 2). Results were meta-analyzed with METAL using inverse variance-based analysis69. In order to explore the relationships among the proteins associated with the PICALM variant, we input the list of sex-interacting proteins into Cytoscape software, utilizing the CLUEGO plug-in70,71, which drew a network linking proteins through their known GO Biological processes.

Supplementary Information

Author contributions

L.Heath and N.D.P. conceived and designed the study; L.Heath, J.C.E., B.W.K., A.C.N., and E.R.M. performed data analyses and figure generation; A.T.M., J.C.L., B.W.K., E.R.M., N.E.-T., L.Hood, N.R., and N.D.P. acquired the data; L.Heath, J.C.E., N.R., A.T.M., S.A.K., J.C.L., C.C.F., B.A.L., L.M.M., N.E.-T., T.E.G., L.Hood, and N.D.P. made substantial contributions to interpretation of the results. L.Heath and N.D.P. were primary authors of the manuscript. All authors read and approved final manuscript.

Funding

This study was supported by the National Institutes of Health, National Institute on Aging Grants U01 AG046139 (N.E-T, T.E.G, N.D.P), R01 AG061796 (N.E-T), RF1 AG051504 (N.E-T), R01-AG062634-01 (B.W.K, E.R.M), and U19 AG023122 (N.R). ADGC. The National Institutes of Health, National Institute on Aging (NIH-NIA) supported this work through the following Grants: ADGC, U01 AG032984, RC2 AG036528; Samples from the National Cell Repository for Alzheimer’s Disease (NCRAD), which receives government support under a cooperative agreement Grant (U24 AG21886) awarded by the National Institute on Aging (NIA), were used in this study. We thank contributors who collected samples used in this study, as well as patients and their families, whose help and participation made this work possible; Data for this study were prepared, archived, and distributed by the National Institute on Aging Alzheimer’s Disease Data Storage Site (NIAGADS) at the University of Pennsylvania (U24-AG041689-01); NACC, U01 AG016976; NIA LOAD (Columbia University), U24 AG026395, U24 AG026390, R01AG041797; Banner Sun Health Research Institute P30 AG019610; Boston University, P30 AG013846, U01 AG10483, R01 CA129769, R01 MH080295, R01 AG017173, R01 AG025259, R01 AG048927, R01AG33193, R01 AG009029; Columbia University, P50 AG008702, R37 AG015473, R01 AG037212, R01 AG028786; Duke University, P30 AG028377, AG05128; Emory University, AG025688; Group Health Research Institute, UO1 AG006781, UO1 HG004610, UO1 HG006375, U01 HG008657; Indiana University, P30 AG10133, R01 AG009956, RC2 AG036650; Johns Hopkins University, P50 AG005146, R01 AG020688; Massachusetts General Hospital, P50 AG005134; Mayo Clinic, P50 AG016574, R01 AG032990, KL2 RR024151; Mount Sinai School of Medicine, P50 AG005138, P01 AG002219; New York University, P30 AG08051, UL1 RR029893, 5R01AG012101, 5R01AG022374, 5R01AG013616, 1RC2AG036502, 1R01AG035137; North Carolina A&T University, P20 MD000546, R01 AG28786-01A1; Northwestern University, P30 AG013854; Oregon Health & Science University, P30 AG008017, R01 AG026916; Rush University, P30 AG010161, R01 AG019085, R01 AG15819, R01 AG17917, R01 AG030146, R01 AG01101, RC2 AG036650, R01 AG22018; TGen, R01 NS059873; University of Alabama at Birmingham, P50 AG016582; University of Arizona, R01 AG031581; University of California, Davis, P30 AG010129; University of California, Irvine, P50 AG016573; University of California, Los Angeles, P50 AG016570; University of California, San Diego, P50 AG005131; University of California, San Francisco, P50 AG023501, P01 AG019724; University of Kentucky, P30 AG028383, AG05144; University of Michigan, P50 AG008671; University of Pennsylvania, P30 AG010124; University of Pittsburgh, P50 AG005133, AG030653, AG041718, AG07562, AG02365; University of Southern California, P50 AG005142; University of Texas Southwestern, P30 AG012300; University of Miami, R01 AG027944, AG010491, AG027944, AG021547, AG019757; University of Washington, P50 AG005136, R01 AG042437; University of Wisconsin, P50 AG033514; Vanderbilt University, R01 AG019085; and Washington University, P50 AG005681, P01 AG03991, P01 AG026276. The Kathleen Price Bryan Brain Bank at Duke University Medical Center is funded by NINDS Grant # NS39764, NIMH MH60451 and by Glaxo Smith Kline. Support was also from the Alzheimer’s Association (LAF, IIRG-08-89720; MP-V, IIRG-05-14147), the US Department of Veterans Affairs Administration, Office of Research and Development, Biomedical Laboratory Research Program, and BrightFocus Foundation (MP-V, A2111048). P.S.G.-H. is supported by Wellcome Trust, Howard Hughes Medical Institute, and the Canadian Institute of Health Research. Genotyping of the TGEN2 cohort was supported by Kronos Science. The TGen series was also funded by NIA Grant AG041232 to AJM and MJH, The Banner Alzheimer’s Foundation, The Johnnie B. Byrd Sr. Alzheimer’s Institute, the Medical Research Council, and the state of Arizona and also includes samples from the following sites: Newcastle Brain Tissue Resource (funding via the Medical Research Council, local NHS trusts and Newcastle University), MRC London Brain Bank for Neurodegenerative Diseases (funding via the Medical Research Council),South West Dementia Brain Bank (funding via numerous sources including the Higher Education Funding Council for England (HEFCE), Alzheimer’s Research Trust (ART), BRACE as well as North Bristol NHS Trust Research and Innovation Department and DeNDRoN), The Netherlands Brain Bank (funding via numerous sources including Stichting MS Research, Brain Net Europe, Hersenstichting Nederland Breinbrekend Werk, International Parkinson Fonds, Internationale Stiching Alzheimer Onderzoek), Institut de Neuropatologia, Servei Anatomia Patologica, Universitat de Barcelona. ADNI data collection and sharing was funded by the National Institutes of Health Grant U01 AG024904 and Department of Defense award number W81XWH-12-2-0012. ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. We thank Drs. D. Stephen Snyder and Marilyn Miller from NIA who are ex-officio ADGC members.

Data availability

The datasets generated and/or analysed during the current study are not publicly available because the data was generated by a private investment firm under legal terms that mandate researchers to sign a data access agreement permitting the use of these data for non-proift research purposes. Upon reasonable request, researchers can access the Arivale deidentified dataset supporting the findings in this study for research purposes from ISB. Requests should be sent to data-access@isbscience.org. The data are available to qualified researchers on submission and approval of a research plan.

Code availability

Code used for PheWAS statistical analysis is available through the Sage Bionetworks Github (https://github.com/Sage-Bionetworks/ADsnps_PheWAS.git).

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A list of authors and their affiliations appears at the end of the paper.

Contributor Information

Laura Heath, Email: laura.heath@sagebase.org.

Nathan D. Price, Email: nprice@thorne.com

Alzheimer’s Disease Genetics Consortium:

Erin Abner, Perrie M. Adams, Marilyn S. Albert, Roger L. Albin, Mariet Allen, Alexandre Amlie-Wolf, Liana G. Apostolova, Steven E. Arnold, Sanjay Asthana, Craig S. Atwood, Clinton T. Baldwin, Robert C. Barber, Lisa L. Barnes, Sandra Barral, Thomas G. Beach, James T. Becker, Gary W. Beecham, Duane Beekly, David Bennett, Eileen H. Bigio, Thomas D. Bird, Deborah Blacker, Bradley F. Boeve, James D. Bowen, Adam Boxer, James R. Burke, Jeffrey M. Burns, Will Bush, Mariusz Butkiewicz, Joseph D. Buxbaum, Nigel J. Cairns, Laura B. Cantwell, Chuanhai Cao, Chris S. Carlson, Cynthia M. Carlsson, Regina M. Carney, Helena C. Chui, Paul K. Crane, David H. Cribbs, Elizabeth A. Crocco, Michael L. Cuccaro, Philip L. De Jager, Charles DeCarli, Malcolm Dick, Dennis W. Dickson, Beth A. Dombroski, Rachelle S. Doody, Ranjan Duara, Nilufer Ertekin-Taner, Denis A. Evans, Kelley M. Faber, Thomas J. Fairchild, Kenneth B. Fallon, David W. Fardo, Martin R. Farlow, Lindsay A. Farrer, Steven Ferris, Tatiana M. Foroud, Matthew P. Frosch, Douglas R. Galasko, Marla Gearing, Daniel H. Geschwind, Bernardino Ghetti, John R. Gilbert, Alison M. Goate, Robert C. Green, John H. Growdon, Jonathan Haines, Hakon Hakonarson, Ronald L. Hamilton, Kara L. Hamilton-Nelson, Lindy E. Harrell, Lawrence S. Honig, Ryan M. Huebinger, Matthew J. Huentelman, Christine M. Hulette, Bradley T. Hyman, Gail P. Jarvik, Lee-Way Jin, Gyungah R. Jun, M. Ilyas Kamboh, Anna Karydas, Mindy J. Katz, Jeffrey A. Kaye, C. Dirk Keene, Ronald Kim, Neil W. Kowall, Joel H. Kramer, Walter A. Kukull, Brian W. Kunkle, Amanda B. Kuzma, Frank M. LaFerla, James J. Lah, Eric B. Larson, James B. Leverenz, Allan I. Levey, Andrew P. Lieberman, Richard B. Lipton, Kathryn L. Lunetta, Constantine G. Lyketsos, John Malamon, Daniel C. Marson, Eden R. Martin, Frank Martiniuk, Deborah C. Mash, Eliezer Masliah, Richard Mayeux, Wayne C. McCormick, Susan M. McCurry, Andrew N. McDavid, Ann C. McKee, Marsel Mesulam, Bruce L. Miller, Carol A. Miller, Joshua W. Miller, Thomas J. Montine, John C. Morris, Shubhabrata Mukherjee, Amanda J. Myers, Adam C. Naj, Sid O’Bryant, John M. Olichney, Joseph E. Parisi, Henry L. Paulson, Margaret A. Pericak-Vance, William R. Perry, Elaine Peskind, Ronald C. Petersen, Aimee Pierce, Wayne W. Poon, Huntington Potter, Liming Qu, Joseph F. Quinn, Ashok Raj, Murray Raskind, Eric M. Reiman, Barry Reisberg, Joan S. Reisch, Christiane Reitz, John M. Ringman, Erik D. Roberson, Ekaterina Rogaeva, Howard J. Rosen, Roger N. Rosenberg, Donald R. Royall, Mark A. Sager, Mary Sano, Andrew J. Saykin, Gerard D. Schellenberg, Julie A. Schneider, Lon S. Schneider, William W. Seeley, Susan Slifer, Amanda G. Smith, Yeunjoo Song, Joshua A. Sonnen, Salvatore Spina, Peter St George-Hyslop, Robert A. Stern, Russell H. Swerdlow, Mitchell Tang, Rudolph E. Tanzi, John Q. Trojanowski, Juan C. Troncoso, Debby W. Tsuang, Otto Valladares, Vivianna M. Van Deerlin, Linda J. Van Eldik, Jeffery Vance, Badri N. Vardarajan, Harry V. Vinters, Jean Paul Vonsattel, Li-San Wang, Sandra Weintraub, Kathleen A. Welsh-Bohmer, Patrice Whitehead, Kirk C. Wilhelmsen, Jennifer Williamson, Thomas S. Wingo, Randall L. Woltjer, Clinton B. Wright, Chuang-Kuo Wu, Steven G. Younkin, Chang-En Yu, Lei Yu, and Yi Zhao

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-022-09825-2.

References

  • 1.Price ND, et al. A wellness study of 108 individuals using personal, dense, dynamic data clouds. Nat. Biotechnol. 2017 doi: 10.1038/nbt.3870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Schüssler-Fiorenza Rose SM, et al. A longitudinal big data approach for precision health. Nat. Med. 2019;25:792–804. doi: 10.1038/s41591-019-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zubair N, et al. Genetic predisposition impacts clinical changes in a lifestyle coaching program. Sci. Rep. 2019 doi: 10.1038/s41598-019-43058-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Gatz M, et al. Role of genes and environments for explaining Alzheimer disease. Arch. Gen. Psychiatry. 2006;63:168–174. doi: 10.1001/archpsyc.63.2.168. [DOI] [PubMed] [Google Scholar]
  • 5.Kunkle BW, et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet. 2019;51:414–430. doi: 10.1038/s41588-019-0358-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lambert JC, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat. Genet. 2013;45:1452–1458. doi: 10.1038/ng.2802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Karch CM, Goate AM. Alzheimer’s disease risk genes and mechanisms of disease pathogenesis. Biol. Psychiatry. 2015;77:43–51. doi: 10.1016/j.biopsych.2014.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Pimenova AA, Raj T, Goate AM. Untangling genetic risk for Alzheimer’s disease. Biol. Psychiatry. 2018;83:300–310. doi: 10.1016/j.biopsych.2017.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Allen M, et al. Late-onset Alzheimer disease risk variants mark brain regulatory loci. Neurol. Genet. 2015;1:e15. doi: 10.1212/NXG.0000000000000012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ertekin-Taner N. Gene expression endophenotypes: A novel approach for gene discovery in Alzheimer’s disease. Mol. Neurodegener. 2011;6:31. doi: 10.1186/1750-1326-6-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Atri A. The Alzheimer’s disease clinical spectrum: Diagnosis and management. Med. Clin. N. Am. 2019;103:263–293. doi: 10.1016/j.mcna.2018.10.009. [DOI] [PubMed] [Google Scholar]
  • 12.Pendergrass SA, Ritchie MD. Phenome-wide association studies: Leveraging comprehensive phenotypic and genotypic data for discovery. Curr. Genet. Med. Rep. 2015;3:92–100. doi: 10.1007/s40142-015-0067-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Agostini S, et al. The PILRA G78R variant correlates with higher HSV-1-specific IgG titers in Alzheimer’s disease. Cell. Mol. Neurobiol. 2019;39:1217–1221. doi: 10.1007/s10571-019-00712-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jansen IE, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 2019;51:404–413. doi: 10.1038/s41588-018-0311-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Patel T, et al. Whole-exome sequencing of the BDR cohort: Evidence to support the role of the PILRA gene in Alzheimer’s disease. Neuropathol. Appl. Neurobiol. 2018;44:506–521. doi: 10.1111/nan.12452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rathore N, et al. Paired Immunoglobulin-like type 2 receptor alpha G78R variant alters ligand binding and confers protection to Alzheimer’s disease. PLoS Genet. 2018;14:e1007427. doi: 10.1371/journal.pgen.1007427. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Karch CM, Ezerskiy LA, Bertelsen S, Alzheimer’s Disease Genetics Consortium (ADGC) Goate AM. Alzheimer’s disease risk polymorphisms regulate gene expression in the ZCWPW1 and the CELF1 Loci. PLoS ONE. 2016;11:e0148717. doi: 10.1371/journal.pone.0148717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wang J, Shiratori I, Uehori J, Ikawa M, Arase H. Neutrophil infiltration during inflammation is regulated by PILRα via modulation of integrin activation. Nat. Immunol. 2013;14:34–40. doi: 10.1038/ni.2456. [DOI] [PubMed] [Google Scholar]
  • 19.Eimer WA, et al. Alzheimer’s disease-associated β-amyloid is rapidly seeded by herpesviridae to protect against brain infection. Neuron. 2018;99:56–63. doi: 10.1016/j.neuron.2018.06.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Itzhaki RF, Wozniak MA. Herpes simplex virus type 1 in Alzheimer’s disease: The enemy within. J. Alzheimers Dis. 2008;13:393–405. doi: 10.3233/JAD-2008-13405. [DOI] [PubMed] [Google Scholar]
  • 21.Readhead B, et al. Multiscale analysis of independent Alzheimer’s cohorts finds disruption of molecular, genetic, and clinical networks by human herpesvirus. Neuron. 2018;99:64–82. doi: 10.1016/j.neuron.2018.05.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Liu C-C, Liu C-C, Kanekiyo T, Xu H, Bu G. Apolipoprotein E and Alzheimer disease: Risk, mechanisms and therapy. Nat. Rev. Neurol. 2013;9:106–118. doi: 10.1038/nrneurol.2012.263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Karjalainen J-P, et al. The effect of apolipoprotein E polymorphism on serum metabolome—A population-based 10-year follow-up study. Sci. Rep. 2019;9:458. doi: 10.1038/s41598-018-36450-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Downer B, Estus S, Katsumata Y, Fardo DW. Longitudinal trajectories of cholesterol from midlife through late life according to apolipoprotein E allele status. Int. J. Environ. Res. Public Health. 2014;11:10663–10693. doi: 10.3390/ijerph111010663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Grönroos P, et al. Influence of apolipoprotein E polymorphism on serum lipid and lipoprotein changes: A 21-year follow-up study from childhood to adulthood. The Cardiovascular Risk in Young Finns Study. Clin. Chem. Lab. Med. 2007;45:592–598. doi: 10.1515/CCLM.2007.116. [DOI] [PubMed] [Google Scholar]
  • 26.Bennet AM, et al. Association of apolipoprotein E genotypes with lipid levels and coronary risk. JAMA. 2007;298:1300–1311. doi: 10.1001/jama.298.11.1300. [DOI] [PubMed] [Google Scholar]
  • 27.Davidson JE, et al. Plasma lipoprotein-associated phospholipase A2 activity in Alzheimer’s disease, amnestic mild cognitive impairment, and cognitively healthy elderly subjects: A cross-sectional study. Alzheimers Res. Ther. 2012;4:51. doi: 10.1186/alzrt154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Drenos F, et al. Integrated associations of genotypes with multiple blood biomarkers linked to coronary heart disease risk. Hum. Mol. Genet. 2009;18:2305–2316. doi: 10.1093/hmg/ddp159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.van Oijen M, et al. Lipoprotein-associated phospholipase A2 is associated with risk of dementia. Ann. Neurol. 2006;59:139–144. doi: 10.1002/ana.20721. [DOI] [PubMed] [Google Scholar]
  • 30.Zhang H, Wu L-M, Wu J. Cross-talk between apolipoprotein E and cytokines. Mediat. Inflamm. 2011;2011:949072. doi: 10.1155/2011/949072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Mahley RW, Huang Y, Rall SC. Pathogenesis of type III hyperlipoproteinemia (dysbetalipoproteinemia). Questions, quandaries, and paradoxes. J. Lipid Res. 1999;40:1933–1949. doi: 10.1016/S0022-2275(20)32417-2. [DOI] [PubMed] [Google Scholar]
  • 32.Fujii A, Allen TJ, Nestel PJ. A 1,3-diacylglycerol-rich oil induces less atherosclerosis and lowers plasma cholesterol in diabetic apoE-deficient mice. Atherosclerosis. 2007;193:55–61. doi: 10.1016/j.atherosclerosis.2006.08.024. [DOI] [PubMed] [Google Scholar]
  • 33.Ijiri Y, et al. Dietary diacylglycerol extenuates arterial thrombosis in apoE and LDLR deficient mice. Thromb. Res. 2006;117:411–417. doi: 10.1016/j.thromres.2005.04.001. [DOI] [PubMed] [Google Scholar]
  • 34.Wood PL, Barnette BL, Kaye JA, Quinn JF, Woltjer RL. Non-targeted lipidomics of CSF and frontal cortex grey and white matter in control, mild cognitive impairment, and Alzheimer’s disease subjects. Acta Neuropsychiatr. 2015;27:270–278. doi: 10.1017/neu.2015.18. [DOI] [PubMed] [Google Scholar]
  • 35.Wood PL, et al. Targeted lipidomics of fontal cortex and plasma diacylglycerols (DAG) in mild cognitive impairment and Alzheimer’s disease: Validation of DAG accumulation early in the pathophysiology of Alzheimer’s disease. J. Alzheimers Dis. 2015;48:537–546. doi: 10.3233/JAD-150336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Mahley RW. Apolipoprotein E: From cardiovascular disease to neurodegenerative disorders. J. Mol. Med. Berl. Ger. 2016;94:739–746. doi: 10.1007/s00109-016-1427-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Barone E, Di Domenico F, Mancuso C, Butterfield DA. The Janus face of the heme oxygenase/biliverdin reductase system in Alzheimer disease: It’s time for reconciliation. Neurobiol. Dis. 2014;62:144–159. doi: 10.1016/j.nbd.2013.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wang G, et al. Cutting edge: Slamf8 is a negative regulator of Nox2 activity in macrophages. J. Immunol. Baltim. Md. 2012;1950(188):5829–5832. doi: 10.4049/jimmunol.1102620. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.van Well EM, et al. A protein quality control pathway regulated by linear ubiquitination. EMBO J. 2019 doi: 10.15252/embj.2018100730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Rodenas-Cuadrado P, Ho J, Vernes SC. Shining a light on CNTNAP2: Complex functions to complex disorders. Eur. J. Hum. Genet. EJHG. 2014;22:171–178. doi: 10.1038/ejhg.2013.100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.van Abel D, et al. Direct downregulation of CNTNAP2 by STOX1A is associated with Alzheimer’s disease. J. Alzheimers Dis. 2012;31:793–800. doi: 10.3233/JAD-2012-120472. [DOI] [PubMed] [Google Scholar]
  • 42.Meier S, et al. Identification of novel tau interactions with endoplasmic reticulum proteins in Alzheimer’s disease brain. J. Alzheimers Dis. 2015;48:687–702. doi: 10.3233/JAD-150298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.De Roeck A, Van Broeckhoven C, Sleegers K. The role of ABCA7 in Alzheimer’s disease: Evidence from genomics, transcriptomics and methylomics. Acta Neuropathol. (Berl.) 2019;138:201–220. doi: 10.1007/s00401-019-01994-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Sakae N, et al. ABCA7 deficiency accelerates amyloid-β generation and Alzheimer’s neuronal pathology. J. Neurosci. Off. J. Soc. Neurosci. 2016;36:3848–3859. doi: 10.1523/JNEUROSCI.3757-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Szekeres M, et al. Relevance of defensin β-2 and α defensins (HNP1-3) in Alzheimer’s disease. Psychiatry Res. 2016;239:342–345. doi: 10.1016/j.psychres.2016.03.045. [DOI] [PubMed] [Google Scholar]
  • 46.Watt AD, et al. Peripheral α-defensins 1 and 2 are elevated in Alzheimer’s disease. J. Alzheimers Dis. 2015;44:1131–1143. doi: 10.3233/JAD-142286. [DOI] [PubMed] [Google Scholar]
  • 47.Efthymiou AG, Goate AM. Late onset Alzheimer’s disease genetics implicates microglial pathways in disease risk. Mol. Neurodegener. 2017;12:43. doi: 10.1186/s13024-017-0184-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Dumitrescu L, Mayeda ER, Sharman K, Moore AM, Hohman TJ. Sex differences in the genetic architecture of Alzheimer’s disease. Curr. Genet. Med. Rep. 2019;7:13–21. doi: 10.1007/s40142-019-0157-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Osborne BF, Turano A, Schwarz JM. Sex differences in the neuroimmune system. Curr. Opin. Behav. Sci. 2018;23:118–123. doi: 10.1016/j.cobeha.2018.05.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Fotuhi M, Mohassel P, Yaffe K. Fish consumption, long-chain omega-3 fatty acids and risk of cognitive decline or Alzheimer disease: A complex association. Nat. Clin. Pract. Neurol. 2009;5:140–152. doi: 10.1038/ncpneuro1044. [DOI] [PubMed] [Google Scholar]
  • 51.Mercer JL, et al. Modulation of PICALM levels perturbs cellular cholesterol homeostasis. PLoS ONE. 2015;10:e0129776. doi: 10.1371/journal.pone.0129776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Verheijen J, Sleegers K. Understanding Alzheimer disease at the interface between genetics and transcriptomics. Trends Genet. 2018;34:434–447. doi: 10.1016/j.tig.2018.02.007. [DOI] [PubMed] [Google Scholar]
  • 53.Whelan CD, et al. Multiplex proteomics identifies novel CSF and plasma biomarkers of early Alzheimer’s disease. Acta Neuropathol. Commun. 2019;7:169. doi: 10.1186/s40478-019-0795-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Harrison JR, Mistry S, Muskett N, Escott-Price V. From polygenic scores to precision medicine in Alzheimer’s disease: A systematic review. J. Alzheimers Dis. 2020;74:1271–1283. doi: 10.3233/JAD-191233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.N’Songo A, et al. African American exome sequencing identifies potential risk variants at Alzheimer disease loci. Neurol. Genet. 2017;3:e141. doi: 10.1212/NXG.0000000000000141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Berg CN, Sinha N, Gluck MA. The effects of APOE and ABCA7 on cognitive function and Alzheimer’s disease risk in African Americans: A focused mini review. Front. Hum. Neurosci. 2019;13:387. doi: 10.3389/fnhum.2019.00387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Mayeda ER, Glymour MM, Quesenberry CP, Whitmer RA. Inequalities in dementia incidence between six racial and ethnic groups over 14 years. Alzheimers Dement. J. Alzheimers Assoc. 2016;12:216–224. doi: 10.1016/j.jalz.2015.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Xu X, et al. Habitual sleep duration and sleep duration variation are independently associated with body mass index. Int. J. Obes. 2018;2005(42):794–800. doi: 10.1038/ijo.2017.223. [DOI] [PubMed] [Google Scholar]
  • 59.Assarsson E, et al. Homogenous 96-plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability. PLoS ONE. 2014;9:e95192. doi: 10.1371/journal.pone.0095192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Manor O, et al. A multi-omic association study of trimethylamine N-oxide. Cell Rep. 2018;24:935–946. doi: 10.1016/j.celrep.2018.06.096. [DOI] [PubMed] [Google Scholar]
  • 61.Wittmann BM, et al. Bladder cancer biomarker discovery using global metabolomic profiling of urine. PLoS ONE. 2014;9:e115870. doi: 10.1371/journal.pone.0115870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Hochberg Y, Benjamini Y. More powerful procedures for multiple significance testing. Stat. Med. 1990;9:811–818. doi: 10.1002/sim.4780090710. [DOI] [PubMed] [Google Scholar]
  • 63.Hall MA, et al. Detection of pleiotropy through a phenome-wide association study (PheWAS) of epidemiologic data as part of the environmental architecture for genes linked to environment (EAGLE) study. PLoS Genet. 2014;10:e1004678. doi: 10.1371/journal.pgen.1004678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Conomos MP, et al. Genetic diversity and association studies in US Hispanic/Latino populations: Applications in the hispanic community health study/study of latinos. Am. J. Hum. Genet. 2016;98:165–184. doi: 10.1016/j.ajhg.2015.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.McCarthy S, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 2016;48:1279–1283. doi: 10.1038/ng.3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Sims R, et al. Rare coding variants in PLCG2, ABI3, and TREM2 implicate microglial-mediated innate immunity in Alzheimer’s disease. Nat. Genet. 2017;49:1373–1384. doi: 10.1038/ng.3916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Chang CC, et al. Second-generation PLINK: Rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7. doi: 10.1186/s13742-015-0047-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 2007;39:906–913. doi: 10.1038/ng2088. [DOI] [PubMed] [Google Scholar]
  • 69.Willer CJ, Li Y, Abecasis GR. METAL: Fast and efficient meta-analysis of genomewide association scans. Bioinform. Oxf. Engl. 2010;26:2190–2191. doi: 10.1093/bioinformatics/btq340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Shannon P, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Bindea G, et al. ClueGO: A cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25:1091–1093. doi: 10.1093/bioinformatics/btp101. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The datasets generated and/or analysed during the current study are not publicly available because the data was generated by a private investment firm under legal terms that mandate researchers to sign a data access agreement permitting the use of these data for non-proift research purposes. Upon reasonable request, researchers can access the Arivale deidentified dataset supporting the findings in this study for research purposes from ISB. Requests should be sent to data-access@isbscience.org. The data are available to qualified researchers on submission and approval of a research plan.

Code used for PheWAS statistical analysis is available through the Sage Bionetworks Github (https://github.com/Sage-Bionetworks/ADsnps_PheWAS.git).


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES