Abstract
Background
The use of polygenic risk scores (PRS) for predicting disease risk in Japanese populations, particularly for dementia and related phenotypes, remains markedly unexplored. The aim of this study was to bridge this gap by developing a novel PRS model designed to predict amyloid-β (Aβ) deposition utilizing positron emission tomography (PET) imaging data from a Japanese cohort.
Methods
Using the polygenic risk score-continuous shrinkage (PRS-CS) algorithm, we calculated PRS based on significant single nucleotide polymorphisms (SNPs) associated with Alzheimer’s disease (AD) in this population. We applied a PRS calculation approach informed by Japanese genome-wide association studies (GWAS) summary statistics into a Japanese dementia cohort from Keio University.
Results
Our findings revealed that a p-value threshold of pT < 0.1 optimally enhanced the predictive capability of the Japanese Aβ PET positivity risk model. Moreover, we demonstrated that distinguishing between the counts of APOE2 and APOE4 alleles in our calculations significantly elevated model performance, achieving an area under the curve (AUC) of 0.759. Remarkably, this predictive accuracy remained robust even when the pT was adjusted to be < 1.0 × 10− 5, maintaining an AUC of 0.735. This study validated the efficacy of the model in identifying individuals with a increased risk of amyloid pathology.
Conclusions
These findings highlight the potential of PRS as a noninvasive tool for early detection and risk stratification of AD, which could lead to enhanced clinical applications and interventions.
Supplementary Information
The online version contains supplementary material available at 10.1186/s13195-025-01754-2.
Keywords: Alzheimer’s disease, Polygenic risk score, Amyloid-β deposition, Positron emission tomography, Genome-wide association studies, Japanese population
Introduction
Alzheimer’s disease (AD), the leading cause of dementia worldwide, is characterized by progressive cognitive decline and memory impairment. A distinguishing characteristic of AD is the accumulation of amyloid-β (Aβ) plaques in the brain, which can be detected through positron emission tomography (PET) imaging. Genetic factors play a crucial role in AD pathogenesis, and recent advancements in genome-wide association studies (GWAS) identified numerous loci associated with AD risk [1, 2]. Although the APOE ε4 allele is well-established as a significant risk factor, other genetic loci also contribute to AD susceptibility and warrant further investigation.
Recent research has shown that APOE4 homozygotes experience earlier symptom onset at an average age of 65.1 years. By this age, the majority of APOE4 homozygotes exhibit AD pathology, such as abnormal amyloid levels in the cerebrospinal fluid, indicating almost complete AD penetrance in this population. This concerning finding highlights the urgency of recognizing APOE4 homozygotes as a distinct category within the broader spectrum of AD, similar to autosomal dominant AD (ADAD) and Down syndrome-associated AD (DSAD) [3]. Understanding the role of APOE4 in AD is essential; nevertheless, it is also imperative to consider the multiple factors contributing to disease development, including genetic background and environmental influences [4, 5]. Elucidating these genetic factors is crucial for developing effective preventive and therapeutic strategies. Predicting AD risk before clinical onset is important for selecting participants for clinical trials as well as for preventive healthcare. Recent studies have demonstrated the utility of polygenic risk scores (PRS) in identifying individuals at high risk of AD [6].
PRS quantifies the cumulative effect of genetic mutations on the disease risk of an individual [7]. Combining data on APOE ε4 allele, APOE ε2 allele, and PRS has enhanced disease prediction accuracy in Caucasian populations [8, 9]. Moreover, PRS has been associated with AD-related phenotypes, including brain volumes [10–12], brain Aβ accumulation [12], and plasma phosphorylated tau levels [13]. Studies have also demonstrated its effectiveness in predicting progression from mild cognitive impairment (MCI) to AD [14, 15]. However, the applicability of these findings to Aβ deposition and their relevance in non-Caucasian populations, such as Japanese, remain unexplored. Therefore, in this study, we aimed to investigate various methodologies and single nucleotide polymorphism (SNP) selection approaches for predicting Aβ PET positivity in Japanese individuals using the Japanese dementia cohort from Keio University, to accurately identify high- and low-risk individuals.
PRS-CS, a Python-based command-line tool, estimates posterior SNP effect sizes using continuous shrinkage (CS) priors [16]. This tool calculates the effect sizes of genetic variants based on GWAS summary statistics and an external linkage disequilibrium (LD) reference panel [17]. We used PRS-CS to calculate AD PRS in Japanese individuals, validated methods for modeling APOE with PRS, and determined the optimal p-value threshold for SNP selection. This study provides recommendations for using PRS to reliably identify individuals at risk for Aβ PET positivity.
Methods
Participants and clinical measurements
The PRS model for the Japanese training and validation cohorts was developed using participants of Japanese ancestry recruited from Keio University Hospital. A total of 218 participants provided complete information and were included in the final analysis dataset. This study included patients who visited Keio University Hospital for routine diagnostic dementia evaluation between September 2018 and 2023. Individuals classified as cognitively normal (CN) were recruited through through hospital websites and from a patient recruitment agency ((3 H Medi Solution Inc., Tokyo, Japan). The inclusion and exclusion criteria have been described previously [18]. Detailed sample information is presented in Table 1. Samples with high heterozygosity rates and those that failed quality control (QC) criteria were excluded from the analysis. Detailed information on the QC procedures can be found in the section “Genetic Data Acquisition and Processing.” Ultimately, 137 participants recruited during the early phase (July 2018– October 2021) were included in the training dataset, of whom 54 (39.4%) were Aβ PET positive. The validation cohort consisted of participants from the late phase of recruitment (October 2021– August 2023), who underwent whole-genome sequencing (WGS) later than those in the training cohort. This cohort included 81 individuals, with 31 (38.3%) of them being Aβ PET positive. Written informed consent was obtained from all participants before conducting any assessment. Ethics approval was obtained from the Keio University Research Ethics Committee (approval No. N20170237).
Table 1.
Summary of the Keio cohort
| Training cohort | Validation cohort | |
|---|---|---|
| Number of samples | 137 | 81 |
| Mean Age, years (SD) | 69.1 (10.7) | 70.4* (9.80) |
| Female (%) | 64 (46.7%) | 41* (50.0%) |
| Mean Education years (SD) | 14.6 (2.09) | 14.8* (2.04) |
| APOE ε2 carrier (%) | 11 (8.03%) | 1 (1.23%) |
| APOE ε4 carrier (%) | 47 (34.3%) | 35 (43.2%) |
| Aβ PET positive (%) | 54 (39.4%) | 31 (38.3%) |
| Median MMSE (SD) | 25.6** (5.24) | 25.3* (5.49) |
| MCI dementia (%) | 18** (13.3%) | 8*** (10.4%) |
| AD dementia (%) | 26** (19.3%) | 18*** (23.4%) |
*: one individual with missing information
**: two individuals with missing information
***: four individuals with missing information
Genetic data acquisition and processing
Genomic DNA was extracted from 218 participants and sequencing libraries were prepared using the NovaSeq 6000 S4 Reagent Kit (Illumina, San Diego, CA, USA) and sequenced on an Illumina NovaSeq 6000 instrument (Illumina, San Diego, CA, USA) with 150 bp paired-end reads. The samples were sequenced to at least 30× mean coverage (Takara Bio Inc., city, Japan). Sequence reads were mapped to the latest reference genome build (hg38) and analyses were performed using the GATK4 pipeline [19–22]. QC was carried out using PLINK v1.9 [23] and SNPs were excluded if they: had a missing call rate > 1%; had a minor allele frequency (MAF) < 1%; or deviated from Hardy–Weinberg equilibrium with a p-value < 1 × 10− 6. Individuals were excluded when the missing call rate was > 1% or heterozygosity was outlying (± 3 standard deviations). One individual was removed during the QC process because of outlying heterozygosity. Pruning was performed with a window size of 200 variants, with the genome being traversed using a step size of 50 variants at a time, and SNPs with LD r2 > 0.25 were filtered out.
Aβ PET measurement
A 20 min static scan was conducted 90 min after the intravenous infusion of 300 MBq ± 10% [18 F]florbetaben (FBB) using a PET/computed tomography (CT) system (Siemens Biograph mCT or Siemens Biograph mCT flow, Munich, Germany) [24, 25]. [18 F]florbetaben was manufactured according to good manufacturing practices at Keio University Hospital using an automated synthesizer (Synthera V2; IBA, Louvain-la-Neuve, Belgium). The acquired PET data were reconstructed using an ordered subset expectation maximization algorithm (four iterations and 24 subsets) with a matrix size of 200 × 200 pixels. A full-width-at-half-maximum Gaussian postreconstruction filtering of 3 mm was applied along with scatter correction. For attenuation correction and anatomic registration, CT was performed with a tube voltage of 120 kVp, tube current of 50 mAs, 0.5 s per rotation, and a slice thickness of 2 mm. Visual assessment of the reconstructed images as Aβ-positive or Aβ-negative was conducted by a neuroradiologist with the required training. The visual assessment was performed by comparing signal intensity between the gray and white matter in axial PET slices at the lateral temporal, frontal, and parietal lobes, as well as the posterior cingulate cortex/precuneus. Scoring was performed using a regional cortical tracer uptake (RCTU) scoring system. When tracer uptake in the gray matter was equal to or greater than that of the adjacent white matter, an RCTU score of 2 or 3 was assigned, indicating positive tracer uptake, whereas a score of 1 was assigned for no tracer uptake. The RCTU scores from the four brain regions were subsequently aggregated to determine the brain amyloid plaque load score, with Aβ positivity being determined when one or more RCTU scores exceeded 1 [26]. The visual assessment of the reconstructed PET images for Aβ positivity or negativity was conducted by a neuroradiologist and a dementia specialist. The final diagnostic determination for all cases was made during a monthly conference attended by multiple neurologists and psychiatrists specializing in dementia and neuroimaging.
Base data of the PRS model
This study used Japanese AD GWAS data from the National Center for Geriatrics and Gerontology in Japan, including Japanese patients diagnosed with AD and healthy controls (3962 patients with AD and 4074 controls) [27]. The GWAS control group consisted of individuals with normal cognitive function whose subjective cognitive complaints were confirmed through neuropsychological assessment. The patients with AD were diagnosed based on the criteria established by the National Institute on Aging Alzheimer’s Association workgroups [28, 29]. Summary statistics were accessed through the National Bioscience Database Center (NBDC) at the Japan Science and Technology Agency (JST) at https://humandbs.dbcls.jp/en/ through accession number hum0237.v1.gwas.v1.
Polygenic risk score calculation
PRS were computed using PRS-CS with the East Asian subset of the 1000 Genomes Project Phase 3 dataset as the LD reference panel [16, 30]. The global shrinkage parameter phi (φ) was set to auto. In the automatic model, the phi parameter was acquired from the discovery GWAS and post hoc tuning was not required. Using the thresholding method, variants with GWAS p-values greater than the chosen p-value threshold (pT) (p < pT) were removed. To determine the optimal pT, values of 1.0 × 10⁻⁶, 1.0 × 10⁻⁵, 1.0 × 10⁻⁴, 0.001, 0.01 and 0.1 were tested. SNPs with p-values below a specified threshold were included, and weighted risk scores were calculated based on their effect sizes. PLINK 1.9 was used to sum all effect alleles of SNPs, which were weighted by the effect sizes derived from PRS-CS (June 6, 2019 version), into PRS for each individual in the training and external validation cohorts (Table 1).
Predictive model for Aβ PET positivity construction
The PRS models used in this manuscript are detailed in Table 2 and Supplementary Table S1. Given the known predictive power of SNPs in the APOE region for AD, a model was also developed excluding the region around apolipoprotein E (apoE) (chr19:44.4–46.5 Mb) to eliminate its influence. This region has high LD, which may disproportionately affect PRS calculations. In studies on AD PRS in Caucasian populations, higher predictive accuracy has been reported when directly genotyped APOE isoforms ε2 and ε4 were used as separate covariates in the regression model, rather than when the APOE region was included in the PRS (PRS.AD) [8, 9]. The dementia cohort at Keio University Hospital was used as one case-control dataset (54 Aβ PET positive and 83 negative cases; details are provided in Table 1). In Table 2, the first section shows the model in which PRS was calculated using the whole genome (Aβ.PRS.full). The second section presents the PRS calculated with the APOE region excluded (Aβ.PRS.no.APOE). The third section displays the model using Aβ.PRS.no.APOE and APOE4 (Aβ PRS.1), whereas the fourth section shows the model using two independent variables: Aβ.PRS.no.APOE and APOE2 + APOE4 (Aβ.PRS.2). Additionally, Supplementary Table 1 presents models that include age, sex, and years of education as covariates, along with APOE status. Predictive models based on PRS, APOE status, and covariates were developed using logistic regression models with the glm() function in R 3.6.3.
Table 2.
Model description for the PRS models presented in the manuscript
| Model Name | Model description |
|---|---|
| Aβ.PRS.full | PRS including SNPs with a p-value < pT |
| Aβ.PRS.no.APOE | PRS including SNPs with a p-value < pT and excluding SNPs in the APOE region (chr19:44.4–46.5 Mb) |
| Aβ.PRS.1 | PRS calculated as a weighted sum of Aβ.PRS.no.APOE (including SNPs with a p-value < pT) and the number of APOE4 alleles |
| Aβ.PRS.2 | PRS calculated as a weighted sum of Aβ.PRS.no.APOE (including SNPs with a p-value < pT), the number of APOE2 alleles, and the number of APOE4 alleles |
Model evaluation
The prediction accuracy of the PRS models was assessed using the area under the curve (AUC) from receiver operating characteristic (ROC) analysis. The 95% confidence interval (CI) was estimated using the R package pROC version 1.18.5. Sensitivity, specificity, and accuracy were calculated using a risk score threshold of 0.5.
k-Fold cross-validation
Cross-validation is an essential step in assessing the performance and generalizability of a PRS model. The predictive models were evaluated and compared by splitting data into subsets, with the model being trained on some subsets (training set) and validated on the remaining subset (validation set). Through this process, the generalizability of the model to an independent dataset was evaluated, overfitting was prevented, and model performance on unknown data was ensured. To robustly evaluate the predictive performance of the PRS models, k-fold cross-validation, a widely used technique in machine learning and statistical modeling, was employed. The dataset was randomly divided into k = 5 folds, with one fold being used as the test set and the remaining four folds being used for model training. This procedure was repeated k times, with each fold being used exactly once as validation data. The AUC from each test set was averaged to produce a single estimate of model performance.
External validation
The performance of the PRS model was evaluated on an independent dataset that was not used during the model development phase. This step is deemed crucial for ensuring the generalizable predictive power of the PRS model and avoiding overfitting to the original dataset. The PRS models were validated in an independent Japanese cohort from Keio University Hospital to assess their predictive accuracy (Table 1). Similar QC procedures were applied to the validation cohort to ensure data integrity, and samples with high heterozygosity rates or other QC failures were excluded from the analysis. The PRS for each individual in the validation cohort was calculated using the same SNPs and weights as those obtained from the discovery cohort. The effect sizes of the APOE2 and APOE4 alleles, as well as those of the covariates, from the discovery cohort were applied to the model. Model performance was assessed using metrics such as AUC, sensitivity, specificity, and accuracy. Additionally, the distribution of PRS was evaluated.
Results
Optimal p-value threshold (pT) for SNP selection
We tested different p-value thresholds to identify the optimal set of SNPs that contributed to the PRS. We assessed the predictive ability of the PRS model using the AUC. Table 3; Fig. 1 show the AUC for the case-control dataset in four scenarios using six SNP p-value thresholds (pT < 1.0 × 10− 6, 1.0 × 10− 5, 1.0 × 10− 4, 0.001, 0.01, 0.1).
Table 3.
PRS prediction accuracy for the Aβ PET case-control dataset using varying pT thresholds and APOE modeling methods
| pT | Aβ.PRS.full | Aβ.PRS.no.APOE | Aβ.PRS.1 (APOE4) | Aβ.PRS.2 (APOE4 + APOE2) | ||
|---|---|---|---|---|---|---|
| N SNPs | AUC | N SNPs | AUC | AUC | AUC | |
| APOE4 | 1 | 0.687 | – | – | – | – |
| APOE2 + APOE4 | 2 | 0.716 | – | – | – | – |
| 1.0 × 10− 6 | 14 | 0.624 | 9 | 0.565 | 0.714 | 0.732 |
| 1.0 × 10− 5 | 41 | 0.641 | 36 | 0.506 | 0.716 | 0.735 |
| 1.0 × 10− 4 | 182 | 0.603 | 176 | 0.539 | 0.722 | 0.740 |
| 0.001 | 1144 | 0.571 | 1137 | 0.529 | 0.716 | 0.740 |
| 0.01 | 9,744 | 0.563 | 9,719 | 0.538 | 0.686 | 0.715 |
| 0.1 | 88,065 | 0.610 | 87,971 | 0.609 | 0.740 | 0.759 |
Fig. 1.
ROC plot of PRS prediction accuracy for the Aβ PET case-control dataset using different pT thresholds and APOE modeling methods
At a threshold of pT < 0.1, the Aβ.PRS.full model, which included the full set of SNPs, achieved an AUC of 0.610 (95% confidence interval [CI] = 0.515–0.706). The Aβ.PRS.no.APOE model, which excluded SNPs from the APOE region, showed an AUC of 0.609 (95% CI = 0.513–0.706). The Aβ.PRS.2 model, which combined the PRS with the counts of APOE2 and APOE4 alleles, achieved the highest AUC (0.759; 95% CI = 0.678–0.839). A model that combined the PRS with only the APOE4 allele count also demonstrated high predictive accuracy (AUC = 0.740; 95% CI = 0.655–0.825).
At a threshold of pT < 1.0 × 10− 5, the Aβ.PRS.2 model again achieved the highest AUC among the four models (0.735; 95% CI = 0.651–0.819). The model combining PRS with only the APOE4 allele count showed the second-highest predictive accuracy (AUC = 0.716; 95% CI = 0.627–0.805).
Moreover, Supplementary Tables 2 and 3, along with Supplementary Figs. 1 and 2, present the AUC values for models in which gender, age, and years of education were individually included as covariates in the Aβ.PRS.1 and Aβ.PRS.2 models across six pT thresholds. At both thresholds of pT < 0.1 and pT < 1.0 × 10⁻⁵, the inclusion of age as a covariate resulted in the greatest improvement in AUC for both models (Aβ.PRS.1 + Age: AUC = 0.775 [95% CI = 0.693–0.857] for pT < 0.1 and AUC = 0.773 [95% CI = 0.690–0.856] for pT < 1.0 × 10⁻⁵; Aβ.PRS.2 + Age: AUC = 0.796 [95% CI = 0.721–0.871] for pT < 0.1 and AUC = 0.798 [95% CI = 0.722–0.874] for pT < 1.0 × 10⁻⁵). Although gender and years of education contributed to model performance in certain cases, their impact on predictive accuracy was less pronounced compared to age.
k-fold cross-validation of the polygenic risk score model
We conducted k-fold cross-validation by dividing the data into k equally sized folds (Table 4; Fig. 2). The 5-fold cross-validation demonstrated that Aβ.PRS.1 and Aβ.PRS.2 models maintained stable and generalizable predictive performance. These models achieved average AUC values exceeding 0.7 at both pT thresholds (0.1 or 0.00001), indicating robust model stability. In contrast, models using only APOE4 information or PRS alone did not achieve an AUC of 0.7. However, the model using only APOE4 and APOE2 counts maintained consistent predictive accuracy.
Table 4.
k-Fold cross validations for the Aβ PET case-control dataset using two different pT thresholds and methods to model APOE
| Model Description | pT | k-fold | AUC |
|---|---|---|---|
| APOE4 | - | k1 | 0.727 |
| k2 | 0.666 | ||
| k3 | 0.685 | ||
| k4 | 0.705 | ||
| k5 | 0.644 | ||
| mean (SD) | 0.685 (± 0.0326) | ||
| APOE4 + APOE2 | - | k1 | 0.775 |
| k2 | 0.757 | ||
| k3 | 0.714 | ||
| k4 | 0.614 | ||
| k5 | 0.738 | ||
| mean (SD) | 0.719 (± 0.0634) | ||
| Aβ.PRS.full | 0.1 | k1 | 0.808 |
| k2 | 0.583 | ||
| k3 | 0.578 | ||
| k4 | 0.489 | ||
| k5 | 0.519 | ||
| mean (SD) | 0.595 (± 0.125) | ||
| 1.0 × 10− 5 | k1 | 0.508 | |
| k2 | 0.722 | ||
| k3 | 0.604 | ||
| k4 | 0.790 | ||
| k5 | 0.531 | ||
| mean (SD) | 0.631 (± 0.122) | ||
| Aβ.PRS.no.APOE | 0.1 | k1 | 0.423 |
| k2 | 0.455 | ||
| k3 | 0.668 | ||
| k4 | 0.790 | ||
| k5 | 0.719 | ||
| mean (SD) | 0.611 (± 0.166) | ||
| 1.0 × 10− 5 | k1 | 0.436 | |
| k2 | 0.428 | ||
| k3 | 0.390 | ||
| k4 | 0.477 | ||
| k5 | 0.388 | ||
| mean (SD) | 0.424 (± 0.0369) | ||
| Aβ.PRS.1 | 0.1 | k1 | 0.850 |
| k2 | 0.781 | ||
| k3 | 0.765 | ||
| k4 | 0.494 | ||
| k5 | 0.713 | ||
| mean (SD) | 0.720 (± 0.136) | ||
| 1.0 × 10− 5 | k1 | 0.866 | |
| k2 | 0.789 | ||
| k3 | 0.626 | ||
| k4 | 0.685 | ||
| k5 | 0.625 | ||
| mean (SD) | 0.718 (± 0.106) | ||
| Aβ.PRS.2 | 0.1 | k1 | 0.711 |
| k2 | 0.856 | ||
| k3 | 0.802 | ||
| k4 | 0.665 | ||
| k5 | 0.750 | ||
| mean (SD) | 0.757 (± 0.0749) | ||
| 1.0 × 10− 5 | k1 | 0.668 | |
| k2 | 0.639 | ||
| k3 | 0.850 | ||
| k4 | 0.710 | ||
| k5 | 0.806 | ||
| mean (SD) | 0.735 (± 0.0903) |
Fig. 2.
ROC plot of Aβ.PRS.1 and Aβ.PRS.2 performance using two different pT thresholds with five-fold cross-validation
External validation of the polygenic risk score model
Our validation cohort comprised Japanese individuals from the Keio Dementia Cohort, which was distinct from the original samples used for PRS model development. Table 1 presents detailed information on the validation cohort. The external validation results (Table 5, Supplementary Table 4, Fig. 3, and Supplementary Fig. 3) demonstrated that the Aβ.PRS.no.APOE model showed inadequate predictive performance. Although lowering the pT threshold improved the accuracy of the Aβ.PRS.full model, it did not match the performance of models incorporating APOE information. In the validation cohort, the Aβ.PRS.2 model maintained an AUC value of approximately 0.7 regardless of pT threshold, demonstrating moderate predictive accuracy. Similarly, the Aβ.PRS.1 model achieved AUC values approaching 0.7. Models that incorporated both APOE information and age as covariates consistently maintained AUC values above 0.7. In contrast, the inclusion of only sex or years of education as covariates in the APOE models did not improve AUC and, in fact, resulted in a slight decrease in performance. Accuracy, sensitivity, and specificity were calculated at a risk score threshold of 0.5 for each model.
Table 5.
External validation for the Aβ PET case-control dataset using two different pT thresholds and methods to model APOE
| Model Description | pT | AUC | 95% CI | Sensitivity | Specificity | Accuracy |
|---|---|---|---|---|---|---|
| Aβ.PRS.full | 0.1 | 0.617 | 0.481–0.753 | 0.188 | 0.959 | 0.654 |
| 1.0 × 10− 5 | 0.678 | 0.554–0.802 | 0.281 | 0.898 | 0.654 | |
| Aβ.PRS.no.APOE | 0.1 | 0.594 | 0.459–0.729 | 0.125 | 0.959 | 0.630 |
| 1.0 × 10− 5 | 0.477 | 0.347–0.607 | 0 | 1 | 0.605 | |
| Aβ.PRS.1 | 0.1 | 0.698 | 0.571–0.825 | 0.594 | 0.714 | 0.667 |
| 1.0 × 10− 5 | 0.698 | 0.577–0.820 | 0.625 | 0.694 | 0.667 | |
| Aβ.PRS.2 | 0.1 | 0.703 | 0.578–0.828 | 0.594 | 0.714 | 0.667 |
| 1.0 × 10− 5 | 0.703 | 0.583–0.824 | 0.625 | 0.694 | 0.667 |
Fig. 3.
ROC plot of the performance of each prediction model using two different pT thresholds and the external validation method
The red line shows the ROC curve of Aβ.PRS.full; the green line represents the ROC curve of Aβ.PRS.no.APOE; the blue line shows the ROC curve of Aβ.PRS.1; the purple line represents the ROC curve of Aβ.PRS.2
Discussion
The aim of this study was to evaluate the utility of using PRS to predict Aβ deposition in a Japanese population. Our study confirmed the utility of PRS in predicting Aβ deposition in a Japanese population, as measured by PET imaging. We determined that the optimal p-value threshold for the Japanese Aβ PET positive risk model was pT < 0.1, demonstrating the advantage of modeling the counts of APOE2 and APOE4 alleles separately from the PRS calculations. Notably, even when we adjusted the pT to < 1 × 10⁻⁵, the model maintained comparable accuracy.
These findings are consistent with those of previous studies that reported similar accuracies in AD PRS models for Caucasian populations. Specifically, research has indicated that combining the counts of APOE2 and APOE4 alleles with PRS calculated from SNPs using a threshold of pT < 0.1 can yield high predictive accuracy [8]. In the present study, although the Japanese AD GWAS used for PRS calculation had a smaller sample size and lower statistical power than its Caucasian counterparts, we achieved comparable accuracy levels.
External validation was essential for evaluating the generalizability and robustness of the PRS models. Validation of the Aβ PET positivity PRS model in an independent Japanese cohort demonstrated both predictive accuracy and robustness. However, the predictive accuracy between the Aβ.PRS.1 and Aβ.PRS.2 models was nearly identical, and including APOE2 alongside APOE4 did not significantly enhance the predictive power. This outcome was primarily attributable to the low prevalence of APOE2 carriers in the validation cohort. The Japanese cohort used in this study contained minimal APOE2 carriers, all of whom were Aβ PET negative. This may have contributed to an overestimation of APOE2’s effect within the model.
These results highlight the challenges associated with developing robust predictive models when sample sizes are limited. In this context, the model incorporating only APOE4 allele counts demonstrated comparable accuracy, suggesting this simplified approach may be viable. To enhance the predictive accuracy of Aβ PET positivity risk models, increasing target sample sizes and including individuals from diverse genetic backgrounds is essential.
Models were developed that incorporated PRS and APOE information, along with gender, age, and years of education as covariates. The inclusion of age, a known risk factor for dementia onset, resulted in improved prediction accuracy. However, gender and the environmental factor of years of education did not significantly impact the model’s accuracy. Additionally, external validation results indicated that including age as a covariate did not lead to a dramatic increase in the AUC. These findings suggest that a model predicting Aβ deposition using only genomic information, without incorporating environmental factors and aging associated with dementia onset, may offer valuable insights.
When the pT was set to 1.0 × 10⁻⁵ excluding the APOE region, 36 SNPs remained (Table 6). Using this threshold, the Aβ.PRS.2 model exceeded an AUC of 0.7. This indicated that even with a lower pT threshold and fewer SNPs, maintaining robust predictive accuracy was possible. The nearest gene names around SNPs were assigned using information from dbSNP [31]. The SNPs included genes such as SORL1 (sortilin-related receptor 1), which has been associated with AD GWAS [1, 2, 32]. SORL1 plays a crucial role in AD pathogenesis by its involvement in amyloid precursor protein (APP) processing and Aβ generation [33]. RYR2 (Ryanodine Receptor 2) encodes a protein that functions as a channel in the sarcoplasmic reticulum membrane of cardiac muscle cells [34]. Although dysregulation of calcium via RYR2 may contribute to AD pathogenesis, the association between RYR2 and AD remains unclear [35]. FAM47E, PAPOLG, and RAB3C represent novel loci associated with AD in the Japanese population [27]. However, the relationship between these genes and AD requires further investigation. These SNPs may provide crucial information for predicting AD risk in Japanese individuals.
Table 6.
Details of the 36 SNPs used for creating the PRS for Aβ PET positivity
| Chr | Position | SNP | A1 | A2 | Gene |
|---|---|---|---|---|---|
| 1 | 237,216,056 | rs1124814 | G | A | RYR2 |
| 2 | 60,951,203 | rs10173062 | T | C | - |
| 2 | 60,952,315 | rs12713425 | C | T | - |
| 2 | 60,952,688 | rs10177082 | A | G | - |
| 2 | 60,955,060 | rs1966705 | T | C | - |
| 2 | 60,962,303 | rs3796067 | C | T | - |
| 2 | 60,962,423 | rs12713426 | C | T | - |
| 2 | 60,962,589 | rs11692811 | C | T | - |
| 2 | 60,965,747 | rs10184932 | A | C | - |
| 2 | 60,967,961 | rs7576218 | T | C | - |
| 2 | 60,968,070 | rs7579240 | T | G | - |
| 2 | 60,969,547 | rs10210517 | C | T | - |
| 2 | 60,972,786 | rs1866207 | A | G | - |
| 2 | 60,982,796 | rs12471897 | C | T | PAPOLG |
| 2 | 60,982,846 | rs12475640 | G | A | PAPOLG |
| 2 | 60,994,438 | rs28459296 | G | A | PAPOLG |
| 2 | 60,994,945 | rs7599341 | T | C | PAPOLG |
| 2 | 61,003,193 | rs11687809 | T | C | PAPOLG |
| 2 | 61,027,921 | rs11678813 | T | G | PAPOLG |
| 2 | 61,032,181 | rs11125856 | T | C | - |
| 2 | 61,041,022 | rs9989783 | G | A | - |
| 2 | 61,041,377 | rs9989794 | G | A | - |
| 2 | 105,019,482 | rs4300852 | T | C | LOC100287010 |
| 4 | 77,139,510 | rs7685696 | A | G | FAM47E/SCARB2 |
| 4 | 77,140,733 | rs10032423 | C | T | FAM47E/SCARB2 |
| 4 | 77,142,346 | rs4282210 | C | A | FAM47E/SCARB2 |
| 5 | 58,091,485 | rs10805510 | G | A | RAB3C |
| 6 | 74,774,958 | rs4708114 | A | G | - |
| 6 | 114,228,811 | rs517399 | G | A | LINC02880 |
| 11 | 121,436,270 | rs3781832 | T | G | SORL1 |
| 11 | 121,445,940 | rs3781834 | G | A | SORL1 |
| 11 | 121,470,646 | rs12274541 | T | C | SORL1 |
| 11 | 121,473,391 | rs11218360 | C | T | SORL1 |
| 11 | 121,474,025 | rs12287339 | C | T | SORL1 |
| 11 | 121,474,239 | rs17125523 | G | A | SORL1 |
| 11 | 121,477,816 | rs3737529 | T | C | SORL1 |
The successful application of PRS calculation methods in this study based on Japanese AD GWAS summary statistics shows promising potential. Although previous studies have demonstrated relatively high performance when calculating PRS for Japanese individuals using European AD GWAS data, applying PRS methods based on different ethnic groups’ GWAS summary statistics may reduce predictive accuracy [36]. This reduction stems from population differences in genetic structures, such as linkage disequilibrium blocks, which complicate the adaptation of SNP weights to non-European populations [37]. The relatively small Japanese GWAS sample size highlights the necessity for more statistically powerful GWAS to calculate reliable PRS. Larger AD GWAS in Japanese populations are necessary and require further investigation and data collection. We anticipate that using GWAS statistics from large East Asian cohorts, including Japanese individuals, will facilitate more robust validation of polygenic effects.
This study had certain limitations. First, the current target and validation sample sizes for PRS analysis remain small, increasing the likelihood of false positives and potentially exaggerating effect sizes. Second, participant ages at Aβ PET diagnosis varied. As APOE locus allelic variations relate to AD onset age and brain Aβ accumulation changes, a cohort with uniform age at Aβ PET measurement could yield a more accurate model.
There is no consensus regarding the optimal number of SNPs to include in the AD PRS. In this study, we compared two models using different thresholds. The model using more SNPs showed slightly higher accuracy, whereas the model using fewer SNPs offered more cost-effective genetic panel advantages. Although our findings are notable, the predictive accuracy of the PRS-based models remains insufficient for clinical applications. In the future, a combination of multiple biomarkers, such as plasma proteomics and metabolomics, has the potential to enhance model accuracy. Additionally, deep learning-based polygenic risk analysis offers promising approaches to increase predictive power [38]. More accurate predictive models would facilitate early evaluation of individual disease risk, potentially contributing to lifestyle changes and disease prevention. Moreover, investigating individuals with a high PRS who do not develop AD could provide valuable insights into AD resistance factors, leading to the identification of new drug targets.
Conclusion
This study demonstrated PRS effectiveness in predicting Aβ deposition in a Japanese population, highlighting its potential as a noninvasive tool for early AD detection. This facilitates early intervention and appropriate follow-up of high-risk patients. Interventions may include lifestyle modifications, cognitive training, exercise, and nutritional management. Clinical trials of new AD therapies can include individuals with high Aβ PET positivity PRS to evaluate therapeutic efficacy. However, further research is required to validate these applications. Exploring appropriate approaches while considering the genetic background of the Japanese population remains important. Although we demonstrated PRS potential in a Japanese cohort, future studies should incorporate larger and more genetically diverse samples to enhance the accuracy and generalizability of Aβ PET positivity risk models.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
We express our sincere gratitude to all participants of this study, including those who provided specimens and data, as well as individuals involved in study applications, recruitment, sample management, and data cleaning. We also extend our appreciation to Drs. Hirofumi Aoyagi, Yoshiaki Sato, and all members of EKID for their invaluable support. Furthermore, we thank Keisuke Takahata, Hajime Tabuchi, Ms. Natsumi Suzuki, Ayaka Morimoto, Azusa Oosumi, and Yuka Hoshino from the Department of Neuropsychiatry; Dr. Morinobu Seki from the Department of Neurology; Drs. Yu Iwabuchi and Masahiro Jinzaki from the Department of Diagnostic Radiology; and the staff of the Division of Nuclear Medicine and Department of Radiology at Keio University School of Medicine for their assistance with PET examinations and image processing. Additionally, we acknowledge the support of Drs. Michael Nagle and Ying Wu (Eisai Inc.). This study was supported by AMED under grant number JP17pc0101006. The NCGG Japanese GWAS summary data used for this research was originally obtained by (No. hum0237.v1) research project led by Prof Kouichi Ozaki and available at the website of the NBDC Human Database (https://humandbs.biosciencedbc.jp/en/) of the Japan Science and Technology Agency (JST).
Abbreviations
- FBB
[18F]florbetaben
- AD
Alzheimer’s disease
- APP
Amyloid precursor protein
- Aβ
Amyloid-β
- AUC
Area under the curve
- ADAD
Autosomal dominant AD
- CN
Cognitively normal
- CT
Computed tomography
- CI
Confidence interval
- CS
Continuous shrinkage
- DSAD
Down syndrome-associated AD
- GWAS
Genome-wide association studies
- LD
Linkage disequilibrium
- MCI
Mild cognitive impairment
- MAF
Minor allele frequency
- pT
p-value threshold
- PRS
Polygenic risk scores
- PET
Positron emission tomography
- QC
Quality control
- ROC
Receiver operating characteristic
- RCTU
Regional cortical tracer uptake
- SNPs
Single nucleotide polymorphisms
- WGS
Whole-genome sequencing
Author contributions
MK: Study design, data analysis and interpretation, and manuscript drafting. JI, KTakahashi, and JK: Data interpretation and manuscript revision. KTai: Sample collection, preparation, and management. DI and SB: Led sample recruitment, collected samples, and revised the manuscript. All authors read and approved the final manuscript.
Funding
This work was supported by grant from the Japan Agency for Medical Research and Development (AMED) (grant number JP17pc0101006 to MK, JI, KTakahashi, KTai,, JK, SB and DI). The funder had no role in the study design, data collection, decision to publish, or preparation of the manuscript.
Data availability
GWAS statistics for Alzheimer’s disease in Japanese population were obtained from the National Bioscience Database Center Human Database (https://humandbs.biosciencedbc.jp/en/), Japan (Research ID: hum0237.v1, 2021). To protect the privacy of participants in the Keio cohort, individual whole-genome sequences and phenotype data cannot be made publicly available.
Declarations
Ethics approval and consent to participate
The study design and protocol were approved by the Certified Review Board of Keio University (#N20170237) and adhered to the principles of the Declaration of Helsinki. All participants (plus their proxies as needed) provided written informed consent for participation in the study. The study was registered with the University Hospital Medical Information Network Clinical Trials Registry (UMIN-CTR; https://www.umin.ac.jp/ctr/index.htm, ID# UMIN000032027) on March 31, 2018, and the Japan Registry of Clinical Trials (jRCT; https://jrct.niph.go.jp/, ID# jRCTs031180225) on March 11, 2019.
Consent for publication
Not applicable.
Competing interests
MK, JI, KTakahashi, KTai, and JK are employees of Eisai. DI has received honorariums from Daiichi Sankyo, Nihon Medi-Physics, Kowa, PDRadiopharma, and Eisai. No other relationships or activities that could have influenced the submitted work are reported. SB declares no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Jansen IE, Savage JE, Watanabe K, Bryois J, Williams DM, Steinberg S, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat Genet. 2019;51(3):404–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bellenguez C, Küçükali F, Jansen IE, Kleineidam L, Moreno-Grau S, Amin N, et al. New insights into the genetic etiology of Alzheimer’s disease and related dementias. Nat Genet. 2022;54(4):412–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Fortea J, Pegueroles J, Alcolea D, Belbin O, Dols-Icardo O, Vaqué-Alcázar L, et al. APOE4 homozygozity represents a distinct genetic form of Alzheimer’s disease. Nat Med. 2024;30(5):1284–91. [DOI] [PubMed] [Google Scholar]
- 4.Lourida I, Hannon E, Littlejohns TJ, Langa KM, Hyppönen E, Kuzma E, Llewellyn DJ. Association of lifestyle and genetic risk with incidence of dementia. JAMA. 2019;322(5):430–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Licher S, Ahmad S, Karamujić-Čomić H, Voortman T, Leening MJG, Ikram MA, Ikram MK. Genetic predisposition, modifiable-risk-factor profile and long-term dementia risk in the general population. Nat Med. 2019;25(9):1364–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chasioti D, Yan J, Nho K, Saykin AJ. Progress in polygenic composite scores in Alzheimer’s and other complex diseases. Trends Genet. 2019;35(5):371–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9(3):e1003348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Leonenko G, Baker E, Stevenson-Hoare J, Sierksma A, Fiers M, Williams J, et al. Identifying individuals with high risk of Alzheimer’s disease using polygenic risk scores. Nat Commun. 2021;12(1):4506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.de Rojas I, Moreno-Grau S, Tesi N, Grenier-Boley B, Andrade V, Jansen IE, et al. Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores. Nat Commun. 2021;12(1):3417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sabuncu MR, Buckner RL, Smoller JW, Lee PH, Fischl B, Sperling RA. The association between a polygenic alzheimer score and cortical thickness in clinically normal subjects. Cereb Cortex. 2012;22(11):2653–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mormino EC, Sperling RA, Holmes AJ, Buckner RL, De Jager PL, Smoller JW, Sabuncu MR. Polygenic risk of alzheimer disease is associated with early- and late-life processes. Neurology. 2016;87(5):481–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ge T, Sabuncu MR, Smoller JW, Sperling RA, Mormino EC. Dissociable influences of APOE Ε4 and polygenic risk of AD dementia on amyloid and cognition. Neurology. 2018;90(18):e1605–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zettergren A, Lord J, Ashton NJ, Benedet AL, Karikari TK, Lantero Rodriguez J, et al. Association between polygenic risk score of Alzheimer’s disease and plasma phosphorylated Tau in individuals from the Alzheimer’s disease neuroimaging initiative. Alzheimers Res Ther. 2021;13(1):17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Daunt P, Ballard CG, Creese B, Davidson G, Hardy J, Oshota O, et al. Polygenic risk scoring is an effective approach to predict those individuals most likely to decline cognitively due to Alzheimer’s disease. J Prev Alzheimers Dis. 2021;8(1):78–83. [DOI] [PubMed] [Google Scholar]
- 15.Pyun JM, Park YH, Lee KJ, Kim S, Saykin AJ, Nho K. Predictability of polygenic risk score for progression to dementia and its interaction with APOE Ε4 in mild cognitive impairment. Transl Neurodegener. 2021;10(1):32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ge T, Chen CY, Ni Y, Feng YA, Smoller JW. Polygenic prediction via bayesian regression and continuous shrinkage priors. Nat Commun. 2019;10(1):1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Slatkin M. Linkage disequilibrium–understanding the evolutionary past and mapping the medical future. Nat Rev Genet. 2008;9(6):477–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Shimohama S, Tezuka T, Takahata K, Bun S, Tabuchi H, Seki M, et al. Impact of amyloid and Tau PET on changes in diagnosis and patient management. Neurology. 2023;100(3):e264–74. [DOI] [PubMed] [Google Scholar]
- 19.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43(5):491–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Huang PJ, Chang JH, Lin HH, Li YX, Lee CC, Su CT, et al. DeepVariant-on-Spark: Small-Scale genome analysis using a Cloud-Based computing framework. Comput Math Methods Med. 2020;2020:7231205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics. 2013;43(1110):11.0.1-.0.33. [DOI] [PMC free article] [PubMed]
- 23.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Mashima K, Ito D, Kameyama M, Osada T, Tabuchi H, Nihei Y, et al. Extremely low prevalence of amyloid positron emission tomography positivity in Parkinson’s disease without dementia. Eur Neurol. 2017;77(5–6):231–7. [DOI] [PubMed] [Google Scholar]
- 25.Tezuka T, Takahata K, Seki M, Tabuchi H, Momota Y, Shiraiwa M, et al. Evaluation of [(18)F]PI-2620, a second-generation selective Tau tracer, for assessing four-repeat Tauopathies. Brain Commun. 2021;3(4):fcab190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bun S, Ito D, Tezuka T, Kubota M, Ueda R, Takahata K, et al. Performance of plasma Aβ42/40, measured using a fully automated immunoassay, across a broad patient population in identifying amyloid status. Alzheimers Res Ther. 2023;15(1):149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Shigemizu D, Mitsumori R, Akiyama S, Miyashita A, Morizono T, Higaki S, et al. Ethnic and trans-ethnic genome-wide association studies identify new loci influencing Japanese Alzheimer’s disease risk. Transl Psychiatry. 2021;11(1):151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.McKhann GM, Knopman DS, Chertkow H, Hyman BT, Jack CR Jr., Kawas CH, et al. The diagnosis of dementia due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7(3):263–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Albert MS, DeKosky ST, Dickson D, Dubois B, Feldman HH, Fox NC, et al. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7(3):270–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Sherry ST, Ward M, Sirotkin K. dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res. 1999;9(8):677–9. [PubMed] [Google Scholar]
- 32.Miyashita A, Koike A, Jun G, Wang LS, Takahashi S, Matsubara E, et al. SORL1 is genetically associated with late-onset Alzheimer’s disease in Japanese, Koreans and Caucasians. PLoS ONE. 2013;8(4):e58618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Mishra S, Knupp A, Szabo MP, Williams CA, Kinoshita C, Hailey DW, et al. The Alzheimer’s gene SORL1 is a regulator of endosomal traffic and recycling in human neurons. Cell Mol Life Sci. 2022;79(3):162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bround MJ, Wambolt R, Luciani DS, Kulpa JE, Rodrigues B, Brownsey RW, et al. Cardiomyocyte ATP production, metabolic flexibility, and survival require calcium flux through cardiac Ryanodine receptors in vivo. J Biol Chem. 2013;288(26):18975–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Bezprozvanny I. Calcium signaling and neurodegenerative diseases. Trends Mol Med. 2009;15(3):89–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kikuchi M, Miyashita A, Hara N, Kasuga K, Saito Y, Murayama S, et al. Polygenic effects on the risk of Alzheimer’s disease in the Japanese population. Alzheimers Res Ther. 2024;16(1):45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW. Genetic structure of human populations. Science. 2002;298(5602):2381–5. [DOI] [PubMed] [Google Scholar]
- 38.Zhou X, Chen Y, Ip FCF, Jiang Y, Cao H, Lv G, et al. Deep learning-based polygenic risk analysis for Alzheimer’s disease prediction. Commun Med (Lond). 2023;3(1):49. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
GWAS statistics for Alzheimer’s disease in Japanese population were obtained from the National Bioscience Database Center Human Database (https://humandbs.biosciencedbc.jp/en/), Japan (Research ID: hum0237.v1, 2021). To protect the privacy of participants in the Keio cohort, individual whole-genome sequences and phenotype data cannot be made publicly available.



