Abstract
Animal studies show aging varies between individuals as well as between organs within an individual1–4, but whether this is true in humans and its effect on age-related diseases is unknown. We utilized levels of human blood plasma proteins originating from specific organs to measure organ-specific aging differences in living individuals. Using machine learning models, we analysed aging in 11 major organs and estimated organ age reproducibly in five independent cohorts encompassing 5,676 adults across the human lifespan. We discovered nearly 20% of the population show strongly accelerated age in one organ and 1.7% are multi-organ agers. Accelerated organ aging confers 20–50% higher mortality risk, and organ-specific diseases relate to faster aging of those organs. We find individuals with accelerated heart aging have a 250% increased heart failure risk and accelerated brain and vascular aging predict Alzheimer’s disease (AD) progression independently from and as strongly as plasma pTau-181 (ref. 5), the current best blood-based biomarker for AD. Our models link vascular calcification, extracellular matrix alterations and synaptic protein shedding to early cognitive decline. We introduce a simple and interpretable method to study organ aging using plasma proteomics data, predicting diseases and aging effects.
Subject terms: Proteome informatics, Prognostic markers, Machine learning, Diagnostic markers, Predictive markers
Blood plasma protein data was combined with machine learning models for a simple method to determine differences in organ-specific aging; the study provides a basis for the prediction of diseases and aging effects using plasma proteomics.
Main
Aging results in organism-wide deterioration of tissue structure and function that drastically increases the risk of most chronic diseases. Comprehensive studies of the molecular changes that occur with aging across multiple organs in mice have identified unique molecular aging trajectories and timings1–4, and susceptibility and resilience to diseases of aging in specific organs such as the brain, heart and kidney varies substantially across the population6. However, little is known about how human organs change molecularly with age. A molecular understanding of human organ aging is of critical importance to address the massive global disease burden of aging and could revolutionize patient care, preventative medicine and drug development7. In particular, preclinical studies have demonstrated that rejuvenating interventions affect organs differently3,8. To translate these studies into transformative medicines, we must be able to accurately measure aging across the body and understand the diversity of human aging not only across but also within individuals.
While many methods to measure molecular aging in humans have been developed9–11, most of them provide just a single measure of aging for the whole body. This is difficult to interpret given the complexity of human aging trajectories. Some recent methods have used clinical chemistry markers which include some markers of organ function12–15. However, many of these markers have low organ specificity, making them difficult to interpret for organ-specific aging. Methods to measure brain aging have used MRI-based brain volume and functional connectivity measurements, which are costly and do not provide molecular insights16, or have required tissue samples, which prevents their application in living persons17. Building off the wealth of literature and clinical practice that uses certain organ-specific plasma proteins to noninvasively assess aspects of organ health, such as alanine transaminase for liver damage, we hypothesized that comprehensive quantification of organ-specific proteins in plasma could enable minimally invasive assessment and tracking of human aging for any organ.
Plasma proteins can model organ aging
To test this, we measured 4,979 proteins in a total of 5,676 subjects across five independent cohorts (Supplementary Table 1) and mapped the putative organ-specific plasma proteome, which we used to train models of organ aging (Fig. 1a). We mapped the organ-specific plasma proteome using human organ bulk RNA sequencing (RNA-seq) data from the Genotype-Tissue Expression (GTEx) project18. We classified genes as ‘organ enriched’ if they were expressed at least four times higher in one organ compared to any other organ, according to the definition proposed in the Human Protein Atlas19 (Extended Data Fig. 1, Supplementary Tables 2 and 3, and Methods). We annotated the 4,979 human proteins measured by the SomaScan assay with this information and found 893 (18%) proteins met this definition, with the highest number from the brain. We performed additional quality control to remove proteins with a high coefficient of variation or a low correlation between the two different versions of the SomaScan assay present across our cohorts, leaving us with 4,778 proteins (856 organ enriched, 17.9%) which were used for downstream analysis (Supplementary Fig. 1 and Supplementary Tables 4 and 5).
We and others have previously shown that plasma proteins can be used to train machine learning models to estimate chronological age in independent cohorts20,21. For each individual, an aging model produces an ‘age gap’, a measure of that individual’s biological age relative to other same-aged peers based on their molecular profile9 (Fig. 1a). Several studies have shown associations between age gaps and mortality risk or other age-related phenotypes9, supporting the hypothesis that the age gap contains information about relative biological aging.
Based on this concept, we trained a bagged ensemble of least absolute shrinkage and selection operator (LASSO) aging models for 11 major organs using the mutually exclusive organ-enriched proteins we identified as inputs (Fig. 1a, Extended Data Fig. 2a,b, Supplementary Fig. 3 and Supplementary Tables 6–8). We chose to restrict our analyses to adipose tissue, artery, brain, heart, immune tissue, intestine, kidney, liver, lung, muscle and pancreas because of their relatively well-understood contributions to diseases of aging and the availability of relevant age-related phenotype data in the tested cohorts. We also trained an ‘organismal’ aging model using the 3,907 organ-nonspecific plasma proteins as inputs to compare the contribution of specific organs to an organ-shared aging signature, and a ‘conventional’ proteomic aging model using all 4,778 proteins to compare the organ aging models to a global plasma proteomic aging signature as previously reported20,21. We trained our models in 1,398 healthy participants from the Knight Alzheimer’s Disease Research Center (Knight-ADRC) cohort (mean age = 75, age range = 27–104) and then tested these models in four fully independent cohorts and in held-out test participants with dementia in the Knight-ADRC. (Fig. 1a, Extended Data Figs. 2 and 3, and Supplementary Fig. 2). All 11 organ aging models and the organismal model significantly estimated age in all five cohorts after multiple test correction (Supplementary Fig. 3b). Organ-specific proteins selected by our approach were highly enriched for organ-specific functions (Supplementary Information).
We observed across all cohorts that individuals with the same conventional age gap had diverse organ aging profiles (Fig. 1b). At the population level, this resulted in a low-to-moderate correlation between the age gaps of different organs (mean pairwise Pearson r = 0.29, Fig. 1c). While organ aging is correlated, the majority of variance in one organ age gap is not explained by others, with the exception of the organismal and conventional age gaps which were highly correlated. Further, we observed that some individuals had extreme aging in one or more organs relative to the general population (Fig. 1d). We scored individuals across all cohorts as outliers for a given organ age gap using a two standard deviation cutoff and clustered individuals into extreme aging types (e-ageotypes) (Fig. 1e and Extended Data Fig. 4a–c). Although it might be expected that extreme aging in one organ would co-occur with extreme aging in other organs, we instead observed segregation into distinct organ e-ageotypes. We found that approximately 18.4% of individuals had a highly organ-specific e-ageotype that was dominated by the aging of only one organ. Only approximately 1.7% of individuals showed extreme aging in multiple organs; the only multi-organ e-ageotype discovered through unbiased clustering was defined by extreme adipose, brain, conventional, heart, immune, liver and organismal age gaps. These observations suggest that organ age gaps may capture unique aging information, which may have implications for organ-specific biological aging and diseases of aging.
Organ age predicts health and disease
To assess the relationship between organ age and biological aging, we tested whether organ e-ageotypes were associated with nine age-related disease states for which we had sufficient data in at least two independent cohorts; AD, atrial fibrillation, cerebrovascular disease, diabetes, heart attack, hypercholesterolaemia, hypertension, obesity and gait impairment. Organ e-ageotypes were associated with specific disease states with known high impact on their respective organs (23 of 117, 20%, associations significant in a meta-analysis after multiple testing correction, Extended Data Fig. 4d and Supplementary Table 9). The kidney ageotype was the most significantly associated with metabolic diseases (diabetes, obesity, hypercholesterolaemia and hypertension), the heart ageotype was the most significantly associated with heart diseases (atrial fibrillation and heart attack), the muscle ageotype was the most significantly associated with gait impairment, the brain ageotype was the most significantly associated with cerebrovascular disease and the organismal ageotype was the most significantly associated with AD. At the whole population level, the relationships between organ age gaps and disease showed the same trends as ageotypes, but more diseases were significantly associated with age gaps due to higher statistical power (65 of 117, 56%, statistically significant after multiple test correction, Extended Data Fig. 4e and Supplementary Table 10).
At the population level, the two most significant associations between disease and age gap were between the kidney age gap and metabolic disease traits. Individuals with hypertension had kidneys that were approximately one year older than their same-aged peers, while individuals with diabetes had kidneys approximately 1.3 years older (Fig. 2a,b and Supplementary Tables 8 and 10). The third and fourth top associations were between the heart age gap and the heart aging traits atrial fibrillation (2.8 years older) and heart attack (2.6 years older) (Fig. 2c,d). Overall, we found that certain diseases, such as heart attack and AD, were associated with accelerated aging in virtually all organs, while others had impacts on a particular organ or subset of organs (Extended Data Fig. 4e and Supplementary Table 10).
Kidney aging proteins were highly expressed by kidney cell types (Fig. 2e,f) and had known roles in kidney biology and disease. Using feature importance plots, the model identified renin (REN), a kidney enzyme known to regulate blood pressure via the renin-angiotensin pathway22, as an important protein in kidney aging. It also identified the putative longevity factor klotho (KL)23, as well as multiple proteins with unknown functions including uromodulin (UMOD) and kidney associated antigen 1 (KAAG1), as important kidney aging proteins. UMOD has been genetically linked to chronic kidney disease, where it is observed to have age-dependent effects24, and rare mutations are the major cause of autosomal dominant tubulointerstitial kidney disease25.
Heart aging proteins were expressed primarily by cardiomyocytes (Fig. 2g,h) and had known roles in heart biology and disease. Pro-brain natriuretic peptide (NPPB), a negative regulator of blood pressure that increases in response to heart damage, and troponin T (TNNT2), a heart muscle protein involved in contraction, had the strongest weights in the heart aging model (Fig. 2g). They are both established clinical markers of acute heart failure26, and NPPB has been previously associated with heart attack risk27. This suggests the possibility of a link between subclinical heart disease and the ‘normal’ heart aging process, which should be investigated further with more detailed heart imaging and electrophysiology. Less well-characterized heart proteins include cardiac myosin light chain (MYL7), peroxidasin like (PXDNL) and bone morphogenetic protein 10 (BMP10). MYL7 is expressed by atrial cardiomyocytes and has recently become a promising target for hypertrophic cardiomyopathy28, suggesting that this could be a repurposing target for heart aging more generally.
Given the strong associations between heart aging traits and the heart age gap, we used longitudinal follow-up among healthy participants in the LonGenity cohort to test if organ age was significantly associated with future heart failure risk (Fig. 2i and Supplementary Table 11). We found that among people with no active disease or clinically abnormal biomarkers at baseline, every 4.1 years of additional heart age (one standard deviation) conferred an almost 2.5-fold increased risk of heart failure over a 15-year follow-up (23% increased risk per year of heart aging, Fig. 2i). Age gaps from multiple other tissues, but not the conventional aging model, also trended towards significance.
We next tested the associations between organ age gaps and all-cause mortality. We found that the age gaps from 10 out of 11 organs, the organismal model and the conventional model were significantly associated with future risk of all-cause mortality after multiple test correction in the LonGenity cohort over 15 years of follow-up (Fig. 2j and Supplementary Table 12). A standard deviation increase (approximately four years of extra organ aging, Supplementary Table 8) in heart, adipose, liver, pancreas, brain, lung, immune or muscle age gap each conferred between 15–50% increased all-cause mortality risk. These hazard ratios are a similar size to methylation-based mortality predictors in independent aging cohorts over similar follow-up times, despite the fact that organ aging models are trained to predict chronological age instead of mortality directly (DNAm GrimAge hazard ratio = 1.3, 14 year mortality follow-up29). Further, we found that for some organs, there was a nonlinear relationship between the age gap and mortality risk (Supplementary Information, Supplementary Fig. 4 and Supplementary Table 13).
Finally, to better understand the relationship between organ age and additional markers of health and disease, we tested the associations between organ age gaps and 43 clinical biochemistry and cell count markers in the test cohort Covance (Extended Data Fig. 5 and Supplementary Fig. 5, see Supplement Information for additional discussion). We also used these markers to calculate Phenotypic age14 (PhenoAge), a clinical biochemistry-based aging clock which predicts mortality and morbidity risk, for all participants in Covance (Extended Data Fig. 5a). We found that the PhenoAge age gap was significantly correlated with multiple organ age gaps, but only a small portion of the variance in any model was explained by another (Extended Data Fig. 5b).
We found 226 out of 559 (40%) associations between organ age gaps and clinical biochemistry markers were significant after multiple testing correction (Extended Data Fig. 5c and Supplementary Table 14). The strongest associations included associations between liver age gap and blood AST:ALT ratio, a clinical marker of liver health and function that is known to change with age (adjusted Pearson r = 0.25, q = 6.13 × 10−17), and between kidney age gap and serum creatinine, the standard clinical marker of kidney function (adjusted Pearson r = 0.23, q = 1.65 × 10−16). While these results are highly significant, they only partially explain the relationship between organ age gaps and disease phenotypes. Even after correcting for estimated glomerular filtration rate (eGFR), the kidney age gap is still significantly associated with hypertension and diabetes (Supplementary Fig. 6).
Collectively, organ age gap associations with disease and blood biochemistry demonstrate that aging models derived from organ-specific plasma proteins capture disease-relevant heterogeneity of aging within and across individuals, which is not captured by other aging clocks or clinical markers.
Brain aging in cognitive decline and AD
Although the largest risk factor for neurodegenerative diseases is age, little is known about the contribution of molecular brain aging to disease. The brain age gap correlated significantly with AD in held-out participants in the Knight-ADRC, but did not replicate in the Stanford Alzheimer’s Disease Research Center (Stanford-ADRC) (Supplementary Table 10). Therefore, to better understand how underlying proteins contributed to the brain aging model’s predictive abilities for brain aging phenotypes, we developed the feature importance for biological aging (FIBA) algorithm, which uses feature permutation to generate a per-protein importance score for both chronological and biological age, as defined by a particular age-related trait (Extended Data Fig. 6a and Methods). We applied FIBA to the brain age model using the trait global clinical dementia rating (CDRGLOB) in the Knight-ADRC cohort to understand how brain proteins contributed to the association between the age gap and cognitive decline. We observed that some proteins, such as complexins, increased both the model age prediction accuracy and the age gap association with dementia severity (FIBA+), while others decreased the age gap association with dementia severity (FIBA−) (Fig. 3a and Supplementary Table 15).
We used this information to train a second-generation brain aging model, which we term the CognitionBrain aging model, by only using CDRGLOB FIBA+ brain-specific proteins (Fig. 3b and Supplementary Tables 16–19). This method is similar to second-generation methylation aging clocks which are trained jointly on chronological age and aging phenotypes14. We found that the CognitionBrain age gap had a stronger association with AD than the first-generation brain age gap and the conventional age gap in the Knight-ADRC cohort (Extended Data Fig. 6b). This result replicated in the independent test cohort Stanford-ADRC. In a meta-analysis, individuals with AD had approximately two years of additional CognitionBrain aging (P valuemeta = 9.23 × 10−36) compared to individuals without AD (Fig. 3c and Supplementary Table 20). The CognitionBrain age gap was also significantly associated with risk of future dementia progression in both ADRC cohorts. A standard deviation increase in the CognitionBrain age gap conferred a 34% increased risk (P valuemeta = 1.03 × 10−15) of a clinically relevant two-point increase in the Clinical Dementia Rating Sum-of-Boxes score (CDR-SB) within five years (Supplementary Table 21). We also tested associations between CognitionBrain age gap and changes in brain volume using matched volumetric MRI in the Stanford-ADRC and Stanford Aging and Memory Study (SAMS) cohorts (Extended Data Fig. 6c, Supplementary Table 22, Supplementary Fig. 7 and Supplementary Information), and found CognitionBrain age gap significantly predicted brain volume in multiple AD-sensitive regions.
Given its associations with AD status, cognitive decline risk and brain volume, we asked whether the CognitionBrain aging model could be used in combination with other biomarkers of AD and predictors of cognitive decline, including plasma pTau-181 (ref. 5) and an AD polygenic risk score30, to better stratify AD patients for future clinical outcomes. We tested a multivariate dementia progression cox proportional hazard model with baseline CDRGLOB, age, CognitionBrain age gap, plasma pTau-181 and an AD polygenic risk score (Fig. 3d) in the Stanford-ADRC. We found that the CognitionBrain age gap had the highest adjusted hazard ratio (hazard ratio = 1.57; P = 8.95 × 10−3) of the AD biomarkers, and that both plasma pTau-181 and CognitionBrain age gap were additive for risk prediction (estimated combined hazard ratio = 2.08, Fig. 3e). Individuals with fluid biomarker levels two standard deviations above average had a 75% probability of dementia progression, while individuals with levels two standard deviations below average had under a 10% probability of dementia progression within five years. Pairwise correlation between all biomarkers also showed that the CognitionBrain age gap was largely independent from other biomarkers (Extended Data Fig. 6d). Taken together, these data suggest CognitionBrain age gap provides molecular information about brain aging not captured by other approaches.
Given the significant associations between the CognitionBrain age model and several brain aging metrics, we sought to uncover new insights into brain aging mechanisms by examining the proteins that make up the model. A total of 47 of the 49 model proteins were detectable in human brain single-cell RNA sequencing (scRNA-seq) data and most could be mapped to neurons and glia with high specificity (Fig. 3f). Proteins with the largest positive weights in the model (Fig. 3c) included the synaptic proteins complexin 1 (CPLX1), complexin 2 (CPLX2) and neurexin 3 (NRXN3)—which all have genetic links to cognition and AD31–33—and stathmin 2 (STMN2) and olfactomedin 1 (OLFM1)—which are involved in neurite outgrowth and axon growth cone collapse34,35. Proteins with large negative weights in the model such as Aldolase Fructose-Bisphosphate C (ALDOC), neuronal pentraxin receptor (NPTXR), carnosine dipeptidase 1 (CNDP1) and Lanc Like Glutathione S-Transferase 1 (LANCL1). ALDOC, NPTXR and CNDP1 are expressed in astrocytes, neurons and oligodendrocytes, respectively (Fig. 3f) and have been proposed as CSF biomarkers for AD36,37. LANCL1, which is primarily expressed in oligodendrocytes (Fig. 3f), has been shown to be crucial for neuronal health in mouse models38. The model also implicated alterations in the glycosylated extracellular matrix through the proteins tenascin R (TNR), neurocan (NCAN) and heparan sulfate-glucosamine 3-sulfotransferase 4 (HS3ST4), underlining the role of the extracellular matrix in brain aging.
We assessed the highest weighted CognitionBrain proteins for their changes with age and AD in the Knight-ADRC and Stanford-ADRC cohorts, as well as their changes with AD in brain tissue at the protein39, bulk RNA39 and single-cell RNA levels from publicly available datasets (Fig. 3g). We observed a consistent pattern of decreases in AD brain tissue and increases in the blood with age and AD. This suggests that the increase of synapse and neurite growth related protein levels in the blood could reflect a loss or alteration in protein processing and subsequent shedding of these crucial factors in the brain. A similar inverse relationship between fluid and brain protein levels is seen with amyloid beta, whereby lower CSF AB42 is correlated with increased AB plaques in the brain40.
Organ aging in cognitive decline and AD
We next sought to apply the FIBA optimization framework to other organ aging models to understand how the aging of other organs contributes to brain aging phenotypes (Fig. 4a). As with the brain aging model, we applied CDRGLOB FIBA to all aging models using the Knight-ADRC (Extended Data Figs. 7 and 8). The CognitionArtery, CognitionBrain, CognitionOrganismal and CognitionPancreas age gap associations with AD replicated in both ADRCs (Fig. 4b and Extended Data Fig. 8c,d), so we focused on these four aging models to understand peripheral versus central contributions to cognitive decline.
To understand the full temporal sequence of cognitive decline, we tested if age gaps were associated with cognition in cognitively normal individuals using a composite score of overall cognition in the LonGenity cohort. The decreased cognitive function was significantly associated with all four age gaps (Fig. 4c, Extended Data Fig. 9a and Supplementary Table 23). We replicated these associations in the healthy SAMS cohort, where we observed that individuals with worse memory recall had higher CognitionOrganismal and CognitionBrain age gaps (Extended Data Fig. 9b and Supplementary Table 23).
We next tested associations between age gaps and risk of transition from cognitively normal to mild cognitive impairment (MCI) (CDR-Global Score 0 to greater than or equal to 0.5) using 15 years of clinical cognitive assessment in the Knight-ADRC (Fig. 4d and Supplementary Table 24). We found that the CognitionOrganismal (hazard ratio = 1.17, P = 0.02) and CognitionArtery (hazard ratio = 1.15, P = 0.04) age gaps significantly predicted conversion to MCI, while the CognitionBrain (hazard ratio = 1.11, P = 0.14) trended towards significance (Fig. 4d). The prediction of future conversion to MCI over 15 years is unlikely to be explained by undiagnosed cognitive impairment, placing changes detected by these aging models early in the causal chain of cognitive decline and neurodegenerative disease.
To understand the biological processes and proteins involved in early cognitive decline, we plotted the aging trajectory of all model proteins and found that highly weighted CognitionOrganismal and CognitionArtery proteins changed with age earlier and at a faster rate than CognitionBrain and CognitionPancreas proteins (Fig. 4e). The earliest changes occurred in a highly correlated cluster of CognitionOrganismal proteins: pleiotrophin (PTN), transgelin (TAGLN), WNT1 Inducible Signalling Pathway Protein 2 (WISP2), CUB Domain Containing Protein 1 (CDCP1) and chordin like 1 (CHRDL1; Fig. 4f). Though not organ-specific, these genes were all highly expressed in the arteries and brain (Extended Data Fig. 10a). Single-cell expression of these genes in human vasculature41,42, indicated these genes are expressed primarily by smooth muscle cells, pericytes and fibroblasts (Fig. 4g and Extended Data Fig. 10b). Loss of brain pericytes, smooth muscle cells and perivascular fibroblasts is associated with age and AD42,43 (Fig. 4g), and pericyte-specific deletion of PTN renders neurons prone to ischaemic and excitotoxic injury44. This early changing signature in the CognitionOrganismal model may thus represent degenerative changes to the cellular integrity of the brain vasculature and the loss of its neuroprotective functions with aging (Fig. 4h).
The five proteins composing the CognitionArtery model, TNF receptor superfamily member 11b (TNFRSF11B), sclerostin (SOST), melanocortin 2 receptor accessory protein (MRAP2), frizzled related protein (FRZB) and matrix gla protein (MGP) were also primarily expressed in vascular smooth muscle cells, pericytes and fibroblasts41 (Extended Data Fig. 10c) and are all strongly implicated in vascular calcification. TNFRSF11B/APOE double knockout mice show increased calcium deposition by vascular smooth muscle cells45, MGP deficiency-causing mutations in humans leads to Keutel syndrome, a disease characterized by soft tissue calcification46, and SOST and FRZB are negative regulators of WNT signalling that drive calcification and are increased in the plasma of people with vascular calcification47,48. We found that CognitionArtery proteins and the vascular signature in the CognitionOrganismal proteins form an interaction network using StringDB (Fig. 4i). Additional model proteins in this interaction network included integrin binding sialoprotein (IBSP), osteoglycin (OGN), collagen type III alpha 1 chain (COL3A1), proline rich and gla domain 1 (PRRG1) and growth arrest specific 6 (GAS6). In total, this protein network is involved in extracellular matrix, cartilage development and osteoblast signalling pathways, and implicates vascular calcification and extracellular matrix alterations as a major component of aging that underlies the early phases of cognitive decline and neurodegenerative disease (Fig. 4i,j).
Discussion
Our study introduces a framework for modelling organ health and biological aging using plasma proteomics. The resulting organ aging models can predict mortality, organ-specific functional decline, disease risk and progression and aging heterogeneity between tissues. This approach is minimally invasive, requiring only a small blood sample, and could be easily applied to understand the effects of health interventions, such as lifestyle modifications and drug therapies, at the organ level. We provide a large and comprehensive resource of organ aging information in nearly 6,000 individuals spanning the adult lifespan and multiple age-related disease states, and we have developed an easy-to-use python package called organage to calculate the organ ages of any plasma proteomics sample from the SomaScan assay.
There are many future directions for this work. While we have shown that plasma proteomic organ aging models are distinct from previous proteomics models, clinical chemistry-based models and imaging-based models, future studies should assess how proteomic organ aging relates to other molecular measures of aging and disease such as methylation aging clocks and disease-specific prediction models. Although we were unable to perform direct comparisons, our models predict mortality with comparable effect sizes to models trained specifically to predict mortality and heart disease in independent cohorts49,50. We demonstrated that our approach added increased value to established biomarkers of AD, and we expect that multimodal aging and disease prediction models may have similar impacts in other diseases.
We present one of the largest studies of plasma proteome aging to date, but as larger plasma proteomics resources emerge, the power of this approach will further increase. Our current models rely on approximately 5,000 proteins measured with the SomaScan assay, but the approach is platform agnostic, and we expect that even more biological information could be gained with additional proteomic coverage, including cell and organ-specific splice isoforms and posttranslational modifications. The rapidly growing number of human gene expression maps at single-cell resolution41 will help further refine organ and cell-type specific aging models and allow for a comprehensive understanding of organismal physiology based on the plasma proteome.
Another question for future studies is which organ-specific aging proteins are causal drivers of aging, given that multiple plasma proteins have been shown to directly modulate aging phenotypes8. Of note, many of the proteins with large weights in the models, such as KLOTHO, UMOD, MYL7, CPLX1, CPLX2 and NRXN3, have genetic associations with diseases of their respective organs or are validated therapeutic targets, suggesting a potential causal role of these proteins in organ aging. Future genomic studies should further investigate the genetic architecture of organ aging clocks and their relationships to disease using GWAS and post-GWAS methods such as colocalization and Mendelian randomization.
This study has multiple limitations. First, we have limited the study to a subset of organs to avoid over-interpretation of models for which we lacked convincing organ-relevant aging phenotypes. It remains unclear if this approach will generalize to all organs in the body, such as reproductive organs, and future studies should address this question. Second, we observe many instances of nonlinear dynamics in the plasma proteome and in aging phenotypes. While our current models serve as a proof of principle for this approach, since they are trained and evaluated largely on older adults, caution should be used when applying them to young people. More sophisticated nonlinear machine learning methods such as neural networks or random forests may further improve the accuracy and generalizability of this approach in the future. Lastly, the models were trained and tested on American and Caucasian-skewed cohorts, and future studies should assess the generalizability of the findings in more ethnically and geographically diverse populations.
Altogether, we show that large-scale plasma proteomics and machine learning can be leveraged to noninvasively measure organ health and aging in living people. We show that biologically motivated modelling, in which we use sets of organ-specific proteins and the FIBA algorithm to further subset to physiological age-related proteins, enables deconvolution of the different rates of aging within an individual and measurement of aging at organ-level resolution.
Methods
Human cohorts
Covance
Details of the Covance study have been previously published54. Briefly, Covance is a multi-site cross-sectional study of health across the lifespan collected at five hospital sites in the United States in 2008. A total of 1,028 subjects were included in analyses for this study. Cohort demographic characteristics are summarized in Supplementary Table 1. Exclusion criteria for the study included uncontrolled hypertension, self-reported treatment for a malignancy other than squamous cell or basal cell carcinoma of the skin in the last two years, self-reported pregnancy, self-reported chronic infection, autoimmune condition or other inflammatory condition, self-reported chronic kidney or liver disease, chronic heart failure or diagnosed with myocardial infarction in the last three months, self-reported diabetes (HbA1c > 8% if known), self-reported acute bacterial or viral infection in the past 24 h or a temperature greater than 38 °C within 24 h of enrolment, self-reported participation in any therapeutic study within 14 days before blood sampling and taking more than 20 mg of prednisone or related drugs.
Clinical blood chemistry was performed on the same samples, including a complete blood count and comprehensive metabolic panel, lipid panel and liver function tests. Basic physical workup (blood pressure, pulse and respirations) was also collected. Lifestyle information was also collected from all participants using a survey which asked about smoking, alcohol, exercise, habits and frequency of consumption of different meats and vegetables.
LonGenity
Details of the LonGenity cohort have been previously published55,56. Briefly, LonGenity is an ongoing longitudinal study initiated in 2008 and designed to identify biological factors that contribute to healthy aging. The LonGenity study enrols older adults of Ashkenazi Jewish descent with an age range of 65–94 years at a baseline. Approximately half of the cohort consists of offspring of parents with exceptional longevity, defined as having at least one parent who survived to 95 years of age. The other half of the cohort includes offspring of parents with usual survival, defined as not having a parental history of exceptional longevity. A total of 962 subjects were included in analyses for this study. The cohort characteristics are summarized in Supplementary Table 1. LonGenity participants are thoroughly characterized demographically and phenotypically at annual visits that include collection of medical history and physical and detailed neurocognitive assessments (described in detail below). The LonGenity study was approved by the institutional review board (IRB) at the Albert Einstein College of Medicine.
Subjects in the LonGenity cohort underwent extensive cognitive examination. The Overall Cognition Composite score was determined by the relative performance of the subject in the Free and Cued Selective Reminding Test, WMS-R Logical Memory I, RBANS Figure Copy, RBANS Figure Recall, WAIS-III Digit Span, WAIS-III Digit Symbol Coding, Phonemic Fluency (FAS), Categorical Fluency, Trail Making Test A and Trail Making Test B. For each task a standardized score (z) was calculated based on the population. The z-score for each task was then combined to create the overall cognition composite.
Stanford Alzheimer’s Disease Research Center
Samples were acquired through the National Institute on Aging (NIA)-funded Stanford Alzheimer’s Disease Research Center (Stanford-ADRC). The Stanford-ADRC cohort is a longitudinal observational study of clinical dementia subjects and age-sex-matched nondemented subjects. The collection of plasma was approved by the Institutional Review Board of Stanford University and written consent was obtained from all subjects. Blood collection and processing were done according to a rigorous standardized protocol to minimize variation associated with blood draw and blood processing. Briefly, about 10 cc of whole blood was collected in a vacutainer ethylenediaminetetraacetic acid (EDTA) tube (Becton Dickinson vacutainer EDTA tube) and spun at 3,000 RPM for 10 mins to separate out plasma, leaving 1 cm of plasma above the buffy coat and taking care not to disturb the buffy coat to circumvent cell contamination. Plasma processing times averaged approximately one hour from the time of the blood draw to the time of freezing and storage. All blood draws were done in the morning to minimize the impact of circadian rhythm on protein concentrations. Plasma pTau-181 levels were measured using the fully automated Lumipulse G 1200 platform (Fujirebio US, Inc, Malvern, PA) by experimenters blind to diagnostic information, as previously described57.
All healthy control participants were deemed cognitively unimpaired during a clinical consensus conference that included board-certified neurologists and neuropsychologists. Cognitively impaired subjects underwent Clinical Dementia Rating and standardized neurological and neuropsychological assessments to determine cognitive and diagnostic status, including procedures of the National Alzheimer’s Coordinating Center (https://naccdata.org/). Cognitive status was determined in a clinical consensus conference that included neurologists and neuropsychologists. All participants were free from acute infectious diseases and in good physical condition. A total of 409 subjects were included in analyses for this study. Cohort demographics and clinical diagnostic categories are summarized in Supplementary Table 1.
Stanford Aging Memory Study
SAMS is an ongoing longitudinal study of healthy aging. Blood collection and processing were done by the same team and using the same protocol as in Stanford-ADRC. Neurological and neuropsychological assessments were performed by the same team and using the same protocol as in Stanford-ADRC. All SAMS participants had CDR = 0 and a neuropsychological test score within the normal range; all SAMS participants were deemed cognitively unimpaired during a clinical consensus conference that included neurologists and neuropsychologists. A total of 192 cognitively SAMS participants were included in the present study. The collection of plasma was approved by the Institutional Review Board of Stanford University and written consent was obtained from all subjects. Cohort demographics and clinical diagnostic categories are summarized in Supplementary Table 1.
Knight Alzheimer’s Disease Research Center
The Knight-ADRC cohort is an NIA-funded longitudinal observational study of clinical dementia subjects and age-matched controls. Research participants at the Knight-ADRC undergo longitudinal cognitive, neuropsychologic, imaging and biomarker assessments including Clinical Dementia Rating (CDR). Among individuals with CSF and plasma data, AD cases corresponded to those with a diagnosis of dementia of the Alzheimer’s type (DAT) using criteria equivalent to the National Institute of Neurological and Communication Disorders and Stroke-Alzheimer’s Disease and Related Disorders Association for probable AD58, and AD severity was determined using the Clinical Dementia Rating (CDR)59 at the time of lumbar puncture (for CSF samples) or blood draw (for plasma samples). Controls received the same assessment as the cases but were nondemented (CDR = 0). Blood samples were collected in EDTA tubes (Becton Dickinson vacutainer purple top) at the visit time, immediately centrifuged at 1,500g for 10 min, aliquoted on two-dimensional barcoded Micronic tubes (200 ul per aliquot) and stored at −80 °C. The plasma was stored in monitored −80 °C freezer until it was pulled and sent to Somalogic for data generation. The Institutional Review Board of Washington University School of Medicine in St. Louis approved the study and research was performed in accordance with the approved protocols. A total of 3,075 participants were included in the present study. Cohort demographics and clinical diagnostic categories are summarized in Supplementary Table 1.
Proteomics data acquisition and quality control
SomaScan assay
We used the SomaLogic SomaScan assay, which uses slow off-rate modified DNA aptamers (SOMAmers) to bind target proteins with high specificity, to quantify the relative concentration of thousands of human proteins in plasma. The assay has been used in hundreds of studies and described in detail previously54,60. Two versions of the SomaScan assay were used in this study. The v.4 assay (4,979 protein targets) was applied to the Covance and LonGenity cohorts, and the v.4.1 assay (7,288 protein targets) was applied to the SAMS, Stanford-ADRC and Knight-ADRC cohorts. All v.4 targets are included in the v.4.1 assay based on SeqId, and only the v.4 targets were analysed for this study.
Somalogic normalization and quality control
Standard Somalogic normalization, calibration and quality control were performed on all samples54,61–63. Briefly, pooled reference standards and buffer standards are included on each plate to control for batch effects during assay quantification. Samples are normalized within and across plates using median signal intensities in reference standards to control for both within-plate and across-plate technical variation. Samples are further normalized to a pooled reference using an adaptive maximum likelihood procedure. Samples are additionally flagged by SomaLogic if signal intensities deviated significantly from the expected range and these samples were excluded from analysis. The resulting expression values are the provided data from Somalogic and are considered ‘raw’ data.
The v.4 → v.4.1 multiplication scaling factors provided by Somalogic were applied to the raw v.4 assay expression values to allow for direct comparisons across two v.4 and three v.4.1 cohorts. We discarded proteins for which the correlation was low between assay versions v.4 and v.4.1 and low estimated replicate coefficient of variation64 (Supplementary Fig. 1). This resulted in 4,778 proteins for downstream analysis. The raw data were log10 transformed before analysis, as the assay has an expected log-normal distribution.
Somalogic probe validation
Somalogic has analysed close to 1 million samples with their technology at the time of this publication, resulting in some 700 publications (https://somalogic.com/publications/). There is minimal replicate sample variability64,65 (coefficient of variation, CV). The majority of SomaScan protein measurements are stable and a subset of proteins have been validated as laboratory-developed tests (LDTs), and have been delivered out of Somalogic’s CLIA-certified laboratory to physicians and patients in the context of medical management66.
- All 7,524 probes on the assay undergo rigorous primary validation of binding and sensitivity to the target protein.
- Determination of equilibrium binding affinity dissociation constant (KD).
- Pull down assay of cognate protein from buffer.
- Demonstration of dose-responsive in the SomaScan Assay.
- Estimation of endogenous cognate protein signals in human plasma above limit of detection.
- A total of 70% of their probes have at least one orthogonal source of validation (Supplementary Fig. 1b) from:
- Mass spectrometry: approximately 900 probes which measure mostly high and mid abundance proteins (due to sensitivity limitations of mass spectrometry), have been confirmed with either data dependent acquisition (DDA) or multiple reaction monitoring (MRM) mass spectrometry.
- Antibody: approximately 390 probe measurements correlate with antibody based measurements.
- Cis-protein quantitative trait loci (pQTL): approximately 2,860 probe measurements are associated with genetic variation in the cognate protein-encoding gene.
- Absence of binding with nearest neighbour: approximately 1,150 probes do not detect signal from the protein that is most closely related in sequence to the cognate protein.
- Correlation with RNA: approximately 1,460 probe measurements correlate with mRNA levels in cell lines.
Identification of organ-enriched plasma proteins
We used the Gene Tissue Expression Atlas (GTEx) human tissue bulk RNA-seq database18 to identify organ-enriched genes and plasma proteins (Extended Data Fig. 1). Tissue gene expression data were normalized using the DESeq2 (ref. 67) R package. We define organ-enriched genes in accordance with the definition proposed by the Human Protein Atlas19: a gene is enriched if it is expressed at least four times higher in a single organ compared to any other organ. Within GTEx, we grouped tissues of the same organ together, such that a gene’s expression level for a given organ was the maximum gene expression value among its subtissues. For example, all GTEx brain regions were considered subtissues of the brain organ. We define the immune organ, which is not a GTEx tissue, as expression in the blood and the spleen tissues. Organ-enriched genes were mapped to the 4,979 plasma proteins quantified in the v.4 SomaScan assay.
Bootstrap aggregated LASSO aging models
To estimate biological age using the plasma proteome, we built LASSO regression-based chronological age predictors (Extended Data Figs. 2–3 and Supplementary Fig. 3) using the scikit-learn68 python package. We employed bootstrap aggregation for model training. Briefly, we resampled with replacement to generate 500 bootstrap samples of our training data (Knight-ADRC: 1,398 healthy individuals). Each bootstrap sample was the same size as the training data, 1,398. For each bootstrap sample, we trained a model on z-scored log10 normalized protein expression values with sex (F = 1, M = 0) as a covariate to predict chronological age. For model training, we performed hyperparameter tuning of the L1 regularization parameter, λ, with five-fold cross validation using the GridSearchCV function from scikit-learn. To reduce model complexity and avoid overfitting, we selected the highest λ value that retained 95% performance relative to the best model. The mean predicted age from all 500 bootstrap models was used.
We trained our models in 1,398 cognitively unimpaired participants from the Knight-ADRC cohort. We evaluated their performance in the Covance (n = 1,029), LonGenity (n = 962), SAMS (n = 192), Stanford-ADRC (n = 409) cohorts and Knight-ADRC cognitively impaired subjects (n = 1,677). Models that included sex as a covariate and models trained separately on males and females showed similar age prediction performance on both sexes, so we controlled for sex to extend the generality of the findings and reduce analytic complexity (Supplementary Fig. 3a–c). There was a correlation between age estimation accuracy and the number of proteins used as input to each model (Supplementary Fig. 3c,d). However, several models with few protein inputs, such as the adipose (five proteins) and heart models (ten proteins), predicted chronological age better than models with more protein inputs (Extended Data Fig. 3).
Age gap calculation and independent validation
To calculate each individual sample age gap for each aging model, we performed the following steps for each aging model. We fit a local regression between predicted and chronological age using the lowess function from the statsmodels69 python package with fraction parameter set to 2/3 to estimate the true population mean (Supplementary Fig. 3e). A local regression is used in place of a simple linear regression because of extensive evidence that the plasma proteome changes nonlinearly with age1, which we see replicated in all five cohorts (Supplementary Fig. 8). Individual sample age gaps were then calculated as the difference between predicted age and the lowess regression estimate of the population mean. Age gaps were calculated separately per cohort to account for cohort differences (Supplementary Fig. 3e). Age gaps were z-scored per aging model to account for the differences in model variability (Supplementary Fig. 3f). This allowed for direct comparison between organ age gaps in downstream analyses.
Phenotypic age calculation
We used the published coefficients14 to calculate the phenotypic age of participants in the Covance cohort using albumin, creatinine, glucose, c-reactive protein, % lymphocyte, mean cell volume, red cell distribution width, alkaline phosphatase, white blood cell count and age.
Statistical methods to associate organ age gaps with age-related phenotypes
Study design
A flowchart of the study design is provided in Supplementary Fig. 2. Each box in the flowchart was treated as a separate analysis for the purpose of multiple testing correction. Multiple testing correction was done using the Benjamani–Hochberg method and the significance threshold was a 5% false discovery rate. To summarize the flowchart, the age gaps from all 11 organ aging models, the organismal model and the conventional model were used in the following analyses: prediction of future mortality in the LonGenity cohort with a cox proportional hazards model (CPH) (12 of 13 tests significant after FDR), prediction of future heart disease in the LonGenity cohort with a CPH (12 of 13 tests significant after FDR), association with nine diseases of aging in a cross-cohort meta-analysis (66 of 17 tests significant after FDR) and association with 42 clinical biochemistry markers in the Covance cohort (237 of 588 tests significant after FDR, PhenoAge gap also tested for 14 × 42 tests).
The 12 cognition-optimized models (11 organs + organismal model) were tested on additional brain aging phenotypes. The CognitionBrain age gap only was tested for association with 65 MRI brain volumes and an MRI-based brain age gap (40 of 66 tests significant after FDR). The CognitionBrain age gap only was included in a multivariate CPH model of dementia progression in AD (1 of 1 tests significant, no FDR). The 12 cognition-optimized model age gaps were tested for association with AD status in the Knight-ADRC (12 of 12 tests significant after FDR), then a replication analysis was performed in Stanford-ADRC (4 of 12 tests significant at P < 0.05, no FDR). The four models which replicated CognitionBrain, CognitionOrganismal, CognitionArtery and CognitionPancreas were then tested for associations with overall cognition in healthy elderly people (LonGenity, 4 of 4 tests significant and no FDR), memory function in the Stanford-ADRC (2 of 4 tests significant, no FDR) and 15-year prediction of conversion from normal cognition to mild cognitive impairment in the Knight-ADRC with a CPH model (2 of 4 tests significant, no FDR).
Linear modelling
Estimation of chronological age is not sufficient in determining whether an organ aging model measures the age-related physiological dysfunction of an organ. To determine whether estimated organ age contains physiologically relevant information, we associated organ age gaps with various age-related phenotypes across Covance, LonGenity, SAMS, Stanford-ADRC and Knight-ADRC cohorts. Most organ age gap versus trait associations in this study (Figs. 2a–d and 3c and Extended Data Figs. 4d,e, 5c, 6b,c,7, 8c,d and 9) were assessed using linear models controlled for age and sex as follows: age gap ≈ trait + age + sex and adjusted for multiple testing burden using the Benjamini–Hochberg method when appropriate. To describe disease associations in relation to years of additional aging in the main text, we took the coefficient for the trait variable—which provides an estimate of the mean difference in z-scored age gaps between disease and control—and converted that to an estimate of mean difference in raw age gaps, using the standard deviation of raw age gaps provided in Supplementary Table 8.
Meta-analyses
Meta-analyses to compare and aggregate effect sizes and confidence intervals from multiple cohorts were performed in R using the metafor70 package with an inverse variance weighted fixed effects model.
Cox proportional hazard modelling
Cox proportional hazards models were used to assess the association between organ age gaps and future risk of mortality, congestive heart failure and increase in clinical dementia rating using the following model: event risk ≈ organ age gap + age + sex. Models were tested using the lifelines71 python package. Kaplan Meyer curves were generated at population-average covariate values in the relevant subject populations.
Extreme agers
Extreme agers were defined as individuals who had an age gap value two standard deviations above or below the mean (z-scored age gap greater than 2 or z-scored age gap less than −2) for at least one aging model. A total of 23% of the population across all cohorts were extreme agers. All extreme agers showed accelerated aging; no individuals displayed extreme youth signatures without extreme aging signature in a different organ (Extended Data Fig. 4a). To identify different groups of extreme agers with similar aging profiles, we performed k-means clustering (n = 13) of the extreme agers. Z-scored age gap values above 2 or below −2 were set to zero before clustering. The clusters showed distinct organ agers (Fig. 1e and Extended Data Fig. 4b). A multi-organ ager cluster was also identified. Individuals who were extreme agers in at least five different organs were manually set to multi-organ agers. Extreme ageotypes (clusters) were associated with major age-related diseases using logistic regression (trait ≈ e-ageotype) in a cross-cohort meta-analysis (Extended Data Fig. 4d and Supplementary Table 9)
Feature importance for biological aging
FIBA is an adaptation of permutation feature importance (PFI)72 (Extended Data Fig. 6a). PFI is traditionally used in machine learning to assess how much a model depends on a given feature for prediction accuracy of the target variable. The PFI score is defined as the decrease in a model’s performance when values from a single feature are randomized. In our case, for chronological age predictors, the PFI score would be calculated as the difference between the model’s original prediction accuracy (Pearson correlation between predicted and chronological age) and the model’s prediction accuracy after randomization of a single feature. The final PFI score is the mean PFI score from five randomizations.
FIBA builds on the concept of PFI and applies it to the field of aging to assess the importance of a feature in measuring biological age, instead of the target variable, chronological age. We assume that information about biological age lies in the model age gap and its association with an age-related trait. Thus, randomization of an important feature would reduce the association between the model age gap and the trait (in the expected direction). The FIBA score for a protein is calculated based on this logic and is defined as the difference between the model age gap’s original association with a trait and the association with that trait after randomization of a single feature.
We applied FIBA to understand aging model protein contributions to associations with cognition using the CDR-Global score. The mean FIBA score after five permutations was calculated for all 500 bootstraps for all organ aging models (Supplementary Table 15). A protein was defined as significant (FIBA+) if less than 5% (empirical single-tailed P < 0.05) of its FIBA scores across bootstraps was negative. Only proteins with nonzero coefficients in at least 100/500 bootstraps were considered. FIBA+ organ-specific proteins were used to train new cognition-optimized aging models from cognitively unimpaired individuals in the Knight-ADRC cohort.
Biological pathway enrichment and protein–protein interaction analysis
Biological pathway enrichment analyses were performed using g:Profiler73 with the all human genes set as the background distribution. Protein–protein interaction networks were generated using the STRING database74.
Single-cell RNA sequencing analysis
Preprocessed human heart52 and kidney51 scRNA-seq data were accessed from studies in the Human Cell Atlas. Preprocessed brain scRNA-seq data were accessed from ref. 53. Preprocessed human brain vasculature scRNA-seq data were accessed from ref. 42. Preprocessed human vasculature scRNA-seq data were accessed from Tabula Sapiens41. Gene expression counts data were log(CPM + 1) transformed and z-scored for visualization.
Brain tissue bulk proteomics and RNA sequencing
Differential expression statistics of proteins and RNA from AD versus control brains were accessed from ref. 39.
Brain MRI data from Stanford-ADRC and SAMS cohorts
MRI acquisition
Whole-brain MRI scans were collected from all subjects in the Stanford-ADRC and SAMS cohorts. All MRI data was collected at the Stanford Richard M. Lucas Center for Imaging. A total of 271 subjects underwent MRI scanning on a 3 T MRI scanner (GE Discovery MR750). T1-weighted SPGR scans were collected (TR/TE/TI = 8.2/3.2/900 ms, flip angle = 9, 1 × 1 × 1 mm) and used to define grey matter volumes. A total of 134 subjects underwent MRI scanning on a hybrid PET/MRI scanner (Signa 3 tesla, GE Healthcare). T1-weighted SPGR scan were collected (TR/TE/TI = 7.7/3.1/400 ms, flip angle = 11, 1.2 × 1.1 × 1.1 mm) and used to define grey matter volumes.
Structural MRI processing
Region of interest (ROI) labelling was implemented using the FreeSurfer75 software package v.7 (http://surfer.nmr.mgh.harvard.edu). In brief, structural images were bias field corrected, intensity normalized and skull stripped using a watershed algorithm. These images underwent a white matter-based segmentation, grey/white matter and pial surfaces were defined, and topology correction was applied to these reconstructed surfaces. Subcortical and cortical ROIs spanning the entire brain were defined in each subject’s native space, using the aparc+aseg atlas in FreeSurfer.
MRI brainageR algorithm
Using matched brain MRI and plasma proteomic data from n = 541 samples in SAMS and Stanford-ADRC, we compared our plasma proteomic organ clocks with established brain MRI-based clocks, brainageR16 and BARACUS Brain-Age76.
We used a pretrained machine learning algorithm (https://github.com/james-cole/brainageR) and raw T1-weighted MRI scans to estimate brain age. This software uses SPM12 (https://www.fil.ion.ucl.ac.uk/spm/software/spm12/) to perform tissue segmentation and normalization of individual scans to Montreal Neurological Institute (MNI) template space. The software relies on a model that used Gaussian process regression to predict brain age on 3,777 participants from seven publicly available datasets (mean age = 40.1, range = 18–90 years). It applies the results of this training to predict brain age in any new T1-w data, utilizing the RNifti (v.1.4.5) and kernlab (v.0.9-32) packages within R v.4.2.
We also used another pretrained algorithm, BARACUS (https://github.com/bids-apps/baracus, ref. 76) to estimate brain age from FreeSurfer v.5.3 processed T1-w scans. The vertex-wise cortical thickness and surface area values (transformed from subject space to fsaverage4 standard space), along with the subcortical volumetric statistics, were used as input to BARACUS’s linear support vector machine model. This model was trained on 1,166 participants with no objective cognitive impairment (566 female, mean age = 59.1, range = 20–80 years). It returns a ‘stacked-anatomy’ prediction among its results, which we used as the estimate of brain age for this method.
MRI regions of interest analysis
The volume of the AD signature region was calculated as the sum of the volumes of the parahippocampal gyrus, entorhinal cortex, inferior parietal lobules, hippocampus and precuneus. Following best practice, ROIs were linearly adjusted for estimated total intracranial volume to account for the differences in human size that is unrelated to cognitive function and neurodegeneration. Associations between organ age gaps and adjusted brain ROIs were tested using a linear model controlled for age and sex. Associations were performed for all ROIs in the aparc+aseg atlas.
Alzheimer’s disease polygenic risk score in the Stanford-ADRC cohort
AD polygenic risk scores (PRS) were calculated in the Stanford-ADRC cohort to compare to the CognitionBrain age gap. PRSs were determined from whole-genome sequencing. The Genome Analysis Toolkit workflow Germline short variant discovery was used to map genome sequencing data to the reference genome (GRCh38) and to produce high-confidence variant calls using joint-calling77. Six individuals were excluded from further whole-genome sequencing analysis due to discordance between their reported sex and genetic sex. APOE genotype (ε2/ ε3/ ε4) was determined using allelic combinations of single nucleotide variants rs7412 and rs429358. The independent loci identified in the largest AD GWAS to date were used to compute AD PRS. Namely, the 84 variants and their effect size available from Tables 1 and 2 in ref. 30 were used, in addition to rs7412 (odds ratio = 0.6) and rs429358 (odds ratio = 3.7). Plink1.9 (ref. 78) with the ‘—score’ flag was used to formally compute the PRS, while providing the individual genotypes and the list of variants with their effect size as input. Three individuals with pathogenic mutations PSEN1 or GBA were removed from this analysis.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-023-06802-1.
Supplementary information
Acknowledgements
We thank A. Keller, D. Gate, O. Leventhal, R. Vest, T. Iram, S. R. Shuken, A. Kaur, S. Shi, E. Costa, A. Shankar, A. Morningstar and other members of the Wyss-Coray laboratory for feedback and support, and D. Berdnick, H. Zhang and K. Dickey for laboratory management. This work was supported by the Stanford Alzheimer’s Disease Research Center (National Institute on Aging grants P50AG047366 and P30AG066515), the National Institute on Aging (AG072255,T.W.-C; AG057909, AG061155 and AG044829, S.M. and N.B; AG066206, Z.H.), the National Institutes of Health (R01AG044546, RF1AG053303, RF1AG058501 and U01AG058922, C.C.; P01AG003991, C.C. and J.C.M.; RF1AG074007, Y.J.S.), the Michael J. Fox Foundation (L.I. and C.C.), the Alzheimer’s Association Zenith Fellows Award (ZEN-22-848604, C.C.), the Milky Way Research Foundation, Nan Fung Life Sciences (T.W.-C.), the Stanford Graduate Fellowship (H.O. and J.R.), the Stanford Translational Program in Aging Research (T32AG047126, D.N.) and the National Science Foundation Graduate Research Fellowship (H.O.).
Extended data figures and tables
Author contributions
T.W.-C., B.L., H.O. and J.R. conceptualized the study. J.R. and H.O. led and performed all analyses. J.R., H.O. and P.M.-L. assessed quality control and normalization methods for SomaScan plasma proteomics data. H.O. and J.R. developed the FIBA algorithm. R.P. and D.N. advised on machine learning best practices. D.N., Z.H. and S.B.M. advised on statistical methods. O.A. aided in brain MRI data analyses from the SAMS and Stanford-ADRC cohorts. D.Y.U. and T.M.M. aided in analyses. K.K. and P.M.-L. created the shiny app. D.C. led plasma collection for the Stanford-ADRC cohort. Y.J.S., L.W., J.T., D.W., M.L., P.K., J.B. and C.C. generated proteomics from the Knight-ADRC cohort. E.N.W. and K.I.A. led plasma tau data collection in the Stanford-ADRC cohort. Y.G. and M.D.G. generated Alzheimer’s polygenic risk scores in the Stanford-ADRC cohort. R.P., M.H. and A.C.Y. aided in single-cell RNA-seq analyses. S.S. collected proteomics and E.F.W. led cognition tests for the LonGenity cohort. S.M. and N.B. established the LonGenity project and provided data. A.D.W. and E.M. established the SAMS cohort and provided data and insights. V.W.H. assisted in Stanford-ADRC data acquisition. V.W.H., F.M.L. and T.W.-C. lead the Stanford-ADRC. H.O. assembled the figures. J.R. and H.O. wrote the manuscript. J.R. edited the manuscript. T.W.-C. supervised the study. All authors critically revised the manuscript for intellectual content. All authors read and approved the final version of the manuscript.
Peer review
Peer review information
Nature thanks Christiaan Leeuwenburgh, Anthony Rosenzweig, Stephen Williams, Alex Zhavoronkov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
Stanford-ADRC data are available upon reasonable request to the Stanford-ADRC data release committee, https://web.stanford.edu/group/adrc/cgi-bin/web-proj/datareq.php. All Stanford-ADRC data will be made publicly available after an embargo period at https://twc-stanford.shinyapps.io/adrc/. SAMS data are available to qualified investigators upon request to principal investigators Beth Mormino (bmormino@stanford.edu) or Anthony Wagner (awagner@stanford.edu). Knight-ADRC data were generated by the laboratory of principal investigator Carlos Cruchaga (cruchagac@wustl.edu) and are available upon reasonable request to the The National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS) (Study ID: ng00130), https://www.niagads.org/knight-adrc-collection. Data from the Covance and LonGenity cohorts can be accessed according to the policies described in the initial study publications54–56. Preprocessed human heart52 and kidney51 scRNA-seq data were accessed from studies in the Human Cell Atlas. Preprocessed brain scRNA-seq data were accessed from ref. 53. Preprocessed human brain vasculature scRNA-seq data were accessed from Yang et. al. 2022 (ref. 42). Preprocessed human vasculature scRNA-seq data were accessed from Tabula Sapiens41. Differential expression statistics of proteins and RNA from Alzheimer’s disease versus control brains were accessed from ref. 39. Change with age information of approximately 5,000 SomaScan v.4 plasma proteins across all five cohorts (Supplementary Fig. 8 and Supplementary Table 25) are available in a public shiny app (https://twc-stanford.shinyapps.io/aging_plasma_proteome_v2/).
Code availability
All analyses have been carried out using freely available software packages in python and R. All aging models are available and easily accessible using the organage package in Python and the associated github repository (https://github.com/hamiltonoh/organage). The package requires v.4 or higher SomaScan data, age and sex as inputs, and outputs estimated organ ages and age gaps. The aging models are available to download from the package, and the model coefficients are available in Supplementary Tables 6 and 17. Code for the FIBA algorithm are in the package’s GitHub repository.
Competing interests
T.W-C., H.O., J.R., B.L. and Stanford University have filed a patent application related to this work, PCT/US2023/027896. T.W-C., H.O. and J.R. are co-founders and scientific advisors of Teal Omics Inc. and have received equity stakes. T.W.-C. is a co-founder and scientific advisor of Alkahest Inc. and Qinotto Inc. and has received equity stakes in these companies. C.C. has received research support from GSK and EISAI. The funders of the study had no role in the collection, analysis or interpretation of data; in the writing of the report; nor in the decision to submit the paper for publication. C.C. is a member of the advisory board of Vivid Genomics and Circular Genomics and owns stocks in these companies. S.B.M is a consultant for BioMarin, MyOme and Tenaya Therapeutics. All other authors have certified they have no competing interests to declare.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Hamilton Se-Hwee Oh, Jarod Rutledge
Extended data
is available for this paper at 10.1038/s41586-023-06802-1.
Supplementary information
The online version contains supplementary material available at 10.1038/s41586-023-06802-1.
References
- 1.Schaum N, et al. Ageing hallmarks exhibit organ-specific temporal signatures. Nature. 2020;583:596–602. doi: 10.1038/s41586-020-2499-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Almanzar N, et al. A single-cell transcriptomic atlas characterizes ageing tissues in the mouse. Nature. 2020;583:590–595. doi: 10.1038/s41586-020-2496-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Pálovics R, et al. Molecular hallmarks of heterochronic parabiosis at single-cell resolution. Nature. 2022;603:309–314. doi: 10.1038/s41586-022-04461-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zahn JM, et al. AGEMAP: a gene expression database for aging in Mice. PLoS Genet. 2007;3:e201. doi: 10.1371/journal.pgen.0030201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Janelidze S, et al. Plasma P-tau181 in Alzheimer’s disease: relationship to other biomarkers, differential diagnosis, neuropathology and longitudinal progression to Alzheimer’s dementia. Nat. Med. 2020;26:379–386. doi: 10.1038/s41591-020-0755-1. [DOI] [PubMed] [Google Scholar]
- 6.Hajat C, Stein E. The global burden of multiple chronic conditions: a narrative review. Prev. Med. Rep. 2018;12:284–293. doi: 10.1016/j.pmedr.2018.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kaeberlein M, Rabinovitch PS, Martin GM. Healthy aging: the ultimate preventative medicine. Science. 2015;350:1191–1193. doi: 10.1126/science.aad3267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Conboy IM, et al. Rejuvenation of aged progenitor cells by exposure to a young systemic environment. Nature. 2005;433:760–764. doi: 10.1038/nature03260. [DOI] [PubMed] [Google Scholar]
- 9.Rutledge, J., Oh, H. & Wyss-Coray, T. Measuring biological age using omics data. Nat. Rev. Genet. 23, 715–727 (2022). [DOI] [PMC free article] [PubMed]
- 10.Hannum G, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell. 2013;49:359–367. doi: 10.1016/j.molcel.2012.10.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:3156. doi: 10.1186/gb-2013-14-10-r115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Putin E, et al. Deep biomarkers of human aging: application of deep neural networks to biomarker development. Aging. 2016;8:1021–1030. doi: 10.18632/aging.100968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Belsky DW, et al. Quantification of biological aging in young adults. Proc. Natl Acad. Sci. USA. 2015;112:E4104–E4110. doi: 10.1073/pnas.1506264112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Levine ME, et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging. 2018;10:573–591. doi: 10.18632/aging.101414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Tian YE, et al. Heterogeneous aging across multiple organ systems and prediction of chronic disease and mortality. Nat. Med. 2023;29:1221–1231. doi: 10.1038/s41591-023-02296-6. [DOI] [PubMed] [Google Scholar]
- 16.Cole JH, et al. Brain age predicts mortality. Mol. Psychiatry. 2018;23:1385–1392. doi: 10.1038/mp.2017.62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Glorioso C, Oh S, Douillard GG, Sibille E. Brain molecular aging, promotion of neurological disease and modulation by Sirtuin5 longevity gene polymorphism. Neurobiol. Dis. 2011;41:279–290. doi: 10.1016/j.nbd.2010.09.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.GTEx Consortium The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369:1318–1330. doi: 10.1126/science.aaz1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Uhlén M, et al. Tissue-based map of the human proteome. Science. 2015;347:1260419. doi: 10.1126/science.1260419. [DOI] [PubMed] [Google Scholar]
- 20.Tanaka T, et al. Plasma proteomic signature of age in healthy humans. Aging Cell. 2018;17:e12799. doi: 10.1111/acel.12799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lehallier B, et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat. Med. 2019;25:1843–1850. doi: 10.1038/s41591-019-0673-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sparks, M. A. et al. Classical renin-angiotensin system in kidney physiology. Compr. Physiol.4, 1201–1228 (2014). [DOI] [PMC free article] [PubMed]
- 23.Buchanan S, Combet E, Stenvinkel P, Shiels PG. Klotho, aging, and the failing kidney. Front. Endocrinol. 2020;11:560. doi: 10.3389/fendo.2020.00560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Gudbjartsson DF, et al. Association of variants at UMOD with chronic kidney disease and kidney stones—role of age and comorbid diseases. PLoS Genet. 2010;6:e1001039. doi: 10.1371/journal.pgen.1001039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Devuyst O, Pattaro C. The UMOD locus: insights into the pathogenesis and prognosis of kidney disease. J. Am. Soc. Nephrol. 2018;29:713–726. doi: 10.1681/ASN.2017070716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Shrivastava A, Haase T, Zeller T, Schulte C. Biomarkers for heart failure prognosis: proteins, genetic scores and non-coding RNAs. Front. Cardiovasc. Med. 2020;7:601364. doi: 10.3389/fcvm.2020.601364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ho JE, et al. Protein biomarkers of cardiovascular disease and mortality in the community. J. Am. Heart Assoc. 2018;7:e008108. doi: 10.1161/JAHA.117.008108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Saberi S, et al. Mavacamten favorably impacts cardiac structure in obstructive hypertrophic cardiomyopathy. Circulation. 2021;143:606–608. doi: 10.1161/CIRCULATIONAHA.120.052359. [DOI] [PubMed] [Google Scholar]
- 29.McCrory C, et al. GrimAge outperforms other epigenetic clocks in the prediction of age-related clinical phenotypes and all-cause mortality. J. Gerontol. A Biol. Sci. Med. Sci. 2021;76:741–749. doi: 10.1093/gerona/glaa286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bellenguez C, et al. New insights into the genetic etiology of Alzheimer’s disease and related dementias. Nat. Genet. 2022;54:412–436. doi: 10.1038/s41588-022-01024-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Yu L, et al. Cortical proteins associated with cognitive resilience in community-dwelling older persons. JAMA Psychiatry. 2020;77:1172–1180. doi: 10.1001/jamapsychiatry.2020.1807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Begemann M, et al. Modification of cognitive performance in schizophrenia by complexin 2 gene polymorphisms. Arch. Gen. Psychiatry. 2010;67:879–888. doi: 10.1001/archgenpsychiatry.2010.107. [DOI] [PubMed] [Google Scholar]
- 33.Hishimoto A, et al. Neurexin 3 transmembrane and soluble isoform expression and splicing haplotype are associated with neuron inflammasome and Alzheimer’s disease. Alzheimer’s Res. Ther. 2019;11:28. doi: 10.1186/s13195-019-0475-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Klim JR, et al. ALS-implicated protein TDP-43 sustains levels of STMN2, a mediator of motor neuron growth and repair. Nat. Neurosci. 2019;22:167–179. doi: 10.1038/s41593-018-0300-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Nakaya N, Sultana A, Lee H-S, Tomarev SI. Olfactomedin 1 interacts with the Nogo A receptor complex to regulate axon growth. J. Biol. Chem. 2012;287:37171–37184. doi: 10.1074/jbc.M112.389916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Yin GN, Lee HW, Cho J-Y, Suk K. Neuronal pentraxin receptor in cerebrospinal fluid as a potential biomarker for neurodegenerative diseases. Brain Res. 2009;1265:158–170. doi: 10.1016/j.brainres.2009.01.058. [DOI] [PubMed] [Google Scholar]
- 37.Bader JM, et al. Proteome profiling in cerebrospinal fluid reveals novel biomarkers of Alzheimer’s disease. Mol. Syst. Biol. 2020;16:e9356. doi: 10.15252/msb.20199356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Tan H, et al. LanCL1 promotes motor neuron survival and extends the lifespan of amyotrophic lateral sclerosis mice. Cell Death Differ. 2020;27:1369–1382. doi: 10.1038/s41418-019-0422-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Johnson ECB, et al. Large-scale deep multi-layer analysis of Alzheimer’s disease brain reveals strong proteomic disease-related changes not observed at the RNA level. Nat. Neurosci. 2022;25:213–225. doi: 10.1038/s41593-021-00999-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tang W, Huang Q, Wang Y, Wang Z-Y, Yao Y-Y. Assessment of CSF Aβ42 as an aid to discriminating Alzheimer’s disease from other dementias and mild cognitive impairment: A meta-analysis of 50 studies. J. Neurol. Sci. 2014;345:26–36. doi: 10.1016/j.jns.2014.07.015. [DOI] [PubMed] [Google Scholar]
- 41.THE TABULA SAPIENS CONSORTIUM The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science. 2022;376:eabl4896. doi: 10.1126/science.abl4896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Yang AC, et al. A human brain vascular atlas reveals diverse mediators of Alzheimer’s risk. Nature. 2022;603:885–892. doi: 10.1038/s41586-021-04369-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Sengillo JD, et al. Deficiency in mural vascular cells coincides with blood–brain barrier disruption in Alzheimer’s disease. Brain Pathol. 2013;23:303–310. doi: 10.1111/bpa.12004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nikolakopoulou AM, et al. Pericyte loss leads to circulatory failure and pleiotrophin depletion causing neuron loss. Nat. Neurosci. 2019;22:1089–1098. doi: 10.1038/s41593-019-0434-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Callegari A, Coons ML, Ricks JL, Rosenfeld ME, Scatena M. Increased calcification in osteoprotegerin-deficient smooth muscle cells: dependence on receptor activator of NF-κB ligand and interleukin 6. J. Vasc. Res. 2014;51:118–131. doi: 10.1159/000358920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Köhler S, et al. The human phenotype ontology in 2021. Nucleic Acids Res. 2021;49:D1207–D1217. doi: 10.1093/nar/gkaa1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Qureshi AR, et al. Increased circulating sclerostin levels in end-stage renal disease predict biopsy-verified vascular medial calcification and coronary artery calcification. Kidney Int. 2015;88:1356–1364. doi: 10.1038/ki.2015.194. [DOI] [PubMed] [Google Scholar]
- 48.Touw WA, et al. Association of circulating Wnt antagonists with severe abdominal aortic calcification in elderly women. J. Endocr. Soc. 2017;1:26–38. doi: 10.1210/js.2016-1040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lu AT, et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging. 2019;11:303–327. doi: 10.18632/aging.101684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Ganz P, et al. Development and validation of a protein-based risk score for cardiovascular outcomes among patients with stable coronary heart disease. JAMA. 2016;315:2532–2541. doi: 10.1001/jama.2016.5951. [DOI] [PubMed] [Google Scholar]
- 51.Stewart Benjamin J, et al. Spatiotemporal immune zonation of the human kidney. Science. 2019;365:1461–1466. doi: 10.1126/science.aat5031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Litviňuková M, et al. Cells of the adult human heart. Nature. 2020;588:466–472. doi: 10.1038/s41586-020-2797-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Michael S. Haney et al. APOE4/4 is linked to damaging lipid droplets in Alzheimer’s microglia. Preprint at bioRxiv10.1101/2023.07.21.549930 (2023).
- 54.Williams SA, et al. Plasma protein patterns as comprehensive indicators of health. Nat. Med. 2019;25:1851–1857. doi: 10.1038/s41591-019-0665-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Gubbi S, et al. Effect of exceptional parental longevity and lifestyle factors on prevalence of cardiovascular disease in offspring. Am. J. Cardiol. 2017;120:2170–2175. doi: 10.1016/j.amjcard.2017.08.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Sathyan S, et al. Plasma proteomic profile of age, health span, and all-cause mortality in older adults. Aging Cell. 2020;19:e13250. doi: 10.1111/acel.13250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wilson EN, et al. Performance of a fully-automated Lumipulse plasma phospho-tau181 assay for Alzheimer’s disease. Alzheimers Res. Ther. 2022;14:172. doi: 10.1186/s13195-022-01116-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Berg L, et al. Clinicopathologic studies in cognitively healthy aging and Alzheimer disease: relation of histologic markers to dementia severity, age, sex, and apolipoprotein E genotype. Arch. Neurol. 1998;55:326–335. doi: 10.1001/archneur.55.3.326. [DOI] [PubMed] [Google Scholar]
- 59.Morris JC. The clinical dementia rating (CDR): current version and scoring rules. Neurology. 1993;43:2412. doi: 10.1212/WNL.43.11.2412-a. [DOI] [PubMed] [Google Scholar]
- 60.Gold L, et al. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE. 2010;5:e15004. doi: 10.1371/journal.pone.0015004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.SomaLogic. SomaScan v4 data standardization and file specification technical note https://somalogic.com/tech-notes/ (2021).
- 62.SomaLogic. SomaScan v4 data standardization https://somalogic.com/tech-notes/ (2022).
- 63.SomaLogic. Technical specification: adaptive normalization using maximum likelihood https://somalogic.com/tech-notes/ (2020).
- 64.Candia J, Daya GN, Tanaka T, Ferrucci L, Walker KA. Assessment of variability in the plasma 7k SomaScan proteomics assay. Sci. Rep. 2022;12:17147. doi: 10.1038/s41598-022-22116-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Katz DH, et al. Proteomic profiling platforms head to head: leveraging genetics and clinical traits to compare aptamer- and antibody-based methods. Sci. Adv. 2022;8:eabm5164. doi: 10.1126/sciadv.abm5164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.SomaSignal Tests — Products and Services. SomaLogichttps://somalogic.com/somasignal-tests-for-research-use/ (2023).
- 67.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Pedregosa F, et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
- 69.Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. In Proc. 9th Python in Science Conference (eds van der Walt, S. & Millman, J.) 92–96 (SciPy, 2010).
- 70.Viechtbauer W. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 2010;36:1–48. doi: 10.18637/jss.v036.i03. [DOI] [Google Scholar]
- 71.Davidson-Pilon, C. lifelines, survival analysis in Python (v.0.27.0). Zenodo10.5281/zenodo.6359609 (2022).
- 72.Breiman L. Random forests. Mach. Learn. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
- 73.Raudvere U, et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Res. 2019;47:W191–W198. doi: 10.1093/nar/gkz369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Szklarczyk D, et al. The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 2021;49:D605–D612. doi: 10.1093/nar/gkaa1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Fischl B. FreeSurfer. NeuroImage. 2012;62:774–781. doi: 10.1016/j.neuroimage.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Liem F, et al. Predicting brain-age from multimodal imaging data captures cognitive impairment. NeuroImage. 2017;148:179–188. doi: 10.1016/j.neuroimage.2016.11.005. [DOI] [PubMed] [Google Scholar]
- 77.Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv10.1101/201178 (2018).
- 78.Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience4, 7 (2015). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Stanford-ADRC data are available upon reasonable request to the Stanford-ADRC data release committee, https://web.stanford.edu/group/adrc/cgi-bin/web-proj/datareq.php. All Stanford-ADRC data will be made publicly available after an embargo period at https://twc-stanford.shinyapps.io/adrc/. SAMS data are available to qualified investigators upon request to principal investigators Beth Mormino (bmormino@stanford.edu) or Anthony Wagner (awagner@stanford.edu). Knight-ADRC data were generated by the laboratory of principal investigator Carlos Cruchaga (cruchagac@wustl.edu) and are available upon reasonable request to the The National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS) (Study ID: ng00130), https://www.niagads.org/knight-adrc-collection. Data from the Covance and LonGenity cohorts can be accessed according to the policies described in the initial study publications54–56. Preprocessed human heart52 and kidney51 scRNA-seq data were accessed from studies in the Human Cell Atlas. Preprocessed brain scRNA-seq data were accessed from ref. 53. Preprocessed human brain vasculature scRNA-seq data were accessed from Yang et. al. 2022 (ref. 42). Preprocessed human vasculature scRNA-seq data were accessed from Tabula Sapiens41. Differential expression statistics of proteins and RNA from Alzheimer’s disease versus control brains were accessed from ref. 39. Change with age information of approximately 5,000 SomaScan v.4 plasma proteins across all five cohorts (Supplementary Fig. 8 and Supplementary Table 25) are available in a public shiny app (https://twc-stanford.shinyapps.io/aging_plasma_proteome_v2/).
All analyses have been carried out using freely available software packages in python and R. All aging models are available and easily accessible using the organage package in Python and the associated github repository (https://github.com/hamiltonoh/organage). The package requires v.4 or higher SomaScan data, age and sex as inputs, and outputs estimated organ ages and age gaps. The aging models are available to download from the package, and the model coefficients are available in Supplementary Tables 6 and 17. Code for the FIBA algorithm are in the package’s GitHub repository.