Abstract
Background
Epigenetic clocks are promising tools for assessing biological age. We assessed the accuracy of pediatric epigenetic clocks in gestational and chronological age determination.
Results
Our study used data from seven tissue types on three DNA methylation profiling microarrays and found that the Knight and Bohlin clocks performed similarly for blood cells, while the Lee clock was superior for placental samples. The pediatric-buccal-epigenetic clock performed the best for pediatric buccal samples, while the Horvath clock is recommended for children's blood cell samples. The NeoAge clock stands out for its unique ability to predict post-menstrual age with high correlation with the observed age in infant buccal cell samples.
Conclusions
Our findings provide valuable guidance for future research and development of epigenetic clocks in pediatric samples, enabling more accurate assessments of biological age.
Supplementary Information
The online version contains supplementary material available at 10.1186/s13148-023-01552-3.
Keywords: Epigenetic clock, DNA methylation, Gestational age, Early childhood chronological age
Background
DNA methylation (DNAm), a molecular mark in which a methyl group is covalently added to the fifth carbon of cytosine next to guanine (CpG dinucleotides), is a well-studied and stable epigenetic mark associated with a diverse array of age-related chronic diseases [1–3], including the process of aging itself [4]. Various epigenetic clocks have been developed to predict chronological ages using DNAm values from tens to hundreds of CpGs identified with statistical and machine learning methods. While these clocks correlate strongly with chronological age by design, they also provide an estimate of an individual's biological age [4, 5]. These clocks have been extensively studied in adult populations in whom accelerated epigenetic age (DNAm-predicted age older than chronological age) exhibits strong associations with age-related diseases, mortality, and health outcomes [4–6]. In recent years, a variety of epigenetic clocks have been built for pediatric populations, including clocks predicting gestational age (GA) and pediatric chronological age (CA). However, there is limited research on the reliability and accuracy of epigenetic clocks for pediatric samples across different tissues and platforms.
Epigenetic clocks are used to evaluate the impact of various environmental exposures on aging and children’s health outcomes. Understanding how these clocks perform across tissue types and developmental stages throughout early-life is critical for appropriate study design and interpretation of results. The evaluation of epigenetic clocks in early life stages may also shed light on the role of epigenetic modifications in developmental processes and the emergence of diseases later in life.
In this study, we conducted a comprehensive performance evaluation of seven epigenetic clocks using DNAm data from various tissues and different Infinium arrays during the early stages of life. The seven clocks evaluated include the Horvath clock [7], trained across various tissues and cell types to predict CA across the lifecourse; the Knight [8] and Bohlin [9] clocks, both developed based on cord blood data to predict GA; the Lee [10] and Mayne [11] clocks, developed for use with placental data to predict GA; the PedBE clock [12], trained in buccal cells to predict CA across childhood and adolescence; and the NeoAge clock [13], trained in buccal cells from preterm infants to predict neonatal age, including post-menstrual age (PMA, time from estimated conception onward) and post-natal age (PNA, time elapsed after birth). Comparisons were performed by analyzing a large number of diverse DNAm profiles (N = 4555) from newborns, infants, and young children in the Environmental influences on Child Health Outcomes (ECHO) Project. The goal of our study was to provide recommendations for the most suitable epigenetic clock in each scenario, ultimately advancing our understanding of this important biomarker for healthy development in early life.
Methods
Study participants
Data used in this study were obtained through the ECHO Research Program. ECHO is a consortium of established pregnancy and pediatric cohort studies seeking to investigate the effects of early environmental exposures on child health [14]. Our analysis included participants who met the following two criteria: (1) availability on an Illumina platform of high-quality DNAm data collected from cord blood cells, cord blood mononuclear cells (CBMC), newborn blood spots, placental samples, buccal cells, peripheral blood mononuclear cells (PBMC), or peripheral whole blood, and (2) this data was collected either near the time of birth or during childhood (age < 18). A total of 3789 participants (with 4555 samples) across 20 U.S. cohorts were included in the current analysis (see Additional file 1: Table S1). The study protocol was approved by the local or single ECHO institutional review board (IRB). The single IRB registered with the Office for Human Research Protections (OHRP and FDA) is IRB00000533. For participation in the ECHO-wide Cohort Data Collection Protocol and specific cohorts, written informed consent or parent/guardian permission was obtained, as well as child assent as appropriate.
DNA methylation data
Methylation of DNA data was measured using the Illumina Infinium arrays, from the earliest Infinium Human Methylation27 BeadChip (27K), the following Infinium HumanMethylation450 BeadChip (450K), or the most recent Infinium MethylationEPIC arrays (EPIC [850K]). Preprocessing of DNAm data from three arrays was conducted in parallel using the same pipeline in R v.4.0.3 mainly with the package minfi 1.36.0 [15]. At the probe level, we removed probes with more than 1% low-quality samples (detection P value > 0.05), cross-reactive probes that map to multiple genomic locations [16], and probes with SNP(s) (single nucleotide polymorphism) at the single base pair extension or CpG site. At the sample level, we excluded samples with poor bisulfite conversion efficiency at a cutoff of 4000, low overall array intensity at a cutoff of 10, low call rate (> 1% low-quality probes [detection P value > 0.05 or bead count < 3]), replicates, or a discrepancy between predicted sex and reported sex. After applying these quality control steps, we applied the normal-exponential using out-of-band probes method, commonly referred to as “noob”, to correct for background signal and dye bias [17]. DNAm levels were calculated as -values, which represent the proportion of cells/chromosomes for which DNA that is methylated at the interrogated CpG site and ranges from 0 to 1.
Epigenetic clocks
In this study, a total of seven epigenetic clocks were evaluated: Horvath [7], Knight [8], Bohlin [9], Lee [10], Mayne [11], PedBE [12], and NeoAge [13] (see Additional file 1: Table S2). The Knight clock consists of 148 CpGs and was developed using training data derived from cord blood in both 27K and 450K arrays. The Bohlin clock is calculated based on 96 CpGs in 450K data from cord blood. The Lee clock is based on 558 CpGs designed using placental data from the 450K and EPIC arrays. The Mayne clock has 62 CpGs and was developed using placental data from both 27K and 450K arrays. The PedBE clock, which involves 94 CpGs, is the first epigenetic clock focusing on pediatric samples (0–20 years old). The training data for the PedBE clock was obtained from buccal cell DNA profiled with both 450K and EPIC arrays. To evaluate the performance of the PedBE clock in predicting chronological ages in pediatric samples, we compared it with the pan-tissue Horvath clock (353 CpGs). The NeoAge clock was trained to predict both PMA and PNA for preterm infants in buccal cell samples using 303–522 CpGs.
Each epigenetic clock was calculated in corresponding tissues to match their training datasets, as outlined in Additional file 1: Table S3. Specifically, Knight and Bohlin clocks were calculated for blood samples collected at birth including cord blood, CBMC, blood spot, and peripheral whole blood. For placental samples, the Lee and Mayne clocks were compared. For samples obtained from infants or children in buccal cells, PBMC, or peripheral whole blood, the Horvath and PedBE clocks were applied and compared. In addition, we tested the NeoAge clock in preterm infants by analyzing DNA from buccal cells, placenta, and blood spot samples. For NeoAge, the predicted PNA was compared to chronological age in weeks, and the PMA was compared to the sum of gestational age at birth and the time elapsed after birth in weeks.
Notably, some of the datasets utilized in this study were previously incorporated in the training data of certain epigenetic clocks. Specifically, the Neonatal Neurobehavior and Outcomes in Very Preterm Infants (NOVI) dataset contributed to the training of NeoAge clock. The Lee clock, on the other hand, employed placental samples from the New Hampshire Birth Cohort Study (NHBCS) as training data, whereas only cord blood samples from this cohort were utilized in our study. Although the Conditions Affecting Neurocognitive Development and Learning in Early Childhood (CANDLE) study was utilized as a testing dataset in the creation of the Knight clock for evaluation purposes, it was not utilized as a training dataset. Therefore, the only intersection between our study and the development of the epigenetic clocks is the utilization of the NOVI dataset in the development of the NeoAge clock.
We calculated these epigenetic clocks using the methods described for Knight, Bohlin, Mayne, PedBE, and NeoAge clocks; and we used the existing R package planet [18] for the Lee clock and ENmix [19] for the Horvath clock. For the Knight, Bohlin, Mayne, PedBE, and NeoAge clocks, if a required CpG site was missing, then the closest CpG site in the dataset was used in its place [20] (see Additional file 1: Table S4).
Statistical analysis
Four measures were considered in evaluating the outputs of each epigenetic clock. First, the Spearman correlation coefficient (r) between the predicted epigenetic age and observed gestational/chronological age was calculated to assess how well the relationship between the predicted and observed age could be described by a monotonic function. Second, the absolute difference between the predicted and the observed age for each sample was calculated, and the “median error” was defined as the median of the set of absolute differences. Third, the signed difference between the predicted and the observed age for each sample was calculated, and the “mean difference” was defined as the arithmetic mean of the set of signed differences. Lastly, the residuals obtained from regressing the predicted epigenetic age onto the observed age (age acceleration residual) were utilized. This residual-based analysis enabled the evaluation of how well the epigenetic clock's predictions aligned with the observed age after accounting for linear dependencies and has been demonstrated robust with respect to normalization methods and measurement platforms [21]. The “median residual” was defined as the median of absolute residuals.
We initiated our analysis by comparing the suitability of various epigenetic clocks for each tissue type, aiming to provide a comprehensive summary of the most appropriate epigenetic clock for each specific tissue. Following that, we proceeded to evaluate the performance of these epigenetic clocks across diverse populations. This evaluation included comparing epigenetic clocks between preterm and term infants within the same tissue type, analyzing different self-reported racial groups, comparing males and females, and assessing the consistency of epigenetic age estimates across different tissue types within the same set of participants.
Results
Sample characteristics
In this study, data were collected from 3789 participants, resulting in a total of 4555 tissue samples from seven different tissue types collected at birth or early childhood, as indicated in Table 1 and Fig. 1. The sample set consisted of 2273 male and 2282 female samples. The majority of participants self-identified as White race (n = 2302 [51%]), but a large proportion of individuals identified as Black (n = 988 [22%]), Asian (n = 94 [2%]), and other racial groups (including Hawaiian or other Pacific Islander, American Indian or Alaska Native, multiple race and other race; n = 752 [17%]). DNAm data was generated using three different types of arrays: (1) 27K (n = 159 [3%]), (2) 450K (n = 1963 [43%]), and (3) EPIC (n = 2433 [53%]). Notably, the study also included two cohorts of infants born very preterm (GA < 30 weeks), consisting of 1151 samples.
Table 1.
Cord blood (N = 1938) | CBMC (N = 142) | Blood spot (N = 701) | Placenta (N = 579) | Buccal (N = 552) | PBMC (N = 290) | Peripheral whole blood (N = 353) | Total (N = 4555) | |
---|---|---|---|---|---|---|---|---|
Self-reported race | ||||||||
American Indian or Alaska Native | 11 | < 5* | 6 | < 5* | < 5* | < 5* | < 5* | 24 |
Asian | 31 | < 5* | 21 | 18 | 19 | < 5* | < 5* | 94 |
Black | 364 | 112 | 124 | 122 | 99 | 117 | 50 | 988 |
Hawaiian or other Pacific Islander | < 5* | < 5* | < 5* | < 5* | 7 | < 5* | < 5* | 9 |
Multiple race | 176 | 9 | 65 | 38 | 104 | 11 | 153 | 556 |
Other race | 94 | < 5* | 12 | 16 | 36 | < 5* | < 5* | 163 |
White | 929 | < 5* | 435 | 367 | 284 | 142 | 143 | 2302 |
Missing | 333 | 17 | 36 | 14 | < 5* | 16 | < 5* | 419 |
Sex | ||||||||
Male | 914 | 70 | 366 | 313 | 303 | 144 | 163 | 2273 |
Female | 1,024 | 72 | 335 | 266 | 249 | 146 | 190 | 2282 |
Birth term | ||||||||
Preterm | < 5* | < 5* | 272 | 399 | 480 | < 5* | < 5* | 1151 |
Term | 1,938 | 142 | 439 | 180 | 72 | 290 | 353 | 3414 |
Array type | ||||||||
27K | 159 | < 5* | < 5* | < 5* | < 5* | < 5* | < 5* | 159 |
450K | 1,194 | < 51 | 253 | 59 | < 51 | 150 | 307 | 1963 |
EPIC | 585 | 142 | 448 | 520 | 552 | 140 | 46 | 2433 |
Gestational age at delivery | 39.1 ± 1.5 | 39.0 ± 1.4 | 33.7 ± 8.6 | 29.8 ± 6.2 | 28.1 ± 4.3 | 39.1 ± 1.2 | 39.0 ± 1.7 | |
Age at sample collection | 0 | 0 | 0 | 0 | 74.3 ± 40.3 days | 9.1 ± 3.0 years | 6.0 ± 2.8 years |
*It is an ECHO requirement that table cells and figures that report data from fewer than 5 participants must be suppressed to protect Participant confidentiality (i.e., marked as < 5)
CBMC Cord blood mononuclear cells, PBMC Peripheral blood mononuclear cells
Epigenetic clocks
This study involved the calculation of seven different epigenetic clocks, which collectively incorporated a total of 2587 CpGs. Most of these CpGs (n = 2206) were specific to only one clock, as shown in Additional file 1: Table S5. Twenty CpGs were included in two clocks, with the Horvath clock having the largest overlap, sharing 11 CpGs with other clocks. Additional file 2: Fig. S1 displays the distribution of CpGs for each clock across the genome. The performance of the clocks was compared and summarized in Table 2, with the measurements in correlation, median error, median residual, and mean difference.
Table 2.
Tissue | Array | Birth term | GA (weeks) | N | Knight | Bohlin | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Gestational age (weeks) estimation | ||||||||||||
Blood spot | 450K | Term | 39.36 ± 1.91 | 253 | 0.35 | 1.11 | 0.96 | − 0.08 | 0.47 | 1.02 | 0.52 | − 0.94 |
EPIC | Term | 38.73 ± 1.60 | 176 | 0.48 | 0.89 | 0.91 | − 0.32 | 0.65 | 1.42 | 0.43 | − 1.35 | |
EPIC | Preterm | 25.65 ± 1.27 | 272 | 0.57 | 2.52 | 1.01 | 2.52 | 0.67 | 3.91 | 0.51 | 4.01 | |
CBMC | EPIC | Term | 38.98 ± 1.39 | 142 | 0.38 | 2.02 | 0.83 | − 1.90 | 0.66 | 1.73 | 0.35 | − 1.56 |
Cord blood | 27K | Term | 39.00 ± 1.29 | 159 | 0.31 | 1.10 | 0.97 | − 0.39 | 0.37 | 0.58 | 0.14 | − 0.31 |
450K | Term | 39.19 ± 1.43 | 1194 | 0.41 | 0.98 | 0.82 | − 0.45 | 0.57 | 1.10 | 0.46 | − 0.99 | |
EPIC | Term | 38.92 ± 1.74 | 585 | 0.34 | 1.11 | 0.91 | − 0.49 | 0.55 | 1.64 | 0.52 | − 1.41 |
Tissue | Array | Birth term | GA (weeks) | N | Lee | Mayne | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Placental gestational age (weeks) estimation | ||||||||||||
Placenta | 450K | Term | 38.98 ± 1.15 | 59 | 0.5 | 0.60 | 0.52 | − 0.11 | 0.27 | 1.55 | 0.83 | − 1.62 |
EPIC | Term | 38.79 ± 1.18 | 121 | 0.36 | 0.88 | 0.76 | − 0.32 | − 0.05 | 2.11 | 1.37 | 0.92 | |
Preterm | 25.68 ± 1.25 | 399 | 0.6 | 2.50 | 0.93 | 2.59 | 0.35 | 7.27 | 1.22 | 7.38 |
Tissue | Array | Birth term | Age (days) | N | Horvath | PedBE | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Pediatric chronological age (years) estimation | ||||||||||||
Buccal cells | EPIC | Term | 3.62 ± 5.24 | 72 | − 0.36 | 0.98 | 0.17 | 1.00 | 0.12 | 0.05 | 0.05 | 0.04 |
Preterm | 84.95 ± 31.64 | 480 | 0.41 | 1.15 | 0.34 | 1.26 | 0.62 | 0.09 | 0.09 | 0.07 | ||
PBMC | 450K | Term | 4044.20 ± 1098.28 | 150 | 0.79 | 1.72 | 1.26 | − 1.33 | 0.78 | 6.73 | 0.26 | − 5.29 |
EPIC | Term | 2555 | 140 | NA | 3.48 | 1.24 | 3.48 | NA | 0.83 | 0.38 | − 0.72 | |
Peripheral whole blood | 450K | Term | 2504.60 ± 705.79 | 307 | 0.68 | 0.85 | 0.86 | 0.30 | 0.56 | 1.73 | 0.30 | − 1.07 |
EPIC | Term | 153 ± 93.44 | 46 | 0.74 | 1.50 | 0.29 | 1.49 | 0.68 | 4.09 | 0.18 | 4.09 |
*Spearman correlation between predicted epigenetic age and observed gestational age/chronological age
†Median of the absolute difference between the predicted age and the observed age (weeks for gestational age and years for chronological age)
§Median of the absolute residuals from regressing the predicted epigenetic age onto the observed age (weeks for gestational age and years for chronological age)
‡Mean of the signed difference between the predicted age and the observed age (weeks for gestational age and years for chronological age).
CBMC: Cord blood mononuclear cells, GA: gestational age, PBMC: Peripheral blood mononuclear cells
Gestational age (GA) prediction in blood samples collected at birth
We assessed the performance of the Knight and Bohlin clocks, in three types of blood samples (cord blood, CBMC, and blood spot) collected at birth, stratified by array type and gestational age at birth category (term [> 37 weeks] or preterm [< 37 weeks]). The Bohlin clock consistently shows less variation in terms of age acceleration residuals (Table 2).
In cord blood DNA, the Bohlin clock GA was more correlated with observed GA than the Knight clock across all three array types (Table 2). The median error for the Knight clock was consistently around 1 week across array types, whereas the Bohlin clock exhibited varied median errors, with the lowest at 0.58 week in the 27K array, 1.10 weeks in the 450K array, and the largest at 1.64 weeks in the EPIC array. Similar trends were observed for mean difference, with the Knight clock consistently underestimating GA, typically resulting in predictions 0.5 weeks less than observed GA. The Bohlin clock had a mean difference of − 0.31 week for the 27K array, − 0.99 week for the 450K array, and − 1.41 weeks for the EPIC array.
In CBMC DNA using the EPIC array, the Bohlin clock showed a higher correlation with observed GA than the Knight clock (0.66 vs. 0.38), with smaller median error (1.73 vs. 2.02 weeks), median residual (0.35 vs. 0.83 week) and mean difference (− 1.56 vs. − 1.90 weeks).
In blood spot DNA, the Bohlin clock GA showed stronger correlation with the observed GA than the Knight clock using the 450K and EPIC arrays (0.47 and 0.65 for Bohlin vs. 0.35 and 0.48 for Knight, Table 2). The median errors were around 1 week for both clocks in 450K data, but the Knight clock had a smaller median error than Bohlin clock for EPIC array data (0.89 vs. 1.42 weeks). The Knight clock also had smaller mean difference estimates than the Bohlin clock for both 450K (− 0.082 vs. − 0.94 week) and EPIC array data (− 0.32 vs. − 1.35 weeks).
For the preterm cohort with blood spot DNA, using the EPIC array, the Bohlin clock GA had a stronger correlation than the Knight clock (0.67 vs. 0.57), but with higher median error (3.91 vs. 2.52 weeks) and mean difference (4.01 vs. 2.52 weeks).
Thus, in all the blood samples collected at birth, the Bohlin clock GA demonstrates a stronger correlation with observed GA compared to the Knight clock GA. Nevertheless, the Knight clock exhibits a consistent trend of having lower median error and mean differences, except in cases where GA is predicted in cord blood DNA using the 27K array and in CBMC DNA using the EPIC array.
Gestational age (GA) prediction in placenta
We evaluated the GA in DNA from placentas using both the Lee (robust placental clock [RPC]) and Mayne clocks that were developed for placental samples. The Lee clock GA showed stronger correlation with the observed GA across the 450K (0.50 vs. 0.27) and EPIC arrays (0.36 vs. − 0.054) for both preterm (0.6 vs. 0.35) and term placentas. The Lee clock also had lower median errors and mean differences (Table 2). The age acceleration residuals were less varied for the Lee clock compared to the Mayne clock (Table 2).
Pediatric chronological age (CA) prediction
The performance of the PedBE and Horvath clocks was assessed across various tissue types and age ranges. For DNA obtained from buccal cells of children under the age of one, the PedBE clock exhibited a better correlation with the observed CA when compared to the Horvath clockin both preterm (0.62 for PedBE vs. 0.41 for Horvath) and term babies (0.12 for PedBE vs. − 0.36 for Horvath). Furthermore, the PedBE clock demonstrated a smaller median error and mean difference (Table 2), as well as less variability in age acceleration residuals (Table 2). It is worth noting that the term babies from which the buccal cells were obtained were all less than one month of age. When converting to CA in years, the observed age ranged from 0 to 0.082 years, indicating a very narrow variability, which makes accurate prediction challenging. The predicted Horvath age ranged from 0 to 1.8 years and the negative correlation demonstrated the poor predictive capacity of the Horvath clock in this subset of samples.
PBMC DNA that were tested with the EPIC array were all collected at the same age (age = 7 years old) so no correlation between the predicted and observed CA was expected, but the PedBE clock showed a smaller median error/residuals and mean difference than the Horvath clock. The 450K array data, which included PBMC samples collected around ages 8 and 14, revealed that both clocks had similar correlations (~ 0.8) with the observed CA, but the PedBE clock had a larger median error and more pronounced mean difference than the Horvath clock.
In contrast, for peripheral whole blood samples collected under the age of 1 or between 3–10 years of age, the Horvath clock demonstrated higher correlation with the observed CA, with smaller median error and mean difference than the PedBE clock in both the 450K and EPIC array data. The differential correlations of the epigenetic ages predicted by the Horvath clock and the PedBE age were only observed in peripheral whole blood samples but not in PBMC. This discrepancy may reflect the variation in blood cell types present between these two sample types. For instance, whole blood includes neutrophils and eosinophils, which are not present in PBMC.
The results indicate that the PedBE clock performs better than the Horvath clock when analyzing DNA extracted from infant buccal cells. However, it is recommended to use the Horvath clock for analyzing pediatric blood samples, including PBMC and peripheral whole blood.
Epigenetic ages for preterm infants
This study offered a unique opportunity to assess the performance of epigenetic clocks on DNA obtained from preterm infants as samples of blood spot, placenta, and buccal cell DNA were available. Specifically, the Knight and Bohlin clocks were calculated for blood spot DNA, while the Lee and Mayne clocks were calculated for placental DNA. For buccal cells, both the Horvath and PedBE clocks were computed.
When evaluating these epigenetic clocks in the same tissue type and same platform, the predicted epigenetic ages were better correlated with the observed GA and CA in preterm infants compared with term infants (Table 2). These results demonstrate the accuracy of applying epigenetic clocks to preterm infants, suggesting that they can be used within these populations.
For blood spot and placental DNA, corresponding epigenetic clocks (Knight and Bohlin clocks for blood spot DNA, Lee and Mayne clocks for placental DNA) showed mean difference ranges of 2.52 to 7.38 weeks for GA prediction. In term infants, these clocks showed mean difference ranges of − 1.32 to 0.92 weeks. For buccal cells collected during the first year of life, the corresponding epigenetic clocks (Horvath and PedBE clocks) showed mean difference ranges from 0.07 to 1.26 years in preterm infants, and a similar range of 0.04 to 1.00 years in term infants. Thus, in blood spot and placental DNA collected in newborns, epigenetic ages predicted by the corresponding clocks showed age acceleration in preterm infants but not in term infants. However, in DNA collected from buccal cells of infants at a later life-stage (> 1 year), epigenetic ages estimated by corresponding clocks do not show as much age acceleration in preterm infants compared to term infants.
NeoAge clock
NeoAge clock was trained on buccal cell samples and specifically developed for neonatal aging in preterm infants, providing predictions for both PMA and PNA. We first compared the NeoAge clock’s prediction of PNA with the PedBE clock’s prediction in buccal cells. In the current study, the largest cohort with buccal cells collected from preterm infants was also the cohort that contributed to the training dataset for the NeoAge clock [13]. Therefore, the prediction performance is very strong in this cohort and outperforms the PedBE clock (Table 3).
Table 3.
Tissue | Array | Birth term | GA (weeks) | N | NeoAge PMAỻ | Knight | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Post-menstrual age (weeks) estimation | ||||||||||||
Blood spot | 450K | Term | 39.36 ± 1.91 | 253 | 0.18 | 4.72 | 0.29 | − 4.71 | 0.35 | 1.11 | 0.96 | − 0.08 |
EPIC | Term | 38.73 ± 1.60 | 176 | 0.27 | 3.08 | 0.21 | − 2.85 | 0.48 | 0.89 | 0.91 | − 0.32 | |
EPIC | Preterm | 25.65 ± 1.27 | 272 | 0.27 | 8.31 | 0.29 | 8.53 | 0.57 | 2.52 | 1.01 | 2.52 |
Tissue | Array | Birth term | GA (weeks) | N | NeoAge PMAỻ | Lee | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Placenta | EPIC | Term | 38.79 ± 1.18 | 121 | 0.16 | 1.91 | 0.39 | − 2.01 | 0.36 | 0.88 | 0.76 | − 0.32 |
Preterm | 25.68 ± 1.25 | 399 | 0.15 | 9.88 | 0.36 | 10.13 | 0.6 | 2.50 | 0.93 | 2.59 |
Tissue | Array | Birth term | GA (weeks) | N | NeoAge PMAỻ | Knight | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Buccal cells | EPIC | Term | 38.13 ± 1.09 | 72 | 0.44 | 0.98 | 0.44 | − 0.87 | 0.25 | 11.01 | 0.59 | − 10.96 |
Buccal cells¶ | EPIC | Preterm | 26.63 ± 1.89 | 480 | 0.99 | 0.71 | 0.23 | − 0.78 | − 0.06 | 2.00 | 0.74 | − 0.75 |
Tissue | Array | Birth term | Age (weeks) | N | NeoAge PNA (weeks) | PedBE (weeks) | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
* | Median error† | Median res§ | Mean diff‡ | * | Median error† | Median res§ | Mean diff‡ | |||||
Post-natal age (weeks) estimation | ||||||||||||
Buccal cells | EPIC | Term | 0.52 ± 0.75 | 72 | − 0.02 | 7.35 | 0.65 | 7.44 | 0.12 | 2.66 | 2.53 | 2.30 |
Buccal cells¶ | EPIC | Preterm | 12.14 ± 4.52 | 480 | 1 | 1.00 | 0.20 | − 1.19 | 0.62 | 4.46 | 4.70 | 3.38 |
*Spearman correlation between predicted epigenetic age and observed gestational age/chronological age
†Median of the absolute difference between the predicted age and the observed age (weeks for gestational age and years for chronological age)
§Median of the absolute residuals from regressing the predicted epigenetic age onto the observed age (weeks for gestational age and years for chronological age)
‡Mean of the signed difference between the predicted age and the observed age (weeks for gestational age and years for chronological age)
CBMC Cord blood mononuclear cells, GA gestational age, PBMC Peripheral blood mononuclear cells
¶Training data used in developing NeoAge clock
ỻThe observed PMA was calculated as the sum of gestational age and the time elapsed after birth in weeks
In another cohort with buccal cells collected from term infants within the first month, the PedBE clock has a better correlation with the observed CA (0.12 vs. − 0.02) with smaller median error (2.66 vs. 7.35 weeks) and mean difference (2.30 vs. 7.44 weeks) than the NeoAge PNA (Table 3). However, the NeoAge clock is currently the only clock that predicts PMA, which incorporates both the GA (the time from conception to birth) and the time elapsed after birth. The correlation between the predicted PMA by the NeoAge clock and the observed GA plus the time after birth, surpasses the correlation between the Knight clock-predicted GA and the observed GA (0.44 vs. 0.25). Moreover, the NeoAge clock exhibits substantially smaller median error (0.98 vs. 11.01 weeks) and mean difference (− 0.87 vs. − 10.96 weeks) compared to the Knight clock..
We further examined the NeoAge PMA performance in two additional tissue types. Specifically, we compared its performance to that of the Knight clock in blood spot samples and to the Lee clock in placental samples. Our findings suggest that the NeoAge PMA prediction did not perform as well in blood spot samples compared to the Knight clock. Similarly, the NeoAge clock did not perform as well as the Lee clock in placental samples (Table 3).
Variation in epigenetic age among diverse self-identified racial groups
Race is a social construct that may reflect the lived experiences of the reporter. These lived experiences may be associated with biological age [22]. Our study incorporates data from various self-reported race groups presenting an opportunity to explore how epigenetic clocks adapt to racial group heterogeneity. To ensure a robust comparison analysis, we have required a minimum of 40 samples from each self-reported race group to ensure that the Spearman correlation could reach approximately 0.3 with a significance level of p < 0.05 [23]. Our study specifically examines the Knight and Bohlin clocks in cord blood and blood spot samples, Lee and Mayne clocks in placental samples, and Horvath and PedBE clocks in buccal cells among both self-identified White and Black individuals (see Additional file 1: Table S6). To maintain simplicity and clarity, we will henceforth refer to the self-identified racial groups as either "White" or "Black" in the subsequent sections of the study.
When comparing the Knight and Bohlin clocks in cord blood samples, we observed similar performances for both clocks in both White and Black participants. However, using data from the 27K array, both the Knight and Bohlin clocks were more correlated with observed GA in Black individuals (n = 58) than White individuals (n = 66) (see Additional file 2: Fig. S2). In contrast, using data from 450K and EPIC arrays, both clocks were more correlated with observed GA in White individuals than Black individuals (see Additional file 2: Figs. S3 and S4). Also, we found consistently larger median errors for participants in the Black group than the White group across all three arrays and for both clocks (average 1.23 vs. 0.91 weeks, t test P = 0.005). The estimated mean differences were similar across groups for both clocks. It should be noted that the White and Black groups had similar sample size in 27K array data, but the White group has > 3 times the sample sizes in the 450K array data and ~ 2 times the sample sizes in EPIC array data.
In blood spot samples collected from term infants using the EPIC array, we found both the Knight and Bohlin clocks had better correlations with observed GA in the Black group (n = 47) than the White group (n = 101), with larger mean differences (see Additional file 2: Fig. S5). The median errors were similar across both groups. In contrast, for blood spot samples from preterm infants, we observed better correlations with observed GA for both clocks in the White group (n = 170) compared to the Black group (n = 76) (see Additional file 2: Fig. S6). The median errors were similar between both groups, as were the mean differences.
Analyses of placental samples using the EPIC array in preterm infants revealed that both the Lee and Mayne clocks had better correlations with observed GA in the White group (n = 245) than in the Black group (n = 112), with similar median errors and mean differences (see Additional file 2: Fig. S7 and Additional file 1: Table S6). For buccal samples collected within the first year of life from preterm infants using the EPIC array, the Horvath clock showed better correlation with smaller median error and mean difference for the Black group (n = 94) than for the White group (n = 231); however, there was substantial overestimation in both groups. The PedBE clock showed overall better performance and similar results across race groups (see Additional file 2: Fig. S8).
Comparison of epigenetic clocks between sexes
We conducted a comparative analysis of epigenetic clocks between males and females in each subgroup, with a requirement of a minimum of 40 samples. It is important to highlight that we had balanced sample sizes for both sexes in all subsets (Additional file 1: Table S7).
The Knight and Bohlin clocks displayed comparable performances in blood spot samples with slight variations observed in EPIC array data. Specifically, in blood spot samples collected from term babies with EPIC array, the Knight clock showed a better correlation in females with smaller median error and mean difference. Conversely, the Bohlin clock had a better correlation in males with smaller median error and mean difference. In blood spot samples obtained from preterm babies with EPIC array data, both clocks showed better correlations in males. In CMBC, both the Knight and Bohlin clocks exhibited better correlations with observed GA in males compared to females, with similar median errors, median residuals and mean differences. For cord blood samples, the Knight clock showed a slight better correlation in 27K array data for females, while it showed a better correlation with the observed GA in males for 450K array data. On the other hand, the Bohlin clock showed slightly better correlations in males for both 450K and EPIC array data (see Additional file 1: Table S7).
For placental samples, both the Lee and Mayne clocks had better correlations with the observed GA in males, showing similar median errors, median residuals and mean differences between sexes (see Additional file 1: Table S7).
In the case of buccal cells, PBMC, and peripheral whole blood samples, both the Horvath and PedBE clocks exhibited similar performances between sexes (see, Additional file 1: Table S7).
Comparison of predicted epigenetic age across tissues
In this study, some participants had DNAm measures from multiple tissue types. We subsequently compared the estimated epigenetic ages, using appropriate clocks, across various tissues for these subsets of participants (see Additional file 1: Table S8).
We first analyzed data from 258 preterm infants who had both blood spot and placental samples collected at birth. For blood spot DNA, we used the Knight clock, and for placental DNA, we used the Lee clock. Our analysis revealed that the predicted epigenetic ages were similar across tissues (). Both clocks predicted similar age acceleration of approximately 2.5 weeks. This cross-tissue validation provides additional confidence in our findings that preterm infants have an older epigenetic age than term infants.
For another subset of 68 term newborns who had both cord blood and placental samples, we used the Knight clock for the cord blood DNA and the Lee clock for the placental DNA. The estimated epigenetic ages were also similar (), with a mean difference of − 0.28 week for the Knight clock and 0.06 week for the Lee clock.
Our results indicate that epigenetic age estimations are comparable across tissue types available in our study (blood spot, cord blood and placenta) when tissue-appropriate clocks are employed.
Discussion
Our study aimed to evaluate epigenetic clocks for measuring gestational age and early-childhood chronological age by analyzing 4,555 DNAm samples from 7 tissue types with 3 arrays in cohorts from a large national consortium. Our performance comparisons emphasize the strengths and limitations of each epigenetic clock to accurately determine gestational or chronological age in diverse pediatric populations. These analyses suggested three major conclusions: (1) the Knight and Bohlin clocks had comparable performance in predicting gestational age in blood cell samples (cord blood, CBMC, and blood spot), with the Bohlin clock being more highly correlated but with larger errors and mean differences in some cases; (2) The Lee clock outperformed the Mayne clock in predicting gestational age in placental samples; and (3) The PedBE clock had better accuracy with respect to correlations with the observed CA, smaller median error and mean difference in predicting CA in infant buccal cells compared with the Horvath clock. Conversely, the Horvath clock performed better in pediatric blood cell samples (PBMC and peripheral whole blood).
By analyzing DNAm from preterm infants, we had the opportunity to compare the performance of epigenetic clocks based on term status and to evaluate the NeoAge clock, which was specialized for preterm infants. Our results showed that all six clocks had better correlations with observed gestational/chronological ages in preterm infants compared to term infants in corresponding tissues. The epigenetic ages predicted exhibit a mean difference ranging from 2.5 to 7.4 weeks for preterm infants. This finding is consistent with earlier research demonstrating gestational age acceleration in preterm newborns [24–26]. While the NeoAge clock showed better accuracy in predicting post-natal and post-menstrual age in buccal cells for preterm infants, it didn't perform well in blood spot and placental DNA samples. The PedBE clock exhibited superior accuracy in predicting post-natal age for buccal cells collected within 30 days of birth in term infants, as compared to the NeoAge clock. Nevertheless, the NeoAge clock offers a unique prediction for post-menstrual age, which is an estimate of the duration between conception and tissue collection, and this estimate corresponds well with the actual age at tissue collection.
Using data on self-reported racial group, we were able to test whether the epigenetic clocks performed equally for the White and Black individuals in cord blood, blood spot, placental and buccal samples. The results of epigenetic clocks across these groups were not consistent. Although the majority of training data for these epigenetic clocks are based upon samples from White individuals, our study found that in some cases, epigenetic clocks performed better in self-reported Black individuals. Specifically, the Knight and Bohlin clocks in cord blood DNA using 27K array and in blood spot DNA using EPIC array, as well as the Horvath clock in buccal cells using EPIC array showed better correlations with observed gestational/chronological age in Black individuals. These findings suggest that further research is needed to explore the adaptability of epigenetic clocks for diverse populations.
Our study represents the largest study to date evaluating epigenetic clocks in pediatric populations across various tissue types using different arrays, providing valuable insights for future applications of epigenetic clock tools in specific tissues throughout different stages of pediatric life. However, it is important to consider certain limitations when interpreting the results. Firstly, the sample sizes from blood tissues are considerably larger than those from other tissues, which may limit the power for comparisons in certain tissues. Secondly, the number of chronological age points from available pediatric samples is limited, which could constrain the evaluation of epigenetic clocks that predict chronological age. Additionally, the observed gestational ages in our study may have been obtained from different sources, such as maternal self-report or first-trimester ultrasound. The variability in the source of gestational age data could introduce potential biases and affect the reliability and accuracy of the results. Similarly, not all CpGs used in each epigenetic clock are available in our data, necessitating the use of nearest CpGs as proxies, which may influence the performance of the epigenetic clocks. For example, 90% (88 out of 97) CpGs of the Bohlin clock were missing on the 27K array. Another important consideration is the heterogeneity of the pediatric cohort samples included in this study. Variation in factors such as sample size, demographic variations, and recruitment weighted toward specific underlying health conditions could potentially contribute to the variation in epigenetic age estimates. Therefore, these data should be interpreted with caution particularly as it pertains to conclusions about the performance of specific epigenetic clocks in pediatric populations with varied health backgrounds. Furthermore, the lack of genetic data limits our ability to examine the effects of genetic ancestry on the performance of epigenetic clocks. Lastly, it is worth noting that our study primarily focuses on comparing age estimation accuracy, which may not fully reflect the clinical relevance of epigenetic age during the process of development and aging. To gain a comprehensive understanding of the implications, it is essential to consider other common covariates, such as cell type proportions, batch effects, sex, and health conditions when examining the associations between epigenetic clocks and health outcomes.
Conclusion
Our study has provided valuable insights into the performance of seven epigenetic clocks across various tissue and array types for samples collected at birth or early childhood. Our results suggest that the optimal choice of epigenetic clock depends on the specific tissue and age group under investigation (Fig. 2). For instance, the Bohlin and the Knight clocks are the preferred options for newborn blood cell samples, while the Lee clock is recommended for placental samples. The higher correlation makes the Bohlin clock a better choice to evaluate health and development. In contrast, for age prediction purposes, the Knight clock is preferred for its lower median errors. The PedBE clock is suitable for children's buccal cell samples, while the Horvath clock is recommended for children's blood cell samples. Notably, the NeoAge clock stands out for its unique ability to predict post-menstrual age and high correlation with the observed age in infant buccal cell samples. Overall, our study provides practical recommendations for selecting the most appropriate epigenetic clock in different research contexts, highlighting the significance of accounting for tissue and platform differences when interpreting results.
Supplementary Information
Acknowledgements
The authors wish to thank our ECHO Colleagues; the medical, nursing, and program staff; and the children and families participating in the ECHO cohorts. We also acknowledge the contribution of the following ECHO Program collaborators: ECHO Components—Coordinating Center: Duke Clinical Research Institute, Durham, North Carolina: Smith PB, Newby LK; Data Analysis Center: Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland: Jacobson LP; Research Triangle Institute, Durham, North Carolina: Catellier DJ; Person-Reported Outcomes Core: Northwestern University, Evanston, Illinois: Gershon R, Cella D. ECHO Awardees and Cohorts—Arnold Palmer Hospital for Children, Orlando, FL: Laham FR; Boston Children’s Hospital, Boston, MA: Mansbach JM; Children's Hospital of Los Angeles, Los Angeles, CA: Wu S; Children’s Hospital of Philadelphia, Philadelphia, PA: Spergel JM; Children's Hospital of Pittsburgh of UPMC, Pittsburgh, PA: Celedón JC; Children's Mercy Hospital & Clinics, Kansas City, MO: Puls HT; Children's National Hospital, Washington, DC: Teach SJ; Cincinnati Children's Hospital and Medical Center, Cincinnati, OH: Porter SC; Connecticut Children's Medical Center, Hartford, CT: Waynik IY; Dell Children's Medical Center of Central Texas, Austin, TX: Iyer SS; Massachusetts General Hospital, Boston, MA: Samuels-Kalow ME; Nemours Children's Hospital, Wilmington, DE: Thompson AD; Norton Children’s Hospital, Louisville, KY: Stevenson MD; Phoenix Children’s Hospital, Phoenix AZ: Bauer CS; Oklahoma University – Tulsa, Tulsa, OK: Inhofe NR; Seattle Children's Hospital, Seattle, WA: Boos M; Texas Children's Hospital, Houston, TX: Macias CG; University of Wisconsin, Madison WI: Gern J; University of Wisconsin, Madison, WI: Jackson D; Children’s Hospital of New York, New York, NY: Bacharier L, Kattan M; Johns Hopkins University, School of Medicine, Baltimore, MD: Wood R; Washington University in St Louis, St Louis, MO: Rivera-Spoljaric K, Bacharier L; University of Southern California, Los Angeles, CA: Bastain T, Farzan S, Habre R; University of Washington, Department of Environmental and Occupational Health Sciences, Seattle, WA: Karr C; University of Tennessee Health Science Center, Memphis, TN: Tylavsky F, Mason A, Zhao Q; Seattle Children’s Research Institute, Seattle, WA: Sathyanarayana S; University of California, San Francisco, San Francisco, CA: Bush N, LeWinn KZ; Women & Infants Hospital of Rhode Island, Providence RI, Lester B; Children’s Mercy, Kansas City, MO: Carter B; Corewell Health, Helen DeVos Children’s Hospital, Grand Rapids, MI: Pastyrnak S; Kapiolani Medical Center for Women and Children, Providence, RI: Neal C; Los Angeles Biomedical Research Institute at Harbour-UCLA Medical Center, Los Angeles, CA: Smith L; Wake Forest University School of Medicine, Winston Salem, NC: Helderman J; Oregon Health and Science University, Portland, OR: McEvoy C; Indiana University, Riley Hospital for Children, Indianapolis, IN: Tepper R; AJ Drexel Autism Institute, Philadelphia, PA: Lyall K; John Hopkins Bloomberg School of Public Health, Baltimore, MD: Volk H; University of California Davis Health, MIND Institute, Davis, CA: Schmidt R; Kaiser Permanente Northern California Division of Research, Oakland, CA: Croen L; University of North Carolina, Chapel Hill, NC: O’Shea M; Baystate Children’s Hospital, Springfield, MA: Vaidya R; Beaumont Children’s Hospital, Royal Oak, MI: Obeid R; Boston Children’s Hospital, Boston, MA: Rollins C; East Carolina University, Brody School of Medicine, Greenville, NC: Bear K; Corewell Health, Helen DeVos Children’s Hospital, Grand Rapids, MI: Pastyrnak S; Michigan State University College of Human Medicine, East Lansing, MI: Lenski M; Tufts University School of Medicine, Boston, MA: Singh R; University of Chicago, Chicago, IL: Msall M; University of Massachusetts Chan Medical School, Worcester, MA: Frazier J; Atrium Health Wake Forest Baptist, Winston Salem, NC: Gogcu S; Yale School of Medicine, New Haven, CT: Montgomery A; Boston Medical Center, Boston, MA: Kuban K, Douglass L, Jara H; Boston University, Boston, MA: Joseph R; Michigan State University, East Lansing, MI: Kerver JM; Columbia University Medical Center, New York, NY: Perera F.
Abbreviations
- CA
Chronological age
- CANDLE
Conditions Affecting Neurocognitive Development and Learning in Early Childhood
- CBMC
Cord blood mononuclear cells
- CpGs
Cytosine guanine dinucleotide
- DNAm
DNA methylation
- ECHO
Environmental influences on Child Health Outcomes
- GA
Gestational age
- NHBCS
New Hampshire Birth Cohort Study
- NOVI
Neonatal Neurobehavior and Outcomes in Very Preterm Infants
- PBMC
Peripheral blood mononuclear cells
- PedBE
Pediatric-buccal-epigenetic
- PMA
Post-menstrual age
- PNA
Post-natal age
- r
Spearman correlation coefficient
- RPC
Robust placental clock
- SNP
Single nucleotide polymorphism
Author contributions
All authors reviewed the results and approved the manuscript. FF, GP, and CL conceived and designed the study. FF and LZ conducted analyses. WP, CJM, AKK, AC, MTA, MFH, IMA, JMG, AKS, AG, RCF, EO, GO, DMR, LT, JBH, CAJC, ALD, DMD, MRK, CVB, CO and TME contributed to data collection, study design, analysis plan and interpretation of the results.
Funding
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Research reported in this publication was supported by the Environmental influences on Child Health Outcomes (ECHO) program, Office of the Director, National Institutes of Health, under Award Numbers U2COD023375 (Coordinating Center), U24OD023382 (Data Analysis Center), U24OD023319 with co-funding from the Office of Behavioral and Social Science Research (PRO Core). The following grants supported data collection: Project Viva is supported by NIH UH3OD023286 (Oken) and R01HD034568; The Healthy Start Study is funded by the National Institutes of Health (NIH) R01DK076648 and UH3OD023248 (Dabelea). The URECA studies are supported by NIH grants UG3/UH3OD023282 (Gern), UM1 AI114271, and UM1 AI160040; ARCH is supported by NIH UG3OD023285 and UH3OD023285 (Kerver); MADRES is supported by UH3OD023287 (Breton); ECHO-NOVI is supported by R01HD072267, UG3OD023347, UH3OD023347 (Lester), and R01HD084515; UH3OD023318 (Dunlop), R01MD009064 (Smith/Dunlop). UH3OD023253(Camargo); UH3OD023275(Karagas); UH3OD023271(Karr); UH3OD023288(McEvoy); UH3OD023342(Lyall); UH3OD023348 (O’Shea); UH3OD023290 (Herbstman) and UH3OD023305 (Trasande). Dr. Perng is supported by the Center for Clinical and Translational Sciences Institute KL2-TR002534 and ADA-7-22-ICTSPM-08.
Availability of data and materials
All the data generated in this study are included in the manuscript and supplementary files. General data relevant to ECHO is available at Environmental influences on Child Health Outcomes (ECHO) Program | National Institutes of Health (NIH).
Declarations
Ethics approval and consent to participate
The ECHO-wide Cohort Data Collection Protocol operates under a single Institutional Review Board (IRB) administered by the Western Institutional Review Board (WIRB) Copernicus Group IRB, which is registered with both the Office for Human Research Protections (OHRP) and the FDA as IRB00000533. Compliance with regulatory requirements for the ECHO-wide Cohort Data Collection Protocol at participating cohort sites is the responsibility of properly constituted Institutional Review Boards, which can either be the ECHO single IRB or the ECHO cohort's local IRB. These governing IRBs review ECHO protocols, as well as informed consent/assent forms, HIPAA authorization forms, recruitment materials, and other relevant information before any ECHO-wide Cohort Data Collection Protocol-related procedures or activities can begin. The ECHO Data Analysis Center's work is approved by the Institutional Review Board at the Johns Hopkins Bloomberg School of Public Health. ECHO Cohort Investigators, or their designated study personnel, obtained written informed consent from all participants in the current study for both ECHO-wide Cohort Data Collection Protocol participation and participation in their specific cohorts.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
See Acknowledgments for full listing of collaborators.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Fang Fang, Email: ffang@rti.org.
on behalf of program collaborators for Environmental influences on Child Health Outcomes:
P. B. Smith, L. K. Newby, L. P. Jacobson, D. J. Catellier, R. Gershon, D. Cella, F. R. Laham, J. M. Mansbach, S. Wu, J. M. Spergel, J. C. Celedón, H. T. Puls, S. J. Teach, S. C. Porter, I. Y. Waynik, S. S. Iyer, M. E. Samuels-Kalow, A. D.Thompson, M. D. Stevenson, C. S. Bauer, N. R. Inhofe, M. Boos, C. G. Macias, J. Gern, D. Jackson, L. Bacharier, M. Kattan, R. Wood, K. Rivera-Spoljaric, L. Bacharier, T. Bastain, S. Farzan, R. Habre, C. Karr, F. Tylavsky, A. Mason, Q. Zhao, S. Sathyanarayana, N. Bush, K. Z. LeWinn, B. Lester, B. Carter, S. Pastyrnak, C. Neal, L. Smith, J. Helderman, C. McEvoy, R. Tepper, K. Lyall, H. Volk, R. Schmidt, L. Croen, M. O’Shea, R. Vaidya, R. Obeid, C. Rollins, K. Bear, S. Pastyrnak, M. Lenski, R. Singh, M. Msall, J. Frazier, S. Gogcu, A. Montgomery, K. Kuban, L. Douglass, H. Jara, R. Joseph, J. M. Kerver, and F. Perera
References
- 1.Santos KF, Mazzola TN, Carvalho HF. The prima donna of epigenetics: the regulation of gene expression by DNA methylation. Braz J Med Biol Res. 2005;38(10):1531–1541. doi: 10.1590/S0100-879X2005001000010. [DOI] [PubMed] [Google Scholar]
- 2.Nishiyama A, Nakanishi M. Navigating the DNA methylation landscape of cancer. Trends Genet. 2021;37(11):1012–1027. doi: 10.1016/j.tig.2021.05.002. [DOI] [PubMed] [Google Scholar]
- 3.Wu YL, Lin ZJ, Li CC, Lin X, Shan SK, Guo B, Zheng MH, Li F, Yuan LQ, Li ZH. Epigenetic regulation in metabolic diseases: mechanisms and advances in clinical study. Signal Transduct Target Ther. 2023;8(1):98. doi: 10.1038/s41392-023-01333-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Noroozi R, Ghafouri-Fard S, Pisarek A, Rudnicka J, Spolnicka M, Branicki W, Taheri M, Pospiech E. DNA methylation-based age clocks: from age prediction to age reversion. Ageing Res Rev. 2021;68:101314. doi: 10.1016/j.arr.2021.101314. [DOI] [PubMed] [Google Scholar]
- 5.Horvath S, Raj K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat Rev Genet. 2018;19(6):371–384. doi: 10.1038/s41576-018-0004-3. [DOI] [PubMed] [Google Scholar]
- 6.Di Lena P, Sala C, Nardini C. Evaluation of different computational methods for DNA methylation-based biological age. Brief Bioinform. 2022;23(4):bbac274. doi: 10.1093/bib/bbac274. [DOI] [PubMed] [Google Scholar]
- 7.Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14(10):R115. doi: 10.1186/gb-2013-14-10-r115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Knight AK, Craig JM, Theda C, Baekvad-Hansen M, Bybjerg-Grauholm J, Hansen CS, Hollegaard MV, Hougaard DM, Mortensen PB, Weinsheimer SM, et al. An epigenetic clock for gestational age at birth based on blood methylation data. Genome Biol. 2016;17(1):206. doi: 10.1186/s13059-016-1068-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bohlin J, Haberg SE, Magnus P, Reese SE, Gjessing HK, Magnus MC, Parr CL, Page CM, London SJ, Nystad W. Prediction of gestational age based on genome-wide differentially methylated regions. Genome Biol. 2016;17(1):207. doi: 10.1186/s13059-016-1063-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lee Y, Choufani S, Weksberg R, Wilson SL, Yuan V, Burt A, Marsit C, Lu AT, Ritz B, Bohlin J, et al. Placental epigenetic clocks: estimating gestational age using placental DNA methylation levels. Aging (Albany NY) 2019;11(12):4238–4253. doi: 10.18632/aging.102049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mayne BT, Leemaqz SY, Smith AK, Breen J, Roberts CT, Bianco-Miotto T. Accelerated placental aging in early onset preeclampsia pregnancies identified by DNA methylation. Epigenomics. 2017;9(3):279–289. doi: 10.2217/epi-2016-0103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.McEwen LM, O'Donnell KJ, McGill MG, Edgar RD, Jones MJ, MacIsaac JL, Lin DTS, Ramadori K, Morin A, Gladish N, et al. The PedBE clock accurately estimates DNA methylation age in pediatric buccal cells. Proc Natl Acad Sci USA. 2020;117(38):23329–23335. doi: 10.1073/pnas.1820843116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Graw S, Camerota M, Carter BS, Helderman J, Hofheimer JA, McGowan EC, Neal CR, Pastyrnak SL, Smith LM, DellaGrotta SA, et al. NEOage clocks—epigenetic clocks to estimate post-menstrual and postnatal age in preterm infants. Aging (Albany NY) 2021;13(20):23527–23544. doi: 10.18632/aging.203637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Knapp EA, Kress AM, Parker CB, Page GP, McArthur K, Gachigi KK, Alshawabkeh AN, Aschner JL, Bastain TM, Breton CV, et al. The environmental influences on child health outcomes (ECHO)-wide cohort. Am J Epidemiol. 2023;192(8):1249–1263. doi: 10.1093/aje/kwad071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, Irizarry RA. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30(10):1363–1369. doi: 10.1093/bioinformatics/btu049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chen YA, Lemire M, Choufani S, Butcher DT, Grafodatskaya D, Zanke BW, Gallinger S, Hudson TJ, Weksberg R. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics. 2013;8(2):203–209. doi: 10.4161/epi.23470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Triche TJ, Jr, Weisenberger DJ, Van Den Berg D, Laird PW, Siegmund KD. Low-level processing of Illumina Infinium DNA methylation BeadArrays. Nucleic Acids Res. 2013;41(7):e90. doi: 10.1093/nar/gkt090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yuan V, Price EM, Del Gobbo G, Mostafavi S, Cox B, Binder AM, Michels KB, Marsit C, Robinson WP. Accurate ethnicity prediction from placental DNA methylation data. Epigenet Chromatin. 2019;12(1):51. doi: 10.1186/s13072-019-0296-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Xu Z, Niu L, Li L, Taylor JA. ENmix: a novel background correction method for Illumina HumanMethylation450 BeadChip. Nucleic Acids Res. 2016;44(3):e20. doi: 10.1093/nar/gkv907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ladd-Acosta C, Vang E, Barrett ES, Bulka CM, Bush NR, Cardenas A, Dabelea D, Dunlop AL, Fry RC, Gao X, et al. Analysis of pregnancy complications and epigenetic gestational age of newborns. JAMA Netw Open. 2023;6(2):e230672. doi: 10.1001/jamanetworkopen.2023.0672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.McEwen LM, Jones MJ, Lin DTS, Edgar RD, Husquin LT, MacIsaac JL, Ramadori KE, Morin AM, Rider CF, Carlsten C, et al. Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array. Clin Epigenet. 2018;10(1):123. doi: 10.1186/s13148-018-0556-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Crimmins EM, Thyagarajan B, Levine ME, Weir DR, Faul J. Associations of age, sex, race/ethnicity, and education with 13 epigenetic clocks in a nationally representative US sample: the Health and Retirement Study. J Gerontol A Biol Sci Med Sci. 2021;76(6):1117–1123. doi: 10.1093/gerona/glab016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Looney SW. Practical issues in sample size determination for correlation coefficient inference. SM J Biom Biostat. 2018;3(1):1027–1030. [Google Scholar]
- 24.Knight AK, Smith AK, Conneely KN, Dalach P, Loke YJ, Cheong JL, Davis PG, Craig JM, Doyle LW, Theda C. Relationship between epigenetic maturity and respiratory morbidity in preterm infants. J Pediatr. 2018;198(168–173):e162. doi: 10.1016/j.jpeds.2018.02.074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Clark J, Bulka CM, Martin CL, Roell K, Santos HP, O'Shea TM, Smeester L, Fry R, Dhingra R. Placental epigenetic gestational aging in relation to maternal sociodemographic factors and smoking among infants born extremely preterm: a descriptive study. Epigenetics. 2022;17(13):2389–2403. doi: 10.1080/15592294.2022.2125717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wang X, Cho HY, Campbell MR, Panduri V, Coviello S, Caballero MT, Sambandan D, Kleeberger SR, Polack FP, Ofman G, et al. Epigenome-wide association study of bronchopulmonary dysplasia in preterm infants: results from the discovery-BPD program. Clin Epigenet. 2022;14(1):57. doi: 10.1186/s13148-022-01272-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All the data generated in this study are included in the manuscript and supplementary files. General data relevant to ECHO is available at Environmental influences on Child Health Outcomes (ECHO) Program | National Institutes of Health (NIH).