Abstract
Maternal age is an established predictor of preterm birth independent of other recognized risk factors. The use of chronological age makes the assumption that individuals age at a similar rate. Therefore, it does not capture interindividual differences that may exist due to genetic background and environmental exposures. As a result, there is a need to identify biomarkers that more closely index the rate of cellular aging. One potential candidate is biological age (BA) estimated by the DNA methylome. This study investigated whether maternal BA, estimated in either early and/or late pregnancy, predicts gestational age at birth. BA was estimated from a genome-wide DNA methylation platform using the Horvath algorithm. Linear regression methods assessed the relationship between BA and pregnancy outcomes, including gestational age at birth and prenatal perceived stress, in a primary and replication cohort. Prenatal BA estimates from early pregnancy explained variance in gestational age at birth above and beyond the influence of other recognized preterm birth risk factors. Sensitivity analyses indicated that this signal was driven primarily by self-identified African American participants. This predictive relationship was sensitive to small variations in the BA estimation algorithm. Benefits and limitations of using BA in translational research and clinical applications for preterm birth are considered.
Subject terms: DNA methylation, Predictive markers, Reproductive disorders
Introduction
Preterm birth (PTB; birth before 37 completed weeks of gestation) remains the leading contributor to neonatal mortality and morbidity worldwide1. In addition to the emotional distress associated with PTB, the monetary costs associated with PTB complications in the United States exceeded $26 billion dollars in 2005 alone2. Prenatal interventions to reduce the prevalence of PTB have shown promise, but identifying women at high risk for preterm delivery can be challenging. Results from epidemiological and family studies confirm that genetic, environmental, and behavioral factors all jointly influence PTB risk liability3–6. However, translating these results into improved clinical prediction models has proven difficult. One potential avenue for relating these three contributory sources to PTB risk is through DNA methylation-based biological age estimates.
Biological age (BA) describes the rate of cellular aging and progression towards senescence. Conventionally, researchers and clinicians have used chronological age to proxy BA, but accumulating evidence suggests that deviations between BA and chronological age are informative about risk for future adverse health outcomes, such as early mortality and cancer7–11. The notion that advanced BA indexes biological changes with respect to aging and senescence is supported by association studies with health outcomes7, including (but not limited to) reports that individuals with Werner and Hutchinson-Gilford progeria syndromes, genetic disorders of premature aging, exhibit markedly advanced BAs12,13. BA is most commonly estimated by patterns of DNA methylation (DNAm), which have been shown to correlate with chronological age14. DNAm is an epigenetic modification to DNA associated with genomic stability, transcriptional activity, and chromatin conformation. DNAm patterns change over time as a function of normal physiology15. Several hundred genomic loci have been robustly associated with age-related DNAm remodeling, and the DNAm levels at these sites are used to estimate BA12,16,17. The rationale that DNAm may index cellular aging stems from the susceptibility of DNAm remodeling to genetic, environmental, and behavioral factors which change throughout the life course. Moreover, aberrant DNAm patterns have been associated with negative health outcomes, including congenital disorders, developmental delay, and elevated risk for cancer, which further underscores the notion that unexpected changes in DNAm (and BA) are salient to current and future health outcomes15.
Three primary lines of evidence support investigating the potential of BA to improve clinical predictive models of PTB risk. First, BA calculated from DNAm may reflect influences of past behaviors (e.g., smoking) and environmental exposures (e.g., pollution, trauma)18–20. This sensitivity to PTB risk factors alone suggests that BA may be more useful than chronological age, which is uniform regardless of life experiences. Second, incorporating genomic information in the form of polygenic risk scores (PRS) has improved clinical prediction algorithms for other multifactorial disorders, like breast cancer, prostate cancer, and type 1 diabetes21,22. Similar success may be possible for cumulative epigenomic summaries like BA. Third, significant racial health disparities in PTB rates have persisted in the United States for decades between individuals who self-identified as non-Hispanic African American (AA) and non-Hispanic European American (EA)23. A putative driver of this disparity is biological weathering, which is premature cellular deterioration due to chronic social, economic, and environmental stressors24–26. In the United States, AA women experience increased levels of chronic stressors compared to EA women due to differences in social and environmental determinants of health like access to medical care and experiences of discrimination and racism27,28. The weathering hypothesis posits that the accumulation of these chronic stressors causes a physiological response that promotes cellular dysfunction and deterioration29. This notion is supported by the observation that advanced maternal age-related perinatal complications begin, on average, at younger ages for AA women compared to EA women30. The biological mechanisms linking stressful and traumatic experiences to increased risk for complex disorders like pregnancy complications have not been confirmed, but BA provides a plausible mechanism to explain how chronic stressors affect health. Moreover, the observed variability in which risk for age-related complications begins further underscores the idea that BA may be more informative of individual risk than chronological age.
The purpose of this study was to explore the relationship between BA and gestational age at delivery (GAAD) in a racially diverse longitudinal cohort of pregnant women. To date, PTB research has primarily focused on investigating postnatal fetal measurements of cellular aging rather than maternal BA during pregnancy31–35. By measuring fetal BA, these studies could be assessing epigenetic changes that provide information about the developmental maturity of an infant at birth, rather than biological processes related to the onset of labor. As a result, little is known about the behavior of maternal BA during pregnancy and the relationship between maternal BA and pregnancy outcomes. This study sought to address these gaps by using repeated measures of DNAm from two longitudinal cohorts of pregnant women to characterize the stability of maternal DNAm-based BA across pregnancy, assess its relationship with GAAD, and determine whether these predictions vary by self-identified Census-based race category. The impact of chronological age, prenatal perceived stress level and tobacco smoking on the association between BA and GAAD was considered to evaluate the potential for BA to account for additional variation in GAAD above and beyond these established PTB risk factors. BA stability during pregnancy also was examined to identify informative intervals for evaluating PTB risk. Replication analyses were conducted in an independent cohort.
Results
Participant demographics
After filtering for all inclusion/exclusion criteria, the PREG cohort consisted of 177 women who self-identified as non-Hispanic Black () or non-Hispanic White (). Meeting the same criteria, the GAPPS cohort included 52 women who all self-identified as non-Hispanic Caucasian and not as AA (see Table 1 for additional participant demographics). In order to maintain consistent terms across cohorts, EA and AA will be used to describe women who self-identified as non-Hispanic White/Caucasian or as Black/AA, respectively. The demographic attributes of the PREG EA subset and GAPPS cohort were, for the most part, more similar to each other than to the PREG AA subset. Overall, PREG AA women were more likely to be younger, report higher levels of perceived stress, and were less likely to report taking daily prenatal vitamins. The PTB rate for the PREG and GAPPS cohorts were similar (, ), but PREG AA women had significantly earlier GAAD (Table 1; see Supplemental Figs. S1 and S2 for full distribution of GAAD).
Table 1.
PREG EA | PREG AAa | GAPPS EAa | |
---|---|---|---|
N | 88 (49.7%) | 89 (50.3%) | 52 (100%) |
Age | 31.0 (3.4) [23, 38] | 27.0 (5.5) [18, 40]* | 31.1 (5.8) [19, 41] |
Educational attainment | |||
< High school diploma | 1 (1.1%) | 20 (22.5%)* | 0 (0%) |
High school diploma | 3 (3.4%) | 31 (34.8%)* | 6 (11.5%) |
At least some college | 82 (93.2%) | 36 (40.4%)* | 45 (86.6%) |
Pregnancy characteristics | |||
GAAD | 277.6 (8.3) [259, 294] | 272.5 (10.5) [229, 294]* | 275.8 (10.6) [232, 289] |
Preterm deliveryb | 1 (1.1%) | 8 (9.0%)* | 3 (5.8%) |
Primiparous | 38 (43.2%) | 19 (21.3%)* | 18 (34.6%)* |
Prenatal vitamin usec | 76 (86.4%) | 28 (31.5%)* | 50 (96.2%)* |
Early prenatal perceived stressd,f | 12.3 (6.2) [1, 25] | 15.8 (6.3) [2, 28]* | 14.1 (4.9) [5, 31] |
Late prenatal perceived stresse,f | 10.5 (6.1) [0, 27] | 14.5 (6.4) [0, 30]* | 14.9 (5.4) [4, 30]* |
Biological age | |||
Early pregnancyd | 39.4 (5.4) [23.9, 53.6] | 35.4 (6.6) [16.9, 46.1]* | 43.6 (4.4) [31.0, 53.3]* |
Late pregnancye | 40.3 (4.6) [24.0, 49.4] | 35.3 (6.3) [19.5, 51.6]* | 43.7 (4.1) [32.0, 52.8]* |
Age differenceg | |||
Early pregnancy d | 8.8 (4.1) [0.4, 18.4] | 8.3 (4.1) [0.6, 19.4] | 12.6 (4.2) [3.9, 21.5]* |
Late pregnancye | 9.5 (3.1) [0.3, 18.0] | 8.4 (4.2) [0.0, 21.0] | 12.5 (4.3) [4.4, 21.7]* |
M (SD) [min, max] or N (%).
EA = European American, AA = African American, GAAD = gestational age at delivery (in days).
*; Welch’s t-test.
aAll comparisons tested against the PREG EA-only subset.
bDelivery before 260 days gestation.
cOnly assessed at first study visit.
dCorresponds to late first or second trimester.
eCorresponds to third trimester.
fAssessed using the Perceived Stress Scale55.
gAbsolute difference between maternal chronological age and biological age.
After filtering DNAm data based on quality metrics, 262 and 94 person time points of data remained for the PREG and GAPPS cohorts, respectively. Subsequent division of measures based on gestational age (GA) at assessment resulted in 95 early pregnancy time points (, ) and 167 late pregnancy time points (, ) in PREG (Fig. 1). The GAPPS cohort consisted of 45 early pregnancy measurements and 49 late pregnancy measurements (Fig. 1). Since all participants provided a minimum of 2 samples during pregnancy, early and late pregnancy measurements were available for the majority of women (, ). However, some participants had only early pregnancy time points (, ) and some had only late pregnancy time points (, ), since measurements collected mid-pregnancy did not meet early/late definitions. The mean GA at collection was 72.3 days at early pregnancy measurements, and 213.2 days at the late pregnancy measures (standard deviation of 16.8 and 23.5 days, respectively; see Supplemental Fig. S3 for full distribution). The mean GAAD was not significantly different between those participants with early (, ) and late (, ) measures ( and for PREG and GAPPS, respectively).
BA estimates were nominally higher than chronological age (Table 1 and Fig. 4). Maternal chronological age and BA was moderately correlated in the PREG study (Pearson’s; 0.63 and 0.74 [ and 0.62, and 0.73], in early and late pregnancy, respectively). The correlation between chronological age and BA was 0.71 in early pregnancy and 0.66 in late pregnancy (Pearson’s) in the GAPPS cohort. Intraindividual variation in BA measurements was relatively low, with mean absolute differences between early and late pregnancy estimates of 3.1 years in PREG (standard deviation ) and 2.6 years in GAPPS (standard deviation ). During preprocessing steps, 13 of the Horvath probes were identified as poor quality and removed from PREG, and 33 probes removed in GAPPS (46 total unique Horvath probes between both cohorts). To assess the impact of different probe subsets, analyses were performed with both the largest possible Horvath probe set for each cohort ( [96%], [91%] (see Fig. 1) and with the subset of Horvath probes shared in common between the two cohorts ( [87%]).
Association between BA and GAAD
The coefficients, standard errors, and p-values for all models tested with the PREG cohort are reported in Table 2. For each model, BA and GAAD were the predictor and response variables, respectively. In the full PREG sample, BA estimates outperformed chronological age in predicting GAAD (adjusted R-squared and 3.57%, respectively). The full PREG sample showed a significant relationship between the early pregnancy Horvath-derived BA estimates and GAAD (p-value threshold after Bonferroni adjustment for multiple testing). Higher BA estimates had a positive relationship with GAAD, indicating that an earlier GAAD is associated with younger BAs. Although the relationship between BA and GAAD was primarily supported by the AA subset, the significant relationship between BA and GAAD in the full sample remained after including a self-reported race variable in the model (). However, the relationship between early prenatal BA and GAAD was attenuated when retaining the maximum number of probes available ( in probes (Supplementary Table S1); in probes [Table 2]). There were no significant findings between GAAD and late pregnancy BA estimates.
Table 2.
Full sample coef | Full sample SE | Full sample p-value | EA subset coef | EA subset SE | EA subset p-value | AA subset coef | AA subset SE | AA subset p-value | |
---|---|---|---|---|---|---|---|---|---|
Predicts GAAD | |||||||||
Early BA | 0.63 | 0.21 | 0.003* | 0.40 | 0.24 | 0.107 | 0.71 | 0.33 | 0.038 |
Late BA | 0.06 | 0.18 | 0.763 | 0.26 | 0.669 | 0.03 | 0.27 | 0.918 | |
Predicts early prenatal PSS | |||||||||
Early BA | 0.13 | 0.009 | 0.19 | 0.389 | 0.17 | 0.016 | |||
Predicts late prenatal PSS | |||||||||
Early BA | 0.14 | 0.319 | 0.21 | 0.600 | 0.19 | 0.435 | |||
Late BA | 0.13 | 0.514 | 0.20 | 0.715 | 0.07 | 0.17 | 0.669 | ||
Predicts late BA | |||||||||
Early PSS | 0.05 | 0.098 | 0.06 | 0.365 | 0.08 | 0.742 |
Horvath probe sets were reduced to match the probes available for GAPPS.
coef = coefficient, SE = standard error, EA = European American, AA = African American, PSS = perceived stress scale total score, GAAD = gestational age at delivery, BA = Horvath-derived biological age estimates.
Maternal chronological age was included as a covariate in all models.
Survives Bonferroni adjustment for 6 tests, p-val .
A marginally significant relationship between prenatal PSS and BA estimates in early pregnancy was identified () in the full sample. Similar to the direction of the relationship identified in the GAAD analyses, a higher PSS was associated with a lower BA. A nominally significant relationship between BA and GAAD remained even after adjusting for perceived stress in early pregnancy (). A follow-up analysis in the GAPPS sample, composed entirely of women with EA ancestry, showed no significant relationships between BA and GAAD or between perceived stress and BA (Table 3). Given previously identified associations between tobacco use and DNAm18, the effect of smoking status on the BA-GAAD relationship was similarly considered. A nominally significant relationship between BA and GAAD remained after including smoking history (i.e., never, former, current) as a covariate in the sensitivity analyses ().
Table 3.
coef | SE | p-value | |
---|---|---|---|
Predicts GAAD | |||
Early BA | 0.50 | 0.821 | |
Late BA | 0.37 | 0.41 | 0.371 |
Predicts early prenatal PSS | |||
Early BA | 0.23 | 0.341 | |
Predicts late prenatal PSS | |||
Early BA | 0.27 | 0.139 | |
Late BA | 0.26 | 0.599 | |
Predicts late BA | |||
Early PSS | 0.10 | 0.898 |
Horvath probe sets were reduced to match the probes available for PREG.
coef = coefficient, SE = standard error, PSS = perceived stress scale total score, GAAD = gestational age at delivery, BA = Horvath-derived biological age estimates.
Maternal chronological age was included as a covariate in all models.
Survives Bonferroni adjustment p-val .
Evaluation of BA as a potential clinical marker for GAAD
Residualized BA scores were calculated by regressing BA onto chronological age and reflect the deviation between chronological age and BA. For PREG, residualized BA scores were calculated using the largest possible Horvath probe subset (). Overall, BA residualized scores were relatively stable over the course of pregnancy regardless of self-identified race and had significant between-person heterogeneity (Fig. 2). A significant relationship between BA baseline measurement (i.e., the model intercept), but not rate of change across pregnancy (i.e., the slope of the model), and GAAD was identified (see Supplement). This finding is in agreement with the results from the linear regression models showing early BA associated with GAAD.
Critically, there was greater variability in the residualized BA scores in the PREG AA subset compared to the EA subset (Fig. 3a). Follow up analyses revealed that BA residualized scores were sensitive to probe subset size and self-identified race. Residualized scores were calculated for both the full PREG () and shared () Horvath probe subsets, and self-identified Census-based race significantly predicted BA residuals for the shared probe set above and beyond the BA residuals for the full PREG BA subset (t-value ; Fig. 3b). The sensitivity of BA estimation to probe subset size and composition was further highlighted by comparing the correlation between BA and chronological age in the PREG and GAPPS cohorts, which had different subsets of Horvath probes available (Fig. 4).
Discussion
Excitement over the potential benefits associated with using BA to index personal risk liability for adverse health outcomes has prompted dozens of studies7. Indeed, such a biological marker could improve the accuracy of screening algorithms for multifactorial disorders. To our knowledge, this study is the first to examine the relationship between longitudinal measurements of prenatal maternal BA and GAAD. The results of this study highlight both potential benefits and caveats associated with using BA in translational research and clinical applications. Several characteristics of maternal prenatal BA are appealing for future follow up studies assessing clinical utility. Importantly, early prenatal BA was the most strongly associated with GAAD, which means that PTB risk assessments could occur in time to consider medical interventions and preventative measures. Further, this study observed large interindividual variation in baseline BA estimates which remained relatively stable throughout pregnancy. Early prenatal BA was significantly associated with GAAD above and beyond other risk factors like maternal prenatal perceived stress and chronological age. These findings suggest that early prenatal BA may be a promising candidate for inclusion in a precision clinical obstetrics screening algorithm.
Although results from this study support the possibility of adopting BA for estimating risk for PTB, some critical observations also were noted. First, sensitivity analyses revealed that the relationship between early prenatal BA and GAAD was impacted by probe set composition. Based on these findings, researchers should take care when estimating BA and clearly report the number of probes used in BA calculations. Second, the strongest association signal was found in the AA subset of the PREG sample. Although this relationship remained significant in the full PREG cohort after adjusting for self-identified Census-based race and multiple testing correction, sensitivity analyses using residualized BA scores suggest that the reliability of BA may vary by genetic ancestry and/or demographic factors. These findings suggest that cryptic, currently unidentified factors may be influencing the predictive validity and reliability of DNAm-based BA estimation. The problem of genomically-informed risk assessments failing to generalize to non-European populations has received increasing attention not only because such results limit the utility of clinical assessments but also because they threaten to exacerbate existing racial health disparities36. Another issue is that the biological significance of the individual sites of DNAm included in BA algorithms is poorly understood37,38, which obscures identifying the specific molecular processes BA actually reflects7. This knowledge gap makes predicting factors that will influence generalizability challenging. Researchers must be careful when studying populations that include individuals from diverse backgrounds, especially given that most DNAm-based BA estimation algorithms work analogously to other methods that exhibit variable predictive validity by genetic ancestry (i.e., polygenic risk score calculation)36.
Although significant relationships were identified, the direction of the relationship between BA and GAAD was unexpected. Advanced biological aging is a putative driver of increased risk for negative health outcomes and would be expected in individuals with higher levels of perceived stress and pregnancies with a lower GAAD. In this case, the algorithm predicts that, on average, AA participants are biologically younger than their EA counterparts despite group differences in lifetime exposure to stressors that would predict greater positive deviations from chronological age. Given that a younger BA is associated with adverse outcomes during pregnancy, the results from this study may not support the traditional weathering hypothesis. The interpretation of BA-disease relationships may be complicated by the fact that risk for PTB is increased among both the youngest and oldest mothers39, rather than increasing over the lifetime like other age-related disorders. This nonlinear distribution between maternal chronological age and PTB could be similarly reflected in BA, so that any prominent deviations from mean BA, rather than advanced BA alone, may highlight those pregnancies at higher risk.
These findings contradict results from another study, which did not find a significant relationship between Horvath BA and GAAD, but did identify an inverse relationship between maternal BA estimated using another DNAm-derived BA algorithm and length of gestation40. However, other studies have similarly noted an unexpected direction of the association between DNAm-based BA and adverse pregnancy outcomes, including research assessing the relationships between the BA of infants at birth and maternal antenatal depression, PTB, and future psychiatric problems41. Contradictory relationships between fetal and placental telomere length, an alternative measure of cellular aging, and GAAD are also prevalent in the literature34,35,42. These results could arise from measurement variance that leads to unreliable BA estimates due to genetic and/or physiological status (i.e., pregnancy). The generalizability and reliability of genomic risk scores depends on the diversity and size of the training dataset composition, respectively. To our knowledge, no existing BA algorithm includes blood samples from pregnant women. As a result, BA estimates could be influenced by pregnancy-related DNAm remodeling. As the epigenetic aging field advances, BA estimators for specific populations have been established31,43,44, and the development of future algorithms should be tailored for birth outcomes research and include pregnant women. Integrating DNAm-derived BA with other indices of cellular senescence (e.g., telomere length) could further increase our understanding of the molecular processes reflected in BA.
Overall, these results suggest that BA estimates hold potential to serve as a biomarker for PTB, but extreme care must be taken to assess the accuracy and generalizability of BA across a wide variety of genetic and demographic backgrounds. The ability to assess risk for PTB at the beginning of pregnancy would provide opportunities for early intervention and targeted medical care throughout gestation. Logistically, many attributes of DNAm-based BA make for a good candidate biomarker45,46. DNAm is a stable mark that can be measured reliably, and BA estimates are easily calculated using the Horvath method. In this study, DNAm was measured in peripheral blood, a tissue with a minimally invasive collection procedure that is already a normal part of pregnancy monitoring, posing no additional risk to patients. While more research is necessary to examine how reliably BA predicts GAAD in other samples, in the future BA should be considered for potential clinical applications.
Strengths and limitations
To our knowledge, this study is the largest study to investigate maternal BA during pregnancy and is the first to examine the stability of prenatal BA and its relationship across time with GAAD. Major strengths of this study include the use of both a primary and replication cohort both containing longitudinal measurements during pregnancy. The inclusion of a diverse cohort allowed for the investigation of BA differences by self-reported race. Finally, all analyses and hypotheses examined in this study were preregistered on the Open Science Framework47 using the AsPredicted format.
The results of this study should be considered in the context of four primary study limitations. First, cross-study comparisons were complicated by variation in data collection protocols. Perceived stress was assessed at four study visits in PREG while only two measures were collected in GAPPS. This limitation would have been easier to resolve if more detailed information about GA at assessment were available for GAPPS participants (e.g., GA in days). Second, the two study populations differed significantly in demographic composition (Table 1). These differences were particularly problematic given that main effects of BA on GAAD were seen primarily in the PREG AA subsample. Additionally, notable demographic differences were observed between the PREG AA subsample, the PREG EA subsample, and the GAPPS EA cohort. It is possible that both measured and unmeasured demographic differences (e.g., differences in parity and personal pregnancy history) contributed to differences in GAAD and BA residuals. Future work will be needed to assess the impact of reproductive history characteristics (e.g., prior history of preterm delivery, parity) on biological aging. Third, neither the PREG nor the GAPPS samples had complete probe data for the full Horvath algorithm. The GAPPS sample was measured using a newer technology missing seventeen of the Horvath probes, and both samples had probes removed during quality control. It is not clear if and how these missing probes influenced the final results, but the strength of the association between early prenatal BA and GA was slightly attenuated in the maximum possible probe subset () compared to the smaller probe subset for PREG (; see Supplement for results from analyses including all available probes). Finally, the PREG and GAPPS participants were generally healthy women with uncomplicated pregnancies due, in part, to exclusion criteria related to placental and amniotic abnormalities and hypertensive disorders. The exclusion of heterogeneous causes of PTB putatively increases statistical power for genetic research at the cost of limiting observed biological variability. Future studies will be needed to characterize maternal BA stability and correlates in high-risk pregnancies.
Methods
Study cohort
Pregnancy, Race, Environment, Genes (PREG)
The Pregnancy, Race, Environment, Genes (PREG) Study is a prospective longitudinal cohort assessing the relationship between epigenetic factors, environmental exposures, and pregnancy outcomes48. Self-report questionnaires and maternal peripheral blood samples were collected up to four times throughout pregnancy. Inclusion criteria at enrollment were (1) singleton pregnancy conceived without assisted reproductive technology, (2) mother was 18–40 years old with no diagnosis of diabetes, (3) enrollment before 24 completed weeks of gestation, (4) mother and father had to self-identify as either both White or both Black without Hispanic or Middle Eastern ancestry. The rationale for limiting the cohort by ancestry was to maximize the statistical power for genetic/epigenetic analyses and to investigate the role of environmental and epigenetic factors to perinatal health disparities. Exclusion criteria included diagnosis of maternal blood pressure disorders (e.g., preeclampsia), fetal congenital anomalies, placental or amniotic anomalies (e.g., placenta previa, polyhydramnios), fewer than three study time points completed, or use of a cerclage. GA was confirmed by ultrasound. GA at each study visit and GAAD were recorded in days since conception.
Replication cohort
Global Alliance to Prevent Prematurity and Stillbirth (GAPPS)
Maternal blood specimens were obtained from the Global Alliance to Prevent Prematurity and Stillbirth (GAPPS) BioServices repository. GAPPS participant selection criteria matched most PREG study inclusion and exclusion criteria to facilitate cross-study comparisons. AA samples were not available from GAPPS at the time of study initiation. Maternal peripheral blood samples were collected along with self-report questionnaires up to three times across pregnancy. Due to the smaller number of total possible study visits, GAPPS participants were included if they had least two time points of data. GAAD was reported in days since conception, but GA at each study visit was reported as trimester (i.e., 1, 2, or 3).
Biological age measurement
BA was estimated from genome-wide DNAm measurements using the Horvath method12. The Horvath algorithm calculates BA from DNAm levels at 353 genomic loci each measured by a single probe. Most of the loci only contribute modestly to the final age estimate (i.e., median weight is 6 weeks; range is 0.00000594 to 3.07 years)12. Both PREG and GAPPS measured DNAm from peripheral blood specimens using Illumina microarray technology. The PREG study used the Infinium HumanMethylation450 BeadChip (450k); GAPPS, the Infinium EPIC BeadChip (850k). The 850k array is a newer sister technology to the 450k and includes 92% of the 450k probe set. The newer 850k array design omits 17 of the Horvath probes (4.8%). Despite the probe set differences, previous reports have suggested that the Horvath age estimates are only slightly underestimated in peripheral blood when these probes are missing (, )49,50. Both PREG and GAPPS microarray experiments were separately performed at HudsonAlpha Institute for Biotechnology according to the manufacturer’s protocol (Illumina, San Diego, CA, USA). For both cohorts, the individual specimen placement were randomized on the array, but all specimens from a single participant were loaded onto a single array to minimize potential batch effects (see Supplement).
Before calculating BA, the quality of DNAm microarrays was assessed (Fig. 1) using the Bioconductor R package minfi51. Probes with either poor signal intensity or known cross-hybridization activity were removed in accordance with established best practices (see Supplement for additional details). Principal components analysis was used to identify potential experimental artifacts (e.g., batch effects), and based on this analysis, probe Beta-values were adjusted for positional effects using ComBat52. BA estimates for each specimen were calculated from adjusted Beta-values using the wateRmelon R package53. All statistical analyses were conducted in the R environment (version 3.5)54.
Perceived stress measurement
The Perceived Stress Scale (PSS) is a ten-question validated self-report instrument for assessing the magnitude and severity of recent stress levels55. Each item is a 5-point Likert-type question, with 0 indicating “never” and 4 indicating “very often”. Possible scores range from 0 to 40 with higher scores indicating greater levels and interference of perceived stress. The PSS was administered at every visit for the PREG study and in the second and third trimester health questionnaires for the GAPPS study. PSS scores have been associated with advanced BA and with greater vulnerability to depressive symptoms precipitated by stressful life events. For this study, PSS scores were used to index each participant’s feelings of cumulative stress and control over the events in her life.
Data analysis
Linear regression was used to test the relationship between BA estimates from early and late pregnancy with GAAD and prenatal perceived stress. To harmonize the data across studies while maintaining sample size, early and late prenatal DNAm measurements were defined in PREG as blood specimens obtained at a GA less than 100 days and after 180 days, respectively. In GAPPS, early pregnancy was defined as measurements collected in the first trimester while late pregnancy measurements were those obtained in the third trimester. To control for individual differences in chronological age, maternal age (collected at the time of study enrollment), was included as a covariate in all analyses. Lifetime smoking status (i.e., never, former, current), self-reported race, and prenatal perceived stress levels were included as covariates in the regression models for sensitivity analyses. Cell-type proportion estimates were not included because the Horvath BA algorithm is robust to biases related to cell-type heterogeneity12. Prenatal BA trajectories were characterized using linear latent growth curve models evaluated in Mplus and built using the R package MplusAutomation56. The purpose of the growth curve model was to quantify the interindividual difference in the baseline and rate of change of BA estimates across pregnancy.
Informed consent and ethical approvals
The PREG study received Virginia Commonwealth University Institutional Review Board approval (14000) and all research was performed in accordance with relevant guidelines and regulations. Written confirmation of informed consent was obtained from each participant.
Preregistration
Analyses presented in this manuscript were preregistered on the Open Science Framework and are available at https://osf.io/6a9db. All of the original preregistered study questions were addressed in these analyses. However, there are other notable deviations from the analyses outlined in the preregistration document. Originally, two BA algorithms prominently featured in the literature, the Horvath and Hannum methods, were selected for this study. However, several probes included in the Hannum algorithm were removed during quality control processing steps. The Hannum method is known to be more sensitive to missing probes, potentially leading to a biased BA estimates50. As per the original preregistered study design, the same analyses were completed with the Hannum clock (see Supplement for relevant methods and results). Interestingly, both epigenetic clocks performed similarly in these samples, suggesting they are capturing the same biological phenomenon. Additionally, methods and results for a secondary analyses examining the use of Y chromosome probes to detect cell-free DNA contamination of maternal samples are available in the Supplement. Finally, a more parsimonious model was selected to adjust for chronological age variability in the models. Rather than adopting a two-step approach in which BA is first regressed on chronological age before modeling the resulting residual, maternal age was simply included as a covariate in all analyses.
Supplementary information
Acknowledgements
The Pregnancy, Race, Environment, Genes (PREG) longitudinal study and its postpartum extension were supported by the NIMHD (P60MD002256, PI: York, Strauss). The John and Polly Sparks Foundation and Brain and Behavior Research Foundation (24712, PI: York) supported the use of GAPPS data and materials. Support was received from the Clinical and Translational Science Award (CTSA) award No. UL1TR000058 from the National Center for Advancing Translational Sciences. Its contents are solely the responsibility of the authors and do not necessarily represent official views of the National Center for Advancing Translational Sciences or the National Institutes of Health. Timothy York, Ph.D., holds a 2015 Preterm Birth Research Grant from the Burroughs Wellcome Fund (1015040). Dana Lapato, Ph.D., was supported by a T32 NIMH grant (T32MH020030, PI: Michael Neale).
Author contributions
All authors designed the study. D.L. and E.L. cleaned the data, performed the analyses, and wrote the initial draft of the manuscript. C.J.C. oversaw the processing/preparation of specimens for assessment in the PREG study. R.R.N., J.S. and T.Y. provided substantive feedback and revisions, and T.Y. and J.S. planned and secured funding for the PREG and GAPPS studies. All authors approved the final version of the manuscript.
Data availability
The preregistration document and R code used to analyze the data and generate figures is available on the Open Science Framework (OSF) project landing page (https://osf.io/sqmzg). Sharing PREG and GAPPS study data is limited by Institutional Review Board agreements and participant consent forms, which restrict openly sharing individual-level DNAm measures. Anyone interested in data access or collaboration is encouraged to contact Dr. Timothy P. York (timothy.york@vcuhealth.org) for more information.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
is available for this paper at 10.1038/s41598-021-94281-7.
References
- 1.Blencowe H, et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: A systematic analysis and implications. Lancet. 2012;379:2162–2172. doi: 10.1016/S0140-6736(12)60820-4. [DOI] [PubMed] [Google Scholar]
- 2.Behrman RE, Butler AS. Preterm Birth: Causes, Consequences, and Prevention. National Academies Press; 2007. [PubMed] [Google Scholar]
- 3.York TP, Eaves LJ, Neale MC, Strauss JF., 3rd The contribution of genetic and environmental factors to the duration of pregnancy. Am. J. Obstet. Gynecol. 2014;210:398–405. doi: 10.1016/j.ajog.2013.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.York TP, Strauss JF, Neale MC, Eaves LJ. Estimating fetal and maternal genetic contributions to premature birth from multiparous pregnancy histories of twins using MCMC and maximum-likelihood approaches. Twin Res. Hum. Genet. 2009;12:333–342. doi: 10.1375/twin.12.4.333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.de Andrade Ramos BR, da Silva MG. The burden of genetic and epigenetic traits in prematurity. Reprod. Sci. 2018;25:471–479. doi: 10.1177/1933719117718270. [DOI] [PubMed] [Google Scholar]
- 6.Burris HH, et al. Racial disparities in preterm birth in USA: A biosensor of physical and social environmental exposures. Arch. Dis. Child. 2019;104:931–935. doi: 10.1136/archdischild-2018-316486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Horvath S, Raj K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat. Rev. Genet. 2018;19:371–384. doi: 10.1038/s41576-018-0004-3. [DOI] [PubMed] [Google Scholar]
- 8.Dugué P-A, et al. DNA methylation-based biological aging and cancer risk and survival: Pooled analysis of seven prospective studies. Int. J. Cancer. 2018;142:1611–1619. doi: 10.1002/ijc.31189. [DOI] [PubMed] [Google Scholar]
- 9.Perna L, et al. Epigenetic age acceleration predicts cancer, cardiovascular, and all-cause mortality in a German case cohort. Clin. Epigenetics. 2016;8:64. doi: 10.1186/s13148-016-0228-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chen BH, et al. DNA methylation-based measures of biological age: Meta-analysis predicting time to death. Aging. 2016;8:1844–1865. doi: 10.18632/aging.101020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Marioni RE, et al. The epigenetic clock is correlated with physical and cognitive fitness in the Lothian Birth Cohort 1936. Int. J. Epidemiol. 2015;44:1388–1396. doi: 10.1093/ije/dyu277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:R115. doi: 10.1186/gb-2013-14-10-r115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Horvath S, et al. Epigenetic clock for skin and blood cells applied to Hutchinson Gilford Progeria syndrome and ex vivo studies. Aging. 2018;10:1758–1775. doi: 10.18632/aging.101508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Jylhävä J, Pedersen NL, Hägg S. Biological age predictors. EBioMedicine. 2017;21:29–36. doi: 10.1016/j.ebiom.2017.03.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Greenberg MVC, Bourc’his D. The diverse roles of DNA methylation in mammalian development and disease. Nat. Rev. Mol. Cell Biol. 2019;20:590–607. doi: 10.1038/s41580-019-0159-6. [DOI] [PubMed] [Google Scholar]
- 16.Hannum G, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell. 2013;49:359–367. doi: 10.1016/j.molcel.2012.10.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Levine ME, et al. DNA methylation age of blood predicts future onset of lung cancer in the women’s health initiative. Aging. 2015;7:690–700. doi: 10.18632/aging.100809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Gao X, Zhang Y, Breitling LP, Brenner H. Relationship of tobacco smoking and smoking-related DNA methylation with epigenetic age acceleration. Oncotarget. 2016;7:46878–46889. doi: 10.18632/oncotarget.9795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Beach SRH, et al. Methylomic aging as a window onto the influence of lifestyle: Tobacco and alcohol use alter the rate of biological aging. J. Am. Geriatr. Soc. 2015;63:2519–2525. doi: 10.1111/jgs.13830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wolf EJ, et al. Traumatic stress and accelerated DNA methylation age: A meta-analysis. Psychoneuroendocrinology. 2018;92:123–134. doi: 10.1016/j.psyneuen.2017.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Udler MS, McCarthy MI, Florez JC, Mahajan A. Genetic risk scores for diabetes diagnosis and precision medicine. Endocr. Rev. 2019;40:1500–1520. doi: 10.1210/er.2019-00088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lambert SA, Abraham G, Inouye M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 2019;28:R133–R142. doi: 10.1093/hmg/ddz187. [DOI] [PubMed] [Google Scholar]
- 23.Martin JA, Hamilton BE, Osterman M, Driscoll AK. Births: Final Data for 2018. Natl. Vital Stat. Rep. 2019;68(13):1–47. [PubMed] [Google Scholar]
- 24.Geronimus AT. Black/white differences in the relationship of maternal age to birthweight: A population-based test of the weathering hypothesis. Soc. Sci. Med. 1996;42:589–597. doi: 10.1016/0277-9536(95)00159-X. [DOI] [PubMed] [Google Scholar]
- 25.Geronimus AT, et al. Do US black women experience stress-related accelerated biological aging? A novel theory and first population-based test of Black-White differences in telomere length. Hum. Nat. 2010;21:19–38. doi: 10.1007/s12110-010-9078-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Simons RL, et al. Economic hardship and biological weathering: The epigenetics of aging in a US sample of black women. Soc. Sci. Med. 2016;150:192–200. doi: 10.1016/j.socscimed.2015.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Manuel JI. Racial/Ethnic and gender disparities in health care use and access. Heal. Serv. Res. 2018;53:1407–1429. doi: 10.1111/1475-6773.12705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Giscombé CL, Lobel M. Explaining disproportionately high rates of adverse birth outcomes among African Americans: The impact of stress, racism, and related factors in pregnancy. Psychol. Bull. 2005;131:662–683. doi: 10.1037/0033-2909.131.5.662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Geronimus AT, Hicken M, Keene D, Bound J. "Weathering" and age patterns of allostatic load scores among Blacks and Whites in the United States. Am. J. Public Heal. 2006;96:826–833. doi: 10.2105/AJPH.2004.060749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Schummers L, et al. Variation in relationships between maternal age at first birth and pregnancy outcomes by maternal race: A population-based cohort study in the United States. BMJ Open. 2019;9:e033697. doi: 10.1136/bmjopen-2019-033697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Knight AK, et al. An epigenetic clock for gestational age at birth based on blood methylation data. Genome Biol. 2016;17:206. doi: 10.1186/s13059-016-1068-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Simpkin AJ, et al. Prenatal and early life influences on epigenetic age in children: A study of mother-offspring pairs from two cohort studies. Hum. Mol. Genet. 2016;25:191–201. doi: 10.1093/hmg/ddv456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Javed R, Chen W, Lin F, Liang H. Infant’s DNA methylation age at birth and epigenetic aging accelerators. Biomed Res. Int. 2016;2016:4515928. doi: 10.1155/2016/4515928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Vasu V, et al. Preterm infants have significantly longer telomeres than their term born counterparts. PLoS ONE. 2017;12:e0180082. doi: 10.1371/journal.pone.0180082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Menon R, et al. Short fetal leukocyte telomere length and preterm prelabor rupture of the membranes. PLoS ONE. 2012;7:e31136. doi: 10.1371/journal.pone.0031136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Martin AR, et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 2019;51:584–591. doi: 10.1038/s41588-019-0379-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Mendelson MM. Epigenetic age acceleration: A biological doomsday clock for cardiovascular disease? Circ. Genom. Precis. Med. 2018;11:e002089. doi: 10.1161/CIRCGEN.118.002089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Oh G, et al. Cytosine modifications exhibit circadian oscillations that are involved in epigenetic diversity and aging. Nat. Commun. 2018;9:644. doi: 10.1038/s41467-018-03073-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Gilbert W, Jandial D, Field N, Bigelow P, Danielsen B. Birth outcomes in teenage pregnancies. J. Matern. Fetal. Neonatal Med. 2004;16:265–270. doi: 10.1080/jmf.16.5.265.270. [DOI] [PubMed] [Google Scholar]
- 40.Ross KM, et al. Epigenetic age and pregnancy outcomes: GrimAge acceleration is associated with shorter gestational length and lower birthweight. Clin. Epigenet. 2020;12:120. doi: 10.1186/s13148-020-00909-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Suarez A, et al. The epigenetic clock at birth: Associations with maternal antenatal depression and child psychiatric problems. J. Am. Acad. Child Adolesc. Psychiatry. 2018;57:321–328.e2. doi: 10.1016/j.jaac.2018.02.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Friedrich U, Schwab M, Griese EU, Fritz P, Klotz U. Telomeres in neonates: New insights in fetal hematopoiesis. Pediatr. Res. 2001;49:252–256. doi: 10.1203/00006450-200102000-00020. [DOI] [PubMed] [Google Scholar]
- 43.Lee Y, et al. Placental epigenetic clocks: Estimating gestational age using placental DNA methylation levels. Aging. 2019;11:4238–4253. doi: 10.18632/aging.102049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.McEwen LM, et al. The PedBE clock accurately estimates DNA methylation age in pediatric buccal cells. Proc. Natl. Acad. Sci. USA. 2020;117(38):23329–23335. doi: 10.1073/pnas.1820843116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Mikeska T, Craig JM. DNA methylation biomarkers: Cancer and beyond. Genes. 2014;5:821–864. doi: 10.3390/genes5030821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Levenson VV. DNA methylation as a universal biomarker. Expert. Rev. Mol. Diagn. 2010;10:481–488. doi: 10.1586/erm.10.17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Foster ED, Deardorff A. Open science framework (OSF) J. Med. Libr. Assoc. 2017;105:203–206. [Google Scholar]
- 48.Lapato DM, et al. Prospective longitudinal study of the pregnancy DNA methylome: The US Pregnancy, Race, Environment, Genes (PREG) study. BMJ Open. 2018;8:e019721. doi: 10.1136/bmjopen-2017-019721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.McEwen LM, et al. Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array. Clin. Epigenet. 2018;10:123. doi: 10.1186/s13148-018-0556-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Dhingra R, et al. Evaluating DNA methylation age on the Illumina MethylationEPIC Bead Chip. PLoS ONE. 2019;14:e0207834. doi: 10.1371/journal.pone.0207834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Aryee MJ, et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–1369. doi: 10.1093/bioinformatics/btu049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28:882–883. doi: 10.1093/bioinformatics/bts034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Pidsley R, et al. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genom. 2013;14:293. doi: 10.1186/1471-2164-14-293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.R Core Team . R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; 2016. [Google Scholar]
- 55.Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J. Heal. Soc. Behav. 1983;24:385–396. doi: 10.2307/2136404. [DOI] [PubMed] [Google Scholar]
- 56.Hallquist MN, Wiley JF. MplusAutomation: An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus. Struct. Equ. Modeling. 2018;25(4):621–638. doi: 10.1080/10705511.2017.1402334. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The preregistration document and R code used to analyze the data and generate figures is available on the Open Science Framework (OSF) project landing page (https://osf.io/sqmzg). Sharing PREG and GAPPS study data is limited by Institutional Review Board agreements and participant consent forms, which restrict openly sharing individual-level DNAm measures. Anyone interested in data access or collaboration is encouraged to contact Dr. Timothy P. York (timothy.york@vcuhealth.org) for more information.