Abstract
BACKGROUND:
Variation in the timing of menarche has been linked with adverse health outcomes in later life. There is evidence that exposure to hormonally active agents (or endocrine disrupting chemicals; EDCs) during childhood may play a role in accelerating or delaying menarche. The goal of this study was to generate hypotheses on the relationship between exposure to multiple EDCs and timing of menarche by applying a two-stage machine learning approach.
METHODS:
We used data from the National Health and Nutrition Examination Survey (NHANES) for years 2005–2008. Data were analyzed for 229 female participants 12–16 years of age who had blood and urine biomarker measures of 41 environmental exposures, all with >70% above limit of detection, in seven classes of chemicals. We modeled risk for earlier menarche (<12 years of age vs older) with exposure biomarkers. We applied a two-stage approach consisting of a random forest (RF) to identify important exposure combinations associated with timing of menarche followed by multivariable modified Poisson regression to quantify associations between exposure profiles (“combinations”) and timing of menarche.
RESULTS:
RF identified urinary concentrations of monoethylhexyl phthalate (MEHP) as the most important feature in partitioning girls into homogenous subgroups followed by bisphenol A (BPA) and 2,4-dichlorophenol (2,4-DCP). In this first stage, we identified 11 distinct exposure biomarker profiles, containing five different classes of EDCs associated with earlier menarche. MEHP appeared in all 11 exposure biomarker profiles and phenols appeared in five. Using these profiles in the second-stage of analysis, we found a relationship between lower MEHP and earlier menarche (MEHP ≤ 2.36 ng/mL vs >2.36 ng/mL: adjusted PR= 1.36, 95% CI: 1.02, 1.80). Combinations of lower MEHP with benzophenone-3, 2,4-DCP, and BPA had similar associations with earlier menarche, though slightly weaker in those smaller subgroups. For girls not having lower MEHP, exposure profiles included other biomarkers (BPA, enterodiol, monobenzyl phthalate, triclosan, and 1-hydroxypyrene); these showed largely null associations in the second-stage analysis. Adjustment for covariates did not materially change the estimates or CIs of these models. We observed weak or null effect estimates for some exposure biomarker profiles and relevant profiles consisted of no more than two EDCs, possibly due to small sample sizes in subgroups.
CONCLUSION:
A two-stage approach incorporating machine learning was able to identify interpretable combinations of biomarkers in relation to timing of menarche; these should be further explored in prospective studies. Machine learning methods can serve as a valuable tool to identify patterns within data and generate hypotheses that can be investigated within future, targeted analyses.
Keywords: Multiple exposures, Mixtures, Environmental exposures, Machine learning, Menarche
1. Introduction
Earlier timing of menarche has been linked with adverse health outcomes in adolescence and in adulthood including obesity, asthma, type 2 diabetes, and cardiovascular disease (Charalampopoulos et al. 2014; Janghorbani et al. 2014; Lieberoth et al. 2014; Luijken et al. 2017; Petry et al. 2018; Prentice and Viner 2013; Sun et al. 2018). There is strong evidence that earlier age at menarche is associated with an increased risk of breast and endometrial cancer (CGHFBC 2002; Euling et al. 2008; Gong et al. 2015; Pike et al. 2004) and later menarche is linked to increased risk of cardiovascular disease (Day et al. 2015; JJ Lee et al. 2019). Previous studies have shown a decline in age at menarche globally from the late 1800s to mid-1900s (Sørensen et al. 2012). In the United States, the average age at menarche has declined by 0.7 to 1.4 years depending on the racial or ethnic group (Euling et al. 2008; McDowell et al. 2007; Wyshak and Frisch 1982). It is hypothesized that exposure to hormonally active agents may play a role in accelerating or delaying the onset of menarche (Gore et al. 2015).
Hormonally active agents, often referred to as endocrine disrupting chemicals (EDCs), are exogenous chemical compounds or their metabolites that can mimic, alter or attenuate the action of natural hormones found in the body (Gore et al. 2015). A wide range of natural and synthetic classes of compounds including phytoestrogens, pesticides, phenols, polycyclic aromatic hydrocarbons (PAHs), phthalates, and some heavy metals have been recognized as possessing endocrine disrupting activity (Diamanti-Kandarakis et al. 2009; Meeker 2012). Moreover, EDCs are multisource and multi-route, and the ubiquitous use of these compounds has resulted in widespread population exposure (Meeker 2012). A recent review of epidemiologic data on early-life exposure to EDCs and pubertal development in girls found conflicting evidence in the associations between specific EDCs and timing of menarche (JE Lee et al. 2019). It is important to note, the majority of these studies investigated exposure to a single chemical, which may help explain inconsistent findings across studies. Given the ubiquitous use of EDCs, the average adolescent girl is exposed to multiple hormonally active agents concurrently or sequentially (Hendryx and Luo 2018). Therefore, it is possible that individual EDCs can alter timing of menarche by acting jointly at lower concentrations than what would be necessary for an individual EDC (Kortenkamp 2014).
Potential joint effects between co-exposure to various classes of EDCs in relation to the timing of menarche in girls is relatively unexplored. Addressing this gap in the literature is important because the individual biologic effects of some EDCs are weak but can collectively influence the onset of menarche via a common mechanism (e.g., estrogen receptor agonist, interference with androgen synthesis, thyroid action, etc.). However, understanding exposure to real-world mixtures and the associated health effects is challenging given the statistical complexity of analyzing co-exposures to various environmental chemicals. Some of these statistical challenges include multicollinearity among co-exposures, pure additivity assumptions, identifying synergistic or antagonistic interactions, and the presence of co-pollutant confounding (Braun et al. 2016; Sun et al. 2013). Applying data-driven methods, such as machine learning algorithms, prior to targeted analysis can help identify exposures relevant to health outcomes of interest within high dimensional exposure data as well as uncover potential interactions between exposures, a task that is difficult to do when using traditional statistical techniques (Patel 2017). Moreover, there is an overreliance in only examining exposures known to be of concern, and hypothesis-generating machine learning methods may provide further insight into the relationship between co-pollutants and their impact on health outcomes (Braun et al. 2016; Sun et al. 2013). A recent application of data-driven machine learning methods utilized classification and regression trees (CaRT) to investigate 104 ambient air toxics and early cognitive skills in a large sample of children (Stingone et al. 2017). The two-step method implemented in this study allowed the authors to identify relevant and interpretable exposure combination profiles associated with lower math test scores.
The goal of this study is to discover whether a data-driven machine learning approach can identify EDC exposure-menarche relationships previously reported in other studies that used traditional analytic methods and to identify EDC exposure combinations that have not been previously explored. Our two-stage machine learning approach incorporates: (1) identifying combinations of environmental exposures associated with timing of menarche and (2) using epidemiologic methods to quantify the associations between exposure profiles and timing of menarche. In this pilot study, we demonstrate the utility of this two-stage analytical method using data from the National Health and Nutrition Examination Survey (NHANES).
2. Materials and Methods
2.1. Study population
This study was conducted using data from NHANES, a nationally representative, cross-sectional survey of non-institutionalized persons living in the United States. NHANES, administered by the National Center for Health Statistics and the Centers for Disease Control and Prevention (CDC), is an ongoing health, and nutrition survey conducted in two-year cycles that utilizes a multistage probability sampling design. The dataset is compiled from in-person interviews, physical examination and laboratory samples. A more detailed description of NHANES is available elsewhere (Johnson et al. 2013). All survey and consent documents for NHANES were approved by the CDC Institutional Review Board. We restricted our study sample to female adolescents between 12 and 16 years of age to capture exposures closest to age at menarche. Girls were included in this analysis if complete data were available on the outcome (age at menarche), selected biomarkers of environmental exposures, and covariates. We chose the NHANES cycles 2005–2008 to maximize the number of girls with both menarche and exposure biomarker data. The final sample consisted of 229 adolescent girls from the 2005–2006 and 2007–2008 sampling cycles with complete data on 41 environmental exposure biomarkers of interest.
2.2. Outcome assessment: age at menarche
Age at menarche was obtained from the NHANES reproductive health questionnaire from the question: “How old were you when you had your first menstrual period.” Based on previous literature, age at menarche was then categorized as “earlier” if menarche was achieved before 12 years of age and “later” if menarche was achieved at 12 years or older. Girls who were at least 12 years old and had not reached menarche at the time of the interview were classified as later menarche.
2.3. Exposure assessment: environmental agents
Whole blood and spot urine samples were collected during the physical examination portion of the NHANES surveys (Johnson et al. 2013). A random subsample (one-third) of participants six years and older in each survey cycle provided biological samples that were frozen at −20°C until testing (Johnson et al. 2013). Participants younger than 12 years were excluded from this analysis because data on age at menarche was obtained from girls who were 12 years or older at the time of the interview. All samples are analyzed at the CDC National Center for Environmental Health Laboratory, with detailed laboratory analysis methods described elsewhere (Johnson et al. 2013). Environmental analytes measured in blood or urine that had a detection frequency of 70% or higher were included in this analysis, resulting in a total of 41 exposure biomarkers across seven classes of environmental agents. Urinary metabolites included phytoestrogens (genistein, daidzein, o-desmethylangolensin (O-DMA), equol, enterodiol and enterolactone), phthalates (mono(2-ethylhexyl) phthalate (MEHP), mono(2-ethyl5-oxohexyl) phthalate (MEOHP), mono(2-ethyl-5-hydroxyhexyl) phthalate (MEHHP), mono(2-ethyl-5-carboxypentyl) phthalate (MECPP), mono(carboxynonyl) phthalate (MCNP), mono(carboxyoctyl) phthalate (MCOP), monobenzyl phthalate (MBzP), monobutyl phthalate (MBP), monoisobutyl phthalate (MiBP), monoethyl phthalate (MEP), mono(3-carboxypropyl) phthalate (MCPP), and mono-n-methyl phthalate (MMP), parabens (propylparaben, methylparaben, ethylparaben, and butylparaben), phenols (bisphenol A (BPA), benzophenone-3 (BP3), triclosan (TCS), 2,5-dichlorophenol (2,5-DCP), 2,4-dichlorophenol (2,4-DCP), 2,4,5-trichlorophenol (2,4,5-TCP), and 2,4,6-trichlorophenol (2,4,6-TCP), and PAHs (1-naphthol, 2-naphthol, 3-hydroxyphenanthrene, 2-hydroxyfluorene, 1-hydroxyphenanthrene, 2-hydroxyphenanthrene, 1-hydroxypyrene, 9-hydroxyfluorene). The oxidative phthalate metabolites included in this analysis represent parent diesters that are used commercially. Blood metals including cadmium, lead, and mercury, and measures of blood cotinine concentrations were also included in the final dataset. Analytes below the limit of detection (LOD) were assigned a value of LOD/√2 by the CDC.
2.4. Covariates
Confounders in the relationship between environmental chemical exposures and menarche were identified for model adjustment in the targeted stage of the analysis (Stage 2) based on previous literature. These covariates include poverty income ratio (PIR), race/ethnicity, and body mass index (BMI) percentile. PIR, defined as the ratio of income to the family’s appropriate poverty threshold, was categorized as <1, 1–2, 2–3, and >3. Race/ethnicity was recoded as non-Hispanic white, non-Hispanic black, Mexican American/other Hispanic, and other. BMI was calculated from measures of height and weight (kg/m2). CDC 2000 growth charts were used to convert BMI values into age-and sex-specific BMI percentile levels to provide a measure of BMI relative to other children of the same sex and age (Kuczmarski et al. 2002). Because this is a cross-sectional survey we did not have pre-menarcheal BMI measurements or measures of other potential mediators. Under the assumption of stable BMI trajectories, we adjusted models based on concurrent BMI-percentile values at the time of EDC exposure assessment.
2.5. Statistical analysis
2.5.1. Stage I: Identifying exposure profiles through random forest
We used tree-based methods, a non-parametric modeling approach, to identify exposure profiles for girls with earlier vs. later menarche. Tree-based methods are a recursive partitioning technique that “learn” a tree using a set of features (exposure variables of interest) to best split the study population into homogeneous groups based on the target outcome. A branch within a single tree, from node to leaf, represents an exposure profile that identifies a subpopulation that is enriched for the target outcome of interest. Advantages of using tree-based methods include: 1) interpretability; 2) accommodation of a large number of feature variables; 3) allowance for the mixed use of categorical and numerical features; and 4) identification of high-order and non-linear interactions between features (as evidenced by being within the same branch) (Lampa et al. 2014). However, a single tree tends to be highly unstable and can overfit the data, resulting in exposure profiles that are unlikely to generalize to new data samples.
Random forests extend the method of tree-based modeling by aggregating over an ensemble of individual classification trees (Breiman 2001). Random forest reduces overfitting by reiterative resampling with replacement and random feature selection when generating each tree. Random forest is often used to identify the optimal prediction model for an outcome of interest. However, we used random forest as an exploratory and hypothesis-generating step to identify individual exposure biomarkers and combinations of exposure biomarkers that may be relevant to the timing of menarche. To identify exposure biomarker profiles, we analyzed the structure of the trees (branches from each tree) within the random forest. Our random forest analysis consisted of the 41 environmental exposure biomarkers modeled on a binary outcome, earlier and later menarche. Random forest methods do not allow for traditional covariate adjustment, therefore urinary concentrations were normalized for dilution by dividing the biomarker concentration in ng/mL (μg/L) by the creatinine concentration in g/L. All exposure biomarker measures were kept as linear terms. We used the following hyperparameters for the random forest analysis: 1) 2000 as the number of trees in the forest, 2) 15 as the minimum number of observations in each terminal node, and 3) 10 as the maximum number of terminal nodes that each tree in forest can have. We varied the minimum terminal node size (range: 15 to 30 observations) and the maximum number of terminal nodes (range: 5 to 15) to examine the impact on our results. The goal of this study is to identify EDC exposure biomarker combinations, not to create an optimal prediction model; therefore, we used a bagging approach where all 41 exposure biomarkers were tried at each split. The outcome was set to target earlier menarche. The randomForest R package was used to learn our random forest (Breiman and Cutler 2018).
As written above, we performed a downstream analysis of the random forest results which involved decomposing each tree into its constituent branches (“exposure biomarker profiles”) and counting their frequencies across the 2000 trees. For each tree, the algorithm recursively selected the exposure biomarker and a value for that exposure biomarker that best split girls into homogeneous subgroups based on the target outcome of earlier and later menarche. We used permutation tests to identify exposure biomarker profiles that warranted follow-up in Stage 2. Specifically, we built a null distribution of exposure biomarker profiles by randomly permuting the outcome (i.e. earlier/later menarche) with respect to the EDC exposure biomarkers. The correlation structure between the exposure biomarker was maintained, but the statistical relationship between the exposure biomarkers and the outcome was severed. We then applied the random forest algorithm and calculated exposure biomarker profile frequencies across the 2000 trees using the permuted data. This was repeated 100 times to generate a distribution of exposure biomarker profile frequencies that occurred in the absence of a relationship between the exposure biomarker and the outcome. The exposure biomarker profile frequency at the 99.9th percentile (corresponding to a p-value of 0.001) was used to calculate a threshold value, that is the total number of times an exposure biomarker profile needed to appear in the forest to be considered inconsistent with a null association with the outcome.
2.5.2. Stage II: Estimation of effect size through regression analyses
In Stage 2, exposure biomarker profiles that appeared more frequently than the threshold value obtained from Stage 1 were evaluated for targeted analysis. This two-step machine learning approach is illustrated in Fig. 1. As described in Section 2.5.1, the random forest algorithm selects the exposure biomarker and a node split value for the respective exposure biomarker that best splits girls into homogeneous subgroups based on the target outcome of earlier and later menarche. Exposure biomarkers were dichotomized based on the median of the node-split values from the RF analysis unless there was an extreme imbalance in the number of observations for the binary variable. In that case we used an alternative cutpoint based on a quartile or tertile value of the distribution rather than the median of the node-split values. For exposures nested within a root node exposure biomarker, the cutpoint values were calculated based on the distribution of the “child” exposure biomarker within the respective strata, not in the full population. The appropriate cutpoint was assessed in sensitivity analyses, comparing the direction of effect when using the median of the node-split values versus alternative cutpoints. Models were adjusted for confounding variables and accounted for the complex survey design (e.g., oversampling, adjustment for non-response, stratification and clustering) of NHANES. Because we used two NHANES cycles, survey weights were adjusted by dividing by the number of cycles, as instructed in NHANES analytic protocols (https://wwwn.cdc.gov/nchs/nhanes/). Poisson regression analyses with robust standard errors were conducted using the ‘survey’ package in R v.3.4.3 (Lumley 2019). The ‘survey’ package allowed us to account for the complex sampling design and to obtain the appropriate variance around the effect estimates. The prevalence ratio (PR) and 95% confidence intervals (CI) were obtained for exposure biomarker profiles associated with earlier vs. later menarche. We constructed three models: unadjusted (Model 1), adjusted for race and PIR (Model 2), and Model 2 further adjusted for BMI percentile, as BMI may be a confounder or a mediator (Model 3). When an exposure biomarker profile consisted of only one EDC we estimated the association between the binary EDC exposure biomarker variable and timing of menarche in the entire study sample (n=229). When an exposure profile consisted of more than one EDC, the effect of the exposure biomarker profile on timing of menarche was estimated in appropriate subsets. For example, if an exposure biomarker profile for earlier menarche was higher levels of biomarker A (root node) and biomarker B (leaf node) we estimated the association of biomarker B (higher vs lower concentration) and timing of menarche in the stratum with higher levels of biomarker A.
Fig. 1.
An illustration of the two-step machine learning approach used in this analysis. Stage 1) 2000 regression trees were generated in a random forest analysis based on 41 exposure biomarkers, targeting the outcome of earlier versus later menarche. Branches from each tree were decomposed to represent exposure biomarker profiles (“combinations”) and counted across all 2000 trees. Combination frequencies were compared to a threshold from the permutation test. Stage 2) Combinations that passed the threshold were assessed in epidemiologic models adjusting for confounding.
3. Results
Characteristics of the study population overall and by earlier and later menarche are summarized in Table 1. A total of 229 adolescent girls between 12 and 16 years of age were included in the analysis. Overall, the mean age at the time of the survey was 14.5 years with a standard deviation (SD) of 0.14 years. About 30% of girls experienced menarche before age 12 years. Approximately 9% (n=20) of girls had not reached menarche at the time of the survey and were classified as “later”. The majority of girls were non-Hispanic white (64%) and in the 5th to < 85th BMI percentile (63%), and about 43% of girls had a middle or higher socioeconomic status indicated by a PIR ≥ 3.
Table 1.
Distribution of population characteristics (n and percent or mean and SD) by menarche timing.
| All participants N=229 | Participants with earlier menarche | Participants with later menarche | ||||
|---|---|---|---|---|---|---|
|
| ||||||
| N (unweighted)a | Weighted % or mean(sd)b | N (unweighted)a | Weighted % or mean(sd)b | N (unweighted)a | Weighted % or mean(sd)b | |
|
| ||||||
| Age at interview, years | 229 | 14.5 (0.14) | 86 | 14.2 (0.18) | 143 | 14.6 (0.17) |
| Age at menarche, years | ||||||
| < 12 years | 86 | 30% | ||||
| ≥ 12 years† | 143 | 70% | ||||
| Race/ethnicity | ||||||
| Non-Hispanic White | 65 | 64% | 17 | 51% | 48 | 70% |
| Non-Hispanic Black | 68 | 14% | 36 | 26% | 32 | 9% |
| Hispanic | 86 | 17% | 31 | 22% | 55 | 14% |
| Other | 10 | 5% | 2 | 1% | 8 | 7% |
| BMI percentile | ||||||
| < 5th percentile | 9 | 5% | 4 | 5% | 5 | 5% |
| 5th to < 85th percentile | 130 | 63% | 43 | 50% | 87 | 69% |
| 85th to < 95th percentile | 32 | 13% | 12 | 15% | 20 | 13% |
| ≥ 95th percentile | 58 | 19% | 27 | 30% | 31 | 13% |
| Poverty income ratio | ||||||
| < 1.0 | 80 | 24% | 33 | 30% | 47 | 22% |
| 1.0 – 1.9 | 49 | 13% | 18 | 14.5% | 31 | 12% |
| 2.0 – 2.9 | 36 | 20% | 13 | 19.5% | 23 | 20% |
| ≥ 3 | 64 | 43% | 22 | 36% | 72 | 46% |
Abbreviations: BMI, body mass index (kg/m2); SD, standard deviation. BMI percentile are based on age-and sex-specific estimates
Includes 20 girls who did not experience menarche at the time of the interview.
Unweighted number of participants.
Summary measure accounting for complex survey design; represents measures that are generalizable to the U.S. population.
3.1. Stage I: Identifying exposure profiles
The median concentrations of the 41 exposure biomarkers are summarized in Supplemental Table 1. Correlations among the 41 exposure biomarkers included in this study are shown in Supplemental Fig.1 and indicate moderate-to-high correlation between chemicals of the same class. The variable importance feature plot calculated from the mean decrease in Gini impurity criterion is shown in Fig. 2 and illustrates the relative importance of how each exposure biomarker contributed to best splitting the sample into homogenous subgroups (based on menarche status). The Gini impurity criterion is a measure of node impurity, and a node is 100% pure when all of its observations belong to a single outcome class. The mean decrease in the Gini impurity criterion indicates how the exposure biomarker best classified girls based on their menarche status across the random forest, where the higher the Gini value the more important the feature was in differentiating subgroups with greater homogeneity across the random forest. As shown in Fig. 2, the random forest identified MEHP as the most important feature that partitioned girls based their outcome status of earlier or later menarche. BPA was the second most important feature followed by 2,4-DCP and MCPP. The number of times an exposure appeared as the root node across the 2000 trees is presented in Supplmental Table 2. Consistent with the findings from the variable importance feature plot, MEHP was the most frequently occurring root node (the first selected feature that best partitioned the data) in the random forest, appearing as the root node in approximately 46% of the 2000 trees. BPA and MCPP appeared as the root node in 7% and 8% of the trees, respectively. Due to the small sample size, results were sensitive to using different hyperparameters. Specifically, choosing larger values for the population sizes of terminal nodes limited the depth of the trees (data not shown).
Fig. 2.
Ordered relative variable importance plot based on mean Gini coefficient. The features are ranked based on mean decrease in Gini impurity criterion, a measure of node impurity. The mean decrease in Gini indicates how the exposure biomarker best classified girls based on their menarche status across the random forest, where the higher the Gini value the more important the feature was in differentiating subgroups across the random forest.
Based on the permutation tests, constituent branches from the random forest that appeared more than 23 times were considered inconsistent with the null distribution and eligible for follow-up in Stage 2. As shown in Table 2, 11 exposure biomarker profiles met the permutation test threshold. The profiles covered five classes of EDCs including phthalates, phenols, PAHs, phytoestrogens, and heavy metals. MEHP appeared in all 11 exposure biomarker profiles, as the root node in nine profiles and the leaf node in two profiles. Phenols were the second most frequent exposure class, appearing in five profiles. Biomarkers of exposure to both MEHP and BPA appeared in two of the 11 profiles (73 branches total). Only one exposure biomarker profile consisted of EDCs from the same class (MEHP root node and MBzP leaf node). Among girls with lower levels of MEHP, higher concentrations of o-DMA or lower levels of BP3 were the third and fourth most populated branches, respectively. Blood concentration of cadmium was the only heavy metal to appear in a nonrandom exposure profile. Due to limited sample size, the random forest methods did not identify relevant exposure profiles with more than two biomarkers.
Table 2.
Exposure biomarker profiles identified by random forest analysis to be associated with earlier vs later menarche.
| Root node | Leaf node | |||
|---|---|---|---|---|
|
| ||||
| Biomarker | Direction / median value* | Biomarker | Direction / median value* | Number of branches among 9113 toal branches generated by RF |
|
| ||||
| MEHP | ≤ 0.53 ng/mL | ------------- | ------------- | 280 |
| MEHP | ≤ 0.75 ng/mL | o-DMA | > 1.53 ng/mL | 64 |
| MEHP | ≤ 0.79 ng/mL | BP3 | ≤ 24.50 ng/mL | 52 |
| MEHP | < 0.76 ng/mL | BPA | ≤ 0.80 ng/mL | 48 |
| MEHP | ≤ 0.76 ng/mL | Cadmium | ≤ 0.26 ug/L | 47 |
| 2-4-DCP | ≤ 0.67 ng/mL | MEHP | ≤ 0.51ng/mL | 43 |
| MEHP | > 0.82 ng/mL | Enterodiol | >152.75 ng/mL | 43 |
| MEHP | > 0.76 ng/mL | MBzP | ≤ 1.76 ng/mL | 32 |
| MEHP | > 0.81 ng/mL | Triclosan | > 838.21 ng/mL | 31 |
| MEHP | > 0.82 ng/mL | 1-hydroxypyrene | > 0.22 ng/mL | 25 |
| BPA | > 0.94 ng/mL | MEHP | ≤ 0.51 ng/mL | 25 |
Abbreviations: 2,4-DCP, 2,4-dichlorophenol; benzophenone-3, BP3; BPA, bisphenol A; MBzP, monobenzyl phthalate; MEHP, mono(2-ethylhexyl phthalate); o-DMA, o-Desmethylangolensin.
Median node split value (calculated for observations within all branches where that exposure profile appeared). Total number of branches across random forest = 9113, permutation frequency=23.
3.2. Stage II: Effect size estimates
The unadjusted and adjusted prevalence ratios and 95% CI of the exposure biomarker profiles identified in Stage 1 are shown in Table 3. As described in section 2.5.2, due to small sample sizes in subpopulations some EDCs were dichotomized at alternative cutpoints rather than the median of the node-split values to stabilize the risk models. The less extreme concentration ranges may explain why we observed weak or null effect estimates for some exposure profiles.
Table 3.
Unadjusted and adjusted prevalence ratios (PR) and 95% confidence intervals (CI) for associations between EDC exposure profiles identified by random forest analysis and earlier menarche.
| Biomarker exposure profiles* | Number of girls with exposure combination / Number of girls in root node | Model 1 PR (95% CI) | Model 2 PR (95% CI) | Model 3 PR (95% CI) |
|---|---|---|---|---|
|
| ||||
| LOWER MEHP COMBINATIONS | ||||
|
| ||||
| MEHP ≤ 2.36 ng/mL* | 113/229 | 1.41 (1.06, 1.87) | 1.41 (1.05, 1.88) | 1.36 (1.02, 1.80) |
| MEHP ≤ 2.36 ng/mL* and o-DMA > 1.53 ng/mL | 73/113 | 1.08 (0.78, 1.50) | 1.08 (0.79, 1.48) | 1.06 (0.78, 1.44) |
| MEHP ≤ 2.36 ng/mL* and BP3≤ 24.50 ng/mL | 71/113 | 1.48 (0.94, 2.30) | 1.47 (0.94, 2.28) | 1.48 (0.96, 2.31) |
| MEHP ≤ 2.36 ng/mL* and cadmium ≤ 0.26 ug/L | 93/113 | 1.14 (0.63, 2.05) | 1.13 (0.62, 2.07) | 1.13 (0.59, 2.14) |
| 2,4-DCP ≤ 0.67 ng/mL* and MEHP ≤ 1.00 ng/mL | 22/90 | 1.47 (0.62, 3.48) | 1.47 (0.61, 3.55) | 1.51 (0.66, 3.43) |
| BPA > 0.94 ng/mL and MEHP ≤ 2.58 ng/mL* | 94/189 | 1.34 (0.99, 1.81) | 1.33 (0.98, 1.81) | 1.31 (0.99,1.73) |
|
| ||||
| COMBINATIONS FOR GIRLS NOT HAVING LOWER MEHP (MEHP > 2.36ng/mL) | ||||
|
| ||||
| BPA ≤ 1.31 ng/mL* | 29/116 | 1.16 (0.88, 1.52) | 1.19 (0.90, 1.58) | 1.17 (0.87, 1.59) |
| Enterodiol > 38.98 ng/mL* | 58/116 | 1.02 (0.82, 1.28) | 1.02 (0.82, 1.27) | 1.04 (0.83, 1.31) |
| MBzP ≤ 7.16 ng/mL* | 28/116 | 1.22 (0.87, 1.73) | 1.24 (0.87, 1.76) | 1.17 (0.86, 1.60) |
| Triclosan > 64.12 ng/mL* | 29/116 | 1.07 (0.80,1.41) | 1.09 (0.84, 1.42) | 1.14 (0.89, 1.46) |
| 1- hydroxypyrene > 0.22 ng/mL | 11/116 | 1.16 (0.73, 1.87) | 1.12 (0.72, 1.73) | 1.15 (0.77, 1.70) |
Abbreviations: 2,4-DCP, 2,4-dichlorophenol; benzophenone-3, BP3; BPA, bisphenol A; MBzP, monobenzyl phthalate; MEHP, mono(2-ethylhexyl phthalate); O-DMA, o-Desmethylangolensin.
values based on alternative cutpoints. Mode11: Unadjusted; Model 2: Adjusted for race and PIR; Model 3: Adjusted for race, PIR and BMI percentile.Exposed: girls in subpopulation defined by root node with exposure levels based on leaf node; unexposed: (total population defined by root node) - (exposed).
Girls with lower levels of MEHP (<2.36 ng/mL) were more likely to have early menarche, when compared to girls with higher levels of MEHP. We observed similar associations when examining the exposure profile of lower BP3 among girls with lower MEHP. Associations between lower levels of MEHP and earlier menarche were also seen among girls with either lower 2,4-DCP concentrations or higher BPA concentrations, though estimates were slightly weaker in those smaller subgroups. Among girls with lower MEHP, associations with higher o-DMA or lower cadmium were close to null.
For girls who did not have lower levels of MEHP, levels of other biomarkers (BPA, enterodiol, MBzP, triclosan, and 1-hydroxypyrene) identified in Stage 1 showed largely null associations within Stage 2. Adjustment for covariates did not materially change the estimates or the CIs of these models.
4. Discussion
We applied a two-stage approach to examine multiple EDCs across various classes of environmental chemicals to identify exposure biomarker profiles associated with the timing of menarche. In a setting of multiple EDC co-exposures, our results highlight the importance of exposure to MEHP (alone and in combination with other EDCs) as it relates to timing of menarche in adolescent girls. MEHP, a metabolite of di-2-ethylhexyl phthalate (DEHP), is the EDC that best partitioned girls into homogenous subgroups based on timing of menarche and the only feature to appear in all 11 relevant exposure profiles. Consistent with previous hypotheses and epidemiologic studies, our findings suggest a relationship between lower levels of MEHP metabolites and earlier menarche. Additionally, we identified various combinations in the remaining ten exposure profiles which represent hypotheses that can be targeted in larger studies of adolescent girls to further understand how EDCs may impact timing of menarche.
To our knowledge, few studies have explored the relationship between exposure to multiple environmental chemicals and pubertal timing. Concurrent or sequential exposure to a variety of EDCs is more reflective of the real-world experience; therefore, it is important to identify individual exposures as well as combinations of exposures that may affect timing of menarche and other pubertal milestones. However, disentangling the independent effect of any single or group of chemicals on timing of menarche in the presence of other chemical exposures is challenging. Various machine learning techniques exist to help overcome the statistical challenges of investigating complex environmental mixtures, specifically in child and adolescent health (Oskar and Stingone 2020). Each method addresses a different research question. Methods like weighted quantile sum (WQS) and Bayesian kernel machine regression (BKMR) assess the overall effect of a mixture in relation to the targeted outcome (Gibson et al. 2019) and may not reflect combinations of exposures that are present in populations. Given the potential variability in patterns of exposure to EDCs that may exist, our research goal was not to identify the overall impact of the mixture effect on timing of menarche, but to uncover distinct exposure biomarker profiles that may be potentially related to timing of menarche. Thus, we applied an approach that combines machine learning and epidemiologic methods to help identify interpretable and relevant environmental exposure profiles that exist within the population while considering some of these statistical challenges. Random forest methods do not account for confounding in their standard implementation; therefore, it was important to implement a two-stage approach that combines identifying co-exposure profiles and strategies for assessing adjusted effect estimates.
In contrast to our study, previous studies have focused on later menarche rather than earlier menarche. Our findings, of an association between lower MEHP and earlier menarche, can be reframed as an association between higher MEHP and later menarche. These results are consistent with two (Watkins et al. 2014; Watkins et al. 2017) of three (Cathey et al. 2020) studies that examined prenatal exposure to MEHP and three (Binder et al. 2018; Kasper-Sonnenberg et al. 2017; Watkins et al. 2014) of four (Zhang et al. 2015) studies that examined childhood exposure to MEHP and menarche. We also observed a weak inverse association between levels of MEHP and menarche among girls with higher levels of BPA. In our study, we found slightly elevated estimates of earlier menarche for other EDC exposure profiles, but lacked sufficient power to detect small differences in the smaller subsets of girls. Nevertheless, the direction of the relationship for EDCs like cadmium, BPA, MBzP, o-DMA, and triclosan with timing of menarche corroborates with other single-chemical studies (Berger et al. 2018; Binder et al. 2018; Buttke et al. 2012; Harley et al. 2019; Reynolds et al. 2020). Our finding of an inverse relationship between benzophenone-3 and 2,4-dichlorophenol with timing of menarche was inconsistent with previous single-chemical studies that found evidence of earlier menarche with higher concentrations of either chemical (Binder et al. 2018; Harley et al. 2019). Inconsistencies may be attributed to differences in timing of exposure measurement (i.e., prenatal vs. childhood exposures) as well as of our assessment of non-linear effects and co-exposure to other EDCs as our observations were found in combination with MEHP exposure. Lastly, within the strata of girls with higher levels of MEHP, girls with higher concentrations of 1-hydroxypyrene had earlier menarche. 1-hydroxypyrene is one of the prominent PAH metabolites. PAHs are predominantly emitted from incomplete combustion of hydrocarbons, including motor vehicle exhaust and tobacco smoke. Although there have been studies observing earlier menarche among girls with higher exposure to air pollutants previous studies have not specifically assessed biomarker measures of PAH exposure and timing of menarche (Jung et al. 2018). These hypotheses would not have been observed using traditional variable importance metrics from random forest since those metrics do not account for combinations of features. Thus, some but not all of our observations have consistent findings with previous research, but some differ, perhaps because of the use of combinations in this study.
This study has several notable strengths. To our knowledge, it is the first study to investigate the relationship between exposure to multiple environmental chemical across seven classes of compounds in a nationally representative sample of the U.S. general population. Blood and urinary biomarker measures allowed us to objectively quantify exposure to environmental chemicals. The random forest algorithm provided an assessment of how important specific EDCs were, relative to other EDCs, in partitioning the data into homogenous subgroups based on the timing of menarche. The variable importance metrics provided by traditional random forest analyses do not provide insight on the overall threshold values for each “important” feature or direction of the relationship. Hence, a key strength of this study is our approach of discovery of exposure profiles by counting unique branches in the random forest. Discovery of exposure profiles involved: 1) decomposing each tree; 2) counting constituent branches across trees; and 3) employing permutation tests to identify relevant profiles. Furthermore, we observed relationships between specific EDCs and timing of menarche that were consistent with previous single-chemical studies and discovered new relationships that could be replicated in future studies. Lastly, application of random forest methods, which do not assume additivity or linearity in the exposures-outcome relationship, allowed us to capture non-monotonic and complex relationships. With larger sample sizes, this approach would also allow for the discovery of pairwise interaction between exposures.
There are some limitations to this study. The cross-sectional NHANES data precludes assessing temporality in the relationship between exposure biomarkers and timing of menarche, and we acknowledge the limitations of inferring causal relationships in the observed associations. Many of the environmental chemical exposures that we examined, including phthalates and phenols, rapidly metabolize within the body; thus, a single urinary measurement of exposure may not be representative of average exposure. However, single measurements of some EDC biomarkers have been shown to be relatively consistent over time (Engel et al. 2014; Johns et al. 2015; Teitelbaum et al. 2008). Small sample size is another limitation. We chose to optimize the number of biomarkers to capture a wide range of EDC exposure, which restricted the study sample to girls with complete biomarker measures for all 41 exposures of interest. The small sample size limited our ability to categorize age at menarche as earlier, typical, and later as well as estimate the effect size for some exposure profiles. Additionally, relevant exposure profiles consisted of no more than two EDCs due to the small sample size. We also lacked the statistical power to investigate additive and multiplicative interaction between exposure profiles. The smaller sample size coupled with an unbalanced target class prevalence limited our power to detect a small effect sizes with appropriate precision. Lastly, our results were sensitive when tuning the random forest node size hyperparameters due to the sample size.
5. Conclusion:
In conclusion, we employed a novel machine learning approach using NHANES data to provide insight into potentially important individual EDCs and combinations of EDCs that may be relevant in timing of menarche within a complex co-exposure setting. We exemplified how data-driven methods can serve as a valuable tool for hypothesis generation, which can be used to uncover relationships between co-exposures that have not been previously explored. Our two-stage approach should be applied in similar ways to other studies with prospective data and larger sample sizes for hypothesis generation or refinement. Future, high-powered longitudinal studies are needed to evaluate the relationship between our identified exposure profiles and timing of menarche.
Supplementary Material
Highlights.
Exposure to endocrine disrupting chemicals (EDCs) may alter timing of menarche.
Girls are exposed to multiple EDCs but research has examined EDCs individually.
We used machine learning methods to identify exposure profiles associated with menarche timing.
A phthalate, MEHP, was the most commonly identified EDC associated with menarche.
Acknowledgments
Funding: This work was supported by the Mount Sinai Institute for Exposomics-Honest Company Innovation Grant and grants from National Institutes of Health / National Institute of Environmental Health Sciences (ES027022 and ES023515).
Footnotes
CRediT author statement
Sabine Oskar: Conceptualization, Methodology, Formal analysis, Writing - Original Draft, Writing - Review & Editing, Visualization
Mary S. Wolff: Conceptualization, Methodology, Writing - Review & Editing
Susan L. Teitelbaum: Conceptualization, Methodology, Writing - Review & Editing
Jeanette A. Stingone: Conceptualization. Methodology, Software, Writing - Review & Editing, Supervision, Funding acquisition
Declaration of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Berger K, Eskenazi B, Kogut K, Parra K, Lustig RH, Greenspan LC, et al. 2018. Association of prenatal urinary concentrations of phthalates and bisphenol a and pubertal timing in boys and girls. Environ Health Perspect 126:97004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Binder AM, Corvalan C, Calafat AM, Ye X, Mericq V, Pereira A, et al. 2018. Childhood and adolescent phenol and phthalate exposure and the age of menarche in latina girls. Environ Health 17:32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Braun JM, Gennings C, Hauser R, Webster TF. 2016. What can epidemiological studies tell us about the impact of chemical mixtures on human health? Environ Health Perspect 124:A6–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breiman L. 2001. Random forests. Machine learning. 45:45–32. [Google Scholar]
- Breiman L, Cutler A. 2018. Randomforest: Breiman and cutler’s random forests for classification and regression. R package version 4.6–14. https://cran.R-project.Org/package=randomforest
- Buttke DE, Sircar K, Martin C. 2012. Exposures to endocrine-disrupting chemicals and age of menarche in adolescent girls in nhanes (2003–2008). Environ Health Perspect 120:1613–1618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cathey A, Watkins DJ, Sánchez BN, Tamayo-Ortiz M, Solano-Gonzalez M, Torres-Olascoaga L, et al. 2020. Onset and tempo of sexual maturation is differentially associated with gestational phthalate exposure between boys and girls in a mexico city birth cohort. Environ Int 136:105469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- CGHFBC. 2002. Collaborative group on hormonal factors in breast cancer. Breast cancer and breastfeeding: Collaborative reanalysis of individual data from 47 epidemiological studies in 30 countries, including 50302 women with breast cancer and 96973 women without the disease. Lancet 360:187–195. [DOI] [PubMed] [Google Scholar]
- Charalampopoulos D, McLoughlin A, Elks CE, Ong KK. 2014. Age at menarche and risks of all-cause and cardiovascular death: A systematic review and meta-analysis. Am J Epidemiol 180:29–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Day FR, Elks CE, Murray A, Ong KK, Perry JR. 2015. Puberty timing associated with diabetes, cardiovascular disease and also diverse health outcomes in men and women: The uk biobank study. Sci Rep 5:11208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diamanti-Kandarakis E, Bourguignon JP, Giudice LC, Hauser R, Prins GS, Soto AM, et al. 2009. Endocrine-disrupting chemicals: An endocrine society scientific statement. Endocr Rev 30:293–342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engel LS, Buckley JP, Yang G, Liao LM, Satagopan J, Calafat AM, et al. 2014. Predictors and variability of repeat measurements of urinary phenols and parabens in a cohort of shanghai women and men. Environ Health Perspect 122:733–740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Euling SY, Herman-Giddens ME, Lee PA, Selevan SG, Juul A, Sørensen TI, et al. 2008. Examination of us puberty-timing data from 1940 to 1994 for secular trends: Panel findings. Pediatrics 121 Suppl 3:S172–191. [DOI] [PubMed] [Google Scholar]
- Gibson EA, Goldsmith J, Kioumourtzoglou MA. 2019. Complex mixtures, complex analyses: An emphasis on interpretable results. Curr Environ Health Rep 6:53–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gong TT, Wang YL, Ma XX. 2015. Age at menarche and endometrial cancer risk: A dose-response meta-analysis of prospective studies. Sci Rep 5:14051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gore AC, Chappell VA, Fenton SE, Flaws JA, Nadal A, Prins GS, et al. 2015. Edc-2: The endocrine society’s second scientific statement on endocrine-disrupting chemicals. Endocr Rev 36:E1–E150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harley KG, Berger KP, Kogut K, Parra K, Lustig RH, Greenspan LC, et al. 2019. Association of phthalates, parabens and phenols found in personal care products with pubertal timing in girls and boys. Hum Reprod 34:109–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hendryx M, Luo J. 2018. Children’s environmental chemical exposures in the USA, nhanes 2003–2012. Environ Sci Pollut Res Int 25:5336–5343. [DOI] [PubMed] [Google Scholar]
- Janghorbani M, Mansourian M, Hosseini E. 2014. Systematic review and meta-analysis of age at menarche and risk of type 2 diabetes. Acta Diabetol 51:519–528. [DOI] [PubMed] [Google Scholar]
- Johns LE, Cooper GS, Galizia A, Meeker JD. 2015. Exposure assessment issues in epidemiology studies of phthalates. Environ Int 85:27–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson CL, Paulose-Ram R, Ogden CL, Carroll MD, Kruszon-Moran D, Dohrmann SM, et al. 2013. National health and nutrition examination survey: Analytic guidelines, 1999–2010. Vital Health Stat 2:1–24. [PubMed] [Google Scholar]
- Jung EM, Kim HS, Park H, Ye S, Lee D, Ha EH. 2018. Does exposure to pm. Environ Int 117:16–21. [DOI] [PubMed] [Google Scholar]
- Kasper-Sonnenberg M, Wittsiepe J, Wald K, Koch HM, Wilhelm M. 2017. Pre-pubertal exposure with phthalates and bisphenol a and pubertal development. PLoS One 12:e0187922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kortenkamp A. 2014. Low dose mixture effects of endocrine disrupters and their implications for regulatory thresholds in chemical risk assessment. Curr Opin Pharmacol 19:105–111. [DOI] [PubMed] [Google Scholar]
- Kuczmarski RJ, Ogden CL, Guo SS, Grummer-Strawn LM, Flegal KM, Mei Z, et al. 2002. 2000 cdc growth charts for the united states: Methods and development. Vital Health Stat 11:1–190. [PubMed] [Google Scholar]
- Lampa E, Lind L, Lind PM, Bornefalk-Hermansson A. 2014. The identification of complex interactions in epidemiology and toxicology: A simulation study of boosted regression trees. Environ Health 13:57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee JE, Jung HW, Lee YJ, Lee YA. 2019. Early-life exposure to endocrine-disrupting chemicals and pubertal development in girls. Ann Pediatr Endocrinol Metab 24:78–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee JJ, Cook-Wiens G, Johnson BD, Braunstein GD, Berga SL, Stanczyk FZ, et al. 2019. Age at menarche and risk of cardiovascular disease outcomes: Findings from the national heart lung and blood institute-sponsored women’s ischemia syndrome evaluation. J Am Heart Assoc 8:e012406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lieberoth S, Gade EJ, Brok J, Backer V, Thomsen SF. 2014. Age at menarche and risk of asthma: Systematic review and meta-analysis. J Asthma 51:559–565. [DOI] [PubMed] [Google Scholar]
- Luijken J, van der Schouw YT, Mensink D, Onland-Moret NC. 2017. Association between age at menarche and cardiovascular disease: A systematic review on risk and potential mechanisms. Maturitas 104:96–116. [DOI] [PubMed] [Google Scholar]
- Lumley T. 2019. “Survey: Analysis of complex survey samples”. R package version 3.35–1. Https://cran.R-project.Org/web/packages/survey/index.Html.
- McDowell MA, Brody DJ, Hughes JP. 2007. Has age at menarche changed? Results from the national health and nutrition examination survey (nhanes) 1999–2004. J Adolesc Health 40:227–231. [DOI] [PubMed] [Google Scholar]
- Meeker JD. 2012. Exposure to environmental endocrine disruptors and child development. Arch Pediatr Adolesc Med 166:E1–7. [PubMed] [Google Scholar]
- Oskar S, Stingone JA. 2020. Machine learning within studies of early-life environmental exposures and child health: Review of the current literature and discussion of next steps. Curr Environ Health Rep 7:170–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patel CJ. 2017. Analytic complexity and challenges in identifying mixtures of exposures associated with phenotypes in the exposome era. Curr Epidemiol Rep 4:22–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petry CJ, Ong KK, Dunger DB. 2018. Age at menarche and the future risk of gestational diabetes: A systematic review and dose response meta-analysis. Acta Diabetol 55:1209–1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pike MC, Pearce CL, Wu AH. 2004. Prevention of cancers of the breast, endometrium and ovary. Oncogene 23:6379–6391. [DOI] [PubMed] [Google Scholar]
- Prentice P, Viner RM. 2013. Pubertal timing and adult obesity and cardiometabolic risk in women and men: A systematic review and meta-analysis. Int J Obes (Lond) 37:1036–1043. [DOI] [PubMed] [Google Scholar]
- Reynolds P, Canchola AJ, Duffy CN, Hurley S, Neuhausen SL, Horn-Ross PL, et al. 2020. Urinary cadmium and timing of menarche and pubertal development in girls. Environ Res 183:109224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sørensen K, Mouritsen A, Aksglaede L, Hagen CP, Mogensen SS, Juul A. 2012. Recent secular trends in pubertal timing: Implications for evaluation and diagnosis of precocious puberty. Horm Res Paediatr 77:137–145. [DOI] [PubMed] [Google Scholar]
- Stingone JA, Pandey OP, Claudio L, Pandey G. 2017. Using machine learning to identify air pollution exposure profiles associated with early cognitive skills among u.S. Children. Environ Pollut 230:730–740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun X, Yang L, Pan J, Yang H, Wu Y, Chen Z, et al. 2018. Age at menarche and the risk of gestational diabetes mellitus: A systematic review and meta-analysis. Endocrine 61:204–209. [DOI] [PubMed] [Google Scholar]
- Sun Z, Tao Y, Li S, Ferguson KK, Meeker JD, Park SK, et al. 2013. Statistical strategies for constructing health risk models with multiple pollutants and their interactions: Possible choices and comparisons. Environ Health 12:85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teitelbaum SL, Britton JA, Calafat AM, Ye X, Silva MJ, Reidy JA, et al. 2008. Temporal variability in urinary concentrations of phthalate metabolites, phytoestrogens and phenols among minority children in the united states. Environ Res 106:257–269. [DOI] [PubMed] [Google Scholar]
- Watkins DJ, Téllez-Rojo MM, Ferguson KK, Lee JM, Solano-Gonzalez M, Blank-Goldenberg C, et al. 2014. In utero and peripubertal exposure to phthalates and bpa in relation to female sexual maturation. Environ Res 134:233–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watkins DJ, Sánchez BN, Téllez-Rojo MM, Lee JM, Mercado-García A, Blank-Goldenberg C, et al. 2017. Phthalate and bisphenol a exposure during in utero windows of susceptibility in relation to reproductive hormones and pubertal development in girls. Environ Res 159:143–151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wyshak G, Frisch RE. 1982. Evidence for a secular trend in age of menarche. N Engl J Med 306:1033–1035. [DOI] [PubMed] [Google Scholar]
- Zhang Y, Cao Y, Shi H, Jiang X, Zhao Y, Fang X, et al. 2015. Could exposure to phthalates speed up or delay pubertal onset and development? A 1.5-year follow-up of a school-based population. Environ Int 83:41–49. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


