Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Apr 1.
Published in final edited form as: Environ Int. 2018 Dec 22;125:505–514. doi: 10.1016/j.envint.2018.11.037

Exposome-wide association study of semen quality: systematic discovery of endocrine disrupting chemical biomarkers in fertility require large sample sizes

Ming Kei Chung 1, Germaine M Buck Louis 2,3, Kurunthachalam Kannan 4, Chirag J Patel 1,*
PMCID: PMC6400484  NIHMSID: NIHMS1517319  PMID: 30583854

Abstract

Objectives:

Exposome-wide association studies (EWAS) are a systematic and unbiased way to investigate multiple environmental factors associated with phenotype. We applied EWAS to study semen quality and queried the sample size requirements to detect modest associations in a reproductive cohort.

Study Design and Setting:

We conducted 1) a multivariate EWAS of 128 endocrine disrupting chemicals (EDCs) from 15 chemical classes measured in urine/serum relative to 7 semen quality endpoints in a prospective cohort study comprising 473 men and 2) estimated the sample size requirements for EWAS etiologic investigations.

Results:

None of the EDCs were associated with semen quality endpoints after adjusting for multiple tests. However, several EDCs (e.g., polychlorinated biphenyl congeners 99, 105, 114, and 167) were associated with raw p < 0.05. In a post hoc statistical power analysis with the observed effect sizes, we determined that EWAS research in male fertility will require a mean sample size of 2696 men (1795 – 3625) to attain a power of 0.8. The average size of four published studies is 201 men.

Conclusion:

Existing cohort studies with hundreds of participants are underpowered (< 0.8) for EWAS-related investigations. Merging cohorts to ensure a sufficient sample size can facilitate the use of EWAS methods for assessing EDC mixtures that impact semen quality.

Keywords: chemical mixtures, endocrine disruptors, exposome, fecundity, semen quality, statistical power

Graphical Abstract

For an exposome-wide association study (EWAS) taking samples from the general population with about 100 commonly detected low-dose chemical exposures, it would require a sample size in the range of thousands instead of hundreds to detect any potential associations with semen quality parameters.

graphic file with name nihms-1517319-f0001.jpg

1. Introduction

Unintended exposure to endocrine disrupting chemicals (EDCs), including pesticides and phthalates, may adversely influence reproductive health, such as diminished semen quality -- decreased sperm concentration, motility, and increased abnormal morphology [14]. Low dose and consistent exposures to EDCs are hypothesized to influence human reproduction and neurodevelopment [5]. Nevertheless, it is unclear how these chemicals as a group may influence semen phenotypes.

Traditionally, investigations of EDCs and semen phenotypes have based on exposure to individual EDC basis or conducted by associating a class of chemical, such as polychlorinated biphenyls (PCBs), rather than a mixture-based analytic strategy that more closely resembles human exposure. However, as measurement capacity increases and more phenotypes and exposures are measured in epidemiological and observational cohorts, findings are prone to publication bias and associations could be falsely identified [6,7]. Exposome-wide association study, or equivalently, environment-wide association study (EWAS) techniques are an agnostic data-driven approach that accounts for multiple testing of chemical mixtures associated with a phenotype. It calls for the associations of all the measured exposures and outcome systematically while controlling for the type I error rate. EWAS techniques have recently been used to assess environmental factors and chronic diseases (e.g., type 2 diabetes, high blood pressure, and peripheral arterial disease) and mortality [811]. However, EWAS techniques have not been empirically used for the assessment of human fecundity endpoints and, in particular, semen quality despite the relevancy of the exposome research paradigm for the sensitive windows underlying human reproduction and development [12]. Furthermore, the power to detect association in existing cohorts designed for discovery of exposures in male fertility has not been well documented. These are critical and timely questions in light of concerns about temporal patterns reflecting declining semen quality believed attributable to environmental factors and the community drive to design epidemiological investigations to shed light on these concerns [1315].

Although EWAS is a powerful approach in the emerging exposome era, detecting signals from noise requires large sample sizes. EDC biomarkers have dense correlational structure [16], modest and small association sizes [2,17], and low concentrations with a substantial proportion of concentrations below laboratory detection limits [18]. Along with multiple testing, these characteristics present unique analytical challenges that hinder discovery in reproductive health research. These issues often decrease the sensitivity of the statistical models but can be ameliorated through increasing the sample size of the study. In contrast to cohorts of tens of thousands of participants [8], applying EWAS to smaller cohorts such as the Longitudinal Investigation of Fertility and the Environment (LIFE) Study that targets more specific questions between EDCs and reproductive remains unexplored, and it is unclear how many participants is large enough to drive discovery.

In this study, we sought to explore the utility of EWAS techniques for assessing the relation between a mixture of EDCs measured in urine/serum and semen quality phenotypes, and to provide methodologic insights for future exposome-related research. Specifically, 1) we used EWAS techniques to investigate the association between 128 EDCs and seven semen quality endpoints using data from the LIFE Study; 2) conducted a post hoc estimation of statistical power for EWAS-type analysis; and 3) assessed the statistical power of four existing cohorts for answering questions about EDC mixtures and semen quality.

Methods

Study Population

The referent study population comprised 501 couples who participated in the LIFE Study all of whom were discontinuing contraception for purposes of becoming pregnant. Participants were recruited between 2009–2012 from 16 counties in Michigan and Texas, USA. From this cohort, 473 (94%) male partners provided semen samples representing the study cohort for analysis. Inclusion criteria for male partners were minimal: ≥ 18 years of age, in a committed relationship and no physician diagnosed infertility. Complete details about the construction of the cohort are provided elsewhere [19].

Data Collection

Male partners completed standardized baseline interviews then provided blood and urine samples for quantification of a mixture of persistent and non-persistent EDCs, respectively. Human subject approval was granted for all participating institutions, and informed consent was obtained from all men prior to any data collection.

Quantification of EDCs & Semen Analysis

Persistent and non-persistent EDCs (n = 128) representing 15 chemical classes were quantified at the Centers for Disease Control and Prevention and the Wadsworth Center (New York State Department of Health), respectively, using published methods. As listed on Table 1, five classes of persistent EDCs measured in serum were included in the analysis: 1) OCPs; 2) polybrominated biphenyl (PBB); 3) PBDEs; 4) PCBs; and 5) per- and polyfluoroalkyl substances (PFASs). Gas chromatography with high-resolution mass spectrometry was used to quantify persistent EDCs with the exception of PFASs, which were measured using high performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS) [2022].

Table 1.

Listing of 128 endocrine disruptors included in the analysis.

Chemical classes # Chemicals
Serum persistent organic compounds
Polychlorinated biphenyls (PCBs) 36 Congeners: 28, 44, 49, 52, 66, 74, 87, 99, 101, 105, 110, 114, 118, 128, 138, 146, 149, 151, 153, 156, 157, 167, 170, 172, 177, 178, 180, 183, 187, 189, 194, 195, 196, 201, 206, and 209
Organochlorine pesticides (OCPs) 9 Hexachlorobenzene (HCB), ??-hexachlorocyclohexane (??-HCH), ??-hexachlorocyclohexane (??- HCH), oxychlordane, trans-nonachlor, p, p’-DDT, o,p’-DDT, p,p’-DDE, and mirex
Polybrominated diphenyl ethers (PBDEs) 10 Congeners: 17, 28, 47, 66, 85, 99, 100, 153, 154, and 183
Polybrominated biphenyl (PBB) 1 Congener: 153
Per- and polyfluoroalkyl substances (PFASs) 7 2-(N-ethyl-perfluorooctane sulfonamido) acetate (Et-PFOSA-AcOH), 2-(N-methyl-perfluorooctane sulfonamido) acetate (Me-PFOSA-AcOH), perfluorodecanoate (PFDeA), perfluorononanoate (PFNA), perfluorooctane sulfonamide (PFOSA), perfluorooctane sulfonate (PFOS), and perfluorooctanoate (PFOA)
Urinary non-persistent organic compounds
Anti-microbialsa 12 Triclosan (TCS) and triclocarban (TCC); Parabens: methyl paraben (MP), ethyl paraben (EP), propyl paraben (PP), butyl paraben (BP), benzyl paraben (BzP), heptyl paraben (HP), 4-hydroxy benzoic acid (4-HB), 3,4-dihydroxy benzoic (3,4-DHB), methyl-protocatechuic acid (OH-Me-P), and ethyl-protocatechuic acid (OH-Et-P)
Phytoestrogens 6 Genistein, daidzein, O-desmethylangolensin (O-DMA), equol, enterodiol, and enterolactone
Phthalate metabolites 14 Mono (3-carboxypropyl) phthalate (mCPP), monomethyl phthalate (mMP), monoethyl phthalate (mEP), mono (2-isobutyl phthalate) (miBP), mono-n-butyl phthalate (mBP), mono (2- ethyl-5-carboxyphentyl) phthalate (mECPP), mono-[(2-carboxymethyl) hexyl] phthalate (mCMHP), mono (2-ethyl-5-oxohexyl) phthalate (mEOHP), mono (2-ethyl-5-hydroxyhexyl) phthalate (mEHHP), monocyclohexyl phthalate (mCHP), monobenzyl phthalate (mBzP), mono (2-ethylhexyl) phthalate (mEHP), mono-isononyl phthalate (mNP), and monooctyl phthalate (mOP).
Benzophenones (BPs) 5 4-hydroxybenzophenone (4-OH-BP), 2,4-dihydroxybenzophenone (2,4-OH-BP), 2,2’,4,4’- tetrahydroxybenzophenone (2,2’4,4’-OH-BP), 2-hydroxy-4-methoxybenzophenone (2-OH-4- MeO-BP), and 2,2’-dihydroxy-4-methoxybenzophenone (2,2’-OH-4-MeO-BP)
Bisphenol A (BPA) 1 Total bisphenol A
Paracetamol & derivatives 2 Paracetamol and 4-aminophenol
Short-lived chemicals
Blood metals 3 Cadmium (Cd), lead (Pb), and mercury (Hg) Manganese (Mn), chromium (Cr), beryllium (Be), cobalt (Co), molybdenum (Mo), cadmium
Urinary metals 17 (Cd), tin (Sn), caesium (Cs), barium (Ba), nickel (Ni), copper (Cu), zinc (Zn), tungsten (W), platinum (Pt), thallium (Tl), lead (Pb), and uranium (U)
Urinary metalloids 4 Selenium (Se), arsenic (As), antimony (Sb), and tellurium (Te)
Lifestyle chemicals
Serum cotinineb 1 Cotinine
a

Anti-microbials contain mostly parabens with TCS and TCC.

b

Serum cotinine is not an endocrine disrupting chemical but included for completeness of the study.

For six classes of non-persistent EDCs, we used HPLC-MS/MS methods to quantify urinary 1) bisphenol A (BPA); 2) benzophenones; 3) anti-microbials (triclosan, triclocarban, and parabens); 4) phthalate metabolites; 5) paracetamol and derivatives; and 6) phytoestrogens using established protocols [2328]. We quantified metal(loid)s using inductively coupled plasma mass spectrometry [29].

Other quantified exposures included serum cotinine using HPLC-MS/MS method [30] and serum lipids using commercially available enzymatic method [31,32]. We used a Roche/Hitachi Model 912 clinical analyzer (Dallas, TX) and the Creatinine Plus Assay to quantify creatinine, a marker of urinary dilution for non-persistent EDCs.

Semen Analysis

Consistent with the population-based sampling framework used in the LIFE Study, men collected semen samples following two days of abstinence after enrollment into the cohort and a second sample approximately 1 month later. Both samples were returned to the National Institute for Occupational Safety and Health’s andrology laboratory for next day analysis using overnight delivery. Within 24 hours, samples were analyzed for next day motility (%), volume (mL), sperm concentration (x106/mL), total sperm count (x106), morphology using both strict and WHO criteria (%), DNA fragmentation index (%), and high DNA stainability (%). A complete description of the laboratory methods for assessing semen quality is provided elsewhere [33].

Statistical Analyses

A. Overall Analyses

Figure 1 shows the overall analytical scheme of our study. We conducted three major analyses, namely: 1) EWAS with the LIFE Study data to uncover associations between EDCs and semen quality using a multivariate model to assess the relationships between each EDC and all semen endpoints simultaneously (i.e., multiple phenotypes versus an exposure biomarker); 2) post hoc power analysis with LIFE Study findings to investigate statistical power relative to the observed effect sizes from step 1; and 3) implementation of a field-wide post hoc power analysis [34] to investigate if existing cohorts studying semen quality are powered for EWAS-type investigations.

Figure 1.

Figure 1.

Analytical scheme of current study. (A) Using LIFE cohort data, we conducted a multivariate exposome-wide association study (EWAS) to systematically assess the associations between seven semen quality endpoints simultaneously and endocrine disrupting chemicals (EDCs). (B) Using LIFE cohort data, we conducted a post-hoc power analysis to gauge the sample size requirement for each of the endpoint for future EWAS. (C) We employed the meta-review technique to identify studies investigating the effects of EDC exposure on semen quality that we pooled related outcomes together. We conducted a post-hoc power analysis to assess whether the sample size of existing fecundity and fertility cohorts can produce enough statistical power to drive EWAS.

B. Multivariate EWAS

Figure 2 shows the EWAS procedure. As an initial step, we assessed the distributions of all exposures and semen outcomes and characterized the cohort by key covariates. Given the high correlatedness between the WHO and strict criteria for determining normal morphology, we included only the former. Thus, our EWAS approach considered seven continuous semen endpoints, viz., next day motility, seminal volume, sperm concentration, total sperm count, morphology (WHO criteria), DNA fragmentation, and high DNA stainability.

Figure 2.

Figure 2.

Illustration of the analytical scheme for the exposome-wide association (EWAS) analysis. The LIFE Study comprises 473 men for whom 128 endocrine disrupting chemicals (EDCs) have been quantified. Missing EDC concentrations were imputed using multiple imputation techniques. All semen quality endpoints, except percent morphologically normal sperm, were log-transformed prior to analysis. For each EDC, we modeled seven semen variables with a multivariate multiple regression, adjusted for a set of five a priori covariates based on chemical class. We did not adjust for lipids or creatinine for per- and poly-fluoroalkyl substances, blood metals or cotinine. Then, we combined the F statistic from the imputed data sets and used family-wise error rate to control for type I error adjusted for false discovery.

All instrument-derived chemical concentrations were used to minimize bias introduced from adjusting values below laboratory limits of detection when estimating human health outcomes with the model [35,36]. We imputed missing EDC data stemming from insufficient sample volume with a multiple imputation technique under the “missing-at-random” assumption. Specifically, we imputed data using all demographic and chemical variables and then created 10 imputed data sets for EWAS analyses.

To search for EDCs associated with semen quality in the context of a mixture, we executed a multivariate multiple regression model for each EDC with all seven semen endpoints as dependent variables. Specifically:

[morphology + log(motility+ log(volume+ log(concentration+ log(count+ log(fragmentation+ log(stainability)] = log (EDC+1) + age + BMI + smoke + exercise+ parity + lipid/creatinine

we also adjusted for a fixed set of five a priori potential confounders, i.e., age (years), body mass index (lean/normal < 25.0, overweight 25.0–29.9, obese ≥ 30.0), currently smoking (yes/no), regular vigorous exercise in past year (yes/no), and having previously fathered a pregnancy (yes/no). In addition, we included either total serum lipids (ng/g serum) calculated according to Phillips et al. [32] for lipophilic EDCs or creatinine (mg/dL urine) for urinary EDCs. Since EDC distributions are right-skewed and reported in different units, we log-transformed (x+1) and rescaled each to have zero-means and unit-variances to facilitate comparison with each other. We also log-transformed six semen endpoints (excluding percent morphology) to conform with the multivariate normality assumption.

For our regression model, the null hypothesis is that the coefficients of an EDC is simultaneously equal to zero across all semen phenotypes, while an alternative hypothesis is that one or more of the EDC coefficients are different from null. To test this hypothesis, we calculated the multivariate F statistics (Pillai’s Trace statistic) using multivariate analysis of variance technique. We combined the F statistics from the imputed data sets using the miceadds package in R [37]. Finally, we estimated both the family-wise error rate (FWER) with Bonferroni correction and the Benjamini-Hochberg false discovery rate (FDR) using p values obtained from the model to adjust for multiple comparisons [38]. Bonferroni correction is a conservative method and, hence, we provided FDR for comparison.

C. Post Hoc Power Analysis

We conducted post hoc statistical power calculations using the R package pwr [39] to inform future EWAS-type investigations. Power is defined as the probability of rejecting the null hypothesis given that the alternative hypothesis is true, i.e., probability to detect a true effect. We ran EWASs on each of semen endpoints to study the power and sample size relationship. The association size is an estimate used to quantify the association between an EDC and semen quality endpoint. We assumed an effect size ƒ2 [40] that is calculated by comparing variance explained (R2) in the full and reduced multiple regression models (formula 1).

f2=RFull2RReduced21RFull2 formula 1

The predictors of the full model included an EDC and a set of covariates as described earlier (Figure 2), while excluding the EDC for the reduced model. The power analysis set the Bonferroni corrected significance level to 0.05/128. Since the effect sizes are typically low and we do not know the biologically significant sizes for EDC exposures, we assumed a null effect size distribution and took the 95th percentile (P) ƒ2 as a threshold of important effect size to estimate the required sample size and statistical power. We selected Bonferroni correction in favor of using FDR for direct interpretation of the effect from multiple comparisons. All sample sizes were reported with Bonferroni correction unless otherwise specified. For comparison, we also estimated the required sample sizes to reach 80% power using FDR methods. Details of the FDR simulation procedures can be found in Appendix A.

D. Post Hoc Field-Wide Power Analysis

Lastly, we sought to ascertain whether EWAS techniques could be readily applied to the typical cohort sizes utilized in epidemiologic and clinical research. To this end, we employed the meta-review (i.e., overview of reviews) techniques [41,42] to systematically extract and summarize the association sizes reported for human research (Figure 3). The Medical Subject Headings (MeSH) is a thesaurus that contains a set of controlled descriptors in a hierarchical structure for indexing biomedical journals. Investigators at the National Library of Medicine annotate each article indexed in PubMed. We used MeSH to perform a search in PubMed with the following terms: “Semen Analysis”[Mesh] AND (“Endocrine Disruptors”[Mesh] OR “Environmental Pollutants”[Mesh]). We identified 423 papers and selected 40 publications in English meeting our inclusion criteria. We used the reviews to identify relevant observational studies that reported Pearson correlation coefficient (r) as a metric of effect size between serum/plasma or urinary EDCs (e.g., pesticides and PCBs) and semen related outcomes (e.g., semen volume and sperm count). Although the odds ratio is the most commonly reported point estimate for estimating the magnitude of an association (e.g., testing cases versus controls) and we estimated ƒ2 in the previous analysis, we chose r in this field-wide analysis given 1) the ease of computation; 2) simpler assumptions (e.g., without specifying baseline prevalence of outcome and case-control sample size ratio); and 3) standardized effect size r can facilitate direct comparison. Finally, we included 47 pairs of rs from four independent research papers for the power analysis [4346].

Figure 3.

Figure 3.

Flow diagram illustrating meta-review techniques of the published literature for the extraction of Pearson correlation coefficients (n = 47 pairs) for endocrine disrupting chemicals (EDCs) and semen phenotypes.

We compared multiplicity in three scenarios: 1) without adjustment; 2) adjusting with 12 pairs of comparisons (empirical average); and 3) adjusting with 100 pairs of comparisons (arbitrarily set for a EWAS). We set the Bonferroni corrected significance level at 0.05, 0.05/12 and 0.05/100, respectively. We selected the 95th P of the absolute r distribution as a metric for important effect size and used one-sided alternative hypothesis in the power analysis. Similar to previous analysis, all sample sizes were reported with Bonferroni correction unless otherwise specified.

Further analysis of the correlations between semen quality endpoints can be found in Appendix B. We conducted all statistical analyses using the computing environment R (v 3.3.1).

Results

Overall, the LIFE cohort comprised largely white men (81%) with a mean age of 31.8 (± 4.9) years and a body mass index of 29.9 (± 5.6) with most (55%) having previously fathered a pregnancy (Table 2). Most of the men resided in a household with an annual income of ≥ $90,000.

Table 2.

Description of study cohort (n = 473).

Characteristic # %
Age (years):
 < 25 16 3
 25–29 151 40
 30–34 176 37
 ≥ 35 130 28
Race/ethnicity:
 Black, Non-Hispanic 20 4
 White, Non-Hispanic 381 81
 Hispanic 38 8
 Other 34 7
Household income:
 < $50,000 71 15
 $50,000–$89,999 120 26
 ≥ $90,000 275 59
Fathered a pregnancy before enrollment:
 No 215 45
 Yes 258 55
Mean (± SD)
Age (years) 31.8 (4.9)
Body mass index (kg/m2) 29.9 (5.6)
Mean abstinence time (days) 4.1 (3.4)
Geometric mean (95% CI)
Serum cotinine (ng/mL) 0.04 (0.04, 0.06)
Serum total lipids (ng/g) 693 (593.0, 811.8)
Urinary creatinine (mg/dL) 6.55 (6.45, 6.66)

Figure 4 is a Manhattan plot that illustrates the EWAS results for the 128 EDCs. The p values for each EDC estimated from the multivariate F test are shown on the vertical axis. We found that 7/15 chemical classes had p values < 0.1, viz., PCBs, PBDEs, PFASs, phthalates, benzophenones, anti-microbials, and urinary metals. Only two PCB congeners, 104 and 115, and one PFAS (Me-PFOSA-AcOH) had p values < 0.01. Overall, the findings were not robust to multiple adjustment. We did not observe any EDCs to be significantly associated with semen phenotypes, as none of the p values passed Bonferroni correction (0.05/128; shown as the red line in the plot) nor the FDR threshold of 0.1 (line not shown).

Figure 4.

Figure 4.

Manhattan plot showing the results from multivariate exposome-wide association study. We tested the null hypothesis that endocrine disrupting chemicals (EDCs) were not associated with any of the seven semen quality endpoints. The Y axis represents the –log10 of the p values associated with multivariate F statistic. The X axis represents the 128 EDCs from persistent lipophilic to non-persistent compounds (left to right) that are colored by chemical class. Horizontal lines are drawn at p values of 0.005, 0.001, and Bonferroni correction level (0.05/128). None of the EDCs were statistically significant at a false discovery rate of 0.1. EDCs with p values < 0.05 are labeled. PCB: Polychlorinated biphenyl; OCPs: Organochlorine pesticides; PBBs: Polybrominated biphenyls; PBDE: Polybrominated diphenyl ether; PFASs: Per- and polyfluoroalkyl substances; Me-PFOSA-AcOH: 2-(N-methyl-perfluorooctane sulfonamido) acetate; PFOSA: Perfluorooctane sulfonamide.

Since effect sizes of EDCs are modest and mixture analysis requires adjusting for multiple comparisons, we investigated whether power could explain the null findings. In a post hoc analysis, we found the power of our study was modest with respect to detecting the association of 128 EDCs (Figure 5). Taking sperm motility as an example (Figure 5A), with a cohort size of 473, the third, second, and first quantiles of statistical power were 0.012, 0.002, and 0.001, respectively. To detect the 95th P ƒ2, the number of recruited men to detect this effect size for a statistical power of ≥ 0.8 at a Bonferroni corrected significance level (0.05/128) would be: 2100 (next day motility), 2168 (seminal volume), 3625 (sperm concentrations), 3486 (total sperm count), 2185 (morphology), 3510 (DNA fragmentation), and 1795 (DNA stainability). The cohort would need 3.8 times more (current size n = 473) to detect the 95th P associations between EDCs and DNA stainability.

Figure 5.

Figure 5.

Figure 5.

Graph showing the relationships between statistical power and sample size in the LIFE Study. The graph reflects post hoc power analysis using empirical data. Figure A to G represent analyses with different endpoints. A) next day motility; B) seminal volume; C) sperm concentration; D) total sperm count; E) morphology (WHO criteria); F) DNA fragmentation, and G) high DNA stainability. For example, in A), we regressed sperm motility on each endocrine disrupting chemical (EDC) separately. We calculated Cohen’s ƒ2 as the effect size and used a Bonferroni corrected significance level at 0.05/128 to estimate power. In the graph, each of the 128 EDCs is represented by a curve. The color of the curve denotes the EDC class. The top five EDCs are annotated. PCB: Polychlorinated biphenyl; OCPs: Organochlorine pesticides; PBBs: Polybrominated biphenyls; PBDE: Polybrominated diphenyl ether; PFASs: Per- and polyfluoroalkyl substances.

In comparison with FDR, the sample sizes to reach power of ≥ 0.8 to detect the 95th effect sizes were generally lower than with Bonferroni correction: 1094 (next day motility), 1110 (seminal volume), 2024 (sperm concentrations), 1744 (total sperm count), 1331 (morphology), 2116 (DNA fragmentation), and 925 (DNA stainability).

The sample size and statistical power relationship to detect correlation rs in our field-wide post hoc analysis is depicted in Figure 6. After pooling the data, the 25th, 50th, 75th, and 95th P of rs were: 0.044, 0.090, 0.140, and 0.229, respectively. We found that, on average, these studies had 201 men and tested a subset of individual EDCs with 12 hypotheses per study. Using these average settings and the 95th P r, previous investigations had a statistical power of 0.69 at a significance level at 0.05/12. For scenarios that did not adjusted for multiple testing and adjusted for 100 pairs of comparisons, the statistical power was 0.91 and 0.42, respectively.

Figure 6.

Figure 6.

Graph showing the relationships between statistical power and sample size, using data from four published semen quality studies. Vertical dash-dot blue lines indicate study sample sizes. We have shown the results in three different Bonferroni-corrected significance level (α) scenarios: no comparison (α = 0.05); 12 comparisons per study (α = 0.05/12), which is the average of selected studies; 100 comparisons per study (α = 0.05/100), which is a value we arbitrarily set for a comprehensive exposome-wide association study. We extracted a total of 47 Pearson correlation coefficients as input and used Bonferroni corrected α to estimate power. In the graph, curves corresponding to the correlations at 25th, 50th, 75th and 95th percentiles (P) are shown.

Discussion

Multivariate EWAS

While we could not identify robust associations when analyzing a mixture of 128 EDCs, several associations have been reported for the LIFE Study when assessing specific candidate chemical classes of EDCs (e.g., benzophenones, phthalates) and semen quality [2,27,29,47]. Other cohorts of men also have reported adverse associations between PCBs and PBDEs and sperm motility [1,46,48] and sperm morphology for PBDEs and PFASs [49,50].

Possible reasons accounting for our inability to identify significant EDCs in the EWAS analysis include choice of statistical models, differences in model specification, sources of biofluids for EDC measurement (urine, serum, semen), and a lack of attention to multiple testing in chemical class approaches that do not account for mixtures. Although EWAS is the most sensitive approach for the detection of associations when assessing mixtures [51], we hypothesize that limited statistical power is a key reason for our null findings in this study. For example, to detect association sizes similar to those estimated in this study (n = 473), we concluded that we would require at least 1795 men to detect the associations with sperm DNA stainability.

Consideration of Multiple Comparisons in Mixture Analysis

The large sample size requirement underlying EWAS techniques stems from correcting type I error rate in the context of multiple comparisons (along with overcoming errors in measurement of the exposures and phenotypes). To highlight power requirements for the field, we analyzed selected studies in the field-wide analysis. Importantly, all did not report original findings with correction for multiplicity. We calculated that the power to detect associations was as high as 0.91 (32% increase from 0.69) without adjustment of multiple comparisons (significance level at 0.05). On average, each study tested 12 hypotheses and hence the Bonferroni corrected significance level is 0.05/12. This illustrates that failure of adjusting for only 12 pairs of comparisons in study with modest sample size may lead to inflation in statistical power and it is therefore more likely to lead to false discovery. Given that the associations between EDCs and semen quality endpoints are modest, cautious interpretation of findings on EDCs and semen quality is required [52].

Limitations of Post Hoc Statistical Power Analysis

Our field-wide power analysis in semen quality has several limitations. First, studies providing data were published from 2002 to 2011, and the extent to which they have external validity for other time periods remains unknown. Second, we could only extract 47 pairs of rs from four studies, which may not be representative enough. Third, we summarized semen endpoints and chose r as the effect size measurement. Therefore, results could not be compared directly with our LIFE post hoc power analysis, which estimated ƒ2 on individual semen endpoints.

Approaches to Increase Statistical Power of EWAS

There are several approaches to increase the power for EWAS investigations as we move forward to implement these tools. First, EDC concentrations are typically low and/or below the laboratory detection limits, especially when studying participants are sampled from the general population. To reduce the number of comparisons, one possibility is to exclude exposures with low detection percentage (e.g., 5%). Alternatively, one can retain the information by aggregating rare exposures by chemical class [53]. Secondly, when calculating the FWER, tests are assumed to be independent. One may take account of the correlations between exposures and estimate a new significance threshold. For example, Bonferroni correction is calculated from dividing an a priori significance level (e.g., 0.05) by the number of comparisons made. Nyholt [54] provided a method to calculate an “effective number of variables”, which is smaller than number of comparisons when tests are correlated and produces a higher significance threshold. Alternatively, FDR controlling procedures are generally more powerful than those for FWER, but at the sake of a higher type I error rate [55,56]. Thirdly, one may use joint analyses for studies with more than one correlated endpoint [57], i.e., modeling data using one multivariate multiple regression in place of a few multiple regressions as we attempted in this investigation. In this study, we have selected seven endpoints. For EWAS driven by multiple regressions, we would have 896 pairs of comparisons (128 × 7), whereas the number of comparisons is reduced to 128 for multivariate multiple regression. Lastly, while it is not feasible for individual investigators to conduct large-scale studies without substantial resources, meta-analyzing existing cohorts may be a cost-effective way to increase effective sample size and hence statistical power; however, a challenge remains in harmonizing studies across regions and with varying methodologies.

Conclusions

We did not identify EDC significantly associated with diminished semen quality in a multivariate EWAS (FDR = 0.1). In a post-hoc power analysis, we conclude that the sample size requirements are between 1795 – 3625 men and 925 – 2116 men when using a Bonferroni or false discovery rate to mitigate type 1 error, respectively. Last, despite the importance of investigating endocrine disrupting chemicals for public health and male fertility, we found that existing cohort investigations are vastly underpowered to undertake discovery-based or EWAS-like approach and greater investment in larger sample sizes are required to identify environmental factors associated with semen phenotypes given their modest association sizes.

Supplementary Material

1

Highlights.

  • For an exposome-wide association study assessing mixtures of 128 endocrine disrupting chemicals and semen quality, sample size of 501 men is not enough

  • Such study would require 1795 – 3625 men when using Bonferroni correction

  • Alternatively, the study would need 925 – 2116 men when using false discovery rate

  • Numerous analytical methods are available to increase statistical power

  • Increasing sample sizes through pooling of cohorts is a strategy to use EWAS and find robust associations

Acknowledgements:

This work was supported by the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), National Institutes of Health (Contracts #N01-HD-3–3355, N01-HD-3–3356, N01-HD-3–3358, HHSN27500001, HHSN27500002, HHSN27500003, HHSN27500006), and the National Institute of Environmental Health Sciences grants (ES023504 and ES025052). NICHD had a signed memo of understanding with the Centers for Disease Control and Prevention for the analysis of semen quality and persistent environmental chemicals.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Declaration of Financial Interests: None of the author has any competing interests with this work.

References

  • [1].Abdelouahab N, Ainmelk Y, Takser L. Polybrominated diphenyl ethers and sperm quality. Reprod Toxicol 2011;31:546–50. [DOI] [PubMed] [Google Scholar]
  • [2].Buck Louis GM, Chen Z, Schisterman EF, Kim S, Sweeney AM, Sundaram R, et al. Perfluorochemicals and human semen quality: the LIFE Study. Environ Health Perspect 2015;123:57–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Joensen UN, Bossi R, Leffers H, Jensen AA, Skakkebaek NE, Jørgensen N. Do perfluoroalkyl compounds impair human semen quality? Environ Health Perspect 2009;117:923–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Vitku J, Heracek J, Sosvorova L, Hampl R, Chlupacova T, Hill M, et al. Associations of bisphenol A and polychlorinated biphenyls with spermatogenesis and steroidogenesis in two biological fluids from men attending an infertility clinic. Environ Int 2016;89–90:166–73. [DOI] [PubMed] [Google Scholar]
  • [5].Diamanti-Kandarakis E, Bourguignon JP, Giudice LC, Hauser R, Prins GS, Soto AM, et al. Endocrine-disrupting chemicals: an endocrine society scientific statement. Endocr Rev 2009;30:293–342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Ioannidis JPA. Why most published research findings are false. PLoS Med 2005;2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Ioannidis JPA. Why most discovered true associations are inflated. Epidemiol Camb Mass 2008;19:640–8. [DOI] [PubMed] [Google Scholar]
  • [8].McGinnis DP, Brownstein JS, Patel CJ. Environment-wide association study of blood pressure in the national health and nutrition examination survey (1999–2012). Sci Rep 2016;6:30373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Patel CJ, Bhattacharya J, Butte AJ. An Environment-Wide Association Study (EWAS) on type 2 diabetes mellitus. PloS One 2010;5:e10746. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Zhuang X, Ni A, Liao L, Guo Y, Dai W, Jiang Y, et al. Environment-wide association study to identify novel factors associated with peripheral arterial disease: Evidence from the National Health and Nutrition Examination Survey (1999–2004). Atherosclerosis 2018;269:172–7. [DOI] [PubMed] [Google Scholar]
  • [11].Patel CJ, Rehkopf DH, Leppert JT, Bortz WM, Cullen MR, Chertow GM, et al. Systematic evaluation of environmental and behavioural factors associated with all-cause mortality in the United States national health and nutrition examination survey. Int J Epidemiol 2013;42:1795–810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Buck Louis GM, Yeung E, Sundaram R, Laughon SK, Zhang C. The exposome – exciting opportunities for discoveries in reproductive and perinatal epidemiology. Paediatr Perinat Epidemiol 2013;27:229–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Skakkebaek NE, Rajpert-De Meyts E, Buck Louis GM, Toppari J, Andersson AM, Eisenberg ML, et al. Male reproductive disorders and fertility trends: influences of environment and genetic susceptibility. Physiol Rev 2016;96:55–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Smarr MM, Sapra KJ, Gemmill A, Kahn LG, Wise LA, Lynch CD, et al. Is human fecundity changing? A discussion of research and data gaps precluding us from having an answer. Hum Reprod 2017;32:499–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Vested A, Giwercman A, Bonde JP, Toft G. Persistent organic pollutants and male reproductive health. Asian J Androl 2014;16:71–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Patel CJ, Manrai AK. Development of exposome correlation globes to map out environment-wide associations. Pac Symp Biocomput Pac Symp Biocomput 2015:231–42. [PMC free article] [PubMed] [Google Scholar]
  • [17].Mumford SL, Kim S, Chen Z, Gore-Langton RE, Barr DB, Buck Louis GM. Persistent organic pollutants and semen quality: the LIFE Study. Chemosphere 2015;135:427–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Rappaport SM, Barupal DK, Wishart D, Vineis P, Scalbert A. The blood exposome and its role in discovering causes of disease. Environ Health Perspect 2014;122:769–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Buck Louis GM, Schisterman EF, Sweeney AM, Wilcosky TC, Gore-Langton RE, Lynch CD, et al. Designing prospective cohort studies for assessing reproductive and developmental toxicity during sensitive windows of human reproduction and development--the LIFE Study. Paediatr Perinat Epidemiol 2011;25:413–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Kuklenyik Z, Needham LL, Calafat AM. Measurement of 18 perfluorinated organic acids and amides in human serum using on-line solid-phase extraction. Anal Chem 2005;77:6085–91. [DOI] [PubMed] [Google Scholar]
  • [21].Sjödin A, Jones RS, Lapeza CR, Focant JF, McGahee EE, Patterson DG. Semiautomated high-throughput extraction and cleanup method for the measurement of polybrominated diphenyl ethers, polybrominated biphenyls, and polychlorinated biphenyls in human serum. Anal Chem 2004;76:1921–7. [DOI] [PubMed] [Google Scholar]
  • [22].Kato K, Wong LY, Jia LT, Kuklenyik Z, Calafat AM. Trends in exposure to polyfluoroalkyl chemicals in the U.S. Population: 1999–2008. Environ Sci Technol 2011;45:8037–45. [DOI] [PubMed] [Google Scholar]
  • [23].Kunisue T, Wu Q, Tanabe S, Aldous KM, Kannan K. Analysis of five benzophenone-type UV filters in human urine by liquid chromatography-tandem mass spectrometry. Anal Methods 2010;2:707–13. [Google Scholar]
  • [24].Guo Y, Alomirah H, Cho HS, Minh TB, Mohd MA, Nakata H, et al. Occurrence of phthalate metabolites in human urine from several Asian countries. Environ Sci Technol 2011;45:3138–44. [DOI] [PubMed] [Google Scholar]
  • [25].Zhang Z, Alomirah H, Cho HS, Li YF, Liao C, Minh TB, et al. Urinary bisphenol A concentrations and their implications for human exposure in several asian countries. Environ Sci Technol 2011;45:7044–50. [DOI] [PubMed] [Google Scholar]
  • [26].Asimakopoulos AG, Wang L, Thomaidis NS, Kannan K. A multi-class bioanalytical methodology for the determination of bisphenol A diglycidyl ethers, phydroxybenzoic acid esters, benzophenone-type ultraviolet filters, triclosan, and triclocarban in human urine by liquid chromatography-tandem mass spectrometry. J Chromatogr A 2014;1324:141–8. [DOI] [PubMed] [Google Scholar]
  • [27].Mumford SL, Kim S, Chen Z, Barr DB, Louis GMB. Urinary phytoestrogens are associated with subtle indicators of semen quality among male partners of couples desiring pregnancy. J Nutr 2015;145:2535–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Smarr MM, Grantz KL, Sundaram R, Maisog JM, Honda M, Kannan K, et al. Urinary paracetamol and time-to-pregnancy. Hum Reprod 2016;31:2119–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Bloom MS, Whitcomb BW, Chen Z, Ye A, Kannan K, Buck Louis GM. Associations between urinary phthalate concentrations and semen quality parameters in a general population. Hum Reprod 2015;30:2645–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Bernert JT, Turner WE, Pirkle JL, Sosnoff CS, Akins JR, Waldrep MK, et al. Development and validation of sensitive method for determination of serum cotinine in smokers and nonsmokers by liquid chromatography/atmospheric pressure ionization tandem mass spectrometry. Clin Chem 1997;43:2281–91. [PubMed] [Google Scholar]
  • [31].Akins JR, Waldrep K, Bernert JT. The estimation of total serum lipids by a completely enzymatic “summation” method. Clin Chim Acta 1989;184:219–26. [DOI] [PubMed] [Google Scholar]
  • [32].Phillips DL, Pirkle JL, Burse VW, Bernert JT, Henderson LO, Needham LL. Chlorinated hydrocarbon levels in human serum: effects of fasting and feeding. Arch Environ Contam Toxicol 1989;18:495–500. [DOI] [PubMed] [Google Scholar]
  • [33].Buck Louis GM, Sundaram R, Schisterman EF, Sweeney A, Lynch CD, Kim S, et al. Semen quality and time-to-pregnancy, the LIFE Study. Fertil Steril 2014;101:453–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Serghiou S, Patel CJ, Tan YY, Koay P, Ioannidis JPA. Field-wide meta-analyses of observational associations can map selective availability of risk factors and the impact of model specifications. J Clin Epidemiol 2016;71:58–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Richardson DB, Ciampi A. Effects of exposure measurement error when an exposure variable is constrained by a lower limit. Am J Epidemiol 2003;157:355–63. [DOI] [PubMed] [Google Scholar]
  • [36].Schisterman EF, Vexler A, Whitcomb BW, Liu A. The limitations due to exposure detection limits for regression models. Am J Epidemiol 2006;163:374–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Robitzsch A, Grund S, Henke T. miceadds: Some Additional Multiple Imputation Functions, Especially for “mice.” 2017.
  • [38].Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc 1995;57:289–300. [Google Scholar]
  • [39].Champely S, Ekstrom C, Dalgaard P, Gill J, Weibelzahl S, Anandkumar A, et al. pwr: Basic Functions for Power Analysis. 2017.
  • [40].Cohen J Statistical power analysis for the behavioral sciences. Hillsdale, N.J.: L. Erlbaum Associates; 1988. [Google Scholar]
  • [41].Smith V, Devane D, Begley CM, Clarke M. Methodology in conducting a systematic review of systematic reviews of healthcare interventions. BMC Med Res Methodol 2011;11:15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Francke AL, Smit MC, de Veer AJ, Mistiaen P. Factors influencing the implementation of clinical guidelines for health care professionals: A systematic meta-review. BMC Med Inform Decis Mak 2008;8:38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].De Jager C, Farias P, Barraza-Villarreal A, Avila MH, Ayotte P, Dewailly E, et al. Reduced seminal parameters associated with environmental DDT exposure and p,p’-DDE concentrations in men in Chiapas, Mexico: a cross-sectional study. J Androl 2006;27:16–27. [DOI] [PubMed] [Google Scholar]
  • [44].Haugen TB, Tefre T, Malm G, Jönsson BAG, Rylander L, Hagmar L, et al. Differences in serum levels of CB-153 and p,p’-DDE, and reproductive parameters between men living south and north in Norway. Reprod Toxicol 2011;32:261–7. [DOI] [PubMed] [Google Scholar]
  • [45].Richthoff J, Rylander L, Jönsson BAG, Akesson H, Hagmar L, Nilsson-Ehle P, et al. Serum levels of 2,2’,4,4’,5,5’-hexachlorobiphenyl (CB-153) in relation to markers of reproductive function in young males from the general Swedish population. Environ Health Perspect 2003;111:409–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Rignell-Hydbom A, Rylander L, Giwercman A, Jönsson B a G, Nilsson-Ehle P, Hagmar L. Exposure to CB-153 and p,p’-DDE and male reproductive function. Hum Reprod 2004;19:2066–75. [DOI] [PubMed] [Google Scholar]
  • [47].Buck Louis GM, Chen Z, Kim S, Sapra KJ, Bae J, Kannan K. Urinary concentrations of benzophenone-type ultraviolet light filters and semen quality. Fertil Steril 2015;104:989–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Meeker JD, Hauser R. Exposure to polychlorinated biphenyls (PCBs) and male reproduction. Syst Biol Reprod Med 2010;56:122–31. [DOI] [PubMed] [Google Scholar]
  • [49].Hauser R, Chen Z, Pothier L, Ryan L, Altshul L. The relationship between human semen parameters and environmental exposure to polychlorinated biphenyls and p,p’-DDE. Environ Health Perspect 2003;111:1505–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [50].Toft G, Jönsson BAG, Lindh CH, Giwercman A, Spano M, Heederik D, et al. Exposure to perfluorinated compounds and human semen quality in arctic and European populations. Hum Reprod 2012;27:2532–40. [DOI] [PubMed] [Google Scholar]
  • [51].Agier L, Portengen L, Chadeau-Hyam M, Basagaña X, Giorgis-Allemand L, Siroux V, et al. A systematic comparison of linear regression-based statistical methods to assess exposome-health associations. Environ Health Perspect 2016;124:1848–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [52].Patel CJ, Ioannidis JPA. Placing epidemiological results in the context of multiplicity and typical correlations of exposures. J Epidemiol Community Health 2014;68:1096–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [53].Auer PL, Lettre G. Rare variant association studies: considerations, challenges and opportunities. Genome Med 2015;7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].Nyholt DR. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 2004;74:765–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [55].Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat 2001;29:1165–88. [Google Scholar]
  • [56].Kim KI, van de Wiel MA. Effects of dependence in high-dimensional multiple testing problems. BMC Bioinformatics 2008;9:114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [57].Zhou X, Stephens M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat Methods 2014;11:407–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES