Abstract
The length of gestation can affect offspring health and performance. Both maternal and fetal effects contribute to gestation length; however, paternal contributions to gestation length remain elusive. Using genome-wide association study (GWAS) in 27,214 Holstein bulls with millions of gestation records, here we identify nine paternal genomic loci associated with cattle gestation length. We demonstrate that these GWAS signals are enriched in pathways relevant to embryonic development, and in differentially methylated regions between sperm samples with long and short gestation length. We reveal that gestation length shares genetic and epigenetic architecture in sperm with calving ability, body depth, and conception rate. While several candidate genes are detected in our fine-mapping analysis, we provide evidence indicating ZNF613 as a promising candidate for cattle gestation length. Collectively, our findings support that the paternal genome and epigenome can impact gestation length potentially through regulation of the embryonic development.
Lingzhao Fang et al. studied the paternal genetic variants that affect gestational length in cattle. They found that paternal genes from pathways involved in embryonic development were associated with gestation length, and that these were often found in differentially methylated regions of the genome.
Introduction
Gestation length measures the fetal development period from conception to subsequent parturition in mammals, which is crucial for mammalian development. The events occurred in gestation period can have important consequences for the health, productivity, and fertility of the offspring1. The abnormality of gestation length can lead to either preterm or post-term birth, resulting in acute and long-term adverse health outcomes in humans2–4. In cattle, the length of gestation highly correlates with health, production, and reproduction performances5. For instance, prolonged gestation length has been reported to be associated with increased fetal weight, reduced pregnancy rate, and more difficult calving in dairy cattle5,6. Furthermore, gestation is a unique immunological state (i.e., the immune clock of pregnancy) that may help predict preterm birth (i.e., shortened gestation length)7 that can influence the risk of developing immune-related diseases8. Moreover, gestation length as a trait also has direct applications to the dairy industry, because more precise expected dates can be used to assist herd management practices on health and nutritional aspects5.
Gestation length is a complex phenotype affected by many genetic and environmental factors, including the progesterone rise, prenatal growth, maternal age, and maternal and fetal immune systems9–11. It has been reported that gestation length is a highly heritable trait with a heritability of 30–50% in humans and cattle10,12,13. By measuring properties of pregnancy, gestation length also has a complex genetic architecture with direct contributions from maternal and fetal genomes, and likely an indirect, paternal influence through regulation of the fetal development. Many studies in humans have explored the maternal genetic factors that were associated with gestation length4,14,15, but few of them have investigated the indirect impacts of paternal genetics on gestation length, possibly due to the limited availability of data.
The U.S. dairy industry has a long history of collecting phenotypic records of dairy cattle. Based on millions of mating and dairy production records, the U.S. dairy cattle database has archived a large amount of reliable phenotypes on gestation length for thousands of service bulls. This data resource provides valuable information to study the paternal impacts on gestation length in mammals. A previous study reported that the heritability of gestation length for service sires (the paternal contribution) was about 47 and 33% when mated to heifers and cows, respectively10. On the other hand, the epigenetic information in sperm has been shown to influence the embryonic development through regulating gene expression in embryos16–18, warranting interest to explore whether the epigenetic alterations in sperm can also contribute to gestation length.
In this study, we seek to investigate the paternal genetic contribution to gestation length in cattle through associating ~3 million imputed sequence variants with gestation length in a large sample of 27,214 Holstein bulls. In addition, we characterized genome-wide DNA methylation alterations in sperm that were associated with gestation length and three genetically correlated traits of economic importance, sire calving ease (SCE), body depth (BDE), and cow conception rate (CCR), by comparing sperm methylomes of 18 representative animals with extreme phenotypes19. Moreover, we integrated the genetic variants of gestation length with DNA methylation alterations in sperm. We further validated our findings by examining publicly available transcriptome data across 87 adult and embryonic tissues20. Finally, we provided genetic, epigenetic, and selection evidence implicating a candidate gene ZNF613 for gestation length. Collectively, our results illustrated the importance of paternal genome and epigenome to gestation length through regulation of embryonic development, and provided insights into the genetic and biological mechanisms underpinning gestation length. We believe that our findings in cattle can provide valuable knowledge for other mammals, including human and rodents.
Results
The complex genetic architecture underlying gestation length
A total of 27,214 Holstein bulls with ~3 million imputed sequence variants and highly reliable phenotypes were included in the current analyses. In total, our single-marker GWAS revealed nine quantitative trait loci (QTL) that were located in the Bos taurus chromosome (BTA) 4, 5, 7, 10, 14, 18, 19, and 28, respectively (Fig. 1a). BTA 18 had two QTLs with one of them being the most significant, consistent with a previous study that used a smaller sample size of 4743 bulls21. Our following fine-mapping analyses on these nine QTL regions determined 25 candidate genes (posterior probability of causality > 0.05) for gestation length, including multiple genes participating in the embryonic development (e.g., HSF122, MYH1023,24, NDEL125, and NRG226), immune responses (e.g., HCFC2, and CYSTM1), DNA processing (e.g., WWP2, CDKL1, ZNF613, and CPSF1), and cell differentiation (e.g., ZNF16 and ARID4B) (Fig. 1a; Supplementary Data 1). By examining the Bovine Gene Atlas data that measured the transcriptome across 87 tissues in cattle (http://www.innatedb.com/)20, we found that 5 out of 25 fine-mapped genes exhibited the highest expression level in placenta, including SLC39A4, WWP2, DNAH2, ZNF613, and MYH10. We further found five fine-mapped genes having the highest expression in the immune and growth-related glands (e.g., thymus, anterior pituitary and thyroid), including NFBY, NRG2, FBXL6, ZNF613, and ARID4B, among which NFBY and ZNF613 were also highly expressed in the embryonic tissues (i.e., fetal tongue surface) (Supplementary Data 1). Through examining the human GWAS Catalog (https://www.ebi.ac.uk/gwas/home), we found that the human orthologues of three fine-mapped genes, DNAH2, CYSTM1, and WWP2, have been reported to be significantly associated with vascular endothelial growth factor levels, intelligence (neurogenesis and myelination), and menarche in human, respectively, suggesting their important roles in development and fertility (Supplementary Data 1). Together, all these results suggest that our fine-mapped genes likely affect gestation in a tissue-specific manner and potentially through the regulation of the fetal genome and development.
Based on the omnigenetic model of complex phenotypes27, we conducted GWAS signal enrichment analyses to determine the core molecular interaction networks that are engaged in regulating gestation length. We employed five commonly used gene annotation sources, including Gene ontology (GO), Kyoto Encyclopedia of genes and genomes (KEGG) pathway, Reactome metabolism pathway, Medical subject headings (MeSH), and miRNA-target networks (miRBase). As shown in Fig. 1b (See Supplementary Data 2 for details), various biological processes and pathways may affect gestation length, including multiple embryonic developmental processes (e.g., positive regulation of angiogenesis, Wnt signaling pathway, mTOR signaling pathway, and mRNA surveillance pathway), growth factors signaling pathways (e.g., fibroblast growth factor receptors (FGFRs), ghrelin synthesis and insulin-like growth factor 1 receptor (IGF1R) regulation), and DNA processing pathways (e.g., DNA biosynthetic process, DNA replication and mismatch repair). These results were consistent with previous findings that DNA damage was repaired very effectively during pregnancy in humans28. Analyses on the basis of MeSH revealed that gestation length was strongly associated with feeding behaviors, postpartum and peripartum periods, and autocrine and paracrine communications, in agreement with a previous study that reported the important roles of autocrine and paracrine signaling in the embryonic skeletal development through regulating the IGF1R signaling pathways29. Our miRNA-target network analyses showed that 84 out of the 755 tested miRNAs were significantly (P < 0.05) involved in gestation length (Supplementary Data 2), and their targets were enriched (FDR < 0.05) in the regulation of action cytoskeleton, miRNA surveillance pathways, multiple growth-related hormone metabolism (e.g., adrenergic signaling and parathyroid hormone synthesis, secretion and action), and immune responses (e.g., leukocyte transendothelial migration and bacterial invasion of epithelial cells) (Supplementary Fig. 1). Of note were the enrichments of all the miRNA-targets, which were significantly (P < 0.01; t-test) higher than all the other four annotation databases (Fig. 1b; Supplementary Data 2). Previous studies showed that maternal plasma miRNAs can be used as a biomarker during pregnancy to predict preterm birth and pregnancy loss in human30,31, and the altered expression of circulating miRNAs has also been proposed to be associated with pregnancy in cattle32. We further validated that 8 and 4 out of the 84 significant miRNAs were differentially expressed in milk33 and plasma34, respectively, during early pregnancy in cattle, including bta-miR-20a, bta-miR-106b, bta-miR-100, bta-miR-143, bta-miR-99b, bta-miR-125b, bta-miR-125a, bta-miR-93, bta-miR-99a-5p, bta-miR-99b, bta-miR-125b, and bta-miR-29a. Our findings were consistent with previous findings that miRNAs played an important role in pregnancy, especially for fetal growth and regulation of the immune system35,36.
Combining gestation length with other economically important dairy traits, we showed that gestation length was significantly positively correlated with calving ability (i.e., service sire and dam effects on still birth and calving ease, which were combined to measure calving ability) and body conformation traits (e.g., stature and body depth), whereas it was significantly, yet negatively correlated with conception/pregnancy rates and milk production performance (Fig. 2a). Through conducting GWAS signal enrichment analysis using QTLs of 208 complex traits in the Cattle QTLdb37 (https://www.animalgenome.org/cgi-bin/QTLdb/index), we confirmed that gestation length was not only highly associated with calving ability and withers height, but also highly associated with milk caprylic acid percentage, as well as milk phosphorus and copper content (Fig. 2b; Supplementary Data 2). This was consistent with other studies that indicated phosphorus and copper deficiency can affect prenatal development and pregnancy outcomes in human38,39. These results suggest that gestation length may share the underlying genetic basis with many dairy traits of economic importance, reflecting the biological and genetic complexity as well as economic implications of gestation length in cattle.
Sperm methylation changes associated with gestation length
By comparing sperm methylomes between bulls with high and low gestation length, we aimed to determine the differentially methylated regions (DMR) in sperm (adjusted P value (q) < 0.01 and the absolute difference in methylation >5%), which may contribute to gestation length through regulating the fetal development. In total, we detected 66,318 out of 593,035 tested regions as significant DMRs, and a QQ plot for the DMR analysis was shown in Supplementary Fig. 2. We found that many of these gestation length associated DMRs were clustered in BTA 13, 18, and 25 as compared to other chromosomes (Supplementary Fig. 3). We further observed that gestation length associated DMRs intersected many genomic elements, but were more likely to be enriched in promoters, CpG islands, miRNAs, and QTLs of gestation length (i.e., ±1 Mb around the top associated SNPs) (Fig. 3a). Since there was an inflation of the test statistics in the QQ plot of the DMR analysis, our analysis was not focused on the significant DMRs. Instead, we defined multiple sets of DMRs with different q value cutoffs, and then investigated the enrichment of GWAS signals in each DMR set using GWAS signal enrichment analysis (see Methods). Our results revealed that GWAS signals were significantly (P < 0.05) enriched in gestation length associated DMRs. This enrichment was also significant for other traits, including days to first breeding after calving (DFB), somatic cell sore (SCS, which is highly related to mastitis40), multiple body type traits (e.g., strength and stature), and milk production traits (e.g., milk and fat yields). Of note, we found that DMRs that lost methylation in animals with higher gestation length had stronger enrichment (i.e., smaller P-value) than those that gained methylation across gestation length and multiple reproduction and body type traits (Fig. 3b). However, milk production traits exhibited an opposite trend, where DMRs that gained methylation in animals with higher gestation length had more significant enrichment than those that lost methylation (Fig. 3b). Our DMR-set enrichment using Reactome pathways further validated that DMRs that lost methylation in animals with higher gestation length were significantly enriched in pathways related to telomere and chromosome maintenance, translation, and DNA damage repairs, suggesting their potential role in pregnancy and embryonic development41,42 (Fig. 3c; Supplementary Data 3). DMRs that gained methylation were significantly engaged in lipid metabolism (e.g., arachidonic acid metabolism and fatty acid metabolism), indicating their important roles in the regulation of milk production (Fig. 3c; Supplementary Data 3). Furthermore, we found that 17 out of the 25 fine-mapped genes were overlapped (gene body and promoter) with DMRs (Supplementary Data 4), and eight of them were reported to be transcriptionally active during early embryonic development (from four cells to blastocyst) before implantation43, including ARID4B, HCFC2, NFYB, CPSF1, WWP2, ZNF613, NDEL1, and PCDHA13 (Supplementary Data 4). All of these results provide evidence that sperm methylation alterations influence gestation length through regulating fetal development, as well as suggest that epigenetic alterations in germline cells induced by environmental perturbations may impact complex phenotypes through a transgenerational inheritance model44, potentially contributing to the genetic architecture underlying complex traits and diseases.
Since sire calving ease (SCE) and body depth (BDE) and cow conception rate (CCR) were genetically correlated with gestation length (Fig. 2a), we hypothesized that they also share the epigenetic architecture in sperm with gestation length. To test this, we further detected DMRs associated with SCE, BDE, and CCR based on the comparisons of high-SCE vs. low-SCE bulls, high BDE vs. low BDE, and high CCR vs. low CCR, respectively. We found that gestation length associated DMRs significantly overlapped with DMRs that were associated with SCE, BDE and CCR, respectively (Fig. 4a). There were 9524, 15,371, and 17,795 DMRs shared in gestation length & SCE, gestation length & BDE, and gestation length & CCR, respectively, which were used as three interesting DMR groups for further comparisons (Fig. 4a). Based on DMRs in these three groups, we defined three groups of genes that overlapped (gene body and promoter) with them. We found that genes in the three groups were commonly and significantly (FDR < 0.05) engaged in parathyroid hormone metabolism, inflammatory mediator regulation of TRP channels, phosphatidylinositol signaling system, and AMPK signaling pathway (Fig. 4b). Notably, for the shared DMRs in gestation length & SCE group, genes were selectively and significantly engaged in the longevity regulating pathway, endocrine-regulated calcium reabsorption and amoebiasis (Fig. 4b), implying their important roles in pregnancy and calving45–48. For the shared DMRs in the gestation length & BDE group, genes were significantly engaged in the focal adhesion, insulin secretion and peroxisome (Fig. 4b), suggesting their roles in the regulation of embryo growth49,50. For the shared DMRs in the gestation length & CCR group, genes were selectively and significantly engaged in the hippo signaling pathway, glycerolipid metabolism and lysosome (Fig. 4b), indicating their potential roles in the regulation of conception and fertility51. In Fig. 4c, for the shared DMRs in the gestation length & SCE group, 71% of the DMRs had the same direction of change for gestation length and SCE (Fig. 4c). In contrast, lower proportions of DMRs (58 and 60%) had the same change direction for gestation length & SCE and gestation length & CCR groups (Fig. 4c). This finding was in line with the genetic evidence that gestation length was more genetically correlated with SCE than with BDE and CCR. Furthermore, we observed that the genetic correlation between SCE and gestation length was higher in their shared DMRs of the same change direction, compared to those of opposite directions or the entire genome background (Fig. 4d). These results further supported that the epigenetic alterations in sperm associated with the genetic architecture underlying gestation length and other dairy traits.
Multiple evidence implicating ZNF613 for gestation length
The most significant QTL of gestation length was on BTA18, which was also significantly associated with SCE, BDE, and CCR (Supplementary Fig. 4). This was consistent with previous studies that reported this QTL to be associated with many fertility and body conformation traits in cattle, including gestation length21, stillbirth52, calving ability52–58, longevity59, calf birth weight19,60, young stock survival61, conformation54–56,62, and udder types55. However, the high extent of linkage disequilibrium and lack of functional annotation in this QTL region hampered the efforts to determine the causal gene and variants. Here, the fine-mapping analysis in our current and previous studies63 identified ZNF613 as the candidate gene for all of the four traits, including gestation length, SCE, BDE and CCR, which was consistent with previous studies in the Nordic dairy cattle population that proposed ZNF613 to be associated with calving difficulty and longevity59,61. Of note, ZNF613 had a common DMR on its second intron across all these four traits, and this common DMR was also the top one for gestation length in the corresponding genomic region (Fig. 5a). We further observed that animals with higher gestation length, SCE, BDE, and CCR had lower methylation levels in this particular DMR (Fig. 5b), which implied that the loss of methylation in the second intron of ZNF613 could be associated with a prolonged gestation, a more difficult calving, a bigger body size, but a higher conception rate. By examining the Bovine Gene Atlas data, we found that ZNF613 was expressed in 85 out of the 87 tissues, but was highly expressed in the thyroid, placenta above cotlydon, fetal tongue surface, anterior pituitary and hippocampus, suggesting its potential role in the regulation of embryonic development. We further validated that ZNF613 was highly expressed in the embryonic brain and ovary, intercaruncular tissue (e.g., placenta), thyroid and testes among 174 tissues in sheep64 (Supplementary Data 5). Its orthologous gene in mouse, zfp157, was highly expressed in the early conceptus and embryo ectoderm, and was associated with phenotypes related to endocrine/exocrine glands, integument and reproductive system (http://www.informatics.jax.org/marker/MGI:1919404). In addition, ZNF613 was reported to be transcriptionally active at the blastocyst stage before implantation in cattle43. We further confirmed this by observing that nucleosomes were retained around ZNF613 in the cattle mature sperm65,66 (Fig. 5b), as sperm-retained nucleosomes package genes for the embryonic development67. Thus, it is tempting to propose that ZNF613 functions at the very early embryonic developmental stage after fertilization, and that its epigenetic marks in sperm play important roles in the regulation of gestation length by influencing the fetal development, as well as that ZNF613 is related to many fertility and conformation traits potentially due to its effect on gestation length.
Our previous fine-mapping study on 35 complex traits with the same dataset demonstrated that ZNF613 was also the fine-mapped gene for eight dairy traits (average posterior probability of causality = 0.418), including days of first breeding after calving (DFB), heifer conception rate, net merit, productive life, rump angle, sire still birth, stature, strength and teat length63. By comparing the sign of marker effects in ZNF613 across all these associated traits, we found that gestation length was under the same selection direction with SCE, daughter calving ease, sire still birth, DFB, stature, strength and teat length, while gestation length was under the opposite selection direction with CCR, heifer conception rate, rump angle, net merit and productive life (Fig. 5c). This finding was consistent with the epigenetic evidence that animals with higher methylation levels on the second intron of ZNF613 had a prolonged gestation, a more difficult calving and a bigger body size; however these animals had a higher conception rate. We found that the minor allele frequency (MAF) of the top associated SNP (chr18:58141989; P = 7.97e-84) in ZNF613 was 0.07 in the current U.S. Holstein population. However, the MAF of this SNP decreased dramatically over the years from 1952 to 2012 (Fig. 5d) due to the strong selection against it, suggesting that current and future selection may quickly remove this undesired variant out of the cattle population. These results demonstrated an example where selection shaped the genetic architecture of complex traits.
Discussion
The data generated from the dairy industry is a valuable resource that provides a unique opportunity to investigate the paternal contribution to pregnancy related traits such as gestation length. In this study, we combined both genomics and sperm epigenomics data to dissect the paternal effects on gestation. Our GWAS study identified nine QTLs of gestation length passing the genome-wide significance level. Using a Bayesian fine-mapping analysis, we identified 25 candidate genes of gestation length. By investigating multiple sources of functional genomics data, we showed strong evidence supporting ZNF613 as the candidate gene for the most significant QTL on BTA 18. Despite the evidence we showed in this study, direct functional studies are still needed to validate the candidate genes reported here in the future.
In conclusion, our study is the first to explore the paternal genetic and epigenetic contributions to gestation length in a large cattle population. We demonstrate that paternal effects on gestation length may occur through regulating the embryo development, and the underlying epigenetic architecture of gestation length in sperm correlates with its genetic architecture. Gestation length shares both genetic and epigenetic architecture with other dairy traits of economic importance, such as SCE, BDE and CCR. In addition, we provide genetic, epigenetic, and selection evidence implicating ZNF613 as the candidate gene for the major QTL on BTA18, indicating that both epigenetic alterations and selection pressures could contribute to the genetic architecture underlying complex traits. Our study also demonstrates the usefulness of integrating multiple layers of biological information to understand the phenotypic variation of complex traits.
Methods
Single-marker GWAS
The phenotype and genotype data have been described in previous studies63,68. The phenotypes were de-regressed breeding values (predicted transmitting abilities or PTA) with high reliability for 27,214 Holstein bulls, which have been adjusted for known systematic effects including herd, year, season, and parity10. The high-density SNP genotypes (n = 312 K) of all the individuals were imputed to sequence variants (n = 3,148,506) with an imputation accuracy of 96.7%68 using reference data from Run 5 of the 1000 Bull Genomes Project69. The imputation was conducted with the FindHap software (https://aipl.arsusda.gov/software/findhap). A total of 2,619,418 imputed variants with minor allele frequency (MAF) > 0.01 and Hardy–Weinberg Equilibrium (HWE) test (P > 1e-06) were kept for further analyses.
Details of the single-marker GWAS analyses were described in63. Briefly, a linear mixed model, implemented in MMAP (https://mmap.github.io/), was employed to test for association of the imputed sequence variants:
where y is the de-regressed PTA, μ is the overall mean, X is the genotype of a candidate marker (coded as 0, 1, or 2), b is the marker effect, g ~ is the polygenic effect accounting for familial relationship and population structure, and e ~ is the residual. G is the genomic relationship matrix70, which was built using HD markers with MAF > 0.01. R is a diagonal matrix with , where is the reliability of phenotype for the ith individual. A tested variant with P < 1.91e-08 (Bonferroni correction) was considered significant at the genome-wide level.
Besides gestation length, we have also analyzed 35 other dairy traits in a previous GWAS63. These 35 traits were clustered into three groups based on their genetic correlations, including 17 body type, 12 reproduction, and 6 production traits. The pair-wise genetic correlations between gestation length and other 35 traits were approximately computed using the Pearson’s correlation of effects (b) of all tested variants71.
Gene-level fine-mapping analysis
The genomic regions used for fine-mapping analyses were defined by extending the QTL regions (i.e., the minimal regions covering all significant SNPs) 1 Mb upstream and downstream. We conducted fine-mapping analysis for each of these candidate regions by employing our Bayesian fine-mapping approach (BFMAP) (http://terpconnect.umd.edu/~jiang18/bfmap/)63, which follows a similar framework as Huang et al.72. BFMAP attempted to determine independent association signals in a genomic region by assessing a posterior probability of causality (PPC) for each variant within this particular region. The PPC of a gene was calculated as the sum of PPCs of all variants that were located within 2 Kb upstream and downstream of the corresponding gene. BFMAP has been shown to have at least equal power as the commonly used fine-mapping approaches63, such as PAINTOR73 and CAVIARBF74.
Public gene annotation sources and genomic features
We used org.Bt.eg.db v. 3.6.0, reactome.db v. 1.64.0, packages, MeSH v. 1.10.0, which have been implemented in Bioconductor v. 3.7, (https://bioconductor.org/packages/release/data/annotation/html/org.Bt.eg.db.html) to link genes to GO terms, KEGG pathways, Reactome pathways, and MeSH terms. A total of 755 miRNA-target networks were built as previously described75. Briefly, 755 cattle miRNAs were obtained from miRbase (http://www.mirbase.org/), The miRmap software76 was used to predict the targets of each miRNA, and only the top 25% of predicted targets were considered. For a given trait in the Cattle QTLdb (release 35, April 29 2018, https://www.animalgenome.org/cgi-bin/QTLdb/BT/index), we considered genes that were located inside or in the closest proximity to the QTL regions as associated genes with this particular trait. We excluded traditional QTL mapping results in the QTLdb due to their large QTL regions. In the end, we kept the GO terms (n = 889), KEGG pathways (n = 225), Reactome pathways (n = 804), miRNA-target networks (n = 755), and trait-associated gene networks (n = 208) that comprised of at least 10 genes as genomic features for further GWAS signal enrichment analyses. For instance, a GO term with ≥10 genes can be considered as a genomic feature.
GWAS signal enrichment analysis
Because the complex phenotypes being analyzed are highly polygenic or even omnigenic27,77, we employed the following sum-based marker-set test approach (http://psoerensen.github.io/qgg/) to determine whether the GWAS signals were enriched in a predefined genomic feature (i.e., a gene list defined using the above annotation sources, including GO terms, KEGG, and Reactome pathways, and miRNA-targets, trait-associated gene networks, and differentially methylation regions). This enrichment analysis is based on all variants in a GWAS study rather than the top variants passing genome-wide significance level. Previous studies have demonstrated that this approach has at least equal power to many commonly used GWAS signal enrichment methods in human78, Drosophila melanogaster79 and livestock species80–82, particularly in highly polygenic phenotypes. The sum-based statistics can be expressed as
where mf is the number of genomic variants in a pre-defined genomic feature, and b is the variant effect. Here, SNPs located in different genes within a genomic feature (e.g., a biological pathway) were often not in linkage disequilibrium (LD). This approach is similar to the popular LD score regression in human studies83, and it controls LD patterns among variants and variant-set sizes through employing the following genotype cyclical permutation strategy78,79. In brief, we first ordered the test statistics (i.e., b2) for all variants according to their chromosome positions (i.e., , , ⋯ , ). We then randomly chose one test statistic (i.e., ) from this vector as the first, and shifted the remaining test statistics to new locations, while retained their original orders (i.e., , , ⋯ , , ⋯ ) to maintain the correlation patterns among variants. We calculated a new summary statistic for the tested genomic feature using its original genomic positions. We repeated this permutation procedure 10,000 times for each genomic feature being tested, and obtained an empirical P-value by using a one-tailed test of the proportion of random summary statistics greater than that observed.
Sperm WGBS data and bioinformatics analyses
No animal experiments were conducted in this study, and ethics committee approval was thus not required. All sperm methylation data were generated in previous studies, and references were provided where animal data were used.
The sperm whole-genome bisulfite sequencing (WGBS) data were generated in previous studies66. All the semen samples used in this study were collected from bulls by an artificial insemination company using a standardized procedure with artificial vaginas. Briefly, one semen straw (0.5 ml) often contains 10–40 million sperm, and is transported and stored in a liquid nitrogen tank. After thawing a semen straw, PBS buffer was used to wash away the extender for three times by mild centrifugations. Visual examination of washed sperm samples was conducted under a microscope, and over 90% of sperm cells were found morphologically normal. At first, eight semen straws were sampled from eight representative Holstein bulls, among which four animals have high PTA of gestation length and the other four with low PTA of gestation length (Supplementary Table 1). For other traits such as SCE, BDE, and CCR, we also collected six semen samples with two groups (i.e., high PTA vs. low PTA) for each trait, and each group then had three biological replicates. In total, all of these sperm samples were from 18 fertile, age-matched and representative animals, because several animals had extreme values for multiple traits. The reliabilities of PTA for all of the 18 animals were greater than 0.97, 0.93, 0.87 and 0.83 for gestation length, SCE, BDE and CCR, respectively. Genomic DNA was isolated using QIAamp DNA Mini Kit protocol (QIAGEN, Valencia, CA, USA). The quality of isolated DNA was evaluated using the 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). The sequence libraries were constructed using all of the qualified genomic DNA66, and were then sequenced using HiSeq X 10 (Illumina, San Diego, CA, USA) with a 150 bp paired-end technology.
FastQC v0.11.2 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and Trim Galore v 0.4.0 (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) were employed to check and clean the raw data, respectively66. Generally, adapters and reads with low quality (Q < 20) or shorter than 20 bp were removed. All the cleaned data were mapped to the cattle reference genome (UMD 3.1) using bowtie284 with an average mapping rate of 69.23% (ranging from 53.10 to 78.90%). The total number of mapped reads per sample ranged from 134,984,436 to 227,534,395 with an average of 171,982,028. Bismark software85 was applied to extract methylcytosine information. Only the loci that were covered by at least 10 clean reads were kept for further analyses. More details have been described previously66.
Because methylation alterations that are associated with complex traits often exhibit spatial correlation patterns86, DMR (differentially methylated region) instead of DMC (differentially methylated cytosine) were determined by using methylKit87. The entire genome was first tiled into windows of 2000 bp in length and 2000 bp step-size, and then methylation levels on those tiles were summarized. A logistic regression model implemented in the calculateDiffMeth function was employed to detect DMR: information from each sample is specified (the number of methylated Cs and number of unmethylated Cs at a given region), and a logistic regression test is applied to compare fraction of methylated Cs across the two groups under comparison. P-values were calculated through comparing the model fitness of alternative models to the null model, and were corrected to q-values for multiple testing using the SLIM method88. Since methylation alterations associated with complex phenotypes are generally very weak86,89, the absolute value of difference in methylation >5% and different q cutoffs (i.e., 0.05, 0.01, 1e-5, 1e-8, and 1e-10) were used to define DMR for the downstream analyses.
DMR-set enrichment analysis
The following count-based approach was used to test whether a Reactome pathway was enriched for DMR,
2 |
where mf is the total number of 2000 bp-tiles being tested that were overlapped (at least 1 bp) with genes in a pathway, qi is the q value for the ith tested tile, q0 is an arbitrarily chosen threshold, and I is an indicator function that takes value 1 when , and value zero otherwise. Here q0 = 0.01 was used tentatively, and gene regions were extended 10 Kb upstream and downstream to cover potential regulatory regions. Under the null hypothesis (i.e., DMR are distributed in the genome randomly), Tcount was assumed to follow a hypergeometric distribution: Tcount ~ Hyper(m, mg, mf) where m is the total number of tiles being tested in the entire genome, mg is the total number of DMR detected in the entire genome, and mf is the number of tiles being tested in a pathway. The null hypothesis (i.e., no enrichment) will be rejected when the adjusted P-value is less than 0.05. Here, a tile was considered belonging to a pathway if the tile intersected any gene (±10 Kb up- and down-stream) in the particular pathway.
Functional enrichment analyses for gene lists in DMRs were conducted using the R package clusterProfiler90, where a hypergeometric test was employed using the current KEGG database. P-values were adjusted for multiple testing using the FDR method91, and FDR < 0.05 was considered as significant.
Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
This work was supported in part by AFRI grants number 2013-67015-20951 and number 2016-67015-24886 from the USDA National Institute of Food and Agriculture (NIFA) and BARD grant number US-4997-17 from the US-Israel Binational Agricultural Research and Development (BARD) Fund. We thank the 1000 Bull Genomes Project for providing genome references for sequence imputation.
Author contributions
L.F., J.J., G.E.L., and L.M. conceived and designed the experiments. P.M.V. and J.B.C. provided genotype and phenotype data, Y.Z. and G.E.L. collected samples and/or generated sperm methylation data. L.F., J.J., and E.F. performed computational and statistical analyses. L.F., J.J., B.L., and L.M. wrote the paper. All authors read and approved the final manuscript.
Data availability
All the 18 cattle sperm methylomes have been submitted to NCBI with accession number GSE119263. All genomic annotation files of cattle (UMD 3.1.1) are available for download from Ensembl database (https://uswest.ensembl.org/index.html). The authors confirm that the original genotype data are owned by the Council on Dairy Cattle Breeding (CDCB). A request to CDCB is necessary for getting data on research, which may be sent to: João Dürr, CDCB Chief Executive Officer (joao.durr@cdcb.us). All other data have been shown in the manuscript and supplementary data.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Lingzhao Fang, Jicai Jiang.
Contributor Information
George E. Liu, Email: George.Liu@ars.usda.gov
Li Ma, Phone: +1-301-405-1389, Email: lima@umd.edu.
Supplementary information
Supplementary information accompanies this paper at 10.1038/s42003-019-0341-6.
References
- 1.Arnott G, et al. Board invited review: the importance of the gestation period for welfare of calves: maternal stressors and difficult births. J. Anim. Sci. 2012;90:5021–5034. doi: 10.2527/jas.2012-5463. [DOI] [PubMed] [Google Scholar]
- 2.Lemons JA, et al. Very low birth weight outcomes of the National Institute of Child health and human development neonatal research network, January 1995 through December 1996. Pediatrics. 2001;107:e1–e1. doi: 10.1542/peds.107.1.e1. [DOI] [PubMed] [Google Scholar]
- 3.Yoshida S, et al. Setting research priorities to improve global newborn health and prevent stillbirths by 2025. J. Glob. Health. 2016;6:010508. doi: 10.7189/jogh.06.010508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Huusko JM, et al. Whole exome sequencing reveals HSPA1L as a genetic risk factor for spontaneous preterm birth. PLoS Genet. 2018;14:e1007394. doi: 10.1371/journal.pgen.1007394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Vieira-Neto A, Galvão K, Thatcher W, Santos J. Association among gestation length and health, production, and reproduction in Holstein cows and implications for their offspring. J. Dairy Sci. 2017;100:3166–3181. doi: 10.3168/jds.2016-11867. [DOI] [PubMed] [Google Scholar]
- 6.Nogalski Z, Piwczyński D. Association of length of pregnancy with other reproductive traits in dairy cattle. Asian-Australas. J. Anim. Sci. 2012;25:22. doi: 10.5713/ajas.2011.11084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Aghaeepour N, et al. An immune clock of human pregnancy. Sci. Immunol. 2017;2:eaan2946. doi: 10.1126/sciimmunol.aan2946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Goedicke-Fritz S, et al. Preterm Birth affects the risk of developing immune-mediated diseases. Front. Immunol. 2017;8:1266. doi: 10.3389/fimmu.2017.01266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jukic AM, Baird DD, Weinberg CR, McConnaughey DR, Wilcox AJ. Length of human pregnancy and contributors to its natural variation. Hum. Reprod. 2013;28:2848–2855. doi: 10.1093/humrep/det297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Norman H, et al. Genetic and environmental factors that affect gestation length in dairy cattle. J. Dairy Sci. 2009;92:2259–2269. doi: 10.3168/jds.2007-0982. [DOI] [PubMed] [Google Scholar]
- 11.Morel MD, Newcombe J, Holland S. Factors affecting gestation length in the Thoroughbred mare. Anim. Reprod. Sci. 2002;74:175–185. doi: 10.1016/s0378-4320(02)00171-9. [DOI] [PubMed] [Google Scholar]
- 12.Clausson B, Lichtenstein P, Cnattingius S. Genetic influence on birthweight and gestational length determined by studies in offspring of twins. BJOG. 2000;107:375–381. doi: 10.1111/j.1471-0528.2000.tb13234.x. [DOI] [PubMed] [Google Scholar]
- 13.York TP, et al. Fetal and maternal genes’ influence on gestational age in a quantitative genetic analysis of 244,000 Swedish births. Am. J. Epidemiol. 2013;178:543–550. doi: 10.1093/aje/kwt005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Schierding W, et al. GWAS on prolonged gestation (post-term birth): analysis of successive Finnish birth cohorts. J. Med. Genet. 2017;55:55–63. doi: 10.1136/jmedgenet-2017-104880. [DOI] [PubMed] [Google Scholar]
- 15.Zhang G, et al. Genetic associations with gestational duration and spontaneous preterm birth. N. Engl. J. Med. 2017;377:1156–1167. doi: 10.1056/NEJMoa1612665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Carrell DT, Hammoud SS. The human sperm epigenome and its potential role in embryonic development. Mol. Hum. Reprod. 2009;16:37–47. doi: 10.1093/molehr/gap090. [DOI] [PubMed] [Google Scholar]
- 17.Jenkins TG, Carrell DT. The sperm epigenome and potential implications for the developing embryo. Reproduction. 2012;143:727–734. doi: 10.1530/REP-11-0450. [DOI] [PubMed] [Google Scholar]
- 18.Teperek M, et al. Sperm is epigenetically programmed to regulate gene transcription in embryos. Genome Res. 2016;26:1034–1046. doi: 10.1101/gr.201541.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cole J, et al. Distribution and location of genetic effects for dairy traits. J. Dairy Sci. 2009;92:2931–2946. doi: 10.3168/jds.2008-1762. [DOI] [PubMed] [Google Scholar]
- 20.Harhay GP, et al. An atlas of bovine gene expression reveals novel distinctive tissue characteristics and evidence for improving genome annotation. Genome Biol. 2010;11:R102. doi: 10.1186/gb-2010-11-10-r102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Maltecca C, Gray K, Weigel K, Cassady J, Ashwell M. A genome‐wide association study of direct gestation length in US Holstein and Italian Brown populations. Anim. Genet. 2011;42:585–591. doi: 10.1111/j.1365-2052.2011.02188.x. [DOI] [PubMed] [Google Scholar]
- 22.Xiao X, et al. HSF1 is required for extra‐embryonic development, postnatal growth and protection during inflammatory responses in mice. EMBO J. 1999;18:5943–5952. doi: 10.1093/emboj/18.21.5943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ma X, Adelstein RS. A point mutation in Myh10 causes major defects in heart development and body wall closure. Circulation. 2014;113:000455. doi: 10.1161/CIRCGENETICS.113.000455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ridge LA, et al. Non-muscle myosin IIB (Myh10) is required for epicardial function and coronary vessel formation during mammalian development. PLoS Genet. 2017;13:e1007068. doi: 10.1371/journal.pgen.1007068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sasaki S, et al. Complete loss of Ndel1 results in neuronal migration defects and early embryonic lethality. Mol. Cell. Biol. 2005;25:7812–7827. doi: 10.1128/MCB.25.17.7812-7827.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zhao YY, et al. Neuregulin signaling in the heart: dynamic targeting of erbB4 to caveolar microdomains in cardiac myocytes. Circ. Res. 1999;84:1380–1387. doi: 10.1161/01.res.84.12.1380. [DOI] [PubMed] [Google Scholar]
- 27.Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: from polygenic to omnigenic. Cell. 2017;169:1177–1186. doi: 10.1016/j.cell.2017.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Furness D, Dekker G, Roberts C. DNA damage and health in pregnancy. J. Reprod. Immunol. 2011;89:153–162. doi: 10.1016/j.jri.2011.02.004. [DOI] [PubMed] [Google Scholar]
- 29.Wang Y, Bikle DD, Chang W. Autocrine and paracrine actions of IGF-I signaling in skeletal development. Bone Res. 2013;1:249. doi: 10.4248/BR201303003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Gray C, McCowan LM, Patel R, Taylor RS, Vickers MH. Maternal plasma miRNAs as biomarkers during mid-pregnancy to predict later spontaneous preterm birth: a pilot study. Sci. Rep. 2017;7:815. doi: 10.1038/s41598-017-00713-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Hosseini MK, Gunel T, Gumusoglu E, Benian A, Aydinli K. MicroRNA expression profiling in placenta and maternal plasma in early pregnancy loss. Mol. Med. Rep. 2018;17:4941–4952. doi: 10.3892/mmr.2018.8530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ioannidis J, Donadeu FX. Changes in circulating microRNA levels can be identified as early as day 8 of pregnancy in cattle. PLoS One. 2017;12:e0174892. doi: 10.1371/journal.pone.0174892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Schanzenbach CI, Kirchner B, Ulbrich SE, Pfaffl MW. Can milk cell or skim milk miRNAs be used as biomarkers for early pregnancy detection in cattle? PLoS One. 2017;12:e0172220. doi: 10.1371/journal.pone.0172220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ioannidis J, Donadeu FX. Circulating miRNA signatures of early pregnancy in cattle. BMC Genom. 2016;17:184. doi: 10.1186/s12864-016-2529-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Laresgoiti-Servitje E. Pregnancy-related miRNAs participate in the regulation of the immune system during the gestational period. J. Clin. Cell Immunol. 2015;6:2. [Google Scholar]
- 36.Cai M, Kolluru GK, Ahmed A. Small molecule, big prospects: microrna in pregnancy and its complications. J. Pregnancy. 2017;2017:6972732. doi: 10.1155/2017/6972732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hu ZL, Reecy JM. Animal QTLdb: beyond a repository. Mamm. Genome. 2007;18:1–4. doi: 10.1007/s00335-006-0105-8. [DOI] [PubMed] [Google Scholar]
- 38.Reitz RE, Daane TA, Woods JR, Weinstein RL. Calcium, magnesium, phosphorus, and parathyroid hormone interrelationships in pregnancy and newborn infants. Obstet. Gynecol. 1977;50:701–705. [PubMed] [Google Scholar]
- 39.Keen CL, et al. Effect of copper deficiency on prenatal development and pregnancy outcome. Am. J. Clin. Nutr. 1998;67:1003S–1011S. doi: 10.1093/ajcn/67.5.1003S. [DOI] [PubMed] [Google Scholar]
- 40.Heringstad B, Gianola D, Chang Y, Ødegård J, Klemetsdal G. Genetic associations between clinical mastitis and somatic cell score in early first-lactation cows. J. Dairy Sci. 2006;89:2236–2244. doi: 10.3168/jds.S0022-0302(06)72295-0. [DOI] [PubMed] [Google Scholar]
- 41.Hande M. DNA repair factors and telomere-chromosome integrity in mammalian cells. Cytogenet. Genome Res. 2004;104:116–122. doi: 10.1159/000077475. [DOI] [PubMed] [Google Scholar]
- 42.Ménézo Y, Dale B, Cohen M. DNA damage and repair in human oocytes and embryos: a review. Zygote. 2010;18:357–365. doi: 10.1017/S0967199410000286. [DOI] [PubMed] [Google Scholar]
- 43.Graf A, et al. Fine mapping of genome activation in bovine embryos by RNA sequencing. Proc. Natl Acad. Sci. USA. 2014;111:4139–4144. doi: 10.1073/pnas.1321569111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Johannes F, et al. Assessing the impact of transgenerational epigenetic variation on complex traits. PLoS Genet. 2009;5:e1000530. doi: 10.1371/journal.pgen.1000530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Armon P. Amoebiasis in pregnancy and the puerperium. BJOG. 1978;85:264–269. doi: 10.1111/j.1471-0528.1978.tb10498.x. [DOI] [PubMed] [Google Scholar]
- 46.Pitkin RM. Endocrine regulation of calcium homeostasis during pregnancy. Clin. Perinatol. 1983;10:575–592. [PubMed] [Google Scholar]
- 47.Tiezzi F, Arceo ME, Cole JB, Maltecca C. Including gene networks to predict calving difficulty in Holstein, Brown Swiss and Jersey cattle. BMC Genet. 2018;19:20. doi: 10.1186/s12863-018-0606-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.de Maturana EL, Ugarte E, González-Recio O. Impact of calving ease on functional longevity and herd amortization costs in Basque Holsteins using survival analysis. J. Dairy Sci. 2007;90:4451–4457. doi: 10.3168/jds.2006-734. [DOI] [PubMed] [Google Scholar]
- 49.Ashworth M, Leach F, Milner R. Development of insulin secretion in the human fetus. Arch. Dis. Child. 1973;48:151. doi: 10.1136/adc.48.2.151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Shiokawa S, et al. Functional role of focal adhesion kinase in the process of implantation. Mol. Hum. Reprod. 1998;4:907–914. doi: 10.1093/molehr/4.9.907. [DOI] [PubMed] [Google Scholar]
- 51.Kawamura K, et al. Hippo signaling disruption and Akt stimulation of ovarian follicles for infertility treatment. Proc. Natl Acad. Sci. USA. 2013;110:17474–17479. doi: 10.1073/pnas.1312830110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Thomasen J, Guldbrandtsen B, Sørensen P, Thomsen B, Lund M. Quantitative trait loci affecting calving traits in Danish Holstein cattle. J. Dairy Sci. 2008;91:2098–2105. doi: 10.3168/jds.2007-0602. [DOI] [PubMed] [Google Scholar]
- 53.Müller MP, et al. Genome-wide mapping of 10 calving and fertility traits in Holstein dairy cattle with special regard to chromosome 18. J. Dairy Sci. 2017;100:1987–2006. doi: 10.3168/jds.2016-11506. [DOI] [PubMed] [Google Scholar]
- 54.Mao X, et al. Fine mapping of a calving QTL on Bos taurus autosome 18 in Holstein cattle. J. Anim. Breed. Genet. 2016;133:207–218. doi: 10.1111/jbg.12187. [DOI] [PubMed] [Google Scholar]
- 55.Brand B, et al. Quantitative trait loci mapping of calving and conformation traits on Bos taurus autosome 18 in the German Holstein population. J. Dairy Sci. 2010;93:1205–1215. doi: 10.3168/jds.2009-2553. [DOI] [PubMed] [Google Scholar]
- 56.Pausch H., et al. Genome-wide association study identifies two major loci affecting calving ease and growth related traits in cattle. Genetics187, 289–297 (2010). [DOI] [PMC free article] [PubMed]
- 57.Purfield D, Bradley D, Kearney J, Berry D. Genome-wide association study for calving traits in Holstein–Friesian dairy cattle. Animal. 2014;8:224–235. doi: 10.1017/S175173111300195X. [DOI] [PubMed] [Google Scholar]
- 58.Purfield DC, Bradley DG, Evans RD, Kearney FJ, Berry DP. Genome-wide association study for calving performance using high-density genotypes in dairy and beef cattle. Genet. Sel. Evol. 2015;47:47. doi: 10.1186/s12711-015-0126-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zhang Q, Guldbrandtsen B, Thomasen JR, Lund MS, Sahana G. Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds. J. Dairy Sci. 2016;99:7289–7298. doi: 10.3168/jds.2015-10697. [DOI] [PubMed] [Google Scholar]
- 60.Cole J, Waurich B, Wensch-Dorendorf M, Bickhart D, Swalve H. A genome-wide association study of calf birth weight in Holstein cattle using single nucleotide polymorphisms and phenotypes predicted from auxiliary traits. J. Dairy Sci. 2014;97:3156–3172. doi: 10.3168/jds.2013-7409. [DOI] [PubMed] [Google Scholar]
- 61.Wu X, Guldbrandtsen B, Nielsen US, Lund MS, Sahana G. Association analysis for young stock survival index with imputed whole-genome sequence variants in Nordic Holstein cattle. J. Dairy Sci. 2017;100:6356–6370. doi: 10.3168/jds.2017-12688. [DOI] [PubMed] [Google Scholar]
- 62.Magee DA, et al. DNA sequence polymorphisms in a panel of eight candidate bovine imprinted genes and their association with performance traits in Irish Holstein-Friesian cattle. BMC Genet. 2010;11:93. doi: 10.1186/1471-2156-11-93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Jiang J., Cole J. M., Da Y., VanRaden P. M., Ma L. Fast Bayesian fine-mapping of 35 production, reproduction and body conformation traits with imputed sequences of 27K Holstein bulls. bioRxiv, 428227 (2018).
- 64.Clark EL, et al. A high resolution atlas of gene expression in the domestic sheep (Ovis aries) PLoS Genet. 2017;13:e1006997. doi: 10.1371/journal.pgen.1006997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Samans B, et al. Uniformity of nucleosome preservation pattern in Mammalian sperm and its connection to repetitive DNA elements. Dev. Cell. 2014;30:23–35. doi: 10.1016/j.devcel.2014.05.023. [DOI] [PubMed] [Google Scholar]
- 66.Zhou Y, et al. Comparative whole genome DNA methylation profiling of cattle sperm and somatic tissues reveals striking hypomethylated patterns in sperm. Gigascience. 2018;7:giy039. doi: 10.1093/gigascience/giy039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Hammoud SS, et al. Distinctive chromatin in human sperm packages genes for embryo development. Nature. 2009;460:473. doi: 10.1038/nature08162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.VanRaden PM, Tooker ME, O’connell JR, Cole JB, Bickhart DM. Selecting sequence variants to improve genomic predictions for dairy cattle. Genet. Sel. Evol. 2017;49:32. doi: 10.1186/s12711-017-0307-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Daetwyler HD, et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat. Genet. 2014;46:858. doi: 10.1038/ng.3034. [DOI] [PubMed] [Google Scholar]
- 70.VanRaden PM. Efficient methods to compute genomic predictions. J. Dairy Sci. 2008;91:4414–4423. doi: 10.3168/jds.2007-0980. [DOI] [PubMed] [Google Scholar]
- 71.Zhu X, et al. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. Am. J. Human. Genet. 2015;96:21–36. doi: 10.1016/j.ajhg.2014.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Huang H, et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature. 2017;547:173. doi: 10.1038/nature22969. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Kichaev G, et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 2014;10:e1004722. doi: 10.1371/journal.pgen.1004722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Chen W, McDonnell SK, Thibodeau SN, Tillmans LS, Schaid DJ. Incorporating functional annotations for fine-mapping causal variants in a Bayesian framework using summary statistics. Genetics. 2016;116:188953. doi: 10.1534/genetics.116.188953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Fang L, et al. MicroRNA-guided prioritization of genome-wide association signals reveals the importance of microRNA-target gene networks for complex traits in cattle. Sci. Rep. 2018;8:9345. doi: 10.1038/s41598-018-27729-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Vejnar CE, Zdobnov EM. MiRmap: comprehensive prediction of microRNA target repression strength. Nucleic Acids Res. 2012;40:11673–11683. doi: 10.1093/nar/gks901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Kemper KE, Goddard ME. Understanding and predicting complex traits: knowledge from cattle. Hum. Mol. Genet. 2012;21:R45–R51. doi: 10.1093/hmg/dds332. [DOI] [PubMed] [Google Scholar]
- 78.Rohde PD, Demontis D, Cuyabano BCD, Børglum AD, Sørensen P. Covariance association test (CVAT) identifies genetic markers associated with schizophrenia in functionally associated biological processes. Genetics. 2016;203:1901–1913. doi: 10.1534/genetics.116.189498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Sørensen IF, Edwards SM, Rohde PD, Sørensen P. Multiple trait covariance association test identifies gene ontology categories associated with chill coma recovery time in Drosophila melanogaster. Sci. Rep. 2017;7:2413. doi: 10.1038/s41598-017-02281-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Sarup P, Jensen J, Ostersen T, Henryon M, Sørensen P. Increased prediction accuracy using a genomic feature model including prior information on quantitative trait locus regions in purebred Danish Duroc pigs. BMC Genet. 2016;17:11. doi: 10.1186/s12863-015-0322-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Fang L, et al. Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection. Genet. Sel. Evol. 2017;49:44. doi: 10.1186/s12711-017-0319-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Fang L, et al. Integrating sequence-based GWAS and RNA-Seq provides novel insights into the genetic basis of mastitis and milk production in dairy cattle. Sci. Rep. 2017;7:45560. doi: 10.1038/srep45560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Finucane HK, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 2015;47:1228. doi: 10.1038/ng.3404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011;27:1571–1572. doi: 10.1093/bioinformatics/btr167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Guo S, et al. Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA. Nat. Genet. 2017;49:635. doi: 10.1038/ng.3805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Akalin A, et al. methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol. 2012;13:R87. doi: 10.1186/gb-2012-13-10-r87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Wang HQ, Tuominen LK, Tsai CJ. SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures. Bioinformatics. 2010;27:225–231. doi: 10.1093/bioinformatics/btq650. [DOI] [PubMed] [Google Scholar]
- 89.Teschendorff AE, Relton CL. Statistical and integrative system-level analysis of DNA methylation data. Nat. Rev. Genet. 2018;19:129. doi: 10.1038/nrg.2017.86. [DOI] [PubMed] [Google Scholar]
- 90.Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Benjamini Y., Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc Ser. B57:289–300 (1995).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All the 18 cattle sperm methylomes have been submitted to NCBI with accession number GSE119263. All genomic annotation files of cattle (UMD 3.1.1) are available for download from Ensembl database (https://uswest.ensembl.org/index.html). The authors confirm that the original genotype data are owned by the Council on Dairy Cattle Breeding (CDCB). A request to CDCB is necessary for getting data on research, which may be sent to: João Dürr, CDCB Chief Executive Officer (joao.durr@cdcb.us). All other data have been shown in the manuscript and supplementary data.