Abstract
Epigenetic dynamics, such as DNA methylation and chromatin accessibility, have been extensively explored in human preimplantation embryos. However, the active demethylation process during this crucial period remains largely unexplored. In this study, we use single-cell chemical-labeling-enabled C-to-T conversion sequencing (CLEVER-seq) to quantify the DNA 5-formylcytosine (5fC) levels of human preimplantation embryos. We find that 5-formylcytosine phosphate guanine (5fCpG) exhibits genomic element-specific distribution features and is enriched in L1 and endogenous retrovirus-K (ERVK), the subfamilies of repeat elements long interspersed nuclear elements (LINEs) and long terminal repeats (LTRs), respectively. Unlike in mice, paired pronuclei in the same zygote present variable difference of 5fCpG levels, although the male pronuclei experience stronger global demethylation. The nucleosome-occupied regions show a higher 5fCpG level compared with nucleosome-depleted ones, suggesting the role of 5fC in organizing nucleosome position. Collectively, our work offers a valuable resource for ten-eleven translocation protein family (TET)-dependent active demethylation-related study during human early embryonic development.
The use of CLEVER-seq to delineate the 5-formylcytosine landscape of human early embryos at single-cell resolution enhances our understanding of TET-dependent active demethylation processes during human preimplantation embryo development.
Introduction
Epigenetic regulation is crucial for early embryonic development to control the expression of key regulators to complete the reprogramming process [1–4]. The dynamics of DNA methylation and chromatin accessibility have been extensively analyzed in human preimplantation embryos [5–11]. Human early embryos go through waves of global demethylation after fertilization [12]. Active demethylation is mediated by the ten-eleven translocation protein family (TET) to produce a series of oxidized derivatives of 5-methylcytosine (5mC), including 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) [13–17]. As an essential active demethylation status, 5fC can organize the nucleosome position and be occupied by corresponding regulatory proteins [18–20]. Additionally, the production of 5fC in promoters and tissue-specific enhancers is usually associated with the up-regulation of gene expression [18,21,22]. However, the genome-wide 5fC dynamics during human preimplantation development are largely unknown. Here, we performed single-cell chemical-labeling-enabled C-to-T conversion sequencing (CLEVER-seq) to dissect the 5fC landscapes of human early embryos to provide new insights into the potential function of active demethylation in this process.
Results
Genome-wide identification of 5fC during human early embryonic development
5fC is a relatively rare DNA modification in the genome, with only 20 to 200 ppm of cytosines [15,16,23–26]. To map 5fC in a single cell’s genome, we applied CLEVER-seq to human gametes, the first polar bodies, human preimplantation embryos at six key developmental stages (zygotes, 2-cell, 4-cell, 8-cell embryos, morulae, and blastocysts) and human embryonic stem cells (hESCs). In total, we obtained 130 euploid individual cells without copy number variations (CNVs) for further analysis (Fig 1A, S1A Fig and S1 Table). In CLEVER-seq, malononitrile is a key chemical that can specifically label 5fC with high efficiency [22]. The synthesized model DNA containing 5fC modifications was spiked into each sample to quantify the efficiency of malononitrile labeling (S2 Table). The average conversion rate was 79.6% in all 130 malononitrile-treated single-cell samples (S1B Fig). With approximately 5× sequencing depth for each malononitrile-treated single-cell sample, the average number of clean reads of each sample was 117.8 million with a 74.6% averaged mapping rate (S1C Fig). On average, 9.0 million, 6.4 million, and 5.0 million unique cytosine phosphate guanine (CpG) sites were covered in each malononitrile-treated sample at ≥1×, ≥3×, and ≥5× coverage, respectively (S1D and S1E Fig).
The highly efficient single-cell genome amplification by multiple annealing- and looping-based amplification cycles (MALBAC) enabled us to detect as many 5fC sites as possible [27]. We focused on the 5fC on CpG sites named as 5fCpG. The 5fCpG calling scheme was basically according to our previous study to keep the 5fCpG sites calling consistent [22]. The stringent 5fC candidate pool was established by selecting CpG sites with ≥65% C-to-T ratio in at least two single-cell samples with malononitrile treatment, which was expected to remove potential polymerase chain reaction (PCR) errors. In the meantime, a “background pool” was defined by selecting sites with ≥65% C-to-T ratio in at least two untreated (negative control) single-cell samples, which was used to subtract additional background noises. Known SNP from dbSNP database hg19 v135 were removed from the candidate sites. The binomial test were used to identify 5fCpG candidate sites from our sequencing data, and CpG sites with Holm–Bonferroni (HB) method-adjusted p < 0.01 were selected as 5fCpG sites. A total of 24,830 to 45,398 5fCpG sites were identified from merged samples for each developmental stage analyzed (S1F Fig and Fig 1B). A total of 8,033 to 393,602 CpG sites were covered across all cells in each developmental stage at ≥3× sequencing depth, and 0.17% to 0.89% of them were 5fCpG sites by developmental stage aggregrated analysis (S1G Fig). That is, 0.17–0.89% of the CpG sites covered in every indivual cell had formyl modification detected in at least one individual cell of the same developmental stages. The percentage of 5fCpG in the genome of individual sperm was higher than that in the genome of individual oocyte, which was 0.065% and 0.050%, respectively (Fig 1C). The first polar body had a higher 5fCpG level (0.062%) than the oocyte, indicating the different extent of active demethylation occurred during oocyte maturation. After fertilization, the 5fCpG content drastically increased in pronuclei. At 16 to 18 h after fertilization, the male and female pronuclei presented comparable 5fCpG levels, 0.106% and 0.109%. Immunostaining of human zygotes using 5fC antibody also proved the existence of 5fC (Fig 1D). From the zygote to the 2-cell stage, the 5fCpG level decreased to 0.053% (Fig 1C). During the cleavage stages, a higher 5fCpG percentage was observed at the 8-cell stage (0.066%). The 5fCpG level of pluripotent hESCs (0.041%) was lower than that of the pluripotent inner cell mass (ICM, 0.062%). These results indicated that human early embryos exhibited extensive 5fC dynamics due to active demethylation.
The distribution characteristics of 5fCpG in human early embryos
When we compared two consecutive developmental stages, most 5fCpG sites were newly generated, suggesting that the production of 5fCpG was highly dynamic during human early embryonic development (Fig 2A) [28]; 66%, 71%, and 67% of 5fCpG sites were newly generated in male pronuclei, female pronuclei, and 2-cell stages compared with those in sperm, oocyte and pronuclei stages, respectively. However, a certain proportion, 10% to 25% of 5fCpG, was inherited from the former developmental stage. Then we explored the relationship of newly generated 5fCpG in promoter regions and RNA expression of corresponsing genes (S2A Fig). Unlike in mice, we did not observe the phenomenon that 5fCpG production in promoters procedes the up-regulation of corresponding genes in the following human early embryo developmental stages [22]. Instead, the promoter 5fCpG-marked genes exhibited the trend of up-regulation in the current developmental stage, such as the oocytes, female pronuclei, 2-cell, 4-cell, 8-cell embryos, and TE when comparing to the former stage. Only 2,823 (10%) 5fCpG sites in hESCs were shared between ICM and hESCs, indicating drastically distinct active demethylation patterns between them (Fig 2A and S2B Fig). According to genomic annotation, over half of the 5fCpG sites were located in intergenic regions, and a steady portion of 5fCpG sites were situated in the intron (30% on average) and exon regions (10% on average) due to the relatively longer length of those elements (Fig 2B). However, relative enrichment analysis showed that 5fCpG sites were enriched in functional genomic regions, such as gene body regions, transcriptional termination sites (TTSs) and enhancers (S2C Fig), whereas these sites were depleted from intergenic regions, 5′ untranslated regions (UTRs), 3′UTRs, and CpG islands (CGIs). The enrichment of 5fCpG sites in promoters was variable at different developmental stages. In most stages except for the oocyte, 8-cell and trophectoderm (TE) stages, 5fCpG sites were depleted from the promoters. Then, we calculated the 5fCpG ratio in different genomic regions. In the intergenic region, the 5fCpG level was significantly higher than that in the intragenic region in the oocyte (p = 3.7 × 10−2), the first polar body (p = 2.0 × 10−2), four-cell (p = 1.3 × 10−2), ICM (p = 1.8 ×10−2), and hESC (p = 2.2 × 10−3) stages (S2D Fig). Furthermore, exon, intron, and TTS regions showed higher 5fCpG levels than did other genomic elements (S3A Fig). In contrast, CGI, except for the oocyte stage, showed relatively lower 5fCpG levels that coincided with their low DNA methylation (5mC) levels (S3A Fig). When promoters were classified into high-density CpG promoters (HCPs), intermediated-density CpG promoters (ICPs), and low-density CpG promoters (LCPs) according to CpG densities, the 5fCpG abundance in HCPs, ICPs, and LCPs was similar in most stages (S3B Fig). In 2-cell stage, the 5fCpG level in HCPs was lower than that in ICPs and LPCs (Student t test p = 5.0 × 10−6 and p = 6.9 × 10−5, respectively). Also in male pronuclei and 4-cell stage, the 5fCpG levels in HCPs were lower than that in LCPs (Student t test p = 3.0 × 10−2 and p = 4.5 × 10−3, respectively).
Then, we conducted unsupervised clustering of the embryos by 5fCpG sites (Fig 2C). The oocytes, the first polar bodies, and female pronuclei clustered together but were separated from other embryo stages. Sperm and male pronuclei also clustered together. Similarly, ICM and TE clustered together. The cleavage stages, from the 2-cell to morula stage, were clustered together. These results indicated the similarity and association of 5fCpG sites in neighboring developmental stages. To estimate the heterogeneity of the 5fCpG site distribution, we calculated the variance in the 5fCpG level in a consecutive 1-kb window in the genome among cells at each developmental stage [29]. The variance in sperm was highest, up to a median of 8.9 × 10−3 (S3C Fig). The variance in the morula stage was lowest, with a median of only 3.3 × 10−3. Then t-distributed stochastic neighbor embedding (t-SNE) analysis of 5fCpG-marked windows was performed at the window size of 1 kb, 10 kb, 100 kb, 1 Mb, 10 Mb, respectively. The cells at sperm and male pronuclei stages clustered relatively closely at the windows of 1 kb, 10 kb, and 1 Mb, while cells of different developmental stages dispersedly distributed in t-SNE map (S3D Fig). Similarly, principal component analyses (PCAs) were also conducted based on 5fCpG-marked windows. For the 1-kb and 10-kb windows, the heterogeneity among single cells of male pronuclei and female pronuclei was stronger than that of other stages (S4 Fig). For the window of 100 kb, 1 Mb, 10 Mb, the heterogeneity among single cells of male pronuclei and sperm was more dominant than that of other stages. The 5fCpG sites in the intergenic region exhibited higher variance compared with those in the intragenic region (Fig 2D). The variance in 5fCpG sites located in repeat element long interspersed nuclear elements (LINEs) was higher than that in short interspersed nuclear elements (SINEs) and long terminal repeats (LTRs). The binding sites obtaind from chromatin immunoprecipitation sequencing (ChIP-seq) data of hESC [30,31] was used as presumed ones in human early embryos. Heterochromatin regions (H3K9me3 marked) showed higher variance in 5fCpG sites than did regions marked by H3K4me3 and H3K27me3. The 5fCpG sites in promoters, CGIs, and RNA polymerase II binding sites showed relatively lower variance, suggesting a conserved 5fC distribution in functionally essential genomic regions. According to the published ChIP-seq data of hESCs [30,31], 5fCpG sites were enriched in heterochromatin regions marked by H3K9me3 and euchromatin regions marked by H3K4me3 (Fig 2E). However, these sites were depleted from the binding sites of CCCTC-binding factor (CTCF), and DNase I hypersensitive sites (DHS) in most preimplantation developmental stages. Additionally, 5fCpG were enriched in the binding sites of pluripotent transcription factors, such as NANOG, POU5F1, and SOX2. To explore the relationship between 5fCpG and 5mC, we calculated the DNA methylation level in 5fCpG-marked 1-kb windows in each developmental stage of human early embryos. Majority of 5fCpG-marked regions had relatively low methylation levels compared to the average 5mC level in the current stage in sperm, male pronuclei, female pronuclei, and 2 -cell (Fig 2F). However, this low DNA methylation tendency of 5fCpG-marked bins began to decrease since 4-cell onwards.
5fCpG distribution on repeat elements
Repeat elements have been showed to participate in the regulation of early embryonic development [32]. Enrichment analysis showed that 5fCpG sites were enriched in LINEs and satellites compared with those in SINEs and LTRs (S5A Fig). Moreover, the 5fCpG levels in LINEs and the SINE/variable number of tandem repeats/Alu (SVA) were higher than that in other repeat elements (S5B Fig). For the subfamily of repeat elements, 5fCpG sites were relatively enriched in endogenous retrovirus-K (ERVK), L1 (except for two-cell stages) and SVA (except for oocyte stages; Fig 3A). The level of 5fCpG was higher in L1, the evolutionarily younger subfamily of LINEs, compared with that in L2, the evolutionarily older subfamily (Fig 3B and 3C). The subfamilies of SINEs, Alu, and mammalian-wide interspersed repeat (MIR), showed similar 5fCpG levels. The 5fCpG level in ERVK was also higher than that in other subfamilies of LTRs (Fig 3B and 3D). Then we explored the RNA expression and DNA methylation level on L1 and ERVK with 5fCpG modification. Compared with the randomly selected ones only with unmodified CpG, these 5fCpG-marked L1 showed lower expression level in most developmental stages (S5C Fig). In ERVK, the 5fCpG-marked ones exhibited lower expression level in 2-cell embryos, morula, and TE but higher expression level in oocytes and 4-cell embryos. The markedly high 5fCpG level in L1 and ERVK may be associated with the regulation of the transcription of these repeat elements. The 5fCpG-marked L1 and ERVK showed significantly higher DNA methylation levels compared to the randomly selected ones (S5D Fig), which suggested that those repeat element regions with higher DNA methylation modification had higher tendency to retain the active demethylation marks.
Comparison of 5fCpG in paternal and maternal genomes
The parental genomes undergo different extents of demethylation and remethylation during human embryonic development [12,33]. To determine the 5fCpG features of parental genomes, we compared the 5fCpG-marked regions defined in which 1-kb consecutive windows in genome had at least one 5fCpG sites in gametes and pronuclei. A total of 3,513 (10%) 5fCpG-marked regions in oocytes were shared with those in sperm (Fig 4A). Additionally, 7,948 (20%) 5fCpG sites in female pronuclei were shared with those in male pronuclei (Fig 4B, 4C and 4D). Enrichment analysis showed that in contrast to gamete- and pronuclei-shared regions, gamete-specific and pronuclei-specific 5fCpG-marked regions were enriched in promoters, exons, CGI, and enhancers (Fig 4E). The gamete-shared and pronuclei-shared 5fCpG-marked regions were enriched in LINEs and satellites (S5E Fig), especially in L1, and depleted in MIR compared with gamete-specific and pronuclei-specific regions (Fig 4F). These results suggested that 5fCpG was distinctly distributed on parental genomes.
Variable differences of 5fCpG levels in paired pronuclei
In the pronuclei stage, the 5fCpG level is higher than that in other stages, which is similar in mice [22]. In mice, for paired pronuclei, the 5fCpG levels of male pronuclei are always higher than those of paired female pronuclei. However, in humans, paired pronuclei showed a distinct pattern. In total, we successfully obtained 10 paired pronuclei to quantify 5fCpG levels. Five pairs presented higher 5fCpG levels in male pronuclei, and the highest difference was up to 0.067% (Fig 5A and 5B). In contrast, in the remaining 5 pairs, female pronuclei had higher 5fCpG levels than did paired male pronuclei. That is, in some zygotes the 5fCpG levels of paternal gemone were higher than that of maternal genome, whereas in others it was opposite. To a certain extent, the differences in both intergenic and intragenic regions collectively contributed to the differences in the whole genome (Fig 5C). However, the difference in 5fCpG levels in enhancers and promoters between paired parental nuclei was nonsignificant. The most drastic differences in paired pronuclei resulted from the repeat elements, LINEs and ALRs (Fig 5D). In particular, the subfamilies L1 and ERVK showed the most significant differences. The variable differences of 5fCpG levels in paired pronuclei were completely different from those in mice, mainly resulting from the difference in 5fCpG on repeat elements.
5fCpG levels in paired two-cell embryos
For paired blastomeres in two-cell embryos, we compared 5fCpG levels between two blastomeres from the same embryo. Three pairs of two-cell blastomeres showed different 5fCpG levels, and one pair showed similar levels (S6A Fig). Additionally, the difference in 5fCpG levels was maintained in enhancers, promoters, exons, and introns (S6B Fig). For subfamilies of repeat elements, these four pairs of blastomeres showed similar 5fCpG levels on Alu, MIR, L2, ERV1, ERVL, and ERVL-MaLR (S6C Fig). However, the difference in 5fCpG distribution was mainly exhibited in L1 and ERVK, subfamilies of LINEs, and LTRs, respectively, which resembled the situation in paired pronuclei (Fig 5D).
Regulatory elements enriched in 5fC-marked regions
Taking advantage of the DNase-seq data of hESCs as putative DHS of human pluripotent ICM at the blastocyst stage [34,35], we analyzed the chromatin state surrounding 5fC sites. Throughout all developmental stages we analyzed, compared with its flanking regions, the center of 5fC sites showed relatively less chromatin accessibility (Fig 6A). For the proximal 5fC sites (within TSS ± 2 kb), this phenomenon was even more prominent. However, the shore regions (approximately 1392 base pairs [bp] upstream or downstream of TSS) around the center showed strong DHS signals (Fig 6A). Using the chromatin overall omic-scale landscape sequencing (scCOOL-seq) data of human early embryos [7], we analyzed the 5fCpG levels on the nucleosome-occupied regions and nucleosome-depleted regions (NDRs; Fig 6B). Interestingly, the 5fCpG level in the nucleosome-occupied regions was higher than that in NDRs in most developmental stages, such as the oocyte (p = 1.5 × 10−2), two-cell (p = 2.2 × 10−2), four-cell (p = 1.2 × 10−2), ICM (p = 1.3 × 10−3), and hESC (p = 5.9 × 10−6) stages. This finding is compatible with previously published results showing that 5fCpG could interact with histones and organize nucleosome positioning [18,36].
According to previous studies, 5fCpG on tissue-specific enhancers is associated with the up-regulation of the expression of corresponding genes [18,37]. We performed motif analysis of distal and proximal 5fCpG-marked genomic regions to find out the enriched transcription factors in these regulatory regions. In the distal 5fCpG-marked regions (2 kb away from TSS), we found that the enhancer-associated protein EP300 that was expressed across the human preimplantation devleopment, which acts as histone acetyltransferase to regulate transcription and is vital for embryonic development and cell proliferation [38,39], showed consistant enrichment in distal 5fCpG-marked regions (Fig 6C). The binding motif of EWSR1 involving in mitosis [40] and neuronal morphology [41] was strongly enriched in distal 5fCpG-marked regions with gradually increased expression across all stages analyzed. ZNF136, ZNF281, and ZNF675, the members of zinc-finger family, also exhibited significant enrichment in these regions. Besides, the binding motifs of pluripotency transcription factors, such as POU5F1 and SOX2, were significantly enriched. Moreover, the binding motifs of several transcription factors that determined cell fate during early lineage differentiation, such as SOX4, SOX15, and GATA6, were also enriched in these distal 5fC-marked regions. For the proximal 5fCpG-marked genomic regions (TSS ± 2 kb), the binding sequences of transcription factors regulating the cell cycle and mitosis, such as E2F3 and EGR1, were enriched (Fig 6D). Additionally, the binding motif of MAZ, a transcription factor regulating transcription complex formation which was expressed in all stages analyzed, was consistently enriched in the proximal 5fCpG-marked regions across human early embryonic development. Collectively, these analyses demonstrated that 5fCpG-marked regulatory regions exhibited the tendency of important transcription factors’ binding during human early embryonic development.
Discussion
After fertilization, embryos undergo drastic epigenomic reprogramming, especially TET-mediated DNA demethylation. Here, we used single-cell CLEVER-seq to depict the DNA formylation profiles of human early embryos. During preimplantation development in humans, pronuclei (16–18 h after intracytoplasmic sperm injection [ICSI]) exhibited the highest 5fCpG levels. Male pronuclei and female pronuclei showed comparable 5fCpG levels, although the paternal genome experienced a much more extensive demethylation process from sperm to male pronuclei compared to maternal genome [12]. Furthermore, we analyzed 10 pairs of pronuclei from the same zygotes and found variable 5fCpG content in pronuclei. That is, in some zygotes, 5fCpG levels were higher in male pronuclei than in female ones, whereas in other zygotes it was the opposite. Different from mice, in 5 pairs of human pronuclei, the 5fCpG level in male pronuclei was higher than that in paired female pronuclei; the remaining 5 pairs showed the opposite phenomenon that female pronuclei had a higher 5fCpG level than did male pronuclei. Because TDG is extremely lowly expressed in human early embryos, it may not account for the inconsistent 5fCpG level in paired pronuclei [5,42]. Using immunostaining, our previous work showed that the 5hmC distribution in male and female pronuclei of human was also inconsistent: although majority of male pronuclei showed stronger 5hmC signals, still some female pronuclei exhibited higher fluorescent intensity of 5hmC compared with paired male one [5]. In mice zygotes, the 5fC and 5hmC level in paternal genome was always higher than that in maternal genome in the same embryo [22,43]. Different from mice, the active demethylation extent between parental genome in human zygotes may present in an embryo-specific manner due to the complicated genetic background of human embryos. The obvious difference in 5fCpG in paired pronuclei came from the 5fCpG in repeat element L1 and ERVK, the subfamilies of LINEs and LTRs. In addition to pronuclei, early embryos in other stages showed relatively higher 5fCpG abundance in L1 and ERVK. 5fC could reduce the specificity of the substrate to RNA polymerase II; thus, the high level of 5fC in those repeat elements may reduce their transcription to improve genome stability [44,45].
The formyl group of 5fC has high activity to react with the amine residues of protein to form the Schiff base complex [18]. The transcription regulators selectively binding to 5fC had been identified by proteomics screening, such as FOXK1, FOXP1, and FOXI3 [19]. In human embryos, we also found that the binding motifs of essential transcription factors, such as POU5F1 and SOX2 to maintain pluripotency and E2F3 and EGR1 to regulate the cell cycle, were significantly enriched in 5fCpG-marked regulatory regions. In addition to transcription factors, the lysines of histone proteins can interact with the formyl group of 5fC [36,46]. Consistent with the above data, our data showed that the 5fCpG level in nucleosome-occupied regions was higher than that in nucleosome-depleted regions in many developmental stages. The function of 5fC in organizing nucleosome position and altering the structure of the DNA helix could contribute to transcription control [18,36,47]. Increasing evidence shows that 5fC does not only serve as an intermediate of active DNA demethylation but also participates in functional regulations [48]. In the future, functional work is needed to determine the comprehensive roles of 5fC in vivo.
For a better understanding of the complicated active DNA demethylation process after fertilization, the features of 5hmC and 5caC also remain to be investigated, which rely on the improvement of techniques to accurately quantify these DNA modifications [49–52]. In summary, our work presents the 5fC landscapes of human early embryos, which provides new insights into the complex regulation network during human preimplantation development.
Methods
Ethics statement
The research project was approved by the Reproductive Study Ethics Committee of Peking University Third Hospital (research license number 2017SZ-081), and the study strictly followed the guidelines set forth by the Ethics Committee. Oocytes were voluntarily donated by female donors at the Center for Reproductive Medicine, Peking University Third Hospital, with signed written informed consent. Sperm donation was provided by a healthy man with proven fertility who signed the written informed consent.
Standard in vitro fertilisation (IVF) protocols for the gametes and embryo collection, thawing, and culture were executed as previously described [53,54].
Collection of human gamete and early embryos
Gametes and the first polar body
After several rounds of washing with HTF medium (Life global), the swim-up sperm were collected for further processing. Motile single sperm with normal morphology was collected via micromanipulation. For the metaphase II oocyte, the granulosa cells were removed with the treatment of hyaluronidase (Sigma). The first polar bodies were collected by a micropipette with laser-assisted biopsy. The metaphase II oocytes without the first polar body were transferred into an acidic solution drop (1 μL 36% HCl diluted in 1 mL DPBS) to remove the zona pellucida. The oocytes were then washed several times in DPBS with 0.1% HSA by gentle pipetting to eliminate any somatic contamination.
Human preimplantation embryos
The assessment of embryos development and the scheduling of embryo collection were performed according to a previous study [55]. Following oocyte retrieval, oocytes were fertilized via ICSI and signs of fertilization were checked on the following day. Male and female pronuclei of zygotes were collected through careful micromanipulation after broken the zona pellucida and the cytoplasmic membrane. The 2-cell and 4-cell embryos were collected at 27 h and 48 h post fertilization, respectively. The 8-cell and morula stage embryos were obtained at Day 3 and Day 4 according to their morphology, respectively. At Days 5 and 6, the ICM and TE of blastocysts were isolated with laser-assisted micromanipulation. Embryos with developmental retardation were excluded from this study. The embryos were treated with acidic solution (1 μL 36% HCl diluted in 1 mL DPBS) to remove the zona pellucida. The embryos without zona pellucida were then washed several times by gentle pipetting in DPBS with 0.1% HSA. Subsequently, a mixture of Accutase (Sigma) and 0.25% trypsin at the ratio of 1:1 was used to digest the embryo into single cells in humidified incubator for 20 to 50 min with 5% CO2 at 37°C. Then single blastomeres were carefully washed in DPBS with 0.1% HSA before being transferred into lysis buffer.
hESC culture
The hESCs (H9) were purchased from WiCell Institute. hESCs were cultured on the hESC-Qualified Matrix (Corning) in a feeder-free condition. And the complete E8 culture medium were used in cell culture. With regular passaging, we collected the colonies of hESCs and digested them into single-cell suspension using Accutase (Sigma) at 37°C for 1 h. Then the single cells of hESCs were used for CLEVER-seq library construction.
CLEVER-seq library construction
The CLEVER-seq experiment was carried out according to the previous study with slight modification [22]. The single cells of human early embryo were transferred into lysis buffer (20 mM Tris, 2 mM EDTA, 20 mM KCl, and 0.3% TritonX-100, 1 mg/mL QIAGEN protease) by mouth pipette. After centrifugation (1 min at 7,000 r.p.m., 4°C), the sample was incubated at 50°C for 3 h to release the genomic DNA followed by 70°C for 30 min to inactive the protease. Then trace mount of spike-in DNA contain 5fC modification was added into sample. In order to label the 5fC, 150 mM malononitrile (J&K) was used. Mineral oil (15 μL) was added onto the surface of the sample to prevent evaporation. The sample was incubated at 37°C for 20 h with constant shaking at 850 r.p.m. in Thermomixer (Eppendorf). Then the single cell genome was amplified by MALBAC (Yikon Genomics) according to the user guide. After amplification, the genomic DNA was fragment into approximately 300 by covaris S2. The library construction was performed using KAPA Hyper Prep Kit (Kapa Biosystems). The libraries were sequenced on Illumina Hiseq 4000 platform in pair-end 150 bp model (Novogene).
Genomic DNA sequencing
In order to differentiate the male and female pronucleus, the genomic DNA was sequenced from the peripheral blood and sperm pellet of couple donators to call SNP. The genomic DNA was extracted using DNesay Blood and Tissue Kit (Qiagen). About 500 ng DNA was fragment into 300 bp by by covaris S2. The libraries were constructed using KAPA Hyper Prep Kit (Kapa Biosystems). And the libraries were sequenced the same way as CLEVER-seq libraries.
Immunostaining of 5fC and 5mC in human zygote
The immunostaining was carried out according to previous study with slight modification [5]. Briefly, the zona pellucida of human zygotes was removed by treating with acidic solution (1 μL 36% HCl diluted in 1 mL DPBS). After several times of washing in DPBS, the sample was fixed 4% paraformaldehyde for 30 min at room temperature. After washing with DPBS containing 0.1% HSA, the membrane was permeated in 1% Triton X-100 at room temperature for 15 min. For detection of 5fC and 5mC, the DNA of embryos was denatured by 4N HCl at room temperature for 20 min and then neutralized for 10 min with 100 mM Tris-HCl. The sample was blocked in 1% goat serum at room temperature for 1.5 h. The primary antibody of 5fC (Active Motif, Cat#61223) and 5mC (Eurogentec, Cat#BI-MECY-1000) was diluted at the ratio of 1:500. The sample was incubated in the diluted antibody buffer for 3 h at room temperature. After washing, the sample was incubated with Alexa Fluor 594 goat anti-rabbit IgG (1:500, Invitrogen, Cat#A-11012) and Alexa Fluor 488 goat anti-mouse IgG (1:500, Invitrogen, Cat#A-11001) for 1 h at room temperature. All the images were captured and analysed using Nikon A1R high-speed laser confocal microscopy.
Reads quality control and alignment
We used Trim Galore (version 0.3.3) to remove low quality bases, adaptor, and MALBAC primer sequences. Then, reads passed quality control were mapped to human reference genome (version: hg19) using Bismark (version 0.13.0) [56]. Then PCR duplicates were removed with the command ‘samtools rmdup’ (version 0.1.18) [57]. A 137-bp model DNA contain 5fC modification (model 1 or model 2; information are provided in S2 Table), and lambda were added in same library to estimate the C-to-T conversion rate and random C-to-T conversion rate in each single cell experiment.
CNV detection with CLEVER-seq data
First, we selected euploid cells for downstream analysis. Briefly, read counts of 1 Mb nonoverlapping windows were used to deduce CNV by R package HMMcopy [58] with GC and mappability correction.
Gender estimation of pronuclei
With read counts of sex chromosomes, we can estimate gender of each embryo. In this way, we can unambiguously distinguish between male PN and female PN in paired pronuclei having chromosomes XY, also male PN from unpaired pronuclei having chromosomes XY. As for pronuclei having chromosomes XX, we used SNPs identified from parental genomic DNA to distinguish between male PN and female PN. Using an established pipline [12,59], parental genomic data were cleaned with Trim Galore (version 0.3.3) and mapped to the human reference genome hg19 with bwa (version 0.7.12) [60]. PCR duplications were removed with Picard (version 1.126); then SNPs were identified with GATK HaplotypeCaller [61].
5fCpG site identification and 5fCpG level estimation
Following a previous study [22], we restricted 5fC at CpG sites without known SNPs (dbSNP database hg19 version 135). All CpG sites with C-to-T conversion rate ≥0.65 in at least two treated cells were grouped as the candidate pool to avoid PCR amplification errors, whereas CpG sites with C-to-T conversion rate ≥0.65 in control cells were grouped as the candidate noise pool to avoid background noises. For each treated single cell, CpG sites covered ≥3 times with C-to-T conversion rate ≥0.65, in the candidate pool but not in the noise pool, were identified as candidate 5fCpG sites in this single cell. To calculate the possibility of observing 5fCpG by chance, we used binomial distribution estimation with ‘dbinom’ in R, with C-to-T conversion rate estimated from lambda DNA, according to previous study [62]. For each site, the number of reads supporting ‘T’ were denoted as ‘NT’ and the number of reads supporting ‘C’ were denoted as ‘NC’, and the sequencing depth was NT+NC. 5fCpG sites with Holm–Bonferroni method-adjusted p < 0.01 were retained as final 5fCpG sites in this single cell for further analysis. And CpG sites covered ≥3 times with C-to-T conversion rate ≤0.25 were identified as unmodified C sites. In order to model false positives, the negative control single cells (those not treated with malononitrile) were also analyzed with the same procedures. The number of called 5fCpG sites were divided by the number of total sequenced CpG sites (defined as ‘5fCpG abundance’) for both the treated and untreated samples. And false-positive detection rate was calculated by dividing the 5fCpG abundance of the untreated samples by that of the treated samples. At the C-to-T cut-off of 0.65, false-positive detection rate of 8% was observed in hESC samples.
For random C-to-T rate of lambda DNA, we selected the C sites covered ≥4 times in an individual cell sample to avoid sequencing errors and counted the number of T readout on these C sites. The random C-to-T rate was calculated by the number of T readouts divided by the total number of these C sites sequenced. Based on labmda DNA, the average false positive rate in 130 treatment sample was 1.15%.
The cluster and the variance of 5fCpG level were calculated based on 1-kb windows across the genome, and only windows with 5fCpG sites identified in at least one stage were retained for this analysis. The motif enrichment analysis was done with HOMER (version 4.10.3) based on regions 100 bp upstream and 100 bp downstream of 5fCpG sites identified in each stage.
The global or genomic region 5fCpG ratio was estimated by the number of 5fCpG sites divided by the sum of the number of 5fCpG and unmodified C sites. The genomic locations of exons, introns, CGIs, 5′UTR, and 3′UTR repeat elements and their subfamilies were downloaded from the UCSC genome browser. Promoters were regions 1-kb upstream and 0.5-kb downstream of TSS, which were grouped into three classes based on CpG dengsity, that is, HCP, ICP, and LCP [63]. The significance of 5fCpG level between genomic regions was calculated by two-tailed Student t test.
The published data sets used in this study were downloaded from the Gene Expression Omnibus (GEO) with the following accession numbers: GSE100272 (scCOOL-seq of human preimplantation embryos), GSE36552 (scRNA-seq data of human preimplantation embryos), GSE32970 (DNase-seq peak and signal of human ESCs), GSE29611 (ChIP-seq of human ESCs), GSE61475 (transcription factor binding sites in human ESCs).
DNA methylation levels and RNA expression levels of 5fCpG marked repeats
We used read counts located in each 5fC marked repeat region and RPKM (reads per kilobase of transcript per million mapped reads) method to estimate the expression levels of repeat elements. As a control, we randomly selected the same number of L1 or ERVK regions with only unmodified CpG covered to calculate the corresponding DNA methylation levels or RNA expression levels. The repeat annotation is downloaded from hg19 Repeat Masker in UCSC. The RNA-seq data are from GSE36552, and the DNA methylation data are from GSE81233.
t-SNE and PCA analysis of 5fCpG sites
5fCpG percentage in 1-kb, 10-kb, 100-kb, 1-Mb, or 10-Mb windows were calculated by the number of 5fCpG divided by the sum of unmodified CpG and 5fCpG. Only the windows containing 5fCpG modification at least in one single cell were considered when performing dimensionality reduction analysis. PCA was performed by ‘pcaMethods’ package in R, and tSNE analysis was performed by ‘tsne’ package in R.
Code availability
All the data analysis was conducted in perl, Python, and R language. All the computational code used in this study are available upon contacting with corresponding authors.
Supporting information
Acknowledgments
We feel very grateful to Prof. C. Yi for generously providing model DNA containing 5fC modification as spike-in. We thank Dr. C. Zhu and Mr. B. He for consulting the bioinformatics analysis, Dr. H. Zeng and Mr. D. Bai for their experimental suggestion. Some parts of the bioinformatics analyses were performed on the Computing Platform at the Center for Life Science.
Abbreviations
- bp
base pairs
- CGI
CpG island
- ChIP-seq
chromatin immunoprecipitation sequencing
- CLEVER-seq
chemical-labeling-enabled C-to-T conversion sequencing
- CNV
copy number variation
- CpG
cytosine phosphate guanine
- CTCF
CCCTC-binding factor
- dbSNP
database of single nucleotide polymorphism
- DHS
DNase I hypersensitive sites
- ERVK
endogenous retrovirus-K
- GEO
Gene Expression Omnibus
- HB
Holm–Bonferroni
- HCP
high-density CpG promoter
- hESC
human embryonic stem cell
- ICM
inner cell mass
- ICP
intermediated-density CpG promoter
- ICSI
intracytoplasmic sperm injection
- IVF
in vitro fertilisation
- LCP
low-density CpG promoter
- LINE
long interspersed nuclear element
- LTR
long terminal repeat
- MALBAC
multiple annealing- and looping-based amplification cycles
- MIR
mammalian-wide interspersed repeat
- NDR
nucleosome-depleted region
- PCA
principal component analysis
- PCR
polymerase chain reaction
- RPKM
reads per kilobase of transcript per million mapped reads
- scCOOL-seq
chromatin overall omic-scale landscape sequencing
- SINE
short interspersed nuclear element
- SNP
single nucleotide polymorphism
- SVA
SINE/variable number of tandem repeats/Alu
- TE
trophectoderm
- TET
ten-eleven translocation protein family
- t-SNE
t-distributed stochastic neighbor embedding
- TTS
transcriptional termination site
- UTR
untranslated region
- 5caC
5-carboxylcytosine
- 5fC
5-formylcytosine
- 5fCpG
5-formylcytosine phosphate guanine
- 5hmC
5-hydroxymethylcytosine
- 5mC
5-methylcytosine
Data Availability
The raw data of CLEVER-seq are deposited at the Genome Sequence Archive for Human (GSA-Human) under the accession code HRA000194, while the processed data are deposited at the Gene Expression Omnibus (GEO) under the accession code GSE124035.
Funding Statement
This project was supported by Beijing Municipal Science & Technology Commission (Z181100001318001 to F.T.). This project was also supported by grants from National Key R&D Program of China (2018YFC1003100 to L.W., 2018YFA0107601 and 2017YFA0102702 to F.T.). This project was also supported by grants from the National Natural Science Foundation of China (81521002 to J.Q. and F.T., 81730038 and 31871482 to J.Q., 31571544 and 31871447 to L.Y., 31625018 to F.T.). The work was also supported by the Beijing Advanced Innovation Center for Genomics at Peking University (F.T. and J.Q.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Hackett JA, Surani MA. Beyond DNA: programming and inheritance of parental methylomes. Cell. 2013;153(4):737–9. 10.1016/j.cell.2013.04.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Saitou M, Kagiwada S, Kurimoto K. Epigenetic reprogramming in mouse pre-implantation development and primordial germ cells. Development. 2012;139(1):15–31. 10.1242/dev.050849 [DOI] [PubMed] [Google Scholar]
- 3.Burton A, Torres-Padilla ME. Chromatin dynamics in the regulation of cell fate allocation during early embryogenesis. Nature reviews Molecular cell biology. 2014;15(11):723–34. 10.1038/nrm3885 [DOI] [PubMed] [Google Scholar]
- 4.Li E. Chromatin modification and epigenetic reprogramming in mammalian development. Nature reviews Genetics. 2002;3(9):662–73. 10.1038/nrg887 [DOI] [PubMed] [Google Scholar]
- 5.Guo H, Zhu P, Yan L, Li R, Hu B, Lian Y, et al. The DNA methylation landscape of human early embryos. Nature. 2014;511(7511):606–10. 10.1038/nature13544 [DOI] [PubMed] [Google Scholar]
- 6.Smith ZD, Chan MM, Humm KC, Karnik R, Mekhoubad S, Regev A, et al. DNA methylation dynamics of the human preimplantation embryo. Nature. 2014;511(7511):611–5. 10.1038/nature13581 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Li L, Guo F, Gao Y, Ren Y, Yuan P, Yan L, et al. Single-cell multi-omics sequencing of human early embryos. Nature cell biology. 2018. [DOI] [PubMed] [Google Scholar]
- 8.Wu J, Xu J, Liu B, Yao G, Wang P, Lin Z, et al. Chromatin analysis in human early development reveals epigenetic transition during ZGA. Nature. 2018;557(7704):256–60. 10.1038/s41586-018-0080-8 [DOI] [PubMed] [Google Scholar]
- 9.Gao L, Wu K, Liu Z, Yao X, Yuan S, Tao W, et al. Chromatin Accessibility Landscape in Human Early Embryos and Its Association with Evolution. Cell. 2018;173(1):248–59 e15. 10.1016/j.cell.2018.02.028 [DOI] [PubMed] [Google Scholar]
- 10.Okae H, Chiba H, Hiura H, Hamada H, Sato A, Utsunomiya T, et al. Genome-wide analysis of DNA methylation dynamics during early human development. PLoS Genet. 2014;10(12):e1004868 10.1371/journal.pgen.1004868 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Fulka H, Mrazek M, Tepla O, Fulka J Jr., DNA methylation pattern in human zygotes and developing embryos. Reproduction. 2004;128(6):703–8. 10.1530/rep.1.00217 [DOI] [PubMed] [Google Scholar]
- 12.Zhu P, Guo H, Ren Y, Hou Y, Dong J, Li R, et al. Single-cell DNA methylome sequencing of human preimplantation embryos. Nature genetics. 2018;50(1):12–9. 10.1038/s41588-017-0007-6 [DOI] [PubMed] [Google Scholar]
- 13.Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science. 2009;324(5929):930–5. 10.1126/science.1170116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ito S, D'Alessio AC, Taranova OV, Hong K, Sowers LC, Zhang Y. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature. 2010;466(7310):1129–33. 10.1038/nature09303 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Globisch D, Munzel M, Muller M, Michalakis S, Wagner M, Koch S, et al. Tissue distribution of 5-hydroxymethylcytosine and search for active demethylation intermediates. PLoS ONE. 2010;5(12):e15367 10.1371/journal.pone.0015367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ito S, Shen L, Dai Q, Wu SC, Collins LB, Swenberg JA, et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science. 2011;333(6047):1300–3. 10.1126/science.1210597 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Williams K, Christensen J, Helin K. DNA methylation: TET proteins-guardians of CpG islands? EMBO reports. 2011;13(1):28–35. 10.1038/embor.2011.233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Raiber EA, Portella G, Martinez Cuesta S, Hardisty R, Murat P, Li Z, et al. 5-Formylcytosine organizes nucleosomes and forms Schiff base interactions with histones in mouse embryonic stem cells. Nature chemistry. 2018;10(12):1258–66. 10.1038/s41557-018-0149-x [DOI] [PubMed] [Google Scholar]
- 19.Iurlaro M, Ficz G, Oxley D, Raiber EA, Bachman M, Booth MJ, et al. A screen for hydroxymethylcytosine and formylcytosine binding proteins suggests functions in transcription and chromatin regulation. Genome biology. 2013;14(10):R119 10.1186/gb-2013-14-10-r119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Song CX, He C. Potential functional roles of DNA demethylation intermediates. Trends in biochemical sciences. 2013;38(10):480–4. 10.1016/j.tibs.2013.07.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Iurlaro M, McInroy GR, Burgess HE, Dean W, Raiber EA, Bachman M, et al. In vivo genome-wide profiling reveals a tissue-specific role for 5-formylcytosine. Genome biology. 2016;17(1):141 10.1186/s13059-016-1001-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhu C, Gao Y, Guo H, Xia B, Song J, Wu X, et al. Single-Cell 5-Formylcytosine Landscapes of Mammalian Early Embryos and ESCs at Single-Base Resolution. Cell stem cell. 2017;20(5):720–31 e5. 10.1016/j.stem.2017.02.013 [DOI] [PubMed] [Google Scholar]
- 23.Pfaffeneder T, Hackner B, Truss M, Munzel M, Muller M, Deiml CA, et al. The discovery of 5-formylcytosine in embryonic stem cell DNA. Angewandte Chemie. 2011;50(31):7008–12. 10.1002/anie.201103899 [DOI] [PubMed] [Google Scholar]
- 24.Xia B, Han D, Lu X, Sun Z, Zhou A, Yin Q, et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nature methods. 2015;12(11):1047–50. 10.1038/nmeth.3569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Booth MJ, Marsico G, Bachman M, Beraldi D, Balasubramanian S. Quantitative sequencing of 5-formylcytosine in DNA at single-base resolution. Nature chemistry. 2014;6(5):435–40. 10.1038/nchem.1893 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wagner M, Steinbacher J, Kraus TF, Michalakis S, Hackner B, Pfaffeneder T, et al. Age-dependent levels of 5-methyl-, 5-hydroxymethyl-, and 5-formylcytosine in human and mouse brain tissues. Angewandte Chemie. 2015;54(42):12511–4. 10.1002/anie.201502722 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zong C, Lu S, Chapman AR, Xie XS. Genome-wide detection of single-nucleotide and copy-number variations of a single human cell. Science. 2012;338(6114):1622–6. 10.1126/science.1229164 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Su M, Kirchner A, Stazzoni S, Muller M, Wagner M, Schroder A, et al. 5-Formylcytosine Could Be a Semipermanent Base in Specific Genome Sites. Angewandte Chemie. 2016;55(39):11797–800. 10.1002/anie.201605994 [DOI] [PubMed] [Google Scholar]
- 29.Smallwood SA, Lee HJ, Angermueller C, Krueger F, Saadeh H, Peat J, et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nature methods. 2014;11(8):817–20. 10.1038/nmeth.3035 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. 10.1038/nature11247 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tsankov AM, Gu H, Akopian V, Ziller MJ, Donaghey J, Amit I, et al. Transcription factor binding dynamics during human ES cell differentiation. Nature. 2015;518(7539):344–9. 10.1038/nature14233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Percharde M, Lin CJ, Yin Y, Guan J, Peixoto GA, Bulut-Karslioglu A, et al. A LINE1-Nucleolin Partnership Regulates Early Development and ESC Identity. Cell. 2018;174(2):391–405 e19. 10.1016/j.cell.2018.05.043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Fang F, Hodges E, Molaro A, Dean M, Hannon GJ, Smith AD. Genomic landscape of human allele-specific DNA methylation. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(19):7332–7. 10.1073/pnas.1201310109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489(7414):75–82. 10.1038/nature11232 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Consortium EP. A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 2011;9(4):e1001046 10.1371/journal.pbio.1001046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Li F, Zhang Y, Bai J, Greenberg MM, Xi Z, Zhou C. 5-Formylcytosine Yields DNA-Protein Cross-Links in Nucleosome Core Particles. Journal of the American Chemical Society. 2017;139(31):10617–20. 10.1021/jacs.7b05495 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Raiber EA, Beraldi D, Ficz G, Burgess HE, Branco MR, Murat P, et al. Genome-wide distribution of 5-formylcytosine in embryonic stem cells is associated with transcription and depends on thymine DNA glycosylase. Genome biology. 2012;13(8):R69 10.1186/gb-2012-13-8-r69 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Yao TP, Oh SP, Fuchs M, Zhou ND, Ch'ng LE, Newsome D, et al. Gene dosage-dependent embryonic development and proliferation defects in mice lacking the transcriptional integrator p300. Cell. 1998;93(3):361–72. 10.1016/s0092-8674(00)81165-4 [DOI] [PubMed] [Google Scholar]
- 39.Visel A, Blow MJ, Li ZR, Zhang T, Akiyama JA, Holt A, et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature. 2009;457(7231):854–8. 10.1038/nature07730 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Embree LJ, Azuma M, Hickstein DD. Ewing sarcoma fusion protein EWSR1/FLI1 interacts with EWSR1 leading to mitotic defects in zebrafish embryos and human cell lines. Cancer research. 2009;69(10):4363–71. 10.1158/0008-5472.CAN-08-3229 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Yoon Y, Park H, Kim S, Nguyen PT, Hyeon SJ, Chung S, et al. Genetic Ablation of EWS RNA Binding Protein 1 (EWSR1) Leads to Neuroanatomical Changes and Motor Dysfunction in Mice. Exp Neurobiol. 2018;27(2):103–11. 10.5607/en.2018.27.2.103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Yan L, Yang M, Guo H, Yang L, Wu J, Li R, et al. Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nature structural & molecular biology. 2013;20(9):1131–9. [DOI] [PubMed] [Google Scholar]
- 43.Amouroux R, Nashun B, Shirane K, Nakagawa S, Hill PW, D'Souza Z, et al. De novo DNA methylation drives 5hmC accumulation in mouse zygotes. Nature cell biology. 2016;18(2):225–33. 10.1038/ncb3296 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kellinger MW, Song CX, Chong J, Lu XY, He C, Wang D. 5-formylcytosine and 5-carboxylcytosine reduce the rate and substrate specificity of RNA polymerase II transcription. Nature structural & molecular biology. 2012;19(8):831–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kitsera N, Allgayer J, Parsa E, Geier N, Rossa M, Carell T, et al. Functional impacts of 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxycytosine at a single hemi-modified CpG dinucleotide in a gene promoter. Nucleic acids research. 2017;45(19):11033–42. 10.1093/nar/gkx718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ji S, Shao H, Han Q, Seiler CL, Tretyakova NY. Reversible DNA-Protein Cross-Linking at Epigenetic DNA Marks. Angewandte Chemie. 2017;56(45):14130–4. 10.1002/anie.201708286 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Raiber EA, Murat P, Chirgadze DY, Beraldi D, Luisi BF, Balasubramanian S. 5-Formylcytosine alters the structure of the DNA double helix. Nature structural & molecular biology. 2015;22(1):44–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bachman M, Uribe-Lewis S, Yang X, Burgess HE, Iurlaro M, Reik W, et al. 5-Formylcytosine can be a stable DNA modification in mammals. Nature chemical biology. 2015;11(8):555–7. 10.1038/nchembio.1848 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Mooijman D, Dey SS, Boisset JC, Crosetto N, van Oudenaarden A. Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction. Nature biotechnology. 2016;34(8):852–6. 10.1038/nbt.3598 [DOI] [PubMed] [Google Scholar]
- 50.Zeng H, He B, Xia B, Bai D, Lu X, Cai J, et al. Bisulfite-Free, Nanoscale Analysis of 5-Hydroxymethylcytosine at Single Base Resolution. Journal of the American Chemical Society. 2018;140(41):13190–4. 10.1021/jacs.8b08297 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Schutsky EK, DeNizio JE, Hu P, Liu MY, Nabel CS, Fabyanic EB, et al. Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nature biotechnology. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Liu Y, Siejka-Zielinska P, Velikova G, Bi Y, Yuan F, Tomkova M, et al. Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution. Nature biotechnology. 2019. [DOI] [PubMed] [Google Scholar]
- 53.Chian RC, Lim JH, Tan SL. State of the art in in-vitro oocyte maturation. Current opinion in obstetrics & gynecology. 2004;16(3):211–9. [DOI] [PubMed] [Google Scholar]
- 54.Sathananthan AH, Osianlis T. Human embryo culture and assessment for the derivation of embryonic stem cells (ESC). Methods in molecular biology. 2010;584:1–20. 10.1007/978-1-60761-369-5_1 [DOI] [PubMed] [Google Scholar]
- 55.Niakan KK, Han J, Pedersen RA, Simon C, Pera RA. Human pre-implantation embryo development. Development. 2012;139(5):829–41. 10.1242/dev.060426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011;27(11):1571–2. 10.1093/bioinformatics/btr167 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Ha G, Roth A, Lai D, Bashashati A, Ding J, Goya R, et al. Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome research. 2012;22(10):1995–2007. 10.1101/gr.137570.112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Li L, Guo F, Gao Y, Ren Y, Yuan P, Yan L, et al. Single-cell multi-omics sequencing of human early embryos. Nature cell biology. 2018. [DOI] [PubMed] [Google Scholar]
- 60.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Current protocols in bioinformatics. 2013;43:11 0 1–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Yu M, Hon GC, Szulwach KE, Song CX, Zhang L, Kim A, et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell. 2012;149(6):1368–80. 10.1016/j.cell.2012.04.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Xie W, Schultz MD, Lister R, Hou Z, Rajagopal N, Ray P, et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell. 2013;153(5):1134–48. 10.1016/j.cell.2013.04.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw data of CLEVER-seq are deposited at the Genome Sequence Archive for Human (GSA-Human) under the accession code HRA000194, while the processed data are deposited at the Gene Expression Omnibus (GEO) under the accession code GSE124035.