Skip to main content
Stem Cells International logoLink to Stem Cells International
. 2021 Feb 27;2021:6660936. doi: 10.1155/2021/6660936

The Regulation and Functions of Endogenous Retrovirus in Embryo Development and Stem Cell Differentiation

Yangquan Xiang 1, Hongqing Liang 1,
PMCID: PMC7937486  PMID: 33727936

Abstract

Endogenous retroviruses (ERVs) are repetitive sequences in the genome, belonging to the retrotransposon family. During the course of life, ERVs are associated with multiple aspects of chromatin and transcriptional regulation in development and pathological conditions. In mammalian embryos, ERVs are extensively activated in early embryo development, but with a highly restricted spatial-temporal pattern; and they are drastically silenced during differentiation with exceptions in extraembryonic tissue and germlines. The dynamic activation pattern of ERVs raises questions about how ERVs are regulated in the life cycle and whether they are functionally important to cell fate decision during early embryo and somatic cell development. Therefore, in this review, we focus on the pieces of evidence demonstrating regulations and functions of ERVs during stem cell differentiation, which suggests that ERV activation is not a passive result of cell fate transition but the active epigenetic and transcriptional regulation during mammalian development and stem cell differentiation.

1. Introduction

ERVs belong to a Class family of retrotransposon elements in the genome. Together with DNA transposons, they are known as transposable elements (TEs), which are derived from DNA fragments able to transpose within the genome. Due to their capacities to hop around and copy themselves in the genome, TEs are considered one of the main driving forces in reconstructing the genome during mammalian evolution. To date, TEs have mostly lost the ability to transpose [1, 2], considering that the transposition events might lead to genome instability. ERVs and other family members of TEs used to be considered as “junk DNA,” but with the technological advancement in genome-wide expression and epigenetic profiling, we started to appreciate more on their functional contribution to development and diseases. We now understand that the complexity of the mammalian genome is not achieved through a significant increase of the protein-coding sequences, but by the vast expansion of regulatory capacities imparted by the non-coding sequences. TEs occupy nearly half of the non-coding genome and thus are thought to play critical roles in shaping the complexity of mammalian gene regulatory network.

Comparing to other repeat element families, such as short interspersed nuclear elements (SINEs) and long interspersed nuclear elements (LINEs), ERVs bear more sequence complexities and thus may play more specific regulatory functions in the genome [1, 3]. Although ERVs are the smallest class of retrotransposon family, they exhibit significant enrichment and are over-represented in cell type-specific active regulatory sequences [4]. ERVs are thought to be generated as by-products of retroviral infection and integration events in the ancestral mammalian genome. During the evolution, they were endogenized and inherited through germline transmission [5]. Most ERVs are tamed now in the host genome through mutations of their transposition machinery or through coevolution of host regulatory factors that repress ERV activation [5]. A full-length ERV consists of two long-terminal repeats (LTRs) flanking at both 5′ and 3′ sides, and the open reading frame (GAG, POL, and ENV) in the center. It should be emphasized that LTRs are the regulatory elements of ERVs [6]. The LTR regions of ERVs possess binding sites for a broad scope of transcription factors to interact with the host gene regulatory machinery and achieve precise control of ERV activity [7]. Meanwhile, exaptation of the LTRs' cis-regulatory functions (enhancer and promoter) also leads to innovations of the transcription network in the host genome. ERVs also exclusively possess the primer binding site (PBS) which can recruit complementary tRNA to prime for viral reverse transcription. PBS sequences are also found to be the binding sites for ERV silencing factors from the host [7]. Based on the similarity to tRNA sequence in the PBS region, ERVs can be further classified into several families, ERVH, ERVW, ERVK, ERVL, etc. Out of the 8% genomic constitution of ERVs in human genome, 90% exist as solitary ERVs with only the LTR sequences present and the viral protein-coding ERV-int regions shed off [3].

The expression level of ERVs is dynamically regulated in early embryogenesis, differentiated tissues, and germ cells [8]. Interestingly, the expression of different ERV sub-families exhibited high temporal specificity during early human embryo development [8], suggesting ERVs as stringent markers for specific embryonic stages (Figure 1). Besides, many shreds of evidence also demonstrated that abnormal ERV expression may lead to different types of diseases [913]. ERVs can affect genome-wide transcription through multiple layers of regulation as discussed below. Thus, their activities should be tightly controlled in the mammalian genome to coordinate with proper development and cell fate decision process. The precise control of ERVs in the host genome is largely through transcriptional and epigenetic regulation. DNA methylation is considered a common regulatory mechanism to repress ERV expression. Many ERVs in human are heavily methylated and silenced in differentiated tissues but show loss of methylation and aberrant expression in cancer [14]. Apart from DNA methylation, the Krüppel-associated box domain-containing zinc finger protein (KRAB-ZFP) is known to regulate chromatin configuration surrounding ERV elements [15]. ERV elements are bound by zinc finger domains of the KRAB-ZFPs, and the KRAB domain can recruit tripartite motif-containing 28 (TRIM28), resulting in the trimethylation of histone H3 lysine9 (H3K9me3) and ERV silencing in embryonic stem cells [16]. Histone deacetylation is also involved in ERV regulation. It has been found that histone deacetylase inhibitor (HDACi) treatment led to ERV9 activation which prevented testicular cancer progression, but this did not lead to upregulation of other ERV sub-families, implying that histone deacetylation may regulate human ERV silencing in a sub-family-specific manner [15, 17]. In general, it can be envisaged that a combination of different kinds of epigenetic modifications is orchestrated to tightly control the ERV activity.

Figure 1.

Figure 1

The dynamic regulation ERVs during development. During embryonic and somatic development, ERVs are selectively activated, whereas aberrant activation or silencing of ERVs results in pathological consequences.

Over the last 10 years, increasing pieces of evidence are showing that LTRs may play under-recognized regulatory roles in mammalian development and diseases [913]. In the following sessions, we will discuss in detail about the current knowledge on the functions of ERV in chromatin and transcription regulation, how these functions are achieved, and how they contribute to cell fate decision during mammalian embryonic development and stem cell differentiation.

2. The Functions of ERV in Gene Regulation

If chromatin regulation is a symphony, then ERV has several instruments to play. ERV recruits transcription factors, works as alternative promoters, encodes long non-coding RNAs (lncRNAs), and produces protein products to mediate cellular function. These abilities could be stemmed from intrinsic functions of ERV or could be coopted during the coevolution with the host genome. Nevertheless, the functions of ERV have become an integral part of the regulatory machinery in the genome and indispensable for the normal development and homeostasis of mammals.

2.1. The Recruitment of Transcription Factors

In-silico mapping revealed that many ERVs are enriched with transcription factor-binding sites, suggesting ERVs may act as cis-regulatory elements for transcription [4]. Putative epigenetic markers for promoter and enhancer, such as H3K4me3 and H3K27ac, are frequently seen on the LTR regions [11]. Activated ERVs are largely associated with cell type-specific open chromatin configuration. For example, in human pluripotent stem cell, HERVH sub-family is enriched with binding sites for pluripotency transcription factors such as OCT4 and KLF4, as well as active histone modifications like H3K4me3 and H3K27ac, adopting open chromatin conformation [11, 18]. In addition, DUX4, as well as its mouse homologous DUX, can bind to the ERVL sub-family in human and mouse, respectively. This leads to epigenetic activation of genes downstream of the ERVL elements, which are essential for initiating zygote genome activation (ZGA) in early human and mouse embryos [19]. Human DUX4 is kept silenced in differentiated tissues, as aberrant activation of DUX4 in muscle tissue upregulates HERVL, leading to unscheduled transcription activation of early embryonic genes which eventually resulted in facioscapulohumeral muscular dystrophy [10]. These shreds of evidence together suggested that ERVs can recruit transcription factors to actively influence the epigenetic landscape in the nearby region, thus contributing to cell type-specific gene regulation.

Moreover, ERVs can also modulate signaling pathways to coordinate cell fate change. It has been found that ERVs shaped the evolution of the transcription network underlying the interferon response [20]. For instance, one of the ERV sub-families, MER41, is enriched with interferon-induced STAT1-binding sites [20]. STAT1-bound MER41 regions were enriched with H3K27ac upon interferon stimulation. The knockout of MER41 impaired the expression of interferon-induced genes such as AIM2 which senses cytosolic foreign DNA and activates inflammatory responses [20]. This suggests that ERV can sense the interferon signaling pathway and feedback to regulate innate immunity.

2.2. Alternative Promoters and Alternative Splicing

The LTR elements in ERVs possess the intrinsic promoter activity to drive ERVs expression. LTRs can also function as alternative promoters to drive host ORF expression. It has been estimated that up to 75% of human genes take advantage of alternative promoters to achieve tissue-specific regulation [21]. The employment of ERVs as alternative promoters not only results in stage- or tissue-specific gene expression patterns but also generates different isoforms of proteins [3, 21, 22]. Besides, ERVs are found over-represented in regions close to protein-coding sequences, suggesting that they are closely related to transcription initiation in the genome [23]. For instance, MT2 of the mouse ERVL sub-family is highly activated in mouse 2C embryo and functions as an alternative promoter to upregulate MERVL nearby genes, generating chimeric transcripts with junctions to MERVL elements [24]. An example to demonstrate is that Zfp352 has two promoters (P1 and P2) that are active in mouse early embryo and somatic cells, respectively [2527]. Interestingly, the active promoter of Zfp352 in early embryos overlaps with MT2B1 repeats, indicating the ERV promoter may be critical for the early activation of Zfp352 [2527]. A recent large-scale transcriptomic analysis discovered that 23% of all protein-coding genes expressed in various cancer types possess at least two promoters that cause a significant tumor type-specific change in isoform expression [28]. For example, JAZF1 prefers the 3′ full-length promoter (prmtr.40310) in KIRP cancer, whereas in KIRC cancer, a truncated promoter (prmtr.40312) is favored [28].

The presence of alternative promoters not only leads to context-dependent gene activation but also creates alternative splicing variants of the transcripts [21]. Alternative splicing can occur in the retroviral RNA itself, which has been correlated to cancer initiation[9]. For example, the open reading frame of HERVK provides a source for alternative splicing, and the spliced variants of HERVK can be detected in various cancers, some of which are cancer type-specific [29]. The differentially expressed retroviral RNA isoforms raise questions of how these isoforms are generated, and what functional differences exist between these isoforms. Apart from retroviral isoforms, ERVs are also involved in generating alternatively spliced isoforms in coding genes. For instance, the upstream MER4A can be utilized as an alternative promoter for GTSO1, which led to the generation of 15 isoforms of GTSO1 that may function differently under different disease contexts [30].

2.3. ERV-Derived Long Non-coding RNA

More importantly, many ERVs can encode for lncRNA. The functions of these lncRNAs can be involved in various processes like recruiting transcription factors, cooperating with epigenetic regulators or modifiers, or interacting with miRNAs [3133].

A few studies demonstrated that the ERV-derived lncRNAs can participate in signaling transduction by regulating protein recruitment and protein degradation [3437]. One of the ERV sub-family members, ALVE1, transcribes into lnc-ALVE1-AS1 to activate the TLR3 signaling pathway in the cytoplasm and induce antiviral innate immunity [35]. In addition, transcriptome analysis revealed that a human ERV-derived lncRNA, termed TROJAN, binds to metastasis-repressing factors and promotes their degradation through ubiquitin-associated signaling pathway [36], thus promoting breast cancer progression. On the converse, antisense oligonucleotide repressing TROJAN slows down the breast cancer progression extraordinarily in vivo, suggesting that TROJAN promotes cancer invasion and can serve as a potential therapeutic target [36].

2.4. ERV-Derived Proteins

In addition to RNAs, the proteins translated from ERVs can also perform specific functions under certain contexts. These proteins are derived from the open reading frame of ERV, including GAG, POL, and ENV. The functions of these viral proteins are diversified [3840]. For instance, the ENV protein from HERVK can upregulate the p-ERK1/2 and RAS signaling pathways in human pancreatic cancer, and knockdown of ENV suppressed the activity of the ERK signaling pathway [40]. Moreover, ENV proteins from HERVW and HERVFRD aid in trophectoderm cell fusion and facilitate mammalian embryo implantation into the uterus [41, 42], and the GAG protein produced by HERVK promotes prostate cancer progression by inducing androgen hormone release [38].

3. ERV in Stem Cell Differentiation

Embryonic development is initiated after fertilization, followed by zygote cleavage. In the early embryo cleavage stages, the zygotic genome is activated, accompanied by global remodeling and rewiring of the transcription network. Before the first cell fate segregation in late morula and blastocyst, cells in embryos retain the capacity to give rise to the complete embryo proper and are thus considered totipotent. In blastocyst, cells are committed to the outer layer trophectoderm and inner cell mass which gives rise to the pluripotent epiblast and differentiates into three germ layers and somatic tissues. Numerous genetic and epigenetic programs governing the embryo developmental processes have been revealed, but mostly focusing on the regulation of the coding genome. Non-coding elements such as ERVs are poorly understood in this context but are increasingly gaining attention. ERVs are extensively activated in early embryo development, with a highly restricted spatial-temporal pattern, and are drastically silenced during differentiation with exceptions of extraembryonic tissue and germlines (Figure 1). Here, we will focus on the functions and regulation of ERVs in a few key developmental stages and context to discuss the emergent roles of ERVs in chromatin regulation and stem cell differentiation.

3.1. ERV in Totipotency Regulation

During both mouse and human embryo development, ERVL subfamily is activated around ZGA but gradually silenced thereafter. It seems that ERVL is predominantly associated with the totipotent state. In mouse, transcripts from MERVL loci occupy 2% of the total mRNA in 2C embryo [24]. More than 307 genes were found to form chimeric transcripts with partial MERVL sequence [24]. These chimeric transcripts are mostly associated with metabolism and transcription regulation involved in mouse ZGA. For instance, in mouse 2C embryo, MT2-SPIN chimeric transcript excludes 3 exons at the N-terminus compared to the native isoform [43], resulting in the native and chimeric isoforms of SPIN that bear different phosphorylation sites by MAPK [43] and thus may mediate different signaling functions. MT2, together with partial MERVL-int sequence, is also a robust fluorescence reporter for 2C embryo as well as 2C-like cells in mouse embryonic stem cells (mESCs) [24]. MT2 also exhibits regulatory functions in activating distal 2C-specific genes. MT2 drives Zscan4 cluster gene expression in mouse 2C embryo, and the upregulation of Zscan4 can further activate MT2, resulting in DNA demethylation and open chromatin configuration to further activate 2C-specific genes nearby MT2 loci [44]. Interestingly, ectopic activation of MERVL by CRISPR activation system also resulted in the upregulation of 2C genes [45], implying that MERVL can act as a cis-regulatory element to control totipotent gene expression.

Similarly, in human, HERVL expression is also enriched in 8C stage corresponding to the time of human embryo ZGA [8]. MERVL and HERVL can be bound by mouse DUX and human DUX4, respectively, but cross-species binding is minimum, suggesting independent but converged evolution in mouse and human [19, 46]. Over-expression of Dux in mESCs can activate MERVL and downstream 2C genes. Similarly, human DUX4 over-expression results in HERVL activation and simultaneously upregulation of human 8C-specific genes [19, 46].

Upon exiting from 2C stage, MERVL is rapidly silenced and its expression falls back to baseline in mouse 8C embryos. The silencing of MERVL is mediated by ZFP809. ZFP809 is a mouse-specific zinc finger protein, containing the KRAB domain at the N-terminus and seven zinc finger domains at the C-terminus [47]. The zinc finger domains allow ZFP809 to bind to the PBS sequence of MERVL, and the KRAB domain recruits TRIM28, together with NURD (histone deacetylase) and SETDB1 (histone methyltransferase), which led to condensed chromatin configuration and repression of MERVL activity [7, 47, 48]. Interestingly, it is noted that Zfp809 produces two isoforms: a full-length protein and a truncated protein that lacks 50 residues at C-terminus. The full-length protein is selectively stable in ESCs but degraded in other cell types. Whereas the short isoform is constitutively expressed in both ESCs and differentiated cells, but the underlying impact and functional differences between the two differentially expressed isoforms remain unknown [47]. Nevertheless, a critical question that remained to be validated is whether the failure to silence MERVL will lead to the delay in the development of mouse early embryos, trapping the cells in totipotency.

3.2. ERV in Pluripotency Regulation

Upon exiting from totipotent state, cells take on the first cell fate decision to become extraembryonic trophectoderm or pluripotent epiblast. ERVL is rapidly silenced along with the exit from totipotency, while other sub-families of ERVs are upregulated [8, 11, 45]. HERVH sub-family is one of the most predominant ERVs in pluripotent stem cells. The internal sequence (ERV-int) is degenerated in a slower manner compared to other ERVs, suggesting the potential function of HERVH-int sequence in the pluripotent state [5, 6]. It is not known whether the silencing of HERVL is a prerequisite for the activation of HERVH during human embryo development. But it is possible that if HERVL is not silenced, the totipotency transcription network will remain active, and cells might be trapped in the totipotent state. Similarly, forced activation of HERVL in pluripotent stem cells may also induce totipotent gene expression and shut down HERVH expression [45].

HERVH copies are highly enriched with the putative binding sites for pluripotent factors including KLF4, NANOG, and OCT4 [11]. In hESCs, HERVH is also enriched with H3K4me3 and H3K27ac [11], implying that they are potentially active promoters or enhancers for pluripotent gene regulation. Ectopic expression of HERVH sub-families by CRISPR activation system can result in an extensive upregulation of genes up to 200 kb nearby of HERVH sequences [49]. Besides, a total of 128 and 145 chimeric transcripts of HERVH are detected in hiPSCs and hESCs respectively, suggesting HERVH can function as alternative promoters to activate pluripotency-related genes [11]. In contrast, native promoters of these genes are rarely active in pluripotent stem cells [11]. Although there could be potential functional distinctions between chimeric transcripts from ERV promoters and original transcripts from native promoters, the ERV-mediated activation of these genes in early embryonic development offers additional opportunity to rewire gene expression and innovate on the transcription regulation.

In addition, the lncRNAs derived from HERVH also play critical roles in pluripotency regulation. They may function as scaffold units to recruit chromatin modifiers and direct them towards specific locations [50, 51]. In detail, the HERVH lncRNAs mainly localize to the nucleus, and they can recruit chromatin modifiers such as P300 to the genomic loci of LTRs to regulate transcription of pluripotency genes nearby [52]. HERVH knockdown leads to fibroblast-like cell morphology [52] and downregulates more than 1000 genes observed, including a 50% reduction in NANOG and OCT4 expression, resulting in the partial loss of pluripotency and upregulation of differentiation markers [11]. In line with its role in hESCs, HERVH exhibited similar functions during somatic cell reprogramming [52]. HERVH expression is substantially upregulated upon ectopic expression of reprogramming factors, while depletion of HERVH during reprogramming leads to a reduction of iPSC colony-forming efficiency [52]. These shreds of evidence together indicate that HERVH is indispensable for both pluripotency establishment and maintenance.

Despite the importance of HERVH, pieces of evidence have been controversial about whether HERVH is required for naïve or primed pluripotency [11, 36, 52, 53]. Based on the LTR regions, HERVH can be further divided into several sub-families, such as LTR7Y, LTR7B, and LTR7. Some of the LTRs, like LTR7, are predominantly expressed in primed pluripotency [8], while LTR7Y may be more specific to naïve pluripotency [8]. Thus, naïve and primed pluripotency might employ different sub-families of HERVH controlled by the respective LTRs, but how this specificity is achieved requires further investigation.

3.3. ERV in Extraembryonic Tissue Differentiation

Research work has shed more light on the roles of ERVs in trophectoderm differentiation since the 1990s [54]. The roles of ERVs in extraembryonic tissue differentiation are mediated by regulating trophectoderm-specific transcription and by encoding for fusion proteins during the syncytia formation.

Many ERVs have a robust expression in placenta development [55]. Among all, HERVW, HERVFRD, and HERV3 are the top three active sub-families that encode for a high level of ENV gene [56, 57]. The SYNCITIN 1 translated from the ENV gene of HERVW lacks an immunosuppressive domain compared to full-length ENV protein. It is specifically upregulated in syncytiotrophoblast during implantation [41, 42]. The hydrophobic domain in SYNCITIN 1 enables its fusion with plasma membrane and potentially aids in uterus invasion [58]. Ectopic expression of HERVW ENV gene can induce cell fusion, which is reversed by neutralizing antibodies against SYNCITIN 1 [59]. In contrast, the lack of SYNCITIN 1 in primary trophoblast cells reduces the ability to form syncytia [60]. Similar to SYNCITIN 1, the SYNCITIN 2 produced by HERVFRD also promotes cell fusion upon ectopic expression in several cell lines [61]. Interestingly, ENV protein derived from HERV3 is expressed not only in syncytiotrophoblast but also in a wide range of tissues, particularly those producing hormones [62, 63]. More importantly, 1% of the Caucasian population bears a premature stop codon near the N-terminus, resulting in a non-functional short isoform of the protein. However, this does not lead to observable physiological defects in these individuals [54, 55]. Different ENV proteins from different ERV sub-families might play redundant roles. Apart from the proteins, ERVs also function as cis-regulatory elements in extraembryonic differentiation. In the mouse placenta, one of the ERV sub-families, RLTR13D5, is highly enriched with H3K27ac and H3K4me1, suggesting its potential role as an enhancer [64]. Moreover, RLTR13D5 can be functionally bound by CDX2, EOMES, and ELF5 to regulate the transcription in trophoblast stem cells and contribute to placenta development [64].

3.4. ERV in Somatic Tissue Differentiation

Despite the high activity of ERV and other TE families in early embryos, they are thought to be largely deactivated during the differentiation process. The silencing mechanism involves coevolution between the host transcription regulatory machinery and ERVs to tame ERV expression and limit their transposition. Improper silencing of ERVs is associated with loss of tissue homeostasis and pathological conditions. For example, the ENV protein derived from HERVW is highly expressed in type-1 diabetes and inhibited the secretion of insulin [65]. Transcripts and proteins of HERVK are also detected in amyotrophic lateral sclerosis brain tissue, which may contribute to the inhibition of neurite growth [66]. In human muscle cells, aberrantly expressed DUX4 binds to and induces HERVL expression, which serves as alternative promoters to alter the transcription network in facioscapulohumeral muscular dystrophy [10]. Moreover, HERV-derived lncRNA TROJAN promotes ubiquitin-associated degradation of metastasis-repressing factors and accelerates breast cancer progression [36].

In addition to the conventional view that ERV activation in differentiated tissue led to pathological conditions, more and more tissue-specific ERVs were identified, and they are thought to contribute to the cell type-specific differentiation or tissue-specific functions [67]. For instance, during mouse gastrulation, different ERV sub-families were activated in various cell fates: erythroid has high RLTR10F activity, while mesoderm favors ERVB4 [67]. However, the exact function of these ERVs in the respective lineage remains elusive. Similarly, during the differentiation of human pluripotent stem cells (hPSCs) to cardiomyocytes in vitro, distinct sets of ERVs are selectively activated in different cell populations. For example, LTR32, MER57A-int, and MER45A are specifically expressed in definitive cardiomyocytes while MLT1H1, HERVIP10B-int, and LTR5A are selectively active in non-contractile cells [67]. It is noted that many ERV transcription regulators in ESCs, such as KLF-family members, are also expressed in tissue-specific cell types; thus, they may regulate ERV in the respective context.

Taken together, these pieces of evidence demonstrate two fundamental aspects of ERV in differentiation: (1) Various ERVs are now associated with tissue differentiation and specific cell lineage. (2) Aberrant ERV expression in differentiated tissues may be toxic while targeting these ERVs could provide potential therapeutic means to slow down disease progression.

3.5. ERV in Germline Formation

Although ERV activity may be largely silenced during differentiation, it is highly expressed and activated during germline formation. The first observation of ERV expression in germline cells can be dated back to 1983 when virus-like “intracisternal A particle (IAP)” was detected in mouse oocytes [68]. Up to now, more than 800 types of LTRs are detected in mouse oocyte, and they are involved in diverse functions, which aid in oocyte transcription regulation and facilitate oogenesis [69]. For instance, DICER protein is present in both mouse somatic cells and oocytes [70]. However, instead of being transcribed from native promoters, oocyte-specific DICER expression is driven by the LTR of MTC and produces an isoform lacking the N-terminal DExD helicase domain compared to full-length somatic DICER produced by its native promoter. And the deletion of LTR regions of MTC impaired oocyte-specific DICER, resulting in female sterility [70]. Many of the activated ERVs in the oocyte are passed down to the zygote as maternal factors, which are thought to be involved in ZGA [71], but their exact functions remain to be dissected in the future.

On the contrary, the progenitors of mouse germline cells, namely primordial germ cells (PGCs), show repressed ERV activity. ERV sequences are enriched with H3K9me3 and H3K27me3 that induce a repressive chromatin configuration [72]. In detail, SETDB1 as a methyltransferase protects PGCs from ERV activity. SETDB1 knockout PGCs show upregulated ERV activity, low survival rate, and postnatal hypogonadism [72]. Although this is in contrast to the general knowledge that ERVs are upregulated in germ cells, it is possible that different families of ERVs are involved in various stages of germ cell formation.

4. Conclusion and Outlook

ERVs are previously thought to coordinate with the host genome during mammalian evolution, and now they are considered as integral parts to form species and cell type-specific gene regulatory networks. The research of ERVs in stem cell fate decision and differentiation has just been unraveled, and many questions remained to be answered. Given the observed stage-specific expression pattern of ERV (Figure 1), what will be the specific function of each ERV sub-family in different developmental stages? How do different cell types achieve specific activation of ERV sub-families? What is the consequence of unscheduled activation or silencing of ERVs during early embryogenesis? Are ERVs exhibiting cell type-specific expression beyond blastocyst stages? Will ERV represent novel targets for diseases? Future studies will shed light on these questions and open up the fascinating but less charted road of ERVs.

Acknowledgments

We thank Ziying Tan for comments on the manuscript. This work was supported by the National Natural Science Foundation of China, grant number 31871372 and 31950410535.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  • 1.Goke J., Ng H. H. CTRL+INSERT: retrotransposons and their contribution to regulation and innovation of the transcriptome. EMBO Reports. 2016;17(8):1131–1144. doi: 10.15252/embr.201642743. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Slotkin R. K., Martienssen R. Transposable elements and the epigenetic regulation of the genome. Nature Reviews. Genetics. 2007;8(4):272–285. doi: 10.1038/nrg2072. [DOI] [PubMed] [Google Scholar]
  • 3.Chuong E. B., Elde N. C., Feschotte C. Regulatory activities of transposable elements: from conflicts to benefits. Nature Reviews. Genetics. 2017;18(2):71–86. doi: 10.1038/nrg.2016.139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Jacques P. E., Jeyakani J., Bourque G. The majority of primate-specific regulatory sequences are derived from transposable elements. PLoS Genetics. 2013;9(5, article e1003504) doi: 10.1371/journal.pgen.1003504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Romer C., Singh M., Hurst L. D., Izsvak Z. How to tame an endogenous retrovirus: HERVH and the evolution of human pluripotency. Current Opinion in Virology. 2017;25:49–58. doi: 10.1016/j.coviro.2017.07.001. [DOI] [PubMed] [Google Scholar]
  • 6.Izsvak Z., Wang J., Singh M., Mager D. L., Hurst L. D. Pluripotency and the endogenous retrovirus HERVH: conflict or serendipity? BioEssays. 2016;38(1):109–117. doi: 10.1002/bies.201500096. [DOI] [PubMed] [Google Scholar]
  • 7.Wolf D., Goff S. P. Embryonic stem cells use ZFP809 to silence retroviral DNAs. Nature. 2009;458(7242):1201–1204. doi: 10.1038/nature07844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Göke J., Lu X., Chan Y. S., et al. Dynamic transcription of distinct classes of endogenous retroviral elements marks specific populations of early human embryonic cells. Cell Stem Cell. 2015;16(2):135–141. doi: 10.1016/j.stem.2015.01.005. [DOI] [PubMed] [Google Scholar]
  • 9.Agoni L., Guha C., Lenz J. Detection of human endogenous retrovirus K (HERV-K) transcripts in human prostate cancer cell lines. Frontiers in Oncology. 2013;3:p. 180. doi: 10.3389/fonc.2013.00180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Young J. M., Whiddon J. L., Yao Z., et al. DUX4 binding to retroelements creates promoters that are active in FSHD muscle and testis. PLoS Genetics. 2013;9(11, article e1003947) doi: 10.1371/journal.pgen.1003947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wang J., Xie G., Singh M., et al. Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells. Nature. 2014;516(7531):405–409. doi: 10.1038/nature13804. [DOI] [PubMed] [Google Scholar]
  • 12.Argaw-Denboba A., Balestrieri E., Serafino A., et al. HERV-K activation is strictly required to sustain CD133+ melanoma cells with stemness features. Journal of Experimental & Clinical Cancer Research. 2017;36(1):p. 20. doi: 10.1186/s13046-016-0485-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Tovo P. A., Rabbone I., Tinti D., et al. Enhanced expression of human endogenous retroviruses in new-onset type 1 diabetes: potential pathogenetic and therapeutic implications. Autoimmunity. 2020;53(5):283–288. doi: 10.1080/08916934.2020.1777281. [DOI] [PubMed] [Google Scholar]
  • 14.Szpakowski S., Sun X., Lage J. M., et al. Loss of epigenetic silencing in tumors preferentially affects primate-specific retroelements. Gene. 2009;448(2):151–167. doi: 10.1016/j.gene.2009.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hurst T. P., Magiorkinis G. Epigenetic control of human endogenous retrovirus expression: focus on regulation of long-terminal repeats (LTRs) Viruses. 2017;9(6):p. 130. doi: 10.3390/v9060130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Imbeault M., Helleboid P. Y., Trono D. KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks. Nature. 2017;543(7646):550–554. doi: 10.1038/nature21683. [DOI] [PubMed] [Google Scholar]
  • 17.Beyer U., Kronung S. K., Leha A., Walter L., Dobbelstein M. Comprehensive identification of genes driven by ERV9-LTRs reveals TNFRSF10B as a re-activatable mediator of testicular cancer cell death. Cell Death and Differentiation. 2016;23(1):64–75. doi: 10.1038/cdd.2015.68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Gaspar-Maia A., Alajem A., Polesso F., et al. Chd1 regulates open chromatin and pluripotency of embryonic stem cells. Nature. 2009;460(7257):863–868. doi: 10.1038/nature08212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Hendrickson P. G., Doráis J. A., Grow E. J., et al. Conserved roles of mouse DUX and human DUX4 in activating cleavage-stage genes and MERVL/HERVL retrotransposons. Nature Genetics. 2017;49(6):925–934. doi: 10.1038/ng.3844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chuong E. B., Elde N. C., Feschotte C. Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Science. 2016;351(6277):1083–1087. doi: 10.1126/science.aad5497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cohen C. J., Lock W. M., Mager D. L. Endogenous retroviral LTRs as promoters for human genes: a critical assessment. Gene. 2009;448(2):105–114. doi: 10.1016/j.gene.2009.06.020. [DOI] [PubMed] [Google Scholar]
  • 22.Batut P., Dobin A., Plessy C., Carninci P., Gingeras T. R. High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression. Genome Research. 2013;23(1):169–180. doi: 10.1101/gr.139618.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Faulkner G. J., Kimura Y., Daub C. O., et al. The regulated retrotransposon transcriptome of mammalian cells. Nature Genetics. 2009;41(5):563–571. doi: 10.1038/ng.368. [DOI] [PubMed] [Google Scholar]
  • 24.Macfarlan T. S., Gifford W. D., Driscoll S., et al. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature. 2012;487(7405):57–63. doi: 10.1038/nature11244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Chen H. H., Liu T. Y., Huang C. J., Choo K. B. Generation of Two Homologous and Intronless Zinc-Finger Protein Genes, Zfp352 and Zfp353 , with Different Expression Patterns by Retrotransposition. Genomics. 2002;79(1):18–23. doi: 10.1006/geno.2001.6664. [DOI] [PubMed] [Google Scholar]
  • 26.Liu T. Y., Chen H. H., Lee K. H., Choo K. B. Display of different modes of transcription by the promoters of an early embryonic gene, Zfp352, in preimplantation embryos and in somatic cells. Molecular Reproduction and Development. 2003;64(1):52–60. doi: 10.1002/mrd.10218. [DOI] [PubMed] [Google Scholar]
  • 27.Ge S. X. Exploratory bioinformatics investigation reveals importance of "junk" DNA in early embryo development. BMC Genomics. 2017;18(1):p. 200. doi: 10.1186/s12864-017-3566-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Demircioğlu D., Cukuroglu E., Kindermans M., et al. A Pan-cancer transcriptome analysis reveals pervasive regulation through alternative promoters. Cell. 2019;178(6):1465–1477.e17. doi: 10.1016/j.cell.2019.08.018. [DOI] [PubMed] [Google Scholar]
  • 29.Agoni L., Lenz J., Guha C. Variant splicing and influence of ionizing radiation on human endogenous retrovirus K (HERV-K) transcripts in cancer cell lines. PLoS One. 2013;8(10, article e76472) doi: 10.1371/journal.pone.0076472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Conley A. B., Piriyapongsa J., Jordan I. K. Retroviral promoters in the human genome. Bioinformatics. 2008;24(14):1563–1567. doi: 10.1093/bioinformatics/btn243. [DOI] [PubMed] [Google Scholar]
  • 31.Tsai M. C., Manor O., Wan Y., et al. Long noncoding RNA as modular scaffold of histone modification complexes. Science. 2010;329(5992):689–693. doi: 10.1126/science.1192002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Panni S., Lovering R. C., Porras P., Orchard S. Non-coding RNA regulatory networks. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms. 2020;1863(6, article 194417) doi: 10.1016/j.bbagrm.2019.194417. [DOI] [PubMed] [Google Scholar]
  • 33.Durruthy-Durruthy J., Sebastiano V., Wossidlo M., et al. The primate-specific noncoding RNA HPAT5 regulates pluripotency during human preimplantation development and nuclear reprogramming. Nature Genetics. 2016;48(1):44–52. doi: 10.1038/ng.3449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wang J., Li X., Wang L., et al. A novel long intergenic noncoding RNA indispensable for the cleavage of mouse two-cell embryos. EMBO Reports. 2016;17(10):1452–1470. doi: 10.15252/embr.201642051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chen S., Hu X., Cui I. H., et al. An endogenous retroviral element exerts an antiviral innate immune function via the derived lncRNA lnc-ALVE1-AS1. Antiviral Research. 2019;170:p. 104571. doi: 10.1016/j.antiviral.2019.104571. [DOI] [PubMed] [Google Scholar]
  • 36.Jin X., Xu X.-E., Jiang Y.-Z., et al. The endogenous retrovirus-derived long noncoding RNA TROJAN promotes triple-negative breast cancer progression via ZMYND8 degradation. Science Advances. 2019;5(3, article eaat9820) doi: 10.1126/sciadv.aat9820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Zhou B., Qi F., Wu F., et al. Endogenous retrovirus-derived long noncoding RNA enhances innate immune responses via derepressing RELA expression. mBio. 2019;10(4) doi: 10.1128/mBio.00937-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Reis B. S., Jungbluth A. A., Frosina D., et al. Prostate cancer progression correlates with increased humoral immune response to a human endogenous retrovirus GAG protein. Clinical Cancer Research. 2013;19(22):6112–6125. doi: 10.1158/1078-0432.CCR-12-3580. [DOI] [PubMed] [Google Scholar]
  • 39.Wang-Johanning F., Li M., Esteva F. J., et al. Human endogenous retrovirus type K antibodies and mRNA as serum biomarkers of early-stage breast cancer. International Journal of Cancer. 2014;134(3):587–595. doi: 10.1002/ijc.28389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Li M., Radvanyi L., Yin B., et al. Downregulation of human endogenous retrovirus type K (HERV-K) viral ENV RNA in pancreatic cancer cells decreases cell proliferation and tumor growth. Clinical Cancer Research. 2017;23(19):5892–5911. doi: 10.1158/1078-0432.CCR-17-0001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Black S. G., Arnaud F., Palmarini M., Spencer T. E. Endogenous retroviruses in trophoblast differentiation and placental development. American Journal of Reproductive Immunology. 2010;64(4):255–264. doi: 10.1111/j.1600-0897.2010.00860.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Dong C., Beltcheva M., Gontarz P., et al. Derivation of trophoblast stem cells from naïve human pluripotent stem cells. eLife. 2020;9 doi: 10.7554/eLife.52504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Peaston A. E., Evsikov A. V., Graber J. H., et al. Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Developmental Cell. 2004;7(4):597–606. doi: 10.1016/j.devcel.2004.09.004. [DOI] [PubMed] [Google Scholar]
  • 44.Eckersley-Maslin M. A., Svensson V., Krueger C., et al. MERVL/Zscan4 network activation results in transient genome-wide DNA demethylation of mESCs. Cell Reports. 2016;17(1):179–192. doi: 10.1016/j.celrep.2016.08.087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Guallar D., Bi X., Pardavila J. A., et al. RNA-dependent chromatin targeting of TET2 for endogenous retrovirus control in pluripotent stem cells. Nature Genetics. 2018;50(3):443–451. doi: 10.1038/s41588-018-0060-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.De Iaco A., Planet E., Coluccio A., Verp S., Duc J., Trono D. DUX-family transcription factors regulate zygotic genome activation in placental mammals. Nature Genetics. 2017;49(6):941–945. doi: 10.1038/ng.3858. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Wang C., Goff S. P. Differential control of retrovirus silencing in embryonic cells by proteasomal regulation of the ZFP809 retroviral repressor. Proceedings of the National Academy of Sciences. 2017;114(6):E922–E930. doi: 10.1073/pnas.1620879114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Wolf G., Yang P., Füchtbauer A. C., et al. The KRAB zinc finger protein ZFP809 is required to initiate epigenetic silencing of endogenous retroviruses. Genes & Development. 2015;29(5):538–554. doi: 10.1101/gad.252767.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Fuentes D. R., Swigut T., Wysocka J. Systematic perturbation of retroviral LTRs reveals widespread long-range effects on human gene regulation. eLife. 2018;7 doi: 10.7554/eLife.35989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Pera M. F., Tam P. P. L. Extrinsic regulation of pluripotent stem cells. Nature. 2010;465(7299):713–720. doi: 10.1038/nature09228. [DOI] [PubMed] [Google Scholar]
  • 51.Burgess D. J. HOTTIP goes the distance. Nature Reviews Genetics. 2011;12(5):p. 300. doi: 10.1038/nrg2992. [DOI] [PubMed] [Google Scholar]
  • 52.Lu X., Sachs F., Ramsay L. A., et al. The retrovirus HERVH is a long noncoding RNA required for human embryonic stem cell identity. Nature Structural & Molecular Biology. 2014;21(4):423–425. doi: 10.1038/nsmb.2799. [DOI] [PubMed] [Google Scholar]
  • 53.Wang J., Singh M., Sun C., et al. Isolation and cultivation of naive-like human pluripotent stem cells based on HERVH expression. Nature Protocols. 2016;11(2):327–346. doi: 10.1038/nprot.2016.016. [DOI] [PubMed] [Google Scholar]
  • 54.Cohen M., Powers M., O'Connell C., Kato N. The nucleotide sequence of the env gene from the human provirus ERV3 and isolation and characterization of an ERV3-specific cDNA. Virology. 1985;147(2):449–458. doi: 10.1016/0042-6822(85)90147-3. [DOI] [PubMed] [Google Scholar]
  • 55.Harris J. R. Placental endogenous retrovirus (ERV): structural, functional, and evolutionary significance. BioEssays. 1998;20(4):307–316. doi: 10.1002/(SICI)1521-1878(199804)20:4<307::AID-BIES7>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
  • 56.Venables P. J. W., Brookes S. M., Griffiths D., Weiss R. A., Boyd M. T. Abundance of an endogenous retroviral envelope protein in placental trophoblasts suggests a biological function. Virology. 1995;211(2):589–592. doi: 10.1006/viro.1995.1442. [DOI] [PubMed] [Google Scholar]
  • 57.Blond J. L., Lavillette D., Cheynet V.´., et al. An envelope glycoprotein of the human endogenous retrovirus HERV-W is expressed in the human placenta and fuses cells expressing the type D mammalian retrovirus receptor. Journal of Virology. 2000;74(7):3321–3329. doi: 10.1128/JVI.74.7.3321-3329.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.de Parseval N., Lazar V., Casella J. F., Benit L., Heidmann T. Survey of human genes of retroviral origin: identification and transcriptome of the genes with coding capacity for complete envelope proteins. Journal of Virology. 2003;77(19):10414–10422. doi: 10.1128/JVI.77.19.10414-10422.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Mi S., Lee X., Li X. P., et al. Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature. 2000;403(6771):785–789. doi: 10.1038/35001608. [DOI] [PubMed] [Google Scholar]
  • 60.Frendo J. L., Olivier D., Cheynet V.´., et al. Direct involvement of HERV-W env glycoprotein in human trophoblast cell fusion and differentiation. Molecular and Cellular Biology. 2003;23(10):3566–3574. doi: 10.1128/MCB.23.10.3566-3574.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Blaise S., de Parseval N., Benit L., Heidmann T. Genomewide screening for fusogenic human endogenous retrovirus envelopes identifies syncytin 2, a gene conserved on primate evolution. Proceedings of the National Academy of Sciences. 2011;100(22):13013–13018. doi: 10.1073/pnas.2132646100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Boyd M. T., Bax C. M. R., Bax B. E., Bloxam D. L., Weiss R. A. The human endogenous retrovirus ERV-3 is upregulated in differentiating placental trophoblast cells. Virology. 1993;196(2):905–909. doi: 10.1006/viro.1993.1556. [DOI] [PubMed] [Google Scholar]
  • 63.Rote N. S., Chakrabarti S., Stetzer B. P. The role of human endogenous retroviruses in trophoblast differentiation and placental development. Placenta. 2004;25(8-9):673–683. doi: 10.1016/j.placenta.2004.02.008. [DOI] [PubMed] [Google Scholar]
  • 64.Chuong E. B., Rumi M. A. K., Soares M. J., Baker J. C. Endogenous retroviruses function as species-specific enhancer elements in the placenta. Nature Genetics. 2013;45(3):325–329. doi: 10.1038/ng.2553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Levet S., Medina J., Joanou J., et al. An ancestral retroviral protein identified as a therapeutic target in type-1 diabetes. JCI Insight. 2017;2(17) doi: 10.1172/jci.insight.94387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Li W., Lee M.-H., Henderson L., et al. Human endogenous retrovirus-K contributes to motor neuron disease. Science Translational Medicine. 2015;7(307, article 307ra153) doi: 10.1126/scitranslmed.aac8201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.He J., Babarinde I. A., Sun L., et al. Unveiling transposable element expression heterogeneity in cell fate regulation at the single-cell level. bioRxiv; 2020. [Google Scholar]
  • 68.Miller G. G., Makarova I. V., Iazykov A. A. Detection of endogenous intracisternal type A particles in early mouse zygotes. Biulleten'eksperimental'noi Biologii i Meditsiny. 1983;96:94–96. [PubMed] [Google Scholar]
  • 69.Franke V., Ganesh S., Karlic R., et al. Long terminal repeats power evolution of genes and gene expression programs in mammalian oocytes and zygotes. Genome Research. 2017;27(8):1384–1394. doi: 10.1101/gr.216150.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Flemr M., Malik R., Franke V., et al. A retrotransposon-driven dicer isoform directs endogenous small interfering RNA production in mouse oocytes. Cell. 2013;155(4):807–816. doi: 10.1016/j.cell.2013.10.001. [DOI] [PubMed] [Google Scholar]
  • 71.Evsikov A. V., Marin de Evsikova C. Gene expression during the oocyte-to-embryo transition in mammals. Molecular Reproduction and Development. 2009;76(9):805–818. doi: 10.1002/mrd.21038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Liu S., Brind’Amour J., Karimi M. M., et al. Setdb1 is required for germline development and silencing of H3K9me3-marked endogenous retroviruses in primordial germ cells. Genes & Development. 2014;28(18):2041–2055. doi: 10.1101/gad.244848.114. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Stem Cells International are provided here courtesy of Wiley

RESOURCES